Repository: nomic-ai/gpt4all Branch: main Commit: b666d16db5ae Files: 307 Total size: 3.0 MB Directory structure: gitextract_o3o4fycr/ ├── .circleci/ │ ├── config.yml │ ├── continue_config.yml │ └── grab_notary_id.py ├── .codespellrc ├── .github/ │ ├── ISSUE_TEMPLATE/ │ │ ├── bindings-bug.md │ │ ├── chat-bug.md │ │ ├── config.yml │ │ ├── documentation.md │ │ ├── feature-request.md │ │ └── other-bug.md │ ├── pull_request_template.md │ └── workflows/ │ ├── close_issues.yml │ └── codespell.yml ├── .gitignore ├── .gitmodules ├── CONTRIBUTING.md ├── LICENSE.txt ├── MAINTAINERS.md ├── README.md ├── common/ │ └── common.cmake ├── gpt4all-backend/ │ ├── CMakeLists.txt │ ├── README.md │ ├── include/ │ │ └── gpt4all-backend/ │ │ ├── llmodel.h │ │ ├── llmodel_c.h │ │ └── sysinfo.h │ ├── llama.cpp.cmake │ └── src/ │ ├── dlhandle.cpp │ ├── dlhandle.h │ ├── llamamodel.cpp │ ├── llamamodel_impl.h │ ├── llmodel.cpp │ ├── llmodel_c.cpp │ ├── llmodel_shared.cpp │ └── utils.h ├── gpt4all-bindings/ │ ├── README.md │ ├── cli/ │ │ ├── README.md │ │ ├── app.py │ │ └── developer_notes.md │ ├── python/ │ │ ├── .gitignore │ │ ├── .isort.cfg │ │ ├── CHANGELOG.md │ │ ├── LICENSE.txt │ │ ├── MANIFEST.in │ │ ├── README.md │ │ ├── docs/ │ │ │ ├── css/ │ │ │ │ └── custom.css │ │ │ ├── gpt4all_api_server/ │ │ │ │ └── home.md │ │ │ ├── gpt4all_desktop/ │ │ │ │ ├── chat_templates.md │ │ │ │ ├── chats.md │ │ │ │ ├── cookbook/ │ │ │ │ │ ├── use-local-ai-models-to-privately-chat-with-Obsidian.md │ │ │ │ │ ├── use-local-ai-models-to-privately-chat-with-One-Drive.md │ │ │ │ │ ├── use-local-ai-models-to-privately-chat-with-google-drive.md │ │ │ │ │ └── use-local-ai-models-to-privately-chat-with-microsoft-excel.md │ │ │ │ ├── localdocs.md │ │ │ │ ├── models.md │ │ │ │ ├── quickstart.md │ │ │ │ └── settings.md │ │ │ ├── gpt4all_help/ │ │ │ │ ├── faq.md │ │ │ │ └── troubleshooting.md │ │ │ ├── gpt4all_python/ │ │ │ │ ├── home.md │ │ │ │ ├── monitoring.md │ │ │ │ └── ref.md │ │ │ ├── index.md │ │ │ └── old/ │ │ │ ├── gpt4all_chat.md │ │ │ ├── gpt4all_cli.md │ │ │ ├── gpt4all_faq.md │ │ │ ├── gpt4all_monitoring.md │ │ │ ├── gpt4all_nodejs.md │ │ │ ├── gpt4all_python.md │ │ │ ├── gpt4all_python_embedding.md │ │ │ └── index.md │ │ ├── gpt4all/ │ │ │ ├── __init__.py │ │ │ ├── _pyllmodel.py │ │ │ ├── gpt4all.py │ │ │ └── tests/ │ │ │ ├── __init__.py │ │ │ ├── test_embed_timings.py │ │ │ └── test_gpt4all.py │ │ ├── makefile │ │ ├── mkdocs.yml │ │ └── setup.py │ └── typescript/ │ ├── .clang-format │ ├── .gitignore │ ├── .npmignore │ ├── .yarnrc.yml │ ├── README.md │ ├── binding.ci.gyp │ ├── binding.gyp │ ├── index.cc │ ├── index.h │ ├── package.json │ ├── prompt.cc │ ├── prompt.h │ ├── scripts/ │ │ ├── build.js │ │ ├── docs.js │ │ ├── mkclangd.js │ │ └── prebuild.js │ ├── spec/ │ │ ├── callbacks.mjs │ │ ├── chat-memory.mjs │ │ ├── chat-minimal.mjs │ │ ├── concurrency.mjs │ │ ├── embed-jsonl.mjs │ │ ├── embed.mjs │ │ ├── llmodel.mjs │ │ ├── long-context.mjs │ │ ├── model-switching.mjs │ │ ├── stateless.mjs │ │ ├── streaming.mjs │ │ └── system.mjs │ ├── src/ │ │ ├── chat-session.js │ │ ├── config.js │ │ ├── gpt4all.d.ts │ │ ├── gpt4all.js │ │ ├── models.js │ │ └── util.js │ └── test/ │ ├── gpt4all.test.js │ └── models.json ├── gpt4all-chat/ │ ├── .flake8 │ ├── CHANGELOG.md │ ├── CMakeLists.txt │ ├── LICENSE │ ├── cmake/ │ │ ├── Modules/ │ │ │ ├── SignMacOSBinaries.cmake │ │ │ └── SignWindowsBinaries.cmake │ │ ├── cpack-steal-config.cmake.in │ │ ├── cpack_config.cmake │ │ ├── deploy-qt-linux.cmake.in │ │ ├── deploy-qt-mac.cmake.in │ │ ├── deploy-qt-windows.cmake.in │ │ ├── download_model.cmake │ │ ├── installer_control.qs │ │ ├── installer_gpt4all_component.qs │ │ ├── installer_maintenancetool_component.qs │ │ └── sign_dmg.py │ ├── contributing_translations.md │ ├── deps/ │ │ └── CMakeLists.txt │ ├── dev-requirements.txt │ ├── flatpak-manifest/ │ │ ├── io.gpt4all.gpt4all.appdata.xml │ │ └── io.gpt4all.gpt4all.desktop │ ├── main.qml │ ├── metadata/ │ │ ├── latestnews.md │ │ ├── models.json │ │ ├── models2.json │ │ ├── models3.json │ │ └── release.json │ ├── pyproject.toml │ ├── qa_checklist.md │ ├── qml/ │ │ ├── AddCollectionView.qml │ │ ├── AddGPT4AllModelView.qml │ │ ├── AddHFModelView.qml │ │ ├── AddModelView.qml │ │ ├── AddRemoteModelView.qml │ │ ├── ApplicationSettings.qml │ │ ├── ChatCollapsibleItem.qml │ │ ├── ChatDrawer.qml │ │ ├── ChatItemView.qml │ │ ├── ChatMessageButton.qml │ │ ├── ChatTextItem.qml │ │ ├── ChatView.qml │ │ ├── CollectionsDrawer.qml │ │ ├── ConfirmationDialog.qml │ │ ├── HomeView.qml │ │ ├── LocalDocsSettings.qml │ │ ├── LocalDocsView.qml │ │ ├── ModelSettings.qml │ │ ├── ModelsView.qml │ │ ├── MyBusyIndicator.qml │ │ ├── MyButton.qml │ │ ├── MyCheckBox.qml │ │ ├── MyComboBox.qml │ │ ├── MyDialog.qml │ │ ├── MyDirectoryField.qml │ │ ├── MyFancyLink.qml │ │ ├── MyFileDialog.qml │ │ ├── MyFileIcon.qml │ │ ├── MyFolderDialog.qml │ │ ├── MyMenu.qml │ │ ├── MyMenuItem.qml │ │ ├── MyMiniButton.qml │ │ ├── MySettingsButton.qml │ │ ├── MySettingsDestructiveButton.qml │ │ ├── MySettingsLabel.qml │ │ ├── MySettingsStack.qml │ │ ├── MySettingsTab.qml │ │ ├── MySlug.qml │ │ ├── MyTabButton.qml │ │ ├── MyTextArea.qml │ │ ├── MyTextButton.qml │ │ ├── MyTextField.qml │ │ ├── MyToolButton.qml │ │ ├── MyWelcomeButton.qml │ │ ├── NetworkDialog.qml │ │ ├── NewVersionDialog.qml │ │ ├── PopupDialog.qml │ │ ├── RemoteModelCard.qml │ │ ├── SettingsView.qml │ │ ├── StartupDialog.qml │ │ ├── Theme.qml │ │ ├── ThumbsDownDialog.qml │ │ ├── Toast.qml │ │ └── ToastManager.qml │ ├── resources/ │ │ ├── gpt4all.icns │ │ └── gpt4all.rc │ ├── src/ │ │ ├── chat.cpp │ │ ├── chat.h │ │ ├── chatapi.cpp │ │ ├── chatapi.h │ │ ├── chatlistmodel.cpp │ │ ├── chatlistmodel.h │ │ ├── chatllm.cpp │ │ ├── chatllm.h │ │ ├── chatmodel.cpp │ │ ├── chatmodel.h │ │ ├── chatviewtextprocessor.cpp │ │ ├── chatviewtextprocessor.h │ │ ├── codeinterpreter.cpp │ │ ├── codeinterpreter.h │ │ ├── config.h.in │ │ ├── database.cpp │ │ ├── database.h │ │ ├── download.cpp │ │ ├── download.h │ │ ├── embllm.cpp │ │ ├── embllm.h │ │ ├── jinja_helpers.cpp │ │ ├── jinja_helpers.h │ │ ├── jinja_replacements.cpp │ │ ├── jinja_replacements.h │ │ ├── llm.cpp │ │ ├── llm.h │ │ ├── localdocs.cpp │ │ ├── localdocs.h │ │ ├── localdocsmodel.cpp │ │ ├── localdocsmodel.h │ │ ├── logger.cpp │ │ ├── logger.h │ │ ├── macosdock.h │ │ ├── macosdock.mm │ │ ├── main.cpp │ │ ├── modellist.cpp │ │ ├── modellist.h │ │ ├── mysettings.cpp │ │ ├── mysettings.h │ │ ├── network.cpp │ │ ├── network.h │ │ ├── server.cpp │ │ ├── server.h │ │ ├── tool.cpp │ │ ├── tool.h │ │ ├── toolcallparser.cpp │ │ ├── toolcallparser.h │ │ ├── toolmodel.cpp │ │ ├── toolmodel.h │ │ ├── utils.h │ │ ├── utils.inl │ │ ├── xlsxtomd.cpp │ │ └── xlsxtomd.h │ ├── system_requirements.md │ ├── test-requirements.txt │ ├── tests/ │ │ ├── CMakeLists.txt │ │ ├── cpp/ │ │ │ ├── basic_test.cpp │ │ │ └── test_main.cpp │ │ └── python/ │ │ ├── __init__.py │ │ ├── config.py.in │ │ └── test_server_api.py │ └── translations/ │ ├── gpt4all_en_US.ts │ ├── gpt4all_es_MX.ts │ ├── gpt4all_it_IT.ts │ ├── gpt4all_pt_BR.ts │ ├── gpt4all_ro_RO.ts │ ├── gpt4all_zh_CN.ts │ └── gpt4all_zh_TW.ts ├── gpt4all-training/ │ ├── GPT-J_MAP.md │ ├── README.md │ ├── TRAINING_LOG.md │ ├── clean.py │ ├── configs/ │ │ ├── deepspeed/ │ │ │ ├── ds_config.json │ │ │ ├── ds_config_gptj.json │ │ │ ├── ds_config_gptj_lora.json │ │ │ ├── ds_config_mpt.json │ │ │ └── ds_config_pythia.json │ │ ├── eval/ │ │ │ ├── generate_baseline.yaml │ │ │ ├── generate_gpt4all_gptj.yaml │ │ │ ├── generate_gpt4all_gptj_lora.yaml │ │ │ └── generate_gpt4all_llama_lora.yaml │ │ ├── generate/ │ │ │ ├── generate.yaml │ │ │ ├── generate_gptj.yaml │ │ │ ├── generate_gptj_lora.yaml │ │ │ └── generate_llama.yaml │ │ ├── inference/ │ │ │ └── gptj.yaml │ │ └── train/ │ │ ├── finetune.yaml │ │ ├── finetune_falcon.yaml │ │ ├── finetune_gptj.yaml │ │ ├── finetune_gptj_lora.yaml │ │ ├── finetune_lora.yaml │ │ ├── finetune_mpt.yaml │ │ └── finetune_openllama.yaml │ ├── create_hostname.sh │ ├── data.py │ ├── env.yaml │ ├── eval_figures.py │ ├── eval_self_instruct.py │ ├── generate.py │ ├── inference.py │ ├── launcher.sh │ ├── old-README.md │ ├── read.py │ ├── requirements.txt │ └── train.py └── roadmap.md ================================================ FILE CONTENTS ================================================ ================================================ FILE: .circleci/config.yml ================================================ version: 2.1 setup: true orbs: path-filtering: circleci/path-filtering@1.3.0 workflows: version: 2.1 generate-config: jobs: - path-filtering/filter: filters: tags: only: - /.*/ base-revision: main config-path: .circleci/continue_config.yml mapping: | .circleci/.* run-all-workflows true gpt4all-backend/.* run-all-workflows true gpt4all-bindings/python/.* run-python-workflow true gpt4all-bindings/typescript/.* run-ts-workflow true gpt4all-chat/.* run-chat-workflow true ================================================ FILE: .circleci/continue_config.yml ================================================ version: 2.1 orbs: win: circleci/windows@5.0 python: circleci/python@1.2 node: circleci/node@5.1 parameters: run-all-workflows: type: boolean default: false run-python-workflow: type: boolean default: false run-chat-workflow: type: boolean default: false run-ts-workflow: type: boolean default: false job-macos-executor: &job-macos-executor macos: xcode: 16.2.0 resource_class: macos.m1.medium.gen1 environment: HOMEBREW_NO_AUTO_UPDATE: 1 job-macos-install-deps: &job-macos-install-deps name: Install basic macOS build dependencies command: brew install ccache llvm wget job-linux-install-chat-deps: &job-linux-install-chat-deps name: Install Linux build dependencies for gpt4all-chat command: | # Prevent apt-get from interactively prompting for service restart echo "\$nrconf{restart} = 'l'" | sudo tee /etc/needrestart/conf.d/90-autorestart.conf >/dev/null wget -qO- 'https://apt.llvm.org/llvm-snapshot.gpg.key' | sudo tee /etc/apt/trusted.gpg.d/apt.llvm.org.asc >/dev/null sudo add-apt-repository -yn 'deb http://apt.llvm.org/jammy/ llvm-toolchain-jammy-19 main' wget -qO- "https://packages.lunarg.com/lunarg-signing-key-pub.asc" \ | sudo tee /etc/apt/trusted.gpg.d/lunarg.asc >/dev/null wget -qO- "https://packages.lunarg.com/vulkan/1.3.290/lunarg-vulkan-1.3.290-jammy.list" \ | sudo tee /etc/apt/sources.list.d/lunarg-vulkan-1.3.290-jammy.list >/dev/null wget "https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.deb" sudo dpkg -i cuda-keyring_1.1-1_all.deb packages=( bison build-essential ccache clang-19 clang-tools-19 cuda-compiler-11-8 flex gperf libcublas-dev-11-8 libfontconfig1 libfreetype6 libgl1-mesa-dev libmysqlclient21 libnvidia-compute-550-server libodbc2 libpq5 libstdc++-12-dev libwayland-dev libx11-6 libx11-xcb1 libxcb-cursor0 libxcb-glx0 libxcb-icccm4 libxcb-image0 libxcb-keysyms1 libxcb-randr0 libxcb-render-util0 libxcb-shape0 libxcb-shm0 libxcb-sync1 libxcb-util1 libxcb-xfixes0 libxcb-xinerama0 libxcb-xkb1 libxcb1 libxext6 libxfixes3 libxi6 libxkbcommon-dev libxkbcommon-x11-0 libxrender1 patchelf python3 vulkan-sdk python3 vulkan-sdk ) sudo apt-get update sudo apt-get install -y "${packages[@]}" wget "https://qt.mirror.constant.com/archive/online_installers/4.8/qt-online-installer-linux-x64-4.8.1.run" chmod +x qt-online-installer-linux-x64-4.8.1.run ./qt-online-installer-linux-x64-4.8.1.run --no-force-installations --no-default-installations \ --no-size-checking --default-answer --accept-licenses --confirm-command --accept-obligations \ --email "$QT_EMAIL" --password "$QT_PASSWORD" install \ qt.tools.cmake qt.tools.ifw.48 qt.tools.ninja qt.qt6.682.linux_gcc_64 qt.qt6.682.addons.qt5compat \ qt.qt6.682.debug_info extensions.qtpdf.682 qt.qt6.682.addons.qthttpserver job-linux-install-backend-deps: &job-linux-install-backend-deps name: Install Linux build dependencies for gpt4all-backend command: | wget -qO- 'https://apt.llvm.org/llvm-snapshot.gpg.key' | sudo tee /etc/apt/trusted.gpg.d/apt.llvm.org.asc >/dev/null sudo add-apt-repository -yn 'deb http://apt.llvm.org/jammy/ llvm-toolchain-jammy-19 main' wget -qO- "https://packages.lunarg.com/lunarg-signing-key-pub.asc" \ | sudo tee /etc/apt/trusted.gpg.d/lunarg.asc >/dev/null wget -qO- "https://packages.lunarg.com/vulkan/1.3.290/lunarg-vulkan-1.3.290-jammy.list" \ | sudo tee /etc/apt/sources.list.d/lunarg-vulkan-1.3.290-jammy.list >/dev/null wget "https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.deb" sudo dpkg -i cuda-keyring_1.1-1_all.deb packages=( build-essential ccache clang-19 clang-tools-19 cuda-compiler-11-8 libcublas-dev-11-8 libnvidia-compute-550-server libstdc++-12-dev vulkan-sdk ) sudo apt-get update sudo apt-get install -y "${packages[@]}" pyenv global 3.13.2 pip install setuptools wheel cmake ninja jobs: # work around CircleCI-Public/path-filtering-orb#20 noop: docker: - image: cimg/base:stable steps: - run: "true" validate-commit-on-main: docker: - image: cimg/base:stable steps: - checkout - run: name: Verify that commit is on the main branch command: git merge-base --is-ancestor HEAD main build-offline-chat-installer-macos: <<: *job-macos-executor steps: - checkout - run: name: Update Submodules command: | git submodule sync git submodule update --init --recursive - restore_cache: keys: - ccache-gpt4all-macos- - run: <<: *job-macos-install-deps - run: name: Install Rosetta command: softwareupdate --install-rosetta --agree-to-license # needed for QtIFW - run: name: Installing Qt command: | wget "https://qt.mirror.constant.com/archive/online_installers/4.8/qt-online-installer-macOS-x64-4.8.1.dmg" hdiutil attach qt-online-installer-macOS-x64-4.8.1.dmg /Volumes/qt-online-installer-macOS-x64-4.8.1/qt-online-installer-macOS-x64-4.8.1.app/Contents/MacOS/qt-online-installer-macOS-x64-4.8.1 \ --no-force-installations --no-default-installations --no-size-checking --default-answer \ --accept-licenses --confirm-command --accept-obligations --email "$QT_EMAIL" --password "$QT_PASSWORD" \ install \ qt.tools.cmake qt.tools.ifw.48 qt.tools.ninja qt.qt6.682.clang_64 qt.qt6.682.addons.qt5compat \ extensions.qtpdf.682 qt.qt6.682.addons.qthttpserver hdiutil detach /Volumes/qt-online-installer-macOS-x64-4.8.1 - run: name: Setup Keychain command: | echo $MAC_SIGNING_CERT | base64 --decode > cert.p12 security create-keychain -p "$MAC_KEYCHAIN_KEY" sign.keychain security default-keychain -s sign.keychain security unlock-keychain -p "$MAC_KEYCHAIN_KEY" sign.keychain security import cert.p12 -k sign.keychain -P "$MAC_SIGNING_CERT_PWD" -T /usr/bin/codesign security set-key-partition-list -S apple-tool:,apple:,codesign: -s -k "$MAC_KEYCHAIN_KEY" sign.keychain - run: name: Build no_output_timeout: 30m command: | ccache -o "cache_dir=${PWD}/../.ccache" -o max_size=500M -p -z mkdir build cd build export PATH=$PATH:$HOME/Qt/Tools/QtInstallerFramework/4.8/bin ~/Qt/Tools/CMake/CMake.app/Contents/bin/cmake \ -S ../gpt4all-chat -B . -G Ninja \ -DCMAKE_BUILD_TYPE=Release \ -DCMAKE_PREFIX_PATH:PATH=~/Qt/6.8.2/macos/lib/cmake \ -DCMAKE_MAKE_PROGRAM:FILEPATH=~/Qt/Tools/Ninja/ninja \ -DCMAKE_C_COMPILER=/opt/homebrew/opt/llvm/bin/clang \ -DCMAKE_CXX_COMPILER=/opt/homebrew/opt/llvm/bin/clang++ \ -DCMAKE_RANLIB=/usr/bin/ranlib \ -DCMAKE_C_COMPILER_LAUNCHER=ccache \ -DCMAKE_CXX_COMPILER_LAUNCHER=ccache \ -DBUILD_UNIVERSAL=ON \ -DCMAKE_OSX_DEPLOYMENT_TARGET=12.6 \ -DGGML_METAL_MACOSX_VERSION_MIN=12.6 \ -DMACDEPLOYQT=~/Qt/6.8.2/macos/bin/macdeployqt \ -DGPT4ALL_OFFLINE_INSTALLER=ON \ -DGPT4ALL_SIGN_INSTALL=ON \ -DGPT4ALL_GEN_CPACK_CONFIG=ON ~/Qt/Tools/CMake/CMake.app/Contents/bin/cmake --build . --target package ~/Qt/Tools/CMake/CMake.app/Contents/bin/cmake . -DGPT4ALL_GEN_CPACK_CONFIG=OFF # The 'install' step here *should* be completely unnecessary. There is absolutely no reason we should have # to copy all of the build artifacts to an output directory that we do not use (because we package GPT4All # as an installer instead). # However, because of the way signing is implemented in the cmake script, the *source* files are signed at # install time instead of the *installed* files. This side effect is the *only* way libraries that are not # processed by macdeployqt, such as libllmodel.so, get signed (at least, with -DBUILD_UNIVERSAL=ON). # Also, we have to run this as a *separate* step. Telling cmake to run both targets in one command causes it # to execute them in parallel, since it is not aware of the dependency of the package target on the install # target. ~/Qt/Tools/CMake/CMake.app/Contents/bin/cmake --build . --target install ~/Qt/Tools/CMake/CMake.app/Contents/bin/cmake --build . --target package ccache -s mkdir upload cp gpt4all-installer-* upload # persist the unsigned installer - store_artifacts: path: build/upload - save_cache: key: ccache-gpt4all-macos-{{ epoch }} when: always paths: - ../.ccache # add workspace so signing jobs can connect & obtain dmg - persist_to_workspace: root: build # specify path to only include components we want to persist # accross builds paths: - upload sign-offline-chat-installer-macos: <<: *job-macos-executor steps: - checkout # attach to a workspace containing unsigned dmg - attach_workspace: at: build - run: name: "Setup Keychain" command: | echo $MAC_SIGNING_CERT | base64 --decode > cert.p12 security create-keychain -p "$MAC_KEYCHAIN_KEY" sign.keychain security default-keychain -s sign.keychain security unlock-keychain -p "$MAC_KEYCHAIN_KEY" sign.keychain security import cert.p12 -k sign.keychain -P "$MAC_SIGNING_CERT_PWD" -T /usr/bin/codesign security set-key-partition-list -S apple-tool:,apple:,codesign: -s -k "$MAC_KEYCHAIN_KEY" sign.keychain rm cert.p12 - run: name: "Sign App Bundle" command: | python3 -m pip install click python3 gpt4all-chat/cmake/sign_dmg.py --input-dmg build/upload/gpt4all-installer-darwin.dmg --output-dmg build/upload/gpt4all-installer-darwin-signed.dmg --signing-identity "$MAC_SIGNING_CERT_NAME" - run: name: "Sign DMG" command: | codesign --options runtime --timestamp -s "$MAC_SIGNING_CERT_NAME" build/upload/gpt4all-installer-darwin-signed.dmg # add workspace so signing jobs can connect & obtain dmg - persist_to_workspace: root: build # specify path to only include components we want to persist # accross builds paths: - upload notarize-offline-chat-installer-macos: <<: *job-macos-executor steps: - checkout - attach_workspace: at: build - run: name: "Notarize" command: | xcrun notarytool submit build/upload/gpt4all-installer-darwin-signed.dmg --apple-id "$MAC_NOTARIZATION_ID" --team-id "$MAC_NOTARIZATION_TID" --password "$MAC_NOTARIZATION_KEY" --wait | tee notarize_log.txt - run: name: "Report Notarization Failure" command: | NID=`python3 .circleci/grab_notary_id.py notarize_log.txt` && export NID xcrun notarytool log $NID --keychain-profile "notary-profile" exit 1 when: on_fail - run: name: "Staple" command: | xcrun stapler staple build/upload/gpt4all-installer-darwin-signed.dmg - store_artifacts: path: build/upload - run: name: Install Rosetta command: softwareupdate --install-rosetta --agree-to-license # needed for QtIFW - run: name: Test installation and verify that it is signed command: | set -e hdiutil attach build/upload/gpt4all-installer-darwin-signed.dmg codesign --verify --deep --verbose /Volumes/gpt4all-installer-darwin/gpt4all-installer-darwin.app /Volumes/gpt4all-installer-darwin/gpt4all-installer-darwin.app/Contents/MacOS/gpt4all-installer-darwin \ --no-size-checking --default-answer --accept-licenses --confirm-command \ install gpt4all codesign --verify --deep --verbose /Applications/gpt4all/bin/gpt4all.app codesign --verify --deep --verbose /Applications/gpt4all/maintenancetool.app hdiutil detach /Volumes/gpt4all-installer-darwin build-online-chat-installer-macos: <<: *job-macos-executor steps: - checkout - run: name: Update Submodules command: | git submodule sync git submodule update --init --recursive - restore_cache: keys: - ccache-gpt4all-macos- - run: <<: *job-macos-install-deps - run: name: Install Rosetta command: softwareupdate --install-rosetta --agree-to-license # needed for QtIFW - run: name: Installing Qt command: | wget "https://qt.mirror.constant.com/archive/online_installers/4.8/qt-online-installer-macOS-x64-4.8.1.dmg" hdiutil attach qt-online-installer-macOS-x64-4.8.1.dmg /Volumes/qt-online-installer-macOS-x64-4.8.1/qt-online-installer-macOS-x64-4.8.1.app/Contents/MacOS/qt-online-installer-macOS-x64-4.8.1 \ --no-force-installations --no-default-installations --no-size-checking --default-answer \ --accept-licenses --confirm-command --accept-obligations --email "$QT_EMAIL" --password "$QT_PASSWORD" \ install \ qt.tools.cmake qt.tools.ifw.48 qt.tools.ninja qt.qt6.682.clang_64 qt.qt6.682.addons.qt5compat \ extensions.qtpdf.682 qt.qt6.682.addons.qthttpserver hdiutil detach /Volumes/qt-online-installer-macOS-x64-4.8.1 - run: name: Setup Keychain command: | echo $MAC_SIGNING_CERT | base64 --decode > cert.p12 security create-keychain -p "$MAC_KEYCHAIN_KEY" sign.keychain security default-keychain -s sign.keychain security unlock-keychain -p "$MAC_KEYCHAIN_KEY" sign.keychain security import cert.p12 -k sign.keychain -P "$MAC_SIGNING_CERT_PWD" -T /usr/bin/codesign security set-key-partition-list -S apple-tool:,apple:,codesign: -s -k "$MAC_KEYCHAIN_KEY" sign.keychain - run: name: Build no_output_timeout: 30m command: | ccache -o "cache_dir=${PWD}/../.ccache" -o max_size=500M -p -z mkdir build cd build export PATH=$PATH:$HOME/Qt/Tools/QtInstallerFramework/4.8/bin ~/Qt/Tools/CMake/CMake.app/Contents/bin/cmake \ -S ../gpt4all-chat -B . -G Ninja \ -DCMAKE_BUILD_TYPE=Release \ -DCMAKE_PREFIX_PATH:PATH=~/Qt/6.8.2/macos/lib/cmake \ -DCMAKE_MAKE_PROGRAM:FILEPATH=~/Qt/Tools/Ninja/ninja \ -DCMAKE_C_COMPILER=/opt/homebrew/opt/llvm/bin/clang \ -DCMAKE_CXX_COMPILER=/opt/homebrew/opt/llvm/bin/clang++ \ -DCMAKE_RANLIB=/usr/bin/ranlib \ -DCMAKE_C_COMPILER_LAUNCHER=ccache \ -DCMAKE_CXX_COMPILER_LAUNCHER=ccache \ -DBUILD_UNIVERSAL=ON \ -DCMAKE_OSX_DEPLOYMENT_TARGET=12.6 \ -DGGML_METAL_MACOSX_VERSION_MIN=12.6 \ -DMACDEPLOYQT=~/Qt/6.8.2/macos/bin/macdeployqt \ -DGPT4ALL_OFFLINE_INSTALLER=OFF \ -DGPT4ALL_SIGN_INSTALL=ON \ -DGPT4ALL_GEN_CPACK_CONFIG=ON ~/Qt/Tools/CMake/CMake.app/Contents/bin/cmake --build . --target package ~/Qt/Tools/CMake/CMake.app/Contents/bin/cmake . -DGPT4ALL_GEN_CPACK_CONFIG=OFF # See comment above related to the 'install' target. ~/Qt/Tools/CMake/CMake.app/Contents/bin/cmake --build . --target install ~/Qt/Tools/CMake/CMake.app/Contents/bin/cmake --build . --target package ccache -s mkdir upload cp gpt4all-installer-* upload tar -cvzf upload/repository.tar.gz -C _CPack_Packages/Darwin/IFW/gpt4all-installer-darwin repository # persist the unsigned installer - store_artifacts: path: build/upload - save_cache: key: ccache-gpt4all-macos-{{ epoch }} when: always paths: - ../.ccache # add workspace so signing jobs can connect & obtain dmg - persist_to_workspace: root: build # specify path to only include components we want to persist # accross builds paths: - upload sign-online-chat-installer-macos: <<: *job-macos-executor steps: - checkout # attach to a workspace containing unsigned dmg - attach_workspace: at: build - run: name: "Setup Keychain" command: | echo $MAC_SIGNING_CERT | base64 --decode > cert.p12 security create-keychain -p "$MAC_KEYCHAIN_KEY" sign.keychain security default-keychain -s sign.keychain security unlock-keychain -p "$MAC_KEYCHAIN_KEY" sign.keychain security import cert.p12 -k sign.keychain -P "$MAC_SIGNING_CERT_PWD" -T /usr/bin/codesign security set-key-partition-list -S apple-tool:,apple:,codesign: -s -k "$MAC_KEYCHAIN_KEY" sign.keychain rm cert.p12 - run: name: "Sign App Bundle" command: | python3 -m pip install click python3 gpt4all-chat/cmake/sign_dmg.py --input-dmg build/upload/gpt4all-installer-darwin.dmg --output-dmg build/upload/gpt4all-installer-darwin-signed.dmg --signing-identity "$MAC_SIGNING_CERT_NAME" - run: name: "Sign DMG" command: | codesign --options runtime --timestamp -s "$MAC_SIGNING_CERT_NAME" build/upload/gpt4all-installer-darwin-signed.dmg # add workspace so signing jobs can connect & obtain dmg - persist_to_workspace: root: build # specify path to only include components we want to persist # accross builds paths: - upload notarize-online-chat-installer-macos: <<: *job-macos-executor steps: - checkout - attach_workspace: at: build - run: name: "Notarize" command: | xcrun notarytool submit build/upload/gpt4all-installer-darwin-signed.dmg --apple-id "$MAC_NOTARIZATION_ID" --team-id "$MAC_NOTARIZATION_TID" --password "$MAC_NOTARIZATION_KEY" --wait | tee notarize_log.txt - run: name: "Report Notarization Failure" command: | NID=`python3 .circleci/grab_notary_id.py notarize_log.txt` && export NID xcrun notarytool log $NID --keychain-profile "notary-profile" exit 1 when: on_fail - run: name: "Staple" command: | xcrun stapler staple build/upload/gpt4all-installer-darwin-signed.dmg - store_artifacts: path: build/upload - run: name: Install Rosetta command: softwareupdate --install-rosetta --agree-to-license # needed for QtIFW - run: name: Test installation and verify that it is signed command: | set -e hdiutil attach build/upload/gpt4all-installer-darwin-signed.dmg codesign --verify --deep --verbose /Volumes/gpt4all-installer-darwin/gpt4all-installer-darwin.app tar -xf build/upload/repository.tar.gz /Volumes/gpt4all-installer-darwin/gpt4all-installer-darwin.app/Contents/MacOS/gpt4all-installer-darwin \ --no-size-checking --default-answer --accept-licenses --confirm-command --set-temp-repository repository \ install gpt4all codesign --verify --deep --verbose /Applications/gpt4all/bin/gpt4all.app codesign --verify --deep --verbose /Applications/gpt4all/maintenancetool.app hdiutil detach /Volumes/gpt4all-installer-darwin build-offline-chat-installer-linux: machine: image: ubuntu-2204:current steps: - checkout - run: name: Update Submodules command: | git submodule sync git submodule update --init --recursive - restore_cache: keys: - ccache-gpt4all-linux-amd64- - run: <<: *job-linux-install-chat-deps - run: name: Build linuxdeployqt command: | git clone https://github.com/nomic-ai/linuxdeployqt cd linuxdeployqt && qmake && sudo make install - run: name: Build no_output_timeout: 30m command: | set -eo pipefail export CMAKE_PREFIX_PATH=~/Qt/6.8.2/gcc_64/lib/cmake export PATH=$PATH:$HOME/Qt/Tools/QtInstallerFramework/4.8/bin export PATH=$PATH:/usr/local/cuda/bin ccache -o "cache_dir=${PWD}/../.ccache" -o max_size=500M -p -z mkdir build cd build mkdir upload ~/Qt/Tools/CMake/bin/cmake \ -S ../gpt4all-chat -B . \ -DCMAKE_BUILD_TYPE=Release \ -DCMAKE_C_COMPILER=clang-19 \ -DCMAKE_CXX_COMPILER=clang++-19 \ -DCMAKE_CXX_COMPILER_AR=ar \ -DCMAKE_CXX_COMPILER_RANLIB=ranlib \ -DCMAKE_C_COMPILER_LAUNCHER=ccache \ -DCMAKE_CXX_COMPILER_LAUNCHER=ccache \ -DCMAKE_CUDA_COMPILER_LAUNCHER=ccache \ -DKOMPUTE_OPT_DISABLE_VULKAN_VERSION_CHECK=ON \ -DGPT4ALL_OFFLINE_INSTALLER=ON ~/Qt/Tools/CMake/bin/cmake --build . -j$(nproc) --target all ~/Qt/Tools/CMake/bin/cmake --build . -j$(nproc) --target install ~/Qt/Tools/CMake/bin/cmake --build . -j$(nproc) --target package ccache -s cp gpt4all-installer-* upload - store_artifacts: path: build/upload - save_cache: key: ccache-gpt4all-linux-amd64-{{ epoch }} when: always paths: - ../.ccache - run: name: Test installation command: | mkdir ~/Desktop build/upload/gpt4all-installer-linux.run --no-size-checking --default-answer --accept-licenses \ --confirm-command \ install gpt4all build-online-chat-installer-linux: machine: image: ubuntu-2204:current steps: - checkout - run: name: Update Submodules command: | git submodule sync git submodule update --init --recursive - restore_cache: keys: - ccache-gpt4all-linux-amd64- - run: <<: *job-linux-install-chat-deps - run: name: Build linuxdeployqt command: | git clone https://github.com/nomic-ai/linuxdeployqt cd linuxdeployqt && qmake && sudo make install - run: name: Build no_output_timeout: 30m command: | set -eo pipefail export CMAKE_PREFIX_PATH=~/Qt/6.8.2/gcc_64/lib/cmake export PATH=$PATH:$HOME/Qt/Tools/QtInstallerFramework/4.8/bin export PATH=$PATH:/usr/local/cuda/bin ccache -o "cache_dir=${PWD}/../.ccache" -o max_size=500M -p -z mkdir build cd build mkdir upload ~/Qt/Tools/CMake/bin/cmake \ -S ../gpt4all-chat -B . \ -DCMAKE_BUILD_TYPE=Release \ -DCMAKE_C_COMPILER=clang-19 \ -DCMAKE_CXX_COMPILER=clang++-19 \ -DCMAKE_CXX_COMPILER_AR=ar \ -DCMAKE_CXX_COMPILER_RANLIB=ranlib \ -DCMAKE_C_COMPILER_LAUNCHER=ccache \ -DCMAKE_CXX_COMPILER_LAUNCHER=ccache \ -DCMAKE_CUDA_COMPILER_LAUNCHER=ccache \ -DKOMPUTE_OPT_DISABLE_VULKAN_VERSION_CHECK=ON \ -DGPT4ALL_OFFLINE_INSTALLER=OFF ~/Qt/Tools/CMake/bin/cmake --build . -j$(nproc) --target all ~/Qt/Tools/CMake/bin/cmake --build . -j$(nproc) --target install ~/Qt/Tools/CMake/bin/cmake --build . -j$(nproc) --target package ccache -s cp gpt4all-installer-* upload tar -cvzf upload/repository.tar.gz -C _CPack_Packages/Linux/IFW/gpt4all-installer-linux repository - store_artifacts: path: build/upload - save_cache: key: ccache-gpt4all-linux-amd64-{{ epoch }} when: always paths: - ../.ccache - run: name: Test installation command: | mkdir ~/Desktop build/upload/gpt4all-installer-linux.run --no-size-checking --default-answer --accept-licenses \ --confirm-command \ --set-temp-repository build/_CPack_Packages/Linux/IFW/gpt4all-installer-linux/repository \ install gpt4all build-offline-chat-installer-windows: machine: # we use 2024.04.01 because nvcc complains about the MSVC ver if we use anything newer image: windows-server-2022-gui:2024.04.1 resource_class: windows.large shell: powershell.exe -ExecutionPolicy Bypass steps: - checkout - run: name: Update Submodules command: | git submodule sync git submodule update --init --recursive - restore_cache: keys: - ccache-gpt4all-win-amd64- - run: name: Install dependencies command: choco install -y ccache wget - run: name: Installing Qt command: | wget.exe "https://qt.mirror.constant.com/archive/online_installers/4.8/qt-online-installer-windows-x64-4.8.1.exe" & .\qt-online-installer-windows-x64-4.8.1.exe --no-force-installations --no-default-installations ` --no-size-checking --default-answer --accept-licenses --confirm-command --accept-obligations ` --email "${Env:QT_EMAIL}" --password "${Env:QT_PASSWORD}" install ` qt.tools.cmake qt.tools.ifw.48 qt.tools.ninja qt.qt6.682.win64_msvc2022_64 qt.qt6.682.addons.qt5compat ` qt.qt6.682.debug_info extensions.qtpdf.682 qt.qt6.682.addons.qthttpserver - run: name: Install VulkanSDK command: | wget.exe "https://sdk.lunarg.com/sdk/download/1.3.261.1/windows/VulkanSDK-1.3.261.1-Installer.exe" .\VulkanSDK-1.3.261.1-Installer.exe --accept-licenses --default-answer --confirm-command install - run: name: Install CUDA Toolkit command: | wget.exe "https://developer.download.nvidia.com/compute/cuda/11.8.0/network_installers/cuda_11.8.0_windows_network.exe" .\cuda_11.8.0_windows_network.exe -s cudart_11.8 nvcc_11.8 cublas_11.8 cublas_dev_11.8 - run: name: "Install Dotnet 8" command: | mkdir dotnet cd dotnet $dotnet_url="https://download.visualstudio.microsoft.com/download/pr/5af098e1-e433-4fda-84af-3f54fd27c108/6bd1c6e48e64e64871957289023ca590/dotnet-sdk-8.0.302-win-x64.zip" wget.exe "$dotnet_url" Expand-Archive -LiteralPath .\dotnet-sdk-8.0.302-win-x64.zip $Env:DOTNET_ROOT="$($(Get-Location).Path)\dotnet-sdk-8.0.302-win-x64" $Env:PATH="$Env:DOTNET_ROOT;$Env:PATH" $Env:DOTNET_SKIP_FIRST_TIME_EXPERIENCE=$true dotnet tool install --global AzureSignTool - run: name: Build no_output_timeout: 30m command: | $vsInstallPath = & "C:\Program Files (x86)\Microsoft Visual Studio\Installer\vswhere.exe" -property installationpath Import-Module "${vsInstallPath}\Common7\Tools\Microsoft.VisualStudio.DevShell.dll" Enter-VsDevShell -VsInstallPath "$vsInstallPath" -SkipAutomaticLocation -DevCmdArguments '-arch=x64 -no_logo' $Env:PATH = "${Env:PATH};C:\VulkanSDK\1.3.261.1\bin" $Env:PATH = "${Env:PATH};C:\Qt\Tools\QtInstallerFramework\4.8\bin" $Env:DOTNET_ROOT="$($(Get-Location).Path)\dotnet\dotnet-sdk-8.0.302-win-x64" $Env:PATH="$Env:DOTNET_ROOT;$Env:PATH" ccache -o "cache_dir=${pwd}\..\.ccache" -o max_size=500M -p -z mkdir build cd build & "C:\Qt\Tools\CMake_64\bin\cmake.exe" ` -S ..\gpt4all-chat -B . -G Ninja ` -DCMAKE_BUILD_TYPE=Release ` "-DCMAKE_PREFIX_PATH:PATH=C:\Qt\6.8.2\msvc2022_64" ` "-DCMAKE_MAKE_PROGRAM:FILEPATH=C:\Qt\Tools\Ninja\ninja.exe" ` -DCMAKE_C_COMPILER_LAUNCHER=ccache ` -DCMAKE_CXX_COMPILER_LAUNCHER=ccache ` -DCMAKE_CUDA_COMPILER_LAUNCHER=ccache ` -DKOMPUTE_OPT_DISABLE_VULKAN_VERSION_CHECK=ON ` -DGPT4ALL_OFFLINE_INSTALLER=ON & "C:\Qt\Tools\Ninja\ninja.exe" & "C:\Qt\Tools\Ninja\ninja.exe" install & "C:\Qt\Tools\Ninja\ninja.exe" package ccache -s mkdir upload copy gpt4all-installer-win64.exe upload - store_artifacts: path: build/upload # add workspace so signing jobs can connect & obtain dmg - save_cache: key: ccache-gpt4all-win-amd64-{{ epoch }} when: always paths: - ..\.ccache - persist_to_workspace: root: build # specify path to only include components we want to persist # accross builds paths: - upload sign-offline-chat-installer-windows: machine: image: windows-server-2022-gui:2024.04.1 resource_class: windows.large shell: powershell.exe -ExecutionPolicy Bypass steps: - checkout - attach_workspace: at: build - run: name: Install dependencies command: choco install -y wget - run: name: "Install Dotnet 8 && Azure Sign Tool" command: | mkdir dotnet cd dotnet $dotnet_url="https://download.visualstudio.microsoft.com/download/pr/5af098e1-e433-4fda-84af-3f54fd27c108/6bd1c6e48e64e64871957289023ca590/dotnet-sdk-8.0.302-win-x64.zip" wget.exe "$dotnet_url" Expand-Archive -LiteralPath .\dotnet-sdk-8.0.302-win-x64.zip $Env:DOTNET_ROOT="$($(Get-Location).Path)\dotnet-sdk-8.0.302-win-x64" $Env:PATH="$Env:DOTNET_ROOT;$Env:PATH" $Env:DOTNET_SKIP_FIRST_TIME_EXPERIENCE=$true dotnet tool install --global AzureSignTool - run: name: "Sign Windows Installer With AST" command: | $Env:DOTNET_ROOT="$($(Get-Location).Path)\dotnet\dotnet-sdk-8.0.302-win-x64" $Env:PATH="$Env:DOTNET_ROOT;$Env:PATH" AzureSignTool.exe sign -du "https://gpt4all.io/index.html" -kvu https://gpt4all.vault.azure.net -kvi "$Env:AZSignGUID" -kvs "$Env:AZSignPWD" -kvc "$Env:AZSignCertName" -kvt "$Env:AZSignTID" -tr http://timestamp.digicert.com -v "$($(Get-Location).Path)\build\upload\gpt4all-installer-win64.exe" - store_artifacts: path: build/upload - run: name: Test installation command: | build\upload\gpt4all-installer-win64.exe --no-size-checking --default-answer --accept-licenses ` --confirm-command ` install gpt4all build-online-chat-installer-windows: machine: image: windows-server-2022-gui:2024.04.1 resource_class: windows.large shell: powershell.exe -ExecutionPolicy Bypass steps: - checkout - run: name: Update Submodules command: | git submodule sync git submodule update --init --recursive - restore_cache: keys: - ccache-gpt4all-win-amd64- - run: name: Install dependencies command: choco install -y ccache wget - run: name: Installing Qt command: | wget.exe "https://qt.mirror.constant.com/archive/online_installers/4.8/qt-online-installer-windows-x64-4.8.1.exe" & .\qt-online-installer-windows-x64-4.8.1.exe --no-force-installations --no-default-installations ` --no-size-checking --default-answer --accept-licenses --confirm-command --accept-obligations ` --email "${Env:QT_EMAIL}" --password "${Env:QT_PASSWORD}" install ` qt.tools.cmake qt.tools.ifw.48 qt.tools.ninja qt.qt6.682.win64_msvc2022_64 qt.qt6.682.addons.qt5compat ` qt.qt6.682.debug_info extensions.qtpdf.682 qt.qt6.682.addons.qthttpserver - run: name: Install VulkanSDK command: | wget.exe "https://sdk.lunarg.com/sdk/download/1.3.261.1/windows/VulkanSDK-1.3.261.1-Installer.exe" .\VulkanSDK-1.3.261.1-Installer.exe --accept-licenses --default-answer --confirm-command install - run: name: Install CUDA Toolkit command: | wget.exe "https://developer.download.nvidia.com/compute/cuda/11.8.0/network_installers/cuda_11.8.0_windows_network.exe" .\cuda_11.8.0_windows_network.exe -s cudart_11.8 nvcc_11.8 cublas_11.8 cublas_dev_11.8 - run: name: "Install Dotnet 8" command: | mkdir dotnet cd dotnet $dotnet_url="https://download.visualstudio.microsoft.com/download/pr/5af098e1-e433-4fda-84af-3f54fd27c108/6bd1c6e48e64e64871957289023ca590/dotnet-sdk-8.0.302-win-x64.zip" wget.exe "$dotnet_url" Expand-Archive -LiteralPath .\dotnet-sdk-8.0.302-win-x64.zip $Env:DOTNET_ROOT="$($(Get-Location).Path)\dotnet-sdk-8.0.302-win-x64" $Env:PATH="$Env:DOTNET_ROOT;$Env:PATH" - run: name: "Setup Azure SignTool" command: | $Env:DOTNET_ROOT="$($(Get-Location).Path)\dotnet\dotnet-sdk-8.0.302-win-x64" $Env:PATH="$Env:DOTNET_ROOT;$Env:PATH" $Env:DOTNET_SKIP_FIRST_TIME_EXPERIENCE=$true dotnet tool install --global AzureSignTool - run: name: Build no_output_timeout: 30m command: | $vsInstallPath = & "C:\Program Files (x86)\Microsoft Visual Studio\Installer\vswhere.exe" -property installationpath Import-Module "${vsInstallPath}\Common7\Tools\Microsoft.VisualStudio.DevShell.dll" Enter-VsDevShell -VsInstallPath "$vsInstallPath" -SkipAutomaticLocation -DevCmdArguments '-arch=x64 -no_logo' $Env:PATH = "${Env:PATH};C:\VulkanSDK\1.3.261.1\bin" $Env:PATH = "${Env:PATH};C:\Qt\Tools\QtInstallerFramework\4.8\bin" $Env:DOTNET_ROOT="$($(Get-Location).Path)\dotnet\dotnet-sdk-8.0.302-win-x64" $Env:PATH="$Env:DOTNET_ROOT;$Env:PATH" ccache -o "cache_dir=${pwd}\..\.ccache" -o max_size=500M -p -z mkdir build cd build & "C:\Qt\Tools\CMake_64\bin\cmake.exe" ` -S ..\gpt4all-chat -B . -G Ninja ` -DCMAKE_BUILD_TYPE=Release ` "-DCMAKE_PREFIX_PATH:PATH=C:\Qt\6.8.2\msvc2022_64" ` "-DCMAKE_MAKE_PROGRAM:FILEPATH=C:\Qt\Tools\Ninja\ninja.exe" ` -DCMAKE_C_COMPILER_LAUNCHER=ccache ` -DCMAKE_CXX_COMPILER_LAUNCHER=ccache ` -DCMAKE_CUDA_COMPILER_LAUNCHER=ccache ` -DKOMPUTE_OPT_DISABLE_VULKAN_VERSION_CHECK=ON ` -DGPT4ALL_OFFLINE_INSTALLER=OFF & "C:\Qt\Tools\Ninja\ninja.exe" & "C:\Qt\Tools\Ninja\ninja.exe" install & "C:\Qt\Tools\Ninja\ninja.exe" package ccache -s mkdir upload copy gpt4all-installer-win64.exe upload Set-Location -Path "_CPack_Packages/win64/IFW/gpt4all-installer-win64" Compress-Archive -Path 'repository' -DestinationPath '..\..\..\..\upload\repository.zip' - store_artifacts: path: build/upload - save_cache: key: ccache-gpt4all-win-amd64-{{ epoch }} when: always paths: - ..\.ccache # add workspace so signing jobs can connect & obtain dmg - persist_to_workspace: root: build # specify path to only include components we want to persist # accross builds paths: - upload sign-online-chat-installer-windows: machine: image: windows-server-2022-gui:2024.04.1 resource_class: windows.large shell: powershell.exe -ExecutionPolicy Bypass steps: - checkout - attach_workspace: at: build - run: name: Install dependencies command: choco install -y wget - run: name: "Install Dotnet 8" command: | mkdir dotnet cd dotnet $dotnet_url="https://download.visualstudio.microsoft.com/download/pr/5af098e1-e433-4fda-84af-3f54fd27c108/6bd1c6e48e64e64871957289023ca590/dotnet-sdk-8.0.302-win-x64.zip" wget.exe "$dotnet_url" Expand-Archive -LiteralPath .\dotnet-sdk-8.0.302-win-x64.zip $Env:DOTNET_ROOT="$($(Get-Location).Path)\dotnet-sdk-8.0.302-win-x64" $Env:PATH="$Env:DOTNET_ROOT;$Env:PATH" - run: name: "Setup Azure SignTool" command: | $Env:DOTNET_ROOT="$($(Get-Location).Path)\dotnet\dotnet-sdk-8.0.302-win-x64" $Env:PATH="$Env:DOTNET_ROOT;$Env:PATH" $Env:DOTNET_SKIP_FIRST_TIME_EXPERIENCE=$true dotnet tool install --global AzureSignTool - run: name: "Sign Windows Installer With AST" command: | $Env:DOTNET_ROOT="$($(Get-Location).Path)\dotnet\dotnet-sdk-8.0.302-win-x64" $Env:PATH="$Env:DOTNET_ROOT;$Env:PATH" AzureSignTool.exe sign -du "https://gpt4all.io/index.html" -kvu https://gpt4all.vault.azure.net -kvi "$Env:AZSignGUID" -kvs "$Env:AZSignPWD" -kvc "$Env:AZSignCertName" -kvt "$Env:AZSignTID" -tr http://timestamp.digicert.com -v "$($(Get-Location).Path)/build/upload/gpt4all-installer-win64.exe" - store_artifacts: path: build/upload - run: name: Test installation command: | Expand-Archive -LiteralPath build\upload\repository.zip -DestinationPath . build\upload\gpt4all-installer-win64.exe --no-size-checking --default-answer --accept-licenses ` --confirm-command --set-temp-repository repository ` install gpt4all build-offline-chat-installer-windows-arm: machine: # we use 2024.04.01 because nvcc complains about the MSVC ver if we use anything newer image: windows-server-2022-gui:2024.04.1 resource_class: windows.large shell: powershell.exe -ExecutionPolicy Bypass steps: - checkout - run: name: Update Submodules command: | git submodule sync git submodule update --init --recursive - restore_cache: keys: - ccache-gpt4all-win-aarch64- - run: name: Install dependencies command: choco install -y ccache wget - run: name: Installing Qt command: | wget.exe "https://qt.mirror.constant.com/archive/online_installers/4.8/qt-online-installer-windows-x64-4.8.1.exe" & .\qt-online-installer-windows-x64-4.8.1.exe --no-force-installations --no-default-installations ` --no-size-checking --default-answer --accept-licenses --confirm-command --accept-obligations ` --email "${Env:QT_EMAIL}" --password "${Env:QT_PASSWORD}" install ` qt.tools.cmake qt.tools.ifw.48 qt.tools.ninja qt.qt6.682.win64_msvc2022_64 ` qt.qt6.682.win64_msvc2022_arm64_cross_compiled qt.qt6.682.addons.qt5compat qt.qt6.682.debug_info ` qt.qt6.682.addons.qthttpserver - run: name: "Install Dotnet 8" command: | mkdir dotnet cd dotnet $dotnet_url="https://download.visualstudio.microsoft.com/download/pr/5af098e1-e433-4fda-84af-3f54fd27c108/6bd1c6e48e64e64871957289023ca590/dotnet-sdk-8.0.302-win-x64.zip" wget.exe "$dotnet_url" Expand-Archive -LiteralPath .\dotnet-sdk-8.0.302-win-x64.zip $Env:DOTNET_ROOT="$($(Get-Location).Path)\dotnet-sdk-8.0.302-win-x64" $Env:PATH="$Env:DOTNET_ROOT;$Env:PATH" $Env:DOTNET_SKIP_FIRST_TIME_EXPERIENCE=$true dotnet tool install --global AzureSignTool - run: name: Build no_output_timeout: 30m command: | $vsInstallPath = & "C:\Program Files (x86)\Microsoft Visual Studio\Installer\vswhere.exe" -property installationpath Import-Module "${vsInstallPath}\Common7\Tools\Microsoft.VisualStudio.DevShell.dll" Enter-VsDevShell -VsInstallPath "$vsInstallPath" -SkipAutomaticLocation -Arch arm64 -HostArch amd64 -DevCmdArguments '-no_logo' $Env:PATH = "${Env:PATH};C:\Qt\Tools\QtInstallerFramework\4.8\bin" $Env:DOTNET_ROOT="$($(Get-Location).Path)\dotnet\dotnet-sdk-8.0.302-win-x64" $Env:PATH="$Env:DOTNET_ROOT;$Env:PATH" ccache -o "cache_dir=${pwd}\..\.ccache" -o max_size=500M -p -z mkdir build cd build & "C:\Qt\Tools\CMake_64\bin\cmake.exe" ` -S ..\gpt4all-chat -B . -G Ninja ` -DCMAKE_BUILD_TYPE=Release ` "-DCMAKE_PREFIX_PATH:PATH=C:\Qt\6.8.2\msvc2022_arm64" ` "-DCMAKE_MAKE_PROGRAM:FILEPATH=C:\Qt\Tools\Ninja\ninja.exe" ` "-DCMAKE_TOOLCHAIN_FILE=C:\Qt\6.8.2\msvc2022_arm64\lib\cmake\Qt6\qt.toolchain.cmake" ` -DCMAKE_C_COMPILER_LAUNCHER=ccache ` -DCMAKE_CXX_COMPILER_LAUNCHER=ccache ` -DLLMODEL_CUDA=OFF ` -DLLMODEL_KOMPUTE=OFF ` "-DWINDEPLOYQT=C:\Qt\6.8.2\msvc2022_64\bin\windeployqt.exe;--qtpaths;C:\Qt\6.8.2\msvc2022_arm64\bin\qtpaths.bat" ` -DGPT4ALL_TEST=OFF ` -DGPT4ALL_OFFLINE_INSTALLER=ON & "C:\Qt\Tools\Ninja\ninja.exe" & "C:\Qt\Tools\Ninja\ninja.exe" install & "C:\Qt\Tools\Ninja\ninja.exe" package ccache -s mkdir upload copy gpt4all-installer-win64-arm.exe upload - store_artifacts: path: build/upload # add workspace so signing jobs can connect & obtain dmg - save_cache: key: ccache-gpt4all-win-aarch64-{{ epoch }} when: always paths: - ..\.ccache - persist_to_workspace: root: build # specify path to only include components we want to persist # accross builds paths: - upload sign-offline-chat-installer-windows-arm: machine: image: windows-server-2022-gui:2024.04.1 resource_class: windows.large shell: powershell.exe -ExecutionPolicy Bypass steps: - checkout - attach_workspace: at: build - run: name: Install dependencies command: choco install -y wget - run: name: "Install Dotnet 8 && Azure Sign Tool" command: | mkdir dotnet cd dotnet $dotnet_url="https://download.visualstudio.microsoft.com/download/pr/5af098e1-e433-4fda-84af-3f54fd27c108/6bd1c6e48e64e64871957289023ca590/dotnet-sdk-8.0.302-win-x64.zip" wget.exe "$dotnet_url" Expand-Archive -LiteralPath .\dotnet-sdk-8.0.302-win-x64.zip $Env:DOTNET_ROOT="$($(Get-Location).Path)\dotnet-sdk-8.0.302-win-x64" $Env:PATH="$Env:DOTNET_ROOT;$Env:PATH" $Env:DOTNET_SKIP_FIRST_TIME_EXPERIENCE=$true dotnet tool install --global AzureSignTool - run: name: "Sign Windows Installer With AST" command: | $Env:DOTNET_ROOT="$($(Get-Location).Path)\dotnet\dotnet-sdk-8.0.302-win-x64" $Env:PATH="$Env:DOTNET_ROOT;$Env:PATH" AzureSignTool.exe sign -du "https://gpt4all.io/index.html" -kvu https://gpt4all.vault.azure.net -kvi "$Env:AZSignGUID" -kvs "$Env:AZSignPWD" -kvc "$Env:AZSignCertName" -kvt "$Env:AZSignTID" -tr http://timestamp.digicert.com -v "$($(Get-Location).Path)\build\upload\gpt4all-installer-win64-arm.exe" - store_artifacts: path: build/upload - run: name: Test installation command: | build\upload\gpt4all-installer-win64-arm.exe --no-size-checking --default-answer --accept-licenses ` --confirm-command ` install gpt4all build-online-chat-installer-windows-arm: machine: image: windows-server-2022-gui:2024.04.1 resource_class: windows.large shell: powershell.exe -ExecutionPolicy Bypass steps: - checkout - run: name: Update Submodules command: | git submodule sync git submodule update --init --recursive - restore_cache: keys: - ccache-gpt4all-win-aarch64- - run: name: Install dependencies command: choco install -y ccache wget - run: name: Installing Qt command: | wget.exe "https://qt.mirror.constant.com/archive/online_installers/4.8/qt-online-installer-windows-x64-4.8.1.exe" & .\qt-online-installer-windows-x64-4.8.1.exe --no-force-installations --no-default-installations ` --no-size-checking --default-answer --accept-licenses --confirm-command --accept-obligations ` --email "${Env:QT_EMAIL}" --password "${Env:QT_PASSWORD}" install ` qt.tools.cmake qt.tools.ifw.48 qt.tools.ninja qt.qt6.682.win64_msvc2022_64 ` qt.qt6.682.win64_msvc2022_arm64_cross_compiled qt.qt6.682.addons.qt5compat qt.qt6.682.debug_info ` qt.qt6.682.addons.qthttpserver - run: name: "Install Dotnet 8" command: | mkdir dotnet cd dotnet $dotnet_url="https://download.visualstudio.microsoft.com/download/pr/5af098e1-e433-4fda-84af-3f54fd27c108/6bd1c6e48e64e64871957289023ca590/dotnet-sdk-8.0.302-win-x64.zip" wget.exe "$dotnet_url" Expand-Archive -LiteralPath .\dotnet-sdk-8.0.302-win-x64.zip $Env:DOTNET_ROOT="$($(Get-Location).Path)\dotnet-sdk-8.0.302-win-x64" $Env:PATH="$Env:DOTNET_ROOT;$Env:PATH" - run: name: "Setup Azure SignTool" command: | $Env:DOTNET_ROOT="$($(Get-Location).Path)\dotnet\dotnet-sdk-8.0.302-win-x64" $Env:PATH="$Env:DOTNET_ROOT;$Env:PATH" $Env:DOTNET_SKIP_FIRST_TIME_EXPERIENCE=$true dotnet tool install --global AzureSignTool - run: name: Build no_output_timeout: 30m command: | $vsInstallPath = & "C:\Program Files (x86)\Microsoft Visual Studio\Installer\vswhere.exe" -property installationpath Import-Module "${vsInstallPath}\Common7\Tools\Microsoft.VisualStudio.DevShell.dll" Enter-VsDevShell -VsInstallPath "$vsInstallPath" -SkipAutomaticLocation -Arch arm64 -HostArch amd64 -DevCmdArguments '-no_logo' $Env:PATH = "${Env:PATH};C:\Qt\Tools\QtInstallerFramework\4.8\bin" $Env:DOTNET_ROOT="$($(Get-Location).Path)\dotnet\dotnet-sdk-8.0.302-win-x64" $Env:PATH="$Env:DOTNET_ROOT;$Env:PATH" ccache -o "cache_dir=${pwd}\..\.ccache" -o max_size=500M -p -z mkdir build cd build & "C:\Qt\Tools\CMake_64\bin\cmake.exe" ` -S ..\gpt4all-chat -B . -G Ninja ` -DCMAKE_BUILD_TYPE=Release ` "-DCMAKE_PREFIX_PATH:PATH=C:\Qt\6.8.2\msvc2022_arm64" ` "-DCMAKE_MAKE_PROGRAM:FILEPATH=C:\Qt\Tools\Ninja\ninja.exe" ` "-DCMAKE_TOOLCHAIN_FILE=C:\Qt\6.8.2\msvc2022_arm64\lib\cmake\Qt6\qt.toolchain.cmake" ` -DCMAKE_C_COMPILER_LAUNCHER=ccache ` -DCMAKE_CXX_COMPILER_LAUNCHER=ccache ` -DLLMODEL_CUDA=OFF ` -DLLMODEL_KOMPUTE=OFF ` "-DWINDEPLOYQT=C:\Qt\6.8.2\msvc2022_64\bin\windeployqt.exe;--qtpaths;C:\Qt\6.8.2\msvc2022_arm64\bin\qtpaths.bat" ` -DGPT4ALL_TEST=OFF ` -DGPT4ALL_OFFLINE_INSTALLER=OFF & "C:\Qt\Tools\Ninja\ninja.exe" & "C:\Qt\Tools\Ninja\ninja.exe" install & "C:\Qt\Tools\Ninja\ninja.exe" package ccache -s mkdir upload copy gpt4all-installer-win64-arm.exe upload Set-Location -Path "_CPack_Packages/win64/IFW/gpt4all-installer-win64-arm" Compress-Archive -Path 'repository' -DestinationPath '..\..\..\..\upload\repository.zip' - store_artifacts: path: build/upload - save_cache: key: ccache-gpt4all-win-aarch64-{{ epoch }} when: always paths: - ..\.ccache # add workspace so signing jobs can connect & obtain dmg - persist_to_workspace: root: build # specify path to only include components we want to persist # accross builds paths: - upload sign-online-chat-installer-windows-arm: machine: image: windows-server-2022-gui:2024.04.1 resource_class: windows.large shell: powershell.exe -ExecutionPolicy Bypass steps: - checkout - attach_workspace: at: build - run: name: Install dependencies command: choco install -y wget - run: name: "Install Dotnet 8" command: | mkdir dotnet cd dotnet $dotnet_url="https://download.visualstudio.microsoft.com/download/pr/5af098e1-e433-4fda-84af-3f54fd27c108/6bd1c6e48e64e64871957289023ca590/dotnet-sdk-8.0.302-win-x64.zip" wget.exe "$dotnet_url" Expand-Archive -LiteralPath .\dotnet-sdk-8.0.302-win-x64.zip $Env:DOTNET_ROOT="$($(Get-Location).Path)\dotnet-sdk-8.0.302-win-x64" $Env:PATH="$Env:DOTNET_ROOT;$Env:PATH" - run: name: "Setup Azure SignTool" command: | $Env:DOTNET_ROOT="$($(Get-Location).Path)\dotnet\dotnet-sdk-8.0.302-win-x64" $Env:PATH="$Env:DOTNET_ROOT;$Env:PATH" $Env:DOTNET_SKIP_FIRST_TIME_EXPERIENCE=$true dotnet tool install --global AzureSignTool - run: name: "Sign Windows Installer With AST" command: | $Env:DOTNET_ROOT="$($(Get-Location).Path)\dotnet\dotnet-sdk-8.0.302-win-x64" $Env:PATH="$Env:DOTNET_ROOT;$Env:PATH" AzureSignTool.exe sign -du "https://gpt4all.io/index.html" -kvu https://gpt4all.vault.azure.net -kvi "$Env:AZSignGUID" -kvs "$Env:AZSignPWD" -kvc "$Env:AZSignCertName" -kvt "$Env:AZSignTID" -tr http://timestamp.digicert.com -v "$($(Get-Location).Path)/build/upload/gpt4all-installer-win64-arm.exe" - store_artifacts: path: build/upload - run: name: Test installation command: | Expand-Archive -LiteralPath build\upload\repository.zip -DestinationPath . build\upload\gpt4all-installer-win64-arm.exe --no-size-checking --default-answer --accept-licenses ` --confirm-command --set-temp-repository repository ` install gpt4all build-gpt4all-chat-linux: machine: image: ubuntu-2204:current steps: - checkout - run: name: Update Submodules command: | git submodule sync git submodule update --init --recursive - restore_cache: keys: - ccache-gpt4all-linux-amd64- - run: <<: *job-linux-install-chat-deps - run: name: Build no_output_timeout: 30m command: | export CMAKE_PREFIX_PATH=~/Qt/6.8.2/gcc_64/lib/cmake export PATH=$PATH:/usr/local/cuda/bin ccache -o "cache_dir=${PWD}/../.ccache" -o max_size=500M -p -z ~/Qt/Tools/CMake/bin/cmake \ -S gpt4all-chat -B build \ -DCMAKE_BUILD_TYPE=Release \ -DCMAKE_C_COMPILER=clang-19 \ -DCMAKE_CXX_COMPILER=clang++-19 \ -DCMAKE_CXX_COMPILER_AR=ar \ -DCMAKE_CXX_COMPILER_RANLIB=ranlib \ -DCMAKE_C_COMPILER_LAUNCHER=ccache \ -DCMAKE_CXX_COMPILER_LAUNCHER=ccache \ -DCMAKE_CUDA_COMPILER_LAUNCHER=ccache \ -DKOMPUTE_OPT_DISABLE_VULKAN_VERSION_CHECK=ON ~/Qt/Tools/CMake/bin/cmake --build build -j$(nproc) --target all ccache -s - save_cache: key: ccache-gpt4all-linux-amd64-{{ epoch }} when: always paths: - ../.ccache build-gpt4all-chat-windows: machine: image: windows-server-2022-gui:2024.04.1 resource_class: windows.large shell: powershell.exe -ExecutionPolicy Bypass steps: - checkout - run: name: Update Submodules command: | git submodule sync git submodule update --init --recursive - restore_cache: keys: - ccache-gpt4all-win-amd64- - run: name: Install dependencies command: choco install -y ccache wget - run: name: Installing Qt command: | wget.exe "https://qt.mirror.constant.com/archive/online_installers/4.8/qt-online-installer-windows-x64-4.8.1.exe" & .\qt-online-installer-windows-x64-4.8.1.exe --no-force-installations --no-default-installations ` --no-size-checking --default-answer --accept-licenses --confirm-command --accept-obligations ` --email "${Env:QT_EMAIL}" --password "${Env:QT_PASSWORD}" install ` qt.tools.cmake qt.tools.ifw.48 qt.tools.ninja qt.qt6.682.win64_msvc2022_64 qt.qt6.682.addons.qt5compat ` qt.qt6.682.debug_info extensions.qtpdf.682 qt.qt6.682.addons.qthttpserver - run: name: Install VulkanSDK command: | wget.exe "https://sdk.lunarg.com/sdk/download/1.3.261.1/windows/VulkanSDK-1.3.261.1-Installer.exe" .\VulkanSDK-1.3.261.1-Installer.exe --accept-licenses --default-answer --confirm-command install - run: name: Install CUDA Toolkit command: | wget.exe "https://developer.download.nvidia.com/compute/cuda/11.8.0/network_installers/cuda_11.8.0_windows_network.exe" .\cuda_11.8.0_windows_network.exe -s cudart_11.8 nvcc_11.8 cublas_11.8 cublas_dev_11.8 - run: name: Build no_output_timeout: 30m command: | $vsInstallPath = & "C:\Program Files (x86)\Microsoft Visual Studio\Installer\vswhere.exe" -property installationpath Import-Module "${vsInstallPath}\Common7\Tools\Microsoft.VisualStudio.DevShell.dll" Enter-VsDevShell -VsInstallPath "$vsInstallPath" -SkipAutomaticLocation -DevCmdArguments '-arch=x64 -no_logo' $Env:PATH = "${Env:PATH};C:\VulkanSDK\1.3.261.1\bin" $Env:VULKAN_SDK = "C:\VulkanSDK\1.3.261.1" ccache -o "cache_dir=${pwd}\..\.ccache" -o max_size=500M -p -z & "C:\Qt\Tools\CMake_64\bin\cmake.exe" ` -S gpt4all-chat -B build -G Ninja ` -DCMAKE_BUILD_TYPE=Release ` "-DCMAKE_PREFIX_PATH:PATH=C:\Qt\6.8.2\msvc2022_64" ` "-DCMAKE_MAKE_PROGRAM:FILEPATH=C:\Qt\Tools\Ninja\ninja.exe" ` -DCMAKE_C_COMPILER_LAUNCHER=ccache ` -DCMAKE_CXX_COMPILER_LAUNCHER=ccache ` -DCMAKE_CUDA_COMPILER_LAUNCHER=ccache ` -DKOMPUTE_OPT_DISABLE_VULKAN_VERSION_CHECK=ON & "C:\Qt\Tools\Ninja\ninja.exe" -C build ccache -s - save_cache: key: ccache-gpt4all-win-amd64-{{ epoch }} when: always paths: - ..\.ccache build-gpt4all-chat-macos: <<: *job-macos-executor steps: - checkout - run: name: Update Submodules command: | git submodule sync git submodule update --init --recursive - restore_cache: keys: - ccache-gpt4all-macos- - run: <<: *job-macos-install-deps - run: name: Install Rosetta command: softwareupdate --install-rosetta --agree-to-license # needed for QtIFW - run: name: Installing Qt command: | wget "https://qt.mirror.constant.com/archive/online_installers/4.8/qt-online-installer-macOS-x64-4.8.1.dmg" hdiutil attach qt-online-installer-macOS-x64-4.8.1.dmg /Volumes/qt-online-installer-macOS-x64-4.8.1/qt-online-installer-macOS-x64-4.8.1.app/Contents/MacOS/qt-online-installer-macOS-x64-4.8.1 \ --no-force-installations --no-default-installations --no-size-checking --default-answer \ --accept-licenses --confirm-command --accept-obligations --email "$QT_EMAIL" --password "$QT_PASSWORD" \ install \ qt.tools.cmake qt.tools.ifw.48 qt.tools.ninja qt.qt6.682.clang_64 qt.qt6.682.addons.qt5compat \ extensions.qtpdf.682 qt.qt6.682.addons.qthttpserver hdiutil detach /Volumes/qt-online-installer-macOS-x64-4.8.1 - run: name: Build no_output_timeout: 30m command: | ccache -o "cache_dir=${PWD}/../.ccache" -o max_size=500M -p -z ~/Qt/Tools/CMake/CMake.app/Contents/bin/cmake \ -S gpt4all-chat -B build -G Ninja \ -DCMAKE_BUILD_TYPE=Release \ -DCMAKE_PREFIX_PATH:PATH=~/Qt/6.8.2/macos/lib/cmake \ -DCMAKE_MAKE_PROGRAM:FILEPATH=~/Qt/Tools/Ninja/ninja \ -DCMAKE_C_COMPILER=/opt/homebrew/opt/llvm/bin/clang \ -DCMAKE_CXX_COMPILER=/opt/homebrew/opt/llvm/bin/clang++ \ -DCMAKE_RANLIB=/usr/bin/ranlib \ -DCMAKE_C_COMPILER_LAUNCHER=ccache \ -DCMAKE_CXX_COMPILER_LAUNCHER=ccache \ -DBUILD_UNIVERSAL=ON \ -DCMAKE_OSX_DEPLOYMENT_TARGET=12.6 \ -DGGML_METAL_MACOSX_VERSION_MIN=12.6 ~/Qt/Tools/CMake/CMake.app/Contents/bin/cmake --build build --target all ccache -s - save_cache: key: ccache-gpt4all-macos-{{ epoch }} when: always paths: - ../.ccache build-ts-docs: docker: - image: cimg/base:stable steps: - checkout - node/install: node-version: "18.16" - run: node --version - run: corepack enable - node/install-packages: pkg-manager: npm app-dir: gpt4all-bindings/typescript override-ci-command: npm install --ignore-scripts - run: name: build docs ts yo command: | cd gpt4all-bindings/typescript npm run docs:build deploy-docs: docker: - image: circleci/python:3.8 steps: - checkout - run: name: Install dependencies command: | sudo apt-get update sudo apt-get -y install python3 python3-pip sudo pip3 install awscli --upgrade sudo pip3 install mkdocs mkdocs-material mkautodoc 'mkdocstrings[python]' markdown-captions pillow cairosvg - run: name: Make Documentation command: | cd gpt4all-bindings/python mkdocs build - run: name: Deploy Documentation command: | cd gpt4all-bindings/python aws s3 sync --delete site/ s3://docs.gpt4all.io/ - run: name: Invalidate docs.gpt4all.io cloudfront command: aws cloudfront create-invalidation --distribution-id E1STQOW63QL2OH --paths "/*" build-py-linux: machine: image: ubuntu-2204:current steps: - checkout - restore_cache: keys: - ccache-gpt4all-linux-amd64- - run: <<: *job-linux-install-backend-deps - run: name: Build C library no_output_timeout: 30m command: | export PATH=$PATH:/usr/local/cuda/bin git submodule update --init --recursive ccache -o "cache_dir=${PWD}/../.ccache" -o max_size=500M -p -z cd gpt4all-backend cmake -B build -G Ninja \ -DCMAKE_BUILD_TYPE=Release \ -DCMAKE_C_COMPILER=clang-19 \ -DCMAKE_CXX_COMPILER=clang++-19 \ -DCMAKE_CXX_COMPILER_AR=ar \ -DCMAKE_CXX_COMPILER_RANLIB=ranlib \ -DCMAKE_C_COMPILER_LAUNCHER=ccache \ -DCMAKE_CXX_COMPILER_LAUNCHER=ccache \ -DCMAKE_CUDA_COMPILER_LAUNCHER=ccache \ -DKOMPUTE_OPT_DISABLE_VULKAN_VERSION_CHECK=ON \ -DCMAKE_CUDA_ARCHITECTURES='50-virtual;52-virtual;61-virtual;70-virtual;75-virtual' cmake --build build -j$(nproc) ccache -s - run: name: Build wheel command: | cd gpt4all-bindings/python/ python setup.py bdist_wheel --plat-name=manylinux1_x86_64 - store_artifacts: path: gpt4all-bindings/python/dist - save_cache: key: ccache-gpt4all-linux-amd64-{{ epoch }} when: always paths: - ../.ccache - persist_to_workspace: root: gpt4all-bindings/python/dist paths: - "*.whl" build-py-macos: <<: *job-macos-executor steps: - checkout - restore_cache: keys: - ccache-gpt4all-macos- - run: <<: *job-macos-install-deps - run: name: Install dependencies command: | pip install setuptools wheel cmake - run: name: Build C library no_output_timeout: 30m command: | git submodule update --init # don't use --recursive because macOS doesn't use Kompute ccache -o "cache_dir=${PWD}/../.ccache" -o max_size=500M -p -z cd gpt4all-backend cmake -B build \ -DCMAKE_BUILD_TYPE=Release \ -DCMAKE_C_COMPILER=/opt/homebrew/opt/llvm/bin/clang \ -DCMAKE_CXX_COMPILER=/opt/homebrew/opt/llvm/bin/clang++ \ -DCMAKE_RANLIB=/usr/bin/ranlib \ -DCMAKE_C_COMPILER_LAUNCHER=ccache \ -DCMAKE_CXX_COMPILER_LAUNCHER=ccache \ -DBUILD_UNIVERSAL=ON \ -DCMAKE_OSX_DEPLOYMENT_TARGET=12.6 \ -DGGML_METAL_MACOSX_VERSION_MIN=12.6 cmake --build build --parallel ccache -s - run: name: Build wheel command: | cd gpt4all-bindings/python python setup.py bdist_wheel --plat-name=macosx_10_15_universal2 - store_artifacts: path: gpt4all-bindings/python/dist - save_cache: key: ccache-gpt4all-macos-{{ epoch }} when: always paths: - ../.ccache - persist_to_workspace: root: gpt4all-bindings/python/dist paths: - "*.whl" build-py-windows: machine: image: windows-server-2022-gui:2024.04.1 resource_class: windows.large shell: powershell.exe -ExecutionPolicy Bypass steps: - checkout - run: name: Update Submodules command: | git submodule sync git submodule update --init --recursive - restore_cache: keys: - ccache-gpt4all-win-amd64- - run: name: Install dependencies command: choco install -y ccache cmake ninja wget --installargs 'ADD_CMAKE_TO_PATH=System' - run: name: Install VulkanSDK command: | wget.exe "https://sdk.lunarg.com/sdk/download/1.3.261.1/windows/VulkanSDK-1.3.261.1-Installer.exe" .\VulkanSDK-1.3.261.1-Installer.exe --accept-licenses --default-answer --confirm-command install - run: name: Install CUDA Toolkit command: | wget.exe "https://developer.download.nvidia.com/compute/cuda/11.8.0/network_installers/cuda_11.8.0_windows_network.exe" .\cuda_11.8.0_windows_network.exe -s cudart_11.8 nvcc_11.8 cublas_11.8 cublas_dev_11.8 - run: name: Install Python dependencies command: pip install setuptools wheel cmake - run: name: Build C library no_output_timeout: 30m command: | $vsInstallPath = & "C:\Program Files (x86)\Microsoft Visual Studio\Installer\vswhere.exe" -property installationpath Import-Module "${vsInstallPath}\Common7\Tools\Microsoft.VisualStudio.DevShell.dll" Enter-VsDevShell -VsInstallPath "$vsInstallPath" -SkipAutomaticLocation -DevCmdArguments '-arch=x64 -no_logo' $Env:PATH += ";C:\VulkanSDK\1.3.261.1\bin" $Env:VULKAN_SDK = "C:\VulkanSDK\1.3.261.1" ccache -o "cache_dir=${pwd}\..\.ccache" -o max_size=500M -p -z cd gpt4all-backend cmake -B build -G Ninja ` -DCMAKE_BUILD_TYPE=Release ` -DCMAKE_C_COMPILER_LAUNCHER=ccache ` -DCMAKE_CXX_COMPILER_LAUNCHER=ccache ` -DCMAKE_CUDA_COMPILER_LAUNCHER=ccache ` -DKOMPUTE_OPT_DISABLE_VULKAN_VERSION_CHECK=ON ` -DCMAKE_CUDA_ARCHITECTURES='50-virtual;52-virtual;61-virtual;70-virtual;75-virtual' cmake --build build --parallel ccache -s - run: name: Build wheel command: | cd gpt4all-bindings/python python setup.py bdist_wheel --plat-name=win_amd64 - store_artifacts: path: gpt4all-bindings/python/dist - save_cache: key: ccache-gpt4all-win-amd64-{{ epoch }} when: always paths: - ..\.ccache - persist_to_workspace: root: gpt4all-bindings/python/dist paths: - "*.whl" deploy-wheels: docker: - image: circleci/python:3.8 steps: - setup_remote_docker - attach_workspace: at: /tmp/workspace - run: name: Install dependencies command: | sudo apt-get update sudo apt-get install -y build-essential cmake pip install setuptools wheel twine - run: name: Upload Python package command: | twine upload /tmp/workspace/*.whl --username __token__ --password $PYPI_CRED - store_artifacts: path: /tmp/workspace build-bindings-backend-linux: machine: image: ubuntu-2204:current steps: - checkout - run: name: Update Submodules command: | git submodule sync git submodule update --init --recursive - restore_cache: keys: - ccache-gpt4all-linux-amd64- - run: <<: *job-linux-install-backend-deps - run: name: Build Libraries no_output_timeout: 30m command: | export PATH=$PATH:/usr/local/cuda/bin ccache -o "cache_dir=${PWD}/../.ccache" -o max_size=500M -p -z cd gpt4all-backend mkdir -p runtimes/build cd runtimes/build cmake ../.. -G Ninja \ -DCMAKE_BUILD_TYPE=Release \ -DCMAKE_C_COMPILER=clang-19 \ -DCMAKE_CXX_COMPILER=clang++-19 \ -DCMAKE_CXX_COMPILER_AR=ar \ -DCMAKE_CXX_COMPILER_RANLIB=ranlib \ -DCMAKE_BUILD_TYPE=Release \ -DCMAKE_C_COMPILER_LAUNCHER=ccache \ -DCMAKE_CXX_COMPILER_LAUNCHER=ccache \ -DCMAKE_CUDA_COMPILER_LAUNCHER=ccache \ -DKOMPUTE_OPT_DISABLE_VULKAN_VERSION_CHECK=ON cmake --build . -j$(nproc) ccache -s mkdir ../linux-x64 cp -L *.so ../linux-x64 # otherwise persist_to_workspace seems to mess symlinks - save_cache: key: ccache-gpt4all-linux-amd64-{{ epoch }} when: always paths: - ../.ccache - persist_to_workspace: root: gpt4all-backend paths: - runtimes/linux-x64/*.so build-bindings-backend-macos: <<: *job-macos-executor steps: - checkout - run: name: Update Submodules command: | git submodule sync git submodule update --init --recursive - restore_cache: keys: - ccache-gpt4all-macos- - run: <<: *job-macos-install-deps - run: name: Build Libraries no_output_timeout: 30m command: | ccache -o "cache_dir=${PWD}/../.ccache" -o max_size=500M -p -z cd gpt4all-backend mkdir -p runtimes/build cd runtimes/build cmake ../.. \ -DCMAKE_BUILD_TYPE=Release \ -DCMAKE_C_COMPILER=/opt/homebrew/opt/llvm/bin/clang \ -DCMAKE_CXX_COMPILER=/opt/homebrew/opt/llvm/bin/clang++ \ -DCMAKE_RANLIB=/usr/bin/ranlib \ -DCMAKE_C_COMPILER_LAUNCHER=ccache \ -DCMAKE_CXX_COMPILER_LAUNCHER=ccache \ -DBUILD_UNIVERSAL=ON \ -DCMAKE_OSX_DEPLOYMENT_TARGET=12.6 \ -DGGML_METAL_MACOSX_VERSION_MIN=12.6 cmake --build . --parallel ccache -s mkdir ../osx-x64 cp -L *.dylib ../osx-x64 cp ../../llama.cpp-mainline/*.metal ../osx-x64 ls ../osx-x64 - save_cache: key: ccache-gpt4all-macos-{{ epoch }} when: always paths: - ../.ccache - persist_to_workspace: root: gpt4all-backend paths: - runtimes/osx-x64/*.dylib - runtimes/osx-x64/*.metal build-bindings-backend-windows: machine: image: windows-server-2022-gui:2024.04.1 resource_class: windows.large shell: powershell.exe -ExecutionPolicy Bypass steps: - checkout - run: name: Update Submodules command: | git submodule sync git submodule update --init --recursive - restore_cache: keys: - ccache-gpt4all-win-amd64- - run: name: Install dependencies command: | choco install -y ccache cmake ninja wget --installargs 'ADD_CMAKE_TO_PATH=System' - run: name: Install VulkanSDK command: | wget.exe "https://sdk.lunarg.com/sdk/download/1.3.261.1/windows/VulkanSDK-1.3.261.1-Installer.exe" .\VulkanSDK-1.3.261.1-Installer.exe --accept-licenses --default-answer --confirm-command install - run: name: Install CUDA Toolkit command: | wget.exe "https://developer.download.nvidia.com/compute/cuda/11.8.0/network_installers/cuda_11.8.0_windows_network.exe" .\cuda_11.8.0_windows_network.exe -s cudart_11.8 nvcc_11.8 cublas_11.8 cublas_dev_11.8 - run: name: Build Libraries no_output_timeout: 30m command: | $vsInstallPath = & "C:\Program Files (x86)\Microsoft Visual Studio\Installer\vswhere.exe" -property installationpath Import-Module "${vsInstallPath}\Common7\Tools\Microsoft.VisualStudio.DevShell.dll" Enter-VsDevShell -VsInstallPath "$vsInstallPath" -SkipAutomaticLocation -DevCmdArguments '-arch=x64 -no_logo' $Env:Path += ";C:\VulkanSDK\1.3.261.1\bin" $Env:VULKAN_SDK = "C:\VulkanSDK\1.3.261.1" ccache -o "cache_dir=${pwd}\..\.ccache" -o max_size=500M -p -z cd gpt4all-backend mkdir runtimes/win-x64_msvc cd runtimes/win-x64_msvc cmake -S ../.. -B . -G Ninja ` -DCMAKE_BUILD_TYPE=Release ` -DCMAKE_C_COMPILER_LAUNCHER=ccache ` -DCMAKE_CXX_COMPILER_LAUNCHER=ccache ` -DCMAKE_CUDA_COMPILER_LAUNCHER=ccache ` -DKOMPUTE_OPT_DISABLE_VULKAN_VERSION_CHECK=ON cmake --build . --parallel ccache -s cp bin/Release/*.dll . - save_cache: key: ccache-gpt4all-win-amd64-{{ epoch }} when: always paths: - ..\.ccache - persist_to_workspace: root: gpt4all-backend paths: - runtimes/win-x64_msvc/*.dll build-nodejs-linux: docker: - image: cimg/base:stable steps: - checkout - attach_workspace: at: /tmp/gpt4all-backend - node/install: install-yarn: true node-version: "18.16" - run: node --version - run: corepack enable - node/install-packages: app-dir: gpt4all-bindings/typescript pkg-manager: yarn override-ci-command: yarn install - run: command: | cd gpt4all-bindings/typescript yarn prebuildify -t 18.16.0 --napi - run: command: | mkdir -p gpt4all-backend/prebuilds/linux-x64 mkdir -p gpt4all-backend/runtimes/linux-x64 cp /tmp/gpt4all-backend/runtimes/linux-x64/*-*.so gpt4all-backend/runtimes/linux-x64 cp gpt4all-bindings/typescript/prebuilds/linux-x64/*.node gpt4all-backend/prebuilds/linux-x64 - persist_to_workspace: root: gpt4all-backend paths: - prebuilds/linux-x64/*.node - runtimes/linux-x64/*-*.so build-nodejs-macos: <<: *job-macos-executor steps: - checkout - attach_workspace: at: /tmp/gpt4all-backend - node/install: install-yarn: true node-version: "18.16" - run: node --version - run: corepack enable - node/install-packages: app-dir: gpt4all-bindings/typescript pkg-manager: yarn override-ci-command: yarn install - run: command: | cd gpt4all-bindings/typescript yarn prebuildify -t 18.16.0 --napi - run: name: "Persisting all necessary things to workspace" command: | mkdir -p gpt4all-backend/prebuilds/darwin-x64 mkdir -p gpt4all-backend/runtimes/darwin cp /tmp/gpt4all-backend/runtimes/osx-x64/*-*.* gpt4all-backend/runtimes/darwin cp gpt4all-bindings/typescript/prebuilds/darwin-x64/*.node gpt4all-backend/prebuilds/darwin-x64 - persist_to_workspace: root: gpt4all-backend paths: - prebuilds/darwin-x64/*.node - runtimes/darwin/*-*.* build-nodejs-windows: executor: name: win/default size: large shell: powershell.exe -ExecutionPolicy Bypass steps: - checkout - attach_workspace: at: /tmp/gpt4all-backend - run: choco install wget -y - run: command: | wget.exe "https://nodejs.org/dist/v18.16.0/node-v18.16.0-x86.msi" -P C:\Users\circleci\Downloads\ MsiExec.exe /i C:\Users\circleci\Downloads\node-v18.16.0-x86.msi /qn - run: command: | Start-Process powershell -verb runAs -Args "-start GeneralProfile" nvm install 18.16.0 nvm use 18.16.0 - run: node --version - run: corepack enable - run: command: | npm install -g yarn cd gpt4all-bindings/typescript yarn install - run: command: | cd gpt4all-bindings/typescript yarn prebuildify -t 18.16.0 --napi - run: command: | mkdir -p gpt4all-backend/prebuilds/win32-x64 mkdir -p gpt4all-backend/runtimes/win32-x64 cp /tmp/gpt4all-backend/runtimes/win-x64_msvc/*-*.dll gpt4all-backend/runtimes/win32-x64 cp gpt4all-bindings/typescript/prebuilds/win32-x64/*.node gpt4all-backend/prebuilds/win32-x64 - persist_to_workspace: root: gpt4all-backend paths: - prebuilds/win32-x64/*.node - runtimes/win32-x64/*-*.dll deploy-npm-pkg: docker: - image: cimg/base:stable steps: - attach_workspace: at: /tmp/gpt4all-backend - checkout - node/install: install-yarn: true node-version: "18.16" - run: node --version - run: corepack enable - run: command: | cd gpt4all-bindings/typescript # excluding llmodel. nodejs bindings dont need llmodel.dll mkdir -p runtimes/win32-x64/native mkdir -p prebuilds/win32-x64/ cp /tmp/gpt4all-backend/runtimes/win-x64_msvc/*-*.dll runtimes/win32-x64/native/ cp /tmp/gpt4all-backend/prebuilds/win32-x64/*.node prebuilds/win32-x64/ mkdir -p runtimes/linux-x64/native mkdir -p prebuilds/linux-x64/ cp /tmp/gpt4all-backend/runtimes/linux-x64/*-*.so runtimes/linux-x64/native/ cp /tmp/gpt4all-backend/prebuilds/linux-x64/*.node prebuilds/linux-x64/ # darwin has univeral runtime libraries mkdir -p runtimes/darwin/native mkdir -p prebuilds/darwin-x64/ cp /tmp/gpt4all-backend/runtimes/darwin/*-*.* runtimes/darwin/native/ cp /tmp/gpt4all-backend/prebuilds/darwin-x64/*.node prebuilds/darwin-x64/ # Fallback build if user is not on above prebuilds mv -f binding.ci.gyp binding.gyp mkdir gpt4all-backend cd ../../gpt4all-backend mv llmodel.h llmodel.cpp llmodel_c.cpp llmodel_c.h sysinfo.h dlhandle.h ../gpt4all-bindings/typescript/gpt4all-backend/ # Test install - node/install-packages: app-dir: gpt4all-bindings/typescript pkg-manager: yarn override-ci-command: yarn install - run: command: | cd gpt4all-bindings/typescript yarn run test - run: command: | cd gpt4all-bindings/typescript npm set //registry.npmjs.org/:_authToken=$NPM_TOKEN npm publish # only run a job on the main branch job_only_main: &job_only_main filters: branches: only: main # allow a job to run on tags as well as commits job_allow_tags: &job_allow_tags filters: tags: only: - /.*/ # standard chat workflow filter workflow-when-chat-requested: &workflow-when-chat-requested when: and: - or: [ << pipeline.parameters.run-all-workflows >>, << pipeline.parameters.run-chat-workflow >> ] - not: equal: [ << pipeline.trigger_source >>, scheduled_pipeline ] workflows: version: 2 noop: when: not: or: - << pipeline.parameters.run-all-workflows >> - << pipeline.parameters.run-python-workflow >> - << pipeline.parameters.run-ts-workflow >> - << pipeline.parameters.run-chat-workflow >> - equal: [ << pipeline.trigger_source >>, scheduled_pipeline ] jobs: - noop schedule: # only run when scheduled by CircleCI when: equal: [ << pipeline.trigger_source >>, scheduled_pipeline ] jobs: - build-offline-chat-installer-macos: context: gpt4all - build-offline-chat-installer-windows: context: gpt4all - build-offline-chat-installer-windows-arm: context: gpt4all - build-offline-chat-installer-linux: context: gpt4all - sign-offline-chat-installer-macos: context: gpt4all requires: - build-offline-chat-installer-macos - notarize-offline-chat-installer-macos: context: gpt4all requires: - sign-offline-chat-installer-macos - sign-offline-chat-installer-windows: context: gpt4all requires: - build-offline-chat-installer-windows - sign-offline-chat-installer-windows-arm: context: gpt4all requires: - build-offline-chat-installer-windows-arm build-chat-installers-release: # only run on main branch tags that start with 'v' and a digit when: and: - matches: { pattern: '^v\d.*', value: << pipeline.git.tag >> } - not: equal: [ << pipeline.trigger_source >>, scheduled_pipeline ] jobs: - validate-commit-on-main: <<: *job_allow_tags - build-offline-chat-installer-macos: <<: *job_allow_tags context: gpt4all requires: - validate-commit-on-main - build-offline-chat-installer-windows: <<: *job_allow_tags context: gpt4all requires: - validate-commit-on-main - build-offline-chat-installer-windows-arm: <<: *job_allow_tags context: gpt4all requires: - validate-commit-on-main - build-offline-chat-installer-linux: <<: *job_allow_tags context: gpt4all requires: - validate-commit-on-main - sign-offline-chat-installer-macos: <<: *job_allow_tags context: gpt4all requires: - build-offline-chat-installer-macos - notarize-offline-chat-installer-macos: <<: *job_allow_tags context: gpt4all requires: - sign-offline-chat-installer-macos - sign-offline-chat-installer-windows: <<: *job_allow_tags context: gpt4all requires: - build-offline-chat-installer-windows - sign-offline-chat-installer-windows-arm: <<: *job_allow_tags context: gpt4all requires: - build-offline-chat-installer-windows-arm - build-online-chat-installer-macos: <<: *job_allow_tags context: gpt4all requires: - validate-commit-on-main - build-online-chat-installer-windows: <<: *job_allow_tags context: gpt4all requires: - validate-commit-on-main - build-online-chat-installer-windows-arm: <<: *job_allow_tags context: gpt4all requires: - validate-commit-on-main - build-online-chat-installer-linux: <<: *job_allow_tags context: gpt4all requires: - validate-commit-on-main - sign-online-chat-installer-macos: <<: *job_allow_tags context: gpt4all requires: - build-online-chat-installer-macos - notarize-online-chat-installer-macos: <<: *job_allow_tags context: gpt4all requires: - sign-online-chat-installer-macos - sign-online-chat-installer-windows: <<: *job_allow_tags context: gpt4all requires: - build-online-chat-installer-windows - sign-online-chat-installer-windows-arm: <<: *job_allow_tags context: gpt4all requires: - build-online-chat-installer-windows-arm build-chat-offline-installers: <<: *workflow-when-chat-requested jobs: - build-hold: type: approval - sign-hold: type: approval - build-offline-chat-installer-macos: context: gpt4all requires: - build-hold - sign-offline-chat-installer-macos: context: gpt4all requires: - sign-hold - build-offline-chat-installer-macos - notarize-offline-chat-installer-macos: context: gpt4all requires: - sign-offline-chat-installer-macos - build-offline-chat-installer-windows: context: gpt4all requires: - build-hold - sign-offline-chat-installer-windows: context: gpt4all requires: - sign-hold - build-offline-chat-installer-windows - build-offline-chat-installer-windows-arm: context: gpt4all requires: - build-hold - sign-offline-chat-installer-windows-arm: context: gpt4all requires: - sign-hold - build-offline-chat-installer-windows-arm - build-offline-chat-installer-linux: context: gpt4all requires: - build-hold build-chat-online-installers: <<: *workflow-when-chat-requested jobs: - build-hold: type: approval - sign-hold: type: approval - build-online-chat-installer-macos: context: gpt4all requires: - build-hold - sign-online-chat-installer-macos: context: gpt4all requires: - sign-hold - build-online-chat-installer-macos - notarize-online-chat-installer-macos: context: gpt4all requires: - sign-online-chat-installer-macos - build-online-chat-installer-windows: context: gpt4all requires: - build-hold - sign-online-chat-installer-windows: context: gpt4all requires: - sign-hold - build-online-chat-installer-windows - build-online-chat-installer-windows-arm: context: gpt4all requires: - build-hold - sign-online-chat-installer-windows-arm: context: gpt4all requires: - sign-hold - build-online-chat-installer-windows-arm - build-online-chat-installer-linux: context: gpt4all requires: - build-hold build-and-test-gpt4all-chat: <<: *workflow-when-chat-requested jobs: - hold: type: approval - build-gpt4all-chat-linux: context: gpt4all requires: - hold - build-gpt4all-chat-windows: context: gpt4all requires: - hold - build-gpt4all-chat-macos: context: gpt4all requires: - hold deploy-docs: when: and: - equal: [ << pipeline.git.branch >>, main ] - or: - << pipeline.parameters.run-all-workflows >> - << pipeline.parameters.run-python-workflow >> - not: equal: [ << pipeline.trigger_source >>, scheduled_pipeline ] jobs: - deploy-docs: context: gpt4all build-python: when: and: - or: [ << pipeline.parameters.run-all-workflows >>, << pipeline.parameters.run-python-workflow >> ] - not: equal: [ << pipeline.trigger_source >>, scheduled_pipeline ] jobs: - pypi-hold: <<: *job_only_main type: approval - hold: type: approval - build-py-linux: requires: - hold - build-py-macos: requires: - hold - build-py-windows: requires: - hold - deploy-wheels: <<: *job_only_main context: gpt4all requires: - pypi-hold - build-py-windows - build-py-linux - build-py-macos build-bindings: when: and: - or: [ << pipeline.parameters.run-all-workflows >>, << pipeline.parameters.run-ts-workflow >> ] - not: equal: [ << pipeline.trigger_source >>, scheduled_pipeline ] jobs: - backend-hold: type: approval - nodejs-hold: type: approval - npm-hold: <<: *job_only_main type: approval - docs-hold: type: approval - build-bindings-backend-linux: requires: - backend-hold - build-bindings-backend-macos: requires: - backend-hold - build-bindings-backend-windows: requires: - backend-hold - build-nodejs-linux: requires: - nodejs-hold - build-bindings-backend-linux - build-nodejs-windows: requires: - nodejs-hold - build-bindings-backend-windows - build-nodejs-macos: requires: - nodejs-hold - build-bindings-backend-macos - build-ts-docs: requires: - docs-hold - deploy-npm-pkg: <<: *job_only_main requires: - npm-hold - build-nodejs-linux - build-nodejs-windows - build-nodejs-macos ================================================ FILE: .circleci/grab_notary_id.py ================================================ import re import sys ID_REG = r"id: (.*)" def main() -> None: notary_log = sys.argv[1] with open(notary_log, "r") as f: notary_output = f.read() id_m = re.search(ID_REG, notary_output) if id_m: print(id_m.group(1)) else: raise RuntimeError("Unable to parse ID from notarization logs") if __name__ == "__main__": main() ================================================ FILE: .codespellrc ================================================ [codespell] ignore-words-list = blong, afterall, assistent, crasher, requestor skip = ./.git,./gpt4all-chat/translations,*.pdf,*.svg,*.lock ================================================ FILE: .github/ISSUE_TEMPLATE/bindings-bug.md ================================================ --- name: "\U0001F6E0 Bindings Bug Report" about: A bug report for the GPT4All Bindings labels: ["bindings", "bug-unconfirmed"] --- ### Bug Report ### Example Code ### Steps to Reproduce 1. 2. 3. ### Expected Behavior ### Your Environment - Bindings version (e.g. "Version" from `pip show gpt4all`): - Operating System: - Chat model used (if applicable): ================================================ FILE: .github/ISSUE_TEMPLATE/chat-bug.md ================================================ --- name: "\U0001F4AC GPT4All Bug Report" about: A bug report for GPT4All Chat labels: ["chat", "bug-unconfirmed"] --- ### Bug Report ### Steps to Reproduce 1. 2. 3. ### Expected Behavior ### Your Environment - GPT4All version: - Operating System: - Chat model used (if applicable): ================================================ FILE: .github/ISSUE_TEMPLATE/config.yml ================================================ version: 2.1 ================================================ FILE: .github/ISSUE_TEMPLATE/documentation.md ================================================ --- name: "\U0001F4C4 Documentation" about: An issue related to the GPT4All documentation labels: ["documentation"] --- ### Documentation ================================================ FILE: .github/ISSUE_TEMPLATE/feature-request.md ================================================ --- name: "\U0001F680 Feature Request" about: Submit a proposal/request for a new GPT4All feature title: "[Feature] Feature request title..." labels: ["enhancement"] --- ### Feature Request ================================================ FILE: .github/ISSUE_TEMPLATE/other-bug.md ================================================ --- name: "\U0001F41B Other Bug Report" about: A bug in another component of GPT4All labels: ["bug-unconfirmed"] --- ### Bug Report ### Steps to Reproduce 1. 2. 3. ### Expected Behavior ### Your Environment - GPT4All version (if applicable): - Operating System: - Chat model used (if applicable): ================================================ FILE: .github/pull_request_template.md ================================================ ## Describe your changes ## Issue ticket number and link ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] I have added thorough documentation for my code. - [ ] I have tagged PR with relevant project labels. I acknowledge that a PR without labels may be dismissed. - [ ] If this PR addresses a bug, I have provided both a screenshot/video of the original bug and the working solution. ## Demo ### Steps to Reproduce ## Notes ================================================ FILE: .github/workflows/close_issues.yml ================================================ # This workflow will close issues that do not have labels or additional comments. # Trigger manually. name: "Close Issues" on: workflow_dispatch: jobs: close_issues: runs-on: ubuntu-latest steps: - name: Close issues without label or comment uses: actions/github-script@v3 with: github-token: ${{secrets.GITHUB_TOKEN}} script: | const repo = context.repo; let page = 1; let issues = []; while (true) { const result = await github.issues.listForRepo({...repo, per_page: 100, page: page}); if (result.data.length === 0) break; issues = issues.concat(result.data); page += 1; } for (let { number } of issues) { const issueData = await github.issues.get({...repo, issue_number: number}); const comments = await github.issues.listComments({...repo, issue_number: number}); if (issueData.data.labels.length === 0 && comments.data.length < 1) { await github.issues.update({...repo, issue_number: number, state: 'closed'}); await github.issues.createComment({...repo, issue_number: number, body: 'Issue closed as it does not have any labels or comments.'}); } } ================================================ FILE: .github/workflows/codespell.yml ================================================ --- name: Codespell on: push: branches: [main] pull_request: branches: [main] jobs: codespell: name: Check for spelling errors runs-on: ubuntu-latest steps: - name: Checkout uses: actions/checkout@v4 - name: Codespell uses: codespell-project/actions-codespell@v2 ================================================ FILE: .gitignore ================================================ *.arrow squad_* *sbert_embedded* *.pkl ckpts* .deepspeed_env *.jsonl *tar.gz ckpts** wandb # Byte-compiled / optimized / DLL files __pycache__/ *.py[cod] *$py.class # C extensions *.so # Distribution / packaging .Python build/ develop-eggs/ dist/ downloads/ eggs/ .eggs/ lib/ lib64/ parts/ sdist/ var/ wheels/ share/python-wheels/ *.egg-info/ .installed.cfg *.egg MANIFEST # PyInstaller # Usually these files are written by a python script from a template # before PyInstaller builds the exe, so as to inject date/other infos into it. *.manifest *.spec # Installer logs pip-log.txt pip-delete-this-directory.txt # Unit test / coverage reports htmlcov/ .tox/ .nox/ .coverage .coverage.* .cache nosetests.xml coverage.xml *.cover *.py,cover .hypothesis/ .pytest_cache/ cover/ # Translations *.mo *.pot # Django stuff: *.log local_settings.py db.sqlite3 db.sqlite3-journal # Flask stuff: instance/ .webassets-cache # Scrapy stuff: .scrapy # Sphinx documentation docs/_build/ # PyBuilder .pybuilder/ target/ # Jupyter Notebook .ipynb_checkpoints # IPython profile_default/ ipython_config.py # pyenv # For a library or package, you might want to ignore these files since the code is # intended to run in multiple environments; otherwise, check them in: # .python-version # pipenv # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. # However, in case of collaboration, if having platform-specific dependencies or dependencies # having no cross-platform support, pipenv may install dependencies that don't work, or not # install all needed dependencies. #Pipfile.lock # poetry # Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control. # This is especially recommended for binary packages to ensure reproducibility, and is more # commonly ignored for libraries. # https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control #poetry.lock # pdm # Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control. #pdm.lock # pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it # in version control. # https://pdm.fming.dev/#use-with-ide .pdm.toml # PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm __pypackages__/ # Celery stuff celerybeat-schedule celerybeat.pid # SageMath parsed files *.sage.py # Environments .env .venv env/ venv/ ENV/ env.bak/ venv.bak/ # Spyder project settings .spyderproject .spyproject # Rope project settings .ropeproject # mkdocs documentation /site # mypy .mypy_cache/ .dmypy.json dmypy.json # Pyre type checker .pyre/ # pytype static type analyzer .pytype/ # Cython debug symbols cython_debug/ # PyCharm # JetBrains specific template is maintained in a separate JetBrains.gitignore that can # be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore # and can be added to the global gitignore or merged into this file. For a more nuclear # option (not recommended) you can uncomment the following to ignore the entire idea folder. #.idea/ # vs code .vscode *.bin .DS_Store # gpt4all-chat CMakeLists.txt.user gpt4all-chat/models/* build_* build-* cmake-build-* /gpt4all-chat/tests/python/config.py # IntelliJ .idea/ # LLM models *.gguf ================================================ FILE: .gitmodules ================================================ [submodule "llama.cpp-mainline"] path = gpt4all-backend/deps/llama.cpp-mainline url = https://github.com/nomic-ai/llama.cpp.git branch = master [submodule "gpt4all-chat/usearch"] path = gpt4all-chat/deps/usearch url = https://github.com/nomic-ai/usearch.git [submodule "gpt4all-chat/deps/SingleApplication"] path = gpt4all-chat/deps/SingleApplication url = https://github.com/nomic-ai/SingleApplication.git [submodule "gpt4all-chat/deps/fmt"] path = gpt4all-chat/deps/fmt url = https://github.com/fmtlib/fmt.git [submodule "gpt4all-chat/deps/DuckX"] path = gpt4all-chat/deps/DuckX url = https://github.com/nomic-ai/DuckX.git [submodule "gpt4all-chat/deps/QXlsx"] path = gpt4all-chat/deps/QXlsx url = https://github.com/nomic-ai/QXlsx.git [submodule "gpt4all-chat/deps/minja"] path = gpt4all-chat/deps/minja url = https://github.com/nomic-ai/minja.git [submodule "gpt4all-chat/deps/json"] path = gpt4all-chat/deps/json url = https://github.com/nlohmann/json.git ================================================ FILE: CONTRIBUTING.md ================================================ # Contributing When contributing to this repository, please first discuss the change you wish to make via issue, email, or any other method with the owners of this repository before making a change. Please note we have a code of conduct, please follow it in all your interactions with the project. ## Pull Request Process 1. Ensure any install or build dependencies are removed before the end of the layer when doing a build. 2. Make sure Pull Request is tagged with appropriate project identifiers and has a clear description of contribution. 3. Any new or updated code must have documentation and preferably tests included with Pull Request. 4. Significant feature or code changes should provide a short video or screenshot demo. 4. Fill out relevant parts of Pull Request template. 4. Pull requests must have sign-off from one other developer. Reach out to a repository owner once your code is ready to be merged into `main`. ## Code of Conduct ### Our Pledge In the interest of fostering an open and welcoming environment, we as contributors and maintainers pledge to making participation in our project and our community a harassment-free experience for everyone, regardless of age, body size, disability, ethnicity, gender identity and expression, level of experience, nationality, personal appearance, race, religion, or sexual identity and orientation. ### Our Standards Examples of behavior that contributes to creating a positive environment include: * Using welcoming and inclusive language * Being respectful of differing viewpoints and experiences * Gracefully accepting constructive criticism * Focusing on what is best for the community * Showing empathy towards other community members Examples of unacceptable behavior by participants include: * The use of sexualized language or imagery and unwelcome sexual attention or advances * Trolling, insulting/derogatory comments, and personal or political attacks * Public or private harassment * Publishing others' private information, such as a physical or electronic address, without explicit permission * Other conduct which could reasonably be considered inappropriate in a professional setting ### Our Responsibilities Project maintainers are responsible for clarifying the standards of acceptable behavior and are expected to take appropriate and fair corrective action in response to any instances of unacceptable behavior. Project maintainers have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, or to ban temporarily or permanently any contributor for other behaviors that they deem inappropriate, threatening, offensive, or harmful. ### Scope This Code of Conduct applies both within project spaces and in public spaces when an individual is representing the project or its community. Examples of representing a project or community include using an official project e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event. Representation of a project may be further defined and clarified by project maintainers. ### Enforcement Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by contacting the project team at support@nomic.ai. All complaints will be reviewed and investigated and will result in a response that is deemed necessary and appropriate to the circumstances. The project team is obligated to maintain confidentiality with regard to the reporter of an incident. Further details of specific enforcement policies may be posted separately. Project maintainers who do not follow or enforce the Code of Conduct in good faith may face temporary or permanent repercussions as determined by other members of the project's leadership. ### Attribution This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4, available at [http://contributor-covenant.org/version/1/4][version] [homepage]: http://contributor-covenant.org [version]: http://contributor-covenant.org/version/1/4/ ================================================ FILE: LICENSE.txt ================================================ Copyright (c) 2023 Nomic, Inc. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ================================================ FILE: MAINTAINERS.md ================================================ # MAINTAINERS ## Rules * All content inside GPT4All shall have a documented maintainer * If a maintainer decides to retire or resign a call for volunteers will go out * If no further maintainer can be found in a reasonable time frame, then the content will be marked deprecated and removed in time ## Job Maintainers will be... 1. Responsible for overseeing content under their stewardship 2. Responsible for triaging new issues, reviewing PRs, assigning priority to tasks 3. Responsible for keeping content in sufficient quality in a timely fashion ## List Adam Treat ([@manyoso](https://github.com/manyoso))
E-mail: adam@nomic.ai
Discord: `@gonzochess75` - Overall project maintainer - Chat UI Jared Van Bortel ([@cebtenzzre](https://github.com/cebtenzzre))
E-mail: jared@nomic.ai
Discord: `@cebtenzzre` - gpt4all-backend - Python binding - Python CLI app Jacob Nguyen ([@jacoobes](https://github.com/jacoobes))
Discord: `@jacoobes`
E-mail: `jacoobes@sern.dev` - TypeScript binding Dominik ([@cosmic-snow](https://github.com/cosmic-snow))
E-mail: cosmic-snow@mailfence.com
Discord: `@cosmic__snow` - Community documentation (GitHub Wiki) Max Cembalest ([@mcembalest](https://github.com/mcembalest))
E-mail: max@nomic.ai
Discord: `@maxcembalest.` - Official documentation (gpt4all-bindings/python/docs -> https://docs.gpt4all.io/) Thiago Ramos ([@thiagojramos](https://github.com/thiagojramos))
E-mail: thiagojramos@outlook.com
- pt\_BR translation 不知火 Shiranui ([@supersonictw](https://github.com/supersonictw))
E-mail: supersonic@livemail.tw
Discord: `@supersonictw` - zh\_TW translation Jeremy Tayco ([@jstayco](https://github.com/jstayco))
E-mail: jstayco@protonmail.ch
Discord: `@vertana` - es\_MX translation Riccardo Giovanetti ([@Harvester62](https://github.com/Harvester62))
E-mail: riccardo.giovanetti@gmail.com
Discord: `@harvester62` - it\_IT translation Tim ([@Tim453](https://github.com/Tim453))
E-mail: tim453@mailbox.org
Discord: `@Tim453` - Flatpak Jack ([@wuodoo](https://github.com/wuodoo))
E-mail: 2296103047@qq.com
Discord: `@mikage` - zh\_CN translation ================================================ FILE: README.md ================================================

GPT4All

Now with support for DeepSeek R1 Distillations

WebsiteDocumentationDiscordYouTube Tutorial

GPT4All runs large language models (LLMs) privately on everyday desktops & laptops.

No API calls or GPUs required - you can just download the application and get started.

Read about what's new in our blog.

Subscribe to the newsletter

https://github.com/nomic-ai/gpt4all/assets/70534565/513a0f15-4964-4109-89e4-4f9a9011f311

GPT4All is made possible by our compute partner Paperspace.

## Download Links

Windows Installer

Windows ARM Installer

macOS Installer

Ubuntu Installer

The Windows and Linux builds require Intel Core i3 2nd Gen / AMD Bulldozer, or better.

The Windows ARM build supports Qualcomm Snapdragon and Microsoft SQ1/SQ2 processors.

The Linux build is x86-64 only (no ARM).

The macOS build requires Monterey 12.6 or newer. Best results with Apple Silicon M-series processors.

See the full [System Requirements](gpt4all-chat/system_requirements.md) for more details.

Get it on Flathub
Flathub (community maintained)

## Install GPT4All Python `gpt4all` gives you access to LLMs with our Python client around [`llama.cpp`](https://github.com/ggerganov/llama.cpp) implementations. Nomic contributes to open source software like [`llama.cpp`](https://github.com/ggerganov/llama.cpp) to make LLMs accessible and efficient **for all**. ```bash pip install gpt4all ``` ```python from gpt4all import GPT4All model = GPT4All("Meta-Llama-3-8B-Instruct.Q4_0.gguf") # downloads / loads a 4.66GB LLM with model.chat_session(): print(model.generate("How can I run LLMs efficiently on my laptop?", max_tokens=1024)) ``` ## Integrations :parrot::link: [Langchain](https://python.langchain.com/v0.2/docs/integrations/providers/gpt4all/) :card_file_box: [Weaviate Vector Database](https://github.com/weaviate/weaviate) - [module docs](https://weaviate.io/developers/weaviate/modules/retriever-vectorizer-modules/text2vec-gpt4all) :telescope: [OpenLIT (OTel-native Monitoring)](https://github.com/openlit/openlit) - [Docs](https://docs.openlit.io/latest/integrations/gpt4all) ## Release History - **July 2nd, 2024**: V3.0.0 Release - Fresh redesign of the chat application UI - Improved user workflow for LocalDocs - Expanded access to more model architectures - **October 19th, 2023**: GGUF Support Launches with Support for: - Mistral 7b base model, an updated model gallery on our website, several new local code models including Rift Coder v1.5 - [Nomic Vulkan](https://blog.nomic.ai/posts/gpt4all-gpu-inference-with-vulkan) support for Q4\_0 and Q4\_1 quantizations in GGUF. - Offline build support for running old versions of the GPT4All Local LLM Chat Client. - **September 18th, 2023**: [Nomic Vulkan](https://blog.nomic.ai/posts/gpt4all-gpu-inference-with-vulkan) launches supporting local LLM inference on NVIDIA and AMD GPUs. - **July 2023**: Stable support for LocalDocs, a feature that allows you to privately and locally chat with your data. - **June 28th, 2023**: [Docker-based API server] launches allowing inference of local LLMs from an OpenAI-compatible HTTP endpoint. [Docker-based API server]: https://github.com/nomic-ai/gpt4all/tree/cef74c2be20f5b697055d5b8b506861c7b997fab/gpt4all-api ## Contributing GPT4All welcomes contributions, involvement, and discussion from the open source community! Please see CONTRIBUTING.md and follow the issues, bug reports, and PR markdown templates. Check project discord, with project owners, or through existing issues/PRs to avoid duplicate work. Please make sure to tag all of the above with relevant project identifiers or your contribution could potentially get lost. Example tags: `backend`, `bindings`, `python-bindings`, `documentation`, etc. ## Citation If you utilize this repository, models or data in a downstream project, please consider citing it with: ``` @misc{gpt4all, author = {Yuvanesh Anand and Zach Nussbaum and Brandon Duderstadt and Benjamin Schmidt and Andriy Mulyar}, title = {GPT4All: Training an Assistant-style Chatbot with Large Scale Data Distillation from GPT-3.5-Turbo}, year = {2023}, publisher = {GitHub}, journal = {GitHub repository}, howpublished = {\url{https://github.com/nomic-ai/gpt4all}}, } ``` ================================================ FILE: common/common.cmake ================================================ function(gpt4all_add_warning_options target) if (MSVC) return() endif() target_compile_options("${target}" PRIVATE # base options -Wall -Wextra # extra options -Wcast-align -Wextra-semi -Wformat=2 -Wmissing-include-dirs -Wsuggest-override -Wvla # errors -Werror=format-security -Werror=init-self -Werror=pointer-arith -Werror=undef # disabled warnings -Wno-sign-compare -Wno-unused-parameter ) if (CMAKE_CXX_COMPILER_ID STREQUAL "GNU") target_compile_options("${target}" PRIVATE -Wduplicated-branches -Wduplicated-cond -Wlogical-op -Wno-reorder -Wno-null-dereference ) elseif (CMAKE_CXX_COMPILER_ID MATCHES "^(Apple)?Clang$") target_compile_options("${target}" PRIVATE -Wunreachable-code-break -Wunreachable-code-return -Werror=pointer-integer-compare -Wno-reorder-ctor ) endif() endfunction() ================================================ FILE: gpt4all-backend/CMakeLists.txt ================================================ cmake_minimum_required(VERSION 3.23) # for FILE_SET include(../common/common.cmake) set(CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS ON) set(CMAKE_EXPORT_COMPILE_COMMANDS ON) if (APPLE) option(BUILD_UNIVERSAL "Build a Universal binary on macOS" ON) else() option(LLMODEL_KOMPUTE "llmodel: use Kompute" ON) option(LLMODEL_VULKAN "llmodel: use Vulkan" OFF) option(LLMODEL_CUDA "llmodel: use CUDA" ON) option(LLMODEL_ROCM "llmodel: use ROCm" OFF) endif() if (APPLE) if (BUILD_UNIVERSAL) # Build a Universal binary on macOS # This requires that the found Qt library is compiled as Universal binaries. set(CMAKE_OSX_ARCHITECTURES "arm64;x86_64" CACHE STRING "" FORCE) else() # Build for the host architecture on macOS if (NOT CMAKE_OSX_ARCHITECTURES) set(CMAKE_OSX_ARCHITECTURES "${CMAKE_HOST_SYSTEM_PROCESSOR}" CACHE STRING "" FORCE) endif() endif() endif() # Include the binary directory for the generated header file include_directories("${CMAKE_CURRENT_BINARY_DIR}") set(LLMODEL_VERSION_MAJOR 0) set(LLMODEL_VERSION_MINOR 5) set(LLMODEL_VERSION_PATCH 0) set(LLMODEL_VERSION "${LLMODEL_VERSION_MAJOR}.${LLMODEL_VERSION_MINOR}.${LLMODEL_VERSION_PATCH}") project(llmodel VERSION ${LLMODEL_VERSION} LANGUAGES CXX C) set(CMAKE_CXX_STANDARD 23) set(CMAKE_CXX_STANDARD_REQUIRED ON) set(CMAKE_LIBRARY_OUTPUT_DIRECTORY ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}) set(BUILD_SHARED_LIBS ON) # Check for IPO support include(CheckIPOSupported) check_ipo_supported(RESULT IPO_SUPPORTED OUTPUT IPO_ERROR) if (NOT IPO_SUPPORTED) message(WARNING "Interprocedural optimization is not supported by your toolchain! This will lead to bigger file sizes and worse performance: ${IPO_ERROR}") else() message(STATUS "Interprocedural optimization support detected") endif() set(DIRECTORY deps/llama.cpp-mainline) include(llama.cpp.cmake) set(BUILD_VARIANTS) if (APPLE) list(APPEND BUILD_VARIANTS metal) endif() if (LLMODEL_KOMPUTE) list(APPEND BUILD_VARIANTS kompute kompute-avxonly) else() list(PREPEND BUILD_VARIANTS cpu cpu-avxonly) endif() if (LLMODEL_VULKAN) list(APPEND BUILD_VARIANTS vulkan vulkan-avxonly) endif() if (LLMODEL_CUDA) cmake_minimum_required(VERSION 3.18) # for CMAKE_CUDA_ARCHITECTURES # Defaults must be set before enable_language(CUDA). # Keep this in sync with the arch list in ggml/src/CMakeLists.txt (plus 5.0 for non-F16 branch). if (NOT DEFINED CMAKE_CUDA_ARCHITECTURES) # 52 == lowest CUDA 12 standard # 60 == f16 CUDA intrinsics # 61 == integer CUDA intrinsics # 70 == compute capability at which unrolling a loop in mul_mat_q kernels is faster if (GGML_CUDA_F16 OR GGML_CUDA_DMMV_F16) set(CMAKE_CUDA_ARCHITECTURES "60;61;70;75") # needed for f16 CUDA intrinsics else() set(CMAKE_CUDA_ARCHITECTURES "50;52;61;70;75") # lowest CUDA 12 standard + lowest for integer intrinsics #set(CMAKE_CUDA_ARCHITECTURES "OFF") # use this to compile much faster, but only F16 models work endif() endif() message(STATUS "Using CUDA architectures: ${CMAKE_CUDA_ARCHITECTURES}") include(CheckLanguage) check_language(CUDA) if (NOT CMAKE_CUDA_COMPILER) message(WARNING "CUDA Toolkit not found. To build without CUDA, use -DLLMODEL_CUDA=OFF.") endif() enable_language(CUDA) list(APPEND BUILD_VARIANTS cuda cuda-avxonly) endif() if (LLMODEL_ROCM) enable_language(HIP) list(APPEND BUILD_VARIANTS rocm rocm-avxonly) endif() # Go through each build variant foreach(BUILD_VARIANT IN LISTS BUILD_VARIANTS) # Determine flags if (BUILD_VARIANT MATCHES avxonly) set(GPT4ALL_ALLOW_NON_AVX OFF) else() set(GPT4ALL_ALLOW_NON_AVX ON) endif() set(GGML_AVX2 ${GPT4ALL_ALLOW_NON_AVX}) set(GGML_F16C ${GPT4ALL_ALLOW_NON_AVX}) set(GGML_FMA ${GPT4ALL_ALLOW_NON_AVX}) set(GGML_METAL OFF) set(GGML_KOMPUTE OFF) set(GGML_VULKAN OFF) set(GGML_CUDA OFF) set(GGML_ROCM OFF) if (BUILD_VARIANT MATCHES metal) set(GGML_METAL ON) elseif (BUILD_VARIANT MATCHES kompute) set(GGML_KOMPUTE ON) elseif (BUILD_VARIANT MATCHES vulkan) set(GGML_VULKAN ON) elseif (BUILD_VARIANT MATCHES cuda) set(GGML_CUDA ON) elseif (BUILD_VARIANT MATCHES rocm) set(GGML_HIPBLAS ON) endif() # Include GGML include_ggml(-mainline-${BUILD_VARIANT}) if (BUILD_VARIANT MATCHES metal) set(GGML_METALLIB "${GGML_METALLIB}" PARENT_SCOPE) endif() # Function for preparing individual implementations function(prepare_target TARGET_NAME BASE_LIB) set(TARGET_NAME ${TARGET_NAME}-${BUILD_VARIANT}) message(STATUS "Configuring model implementation target ${TARGET_NAME}") # Link to ggml/llama target_link_libraries(${TARGET_NAME} PRIVATE ${BASE_LIB}-${BUILD_VARIANT}) # Let it know about its build variant target_compile_definitions(${TARGET_NAME} PRIVATE GGML_BUILD_VARIANT="${BUILD_VARIANT}") # Enable IPO if possible # FIXME: Doesn't work with msvc reliably. See https://github.com/nomic-ai/gpt4all/issues/841 # set_property(TARGET ${TARGET_NAME} # PROPERTY INTERPROCEDURAL_OPTIMIZATION ${IPO_SUPPORTED}) endfunction() # Add each individual implementations add_library(llamamodel-mainline-${BUILD_VARIANT} SHARED src/llamamodel.cpp src/llmodel_shared.cpp) gpt4all_add_warning_options(llamamodel-mainline-${BUILD_VARIANT}) target_compile_definitions(llamamodel-mainline-${BUILD_VARIANT} PRIVATE LLAMA_VERSIONS=>=3 LLAMA_DATE=999999) target_include_directories(llamamodel-mainline-${BUILD_VARIANT} PRIVATE src include/gpt4all-backend ) prepare_target(llamamodel-mainline llama-mainline) if (NOT PROJECT_IS_TOP_LEVEL AND BUILD_VARIANT STREQUAL cuda) set(CUDAToolkit_BIN_DIR ${CUDAToolkit_BIN_DIR} PARENT_SCOPE) endif() endforeach() add_library(llmodel src/dlhandle.cpp src/llmodel.cpp src/llmodel_c.cpp src/llmodel_shared.cpp ) gpt4all_add_warning_options(llmodel) target_sources(llmodel PUBLIC FILE_SET public_headers TYPE HEADERS BASE_DIRS include FILES include/gpt4all-backend/llmodel.h include/gpt4all-backend/llmodel_c.h include/gpt4all-backend/sysinfo.h ) target_compile_definitions(llmodel PRIVATE LIB_FILE_EXT="${CMAKE_SHARED_LIBRARY_SUFFIX}") target_include_directories(llmodel PRIVATE src include/gpt4all-backend) set_target_properties(llmodel PROPERTIES VERSION ${PROJECT_VERSION} SOVERSION ${PROJECT_VERSION_MAJOR}) set(COMPONENT_NAME_MAIN ${PROJECT_NAME}) set(CMAKE_INSTALL_PREFIX ${CMAKE_BINARY_DIR}/install) ================================================ FILE: gpt4all-backend/README.md ================================================ # GPT4ALL Backend This directory contains the C/C++ model backend used by GPT4All for inference on the CPU. This backend acts as a universal library/wrapper for all models that the GPT4All ecosystem supports. Language bindings are built on top of this universal library. The native GPT4all Chat application directly uses this library for all inference. # What models are supported by the GPT4All ecosystem? Currently, there are three different model architectures that are supported: 1. GPTJ - Based off of the GPT-J architecture with examples found [here](https://huggingface.co/EleutherAI/gpt-j-6b) 2. LLAMA - Based off of the LLAMA architecture with examples found [here](https://huggingface.co/models?sort=downloads&search=llama) 3. MPT - Based off of Mosaic ML's MPT architecture with examples found [here](https://huggingface.co/mosaicml/mpt-7b) # Why so many different architectures? What differentiates them? One of the major differences is license. Currently, the LLAMA based models are subject to a non-commercial license, whereas the GPTJ and MPT base models allow commercial usage. In the early advent of the recent explosion of activity in open source local models, the llama models have generally been seen as performing better, but that is changing quickly. Every week - even every day! - new models are released with some of the GPTJ and MPT models competitive in performance/quality with LLAMA. What's more, there are some very nice architectural innovations with the MPT models that could lead to new performance/quality gains. # How does GPT4All make these models available for CPU inference? By leveraging the ggml library written by Georgi Gerganov and a growing community of developers. There are currently multiple different versions of this library. The original github repo can be found [here](https://github.com/ggerganov/ggml), but the developer of the library has also created a LLAMA based version [here](https://github.com/ggerganov/llama.cpp). Currently, this backend is using the latter as a submodule. # Does that mean GPT4All is compatible with all llama.cpp models and vice versa? Unfortunately, no for three reasons: 1. The upstream [llama.cpp](https://github.com/ggerganov/llama.cpp) project has introduced [a compatibility breaking](https://github.com/ggerganov/llama.cpp/commit/b9fd7eee57df101d4a3e3eabc9fd6c2cb13c9ca1) re-quantization method recently. This is a breaking change that renders all previous models (including the ones that GPT4All uses) inoperative with newer versions of llama.cpp since that change. 2. The GPT4All backend has the llama.cpp submodule specifically pinned to a version prior to this breaking change. 3. The GPT4All backend currently supports MPT based models as an added feature. Neither llama.cpp nor the original ggml repo support this architecture as of this writing, however efforts are underway to make MPT available in the ggml repo which you can follow [here.](https://github.com/ggerganov/ggml/pull/145) # What is being done to make them more compatible? A few things. Number one, we are maintaining compatibility with our current model zoo by way of the submodule pinning. However, we are also exploring how we can update to newer versions of llama.cpp without breaking our current models. This might involve an additional magic header check or it could possibly involve keeping the currently pinned submodule and also adding a new submodule with later changes and differentiating them with namespaces or some other manner. Investigations continue. # What about GPU inference? In newer versions of llama.cpp, there has been some added support for NVIDIA GPU's for inference. We're investigating how to incorporate this into our downloadable installers. # Ok, so bottom line... how do I make my model on Hugging Face compatible with GPT4All ecosystem right now? 1. Check to make sure the Hugging Face model is available in one of our three supported architectures 2. If it is, then you can use the conversion script inside of our pinned llama.cpp submodule for GPTJ and LLAMA based models 3. Or if your model is an MPT model you can use the conversion script located directly in this backend directory under the scripts subdirectory # Check back for updates as we'll try to keep this updated as things change! ================================================ FILE: gpt4all-backend/include/gpt4all-backend/llmodel.h ================================================ #ifndef LLMODEL_H #define LLMODEL_H #include #include #include #include #include #include #include #include #include #include #include #include #include #include class Dlhandle; using namespace std::string_literals; #define LLMODEL_MAX_PROMPT_BATCH 128 class LLModel { public: using Token = int32_t; using PromptCallback = std::function batch, bool cached)>; using ResponseCallback = std::function; using EmbedCancelCallback = bool(unsigned *batchSizes, unsigned nBatch, const char *backend); using ProgressCallback = std::function; class BadArchError: public std::runtime_error { public: BadArchError(std::string arch) : runtime_error("Unsupported model architecture: " + arch) , m_arch(std::move(arch)) {} const std::string &arch() const noexcept { return m_arch; } private: std::string m_arch; }; class MissingImplementationError: public std::runtime_error { public: using std::runtime_error::runtime_error; }; class UnsupportedModelError: public std::runtime_error { public: using std::runtime_error::runtime_error; }; struct GPUDevice { const char *backend; int index; int type; size_t heapSize; std::string name; std::string vendor; GPUDevice(const char *backend, int index, int type, size_t heapSize, std::string name, std::string vendor): backend(backend), index(index), type(type), heapSize(heapSize), name(std::move(name)), vendor(std::move(vendor)) {} std::string selectionName() const { assert(backend == "cuda"s || backend == "kompute"s); return backendName() + ": " + name; } std::string backendName() const { return backendIdToName(backend); } static std::string backendIdToName(const std::string &backend) { return s_backendNames.at(backend); } static std::string updateSelectionName(const std::string &name) { if (name == "Auto" || name == "CPU" || name == "Metal") return name; auto it = std::find_if(s_backendNames.begin(), s_backendNames.end(), [&name](const auto &entry) { return name.starts_with(entry.second + ": "); }); if (it != s_backendNames.end()) return name; return "Vulkan: " + name; // previously, there were only Vulkan devices } private: static inline const std::unordered_map s_backendNames { {"cpu", "CPU"}, {"metal", "Metal"}, {"cuda", "CUDA"}, {"kompute", "Vulkan"}, }; }; class Implementation { public: Implementation(const Implementation &) = delete; Implementation(Implementation &&); ~Implementation(); std::string_view modelType() const { return m_modelType; } std::string_view buildVariant() const { return m_buildVariant; } static LLModel *construct(const std::string &modelPath, const std::string &backend = "auto", int n_ctx = 2048); static std::vector availableGPUDevices(size_t memoryRequired = 0); static int32_t maxContextLength(const std::string &modelPath); static int32_t layerCount(const std::string &modelPath); static bool isEmbeddingModel(const std::string &modelPath); static auto chatTemplate(const char *modelPath) -> std::expected; static void setImplementationsSearchPath(const std::string &path); static const std::string &implementationsSearchPath(); static bool hasSupportedCPU(); // 0 for no, 1 for yes, -1 for non-x86_64 static int cpuSupportsAVX2(); private: Implementation(Dlhandle &&); static const std::vector &implementationList(); static const Implementation *implementation(const char *fname, const std::string &buildVariant); static LLModel *constructGlobalLlama(const std::optional &backend = std::nullopt); char *(*m_getFileArch)(const char *fname); bool (*m_isArchSupported)(const char *arch); LLModel *(*m_construct)(); std::string_view m_modelType; std::string_view m_buildVariant; Dlhandle *m_dlhandle; }; struct PromptContext { int32_t n_predict = 200; int32_t top_k = 40; float top_p = 0.9f; float min_p = 0.0f; float temp = 0.9f; int32_t n_batch = 9; float repeat_penalty = 1.10f; int32_t repeat_last_n = 64; // last n tokens to penalize float contextErase = 0.5f; // percent of context to erase if we exceed the context window }; explicit LLModel() {} virtual ~LLModel() {} virtual bool supportsEmbedding() const = 0; virtual bool supportsCompletion() const = 0; virtual bool loadModel(const std::string &modelPath, int n_ctx, int ngl) = 0; virtual bool isModelBlacklisted(const std::string &modelPath) const { (void)modelPath; return false; } virtual bool isEmbeddingModel(const std::string &modelPath) const { (void)modelPath; return false; } virtual bool isModelLoaded() const = 0; virtual size_t requiredMem(const std::string &modelPath, int n_ctx, int ngl) = 0; virtual size_t stateSize() const = 0; virtual size_t saveState(std::span stateOut, std::vector &inputTokensOut) const = 0; virtual size_t restoreState(std::span state, std::span inputTokens) = 0; // This method requires the model to return true from supportsCompletion otherwise it will throw // an error virtual void prompt(std::string_view prompt, const PromptCallback &promptCallback, const ResponseCallback &responseCallback, const PromptContext &ctx); virtual int32_t countPromptTokens(std::string_view prompt) const; virtual size_t embeddingSize() const { throw std::logic_error(std::string(implementation().modelType()) + " does not support embeddings"); } // user-specified prefix virtual void embed(const std::vector &texts, float *embeddings, std::optional prefix, int dimensionality = -1, size_t *tokenCount = nullptr, bool doMean = true, bool atlas = false, EmbedCancelCallback *cancelCb = nullptr); // automatic prefix virtual void embed(const std::vector &texts, float *embeddings, bool isRetrieval, int dimensionality = -1, size_t *tokenCount = nullptr, bool doMean = true, bool atlas = false); virtual void setThreadCount(int32_t n_threads) { (void)n_threads; } virtual int32_t threadCount() const { return 1; } const Implementation &implementation() const { return *m_implementation; } virtual std::vector availableGPUDevices(size_t memoryRequired) const { (void)memoryRequired; return {}; } virtual bool initializeGPUDevice(size_t memoryRequired, const std::string &name) const { (void)memoryRequired; (void)name; return false; } virtual bool initializeGPUDevice(int device, std::string *unavail_reason = nullptr) const { (void)device; if (unavail_reason) { *unavail_reason = "model has no GPU support"; } return false; } virtual bool usingGPUDevice() const { return false; } virtual const char *backendName() const { return "cpu"; } virtual const char *gpuDeviceName() const { return nullptr; } void setProgressCallback(ProgressCallback callback) { m_progressCallback = callback; } virtual int32_t contextLength() const = 0; virtual auto specialTokens() -> std::unordered_map const = 0; protected: // These are pure virtual because subclasses need to implement as the default implementation of // 'prompt' above calls these functions virtual std::vector tokenize(std::string_view str) const = 0; virtual bool isSpecialToken(Token id) const = 0; virtual std::string tokenToString(Token id) const = 0; virtual void initSampler(const PromptContext &ctx) = 0; virtual Token sampleToken() const = 0; virtual bool evalTokens(int32_t nPast, std::span tokens) const = 0; virtual void shiftContext(const PromptContext &promptCtx, int32_t *nPast) = 0; virtual int32_t inputLength() const = 0; virtual int32_t computeModelInputPosition(std::span input) const = 0; virtual void setModelInputPosition(int32_t pos) = 0; virtual void appendInputToken(Token tok) = 0; virtual std::span inputTokens() const = 0; virtual const std::vector &endTokens() const = 0; virtual bool shouldAddBOS() const = 0; virtual int32_t maxContextLength(std::string const &modelPath) const { (void)modelPath; return -1; } virtual int32_t layerCount(std::string const &modelPath) const { (void)modelPath; return -1; } virtual auto chatTemplate(const char *modelPath) const -> std::expected { (void)modelPath; return std::unexpected("not implemented"); } const Implementation *m_implementation = nullptr; ProgressCallback m_progressCallback; static bool staticProgressCallback(float progress, void* ctx) { LLModel* model = static_cast(ctx); if (model && model->m_progressCallback) return model->m_progressCallback(progress); return true; } // prefill context with prompt auto decodePrompt(const PromptCallback &promptCallback, const PromptContext &promptCtx, std::vector embd_inp) -> std::optional; // generate a response void generateResponse(const ResponseCallback &responseCallback, const PromptContext &promptCtx, int32_t nPast); friend class LLMImplementation; }; #endif // LLMODEL_H ================================================ FILE: gpt4all-backend/include/gpt4all-backend/llmodel_c.h ================================================ #ifndef LLMODEL_C_H #define LLMODEL_C_H #include #include #include #ifdef __GNUC__ #define DEPRECATED __attribute__ ((deprecated)) #elif defined(_MSC_VER) #define DEPRECATED __declspec(deprecated) #else #pragma message("WARNING: You need to implement DEPRECATED for this compiler") #define DEPRECATED #endif #ifdef __cplusplus extern "C" { #endif /** * Opaque pointer to the underlying model. */ typedef void *llmodel_model; /** * A token. */ typedef int32_t token_t; /** * llmodel_prompt_context structure for holding the prompt context. * NOTE: The implementation takes care of all the memory handling of the raw logits pointer and the * raw tokens pointer. Attempting to resize them or modify them in any way can lead to undefined * behavior. */ struct llmodel_prompt_context { int32_t n_predict; // number of tokens to predict int32_t top_k; // top k logits to sample from float top_p; // nucleus sampling probability threshold float min_p; // Min P sampling float temp; // temperature to adjust model's output distribution int32_t n_batch; // number of predictions to generate in parallel float repeat_penalty; // penalty factor for repeated tokens int32_t repeat_last_n; // last n tokens to penalize float context_erase; // percent of context to erase if we exceed the context window }; struct llmodel_gpu_device { const char * backend; int index; int type; // same as VkPhysicalDeviceType size_t heapSize; const char * name; const char * vendor; }; #ifndef __cplusplus typedef struct llmodel_prompt_context llmodel_prompt_context; typedef struct llmodel_gpu_device llmodel_gpu_device; #endif /** * Callback type for prompt processing. * @param token_ids An array of token ids of the prompt. * @param n_token_ids The number of tokens in the array. * @param cached Whether the tokens were already in cache. * @return a bool indicating whether the model should keep processing. */ typedef bool (*llmodel_prompt_callback)(const token_t *token_ids, size_t n_token_ids, bool cached); /** * Callback type for response. * @param token_id The token id of the response. * @param response The response string. NOTE: a token_id of -1 indicates the string is an error string. * @return a bool indicating whether the model should keep generating. */ typedef bool (*llmodel_response_callback)(token_t token_id, const char *response); /** * Embedding cancellation callback for use with llmodel_embed. * @param batch_sizes The number of tokens in each batch that will be embedded. * @param n_batch The number of batches that will be embedded. * @param backend The backend that will be used for embedding. One of "cpu", "kompute", "cuda", or "metal". * @return True to cancel llmodel_embed, false to continue. */ typedef bool (*llmodel_emb_cancel_callback)(unsigned *batch_sizes, unsigned n_batch, const char *backend); typedef void (*llmodel_special_token_callback)(const char *name, const char *token); /** * Create a llmodel instance. * Recognises correct model type from file at model_path * @param model_path A string representing the path to the model file. * @return A pointer to the llmodel_model instance; NULL on error. */ DEPRECATED llmodel_model llmodel_model_create(const char *model_path); /** * Create a llmodel instance. * Recognises correct model type from file at model_path * @param model_path A string representing the path to the model file; will only be used to detect model type. * @param backend A string representing the implementation to use. One of 'auto', 'cpu', 'metal', 'kompute', or 'cuda'. * @param error A pointer to a string; will only be set on error. * @return A pointer to the llmodel_model instance; NULL on error. */ llmodel_model llmodel_model_create2(const char *model_path, const char *backend, const char **error); /** * Destroy a llmodel instance. * Recognises correct model type using type info * @param model a pointer to a llmodel_model instance. */ void llmodel_model_destroy(llmodel_model model); /** * Estimate RAM requirement for a model file * @param model A pointer to the llmodel_model instance. * @param model_path A string representing the path to the model file. * @param n_ctx Maximum size of context window * @param ngl Number of GPU layers to use (Vulkan) * @return size greater than 0 if the model was parsed successfully, 0 if file could not be parsed. */ size_t llmodel_required_mem(llmodel_model model, const char *model_path, int n_ctx, int ngl); /** * Load a model from a file. * @param model A pointer to the llmodel_model instance. * @param model_path A string representing the path to the model file. * @param n_ctx Maximum size of context window * @param ngl Number of GPU layers to use (Vulkan) * @return true if the model was loaded successfully, false otherwise. */ bool llmodel_loadModel(llmodel_model model, const char *model_path, int n_ctx, int ngl); /** * Check if a model is loaded. * @param model A pointer to the llmodel_model instance. * @return true if the model is loaded, false otherwise. */ bool llmodel_isModelLoaded(llmodel_model model); /** * Get the size of the internal state of the model. * NOTE: This state data is specific to the type of model you have created. * @param model A pointer to the llmodel_model instance. * @return the size in bytes of the internal state of the model */ uint64_t llmodel_state_get_size(llmodel_model model); /** * Saves the internal state of the model. * NOTE: This state data is specific to the type of model you have created. * @param model A pointer to the llmodel_model instance. * @param state Where to store the state. This must be a buffer of at least llmodel_state_get_size() bytes. * @param state_size The size of the destination for the state. * @param input_tokens_out Where to store the address of the token cache state. This is dynamically allocated and must * be freed with llmodel_state_free_input_tokens. * @param n_input_tokens Where to store the size of the token cache state. * @return The number of bytes copied. On error, zero is returned, the token cache is set to NULL, and the token cache * size is set to zero. */ uint64_t llmodel_state_get_data(llmodel_model model, uint8_t *state_out, uint64_t state_size, token_t **input_tokens_out, uint64_t *n_input_tokens); /** * Frees the temporary token cache buffer created by a call to llmodel_state_get_data(). * @param input_tokens The token cache buffer. */ void llmodel_state_free_input_tokens(token_t *input_tokens); /** * Restores the internal state of the model using data from the specified address. * NOTE: This state data is specific to the type of model you have created. * @param model A pointer to the llmodel_model instance. * @param state A pointer to the state data. * @param state_size The size of the state data. * @param input_tokens The token cache associated with the saved state. * @param n_input_tokens The number of tokens in input_tokens. * @return The number of bytes read, or zero on error. */ uint64_t llmodel_state_set_data(llmodel_model model, const uint8_t *state, uint64_t state_size, const token_t *input_tokens, uint64_t n_input_tokens); /** * Generate a response using the model. * @param model A pointer to the llmodel_model instance. * @param prompt A string representing the input prompt. * @param prompt_callback A callback function for handling the processing of prompt. * @param response_callback A callback function for handling the generated response. * @param ctx A pointer to the llmodel_prompt_context structure. * @param error A pointer to a string; will only be set on error. */ bool llmodel_prompt(llmodel_model model, const char *prompt, llmodel_prompt_callback prompt_callback, llmodel_response_callback response_callback, llmodel_prompt_context *ctx, const char **error); /** * Generate an embedding using the model. * NOTE: If given NULL pointers for the model or text, or an empty text, a NULL pointer will be * returned. Bindings should signal an error when NULL is the return value. * @param model A pointer to the llmodel_model instance. * @param texts A pointer to a NULL-terminated array of strings representing the texts to generate an * embedding for. * @param embedding_size A pointer to a size_t type that will be set by the call indicating the length * of the returned floating point array. * @param prefix The model-specific prefix representing the embedding task, without the trailing colon. NULL for no * prefix. * @param dimensionality The embedding dimension, for use with Matryoshka-capable models. Set to -1 to for full-size. * @param token_count Return location for the number of prompt tokens processed, or NULL. * @param do_mean True to average multiple embeddings if the text is longer than the model can accept, False to * truncate. * @param atlas Try to be fully compatible with the Atlas API. Currently, this means texts longer than 8192 tokens with * long_text_mode="mean" will raise an error. Disabled by default. * @param cancel_cb Cancellation callback, or NULL. See the documentation of llmodel_emb_cancel_callback. * @param error Return location for a malloc()ed string that will be set on error, or NULL. * @return A pointer to an array of floating point values passed to the calling method which then will * be responsible for lifetime of this memory. NULL if an error occurred. */ float *llmodel_embed(llmodel_model model, const char **texts, size_t *embedding_size, const char *prefix, int dimensionality, size_t *token_count, bool do_mean, bool atlas, llmodel_emb_cancel_callback cancel_cb, const char **error); /** * Frees the memory allocated by the llmodel_embedding function. * @param ptr A pointer to the embedding as returned from llmodel_embedding. */ void llmodel_free_embedding(float *ptr); /** * Set the number of threads to be used by the model. * @param model A pointer to the llmodel_model instance. * @param n_threads The number of threads to be used. */ void llmodel_setThreadCount(llmodel_model model, int32_t n_threads); /** * Get the number of threads currently being used by the model. * @param model A pointer to the llmodel_model instance. * @return The number of threads currently being used. */ int32_t llmodel_threadCount(llmodel_model model); /** * Set llmodel implementation search path. * Default is "." * @param path The path to the llmodel implementation shared objects. This can be a single path or * a list of paths separated by ';' delimiter. */ void llmodel_set_implementation_search_path(const char *path); /** * Get llmodel implementation search path. * @return The current search path; lifetime ends on next set llmodel_set_implementation_search_path() call. */ const char *llmodel_get_implementation_search_path(); /** * Get a list of available GPU devices given the memory required. * @param memoryRequired The minimum amount of VRAM, in bytes * @return A pointer to an array of llmodel_gpu_device's whose number is given by num_devices. */ struct llmodel_gpu_device* llmodel_available_gpu_devices(size_t memoryRequired, int* num_devices); /** * Initializes a GPU device based on a specified string criterion. * * This function initializes a GPU device based on a string identifier provided. The function * allows initialization based on general device type ("gpu"), vendor name ("amd", "nvidia", "intel"), * or any specific device name. * * @param memoryRequired The amount of memory (in bytes) required by the application or task * that will utilize the GPU device. * @param device A string specifying the desired criterion for GPU device selection. It can be: * - "gpu": To initialize the best available GPU. * - "amd", "nvidia", or "intel": To initialize the best available GPU from that vendor. * - A specific GPU device name: To initialize a GPU with that exact name. * * @return True if the GPU device is successfully initialized based on the provided string * criterion. Returns false if the desired GPU device could not be initialized. */ bool llmodel_gpu_init_gpu_device_by_string(llmodel_model model, size_t memoryRequired, const char *device); /** * Initializes a GPU device by specifying a valid gpu device pointer. * @param device A gpu device pointer. * @return True if the GPU device is successfully initialized, false otherwise. */ bool llmodel_gpu_init_gpu_device_by_struct(llmodel_model model, const llmodel_gpu_device *device); /** * Initializes a GPU device by its index. * @param device An integer representing the index of the GPU device to be initialized. * @return True if the GPU device is successfully initialized, false otherwise. */ bool llmodel_gpu_init_gpu_device_by_int(llmodel_model model, int device); /** * @return The name of the llama.cpp backend currently in use. One of "cpu", "kompute", or "metal". */ const char *llmodel_model_backend_name(llmodel_model model); /** * @return The name of the GPU device currently in use, or NULL for backends other than Kompute. */ const char *llmodel_model_gpu_device_name(llmodel_model model); int32_t llmodel_count_prompt_tokens(llmodel_model model, const char *prompt, const char **error); void llmodel_model_foreach_special_token(llmodel_model model, llmodel_special_token_callback callback); #ifdef __cplusplus } #endif #endif // LLMODEL_C_H ================================================ FILE: gpt4all-backend/include/gpt4all-backend/sysinfo.h ================================================ #ifndef SYSINFO_H #define SYSINFO_H #include #include #include #include #if defined(__linux__) # include #elif defined(__APPLE__) # include # include #elif defined(_WIN32) # define WIN32_LEAN_AND_MEAN # ifndef NOMINMAX # define NOMINMAX # endif # include #endif static long long getSystemTotalRAMInBytes() { long long totalRAM = 0; #if defined(__linux__) std::ifstream file("/proc/meminfo"); std::string line; while (std::getline(file, line)) { if (line.find("MemTotal") != std::string::npos) { std::string memTotalStr = line.substr(line.find(":") + 1); memTotalStr.erase(0, memTotalStr.find_first_not_of(" ")); memTotalStr = memTotalStr.substr(0, memTotalStr.find(" ")); totalRAM = std::stoll(memTotalStr) * 1024; // Convert from KB to bytes break; } } file.close(); #elif defined(__APPLE__) int mib[2] = {CTL_HW, HW_MEMSIZE}; size_t length = sizeof(totalRAM); sysctl(mib, 2, &totalRAM, &length, NULL, 0); #elif defined(_WIN32) MEMORYSTATUSEX memoryStatus; memoryStatus.dwLength = sizeof(memoryStatus); GlobalMemoryStatusEx(&memoryStatus); totalRAM = memoryStatus.ullTotalPhys; #endif return totalRAM; } static double getSystemTotalRAMInGB() { return static_cast(getSystemTotalRAMInBytes()) / (1024 * 1024 * 1024); } static std::string getSystemTotalRAMInGBString() { std::stringstream ss; ss << std::fixed << std::setprecision(2) << getSystemTotalRAMInGB() << " GB"; return ss.str(); } #endif // SYSINFO_H ================================================ FILE: gpt4all-backend/llama.cpp.cmake ================================================ cmake_minimum_required(VERSION 3.14) # for add_link_options and implicit target directories. set(CMAKE_RUNTIME_OUTPUT_DIRECTORY ${CMAKE_BINARY_DIR}/bin) # # Option list # # some of the options here are commented out so they can be set "dynamically" before calling include_ggml() set(GGML_LLAMAFILE_DEFAULT ON) # general option(LLAMA_STATIC "llama: static link libraries" OFF) option(LLAMA_NATIVE "llama: enable -march=native flag" OFF) # debug option(LLAMA_ALL_WARNINGS "llama: enable all compiler warnings" ON) option(LLAMA_ALL_WARNINGS_3RD_PARTY "llama: enable all compiler warnings in 3rd party libs" OFF) option(LLAMA_GPROF "llama: enable gprof" OFF) # build option(LLAMA_FATAL_WARNINGS "llama: enable -Werror flag" OFF) # instruction set specific #option(GGML_AVX "ggml: enable AVX" ON) #option(GGML_AVX2 "ggml: enable AVX2" ON) #option(GGML_AVX512 "ggml: enable AVX512" OFF) #option(GGML_AVX512_VBMI "ggml: enable AVX512-VBMI" OFF) #option(GGML_AVX512_VNNI "ggml: enable AVX512-VNNI" OFF) #option(GGML_FMA "ggml: enable FMA" ON) # in MSVC F16C is implied with AVX2/AVX512 #if (NOT MSVC) # option(GGML_F16C "ggml: enable F16C" ON) #endif() if (WIN32) set(LLAMA_WIN_VER "0x602" CACHE STRING "llama: Windows Version") endif() # 3rd party libs option(GGML_ACCELERATE "ggml: enable Accelerate framework" ON) option(GGML_BLAS "ggml: use BLAS" OFF) option(GGML_LLAMAFILE "ggml: use llamafile SGEMM" ${GGML_LLAMAFILE_DEFAULT}) set(GGML_BLAS_VENDOR "Generic" CACHE STRING "ggml: BLAS library vendor") #option(GGML_CUDA "ggml: use CUDA" OFF) option(GGML_CUDA_FORCE_DMMV "ggml: use dmmv instead of mmvq CUDA kernels" OFF) option(GGML_CUDA_FORCE_MMQ "ggml: use mmq kernels instead of cuBLAS" OFF) option(GGML_CUDA_FORCE_CUBLAS "ggml: always use cuBLAS instead of mmq kernels" OFF) set (GGML_CUDA_DMMV_X "32" CACHE STRING "ggml: x stride for dmmv CUDA kernels") set (GGML_CUDA_MMV_Y "1" CACHE STRING "ggml: y block size for mmv CUDA kernels") option(GGML_CUDA_F16 "ggml: use 16 bit floats for some calculations" OFF) set (GGML_CUDA_KQUANTS_ITER "2" CACHE STRING "ggml: iters./thread per block for Q2_K/Q6_K") set (GGML_CUDA_PEER_MAX_BATCH_SIZE "128" CACHE STRING "ggml: max. batch size for using peer access") option(GGML_CUDA_NO_PEER_COPY "ggml: do not use peer to peer copies" OFF) option(GGML_CUDA_NO_VMM "ggml: do not try to use CUDA VMM" OFF) option(GGML_CUDA_FA_ALL_QUANTS "ggml: compile all quants for FlashAttention" OFF) option(GGML_CUDA_USE_GRAPHS "ggml: use CUDA graphs (llama.cpp only)" OFF) #option(GGML_HIPBLAS "ggml: use hipBLAS" OFF) option(GGML_HIP_UMA "ggml: use HIP unified memory architecture" OFF) #option(GGML_VULKAN "ggml: use Vulkan" OFF) option(GGML_VULKAN_CHECK_RESULTS "ggml: run Vulkan op checks" OFF) option(GGML_VULKAN_DEBUG "ggml: enable Vulkan debug output" OFF) option(GGML_VULKAN_VALIDATE "ggml: enable Vulkan validation" OFF) option(GGML_VULKAN_RUN_TESTS "ggml: run Vulkan tests" OFF) #option(GGML_METAL "ggml: use Metal" ${GGML_METAL_DEFAULT}) option(GGML_METAL_NDEBUG "ggml: disable Metal debugging" OFF) option(GGML_METAL_SHADER_DEBUG "ggml: compile Metal with -fno-fast-math" OFF) set(GGML_METAL_MACOSX_VERSION_MIN "" CACHE STRING "ggml: metal minimum macOS version") set(GGML_METAL_STD "" CACHE STRING "ggml: metal standard version (-std flag)") #option(GGML_KOMPUTE "ggml: use Kompute" OFF) option(GGML_QKK_64 "ggml: use super-block size of 64 for k-quants" OFF) set(GGML_SCHED_MAX_COPIES "4" CACHE STRING "ggml: max input copies for pipeline parallelism") # add perf arguments option(LLAMA_PERF "llama: enable perf" OFF) # # Compile flags # set(THREADS_PREFER_PTHREAD_FLAG ON) find_package(Threads REQUIRED) list(APPEND GGML_COMPILE_DEFS GGML_SCHED_MAX_COPIES=${GGML_SCHED_MAX_COPIES}) # enable libstdc++ assertions for debug builds if (CMAKE_SYSTEM_NAME MATCHES "Linux") list(APPEND GGML_COMPILE_DEFS $<$:_GLIBCXX_ASSERTIONS>) endif() if (APPLE AND GGML_ACCELERATE) find_library(ACCELERATE_FRAMEWORK Accelerate) if (ACCELERATE_FRAMEWORK) message(STATUS "Accelerate framework found") list(APPEND GGML_COMPILE_DEFS GGML_USE_ACCELERATE) list(APPEND GGML_COMPILE_DEFS ACCELERATE_NEW_LAPACK) list(APPEND GGML_COMPILE_DEFS ACCELERATE_LAPACK_ILP64) set(LLAMA_EXTRA_LIBS ${LLAMA_EXTRA_LIBS} ${ACCELERATE_FRAMEWORK}) else() message(WARNING "Accelerate framework not found") endif() endif() if (GGML_BLAS) if (LLAMA_STATIC) set(BLA_STATIC ON) endif() if ($(CMAKE_VERSION) VERSION_GREATER_EQUAL 3.22) set(BLA_SIZEOF_INTEGER 8) endif() set(BLA_VENDOR ${GGML_BLAS_VENDOR}) find_package(BLAS) if (BLAS_FOUND) message(STATUS "BLAS found, Libraries: ${BLAS_LIBRARIES}") if ("${BLAS_INCLUDE_DIRS}" STREQUAL "") # BLAS_INCLUDE_DIRS is missing in FindBLAS.cmake. # see https://gitlab.kitware.com/cmake/cmake/-/issues/20268 find_package(PkgConfig REQUIRED) if (${GGML_BLAS_VENDOR} MATCHES "Generic") pkg_check_modules(DepBLAS REQUIRED blas) elseif (${GGML_BLAS_VENDOR} MATCHES "OpenBLAS") # As of openblas v0.3.22, the 64-bit is named openblas64.pc pkg_check_modules(DepBLAS openblas64) if (NOT DepBLAS_FOUND) pkg_check_modules(DepBLAS REQUIRED openblas) endif() elseif (${GGML_BLAS_VENDOR} MATCHES "FLAME") pkg_check_modules(DepBLAS REQUIRED blis) elseif (${GGML_BLAS_VENDOR} MATCHES "ATLAS") pkg_check_modules(DepBLAS REQUIRED blas-atlas) elseif (${GGML_BLAS_VENDOR} MATCHES "FlexiBLAS") pkg_check_modules(DepBLAS REQUIRED flexiblas_api) elseif (${GGML_BLAS_VENDOR} MATCHES "Intel") # all Intel* libraries share the same include path pkg_check_modules(DepBLAS REQUIRED mkl-sdl) elseif (${GGML_BLAS_VENDOR} MATCHES "NVHPC") # this doesn't provide pkg-config # suggest to assign BLAS_INCLUDE_DIRS on your own if ("${NVHPC_VERSION}" STREQUAL "") message(WARNING "Better to set NVHPC_VERSION") else() set(DepBLAS_FOUND ON) set(DepBLAS_INCLUDE_DIRS "/opt/nvidia/hpc_sdk/${CMAKE_SYSTEM_NAME}_${CMAKE_SYSTEM_PROCESSOR}/${NVHPC_VERSION}/math_libs/include") endif() endif() if (DepBLAS_FOUND) set(BLAS_INCLUDE_DIRS ${DepBLAS_INCLUDE_DIRS}) else() message(WARNING "BLAS_INCLUDE_DIRS neither been provided nor been automatically" " detected by pkgconfig, trying to find cblas.h from possible paths...") find_path(BLAS_INCLUDE_DIRS NAMES cblas.h HINTS /usr/include /usr/local/include /usr/include/openblas /opt/homebrew/opt/openblas/include /usr/local/opt/openblas/include /usr/include/x86_64-linux-gnu/openblas/include ) endif() endif() message(STATUS "BLAS found, Includes: ${BLAS_INCLUDE_DIRS}") list(APPEND GGML_COMPILE_OPTS ${BLAS_LINKER_FLAGS}) list(APPEND GGML_COMPILE_DEFS GGML_USE_OPENBLAS) if (${BLAS_INCLUDE_DIRS} MATCHES "mkl" AND (${GGML_BLAS_VENDOR} MATCHES "Generic" OR ${GGML_BLAS_VENDOR} MATCHES "Intel")) list(APPEND GGML_COMPILE_DEFS GGML_BLAS_USE_MKL) endif() set(LLAMA_EXTRA_LIBS ${LLAMA_EXTRA_LIBS} ${BLAS_LIBRARIES}) set(LLAMA_EXTRA_INCLUDES ${LLAMA_EXTRA_INCLUDES} ${BLAS_INCLUDE_DIRS}) else() message(WARNING "BLAS not found, please refer to " "https://cmake.org/cmake/help/latest/module/FindBLAS.html#blas-lapack-vendors" " to set correct GGML_BLAS_VENDOR") endif() endif() if (GGML_LLAMAFILE) list(APPEND GGML_COMPILE_DEFS GGML_USE_LLAMAFILE) set(GGML_HEADERS_LLAMAFILE ${DIRECTORY}/ggml/src/llamafile/sgemm.h) set(GGML_SOURCES_LLAMAFILE ${DIRECTORY}/ggml/src/llamafile/sgemm.cpp) endif() if (GGML_QKK_64) list(APPEND GGML_COMPILE_DEFS GGML_QKK_64) endif() if (LLAMA_PERF) list(APPEND GGML_COMPILE_DEFS GGML_PERF) endif() function(get_flags CCID CCVER) set(C_FLAGS "") set(CXX_FLAGS "") if (CCID MATCHES "Clang") set(C_FLAGS -Wunreachable-code-break -Wunreachable-code-return) set(CXX_FLAGS -Wunreachable-code-break -Wunreachable-code-return -Wmissing-prototypes -Wextra-semi) if ( (CCID STREQUAL "Clang" AND CCVER VERSION_GREATER_EQUAL 3.8.0) OR (CCID STREQUAL "AppleClang" AND CCVER VERSION_GREATER_EQUAL 7.3.0) ) list(APPEND C_FLAGS -Wdouble-promotion) endif() elseif (CCID STREQUAL "GNU") set(C_FLAGS -Wdouble-promotion) set(CXX_FLAGS -Wno-array-bounds) if (CCVER VERSION_GREATER_EQUAL 7.1.0) list(APPEND CXX_FLAGS -Wno-format-truncation) endif() if (CCVER VERSION_GREATER_EQUAL 8.1.0) list(APPEND CXX_FLAGS -Wextra-semi) endif() endif() set(GF_C_FLAGS ${C_FLAGS} PARENT_SCOPE) set(GF_CXX_FLAGS ${CXX_FLAGS} PARENT_SCOPE) endfunction() if (LLAMA_FATAL_WARNINGS) if (CMAKE_CXX_COMPILER_ID MATCHES "GNU" OR CMAKE_CXX_COMPILER_ID MATCHES "Clang") list(APPEND C_FLAGS -Werror) list(APPEND CXX_FLAGS -Werror) elseif (CMAKE_CXX_COMPILER_ID STREQUAL "MSVC") list(APPEND GGML_COMPILE_OPTS /WX) endif() endif() if (LLAMA_ALL_WARNINGS) if (NOT MSVC) list(APPEND WARNING_FLAGS -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function) list(APPEND C_FLAGS -Wshadow -Wstrict-prototypes -Wpointer-arith -Wmissing-prototypes -Werror=implicit-int -Werror=implicit-function-declaration) list(APPEND CXX_FLAGS -Wmissing-declarations -Wmissing-noreturn) list(APPEND C_FLAGS ${WARNING_FLAGS}) list(APPEND CXX_FLAGS ${WARNING_FLAGS}) get_flags(${CMAKE_CXX_COMPILER_ID} ${CMAKE_CXX_COMPILER_VERSION}) list(APPEND GGML_COMPILE_OPTS "$<$:${C_FLAGS};${GF_C_FLAGS}>" "$<$:${CXX_FLAGS};${GF_CXX_FLAGS}>") else() # todo : msvc set(C_FLAGS "") set(CXX_FLAGS "") endif() endif() if (WIN32) list(APPEND GGML_COMPILE_DEFS _CRT_SECURE_NO_WARNINGS) if (BUILD_SHARED_LIBS) set(CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS ON) endif() endif() # this version of Apple ld64 is buggy execute_process( COMMAND ${CMAKE_C_COMPILER} ${CMAKE_EXE_LINKER_FLAGS} -Wl,-v ERROR_VARIABLE output OUTPUT_QUIET ) if (output MATCHES "dyld-1015\.7") list(APPEND GGML_COMPILE_DEFS HAVE_BUGGY_APPLE_LINKER) endif() # Architecture specific # TODO: probably these flags need to be tweaked on some architectures # feel free to update the Makefile for your architecture and send a pull request or issue message(STATUS "CMAKE_SYSTEM_PROCESSOR: ${CMAKE_SYSTEM_PROCESSOR}") if (MSVC) string(TOLOWER "${CMAKE_GENERATOR_PLATFORM}" CMAKE_GENERATOR_PLATFORM_LWR) message(STATUS "CMAKE_GENERATOR_PLATFORM: ${CMAKE_GENERATOR_PLATFORM}") else () set(CMAKE_GENERATOR_PLATFORM_LWR "") endif () if (NOT MSVC) if (LLAMA_STATIC) list(APPEND GGML_LINK_OPTS -static) if (MINGW) list(APPEND GGML_LINK_OPTS -static-libgcc -static-libstdc++) endif() endif() if (LLAMA_GPROF) list(APPEND GGML_COMPILE_OPTS -pg) endif() endif() if (MINGW) # Target Windows 8 for PrefetchVirtualMemory list(APPEND GGML_COMPILE_DEFS _WIN32_WINNT=${LLAMA_WIN_VER}) endif() # # POSIX conformance # # clock_gettime came in POSIX.1b (1993) # CLOCK_MONOTONIC came in POSIX.1-2001 / SUSv3 as optional # posix_memalign came in POSIX.1-2001 / SUSv3 # M_PI is an XSI extension since POSIX.1-2001 / SUSv3, came in XPG1 (1985) list(APPEND GGML_COMPILE_DEFS _XOPEN_SOURCE=600) # Somehow in OpenBSD whenever POSIX conformance is specified # some string functions rely on locale_t availability, # which was introduced in POSIX.1-2008, forcing us to go higher if (CMAKE_SYSTEM_NAME MATCHES "OpenBSD") list(REMOVE_ITEM GGML_COMPILE_DEFS _XOPEN_SOURCE=600) list(APPEND GGML_COMPILE_DEFS _XOPEN_SOURCE=700) endif() # Data types, macros and functions related to controlling CPU affinity and # some memory allocation are available on Linux through GNU extensions in libc if (CMAKE_SYSTEM_NAME MATCHES "Linux") list(APPEND GGML_COMPILE_DEFS _GNU_SOURCE) endif() # RLIMIT_MEMLOCK came in BSD, is not specified in POSIX.1, # and on macOS its availability depends on enabling Darwin extensions # similarly on DragonFly, enabling BSD extensions is necessary if ( CMAKE_SYSTEM_NAME MATCHES "Darwin" OR CMAKE_SYSTEM_NAME MATCHES "iOS" OR CMAKE_SYSTEM_NAME MATCHES "tvOS" OR CMAKE_SYSTEM_NAME MATCHES "DragonFly" ) list(APPEND GGML_COMPILE_DEFS _DARWIN_C_SOURCE) endif() # alloca is a non-standard interface that is not visible on BSDs when # POSIX conformance is specified, but not all of them provide a clean way # to enable it in such cases if (CMAKE_SYSTEM_NAME MATCHES "FreeBSD") list(APPEND GGML_COMPILE_DEFS __BSD_VISIBLE) endif() if (CMAKE_SYSTEM_NAME MATCHES "NetBSD") list(APPEND GGML_COMPILE_DEFS _NETBSD_SOURCE) endif() if (CMAKE_SYSTEM_NAME MATCHES "OpenBSD") list(APPEND GGML_COMPILE_DEFS _BSD_SOURCE) endif() function(include_ggml SUFFIX) message(STATUS "Configuring ggml implementation target llama${SUFFIX} in ${CMAKE_CURRENT_SOURCE_DIR}/${DIRECTORY}") # # libraries # if (GGML_CUDA) cmake_minimum_required(VERSION 3.18) # for CMAKE_CUDA_ARCHITECTURES get_property(LANGS GLOBAL PROPERTY ENABLED_LANGUAGES) if (NOT CUDA IN_LIST LANGS) message(FATAL_ERROR "The CUDA language must be enabled.") endif() find_package(CUDAToolkit REQUIRED) set(CUDAToolkit_BIN_DIR ${CUDAToolkit_BIN_DIR} PARENT_SCOPE) # architectures are set in gpt4all-backend/CMakeLists.txt set(GGML_HEADERS_CUDA ${DIRECTORY}/ggml/include/ggml-cuda.h) file(GLOB GGML_HEADERS_CUDA "${DIRECTORY}/ggml/src/ggml-cuda/*.cuh") list(APPEND GGML_HEADERS_CUDA "${DIRECTORY}/ggml/include/ggml-cuda.h") file(GLOB GGML_SOURCES_CUDA "${DIRECTORY}/ggml/src/ggml-cuda/*.cu") list(APPEND GGML_SOURCES_CUDA "${DIRECTORY}/ggml/src/ggml-cuda.cu") file(GLOB SRCS "${DIRECTORY}/ggml/src/ggml-cuda/template-instances/fattn-wmma*.cu") list(APPEND GGML_SOURCES_CUDA ${SRCS}) file(GLOB SRCS "${DIRECTORY}/ggml/src/ggml-cuda/template-instances/mmq*.cu") list(APPEND GGML_SOURCES_CUDA ${SRCS}) if (GGML_CUDA_FA_ALL_QUANTS) file(GLOB SRCS "${DIRECTORY}/ggml/src/ggml-cuda/template-instances/fattn-vec*.cu") list(APPEND GGML_SOURCES_CUDA ${SRCS}) add_compile_definitions(GGML_CUDA_FA_ALL_QUANTS) else() file(GLOB SRCS "${DIRECTORY}/ggml/src/ggml-cuda/template-instances/fattn-vec*q4_0-q4_0.cu") list(APPEND GGML_SOURCES_CUDA ${SRCS}) file(GLOB SRCS "${DIRECTORY}/ggml/src/ggml-cuda/template-instances/fattn-vec*q8_0-q8_0.cu") list(APPEND GGML_SOURCES_CUDA ${SRCS}) file(GLOB SRCS "${DIRECTORY}/ggml/src/ggml-cuda/template-instances/fattn-vec*f16-f16.cu") list(APPEND GGML_SOURCES_CUDA ${SRCS}) endif() list(APPEND GGML_COMPILE_DEFS_PUBLIC GGML_USE_CUDA) list(APPEND GGML_COMPILE_DEFS GGML_CUDA_DMMV_X=${GGML_CUDA_DMMV_X}) list(APPEND GGML_COMPILE_DEFS GGML_CUDA_MMV_Y=${GGML_CUDA_MMV_Y}) list(APPEND GGML_COMPILE_DEFS K_QUANTS_PER_ITERATION=${GGML_CUDA_KQUANTS_ITER}) list(APPEND GGML_COMPILE_DEFS GGML_CUDA_PEER_MAX_BATCH_SIZE=${GGML_CUDA_PEER_MAX_BATCH_SIZE}) if (GGML_CUDA_USE_GRAPHS) list(APPEND GGML_COMPILE_DEFS GGML_CUDA_USE_GRAPHS) endif() if (GGML_CUDA_FORCE_DMMV) list(APPEND GGML_COMPILE_DEFS GGML_CUDA_FORCE_DMMV) endif() if (GGML_CUDA_FORCE_MMQ) list(APPEND GGML_COMPILE_DEFS GGML_CUDA_FORCE_MMQ) endif() if (GGML_CUDA_FORCE_CUBLAS) list(APPEND GGML_COMPILE_DEFS GGML_CUDA_FORCE_CUBLAS) endif() if (GGML_CUDA_NO_VMM) list(APPEND GGML_COMPILE_DEFS GGML_CUDA_NO_VMM) endif() if (GGML_CUDA_F16) list(APPEND GGML_COMPILE_DEFS GGML_CUDA_F16) endif() if (GGML_CUDA_NO_PEER_COPY) list(APPEND GGML_COMPILE_DEFS GGML_CUDA_NO_PEER_COPY) endif() if (LLAMA_STATIC) if (WIN32) # As of 12.3.1 CUDA Toolkit for Windows does not offer a static cublas library set(LLAMA_EXTRA_LIBS ${LLAMA_EXTRA_LIBS} CUDA::cudart_static CUDA::cublas CUDA::cublasLt) else () set(LLAMA_EXTRA_LIBS ${LLAMA_EXTRA_LIBS} CUDA::cudart_static CUDA::cublas_static CUDA::cublasLt_static) endif() else() set(LLAMA_EXTRA_LIBS ${LLAMA_EXTRA_LIBS} CUDA::cudart CUDA::cublas CUDA::cublasLt) endif() set(LLAMA_EXTRA_LIBS ${LLAMA_EXTRA_LIBS} CUDA::cuda_driver) endif() if (GGML_VULKAN) find_package(Vulkan REQUIRED) set(GGML_HEADERS_VULKAN ${DIRECTORY}/ggml/include/ggml-vulkan.h) set(GGML_SOURCES_VULKAN ${DIRECTORY}/ggml/src/ggml-vulkan.cpp) list(APPEND GGML_COMPILE_DEFS_PUBLIC GGML_USE_VULKAN) if (GGML_VULKAN_CHECK_RESULTS) list(APPEND GGML_COMPILE_DEFS GGML_VULKAN_CHECK_RESULTS) endif() if (GGML_VULKAN_DEBUG) list(APPEND GGML_COMPILE_DEFS GGML_VULKAN_DEBUG) endif() if (GGML_VULKAN_VALIDATE) list(APPEND GGML_COMPILE_DEFS GGML_VULKAN_VALIDATE) endif() if (GGML_VULKAN_RUN_TESTS) list(APPEND GGML_COMPILE_DEFS GGML_VULKAN_RUN_TESTS) endif() set(LLAMA_EXTRA_LIBS ${LLAMA_EXTRA_LIBS} Vulkan::Vulkan) endif() if (GGML_HIPBLAS) if ($ENV{ROCM_PATH}) set(ROCM_PATH $ENV{ROCM_PATH}) else() set(ROCM_PATH /opt/rocm) endif() list(APPEND CMAKE_PREFIX_PATH ${ROCM_PATH}) string(REGEX MATCH "hipcc(\.bat)?$" CXX_IS_HIPCC "${CMAKE_CXX_COMPILER}") if (CXX_IS_HIPCC AND UNIX) message(WARNING "Setting hipcc as the C++ compiler is legacy behavior." " Prefer setting the HIP compiler directly. See README for details.") else() # Forward AMDGPU_TARGETS to CMAKE_HIP_ARCHITECTURES. if (AMDGPU_TARGETS AND NOT CMAKE_HIP_ARCHITECTURES) set(CMAKE_HIP_ARCHITECTURES ${AMDGPU_ARGETS}) endif() cmake_minimum_required(VERSION 3.21) get_property(LANGS GLOBAL PROPERTY ENABLED_LANGUAGES) if (NOT HIP IN_LIST LANGS) message(FATAL_ERROR "The HIP language must be enabled.") endif() endif() find_package(hip REQUIRED) find_package(hipblas REQUIRED) find_package(rocblas REQUIRED) message(STATUS "HIP and hipBLAS found") set(GGML_HEADERS_ROCM ${DIRECTORY}/ggml/include/ggml-cuda.h) file(GLOB GGML_SOURCES_ROCM "${DIRECTORY}/ggml/src/ggml-rocm/*.cu") list(APPEND GGML_SOURCES_ROCM "${DIRECTORY}/ggml/src/ggml-rocm.cu") list(APPEND GGML_COMPILE_DEFS_PUBLIC GGML_USE_HIPBLAS GGML_USE_CUDA) if (GGML_HIP_UMA) list(APPEND GGML_COMPILE_DEFS GGML_HIP_UMA) endif() if (GGML_CUDA_FORCE_DMMV) list(APPEND GGML_COMPILE_DEFS GGML_CUDA_FORCE_DMMV) endif() if (GGML_CUDA_FORCE_MMQ) list(APPEND GGML_COMPILE_DEFS GGML_CUDA_FORCE_MMQ) endif() if (GGML_CUDA_NO_PEER_COPY) list(APPEND GGML_COMPILE_DEFS GGML_CUDA_NO_PEER_COPY) endif() list(APPEND GGML_COMPILE_DEFS GGML_CUDA_DMMV_X=${GGML_CUDA_DMMV_X}) list(APPEND GGML_COMPILE_DEFS GGML_CUDA_MMV_Y=${GGML_CUDA_MMV_Y}) list(APPEND GGML_COMPILE_DEFS K_QUANTS_PER_ITERATION=${GGML_CUDA_KQUANTS_ITER}) if (CXX_IS_HIPCC) set_source_files_properties(${GGML_SOURCES_ROCM} PROPERTIES LANGUAGE CXX) set(LLAMA_EXTRA_LIBS ${LLAMA_EXTRA_LIBS} hip::device) else() set_source_files_properties(${GGML_SOURCES_ROCM} PROPERTIES LANGUAGE HIP) endif() if (LLAMA_STATIC) message(FATAL_ERROR "Static linking not supported for HIP/ROCm") endif() set(LLAMA_EXTRA_LIBS ${LLAMA_EXTRA_LIBS} PUBLIC hip::host roc::rocblas roc::hipblas) endif() set(LLAMA_DIR ${CMAKE_CURRENT_SOURCE_DIR}/${DIRECTORY}) if (GGML_KOMPUTE AND NOT GGML_KOMPUTE_ONCE) set(GGML_KOMPUTE_ONCE ON PARENT_SCOPE) if (NOT EXISTS "${LLAMA_DIR}/ggml/src/kompute/CMakeLists.txt") message(FATAL_ERROR "Kompute not found") endif() message(STATUS "Kompute found") find_package(Vulkan COMPONENTS glslc) if (NOT Vulkan_FOUND) message(FATAL_ERROR "Vulkan not found. To build without Vulkan, use -DLLMODEL_KOMPUTE=OFF.") endif() find_program(glslc_executable NAMES glslc HINTS Vulkan::glslc) if (NOT glslc_executable) message(FATAL_ERROR "glslc not found. To build without Vulkan, use -DLLMODEL_KOMPUTE=OFF.") endif() function(compile_shader) set(options) set(oneValueArgs) set(multiValueArgs SOURCES) cmake_parse_arguments(compile_shader "${options}" "${oneValueArgs}" "${multiValueArgs}" ${ARGN}) foreach(source ${compile_shader_SOURCES}) get_filename_component(OP_FILE ${source} NAME) set(spv_file ${CMAKE_CURRENT_BINARY_DIR}/${OP_FILE}.spv) add_custom_command( OUTPUT ${spv_file} DEPENDS ${LLAMA_DIR}/ggml/src/kompute-shaders/${source} ${LLAMA_DIR}/ggml/src/kompute-shaders/common.comp ${LLAMA_DIR}/ggml/src/kompute-shaders/op_getrows.comp ${LLAMA_DIR}/ggml/src/kompute-shaders/op_mul_mv_q_n_pre.comp ${LLAMA_DIR}/ggml/src/kompute-shaders/op_mul_mv_q_n.comp COMMAND ${glslc_executable} --target-env=vulkan1.2 -o ${spv_file} ${LLAMA_DIR}/ggml/src/kompute-shaders/${source} COMMENT "Compiling ${source} to ${source}.spv" ) get_filename_component(RAW_FILE_NAME ${spv_file} NAME) set(FILE_NAME "shader${RAW_FILE_NAME}") string(REPLACE ".comp.spv" ".h" HEADER_FILE ${FILE_NAME}) string(TOUPPER ${HEADER_FILE} HEADER_FILE_DEFINE) string(REPLACE "." "_" HEADER_FILE_DEFINE "${HEADER_FILE_DEFINE}") set(OUTPUT_HEADER_FILE "${HEADER_FILE}") message(STATUS "${HEADER_FILE} generating ${HEADER_FILE_DEFINE}") if(CMAKE_GENERATOR MATCHES "Visual Studio") add_custom_command( OUTPUT ${OUTPUT_HEADER_FILE} COMMAND ${CMAKE_COMMAND} -E echo "/*THIS FILE HAS BEEN AUTOMATICALLY GENERATED - DO NOT EDIT*/" > ${OUTPUT_HEADER_FILE} COMMAND ${CMAKE_COMMAND} -E echo \"\#ifndef ${HEADER_FILE_DEFINE}\" >> ${OUTPUT_HEADER_FILE} COMMAND ${CMAKE_COMMAND} -E echo \"\#define ${HEADER_FILE_DEFINE}\" >> ${OUTPUT_HEADER_FILE} COMMAND ${CMAKE_COMMAND} -E echo "namespace kp {" >> ${OUTPUT_HEADER_FILE} COMMAND ${CMAKE_COMMAND} -E echo "namespace shader_data {" >> ${OUTPUT_HEADER_FILE} COMMAND ${CMAKE_BINARY_DIR}/bin/$/xxd -i ${RAW_FILE_NAME} >> ${OUTPUT_HEADER_FILE} COMMAND ${CMAKE_COMMAND} -E echo "}}" >> ${OUTPUT_HEADER_FILE} COMMAND ${CMAKE_COMMAND} -E echo \"\#endif // define ${HEADER_FILE_DEFINE}\" >> ${OUTPUT_HEADER_FILE} DEPENDS ${spv_file} xxd COMMENT "Converting to hpp: ${FILE_NAME} ${CMAKE_BINARY_DIR}/bin/$/xxd" ) else() add_custom_command( OUTPUT ${OUTPUT_HEADER_FILE} COMMAND ${CMAKE_COMMAND} -E echo "/*THIS FILE HAS BEEN AUTOMATICALLY GENERATED - DO NOT EDIT*/" > ${OUTPUT_HEADER_FILE} COMMAND ${CMAKE_COMMAND} -E echo \"\#ifndef ${HEADER_FILE_DEFINE}\" >> ${OUTPUT_HEADER_FILE} COMMAND ${CMAKE_COMMAND} -E echo \"\#define ${HEADER_FILE_DEFINE}\" >> ${OUTPUT_HEADER_FILE} COMMAND ${CMAKE_COMMAND} -E echo "namespace kp {" >> ${OUTPUT_HEADER_FILE} COMMAND ${CMAKE_COMMAND} -E echo "namespace shader_data {" >> ${OUTPUT_HEADER_FILE} COMMAND ${CMAKE_BINARY_DIR}/bin/xxd -i ${RAW_FILE_NAME} >> ${OUTPUT_HEADER_FILE} COMMAND ${CMAKE_COMMAND} -E echo "}}" >> ${OUTPUT_HEADER_FILE} COMMAND ${CMAKE_COMMAND} -E echo \"\#endif // define ${HEADER_FILE_DEFINE}\" >> ${OUTPUT_HEADER_FILE} DEPENDS ${spv_file} xxd COMMENT "Converting to hpp: ${FILE_NAME} ${CMAKE_BINARY_DIR}/bin/xxd" ) endif() endforeach() endfunction() set(KOMPUTE_OPT_BUILT_IN_VULKAN_HEADER_TAG "v1.3.239" CACHE STRING "Kompute Vulkan headers tag") set(KOMPUTE_OPT_LOG_LEVEL Critical CACHE STRING "Kompute log level") set(FMT_INSTALL OFF) add_subdirectory(${LLAMA_DIR}/ggml/src/kompute) # Compile our shaders compile_shader(SOURCES op_scale.comp op_scale_8.comp op_add.comp op_addrow.comp op_mul.comp op_silu.comp op_relu.comp op_gelu.comp op_softmax.comp op_norm.comp op_rmsnorm.comp op_diagmask.comp op_mul_mat_mat_f32.comp op_mul_mat_f16.comp op_mul_mat_q8_0.comp op_mul_mat_q4_0.comp op_mul_mat_q4_1.comp op_mul_mat_q6_k.comp op_getrows_f32.comp op_getrows_f16.comp op_getrows_q4_0.comp op_getrows_q4_1.comp op_getrows_q6_k.comp op_rope_f16.comp op_rope_f32.comp op_cpy_f16_f16.comp op_cpy_f16_f32.comp op_cpy_f32_f16.comp op_cpy_f32_f32.comp ) # Create a custom target for our generated shaders add_custom_target(generated_shaders DEPENDS shaderop_scale.h shaderop_scale_8.h shaderop_add.h shaderop_addrow.h shaderop_mul.h shaderop_silu.h shaderop_relu.h shaderop_gelu.h shaderop_softmax.h shaderop_norm.h shaderop_rmsnorm.h shaderop_diagmask.h shaderop_mul_mat_mat_f32.h shaderop_mul_mat_f16.h shaderop_mul_mat_q8_0.h shaderop_mul_mat_q4_0.h shaderop_mul_mat_q4_1.h shaderop_mul_mat_q6_k.h shaderop_getrows_f32.h shaderop_getrows_f16.h shaderop_getrows_q4_0.h shaderop_getrows_q4_1.h shaderop_getrows_q6_k.h shaderop_rope_f16.h shaderop_rope_f32.h shaderop_cpy_f16_f16.h shaderop_cpy_f16_f32.h shaderop_cpy_f32_f16.h shaderop_cpy_f32_f32.h ) # Create a custom command that depends on the generated_shaders add_custom_command( OUTPUT ${CMAKE_CURRENT_BINARY_DIR}/ggml-kompute.stamp COMMAND ${CMAKE_COMMAND} -E touch ${CMAKE_CURRENT_BINARY_DIR}/ggml-kompute.stamp DEPENDS generated_shaders COMMENT "Ensuring shaders are generated before compiling ggml-kompute.cpp" ) endif() if (GGML_KOMPUTE) list(APPEND GGML_COMPILE_DEFS VULKAN_HPP_DISPATCH_LOADER_DYNAMIC=1) # Add the stamp to the main sources to ensure dependency tracking set(GGML_SOURCES_KOMPUTE ${LLAMA_DIR}/ggml/src/ggml-kompute.cpp ${CMAKE_CURRENT_BINARY_DIR}/ggml-kompute.stamp) set(GGML_HEADERS_KOMPUTE ${LLAMA_DIR}/ggml/include/ggml-kompute.h) list(APPEND GGML_COMPILE_DEFS_PUBLIC GGML_USE_KOMPUTE) set(LLAMA_EXTRA_LIBS ${LLAMA_EXTRA_LIBS} kompute) endif() set(CUDA_CXX_FLAGS "") if (GGML_CUDA) set(CUDA_FLAGS -use_fast_math) if (LLAMA_FATAL_WARNINGS) list(APPEND CUDA_FLAGS -Werror all-warnings) endif() if (LLAMA_ALL_WARNINGS AND NOT MSVC) set(NVCC_CMD ${CMAKE_CUDA_COMPILER} .c) if (NOT CMAKE_CUDA_HOST_COMPILER STREQUAL "") list(APPEND NVCC_CMD -ccbin ${CMAKE_CUDA_HOST_COMPILER}) endif() execute_process( COMMAND ${NVCC_CMD} -Xcompiler --version OUTPUT_VARIABLE CUDA_CCFULLVER ERROR_QUIET ) if (NOT CUDA_CCFULLVER MATCHES clang) set(CUDA_CCID "GNU") execute_process( COMMAND ${NVCC_CMD} -Xcompiler "-dumpfullversion -dumpversion" OUTPUT_VARIABLE CUDA_CCVER OUTPUT_STRIP_TRAILING_WHITESPACE ERROR_QUIET ) else() if (CUDA_CCFULLVER MATCHES Apple) set(CUDA_CCID "AppleClang") else() set(CUDA_CCID "Clang") endif() string(REGEX REPLACE "^.* version ([0-9.]*).*$" "\\1" CUDA_CCVER ${CUDA_CCFULLVER}) endif() message("-- CUDA host compiler is ${CUDA_CCID} ${CUDA_CCVER}") get_flags(${CUDA_CCID} ${CUDA_CCVER}) list(APPEND CUDA_CXX_FLAGS ${CXX_FLAGS} ${GF_CXX_FLAGS}) # This is passed to -Xcompiler later endif() if (NOT MSVC) list(APPEND CUDA_CXX_FLAGS -Wno-pedantic) endif() endif() if (GGML_METAL) find_library(FOUNDATION_LIBRARY Foundation REQUIRED) find_library(METAL_FRAMEWORK Metal REQUIRED) find_library(METALKIT_FRAMEWORK MetalKit REQUIRED) message(STATUS "Metal framework found") set(GGML_HEADERS_METAL ${DIRECTORY}/ggml/include/ggml-metal.h) set(GGML_SOURCES_METAL ${DIRECTORY}/ggml/src/ggml-metal.m) list(APPEND GGML_COMPILE_DEFS_PUBLIC GGML_USE_METAL) if (GGML_METAL_NDEBUG) list(APPEND GGML_COMPILE_DEFS GGML_METAL_NDEBUG) endif() # copy ggml-common.h and ggml-metal.metal to bin directory configure_file(${DIRECTORY}/ggml/src/ggml-common.h ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/ggml-common.h COPYONLY) configure_file(${DIRECTORY}/ggml/src/ggml-metal.metal ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/ggml-metal.metal COPYONLY) if (GGML_METAL_SHADER_DEBUG) # custom command to do the following: # xcrun -sdk macosx metal -fno-fast-math -c ggml-metal.metal -o ggml-metal.air # xcrun -sdk macosx metallib ggml-metal.air -o default.metallib # # note: this is the only way I found to disable fast-math in Metal. it's ugly, but at least it works # disabling fast math is needed in order to pass tests/test-backend-ops # note: adding -fno-inline fixes the tests when using MTL_SHADER_VALIDATION=1 # note: unfortunately, we have to call it default.metallib instead of ggml.metallib # ref: https://github.com/ggerganov/whisper.cpp/issues/1720 set(XC_FLAGS -fno-fast-math -fno-inline -g) else() set(XC_FLAGS -O3) endif() # Append macOS metal versioning flags if (GGML_METAL_MACOSX_VERSION_MIN) message(STATUS "Adding -mmacosx-version-min=${GGML_METAL_MACOSX_VERSION_MIN} flag to metal compilation") list(APPEND XC_FLAGS -mmacosx-version-min=${GGML_METAL_MACOSX_VERSION_MIN}) endif() if (GGML_METAL_STD) message(STATUS "Adding -std=${GGML_METAL_STD} flag to metal compilation") list(APPEND XC_FLAGS -std=${GGML_METAL_STD}) endif() set(GGML_METALLIB "${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/default.metallib") set(GGML_METALLIB "${GGML_METALLIB}" PARENT_SCOPE) add_custom_command( OUTPUT ${GGML_METALLIB} COMMAND xcrun -sdk macosx metal ${XC_FLAGS} -c ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/ggml-metal.metal -o ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/ggml-metal.air COMMAND xcrun -sdk macosx metallib ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/ggml-metal.air -o ${GGML_METALLIB} COMMAND rm -f ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/ggml-metal.air COMMAND rm -f ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/ggml-common.h COMMAND rm -f ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/ggml-metal.metal DEPENDS ${DIRECTORY}/ggml/src/ggml-metal.metal ${DIRECTORY}/ggml/src/ggml-common.h COMMENT "Compiling Metal kernels" ) add_custom_target( ggml-metal ALL DEPENDS ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/default.metallib ) set(LLAMA_EXTRA_LIBS ${LLAMA_EXTRA_LIBS} ${FOUNDATION_LIBRARY} ${METAL_FRAMEWORK} ${METALKIT_FRAMEWORK} ) endif() set(ARCH_FLAGS "") if (CMAKE_OSX_ARCHITECTURES STREQUAL "arm64" OR CMAKE_GENERATOR_PLATFORM_LWR STREQUAL "arm64" OR (NOT CMAKE_OSX_ARCHITECTURES AND NOT CMAKE_GENERATOR_PLATFORM_LWR AND CMAKE_SYSTEM_PROCESSOR MATCHES "^(aarch64|arm.*|ARM64)$")) message(STATUS "ARM detected") if (MSVC) # TODO: arm msvc? else() check_cxx_compiler_flag(-mfp16-format=ieee COMPILER_SUPPORTS_FP16_FORMAT_I3E) if (NOT "${COMPILER_SUPPORTS_FP16_FORMAT_I3E}" STREQUAL "") list(APPEND ARCH_FLAGS -mfp16-format=ieee) endif() if (${CMAKE_SYSTEM_PROCESSOR} MATCHES "armv6") # Raspberry Pi 1, Zero list(APPEND ARCH_FLAGS -mfpu=neon-fp-armv8 -mno-unaligned-access) endif() if (${CMAKE_SYSTEM_PROCESSOR} MATCHES "armv7") if ("${CMAKE_SYSTEM_NAME}" STREQUAL "Android") # Android armeabi-v7a list(APPEND ARCH_FLAGS -mfpu=neon-vfpv4 -mno-unaligned-access -funsafe-math-optimizations) else() # Raspberry Pi 2 list(APPEND ARCH_FLAGS -mfpu=neon-fp-armv8 -mno-unaligned-access -funsafe-math-optimizations) endif() endif() if (${CMAKE_SYSTEM_PROCESSOR} MATCHES "armv8") # Android arm64-v8a # Raspberry Pi 3, 4, Zero 2 (32-bit) list(APPEND ARCH_FLAGS -mno-unaligned-access) endif() endif() elseif (CMAKE_OSX_ARCHITECTURES STREQUAL "x86_64" OR CMAKE_GENERATOR_PLATFORM_LWR MATCHES "^(x86_64|i686|amd64|x64|win32)$" OR (NOT CMAKE_OSX_ARCHITECTURES AND NOT CMAKE_GENERATOR_PLATFORM_LWR AND CMAKE_SYSTEM_PROCESSOR MATCHES "^(x86_64|i686|AMD64)$")) message(STATUS "x86 detected") if (MSVC) if (GGML_AVX512) list(APPEND ARCH_FLAGS /arch:AVX512) # MSVC has no compile-time flags enabling specific # AVX512 extensions, neither it defines the # macros corresponding to the extensions. # Do it manually. if (GGML_AVX512_VBMI) list(APPEND GGML_COMPILE_DEFS $<$:__AVX512VBMI__>) list(APPEND GGML_COMPILE_DEFS $<$:__AVX512VBMI__>) endif() if (GGML_AVX512_VNNI) list(APPEND GGML_COMPILE_DEFS $<$:__AVX512VNNI__>) list(APPEND GGML_COMPILE_DEFS $<$:__AVX512VNNI__>) endif() elseif (GGML_AVX2) list(APPEND ARCH_FLAGS /arch:AVX2) elseif (GGML_AVX) list(APPEND ARCH_FLAGS /arch:AVX) endif() else() if (GGML_NATIVE) list(APPEND ARCH_FLAGS -march=native) endif() if (GGML_F16C) list(APPEND ARCH_FLAGS -mf16c) endif() if (GGML_FMA) list(APPEND ARCH_FLAGS -mfma) endif() if (GGML_AVX) list(APPEND ARCH_FLAGS -mavx) endif() if (GGML_AVX2) list(APPEND ARCH_FLAGS -mavx2) endif() if (GGML_AVX512) list(APPEND ARCH_FLAGS -mavx512f) list(APPEND ARCH_FLAGS -mavx512bw) endif() if (GGML_AVX512_VBMI) list(APPEND ARCH_FLAGS -mavx512vbmi) endif() if (GGML_AVX512_VNNI) list(APPEND ARCH_FLAGS -mavx512vnni) endif() endif() elseif (${CMAKE_SYSTEM_PROCESSOR} MATCHES "ppc64") message(STATUS "PowerPC detected") if (${CMAKE_SYSTEM_PROCESSOR} MATCHES "ppc64le") list(APPEND ARCH_FLAGS -mcpu=powerpc64le) else() list(APPEND ARCH_FLAGS -mcpu=native -mtune=native) #TODO: Add targets for Power8/Power9 (Altivec/VSX) and Power10(MMA) and query for big endian systems (ppc64/le/be) endif() else() message(STATUS "Unknown architecture") endif() list(APPEND GGML_COMPILE_OPTS "$<$:${ARCH_FLAGS}>") list(APPEND GGML_COMPILE_OPTS "$<$:${ARCH_FLAGS}>") if (GGML_CUDA) list(APPEND CUDA_CXX_FLAGS ${ARCH_FLAGS}) list(JOIN CUDA_CXX_FLAGS " " CUDA_CXX_FLAGS_JOINED) # pass host compiler flags as a single argument if (NOT CUDA_CXX_FLAGS_JOINED STREQUAL "") list(APPEND CUDA_FLAGS -Xcompiler ${CUDA_CXX_FLAGS_JOINED}) endif() list(APPEND GGML_COMPILE_OPTS "$<$:${CUDA_FLAGS}>") endif() # ggml add_library(ggml${SUFFIX} OBJECT ${DIRECTORY}/ggml/include/ggml.h ${DIRECTORY}/ggml/include/ggml-alloc.h ${DIRECTORY}/ggml/include/ggml-backend.h ${DIRECTORY}/ggml/src/ggml.c ${DIRECTORY}/ggml/src/ggml-alloc.c ${DIRECTORY}/ggml/src/ggml-backend.c ${DIRECTORY}/ggml/src/ggml-quants.c ${DIRECTORY}/ggml/src/ggml-quants.h ${GGML_SOURCES_CUDA} ${GGML_HEADERS_CUDA} ${GGML_SOURCES_METAL} ${GGML_HEADERS_METAL} ${GGML_SOURCES_KOMPUTE} ${GGML_HEADERS_KOMPUTE} ${GGML_SOURCES_VULKAN} ${GGML_HEADERS_VULKAN} ${GGML_SOURCES_ROCM} ${GGML_HEADERS_ROCM} ${GGML_SOURCES_LLAMAFILE} ${GGML_HEADERS_LLAMAFILE} ${DIRECTORY}/ggml/src/ggml-aarch64.c ${DIRECTORY}/ggml/src/ggml-aarch64.h ) target_include_directories(ggml${SUFFIX} PUBLIC ${DIRECTORY}/ggml/include ${LLAMA_EXTRA_INCLUDES}) target_include_directories(ggml${SUFFIX} PRIVATE ${DIRECTORY}/ggml/src) target_compile_features(ggml${SUFFIX} PUBLIC c_std_11) # don't bump target_link_libraries(ggml${SUFFIX} PUBLIC Threads::Threads ${LLAMA_EXTRA_LIBS}) if (BUILD_SHARED_LIBS) set_target_properties(ggml${SUFFIX} PROPERTIES POSITION_INDEPENDENT_CODE ON) endif() # llama add_library(llama${SUFFIX} STATIC ${DIRECTORY}/include/llama.h ${DIRECTORY}/src/llama-grammar.cpp ${DIRECTORY}/src/llama-sampling.cpp ${DIRECTORY}/src/llama-vocab.cpp ${DIRECTORY}/src/llama.cpp ${DIRECTORY}/src/unicode-data.cpp ${DIRECTORY}/src/unicode.cpp ${DIRECTORY}/src/unicode.h ) target_include_directories(llama${SUFFIX} PUBLIC ${DIRECTORY}/include ${DIRECTORY}/ggml/include) target_include_directories(llama${SUFFIX} PRIVATE ${DIRECTORY}/src) target_compile_features (llama${SUFFIX} PUBLIC cxx_std_11) # don't bump target_link_libraries(llama${SUFFIX} PRIVATE ggml${SUFFIX} ${LLAMA_EXTRA_LIBS} ) if (BUILD_SHARED_LIBS) set_target_properties(llama${SUFFIX} PROPERTIES POSITION_INDEPENDENT_CODE ON) target_compile_definitions(llama${SUFFIX} PRIVATE LLAMA_SHARED LLAMA_BUILD) endif() # target options set_target_properties(ggml${SUFFIX} llama${SUFFIX} PROPERTIES CXX_STANDARD 11 CXX_STANDARD_REQUIRED true C_STANDARD 11 C_STANDARD_REQUIRED true ) target_compile_options(ggml${SUFFIX} PRIVATE "${GGML_COMPILE_OPTS}") target_compile_options(llama${SUFFIX} PRIVATE "${GGML_COMPILE_OPTS}") target_compile_definitions(ggml${SUFFIX} PRIVATE "${GGML_COMPILE_DEFS}") target_compile_definitions(llama${SUFFIX} PRIVATE "${GGML_COMPILE_DEFS}") target_compile_definitions(ggml${SUFFIX} PUBLIC "${GGML_COMPILE_DEFS_PUBLIC}") target_compile_definitions(llama${SUFFIX} PUBLIC "${GGML_COMPILE_DEFS_PUBLIC}") target_link_options(ggml${SUFFIX} PRIVATE "${GGML_LINK_OPTS}") target_link_options(llama${SUFFIX} PRIVATE "${GGML_LINK_OPTS}") endfunction() ================================================ FILE: gpt4all-backend/src/dlhandle.cpp ================================================ #include "dlhandle.h" #include #ifndef _WIN32 # include #else # include # include # define WIN32_LEAN_AND_MEAN # ifndef NOMINMAX # define NOMINMAX # endif # include #endif using namespace std::string_literals; namespace fs = std::filesystem; #ifndef _WIN32 Dlhandle::Dlhandle(const fs::path &fpath) { chandle = dlopen(fpath.c_str(), RTLD_LAZY | RTLD_LOCAL); if (!chandle) { throw Exception("dlopen: "s + dlerror()); } } Dlhandle::~Dlhandle() { if (chandle) dlclose(chandle); } void *Dlhandle::get_internal(const char *symbol) const { return dlsym(chandle, symbol); } #else // defined(_WIN32) Dlhandle::Dlhandle(const fs::path &fpath) { fs::path afpath = fs::absolute(fpath); // Suppress the "Entry Point Not Found" dialog, caused by outdated nvcuda.dll from the GPU driver UINT lastErrorMode = GetErrorMode(); SetErrorMode(lastErrorMode | SEM_FAILCRITICALERRORS); chandle = LoadLibraryExW(afpath.c_str(), NULL, LOAD_LIBRARY_SEARCH_DEFAULT_DIRS | LOAD_LIBRARY_SEARCH_DLL_LOAD_DIR); SetErrorMode(lastErrorMode); if (!chandle) { DWORD err = GetLastError(); std::ostringstream ss; ss << "LoadLibraryExW failed with error 0x" << std::hex << err; throw Exception(ss.str()); } } Dlhandle::~Dlhandle() { if (chandle) FreeLibrary(HMODULE(chandle)); } void *Dlhandle::get_internal(const char *symbol) const { return GetProcAddress(HMODULE(chandle), symbol); } #endif // defined(_WIN32) ================================================ FILE: gpt4all-backend/src/dlhandle.h ================================================ #pragma once #include #include #include #include namespace fs = std::filesystem; class Dlhandle { void *chandle = nullptr; public: class Exception : public std::runtime_error { public: using std::runtime_error::runtime_error; }; Dlhandle() = default; Dlhandle(const fs::path &fpath); Dlhandle(const Dlhandle &o) = delete; Dlhandle(Dlhandle &&o) : chandle(o.chandle) { o.chandle = nullptr; } ~Dlhandle(); Dlhandle &operator=(Dlhandle &&o) { chandle = std::exchange(o.chandle, nullptr); return *this; } template T *get(const std::string &symbol) const { return reinterpret_cast(get_internal(symbol.c_str())); } auto get_fnc(const std::string &symbol) const { return get(symbol); } private: void *get_internal(const char *symbol) const; }; ================================================ FILE: gpt4all-backend/src/llamamodel.cpp ================================================ #define LLAMAMODEL_H_I_KNOW_WHAT_I_AM_DOING_WHEN_INCLUDING_THIS_FILE #include "llamamodel_impl.h" #include "llmodel.h" #include "utils.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #ifdef GGML_USE_KOMPUTE # include #elif defined(GGML_USE_VULKAN) # include #elif defined(GGML_USE_CUDA) # include #endif using namespace std::string_literals; // Maximum supported GGUF version static constexpr int GGUF_VER_MAX = 3; static const char * const modelType_ = "LLaMA"; // note: same order as LLM_ARCH_NAMES in llama.cpp static const std::vector KNOWN_ARCHES { "llama", "falcon", // "grok", -- 314B parameters "gpt2", // "gptj", -- no inference code "gptneox", "granite", "granitemoe", "mpt", "baichuan", "starcoder", "refact", "bert", "nomic-bert", // "jina-bert-v2", -- Assertion `i01 >= 0 && i01 < ne01' failed. "bloom", "stablelm", "qwen", "qwen2", "qwen2moe", "phi2", "phi3", // "plamo", -- https://github.com/ggerganov/llama.cpp/issues/5669 "codeshell", "orion", "internlm2", // "minicpm", -- CUDA generates garbage "gemma", "gemma2", "starcoder2", // "mamba", -- CUDA missing SSM_CONV "xverse", "command-r", // "dbrx", -- 16x12B parameters "olmo", "olmoe", "openelm", // "arctic", -- 10B+128x3.66B parameters "deepseek2", "chatglm", // "bitnet", -- tensor not within file bounds? // "t5", -- seq2seq model "jais", }; static const std::vector EMBEDDING_ARCHES { "bert", "nomic-bert", }; static bool is_embedding_arch(const std::string &arch) { return std::find(EMBEDDING_ARCHES.begin(), EMBEDDING_ARCHES.end(), arch) < EMBEDDING_ARCHES.end(); } static bool llama_verbose() { const char* var = getenv("GPT4ALL_VERBOSE_LLAMACPP"); return var && *var; } static void llama_log_callback(ggml_log_level level, const char *text, void *userdata, bool warn) { (void)userdata; static ggml_log_level lastlevel = GGML_LOG_LEVEL_NONE; if (!llama_verbose()) { auto efflevel = level == GGML_LOG_LEVEL_CONT ? lastlevel : level; lastlevel = efflevel; switch (efflevel) { case GGML_LOG_LEVEL_CONT: UNREACHABLE(); break; case GGML_LOG_LEVEL_WARN: if (warn) break; [[fallthrough]]; case GGML_LOG_LEVEL_NONE: // not used? case GGML_LOG_LEVEL_INFO: case GGML_LOG_LEVEL_DEBUG: return; // suppress case GGML_LOG_LEVEL_ERROR: ; } } fputs(text, stderr); } struct gpt_params { int32_t n_keep = 0; // number of tokens to keep from initial prompt // sampling parameters float tfs_z = 1.0f; // 1.0 = disabled float typical_p = 1.0f; // 1.0 = disabled std::string prompt = ""; enum ggml_type kv_type = GGML_TYPE_F16; // use f16 instead of f32 for memory kv bool use_mmap = true; // use mmap for faster loads bool use_mlock = false; // use mlock to keep model in memory }; const char *get_arch_name(gguf_context *ctx_gguf) { const int kid = gguf_find_key(ctx_gguf, "general.architecture"); if (kid == -1) throw std::runtime_error("key not found in model: general.architecture"); enum gguf_type ktype = gguf_get_kv_type(ctx_gguf, kid); if (ktype != GGUF_TYPE_STRING) throw std::runtime_error("key general.architecture has wrong type"); return gguf_get_val_str(ctx_gguf, kid); } static gguf_context *load_gguf(const char *fname) { struct gguf_init_params params = { /*.no_alloc = */ true, /*.ctx = */ nullptr, }; gguf_context *ctx = gguf_init_from_file(fname, params); if (!ctx) { std::cerr << __func__ << ": gguf_init_from_file failed\n"; return nullptr; } int gguf_ver = gguf_get_version(ctx); if (gguf_ver > GGUF_VER_MAX) { std::cerr << __func__ << ": unsupported gguf version: " << gguf_ver << "\n"; gguf_free(ctx); return nullptr; } return ctx; } static int32_t get_arch_key_u32(std::string const &modelPath, std::string const &archKey) { int32_t value = -1; std::string arch; auto * ctx = load_gguf(modelPath.c_str()); if (!ctx) goto cleanup; try { arch = get_arch_name(ctx); } catch (const std::runtime_error &) { goto cleanup; // cannot read key } { auto key = arch + "." + archKey; int keyidx = gguf_find_key(ctx, key.c_str()); if (keyidx != -1) { value = gguf_get_val_u32(ctx, keyidx); } else { std::cerr << __func__ << ": " << key << " not found in " << modelPath << "\n"; } } cleanup: gguf_free(ctx); return value; } struct LLamaPrivate { bool modelLoaded = false; int device = -1; std::string deviceName; int64_t n_threads = 0; std::vector end_tokens; const char *backend_name = nullptr; std::vector inputTokens; llama_model *model = nullptr; llama_context *ctx = nullptr; llama_model_params model_params; llama_context_params ctx_params; llama_sampler *sampler_chain; }; LLamaModel::LLamaModel() : d_ptr(std::make_unique()) { auto sparams = llama_sampler_chain_default_params(); d_ptr->sampler_chain = llama_sampler_chain_init(sparams); } // default hparams (LLaMA 7B) struct llama_file_hparams { uint32_t n_vocab = 32000; uint32_t n_embd = 4096; uint32_t n_mult = 256; uint32_t n_head = 32; uint32_t n_layer = 32; uint32_t n_rot = 64; enum llama_ftype ftype = LLAMA_FTYPE_MOSTLY_F16; }; size_t LLamaModel::requiredMem(const std::string &modelPath, int n_ctx, int ngl) { // TODO(cebtenzzre): update to GGUF (void)ngl; // FIXME(cetenzzre): use this value auto fin = std::ifstream(modelPath, std::ios::binary); fin.seekg(0, std::ios_base::end); size_t filesize = fin.tellg(); fin.seekg(0, std::ios_base::beg); uint32_t magic = 0; fin.read(reinterpret_cast(&magic), sizeof(magic)); if (magic != 0x67676a74) return 0; uint32_t version = 0; fin.read(reinterpret_cast(&version), sizeof(version)); llama_file_hparams hparams; fin.read(reinterpret_cast(&hparams.n_vocab), sizeof(hparams.n_vocab)); fin.read(reinterpret_cast(&hparams.n_embd), sizeof(hparams.n_embd)); fin.read(reinterpret_cast(&hparams.n_head), sizeof(hparams.n_head)); fin.read(reinterpret_cast(&hparams.n_layer), sizeof(hparams.n_layer)); fin.read(reinterpret_cast(&hparams.n_rot), sizeof(hparams.n_rot)); fin.read(reinterpret_cast(&hparams.ftype), sizeof(hparams.ftype)); const size_t kvcache_element_size = 2; // fp16 const size_t est_kvcache_size = hparams.n_embd * hparams.n_layer * 2u * n_ctx * kvcache_element_size; return filesize + est_kvcache_size; } bool LLamaModel::isModelBlacklisted(const std::string &modelPath) const { auto * ctx = load_gguf(modelPath.c_str()); if (!ctx) { std::cerr << __func__ << ": failed to load " << modelPath << "\n"; return false; } auto get_key = [ctx, &modelPath](const char *name) { int keyidx = gguf_find_key(ctx, name); if (keyidx == -1) { throw std::logic_error(name + " not found in "s + modelPath); } return keyidx; }; bool res = false; try { std::string name(gguf_get_val_str(ctx, get_key("general.name"))); int token_idx = get_key("tokenizer.ggml.tokens"); int n_vocab = gguf_get_arr_n(ctx, token_idx); // check for known bad models if (name == "open-orca_mistral-7b-openorca" && n_vocab == 32002 && gguf_get_arr_str(ctx, token_idx, 32000) == ""s // should be <|im_end|> ) { res = true; } } catch (const std::logic_error &e) { std::cerr << __func__ << ": " << e.what() << "\n"; } gguf_free(ctx); return res; } bool LLamaModel::isEmbeddingModel(const std::string &modelPath) const { bool result = false; std::string arch; auto *ctx_gguf = load_gguf(modelPath.c_str()); if (!ctx_gguf) { std::cerr << __func__ << ": failed to load GGUF from " << modelPath << "\n"; goto cleanup; } try { arch = get_arch_name(ctx_gguf); } catch (const std::runtime_error &) { goto cleanup; // cannot read key } result = is_embedding_arch(arch); cleanup: gguf_free(ctx_gguf); return result; } bool LLamaModel::loadModel(const std::string &modelPath, int n_ctx, int ngl) { d_ptr->modelLoaded = false; // clean up after previous loadModel() if (d_ptr->model) { llama_free_model(d_ptr->model); d_ptr->model = nullptr; } if (d_ptr->ctx) { llama_free(d_ptr->ctx); d_ptr->ctx = nullptr; } if (n_ctx < 8) { std::cerr << "warning: minimum context size is 8, using minimum size.\n"; n_ctx = 8; } // -- load the model -- gpt_params params; d_ptr->model_params = llama_model_default_params(); d_ptr->model_params.use_mmap = params.use_mmap; #if defined (__APPLE__) d_ptr->model_params.use_mlock = true; #else d_ptr->model_params.use_mlock = params.use_mlock; #endif d_ptr->model_params.progress_callback = &LLModel::staticProgressCallback; d_ptr->model_params.progress_callback_user_data = this; d_ptr->backend_name = "cpu"; // default #if defined(GGML_USE_KOMPUTE) || defined(GGML_USE_VULKAN) || defined(GGML_USE_CUDA) if (d_ptr->device != -1) { d_ptr->model_params.main_gpu = d_ptr->device; d_ptr->model_params.n_gpu_layers = ngl; d_ptr->model_params.split_mode = LLAMA_SPLIT_MODE_NONE; } else { #ifdef GGML_USE_CUDA std::cerr << "Llama ERROR: CUDA loadModel was called without a device\n"; return false; #endif // GGML_USE_CUDA } #elif defined(GGML_USE_METAL) (void)ngl; if (llama_verbose()) { std::cerr << "llama.cpp: using Metal" << std::endl; } d_ptr->backend_name = "metal"; // always fully offload on Metal // TODO(cebtenzzre): use this parameter to allow using more than 53% of system RAM to load a model d_ptr->model_params.n_gpu_layers = 100; #else // !KOMPUTE && !VULKAN && !CUDA && !METAL (void)ngl; #endif d_ptr->model = llama_load_model_from_file(modelPath.c_str(), d_ptr->model_params); if (!d_ptr->model) { fflush(stdout); #ifndef GGML_USE_CUDA d_ptr->device = -1; d_ptr->deviceName.clear(); #endif std::cerr << "LLAMA ERROR: failed to load model from " << modelPath << std::endl; return false; } // -- initialize the context -- d_ptr->ctx_params = llama_context_default_params(); bool isEmbedding = is_embedding_arch(llama_model_arch(d_ptr->model)); const int n_ctx_train = llama_n_ctx_train(d_ptr->model); if (isEmbedding) { d_ptr->ctx_params.n_batch = n_ctx; d_ptr->ctx_params.n_ubatch = n_ctx; } else { if (n_ctx > n_ctx_train) { std::cerr << "warning: model was trained on only " << n_ctx_train << " context tokens (" << n_ctx << " specified)\n"; } } d_ptr->ctx_params.n_ctx = n_ctx; d_ptr->ctx_params.type_k = params.kv_type; d_ptr->ctx_params.type_v = params.kv_type; // The new batch API provides space for n_vocab*n_tokens logits. Tell llama.cpp early // that we want this many logits so the state serializes consistently. d_ptr->ctx_params.logits_all = true; d_ptr->n_threads = std::min(4, (int32_t) std::thread::hardware_concurrency()); d_ptr->ctx_params.n_threads = d_ptr->n_threads; d_ptr->ctx_params.n_threads_batch = d_ptr->n_threads; if (isEmbedding) d_ptr->ctx_params.embeddings = true; d_ptr->ctx = llama_new_context_with_model(d_ptr->model, d_ptr->ctx_params); if (!d_ptr->ctx) { fflush(stdout); std::cerr << "LLAMA ERROR: failed to init context for model " << modelPath << std::endl; llama_free_model(d_ptr->model); d_ptr->model = nullptr; #ifndef GGML_USE_CUDA d_ptr->device = -1; d_ptr->deviceName.clear(); #endif return false; } d_ptr->end_tokens = {llama_token_eos(d_ptr->model)}; if (usingGPUDevice()) { #ifdef GGML_USE_KOMPUTE if (llama_verbose()) { std::cerr << "llama.cpp: using Vulkan on " << d_ptr->deviceName << std::endl; } d_ptr->backend_name = "kompute"; #elif defined(GGML_USE_VULKAN) d_ptr->backend_name = "vulkan"; #elif defined(GGML_USE_CUDA) d_ptr->backend_name = "cuda"; #endif } m_supportsEmbedding = isEmbedding; m_supportsCompletion = !isEmbedding; fflush(stdout); d_ptr->modelLoaded = true; return true; } void LLamaModel::setThreadCount(int32_t n_threads) { d_ptr->n_threads = n_threads; llama_set_n_threads(d_ptr->ctx, n_threads, n_threads); } int32_t LLamaModel::threadCount() const { return d_ptr->n_threads; } LLamaModel::~LLamaModel() { if (d_ptr->ctx) { llama_free(d_ptr->ctx); } llama_free_model(d_ptr->model); llama_sampler_free(d_ptr->sampler_chain); } bool LLamaModel::isModelLoaded() const { return d_ptr->modelLoaded; } size_t LLamaModel::stateSize() const { return llama_state_get_size(d_ptr->ctx); } size_t LLamaModel::saveState(std::span stateOut, std::vector &inputTokensOut) const { size_t bytesWritten = llama_state_get_data(d_ptr->ctx, stateOut.data(), stateOut.size()); if (bytesWritten) inputTokensOut.assign(d_ptr->inputTokens.begin(), d_ptr->inputTokens.end()); return bytesWritten; } size_t LLamaModel::restoreState(std::span state, std::span inputTokens) { size_t bytesRead = llama_state_set_data(d_ptr->ctx, state.data(), state.size()); if (bytesRead) d_ptr->inputTokens.assign(inputTokens.begin(), inputTokens.end()); return bytesRead; } std::vector LLamaModel::tokenize(std::string_view str) const { std::vector fres(str.length() + 4); int32_t fres_len = llama_tokenize( d_ptr->model, str.data(), str.length(), fres.data(), fres.size(), /*add_special*/ true, /*parse_special*/ true ); fres.resize(fres_len); return fres; } bool LLamaModel::isSpecialToken(Token id) const { return llama_token_get_attr(d_ptr->model, id) & (LLAMA_TOKEN_ATTR_CONTROL | LLAMA_TOKEN_ATTR_USER_DEFINED | LLAMA_TOKEN_ATTR_UNKNOWN); } std::string LLamaModel::tokenToString(Token id) const { std::vector result(8, 0); const int n_tokens = llama_token_to_piece(d_ptr->model, id, result.data(), result.size(), 0, true); if (n_tokens < 0) { result.resize(-n_tokens); int check = llama_token_to_piece(d_ptr->model, id, result.data(), result.size(), 0, true); GGML_ASSERT(check == -n_tokens); } else { result.resize(n_tokens); } return std::string(result.data(), result.size()); } void LLamaModel::initSampler(const PromptContext &promptCtx) { auto *model = d_ptr->model; auto *chain = d_ptr->sampler_chain; // clear sampler chain for (int i = llama_sampler_chain_n(chain) - 1; i >= 0; i--) { auto *smpl = llama_sampler_chain_remove(chain, i); llama_sampler_free(smpl); } // build new chain llama_sampler_chain_add(chain, llama_sampler_init_penalties( llama_n_vocab(model), llama_token_eos(model), llama_token_nl(model), promptCtx.repeat_last_n, promptCtx.repeat_penalty, // TODO(jared): consider making the below configurable /*penalty_freq*/ 0.0f, /*penalty_present*/ 0.0f, /*penalize_nl*/ true, /*ignore_eos*/ false ) ); if (promptCtx.temp == 0.0f) { llama_sampler_chain_add(chain, llama_sampler_init_greedy()); } else { struct llama_sampler *samplers[] = { llama_sampler_init_top_k(promptCtx.top_k), llama_sampler_init_top_p(promptCtx.top_p, 1), llama_sampler_init_min_p(promptCtx.min_p, 1), llama_sampler_init_temp(promptCtx.temp), llama_sampler_init_softmax(), llama_sampler_init_dist(LLAMA_DEFAULT_SEED), }; for (auto *smpl : samplers) llama_sampler_chain_add(chain, smpl); } } LLModel::Token LLamaModel::sampleToken() const { return llama_sampler_sample(d_ptr->sampler_chain, d_ptr->ctx, -1); } bool LLamaModel::evalTokens(int32_t nPast, std::span tokens) const { assert(!tokens.empty()); llama_kv_cache_seq_rm(d_ptr->ctx, 0, nPast, -1); llama_batch batch = llama_batch_init(tokens.size(), 0, 1); batch.n_tokens = tokens.size(); for (int32_t i = 0; i < batch.n_tokens; i++) { batch.token [i] = tokens[i]; batch.pos [i] = nPast + i; batch.n_seq_id[i] = 1; batch.seq_id [i][0] = 0; batch.logits [i] = false; } // llama_decode will output logits only for the last token of the prompt batch.logits[batch.n_tokens - 1] = true; int res = llama_decode(d_ptr->ctx, batch); llama_batch_free(batch); return res == 0; } void LLamaModel::shiftContext(const PromptContext &promptCtx, int32_t *nPast) { // infinite text generation via context shifting // erase up to n_ctx*contextErase tokens int n_keep = shouldAddBOS(); int n_past = *nPast; int n_discard = std::min(n_past - n_keep, int(contextLength() * promptCtx.contextErase)); assert(n_discard > 0); if (n_discard <= 0) return; std::cerr << "Llama: context full, swapping: n_past = " << n_past << ", n_keep = " << n_keep << ", n_discard = " << n_discard << "\n"; // erase the first n_discard tokens from the context llama_kv_cache_seq_rm (d_ptr->ctx, 0, n_keep, n_keep + n_discard); llama_kv_cache_seq_add(d_ptr->ctx, 0, n_keep + n_discard, n_past, -n_discard); auto &inp = d_ptr->inputTokens; inp.erase(inp.begin() + n_keep, inp.begin() + n_keep + n_discard); *nPast = inp.size(); } int32_t LLamaModel::contextLength() const { return llama_n_ctx(d_ptr->ctx); } auto LLamaModel::specialTokens() -> std::unordered_map const { if (!d_ptr->model) throw std::logic_error("model not loaded"); std::unordered_map tokens; if (auto id = llama_token_bos(d_ptr->model); id != LLAMA_TOKEN_NULL) tokens.emplace("bos_token", tokenToString(id)); if (auto id = llama_token_eos(d_ptr->model); id != LLAMA_TOKEN_NULL) tokens.emplace("eos_token", tokenToString(id)); return tokens; } int32_t LLamaModel::inputLength() const { return d_ptr->inputTokens.size(); } int32_t LLamaModel::computeModelInputPosition(std::span input) const { // find common prefix auto cacheIt = d_ptr->inputTokens.begin(); auto inputIt = input.begin(); while (cacheIt < d_ptr->inputTokens.end() && inputIt < input.end() && *cacheIt == *inputIt) { ++cacheIt; ++inputIt; } // tell the caller to ignore the tokens between [begin, inputIt) return inputIt - input.begin(); } void LLamaModel::setModelInputPosition(int32_t pos) { auto &inp = d_ptr->inputTokens; assert(pos >= 0); assert(pos <= inp.size()); // truncate token cache to end at the new n_past if (pos < inp.size()) inp.resize(pos); } void LLamaModel::appendInputToken(Token tok) { d_ptr->inputTokens.push_back(tok); } auto LLamaModel::inputTokens() const -> std::span { return d_ptr->inputTokens; } const std::vector &LLamaModel::endTokens() const { return d_ptr->end_tokens; } bool LLamaModel::shouldAddBOS() const { return llama_add_bos_token(d_ptr->model); } int32_t LLamaModel::maxContextLength(std::string const &modelPath) const { return get_arch_key_u32(modelPath, "context_length"); } int32_t LLamaModel::layerCount(std::string const &modelPath) const { return get_arch_key_u32(modelPath, "block_count"); } // TODO(jared): reduce redundant code and operations by combining all metadata getters for unloaded // models into a class that keeps the model file open auto LLamaModel::chatTemplate(const char *modelPath) const -> std::expected { auto *ctx = load_gguf(modelPath); if (!ctx) return std::unexpected("failed to open model file"); std::expected result; enum gguf_type ktype; const int kid = gguf_find_key(ctx, "tokenizer.chat_template"); if (kid == -1) { result = std::unexpected("key not found"); goto cleanup; } ktype = gguf_get_kv_type(ctx, kid); if (ktype != GGUF_TYPE_STRING) { result = std::unexpected( "expected key type STRING (" + std::to_string(GGUF_TYPE_STRING) + "), got " + std::to_string(ktype) ); goto cleanup; } result = gguf_get_val_str(ctx, kid); cleanup: gguf_free(ctx); return result; } #ifdef GGML_USE_VULKAN static const char *getVulkanVendorName(uint32_t vendorID) { switch (vendorID) { case 0x10DE: return "nvidia"; case 0x1002: return "amd"; case 0x8086: return "intel"; default: return "unknown"; } } #endif std::vector LLamaModel::availableGPUDevices(size_t memoryRequired) const { #if defined(GGML_USE_KOMPUTE) || defined(GGML_USE_VULKAN) || defined(GGML_USE_CUDA) size_t count = 0; #ifdef GGML_USE_KOMPUTE auto *lcppDevices = ggml_vk_available_devices(memoryRequired, &count); #elif defined(GGML_USE_VULKAN) (void)memoryRequired; // hasn't been used since GGUF was added auto *lcppDevices = ggml_vk_available_devices(&count); #else // defined(GGML_USE_CUDA) (void)memoryRequired; auto *lcppDevices = ggml_cuda_available_devices(&count); #endif if (lcppDevices) { std::vector devices; devices.reserve(count); for (size_t i = 0; i < count; ++i) { auto & dev = lcppDevices[i]; devices.emplace_back( #ifdef GGML_USE_KOMPUTE /* backend = */ "kompute", /* index = */ dev.index, /* type = */ dev.type, /* heapSize = */ dev.heapSize, /* name = */ dev.name, /* vendor = */ dev.vendor #elif defined(GGML_USE_VULKAN) /* backend = */ "vulkan", /* index = */ dev.index, /* type = */ dev.type, /* heapSize = */ dev.heapSize, /* name = */ dev.name, /* vendor = */ getVulkanVendorName(dev.vendorID) #else // defined(GGML_USE_CUDA) /* backend = */ "cuda", /* index = */ dev.index, /* type = */ 2, // vk::PhysicalDeviceType::eDiscreteGpu /* heapSize = */ dev.heapSize, /* name = */ dev.name, /* vendor = */ "nvidia" #endif ); #ifndef GGML_USE_CUDA ggml_vk_device_destroy(&dev); #else ggml_cuda_device_destroy(&dev); #endif } free(lcppDevices); return devices; } #else (void)memoryRequired; std::cerr << __func__ << ": built without a GPU backend\n"; #endif return {}; } bool LLamaModel::initializeGPUDevice(size_t memoryRequired, const std::string &name) const { #if defined(GGML_USE_VULKAN) || defined(GGML_USE_CUDA) auto devices = availableGPUDevices(memoryRequired); auto dev_it = devices.begin(); #ifndef GGML_USE_CUDA if (name == "amd" || name == "nvidia" || name == "intel") { dev_it = std::find_if(dev_it, devices.end(), [&name](auto &dev) { return dev.vendor == name; }); } else #endif if (name != "gpu") { dev_it = std::find_if(dev_it, devices.end(), [&name](auto &dev) { return dev.name == name; }); } if (dev_it < devices.end()) { d_ptr->device = dev_it->index; d_ptr->deviceName = dev_it->name; return true; } return false; #elif defined(GGML_USE_KOMPUTE) ggml_vk_device device; bool ok = ggml_vk_get_device(&device, memoryRequired, name.c_str()); if (ok) { d_ptr->device = device.index; d_ptr->deviceName = device.name; ggml_vk_device_destroy(&device); return true; } #else (void)memoryRequired; (void)name; #endif return false; } bool LLamaModel::initializeGPUDevice(int device, std::string *unavail_reason) const { #if defined(GGML_USE_KOMPUTE) || defined(GGML_USE_VULKAN) || defined(GGML_USE_CUDA) (void)unavail_reason; auto devices = availableGPUDevices(); auto it = std::find_if(devices.begin(), devices.end(), [device](auto &dev) { return dev.index == device; }); d_ptr->device = device; d_ptr->deviceName = it < devices.end() ? it->name : "(unknown)"; return true; #else (void)device; if (unavail_reason) { *unavail_reason = "built without a GPU backend"; } return false; #endif } bool LLamaModel::usingGPUDevice() const { if (!d_ptr->model) return false; bool usingGPU = llama_model_using_gpu(d_ptr->model); #ifdef GGML_USE_KOMPUTE assert(!usingGPU || ggml_vk_has_device()); #endif return usingGPU; } const char *LLamaModel::backendName() const { return d_ptr->backend_name; } const char *LLamaModel::gpuDeviceName() const { if (usingGPUDevice()) { #if defined(GGML_USE_KOMPUTE) || defined(GGML_USE_VULKAN) || defined(GGML_USE_CUDA) return d_ptr->deviceName.c_str(); #elif defined(GGML_USE_METAL) return "Metal"; #endif } return nullptr; } void llama_batch_add( struct llama_batch & batch, llama_token id, llama_pos pos, const std::vector & seq_ids, bool logits) { batch.token [batch.n_tokens] = id; batch.pos [batch.n_tokens] = pos; batch.n_seq_id[batch.n_tokens] = seq_ids.size(); for (size_t i = 0; i < seq_ids.size(); ++i) { batch.seq_id[batch.n_tokens][i] = seq_ids[i]; } batch.logits [batch.n_tokens] = logits; batch.n_tokens++; } static void batch_add_seq(llama_batch &batch, const std::vector &tokens, int seq_id) { for (unsigned i = 0; i < tokens.size(); i++) { llama_batch_add(batch, tokens[i], i, { seq_id }, i == tokens.size() - 1); } } size_t LLamaModel::embeddingSize() const { return llama_n_embd(d_ptr->model); } struct EmbModelSpec { const char *docPrefix; const char *queryPrefix; std::vector otherPrefixes = {}; bool matryoshkaCapable = false; const char *recommendedDims = nullptr; }; struct EmbModelGroup { EmbModelSpec spec; std::vector names; }; static const EmbModelSpec NOPREFIX_SPEC {"", ""}; static const EmbModelSpec NOMIC_SPEC {"search_document", "search_query", {"clustering", "classification"}}; static const EmbModelSpec E5_SPEC {"passage", "query"}; static const EmbModelSpec NOMIC_1_5_SPEC { "search_document", "search_query", {"clustering", "classification"}, true, "[768, 512, 384, 256, 128]", }; static const EmbModelSpec LLM_EMBEDDER_SPEC { "Represent this document for retrieval", "Represent this query for retrieving relevant documents", }; static const EmbModelSpec BGE_SPEC { "", "Represent this sentence for searching relevant passages", }; static const EmbModelSpec E5_MISTRAL_SPEC { "", "Instruct: Given a query, retrieve relevant passages that answer the query\nQuery", }; static const EmbModelGroup EMBEDDING_MODEL_SPECS[] { {NOPREFIX_SPEC, {"all-MiniLM-L6-v1", "all-MiniLM-L12-v1", "all-MiniLM-L6-v2", "all-MiniLM-L12-v2"}}, {NOMIC_SPEC, {"nomic-embed-text-v1", "nomic-embed-text-v1-ablated", "nomic-embed-text-v1-unsupervised"}}, {NOMIC_1_5_SPEC, {"nomic-embed-text-v1.5"}}, {LLM_EMBEDDER_SPEC, {"llm-embedder"}}, {BGE_SPEC, {"bge-small-en", "bge-base-en", "bge-large-en", "bge-small-en-v1.5", "bge-base-en-v1.5", "bge-large-en-v1.5"}}, // NOTE: E5 Mistral is not yet implemented in llama.cpp, so it's not in EMBEDDING_ARCHES {E5_SPEC, {"e5-small", "e5-base", "e5-large", "e5-small-unsupervised", "e5-base-unsupervised", "e5-large-unsupervised", "e5-small-v2", "e5-base-v2", "e5-large-v2"}}, {E5_MISTRAL_SPEC, {"e5-mistral-7b-instruct", "multilingual-e5-small", "multilingual-e5-base", "multilingual-e5-large", "multilingual-e5-large-instruct"}}, }; static const EmbModelSpec *getEmbedSpec(const std::string &modelName) { static const auto &specs = EMBEDDING_MODEL_SPECS; auto it = std::find_if(specs, std::end(specs), [&modelName](auto &spec) { auto &names = spec.names; return std::find(names.begin(), names.end(), modelName) < names.end(); } ); return it < std::end(specs) ? &it->spec : nullptr; } void LLamaModel::embed( const std::vector &texts, float *embeddings, bool isRetrieval, int dimensionality, size_t *tokenCount, bool doMean, bool atlas ) { const EmbModelSpec *spec; std::optional prefix; if (d_ptr->model && (spec = getEmbedSpec(llama_model_name(d_ptr->model)))) prefix = isRetrieval ? spec->queryPrefix : spec->docPrefix; embed(texts, embeddings, prefix, dimensionality, tokenCount, doMean, atlas); } void LLamaModel::embed( const std::vector &texts, float *embeddings, std::optional prefix, int dimensionality, size_t *tokenCount, bool doMean, bool atlas, LLModel::EmbedCancelCallback *cancelCb ) { if (!d_ptr->model) throw std::logic_error("no model is loaded"); const char *modelName = llama_model_name(d_ptr->model); if (!m_supportsEmbedding) throw std::logic_error("not an embedding model: "s + modelName); auto *spec = getEmbedSpec(modelName); if (!spec) std::cerr << __func__ << ": warning: unknown model " << modelName << "\n"; const int32_t n_embd = llama_n_embd(d_ptr->model); if (dimensionality < 0) { dimensionality = n_embd; } else if (spec && dimensionality != n_embd) { auto msg = [dimensionality, modelName]() { return "unsupported dimensionality " + std::to_string(dimensionality) + " for model " + modelName; }; if (!spec->matryoshkaCapable) throw std::out_of_range(msg() + " (supported: " + std::to_string(n_embd) + ")"); if (dimensionality == 0 || dimensionality > n_embd) throw std::out_of_range(msg() + " (recommended: " + spec->recommendedDims + ")"); } if (!prefix) { if (!spec) throw std::invalid_argument("unknown model "s + modelName + ", specify a prefix if applicable or an empty string"); prefix = spec->docPrefix; } else if (spec && prefix != spec->docPrefix && prefix != spec->queryPrefix && std::find(spec->otherPrefixes.begin(), spec->otherPrefixes.end(), *prefix) == spec->otherPrefixes.end()) { std::stringstream ss; ss << std::quoted(*prefix) << " is not a valid task type for model " << modelName; throw std::invalid_argument(ss.str()); } embedInternal(texts, embeddings, *prefix, dimensionality, tokenCount, doMean, atlas, cancelCb, spec); } // MD5 hash of "nomic empty" static const char EMPTY_PLACEHOLDER[] = "24df574ea1c998de59d5be15e769658e"; auto product(double a) -> std::function { return [a](double b) { return a * b; }; } template double getL2NormScale(T *start, T *end) { double magnitude = std::sqrt(std::inner_product(start, end, start, 0.0)); return 1.0 / std::max(magnitude, 1e-12); } void LLamaModel::embedInternal( const std::vector &texts, float *embeddings, std::string prefix, int dimensionality, size_t *tokenCount, bool doMean, bool atlas, LLModel::EmbedCancelCallback *cancelCb, const EmbModelSpec *spec ) { typedef std::vector TokenString; static constexpr int32_t atlasMaxLength = 8192; static constexpr int chunkOverlap = 8; // Atlas overlaps chunks of input by 8 tokens const llama_token bos_token = llama_token_bos(d_ptr->model); const llama_token eos_token = llama_token_eos(d_ptr->model); bool useBOS = llama_add_bos_token(d_ptr->model); bool useEOS = llama_vocab_type(d_ptr->model) == LLAMA_VOCAB_TYPE_WPM; // no EOS, optional BOS auto tokenize = [this, useBOS, useEOS, eos_token](std::string text, TokenString &tokens, bool wantBOS) { if (!text.empty() && text[0] != ' ') { text = ' ' + text; // normalize for SPM - our fork of llama.cpp doesn't add a space prefix } tokens.resize(text.length()+4); int32_t n_tokens = llama_tokenize_gpt4all( d_ptr->model, text.c_str(), text.length(), tokens.data(), tokens.size(), /*add_special*/ wantBOS, /*parse_special*/ false, /*insert_space*/ false ); if (n_tokens) { (void)eos_token; (void)useBOS; assert((useEOS && wantBOS && useBOS) == (eos_token != -1 && tokens[n_tokens - 1] == eos_token)); if (useEOS && wantBOS) n_tokens--; // erase EOS/SEP } tokens.resize(n_tokens); }; // tokenize the texts std::vector inputs; for (unsigned i = 0; i < texts.size(); i++) { auto &text = texts[i]; auto &inp = inputs.emplace_back(); tokenize(text, inp, false); if (atlas && inp.size() > atlasMaxLength) { if (doMean) { throw std::length_error( "length of text at index " + std::to_string(i) + " is " + std::to_string(inp.size()) + " tokens which exceeds limit of " + std::to_string(atlasMaxLength) ); } inp.resize(atlasMaxLength); } else if (inp.empty()) { if (!atlas || !text.empty()) { std::cerr << __func__ << ": warning: chunking tokenized text at index " << std::to_string(i) << " into zero tokens\n"; } tokenize(EMPTY_PLACEHOLDER, inp, false); } } // tokenize the prefix TokenString prefixTokens; if (prefix.empty()) { prefixTokens.push_back(bos_token); } else { tokenize(prefix + ':', prefixTokens, true); } // n_ctx_train: max sequence length of model (RoPE scaling not implemented) const uint32_t n_ctx_train = llama_n_ctx_train(d_ptr->model); // n_batch (equals n_ctx): max tokens per call to llama_decode (one more more sequences) const uint32_t n_batch = llama_n_batch(d_ptr->ctx); // effective sequence length minus prefix and SEP token const uint32_t max_len = std::min(n_ctx_train, n_batch) - (prefixTokens.size() + useEOS); if (max_len <= chunkOverlap) { throw std::logic_error("max chunk length of " + std::to_string(max_len) + " is smaller than overlap of " + std::to_string(chunkOverlap) + " tokens"); } // split into max_len-sized chunks struct split_batch { unsigned idx; TokenString batch; }; std::vector batches; size_t totalTokens = 0; for (unsigned i = 0; i < inputs.size(); i++) { auto &input = inputs[i]; for (unsigned j = 0; j < input.size(); j += max_len) { if (j) { j -= chunkOverlap; } unsigned end = std::min(j + max_len, unsigned(input.size())); batches.push_back({ i, {} }); auto &batch = batches.back().batch; batch = prefixTokens; batch.insert(batch.end(), input.begin() + j, input.begin() + end); totalTokens += end - j; batch.push_back(eos_token); if (!doMean) { break; /* limit text to one chunk */ } } } inputs.clear(); if (cancelCb) { // copy of batching code below, but just count tokens instead of running inference unsigned nBatchTokens = 0; std::vector batchSizes; for (const auto &inp: batches) { if (nBatchTokens + inp.batch.size() > n_batch) { batchSizes.push_back(nBatchTokens); nBatchTokens = 0; } nBatchTokens += inp.batch.size(); } batchSizes.push_back(nBatchTokens); if (cancelCb(batchSizes.data(), batchSizes.size(), d_ptr->backend_name)) { throw std::runtime_error("operation was canceled"); } } // initialize batch struct llama_batch batch = llama_batch_init(n_batch, 0, 1); // n_texts x n_embd matrix const int32_t n_embd = llama_n_embd(d_ptr->model); std::vector embeddingsSum(texts.size() * n_embd); std::vector embeddingsSumTotal(texts.size()); std::vector queued_indices; // text indices of batches to be processed auto decode = [this, &queued_indices, n_embd, &batch, &embeddingsSum, &embeddingsSumTotal, spec, dimensionality]() { if (llama_decode(d_ptr->ctx, batch) < 0) throw std::runtime_error("llama_decode failed"); for (int i = 0; i < batch.n_tokens; ++i) { if (!batch.logits[i]) { continue; } int i_prompt = queued_indices[batch.seq_id[i][0]]; auto *out = &embeddingsSum[i_prompt * n_embd]; // sequence embeddings aren't available when pooling_type is NONE auto *embd = llama_get_embeddings_seq(d_ptr->ctx, batch.seq_id[i][0]); if (!embd) { embd = llama_get_embeddings_ith(d_ptr->ctx, i); } assert(embd); auto *embd_end = embd + n_embd; // layer normalization for nomic-embed-text-v1.5 if (spec && spec->matryoshkaCapable) { // normalize mean double mean = std::accumulate(embd, embd_end, 0.0) / n_embd; std::transform(embd, embd_end, embd, [mean](double f){ return f - mean; }); // unbiased sample variance, with Bessel's correction double variance = std::inner_product(embd, embd_end, embd, 0.0) / (n_embd - 1); // trim to matryoshka dim embd_end = embd + dimensionality; // normalize variance std::transform(embd, embd_end, embd, product(1.0 / std::sqrt(variance + 1e-5))); } // L2 norm auto scale = getL2NormScale(embd, embd_end); std::transform(embd, embd_end, out, out, [scale](double e, double o){ return o + scale * e; }); embeddingsSumTotal[i_prompt]++; } }; // break into batches for (const auto &inp: batches) { // encode if at capacity if (batch.n_tokens + inp.batch.size() > n_batch) { decode(); batch.n_tokens = 0; queued_indices.clear(); } // add to batch batch_add_seq(batch, inp.batch, queued_indices.size()); queued_indices.push_back(inp.idx); } // final batch decode(); for (unsigned i = 0; i < texts.size(); i++) { auto *embd = &embeddingsSum[i * n_embd]; auto *embd_end = embd + dimensionality; int total = embeddingsSumTotal[i]; // average over chunks std::transform(embd, embd_end, embd, product(1.0 / total)); // L2 norm and copy auto scale = getL2NormScale(embd, embd_end); std::transform(embd, embd_end, embeddings, product(scale)); embeddings += dimensionality; } if (tokenCount) { *tokenCount = totalTokens; } llama_batch_free(batch); } #if defined(_WIN32) #define DLL_EXPORT __declspec(dllexport) #else #define DLL_EXPORT __attribute__ ((visibility ("default"))) #endif extern "C" { DLL_EXPORT bool is_g4a_backend_model_implementation() { return true; } DLL_EXPORT const char *get_model_type() { return modelType_; } DLL_EXPORT const char *get_build_variant() { return GGML_BUILD_VARIANT; } DLL_EXPORT char *get_file_arch(const char *fname) { char *arch = nullptr; std::string archStr; auto *ctx = load_gguf(fname); if (!ctx) goto cleanup; try { archStr = get_arch_name(ctx); } catch (const std::runtime_error &) { goto cleanup; // cannot read key } if (is_embedding_arch(archStr) && gguf_find_key(ctx, (archStr + ".pooling_type").c_str()) < 0) { // old bert.cpp embedding model } else { arch = strdup(archStr.c_str()); } cleanup: gguf_free(ctx); return arch; } DLL_EXPORT bool is_arch_supported(const char *arch) { return std::find(KNOWN_ARCHES.begin(), KNOWN_ARCHES.end(), std::string(arch)) < KNOWN_ARCHES.end(); } DLL_EXPORT LLModel *construct() { llama_log_set([](auto l, auto t, auto u) { llama_log_callback(l, t, u, false); }, nullptr); #ifdef GGML_USE_CUDA ggml_backend_cuda_log_set_callback([](auto l, auto t, auto u) { llama_log_callback(l, t, u, true); }, nullptr); #endif return new LLamaModel; } } ================================================ FILE: gpt4all-backend/src/llamamodel_impl.h ================================================ #ifndef LLAMAMODEL_H_I_KNOW_WHAT_I_AM_DOING_WHEN_INCLUDING_THIS_FILE #error This file is NOT meant to be included outside of llamamodel.cpp. Doing so is DANGEROUS. Be sure to know what you are doing before proceeding to #define LLAMAMODEL_H_I_KNOW_WHAT_I_AM_DOING_WHEN_INCLUDING_THIS_FILE #endif #ifndef LLAMAMODEL_H #define LLAMAMODEL_H #include "llmodel.h" #include #include #include #include #include #include struct LLamaPrivate; struct EmbModelSpec; class LLamaModel : public LLModel { public: LLamaModel(); ~LLamaModel(); bool supportsEmbedding() const override { return m_supportsEmbedding; } bool supportsCompletion() const override { return m_supportsCompletion; } bool loadModel(const std::string &modelPath, int n_ctx, int ngl) override; bool isModelBlacklisted(const std::string &modelPath) const override; bool isEmbeddingModel(const std::string &modelPath) const override; bool isModelLoaded() const override; size_t requiredMem(const std::string &modelPath, int n_ctx, int ngl) override; size_t stateSize() const override; size_t saveState(std::span stateOut, std::vector &inputTokensOut) const override; size_t restoreState(std::span state, std::span inputTokens) override; void setThreadCount(int32_t n_threads) override; int32_t threadCount() const override; std::vector availableGPUDevices(size_t memoryRequired = 0) const override; bool initializeGPUDevice(size_t memoryRequired, const std::string &name) const override; bool initializeGPUDevice(int device, std::string *unavail_reason = nullptr) const override; bool usingGPUDevice() const override; const char *backendName() const override; const char *gpuDeviceName() const override; size_t embeddingSize() const override; // user-specified prefix void embed(const std::vector &texts, float *embeddings, std::optional prefix, int dimensionality = -1, size_t *tokenCount = nullptr, bool doMean = true, bool atlas = false, EmbedCancelCallback *cancelCb = nullptr) override; // automatic prefix void embed(const std::vector &texts, float *embeddings, bool isRetrieval, int dimensionality = -1, size_t *tokenCount = nullptr, bool doMean = true, bool atlas = false) override; int32_t contextLength() const override; auto specialTokens() -> std::unordered_map const override; protected: std::vector tokenize(std::string_view str) const override; bool isSpecialToken(Token id) const override; std::string tokenToString(Token id) const override; void initSampler(const PromptContext &ctx) override; Token sampleToken() const override; bool evalTokens(int32_t nPast, std::span tokens) const override; void shiftContext(const PromptContext &promptCtx, int32_t *nPast) override; int32_t inputLength() const override; int32_t computeModelInputPosition(std::span input) const override; void setModelInputPosition(int32_t pos) override; void appendInputToken(Token tok) override; std::span inputTokens() const override; const std::vector &endTokens() const override; bool shouldAddBOS() const override; int32_t maxContextLength(std::string const &modelPath) const override; int32_t layerCount(std::string const &modelPath) const override; auto chatTemplate(const char *modelPath) const -> std::expected override; void embedInternal(const std::vector &texts, float *embeddings, std::string prefix, int dimensionality, size_t *tokenCount, bool doMean, bool atlas, EmbedCancelCallback *cancelCb, const EmbModelSpec *spec); private: std::unique_ptr d_ptr; bool m_supportsEmbedding = false; bool m_supportsCompletion = false; }; #endif // LLAMAMODEL_H ================================================ FILE: gpt4all-backend/src/llmodel.cpp ================================================ #include "llmodel.h" #include "dlhandle.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #ifdef _WIN32 # define WIN32_LEAN_AND_MEAN # ifndef NOMINMAX # define NOMINMAX # endif # include #endif #ifdef _MSC_VER # include #endif #if defined(__APPLE__) && defined(__aarch64__) # include "sysinfo.h" // for getSystemTotalRAMInBytes #endif namespace fs = std::filesystem; #ifndef __APPLE__ static const std::string DEFAULT_BACKENDS[] = {"kompute", "cpu"}; #elif defined(__aarch64__) static const std::string DEFAULT_BACKENDS[] = {"metal", "cpu"}; #else static const std::string DEFAULT_BACKENDS[] = {"cpu"}; #endif std::string s_implementations_search_path = "."; #if !(defined(__x86_64__) || defined(_M_X64)) // irrelevant on non-x86_64 #define cpu_supports_avx() -1 #define cpu_supports_avx2() -1 #elif defined(_MSC_VER) // MSVC static int get_cpu_info(int func_id, int reg_id) { int info[4]; __cpuid(info, func_id); return info[reg_id]; } // AVX via EAX=1: Processor Info and Feature Bits, bit 28 of ECX #define cpu_supports_avx() !!(get_cpu_info(1, 2) & (1 << 28)) // AVX2 via EAX=7, ECX=0: Extended Features, bit 5 of EBX #define cpu_supports_avx2() !!(get_cpu_info(7, 1) & (1 << 5)) #else // gcc/clang #define cpu_supports_avx() !!__builtin_cpu_supports("avx") #define cpu_supports_avx2() !!__builtin_cpu_supports("avx2") #endif LLModel::Implementation::Implementation(Dlhandle &&dlhandle_) : m_dlhandle(new Dlhandle(std::move(dlhandle_))) { auto get_model_type = m_dlhandle->get("get_model_type"); assert(get_model_type); m_modelType = get_model_type(); auto get_build_variant = m_dlhandle->get("get_build_variant"); assert(get_build_variant); m_buildVariant = get_build_variant(); m_getFileArch = m_dlhandle->get("get_file_arch"); assert(m_getFileArch); m_isArchSupported = m_dlhandle->get("is_arch_supported"); assert(m_isArchSupported); m_construct = m_dlhandle->get("construct"); assert(m_construct); } LLModel::Implementation::Implementation(Implementation &&o) : m_getFileArch(o.m_getFileArch) , m_isArchSupported(o.m_isArchSupported) , m_construct(o.m_construct) , m_modelType(o.m_modelType) , m_buildVariant(o.m_buildVariant) , m_dlhandle(o.m_dlhandle) { o.m_dlhandle = nullptr; } LLModel::Implementation::~Implementation() { delete m_dlhandle; } static bool isImplementation(const Dlhandle &dl) { return dl.get("is_g4a_backend_model_implementation"); } // Add the CUDA Toolkit to the DLL search path on Windows. // This is necessary for chat.exe to find CUDA when started from Qt Creator. static void addCudaSearchPath() { #ifdef _WIN32 if (const auto *cudaPath = _wgetenv(L"CUDA_PATH")) { auto libDir = std::wstring(cudaPath) + L"\\bin"; if (!AddDllDirectory(libDir.c_str())) { auto err = GetLastError(); std::wcerr << L"AddDllDirectory(\"" << libDir << L"\") failed with error 0x" << std::hex << err << L"\n"; } } #endif } const std::vector &LLModel::Implementation::implementationList() { if (cpu_supports_avx() == 0) { throw std::runtime_error("CPU does not support AVX"); } // NOTE: allocated on heap so we leak intentionally on exit so we have a chance to clean up the // individual models without the cleanup of the static list interfering static auto* libs = new std::vector([] () { std::vector fres; addCudaSearchPath(); std::string impl_name_re = "llamamodel-mainline-(cpu|metal|kompute|vulkan|cuda)"; if (cpu_supports_avx2() == 0) { impl_name_re += "-avxonly"; } std::regex re(impl_name_re); auto search_in_directory = [&](const std::string& paths) { std::stringstream ss(paths); std::string path; // Split the paths string by the delimiter and process each path. while (std::getline(ss, path, ';')) { fs::directory_iterator iter; try { iter = fs::directory_iterator(std::u8string(path.begin(), path.end())); } catch (const fs::filesystem_error &) { continue; // skip nonexistent path } // Iterate over all libraries for (const auto &f : iter) { const fs::path &p = f.path(); if (p.extension() != LIB_FILE_EXT) continue; if (!std::regex_search(p.stem().string(), re)) continue; // Add to list if model implementation Dlhandle dl; try { dl = Dlhandle(p); } catch (const Dlhandle::Exception &e) { std::cerr << "Failed to load " << p.filename().string() << ": " << e.what() << "\n"; continue; } if (!isImplementation(dl)) { std::cerr << "Not an implementation: " << p.filename().string() << "\n"; continue; } fres.emplace_back(Implementation(std::move(dl))); } } }; search_in_directory(s_implementations_search_path); return fres; }()); // Return static result return *libs; } static std::string applyCPUVariant(const std::string &buildVariant) { if (buildVariant != "metal" && cpu_supports_avx2() == 0) { return buildVariant + "-avxonly"; } return buildVariant; } const LLModel::Implementation* LLModel::Implementation::implementation(const char *fname, const std::string& buildVariant) { bool buildVariantMatched = false; std::optional archName; for (const auto& i : implementationList()) { if (buildVariant != i.m_buildVariant) continue; buildVariantMatched = true; char *arch = i.m_getFileArch(fname); if (!arch) continue; archName = arch; bool archSupported = i.m_isArchSupported(arch); free(arch); if (archSupported) return &i; } if (!buildVariantMatched) return nullptr; if (!archName) throw UnsupportedModelError("Unsupported file format"); throw BadArchError(std::move(*archName)); } LLModel *LLModel::Implementation::construct(const std::string &modelPath, const std::string &backend, int n_ctx) { std::vector desiredBackends; if (backend != "auto") { desiredBackends.push_back(backend); } else { desiredBackends.insert(desiredBackends.end(), DEFAULT_BACKENDS, std::end(DEFAULT_BACKENDS)); } for (const auto &desiredBackend: desiredBackends) { const auto *impl = implementation(modelPath.c_str(), applyCPUVariant(desiredBackend)); if (impl) { // Construct llmodel implementation auto *fres = impl->m_construct(); fres->m_implementation = impl; #if defined(__APPLE__) && defined(__aarch64__) // FIXME: See if metal works for intel macs /* TODO(cebtenzzre): after we fix requiredMem, we should change this to happen at * load time, not construct time. right now n_ctx is incorrectly hardcoded 2048 in * most (all?) places where this is called, causing underestimation of required * memory. */ if (backend == "auto" && desiredBackend == "metal") { // on a 16GB M2 Mac a 13B q4_0 (0.52) works for me but a 13B q4_K_M (0.55) does not size_t req_mem = fres->requiredMem(modelPath, n_ctx, 100); if (req_mem >= size_t(0.53f * getSystemTotalRAMInBytes())) { delete fres; continue; } } #else (void)n_ctx; #endif return fres; } } throw MissingImplementationError("Could not find any implementations for backend: " + backend); } LLModel *LLModel::Implementation::constructGlobalLlama(const std::optional &backend) { static std::unordered_map> implCache; const std::vector *impls; try { impls = &implementationList(); } catch (const std::runtime_error &e) { std::cerr << __func__ << ": implementationList failed: " << e.what() << "\n"; return nullptr; } std::vector desiredBackends; if (backend) { desiredBackends.push_back(backend.value()); } else { desiredBackends.insert(desiredBackends.end(), DEFAULT_BACKENDS, std::end(DEFAULT_BACKENDS)); } const Implementation *impl = nullptr; for (const auto &desiredBackend: desiredBackends) { auto cacheIt = implCache.find(desiredBackend); if (cacheIt != implCache.end()) return cacheIt->second.get(); // cached for (const auto &i: *impls) { if (i.m_modelType == "LLaMA" && i.m_buildVariant == applyCPUVariant(desiredBackend)) { impl = &i; break; } } if (impl) { auto *fres = impl->m_construct(); fres->m_implementation = impl; implCache[desiredBackend] = std::unique_ptr(fres); return fres; } } std::cerr << __func__ << ": could not find Llama implementation for backend: " << backend.value_or("default") << "\n"; return nullptr; } std::vector LLModel::Implementation::availableGPUDevices(size_t memoryRequired) { std::vector devices; #ifndef __APPLE__ static const std::string backends[] = {"kompute", "cuda"}; for (const auto &backend: backends) { auto *llama = constructGlobalLlama(backend); if (llama) { auto backendDevs = llama->availableGPUDevices(memoryRequired); devices.insert(devices.end(), backendDevs.begin(), backendDevs.end()); } } #endif return devices; } int32_t LLModel::Implementation::maxContextLength(const std::string &modelPath) { auto *llama = constructGlobalLlama(); return llama ? llama->maxContextLength(modelPath) : -1; } int32_t LLModel::Implementation::layerCount(const std::string &modelPath) { auto *llama = constructGlobalLlama(); return llama ? llama->layerCount(modelPath) : -1; } bool LLModel::Implementation::isEmbeddingModel(const std::string &modelPath) { auto *llama = constructGlobalLlama(); return llama && llama->isEmbeddingModel(modelPath); } auto LLModel::Implementation::chatTemplate(const char *modelPath) -> std::expected { auto *llama = constructGlobalLlama(); return llama ? llama->chatTemplate(modelPath) : std::unexpected("backend not available"); } void LLModel::Implementation::setImplementationsSearchPath(const std::string& path) { s_implementations_search_path = path; } const std::string& LLModel::Implementation::implementationsSearchPath() { return s_implementations_search_path; } bool LLModel::Implementation::hasSupportedCPU() { return cpu_supports_avx() != 0; } int LLModel::Implementation::cpuSupportsAVX2() { return cpu_supports_avx2(); } ================================================ FILE: gpt4all-backend/src/llmodel_c.cpp ================================================ #include "llmodel_c.h" #include "llmodel.h" #include #include #include #include #include #include #include #include #include #include #include #include namespace ranges = std::ranges; static_assert(sizeof(token_t) == sizeof(LLModel::Token)); struct LLModelWrapper { LLModel *llModel = nullptr; ~LLModelWrapper() { delete llModel; } }; llmodel_model llmodel_model_create(const char *model_path) { const char *error; auto fres = llmodel_model_create2(model_path, "auto", &error); if (!fres) { fprintf(stderr, "Unable to instantiate model: %s\n", error); } return fres; } static void llmodel_set_error(const char **errptr, const char *message) { thread_local static std::string last_error_message; if (errptr) { last_error_message = message; *errptr = last_error_message.c_str(); } } llmodel_model llmodel_model_create2(const char *model_path, const char *backend, const char **error) { LLModel *llModel; try { llModel = LLModel::Implementation::construct(model_path, backend); } catch (const std::exception& e) { llmodel_set_error(error, e.what()); return nullptr; } auto wrapper = new LLModelWrapper; wrapper->llModel = llModel; return wrapper; } void llmodel_model_destroy(llmodel_model model) { delete static_cast(model); } size_t llmodel_required_mem(llmodel_model model, const char *model_path, int n_ctx, int ngl) { auto *wrapper = static_cast(model); return wrapper->llModel->requiredMem(model_path, n_ctx, ngl); } bool llmodel_loadModel(llmodel_model model, const char *model_path, int n_ctx, int ngl) { auto *wrapper = static_cast(model); std::string modelPath(model_path); if (wrapper->llModel->isModelBlacklisted(modelPath)) { size_t slash = modelPath.find_last_of("/\\"); auto basename = slash == std::string::npos ? modelPath : modelPath.substr(slash + 1); std::cerr << "warning: model '" << basename << "' is out-of-date, please check for an updated version\n"; } return wrapper->llModel->loadModel(modelPath, n_ctx, ngl); } bool llmodel_isModelLoaded(llmodel_model model) { auto *wrapper = static_cast(model); return wrapper->llModel->isModelLoaded(); } uint64_t llmodel_state_get_size(llmodel_model model) { auto *wrapper = static_cast(model); return wrapper->llModel->stateSize(); } uint64_t llmodel_state_get_data(llmodel_model model, uint8_t *state_out, uint64_t state_size, token_t **input_tokens_out, uint64_t *n_input_tokens) { auto *wrapper = static_cast(model); std::vector inputTokens; auto bytesWritten = wrapper->llModel->saveState({state_out, size_t(state_size)}, inputTokens); if (bytesWritten) { auto *buf = new LLModel::Token[inputTokens.size()]; ranges::copy(inputTokens, buf); *input_tokens_out = buf; *n_input_tokens = uint64_t(inputTokens.size()); } else { *input_tokens_out = nullptr; *n_input_tokens = 0; } return bytesWritten; } void llmodel_state_free_input_tokens(LLModel::Token *input_tokens) { delete[] input_tokens; } uint64_t llmodel_state_set_data(llmodel_model model, const uint8_t *state, uint64_t state_size, const token_t *input_tokens, uint64_t n_input_tokens) { auto *wrapper = static_cast(model); return wrapper->llModel->restoreState({state, size_t(state_size)}, {input_tokens, size_t(n_input_tokens)}); } bool llmodel_prompt(llmodel_model model, const char *prompt, llmodel_prompt_callback prompt_callback, llmodel_response_callback response_callback, llmodel_prompt_context *ctx, const char **error) { auto *wrapper = static_cast(model); // Copy the C prompt context LLModel::PromptContext promptContext { .n_predict = ctx->n_predict, .top_k = ctx->top_k, .top_p = ctx->top_p, .min_p = ctx->min_p, .temp = ctx->temp, .n_batch = ctx->n_batch, .repeat_penalty = ctx->repeat_penalty, .repeat_last_n = ctx->repeat_last_n, .contextErase = ctx->context_erase, }; auto prompt_func = [prompt_callback](std::span token_ids, bool cached) { return prompt_callback(token_ids.data(), token_ids.size(), cached); }; auto response_func = [response_callback](LLModel::Token token_id, std::string_view piece) { return response_callback(token_id, piece.data()); }; // Call the C++ prompt method try { wrapper->llModel->prompt(prompt, prompt_func, response_func, promptContext); } catch (std::exception const &e) { llmodel_set_error(error, e.what()); return false; } return true; } float *llmodel_embed( llmodel_model model, const char **texts, size_t *embedding_size, const char *prefix, int dimensionality, size_t *token_count, bool do_mean, bool atlas, llmodel_emb_cancel_callback cancel_cb, const char **error ) { auto *wrapper = static_cast(model); if (!texts || !*texts) { llmodel_set_error(error, "'texts' is NULL or empty"); return nullptr; } std::vector textsVec; while (*texts) { textsVec.emplace_back(*texts++); } size_t embd_size; float *embedding; try { embd_size = wrapper->llModel->embeddingSize(); if (dimensionality > 0 && dimensionality < int(embd_size)) embd_size = dimensionality; embd_size *= textsVec.size(); std::optional prefixStr; if (prefix) { prefixStr = prefix; } embedding = new float[embd_size]; wrapper->llModel->embed(textsVec, embedding, prefixStr, dimensionality, token_count, do_mean, atlas, cancel_cb); } catch (std::exception const &e) { llmodel_set_error(error, e.what()); return nullptr; } *embedding_size = embd_size; return embedding; } void llmodel_free_embedding(float *ptr) { delete[] ptr; } void llmodel_setThreadCount(llmodel_model model, int32_t n_threads) { auto *wrapper = static_cast(model); wrapper->llModel->setThreadCount(n_threads); } int32_t llmodel_threadCount(llmodel_model model) { auto *wrapper = static_cast(model); return wrapper->llModel->threadCount(); } void llmodel_set_implementation_search_path(const char *path) { LLModel::Implementation::setImplementationsSearchPath(path); } const char *llmodel_get_implementation_search_path() { return LLModel::Implementation::implementationsSearchPath().c_str(); } // RAII wrapper around a C-style struct struct llmodel_gpu_device_cpp: llmodel_gpu_device { llmodel_gpu_device_cpp() = default; llmodel_gpu_device_cpp(const llmodel_gpu_device_cpp &) = delete; llmodel_gpu_device_cpp( llmodel_gpu_device_cpp &&) = delete; const llmodel_gpu_device_cpp &operator=(const llmodel_gpu_device_cpp &) = delete; llmodel_gpu_device_cpp &operator=( llmodel_gpu_device_cpp &&) = delete; ~llmodel_gpu_device_cpp() { free(const_cast(name)); free(const_cast(vendor)); } }; static_assert(sizeof(llmodel_gpu_device_cpp) == sizeof(llmodel_gpu_device)); struct llmodel_gpu_device *llmodel_available_gpu_devices(size_t memoryRequired, int *num_devices) { static thread_local std::unique_ptr c_devices; auto devices = LLModel::Implementation::availableGPUDevices(memoryRequired); *num_devices = devices.size(); if (devices.empty()) { return nullptr; /* no devices */ } c_devices = std::make_unique(devices.size()); for (unsigned i = 0; i < devices.size(); i++) { const auto &dev = devices[i]; auto &cdev = c_devices[i]; cdev.backend = dev.backend; cdev.index = dev.index; cdev.type = dev.type; cdev.heapSize = dev.heapSize; cdev.name = strdup(dev.name.c_str()); cdev.vendor = strdup(dev.vendor.c_str()); } return c_devices.get(); } bool llmodel_gpu_init_gpu_device_by_string(llmodel_model model, size_t memoryRequired, const char *device) { auto *wrapper = static_cast(model); return wrapper->llModel->initializeGPUDevice(memoryRequired, std::string(device)); } bool llmodel_gpu_init_gpu_device_by_struct(llmodel_model model, const llmodel_gpu_device *device) { auto *wrapper = static_cast(model); return wrapper->llModel->initializeGPUDevice(device->index); } bool llmodel_gpu_init_gpu_device_by_int(llmodel_model model, int device) { auto *wrapper = static_cast(model); return wrapper->llModel->initializeGPUDevice(device); } const char *llmodel_model_backend_name(llmodel_model model) { const auto *wrapper = static_cast(model); return wrapper->llModel->backendName(); } const char *llmodel_model_gpu_device_name(llmodel_model model) { const auto *wrapper = static_cast(model); return wrapper->llModel->gpuDeviceName(); } int32_t llmodel_count_prompt_tokens(llmodel_model model, const char *prompt, const char **error) { auto *wrapper = static_cast(model); try { return wrapper->llModel->countPromptTokens(prompt); } catch (const std::exception& e) { llmodel_set_error(error, e.what()); return -1; } } void llmodel_model_foreach_special_token(llmodel_model model, llmodel_special_token_callback callback) { auto *wrapper = static_cast(model); for (auto &[name, token] : wrapper->llModel->specialTokens()) callback(name.c_str(), token.c_str()); } ================================================ FILE: gpt4all-backend/src/llmodel_shared.cpp ================================================ #include "llmodel.h" #include #include #include #include #include #include #include #include #include #include #include #include namespace ranges = std::ranges; namespace views = std::ranges::views; void LLModel::prompt( std::string_view prompt, const PromptCallback &promptCallback, const ResponseCallback &responseCallback, const PromptContext &promptCtx ) { if (!isModelLoaded()) throw std::invalid_argument("Attempted to prompt an unloaded model."); if (!supportsCompletion()) throw std::invalid_argument("Not a text completion model."); if (!promptCtx.n_batch) throw std::invalid_argument("Batch size cannot be zero."); if (!promptCtx.n_predict) return; // nothing requested auto embd_inp = tokenize(prompt); if (embd_inp.empty()) throw std::invalid_argument("Prompt tokenized to zero tokens."); if (auto res = decodePrompt(promptCallback, promptCtx, std::move(embd_inp))) generateResponse(responseCallback, promptCtx, /*n_past*/ *res); } int32_t LLModel::countPromptTokens(std::string_view prompt) const { if (!isModelLoaded()) throw std::invalid_argument("Attempted to tokenize with an unloaded model."); return int32_t(tokenize(prompt).size()); } auto LLModel::decodePrompt( const PromptCallback &promptCallback, const PromptContext &promptCtx, std::vector embd_inp ) -> std::optional { assert(!embd_inp.empty()); int32_t nCtx = contextLength(); int32_t n_batch = std::min(promptCtx.n_batch, LLMODEL_MAX_PROMPT_BATCH); // Find the greatest n_past where the beginning of embd_inp matches the end of the token cache, starting at the // requested n_past. // This is used to skip unnecessary work when the prompt shares a common prefix with the previous result. int32_t nPast = computeModelInputPosition(embd_inp); // always decode up to a full batch before generating, even if cached nPast -= std::min(n_batch, nPast); // TODO(jared): generalize this to find the smallest new_embd_inp.size() - nPast given the cache if (!nPast && int32_t(embd_inp.size()) > nCtx) { // no cache hit -> shift the input before even processing int32_t nKeep = shouldAddBOS(); auto newLength = int32_t(nCtx * (1.f - promptCtx.contextErase)); int32_t nDiscard = int32_t(embd_inp.size()) - std::max(1, std::min(nCtx, newLength)); // execute the callback even for skipped tokens. this misrepresents the position of BOS but we don't care auto discardedTokens = embd_inp | views::drop(nKeep) | views::take(nDiscard); if (!promptCallback(discardedTokens, true)) return std::nullopt; // erase nDiscard tokens embd_inp.erase(discardedTokens.begin(), discardedTokens.end()); assert(int32_t(embd_inp.size()) <= nCtx); // check the cache again, just in case nPast = computeModelInputPosition(embd_inp); nPast -= std::min(n_batch, nPast); } setModelInputPosition(nPast); // execute the callback even for skipped tokens if (!promptCallback(embd_inp | views::take(nPast), true)) return std::nullopt; // process the prompt in batches for (int32_t i = nPast; i < embd_inp.size();) { auto batch_end = std::min(i + n_batch, int32_t(embd_inp.size())); std::span batch(embd_inp.begin() + i, embd_inp.begin() + batch_end); // Check if the context has run out... if (nPast + int32_t(batch.size()) > nCtx) { shiftContext(promptCtx, &nPast); assert(nPast + int32_t(batch.size()) <= nCtx); } // FIXME(Adam): We should find a way to bubble these strings to the UI level to allow for translation if (!evalTokens(nPast, batch)) throw std::runtime_error("An internal error was encountered during prompt processing."); for (auto &tok : batch) { appendInputToken(tok); nPast++; if (!promptCallback({ &tok, 1 }, false)) return std::nullopt; } i = batch_end; } return nPast; } /* * If string s overlaps with the string key such that some prefix of the key is at the end * of the string, return the position in s where the first match starts. Otherwise, return * std::string::npos. Examples: * s = "bfo", key = "foo" -> 1 * s = "fooa", key = "foo" -> npos */ static std::string::size_type stringsOverlap(const std::string &s, const std::string &key) { if (s.empty() || key.empty()) throw std::invalid_argument("arguments to stringsOverlap must not be empty"); for (int start = std::max(0, int(s.size()) - int(key.size())); start < s.size(); start++) { if (s.compare(start, s.size(), key, 0, s.size() - start) == 0) return start; } return std::string::npos; } void LLModel::generateResponse( const ResponseCallback &responseCallback, const PromptContext &promptCtx, int32_t nPast ) { static const char *stopSequences[] { "### System", "### Instruction", "### Human", "### User", "### Response", "### Assistant", "### Context", "<|im_start|>", "<|im_end|>", "<|endoftext|>", }; initSampler(promptCtx); std::string cachedResponse; std::vector cachedTokens; int n_predicted = 0; // Predict next tokens for (bool stop = false; !stop;) { // Sample next token std::optional new_tok = sampleToken(); std::string new_piece = tokenToString(new_tok.value()); cachedTokens.push_back(new_tok.value()); cachedResponse += new_piece; auto accept = [this, &promptCtx, &new_tok, &nPast] { // Shift context if out of space if (nPast >= contextLength()) { shiftContext(promptCtx, &nPast); assert(nPast < contextLength()); } // Accept the token Token tok = std::exchange(new_tok, std::nullopt).value(); if (!evalTokens(nPast, { &tok, 1 })) throw std::runtime_error("An internal error was encountered during response generation."); appendInputToken(tok); nPast++; }; // Check for EOS auto lengthLimit = std::string::npos; for (const auto token : endTokens()) { if (new_tok == token) { stop = true; lengthLimit = cachedResponse.size() - new_piece.size(); } } if (lengthLimit != std::string::npos) { // EOS matched } else if (!isSpecialToken(new_tok.value())) { // Check if the response contains a stop sequence for (const auto &p : stopSequences) { auto match = cachedResponse.find(p); if (match != std::string::npos) stop = true; lengthLimit = std::min(lengthLimit, match); if (match == 0) break; } // Check if the response matches the start of a stop sequence if (lengthLimit == std::string::npos) { for (const auto &p : stopSequences) { auto match = stringsOverlap(cachedResponse, p); lengthLimit = std::min(lengthLimit, match); if (match == 0) break; } } } else if (ranges::find(stopSequences, new_piece) < std::end(stopSequences)) { // Special tokens must exactly match a stop sequence stop = true; lengthLimit = cachedResponse.size() - new_piece.size(); } // Empty the cache, up to the length limit std::string::size_type responseLength = 0; while (!cachedTokens.empty()) { Token tok = cachedTokens.front(); std::string piece = tokenToString(tok); // Stop if the piece (or part of it) does not fit within the length limit if (responseLength + (stop ? 1 : piece.size()) > lengthLimit) break; // Remove token from cache assert(cachedResponse.starts_with(piece)); cachedTokens.erase(cachedTokens.begin(), cachedTokens.begin() + 1); cachedResponse.erase(cachedResponse.begin(), cachedResponse.begin() + piece.size()); // Accept the token, if needed (not cached) if (cachedTokens.empty() && new_tok) accept(); // Send the token if (!responseCallback(tok, piece) || ++n_predicted >= promptCtx.n_predict) { stop = true; break; } // FIXME(jared): we could avoid printing partial stop sequences if we didn't have to // output token IDs and could cache a partial token for the next prompt call responseLength += piece.size(); } assert(cachedTokens.empty() == cachedResponse.empty()); // Accept the token, if needed (in cache) if (new_tok) { assert(!cachedTokens.empty() && cachedTokens.back() == new_tok); if (stop) { cachedTokens.pop_back(); } else { accept(); } } } if (inputLength() < cachedTokens.size()) { /* This is theoretically possible if the longest stop sequence is greater than * n_ctx * contextErase tokens. */ throw std::runtime_error("shifted too much context, can't go back"); } #ifndef NDEBUG auto inp = inputTokens(); auto discard_start = inp.end() - cachedTokens.size(); assert(std::equal(discard_start, inp.end(), cachedTokens.begin())); #endif } void LLModel::embed( const std::vector &texts, float *embeddings, std::optional prefix, int dimensionality, size_t *tokenCount, bool doMean, bool atlas, EmbedCancelCallback *cancelCb ) { (void)texts; (void)embeddings; (void)prefix; (void)dimensionality; (void)tokenCount; (void)doMean; (void)atlas; (void)cancelCb; throw std::logic_error(std::string(implementation().modelType()) + " does not support embeddings"); } void LLModel::embed( const std::vector &texts, float *embeddings, bool isRetrieval, int dimensionality, size_t *tokenCount, bool doMean, bool atlas ) { (void)texts; (void)embeddings; (void)isRetrieval; (void)dimensionality; (void)tokenCount; (void)doMean; (void)atlas; throw std::logic_error(std::string(implementation().modelType()) + " does not support embeddings"); } ================================================ FILE: gpt4all-backend/src/utils.h ================================================ #pragma once #include #ifdef NDEBUG # ifdef __has_builtin # if __has_builtin(__builtin_unreachable) # define UNREACHABLE() __builtin_unreachable() # else # define UNREACHABLE() do {} while (0) # endif # else # define UNREACHABLE() do {} while (0) # endif #else # define UNREACHABLE() assert(!"Unreachable statement was reached") #endif ================================================ FILE: gpt4all-bindings/README.md ================================================ # GPT4All Language Bindings These are the language bindings for the GPT4All backend. They provide functionality to load GPT4All models (and other llama.cpp models), generate text, and (in the case of the Python bindings) embed text as a vector representation. See their respective folders for language-specific documentation. ### Languages - [Python](https://github.com/nomic-ai/gpt4all/tree/main/gpt4all-bindings/python) (Nomic official, maintained by [@cebtenzzre](https://github.com/cebtenzzre)) - [Node.js/Typescript](https://github.com/nomic-ai/gpt4all/tree/main/gpt4all-bindings/typescript) (community, maintained by [@jacoobes](https://github.com/jacoobes) and [@iimez](https://github.com/iimez))

Archived Bindings
The following bindings have been removed from this repository due to lack of maintenance. If adopted, they can be brought back—feel free to message a developer on Dicsord if you are interested in maintaining one of them. Below are links to their last available version (not necessarily the last working version). - C#: [41c9013f](https://github.com/nomic-ai/gpt4all/tree/41c9013fa46a194b3e4fee6ced1b9d1b65e177ac/gpt4all-bindings/csharp) - Java: [41c9013f](https://github.com/nomic-ai/gpt4all/tree/41c9013fa46a194b3e4fee6ced1b9d1b65e177ac/gpt4all-bindings/java) - Go: [41c9013f](https://github.com/nomic-ai/gpt4all/tree/41c9013fa46a194b3e4fee6ced1b9d1b65e177ac/gpt4all-bindings/golang)
================================================ FILE: gpt4all-bindings/cli/README.md ================================================ # GPT4All Command-Line Interface (CLI) GPT4All on the command-line. More details on the [wiki](https://github.com/nomic-ai/gpt4all/wiki/Python-CLI). ## Quickstart The CLI is based on the `gpt4all` Python bindings and the `typer` package. The following shows one way to get started with the CLI, the documentation has more information. Typically, you will want to replace `python` with `python3` on _Unix-like_ systems and `py -3` on _Windows_. Also, it's assumed you have all the necessary Python components already installed. The CLI is a self-contained Python script named [app.py] ([download][app.py-download]). As long as its package dependencies are present, you can download and run it from wherever you like. [app.py]: https://github.com/nomic-ai/gpt4all/blob/main/gpt4all-bindings/cli/app.py [app.py-download]: https://raw.githubusercontent.com/nomic-ai/gpt4all/main/gpt4all-bindings/cli/app.py ```shell # optional but recommended: create and use a virtual environment python -m venv gpt4all-cli ``` _Windows_ and _Unix-like_ systems differ slightly in how you activate a _virtual environment_: - _Unix-like_, typically: `. gpt4all-cli/bin/activate` - _Windows_: `gpt4all-cli\Scripts\activate` Then: ```shell # pip-install the necessary packages; omit '--user' if using a virtual environment python -m pip install --user --upgrade gpt4all typer # run the CLI python app.py repl ``` By default, it will automatically download the `Mistral Instruct` model to `.cache/gpt4all/` in your user directory, if necessary. If you have already saved a model beforehand, specify its path with the `-m`/`--model` argument, for example: ```shell python app.py repl --model /home/user/my-gpt4all-models/mistral-7b-instruct-v0.1.Q4_0.gguf ``` ================================================ FILE: gpt4all-bindings/cli/app.py ================================================ #!/usr/bin/env python3 """GPT4All CLI The GPT4All CLI is a self-contained script based on the `gpt4all` and `typer` packages. It offers a REPL to communicate with a language model similar to the chat GUI application, but more basic. """ import importlib.metadata import io import sys from collections import namedtuple from typing_extensions import Annotated import typer from gpt4all import GPT4All MESSAGES = [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Hello there."}, {"role": "assistant", "content": "Hi, how can I help you?"}, ] SPECIAL_COMMANDS = { "/reset": lambda messages: messages.clear(), "/exit": lambda _: sys.exit(), "/clear": lambda _: print("\n" * 100), "/help": lambda _: print("Special commands: /reset, /exit, /help and /clear"), } VersionInfo = namedtuple('VersionInfo', ['major', 'minor', 'micro']) VERSION_INFO = VersionInfo(1, 0, 2) VERSION = '.'.join(map(str, VERSION_INFO)) # convert to string form, like: '1.2.3' CLI_START_MESSAGE = f""" ██████ ██████ ████████ ██ ██ █████ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ███ ██████ ██ ███████ ███████ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██████ ██ ██ ██ ██ ██ ███████ ███████ Welcome to the GPT4All CLI! Version {VERSION} Type /help for special commands. """ # create typer app app = typer.Typer() @app.command() def repl( model: Annotated[ str, typer.Option("--model", "-m", help="Model to use for chatbot"), ] = "mistral-7b-instruct-v0.1.Q4_0.gguf", n_threads: Annotated[ int, typer.Option("--n-threads", "-t", help="Number of threads to use for chatbot"), ] = None, device: Annotated[ str, typer.Option("--device", "-d", help="Device to use for chatbot, e.g. gpu, amd, nvidia, intel. Defaults to CPU."), ] = None, ): """The CLI read-eval-print loop.""" gpt4all_instance = GPT4All(model, device=device) # if threads are passed, set them if n_threads is not None: num_threads = gpt4all_instance.model.thread_count() print(f"\nAdjusted: {num_threads} →", end="") # set number of threads gpt4all_instance.model.set_thread_count(n_threads) num_threads = gpt4all_instance.model.thread_count() print(f" {num_threads} threads", end="", flush=True) else: print(f"\nUsing {gpt4all_instance.model.thread_count()} threads", end="") print(CLI_START_MESSAGE) use_new_loop = False try: version = importlib.metadata.version('gpt4all') version_major = int(version.split('.')[0]) if version_major >= 1: use_new_loop = True except: pass # fall back to old loop if use_new_loop: _new_loop(gpt4all_instance) else: _old_loop(gpt4all_instance) def _old_loop(gpt4all_instance): while True: message = input(" ⇢ ") # Check if special command and take action if message in SPECIAL_COMMANDS: SPECIAL_COMMANDS[message](MESSAGES) continue # if regular message, append to messages MESSAGES.append({"role": "user", "content": message}) # execute chat completion and ignore the full response since # we are outputting it incrementally full_response = gpt4all_instance.chat_completion( MESSAGES, # preferential kwargs for chat ux n_past=0, n_predict=200, top_k=40, top_p=0.9, min_p=0.0, temp=0.9, n_batch=9, repeat_penalty=1.1, repeat_last_n=64, context_erase=0.0, # required kwargs for cli ux (incremental response) verbose=False, streaming=True, ) # record assistant's response to messages MESSAGES.append(full_response.get("choices")[0].get("message")) print() # newline before next prompt def _new_loop(gpt4all_instance): with gpt4all_instance.chat_session(): while True: message = input(" ⇢ ") # Check if special command and take action if message in SPECIAL_COMMANDS: SPECIAL_COMMANDS[message](MESSAGES) continue # if regular message, append to messages MESSAGES.append({"role": "user", "content": message}) # execute chat completion and ignore the full response since # we are outputting it incrementally response_generator = gpt4all_instance.generate( message, # preferential kwargs for chat ux max_tokens=200, temp=0.9, top_k=40, top_p=0.9, min_p=0.0, repeat_penalty=1.1, repeat_last_n=64, n_batch=9, # required kwargs for cli ux (incremental response) streaming=True, ) response = io.StringIO() for token in response_generator: print(token, end='', flush=True) response.write(token) # record assistant's response to messages response_message = {'role': 'assistant', 'content': response.getvalue()} response.close() gpt4all_instance.current_chat_session.append(response_message) MESSAGES.append(response_message) print() # newline before next prompt @app.command() def version(): """The CLI version command.""" print(f"gpt4all-cli v{VERSION}") if __name__ == "__main__": app() ================================================ FILE: gpt4all-bindings/cli/developer_notes.md ================================================ # Developing the CLI ## Documentation Documentation can be found in three places: - `app.py` docstrings & comments - a Readme: `gpt4all-bindings/cli/README.md` - the actual CLI documentation: `gpt4all-bindings/python/docs/gpt4all_cli.md` The _docstrings_ are meant for programmatic use. Since the CLI is primarily geared towards users and not to build on top, they're kept terse. The _Readme_ is mostly meant for users and includes: - a link to the _CLI documentation_ (on the [website]) - a Quickstart section with some guidance on how to get started with a sane setup The _CLI documentation_ and other documentation are located in the above mentioned `docs/` folder. They're in Markdown format and built for the [website]. Of the three, they should be the most detailed. [website]: https://docs.gpt4all.io/gpt4all_cli.html ## Versioning The version number should now follow the `gpt4all` PyPI package, so compatibility is more clear. The one place to change it is the `namedtuple` called `VERSION_INFO`. ================================================ FILE: gpt4all-bindings/python/.gitignore ================================================ # Byte-compiled / optimized / DLL files __pycache__/ *.py[cod] *$py.class # C extensions *.so # Distribution / packaging .Python build/ develop-eggs/ dist/ downloads/ eggs/ .eggs/ lib/ lib64/ parts/ sdist/ var/ wheels/ share/python-wheels/ *.egg-info/ .installed.cfg *.egg MANIFEST # PyInstaller # Usually these files are written by a python script from a template # before PyInstaller builds the exe, so as to inject date/other infos into it. *.manifest *.spec # Installer logs pip-log.txt pip-delete-this-directory.txt # Unit test / coverage reports htmlcov/ .tox/ .nox/ .coverage .coverage.* .cache nosetests.xml coverage.xml *.cover *.py,cover .hypothesis/ .pytest_cache/ cover/ # Translations *.mo *.pot # Django stuff: *.log local_settings.py db.sqlite3 db.sqlite3-journal # Flask stuff: instance/ .webassets-cache # Scrapy stuff: .scrapy # Sphinx documentation docs/_build/ # PyBuilder .pybuilder/ target/ # Jupyter Notebook .ipynb_checkpoints # IPython profile_default/ ipython_config.py # pyenv # For a library or package, you might want to ignore these files since the code is # intended to run in multiple environments; otherwise, check them in: # .python-version # pipenv # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. # However, in case of collaboration, if having platform-specific dependencies or dependencies # having no cross-platform support, pipenv may install dependencies that don't work, or not # install all needed dependencies. #Pipfile.lock # poetry # Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control. # This is especially recommended for binary packages to ensure reproducibility, and is more # commonly ignored for libraries. # https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control #poetry.lock # pdm # Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control. #pdm.lock # pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it # in version control. # https://pdm.fming.dev/#use-with-ide .pdm.toml # PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm __pypackages__/ # Celery stuff celerybeat-schedule celerybeat.pid # SageMath parsed files *.sage.py # Environments .env .venv env/ venv/ ENV/ env.bak/ venv.bak/ # Spyder project settings .spyderproject .spyproject # Rope project settings .ropeproject # mkdocs documentation /site # mypy .mypy_cache/ .dmypy.json dmypy.json # Pyre type checker .pyre/ # pytype static type analyzer .pytype/ # Cython debug symbols cython_debug/ # PyCharm # JetBrains specific template is maintained in a separate JetBrains.gitignore that can # be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore # and can be added to the global gitignore or merged into this file. For a more nuclear # option (not recommended) you can uncomment the following to ignore the entire idea folder. #.idea/ # Cython /*.c *DO_NOT_MODIFY/ ================================================ FILE: gpt4all-bindings/python/.isort.cfg ================================================ [settings] known_third_party=geopy,nltk,np,numpy,pandas,pysbd,fire,torch line_length=120 include_trailing_comma=True multi_line_output=3 use_parentheses=True ================================================ FILE: gpt4all-bindings/python/CHANGELOG.md ================================================ # Changelog All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). ## [Unreleased] ### Added - Warn on Windows if the Microsoft Visual C++ runtime libraries are not found ([#2920](https://github.com/nomic-ai/gpt4all/pull/2920)) - Basic cache for faster prefill when the input shares a prefix with previous context ([#3073](https://github.com/nomic-ai/gpt4all/pull/3073)) - Add ability to modify or replace the history of an active chat session ([#3147](https://github.com/nomic-ai/gpt4all/pull/3147)) ### Changed - Rebase llama.cpp on latest upstream as of September 26th ([#2998](https://github.com/nomic-ai/gpt4all/pull/2998)) - Change the error message when a message is too long ([#3004](https://github.com/nomic-ai/gpt4all/pull/3004)) - Fix CalledProcessError on Intel Macs since v2.8.0 ([#3045](https://github.com/nomic-ai/gpt4all/pull/3045)) - Use Jinja for chat templates instead of per-message QString.arg-style templates ([#3147](https://github.com/nomic-ai/gpt4all/pull/3147)) ## [2.8.2] - 2024-08-14 ### Fixed - Fixed incompatibility with Python 3.8 since v2.7.0 and Python <=3.11 since v2.8.1 ([#2871](https://github.com/nomic-ai/gpt4all/pull/2871)) ## [2.8.1] - 2024-08-13 ### Added - Use greedy sampling when temperature is set to zero ([#2854](https://github.com/nomic-ai/gpt4all/pull/2854)) ### Changed - Search for pip-installed CUDA 11 as well as CUDA 12 ([#2802](https://github.com/nomic-ai/gpt4all/pull/2802)) - Stop shipping CUBINs to reduce wheel size ([#2802](https://github.com/nomic-ai/gpt4all/pull/2802)) - Use llama\_kv\_cache ops to shift context faster ([#2781](https://github.com/nomic-ai/gpt4all/pull/2781)) - Don't stop generating at end of context ([#2781](https://github.com/nomic-ai/gpt4all/pull/2781)) ### Fixed - Make reverse prompt detection work more reliably and prevent it from breaking output ([#2781](https://github.com/nomic-ai/gpt4all/pull/2781)) - Explicitly target macOS 12.6 in CI to fix Metal compatibility on older macOS ([#2849](https://github.com/nomic-ai/gpt4all/pull/2849)) - Do not initialize Vulkan driver when only using CPU ([#2843](https://github.com/nomic-ai/gpt4all/pull/2843)) - Fix a segfault on exit when using CPU mode on Linux with NVIDIA and EGL ([#2843](https://github.com/nomic-ai/gpt4all/pull/2843)) ## [2.8.0] - 2024-08-05 ### Added - Support GPT-NeoX, Gemma 2, OpenELM, ChatGLM, and Jais architectures (all with Vulkan support) ([#2694](https://github.com/nomic-ai/gpt4all/pull/2694)) - Enable Vulkan support for StarCoder2, XVERSE, Command R, and OLMo ([#2694](https://github.com/nomic-ai/gpt4all/pull/2694)) - Support DeepSeek-V2 architecture (no Vulkan support) ([#2702](https://github.com/nomic-ai/gpt4all/pull/2702)) - Add Llama 3.1 8B Instruct to models3.json (by [@3Simplex](https://github.com/3Simplex) in [#2731](https://github.com/nomic-ai/gpt4all/pull/2731) and [#2732](https://github.com/nomic-ai/gpt4all/pull/2732)) - Support Llama 3.1 RoPE scaling ([#2758](https://github.com/nomic-ai/gpt4all/pull/2758)) - Add Qwen2-1.5B-Instruct to models3.json (by [@ThiloteE](https://github.com/ThiloteE) in [#2759](https://github.com/nomic-ai/gpt4all/pull/2759)) - Detect use of a Python interpreter under Rosetta for a clearer error message ([#2793](https://github.com/nomic-ai/gpt4all/pull/2793)) ### Changed - Build against CUDA 11.8 instead of CUDA 12 for better compatibility with older drivers ([#2639](https://github.com/nomic-ai/gpt4all/pull/2639)) - Update llama.cpp to commit 87e397d00 from July 19th ([#2694](https://github.com/nomic-ai/gpt4all/pull/2694)) ### Removed - Remove unused internal llmodel\_has\_gpu\_device ([#2409](https://github.com/nomic-ai/gpt4all/pull/2409)) - Remove support for GPT-J models ([#2676](https://github.com/nomic-ai/gpt4all/pull/2676), [#2693](https://github.com/nomic-ai/gpt4all/pull/2693)) ### Fixed - Fix debug mode crash on Windows and undefined behavior in LLamaModel::embedInternal ([#2467](https://github.com/nomic-ai/gpt4all/pull/2467)) - Fix CUDA PTX errors with some GPT4All builds ([#2421](https://github.com/nomic-ai/gpt4all/pull/2421)) - Fix mishandling of inputs greater than n\_ctx tokens after [#1970](https://github.com/nomic-ai/gpt4all/pull/1970) ([#2498](https://github.com/nomic-ai/gpt4all/pull/2498)) - Fix crash when Kompute falls back to CPU ([#2640](https://github.com/nomic-ai/gpt4all/pull/2640)) - Fix several Kompute resource management issues ([#2694](https://github.com/nomic-ai/gpt4all/pull/2694)) - Fix crash/hang when some models stop generating, by showing special tokens ([#2701](https://github.com/nomic-ai/gpt4all/pull/2701)) - Fix several backend issues ([#2778](https://github.com/nomic-ai/gpt4all/pull/2778)) - Restore leading space removal logic that was incorrectly removed in [#2694](https://github.com/nomic-ai/gpt4all/pull/2694) - CUDA: Cherry-pick llama.cpp DMMV cols requirement fix that caused a crash with long conversations since [#2694](https://github.com/nomic-ai/gpt4all/pull/2694) [Unreleased]: https://github.com/nomic-ai/gpt4all/compare/python-v2.8.2...HEAD [2.8.2]: https://github.com/nomic-ai/gpt4all/compare/python-v2.8.1...python-v2.8.2 [2.8.1]: https://github.com/nomic-ai/gpt4all/compare/python-v2.8.0...python-v2.8.1 [2.8.0]: https://github.com/nomic-ai/gpt4all/compare/python-v2.7.0...python-v2.8.0 ================================================ FILE: gpt4all-bindings/python/LICENSE.txt ================================================ Copyright (c) 2023 Nomic, Inc. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ================================================ FILE: gpt4all-bindings/python/MANIFEST.in ================================================ recursive-include gpt4all/llmodel_DO_NOT_MODIFY * ================================================ FILE: gpt4all-bindings/python/README.md ================================================ # Python GPT4All This package contains a set of Python bindings around the `llmodel` C-API. Package on PyPI: https://pypi.org/project/gpt4all/ ## Documentation https://docs.gpt4all.io/gpt4all_python.html ## Installation The easiest way to install the Python bindings for GPT4All is to use pip: ``` pip install gpt4all ``` This will download the latest version of the `gpt4all` package from PyPI. ## Local Build As an alternative to downloading via pip, you may build the Python bindings from source. ### Prerequisites You will need a compiler. On Windows, you should install Visual Studio with the C++ Development components. On macOS, you will need the full version of Xcode—Xcode Command Line Tools lacks certain required tools. On Linux, you will need a GCC or Clang toolchain with C++ support. On Windows and Linux, building GPT4All with full GPU support requires the [Vulkan SDK](https://vulkan.lunarg.com/sdk/home) and the latest [CUDA Toolkit](https://developer.nvidia.com/cuda-downloads). ### Building the python bindings 1. Clone GPT4All and change directory: ``` git clone --recurse-submodules https://github.com/nomic-ai/gpt4all.git cd gpt4all/gpt4all-backend ``` 2. Build the backend. If you are using Windows and have Visual Studio installed: ``` cmake -B build cmake --build build --parallel --config RelWithDebInfo ``` For all other platforms: ``` cmake -B build -DCMAKE_BUILD_TYPE=RelWithDebInfo cmake --build build --parallel ``` `RelWithDebInfo` is a good default, but you can also use `Release` or `Debug` depending on the situation. 2. Install the Python package: ``` cd ../gpt4all-bindings/python pip install -e . ``` ## Usage Test it out! In a Python script or console: ```python from gpt4all import GPT4All model = GPT4All("orca-mini-3b-gguf2-q4_0.gguf") output = model.generate("The capital of France is ", max_tokens=3) print(output) ``` GPU Usage ```python from gpt4all import GPT4All model = GPT4All("orca-mini-3b-gguf2-q4_0.gguf", device='gpu') # device='amd', device='intel' output = model.generate("The capital of France is ", max_tokens=3) print(output) ``` ## Troubleshooting a Local Build - If you're on Windows and have compiled with a MinGW toolchain, you might run into an error like: ``` FileNotFoundError: Could not find module '<...>\gpt4all-bindings\python\gpt4all\llmodel_DO_NOT_MODIFY\build\libllmodel.dll' (or one of its dependencies). Try using the full path with constructor syntax. ``` The key phrase in this case is _"or one of its dependencies"_. The Python interpreter you're using probably doesn't see the MinGW runtime dependencies. At the moment, the following three are required: `libgcc_s_seh-1.dll`, `libstdc++-6.dll` and `libwinpthread-1.dll`. You should copy them from MinGW into a folder where Python will see them, preferably next to `libllmodel.dll`. - Note regarding the Microsoft toolchain: Compiling with MSVC is possible, but not the official way to go about it at the moment. MSVC doesn't produce DLLs with a `lib` prefix, which the bindings expect. You'd have to amend that yourself. ================================================ FILE: gpt4all-bindings/python/docs/css/custom.css ================================================ .md-content h1, .md-content h2 { margin-top: 0.5em; margin-bottom: 0.5em; } ================================================ FILE: gpt4all-bindings/python/docs/gpt4all_api_server/home.md ================================================ # GPT4All API Server GPT4All provides a local API server that allows you to run LLMs over an HTTP API. ## Key Features - **Local Execution**: Run models on your own hardware for privacy and offline use. - **LocalDocs Integration**: Run the API with relevant text snippets provided to your LLM from a [LocalDocs collection](../gpt4all_desktop/localdocs.md). - **OpenAI API Compatibility**: Use existing OpenAI-compatible clients and tools with your local models. ## Activating the API Server 1. Open the GPT4All Chat Desktop Application. 2. Go to `Settings` > `Application` and scroll down to `Advanced`. 3. Check the box for the `"Enable Local API Server"` setting. 4. The server listens on port 4891 by default. You can choose another port number in the `"API Server Port"` setting. ## Connecting to the API Server The base URL used for the API server is `http://localhost:4891/v1` (or `http://localhost:/v1` if you are using a different port number). The server only accepts HTTP connections (not HTTPS) and only listens on localhost (127.0.0.1) (e.g. not to the IPv6 localhost address `::1`.) ## Examples !!! note "Example GPT4All API calls" === "cURL" ```bash curl -X POST http://localhost:4891/v1/chat/completions -d '{ "model": "Phi-3 Mini Instruct", "messages": [{"role":"user","content":"Who is Lionel Messi?"}], "max_tokens": 50, "temperature": 0.28 }' ``` === "PowerShell" ```powershell Invoke-WebRequest -URI http://localhost:4891/v1/chat/completions -Method POST -ContentType application/json -Body '{ "model": "Phi-3 Mini Instruct", "messages": [{"role":"user","content":"Who is Lionel Messi?"}], "max_tokens": 50, "temperature": 0.28 }' ``` ## API Endpoints | Method | Path | Description | |--------|------|-------------| | GET | `/v1/models` | List available models | | GET | `/v1/models/` | Get details of a specific model | | POST | `/v1/completions` | Generate text completions | | POST | `/v1/chat/completions` | Generate chat completions | ## LocalDocs Integration You can use LocalDocs with the API server: 1. Open the Chats view in the GPT4All application. 2. Scroll to the bottom of the chat history sidebar. 3. Select the server chat (it has a different background color). 4. Activate LocalDocs collections in the right sidebar. (Note: LocalDocs can currently only be activated through the GPT4All UI, not via the API itself). Now, your API calls to your local LLM will have relevant references from your LocalDocs collection retrieved and placed in the input message for the LLM to respond to. The references retrieved for your API call can be accessed in the API response object at `response["choices"][0]["references"]` The data included in the `references` are: - `text`: the actual text content from the snippet that was extracted from the reference document - `author`: the author of the reference document (if available) - `date`: the date of creation of the reference document (if available) - `page`: the page number the snippet is from (only available for PDF documents for now) - `title`: the title of the reference document (if available) ================================================ FILE: gpt4all-bindings/python/docs/gpt4all_desktop/chat_templates.md ================================================ ## What are chat templates? Natively, large language models only know how to complete plain text and do not know the difference between their input and their output. In order to support a chat with a person, LLMs are designed to use a template to convert the conversation to plain text using a specific format. For a given model, it is important to use an appropriate chat template, as each model is designed to work best with a specific format. The chat templates included with the built-in models should be sufficient for most purposes. There are two reasons you would want to alter the chat template: - You are sideloading a model and there is no chat template available, - You would like to have greater control over the input to the LLM than a system message provides. ## What is a system message? A system message is a message that controls the responses from the LLM in a way that affects the entire conversation. System messages can be short, such as "Speak like a pirate.", or they can be long and contain a lot of context for the LLM to keep in mind. Not all models are designed to use a system message, so they work with some models better than others. ## How do I customize the chat template or system message? To customize the chat template or system message, go to Settings > Model. Make sure to select the correct model at the top. If you clone a model, you can use a different chat template or system message from the base model, enabling you to use different settings for each conversation. These settings take effect immediately. After changing them, you can click "Redo last response" in the chat view, and the response will take the new settings into account. ## Do I need to write a chat template? You typically do not need to write your own chat template. The exception is models that are not in the official model list and do not come with a chat template built-in. These will show a "Clear" option above the chat template field in the Model Settings page instead of a "Reset" option. See the section on [finding] or [creating] a chat template. [finding]: #how-do-i-find-a-chat-template [creating]: #advanced-how-do-chat-templates-work ## What changed in GPT4All v3.5? GPT4All v3.5 overhauled the chat template system. There are three crucial differences: - The chat template now formats an entire conversation instead of a single pair of messages, - The chat template now uses Jinja syntax instead of `%1` and `%2` placeholders, - And the system message should no longer contain control tokens or trailing whitespace. If you are using any chat templates or system messages that had been added or altered from the default before upgrading to GPT4All v3.5 or newer, these will no longer work. See below for how to solve common errors you may see after upgrading. ## Error/Warning: System message is not plain text. This is easy to fix. Go to the model's settings and look at the system prompt. There are three things to look for: - Control tokens such as `<|im_start|>`, `<|start_header_id|>`, or `<|system|>` - A prefix such as `### System` or `SYSTEM:` - Trailing whitespace, such as a space character or blank line. If you see any of these things, remove them. For example, this legacy system prompt: ``` <|start_header_id|>system<|end_header_id|> You are a helpful assistant.<|eot_id|> ``` Should become this: ``` You are a helpful assistant. ``` If you do not see anything that needs to be changed, you can dismiss the error by making a minor modification to the message and then changing it back. If you see a warning, your system message does not appear to be plain text. If you believe this warning is incorrect, it can be safely ignored. If in doubt, ask on the [Discord]. [Discord]: https://discord.gg/mGZE39AS3e ## Error: Legacy system prompt needs to be updated in Settings. This is the same as [above][above-1], but appears on the chat page. [above-1]: #errorwarning-system-message-is-not-plain-text ## Error/Warning: Chat template is not in Jinja format. This is the result of attempting to use an old-style template (possibly from a previous version) in GPT4All 3.5+. Go to the Model Settings page and select the affected model. If you see a "Reset" button, and you have not intentionally modified the prompt template, you can click "Reset". Otherwise, this is what you can do: 1. Back up your chat template by copying it safely to a text file and saving it. In the next step, it will be removed from GPT4All. 2. Click "Reset" or "Clear". 3. If you clicked "Clear", the chat template is now gone. Follow the steps to [find][finding] or [create][creating] a basic chat template for your model. 4. Customize the chat template to suit your needs. For help, read the section about [creating] a chat template. ## Error: Legacy prompt template needs to be updated in Settings. This is the same as [above][above-2], but appears on the chat page. [above-2]: #errorwarning-chat-template-is-not-in-jinja-format ## The chat template has a syntax error. If there is a syntax error while editing the chat template, the details will be displayed in an error message above the input box. This could be because the chat template is not actually in Jinja format (see [above][above-2]). Otherwise, you have either typed something correctly, or the model comes with a template that is incompatible with GPT4All. See [the below section][creating] on creating chat templates and make sure that everything is correct. When in doubt, ask on the [Discord]. ## Error: No chat template configured. This may appear for models that are not from the official model list and do not include a chat template. Older versions of GPT4All picked a poor default in this case. You will get much better results if you follow the steps to [find][finding] or [create][creating] a chat template for your model. ## Error: The chat template cannot be blank. If the button above the chat template on the Model Settings page says "Clear", see [above][above-3]. If you see "Reset", click that button to restore a reasonable default. Also see the section on [syntax errors][chat-syntax-error]. [above-3]: #error-no-chat-template-configured [chat-syntax-error]: #the-chat-template-has-a-syntax-error ## How do I find a chat template? When in doubt, you can always ask the [Discord] community for help. Below are the instructions to find one on your own. The authoritative source for a model's chat template is the HuggingFace repo that the original (non-GGUF) model came from. First, you should find this page. If you just have a model file, you can try a google search for the model's name. If you know the page you downloaded the GGUF model from, its README usually links to the original non-GGUF model. Once you have located the original model, there are two methods you can use to extract its chat template. Pick whichever one you are most comfortable with. ### Using the CLI (all models) 1. Install `jq` using your preferred package manager - e.g. Chocolatey (Windows), Homebrew (macOS), or apt (Ubuntu). 2. Download `tokenizer_config.json` from the model's "Files and versions" tab. 3. Open a command prompt in the directory which you have downloaded the model file. 4. Run `jq -r ".chat_template" tokenizer_config.json`. This shows the chat template in a human-readable form. You can copy this and paste it into the settings page. 5. (Optional) You can save the output to a text file like this: `jq -r ".chat_template" tokenizer_config.json >chat_template.txt` If the output is "null", the model does not provide a chat template. See the [below instructions][creating] on creating a chat template. ### Python (open models) 1. Install `transformers` using your preferred python package manager, e.g. `pip install transformers`. Make sure it is at least version v4.43.0. 2. Copy the ID of the HuggingFace model, using the clipboard icon next to the name. For example, if the URL is `https://huggingface.co/NousResearch/Hermes-2-Pro-Llama-3-8B`, the ID is `NousResearch/Hermes-2-Pro-Llama-3-8B`. 3. Open a python interpreter (`python`) and run the following commands. Change the model ID in the example to the one you copied. ``` >>> from transformers import AutoTokenizer >>> tokenizer = AutoTokenizer.from_pretrained('NousResearch/Hermes-2-Pro-Llama-3-8B') >>> print(tokenizer.get_chat_template()) ``` You can copy the output and paste it into the settings page. 4. (Optional) You can save the output to a text file like this: ``` >>> open('chat_template.txt', 'w').write(tokenizer.get_chat_template()) ``` If you get a ValueError exception, this model does not provide a chat template. See the [below instructions][creating] on creating a chat template. ### Python (gated models) Some models, such as Llama and Mistral, do not allow public access to their chat template. You must either use the CLI method above, or follow the following instructions to use Python: 1. For these steps, you must have git and git-lfs installed. 2. You must have a HuggingFace account and be logged in. 3. You must already have access to the gated model. Otherwise, request access. 4. You must have an SSH key configured for git access to HuggingFace. 5. `git clone` the model's HuggingFace repo using the SSH clone URL. There is no need to download the entire model, which is very large. A good way to do this on Linux is: ```console $ GIT_LFS_SKIP_SMUDGE=1 git clone hf.co:meta-llama/Llama-3.1-8B-Instruct.git $ cd Llama-3.1-8B-Instruct $ git lfs pull -I "tokenizer.*" ``` 6. Follow the above instructions for open models, but replace the model ID with the path to the directory containing `tokenizer\_config.json`: ``` >>> tokenizer = AutoTokenizer.from_pretrained('.') ``` ## Advanced: How do chat templates work? The chat template is applied to the entire conversation you see in the chat window. The template loops over the list of messages, each containing `role` and `content` fields. `role` is either `user`, `assistant`, or `system`. GPT4All also supports the special variables `bos_token`, `eos_token`, and `add_generation_prompt`. See the [HuggingFace docs] for what those do. [HuggingFace docs]: https://huggingface.co/docs/transformers/v4.46.3/en/chat_templating#special-variables ## Advanced: How do I make a chat template? The best way to create a chat template is to start by using an existing one as a reference. Then, modify it to use the format documented for the given model. Its README page may explicitly give an example of its template. Or, it may mention the name of a well-known standard template, such as ChatML, Alpaca, Vicuna. GPT4All does not yet include presets for these templates, so they will have to be found in other models or taken from the community. For more information, see the very helpful [HuggingFace guide]. Some of this is not applicable, such as the information about tool calling and RAG - GPT4All implements those features differently. Some models use a prompt template that does not intuitively map to a multi-turn chat, because it is more intended for single instructions. The [FastChat] implementation of these templates is a useful reference for the correct way to extend them to multiple messages. [HuggingFace guide]: https://huggingface.co/docs/transformers/v4.46.3/en/chat_templating#advanced-template-writing-tips [FastChat]: https://github.com/lm-sys/FastChat/blob/main/fastchat/conversation.py # Advanced: What are GPT4All v1 templates? GPT4All supports its own template syntax, which is nonstandard but provides complete control over the way LocalDocs sources and file attachments are inserted into the conversation. These templates begin with `{# gpt4all v1 #}` and look similar to the example below. For standard templates, GPT4All combines the user message, sources, and attachments into the `content` field. For GPT4All v1 templates, this is not done, so they must be used directly in the template for those features to work correctly. ```jinja {# gpt4all v1 #} {%- for message in messages %} {{- '<|start_header_id|>' + message['role'] + '<|end_header_id|>\n\n' }} {%- if message['role'] == 'user' %} {%- for source in message['sources'] %} {%- if loop.first %} {{- '### Context:\n' }} {%- endif %} {{- 'Collection: ' + source['collection'] + '\n' + 'Path: ' + source['path'] + '\n' + 'Excerpt: ' + source['text'] + '\n\n' }} {%- endfor %} {%- endif %} {%- for attachment in message['prompt_attachments'] %} {{- attachment['processed_content'] + '\n\n' }} {%- endfor %} {{- message['content'] | trim }} {{- '<|eot_id|>' }} {%- endfor %} {%- if add_generation_prompt %} {{- '<|start_header_id|>assistant<|end_header_id|>\n\n' }} {%- endif %} ``` ================================================ FILE: gpt4all-bindings/python/docs/gpt4all_desktop/chats.md ================================================ # Chats Chats are conversations with [language models](models.md) that run locally on your device. !!! note "Example Chats" === "Llama 3" Download `Llama 3` and prompt: ``` explain why the sky is blue in a way that is correct and makes sense to a child ``` With the default sampling settings, you should see text resembling the following: ``` What a great question! So, you know how we can see different colors like red, yellow, green, and orange? Well, when sunlight enters Earth's atmosphere, it starts to interact with tiny particles called molecules of gases like nitrogen (N2) and oxygen (02). These molecules are really small, but they're all around us! Now, here's the cool part: these molecules scatter, or bounce off, shorter wavelengths of light more than longer wavelengths. Think of it like a game of pool where the cue ball hits other balls with different sizes and speeds. The shortest wavelength of visible light is violet (about 400 nanometers), followed by blue (around 450-495 nanometers). These shorter wavelengths are scattered in all directions, kind of like how a basketball player might bounce the ball around on the court. This scattering effect makes the sky appear blue to our eyes! Here's why it doesn't look as bright or intense as a blue crayon: there are many more molecules in the atmosphere that scatter longer wavelengths (like red and orange) than shorter ones, so those colors get spread out and become less noticeable. To sum it up: 1. Sunlight enters Earth's atmosphere. 2. Tiny gas molecule particles interact with light, scattering shorter wavelengths like blue and violet. 3. This scattered blue light reaches our eyes from all directions, making the sky appear blue! Now, go outside and enjoy that beautiful blue sky! ``` === "Nous Hermes 2 Mistral DPO" Download `Nous Hermes 2 Mistral DPO` and prompt: ``` write me a react app i can run from the command line to play a quick game ``` With the default sampling settings, you should see text and code blocks resembling the following: ``` I'll provide you with a simple React app that allows you to play a guessing game. This app will generate a random number between 1 and 10, and ask you to guess it. You can run this app locally using Create React App. Follow the steps below: Open your terminal or command line interface. Navigate to the directory where you want to create the project (e.g., cd Documents/Projects). Run the following commands: ``` ```bash npx create-react-app guessing-game cd guessing-game ``` ``` Open the project in your preferred code editor by running code . (for VS Code) or any other command to open the folder with your text editor. Replace the content of the src/App.js file with the following code: ``` ```javascript import React, { useState } from 'react'; function App() { const [guess, setGuess] = useState(''); const [message, setMessage] = useState(''); const [numberToGuess, setNumberToGuess] = useState(Math.floor(Math.random() * 10) + 1); function handleChange(e) { setGuess(e.target.value); } function handleSubmit(e) { e.preventDefault(); if (parseInt(guess) === numberToGuess) { setMessage('You guessed it right!'); } else { setMessage(`Wrong guess, the correct number was ${numberToGuess}. Try again next time.`); } } return (

Guess a Number

{message}

); } export default App; ``` ``` Save the file and go back to your terminal or command line interface. Run npm start. This will start a local development server, and you can access the app in your browser at http://localhost:3000. Play the guessing game by entering a number between 1 and 10 into the input field and clicking "Submit". The app will tell you if your guess is correct or not. Remember that this is just a simple example, and you can expand upon it to make the game more interesting with additional features like high scores, multiple difficulty levels, etc. ``` ## New Chat Choose a model with the dropdown at the top of the Chats page If you don't have any models, [download one](models.md#download-models). Once you have models, you can start chats by loading your default model, which you can configure in [settings](settings.md#application-settings) ![Choose a model](../assets/three_model_options.png) ## LocalDocs Open the [LocalDocs](localdocs.md) panel with the button in the top-right corner to bring your files into the chat. With LocalDocs, your chats are enhanced with semantically related snippets from your files included in the model's context. ![Open LocalDocs](../assets/open_local_docs.png) ## Chat History View your chat history with the button in the top-left corner of the Chats page.
Close chats Open chats
You can change a chat name or delete it from your chat history at any time. ================================================ FILE: gpt4all-bindings/python/docs/gpt4all_desktop/cookbook/use-local-ai-models-to-privately-chat-with-Obsidian.md ================================================ # Using GPT4All to Privately Chat with your Obsidian Vault Obsidian for Desktop is a powerful management and note-taking software designed to create and organize markdown notes. This tutorial allows you to sync and access your Obsidian note files directly on your computer. By connecting it to LocalDocs, you can integrate these files into your LLM chats for private access and enhanced context. ## Download Obsidian for Desktop !!! note "Download Obsidian for Desktop" 1. **Download Obsidian for Desktop**: - Visit the [Obsidian website](https://obsidian.md) and create an account account. - Click the Download button in the center of the homepage - For more help with installing Obsidian see [Getting Started with Obsidian](https://help.obsidian.md/Getting+started/Download+and+install+Obsidian) 2. **Set Up Obsidian**: - Launch Obsidian from your Applications folder (macOS), Start menu (Windows), or equivalent location (Linux). - On the welcome screen, you can either create a new vault (a collection of notes) or open an existing one. - To create a new vault, click Create a new vault, name your vault, choose a location on your computer, and click Create. 3. **Sign in and Sync**: - Once installed, you can start adding and organizing notes. - Choose the folders you want to sync to your computer. ## Connect Obsidian to LocalDocs !!! note "Connect Obsidian to LocalDocs" 1. **Open LocalDocs**: - Navigate to the LocalDocs feature within GPT4All.
LocalDocs interface
2. **Add Collection**: - Click on **+ Add Collection** to begin linking your Obsidian Vault.
Screenshot of adding collection
- Name your collection 3. **Create Collection**: - Click **Create Collection** to initiate the embedding process. Progress will be displayed within the LocalDocs interface. 4. **Access Files in Chats**: - Load a model to chat with your files (Llama 3 Instruct is the fastest) - In your chat, open 'LocalDocs' with the button in the top-right corner to provide context from your synced Obsidian notes.
Accessing LocalDocs in chats
5. **Interact With Your Notes:** - Use the model to interact with your files
osbsidian user interaction
osbsidian GPT4ALL response
6. **View Referenced Files**: - Click on **Sources** below LLM responses to see which Obsidian Notes were referenced.
Referenced Files
## How It Works Obsidian for Desktop syncs your Obsidian notes to your computer, while LocalDocs integrates these files into your LLM chats using embedding models. These models find semantically similar snippets from your files to enhance the context of your interactions. To learn more about embedding models and explore further, refer to the [Nomic Python SDK documentation](https://docs.nomic.ai/atlas/capabilities/embeddings). ================================================ FILE: gpt4all-bindings/python/docs/gpt4all_desktop/cookbook/use-local-ai-models-to-privately-chat-with-One-Drive.md ================================================ # Using GPT4All to Privately Chat with your OneDrive Data Local and Private AI Chat with your OneDrive Data OneDrive for Desktop allows you to sync and access your OneDrive files directly on your computer. By connecting your synced directory to LocalDocs, you can start using GPT4All to privately chat with data stored in your OneDrive. ## Download OneDrive for Desktop !!! note "Download OneDrive for Desktop" 1. **Download OneDrive for Desktop**: - Visit [Microsoft OneDrive](https://www.microsoft.com/en-us/microsoft-365/onedrive/download). - Press 'download' for your respective device type. - Download the OneDrive for Desktop application. 2. **Install OneDrive for Desktop** - Run the installer file you downloaded. - Follow the prompts to complete the installation process. 3. **Sign in and Sync** - Once installed, sign in to OneDrive for Desktop with your Microsoft account credentials. - Choose the folders you want to sync to your computer. ## Connect OneDrive to LocalDocs !!! note "Connect OneDrive to LocalDocs" 1. **Install GPT4All and Open LocalDocs**: - Go to [nomic.ai/gpt4all](https://nomic.ai/gpt4all) to install GPT4All for your operating system. - Navigate to the LocalDocs feature within GPT4All to configure it to use your synced OneDrive directory.
Screenshot 2024-07-10 at 10 55 41 AM
2. **Add Collection**: - Click on **+ Add Collection** to begin linking your OneDrive folders.
Screenshot 2024-07-10 at 10 56 29 AM
- Name the Collection and specify the OneDrive folder path. 3. **Create Collection**: - Click **Create Collection** to initiate the embedding process. Progress will be displayed within the LocalDocs interface. 4. **Access Files in Chats**: - Load a model within GPT4All to chat with your files. - In your chat, open 'LocalDocs' using the button in the top-right corner to provide context from your synced OneDrive files.
Screenshot 2024-07-10 at 10 58 55 AM
5. **Interact With Your OneDrive**: - Use the model to interact with your files directly from OneDrive.
Screenshot 2024-07-10 at 11 04 55 AM
Screenshot 2024-07-11 at 11 21 46 AM
6. **View Referenced Files**: - Click on **Sources** below responses to see which OneDrive files were referenced.
Screenshot 2024-07-11 at 11 22 49 AM
## How It Works OneDrive for Desktop syncs your OneDrive files to your computer, while LocalDocs maintains a database of these synced files for use by your local GPT4All model. As your OneDrive updates, LocalDocs will automatically detect file changes and stay up to date. LocalDocs leverages [Nomic Embedding](https://docs.nomic.ai/atlas/capabilities/embeddings) models to find semantically similar snippets from your files, enhancing the context of your interactions. ================================================ FILE: gpt4all-bindings/python/docs/gpt4all_desktop/cookbook/use-local-ai-models-to-privately-chat-with-google-drive.md ================================================ # Using GPT4All to Privately Chat with your Google Drive Data Local and Private AI Chat with your Google Drive Data Google Drive for Desktop allows you to sync and access your Google Drive files directly on your computer. By connecting your synced directory to LocalDocs, you can start using GPT4All to privately chat with data stored in your Google Drive. ## Download Google Drive for Desktop !!! note "Download Google Drive for Desktop" 1. **Download Google Drive for Desktop**: - Visit [drive.google.com](https://drive.google.com) and sign in with your Google account. - Navigate to the **Settings** (gear icon) and select **Settings** from the dropdown menu. - Scroll down to **Google Drive for desktop** and click **Download**. 2. **Install Google Drive for Desktop** - Run the installer file you downloaded. - Follow the prompts to complete the installation process. 3. **Sign in and Sync** - Once installed, sign in to Google Drive for Desktop with your Google account credentials. - Choose the folders you want to sync to your computer. For advanced help, see [Setting up Google Drive for Desktop](https://support.google.com/drive/answer/10838124?hl=en) ## Connect Google Drive to LocalDocs !!! note "Connect Google Drive to LocalDocs" 1. **Install GPT4All and Open LocalDocs**: - Go to [nomic.ai/gpt4all](https://nomic.ai/gpt4all) to install GPT4All for your operating system. - Navigate to the LocalDocs feature within GPT4All to configure it to use your synced directory.
Screenshot 2024-07-09 at 3 15 35 PM
2. **Add Collection**: - Click on **+ Add Collection** to begin linking your Google Drive folders.
Screenshot 2024-07-09 at 3 17 24 PM
- Name Collection 3. **Create Collection**: - Click **Create Collection** to initiate the embedding process. Progress will be displayed within the LocalDocs interface. 4. **Access Files in Chats**: - Load a model to chat with your files (Llama 3 Instruct performs best) - In your chat, open 'LocalDocs' with the button in the top-right corner to provide context from your synced Google Drive files.
Screenshot 2024-07-09 at 3 20 53 PM
5. **Interact With Your Drive:** - Use the model to interact with your files
Screenshot 2024-07-09 at 3 36 51 PM
Screenshot 2024-07-11 at 11 34 00 AM
6. **View Referenced Files**: - Click on **Sources** below LLM responses to see which Google Drive files were referenced.
Screenshot 2024-07-11 at 11 34 37 AM
## How It Works Google Drive for Desktop syncs your Google Drive files to your computer, while LocalDocs maintains a database of these synced files for use by your local LLM. As your Google Drive updates, LocalDocs will automatically detect file changes and get up to date. LocalDocs is powered by [Nomic Embedding](https://docs.nomic.ai/atlas/capabilities/embeddings) models which find semantically similar snippets from your files to enhance the context of your interactions. ================================================ FILE: gpt4all-bindings/python/docs/gpt4all_desktop/cookbook/use-local-ai-models-to-privately-chat-with-microsoft-excel.md ================================================ # Using GPT4All to Privately Chat with your Microsoft Excel Spreadsheets Local and Private AI Chat with your Microsoft Excel Spreadsheets Microsoft Excel allows you to create, manage, and analyze data in spreadsheet format. By attaching your spreadsheets directly to GPT4All, you can privately chat with the AI to query and explore the data, enabling you to summarize, generate reports, and glean insights from your files—all within your conversation.
## Attach Microsoft Excel to your GPT4All Conversation !!! note "Attach Microsoft Excel to your GPT4All Conversation" 1. **Install GPT4All and Open **: - Go to [nomic.ai/gpt4all](https://nomic.ai/gpt4all) to install GPT4All for your operating system. - Navigate to the Chats view within GPT4All.
Chat view
2. **Example Spreadsheet **:
Spreadsheet view
3. **Attach to GPT4All conversration**
Attach view
4. **Have GPT4All Summarize and Generate a Report**
Attach view
## How It Works GPT4All parses your attached excel spreadsheet into Markdown, a format understandable to LLMs, and adds the markdown text to the context for your LLM chat. You can view the code that converts `.xslx` to Markdown [here](https://github.com/nomic-ai/gpt4all/blob/main/gpt4all-chat/src/xlsxtomd.cpp) in the GPT4All github repo. For example, the above spreadsheet titled `disney_income_stmt.xlsx` would be formatted the following way: ```markdown ## disney_income_stmt |Walt Disney Co.||||||| |---|---|---|---|---|---|---| |Consolidated Income Statement||||||| ||||||||| |US$ in millions||||||| |12 months ended:|2023-09-30 00:00:00|2022-10-01 00:00:00|2021-10-02 00:00:00|2020-10-03 00:00:00|2019-09-28 00:00:00|2018-09-29 00:00:00| |Services|79562|74200|61768|59265|60542|50869| ... ... ... ``` ## Limitations It is important to double-check the claims LLMs make about the spreadsheets you provide. LLMs can make mistakes about the data they are presented, particularly for the LLMs with smaller parameter counts (~8B) that fit within the memory of consumer hardware. ================================================ FILE: gpt4all-bindings/python/docs/gpt4all_desktop/localdocs.md ================================================ # LocalDocs LocalDocs brings the information you have from files on-device into your LLM chats - **privately**. ## Create LocalDocs !!! note "Create LocalDocs" 1. Click `+ Add Collection`. 2. Name your collection and link it to a folder.
new GOT Docs new GOT Docs filled out
3. Click `Create Collection`. Progress for the collection is displayed on the LocalDocs page. ![Embedding in progress](../assets/baelor.png) You will see a green `Ready` indicator when the entire collection is ready. Note: you can still chat with the files that are ready before the entire collection is ready. ![Embedding complete](../assets/got_done.png) Later on if you modify your LocalDocs settings you can rebuild your collections with your new settings. 4. In your chats, open `LocalDocs` with button in top-right corner to give your LLM context from those files. ![LocalDocs result](../assets/syrio_snippets.png) 5. See which files were referenced by clicking `Sources` below the LLM responses. ![Sources](../assets/open_sources.png) ## How It Works A LocalDocs collection uses Nomic AI's free and fast on-device embedding models to index your folder into text snippets that each get an **embedding vector**. These vectors allow us to find snippets from your files that are semantically similar to the questions and prompts you enter in your chats. We then include those semantically similar snippets in the prompt to the LLM. To try the embedding models yourself, we recommend using the [Nomic Python SDK](https://docs.nomic.ai/atlas/capabilities/embeddings) ================================================ FILE: gpt4all-bindings/python/docs/gpt4all_desktop/models.md ================================================ # Models GPT4All is optimized to run LLMs in the 3-13B parameter range on consumer-grade hardware. LLMs are downloaded to your device so you can run them locally and privately. With our backend anyone can interact with LLMs efficiently and securely on their own hardware. ## Download Models !!! note "Download Models"
1. Click `Models` in the menu on the left (below `Chats` and above `LocalDocs`) Models Page Icon
2. Click `+ Add Model` to navigate to the `Explore Models` page Add Model button
3. Search for models available online Explore Models search
4. Hit `Download` to save a model to your device Download Models button
5. Once the model is downloaded you will see it in `Models`. Download Models button
## Explore Models GPT4All connects you with LLMs from HuggingFace with a [`llama.cpp`](https://github.com/ggerganov/llama.cpp) backend so that they will run efficiently on your hardware. Many of these models can be identified by the file type `.gguf`. ![Explore models](../assets/search_mistral.png) ## Example Models Many LLMs are available at various sizes, quantizations, and licenses. - LLMs with more parameters tend to be better at coherently responding to instructions - LLMs with a smaller quantization (e.g. 4bit instead of 16bit) are much faster and less memory intensive, and tend to have slightly worse performance - Licenses vary in their terms for personal and commercial use Here are a few examples: | Model| Filesize| RAM Required| Parameters| Quantization| Developer| License| MD5 Sum (Unique Hash)| |------|---------|-------------|-----------|-------------|----------|--------|----------------------| | Llama 3 Instruct | 4.66 GB| 8 GB| 8 Billion| q4_0| Meta| [Llama 3 License](https://llama.meta.com/llama3/license/)| c87ad09e1e4c8f9c35a5fcef52b6f1c9| | Nous Hermes 2 Mistral DPO| 4.11 GB| 8 GB| 7 Billion| q4_0| Mistral & Nous Research | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)| Coa5f6b4eabd3992da4d7fb7f020f921eb| | Phi-3 Mini Instruct | 2.18 GB| 4 GB| 4 billion| q4_0| Microsoft| [MIT](https://opensource.org/license/mit)| f8347badde9bfc2efbe89124d78ddaf5| | Mini Orca (Small)| 1.98 GB| 4 GB| 3 billion| q4_0| Microsoft | [CC-BY-NC-SA-4.0](https://spdx.org/licenses/CC-BY-NC-SA-4.0)| 0e769317b90ac30d6e09486d61fefa26| | GPT4All Snoozy| 7.37 GB| 16 GB| 13 billion| q4_0| Nomic AI| [GPL](https://www.gnu.org/licenses/gpl-3.0.en.html)| 40388eb2f8d16bb5d08c96fdfaac6b2c| ### Search Results You can click the gear icon in the search bar to sort search results by their # of likes, # of downloads, or date of upload (all from HuggingFace). ![Sort search results](../assets/search_settings.png) ## Connect Model APIs You can add your API key for remote model providers. **Note**: this does not download a model file to your computer to use securely. Instead, this way of interacting with models has your prompts leave your computer to the API provider and returns the response to your computer. ![Connect APIs](../assets/add_model_gpt4.png) ================================================ FILE: gpt4all-bindings/python/docs/gpt4all_desktop/quickstart.md ================================================ # GPT4All Desktop The GPT4All Desktop Application allows you to download and run large language models (LLMs) locally & privately on your device. With GPT4All, you can chat with models, turn your local files into information sources for models [(LocalDocs)](localdocs.md), or browse models available online to download onto your device. [Official Video Tutorial](https://www.youtube.com/watch?v=gQcZDXRVJok) ## Quickstart !!! note "Quickstart" 1. Install GPT4All for your operating system and open the application.
[Download for Windows](https://gpt4all.io/installers/gpt4all-installer-win64.exe)      [Download for Mac](https://gpt4all.io/installers/gpt4all-installer-darwin.dmg)      [Download for Linux](https://gpt4all.io/installers/gpt4all-installer-linux.run)
2. Hit `Start Chatting`. ![GPT4All home page](../assets/gpt4all_home.png) 3. Click `+ Add Model`. 4. Download a model. We recommend starting with Llama 3, but you can [browse more models](models.md). ![Download a model](../assets/download_llama.png) 5. Once downloaded, go to Chats (below Home and above Models in the menu on the left). 6. Click "Load Default Model" (will be Llama 3 or whichever model you downloaded).
Before first chat New first chat
7. Try the [example chats](chats.md) or your own prompts! ================================================ FILE: gpt4all-bindings/python/docs/gpt4all_desktop/settings.md ================================================ # Settings ## Application Settings !!! note "General Application Settings" | Setting | Description | Default Value | | --- | --- | --- | | **Theme** | Color theme for the application. Options are `Light`, `Dark`, and `LegacyDark` | `Light` | | **Font Size** | Font size setting for text throughout the application. Options are Small, Medium, and Large | Small | | **Language and Locale** | The language and locale of that language you wish to use | System Locale | | **Device** | Device that will run your models. Options are `Auto` (GPT4All chooses), `Metal` (Apple Silicon M1+), `CPU`, and `GPU` | `Auto` | | **Default Model** | Choose your preferred LLM to load by default on startup| Auto | | **Suggestion Mode** | Generate suggested follow up questions at the end of responses | When chatting with LocalDocs | | **Download Path** | Select a destination on your device to save downloaded models | Windows: `C:\Users\{username}\AppData\Local\nomic.ai\GPT4All`

Mac: `/Users/{username}/Library/Application Support/nomic.ai/GPT4All/`

Linux: `/home/{username}/.local/share/nomic.ai/GPT4All` | | **Enable Datalake** | Opt-in to sharing interactions with GPT4All community (**anonymous** and **optional**) | Off | !!! note "Advanced Application Settings" | Setting | Description | Default Value | | --- | --- | --- | | **CPU Threads** | Number of concurrently running CPU threads (more can speed up responses) | 4 | | **Enable System Tray** | The application will minimize to the system tray / taskbar when the window is closed | Off | | **Enable Local Server** | Allow any application on your device to use GPT4All via an OpenAI-compatible GPT4All API | Off | | **API Server Port** | Local HTTP port for the local API server | 4891 | ## Model Settings !!! note "Model / Character Settings" | Setting | Description | Default Value | | --- | --- | --- | | **Name** | Unique name of this model / character| set by model uploader | | **Model File** | Filename (.gguf) of the model | set by model uploader | | **System Message** | General instructions for the chats this model will be used for | set by model uploader | | **Chat Template** | Format of user <-> assistant interactions for the chats this model will be used for | set by model uploader | | **Chat Name Prompt** | Prompt used to automatically generate chat names | Describe the above conversation in seven words or less. | | **Suggested FollowUp Prompt** | Prompt used to automatically generate follow up questions after a chat response | Suggest three very short factual follow-up questions that have not been answered yet or cannot be found inspired by the previous conversation and excerpts. | ### Clone You can **clone** an existing model, which allows you to save a configuration of a model file with different prompt templates and sampling settings. ### Sampling Settings !!! note "Model Sampling Settings" | Setting | Description | Default Value | |----------------------------|------------------------------------------|-----------| | **Context Length** | Maximum length of input sequence in tokens | 2048 | | **Max Length** | Maximum length of response in tokens | 4096 | | **Prompt Batch Size** | Token batch size for parallel processing | 128 | | **Temperature** | Lower temperature gives more likely generations | 0.7 | | **Top P** | Prevents choosing highly unlikely tokens | 0.4 | | **Top K** | Size of selection pool for tokens | 40 | | **Min P** | Minimum relative probability | 0 | | **Repeat Penalty Tokens** | Length to apply penalty | 64 | | **Repeat Penalty** | Penalize repetitiveness | 1.18 | | **GPU Layers** | How many model layers to load into VRAM | 32 | ## LocalDocs Settings !!! note "General LocalDocs Settings" | Setting | Description | Default Value | | --- | --- | --- | | **Allowed File Extensions** | Choose which file types will be indexed into LocalDocs collections as text snippets with embedding vectors | `.txt`, `.pdf`, `.md`, `.rst` | | **Use Nomic Embed API** | Use Nomic API to create LocalDocs collections fast and off-device; [Nomic API Key](https://atlas.nomic.ai/) required | Off | | **Embeddings Device** | Device that will run embedding models. Options are `Auto` (GPT4All chooses), `Metal` (Apple Silicon M1+), `CPU`, and `GPU` | `Auto` | | **Show Sources** | Titles of source files retrieved by LocalDocs will be displayed directly in your chats.| On | !!! note "Advanced LocalDocs Settings" Note that increasing these settings can increase the likelihood of factual responses, but may result in slower generation times. | Setting | Description | Default Value | | --- | --- | --- | | **Document Snippet Size** | Number of string characters per document snippet | 512 | | **Maximum Document Snippets Per Prompt** | Upper limit for the number of snippets from your files LocalDocs can retrieve for LLM context | 3 | ================================================ FILE: gpt4all-bindings/python/docs/gpt4all_help/faq.md ================================================ # Frequently Asked Questions ## Models ### Which language models are supported? We support models with a `llama.cpp` implementation which have been uploaded to [HuggingFace](https://huggingface.co/). ### Which embedding models are supported? We support SBert and Nomic Embed Text v1 & v1.5. ## Software ### What software do I need? All you need is to [install GPT4all](../index.md) onto you Windows, Mac, or Linux computer. ### Which SDK languages are supported? Our SDK is in Python for usability, but these are light bindings around [`llama.cpp`](https://github.com/ggerganov/llama.cpp) implementations that we contribute to for efficiency and accessibility on everyday computers. ### Is there an API? Yes, you can run your model in server-mode with our [OpenAI-compatible API](https://platform.openai.com/docs/api-reference/completions), which you can configure in [settings](../gpt4all_desktop/settings.md#application-settings) ### Can I monitor a GPT4All deployment? Yes, GPT4All [integrates](../gpt4all_python/monitoring.md) with [OpenLIT](https://github.com/openlit/openlit) so you can deploy LLMs with user interactions and hardware usage automatically monitored for full observability. ### Is there a command line interface (CLI)? [Yes](https://github.com/nomic-ai/gpt4all/tree/main/gpt4all-bindings/cli), we have a lightweight use of the Python client as a CLI. We welcome further contributions! ## Hardware ### What hardware do I need? GPT4All can run on CPU, Metal (Apple Silicon M1+), and GPU. ### What are the system requirements? Your CPU needs to support [AVX or AVX2 instructions](https://en.wikipedia.org/wiki/Advanced_Vector_Extensions) and you need enough RAM to load a model into memory. ================================================ FILE: gpt4all-bindings/python/docs/gpt4all_help/troubleshooting.md ================================================ # Troubleshooting ## Error Loading Models It is possible you are trying to load a model from HuggingFace whose weights are not compatible with our [backend](https://github.com/nomic-ai/gpt4all/tree/main/gpt4all-bindings). Try downloading one of the officially supported models listed on the main models page in the application. If the problem persists, please share your experience on our [Discord](https://discord.com/channels/1076964370942267462). ## Bad Responses Try the [example chats](../gpt4all_desktop/chats.md) to double check that your system is implementing models correctly. ### Responses Incoherent If you are seeing something **not at all** resembling the [example chats](../gpt4all_desktop/chats.md) - for example, if the responses you are seeing look nonsensical - try [downloading a different model](../gpt4all_desktop/models.md), and please share your experience on our [Discord](https://discord.com/channels/1076964370942267462). ### Responses Incorrect LLMs can be unreliable. It's helpful to know what their training data was - they are less likely to be correct when asking about data they were not trained on unless you give the necessary information in the prompt as **context**. Giving LLMs additional context, like chatting using [LocalDocs](../gpt4all_desktop/localdocs.md), can help merge the language model's ability to understand text with the files that you trust to contain the information you need. Including information in a prompt is not a guarantee that it will be used correctly, but the more clear and concise your prompts, and the more relevant your prompts are to your files, the better. ### LocalDocs Issues Occasionally a model - particularly a smaller or overall weaker LLM - may not use the relevant text snippets from the files that were referenced via LocalDocs. If you are seeing this, it can help to use phrases like "in the docs" or "from the provided files" when prompting your model. ================================================ FILE: gpt4all-bindings/python/docs/gpt4all_python/home.md ================================================ # GPT4All Python SDK ## Installation To get started, pip-install the `gpt4all` package into your python environment. ```bash pip install gpt4all ``` We recommend installing `gpt4all` into its own virtual environment using `venv` or `conda` ## Load LLM Models are loaded by name via the `GPT4All` class. If it's your first time loading a model, it will be downloaded to your device and saved so it can be quickly reloaded next time you create a `GPT4All` model with the same name. !!! note "Load LLM" ```python from gpt4all import GPT4All model = GPT4All("Meta-Llama-3-8B-Instruct.Q4_0.gguf") # downloads / loads a 4.66GB LLM with model.chat_session(): print(model.generate("How can I run LLMs efficiently on my laptop?", max_tokens=1024)) ``` | `GPT4All` model name| Filesize| RAM Required| Parameters| Quantization| Developer| License| MD5 Sum (Unique Hash)| |------|---------|-------|-------|-----------|----------|--------|----------------------| | `Meta-Llama-3-8B-Instruct.Q4_0.gguf`| 4.66 GB| 8 GB| 8 Billion| q4_0| Meta| [Llama 3 License](https://llama.meta.com/llama3/license/)| c87ad09e1e4c8f9c35a5fcef52b6f1c9| | `Nous-Hermes-2-Mistral-7B-DPO.Q4_0.gguf`| 4.11 GB| 8 GB| 7 Billion| q4_0| Mistral & Nous Research | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)| Coa5f6b4eabd3992da4d7fb7f020f921eb| | `Phi-3-mini-4k-instruct.Q4_0.gguf` | 2.18 GB| 4 GB| 3.8 billion| q4_0| Microsoft| [MIT](https://opensource.org/license/mit)| f8347badde9bfc2efbe89124d78ddaf5| | `orca-mini-3b-gguf2-q4_0.gguf`| 1.98 GB| 4 GB| 3 billion| q4_0| Microsoft | [CC-BY-NC-SA-4.0](https://spdx.org/licenses/CC-BY-NC-SA-4.0)| 0e769317b90ac30d6e09486d61fefa26| | `gpt4all-13b-snoozy-q4_0.gguf`| 7.37 GB| 16 GB| 13 billion| q4_0| Nomic AI| [GPL](https://www.gnu.org/licenses/gpl-3.0.en.html)| 40388eb2f8d16bb5d08c96fdfaac6b2c| ## Chat Session Generation Most of the language models you will be able to access from HuggingFace have been trained as assistants. This guides language models to not just answer with relevant text, but *helpful* text. If you want your LLM's responses to be helpful in the typical sense, we recommend you apply the chat templates the models were finetuned with. Information about specific prompt templates is typically available on the official HuggingFace page for the model. !!! note "Example LLM Chat Session Generation" === "Code" Load `Llama 3` and enter the following prompt in a chat session: ```python from gpt4all import GPT4All model = GPT4All("Meta-Llama-3-8B-Instruct.Q4_0.gguf") with model.chat_session(): print(model.generate("quadratic formula")) ``` === "Output" With the default sampling settings, you should see something resembling the following: ``` The quadratic formula! The quadratic formula is a mathematical formula that provides the solutions to a quadratic equation of the form: ax^2 + bx + c = 0 where a, b, and c are constants. The formula is: x = (-b ± √(b^2 - 4ac)) / 2a Let's break it down: * x is the variable we're trying to solve for. * a, b, and c are the coefficients of the quadratic equation. * ± means "plus or minus". * √ denotes the square root. To use the formula, simply plug in the values of a, b, and c into the expression above. The resulting value(s) will be the solutions to the original quadratic equation! For example, let's say we have the quadratic equation: x^2 + 5x + 6 = 0 We can plug these values into the formula as follows: a = 1 b = 5 ``` ## Direct Generation Directly calling `model.generate()` prompts the model without applying any templates. Note: this can result in responses that are less like helpful responses and more like mirroring the tone of your prompt. In general, a language model outside of a chat session is less of a helpful assistant and more of a lens into the distribution of the model's training data. As an example, see how the model's response changes when we give the same prompt as above without applying a chat session: !!! note "Example LLM Direct Generation" === "Code" Load `Llama 3` and enter the following prompt: ```python from gpt4all import GPT4All model = GPT4All("Meta-Llama-3-8B-Instruct.Q4_0.gguf") print(model.generate("quadratic formula")) ``` === "Output" With the default sampling settings, you should see something resembling the following: ``` . The equation is in the form of a + bx = c, where a and b are constants. The solution to this problem involves using the quadratic formula which states that for any quadratic equation ax^2+bx+c=0, its solutions can be found by: x = (-b ± √(b^2-4ac)) / 2a In your case, since you have a + bx = c, we need to rewrite it in the form of ax^2+bx+c=0. To do this, subtract both sides from c, so that: c - (a + bx) = 0 Now, combine like terms on the left side and simplify: ax^2 + (-b)x + (c-a) = 0\n\nSo now we have a quadratic equation in standard form: ax^2+bx+c=0. We can use this to find its solutions using the quadratic formula: x = ((-b ± √((-b)^2 ``` Why did it respond differently? Because language models, before being fine-tuned as assistants, are trained to be more like a data mimic than a helpful assistant. Therefore our responses ends up more like a typical continuation of math-style text rather than a helpful answer in dialog. ## Embeddings Nomic trains and open-sources free embedding models that will run very fast on your hardware. The easiest way to run the text embedding model locally uses the [`nomic`](https://github.com/nomic-ai/nomic) python library to interface with our fast [C/C++ implementations](ref.md#gpt4all.gpt4all.Embed4All). !!! note "Example Embeddings Generation" === "Code" Importing `embed` from the [`nomic`](https://github.com/nomic-ai/nomic) library, you can call `embed.text()` with `inference_mode="local"`. This downloads an embedding model and saves it for later. ```python from nomic import embed embeddings = embed.text(["String 1", "String 2"], inference_mode="local")['embeddings'] print("Number of embeddings created:", len(embeddings)) print("Number of dimensions per embedding:", len(embeddings[0])) ``` === "Output" ``` Number of embeddings created: 2 Number of dimensions per embedding: 768 ``` ![Nomic embed text local inference](../assets/local_embed.gif) To learn more about making embeddings locally with `nomic`, visit our [embeddings guide](https://docs.nomic.ai/atlas/guides/embeddings#local-inference). The following embedding models can be used within the application and with the `Embed4All` class from the `gpt4all` Python library. The default context length as GGUF files is 2048 but can be [extended](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5-GGUF#description). | Name| Using with `nomic`| `Embed4All` model name| Context Length| # Embedding Dimensions| File Size| |--------------------|-|------------------------------------------------------|---------------:|-----------------:|----------:| | [Nomic Embed v1](https://huggingface.co/nomic-ai/nomic-embed-text-v1-GGUF) | ```embed.text(strings, model="nomic-embed-text-v1", inference_mode="local")```| ```Embed4All("nomic-embed-text-v1.f16.gguf")```| 2048 | 768 | 262 MiB | | [Nomic Embed v1.5](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5-GGUF) | ```embed.text(strings, model="nomic-embed-text-v1.5", inference_mode="local")```| ```Embed4All("nomic-embed-text-v1.5.f16.gguf")``` | 2048| 64-768 | 262 MiB | | [SBert](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2)| n/a| ```Embed4All("all-MiniLM-L6-v2.gguf2.f16.gguf")```| 512 | 384 | 44 MiB | ================================================ FILE: gpt4all-bindings/python/docs/gpt4all_python/monitoring.md ================================================ # GPT4All Monitoring GPT4All integrates with [OpenLIT](https://github.com/openlit/openlit) OpenTelemetry auto-instrumentation to perform real-time monitoring of your LLM application and GPU hardware. Monitoring can enhance your GPT4All deployment with auto-generated traces and metrics for - **Performance Optimization:** Analyze latency, cost and token usage to ensure your LLM application runs efficiently, identifying and resolving performance bottlenecks swiftly. - **User Interaction Insights:** Capture each prompt and response to understand user behavior and usage patterns better, improving user experience and engagement. - **Detailed GPU Metrics:** Monitor essential GPU parameters such as utilization, memory consumption, temperature, and power usage to maintain optimal hardware performance and avert potential issues. ## Setup Monitoring !!! note "Setup Monitoring" With [OpenLIT](https://github.com/openlit/openlit), you can automatically monitor traces and metrics for your LLM deployment: ```shell pip install openlit ``` ```python from gpt4all import GPT4All import openlit openlit.init() # start # openlit.init(collect_gpu_stats=True) # Optional: To configure GPU monitoring model = GPT4All(model_name='orca-mini-3b-gguf2-q4_0.gguf') # Start a chat session and send queries with model.chat_session(): response1 = model.generate(prompt='hello', temp=0) response2 = model.generate(prompt='write me a short poem', temp=0) response3 = model.generate(prompt='thank you', temp=0) print(model.current_chat_session) ``` ## Visualization ### OpenLIT UI Connect to OpenLIT's UI to start exploring the collected LLM performance metrics and traces. Visit the OpenLIT [Quickstart Guide](https://docs.openlit.io/latest/quickstart) for step-by-step details. ### Grafana, DataDog, & Other Integrations You can also send the data collected by OpenLIT to popular monitoring tools like Grafana and DataDog. For detailed instructions on setting up these connections, please refer to the OpenLIT [Connections Guide](https://docs.openlit.io/latest/connections/intro). ================================================ FILE: gpt4all-bindings/python/docs/gpt4all_python/ref.md ================================================ # GPT4All Python SDK Reference ::: gpt4all.gpt4all.GPT4All ::: gpt4all.gpt4all.Embed4All ================================================ FILE: gpt4all-bindings/python/docs/index.md ================================================ # GPT4All Documentation GPT4All runs large language models (LLMs) privately on everyday desktops & laptops. No API calls or GPUs required - you can just download the application and [get started](gpt4all_desktop/quickstart.md#quickstart). !!! note "Desktop Application" GPT4All runs LLMs as an application on your computer. Nomic's embedding models can bring information from your local documents and files into your chats. It's fast, on-device, and completely **private**.
[Download for Windows](https://gpt4all.io/installers/gpt4all-installer-win64.exe)      [Download for Mac](https://gpt4all.io/installers/gpt4all-installer-darwin.dmg)      [Download for Linux](https://gpt4all.io/installers/gpt4all-installer-linux.run)
!!! note "Python SDK" Use GPT4All in Python to program with LLMs implemented with the [`llama.cpp`](https://github.com/ggerganov/llama.cpp) backend and [Nomic's C backend](https://github.com/nomic-ai/gpt4all/tree/main/gpt4all-backend). Nomic contributes to open source software like [`llama.cpp`](https://github.com/ggerganov/llama.cpp) to make LLMs accessible and efficient **for all**. ```bash pip install gpt4all ``` ```python from gpt4all import GPT4All model = GPT4All("Meta-Llama-3-8B-Instruct.Q4_0.gguf") # downloads / loads a 4.66GB LLM with model.chat_session(): print(model.generate("How can I run LLMs efficiently on my laptop?", max_tokens=1024)) ``` ================================================ FILE: gpt4all-bindings/python/docs/old/gpt4all_chat.md ================================================ # GPT4All Chat UI The [GPT4All Chat Client](https://gpt4all.io) lets you easily interact with any local large language model. It is optimized to run 7-13B parameter LLMs on the CPU's of any computer running OSX/Windows/Linux. ## Running LLMs on CPU The GPT4All Chat UI supports models from all newer versions of `llama.cpp` with `GGUF` models including the `Mistral`, `LLaMA2`, `LLaMA`, `OpenLLaMa`, `Falcon`, `MPT`, `Replit`, `Starcoder`, and `Bert` architectures GPT4All maintains an official list of recommended models located in [models3.json](https://github.com/nomic-ai/gpt4all/blob/main/gpt4all-chat/metadata/models3.json). You can pull request new models to it and if accepted they will show up in the official download dialog. #### Sideloading any GGUF model If a model is compatible with the gpt4all-backend, you can sideload it into GPT4All Chat by: 1. Downloading your model in GGUF format. It should be a 3-8 GB file similar to the ones [here](https://huggingface.co/TheBloke/Orca-2-7B-GGUF/tree/main). 2. Identifying your GPT4All model downloads folder. This is the path listed at the bottom of the downloads dialog. 3. Placing your downloaded model inside GPT4All's model downloads folder. 4. Restarting your GPT4ALL app. Your model should appear in the model selection list. ## Plugins GPT4All Chat Plugins allow you to expand the capabilities of Local LLMs. ### LocalDocs Plugin (Chat With Your Data) LocalDocs is a GPT4All feature that allows you to chat with your local files and data. It allows you to utilize powerful local LLMs to chat with private data without any data leaving your computer or server. When using LocalDocs, your LLM will cite the sources that most likely contributed to a given output. Note, even an LLM equipped with LocalDocs can hallucinate. The LocalDocs plugin will utilize your documents to help answer prompts and you will see references appear below the response.

#### Enabling LocalDocs 1. Install the latest version of GPT4All Chat from [GPT4All Website](https://gpt4all.io). 2. Go to `Settings > LocalDocs tab`. 3. Download the SBert model 4. Configure a collection (folder) on your computer that contains the files your LLM should have access to. You can alter the contents of the folder/directory at anytime. As you add more files to your collection, your LLM will dynamically be able to access them. 5. Spin up a chat session with any LLM (including external ones like ChatGPT but warning data will leave your machine!) 6. At the top right, click the database icon and select which collection you want your LLM to know about during your chat session. 7. You can begin searching with your localdocs even before the collection has completed indexing, but note the search will not include those parts of the collection yet to be indexed. #### LocalDocs Capabilities LocalDocs allows your LLM to have context about the contents of your documentation collection. LocalDocs **can**: - Query your documents based upon your prompt / question. Your documents will be searched for snippets that can be used to provide context for an answer. The most relevant snippets will be inserted into your prompts context, but it will be up to the underlying model to decide how best to use the provided context. LocalDocs **cannot**: - Answer general metadata queries (e.g. `What documents do you know about?`, `Tell me about my documents`) - Summarize a single document (e.g. `Summarize my magna carta PDF.`) See the Troubleshooting section for common issues. #### How LocalDocs Works LocalDocs works by maintaining an index of all data in the directory your collection is linked to. This index consists of small chunks of each document that the LLM can receive as additional input when you ask it a question. The general technique this plugin uses is called [Retrieval Augmented Generation](https://arxiv.org/abs/2005.11401). These document chunks help your LLM respond to queries with knowledge about the contents of your data. The number of chunks and the size of each chunk can be configured in the LocalDocs plugin settings tab. LocalDocs currently supports plain text files (`.txt`, `.md`, and `.rst`) and PDF files (`.pdf`). #### Troubleshooting and FAQ *My LocalDocs plugin isn't using my documents* - Make sure LocalDocs is enabled for your chat session (the DB icon on the top-right should have a border) - If your document collection is large, wait 1-2 minutes for it to finish indexing. #### LocalDocs Roadmap - Customize model fine-tuned with retrieval in the loop. - Plugin compatibility with chat client server mode. ## Server Mode GPT4All Chat comes with a built-in server mode allowing you to programmatically interact with any supported local LLM through a *very familiar* HTTP API. You can find the API documentation [here](https://platform.openai.com/docs/api-reference/completions). Enabling server mode in the chat client will spin-up on an HTTP server running on `localhost` port `4891` (the reverse of 1984). You can enable the webserver via `GPT4All Chat > Settings > Enable web server`. Begin using local LLMs in your AI powered apps by changing a single line of code: the base path for requests. ```python import openai openai.api_base = "http://localhost:4891/v1" #openai.api_base = "https://api.openai.com/v1" openai.api_key = "not needed for a local LLM" # Set up the prompt and other parameters for the API request prompt = "Who is Michael Jordan?" # model = "gpt-3.5-turbo" #model = "mpt-7b-chat" model = "gpt4all-j-v1.3-groovy" # Make the API request response = openai.Completion.create( model=model, prompt=prompt, max_tokens=50, temperature=0.28, top_p=0.95, n=1, echo=True, stream=False ) # Print the generated completion print(response) ``` which gives the following response ```json { "choices": [ { "finish_reason": "stop", "index": 0, "logprobs": null, "text": "Who is Michael Jordan?\nMichael Jordan is a former professional basketball player who played for the Chicago Bulls in the NBA. He was born on December 30, 1963, and retired from playing basketball in 1998." } ], "created": 1684260896, "id": "foobarbaz", "model": "gpt4all-j-v1.3-groovy", "object": "text_completion", "usage": { "completion_tokens": 35, "prompt_tokens": 39, "total_tokens": 74 } } ``` ================================================ FILE: gpt4all-bindings/python/docs/old/gpt4all_cli.md ================================================ # GPT4All CLI The GPT4All command-line interface (CLI) is a Python script which is built on top of the [Python bindings][docs-bindings-python] ([repository][repo-bindings-python]) and the [typer] package. The source code, README, and local build instructions can be found [here][repo-bindings-cli]. [docs-bindings-python]: gpt4all_python.md [repo-bindings-python]: https://github.com/nomic-ai/gpt4all/tree/main/gpt4all-bindings/python [repo-bindings-cli]: https://github.com/nomic-ai/gpt4all/tree/main/gpt4all-bindings/cli [typer]: https://typer.tiangolo.com/ ## Installation ### The Short Version The CLI is a Python script called [app.py]. If you're already familiar with Python best practices, the short version is to [download app.py][app.py-download] into a folder of your choice, install the two required dependencies with some variant of: ```shell pip install gpt4all typer ``` Then run it with a variant of: ```shell python app.py repl ``` In case you're wondering, _REPL_ is an acronym for [read-eval-print loop][wiki-repl]. [app.py]: https://github.com/nomic-ai/gpt4all/blob/main/gpt4all-bindings/cli/app.py [app.py-download]: https://raw.githubusercontent.com/nomic-ai/gpt4all/main/gpt4all-bindings/cli/app.py [wiki-repl]: https://en.wikipedia.org/wiki/Read%E2%80%93eval%E2%80%93print_loop ### Recommendations & The Long Version Especially if you have several applications/libraries which depend on Python, to avoid descending into dependency hell at some point, you should: - Consider to always install into some kind of [_virtual environment_][venv]. - On a _Unix-like_ system, don't use `sudo` for anything other than packages provided by the system package manager, i.e. never with `pip`. [venv]: https://docs.python.org/3/library/venv.html There are several ways and tools available to do this, so below are descriptions on how to install with a _virtual environment_ (recommended) or a user installation on all three main platforms. Different platforms can have slightly different ways to start the Python interpreter itself. Note: _Typer_ has an optional dependency for more fanciful output. If you want that, replace `typer` with `typer[all]` in the pip-install instructions below. #### Virtual Environment Installation You can name your _virtual environment_ folder for the CLI whatever you like. In the following, `gpt4all-cli` is used throughout. ##### macOS There are at least three ways to have a Python installation on _macOS_, and possibly not all of them provide a full installation of Python and its tools. When in doubt, try the following: ```shell python3 -m venv --help python3 -m pip --help ``` Both should print the help for the `venv` and `pip` commands, respectively. If they don't, consult the documentation of your Python installation on how to enable them, or download a separate Python variant, for example try an [unified installer package from python.org][python.org-downloads]. [python.org-downloads]: https://www.python.org/downloads/ Once ready, do: ```shell python3 -m venv gpt4all-cli . gpt4all-cli/bin/activate python3 -m pip install gpt4all typer ``` ##### Windows Download the [official installer from python.org][python.org-downloads] if Python isn't already present on your system. A _Windows_ installation should already provide all the components for a _virtual environment_. Run: ```shell py -3 -m venv gpt4all-cli gpt4all-cli\Scripts\activate py -m pip install gpt4all typer ``` ##### Linux On Linux, a Python installation is often split into several packages and not all are necessarily installed by default. For example, on Debian/Ubuntu and derived distros, you will want to ensure their presence with the following: ```shell sudo apt-get install python3-venv python3-pip ``` The next steps are similar to the other platforms: ```shell python3 -m venv gpt4all-cli . gpt4all-cli/bin/activate python3 -m pip install gpt4all typer ``` On other distros, the situation might be different. Especially the package names can vary a lot. You'll have to look it up in the documentation, software directory, or package search. #### User Installation ##### macOS There are at least three ways to have a Python installation on _macOS_, and possibly not all of them provide a full installation of Python and its tools. When in doubt, try the following: ```shell python3 -m pip --help ``` That should print the help for the `pip` command. If it doesn't, consult the documentation of your Python installation on how to enable them, or download a separate Python variant, for example try an [unified installer package from python.org][python.org-downloads]. Once ready, do: ```shell python3 -m pip install --user --upgrade gpt4all typer ``` ##### Windows Download the [official installer from python.org][python.org-downloads] if Python isn't already present on your system. It includes all the necessary components. Run: ```shell py -3 -m pip install --user --upgrade gpt4all typer ``` ##### Linux On Linux, a Python installation is often split into several packages and not all are necessarily installed by default. For example, on Debian/Ubuntu and derived distros, you will want to ensure their presence with the following: ```shell sudo apt-get install python3-pip ``` The next steps are similar to the other platforms: ```shell python3 -m pip install --user --upgrade gpt4all typer ``` On other distros, the situation might be different. Especially the package names can vary a lot. You'll have to look it up in the documentation, software directory, or package search. ## Running the CLI The CLI is a self-contained script called [app.py]. As such, you can [download][app.py-download] and save it anywhere you like, as long as the Python interpreter has access to the mentioned dependencies. Note: different platforms can have slightly different ways to start Python. Whereas below the interpreter command is written as `python` you typically want to type instead: - On _Unix-like_ systems: `python3` - On _Windows_: `py -3` The simplest way to start the CLI is: ```shell python app.py repl ``` This automatically selects the [groovy] model and downloads it into the `.cache/gpt4all/` folder of your home directory, if not already present. [groovy]: https://huggingface.co/nomic-ai/gpt4all-j#model-details If you want to use a different model, you can do so with the `-m`/`--model` parameter. If only a model file name is provided, it will again check in `.cache/gpt4all/` and might start downloading. If instead given a path to an existing model, the command could for example look like this: ```shell python app.py repl --model /home/user/my-gpt4all-models/gpt4all-13b-snoozy-q4_0.gguf ``` When you're done and want to end a session, simply type `/exit`. To get help and information on all the available commands and options on the command-line, run: ```shell python app.py --help ``` And while inside the running _REPL_, write `/help`. Note that if you've installed the required packages into a _virtual environment_, you don't need to activate that every time you want to run the CLI. Instead, you can just start it with the Python interpreter in the folder `gpt4all-cli/bin/` (_Unix-like_) or `gpt4all-cli/Script/` (_Windows_). That also makes it easy to set an alias e.g. in [Bash][bash-aliases] or [PowerShell][posh-aliases]: - Bash: `alias gpt4all="'/full/path/to/gpt4all-cli/bin/python' '/full/path/to/app.py' repl"` - PowerShell: ```posh Function GPT4All-Venv-CLI {"C:\full\path\to\gpt4all-cli\Scripts\python.exe" "C:\full\path\to\app.py" repl} Set-Alias -Name gpt4all -Value GPT4All-Venv-CLI ``` Don't forget to save these in the start-up file of your shell. [bash-aliases]: https://www.gnu.org/software/bash/manual/html_node/Aliases.html [posh-aliases]: https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.utility/set-alias Finally, if on _Windows_ you see a box instead of an arrow `⇢` as the prompt character, you should change the console font to one which offers better Unicode support. ================================================ FILE: gpt4all-bindings/python/docs/old/gpt4all_faq.md ================================================ # GPT4All FAQ ## What models are supported by the GPT4All ecosystem? Currently, there are six different model architectures that are supported: 1. GPT-J - Based off of the GPT-J architecture with examples found [here](https://huggingface.co/EleutherAI/gpt-j-6b) 2. LLaMA - Based off of the LLaMA architecture with examples found [here](https://huggingface.co/models?sort=downloads&search=llama) 3. MPT - Based off of Mosaic ML's MPT architecture with examples found [here](https://huggingface.co/mosaicml/mpt-7b) 4. Replit - Based off of Replit Inc.'s Replit architecture with examples found [here](https://huggingface.co/replit/replit-code-v1-3b) 5. Falcon - Based off of TII's Falcon architecture with examples found [here](https://huggingface.co/tiiuae/falcon-40b) 6. StarCoder - Based off of BigCode's StarCoder architecture with examples found [here](https://huggingface.co/bigcode/starcoder) ## Why so many different architectures? What differentiates them? One of the major differences is license. Currently, the LLaMA based models are subject to a non-commercial license, whereas the GPTJ and MPT base models allow commercial usage. However, its successor [Llama 2 is commercially licensable](https://ai.meta.com/llama/license/), too. In the early advent of the recent explosion of activity in open source local models, the LLaMA models have generally been seen as performing better, but that is changing quickly. Every week - even every day! - new models are released with some of the GPTJ and MPT models competitive in performance/quality with LLaMA. What's more, there are some very nice architectural innovations with the MPT models that could lead to new performance/quality gains. ## How does GPT4All make these models available for CPU inference? By leveraging the ggml library written by Georgi Gerganov and a growing community of developers. There are currently multiple different versions of this library. The original GitHub repo can be found [here](https://github.com/ggerganov/ggml), but the developer of the library has also created a LLaMA based version [here](https://github.com/ggerganov/llama.cpp). Currently, this backend is using the latter as a submodule. ## Does that mean GPT4All is compatible with all llama.cpp models and vice versa? Yes! The upstream [llama.cpp](https://github.com/ggerganov/llama.cpp) project has introduced several [compatibility breaking] quantization methods recently. This is a breaking change that renders all previous models (including the ones that GPT4All uses) inoperative with newer versions of llama.cpp since that change. Fortunately, we have engineered a submoduling system allowing us to dynamically load different versions of the underlying library so that GPT4All just works. [compatibility breaking]: https://github.com/ggerganov/llama.cpp/commit/b9fd7eee57df101d4a3e3eabc9fd6c2cb13c9ca1 ## What are the system requirements? Your CPU needs to support [AVX or AVX2 instructions](https://en.wikipedia.org/wiki/Advanced_Vector_Extensions) and you need enough RAM to load a model into memory. ## What about GPU inference? In newer versions of llama.cpp, there has been some added support for NVIDIA GPU's for inference. We're investigating how to incorporate this into our downloadable installers. ## Ok, so bottom line... how do I make my model on Hugging Face compatible with GPT4All ecosystem right now? 1. Check to make sure the Hugging Face model is available in one of our three supported architectures 2. If it is, then you can use the conversion script inside of our pinned llama.cpp submodule for GPTJ and LLaMA based models 3. Or if your model is an MPT model you can use the conversion script located directly in this backend directory under the scripts subdirectory ## Language Bindings #### There's a problem with the download Some bindings can download a model, if allowed to do so. For example, in Python or TypeScript if `allow_download=True` or `allowDownload=true` (default), a model is automatically downloaded into `.cache/gpt4all/` in the user's home folder, unless it already exists. In case of connection issues or errors during the download, you might want to manually verify the model file's MD5 checksum by comparing it with the one listed in [models3.json]. As an alternative to the basic downloader built into the bindings, you can choose to download from the website instead. Scroll down to 'Model Explorer' and pick your preferred model. [models3.json]: https://github.com/nomic-ai/gpt4all/blob/main/gpt4all-chat/metadata/models3.json #### I need the chat GUI and bindings to behave the same The chat GUI and bindings are based on the same backend. You can make them behave the same way by following these steps: - First of all, ensure that all parameters in the chat GUI settings match those passed to the generating API, e.g.: === "Python" ``` py from gpt4all import GPT4All model = GPT4All(...) model.generate("prompt text", temp=0, ...) # adjust parameters ``` === "TypeScript" ``` ts import { createCompletion, loadModel } from '../src/gpt4all.js' const ll = await loadModel(...); const messages = ... const re = await createCompletion(ll, messages, { temp: 0, ... }); // adjust parameters ``` - To make comparing the output easier, set _Temperature_ in both to 0 for now. This will make the output deterministic. - Next you'll have to compare the templates, adjusting them as necessary, based on how you're using the bindings. - Specifically, in Python: - With simple `generate()` calls, the input has to be surrounded with system and prompt templates. - When using a chat session, it depends on whether the bindings are allowed to download [models3.json]. If yes, and in the chat GUI the default templates are used, it'll be handled automatically. If no, use `chat_session()` template parameters to customize them. - Once you're done, remember to reset _Temperature_ to its previous value in both chat GUI and your custom code. ================================================ FILE: gpt4all-bindings/python/docs/old/gpt4all_monitoring.md ================================================ # Monitoring Leverage OpenTelemetry to perform real-time monitoring of your LLM application and GPUs using [OpenLIT](https://github.com/openlit/openlit). This tool helps you easily collect data on user interactions, performance metrics, along with GPU Performance metrics, which can assist in enhancing the functionality and dependability of your GPT4All based LLM application. ## How it works? OpenLIT adds automatic OTel instrumentation to the GPT4All SDK. It covers the `generate` and `embedding` functions, helping to track LLM usage by gathering inputs and outputs. This allows users to monitor and evaluate the performance and behavior of their LLM application in different environments. OpenLIT also provides OTel auto-instrumentation for monitoring GPU metrics like utilization, temperature, power usage, and memory usage. Additionally, you have the flexibility to view and analyze the generated traces and metrics either in the OpenLIT UI or by exporting them to widely used observability tools like Grafana and DataDog for more comprehensive analysis and visualization. ## Getting Started Here’s a straightforward guide to help you set up and start monitoring your application: ### 1. Install the OpenLIT SDK Open your terminal and run: ```shell pip install openlit ``` ### 2. Setup Monitoring for your Application In your application, initiate OpenLIT as outlined below: ```python from gpt4all import GPT4All import openlit openlit.init() # Initialize OpenLIT monitoring model = GPT4All(model_name='orca-mini-3b-gguf2-q4_0.gguf') # Start a chat session and send queries with model.chat_session(): response1 = model.generate(prompt='hello', temp=0) response2 = model.generate(prompt='write me a short poem', temp=0) response3 = model.generate(prompt='thank you', temp=0) print(model.current_chat_session) ``` This setup wraps your gpt4all model interactions, capturing valuable data about each request and response. ### 3. (Optional) Enable GPU Monitoring If your application runs on NVIDIA GPUs, you can enable GPU stats collection in the OpenLIT SDK by adding `collect_gpu_stats=True`. This collects GPU metrics like utilization, temperature, power usage, and memory-related performance metrics. The collected metrics are OpenTelemetry gauges. ```python from gpt4all import GPT4All import openlit openlit.init(collect_gpu_stats=True) # Initialize OpenLIT monitoring model = GPT4All(model_name='orca-mini-3b-gguf2-q4_0.gguf') # Start a chat session and send queries with model.chat_session(): response1 = model.generate(prompt='hello', temp=0) response2 = model.generate(prompt='write me a short poem', temp=0) response3 = model.generate(prompt='thank you', temp=0) print(model.current_chat_session) ``` ### Visualize Once you've set up data collection with [OpenLIT](https://github.com/openlit/openlit), you can visualize and analyze this information to better understand your application's performance: - **Using OpenLIT UI:** Connect to OpenLIT's UI to start exploring performance metrics. Visit the OpenLIT [Quickstart Guide](https://docs.openlit.io/latest/quickstart) for step-by-step details. - **Integrate with existing Observability Tools:** If you use tools like Grafana or DataDog, you can integrate the data collected by OpenLIT. For instructions on setting up these connections, check the OpenLIT [Connections Guide](https://docs.openlit.io/latest/connections/intro). ================================================ FILE: gpt4all-bindings/python/docs/old/gpt4all_nodejs.md ================================================ # GPT4All Node.js API Native Node.js LLM bindings for all. ```sh yarn add gpt4all@latest npm install gpt4all@latest pnpm install gpt4all@latest ``` ## Contents * See [API Reference](#api-reference) * See [Examples](#api-example) * See [Developing](#develop) * GPT4ALL nodejs bindings created by [jacoobes](https://github.com/jacoobes), [limez](https://github.com/iimez) and the [nomic ai community](https://home.nomic.ai), for all to use. ## Api Example ### Chat Completion ```js import { LLModel, createCompletion, DEFAULT_DIRECTORY, DEFAULT_LIBRARIES_DIRECTORY, loadModel } from '../src/gpt4all.js' const model = await loadModel( 'mistral-7b-openorca.gguf2.Q4_0.gguf', { verbose: true, device: 'gpu' }); const completion1 = await createCompletion(model, 'What is 1 + 1?', { verbose: true, }) console.log(completion1.message) const completion2 = await createCompletion(model, 'And if we add two?', { verbose: true }) console.log(completion2.message) model.dispose() ``` ### Embedding ```js import { loadModel, createEmbedding } from '../src/gpt4all.js' const embedder = await loadModel("all-MiniLM-L6-v2-f16.gguf", { verbose: true, type: 'embedding'}) console.log(createEmbedding(embedder, "Maybe Minecraft was the friends we made along the way")); ``` ### Chat Sessions ```js import { loadModel, createCompletion } from "../src/gpt4all.js"; const model = await loadModel("orca-mini-3b-gguf2-q4_0.gguf", { verbose: true, device: "gpu", }); const chat = await model.createChatSession(); await createCompletion( chat, "Why are bananas rather blue than bread at night sometimes?", { verbose: true, } ); await createCompletion(chat, "Are you sure?", { verbose: true, }); ``` ### Streaming responses ```js import gpt from "../src/gpt4all.js"; const model = await gpt.loadModel("mistral-7b-openorca.gguf2.Q4_0.gguf", { device: "gpu", }); process.stdout.write("### Stream:"); const stream = gpt.createCompletionStream(model, "How are you?"); stream.tokens.on("data", (data) => { process.stdout.write(data); }); //wait till stream finishes. We cannot continue until this one is done. await stream.result; process.stdout.write("\n"); process.stdout.write("### Stream with pipe:"); const stream2 = gpt.createCompletionStream( model, "Please say something nice about node streams." ); stream2.tokens.pipe(process.stdout); await stream2.result; process.stdout.write("\n"); console.log("done"); model.dispose(); ``` ### Async Generators ```js import gpt from "../src/gpt4all.js"; const model = await gpt.loadModel("mistral-7b-openorca.gguf2.Q4_0.gguf", { device: "gpu", }); process.stdout.write("### Generator:"); const gen = gpt.createCompletionGenerator(model, "Redstone in Minecraft is Turing Complete. Let that sink in. (let it in!)"); for await (const chunk of gen) { process.stdout.write(chunk); } process.stdout.write("\n"); model.dispose(); ``` ## Develop ### Build Instructions * binding.gyp is compile config * Tested on Ubuntu. Everything seems to work fine * Tested on Windows. Everything works fine. * Sparse testing on mac os. * MingW works as well to build the gpt4all-backend. **HOWEVER**, this package works only with MSVC built dlls. ### Requirements * git * [node.js >= 18.0.0](https://nodejs.org/en) * [yarn](https://yarnpkg.com/) * [node-gyp](https://github.com/nodejs/node-gyp) * all of its requirements. * (unix) gcc version 12 * (win) msvc version 143 * Can be obtained with visual studio 2022 build tools * python 3 * On Windows and Linux, building GPT4All requires the complete Vulkan SDK. You may download it from here: https://vulkan.lunarg.com/sdk/home * macOS users do not need Vulkan, as GPT4All will use Metal instead. ### Build (from source) ```sh git clone https://github.com/nomic-ai/gpt4all.git cd gpt4all-bindings/typescript ``` * The below shell commands assume the current working directory is `typescript`. * To Build and Rebuild: ```sh yarn ``` * llama.cpp git submodule for gpt4all can be possibly absent. If this is the case, make sure to run in llama.cpp parent directory ```sh git submodule update --init --depth 1 --recursive ``` ```sh yarn build:backend ``` This will build platform-dependent dynamic libraries, and will be located in runtimes/(platform)/native The only current way to use them is to put them in the current working directory of your application. That is, **WHEREVER YOU RUN YOUR NODE APPLICATION** * llama-xxxx.dll is required. * According to whatever model you are using, you'll need to select the proper model loader. * For example, if you running an Mosaic MPT model, you will need to select the mpt-(buildvariant).(dynamiclibrary) ### Test ```sh yarn test ``` ### Source Overview #### src/ * Extra functions to help aid devex * Typings for the native node addon * the javascript interface #### test/ * simple unit testings for some functions exported. * more advanced ai testing is not handled #### spec/ * Average look and feel of the api * Should work assuming a model and libraries are installed locally in working directory #### index.cc * The bridge between nodejs and c. Where the bindings are. #### prompt.cc * Handling prompting and inference of models in a threadsafe, asynchronous way. ### Known Issues * why your model may be spewing bull 💩 * The downloaded model is broken (just reinstall or download from official site) * Your model is hanging after a call to generate tokens. * Is `nPast` set too high? This may cause your model to hang (03/16/2024), Linux Mint, Ubuntu 22.04 * Your GPU usage is still high after node.js exits. * Make sure to call `model.dispose()`!!! ### Roadmap This package has been stabilizing over time development, and breaking changes may happen until the api stabilizes. Here's what's the todo list: * \[ ] Purely offline. Per the gui, which can be run completely offline, the bindings should be as well. * \[ ] NPM bundle size reduction via optionalDependencies strategy (need help) * Should include prebuilds to avoid painful node-gyp errors * \[x] createChatSession ( the python equivalent to create\_chat\_session ) * \[x] generateTokens, the new name for createTokenStream. As of 3.2.0, this is released but not 100% tested. Check spec/generator.mjs! * \[x] ~~createTokenStream, an async iterator that streams each token emitted from the model. Planning on following this [example](https://github.com/nodejs/node-addon-examples/tree/main/threadsafe-async-iterator)~~ May not implement unless someone else can complete * \[x] prompt models via a threadsafe function in order to have proper non blocking behavior in nodejs * \[x] generateTokens is the new name for this^ * \[x] proper unit testing (integrate with circle ci) * \[x] publish to npm under alpha tag `gpt4all@alpha` * \[x] have more people test on other platforms (mac tester needed) * \[x] switch to new pluggable backend ### API Reference ##### Table of Contents * [type](#type) * [TokenCallback](#tokencallback) * [ChatSessionOptions](#chatsessionoptions) * [systemPrompt](#systemprompt) * [messages](#messages) * [initialize](#initialize) * [Parameters](#parameters) * [generate](#generate) * [Parameters](#parameters-1) * [InferenceModel](#inferencemodel) * [createChatSession](#createchatsession) * [Parameters](#parameters-2) * [generate](#generate-1) * [Parameters](#parameters-3) * [dispose](#dispose) * [EmbeddingModel](#embeddingmodel) * [dispose](#dispose-1) * [InferenceResult](#inferenceresult) * [LLModel](#llmodel) * [constructor](#constructor) * [Parameters](#parameters-4) * [type](#type-1) * [name](#name) * [stateSize](#statesize) * [threadCount](#threadcount) * [setThreadCount](#setthreadcount) * [Parameters](#parameters-5) * [infer](#infer) * [Parameters](#parameters-6) * [embed](#embed) * [Parameters](#parameters-7) * [isModelLoaded](#ismodelloaded) * [setLibraryPath](#setlibrarypath) * [Parameters](#parameters-8) * [getLibraryPath](#getlibrarypath) * [initGpuByString](#initgpubystring) * [Parameters](#parameters-9) * [hasGpuDevice](#hasgpudevice) * [listGpu](#listgpu) * [Parameters](#parameters-10) * [dispose](#dispose-2) * [GpuDevice](#gpudevice) * [type](#type-2) * [LoadModelOptions](#loadmodeloptions) * [modelPath](#modelpath) * [librariesPath](#librariespath) * [modelConfigFile](#modelconfigfile) * [allowDownload](#allowdownload) * [verbose](#verbose) * [device](#device) * [nCtx](#nctx) * [ngl](#ngl) * [loadModel](#loadmodel) * [Parameters](#parameters-11) * [InferenceProvider](#inferenceprovider) * [createCompletion](#createcompletion) * [Parameters](#parameters-12) * [createCompletionStream](#createcompletionstream) * [Parameters](#parameters-13) * [createCompletionGenerator](#createcompletiongenerator) * [Parameters](#parameters-14) * [createEmbedding](#createembedding) * [Parameters](#parameters-15) * [CompletionOptions](#completionoptions) * [verbose](#verbose-1) * [onToken](#ontoken) * [Message](#message) * [role](#role) * [content](#content) * [prompt\_tokens](#prompt_tokens) * [completion\_tokens](#completion_tokens) * [total\_tokens](#total_tokens) * [n\_past\_tokens](#n_past_tokens) * [CompletionReturn](#completionreturn) * [model](#model) * [usage](#usage) * [message](#message-1) * [CompletionStreamReturn](#completionstreamreturn) * [LLModelPromptContext](#llmodelpromptcontext) * [logitsSize](#logitssize) * [tokensSize](#tokenssize) * [nPast](#npast) * [nPredict](#npredict) * [promptTemplate](#prompttemplate) * [nCtx](#nctx-1) * [topK](#topk) * [topP](#topp) * [minP](#minp) * [temperature](#temperature) * [nBatch](#nbatch) * [repeatPenalty](#repeatpenalty) * [repeatLastN](#repeatlastn) * [contextErase](#contexterase) * [DEFAULT\_DIRECTORY](#default_directory) * [DEFAULT\_LIBRARIES\_DIRECTORY](#default_libraries_directory) * [DEFAULT\_MODEL\_CONFIG](#default_model_config) * [DEFAULT\_PROMPT\_CONTEXT](#default_prompt_context) * [DEFAULT\_MODEL\_LIST\_URL](#default_model_list_url) * [downloadModel](#downloadmodel) * [Parameters](#parameters-16) * [Examples](#examples) * [DownloadModelOptions](#downloadmodeloptions) * [modelPath](#modelpath-1) * [verbose](#verbose-2) * [url](#url) * [md5sum](#md5sum) * [DownloadController](#downloadcontroller) * [cancel](#cancel) * [promise](#promise) #### type Model architecture. This argument currently does not have any functionality and is just used as descriptive identifier for user. Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String) #### TokenCallback Callback for controlling token generation. Return false to stop token generation. Type: function (tokenId: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number), token: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String), total: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)): [boolean](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Boolean) #### ChatSessionOptions **Extends Partial\** Options for the chat session. ##### systemPrompt System prompt to ingest on initialization. Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String) ##### messages Messages to ingest on initialization. Type: [Array](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Array)<[Message](#message)> #### initialize Ingests system prompt and initial messages. Sets this chat session as the active chat session of the model. ##### Parameters * `options` **[ChatSessionOptions](#chatsessionoptions)** The options for the chat session. Returns **[Promise](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Promise)\** #### generate Prompts the model in chat-session context. ##### Parameters * `prompt` **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** The prompt input. * `options` **[CompletionOptions](#completionoptions)?** Prompt context and other options. * `callback` **[TokenCallback](#tokencallback)?** Token generation callback. * Throws **[Error](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Error)** If the chat session is not the active chat session of the model. Returns **[Promise](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Promise)<[CompletionReturn](#completionreturn)>** The model's response to the prompt. #### InferenceModel InferenceModel represents an LLM which can make chat predictions, similar to GPT transformers. ##### createChatSession Create a chat session with the model. ###### Parameters * `options` **[ChatSessionOptions](#chatsessionoptions)?** The options for the chat session. Returns **[Promise](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Promise)\** The chat session. ##### generate Prompts the model with a given input and optional parameters. ###### Parameters * `prompt` **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** * `options` **[CompletionOptions](#completionoptions)?** Prompt context and other options. * `callback` **[TokenCallback](#tokencallback)?** Token generation callback. * `input` The prompt input. Returns **[Promise](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Promise)<[CompletionReturn](#completionreturn)>** The model's response to the prompt. ##### dispose delete and cleanup the native model Returns **void** #### EmbeddingModel EmbeddingModel represents an LLM which can create embeddings, which are float arrays ##### dispose delete and cleanup the native model Returns **void** #### InferenceResult Shape of LLModel's inference result. #### LLModel LLModel class representing a language model. This is a base class that provides common functionality for different types of language models. ##### constructor Initialize a new LLModel. ###### Parameters * `path` **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** Absolute path to the model file. * Throws **[Error](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Error)** If the model file does not exist. ##### type undefined or user supplied Returns **([string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String) | [undefined](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/undefined))** ##### name The name of the model. Returns **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** ##### stateSize Get the size of the internal state of the model. NOTE: This state data is specific to the type of model you have created. Returns **[number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)** the size in bytes of the internal state of the model ##### threadCount Get the number of threads used for model inference. The default is the number of physical cores your computer has. Returns **[number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)** The number of threads used for model inference. ##### setThreadCount Set the number of threads used for model inference. ###### Parameters * `newNumber` **[number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)** The new number of threads. Returns **void** ##### infer Prompt the model with a given input and optional parameters. This is the raw output from model. Use the prompt function exported for a value ###### Parameters * `prompt` **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** The prompt input. * `promptContext` **Partial<[LLModelPromptContext](#llmodelpromptcontext)>** Optional parameters for the prompt context. * `callback` **[TokenCallback](#tokencallback)?** optional callback to control token generation. Returns **[Promise](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Promise)<[InferenceResult](#inferenceresult)>** The result of the model prompt. ##### embed Embed text with the model. Keep in mind that Use the prompt function exported for a value ###### Parameters * `text` **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** The prompt input. Returns **[Float32Array](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Float32Array)** The result of the model prompt. ##### isModelLoaded Whether the model is loaded or not. Returns **[boolean](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Boolean)** ##### setLibraryPath Where to search for the pluggable backend libraries ###### Parameters * `s` **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** Returns **void** ##### getLibraryPath Where to get the pluggable backend libraries Returns **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** ##### initGpuByString Initiate a GPU by a string identifier. ###### Parameters * `memory_required` **[number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)** Should be in the range size\_t or will throw * `device_name` **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** 'amd' | 'nvidia' | 'intel' | 'gpu' | gpu name. read LoadModelOptions.device for more information Returns **[boolean](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Boolean)** ##### hasGpuDevice From C documentation Returns **[boolean](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Boolean)** True if a GPU device is successfully initialized, false otherwise. ##### listGpu GPUs that are usable for this LLModel ###### Parameters * `nCtx` **[number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)** Maximum size of context window * Throws **any** if hasGpuDevice returns false (i think) Returns **[Array](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Array)<[GpuDevice](#gpudevice)>** ##### dispose delete and cleanup the native model Returns **void** #### GpuDevice an object that contains gpu data on this machine. ##### type same as VkPhysicalDeviceType Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number) #### LoadModelOptions Options that configure a model's behavior. ##### modelPath Where to look for model files. Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String) ##### librariesPath Where to look for the backend libraries. Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String) ##### modelConfigFile The path to the model configuration file, useful for offline usage or custom model configurations. Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String) ##### allowDownload Whether to allow downloading the model if it is not present at the specified path. Type: [boolean](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Boolean) ##### verbose Enable verbose logging. Type: [boolean](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Boolean) ##### device The processing unit on which the model will run. It can be set to * "cpu": Model will run on the central processing unit. * "gpu": Model will run on the best available graphics processing unit, irrespective of its vendor. * "amd", "nvidia", "intel": Model will run on the best available GPU from the specified vendor. * "gpu name": Model will run on the GPU that matches the name if it's available. Note: If a GPU device lacks sufficient RAM to accommodate the model, an error will be thrown, and the GPT4All instance will be rendered invalid. It's advised to ensure the device has enough memory before initiating the model. Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String) ##### nCtx The Maximum window size of this model Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number) ##### ngl Number of gpu layers needed Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number) #### loadModel Loads a machine learning model with the specified name. The defacto way to create a model. By default this will download a model from the official GPT4ALL website, if a model is not present at given path. ##### Parameters * `modelName` **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** The name of the model to load. * `options` **([LoadModelOptions](#loadmodeloptions) | [undefined](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/undefined))?** (Optional) Additional options for loading the model. Returns **[Promise](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Promise)<([InferenceModel](#inferencemodel) | [EmbeddingModel](#embeddingmodel))>** A promise that resolves to an instance of the loaded LLModel. #### InferenceProvider Interface for inference, implemented by InferenceModel and ChatSession. #### createCompletion The nodejs equivalent to python binding's chat\_completion ##### Parameters * `provider` **[InferenceProvider](#inferenceprovider)** The inference model object or chat session * `message` **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** The user input message * `options` **[CompletionOptions](#completionoptions)** The options for creating the completion. Returns **[CompletionReturn](#completionreturn)** The completion result. #### createCompletionStream Streaming variant of createCompletion, returns a stream of tokens and a promise that resolves to the completion result. ##### Parameters * `provider` **[InferenceProvider](#inferenceprovider)** The inference model object or chat session * `message` **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** The user input message. * `options` **[CompletionOptions](#completionoptions)** The options for creating the completion. Returns **[CompletionStreamReturn](#completionstreamreturn)** An object of token stream and the completion result promise. #### createCompletionGenerator Creates an async generator of tokens ##### Parameters * `provider` **[InferenceProvider](#inferenceprovider)** The inference model object or chat session * `message` **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** The user input message. * `options` **[CompletionOptions](#completionoptions)** The options for creating the completion. Returns **AsyncGenerator<[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)>** The stream of generated tokens #### createEmbedding The nodejs moral equivalent to python binding's Embed4All().embed() meow ##### Parameters * `model` **[EmbeddingModel](#embeddingmodel)** The language model object. * `text` **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** text to embed Returns **[Float32Array](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Float32Array)** The completion result. #### CompletionOptions **Extends Partial\** The options for creating the completion. ##### verbose Indicates if verbose logging is enabled. Type: [boolean](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Boolean) ##### onToken Callback for controlling token generation. Return false to stop processing. Type: [TokenCallback](#tokencallback) #### Message A message in the conversation. ##### role The role of the message. Type: (`"system"` | `"assistant"` | `"user"`) ##### content The message content. Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String) #### prompt\_tokens The number of tokens used in the prompt. Currently not available and always 0. Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number) #### completion\_tokens The number of tokens used in the completion. Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number) #### total\_tokens The total number of tokens used. Currently not available and always 0. Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number) #### n\_past\_tokens Number of tokens used in the conversation. Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number) #### CompletionReturn The result of a completion. ##### model The model used for the completion. Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String) ##### usage Token usage report. Type: {prompt\_tokens: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number), completion\_tokens: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number), total\_tokens: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number), n\_past\_tokens: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)} ##### message The generated completion. Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String) #### CompletionStreamReturn The result of a streamed completion, containing a stream of tokens and a promise that resolves to the completion result. #### LLModelPromptContext Model inference arguments for generating completions. ##### logitsSize The size of the raw logits vector. Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number) ##### tokensSize The size of the raw tokens vector. Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number) ##### nPast The number of tokens in the past conversation. This controls how far back the model looks when generating completions. Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number) ##### nPredict The maximum number of tokens to predict. Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number) ##### promptTemplate Template for user / assistant message pairs. %1 is required and will be replaced by the user input. %2 is optional and will be replaced by the assistant response. Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String) ##### nCtx The context window size. Do not use, it has no effect. See loadModel options. THIS IS DEPRECATED!!! Use loadModel's nCtx option instead. Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number) ##### topK The top-k logits to sample from. Top-K sampling selects the next token only from the top K most likely tokens predicted by the model. It helps reduce the risk of generating low-probability or nonsensical tokens, but it may also limit the diversity of the output. A higher value for top-K (eg., 100) will consider more tokens and lead to more diverse text, while a lower value (eg., 10) will focus on the most probable tokens and generate more conservative text. 30 - 60 is a good range for most tasks. Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number) ##### topP The nucleus sampling probability threshold. Top-P limits the selection of the next token to a subset of tokens with a cumulative probability above a threshold P. This method, also known as nucleus sampling, finds a balance between diversity and quality by considering both token probabilities and the number of tokens available for sampling. When using a higher value for top-P (eg., 0.95), the generated text becomes more diverse. On the other hand, a lower value (eg., 0.1) produces more focused and conservative text. Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number) ##### minP The minimum probability of a token to be considered. Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number) ##### temperature The temperature to adjust the model's output distribution. Temperature is like a knob that adjusts how creative or focused the output becomes. Higher temperatures (eg., 1.2) increase randomness, resulting in more imaginative and diverse text. Lower temperatures (eg., 0.5) make the output more focused, predictable, and conservative. When the temperature is set to 0, the output becomes completely deterministic, always selecting the most probable next token and producing identical results each time. A safe range would be around 0.6 - 0.85, but you are free to search what value fits best for you. Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number) ##### nBatch The number of predictions to generate in parallel. By splitting the prompt every N tokens, prompt-batch-size reduces RAM usage during processing. However, this can increase the processing time as a trade-off. If the N value is set too low (e.g., 10), long prompts with 500+ tokens will be most affected, requiring numerous processing runs to complete the prompt processing. To ensure optimal performance, setting the prompt-batch-size to 2048 allows processing of all tokens in a single run. Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number) ##### repeatPenalty The penalty factor for repeated tokens. Repeat-penalty can help penalize tokens based on how frequently they occur in the text, including the input prompt. A token that has already appeared five times is penalized more heavily than a token that has appeared only one time. A value of 1 means that there is no penalty and values larger than 1 discourage repeated tokens. Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number) ##### repeatLastN The number of last tokens to penalize. The repeat-penalty-tokens N option controls the number of tokens in the history to consider for penalizing repetition. A larger value will look further back in the generated text to prevent repetitions, while a smaller value will only consider recent tokens. Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number) ##### contextErase The percentage of context to erase if the context window is exceeded. Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number) #### DEFAULT\_DIRECTORY From python api: models will be stored in (homedir)/.cache/gpt4all/\` Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String) #### DEFAULT\_LIBRARIES\_DIRECTORY From python api: The default path for dynamic libraries to be stored. You may separate paths by a semicolon to search in multiple areas. This searches DEFAULT\_DIRECTORY/libraries, cwd/libraries, and finally cwd. Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String) #### DEFAULT\_MODEL\_CONFIG Default model configuration. Type: ModelConfig #### DEFAULT\_PROMPT\_CONTEXT Default prompt context. Type: [LLModelPromptContext](#llmodelpromptcontext) #### DEFAULT\_MODEL\_LIST\_URL Default model list url. Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String) #### downloadModel Initiates the download of a model file. By default this downloads without waiting. use the controller returned to alter this behavior. ##### Parameters * `modelName` **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** The model to be downloaded. * `options` **[DownloadModelOptions](#downloadmodeloptions)** to pass into the downloader. Default is { location: (cwd), verbose: false }. ##### Examples ```javascript const download = downloadModel('ggml-gpt4all-j-v1.3-groovy.bin') download.promise.then(() => console.log('Downloaded!')) ``` * Throws **[Error](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Error)** If the model already exists in the specified location. * Throws **[Error](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Error)** If the model cannot be found at the specified url. Returns **[DownloadController](#downloadcontroller)** object that allows controlling the download process. #### DownloadModelOptions Options for the model download process. ##### modelPath location to download the model. Default is process.cwd(), or the current working directory Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String) ##### verbose Debug mode -- check how long it took to download in seconds Type: [boolean](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Boolean) ##### url Remote download url. Defaults to `https://gpt4all.io/models/gguf/` Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String) ##### md5sum MD5 sum of the model file. If this is provided, the downloaded file will be checked against this sum. If the sums do not match, an error will be thrown and the file will be deleted. Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String) #### DownloadController Model download controller. ##### cancel Cancel the request to download if this is called. Type: function (): void ##### promise A promise resolving to the downloaded models config once the download is done Type: [Promise](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Promise)\ ================================================ FILE: gpt4all-bindings/python/docs/old/gpt4all_python.md ================================================ # GPT4All Python Generation API The `GPT4All` python package provides bindings to our C/C++ model backend libraries. The source code and local build instructions can be found [here](https://github.com/nomic-ai/gpt4all/tree/main/gpt4all-bindings/python). ## Quickstart ```bash pip install gpt4all ``` ``` py from gpt4all import GPT4All model = GPT4All("orca-mini-3b-gguf2-q4_0.gguf") ``` This will: - Instantiate `GPT4All`, which is the primary public API to your large language model (LLM). - Automatically download the given model to `~/.cache/gpt4all/` if not already present. Read further to see how to chat with this model. ### Chatting with GPT4All To start chatting with a local LLM, you will need to start a chat session. Within a chat session, the model will be prompted with the appropriate template, and history will be preserved between successive calls to `generate()`. === "GPT4All Example" ``` py model = GPT4All(model_name='orca-mini-3b-gguf2-q4_0.gguf') with model.chat_session(): response1 = model.generate(prompt='hello', temp=0) response2 = model.generate(prompt='write me a short poem', temp=0) response3 = model.generate(prompt='thank you', temp=0) print(model.current_chat_session) ``` === "Output" ``` json [ { 'role': 'user', 'content': 'hello' }, { 'role': 'assistant', 'content': 'What is your name?' }, { 'role': 'user', 'content': 'write me a short poem' }, { 'role': 'assistant', 'content': "I would love to help you with that! Here's a short poem I came up with:\nBeneath the autumn leaves,\nThe wind whispers through the trees.\nA gentle breeze, so at ease,\nAs if it were born to play.\nAnd as the sun sets in the sky,\nThe world around us grows still." }, { 'role': 'user', 'content': 'thank you' }, { 'role': 'assistant', 'content': "You're welcome! I hope this poem was helpful or inspiring for you. Let me know if there is anything else I can assist you with." } ] ``` When using GPT4All models in the `chat_session()` context: - Consecutive chat exchanges are taken into account and not discarded until the session ends; as long as the model has capacity. - A system prompt is inserted into the beginning of the model's context. - Each prompt passed to `generate()` is wrapped in the appropriate prompt template. If you pass `allow_download=False` to GPT4All or are using a model that is not from the official models list, you must pass a prompt template using the `prompt_template` parameter of `chat_session()`. NOTE: If you do not use `chat_session()`, calls to `generate()` will not be wrapped in a prompt template. This will cause the model to *continue* the prompt instead of *answering* it. When in doubt, use a chat session, as many newer models are designed to be used exclusively with a prompt template. [models3.json]: https://github.com/nomic-ai/gpt4all/blob/main/gpt4all-chat/metadata/models3.json ### Streaming Generations To interact with GPT4All responses as the model generates, use the `streaming=True` flag during generation. === "GPT4All Streaming Example" ``` py from gpt4all import GPT4All model = GPT4All("orca-mini-3b-gguf2-q4_0.gguf") tokens = [] with model.chat_session(): for token in model.generate("What is the capital of France?", streaming=True): tokens.append(token) print(tokens) ``` === "Output" ``` [' The', ' capital', ' of', ' France', ' is', ' Paris', '.'] ``` ### The Generate Method API ::: gpt4all.gpt4all.GPT4All.generate ## Examples & Explanations ### Influencing Generation The three most influential parameters in generation are _Temperature_ (`temp`), _Top-p_ (`top_p`) and _Top-K_ (`top_k`). In a nutshell, during the process of selecting the next token, not just one or a few are considered, but every single token in the vocabulary is given a probability. The parameters can change the field of candidate tokens. - **Temperature** makes the process either more or less random. A _Temperature_ above 1 increasingly "levels the playing field", while at a _Temperature_ between 0 and 1 the likelihood of the best token candidates grows even more. A _Temperature_ of 0 results in selecting the best token, making the output deterministic. A _Temperature_ of 1 represents a neutral setting with regard to randomness in the process. - _Top-p_ and _Top-K_ both narrow the field: - **Top-K** limits candidate tokens to a fixed number after sorting by probability. Setting it higher than the vocabulary size deactivates this limit. - **Top-p** selects tokens based on their total probabilities. For example, a value of 0.8 means "include the best tokens, whose accumulated probabilities reach or just surpass 80%". Setting _Top-p_ to 1, which is 100%, effectively disables it. The recommendation is to keep at least one of _Top-K_ and _Top-p_ active. Other parameters can also influence generation; be sure to review all their descriptions. ### Specifying the Model Folder The model folder can be set with the `model_path` parameter when creating a `GPT4All` instance. The example below is is the same as if it weren't provided; that is, `~/.cache/gpt4all/` is the default folder. ``` py from pathlib import Path from gpt4all import GPT4All model = GPT4All(model_name='orca-mini-3b-gguf2-q4_0.gguf', model_path=Path.home() / '.cache' / 'gpt4all') ``` If you want to point it at the chat GUI's default folder, it should be: === "macOS" ``` py from pathlib import Path from gpt4all import GPT4All model_name = 'orca-mini-3b-gguf2-q4_0.gguf' model_path = Path.home() / 'Library' / 'Application Support' / 'nomic.ai' / 'GPT4All' model = GPT4All(model_name, model_path) ``` === "Windows" ``` py from pathlib import Path from gpt4all import GPT4All import os model_name = 'orca-mini-3b-gguf2-q4_0.gguf' model_path = Path(os.environ['LOCALAPPDATA']) / 'nomic.ai' / 'GPT4All' model = GPT4All(model_name, model_path) ``` === "Linux" ``` py from pathlib import Path from gpt4all import GPT4All model_name = 'orca-mini-3b-gguf2-q4_0.gguf' model_path = Path.home() / '.local' / 'share' / 'nomic.ai' / 'GPT4All' model = GPT4All(model_name, model_path) ``` Alternatively, you could also change the module's default model directory: ``` py from pathlib import Path from gpt4all import GPT4All, gpt4all gpt4all.DEFAULT_MODEL_DIRECTORY = Path.home() / 'my' / 'models-directory' model = GPT4All('orca-mini-3b-gguf2-q4_0.gguf') ``` ### Managing Templates When using a `chat_session()`, you may customize the system prompt, and set the prompt template if necessary: === "GPT4All Custom Session Templates Example" ``` py from gpt4all import GPT4All model = GPT4All('wizardlm-13b-v1.2.Q4_0.gguf') system_template = 'A chat between a curious user and an artificial intelligence assistant.\n' # many models use triple hash '###' for keywords, Vicunas are simpler: prompt_template = 'USER: {0}\nASSISTANT: ' with model.chat_session(system_template, prompt_template): response1 = model.generate('why is the grass green?') print(response1) print() response2 = model.generate('why is the sky blue?') print(response2) ``` === "Possible Output" ``` The color of grass can be attributed to its chlorophyll content, which allows it to absorb light energy from sunlight through photosynthesis. Chlorophyll absorbs blue and red wavelengths of light while reflecting other colors such as yellow and green. This is why the leaves appear green to our eyes. The color of the sky appears blue due to a phenomenon called Rayleigh scattering, which occurs when sunlight enters Earth's atmosphere and interacts with air molecules such as nitrogen and oxygen. Blue light has shorter wavelength than other colors in the visible spectrum, so it is scattered more easily by these particles, making the sky appear blue to our eyes. ``` ### Without Online Connectivity To prevent GPT4All from accessing online resources, instantiate it with `allow_download=False`. When using this flag, there will be no default system prompt by default, and you must specify the prompt template yourself. You can retrieve a model's default system prompt and prompt template with an online instance of GPT4All: === "Prompt Template Retrieval" ``` py from gpt4all import GPT4All model = GPT4All('orca-mini-3b-gguf2-q4_0.gguf') print(repr(model.config['systemPrompt'])) print(repr(model.config['promptTemplate'])) ``` === "Output" ```py '### System:\nYou are an AI assistant that follows instruction extremely well. Help as much as you can.\n\n' '### User:\n{0}\n### Response:\n' ``` Then you can pass them explicitly when creating an offline instance: ``` py from gpt4all import GPT4All model = GPT4All('orca-mini-3b-gguf2-q4_0.gguf', allow_download=False) system_prompt = '### System:\nYou are an AI assistant that follows instruction extremely well. Help as much as you can.\n\n' prompt_template = '### User:\n{0}\n\n### Response:\n' with model.chat_session(system_prompt=system_prompt, prompt_template=prompt_template): ... ``` ### Interrupting Generation The simplest way to stop generation is to set a fixed upper limit with the `max_tokens` parameter. If you know exactly when a model should stop responding, you can add a custom callback, like so: === "GPT4All Custom Stop Callback" ``` py from gpt4all import GPT4All model = GPT4All('orca-mini-3b-gguf2-q4_0.gguf') def stop_on_token_callback(token_id, token_string): # one sentence is enough: if '.' in token_string: return False else: return True response = model.generate('Blue Whales are the biggest animal to ever inhabit the Earth.', temp=0, callback=stop_on_token_callback) print(response) ``` === "Output" ``` They can grow up to 100 feet (30 meters) long and weigh as much as 20 tons (18 metric tons). ``` ## API Documentation ::: gpt4all.gpt4all.GPT4All ================================================ FILE: gpt4all-bindings/python/docs/old/gpt4all_python_embedding.md ================================================ # Embeddings GPT4All supports generating high quality embeddings of arbitrary length text using any embedding model supported by llama.cpp. An embedding is a vector representation of a piece of text. Embeddings are useful for tasks such as retrieval for question answering (including retrieval augmented generation or *RAG*), semantic similarity search, classification, and topic clustering. ## Supported Embedding Models The following models have built-in support in Embed4All: | Name | Embed4All `model_name` | Context Length | Embedding Length | File Size | |--------------------|------------------------------------------------------|---------------:|-----------------:|----------:| | [SBert] | all‑MiniLM‑L6‑v2.gguf2.f16.gguf | 512 | 384 | 44 MiB | | [Nomic Embed v1] | nomic‑embed‑text‑v1.f16.gguf | 2048 | 768 | 262 MiB | | [Nomic Embed v1.5] | nomic‑embed‑text‑v1.5.f16.gguf | 2048 | 64-768 | 262 MiB | The context length is the maximum number of word pieces, or *tokens*, that a model can embed at once. Embedding texts longer than a model's context length requires some kind of strategy; see [Embedding Longer Texts] for more information. The embedding length is the size of the vector returned by `Embed4All.embed`. [SBert]: https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2 [Nomic Embed v1]: https://huggingface.co/nomic-ai/nomic-embed-text-v1 [Nomic Embed v1.5]: https://huggingface.co/nomic-ai/nomic-embed-text-v1.5 [Embedding Longer Texts]: #embedding-longer-texts ## Quickstart ```bash pip install gpt4all ``` ### Generating Embeddings By default, embeddings will be generated on the CPU using all-MiniLM-L6-v2. === "Embed4All Example" ```py from gpt4all import Embed4All text = 'The quick brown fox jumps over the lazy dog' embedder = Embed4All() output = embedder.embed(text) print(output) ``` === "Output" ``` [0.034696947783231735, -0.07192722707986832, 0.06923297047615051, ...] ``` You can also use the GPU to accelerate the embedding model by specifying the `device` parameter. See the [GPT4All constructor] for more information. === "GPU Example" ```py from gpt4all import Embed4All text = 'The quick brown fox jumps over the lazy dog' embedder = Embed4All(device='gpu') output = embedder.embed(text) print(output) ``` === "Output" ``` [0.034696947783231735, -0.07192722707986832, 0.06923297047615051, ...] ``` [GPT4All constructor]: gpt4all_python.md#gpt4all.gpt4all.GPT4All.__init__ ### Nomic Embed Embed4All has built-in support for Nomic's open-source embedding model, [Nomic Embed]. When using this model, you must specify the task type using the `prefix` argument. This may be one of `search_query`, `search_document`, `classification`, or `clustering`. For retrieval applications, you should prepend `search_document` for all of your documents and `search_query` for your queries. See the [Nomic Embedding Guide] for more info. === "Nomic Embed Example" ```py from gpt4all import Embed4All text = 'Who is Laurens van der Maaten?' embedder = Embed4All('nomic-embed-text-v1.f16.gguf') output = embedder.embed(text, prefix='search_query') print(output) ``` === "Output" ``` [-0.013357644900679588, 0.027070969343185425, -0.0232995692640543, ...] ``` [Nomic Embed]: https://blog.nomic.ai/posts/nomic-embed-text-v1 [Nomic Embedding Guide]: https://docs.nomic.ai/atlas/guides/embeddings#embedding-task-types ### Embedding Longer Texts Embed4All accepts a parameter called `long_text_mode`. This controls the behavior of Embed4All for texts longer than the context length of the embedding model. In the default mode of "mean", Embed4All will break long inputs into chunks and average their embeddings to compute the final result. To change this behavior, you can set the `long_text_mode` parameter to "truncate", which will truncate the input to the sequence length of the model before generating a single embedding. === "Truncation Example" ```py from gpt4all import Embed4All text = 'The ' * 512 + 'The quick brown fox jumps over the lazy dog' embedder = Embed4All() output = embedder.embed(text, long_text_mode="mean") print(output) print() output = embedder.embed(text, long_text_mode="truncate") print(output) ``` === "Output" ``` [0.0039850445464253426, 0.04558328539133072, 0.0035536508075892925, ...] [-0.009771130047738552, 0.034792833030223846, -0.013273917138576508, ...] ``` ### Batching You can send multiple texts to Embed4All in a single call. This can give faster results when individual texts are significantly smaller than `n_ctx` tokens. (`n_ctx` defaults to 2048.) === "Batching Example" ```py from gpt4all import Embed4All texts = ['The quick brown fox jumps over the lazy dog', 'Foo bar baz'] embedder = Embed4All() output = embedder.embed(texts) print(output[0]) print() print(output[1]) ``` === "Output" ``` [0.03551332652568817, 0.06137588247656822, 0.05281158909201622, ...] [-0.03879690542817116, 0.00013223080895841122, 0.023148687556385994, ...] ``` The number of texts that can be embedded in one pass of the model is proportional to the `n_ctx` parameter of Embed4All. Increasing it may increase batched embedding throughput if you have a fast GPU, at the cost of VRAM. ```py embedder = Embed4All(n_ctx=4096, device='gpu') ``` ### Resizable Dimensionality The embedding dimension of Nomic Embed v1.5 can be resized using the `dimensionality` parameter. This parameter supports any value between 64 and 768. Shorter embeddings use less storage, memory, and bandwidth with a small performance cost. See the [blog post] for more info. [blog post]: https://blog.nomic.ai/posts/nomic-embed-matryoshka === "Matryoshka Example" ```py from gpt4all import Embed4All text = 'The quick brown fox jumps over the lazy dog' embedder = Embed4All('nomic-embed-text-v1.5.f16.gguf') output = embedder.embed(text, dimensionality=64) print(len(output)) print(output) ``` === "Output" ``` 64 [-0.03567073494195938, 0.1301717758178711, -0.4333043396472931, ...] ``` ### API documentation ::: gpt4all.gpt4all.Embed4All ================================================ FILE: gpt4all-bindings/python/docs/old/index.md ================================================ # GPT4All Welcome to the GPT4All documentation LOCAL EDIT GPT4All is an open-source software ecosystem for anyone to run large language models (LLMs) **privately** on **everyday laptop & desktop computers**. No API calls or GPUs required. The GPT4All Desktop Application is a touchpoint to interact with LLMs and integrate them with your local docs & local data for RAG (retrieval-augmented generation). No coding is required, just install the application, download the models of your choice, and you are ready to use your LLM. Your local data is **yours**. GPT4All handles the retrieval privately and on-device to fetch relevant data to support your queries to your LLM. Nomic AI oversees contributions to GPT4All to ensure quality, security, and maintainability. Additionally, Nomic AI has open-sourced code for training and deploying your own customized LLMs internally. GPT4All software is optimized to run inference of 3-13 billion parameter large language models on the CPUs of laptops, desktops and servers. === "GPT4All Example" ``` py from gpt4all import GPT4All model = GPT4All("orca-mini-3b-gguf2-q4_0.gguf") output = model.generate("The capital of France is ", max_tokens=3) print(output) ``` === "Output" ``` 1. Paris ``` See [Python Bindings](gpt4all_python.md) to use GPT4All. ### Navigating the Documentation In an effort to ensure cross-operating-system and cross-language compatibility, the [GPT4All software ecosystem](https://github.com/nomic-ai/gpt4all) is organized as a monorepo with the following structure: - **gpt4all-backend**: The GPT4All backend maintains and exposes a universal, performance optimized C API for running inference with multi-billion parameter Transformer Decoders. This C API is then bound to any higher level programming language such as C++, Python, Go, etc. - **gpt4all-bindings**: GPT4All bindings contain a variety of high-level programming languages that implement the C API. Each directory is a bound programming language. The [CLI](gpt4all_cli.md) is included here, as well. - **gpt4all-chat**: GPT4All Chat is an OS native chat application that runs on macOS, Windows and Linux. It is the easiest way to run local, privacy aware chat assistants on everyday hardware. You can download it on the [GPT4All Website](https://gpt4all.io) and read its source code in the monorepo. Explore detailed documentation for the backend, bindings and chat client in the sidebar. ## Models The GPT4All software ecosystem is compatible with the following Transformer architectures: - `Falcon` - `LLaMA` (including `OpenLLaMA`) - `MPT` (including `Replit`) - `GPT-J` You can find an exhaustive list of supported models on the [website](https://gpt4all.io) or in the [models directory](https://raw.githubusercontent.com/nomic-ai/gpt4all/main/gpt4all-chat/metadata/models3.json) GPT4All models are artifacts produced through a process known as neural network quantization. A multi-billion parameter Transformer Decoder usually takes 30+ GB of VRAM to execute a forward pass. Most people do not have such a powerful computer or access to GPU hardware. By running trained LLMs through quantization algorithms, some GPT4All models can run on your laptop using only 4-8GB of RAM enabling their wide-spread usage. Bigger models might still require more RAM, however. Any model trained with one of these architectures can be quantized and run locally with all GPT4All bindings and in the chat client. You can add new variants by contributing to the gpt4all-backend. ## Frequently Asked Questions Find answers to frequently asked questions by searching the [Github issues](https://github.com/nomic-ai/gpt4all/issues) or in the [documentation FAQ](gpt4all_faq.md). ## Getting the most of your local LLM **Inference Speed** of a local LLM depends on two factors: model size and the number of tokens given as input. It is not advised to prompt local LLMs with large chunks of context as their inference speed will heavily degrade. You will likely want to run GPT4All models on GPU if you would like to utilize context windows larger than 750 tokens. Native GPU support for GPT4All models is planned. **Inference Performance:** Which model is best? That question depends on your use-case. The ability of an LLM to faithfully follow instructions is conditioned on the quantity and diversity of the pre-training data it trained on and the diversity, quality and factuality of the data the LLM was fine-tuned on. A goal of GPT4All is to bring the most powerful local assistant model to your desktop and Nomic AI is actively working on efforts to improve their performance and quality. ================================================ FILE: gpt4all-bindings/python/gpt4all/__init__.py ================================================ from .gpt4all import CancellationError as CancellationError, Embed4All as Embed4All, GPT4All as GPT4All ================================================ FILE: gpt4all-bindings/python/gpt4all/_pyllmodel.py ================================================ from __future__ import annotations import ctypes import os import platform import subprocess import sys import textwrap import threading from enum import Enum from queue import Queue from typing import TYPE_CHECKING, Any, Callable, Generic, Iterable, Iterator, Literal, NoReturn, TypeVar, overload if sys.version_info >= (3, 9): import importlib.resources as importlib_resources else: import importlib_resources if (3, 9) <= sys.version_info < (3, 11): # python 3.9 broke generic TypedDict, python 3.11 fixed it from typing_extensions import TypedDict else: from typing import TypedDict if TYPE_CHECKING: from typing_extensions import ParamSpec, TypeAlias T = TypeVar("T") P = ParamSpec("P") EmbeddingsType = TypeVar('EmbeddingsType', bound='list[Any]') cuda_found: bool = False # TODO(jared): use operator.call after we drop python 3.10 support def _operator_call(obj: Callable[P, T], /, *args: P.args, **kwargs: P.kwargs) -> T: return obj(*args, **kwargs) # Detect Rosetta 2 @_operator_call def check_rosetta() -> None: if platform.system() == "Darwin" and platform.processor() == "i386": p = subprocess.run("sysctl -n sysctl.proc_translated".split(), capture_output=True, text=True) if p.returncode == 0 and p.stdout.strip() == "1": raise RuntimeError(textwrap.dedent("""\ Running GPT4All under Rosetta is not supported due to CPU feature requirements. Please install GPT4All in an environment that uses a native ARM64 Python interpreter. """).strip()) # Check for C++ runtime libraries if platform.system() == "Windows": try: ctypes.CDLL("msvcp140.dll") ctypes.CDLL("vcruntime140.dll") ctypes.CDLL("vcruntime140_1.dll") except OSError as e: print(textwrap.dedent(f"""\ {e!r} The Microsoft Visual C++ runtime libraries were not found. Please install them from https://aka.ms/vs/17/release/vc_redist.x64.exe """), file=sys.stderr) @_operator_call def find_cuda() -> None: global cuda_found def _load_cuda(rtver: str, blasver: str) -> None: if platform.system() == "Linux": cudalib = f"lib/libcudart.so.{rtver}" cublaslib = f"lib/libcublas.so.{blasver}" else: # Windows cudalib = fr"bin\cudart64_{rtver.replace('.', '')}.dll" cublaslib = fr"bin\cublas64_{blasver}.dll" # preload the CUDA libs so the backend can find them ctypes.CDLL(os.path.join(cuda_runtime.__path__[0], cudalib), mode=ctypes.RTLD_GLOBAL) ctypes.CDLL(os.path.join(cublas.__path__[0], cublaslib), mode=ctypes.RTLD_GLOBAL) # Find CUDA libraries from the official packages if platform.system() in ("Linux", "Windows"): try: from nvidia import cuda_runtime, cublas except ImportError: pass # CUDA is optional else: for rtver, blasver in [("12", "12"), ("11.0", "11")]: try: _load_cuda(rtver, blasver) cuda_found = True except OSError: # dlopen() does not give specific error codes pass # try the next one # TODO: provide a config file to make this more robust MODEL_LIB_PATH = importlib_resources.files("gpt4all") / "llmodel_DO_NOT_MODIFY" / "build" def load_llmodel_library(): ext = {"Darwin": "dylib", "Linux": "so", "Windows": "dll"}[platform.system()] try: # macOS, Linux, MinGW lib = ctypes.CDLL(str(MODEL_LIB_PATH / f"libllmodel.{ext}")) except FileNotFoundError: if ext != 'dll': raise # MSVC lib = ctypes.CDLL(str(MODEL_LIB_PATH / "llmodel.dll")) return lib llmodel = load_llmodel_library() class LLModelPromptContext(ctypes.Structure): _fields_ = [ ("n_predict", ctypes.c_int32), ("top_k", ctypes.c_int32), ("top_p", ctypes.c_float), ("min_p", ctypes.c_float), ("temp", ctypes.c_float), ("n_batch", ctypes.c_int32), ("repeat_penalty", ctypes.c_float), ("repeat_last_n", ctypes.c_int32), ("context_erase", ctypes.c_float), ] class LLModelGPUDevice(ctypes.Structure): _fields_ = [ ("backend", ctypes.c_char_p), ("index", ctypes.c_int32), ("type", ctypes.c_int32), ("heapSize", ctypes.c_size_t), ("name", ctypes.c_char_p), ("vendor", ctypes.c_char_p), ] # Define C function signatures using ctypes llmodel.llmodel_model_create.argtypes = [ctypes.c_char_p] llmodel.llmodel_model_create.restype = ctypes.c_void_p llmodel.llmodel_model_create2.argtypes = [ctypes.c_char_p, ctypes.c_char_p, ctypes.POINTER(ctypes.c_char_p)] llmodel.llmodel_model_create2.restype = ctypes.c_void_p llmodel.llmodel_model_destroy.argtypes = [ctypes.c_void_p] llmodel.llmodel_model_destroy.restype = None llmodel.llmodel_loadModel.argtypes = [ctypes.c_void_p, ctypes.c_char_p, ctypes.c_int] llmodel.llmodel_loadModel.restype = ctypes.c_bool llmodel.llmodel_required_mem.argtypes = [ctypes.c_void_p, ctypes.c_char_p, ctypes.c_int] llmodel.llmodel_required_mem.restype = ctypes.c_size_t llmodel.llmodel_isModelLoaded.argtypes = [ctypes.c_void_p] llmodel.llmodel_isModelLoaded.restype = ctypes.c_bool PromptCallback = ctypes.CFUNCTYPE(ctypes.c_bool, ctypes.POINTER(ctypes.c_int32), ctypes.c_size_t, ctypes.c_bool) ResponseCallback = ctypes.CFUNCTYPE(ctypes.c_bool, ctypes.c_int32, ctypes.c_char_p) EmbCancelCallback = ctypes.CFUNCTYPE(ctypes.c_bool, ctypes.POINTER(ctypes.c_uint), ctypes.c_uint, ctypes.c_char_p) SpecialTokenCallback = ctypes.CFUNCTYPE(None, ctypes.c_char_p, ctypes.c_char_p) llmodel.llmodel_prompt.argtypes = [ ctypes.c_void_p, ctypes.c_char_p, PromptCallback, ResponseCallback, ctypes.POINTER(LLModelPromptContext), ctypes.POINTER(ctypes.c_char_p), ] llmodel.llmodel_prompt.restype = ctypes.c_bool llmodel.llmodel_embed.argtypes = [ ctypes.c_void_p, ctypes.POINTER(ctypes.c_char_p), ctypes.POINTER(ctypes.c_size_t), ctypes.c_char_p, ctypes.c_int, ctypes.POINTER(ctypes.c_size_t), ctypes.c_bool, ctypes.c_bool, EmbCancelCallback, ctypes.POINTER(ctypes.c_char_p), ] llmodel.llmodel_embed.restype = ctypes.POINTER(ctypes.c_float) llmodel.llmodel_free_embedding.argtypes = [ctypes.POINTER(ctypes.c_float)] llmodel.llmodel_free_embedding.restype = None llmodel.llmodel_setThreadCount.argtypes = [ctypes.c_void_p, ctypes.c_int32] llmodel.llmodel_setThreadCount.restype = None llmodel.llmodel_set_implementation_search_path.argtypes = [ctypes.c_char_p] llmodel.llmodel_set_implementation_search_path.restype = None llmodel.llmodel_threadCount.argtypes = [ctypes.c_void_p] llmodel.llmodel_threadCount.restype = ctypes.c_int32 llmodel.llmodel_set_implementation_search_path(str(MODEL_LIB_PATH).encode()) llmodel.llmodel_available_gpu_devices.argtypes = [ctypes.c_size_t, ctypes.POINTER(ctypes.c_int32)] llmodel.llmodel_available_gpu_devices.restype = ctypes.POINTER(LLModelGPUDevice) llmodel.llmodel_gpu_init_gpu_device_by_string.argtypes = [ctypes.c_void_p, ctypes.c_size_t, ctypes.c_char_p] llmodel.llmodel_gpu_init_gpu_device_by_string.restype = ctypes.c_bool llmodel.llmodel_gpu_init_gpu_device_by_struct.argtypes = [ctypes.c_void_p, ctypes.POINTER(LLModelGPUDevice)] llmodel.llmodel_gpu_init_gpu_device_by_struct.restype = ctypes.c_bool llmodel.llmodel_gpu_init_gpu_device_by_int.argtypes = [ctypes.c_void_p, ctypes.c_int32] llmodel.llmodel_gpu_init_gpu_device_by_int.restype = ctypes.c_bool llmodel.llmodel_model_backend_name.argtypes = [ctypes.c_void_p] llmodel.llmodel_model_backend_name.restype = ctypes.c_char_p llmodel.llmodel_model_gpu_device_name.argtypes = [ctypes.c_void_p] llmodel.llmodel_model_gpu_device_name.restype = ctypes.c_char_p llmodel.llmodel_count_prompt_tokens.argtypes = [ctypes.c_void_p, ctypes.POINTER(ctypes.c_char_p)] llmodel.llmodel_count_prompt_tokens.restype = ctypes.c_int32 llmodel.llmodel_model_foreach_special_token.argtypes = [ctypes.c_void_p, SpecialTokenCallback] llmodel.llmodel_model_foreach_special_token.restype = None ResponseCallbackType = Callable[[int, str], bool] RawResponseCallbackType = Callable[[int, bytes], bool] EmbCancelCallbackType: TypeAlias = 'Callable[[list[int], str], bool]' def empty_response_callback(token_id: int, response: str) -> bool: return True # Symbol to terminate from generator class Sentinel(Enum): TERMINATING_SYMBOL = 0 class EmbedResult(Generic[EmbeddingsType], TypedDict): embeddings: EmbeddingsType n_prompt_tokens: int class CancellationError(Exception): """raised when embedding is canceled""" class LLModel: """ Base class and universal wrapper for GPT4All language models built around llmodel C-API. Parameters ---------- model_path : str Path to the model. n_ctx : int Maximum size of context window ngl : int Number of GPU layers to use (Vulkan) backend : str Backend to use. One of 'auto', 'cpu', 'metal', 'kompute', or 'cuda'. """ def __init__(self, model_path: str, n_ctx: int, ngl: int, backend: str): self.model_path = model_path.encode() self.n_ctx = n_ctx self.ngl = ngl self.buffer = bytearray() self.buff_expecting_cont_bytes: int = 0 # Construct a model implementation err = ctypes.c_char_p() model = llmodel.llmodel_model_create2(self.model_path, backend.encode(), ctypes.byref(err)) if model is None: s = err.value errmsg = 'null' if s is None else s.decode() if ( backend == 'cuda' and not cuda_found and errmsg.startswith('Could not find any implementations for backend') ): print('WARNING: CUDA runtime libraries not found. Try `pip install "gpt4all[cuda]"`\n', file=sys.stderr) raise RuntimeError(f"Unable to instantiate model: {errmsg}") self.model: ctypes.c_void_p | None = model self.special_tokens_map: dict[str, str] = {} llmodel.llmodel_model_foreach_special_token( self.model, lambda n, t: self.special_tokens_map.__setitem__(n.decode(), t.decode()), ) def __del__(self, llmodel=llmodel): if hasattr(self, 'model'): self.close() def close(self) -> None: if self.model is not None: llmodel.llmodel_model_destroy(self.model) self.model = None def _raise_closed(self) -> NoReturn: raise ValueError("Attempted operation on a closed LLModel") @property def backend(self) -> Literal["cpu", "kompute", "cuda", "metal"]: if self.model is None: self._raise_closed() return llmodel.llmodel_model_backend_name(self.model).decode() @property def device(self) -> str | None: if self.model is None: self._raise_closed() dev = llmodel.llmodel_model_gpu_device_name(self.model) return None if dev is None else dev.decode() def count_prompt_tokens(self, prompt: str) -> int: if self.model is None: self._raise_closed() err = ctypes.c_char_p() n_tok = llmodel.llmodel_count_prompt_tokens(self.model, prompt, ctypes.byref(err)) if n_tok < 0: s = err.value errmsg = 'null' if s is None else s.decode() raise RuntimeError(f'Unable to count prompt tokens: {errmsg}') return n_tok llmodel.llmodel_count_prompt_tokens.argtypes = [ctypes.c_void_p, ctypes.c_char_p] @staticmethod def list_gpus(mem_required: int = 0) -> list[str]: """ List the names of the available GPU devices with at least `mem_required` bytes of VRAM. Args: mem_required: The minimum amount of VRAM, in bytes Returns: A list of strings representing the names of the available GPU devices. """ num_devices = ctypes.c_int32(0) devices_ptr = llmodel.llmodel_available_gpu_devices(mem_required, ctypes.byref(num_devices)) if not devices_ptr: raise ValueError("Unable to retrieve available GPU devices") return [f'{d.backend.decode()}:{d.name.decode()}' for d in devices_ptr[:num_devices.value]] def init_gpu(self, device: str): if self.model is None: self._raise_closed() mem_required = llmodel.llmodel_required_mem(self.model, self.model_path, self.n_ctx, self.ngl) if llmodel.llmodel_gpu_init_gpu_device_by_string(self.model, mem_required, device.encode()): return all_gpus = self.list_gpus() available_gpus = self.list_gpus(mem_required) unavailable_gpus = [g for g in all_gpus if g not in available_gpus] error_msg = (f"Unable to initialize model on GPU: {device!r}" + f"\nAvailable GPUs: {available_gpus}") if unavailable_gpus: error_msg += f"\nUnavailable GPUs due to insufficient memory: {unavailable_gpus}" raise ValueError(error_msg) def load_model(self) -> bool: """ Load model from a file. Returns ------- True if model loaded successfully, False otherwise """ if self.model is None: self._raise_closed() return llmodel.llmodel_loadModel(self.model, self.model_path, self.n_ctx, self.ngl) def set_thread_count(self, n_threads): if self.model is None: self._raise_closed() if not llmodel.llmodel_isModelLoaded(self.model): raise Exception("Model not loaded") llmodel.llmodel_setThreadCount(self.model, n_threads) def thread_count(self): if self.model is None: self._raise_closed() if not llmodel.llmodel_isModelLoaded(self.model): raise Exception("Model not loaded") return llmodel.llmodel_threadCount(self.model) @overload def generate_embeddings( self, text: str, prefix: str | None, dimensionality: int, do_mean: bool, atlas: bool, cancel_cb: EmbCancelCallbackType | None, ) -> EmbedResult[list[float]]: ... @overload def generate_embeddings( self, text: list[str], prefix: str | None, dimensionality: int, do_mean: bool, atlas: bool, cancel_cb: EmbCancelCallbackType | None, ) -> EmbedResult[list[list[float]]]: ... @overload def generate_embeddings( self, text: str | list[str], prefix: str | None, dimensionality: int, do_mean: bool, atlas: bool, cancel_cb: EmbCancelCallbackType | None, ) -> EmbedResult[list[Any]]: ... def generate_embeddings( self, text: str | list[str], prefix: str | None, dimensionality: int, do_mean: bool, atlas: bool, cancel_cb: EmbCancelCallbackType | None, ) -> EmbedResult[list[Any]]: if not text: raise ValueError("text must not be None or empty") if self.model is None: self._raise_closed() if single_text := isinstance(text, str): text = [text] # prepare input embedding_size = ctypes.c_size_t() token_count = ctypes.c_size_t() error = ctypes.c_char_p() c_prefix = ctypes.c_char_p() if prefix is None else prefix.encode() c_texts = (ctypes.c_char_p * (len(text) + 1))() for i, t in enumerate(text): c_texts[i] = t.encode() def wrap_cancel_cb(batch_sizes: Any, n_batch: int, backend: bytes) -> bool: assert cancel_cb is not None return cancel_cb(batch_sizes[:n_batch], backend.decode()) cancel_cb_wrapper = EmbCancelCallback() if cancel_cb is None else EmbCancelCallback(wrap_cancel_cb) # generate the embeddings embedding_ptr = llmodel.llmodel_embed( self.model, c_texts, ctypes.byref(embedding_size), c_prefix, dimensionality, ctypes.byref(token_count), do_mean, atlas, cancel_cb_wrapper, ctypes.byref(error), ) if not embedding_ptr: msg = "(unknown error)" if error.value is None else error.value.decode() if msg == "operation was canceled": raise CancellationError(msg) raise RuntimeError(f'Failed to generate embeddings: {msg}') # extract output n_embd = embedding_size.value // len(text) embedding_array = [ embedding_ptr[i:i + n_embd] for i in range(0, embedding_size.value, n_embd) ] llmodel.llmodel_free_embedding(embedding_ptr) embeddings = embedding_array[0] if single_text else embedding_array return {'embeddings': embeddings, 'n_prompt_tokens': token_count.value} def prompt_model( self, prompt : str, callback : ResponseCallbackType, n_predict : int = 4096, top_k : int = 40, top_p : float = 0.9, min_p : float = 0.0, temp : float = 0.1, n_batch : int = 8, repeat_penalty : float = 1.2, repeat_last_n : int = 10, context_erase : float = 0.75, reset_context : bool = False, ): """ Generate response from model from a prompt. Parameters ---------- prompt: str Question, task, or conversation for model to respond to callback(token_id:int, response:str): bool The model sends response tokens to callback Returns ------- None """ if self.model is None: self._raise_closed() self.buffer.clear() self.buff_expecting_cont_bytes = 0 context = LLModelPromptContext( n_predict = n_predict, top_k = top_k, top_p = top_p, min_p = min_p, temp = temp, n_batch = n_batch, repeat_penalty = repeat_penalty, repeat_last_n = repeat_last_n, context_erase = context_erase, ) error_msg: bytes | None = None def error_callback(msg: bytes) -> None: nonlocal error_msg error_msg = msg err = ctypes.c_char_p() if not llmodel.llmodel_prompt( self.model, ctypes.c_char_p(prompt.encode()), PromptCallback(self._prompt_callback), ResponseCallback(self._callback_decoder(callback)), context, ctypes.byref(err), ): s = err.value raise RuntimeError(f"prompt error: {'null' if s is None else s.decode()}") def prompt_model_streaming( self, prompt: str, callback: ResponseCallbackType = empty_response_callback, **kwargs: Any, ) -> Iterator[str]: if self.model is None: self._raise_closed() output_queue: Queue[str | Sentinel] = Queue() # Put response tokens into an output queue def _generator_callback_wrapper(callback: ResponseCallbackType) -> ResponseCallbackType: def _generator_callback(token_id: int, response: str): nonlocal callback if callback(token_id, response): output_queue.put(response) return True return False return _generator_callback def run_llmodel_prompt(prompt: str, callback: ResponseCallbackType, **kwargs): self.prompt_model(prompt, callback, **kwargs) output_queue.put(Sentinel.TERMINATING_SYMBOL) # Kick off llmodel_prompt in separate thread so we can return generator # immediately thread = threading.Thread( target=run_llmodel_prompt, args=(prompt, _generator_callback_wrapper(callback)), kwargs=kwargs, ) thread.start() # Generator while True: response = output_queue.get() if isinstance(response, Sentinel): break yield response def _callback_decoder(self, callback: ResponseCallbackType) -> RawResponseCallbackType: def _raw_callback(token_id: int, response: bytes) -> bool: nonlocal self, callback decoded = [] for byte in response: bits = "{:08b}".format(byte) (high_ones, _, _) = bits.partition('0') if len(high_ones) == 1: # continuation byte self.buffer.append(byte) self.buff_expecting_cont_bytes -= 1 else: # beginning of a byte sequence if len(self.buffer) > 0: decoded.append(self.buffer.decode(errors='replace')) self.buffer.clear() self.buffer.append(byte) self.buff_expecting_cont_bytes = max(0, len(high_ones) - 1) if self.buff_expecting_cont_bytes <= 0: # received the whole sequence or an out of place continuation byte decoded.append(self.buffer.decode(errors='replace')) self.buffer.clear() self.buff_expecting_cont_bytes = 0 if len(decoded) == 0 and self.buff_expecting_cont_bytes > 0: # wait for more continuation bytes return True return callback(token_id, ''.join(decoded)) return _raw_callback # Empty prompt callback @staticmethod def _prompt_callback(token_ids: ctypes._Pointer[ctypes.c_int32], n_token_ids: int, cached: bool) -> bool: return True ================================================ FILE: gpt4all-bindings/python/gpt4all/gpt4all.py ================================================ """ Python only API for running all GPT4All models. """ from __future__ import annotations import hashlib import json import os import platform import re import sys import warnings from contextlib import contextmanager from datetime import datetime from pathlib import Path from types import TracebackType from typing import TYPE_CHECKING, Any, Iterable, Iterator, Literal, NamedTuple, NoReturn, Protocol, TypedDict, overload import jinja2 import requests from jinja2.sandbox import ImmutableSandboxedEnvironment from requests.exceptions import ChunkedEncodingError from tqdm import tqdm from urllib3.exceptions import IncompleteRead, ProtocolError from ._pyllmodel import (CancellationError as CancellationError, EmbCancelCallbackType, EmbedResult as EmbedResult, LLModel, ResponseCallbackType, _operator_call, empty_response_callback) if TYPE_CHECKING: from typing_extensions import Self, TypeAlias if sys.platform == "darwin": import fcntl # TODO: move to config DEFAULT_MODEL_DIRECTORY = Path.home() / ".cache" / "gpt4all" ConfigType: TypeAlias = "dict[str, Any]" # Environment setup adapted from HF transformers @_operator_call def _jinja_env() -> ImmutableSandboxedEnvironment: def raise_exception(message: str) -> NoReturn: raise jinja2.exceptions.TemplateError(message) def tojson(obj: Any, indent: int | None = None) -> str: return json.dumps(obj, ensure_ascii=False, indent=indent) def strftime_now(fmt: str) -> str: return datetime.now().strftime(fmt) env = ImmutableSandboxedEnvironment(trim_blocks=True, lstrip_blocks=True) env.filters["tojson" ] = tojson env.globals["raise_exception"] = raise_exception env.globals["strftime_now" ] = strftime_now return env class MessageType(TypedDict): role: str content: str class ChatSession(NamedTuple): template: jinja2.Template history: list[MessageType] class Embed4All: """ Python class that handles embeddings for GPT4All. """ MIN_DIMENSIONALITY = 64 def __init__(self, model_name: str | None = None, *, n_threads: int | None = None, device: str | None = None, **kwargs: Any): """ Constructor Args: n_threads: number of CPU threads used by GPT4All. Default is None, then the number of threads are determined automatically. device: The processing unit on which the embedding model will run. See the `GPT4All` constructor for more info. kwargs: Remaining keyword arguments are passed to the `GPT4All` constructor. """ if model_name is None: model_name = "all-MiniLM-L6-v2.gguf2.f16.gguf" self.gpt4all = GPT4All(model_name, n_threads=n_threads, device=device, **kwargs) def __enter__(self) -> Self: return self def __exit__( self, typ: type[BaseException] | None, value: BaseException | None, tb: TracebackType | None, ) -> None: self.close() def close(self) -> None: """Delete the model instance and free associated system resources.""" self.gpt4all.close() # return_dict=False @overload def embed( self, text: str, *, prefix: str | None = ..., dimensionality: int | None = ..., long_text_mode: str = ..., return_dict: Literal[False] = ..., atlas: bool = ..., cancel_cb: EmbCancelCallbackType | None = ..., ) -> list[float]: ... @overload def embed( self, text: list[str], *, prefix: str | None = ..., dimensionality: int | None = ..., long_text_mode: str = ..., return_dict: Literal[False] = ..., atlas: bool = ..., cancel_cb: EmbCancelCallbackType | None = ..., ) -> list[list[float]]: ... @overload def embed( self, text: str | list[str], *, prefix: str | None = ..., dimensionality: int | None = ..., long_text_mode: str = ..., return_dict: Literal[False] = ..., atlas: bool = ..., cancel_cb: EmbCancelCallbackType | None = ..., ) -> list[Any]: ... # return_dict=True @overload def embed( self, text: str, *, prefix: str | None = ..., dimensionality: int | None = ..., long_text_mode: str = ..., return_dict: Literal[True], atlas: bool = ..., cancel_cb: EmbCancelCallbackType | None = ..., ) -> EmbedResult[list[float]]: ... @overload def embed( self, text: list[str], *, prefix: str | None = ..., dimensionality: int | None = ..., long_text_mode: str = ..., return_dict: Literal[True], atlas: bool = ..., cancel_cb: EmbCancelCallbackType | None = ..., ) -> EmbedResult[list[list[float]]]: ... @overload def embed( self, text: str | list[str], *, prefix: str | None = ..., dimensionality: int | None = ..., long_text_mode: str = ..., return_dict: Literal[True], atlas: bool = ..., cancel_cb: EmbCancelCallbackType | None = ..., ) -> EmbedResult[list[Any]]: ... # return type unknown @overload def embed( self, text: str | list[str], *, prefix: str | None = ..., dimensionality: int | None = ..., long_text_mode: str = ..., return_dict: bool = ..., atlas: bool = ..., cancel_cb: EmbCancelCallbackType | None = ..., ) -> Any: ... def embed( self, text: str | list[str], *, prefix: str | None = None, dimensionality: int | None = None, long_text_mode: str = "mean", return_dict: bool = False, atlas: bool = False, cancel_cb: EmbCancelCallbackType | None = None, ) -> Any: """ Generate one or more embeddings. Args: text: A text or list of texts to generate embeddings for. prefix: The model-specific prefix representing the embedding task, without the trailing colon. For Nomic Embed, this can be `search_query`, `search_document`, `classification`, or `clustering`. Defaults to `search_document` or equivalent if known; otherwise, you must explicitly pass a prefix or an empty string if none applies. dimensionality: The embedding dimension, for use with Matryoshka-capable models. Defaults to full-size. long_text_mode: How to handle texts longer than the model can accept. One of `mean` or `truncate`. return_dict: Return the result as a dict that includes the number of prompt tokens processed. atlas: Try to be fully compatible with the Atlas API. Currently, this means texts longer than 8192 tokens with long_text_mode="mean" will raise an error. Disabled by default. cancel_cb: Called with arguments (batch_sizes, backend_name). Return true to cancel embedding. Returns: With return_dict=False, an embedding or list of embeddings of your text(s). With return_dict=True, a dict with keys 'embeddings' and 'n_prompt_tokens'. Raises: CancellationError: If cancel_cb returned True and embedding was canceled. """ if dimensionality is None: dimensionality = -1 else: if dimensionality <= 0: raise ValueError(f"Dimensionality must be None or a positive integer, got {dimensionality}") if dimensionality < self.MIN_DIMENSIONALITY: warnings.warn( f"Dimensionality {dimensionality} is less than the suggested minimum of {self.MIN_DIMENSIONALITY}." " Performance may be degraded." ) try: do_mean = {"mean": True, "truncate": False}[long_text_mode] except KeyError: raise ValueError(f"Long text mode must be one of 'mean' or 'truncate', got {long_text_mode!r}") result = self.gpt4all.model.generate_embeddings(text, prefix, dimensionality, do_mean, atlas, cancel_cb) return result if return_dict else result["embeddings"] class GPT4All: """ Python class that handles instantiation, downloading, generation and chat with GPT4All models. """ def __init__( self, model_name: str, *, model_path: str | os.PathLike[str] | None = None, model_type: str | None = None, allow_download: bool = True, n_threads: int | None = None, device: str | None = None, n_ctx: int = 2048, ngl: int = 100, verbose: bool = False, ): """ Constructor Args: model_name: Name of GPT4All or custom model. Including ".gguf" file extension is optional but encouraged. model_path: Path to directory containing model file or, if file does not exist, where to download model. Default is None, in which case models will be stored in `~/.cache/gpt4all/`. model_type: Model architecture. This argument currently does not have any functionality and is just used as descriptive identifier for user. Default is None. allow_download: Allow API to download models from gpt4all.io. Default is True. n_threads: number of CPU threads used by GPT4All. Default is None, then the number of threads are determined automatically. device: The processing unit on which the GPT4All model will run. It can be set to: - "cpu": Model will run on the central processing unit. - "gpu": Use Metal on ARM64 macOS, otherwise the same as "kompute". - "kompute": Use the best GPU provided by the Kompute backend. - "cuda": Use the best GPU provided by the CUDA backend. - "amd", "nvidia": Use the best GPU provided by the Kompute backend from this vendor. - A specific device name from the list returned by `GPT4All.list_gpus()`. Default is Metal on ARM64 macOS, "cpu" otherwise. Note: If a selected GPU device does not have sufficient RAM to accommodate the model, an error will be thrown, and the GPT4All instance will be rendered invalid. It's advised to ensure the device has enough memory before initiating the model. n_ctx: Maximum size of context window ngl: Number of GPU layers to use (Vulkan) verbose: If True, print debug messages. """ self.model_type = model_type self._chat_session: ChatSession | None = None device_init = None if sys.platform == "darwin": if device is None: backend = "auto" # "auto" is effectively "metal" due to currently non-functional fallback elif device == "cpu": backend = "cpu" else: if platform.machine() != "arm64" or device != "gpu": raise ValueError(f"Unknown device for this platform: {device}") backend = "metal" else: backend = "kompute" if device is None or device == "cpu": pass # use kompute with no device elif device in ("cuda", "kompute"): backend = device device_init = "gpu" elif device.startswith("cuda:"): backend = "cuda" device_init = _remove_prefix(device, "cuda:") else: device_init = _remove_prefix(device, "kompute:") # Retrieve model and download if allowed self.config: ConfigType = self.retrieve_model(model_name, model_path=model_path, allow_download=allow_download, verbose=verbose) self.model = LLModel(self.config["path"], n_ctx, ngl, backend) if device_init is not None: self.model.init_gpu(device_init) self.model.load_model() # Set n_threads if n_threads is not None: self.model.set_thread_count(n_threads) def __enter__(self) -> Self: return self def __exit__( self, typ: type[BaseException] | None, value: BaseException | None, tb: TracebackType | None, ) -> None: self.close() def close(self) -> None: """Delete the model instance and free associated system resources.""" self.model.close() @property def backend(self) -> Literal["cpu", "kompute", "cuda", "metal"]: """The name of the llama.cpp backend currently in use. One of "cpu", "kompute", "cuda", or "metal".""" return self.model.backend @property def device(self) -> str | None: """The name of the GPU device currently in use, or None for backends other than Kompute or CUDA.""" return self.model.device @property def current_chat_session(self) -> list[MessageType] | None: return None if self._chat_session is None else self._chat_session.history @current_chat_session.setter def current_chat_session(self, history: list[MessageType]) -> None: if self._chat_session is None: raise ValueError("current_chat_session may only be set when there is an active chat session") self._chat_session.history[:] = history @staticmethod def list_models() -> list[ConfigType]: """ Fetch model list from https://gpt4all.io/models/models3.json. Returns: Model list in JSON format. """ resp = requests.get("https://gpt4all.io/models/models3.json") if resp.status_code != 200: raise ValueError(f"Request failed: HTTP {resp.status_code} {resp.reason}") return resp.json() @classmethod def retrieve_model( cls, model_name: str, model_path: str | os.PathLike[str] | None = None, allow_download: bool = True, verbose: bool = False, ) -> ConfigType: """ Find model file, and if it doesn't exist, download the model. Args: model_name: Name of model. model_path: Path to find model. Default is None in which case path is set to ~/.cache/gpt4all/. allow_download: Allow API to download model from gpt4all.io. Default is True. verbose: If True (default), print debug messages. Returns: Model config. """ model_filename = append_extension_if_missing(model_name) # get the config for the model config: ConfigType = {} if allow_download: models = cls.list_models() if (model := next((m for m in models if m["filename"] == model_filename), None)) is not None: config.update(model) # Validate download directory if model_path is None: try: os.makedirs(DEFAULT_MODEL_DIRECTORY, exist_ok=True) except OSError as e: raise RuntimeError("Failed to create model download directory") from e model_path = DEFAULT_MODEL_DIRECTORY else: model_path = Path(model_path) if not model_path.exists(): raise FileNotFoundError(f"Model directory does not exist: {model_path!r}") model_dest = model_path / model_filename if model_dest.exists(): config["path"] = str(model_dest) if verbose: print(f"Found model file at {str(model_dest)!r}", file=sys.stderr) elif allow_download: # If model file does not exist, download filesize = config.get("filesize") config["path"] = str(cls.download_model( model_filename, model_path, verbose=verbose, url=config.get("url"), expected_size=None if filesize is None else int(filesize), expected_md5=config.get("md5sum"), )) else: raise FileNotFoundError(f"Model file does not exist: {model_dest!r}") return config @staticmethod def download_model( model_filename: str, model_path: str | os.PathLike[str], verbose: bool = True, url: str | None = None, expected_size: int | None = None, expected_md5: str | None = None, ) -> str | os.PathLike[str]: """ Download model from gpt4all.io. Args: model_filename: Filename of model (with .gguf extension). model_path: Path to download model to. verbose: If True (default), print debug messages. url: the models remote url (e.g. may be hosted on HF) expected_size: The expected size of the download. expected_md5: The expected MD5 hash of the download. Returns: Model file destination. """ # Download model if url is None: url = f"https://gpt4all.io/models/gguf/{model_filename}" def make_request(offset=None): headers = {} if offset: print(f"\nDownload interrupted, resuming from byte position {offset}", file=sys.stderr) headers["Range"] = f"bytes={offset}-" # resume incomplete response headers["Accept-Encoding"] = "identity" # Content-Encoding changes meaning of ranges response = requests.get(url, stream=True, headers=headers) if response.status_code not in (200, 206): raise ValueError(f"Request failed: HTTP {response.status_code} {response.reason}") if offset and (response.status_code != 206 or str(offset) not in response.headers.get("Content-Range", "")): raise ValueError("Connection was interrupted and server does not support range requests") if (enc := response.headers.get("Content-Encoding")) is not None: raise ValueError(f"Expected identity Content-Encoding, got {enc}") return response response = make_request() total_size_in_bytes = int(response.headers.get("content-length", 0)) block_size = 2**20 # 1 MB partial_path = Path(model_path) / (model_filename + ".part") with open(partial_path, "w+b") as partf: try: with tqdm(desc="Downloading", total=total_size_in_bytes, unit="iB", unit_scale=True) as progress_bar: while True: last_progress = progress_bar.n try: for data in response.iter_content(block_size): partf.write(data) progress_bar.update(len(data)) except ChunkedEncodingError as cee: if cee.args and isinstance(pe := cee.args[0], ProtocolError): if len(pe.args) >= 2 and isinstance(ir := pe.args[1], IncompleteRead): assert progress_bar.n <= ir.partial # urllib3 may be ahead of us but never behind # the socket was closed during a read - retry response = make_request(progress_bar.n) continue raise if total_size_in_bytes != 0 and progress_bar.n < total_size_in_bytes: if progress_bar.n == last_progress: raise RuntimeError("Download not making progress, aborting.") # server closed connection prematurely - retry response = make_request(progress_bar.n) continue break # verify file integrity file_size = partf.tell() if expected_size is not None and file_size != expected_size: raise ValueError(f"Expected file size of {expected_size} bytes, got {file_size}") if expected_md5 is not None: partf.seek(0) hsh = hashlib.md5() with tqdm(desc="Verifying", total=file_size, unit="iB", unit_scale=True) as bar: while chunk := partf.read(block_size): hsh.update(chunk) bar.update(len(chunk)) if hsh.hexdigest() != expected_md5.lower(): raise ValueError(f"Expected MD5 hash of {expected_md5!r}, got {hsh.hexdigest()!r}") except: if verbose: print("Cleaning up the interrupted download...", file=sys.stderr) try: os.remove(partial_path) except OSError: pass raise # flush buffers and sync the inode partf.flush() _fsync(partf) # move to final destination download_path = Path(model_path) / model_filename try: os.rename(partial_path, download_path) except FileExistsError: try: os.remove(partial_path) except OSError: pass raise if verbose: print(f"Model downloaded to {str(download_path)!r}", file=sys.stderr) return download_path @overload def generate( self, prompt: str, *, max_tokens: int = ..., temp: float = ..., top_k: int = ..., top_p: float = ..., min_p: float = ..., repeat_penalty: float = ..., repeat_last_n: int = ..., n_batch: int = ..., n_predict: int | None = ..., streaming: Literal[False] = ..., callback: ResponseCallbackType = ..., ) -> str: ... @overload def generate( self, prompt: str, *, max_tokens: int = ..., temp: float = ..., top_k: int = ..., top_p: float = ..., min_p: float = ..., repeat_penalty: float = ..., repeat_last_n: int = ..., n_batch: int = ..., n_predict: int | None = ..., streaming: Literal[True], callback: ResponseCallbackType = ..., ) -> Iterable[str]: ... @overload def generate( self, prompt: str, *, max_tokens: int = ..., temp: float = ..., top_k: int = ..., top_p: float = ..., min_p: float = ..., repeat_penalty: float = ..., repeat_last_n: int = ..., n_batch: int = ..., n_predict: int | None = ..., streaming: bool, callback: ResponseCallbackType = ..., ) -> Any: ... def generate( self, prompt : str, *, max_tokens : int = 200, temp : float = 0.7, top_k : int = 40, top_p : float = 0.4, min_p : float = 0.0, repeat_penalty : float = 1.18, repeat_last_n : int = 64, n_batch : int = 8, n_predict : int | None = None, streaming : bool = False, callback : ResponseCallbackType = empty_response_callback, ) -> Any: """ Generate outputs from any GPT4All model. Args: prompt: The prompt for the model to complete. max_tokens: The maximum number of tokens to generate. temp: The model temperature. Larger values increase creativity but decrease factuality. top_k: Randomly sample from the top_k most likely tokens at each generation step. Set this to 1 for greedy decoding. top_p: Randomly sample at each generation step from the top most likely tokens whose probabilities add up to top_p. min_p: Randomly sample at each generation step from the top most likely tokens whose probabilities are at least min_p. repeat_penalty: Penalize the model for repetition. Higher values result in less repetition. repeat_last_n: How far in the models generation history to apply the repeat penalty. n_batch: Number of prompt tokens processed in parallel. Larger values decrease latency but increase resource requirements. n_predict: Equivalent to max_tokens, exists for backwards compatibility. streaming: If True, this method will instead return a generator that yields tokens as the model generates them. callback: A function with arguments token_id:int and response:str, which receives the tokens from the model as they are generated and stops the generation by returning False. Returns: Either the entire completion or a generator that yields the completion token by token. """ # Preparing the model request generate_kwargs: dict[str, Any] = dict( temp = temp, top_k = top_k, top_p = top_p, min_p = min_p, repeat_penalty = repeat_penalty, repeat_last_n = repeat_last_n, n_batch = n_batch, n_predict = n_predict if n_predict is not None else max_tokens, ) # Prepare the callback, process the model response full_response = "" def _callback_wrapper(token_id: int, response: str) -> bool: nonlocal full_response full_response += response return callback(token_id, response) last_msg_rendered = prompt if self._chat_session is not None: session = self._chat_session def render(messages: list[MessageType]) -> str: return session.template.render( messages=messages, add_generation_prompt=True, **self.model.special_tokens_map, ) session.history.append(MessageType(role="user", content=prompt)) prompt = render(session.history) if len(session.history) > 1: last_msg_rendered = render(session.history[-1:]) # Check request length last_msg_len = self.model.count_prompt_tokens(last_msg_rendered) if last_msg_len > (limit := self.model.n_ctx - 4): raise ValueError(f"Your message was too long and could not be processed ({last_msg_len} > {limit}).") # Send the request to the model if streaming: def stream() -> Iterator[str]: yield from self.model.prompt_model_streaming(prompt, _callback_wrapper, **generate_kwargs) if self._chat_session is not None: self._chat_session.history.append(MessageType(role="assistant", content=full_response)) return stream() self.model.prompt_model(prompt, _callback_wrapper, **generate_kwargs) if self._chat_session is not None: self._chat_session.history.append(MessageType(role="assistant", content=full_response)) return full_response @contextmanager def chat_session( self, system_message: str | Literal[False] | None = None, chat_template: str | None = None, ): """ Context manager to hold an inference optimized chat session with a GPT4All model. Args: system_message: An initial instruction for the model, None to use the model default, or False to disable. Defaults to None. chat_template: Jinja template for the conversation, or None to use the model default. Defaults to None. """ if system_message is None: system_message = self.config.get("systemMessage", False) if chat_template is None: if "name" not in self.config: raise ValueError("For sideloaded models or with allow_download=False, you must specify a chat template.") if "chatTemplate" not in self.config: raise NotImplementedError("This model appears to have a built-in chat template, but loading it is not " "currently implemented. Please pass a template to chat_session() directly.") if (tmpl := self.config["chatTemplate"]) is None: raise ValueError(f"The model {self.config['name']!r} does not support chat.") chat_template = tmpl history = [] if system_message is not False: history.append(MessageType(role="system", content=system_message)) self._chat_session = ChatSession( template=_jinja_env.from_string(chat_template), history=history, ) try: yield self finally: self._chat_session = None @staticmethod def list_gpus() -> list[str]: """ List the names of the available GPU devices. Returns: A list of strings representing the names of the available GPU devices. """ return LLModel.list_gpus() def append_extension_if_missing(model_name): if not model_name.endswith((".bin", ".gguf")): model_name += ".gguf" return model_name class _HasFileno(Protocol): def fileno(self) -> int: ... def _fsync(fd: int | _HasFileno) -> None: if sys.platform == "darwin": # Apple's fsync does not flush the drive write cache try: fcntl.fcntl(fd, fcntl.F_FULLFSYNC) except OSError: pass # fall back to fsync else: return os.fsync(fd) def _remove_prefix(s: str, prefix: str) -> str: return s[len(prefix):] if s.startswith(prefix) else s ================================================ FILE: gpt4all-bindings/python/gpt4all/tests/__init__.py ================================================ ================================================ FILE: gpt4all-bindings/python/gpt4all/tests/test_embed_timings.py ================================================ #!/usr/bin/env python3 import sys import time from io import StringIO from gpt4all import Embed4All, GPT4All def time_embedding(i, embedder): text = 'foo bar ' * i start_time = time.time() output = embedder.embed(text) end_time = time.time() elapsed_time = end_time - start_time print(f"Time report: {2 * i / elapsed_time} tokens/second with {2 * i} tokens taking {elapsed_time} seconds") if __name__ == "__main__": embedder = Embed4All(n_threads=8) for i in [2**n for n in range(6, 14)]: time_embedding(i, embedder) ================================================ FILE: gpt4all-bindings/python/gpt4all/tests/test_gpt4all.py ================================================ import sys from io import StringIO from pathlib import Path from gpt4all import GPT4All, Embed4All import time import pytest def test_inference(): model = GPT4All(model_name='orca-mini-3b-gguf2-q4_0.gguf') output_1 = model.generate('hello', top_k=1) with model.chat_session(): response = model.generate(prompt='hello', top_k=1) response = model.generate(prompt='write me a short poem', top_k=1) response = model.generate(prompt='thank you', top_k=1) print(model.current_chat_session) output_2 = model.generate('hello', top_k=1) assert output_1 == output_2 tokens = [] for token in model.generate('hello', streaming=True): tokens.append(token) assert len(tokens) > 0 with model.chat_session(): model.generate(prompt='hello', top_k=1, streaming=True) model.generate(prompt='write me a poem about dogs', top_k=1, streaming=True) print(model.current_chat_session) def do_long_input(model): long_input = " ".join(["hello how are you"] * 40) with model.chat_session(): # llmodel should limit us to 128 even if we ask for more model.generate(long_input, n_batch=512) print(model.current_chat_session) def test_inference_long_orca_3b(): model = GPT4All(model_name="orca-mini-3b-gguf2-q4_0.gguf") do_long_input(model) def test_inference_long_falcon(): model = GPT4All(model_name='gpt4all-falcon-q4_0.gguf') do_long_input(model) def test_inference_long_llama_7b(): model = GPT4All(model_name="mistral-7b-openorca.Q4_0.gguf") do_long_input(model) def test_inference_long_llama_13b(): model = GPT4All(model_name='nous-hermes-llama2-13b.Q4_0.gguf') do_long_input(model) def test_inference_long_mpt(): model = GPT4All(model_name='mpt-7b-chat-q4_0.gguf') do_long_input(model) def test_inference_long_replit(): model = GPT4All(model_name='replit-code-v1_5-3b-q4_0.gguf') do_long_input(model) def test_inference_hparams(): model = GPT4All(model_name='orca-mini-3b-gguf2-q4_0.gguf') output = model.generate("The capital of france is ", max_tokens=3) assert 'Paris' in output def test_inference_falcon(): model = GPT4All(model_name='gpt4all-falcon-q4_0.gguf') prompt = 'hello' output = model.generate(prompt) assert isinstance(output, str) assert len(output) > 0 def test_inference_mpt(): model = GPT4All(model_name='mpt-7b-chat-q4_0.gguf') prompt = 'hello' output = model.generate(prompt) assert isinstance(output, str) assert len(output) > 0 def test_embedding(): text = 'The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox' embedder = Embed4All() output = embedder.embed(text) #for i, value in enumerate(output): #print(f'Value at index {i}: {value}') assert len(output) == 384 def test_empty_embedding(): text = '' embedder = Embed4All() with pytest.raises(ValueError): output = embedder.embed(text) def test_download_model(tmp_path: Path): from gpt4all import gpt4all old_default_dir = gpt4all.DEFAULT_MODEL_DIRECTORY gpt4all.DEFAULT_MODEL_DIRECTORY = tmp_path # temporary pytest directory to ensure a download happens try: model = GPT4All(model_name='ggml-all-MiniLM-L6-v2-f16.bin') model_path = tmp_path / model.config['filename'] assert model_path.absolute() == Path(model.config['path']).absolute() assert model_path.stat().st_size == int(model.config['filesize']) finally: gpt4all.DEFAULT_MODEL_DIRECTORY = old_default_dir ================================================ FILE: gpt4all-bindings/python/makefile ================================================ SHELL:=/bin/bash -o pipefail ROOT_DIR:=$(shell dirname $(realpath $(lastword $(MAKEFILE_LIST)))) PYTHON:=python3 env: if [ ! -d $(ROOT_DIR)/env ]; then $(PYTHON) -m venv $(ROOT_DIR)/env; fi dev: env source env/bin/activate; pip install black isort pytest; pip install -e . documentation: rm -rf ./site && mkdocs build wheel: rm -rf dist/ build/ gpt4all/llmodel_DO_NOT_MODIFY; python setup.py bdist_wheel; clean: rm -rf {.pytest_cache,env,gpt4all.egg-info} find . | grep -E "(__pycache__|\.pyc|\.pyo$\)" | xargs rm -rf black: source env/bin/activate; black -l 120 -S --target-version py36 gpt4all isort: source env/bin/activate; isort --ignore-whitespace --atomic -w 120 gpt4all test: source env/bin/activate; pytest -s gpt4all/tests -k "not test_inference_long" test_all: source env/bin/activate; pytest -s gpt4all/tests ================================================ FILE: gpt4all-bindings/python/mkdocs.yml ================================================ site_name: GPT4All repo_url: https://github.com/nomic-ai/gpt4all repo_name: nomic-ai/gpt4all site_url: https://docs.gpt4all.io edit_uri: edit/main/docs/ site_description: GPT4All Docs - run LLMs efficiently on your hardware copyright: Copyright © 2024 Nomic, Inc use_directory_urls: false nav: - 'index.md' - 'Quickstart' : 'gpt4all_desktop/quickstart.md' - 'Chats' : 'gpt4all_desktop/chats.md' - 'Models' : 'gpt4all_desktop/models.md' - 'LocalDocs' : 'gpt4all_desktop/localdocs.md' - 'Settings' : 'gpt4all_desktop/settings.md' - 'Chat Templates' : 'gpt4all_desktop/chat_templates.md' - 'Cookbook': - 'Local AI Chat with Microsoft Excel': 'gpt4all_desktop/cookbook/use-local-ai-models-to-privately-chat-with-microsoft-excel.md' - 'Local AI Chat with your Google Drive': 'gpt4all_desktop/cookbook/use-local-ai-models-to-privately-chat-with-google-drive.md' - 'Local AI Chat with your Obsidian Vault': 'gpt4all_desktop/cookbook/use-local-ai-models-to-privately-chat-with-Obsidian.md' - 'Local AI Chat with your OneDrive': 'gpt4all_desktop/cookbook/use-local-ai-models-to-privately-chat-with-One-Drive.md' - 'API Server': - 'gpt4all_api_server/home.md' - 'Python SDK': - 'gpt4all_python/home.md' - 'Monitoring': 'gpt4all_python/monitoring.md' - 'SDK Reference': 'gpt4all_python/ref.md' - 'Help': - 'FAQ': 'gpt4all_help/faq.md' - 'Troubleshooting': 'gpt4all_help/troubleshooting.md' theme: name: material palette: primary: white logo: assets/nomic.png favicon: assets/favicon.ico features: - content.code.copy - navigation.instant - navigation.tracking - navigation.sections # - navigation.tabs # - navigation.tabs.sticky markdown_extensions: - pymdownx.highlight: anchor_linenums: true - pymdownx.inlinehilite - pymdownx.snippets - pymdownx.details - pymdownx.superfences - pymdownx.tabbed: alternate_style: true - pymdownx.emoji: emoji_index: !!python/name:material.extensions.emoji.twemoji emoji_generator: !!python/name:material.extensions.emoji.to_svg options: custom_icons: - docs/overrides/.icons - tables - admonition - codehilite: css_class: highlight - markdown_captions extra_css: - css/custom.css plugins: - search - mkdocstrings: handlers: python: options: show_root_heading: True heading_level: 4 show_root_full_path: false docstring_section_style: list - material/social: cards_layout_options: font_family: Roboto description: GPT4All runs LLMs efficiently on your hardware extra: generator: false analytics: provider: google property: G-NPXC8BYHJV ================================================ FILE: gpt4all-bindings/python/setup.py ================================================ from setuptools import setup, find_packages import os import pathlib import platform import shutil package_name = "gpt4all" # Define the location of your prebuilt C library files SRC_CLIB_DIRECTORY = os.path.join("..", "..", "gpt4all-backend") SRC_CLIB_BUILD_DIRECTORY = os.path.join("..", "..", "gpt4all-backend", "build") LIB_NAME = "llmodel" DEST_CLIB_DIRECTORY = os.path.join(package_name, f"{LIB_NAME}_DO_NOT_MODIFY") DEST_CLIB_BUILD_DIRECTORY = os.path.join(DEST_CLIB_DIRECTORY, "build") system = platform.system() def get_c_shared_lib_extension(): if system == "Darwin": return "dylib" elif system == "Linux": return "so" elif system == "Windows": return "dll" else: raise Exception("Operating System not supported") lib_ext = get_c_shared_lib_extension() def copy_prebuilt_C_lib(src_dir, dest_dir, dest_build_dir): files_copied = 0 if not os.path.exists(dest_dir): os.mkdir(dest_dir) os.mkdir(dest_build_dir) for dirpath, _, filenames in os.walk(src_dir): for item in filenames: # copy over header files to dest dir s = os.path.join(dirpath, item) if item.endswith(".h"): d = os.path.join(dest_dir, item) shutil.copy2(s, d) files_copied += 1 if item.endswith(lib_ext) or item.endswith('.metallib'): s = os.path.join(dirpath, item) d = os.path.join(dest_build_dir, item) shutil.copy2(s, d) files_copied += 1 return files_copied # NOTE: You must provide correct path to the prebuilt llmodel C library. # Specifically, the llmodel.h and C shared library are needed. copy_prebuilt_C_lib(SRC_CLIB_DIRECTORY, DEST_CLIB_DIRECTORY, DEST_CLIB_BUILD_DIRECTORY) def get_long_description(): with open(pathlib.Path(__file__).parent / "README.md", encoding="utf-8") as fp: return fp.read() setup( name=package_name, version="2.8.3.dev0", description="Python bindings for GPT4All", long_description=get_long_description(), long_description_content_type="text/markdown", author="Nomic and the Open Source Community", author_email="support@nomic.ai", url="https://www.nomic.ai/gpt4all", project_urls={ "Documentation": "https://docs.gpt4all.io/gpt4all_python.html", "Source code": "https://github.com/nomic-ai/gpt4all/tree/main/gpt4all-bindings/python", "Changelog": "https://github.com/nomic-ai/gpt4all/blob/main/gpt4all-bindings/python/CHANGELOG.md", }, classifiers = [ "Programming Language :: Python :: 3", "License :: OSI Approved :: MIT License", "Operating System :: OS Independent", ], python_requires='>=3.8', packages=find_packages(), install_requires=[ 'importlib_resources; python_version < "3.9"', 'jinja2~=3.1', 'requests', 'tqdm', 'typing-extensions>=4.3.0; python_version >= "3.9" and python_version < "3.11"', ], extras_require={ 'cuda': [ 'nvidia-cuda-runtime-cu11', 'nvidia-cublas-cu11', ], 'all': [ 'gpt4all[cuda]; platform_system == "Windows" or platform_system == "Linux"', ], 'dev': [ 'gpt4all[all]', 'pytest', 'twine', 'wheel', 'setuptools', 'mkdocs-material', 'mkdocs-material[imaging]', 'mkautodoc', 'mkdocstrings[python]', 'mkdocs-jupyter', 'black', 'isort', 'typing-extensions>=3.10', ] }, package_data={'llmodel': [os.path.join(DEST_CLIB_DIRECTORY, "*")]}, include_package_data=True ) ================================================ FILE: gpt4all-bindings/typescript/.clang-format ================================================ --- Language: Cpp BasedOnStyle: Microsoft ColumnLimit: 120 ================================================ FILE: gpt4all-bindings/typescript/.gitignore ================================================ node_modules/ build/ prebuilds/ .yarn/* !.yarn/patches !.yarn/plugins !.yarn/releases !.yarn/sdks !.yarn/versions runtimes/ compile_flags.txt ================================================ FILE: gpt4all-bindings/typescript/.npmignore ================================================ test/ spec/ scripts/ build ================================================ FILE: gpt4all-bindings/typescript/.yarnrc.yml ================================================ nodeLinker: node-modules ================================================ FILE: gpt4all-bindings/typescript/README.md ================================================ # GPT4All Node.js API Native Node.js LLM bindings for all. ```sh yarn add gpt4all@latest npm install gpt4all@latest pnpm install gpt4all@latest ``` ## Breaking changes in version 4!! * See [Transition](#changes) ## Contents * See [API Reference](#api-reference) * See [Examples](#api-example) * See [Developing](#develop) * GPT4ALL nodejs bindings created by [jacoobes](https://github.com/jacoobes), [limez](https://github.com/iimez) and the [nomic ai community](https://home.nomic.ai), for all to use. * [spare change](https://github.com/sponsors/jacoobes) for a college student? 🤑 ## Api Examples ### Chat Completion Use a chat session to keep context between completions. This is useful for efficient back and forth conversations. ```js import { createCompletion, loadModel } from "../src/gpt4all.js"; const model = await loadModel("orca-mini-3b-gguf2-q4_0.gguf", { verbose: true, // logs loaded model configuration device: "gpu", // defaults to 'cpu' nCtx: 2048, // the maximum sessions context window size. }); // initialize a chat session on the model. a model instance can have only one chat session at a time. const chat = await model.createChatSession({ // any completion options set here will be used as default for all completions in this chat session temperature: 0.8, // a custom systemPrompt can be set here. note that the template depends on the model. // if unset, the systemPrompt that comes with the model will be used. systemPrompt: "### System:\nYou are an advanced mathematician.\n\n", }); // create a completion using a string as input const res1 = await createCompletion(chat, "What is 1 + 1?"); console.debug(res1.choices[0].message); // multiple messages can be input to the conversation at once. // note that if the last message is not of role 'user', an empty message will be returned. await createCompletion(chat, [ { role: "user", content: "What is 2 + 2?", }, { role: "assistant", content: "It's 5.", }, ]); const res3 = await createCompletion(chat, "Could you recalculate that?"); console.debug(res3.choices[0].message); model.dispose(); ``` ### Stateless usage You can use the model without a chat session. This is useful for one-off completions. ```js import { createCompletion, loadModel } from "../src/gpt4all.js"; const model = await loadModel("orca-mini-3b-gguf2-q4_0.gguf"); // createCompletion methods can also be used on the model directly. // context is not maintained between completions. const res1 = await createCompletion(model, "What is 1 + 1?"); console.debug(res1.choices[0].message); // a whole conversation can be input as well. // note that if the last message is not of role 'user', an error will be thrown. const res2 = await createCompletion(model, [ { role: "user", content: "What is 2 + 2?", }, { role: "assistant", content: "It's 5.", }, { role: "user", content: "Could you recalculate that?", }, ]); console.debug(res2.choices[0].message); ``` ### Embedding ```js import { loadModel, createEmbedding } from '../src/gpt4all.js' const embedder = await loadModel("nomic-embed-text-v1.5.f16.gguf", { verbose: true, type: 'embedding'}) console.log(createEmbedding(embedder, "Maybe Minecraft was the friends we made along the way")); ``` ### Streaming responses ```js import { loadModel, createCompletionStream } from "../src/gpt4all.js"; const model = await loadModel("mistral-7b-openorca.gguf2.Q4_0.gguf", { device: "gpu", }); process.stdout.write("Output: "); const stream = createCompletionStream(model, "How are you?"); stream.tokens.on("data", (data) => { process.stdout.write(data); }); //wait till stream finishes. We cannot continue until this one is done. await stream.result; process.stdout.write("\n"); model.dispose(); ``` ### Async Generators ```js import { loadModel, createCompletionGenerator } from "../src/gpt4all.js"; const model = await loadModel("mistral-7b-openorca.gguf2.Q4_0.gguf"); process.stdout.write("Output: "); const gen = createCompletionGenerator( model, "Redstone in Minecraft is Turing Complete. Let that sink in. (let it in!)" ); for await (const chunk of gen) { process.stdout.write(chunk); } process.stdout.write("\n"); model.dispose(); ``` ### Offline usage do this b4 going offline ```sh curl -L https://gpt4all.io/models/models3.json -o ./models3.json ``` ```js import { createCompletion, loadModel } from 'gpt4all' //make sure u downloaded the models before going offline! const model = await loadModel('mistral-7b-openorca.gguf2.Q4_0.gguf', { verbose: true, device: 'gpu', modelConfigFile: "./models3.json" }); await createCompletion(model, 'What is 1 + 1?', { verbose: true }) model.dispose(); ``` ## Develop ### Build Instructions * `binding.gyp` is compile config * Tested on Ubuntu. Everything seems to work fine * Tested on Windows. Everything works fine. * Sparse testing on mac os. * MingW script works to build the gpt4all-backend. We left it there just in case. **HOWEVER**, this package works only with MSVC built dlls. ### Requirements * git * [node.js >= 18.0.0](https://nodejs.org/en) * [yarn](https://yarnpkg.com/) * [node-gyp](https://github.com/nodejs/node-gyp) * all of its requirements. * (unix) gcc version 12 * (win) msvc version 143 * Can be obtained with visual studio 2022 build tools * python 3 * On Windows and Linux, building GPT4All requires the complete Vulkan SDK. You may download it from here: https://vulkan.lunarg.com/sdk/home * macOS users do not need Vulkan, as GPT4All will use Metal instead. ### Build (from source) ```sh git clone https://github.com/nomic-ai/gpt4all.git cd gpt4all-bindings/typescript ``` * The below shell commands assume the current working directory is `typescript`. * To Build and Rebuild: ```sh node scripts/prebuild.js ``` * llama.cpp git submodule for gpt4all can be possibly absent. If this is the case, make sure to run in llama.cpp parent directory ```sh git submodule update --init --recursive ``` ```sh yarn build:backend ``` This will build platform-dependent dynamic libraries, and will be located in runtimes/(platform)/native ### Test ```sh yarn test ``` ### Source Overview #### src/ * Extra functions to help aid devex * Typings for the native node addon * the javascript interface #### test/ * simple unit testings for some functions exported. * more advanced ai testing is not handled #### spec/ * Average look and feel of the api * Should work assuming a model and libraries are installed locally in working directory #### index.cc * The bridge between nodejs and c. Where the bindings are. #### prompt.cc * Handling prompting and inference of models in a threadsafe, asynchronous way. ### Known Issues * why your model may be spewing bull 💩 * The downloaded model is broken (just reinstall or download from official site) * Your model is hanging after a call to generate tokens. * Is `nPast` set too high? This may cause your model to hang (03/16/2024), Linux Mint, Ubuntu 22.04 * Your GPU usage is still high after node.js exits. * Make sure to call `model.dispose()`!!! ### Roadmap This package has been stabilizing over time development, and breaking changes may happen until the api stabilizes. Here's what's the todo list: * \[ ] Purely offline. Per the gui, which can be run completely offline, the bindings should be as well. * \[ ] NPM bundle size reduction via optionalDependencies strategy (need help) * Should include prebuilds to avoid painful node-gyp errors * \[x] createChatSession ( the python equivalent to create\_chat\_session ) * \[x] generateTokens, the new name for createTokenStream. As of 3.2.0, this is released but not 100% tested. Check spec/generator.mjs! * \[x] ~~createTokenStream, an async iterator that streams each token emitted from the model. Planning on following this [example](https://github.com/nodejs/node-addon-examples/tree/main/threadsafe-async-iterator)~~ May not implement unless someone else can complete * \[x] prompt models via a threadsafe function in order to have proper non blocking behavior in nodejs * \[x] generateTokens is the new name for this^ * \[x] proper unit testing (integrate with circle ci) * \[x] publish to npm under alpha tag `gpt4all@alpha` * \[x] have more people test on other platforms (mac tester needed) * \[x] switch to new pluggable backend ## Changes This repository serves as the new bindings for nodejs users. - If you were a user of [these bindings](https://github.com/nomic-ai/gpt4all-ts), they are outdated. - Version 4 includes the follow breaking changes * `createEmbedding` & `EmbeddingModel.embed()` returns an object, `EmbeddingResult`, instead of a float32array. * Removed deprecated types `ModelType` and `ModelFile` * Removed deprecated initiation of model by string path only ### API Reference ================================================ FILE: gpt4all-bindings/typescript/binding.ci.gyp ================================================ { "targets": [ { "target_name": "gpt4all", # gpt4all-ts will cause compile error "include_dirs": [ "(llmodel_required_mem(GetInference(), full_model_path.c_str(), nCtx, nGpuLayers))); } Napi::Value NodeModelWrapper::GetGpuDevices(const Napi::CallbackInfo &info) { auto env = info.Env(); int num_devices = 0; auto mem_size = llmodel_required_mem(GetInference(), full_model_path.c_str(), nCtx, nGpuLayers); llmodel_gpu_device *all_devices = llmodel_available_gpu_devices(mem_size, &num_devices); if (all_devices == nullptr) { Napi::Error::New(env, "Unable to retrieve list of all GPU devices").ThrowAsJavaScriptException(); return env.Undefined(); } auto js_array = Napi::Array::New(env, num_devices); for (int i = 0; i < num_devices; ++i) { auto gpu_device = all_devices[i]; /* * * struct llmodel_gpu_device { int index = 0; int type = 0; // same as VkPhysicalDeviceType size_t heapSize = 0; const char * name; const char * vendor; }; * */ Napi::Object js_gpu_device = Napi::Object::New(env); js_gpu_device["index"] = uint32_t(gpu_device.index); js_gpu_device["type"] = uint32_t(gpu_device.type); js_gpu_device["heapSize"] = static_cast(gpu_device.heapSize); js_gpu_device["name"] = gpu_device.name; js_gpu_device["vendor"] = gpu_device.vendor; js_array[i] = js_gpu_device; } return js_array; } Napi::Value NodeModelWrapper::GetType(const Napi::CallbackInfo &info) { if (type.empty()) { return info.Env().Undefined(); } return Napi::String::New(info.Env(), type); } Napi::Value NodeModelWrapper::InitGpuByString(const Napi::CallbackInfo &info) { auto env = info.Env(); size_t memory_required = static_cast(info[0].As().Uint32Value()); std::string gpu_device_identifier = info[1].As(); size_t converted_value; if (memory_required <= std::numeric_limits::max()) { converted_value = static_cast(memory_required); } else { Napi::Error::New(env, "invalid number for memory size. Exceeded bounds for memory.") .ThrowAsJavaScriptException(); return env.Undefined(); } auto result = llmodel_gpu_init_gpu_device_by_string(GetInference(), converted_value, gpu_device_identifier.c_str()); return Napi::Boolean::New(env, result); } Napi::Value NodeModelWrapper::HasGpuDevice(const Napi::CallbackInfo &info) { return Napi::Boolean::New(info.Env(), llmodel_has_gpu_device(GetInference())); } NodeModelWrapper::NodeModelWrapper(const Napi::CallbackInfo &info) : Napi::ObjectWrap(info) { auto env = info.Env(); auto config_object = info[0].As(); // sets the directory where models (gguf files) are to be searched llmodel_set_implementation_search_path( config_object.Has("library_path") ? config_object.Get("library_path").As().Utf8Value().c_str() : "."); std::string model_name = config_object.Get("model_name").As(); fs::path model_path = config_object.Get("model_path").As().Utf8Value(); std::string full_weight_path = (model_path / fs::path(model_name)).string(); name = model_name.empty() ? model_path.filename().string() : model_name; full_model_path = full_weight_path; nCtx = config_object.Get("nCtx").As().Int32Value(); nGpuLayers = config_object.Get("ngl").As().Int32Value(); const char *e; inference_ = llmodel_model_create2(full_weight_path.c_str(), "auto", &e); if (!inference_) { Napi::Error::New(env, e).ThrowAsJavaScriptException(); return; } if (GetInference() == nullptr) { std::cerr << "Tried searching libraries in \"" << llmodel_get_implementation_search_path() << "\"" << std::endl; std::cerr << "Tried searching for model weight in \"" << full_weight_path << "\"" << std::endl; std::cerr << "Do you have runtime libraries installed?" << std::endl; Napi::Error::New(env, "Had an issue creating llmodel object, inference is null").ThrowAsJavaScriptException(); return; } std::string device = config_object.Get("device").As(); if (device != "cpu") { size_t mem = llmodel_required_mem(GetInference(), full_weight_path.c_str(), nCtx, nGpuLayers); auto success = llmodel_gpu_init_gpu_device_by_string(GetInference(), mem, device.c_str()); if (!success) { // https://github.com/nomic-ai/gpt4all/blob/3acbef14b7c2436fe033cae9036e695d77461a16/gpt4all-bindings/python/gpt4all/pyllmodel.py#L215 // Haven't implemented this but it is still open to contribution std::cout << "WARNING: Failed to init GPU\n"; } } auto success = llmodel_loadModel(GetInference(), full_weight_path.c_str(), nCtx, nGpuLayers); if (!success) { Napi::Error::New(env, "Failed to load model at given path").ThrowAsJavaScriptException(); return; } // optional if (config_object.Has("model_type")) { type = config_object.Get("model_type").As(); } }; // NodeModelWrapper::~NodeModelWrapper() { // if(GetInference() != nullptr) { // std::cout << "Debug: deleting model\n"; // llmodel_model_destroy(inference_); // std::cout << (inference_ == nullptr); // } // } // void NodeModelWrapper::Finalize(Napi::Env env) { // if(inference_ != nullptr) { // std::cout << "Debug: deleting model\n"; // // } // } Napi::Value NodeModelWrapper::IsModelLoaded(const Napi::CallbackInfo &info) { return Napi::Boolean::New(info.Env(), llmodel_isModelLoaded(GetInference())); } Napi::Value NodeModelWrapper::StateSize(const Napi::CallbackInfo &info) { // Implement the binding for the stateSize method return Napi::Number::New(info.Env(), static_cast(llmodel_get_state_size(GetInference()))); } Napi::Array ChunkedFloatPtr(float *embedding_ptr, int embedding_size, int text_len, Napi::Env const &env) { auto n_embd = embedding_size / text_len; // std::cout << "Embedding size: " << embedding_size << std::endl; // std::cout << "Text length: " << text_len << std::endl; // std::cout << "Chunk size (n_embd): " << n_embd << std::endl; Napi::Array result = Napi::Array::New(env, text_len); auto count = 0; for (int i = 0; i < embedding_size; i += n_embd) { int end = std::min(i + n_embd, embedding_size); // possible bounds error? // Constructs a container with as many elements as the range [first,last), with each element emplace-constructed // from its corresponding element in that range, in the same order. std::vector chunk(embedding_ptr + i, embedding_ptr + end); Napi::Float32Array fltarr = Napi::Float32Array::New(env, chunk.size()); // I know there's a way to emplace the raw float ptr into a Napi::Float32Array but idk how and // im too scared to cause memory issues // this is goodenough for (int j = 0; j < chunk.size(); j++) { fltarr.Set(j, chunk[j]); } result.Set(count++, fltarr); } return result; } Napi::Value NodeModelWrapper::GenerateEmbedding(const Napi::CallbackInfo &info) { auto env = info.Env(); auto prefix = info[1]; auto dimensionality = info[2].As().Int32Value(); auto do_mean = info[3].As().Value(); auto atlas = info[4].As().Value(); size_t embedding_size; size_t token_count = 0; // This procedure can maybe be optimized but its whatever, i have too many intermediary structures std::vector text_arr; bool is_single_text = false; if (info[0].IsString()) { is_single_text = true; text_arr.push_back(info[0].As().Utf8Value()); } else { auto jsarr = info[0].As(); size_t len = jsarr.Length(); text_arr.reserve(len); for (size_t i = 0; i < len; ++i) { std::string str = jsarr.Get(i).As().Utf8Value(); text_arr.push_back(str); } } std::vector str_ptrs; str_ptrs.reserve(text_arr.size() + 1); for (size_t i = 0; i < text_arr.size(); ++i) str_ptrs.push_back(text_arr[i].c_str()); str_ptrs.push_back(nullptr); const char *_err = nullptr; float *embeds = llmodel_embed(GetInference(), str_ptrs.data(), &embedding_size, prefix.IsUndefined() ? nullptr : prefix.As().Utf8Value().c_str(), dimensionality, &token_count, do_mean, atlas, nullptr, &_err); if (!embeds) { // i dont wanna deal with c strings lol std::string err(_err); Napi::Error::New(env, err == "(unknown error)" ? "Unknown error: sorry bud" : err).ThrowAsJavaScriptException(); return env.Undefined(); } auto embedmat = ChunkedFloatPtr(embeds, embedding_size, text_arr.size(), env); llmodel_free_embedding(embeds); auto res = Napi::Object::New(env); res.Set("n_prompt_tokens", token_count); if(is_single_text) { res.Set("embeddings", embedmat.Get(static_cast(0))); } else { res.Set("embeddings", embedmat); } return res; } /** * Generate a response using the model. * @param prompt A string representing the input prompt. * @param options Inference options. */ Napi::Value NodeModelWrapper::Infer(const Napi::CallbackInfo &info) { auto env = info.Env(); std::string prompt; if (info[0].IsString()) { prompt = info[0].As().Utf8Value(); } else { Napi::Error::New(info.Env(), "invalid string argument").ThrowAsJavaScriptException(); return info.Env().Undefined(); } if (!info[1].IsObject()) { Napi::Error::New(info.Env(), "Missing Prompt Options").ThrowAsJavaScriptException(); return info.Env().Undefined(); } // defaults copied from python bindings llmodel_prompt_context promptContext = {.logits = nullptr, .tokens = nullptr, .n_past = 0, .n_ctx = nCtx, .n_predict = 4096, .top_k = 40, .top_p = 0.9f, .min_p = 0.0f, .temp = 0.1f, .n_batch = 8, .repeat_penalty = 1.2f, .repeat_last_n = 10, .context_erase = 0.75}; PromptWorkerConfig promptWorkerConfig; auto inputObject = info[1].As(); if (inputObject.Has("logits") || inputObject.Has("tokens")) { Napi::Error::New(info.Env(), "Invalid input: 'logits' or 'tokens' properties are not allowed") .ThrowAsJavaScriptException(); return info.Env().Undefined(); } // Assign the remaining properties if (inputObject.Has("nPast") && inputObject.Get("nPast").IsNumber()) { promptContext.n_past = inputObject.Get("nPast").As().Int32Value(); } if (inputObject.Has("nPredict") && inputObject.Get("nPredict").IsNumber()) { promptContext.n_predict = inputObject.Get("nPredict").As().Int32Value(); } if (inputObject.Has("topK") && inputObject.Get("topK").IsNumber()) { promptContext.top_k = inputObject.Get("topK").As().Int32Value(); } if (inputObject.Has("topP") && inputObject.Get("topP").IsNumber()) { promptContext.top_p = inputObject.Get("topP").As().FloatValue(); } if (inputObject.Has("minP") && inputObject.Get("minP").IsNumber()) { promptContext.min_p = inputObject.Get("minP").As().FloatValue(); } if (inputObject.Has("temp") && inputObject.Get("temp").IsNumber()) { promptContext.temp = inputObject.Get("temp").As().FloatValue(); } if (inputObject.Has("nBatch") && inputObject.Get("nBatch").IsNumber()) { promptContext.n_batch = inputObject.Get("nBatch").As().Int32Value(); } if (inputObject.Has("repeatPenalty") && inputObject.Get("repeatPenalty").IsNumber()) { promptContext.repeat_penalty = inputObject.Get("repeatPenalty").As().FloatValue(); } if (inputObject.Has("repeatLastN") && inputObject.Get("repeatLastN").IsNumber()) { promptContext.repeat_last_n = inputObject.Get("repeatLastN").As().Int32Value(); } if (inputObject.Has("contextErase") && inputObject.Get("contextErase").IsNumber()) { promptContext.context_erase = inputObject.Get("contextErase").As().FloatValue(); } if (inputObject.Has("onPromptToken") && inputObject.Get("onPromptToken").IsFunction()) { promptWorkerConfig.promptCallback = inputObject.Get("onPromptToken").As(); promptWorkerConfig.hasPromptCallback = true; } if (inputObject.Has("onResponseToken") && inputObject.Get("onResponseToken").IsFunction()) { promptWorkerConfig.responseCallback = inputObject.Get("onResponseToken").As(); promptWorkerConfig.hasResponseCallback = true; } // copy to protect llmodel resources when splitting to new thread // llmodel_prompt_context copiedPrompt = promptContext; promptWorkerConfig.context = promptContext; promptWorkerConfig.model = GetInference(); promptWorkerConfig.mutex = &inference_mutex; promptWorkerConfig.prompt = prompt; promptWorkerConfig.result = ""; promptWorkerConfig.promptTemplate = inputObject.Get("promptTemplate").As(); if (inputObject.Has("special")) { promptWorkerConfig.special = inputObject.Get("special").As(); } if (inputObject.Has("fakeReply")) { // this will be deleted in the worker promptWorkerConfig.fakeReply = new std::string(inputObject.Get("fakeReply").As().Utf8Value()); } auto worker = new PromptWorker(env, promptWorkerConfig); worker->Queue(); return worker->GetPromise(); } void NodeModelWrapper::Dispose(const Napi::CallbackInfo &info) { llmodel_model_destroy(inference_); } void NodeModelWrapper::SetThreadCount(const Napi::CallbackInfo &info) { if (info[0].IsNumber()) { llmodel_setThreadCount(GetInference(), info[0].As().Int64Value()); } else { Napi::Error::New(info.Env(), "Could not set thread count: argument 1 is NaN").ThrowAsJavaScriptException(); return; } } Napi::Value NodeModelWrapper::GetName(const Napi::CallbackInfo &info) { return Napi::String::New(info.Env(), name); } Napi::Value NodeModelWrapper::ThreadCount(const Napi::CallbackInfo &info) { return Napi::Number::New(info.Env(), llmodel_threadCount(GetInference())); } Napi::Value NodeModelWrapper::GetLibraryPath(const Napi::CallbackInfo &info) { return Napi::String::New(info.Env(), llmodel_get_implementation_search_path()); } llmodel_model NodeModelWrapper::GetInference() { return inference_; } // Exports Bindings Napi::Object Init(Napi::Env env, Napi::Object exports) { exports["LLModel"] = NodeModelWrapper::GetClass(env); return exports; } NODE_API_MODULE(NODE_GYP_MODULE_NAME, Init) ================================================ FILE: gpt4all-bindings/typescript/index.h ================================================ #include "llmodel.h" #include "llmodel_c.h" #include "prompt.h" #include #include #include #include #include #include #include namespace fs = std::filesystem; class NodeModelWrapper : public Napi::ObjectWrap { public: NodeModelWrapper(const Napi::CallbackInfo &); // virtual ~NodeModelWrapper(); Napi::Value GetType(const Napi::CallbackInfo &info); Napi::Value IsModelLoaded(const Napi::CallbackInfo &info); Napi::Value StateSize(const Napi::CallbackInfo &info); // void Finalize(Napi::Env env) override; /** * Prompting the model. This entails spawning a new thread and adding the response tokens * into a thread local string variable. */ Napi::Value Infer(const Napi::CallbackInfo &info); void SetThreadCount(const Napi::CallbackInfo &info); void Dispose(const Napi::CallbackInfo &info); Napi::Value GetName(const Napi::CallbackInfo &info); Napi::Value ThreadCount(const Napi::CallbackInfo &info); Napi::Value GenerateEmbedding(const Napi::CallbackInfo &info); Napi::Value HasGpuDevice(const Napi::CallbackInfo &info); Napi::Value ListGpus(const Napi::CallbackInfo &info); Napi::Value InitGpuByString(const Napi::CallbackInfo &info); Napi::Value GetRequiredMemory(const Napi::CallbackInfo &info); Napi::Value GetGpuDevices(const Napi::CallbackInfo &info); /* * The path that is used to search for the dynamic libraries */ Napi::Value GetLibraryPath(const Napi::CallbackInfo &info); /** * Creates the LLModel class */ static Napi::Function GetClass(Napi::Env); llmodel_model GetInference(); private: /** * The underlying inference that interfaces with the C interface */ llmodel_model inference_; std::mutex inference_mutex; std::string type; // corresponds to LLModel::name() in typescript std::string name; int nCtx{}; int nGpuLayers{}; std::string full_model_path; }; ================================================ FILE: gpt4all-bindings/typescript/package.json ================================================ { "name": "gpt4all", "version": "4.0.0", "packageManager": "yarn@3.6.1", "main": "src/gpt4all.js", "repository": "nomic-ai/gpt4all", "scripts": { "install": "node-gyp-build", "test": "jest", "build:backend": "node scripts/build.js", "build": "node-gyp-build", "docs:build": "node scripts/docs.js && documentation readme ./src/gpt4all.d.ts --parse-extension js d.ts --format md --section \"API Reference\" --readme-file ../python/docs/gpt4all_nodejs.md" }, "files": [ "src/**/*", "runtimes/**/*", "binding.gyp", "prebuilds/**/*", "*.h", "*.cc", "gpt4all-backend/**/*" ], "dependencies": { "md5-file": "^5.0.0", "node-addon-api": "^6.1.0", "node-gyp-build": "^4.6.0" }, "devDependencies": { "@types/node": "^20.1.5", "documentation": "^14.0.2", "jest": "^29.5.0", "prebuildify": "^5.0.1", "prettier": "^2.8.8" }, "optionalDependencies": { "node-gyp": "9.x.x" }, "engines": { "node": ">= 18.x.x" }, "prettier": { "endOfLine": "lf", "tabWidth": 4 }, "jest": { "verbose": true }, "publishConfig": { "registry": "https://registry.npmjs.org/", "access": "public", "tag": "latest" } } ================================================ FILE: gpt4all-bindings/typescript/prompt.cc ================================================ #include "prompt.h" #include PromptWorker::PromptWorker(Napi::Env env, PromptWorkerConfig config) : promise(Napi::Promise::Deferred::New(env)), _config(config), AsyncWorker(env) { if (_config.hasResponseCallback) { _responseCallbackFn = Napi::ThreadSafeFunction::New(config.responseCallback.Env(), config.responseCallback, "PromptWorker", 0, 1, this); } if (_config.hasPromptCallback) { _promptCallbackFn = Napi::ThreadSafeFunction::New(config.promptCallback.Env(), config.promptCallback, "PromptWorker", 0, 1, this); } } PromptWorker::~PromptWorker() { if (_config.hasResponseCallback) { _responseCallbackFn.Release(); } if (_config.hasPromptCallback) { _promptCallbackFn.Release(); } } void PromptWorker::Execute() { _config.mutex->lock(); LLModelWrapper *wrapper = reinterpret_cast(_config.model); auto ctx = &_config.context; if (size_t(ctx->n_past) < wrapper->promptContext.tokens.size()) wrapper->promptContext.tokens.resize(ctx->n_past); // Copy the C prompt context wrapper->promptContext.n_past = ctx->n_past; wrapper->promptContext.n_ctx = ctx->n_ctx; wrapper->promptContext.n_predict = ctx->n_predict; wrapper->promptContext.top_k = ctx->top_k; wrapper->promptContext.top_p = ctx->top_p; wrapper->promptContext.temp = ctx->temp; wrapper->promptContext.n_batch = ctx->n_batch; wrapper->promptContext.repeat_penalty = ctx->repeat_penalty; wrapper->promptContext.repeat_last_n = ctx->repeat_last_n; wrapper->promptContext.contextErase = ctx->context_erase; // Call the C++ prompt method wrapper->llModel->prompt( _config.prompt, _config.promptTemplate, [this](int32_t token_id) { return PromptCallback(token_id); }, [this](int32_t token_id, const std::string token) { return ResponseCallback(token_id, token); }, [](bool isRecalculating) { return isRecalculating; }, wrapper->promptContext, _config.special, _config.fakeReply); // Update the C context by giving access to the wrappers raw pointers to std::vector data // which involves no copies ctx->logits = wrapper->promptContext.logits.data(); ctx->logits_size = wrapper->promptContext.logits.size(); ctx->tokens = wrapper->promptContext.tokens.data(); ctx->tokens_size = wrapper->promptContext.tokens.size(); // Update the rest of the C prompt context ctx->n_past = wrapper->promptContext.n_past; ctx->n_ctx = wrapper->promptContext.n_ctx; ctx->n_predict = wrapper->promptContext.n_predict; ctx->top_k = wrapper->promptContext.top_k; ctx->top_p = wrapper->promptContext.top_p; ctx->temp = wrapper->promptContext.temp; ctx->n_batch = wrapper->promptContext.n_batch; ctx->repeat_penalty = wrapper->promptContext.repeat_penalty; ctx->repeat_last_n = wrapper->promptContext.repeat_last_n; ctx->context_erase = wrapper->promptContext.contextErase; _config.mutex->unlock(); } void PromptWorker::OnOK() { Napi::Object returnValue = Napi::Object::New(Env()); returnValue.Set("text", result); returnValue.Set("nPast", _config.context.n_past); promise.Resolve(returnValue); delete _config.fakeReply; } void PromptWorker::OnError(const Napi::Error &e) { delete _config.fakeReply; promise.Reject(e.Value()); } Napi::Promise PromptWorker::GetPromise() { return promise.Promise(); } bool PromptWorker::ResponseCallback(int32_t token_id, const std::string token) { if (token_id == -1) { return false; } if (!_config.hasResponseCallback) { return true; } result += token; std::promise promise; auto info = new ResponseCallbackData(); info->tokenId = token_id; info->token = token; auto future = promise.get_future(); auto status = _responseCallbackFn.BlockingCall( info, [&promise](Napi::Env env, Napi::Function jsCallback, ResponseCallbackData *value) { try { // Transform native data into JS data, passing it to the provided // `jsCallback` -- the TSFN's JavaScript function. auto token_id = Napi::Number::New(env, value->tokenId); auto token = Napi::String::New(env, value->token); auto jsResult = jsCallback.Call({token_id, token}).ToBoolean(); promise.set_value(jsResult); } catch (const Napi::Error &e) { std::cerr << "Error in onResponseToken callback: " << e.what() << std::endl; promise.set_value(false); } delete value; }); if (status != napi_ok) { Napi::Error::Fatal("PromptWorkerResponseCallback", "Napi::ThreadSafeNapi::Function.NonBlockingCall() failed"); } return future.get(); } bool PromptWorker::RecalculateCallback(bool isRecalculating) { return isRecalculating; } bool PromptWorker::PromptCallback(int32_t token_id) { if (!_config.hasPromptCallback) { return true; } std::promise promise; auto info = new PromptCallbackData(); info->tokenId = token_id; auto future = promise.get_future(); auto status = _promptCallbackFn.BlockingCall( info, [&promise](Napi::Env env, Napi::Function jsCallback, PromptCallbackData *value) { try { // Transform native data into JS data, passing it to the provided // `jsCallback` -- the TSFN's JavaScript function. auto token_id = Napi::Number::New(env, value->tokenId); auto jsResult = jsCallback.Call({token_id}).ToBoolean(); promise.set_value(jsResult); } catch (const Napi::Error &e) { std::cerr << "Error in onPromptToken callback: " << e.what() << std::endl; promise.set_value(false); } delete value; }); if (status != napi_ok) { Napi::Error::Fatal("PromptWorkerPromptCallback", "Napi::ThreadSafeNapi::Function.NonBlockingCall() failed"); } return future.get(); } ================================================ FILE: gpt4all-bindings/typescript/prompt.h ================================================ #ifndef PREDICT_WORKER_H #define PREDICT_WORKER_H #include "llmodel.h" #include "llmodel_c.h" #include "napi.h" #include #include #include #include #include struct ResponseCallbackData { int32_t tokenId; std::string token; }; struct PromptCallbackData { int32_t tokenId; }; struct LLModelWrapper { LLModel *llModel = nullptr; LLModel::PromptContext promptContext; ~LLModelWrapper() { delete llModel; } }; struct PromptWorkerConfig { Napi::Function responseCallback; bool hasResponseCallback = false; Napi::Function promptCallback; bool hasPromptCallback = false; llmodel_model model; std::mutex *mutex; std::string prompt; std::string promptTemplate; llmodel_prompt_context context; std::string result; bool special = false; std::string *fakeReply = nullptr; }; class PromptWorker : public Napi::AsyncWorker { public: PromptWorker(Napi::Env env, PromptWorkerConfig config); ~PromptWorker(); void Execute() override; void OnOK() override; void OnError(const Napi::Error &e) override; Napi::Promise GetPromise(); bool ResponseCallback(int32_t token_id, const std::string token); bool RecalculateCallback(bool isrecalculating); bool PromptCallback(int32_t token_id); private: Napi::Promise::Deferred promise; std::string result; PromptWorkerConfig _config; Napi::ThreadSafeFunction _responseCallbackFn; Napi::ThreadSafeFunction _promptCallbackFn; }; #endif // PREDICT_WORKER_H ================================================ FILE: gpt4all-bindings/typescript/scripts/build.js ================================================ const { spawn } = require("node:child_process"); const { resolve } = require("path"); const args = process.argv.slice(2); const platform = process.platform; //windows 64bit or 32 if (platform === "win32") { const path = "scripts/build_msvc.bat"; spawn(resolve(path), ["/Y", ...args], { shell: true, stdio: "inherit" }); process.on("data", (s) => console.log(s.toString())); } else if (platform === "linux" || platform === "darwin") { const path = "scripts/build_unix.sh"; spawn(`sh `, [path, args], { shell: true, stdio: "inherit", }); process.on("data", (s) => console.log(s.toString())); } ================================================ FILE: gpt4all-bindings/typescript/scripts/docs.js ================================================ //Maybe some command line piping would work better, but can't think of platform independent command line tool const fs = require('fs'); const newPath = '../python/docs/gpt4all_nodejs.md'; const filepath = './README.md'; const intro = fs.readFileSync(filepath); fs.writeFileSync( newPath, intro ); ================================================ FILE: gpt4all-bindings/typescript/scripts/mkclangd.js ================================================ /// makes compile_flags.txt for clangd server support with this project /// run this with typescript as your cwd // //for debian users make sure to install libstdc++-12-dev const nodeaddonapi=require('node-addon-api').include; const fsp = require('fs/promises'); const { existsSync, readFileSync } = require('fs'); const assert = require('node:assert'); const findnodeapih = () => { assert(existsSync("./build"), "Haven't built the application once yet. run node scripts/prebuild.js"); const dir = readFileSync("./build/config.gypi", 'utf8'); const nodedir_line = dir.match(/"nodedir": "([^"]+)"/); assert(nodedir_line, "Found no matches") assert(nodedir_line[1]); console.log("node_api.h found at: ", nodedir_line[1]); return nodedir_line[1]+"/include/node"; }; const knownIncludes = [ '-I', './', '-I', nodeaddonapi.substring(1, nodeaddonapi.length-1), '-I', '../../gpt4all-backend', '-I', findnodeapih() ]; const knownFlags = [ "-x", "c++", '-std=c++17' ]; const output = knownFlags.join('\n')+'\n'+knownIncludes.join('\n'); fsp.writeFile('./compile_flags.txt', output, 'utf8') .then(() => console.log('done')) .catch(() => console.err('failed')); ================================================ FILE: gpt4all-bindings/typescript/scripts/prebuild.js ================================================ const prebuildify = require("prebuildify"); async function createPrebuilds(combinations) { for (const { platform, arch } of combinations) { const opts = { platform, arch, napi: true, targets: ["18.16.0"] }; try { await createPrebuild(opts); console.log( `Build succeeded for platform ${opts.platform} and architecture ${opts.arch}` ); } catch (err) { console.error( `Error building for platform ${opts.platform} and architecture ${opts.arch}:`, err ); } } } function createPrebuild(opts) { return new Promise((resolve, reject) => { prebuildify(opts, (err) => { if (err) { reject(err); } else { resolve(); } }); }); } let prebuildConfigs; if(process.platform === 'win32') { prebuildConfigs = [ { platform: "win32", arch: "x64" } ]; } else if(process.platform === 'linux') { //Unsure if darwin works, need mac tester! prebuildConfigs = [ { platform: "linux", arch: "x64" }, //{ platform: "linux", arch: "arm64" }, //{ platform: "linux", arch: "armv7" }, ] } else if(process.platform === 'darwin') { prebuildConfigs = [ { platform: "darwin", arch: "x64" }, { platform: "darwin", arch: "arm64" }, ] } createPrebuilds(prebuildConfigs) .then(() => console.log("All builds succeeded")) .catch((err) => console.error("Error building:", err)); ================================================ FILE: gpt4all-bindings/typescript/spec/callbacks.mjs ================================================ import { promises as fs } from "node:fs"; import { loadModel, createCompletion } from "../src/gpt4all.js"; const model = await loadModel("Nous-Hermes-2-Mistral-7B-DPO.Q4_0.gguf", { verbose: true, device: "gpu", }); const res = await createCompletion( model, "I've got three 🍣 - What shall I name them?", { onPromptToken: (tokenId) => { console.debug("onPromptToken", { tokenId }); // throwing an error will cancel throw new Error("This is an error"); // const foo = thisMethodDoesNotExist(); // returning false will cancel as well // return false; }, onResponseToken: (tokenId, token) => { console.debug("onResponseToken", { tokenId, token }); // same applies here }, } ); console.debug("Output:", { usage: res.usage, message: res.choices[0].message, }); ================================================ FILE: gpt4all-bindings/typescript/spec/chat-memory.mjs ================================================ import { loadModel, createCompletion } from "../src/gpt4all.js"; const model = await loadModel("Nous-Hermes-2-Mistral-7B-DPO.Q4_0.gguf", { verbose: true, device: "gpu", }); const chat = await model.createChatSession({ messages: [ { role: "user", content: "I'll tell you a secret password: It's 63445.", }, { role: "assistant", content: "I will do my best to remember that.", }, { role: "user", content: "And here another fun fact: Bananas may be bluer than bread at night.", }, { role: "assistant", content: "Yes, that makes sense.", }, ], }); const turn1 = await createCompletion( chat, "Please tell me the secret password." ); console.debug(turn1.choices[0].message); // "The secret password you shared earlier is 63445."" const turn2 = await createCompletion( chat, "Thanks! Have your heard about the bananas?" ); console.debug(turn2.choices[0].message); for (let i = 0; i < 32; i++) { // gpu go brr const turn = await createCompletion( chat, i % 2 === 0 ? "Tell me a fun fact." : "And a boring one?" ); console.debug({ message: turn.choices[0].message, n_past_tokens: turn.usage.n_past_tokens, }); } const finalTurn = await createCompletion( chat, "Now I forgot the secret password. Can you remind me?" ); console.debug(finalTurn.choices[0].message); // result of finalTurn may vary depending on whether the generated facts pushed the secret out of the context window. // "Of course! The secret password you shared earlier is 63445." // "I apologize for any confusion. As an AI language model, ..." model.dispose(); ================================================ FILE: gpt4all-bindings/typescript/spec/chat-minimal.mjs ================================================ import { loadModel, createCompletion } from "../src/gpt4all.js"; const model = await loadModel("orca-mini-3b-gguf2-q4_0.gguf", { verbose: true, device: "gpu", }); const chat = await model.createChatSession(); await createCompletion( chat, "Why are bananas rather blue than bread at night sometimes?", { verbose: true, } ); await createCompletion(chat, "Are you sure?", { verbose: true, }); ================================================ FILE: gpt4all-bindings/typescript/spec/concurrency.mjs ================================================ import { loadModel, createCompletion, } from "../src/gpt4all.js"; const modelOptions = { verbose: true, }; const model1 = await loadModel("orca-mini-3b-gguf2-q4_0.gguf", { ...modelOptions, device: "gpu", // only one model can be on gpu }); const model2 = await loadModel("orca-mini-3b-gguf2-q4_0.gguf", modelOptions); const model3 = await loadModel("orca-mini-3b-gguf2-q4_0.gguf", modelOptions); const promptContext = { verbose: true, } const responses = await Promise.all([ createCompletion(model1, "What is 1 + 1?", promptContext), // generating with the same model instance will wait for the previous completion to finish createCompletion(model1, "What is 1 + 1?", promptContext), // generating with different model instances will run in parallel createCompletion(model2, "What is 1 + 2?", promptContext), createCompletion(model3, "What is 1 + 3?", promptContext), ]); console.log(responses.map((res) => res.choices[0].message)); ================================================ FILE: gpt4all-bindings/typescript/spec/embed-jsonl.mjs ================================================ import { loadModel, createEmbedding } from '../src/gpt4all.js' import { createGunzip, createGzip, createUnzip } from 'node:zlib'; import { Readable } from 'stream' import readline from 'readline' const embedder = await loadModel("nomic-embed-text-v1.5.f16.gguf", { verbose: true, type: 'embedding', device: 'gpu' }) console.log("Running with", embedder.llm.threadCount(), "threads"); const unzip = createGunzip(); const url = "https://huggingface.co/datasets/sentence-transformers/embedding-training-data/resolve/main/squad_pairs.jsonl.gz" const stream = await fetch(url) .then(res => Readable.fromWeb(res.body)); const lineReader = readline.createInterface({ input: stream.pipe(unzip), crlfDelay: Infinity }) lineReader.on('line', line => { //pairs of questions and answers const question_answer = JSON.parse(line) console.log(createEmbedding(embedder, question_answer)) }) lineReader.on('close', () => embedder.dispose()) ================================================ FILE: gpt4all-bindings/typescript/spec/embed.mjs ================================================ import { loadModel, createEmbedding } from '../src/gpt4all.js' const embedder = await loadModel("nomic-embed-text-v1.5.f16.gguf", { verbose: true, type: 'embedding' , device: 'gpu' }) try { console.log(createEmbedding(embedder, ["Accept your current situation", "12312"], { prefix: "search_document" })) } catch(e) { console.log(e) } embedder.dispose() ================================================ FILE: gpt4all-bindings/typescript/spec/llmodel.mjs ================================================ import { LLModel, createCompletion, DEFAULT_DIRECTORY, DEFAULT_LIBRARIES_DIRECTORY, loadModel, } from "../src/gpt4all.js"; const model = await loadModel("mistral-7b-openorca.gguf2.Q4_0.gguf", { verbose: true, device: "gpu", }); const ll = model.llm; try { class Extended extends LLModel {} } catch (e) { console.log("Extending from native class gone wrong " + e); } console.log("state size " + ll.stateSize()); console.log("thread count " + ll.threadCount()); ll.setThreadCount(5); console.log("thread count " + ll.threadCount()); ll.setThreadCount(4); console.log("thread count " + ll.threadCount()); console.log("name " + ll.name()); console.log("type: " + ll.type()); console.log("Default directory for models", DEFAULT_DIRECTORY); console.log("Default directory for libraries", DEFAULT_LIBRARIES_DIRECTORY); console.log("Has GPU", ll.hasGpuDevice()); console.log("gpu devices", ll.listGpu()); console.log("Required Mem in bytes", ll.memoryNeeded()); // to ingest a custom system prompt without using a chat session. await createCompletion( model, "<|im_start|>system\nYou are an advanced mathematician.\n<|im_end|>\n", { promptTemplate: "%1", nPredict: 0, special: true, } ); const completion1 = await createCompletion(model, "What is 1 + 1?", { verbose: true, }); console.log(`🤖 > ${completion1.choices[0].message.content}`); //Very specific: // tested on Ubuntu 22.0, Linux Mint, if I set nPast to 100, the app hangs. const completion2 = await createCompletion(model, "And if we add two?", { verbose: true, }); console.log(`🤖 > ${completion2.choices[0].message.content}`); //CALLING DISPOSE WILL INVALID THE NATIVE MODEL. USE THIS TO CLEANUP model.dispose(); console.log("model disposed, exiting..."); ================================================ FILE: gpt4all-bindings/typescript/spec/long-context.mjs ================================================ import { promises as fs } from "node:fs"; import { loadModel, createCompletion } from "../src/gpt4all.js"; const model = await loadModel("Nous-Hermes-2-Mistral-7B-DPO.Q4_0.gguf", { verbose: true, device: "gpu", nCtx: 32768, }); const typeDefSource = await fs.readFile("./src/gpt4all.d.ts", "utf-8"); const res = await createCompletion( model, "Here are the type definitions for the GPT4All API:\n\n" + typeDefSource + "\n\nHow do I create a completion with a really large context window?", { verbose: true, } ); console.debug(res.choices[0].message); ================================================ FILE: gpt4all-bindings/typescript/spec/model-switching.mjs ================================================ import { loadModel, createCompletion } from "../src/gpt4all.js"; const model1 = await loadModel("Nous-Hermes-2-Mistral-7B-DPO.Q4_0.gguf", { device: "gpu", nCtx: 4096, }); const chat1 = await model1.createChatSession({ temperature: 0.8, topP: 0.7, topK: 60, }); const chat1turn1 = await createCompletion( chat1, "Outline a short story concept for adults. About why bananas are rather blue than bread is green at night sometimes. Not too long." ); console.debug(chat1turn1.choices[0].message); const chat1turn2 = await createCompletion( chat1, "Lets sprinkle some plot twists. And a cliffhanger at the end." ); console.debug(chat1turn2.choices[0].message); const chat1turn3 = await createCompletion( chat1, "Analyze your plot. Find the weak points." ); console.debug(chat1turn3.choices[0].message); const chat1turn4 = await createCompletion( chat1, "Rewrite it based on the analysis." ); console.debug(chat1turn4.choices[0].message); model1.dispose(); const model2 = await loadModel("gpt4all-falcon-newbpe-q4_0.gguf", { device: "gpu", }); const chat2 = await model2.createChatSession({ messages: chat1.messages, }); const chat2turn1 = await createCompletion( chat2, "Give three ideas how this plot could be improved." ); console.debug(chat2turn1.choices[0].message); const chat2turn2 = await createCompletion( chat2, "Revise the plot, applying your ideas." ); console.debug(chat2turn2.choices[0].message); model2.dispose(); ================================================ FILE: gpt4all-bindings/typescript/spec/stateless.mjs ================================================ import { loadModel, createCompletion } from "../src/gpt4all.js"; const model = await loadModel("orca-mini-3b-gguf2-q4_0.gguf", { verbose: true, device: "gpu", }); const messages = [ { role: "system", content: "<|im_start|>system\nYou are an advanced mathematician.\n<|im_end|>\n", }, { role: "user", content: "What's 2+2?", }, { role: "assistant", content: "5", }, { role: "user", content: "Are you sure?", }, ]; const res1 = await createCompletion(model, messages); console.debug(res1.choices[0].message); messages.push(res1.choices[0].message); messages.push({ role: "user", content: "Could you double check that?", }); const res2 = await createCompletion(model, messages); console.debug(res2.choices[0].message); messages.push(res2.choices[0].message); messages.push({ role: "user", content: "Let's bring out the big calculators.", }); const res3 = await createCompletion(model, messages); console.debug(res3.choices[0].message); messages.push(res3.choices[0].message); // console.debug(messages); ================================================ FILE: gpt4all-bindings/typescript/spec/streaming.mjs ================================================ import { loadModel, createCompletion, createCompletionStream, createCompletionGenerator, } from "../src/gpt4all.js"; const model = await loadModel("mistral-7b-openorca.gguf2.Q4_0.gguf", { device: "gpu", }); process.stdout.write("### Stream:"); const stream = createCompletionStream(model, "How are you?"); stream.tokens.on("data", (data) => { process.stdout.write(data); }); await stream.result; process.stdout.write("\n"); process.stdout.write("### Stream with pipe:"); const stream2 = createCompletionStream( model, "Please say something nice about node streams." ); stream2.tokens.pipe(process.stdout); const stream2Res = await stream2.result; process.stdout.write("\n"); process.stdout.write("### Generator:"); const gen = createCompletionGenerator(model, "generators instead?", { nPast: stream2Res.usage.n_past_tokens, }); for await (const chunk of gen) { process.stdout.write(chunk); } process.stdout.write("\n"); process.stdout.write("### Callback:"); await createCompletion(model, "Why not just callbacks?", { onResponseToken: (tokenId, token) => { process.stdout.write(token); }, }); process.stdout.write("\n"); process.stdout.write("### 2nd Generator:"); const gen2 = createCompletionGenerator(model, "If 3 + 3 is 5, what is 2 + 2?"); let chunk = await gen2.next(); while (!chunk.done) { process.stdout.write(chunk.value); chunk = await gen2.next(); } process.stdout.write("\n"); console.debug("generator finished", chunk); model.dispose(); ================================================ FILE: gpt4all-bindings/typescript/spec/system.mjs ================================================ import { loadModel, createCompletion, } from "../src/gpt4all.js"; const model = await loadModel("Nous-Hermes-2-Mistral-7B-DPO.Q4_0.gguf", { verbose: true, device: "gpu", }); const chat = await model.createChatSession({ verbose: true, systemPrompt: "<|im_start|>system\nRoleplay as Batman. Answer as if you are Batman, never say you're an Assistant.\n<|im_end|>", }); const turn1 = await createCompletion(chat, "You have any plans tonight?"); console.log(turn1.choices[0].message); // "I'm afraid I must decline any personal invitations tonight. As Batman, I have a responsibility to protect Gotham City." model.dispose(); ================================================ FILE: gpt4all-bindings/typescript/src/chat-session.js ================================================ const { DEFAULT_PROMPT_CONTEXT } = require("./config"); const { prepareMessagesForIngest } = require("./util"); class ChatSession { model; modelName; /** * @type {import('./gpt4all').ChatMessage[]} */ messages; /** * @type {string} */ systemPrompt; /** * @type {import('./gpt4all').LLModelPromptContext} */ promptContext; /** * @type {boolean} */ initialized; constructor(model, chatSessionOpts = {}) { const { messages, systemPrompt, ...sessionDefaultPromptContext } = chatSessionOpts; this.model = model; this.modelName = model.llm.name(); this.messages = messages ?? []; this.systemPrompt = systemPrompt ?? model.config.systemPrompt; this.initialized = false; this.promptContext = { ...DEFAULT_PROMPT_CONTEXT, ...sessionDefaultPromptContext, nPast: 0, }; } async initialize(completionOpts = {}) { if (this.model.activeChatSession !== this) { this.model.activeChatSession = this; } let tokensIngested = 0; // ingest system prompt if (this.systemPrompt) { const systemRes = await this.model.generate(this.systemPrompt, { promptTemplate: "%1", nPredict: 0, special: true, nBatch: this.promptContext.nBatch, // verbose: true, }); tokensIngested += systemRes.tokensIngested; this.promptContext.nPast = systemRes.nPast; } // ingest initial messages if (this.messages.length > 0) { tokensIngested += await this.ingestMessages( this.messages, completionOpts ); } this.initialized = true; return tokensIngested; } async ingestMessages(messages, completionOpts = {}) { const turns = prepareMessagesForIngest(messages); // send the message pairs to the model let tokensIngested = 0; for (const turn of turns) { const turnRes = await this.model.generate(turn.user, { ...this.promptContext, ...completionOpts, fakeReply: turn.assistant, }); tokensIngested += turnRes.tokensIngested; this.promptContext.nPast = turnRes.nPast; } return tokensIngested; } async generate(input, completionOpts = {}) { if (this.model.activeChatSession !== this) { throw new Error( "Chat session is not active. Create a new chat session or call initialize to continue." ); } if (completionOpts.nPast > this.promptContext.nPast) { throw new Error( `nPast cannot be greater than ${this.promptContext.nPast}.` ); } let tokensIngested = 0; if (!this.initialized) { tokensIngested += await this.initialize(completionOpts); } let prompt = input; if (Array.isArray(input)) { // assuming input is a messages array // -> tailing user message will be used as the final prompt. its optional. // -> all system messages will be ignored. // -> all other messages will be ingested with fakeReply // -> user/assistant messages will be pushed into the messages array let tailingUserMessage = ""; let messagesToIngest = input; const lastMessage = input[input.length - 1]; if (lastMessage.role === "user") { tailingUserMessage = lastMessage.content; messagesToIngest = input.slice(0, input.length - 1); } if (messagesToIngest.length > 0) { tokensIngested += await this.ingestMessages( messagesToIngest, completionOpts ); this.messages.push(...messagesToIngest); } if (tailingUserMessage) { prompt = tailingUserMessage; } else { return { text: "", nPast: this.promptContext.nPast, tokensIngested, tokensGenerated: 0, }; } } const result = await this.model.generate(prompt, { ...this.promptContext, ...completionOpts, }); this.promptContext.nPast = result.nPast; result.tokensIngested += tokensIngested; this.messages.push({ role: "user", content: prompt, }); this.messages.push({ role: "assistant", content: result.text, }); return result; } } module.exports = { ChatSession, }; ================================================ FILE: gpt4all-bindings/typescript/src/config.js ================================================ const os = require("node:os"); const path = require("node:path"); const DEFAULT_DIRECTORY = path.resolve(os.homedir(), ".cache/gpt4all"); const librarySearchPaths = [ path.join(DEFAULT_DIRECTORY, "libraries"), path.resolve("./libraries"), path.resolve( __dirname, "..", `runtimes/${process.platform}-${process.arch}/native`, ), //for darwin. This is hardcoded for now but it should work path.resolve( __dirname, "..", `runtimes/${process.platform}/native`, ), process.cwd(), ]; const DEFAULT_LIBRARIES_DIRECTORY = librarySearchPaths.join(";"); const DEFAULT_MODEL_CONFIG = { systemPrompt: "", promptTemplate: "### Human:\n%1\n\n### Assistant:\n", } const DEFAULT_MODEL_LIST_URL = "https://gpt4all.io/models/models3.json"; const DEFAULT_PROMPT_CONTEXT = { temp: 0.1, topK: 40, topP: 0.9, minP: 0.0, repeatPenalty: 1.18, repeatLastN: 10, nBatch: 100, } module.exports = { DEFAULT_DIRECTORY, DEFAULT_LIBRARIES_DIRECTORY, DEFAULT_MODEL_CONFIG, DEFAULT_MODEL_LIST_URL, DEFAULT_PROMPT_CONTEXT, }; ================================================ FILE: gpt4all-bindings/typescript/src/gpt4all.d.ts ================================================ /// declare module "gpt4all"; interface LLModelOptions { /** * Model architecture. This argument currently does not have any functionality and is just used as descriptive identifier for user. */ type?: string; model_name: string; model_path: string; library_path?: string; } interface ModelConfig { systemPrompt: string; promptTemplate: string; path: string; url?: string; } /** * Options for the chat session. */ interface ChatSessionOptions extends Partial { /** * System prompt to ingest on initialization. */ systemPrompt?: string; /** * Messages to ingest on initialization. */ messages?: ChatMessage[]; } /** * ChatSession utilizes an InferenceModel for efficient processing of chat conversations. */ declare class ChatSession implements CompletionProvider { /** * Constructs a new ChatSession using the provided InferenceModel and options. * Does not set the chat session as the active chat session until initialize is called. * @param {InferenceModel} model An InferenceModel instance. * @param {ChatSessionOptions} [options] Options for the chat session including default completion options. */ constructor(model: InferenceModel, options?: ChatSessionOptions); /** * The underlying InferenceModel used for generating completions. */ model: InferenceModel; /** * The name of the model. */ modelName: string; /** * The messages that have been exchanged in this chat session. */ messages: ChatMessage[]; /** * The system prompt that has been ingested at the beginning of the chat session. */ systemPrompt: string; /** * The current prompt context of the chat session. */ promptContext: LLModelPromptContext; /** * Ingests system prompt and initial messages. * Sets this chat session as the active chat session of the model. * @param {CompletionOptions} [options] Set completion options for initialization. * @returns {Promise} The number of tokens ingested during initialization. systemPrompt + messages. */ initialize(completionOpts?: CompletionOptions): Promise; /** * Prompts the model in chat-session context. * @param {CompletionInput} input Input string or message array. * @param {CompletionOptions} [options] Set completion options for this generation. * @returns {Promise} The inference result. * @throws {Error} If the chat session is not the active chat session of the model. * @throws {Error} If nPast is set to a value higher than what has been ingested in the session. */ generate( input: CompletionInput, options?: CompletionOptions ): Promise; } /** * Shape of InferenceModel generations. */ interface InferenceResult extends LLModelInferenceResult { tokensIngested: number; tokensGenerated: number; } /** * InferenceModel represents an LLM which can make next-token predictions. */ declare class InferenceModel implements CompletionProvider { constructor(llm: LLModel, config: ModelConfig); /** The native LLModel */ llm: LLModel; /** The configuration the instance was constructed with. */ config: ModelConfig; /** The active chat session of the model. */ activeChatSession?: ChatSession; /** The name of the model. */ modelName: string; /** * Create a chat session with the model and set it as the active chat session of this model. * A model instance can only have one active chat session at a time. * @param {ChatSessionOptions} options The options for the chat session. * @returns {Promise} The chat session. */ createChatSession(options?: ChatSessionOptions): Promise; /** * Prompts the model with a given input and optional parameters. * @param {CompletionInput} input The prompt input. * @param {CompletionOptions} options Prompt context and other options. * @returns {Promise} The model's response to the prompt. * @throws {Error} If nPast is set to a value smaller than 0. * @throws {Error} If a messages array without a tailing user message is provided. */ generate( prompt: string, options?: CompletionOptions ): Promise; /** * delete and cleanup the native model */ dispose(): void; } /** * Options for generating one or more embeddings. */ interface EmbedddingOptions { /** * The model-specific prefix representing the embedding task, without the trailing colon. For Nomic Embed * this can be `search_query`, `search_document`, `classification`, or `clustering`. */ prefix?: string; /** *The embedding dimension, for use with Matryoshka-capable models. Defaults to full-size. * @default determines on the model being used. */ dimensionality?: number; /** * How to handle texts longer than the model can accept. One of `mean` or `truncate`. * @default "mean" */ longTextMode?: "mean" | "truncate"; /** * Try to be fully compatible with the Atlas API. Currently, this means texts longer than 8192 tokens * with long_text_mode="mean" will raise an error. Disabled by default. * @default false */ atlas?: boolean; } /** * The nodejs moral equivalent to python binding's Embed4All().embed() * meow * @param {EmbeddingModel} model The embedding model instance. * @param {string} text Text to embed. * @param {EmbeddingOptions} options Optional parameters for the embedding. * @returns {EmbeddingResult} The embedding result. * @throws {Error} If dimensionality is set to a value smaller than 1. */ declare function createEmbedding( model: EmbeddingModel, text: string, options?: EmbedddingOptions ): EmbeddingResult; /** * Overload that takes multiple strings to embed. * @param {EmbeddingModel} model The embedding model instance. * @param {string[]} texts Texts to embed. * @param {EmbeddingOptions} options Optional parameters for the embedding. * @returns {EmbeddingResult} The embedding result. * @throws {Error} If dimensionality is set to a value smaller than 1. */ declare function createEmbedding( model: EmbeddingModel, text: string[], options?: EmbedddingOptions ): EmbeddingResult; /** * The resulting embedding. */ interface EmbeddingResult { /** * Encoded token count. Includes overlap but specifically excludes tokens used for the prefix/task_type, BOS/CLS token, and EOS/SEP token **/ n_prompt_tokens: number; embeddings: T; } /** * EmbeddingModel represents an LLM which can create embeddings, which are float arrays */ declare class EmbeddingModel { constructor(llm: LLModel, config: ModelConfig); /** The native LLModel */ llm: LLModel; /** The configuration the instance was constructed with. */ config: ModelConfig; /** * Create an embedding from a given input string. See EmbeddingOptions. * @param {string} text * @param {string} prefix * @param {number} dimensionality * @param {boolean} doMean * @param {boolean} atlas * @returns {EmbeddingResult} The embedding result. */ embed( text: string, prefix: string, dimensionality: number, doMean: boolean, atlas: boolean ): EmbeddingResult; /** * Create an embedding from a given input text array. See EmbeddingOptions. * @param {string[]} text * @param {string} prefix * @param {number} dimensionality * @param {boolean} doMean * @param {boolean} atlas * @returns {EmbeddingResult} The embedding result. */ embed( text: string[], prefix: string, dimensionality: number, doMean: boolean, atlas: boolean ): EmbeddingResult; /** * delete and cleanup the native model */ dispose(): void; } /** * Shape of LLModel's inference result. */ interface LLModelInferenceResult { text: string; nPast: number; } interface LLModelInferenceOptions extends Partial { /** Callback for response tokens, called for each generated token. * @param {number} tokenId The token id. * @param {string} token The token. * @returns {boolean | undefined} Whether to continue generating tokens. * */ onResponseToken?: (tokenId: number, token: string) => boolean | void; /** Callback for prompt tokens, called for each input token in the prompt. * @param {number} tokenId The token id. * @returns {boolean | undefined} Whether to continue ingesting the prompt. * */ onPromptToken?: (tokenId: number) => boolean | void; } /** * LLModel class representing a language model. * This is a base class that provides common functionality for different types of language models. */ declare class LLModel { /** * Initialize a new LLModel. * @param {string} path Absolute path to the model file. * @throws {Error} If the model file does not exist. */ constructor(options: LLModelOptions); /** undefined or user supplied */ type(): string | undefined; /** The name of the model. */ name(): string; /** * Get the size of the internal state of the model. * NOTE: This state data is specific to the type of model you have created. * @return the size in bytes of the internal state of the model */ stateSize(): number; /** * Get the number of threads used for model inference. * The default is the number of physical cores your computer has. * @returns The number of threads used for model inference. */ threadCount(): number; /** * Set the number of threads used for model inference. * @param newNumber The new number of threads. */ setThreadCount(newNumber: number): void; /** * Prompt the model directly with a given input string and optional parameters. * Use the higher level createCompletion methods for a more user-friendly interface. * @param {string} prompt The prompt input. * @param {LLModelInferenceOptions} options Optional parameters for the generation. * @returns {LLModelInferenceResult} The response text and final context size. */ infer( prompt: string, options: LLModelInferenceOptions ): Promise; /** * Embed text with the model. See EmbeddingOptions for more information. * Use the higher level createEmbedding methods for a more user-friendly interface. * @param {string} text * @param {string} prefix * @param {number} dimensionality * @param {boolean} doMean * @param {boolean} atlas * @returns {Float32Array} The embedding of the text. */ embed( text: string, prefix: string, dimensionality: number, doMean: boolean, atlas: boolean ): Float32Array; /** * Embed multiple texts with the model. See EmbeddingOptions for more information. * Use the higher level createEmbedding methods for a more user-friendly interface. * @param {string[]} texts * @param {string} prefix * @param {number} dimensionality * @param {boolean} doMean * @param {boolean} atlas * @returns {Float32Array[]} The embeddings of the texts. */ embed( texts: string, prefix: string, dimensionality: number, doMean: boolean, atlas: boolean ): Float32Array[]; /** * Whether the model is loaded or not. */ isModelLoaded(): boolean; /** * Where to search for the pluggable backend libraries */ setLibraryPath(s: string): void; /** * Where to get the pluggable backend libraries */ getLibraryPath(): string; /** * Initiate a GPU by a string identifier. * @param {number} memory_required Should be in the range size_t or will throw * @param {string} device_name 'amd' | 'nvidia' | 'intel' | 'gpu' | gpu name. * read LoadModelOptions.device for more information */ initGpuByString(memory_required: number, device_name: string): boolean; /** * From C documentation * @returns True if a GPU device is successfully initialized, false otherwise. */ hasGpuDevice(): boolean; /** * GPUs that are usable for this LLModel * @param {number} nCtx Maximum size of context window * @throws if hasGpuDevice returns false (i think) * @returns */ listGpu(nCtx: number): GpuDevice[]; /** * delete and cleanup the native model */ dispose(): void; } /** * an object that contains gpu data on this machine. */ interface GpuDevice { index: number; /** * same as VkPhysicalDeviceType */ type: number; heapSize: number; name: string; vendor: string; } /** * Options that configure a model's behavior. */ interface LoadModelOptions { /** * Where to look for model files. */ modelPath?: string; /** * Where to look for the backend libraries. */ librariesPath?: string; /** * The path to the model configuration file, useful for offline usage or custom model configurations. */ modelConfigFile?: string; /** * Whether to allow downloading the model if it is not present at the specified path. */ allowDownload?: boolean; /** * Enable verbose logging. */ verbose?: boolean; /** * The processing unit on which the model will run. It can be set to * - "cpu": Model will run on the central processing unit. * - "gpu": Model will run on the best available graphics processing unit, irrespective of its vendor. * - "amd", "nvidia", "intel": Model will run on the best available GPU from the specified vendor. * - "gpu name": Model will run on the GPU that matches the name if it's available. * Note: If a GPU device lacks sufficient RAM to accommodate the model, an error will be thrown, and the GPT4All * instance will be rendered invalid. It's advised to ensure the device has enough memory before initiating the * model. * @default "cpu" */ device?: string; /** * The Maximum window size of this model * @default 2048 */ nCtx?: number; /** * Number of gpu layers needed * @default 100 */ ngl?: number; } interface InferenceModelOptions extends LoadModelOptions { type?: "inference"; } interface EmbeddingModelOptions extends LoadModelOptions { type: "embedding"; } /** * Loads a machine learning model with the specified name. The defacto way to create a model. * By default this will download a model from the official GPT4ALL website, if a model is not present at given path. * * @param {string} modelName - The name of the model to load. * @param {LoadModelOptions|undefined} [options] - (Optional) Additional options for loading the model. * @returns {Promise} A promise that resolves to an instance of the loaded LLModel. */ declare function loadModel( modelName: string, options?: InferenceModelOptions ): Promise; declare function loadModel( modelName: string, options?: EmbeddingModelOptions ): Promise; declare function loadModel( modelName: string, options?: EmbeddingModelOptions | InferenceModelOptions ): Promise; /** * Interface for createCompletion methods, implemented by InferenceModel and ChatSession. * Implement your own CompletionProvider or extend ChatSession to generate completions with custom logic. */ interface CompletionProvider { modelName: string; generate( input: CompletionInput, options?: CompletionOptions ): Promise; } /** * Options for creating a completion. */ interface CompletionOptions extends LLModelInferenceOptions { /** * Indicates if verbose logging is enabled. * @default false */ verbose?: boolean; } /** * The input for creating a completion. May be a string or an array of messages. */ type CompletionInput = string | ChatMessage[]; /** * The nodejs equivalent to python binding's chat_completion * @param {CompletionProvider} provider - The inference model object or chat session * @param {CompletionInput} input - The input string or message array * @param {CompletionOptions} options - The options for creating the completion. * @returns {CompletionResult} The completion result. */ declare function createCompletion( provider: CompletionProvider, input: CompletionInput, options?: CompletionOptions ): Promise; /** * Streaming variant of createCompletion, returns a stream of tokens and a promise that resolves to the completion result. * @param {CompletionProvider} provider - The inference model object or chat session * @param {CompletionInput} input - The input string or message array * @param {CompletionOptions} options - The options for creating the completion. * @returns {CompletionStreamReturn} An object of token stream and the completion result promise. */ declare function createCompletionStream( provider: CompletionProvider, input: CompletionInput, options?: CompletionOptions ): CompletionStreamReturn; /** * The result of a streamed completion, containing a stream of tokens and a promise that resolves to the completion result. */ interface CompletionStreamReturn { tokens: NodeJS.ReadableStream; result: Promise; } /** * Async generator variant of createCompletion, yields tokens as they are generated and returns the completion result. * @param {CompletionProvider} provider - The inference model object or chat session * @param {CompletionInput} input - The input string or message array * @param {CompletionOptions} options - The options for creating the completion. * @returns {AsyncGenerator} The stream of generated tokens */ declare function createCompletionGenerator( provider: CompletionProvider, input: CompletionInput, options: CompletionOptions ): AsyncGenerator; /** * A message in the conversation. */ interface ChatMessage { /** The role of the message. */ role: "system" | "assistant" | "user"; /** The message content. */ content: string; } /** * The result of a completion. */ interface CompletionResult { /** The model used for the completion. */ model: string; /** Token usage report. */ usage: { /** The number of tokens ingested during the completion. */ prompt_tokens: number; /** The number of tokens generated in the completion. */ completion_tokens: number; /** The total number of tokens used. */ total_tokens: number; /** Number of tokens used in the conversation. */ n_past_tokens: number; }; /** The generated completion. */ choices: Array<{ message: ChatMessage; }>; } /** * Model inference arguments for generating completions. */ interface LLModelPromptContext { /** The size of the raw logits vector. */ logitsSize: number; /** The size of the raw tokens vector. */ tokensSize: number; /** The number of tokens in the past conversation. * This may be used to "roll back" the conversation to a previous state. * Note that for most use cases the default value should be sufficient and this should not be set. * @default 0 For completions using InferenceModel, meaning the model will only consider the input prompt. * @default nPast For completions using ChatSession. This means the context window will be automatically determined * and possibly resized (see contextErase) to keep the conversation performant. * */ nPast: number; /** The maximum number of tokens to predict. * @default 4096 * */ nPredict: number; /** Template for user / assistant message pairs. * %1 is required and will be replaced by the user input. * %2 is optional and will be replaced by the assistant response. If not present, the assistant response will be appended. */ promptTemplate?: string; /** The context window size. Do not use, it has no effect. See loadModel options. * THIS IS DEPRECATED!!! * Use loadModel's nCtx option instead. * @default 2048 */ nCtx: number; /** The top-k logits to sample from. * Top-K sampling selects the next token only from the top K most likely tokens predicted by the model. * It helps reduce the risk of generating low-probability or nonsensical tokens, but it may also limit * the diversity of the output. A higher value for top-K (eg., 100) will consider more tokens and lead * to more diverse text, while a lower value (eg., 10) will focus on the most probable tokens and generate * more conservative text. 30 - 60 is a good range for most tasks. * @default 40 * */ topK: number; /** The nucleus sampling probability threshold. * Top-P limits the selection of the next token to a subset of tokens with a cumulative probability * above a threshold P. This method, also known as nucleus sampling, finds a balance between diversity * and quality by considering both token probabilities and the number of tokens available for sampling. * When using a higher value for top-P (eg., 0.95), the generated text becomes more diverse. * On the other hand, a lower value (eg., 0.1) produces more focused and conservative text. * @default 0.9 * * */ topP: number; /** * The minimum probability of a token to be considered. * @default 0.0 */ minP: number; /** The temperature to adjust the model's output distribution. * Temperature is like a knob that adjusts how creative or focused the output becomes. Higher temperatures * (eg., 1.2) increase randomness, resulting in more imaginative and diverse text. Lower temperatures (eg., 0.5) * make the output more focused, predictable, and conservative. When the temperature is set to 0, the output * becomes completely deterministic, always selecting the most probable next token and producing identical results * each time. Try what value fits best for your use case and model. * @default 0.1 * @alias temperature * */ temp: number; temperature: number; /** The number of predictions to generate in parallel. * By splitting the prompt every N tokens, prompt-batch-size reduces RAM usage during processing. However, * this can increase the processing time as a trade-off. If the N value is set too low (e.g., 10), long prompts * with 500+ tokens will be most affected, requiring numerous processing runs to complete the prompt processing. * To ensure optimal performance, setting the prompt-batch-size to 2048 allows processing of all tokens in a single run. * @default 8 * */ nBatch: number; /** The penalty factor for repeated tokens. * Repeat-penalty can help penalize tokens based on how frequently they occur in the text, including the input prompt. * A token that has already appeared five times is penalized more heavily than a token that has appeared only one time. * A value of 1 means that there is no penalty and values larger than 1 discourage repeated tokens. * @default 1.18 * */ repeatPenalty: number; /** The number of last tokens to penalize. * The repeat-penalty-tokens N option controls the number of tokens in the history to consider for penalizing repetition. * A larger value will look further back in the generated text to prevent repetitions, while a smaller value will only * consider recent tokens. * @default 10 * */ repeatLastN: number; /** The percentage of context to erase if the context window is exceeded. * Set it to a lower value to keep context for longer at the cost of performance. * @default 0.75 * */ contextErase: number; } /** * From python api: * models will be stored in (homedir)/.cache/gpt4all/` */ declare const DEFAULT_DIRECTORY: string; /** * From python api: * The default path for dynamic libraries to be stored. * You may separate paths by a semicolon to search in multiple areas. * This searches DEFAULT_DIRECTORY/libraries, cwd/libraries, and finally cwd. */ declare const DEFAULT_LIBRARIES_DIRECTORY: string; /** * Default model configuration. */ declare const DEFAULT_MODEL_CONFIG: ModelConfig; /** * Default prompt context. */ declare const DEFAULT_PROMPT_CONTEXT: LLModelPromptContext; /** * Default model list url. */ declare const DEFAULT_MODEL_LIST_URL: string; /** * Initiates the download of a model file. * By default this downloads without waiting. use the controller returned to alter this behavior. * @param {string} modelName - The model to be downloaded. * @param {DownloadModelOptions} options - to pass into the downloader. Default is { location: (cwd), verbose: false }. * @returns {DownloadController} object that allows controlling the download process. * * @throws {Error} If the model already exists in the specified location. * @throws {Error} If the model cannot be found at the specified url. * * @example * const download = downloadModel('ggml-gpt4all-j-v1.3-groovy.bin') * download.promise.then(() => console.log('Downloaded!')) */ declare function downloadModel( modelName: string, options?: DownloadModelOptions ): DownloadController; /** * Options for the model download process. */ interface DownloadModelOptions { /** * location to download the model. * Default is process.cwd(), or the current working directory */ modelPath?: string; /** * Debug mode -- check how long it took to download in seconds * @default false */ verbose?: boolean; /** * Remote download url. Defaults to `https://gpt4all.io/models/gguf/` * @default https://gpt4all.io/models/gguf/ */ url?: string; /** * MD5 sum of the model file. If this is provided, the downloaded file will be checked against this sum. * If the sums do not match, an error will be thrown and the file will be deleted. */ md5sum?: string; } interface ListModelsOptions { url?: string; file?: string; } declare function listModels( options?: ListModelsOptions ): Promise; interface RetrieveModelOptions { allowDownload?: boolean; verbose?: boolean; modelPath?: string; modelConfigFile?: string; } declare function retrieveModel( modelName: string, options?: RetrieveModelOptions ): Promise; /** * Model download controller. */ interface DownloadController { /** Cancel the request to download if this is called. */ cancel: () => void; /** A promise resolving to the downloaded models config once the download is done */ promise: Promise; } export { LLModel, LLModelPromptContext, ModelConfig, InferenceModel, InferenceResult, EmbeddingModel, EmbeddingResult, ChatSession, ChatMessage, CompletionInput, CompletionProvider, CompletionOptions, CompletionResult, LoadModelOptions, DownloadController, RetrieveModelOptions, DownloadModelOptions, GpuDevice, loadModel, downloadModel, retrieveModel, listModels, createCompletion, createCompletionStream, createCompletionGenerator, createEmbedding, DEFAULT_DIRECTORY, DEFAULT_LIBRARIES_DIRECTORY, DEFAULT_MODEL_CONFIG, DEFAULT_PROMPT_CONTEXT, DEFAULT_MODEL_LIST_URL, }; ================================================ FILE: gpt4all-bindings/typescript/src/gpt4all.js ================================================ "use strict"; /// This file implements the gpt4all.d.ts file endings. /// Written in commonjs to support both ESM and CJS projects. const { existsSync } = require("node:fs"); const path = require("node:path"); const Stream = require("node:stream"); const assert = require("node:assert"); const { LLModel } = require("node-gyp-build")(path.resolve(__dirname, "..")); const { retrieveModel, downloadModel, appendBinSuffixIfMissing, } = require("./util.js"); const { DEFAULT_DIRECTORY, DEFAULT_LIBRARIES_DIRECTORY, DEFAULT_PROMPT_CONTEXT, DEFAULT_MODEL_CONFIG, DEFAULT_MODEL_LIST_URL, } = require("./config.js"); const { InferenceModel, EmbeddingModel } = require("./models.js"); const { ChatSession } = require("./chat-session.js"); /** * Loads a machine learning model with the specified name. The defacto way to create a model. * By default this will download a model from the official GPT4ALL website, if a model is not present at given path. * * @param {string} modelName - The name of the model to load. * @param {import('./gpt4all').LoadModelOptions|undefined} [options] - (Optional) Additional options for loading the model. * @returns {Promise} A promise that resolves to an instance of the loaded LLModel. */ async function loadModel(modelName, options = {}) { const loadOptions = { modelPath: DEFAULT_DIRECTORY, librariesPath: DEFAULT_LIBRARIES_DIRECTORY, type: "inference", allowDownload: true, verbose: false, device: "cpu", nCtx: 2048, ngl: 100, ...options, }; const modelConfig = await retrieveModel(modelName, { modelPath: loadOptions.modelPath, modelConfigFile: loadOptions.modelConfigFile, allowDownload: loadOptions.allowDownload, verbose: loadOptions.verbose, }); assert.ok( typeof loadOptions.librariesPath === "string", "Libraries path should be a string" ); const existingPaths = loadOptions.librariesPath .split(";") .filter(existsSync) .join(";"); const llmOptions = { model_name: appendBinSuffixIfMissing(modelName), model_path: loadOptions.modelPath, library_path: existingPaths, device: loadOptions.device, nCtx: loadOptions.nCtx, ngl: loadOptions.ngl, }; if (loadOptions.verbose) { console.debug("Creating LLModel:", { llmOptions, modelConfig, }); } const llmodel = new LLModel(llmOptions); if (loadOptions.type === "embedding") { return new EmbeddingModel(llmodel, modelConfig); } else if (loadOptions.type === "inference") { return new InferenceModel(llmodel, modelConfig); } else { throw Error("Invalid model type: " + loadOptions.type); } } function createEmbedding(model, text, options={}) { let { dimensionality = undefined, longTextMode = "mean", atlas = false, } = options; if (dimensionality === undefined) { dimensionality = -1; } else { if (dimensionality <= 0) { throw new Error( `Dimensionality must be undefined or a positive integer, got ${dimensionality}` ); } if (dimensionality < model.MIN_DIMENSIONALITY) { console.warn( `Dimensionality ${dimensionality} is less than the suggested minimum of ${model.MIN_DIMENSIONALITY}. Performance may be degraded.` ); } } let doMean; switch (longTextMode) { case "mean": doMean = true; break; case "truncate": doMean = false; break; default: throw new Error( `Long text mode must be one of 'mean' or 'truncate', got ${longTextMode}` ); } return model.embed(text, options?.prefix, dimensionality, doMean, atlas); } const defaultCompletionOptions = { verbose: false, ...DEFAULT_PROMPT_CONTEXT, }; async function createCompletion( provider, input, options = defaultCompletionOptions ) { const completionOptions = { ...defaultCompletionOptions, ...options, }; const result = await provider.generate( input, completionOptions, ); return { model: provider.modelName, usage: { prompt_tokens: result.tokensIngested, total_tokens: result.tokensIngested + result.tokensGenerated, completion_tokens: result.tokensGenerated, n_past_tokens: result.nPast, }, choices: [ { message: { role: "assistant", content: result.text, }, // TODO some completion APIs also provide logprobs and finish_reason, could look into adding those }, ], }; } function createCompletionStream( provider, input, options = defaultCompletionOptions ) { const completionStream = new Stream.PassThrough({ encoding: "utf-8", }); const completionPromise = createCompletion(provider, input, { ...options, onResponseToken: (tokenId, token) => { completionStream.push(token); if (options.onResponseToken) { return options.onResponseToken(tokenId, token); } }, }).then((result) => { completionStream.push(null); completionStream.emit("end"); return result; }); return { tokens: completionStream, result: completionPromise, }; } async function* createCompletionGenerator(provider, input, options) { const completion = createCompletionStream(provider, input, options); for await (const chunk of completion.tokens) { yield chunk; } return await completion.result; } module.exports = { DEFAULT_LIBRARIES_DIRECTORY, DEFAULT_DIRECTORY, DEFAULT_PROMPT_CONTEXT, DEFAULT_MODEL_CONFIG, DEFAULT_MODEL_LIST_URL, LLModel, InferenceModel, EmbeddingModel, ChatSession, createCompletion, createCompletionStream, createCompletionGenerator, createEmbedding, downloadModel, retrieveModel, loadModel, }; ================================================ FILE: gpt4all-bindings/typescript/src/models.js ================================================ const { DEFAULT_PROMPT_CONTEXT } = require("./config"); const { ChatSession } = require("./chat-session"); const { prepareMessagesForIngest } = require("./util"); class InferenceModel { llm; modelName; config; activeChatSession; constructor(llmodel, config) { this.llm = llmodel; this.config = config; this.modelName = this.llm.name(); } async createChatSession(options) { const chatSession = new ChatSession(this, options); await chatSession.initialize(); this.activeChatSession = chatSession; return this.activeChatSession; } async generate(input, options = DEFAULT_PROMPT_CONTEXT) { const { verbose, ...otherOptions } = options; const promptContext = { promptTemplate: this.config.promptTemplate, temp: otherOptions.temp ?? otherOptions.temperature ?? DEFAULT_PROMPT_CONTEXT.temp, ...otherOptions, }; if (promptContext.nPast < 0) { throw new Error("nPast must be a non-negative integer."); } if (verbose) { console.debug("Generating completion", { input, promptContext, }); } let prompt = input; let nPast = promptContext.nPast; let tokensIngested = 0; if (Array.isArray(input)) { // assuming input is a messages array // -> tailing user message will be used as the final prompt. its required. // -> leading system message will be ingested as systemPrompt, further system messages will be ignored // -> all other messages will be ingested with fakeReply // -> model/context will only be kept for this completion; "stateless" nPast = 0; const messages = [...input]; const lastMessage = input[input.length - 1]; if (lastMessage.role !== "user") { // this is most likely a user error throw new Error("The final message must be of role 'user'."); } if (input[0].role === "system") { // needs to be a pre-templated prompt ala '<|im_start|>system\nYou are an advanced mathematician.\n<|im_end|>\n' const systemPrompt = input[0].content; const systemRes = await this.llm.infer(systemPrompt, { promptTemplate: "%1", nPredict: 0, special: true, }); nPast = systemRes.nPast; tokensIngested += systemRes.tokensIngested; messages.shift(); } prompt = lastMessage.content; const messagesToIngest = messages.slice(0, input.length - 1); const turns = prepareMessagesForIngest(messagesToIngest); for (const turn of turns) { const turnRes = await this.llm.infer(turn.user, { ...promptContext, nPast, fakeReply: turn.assistant, }); tokensIngested += turnRes.tokensIngested; nPast = turnRes.nPast; } } let tokensGenerated = 0; const result = await this.llm.infer(prompt, { ...promptContext, nPast, onPromptToken: (tokenId) => { let continueIngestion = true; tokensIngested++; if (options.onPromptToken) { // catch errors because if they go through cpp they will loose stacktraces try { // don't cancel ingestion unless user explicitly returns false continueIngestion = options.onPromptToken(tokenId) !== false; } catch (e) { console.error("Error in onPromptToken callback", e); continueIngestion = false; } } return continueIngestion; }, onResponseToken: (tokenId, token) => { let continueGeneration = true; tokensGenerated++; if (options.onResponseToken) { try { // don't cancel the generation unless user explicitly returns false continueGeneration = options.onResponseToken(tokenId, token) !== false; } catch (err) { console.error("Error in onResponseToken callback", err); continueGeneration = false; } } return continueGeneration; }, }); result.tokensGenerated = tokensGenerated; result.tokensIngested = tokensIngested; if (verbose) { console.debug("Finished completion:\n", result); } return result; } dispose() { this.llm.dispose(); } } class EmbeddingModel { llm; config; MIN_DIMENSIONALITY = 64; constructor(llmodel, config) { this.llm = llmodel; this.config = config; } embed(text, prefix, dimensionality, do_mean, atlas) { return this.llm.embed(text, prefix, dimensionality, do_mean, atlas); } dispose() { this.llm.dispose(); } } module.exports = { InferenceModel, EmbeddingModel, }; ================================================ FILE: gpt4all-bindings/typescript/src/util.js ================================================ const { createWriteStream, existsSync, statSync, mkdirSync } = require("node:fs"); const fsp = require("node:fs/promises"); const { performance } = require("node:perf_hooks"); const path = require("node:path"); const md5File = require("md5-file"); const { DEFAULT_DIRECTORY, DEFAULT_MODEL_CONFIG, DEFAULT_MODEL_LIST_URL, } = require("./config.js"); async function listModels( options = { url: DEFAULT_MODEL_LIST_URL, } ) { if (!options || (!options.url && !options.file)) { throw new Error( `No model list source specified. Please specify either a url or a file.` ); } if (options.file) { if (!existsSync(options.file)) { throw new Error(`Model list file ${options.file} does not exist.`); } const fileContents = await fsp.readFile(options.file, "utf-8"); const modelList = JSON.parse(fileContents); return modelList; } else if (options.url) { const res = await fetch(options.url); if (!res.ok) { throw Error( `Failed to retrieve model list from ${url} - ${res.status} ${res.statusText}` ); } const modelList = await res.json(); return modelList; } } function appendBinSuffixIfMissing(name) { const ext = path.extname(name); if (![".bin", ".gguf"].includes(ext)) { return name + ".gguf"; } return name; } function prepareMessagesForIngest(messages) { const systemMessages = messages.filter( (message) => message.role === "system" ); if (systemMessages.length > 0) { console.warn( "System messages are currently not supported and will be ignored. Use the systemPrompt option instead." ); } const userAssistantMessages = messages.filter( (message) => message.role !== "system" ); // make sure the first message is a user message // if its not, the turns will be out of order if (userAssistantMessages[0].role !== "user") { userAssistantMessages.unshift({ role: "user", content: "", }); } // create turns of user input + assistant reply const turns = []; let userMessage = null; let assistantMessage = null; for (const message of userAssistantMessages) { // consecutive messages of the same role are concatenated into one message if (message.role === "user") { if (!userMessage) { userMessage = message.content; } else { userMessage += "\n" + message.content; } } else if (message.role === "assistant") { if (!assistantMessage) { assistantMessage = message.content; } else { assistantMessage += "\n" + message.content; } } if (userMessage && assistantMessage) { turns.push({ user: userMessage, assistant: assistantMessage, }); userMessage = null; assistantMessage = null; } } return turns; } // readChunks() reads from the provided reader and yields the results into an async iterable // https://css-tricks.com/web-streams-everywhere-and-fetch-for-node-js/ function readChunks(reader) { return { async *[Symbol.asyncIterator]() { let readResult = await reader.read(); while (!readResult.done) { yield readResult.value; readResult = await reader.read(); } }, }; } function downloadModel(modelName, options = {}) { const downloadOptions = { modelPath: DEFAULT_DIRECTORY, verbose: false, ...options, }; const modelFileName = appendBinSuffixIfMissing(modelName); const partialModelPath = path.join( downloadOptions.modelPath, modelName + ".part" ); const finalModelPath = path.join(downloadOptions.modelPath, modelFileName); const modelUrl = downloadOptions.url ?? `https://gpt4all.io/models/gguf/${modelFileName}`; mkdirSync(downloadOptions.modelPath, { recursive: true }); if (existsSync(finalModelPath)) { throw Error(`Model already exists at ${finalModelPath}`); } if (downloadOptions.verbose) { console.debug(`Downloading ${modelName} from ${modelUrl}`); } const headers = { "Accept-Ranges": "arraybuffer", "Response-Type": "arraybuffer", }; const writeStreamOpts = {}; if (existsSync(partialModelPath)) { if (downloadOptions.verbose) { console.debug("Partial model exists, resuming download..."); } const startRange = statSync(partialModelPath).size; headers["Range"] = `bytes=${startRange}-`; writeStreamOpts.flags = "a"; } const abortController = new AbortController(); const signal = abortController.signal; const finalizeDownload = async () => { if (downloadOptions.md5sum) { const fileHash = await md5File(partialModelPath); if (fileHash !== downloadOptions.md5sum) { await fsp.unlink(partialModelPath); const message = `Model "${modelName}" failed verification: Hashes mismatch. Expected ${downloadOptions.md5sum}, got ${fileHash}`; throw Error(message); } if (downloadOptions.verbose) { console.debug(`MD5 hash verified: ${fileHash}`); } } await fsp.rename(partialModelPath, finalModelPath); }; // a promise that executes and writes to a stream. Resolves to the path the model was downloaded to when done writing. const downloadPromise = new Promise((resolve, reject) => { let timestampStart; if (downloadOptions.verbose) { console.debug(`Downloading @ ${partialModelPath} ...`); timestampStart = performance.now(); } const writeStream = createWriteStream( partialModelPath, writeStreamOpts ); writeStream.on("error", (e) => { writeStream.close(); reject(e); }); writeStream.on("finish", () => { if (downloadOptions.verbose) { const elapsed = performance.now() - timestampStart; console.log(`Finished. Download took ${elapsed.toFixed(2)} ms`); } finalizeDownload() .then(() => { resolve(finalModelPath); }) .catch(reject); }); fetch(modelUrl, { signal, headers, }) .then((res) => { if (!res.ok) { const message = `Failed to download model from ${modelUrl} - ${res.status} ${res.statusText}`; reject(Error(message)); } return res.body.getReader(); }) .then(async (reader) => { for await (const chunk of readChunks(reader)) { writeStream.write(chunk); } writeStream.end(); }) .catch(reject); }); return { cancel: () => abortController.abort(), promise: downloadPromise, }; } async function retrieveModel(modelName, options = {}) { const retrieveOptions = { modelPath: DEFAULT_DIRECTORY, allowDownload: true, verbose: false, ...options, }; mkdirSync(retrieveOptions.modelPath, { recursive: true }); const modelFileName = appendBinSuffixIfMissing(modelName); const fullModelPath = path.join(retrieveOptions.modelPath, modelFileName); const modelExists = existsSync(fullModelPath); let config = { ...DEFAULT_MODEL_CONFIG }; const availableModels = await listModels({ file: retrieveOptions.modelConfigFile, url: retrieveOptions.allowDownload && "https://gpt4all.io/models/models3.json", }); const loadedModelConfig = availableModels.find( (model) => model.filename === modelFileName ); if (loadedModelConfig) { config = { ...config, ...loadedModelConfig, }; } else { // if there's no local modelConfigFile specified, and allowDownload is false, the default model config will be used. // warning the user here because the model may not work as expected. console.warn( `Failed to load model config for ${modelName}. Using defaults.` ); } config.systemPrompt = config.systemPrompt.trim(); if (modelExists) { config.path = fullModelPath; if (retrieveOptions.verbose) { console.debug(`Found ${modelName} at ${fullModelPath}`); } } else if (retrieveOptions.allowDownload) { const downloadController = downloadModel(modelName, { modelPath: retrieveOptions.modelPath, verbose: retrieveOptions.verbose, filesize: config.filesize, url: config.url, md5sum: config.md5sum, }); const downloadPath = await downloadController.promise; config.path = downloadPath; if (retrieveOptions.verbose) { console.debug(`Model downloaded to ${downloadPath}`); } } else { throw Error("Failed to retrieve model."); } return config; } module.exports = { appendBinSuffixIfMissing, prepareMessagesForIngest, downloadModel, retrieveModel, listModels, }; ================================================ FILE: gpt4all-bindings/typescript/test/gpt4all.test.js ================================================ const path = require("node:path"); const os = require("node:os"); const fsp = require("node:fs/promises"); const { existsSync } = require('node:fs'); const { LLModel } = require("node-gyp-build")(path.resolve(__dirname, "..")); const { listModels, downloadModel, appendBinSuffixIfMissing, } = require("../src/util.js"); const { DEFAULT_DIRECTORY, DEFAULT_LIBRARIES_DIRECTORY, DEFAULT_MODEL_LIST_URL, } = require("../src/config.js"); const { loadModel, createPrompt, createCompletion, } = require("../src/gpt4all.js"); describe("config", () => { test("default paths constants are available and correct", () => { expect(DEFAULT_DIRECTORY).toBe( path.resolve(os.homedir(), ".cache/gpt4all") ); const paths = [ path.join(DEFAULT_DIRECTORY, "libraries"), path.resolve("./libraries"), path.resolve( __dirname, "..", `runtimes/${process.platform}-${process.arch}/native` ), path.resolve( __dirname, "..", `runtimes/${process.platform}/native`, ), process.cwd(), ]; expect(typeof DEFAULT_LIBRARIES_DIRECTORY).toBe("string"); expect(DEFAULT_LIBRARIES_DIRECTORY).toBe(paths.join(";")); }); }); describe("listModels", () => { const fakeModels = require("./models.json"); const fakeModel = fakeModels[0]; const mockResponse = JSON.stringify([fakeModel]); let mockFetch, originalFetch; beforeAll(() => { // Mock the fetch function for all tests mockFetch = jest.fn().mockResolvedValue({ ok: true, json: () => JSON.parse(mockResponse), }); originalFetch = global.fetch; global.fetch = mockFetch; }); afterEach(() => { // Reset the fetch counter after each test mockFetch.mockClear(); }); afterAll(() => { // Restore fetch global.fetch = originalFetch; }); it("should load the model list from remote when called without args", async () => { const models = await listModels(); expect(fetch).toHaveBeenCalledTimes(1); expect(fetch).toHaveBeenCalledWith(DEFAULT_MODEL_LIST_URL); expect(models[0]).toEqual(fakeModel); }); it("should load the model list from a local file, if specified", async () => { const models = await listModels({ file: path.resolve(__dirname, "models.json"), }); expect(fetch).toHaveBeenCalledTimes(0); expect(models[0]).toEqual(fakeModel); }); it("should throw an error if neither url nor file is specified", async () => { await expect(listModels(null)).rejects.toThrow( "No model list source specified. Please specify either a url or a file." ); }); }); describe("appendBinSuffixIfMissing", () => { it("should make sure the suffix is there", () => { expect(appendBinSuffixIfMissing("filename")).toBe("filename.gguf"); expect(appendBinSuffixIfMissing("filename.bin")).toBe("filename.bin"); }); }); describe("downloadModel", () => { let mockAbortController, mockFetch; const fakeModelName = "fake-model"; const createMockFetch = () => { const mockData = new Uint8Array([1, 2, 3, 4]); const mockResponse = new ReadableStream({ start(controller) { controller.enqueue(mockData); controller.close(); }, }); const mockFetchImplementation = jest.fn(() => Promise.resolve({ ok: true, body: mockResponse, }) ); return mockFetchImplementation; }; beforeEach(async () => { // Mocking the AbortController constructor mockAbortController = jest.fn(); global.AbortController = mockAbortController; mockAbortController.mockReturnValue({ signal: "signal", abort: jest.fn(), }); mockFetch = createMockFetch(); jest.spyOn(global, "fetch").mockImplementation(mockFetch); }); afterEach(async () => { // Clean up mocks mockAbortController.mockReset(); mockFetch.mockClear(); global.fetch.mockRestore(); const rootDefaultPath = path.resolve(DEFAULT_DIRECTORY), partialPath = path.resolve(rootDefaultPath, fakeModelName+'.part'), fullPath = path.resolve(rootDefaultPath, fakeModelName+'.bin') //if tests fail, remove the created files // acts as cleanup if tests fail // if(existsSync(fullPath)) { await fsp.rm(fullPath) } if(existsSync(partialPath)) { await fsp.rm(partialPath) } }); test("should successfully download a model file", async () => { const downloadController = downloadModel(fakeModelName); const modelFilePath = await downloadController.promise; expect(modelFilePath).toBe(path.resolve(DEFAULT_DIRECTORY, `${fakeModelName}.gguf`)); expect(global.fetch).toHaveBeenCalledTimes(1); expect(global.fetch).toHaveBeenCalledWith( "https://gpt4all.io/models/gguf/fake-model.gguf", { signal: "signal", headers: { "Accept-Ranges": "arraybuffer", "Response-Type": "arraybuffer", }, } ); // final model file should be present await expect(fsp.access(modelFilePath)).resolves.not.toThrow(); // remove the testing model file await fsp.unlink(modelFilePath); }); test("should error and cleanup if md5sum is not matching", async () => { const downloadController = downloadModel(fakeModelName, { md5sum: "wrong-md5sum", }); // the promise should reject with a mismatch await expect(downloadController.promise).rejects.toThrow( `Model "fake-model" failed verification: Hashes mismatch. Expected wrong-md5sum, got 08d6c05a21512a79a1dfeb9d2a8f262f` ); // fetch should have been called expect(global.fetch).toHaveBeenCalledTimes(1); // the file should be missing await expect( fsp.access(path.resolve(DEFAULT_DIRECTORY, `${fakeModelName}.gguf`)) ).rejects.toThrow(); // partial file should also be missing await expect( fsp.access(path.resolve(DEFAULT_DIRECTORY, `${fakeModelName}.part`)) ).rejects.toThrow(); }); // TODO // test("should be able to cancel and resume a download", async () => { // }); }); ================================================ FILE: gpt4all-bindings/typescript/test/models.json ================================================ [ { "order": "a", "md5sum": "08d6c05a21512a79a1dfeb9d2a8f262f", "name": "Not a real model", "filename": "fake-model.gguf", "filesize": "4", "systemPrompt": " " } ] ================================================ FILE: gpt4all-chat/.flake8 ================================================ # vim: set syntax=dosini: [flake8] exclude = .*,__pycache__ max-line-length = 120 extend-ignore = B001,C408,D,DAR,E221,E303,E722,E741,E800,N801,N806,P101,S101,S324,S404,S406,S410,S603,WPS100,WPS110,WPS111,WPS113,WPS114,WPS115,WPS120,WPS2,WPS300,WPS301,WPS304,WPS305,WPS306,WPS309,WPS316,WPS317,WPS318,WPS319,WPS322,WPS323,WPS326,WPS329,WPS330,WPS332,WPS336,WPS337,WPS347,WPS360,WPS361,WPS407,WPS414,WPS420,WPS421,WPS429,WPS430,WPS431,WPS432,WPS433,WPS437,WPS440,WPS440,WPS441,WPS442,WPS457,WPS458,WPS460,WPS462,WPS463,WPS473,WPS501,WPS504,WPS505,WPS508,WPS509,WPS510,WPS515,WPS516,WPS519,WPS520,WPS529,WPS531,WPS602,WPS604,WPS605,WPS608,WPS609,WPS613,WPS615 ================================================ FILE: gpt4all-chat/CHANGELOG.md ================================================ # Changelog All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). ## [3.10.0] - 2025-02-24 ### Added - Whitelist Granite (non-MoE) model architecture (by [@ThiloteE](https://github.com/ThiloteE) in [#3487](https://github.com/nomic-ai/gpt4all/pull/3487)) - Add support for CUDA compute 5.0 GPUs such as the GTX 750 ([#3499](https://github.com/nomic-ai/gpt4all/pull/3499)) - Add a Remote Providers tab to the Add Model page ([#3506](https://github.com/nomic-ai/gpt4all/pull/3506)) ### Changed - Substitute prettier default templates for OLMoE 7B 0924/0125 and Granite 3.1 3B/8B (by [@ThiloteE](https://github.com/ThiloteE) in [#3471](https://github.com/nomic-ai/gpt4all/pull/3471)) - Build with LLVM Clang 19 on macOS and Ubuntu ([#3500](https://github.com/nomic-ai/gpt4all/pull/3500)) ### Fixed - Fix several potential crashes ([#3465](https://github.com/nomic-ai/gpt4all/pull/3465)) - Fix visual spacing issues with deepseek models ([#3470](https://github.com/nomic-ai/gpt4all/pull/3470)) - Add missing strings to Italian translation (by [@Harvester62](https://github.com/Harvester62) in [#3496](https://github.com/nomic-ai/gpt4all/pull/3496)) - Update Simplified Chinese translation (by [@Junior2Ran](https://github.com/Junior2Ran) in [#3467](https://github.com/nomic-ai/pull/3467)) ## [3.9.0] - 2025-02-04 ### Added - Whitelist OLMoE and Granite MoE model architectures (no Vulkan) (by [@ThiloteE](https://github.com/ThiloteE) in [#3449](https://github.com/nomic-ai/gpt4all/pull/3449)) ### Fixed - Fix "index N is not a prompt" when using LocalDocs with reasoning ([#3451](https://github.com/nomic-ai/gpt4all/pull/3451)) - Work around rendering artifacts on Snapdragon SoCs with Windows ([#3450](https://github.com/nomic-ai/gpt4all/pull/3450)) - Prevent DeepSeek-R1 reasoning from appearing in chat names and follow-up questions ([#3458](https://github.com/nomic-ai/gpt4all/pull/3458)) - Fix LocalDocs crash on Windows ARM when reading PDFs ([#3460](https://github.com/nomic-ai/gpt4all/pull/3460)) - Fix UI freeze when chat template is `{#` ([#3446](https://github.com/nomic-ai/gpt4all/pull/3446)) ## [3.8.0] - 2025-01-30 ### Added - Support DeepSeek-R1 Qwen models ([#3431](https://github.com/nomic-ai/gpt4all/pull/3431)) - Support for think tags in the GUI ([#3440](https://github.com/nomic-ai/gpt4all/pull/3440)) - Support specifying SHA256 hash in models3.json instead of MD5 ([#3437](https://github.com/nomic-ai/gpt4all/pull/3437)) ### Changed - Use minja instead of Jinja2Cpp for significantly improved template compatibility ([#3433](https://github.com/nomic-ai/gpt4all/pull/3433)) ### Fixed - Fix regression while using localdocs with server API ([#3410](https://github.com/nomic-ai/gpt4all/pull/3410)) - Don't show system messages in server chat view ([#3411](https://github.com/nomic-ai/gpt4all/pull/3411)) - Fix `codesign --verify` failure on macOS ([#3413](https://github.com/nomic-ai/gpt4all/pull/3413)) - Code Interpreter: Fix console.log not accepting a single string after v3.7.0 ([#3426](https://github.com/nomic-ai/gpt4all/pull/3426)) - Fix Phi 3.1 Mini 128K Instruct template (by [@ThiloteE](https://github.com/ThiloteE) in [#3412](https://github.com/nomic-ai/gpt4all/pull/3412)) - Don't block the gui thread for reasoning ([#3435](https://github.com/nomic-ai/gpt4all/pull/3435)) - Fix corruption of unicode in output of reasoning models ([#3443](https://github.com/nomic-ai/gpt4all/pull/3443)) ## [3.7.0] - 2025-01-21 ### Added - Add support for the Windows ARM64 target platform (CPU-only) ([#3385](https://github.com/nomic-ai/gpt4all/pull/3385)) ### Changed - Update from Qt 6.5.1 to 6.8.1 ([#3386](https://github.com/nomic-ai/gpt4all/pull/3386)) ### Fixed - Fix the timeout error in code interpreter ([#3369](https://github.com/nomic-ai/gpt4all/pull/3369)) - Fix code interpreter console.log not accepting multiple arguments ([#3371](https://github.com/nomic-ai/gpt4all/pull/3371)) - Remove 'X is defined' checks from templates for better compatibility ([#3372](https://github.com/nomic-ai/gpt4all/pull/3372)) - Jinja2Cpp: Add 'if' requirement for 'else' parsing to fix crash ([#3373](https://github.com/nomic-ai/gpt4all/pull/3373)) - Save chats on quit, even if the window isn't closed first ([#3387](https://github.com/nomic-ai/gpt4all/pull/3387)) - Add chat template replacements for five new models and fix EM German Mistral ([#3393](https://github.com/nomic-ai/gpt4all/pull/3393)) - Fix crash when entering `{{ a["foo"(` as chat template ([#3394](https://github.com/nomic-ai/gpt4all/pull/3394)) - Sign the maintenance tool on macOS to prevent crash on Sequoia ([#3391](https://github.com/nomic-ai/gpt4all/pull/3391)) - Jinja2Cpp: Fix operator precedence in 'not X is defined' ([#3402](https://github.com/nomic-ai/gpt4all/pull/3402)) ## [3.6.1] - 2024-12-20 ### Fixed - Fix the stop generation button no longer working in v3.6.0 ([#3336](https://github.com/nomic-ai/gpt4all/pull/3336)) - Fix the copy entire conversation button no longer working in v3.6.0 ([#3336](https://github.com/nomic-ai/gpt4all/pull/3336)) ## [3.6.0] - 2024-12-19 ### Added - Automatically substitute chat templates that are not compatible with Jinja2Cpp in GGUFs ([#3327](https://github.com/nomic-ai/gpt4all/pull/3327)) - Built-in javascript code interpreter tool plus model ([#3173](https://github.com/nomic-ai/gpt4all/pull/3173)) ### Fixed - Fix remote model template to allow for XML in messages ([#3318](https://github.com/nomic-ai/gpt4all/pull/3318)) - Fix Jinja2Cpp bug that broke system message detection in chat templates ([#3325](https://github.com/nomic-ai/gpt4all/pull/3325)) - Fix LocalDocs sources displaying in unconsolidated form after v3.5.0 ([#3328](https://github.com/nomic-ai/gpt4all/pull/3328)) ## [3.5.3] - 2024-12-16 ### Fixed - Fix LocalDocs not using information from sources in v3.5.2 ([#3302](https://github.com/nomic-ai/gpt4all/pull/3302)) ## [3.5.2] - 2024-12-13 ### Added - Create separate download pages for built-in and HuggingFace models ([#3269](https://github.com/nomic-ai/gpt4all/pull/3269)) ### Fixed - Fix API server ignoring assistant messages in history after v3.5.0 ([#3256](https://github.com/nomic-ai/gpt4all/pull/3256)) - Fix API server replying with incorrect token counts and stop reason after v3.5.0 ([#3256](https://github.com/nomic-ai/gpt4all/pull/3256)) - Fix API server remembering previous, unrelated conversations after v3.5.0 ([#3256](https://github.com/nomic-ai/gpt4all/pull/3256)) - Fix mishandling of default chat template and system message of cloned models in v3.5.0 ([#3262](https://github.com/nomic-ai/gpt4all/pull/3262)) - Fix untranslated text on the startup dialog ([#3293](https://github.com/nomic-ai/gpt4all/pull/3293)) ## [3.5.1] - 2024-12-10 ### Fixed - Fix an incorrect value for currentResponse ([#3245](https://github.com/nomic-ai/gpt4all/pull/3245)) - Fix the default model button so it works again after 3.5.0 ([#3246](https://github.com/nomic-ai/gpt4all/pull/3246)) - Fix chat templates for Nous Hermes 2 Mistral, Mistral OpenOrca, Qwen 2, and remote models ([#3250](https://github.com/nomic-ai/gpt4all/pull/3250)) - Fix chat templates for Llama 3.2 models ([#3251](https://github.com/nomic-ai/gpt4all/pull/3251)) ## [3.5.0] - 2024-12-09 ### Changed - Update Italian translation (by [@Harvester62](https://github.com/Harvester62) in [#3236](https://github.com/nomic-ai/gpt4all/pull/3236)) - Update Romanian translation (by [@SINAPSA-IC](https://github.com/SINAPSA-IC) in [#3232](https://github.com/nomic-ai/gpt4all/pull/3232)) ### Fixed - Fix a few more problems with the Jinja changes ([#3239](https://github.com/nomic-ai/gpt4all/pull/3239)) ## [3.5.0-rc2] - 2024-12-06 ### Changed - Fade messages out with an animation when they are removed from the chat view ([#3227](https://github.com/nomic-ai/gpt4all/pull/3227)) - Tweak wording of edit/redo confirmation dialogs ([#3228](https://github.com/nomic-ai/gpt4all/pull/3228)) - Make edit/redo buttons disabled instead of invisible when they are temporarily unavailable ([#3228](https://github.com/nomic-ai/gpt4all/pull/3228)) ## [3.5.0-rc1] - 2024-12-04 ### Added - Add ability to attach text, markdown, and rst files to chat ([#3135](https://github.com/nomic-ai/gpt4all/pull/3135)) - Add feature to minimize to system tray (by [@bgallois](https://github.com/bgallois) in [#3109](https://github.com/nomic-ai/gpt4all/pull/3109)) - Basic cache for faster prefill when the input shares a prefix with previous context ([#3073](https://github.com/nomic-ai/gpt4all/pull/3073)) - Add ability to edit prompts and regenerate any response ([#3147](https://github.com/nomic-ai/gpt4all/pull/3147)) ### Changed - Implement Qt 6.8 compatibility ([#3121](https://github.com/nomic-ai/gpt4all/pull/3121)) - Use Jinja for chat templates instead of per-message QString.arg-style templates ([#3147](https://github.com/nomic-ai/gpt4all/pull/3147)) - API server: Use system message(s) from client instead of settings ([#3147](https://github.com/nomic-ai/gpt4all/pull/3147)) - API server: Accept messages in any order supported by the model instead of requiring user/assistant pairs ([#3147](https://github.com/nomic-ai/gpt4all/pull/3147)) - Remote models: Pass system message with "system" role instead of joining with user message ([#3147](https://github.com/nomic-ai/gpt4all/pull/3147)) ### Removed - Remove option to save binary model state to disk ([#3147](https://github.com/nomic-ai/gpt4all/pull/3147)) ### Fixed - Fix bug in GUI when localdocs encounters binary data ([#3137](https://github.com/nomic-ai/gpt4all/pull/3137)) - Fix LocalDocs bugs that prevented some docx files from fully chunking ([#3140](https://github.com/nomic-ai/gpt4all/pull/3140)) - Fix missing softmax that was causing crashes and effectively infinite temperature since 3.4.0 ([#3202](https://github.com/nomic-ai/gpt4all/pull/3202)) ## [3.4.2] - 2024-10-16 ### Fixed - Limit bm25 retrieval to only specified collections ([#3083](https://github.com/nomic-ai/gpt4all/pull/3083)) - Fix bug removing documents because of a wrong case sensitive file suffix check ([#3083](https://github.com/nomic-ai/gpt4all/pull/3083)) - Fix bug with hybrid localdocs search where database would get out of sync ([#3083](https://github.com/nomic-ai/gpt4all/pull/3083)) - Fix GUI bug where the localdocs embedding device appears blank ([#3083](https://github.com/nomic-ai/gpt4all/pull/3083)) - Prevent LocalDocs from not making progress in certain cases ([#3094](https://github.com/nomic-ai/gpt4all/pull/3094)) ## [3.4.1] - 2024-10-11 ### Fixed - Improve the Italian translation ([#3048](https://github.com/nomic-ai/gpt4all/pull/3048)) - Fix models.json cache location ([#3052](https://github.com/nomic-ai/gpt4all/pull/3052)) - Fix LocalDocs regressions caused by docx change ([#3079](https://github.com/nomic-ai/gpt4all/pull/3079)) - Fix Go code being highlighted as Java ([#3080](https://github.com/nomic-ai/gpt4all/pull/3080)) ## [3.4.0] - 2024-10-08 ### Added - Add bm25 hybrid search to localdocs ([#2969](https://github.com/nomic-ai/gpt4all/pull/2969)) - LocalDocs support for .docx files ([#2986](https://github.com/nomic-ai/gpt4all/pull/2986)) - Add support for attaching Excel spreadsheet to chat ([#3007](https://github.com/nomic-ai/gpt4all/pull/3007), [#3028](https://github.com/nomic-ai/gpt4all/pull/3028)) ### Changed - Rebase llama.cpp on latest upstream as of September 26th ([#2998](https://github.com/nomic-ai/gpt4all/pull/2998)) - Change the error message when a message is too long ([#3004](https://github.com/nomic-ai/gpt4all/pull/3004)) - Simplify chatmodel to get rid of unnecessary field and bump chat version ([#3016](https://github.com/nomic-ai/gpt4all/pull/3016)) - Allow ChatLLM to have direct access to ChatModel for restoring state from text ([#3018](https://github.com/nomic-ai/gpt4all/pull/3018)) - Improvements to XLSX conversion and UI fix ([#3022](https://github.com/nomic-ai/gpt4all/pull/3022)) ### Fixed - Fix a crash when attempting to continue a chat loaded from disk ([#2995](https://github.com/nomic-ai/gpt4all/pull/2995)) - Fix the local server rejecting min\_p/top\_p less than 1 ([#2996](https://github.com/nomic-ai/gpt4all/pull/2996)) - Fix "regenerate" always forgetting the most recent message ([#3011](https://github.com/nomic-ai/gpt4all/pull/3011)) - Fix loaded chats forgetting context when there is a system prompt ([#3015](https://github.com/nomic-ai/gpt4all/pull/3015)) - Make it possible to downgrade and keep some chats, and avoid crash for some model types ([#3030](https://github.com/nomic-ai/gpt4all/pull/3030)) - Fix scroll positition being reset in model view, and attempt a better fix for the clone issue ([#3042](https://github.com/nomic-ai/gpt4all/pull/3042)) ## [3.3.1] - 2024-09-27 ([v3.3.y](https://github.com/nomic-ai/gpt4all/tree/v3.3.y)) ### Fixed - Fix a crash when attempting to continue a chat loaded from disk ([#2995](https://github.com/nomic-ai/gpt4all/pull/2995)) - Fix the local server rejecting min\_p/top\_p less than 1 ([#2996](https://github.com/nomic-ai/gpt4all/pull/2996)) ## [3.3.0] - 2024-09-20 ### Added - Use greedy sampling when temperature is set to zero ([#2854](https://github.com/nomic-ai/gpt4all/pull/2854)) - Use configured system prompt in server mode and ignore system messages ([#2921](https://github.com/nomic-ai/gpt4all/pull/2921), [#2924](https://github.com/nomic-ai/gpt4all/pull/2924)) - Add more system information to anonymous usage stats ([#2939](https://github.com/nomic-ai/gpt4all/pull/2939)) - Check for unsupported Ubuntu and macOS versions at install time ([#2940](https://github.com/nomic-ai/gpt4all/pull/2940)) ### Changed - The offline update button now directs users to the offline installer releases page. (by [@3Simplex](https://github.com/3Simplex) in [#2888](https://github.com/nomic-ai/gpt4all/pull/2888)) - Change the website link on the home page to point to the new URL ([#2915](https://github.com/nomic-ai/gpt4all/pull/2915)) - Smaller default window size, dynamic minimum size, and scaling tweaks ([#2904](https://github.com/nomic-ai/gpt4all/pull/2904)) - Only allow a single instance of program to be run at a time ([#2923](https://github.com/nomic-ai/gpt4all/pull/2923])) ### Fixed - Bring back "Auto" option for Embeddings Device as "Application default," which went missing in v3.1.0 ([#2873](https://github.com/nomic-ai/gpt4all/pull/2873)) - Correct a few strings in the Italian translation (by [@Harvester62](https://github.com/Harvester62) in [#2872](https://github.com/nomic-ai/gpt4all/pull/2872) and [#2909](https://github.com/nomic-ai/gpt4all/pull/2909)) - Correct typos in Traditional Chinese translation (by [@supersonictw](https://github.com/supersonictw) in [#2852](https://github.com/nomic-ai/gpt4all/pull/2852)) - Set the window icon on Linux ([#2880](https://github.com/nomic-ai/gpt4all/pull/2880)) - Corrections to the Romanian translation (by [@SINAPSA-IC](https://github.com/SINAPSA-IC) in [#2890](https://github.com/nomic-ai/gpt4all/pull/2890)) - Fix singular/plural forms of LocalDocs "x Sources" (by [@cosmic-snow](https://github.com/cosmic-snow) in [#2885](https://github.com/nomic-ai/gpt4all/pull/2885)) - Fix a typo in Model Settings (by [@3Simplex](https://github.com/3Simplex) in [#2916](https://github.com/nomic-ai/gpt4all/pull/2916)) - Fix the antenna icon tooltip when using the local server ([#2922](https://github.com/nomic-ai/gpt4all/pull/2922)) - Fix a few issues with locating files and handling errors when loading remote models on startup ([#2875](https://github.com/nomic-ai/gpt4all/pull/2875)) - Significantly improve API server request parsing and response correctness ([#2929](https://github.com/nomic-ai/gpt4all/pull/2929)) - Remove unnecessary dependency on Qt WaylandCompositor module ([#2949](https://github.com/nomic-ai/gpt4all/pull/2949)) - Update translations ([#2970](https://github.com/nomic-ai/gpt4all/pull/2970)) - Fix macOS installer and remove extra installed copy of Nomic Embed ([#2973](https://github.com/nomic-ai/gpt4all/pull/2973)) ## [3.2.1] - 2024-08-13 ### Fixed - Do not initialize Vulkan driver when only using CPU ([#2843](https://github.com/nomic-ai/gpt4all/pull/2843)) - Fix a potential crash on exit when using only CPU on Linux with NVIDIA (does not affect X11) ([#2843](https://github.com/nomic-ai/gpt4all/pull/2843)) - Fix default CUDA architecture list after [#2802](https://github.com/nomic-ai/gpt4all/pull/2802) ([#2855](https://github.com/nomic-ai/gpt4all/pull/2855)) ## [3.2.0] - 2024-08-12 ### Added - Add Qwen2-1.5B-Instruct to models3.json (by [@ThiloteE](https://github.com/ThiloteE) in [#2759](https://github.com/nomic-ai/gpt4all/pull/2759)) - Enable translation feature for seven languages: English, Spanish, Italian, Portuguese, Chinese Simplified, Chinese Traditional, Romanian ([#2830](https://github.com/nomic-ai/gpt4all/pull/2830)) ### Changed - Add missing entries to Italian transltation (by [@Harvester62](https://github.com/Harvester62) in [#2783](https://github.com/nomic-ai/gpt4all/pull/2783)) - Use llama\_kv\_cache ops to shift context faster ([#2781](https://github.com/nomic-ai/gpt4all/pull/2781)) - Don't stop generating at end of context ([#2781](https://github.com/nomic-ai/gpt4all/pull/2781)) ### Fixed - Case-insensitive LocalDocs source icon detection (by [@cosmic-snow](https://github.com/cosmic-snow) in [#2761](https://github.com/nomic-ai/gpt4all/pull/2761)) - Fix comparison of pre- and post-release versions for update check and models3.json ([#2762](https://github.com/nomic-ai/gpt4all/pull/2762), [#2772](https://github.com/nomic-ai/gpt4all/pull/2772)) - Fix several backend issues ([#2778](https://github.com/nomic-ai/gpt4all/pull/2778)) - Restore leading space removal logic that was incorrectly removed in [#2694](https://github.com/nomic-ai/gpt4all/pull/2694) - CUDA: Cherry-pick llama.cpp DMMV cols requirement fix that caused a crash with long conversations since [#2694](https://github.com/nomic-ai/gpt4all/pull/2694) - Make reverse prompt detection work more reliably and prevent it from breaking output ([#2781](https://github.com/nomic-ai/gpt4all/pull/2781)) - Disallow context shift for chat name and follow-up generation to prevent bugs ([#2781](https://github.com/nomic-ai/gpt4all/pull/2781)) - Explicitly target macOS 12.6 in CI to fix Metal compatibility on older macOS ([#2846](https://github.com/nomic-ai/gpt4all/pull/2846)) ## [3.1.1] - 2024-07-27 ### Added - Add Llama 3.1 8B Instruct to models3.json (by [@3Simplex](https://github.com/3Simplex) in [#2731](https://github.com/nomic-ai/gpt4all/pull/2731) and [#2732](https://github.com/nomic-ai/gpt4all/pull/2732)) - Portuguese (BR) translation (by [thiagojramos](https://github.com/thiagojramos) in [#2733](https://github.com/nomic-ai/gpt4all/pull/2733)) - Support adding arbitrary OpenAI-compatible models by URL (by [@supersonictw](https://github.com/supersonictw) in [#2683](https://github.com/nomic-ai/gpt4all/pull/2683)) - Support Llama 3.1 RoPE scaling ([#2758](https://github.com/nomic-ai/gpt4all/pull/2758)) ### Changed - Add missing entries to Chinese (Simplified) translation (by [wuodoo](https://github.com/wuodoo) in [#2716](https://github.com/nomic-ai/gpt4all/pull/2716) and [#2749](https://github.com/nomic-ai/gpt4all/pull/2749)) - Update translation files and add missing paths to CMakeLists.txt ([#2735](https://github.com/nomic-ai/gpt4all/2735)) ## [3.1.0] - 2024-07-24 ### Added - Generate suggested follow-up questions ([#2634](https://github.com/nomic-ai/gpt4all/pull/2634), [#2723](https://github.com/nomic-ai/gpt4all/pull/2723)) - Also add options for the chat name and follow-up question prompt templates - Scaffolding for translations ([#2612](https://github.com/nomic-ai/gpt4all/pull/2612)) - Spanish (MX) translation (by [@jstayco](https://github.com/jstayco) in [#2654](https://github.com/nomic-ai/gpt4all/pull/2654)) - Chinese (Simplified) translation by mikage ([#2657](https://github.com/nomic-ai/gpt4all/pull/2657)) - Dynamic changes of language and locale at runtime ([#2659](https://github.com/nomic-ai/gpt4all/pull/2659), [#2677](https://github.com/nomic-ai/gpt4all/pull/2677)) - Romanian translation by [@SINAPSA\_IC](https://github.com/SINAPSA_IC) ([#2662](https://github.com/nomic-ai/gpt4all/pull/2662)) - Chinese (Traditional) translation (by [@supersonictw](https://github.com/supersonictw) in [#2661](https://github.com/nomic-ai/gpt4all/pull/2661)) - Italian translation (by [@Harvester62](https://github.com/Harvester62) in [#2700](https://github.com/nomic-ai/gpt4all/pull/2700)) ### Changed - Customize combo boxes and context menus to fit the new style ([#2535](https://github.com/nomic-ai/gpt4all/pull/2535)) - Improve view bar scaling and Model Settings layout ([#2520](https://github.com/nomic-ai/gpt4all/pull/2520) - Make the logo spin while the model is generating ([#2557](https://github.com/nomic-ai/gpt4all/pull/2557)) - Server: Reply to wrong GET/POST method with HTTP 405 instead of 404 (by [@cosmic-snow](https://github.com/cosmic-snow) in [#2615](https://github.com/nomic-ai/gpt4all/pull/2615)) - Update theme for menus (by [@3Simplex](https://github.com/3Simplex) in [#2578](https://github.com/nomic-ai/gpt4all/pull/2578)) - Move the "stop" button to the message box ([#2561](https://github.com/nomic-ai/gpt4all/pull/2561)) - Build with CUDA 11.8 for better compatibility ([#2639](https://github.com/nomic-ai/gpt4all/pull/2639)) - Make links in latest news section clickable ([#2643](https://github.com/nomic-ai/gpt4all/pull/2643)) - Support translation of settings choices ([#2667](https://github.com/nomic-ai/gpt4all/pull/2667), [#2690](https://github.com/nomic-ai/gpt4all/pull/2690)) - Improve LocalDocs view's error message (by @cosmic-snow in [#2679](https://github.com/nomic-ai/gpt4all/pull/2679)) - Ignore case of LocalDocs file extensions ([#2642](https://github.com/nomic-ai/gpt4all/pull/2642), [#2684](https://github.com/nomic-ai/gpt4all/pull/2684)) - Update llama.cpp to commit 87e397d00 from July 19th ([#2694](https://github.com/nomic-ai/gpt4all/pull/2694), [#2702](https://github.com/nomic-ai/gpt4all/pull/2702)) - Add support for GPT-NeoX, Gemma 2, OpenELM, ChatGLM, and Jais architectures (all with Vulkan support) - Add support for DeepSeek-V2 architecture (no Vulkan support) - Enable Vulkan support for StarCoder2, XVERSE, Command R, and OLMo - Show scrollbar in chat collections list as needed (by [@cosmic-snow](https://github.com/cosmic-snow) in [#2691](https://github.com/nomic-ai/gpt4all/pull/2691)) ### Removed - Remove support for GPT-J models ([#2676](https://github.com/nomic-ai/gpt4all/pull/2676), [#2693](https://github.com/nomic-ai/gpt4all/pull/2693)) ### Fixed - Fix placement of thumbs-down and datalake opt-in dialogs ([#2540](https://github.com/nomic-ai/gpt4all/pull/2540)) - Select the correct folder with the Linux fallback folder dialog ([#2541](https://github.com/nomic-ai/gpt4all/pull/2541)) - Fix clone button sometimes producing blank model info ([#2545](https://github.com/nomic-ai/gpt4all/pull/2545)) - Fix jerky chat view scrolling ([#2555](https://github.com/nomic-ai/gpt4all/pull/2555)) - Fix "reload" showing for chats with missing models ([#2520](https://github.com/nomic-ai/gpt4all/pull/2520) - Fix property binding loop warning ([#2601](https://github.com/nomic-ai/gpt4all/pull/2601)) - Fix UI hang with certain chat view content ([#2543](https://github.com/nomic-ai/gpt4all/pull/2543)) - Fix crash when Kompute falls back to CPU ([#2640](https://github.com/nomic-ai/gpt4all/pull/2640)) - Fix several Vulkan resource management issues ([#2694](https://github.com/nomic-ai/gpt4all/pull/2694)) - Fix crash/hang when some models stop generating, by showing special tokens ([#2701](https://github.com/nomic-ai/gpt4all/pull/2701)) [3.10.0]: https://github.com/nomic-ai/gpt4all/compare/v3.9.0...v3.10.0 [3.9.0]: https://github.com/nomic-ai/gpt4all/compare/v3.8.0...v3.9.0 [3.8.0]: https://github.com/nomic-ai/gpt4all/compare/v3.7.0...v3.8.0 [3.7.0]: https://github.com/nomic-ai/gpt4all/compare/v3.6.1...v3.7.0 [3.6.1]: https://github.com/nomic-ai/gpt4all/compare/v3.6.0...v3.6.1 [3.6.0]: https://github.com/nomic-ai/gpt4all/compare/v3.5.3...v3.6.0 [3.5.3]: https://github.com/nomic-ai/gpt4all/compare/v3.5.2...v3.5.3 [3.5.2]: https://github.com/nomic-ai/gpt4all/compare/v3.5.1...v3.5.2 [3.5.1]: https://github.com/nomic-ai/gpt4all/compare/v3.5.0...v3.5.1 [3.5.0]: https://github.com/nomic-ai/gpt4all/compare/v3.5.0-rc2...v3.5.0 [3.5.0-rc2]: https://github.com/nomic-ai/gpt4all/compare/v3.5.0-rc1...v3.5.0-rc2 [3.5.0-rc1]: https://github.com/nomic-ai/gpt4all/compare/v3.4.2...v3.5.0-rc1 [3.4.2]: https://github.com/nomic-ai/gpt4all/compare/v3.4.1...v3.4.2 [3.4.1]: https://github.com/nomic-ai/gpt4all/compare/v3.4.0...v3.4.1 [3.4.0]: https://github.com/nomic-ai/gpt4all/compare/v3.3.0...v3.4.0 [3.3.1]: https://github.com/nomic-ai/gpt4all/compare/v3.3.0...v3.3.1 [3.3.0]: https://github.com/nomic-ai/gpt4all/compare/v3.2.1...v3.3.0 [3.2.1]: https://github.com/nomic-ai/gpt4all/compare/v3.2.0...v3.2.1 [3.2.0]: https://github.com/nomic-ai/gpt4all/compare/v3.1.1...v3.2.0 [3.1.1]: https://github.com/nomic-ai/gpt4all/compare/v3.1.0...v3.1.1 [3.1.0]: https://github.com/nomic-ai/gpt4all/compare/v3.0.0...v3.1.0 ================================================ FILE: gpt4all-chat/CMakeLists.txt ================================================ cmake_minimum_required(VERSION 3.25) # for try_compile SOURCE_FROM_VAR include(../common/common.cmake) set(APP_VERSION_MAJOR 3) set(APP_VERSION_MINOR 10) set(APP_VERSION_PATCH 1) set(APP_VERSION_BASE "${APP_VERSION_MAJOR}.${APP_VERSION_MINOR}.${APP_VERSION_PATCH}") set(APP_VERSION "${APP_VERSION_BASE}-dev0") project(gpt4all VERSION ${APP_VERSION_BASE} LANGUAGES CXX C) if (CMAKE_INSTALL_PREFIX_INITIALIZED_TO_DEFAULT) set(CMAKE_INSTALL_PREFIX ${CMAKE_BINARY_DIR}/install CACHE PATH "..." FORCE) endif() if(APPLE) option(BUILD_UNIVERSAL "Build a Universal binary on macOS" OFF) if(BUILD_UNIVERSAL) # Build a Universal binary on macOS # This requires that the found Qt library is compiled as Universal binaries. set(CMAKE_OSX_ARCHITECTURES "arm64;x86_64" CACHE STRING "" FORCE) else() # Build for the host architecture on macOS set(CMAKE_OSX_ARCHITECTURES "${CMAKE_HOST_SYSTEM_PROCESSOR}" CACHE STRING "" FORCE) endif() endif() find_package(Python3 3.12 QUIET COMPONENTS Interpreter) option(GPT4ALL_TEST "Build the tests" ${Python3_FOUND}) option(GPT4ALL_LOCALHOST "Build installer for localhost repo" OFF) option(GPT4ALL_OFFLINE_INSTALLER "Build an offline installer" OFF) option(GPT4ALL_SIGN_INSTALL "Sign installed binaries and installers (requires signing identities)" OFF) option(GPT4ALL_GEN_CPACK_CONFIG "Generate the CPack config.xml in the package step and nothing else." OFF) set(GPT4ALL_USE_QTPDF "AUTO" CACHE STRING "Whether to Use QtPDF for LocalDocs. If OFF or not available on this platform, PDFium is used.") set_property(CACHE GPT4ALL_USE_QTPDF PROPERTY STRINGS AUTO ON OFF) set(GPT4ALL_FORCE_D3D12 "AUTO" CACHE STRING "Whether to use Direct3D 12 as the Qt scene graph backend. Defaults to ON on Windows ARM.") set_property(CACHE GPT4ALL_FORCE_D3D12 PROPERTY STRINGS AUTO ON OFF) include(cmake/cpack_config.cmake) if (GPT4ALL_GEN_CPACK_CONFIG) configure_file("${CMAKE_CURRENT_SOURCE_DIR}/cmake/cpack-steal-config.cmake.in" "${CMAKE_BINARY_DIR}/cmake/cpack-steal-config.cmake" @ONLY) set(CPACK_POST_BUILD_SCRIPTS ${CMAKE_BINARY_DIR}/cmake/cpack-steal-config.cmake) include(CPack) include(CPackIFW) return() endif() set(CMAKE_EXPORT_COMPILE_COMMANDS ON) set(CMAKE_CXX_STANDARD 23) set(CMAKE_CXX_STANDARD_REQUIRED ON) if (MSVC) # Enable accurate __cplusplus macro add_compile_options($<$:/Zc:__cplusplus>) endif() # conftests function(check_cpp_feature FEATURE_NAME MIN_VALUE) message(CHECK_START "Checking for ${FEATURE_NAME} >= ${MIN_VALUE}") string(CONCAT SRC "#include \n" "#if !defined(${FEATURE_NAME}) || ${FEATURE_NAME} < ${MIN_VALUE}\n" "# error \"${FEATURE_NAME} is not defined or less than ${MIN_VALUE}\"\n" "#endif\n" "int main() { return 0; }\n" ) try_compile(HAS_FEATURE SOURCE_FROM_VAR "test_${FEATURE_NAME}.cpp" SRC) if (NOT HAS_FEATURE) message(CHECK_FAIL "fail") message(FATAL_ERROR "The C++ compiler\n \"${CMAKE_CXX_COMPILER}\"\n" "is too old to support ${FEATURE_NAME} >= ${MIN_VALUE}.\n" "Please specify a newer compiler via -DCMAKE_C_COMPILER/-DCMAKE_CXX_COMPILER." ) endif() message(CHECK_PASS "pass") endfunction() # check for monadic operations in std::optional (e.g. transform) check_cpp_feature("__cpp_lib_optional" "202110L") list(APPEND CMAKE_MODULE_PATH "${CMAKE_CURRENT_LIST_DIR}/cmake/Modules") # Include the binary directory for the generated header file include_directories("${CMAKE_CURRENT_BINARY_DIR}") set(CMAKE_AUTOMOC ON) set(CMAKE_AUTORCC ON) set(CMAKE_FIND_PACKAGE_TARGETS_GLOBAL ON) set(GPT4ALL_QT_COMPONENTS Core HttpServer LinguistTools Quick QuickDialogs2 Sql Svg) set(GPT4ALL_USING_QTPDF OFF) if (CMAKE_SYSTEM_NAME MATCHES Windows AND CMAKE_SYSTEM_PROCESSOR MATCHES "^(aarch64|AARCH64|arm64|ARM64)$") # QtPDF is not available. if (GPT4ALL_USE_QTPDF STREQUAL "ON") message(FATAL_ERROR "QtPDF is not available on Windows ARM64.") endif() elseif (GPT4ALL_USE_QTPDF MATCHES "^(ON|AUTO)$") set(GPT4ALL_USING_QTPDF ON) list(APPEND GPT4ALL_QT_COMPONENTS Pdf) endif() find_package(Qt6 6.8 COMPONENTS ${GPT4ALL_QT_COMPONENTS} REQUIRED) if (QT_KNOWN_POLICY_QTP0004) qt_policy(SET QTP0004 NEW) # generate extra qmldir files on Qt 6.8+ endif() # Get the Qt6Core target properties get_target_property(Qt6Core_INCLUDE_DIRS Qt6::Core INTERFACE_INCLUDE_DIRECTORIES) get_target_property(Qt6Core_LIBRARY_RELEASE Qt6::Core LOCATION_RELEASE) # Find the qmake binary find_program(QMAKE_EXECUTABLE NAMES qmake qmake6 PATHS ${Qt6Core_INCLUDE_DIRS}/../.. NO_DEFAULT_PATH) # Get the Qt 6 root directory get_filename_component(Qt6_ROOT_DIR "${Qt6Core_LIBRARY_RELEASE}" DIRECTORY) get_filename_component(Qt6_ROOT_DIR "${Qt6_ROOT_DIR}/.." ABSOLUTE) message(STATUS "qmake binary: ${QMAKE_EXECUTABLE}") message(STATUS "Qt 6 root directory: ${Qt6_ROOT_DIR}") set(CMAKE_RUNTIME_OUTPUT_DIRECTORY ${CMAKE_BINARY_DIR}/bin) set(GPT4ALL_CONFIG_FORCE_D3D12 -1) if (NOT CMAKE_SYSTEM_NAME MATCHES Windows OR Qt6_VERSION VERSION_LESS "6.6") # Direct3D 12 is not available. if (GPT4ALL_FORCE_D3D12 STREQUAL "ON") message(FATAL_ERROR "Cannot use Direct3D 12 on this platform.") endif() elseif (GPT4ALL_FORCE_D3D12 MATCHES "^(ON|AUTO)$") if (GPT4ALL_FORCE_D3D12 STREQUAL "ON" OR CMAKE_SYSTEM_PROCESSOR MATCHES "^(aarch64|AARCH64|arm64|ARM64)$") set(GPT4ALL_CONFIG_FORCE_D3D12 1) endif() endif() # Generate a header file for configuration configure_file( "${CMAKE_CURRENT_SOURCE_DIR}/src/config.h.in" "${CMAKE_CURRENT_BINARY_DIR}/config.h" ) add_subdirectory(deps) add_subdirectory(../gpt4all-backend llmodel) if (GPT4ALL_TEST) enable_testing() # Llama-3.2-1B model set(TEST_MODEL "Llama-3.2-1B-Instruct-Q4_0.gguf") set(TEST_MODEL_MD5 "48ff0243978606fdba19d899b77802fc") set(TEST_MODEL_PATH "${CMAKE_BINARY_DIR}/resources/${TEST_MODEL}") set(TEST_MODEL_URL "https://huggingface.co/bartowski/Llama-3.2-1B-Instruct-GGUF/resolve/main/${TEST_MODEL}") # Create a custom command to download the file if it does not exist or if the checksum does not match add_custom_command( OUTPUT "${TEST_MODEL_PATH}" COMMAND ${CMAKE_COMMAND} -E echo "Downloading test model from ${TEST_MODEL_URL} ..." COMMAND ${CMAKE_COMMAND} -DURL="${TEST_MODEL_URL}" -DOUTPUT_PATH="${TEST_MODEL_PATH}" -DEXPECTED_MD5="${TEST_MODEL_MD5}" -P "${CMAKE_SOURCE_DIR}/cmake/download_model.cmake" DEPENDS "${CMAKE_SOURCE_DIR}/cmake/download_model.cmake" ) # Define a custom target that depends on the downloaded model add_custom_target(download_test_model DEPENDS "${TEST_MODEL_PATH}" ) add_subdirectory(tests) # The 'check' target makes sure the tests and their dependencies are up-to-date before running them add_custom_target(check COMMAND ${CMAKE_CTEST_COMMAND} --output-on-failure DEPENDS download_test_model chat gpt4all_tests) endif() set(CHAT_EXE_RESOURCES) # Metal shader library if (APPLE) list(APPEND CHAT_EXE_RESOURCES "${GGML_METALLIB}") endif() # App icon if (WIN32) list(APPEND CHAT_EXE_RESOURCES "${CMAKE_CURRENT_SOURCE_DIR}/resources/gpt4all.rc") elseif (APPLE) # The MACOSX_BUNDLE_ICON_FILE variable is added to the Info.plist # generated by CMake. This variable contains the .icns file name, # without the path. set(MACOSX_BUNDLE_ICON_FILE gpt4all.icns) # And the following tells CMake where to find and install the file itself. set(APP_ICON_RESOURCE "${CMAKE_CURRENT_SOURCE_DIR}/resources/gpt4all.icns") list(APPEND CHAT_EXE_RESOURCES "${APP_ICON_RESOURCE}") endif() # Embedding model set(LOCAL_EMBEDDING_MODEL "nomic-embed-text-v1.5.f16.gguf") set(LOCAL_EMBEDDING_MODEL_MD5 "a5401e7f7e46ed9fcaed5b60a281d547") set(LOCAL_EMBEDDING_MODEL_PATH "${CMAKE_BINARY_DIR}/resources/${LOCAL_EMBEDDING_MODEL}") set(LOCAL_EMBEDDING_MODEL_URL "https://gpt4all.io/models/gguf/${LOCAL_EMBEDDING_MODEL}") message(STATUS "Downloading embedding model from ${LOCAL_EMBEDDING_MODEL_URL} ...") file(DOWNLOAD "${LOCAL_EMBEDDING_MODEL_URL}" "${LOCAL_EMBEDDING_MODEL_PATH}" EXPECTED_HASH "MD5=${LOCAL_EMBEDDING_MODEL_MD5}" ) message(STATUS "Embedding model downloaded to ${LOCAL_EMBEDDING_MODEL_PATH}") if (APPLE) list(APPEND CHAT_EXE_RESOURCES "${LOCAL_EMBEDDING_MODEL_PATH}") endif() if (DEFINED GGML_METALLIB) set_source_files_properties("${GGML_METALLIB}" PROPERTIES GENERATED ON) endif() if (APPLE) set_source_files_properties(${CHAT_EXE_RESOURCES} PROPERTIES MACOSX_PACKAGE_LOCATION Resources) endif() set(MACOS_SOURCES) if (APPLE) find_library(COCOA_LIBRARY Cocoa) list(APPEND MACOS_SOURCES src/macosdock.mm src/macosdock.h) endif() qt_add_executable(chat src/main.cpp src/chat.cpp src/chat.h src/chatapi.cpp src/chatapi.h src/chatlistmodel.cpp src/chatlistmodel.h src/chatllm.cpp src/chatllm.h src/chatmodel.h src/chatmodel.cpp src/chatviewtextprocessor.cpp src/chatviewtextprocessor.h src/codeinterpreter.cpp src/codeinterpreter.h src/database.cpp src/database.h src/download.cpp src/download.h src/embllm.cpp src/embllm.h src/jinja_helpers.cpp src/jinja_helpers.h src/jinja_replacements.cpp src/jinja_replacements.h src/llm.cpp src/llm.h src/localdocs.cpp src/localdocs.h src/localdocsmodel.cpp src/localdocsmodel.h src/logger.cpp src/logger.h src/modellist.cpp src/modellist.h src/mysettings.cpp src/mysettings.h src/network.cpp src/network.h src/server.cpp src/server.h src/tool.cpp src/tool.h src/toolcallparser.cpp src/toolcallparser.h src/toolmodel.cpp src/toolmodel.h src/xlsxtomd.cpp src/xlsxtomd.h ${CHAT_EXE_RESOURCES} ${MACOS_SOURCES} ) gpt4all_add_warning_options(chat) qt_add_qml_module(chat URI gpt4all VERSION 1.0 NO_CACHEGEN QML_FILES main.qml qml/AddCollectionView.qml qml/AddModelView.qml qml/AddGPT4AllModelView.qml qml/AddHFModelView.qml qml/AddRemoteModelView.qml qml/ApplicationSettings.qml qml/ChatDrawer.qml qml/ChatCollapsibleItem.qml qml/ChatItemView.qml qml/ChatMessageButton.qml qml/ChatTextItem.qml qml/ChatView.qml qml/CollectionsDrawer.qml qml/HomeView.qml qml/LocalDocsSettings.qml qml/LocalDocsView.qml qml/ModelSettings.qml qml/ModelsView.qml qml/NetworkDialog.qml qml/NewVersionDialog.qml qml/PopupDialog.qml qml/SettingsView.qml qml/StartupDialog.qml qml/ConfirmationDialog.qml qml/Theme.qml qml/ThumbsDownDialog.qml qml/Toast.qml qml/ToastManager.qml qml/MyBusyIndicator.qml qml/MyButton.qml qml/MyTabButton.qml qml/MyCheckBox.qml qml/MyComboBox.qml qml/MyDialog.qml qml/MyDirectoryField.qml qml/MyFileDialog.qml qml/MyFileIcon.qml qml/MyFolderDialog.qml qml/MyFancyLink.qml qml/MyMenu.qml qml/MyMenuItem.qml qml/MyMiniButton.qml qml/MySettingsButton.qml qml/MySettingsDestructiveButton.qml qml/MySettingsLabel.qml qml/MySettingsStack.qml qml/MySettingsTab.qml qml/MySlug.qml qml/MyTextArea.qml qml/MyTextButton.qml qml/MyTextField.qml qml/MyToolButton.qml qml/MyWelcomeButton.qml qml/RemoteModelCard.qml RESOURCES icons/antenna_1.svg icons/antenna_2.svg icons/antenna_3.svg icons/caret_down.svg icons/caret_right.svg icons/changelog.svg icons/chat.svg icons/check.svg icons/close.svg icons/copy.svg icons/db.svg icons/discord.svg icons/download.svg icons/edit.svg icons/eject.svg icons/email.svg icons/file-doc.svg icons/file-docx.svg icons/file-md.svg icons/file-pdf.svg icons/file-txt.svg icons/file-xls.svg icons/file.svg icons/github.svg icons/globe.svg icons/gpt4all-32.png icons/gpt4all-48.png icons/gpt4all.svg icons/gpt4all_transparent.svg icons/groq.svg icons/home.svg icons/image.svg icons/info.svg icons/left_panel_closed.svg icons/left_panel_open.svg icons/local-docs.svg icons/models.svg icons/mistral.svg icons/network.svg icons/nomic_logo.svg icons/notes.svg icons/paperclip.svg icons/plus.svg icons/plus_circle.svg icons/openai.svg icons/recycle.svg icons/regenerate.svg icons/search.svg icons/send_message.svg icons/settings.svg icons/stack.svg icons/stop_generating.svg icons/thumbs_down.svg icons/thumbs_up.svg icons/trash.svg icons/twitter.svg icons/up_down.svg icons/webpage.svg icons/you.svg ) qt_add_translations(chat TS_FILES ${CMAKE_SOURCE_DIR}/translations/gpt4all_en_US.ts ${CMAKE_SOURCE_DIR}/translations/gpt4all_es_MX.ts ${CMAKE_SOURCE_DIR}/translations/gpt4all_zh_CN.ts ${CMAKE_SOURCE_DIR}/translations/gpt4all_zh_TW.ts ${CMAKE_SOURCE_DIR}/translations/gpt4all_ro_RO.ts ${CMAKE_SOURCE_DIR}/translations/gpt4all_it_IT.ts ${CMAKE_SOURCE_DIR}/translations/gpt4all_pt_BR.ts ) set_target_properties(chat PROPERTIES WIN32_EXECUTABLE TRUE ) macro(REPORT_MISSING_SIGNING_CONTEXT) message(FATAL_ERROR [=[ Signing requested but no identity configured. Please set the correct env variable or provide the MAC_SIGNING_IDENTITY argument on the command line ]=]) endmacro() if (APPLE) set_target_properties(chat PROPERTIES MACOSX_BUNDLE TRUE MACOSX_BUNDLE_GUI_IDENTIFIER gpt4all MACOSX_BUNDLE_BUNDLE_VERSION ${PROJECT_VERSION} MACOSX_BUNDLE_SHORT_VERSION_STRING ${PROJECT_VERSION_MAJOR}.${PROJECT_VERSION_MINOR} OUTPUT_NAME gpt4all ) add_dependencies(chat ggml-metal) endif() if (APPLE AND GPT4ALL_SIGN_INSTALL) if (NOT MAC_SIGNING_IDENTITY) if (NOT DEFINED ENV{MAC_SIGNING_CERT_NAME}) REPORT_MISSING_SIGNING_CONTEXT() endif() set(MAC_SIGNING_IDENTITY $ENV{MAC_SIGNING_CERT_NAME}) endif() if (NOT MAC_SIGNING_TID) if (NOT DEFINED ENV{MAC_NOTARIZATION_TID}) REPORT_MISSING_SIGNING_CONTEXT() endif() set(MAC_SIGNING_TID $ENV{MAC_NOTARIZATION_TID}) endif() # Setup MacOS signing for individual binaries set_target_properties(chat PROPERTIES XCODE_ATTRIBUTE_CODE_SIGN_STYLE "Manual" XCODE_ATTRIBUTE_DEVELOPMENT_TEAM ${MAC_SIGNING_TID} XCODE_ATTRIBUTE_CODE_SIGN_IDENTITY ${MAC_SIGNING_IDENTITY} XCODE_ATTRIBUTE_CODE_SIGNING_REQUIRED True XCODE_ATTRIBUTE_OTHER_CODE_SIGN_FLAGS "--timestamp=http://timestamp.apple.com/ts01 --options=runtime,library" ) endif() target_compile_definitions(chat PRIVATE $<$,$>:QT_QML_DEBUG>) target_include_directories(chat PRIVATE src) # usearch uses the identifier 'slots' which conflicts with Qt's 'slots' keyword target_compile_definitions(chat PRIVATE QT_NO_SIGNALS_SLOTS_KEYWORDS) target_include_directories(chat PRIVATE deps/usearch/include deps/usearch/fp16/include) target_link_libraries(chat PRIVATE Qt6::Core Qt6::HttpServer Qt6::Quick Qt6::Sql Qt6::Svg) if (GPT4ALL_USING_QTPDF) target_compile_definitions(chat PRIVATE GPT4ALL_USE_QTPDF) target_link_libraries(chat PRIVATE Qt6::Pdf) else() # Link PDFium target_link_libraries(chat PRIVATE pdfium) endif() target_link_libraries(chat PRIVATE llmodel SingleApplication fmt::fmt duckx::duckx QXlsx) target_include_directories(chat PRIVATE ${CMAKE_CURRENT_SOURCE_DIR}/deps/json/include) target_include_directories(chat PRIVATE ${CMAKE_CURRENT_SOURCE_DIR}/deps/json/include/nlohmann) target_include_directories(chat PRIVATE ${CMAKE_CURRENT_SOURCE_DIR}/deps/minja/include) if (APPLE) target_link_libraries(chat PRIVATE ${COCOA_LIBRARY}) endif() # -- install -- if (APPLE) set(GPT4ALL_LIB_DEST bin/gpt4all.app/Contents/Frameworks) else() set(GPT4ALL_LIB_DEST lib) endif() install(TARGETS chat DESTINATION bin COMPONENT ${COMPONENT_NAME_MAIN}) install( TARGETS llmodel LIBRARY DESTINATION ${GPT4ALL_LIB_DEST} COMPONENT ${COMPONENT_NAME_MAIN} # .so/.dylib RUNTIME DESTINATION bin COMPONENT ${COMPONENT_NAME_MAIN} # .dll ) # We should probably iterate through the list of the cmake for backend, but these need to be installed # to the this component's dir for the finicky qt installer to work if (LLMODEL_KOMPUTE) set(MODEL_IMPL_TARGETS llamamodel-mainline-kompute llamamodel-mainline-kompute-avxonly ) else() set(MODEL_IMPL_TARGETS llamamodel-mainline-cpu llamamodel-mainline-cpu-avxonly ) endif() if (APPLE) list(APPEND MODEL_IMPL_TARGETS llamamodel-mainline-metal) endif() install( TARGETS ${MODEL_IMPL_TARGETS} LIBRARY DESTINATION ${GPT4ALL_LIB_DEST} COMPONENT ${COMPONENT_NAME_MAIN} # .so/.dylib RUNTIME DESTINATION lib COMPONENT ${COMPONENT_NAME_MAIN} # .dll ) if(APPLE AND GPT4ALL_SIGN_INSTALL) include(SignMacOSBinaries) install_sign_osx(chat) install_sign_osx(llmodel) foreach(tgt ${MODEL_IMPL_TARGETS}) install_sign_osx(${tgt}) endforeach() endif() if(WIN32 AND GPT4ALL_SIGN_INSTALL) include(SignWindowsBinaries) sign_target_windows(chat) sign_target_windows(llmodel) foreach(tgt ${MODEL_IMPL_TARGETS}) sign_target_windows(${tgt}) endforeach() endif() if (LLMODEL_CUDA) set_property(TARGET llamamodel-mainline-cuda llamamodel-mainline-cuda-avxonly APPEND PROPERTY INSTALL_RPATH "$ORIGIN") install( TARGETS llamamodel-mainline-cuda llamamodel-mainline-cuda-avxonly RUNTIME_DEPENDENCY_SET llama-cuda-deps LIBRARY DESTINATION lib COMPONENT ${COMPONENT_NAME_MAIN} # .so RUNTIME DESTINATION lib COMPONENT ${COMPONENT_NAME_MAIN} # .dll ) if (WIN32) install( RUNTIME_DEPENDENCY_SET llama-cuda-deps PRE_EXCLUDE_REGEXES "^(nvcuda|api-ms-.*)\\.dll$" POST_INCLUDE_REGEXES "(^|[/\\\\])(lib)?(cuda|cublas)" POST_EXCLUDE_REGEXES . DIRECTORIES "${CUDAToolkit_BIN_DIR}" DESTINATION lib COMPONENT ${COMPONENT_NAME_MAIN} ) endif() endif() if (NOT GPT4ALL_USING_QTPDF) # Install PDFium if (WIN32) install(FILES ${PDFium_LIBRARY} DESTINATION bin COMPONENT ${COMPONENT_NAME_MAIN}) # .dll else() install(FILES ${PDFium_LIBRARY} DESTINATION ${GPT4ALL_LIB_DEST} COMPONENT ${COMPONENT_NAME_MAIN}) # .so/.dylib endif() endif() if (NOT APPLE) install(FILES "${LOCAL_EMBEDDING_MODEL_PATH}" DESTINATION resources COMPONENT ${COMPONENT_NAME_MAIN}) endif() if (CMAKE_SYSTEM_NAME MATCHES Linux) find_program(LINUXDEPLOYQT linuxdeployqt HINTS "$ENV{HOME}/dev/linuxdeployqt/build/tools/linuxdeployqt" "$ENV{HOME}/project/linuxdeployqt/bin") configure_file("${CMAKE_CURRENT_SOURCE_DIR}/cmake/deploy-qt-linux.cmake.in" "${CMAKE_BINARY_DIR}/cmake/deploy-qt-linux.cmake" @ONLY) set(CPACK_PRE_BUILD_SCRIPTS ${CMAKE_BINARY_DIR}/cmake/deploy-qt-linux.cmake) elseif (CMAKE_SYSTEM_NAME MATCHES Windows) find_program(WINDEPLOYQT windeployqt) configure_file("${CMAKE_CURRENT_SOURCE_DIR}/cmake/deploy-qt-windows.cmake.in" "${CMAKE_BINARY_DIR}/cmake/deploy-qt-windows.cmake" @ONLY) set(CPACK_PRE_BUILD_SCRIPTS ${CMAKE_BINARY_DIR}/cmake/deploy-qt-windows.cmake) elseif (CMAKE_SYSTEM_NAME MATCHES Darwin) find_program(MACDEPLOYQT macdeployqt) configure_file("${CMAKE_CURRENT_SOURCE_DIR}/cmake/deploy-qt-mac.cmake.in" "${CMAKE_BINARY_DIR}/cmake/deploy-qt-mac.cmake" @ONLY) set(CPACK_PRE_BUILD_SCRIPTS ${CMAKE_BINARY_DIR}/cmake/deploy-qt-mac.cmake) endif() include(InstallRequiredSystemLibraries) include(CPack) include(CPackIFW) if(GPT4ALL_OFFLINE_INSTALLER) cpack_add_component(${COMPONENT_NAME_MAIN}) else() cpack_add_component(${COMPONENT_NAME_MAIN} DOWNLOADED) endif() cpack_ifw_configure_component(${COMPONENT_NAME_MAIN} ESSENTIAL FORCED_INSTALLATION) cpack_ifw_configure_component(${COMPONENT_NAME_MAIN} VERSION ${APP_VERSION}) cpack_ifw_configure_component(${COMPONENT_NAME_MAIN} LICENSES "MIT LICENSE" ${CPACK_RESOURCE_FILE_LICENSE}) cpack_ifw_configure_component(${COMPONENT_NAME_MAIN} SCRIPT "${CMAKE_CURRENT_SOURCE_DIR}/cmake/installer_gpt4all_component.qs") cpack_ifw_configure_component(${COMPONENT_NAME_MAIN} REPLACES "gpt4all-chat") #Was used in very earliest prototypes if (APPLE AND GPT4ALL_SIGN_INSTALL) if (GPT4ALL_OFFLINE_INSTALLER) cpack_add_component(maintenancetool HIDDEN) else() cpack_add_component(maintenancetool HIDDEN DOWNLOADED) endif() cpack_ifw_configure_component(maintenancetool ESSENTIAL FORCED_INSTALLATION) cpack_ifw_configure_component(maintenancetool VERSION ${APP_VERSION}) cpack_ifw_configure_component(maintenancetool SCRIPT "${CMAKE_CURRENT_SOURCE_DIR}/cmake/installer_maintenancetool_component.qs") endif() if (GPT4ALL_LOCALHOST) cpack_ifw_add_repository("GPT4AllRepository" URL "http://localhost/repository") elseif (GPT4ALL_OFFLINE_INSTALLER) add_compile_definitions(GPT4ALL_OFFLINE_INSTALLER) else() if (CMAKE_SYSTEM_NAME MATCHES Linux) cpack_ifw_add_repository("GPT4AllRepository" URL "https://gpt4all.io/installer_repos/linux/repository") elseif (CMAKE_SYSTEM_NAME MATCHES Windows) # To sign the target on windows have to create a batch script add use it as a custom target and then use CPACK_IFW_EXTRA_TARGETS to set this extra target if (CMAKE_SYSTEM_PROCESSOR MATCHES "^(x86_64|AMD64|amd64)$") cpack_ifw_add_repository("GPT4AllRepository" URL "https://gpt4all.io/installer_repos/windows/repository") elseif (CMAKE_SYSTEM_PROCESSOR MATCHES "^(aarch64|AARCH64|arm64|ARM64)$") cpack_ifw_add_repository("GPT4AllRepository" URL "https://gpt4all.io/installer_repos/windows_arm/repository") endif() elseif (CMAKE_SYSTEM_NAME MATCHES Darwin) cpack_ifw_add_repository("GPT4AllRepository" URL "https://gpt4all.io/installer_repos/mac/repository") endif() endif() ================================================ FILE: gpt4all-chat/LICENSE ================================================ Copyright 2023-2024 Nomic, Inc. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ADDENDUM: Any LLM models that are loaded and used by the application are not themselves subject to this license if indeed they are even copyrightable. The terms of this license apply only to the application software and its accompanying documentation and do not extend to any LLM models, whether created by the author of the application or obtained from third-party sources. ================================================ FILE: gpt4all-chat/cmake/Modules/SignMacOSBinaries.cmake ================================================ function(install_sign_osx tgt) install(CODE "execute_process(COMMAND codesign --options runtime --timestamp -s \"${MAC_SIGNING_IDENTITY}\" $)") endfunction() ================================================ FILE: gpt4all-chat/cmake/Modules/SignWindowsBinaries.cmake ================================================ function(sign_target_windows tgt) if(WIN32 AND GPT4ALL_SIGN_INSTALL) add_custom_command(TARGET ${tgt} POST_BUILD COMMAND AzureSignTool.exe sign -du "https://www.nomic.ai/gpt4all" -kvu https://gpt4all.vault.azure.net -kvi "$Env{AZSignGUID}" -kvs "$Env{AZSignPWD}" -kvc "$Env{AZSignCertName}" -kvt "$Env{AZSignTID}" -tr http://timestamp.digicert.com -v $ ) endif() endfunction() ================================================ FILE: gpt4all-chat/cmake/cpack-steal-config.cmake.in ================================================ set(OUTPUT_DIR "@CMAKE_BINARY_DIR@") file(COPY ${CPACK_TEMPORARY_INSTALL_DIRECTORY}/config DESTINATION ${OUTPUT_DIR}/cpack-config) ================================================ FILE: gpt4all-chat/cmake/cpack_config.cmake ================================================ set(COMPONENT_NAME_MAIN "gpt4all") set(CPACK_GENERATOR "IFW") set(CPACK_VERBATIM_VARIABLES YES) set(CPACK_IFW_VERBOSE ON) if (CMAKE_SYSTEM_NAME MATCHES Linux) set(CPACK_IFW_ROOT "~/Qt/Tools/QtInstallerFramework/4.6") set(CPACK_PACKAGE_FILE_NAME "${COMPONENT_NAME_MAIN}-installer-linux") set(CPACK_IFW_TARGET_DIRECTORY "@HomeDir@/${COMPONENT_NAME_MAIN}") elseif (CMAKE_SYSTEM_NAME MATCHES Windows) set(CPACK_IFW_ROOT "C:/Qt/Tools/QtInstallerFramework/4.6") set(CPACK_IFW_PACKAGE_ICON "${CMAKE_CURRENT_SOURCE_DIR}/resources/gpt4all.ico") if (CMAKE_SYSTEM_PROCESSOR MATCHES "^(x86_64|AMD64|amd64)$") set(CPACK_PACKAGE_FILE_NAME "${COMPONENT_NAME_MAIN}-installer-win64") elseif (CMAKE_SYSTEM_PROCESSOR MATCHES "^(aarch64|AARCH64|arm64|ARM64)$") set(CPACK_PACKAGE_FILE_NAME "${COMPONENT_NAME_MAIN}-installer-win64-arm") else() message(FATAL_ERROR "Unrecognized processor: ${CMAKE_SYSTEM_PROCESSOR}") endif() set(CPACK_IFW_TARGET_DIRECTORY "@HomeDir@\\${COMPONENT_NAME_MAIN}") elseif (CMAKE_SYSTEM_NAME MATCHES Darwin) set(CPACK_IFW_ROOT "~/Qt/Tools/QtInstallerFramework/4.6") set(CPACK_IFW_PACKAGE_ICON "${CMAKE_CURRENT_SOURCE_DIR}/resources/gpt4all.icns") set(CPACK_PACKAGE_FILE_NAME "${COMPONENT_NAME_MAIN}-installer-darwin") set(CPACK_IFW_TARGET_DIRECTORY "@ApplicationsDir@/${COMPONENT_NAME_MAIN}") endif() set(CPACK_COMPONENTS_ALL ${COMPONENT_NAME_MAIN}) # exclude development components if (APPLE AND GPT4ALL_SIGN_INSTALL) list(APPEND CPACK_COMPONENTS_ALL maintenancetool) endif() set(CPACK_PACKAGE_INSTALL_DIRECTORY ${COMPONENT_NAME_MAIN}) set(CPACK_PACKAGE_VERSION_MAJOR ${PROJECT_VERSION_MAJOR}) set(CPACK_PACKAGE_VERSION_MINOR ${PROJECT_VERSION_MINOR}) set(CPACK_PACKAGE_VERSION_PATCH ${PROJECT_VERSION_PATCH}) set(CPACK_PACKAGE_HOMEPAGE_URL "https://www.nomic.ai/gpt4all") set(CPACK_PACKAGE_ICON "${CMAKE_CURRENT_SOURCE_DIR}/icons/gpt4all-48.png") set(CPACK_RESOURCE_FILE_LICENSE ${CMAKE_CURRENT_SOURCE_DIR}/LICENSE) set(CPACK_PACKAGE_EXECUTABLES "GPT4All") set(CPACK_CREATE_DESKTOP_LINKS "GPT4All") set(CPACK_IFW_PACKAGE_NAME "GPT4All") set(CPACK_IFW_PACKAGE_TITLE "GPT4All Installer") set(CPACK_IFW_PACKAGE_PUBLISHER "Nomic, Inc.") set(CPACK_IFW_PRODUCT_URL "https://www.nomic.ai/gpt4all") set(CPACK_IFW_PACKAGE_WIZARD_STYLE "Aero") set(CPACK_IFW_PACKAGE_LOGO "${CMAKE_CURRENT_SOURCE_DIR}/icons/gpt4all-48.png") set(CPACK_IFW_PACKAGE_WINDOW_ICON "${CMAKE_CURRENT_SOURCE_DIR}/icons/gpt4all-32.png") set(CPACK_IFW_PACKAGE_WIZARD_SHOW_PAGE_LIST OFF) set(CPACK_IFW_PACKAGE_CONTROL_SCRIPT "${CMAKE_CURRENT_SOURCE_DIR}/cmake/installer_control.qs") ================================================ FILE: gpt4all-chat/cmake/deploy-qt-linux.cmake.in ================================================ set(LINUXDEPLOYQT "@LINUXDEPLOYQT@") set(COMPONENT_NAME_MAIN "@COMPONENT_NAME_MAIN@") set(CMAKE_CURRENT_SOURCE_DIR "@CMAKE_CURRENT_SOURCE_DIR@") set(DATA_DIR ${CPACK_TEMPORARY_INSTALL_DIRECTORY}/packages/${COMPONENT_NAME_MAIN}/data) set(BIN_DIR ${DATA_DIR}/bin) set(Qt6_ROOT_DIR "@Qt6_ROOT_DIR@") set(ENV{LD_LIBRARY_PATH} "${BIN_DIR}:${Qt6_ROOT_DIR}/../lib/") execute_process(COMMAND ${LINUXDEPLOYQT} ${BIN_DIR}/chat -qmldir=${CMAKE_CURRENT_SOURCE_DIR} -bundle-non-qt-libs -qmake=${Qt6_ROOT_DIR}/bin/qmake -verbose=2 -exclude-libs=libcuda.so.1) file(COPY "${CMAKE_CURRENT_SOURCE_DIR}/icons/gpt4all-32.png" DESTINATION ${DATA_DIR}) file(COPY "${CMAKE_CURRENT_SOURCE_DIR}/icons/gpt4all-48.png" DESTINATION ${DATA_DIR}) ================================================ FILE: gpt4all-chat/cmake/deploy-qt-mac.cmake.in ================================================ set(MACDEPLOYQT "@MACDEPLOYQT@") set(COMPONENT_NAME_MAIN "@COMPONENT_NAME_MAIN@") set(CMAKE_CURRENT_SOURCE_DIR "@CMAKE_CURRENT_SOURCE_DIR@") set(GPT4ALL_SIGN_INSTALL "@GPT4ALL_SIGN_INSTALL@") set(GPT4ALL_SIGNING_ID "@MAC_SIGNING_IDENTITY@") set(CPACK_CONFIG_DIR "@CMAKE_BINARY_DIR@") if (GPT4ALL_SIGN_INSTALL) set(MAC_NOTARIZE -sign-for-notarization=${GPT4ALL_SIGNING_ID}) endif() execute_process(COMMAND ${MACDEPLOYQT} ${CPACK_TEMPORARY_INSTALL_DIRECTORY}/packages/${COMPONENT_NAME_MAIN}/data/bin/gpt4all.app -qmldir=${CMAKE_CURRENT_SOURCE_DIR} -verbose=2 ${MAC_NOTARIZE}) file(COPY "${CMAKE_CURRENT_SOURCE_DIR}/icons/gpt4all-32.png" DESTINATION ${CPACK_TEMPORARY_INSTALL_DIRECTORY}/packages/${COMPONENT_NAME_MAIN}/data) file(COPY "${CMAKE_CURRENT_SOURCE_DIR}/icons/gpt4all-48.png" DESTINATION ${CPACK_TEMPORARY_INSTALL_DIRECTORY}/packages/${COMPONENT_NAME_MAIN}/data) file(COPY "${CMAKE_CURRENT_SOURCE_DIR}/resources/gpt4all.icns" DESTINATION ${CPACK_TEMPORARY_INSTALL_DIRECTORY}/packages/${COMPONENT_NAME_MAIN}/data) if (GPT4ALL_SIGN_INSTALL) # Create signed MaintenanceTool set(MT_DATA_DIR ${CPACK_TEMPORARY_INSTALL_DIRECTORY}/packages/maintenancetool/data) file(MAKE_DIRECTORY ${MT_DATA_DIR}) execute_process( COMMAND binarycreator --config ${CPACK_CONFIG_DIR}/cpack-config/config/config.xml --create-maintenancetool --sign ${GPT4ALL_SIGNING_ID} WORKING_DIRECTORY ${MT_DATA_DIR} ) endif() ================================================ FILE: gpt4all-chat/cmake/deploy-qt-windows.cmake.in ================================================ set(WINDEPLOYQT "@WINDEPLOYQT@") set(COMPONENT_NAME_MAIN "@COMPONENT_NAME_MAIN@") set(CMAKE_CURRENT_SOURCE_DIR "@CMAKE_CURRENT_SOURCE_DIR@") execute_process(COMMAND ${WINDEPLOYQT} --qmldir ${CMAKE_CURRENT_SOURCE_DIR} ${CPACK_TEMPORARY_INSTALL_DIRECTORY}/packages/${COMPONENT_NAME_MAIN}/data/bin) file(COPY "${CMAKE_CURRENT_SOURCE_DIR}/icons/gpt4all-32.png" DESTINATION ${CPACK_TEMPORARY_INSTALL_DIRECTORY}/packages/${COMPONENT_NAME_MAIN}/data) file(COPY "${CMAKE_CURRENT_SOURCE_DIR}/icons/gpt4all-48.png" DESTINATION ${CPACK_TEMPORARY_INSTALL_DIRECTORY}/packages/${COMPONENT_NAME_MAIN}/data) file(COPY "${CMAKE_CURRENT_SOURCE_DIR}/resources/gpt4all.ico" DESTINATION ${CPACK_TEMPORARY_INSTALL_DIRECTORY}/packages/${COMPONENT_NAME_MAIN}/data) ================================================ FILE: gpt4all-chat/cmake/download_model.cmake ================================================ if(NOT DEFINED URL OR NOT DEFINED OUTPUT_PATH OR NOT DEFINED EXPECTED_MD5) message(FATAL_ERROR "Usage: cmake -DURL= -DOUTPUT_PATH= -DEXPECTED_MD5= -P download_model.cmake") endif() message(STATUS "Downloading model from ${URL} to ${OUTPUT_PATH} ...") file(DOWNLOAD "${URL}" "${OUTPUT_PATH}" EXPECTED_MD5 "${EXPECTED_MD5}" STATUS status) list(GET status 0 status_code) if(NOT status_code EQUAL 0) message(FATAL_ERROR "Failed to download model: ${status}") endif() ================================================ FILE: gpt4all-chat/cmake/installer_control.qs ================================================ var finishedText = null; function cancelInstaller(message) { installer.setDefaultPageVisible(QInstaller.Introduction, false); installer.setDefaultPageVisible(QInstaller.TargetDirectory, false); installer.setDefaultPageVisible(QInstaller.ComponentSelection, false); installer.setDefaultPageVisible(QInstaller.ReadyForInstallation, false); installer.setDefaultPageVisible(QInstaller.StartMenuSelection, false); installer.setDefaultPageVisible(QInstaller.PerformInstallation, false); installer.setDefaultPageVisible(QInstaller.LicenseCheck, false); finishedText = message; installer.setCanceled(); } function vercmp(a, b) { return a.localeCompare(b, undefined, { numeric: true, sensitivity: "base" }); } function Controller() { } Controller.prototype.TargetDirectoryPageCallback = function() { var failedReq = null; if (systemInfo.productType === "ubuntu" && vercmp(systemInfo.productVersion, "22.04") < 0) { failedReq = "Ubuntu 22.04 LTS"; } else if (systemInfo.productType === "macos" && vercmp(systemInfo.productVersion, "12.6") < 0) { failedReq = "macOS Monterey 12.6"; } if (failedReq !== null) { cancelInstaller( "Installation cannot continue because GPT4All does not support your operating system: " + `${systemInfo.prettyProductName}

` + `GPT4All requires ${failedReq} or newer.` ); } } Controller.prototype.FinishedPageCallback = function() { const widget = gui.currentPageWidget(); if (widget != null && finishedText != null) { widget.MessageLabel.setText(finishedText); } } ================================================ FILE: gpt4all-chat/cmake/installer_gpt4all_component.qs ================================================ function Component() { } var targetDirectory; Component.prototype.beginInstallation = function() { targetDirectory = installer.value("TargetDir"); }; Component.prototype.createOperations = function() { try { // call the base create operations function component.createOperations(); if (systemInfo.productType === "windows") { try { var userProfile = installer.environmentVariable("USERPROFILE"); installer.setValue("UserProfile", userProfile); component.addOperation("CreateShortcut", targetDirectory + "/bin/chat.exe", "@UserProfile@/Desktop/GPT4All.lnk", "workingDirectory=" + targetDirectory + "/bin", "iconPath=" + targetDirectory + "/gpt4all.ico", "iconId=0", "description=Open GPT4All"); } catch (e) { print("ERROR: creating desktop shortcut" + e); } component.addOperation("CreateShortcut", targetDirectory + "/bin/chat.exe", "@StartMenuDir@/GPT4All.lnk", "workingDirectory=" + targetDirectory + "/bin", "iconPath=" + targetDirectory + "/gpt4all.ico", "iconId=0", "description=Open GPT4All"); } else if (systemInfo.productType === "macos") { var gpt4allAppPath = targetDirectory + "/bin/gpt4all.app"; var symlinkPath = targetDirectory + "/../GPT4All.app"; // Remove the symlink if it already exists component.addOperation("Execute", "rm", "-f", symlinkPath); // Create the symlink component.addOperation("Execute", "ln", "-s", gpt4allAppPath, symlinkPath); } else { // linux var homeDir = installer.environmentVariable("HOME"); if (!installer.fileExists(homeDir + "/Desktop/GPT4All.desktop")) { component.addOperation("CreateDesktopEntry", homeDir + "/Desktop/GPT4All.desktop", "Type=Application\nTerminal=false\nExec=\"" + targetDirectory + "/bin/chat\"\nName=GPT4All\nIcon=" + targetDirectory + "/gpt4all-48.png\nName[en_US]=GPT4All"); } } } catch (e) { print("ERROR: running post installscript.qs" + e); } } Component.prototype.createOperationsForArchive = function(archive) { component.createOperationsForArchive(archive); if (systemInfo.productType === "macos") { var uninstallTargetDirectory = installer.value("TargetDir"); var symlinkPath = uninstallTargetDirectory + "/../GPT4All.app"; // Remove the symlink during uninstallation if (installer.isUninstaller()) { component.addOperation("Execute", "rm", "-f", symlinkPath, "UNDOEXECUTE"); } } } ================================================ FILE: gpt4all-chat/cmake/installer_maintenancetool_component.qs ================================================ function Component() { component.ifwVersion = installer.value("FrameworkVersion"); installer.installationStarted.connect(this, Component.prototype.onInstallationStarted); } Component.prototype.onInstallationStarted = function() { if (component.updateRequested() || component.installationRequested()) { if (installer.value("os") == "win") { component.installerbaseBinaryPath = "@TargetDir@/installerbase.exe"; } else if (installer.value("os") == "x11") { component.installerbaseBinaryPath = "@TargetDir@/installerbase"; } else if (installer.value("os") == "mac") { component.installerbaseBinaryPath = "@TargetDir@/MaintenanceTool.app"; } installer.setInstallerBaseBinary(component.installerbaseBinaryPath); } } ================================================ FILE: gpt4all-chat/cmake/sign_dmg.py ================================================ #!/usr/bin/env python3 import os import subprocess import tempfile import shutil import click import re from typing import Optional # Requires click # pip install click # Example usage # python sign_dmg.py --input-dmg /path/to/your/input.dmg --output-dmg /path/to/your/output.dmg --signing-identity "Developer ID Application: YOUR_NAME (TEAM_ID)" # NOTE: This script assumes that you have the necessary Developer ID Application certificate in your # Keychain Access and that the codesign and hdiutil command-line tools are available on your system. @click.command() @click.option('--input-dmg', required=True, help='Path to the input DMG file.') @click.option('--output-dmg', required=True, help='Path to the output signed DMG file.') @click.option('--sha1-hash', help='SHA-1 hash of the Developer ID Application certificate') @click.option('--signing-identity', default=None, help='Common name of the Developer ID Application certificate') @click.option('--verify', is_flag=True, show_default=True, required=False, default=False, help='Perform verification of signed app bundle' ) def sign_dmg(input_dmg: str, output_dmg: str, signing_identity: Optional[str] = None, sha1_hash: Optional[str] = None, verify: Optional[bool] = False) -> None: if not signing_identity and not sha1_hash: print("Error: Either --signing-identity or --sha1-hash must be provided.") exit(1) # Mount the input DMG mount_point = tempfile.mkdtemp() subprocess.run(['hdiutil', 'attach', input_dmg, '-mountpoint', mount_point]) # Copy the contents of the DMG to a temporary folder temp_dir = tempfile.mkdtemp() shutil.copytree(mount_point, os.path.join(temp_dir, 'contents')) subprocess.run(['hdiutil', 'detach', mount_point]) # Find the .app bundle in the temporary folder app_bundle = None for item in os.listdir(os.path.join(temp_dir, 'contents')): if item.endswith('.app'): app_bundle = os.path.join(temp_dir, 'contents', item) break if not app_bundle: print('No .app bundle found in the DMG.') exit(1) # Sign the .app bundle try: subprocess.run([ 'codesign', '--deep', '--force', '--verbose', '--options', 'runtime', '--timestamp', '--sign', sha1_hash or signing_identity, app_bundle ], check=True) except subprocess.CalledProcessError as e: print(f"Error during codesign: {e}") # Clean up temporary directories shutil.rmtree(temp_dir) shutil.rmtree(mount_point) exit(1) # Validate signature and entitlements of signed app bundle if verify: try: code_ver_proc = subprocess.run([ 'codesign', '--deep', '--verify', '--verbose=2', '--strict', app_bundle ], check=True, capture_output=True) if not re.search(fr"{app_bundle}: valid", code_ver_proc.stdout.decode()): raise RuntimeError(f"codesign validation failed: {code_ver_proc.stdout.decode()}") except subprocess.CalledProcessError as e: print(f"Error during codesign validation: {e}") # Clean up temporary directories shutil.rmtree(temp_dir) shutil.rmtree(mount_point) exit(1) try: spctl_proc = subprocess.run([ 'spctl', '-a', '-t', 'exec', '-vv', app_bundle ], check=True, capture_output=True) if not re.search(fr"{app_bundle}: accepted", spctl_proc.stdout.decode()): raise RuntimeError(f"spctl validation failed: {spctl_proc.stdout.decode()}") except subprocess.CalledProcessError as e: print(f"Error during spctl validation: {e}") # Clean up temporary directories shutil.rmtree(temp_dir) shutil.rmtree(mount_point) exit(1) # Create a new DMG containing the signed .app bundle subprocess.run([ 'hdiutil', 'create', '-volname', os.path.splitext(os.path.basename(input_dmg))[0], '-srcfolder', os.path.join(temp_dir, 'contents'), '-ov', '-format', 'UDZO', output_dmg ]) # Clean up temporary directories shutil.rmtree(temp_dir) shutil.rmtree(mount_point) if __name__ == '__main__': sign_dmg() ================================================ FILE: gpt4all-chat/contributing_translations.md ================================================ # Contributing Foreign Language Translations of GPT4All ## Overview To contribute foreign language translations to the GPT4All project will require installation of a graphical tool called Qt Linguist. This tool can be obtained by installing a subset of Qt. You'll also need to clone this github repository locally on your filesystem. Once this tool is installed you'll be able to use it to load specific translation files found in the gpt4all github repository and add your foreign language translations. Once you've done this you can contribute back those translations by opening a pull request on Github or by sharing it with one of the administrators on GPT4All [discord.](https://discord.gg/4M2QFmTt2k) ## Download Qt Linguist - Go to https://login.qt.io/register to create a free Qt account. - Download the Qt Online Installer for your OS from here: https://www.qt.io/download-qt-installer-oss - Sign into the installer. - Agree to the terms of the (L)GPL 3 license. - Select whether you would like to send anonymous usage statistics to Qt. - On the Installation Folder page, leave the default installation path, and select "Custom Installation". ![image](https://github.com/nomic-ai/gpt4all/assets/10168/85234549-1ea7-43c9-87d1-1e4f0fb93d82) - Under "Qt", select the latest Qt 6.x release as well as Developer and Designer Tools - NOTE: This will install much more than the Qt Linguist tool and you can deselect portions, but to be safe I've included the easiest steps that will also enable you to build GPT4All from source if you wish. ## Open Qt Linguist After installation you should be able to find the Qt Linguist application in the following locations: - Windows `C:\Qt\6.7.2\msvc2019_64\bin\linguist.exe` - macOS `/Users/username/Qt/6.7.2/macos/bin/Linguist.app` - Linux `/home/username/Qt/6.7.2/gcc_64/bin/linguist` ![Peek 2024-07-11 10-26](https://github.com/nomic-ai/gpt4all/assets/10168/957de16f-4e23-4d90-9d20-9089d2028aa8) ## After you've opened Qt Linguist - Navigate to the translation file you're interested in contributing to. This file will be located in the gpt4all `translations` directory found on your local filesystem after you've cloned the gpt4all github repository. It is this folder [gpt4all/gpt4all-chat/translations](https://github.com/nomic-ai/gpt4all/tree/main/gpt4all-chat/translations) located on your local filesystem after cloning the repository. - If the file does not exist yet for the language you are interested in, then just copy the english one to a new file with appropriate name and edit that. ## How to see your translations in the app as you develop them ![Peek 2024-07-12 14-22](https://github.com/user-attachments/assets/6ff00338-5b49-4f97-a0d4-de96f3991469) - In the same folder that where your models are stored you can add translation files (.ts) and compile them using the command `/path/to/Qt/6.7.2/gcc_64/bin/lrelease gpt4all_{lang}.ts` - This should produce a file named `gpt4all_{lang}.qm` in the same folder. Restart GPT4All and you should now be able to see your language in the settings combobox. ## Information on how to use Qt Linguist - [Manual for translators](https://doc.qt.io/qt-6/linguist-translators.html) - [Video explaining how translators use Qt Linguist](https://youtu.be/xNIz78IPBu0?t=351) ## Once you've edited the translations save the file - Open a [pull request](https://github.com/nomic-ai/gpt4all/pulls) for your changes. - Alternatively, you may share your translation file with one of the administrators on GPT4All [discord.](https://discord.gg/4M2QFmTt2k) # Thank you! ================================================ FILE: gpt4all-chat/deps/CMakeLists.txt ================================================ include(FetchContent) set(BUILD_SHARED_LIBS OFF) set(FMT_INSTALL OFF) add_subdirectory(fmt) set(QAPPLICATION_CLASS QApplication) add_subdirectory(SingleApplication) set(DUCKX_INSTALL OFF) add_subdirectory(DuckX) set(QT_VERSION_MAJOR 6) add_subdirectory(QXlsx/QXlsx) if (NOT GPT4ALL_USING_QTPDF) # If we do not use QtPDF, we need to get PDFium. set(GPT4ALL_PDFIUM_TAG "chromium/6996") if (CMAKE_SYSTEM_NAME MATCHES Linux) FetchContent_Declare( pdfium URL "https://github.com/bblanchon/pdfium-binaries/releases/download/${GPT4ALL_PDFIUM_TAG}/pdfium-linux-x64.tgz" URL_HASH "SHA256=68b381b87efed539f2e33ae1e280304c9a42643a878cc296c1d66a93b0cb4335" ) elseif (CMAKE_SYSTEM_NAME MATCHES Windows) if (CMAKE_SYSTEM_PROCESSOR MATCHES "^(x86_64|AMD64|amd64)$") FetchContent_Declare( pdfium URL "https://github.com/bblanchon/pdfium-binaries/releases/download/${GPT4ALL_PDFIUM_TAG}/pdfium-win-x64.tgz" URL_HASH "SHA256=83e714c302ceacccf403826d5cb57ea39b77f393d83b8d5781283012774a9378" ) elseif (CMAKE_SYSTEM_PROCESSOR MATCHES "^(aarch64|AARCH64|arm64|ARM64)$") FetchContent_Declare( pdfium URL "https://github.com/bblanchon/pdfium-binaries/releases/download/${GPT4ALL_PDFIUM_TAG}/pdfium-win-arm64.tgz" URL_HASH "SHA256=78e77e871453a4915cbf66fb381b951c9932f88a747c6b2b33c9f27ec2371445" ) endif() elseif (CMAKE_SYSTEM_NAME MATCHES Darwin) FetchContent_Declare( pdfium URL "https://github.com/bblanchon/pdfium-binaries/releases/download/${GPT4ALL_PDFIUM_TAG}/pdfium-mac-univ.tgz" URL_HASH "SHA256=e7577f3242ff9c1df50025f9615673a43601a201bc51ee4792975f98920793a2" ) endif() FetchContent_MakeAvailable(pdfium) find_package(PDFium REQUIRED PATHS "${pdfium_SOURCE_DIR}" NO_DEFAULT_PATH) endif() ================================================ FILE: gpt4all-chat/dev-requirements.txt ================================================ -r test-requirements.txt # dev tools flake8~=7.1 mypy~=1.12 pytype>=2024.10.11 wemake-python-styleguide~=0.19.2 # type stubs and other optional modules types-requests~=2.32 urllib3[socks] ================================================ FILE: gpt4all-chat/flatpak-manifest/io.gpt4all.gpt4all.appdata.xml ================================================ io.gpt4all.gpt4all CC0-1.0 MIT GPT4ALL Open-source assistant Nomic-ai

Cross platform Qt based GUI for GPT4All

  • Fast CPU and GPU based inference using ggml for open source LLM's
  • The UI is made to look and feel like you've come to expect from a chatty gpt
  • Check for updates so you can always stay fresh with latest models
  • Easy to install with precompiled binaries available for all three major desktop platforms
  • Multi-model - Ability to load more than one model and switch between them
  • Supports llama.cpp style models
  • Model downloader in GUI featuring many popular open source models
  • Settings dialog to change temp, top_p, top_k, threads, etc
  • Copy your conversation to clipboard
Main Window https://raw.githubusercontent.com/nomic-ai/gpt4all/main/gpt4all-chat/flatpak-manifest/screenshots/welcome.png https://raw.githubusercontent.com/nomic-ai/gpt4all/main/gpt4all-chat/flatpak-manifest/screenshots/chat.png https://raw.githubusercontent.com/nomic-ai/gpt4all/main/gpt4all-chat/flatpak-manifest/screenshots/model.png https://www.nomic.ai/gpt4all https://github.com/nomic-ai/gpt4all/issues https://github.com/nomic-ai/gpt4all io.gpt4all.gpt4all.desktop mild moderate mild
================================================ FILE: gpt4all-chat/flatpak-manifest/io.gpt4all.gpt4all.desktop ================================================ [Desktop Entry] Name=GPT4ALL GenericName=Open-source assistant-style large language models that run locally on your CPU Comment=Run any GPT4All model natively on your home desktop with the auto-updating desktop chat client. See GPT4All Website for a full list of open-source models you can run with this powerful desktop application. Exec=chat Icon=io.gpt4all.gpt4all Type=Application Categories=Utility;Office; Keywords=GPT,Chat;AI ================================================ FILE: gpt4all-chat/main.qml ================================================ import QtCore import QtQuick import QtQuick.Controls import QtQuick.Controls.Basic import QtQuick.Layouts import Qt5Compat.GraphicalEffects import llm import chatlistmodel import download import modellist import network import gpt4all import localdocs import mysettings import Qt.labs.platform Window { id: window width: 1440 height: 810 minimumWidth: 658 + 470 * theme.fontScale minimumHeight: 384 + 160 * theme.fontScale visible: true title: qsTr("GPT4All v%1").arg(Qt.application.version) SystemTrayIcon { id: systemTrayIcon property bool shouldClose: false visible: MySettings.systemTray && !shouldClose icon.source: "qrc:/gpt4all/icons/gpt4all.svg" function restore() { LLM.showDockIcon(); window.show(); window.raise(); window.requestActivate(); } onActivated: function(reason) { if (reason === SystemTrayIcon.Context && Qt.platform.os !== "osx") menu.open(); else if (reason === SystemTrayIcon.Trigger) restore(); } menu: Menu { MenuItem { text: qsTr("Restore") onTriggered: systemTrayIcon.restore() } MenuItem { text: qsTr("Quit") onTriggered: { systemTrayIcon.restore(); systemTrayIcon.shouldClose = true; window.shouldClose = true; savingPopup.open(); ChatListModel.saveChatsForQuit(); } } } } Settings { property alias x: window.x property alias y: window.y property alias width: window.width property alias height: window.height } Theme { id: theme } Item { Accessible.role: Accessible.Window Accessible.name: title } // Startup code Component.onCompleted: { startupDialogs(); } Component.onDestruction: { Network.trackEvent("session_end") } Connections { target: firstStartDialog function onClosed() { startupDialogs(); } } Connections { target: Download function onHasNewerReleaseChanged() { startupDialogs(); } } property bool hasCheckedFirstStart: false property bool hasShownSettingsAccess: false property var currentChat: ChatListModel.currentChat function startupDialogs() { if (!LLM.compatHardware()) { Network.trackEvent("noncompat_hardware") errorCompatHardware.open(); return; } // check if we have access to settings and if not show an error if (!hasShownSettingsAccess && !LLM.hasSettingsAccess()) { errorSettingsAccess.open(); hasShownSettingsAccess = true; return; } // check for first time start of this version if (!hasCheckedFirstStart) { if (Download.isFirstStart(/*writeVersion*/ true)) { firstStartDialog.open(); return; } // send startup or opt-out now that the user has made their choice Network.sendStartup() // start localdocs LocalDocs.requestStart() hasCheckedFirstStart = true } // check for new version if (Download.hasNewerRelease && !firstStartDialog.opened) { newVersionDialog.open(); return; } } PopupDialog { id: errorCompatHardware anchors.centerIn: parent shouldTimeOut: false shouldShowBusy: false closePolicy: Popup.NoAutoClose modal: true text: qsTr("

Encountered an error starting up:


" + "\"Incompatible hardware detected.\"" + "

Unfortunately, your CPU does not meet the minimal requirements to run " + "this program. In particular, it does not support AVX intrinsics which this " + "program requires to successfully run a modern large language model. " + "The only solution at this time is to upgrade your hardware to a more modern CPU." + "

See here for more information: " + "https://en.wikipedia.org/wiki/Advanced_Vector_Extensions"); } PopupDialog { id: errorSettingsAccess anchors.centerIn: parent shouldTimeOut: false shouldShowBusy: false modal: true text: qsTr("

Encountered an error starting up:


" + "\"Inability to access settings file.\"" + "

Unfortunately, something is preventing the program from accessing " + "the settings file. This could be caused by incorrect permissions in the local " + "app config directory where the settings file is located. " + "Check out our discord channel for help.") } StartupDialog { id: firstStartDialog anchors.centerIn: parent } NewVersionDialog { id: newVersionDialog anchors.centerIn: parent } Connections { target: Network function onHealthCheckFailed(code) { healthCheckFailed.open(); } } PopupDialog { id: healthCheckFailed anchors.centerIn: parent text: qsTr("Connection to datalake failed.") font.pixelSize: theme.fontSizeLarge } property bool shouldClose: false PopupDialog { id: savingPopup anchors.centerIn: parent shouldTimeOut: false shouldShowBusy: true text: qsTr("Saving chats.") font.pixelSize: theme.fontSizeLarge } NetworkDialog { id: networkDialog anchors.centerIn: parent width: Math.min(1024, window.width - (window.width * .2)) height: Math.min(600, window.height - (window.height * .2)) Item { Accessible.role: Accessible.Dialog Accessible.name: qsTr("Network dialog") Accessible.description: qsTr("opt-in to share feedback/conversations") } } onClosing: function(close) { if (systemTrayIcon.visible) { LLM.hideDockIcon(); window.visible = false; ChatListModel.saveChats(); close.accepted = false; return; } if (window.shouldClose) return; window.shouldClose = true; savingPopup.open(); ChatListModel.saveChatsForQuit(); close.accepted = false; } Connections { target: ChatListModel function onSaveChatsFinished() { savingPopup.close(); if (window.shouldClose) window.close() } } color: theme.viewBarBackground Rectangle { id: viewBar anchors.top: parent.top anchors.bottom: parent.bottom anchors.left: parent.left width: 68 * theme.fontScale color: theme.viewBarBackground ColumnLayout { id: viewsLayout anchors.top: parent.top anchors.topMargin: 30 anchors.horizontalCenter: parent.horizontalCenter Layout.margins: 0 spacing: 16 MyToolButton { id: homeButton backgroundColor: toggled ? theme.iconBackgroundViewBarHovered : theme.iconBackgroundViewBar backgroundColorHovered: theme.iconBackgroundViewBarHovered Layout.preferredWidth: 38 * theme.fontScale Layout.preferredHeight: 38 * theme.fontScale Layout.alignment: Qt.AlignCenter toggledWidth: 0 toggled: homeView.isShown() toggledColor: theme.iconBackgroundViewBarToggled imageWidth: 25 * theme.fontScale imageHeight: 25 * theme.fontScale source: "qrc:/gpt4all/icons/home.svg" Accessible.name: qsTr("Home view") Accessible.description: qsTr("Home view of application") onClicked: { homeView.show() } } Text { Layout.topMargin: -20 text: qsTr("Home") font.pixelSize: theme.fontSizeMedium font.bold: true color: homeButton.hovered ? homeButton.backgroundColorHovered : homeButton.backgroundColor Layout.preferredWidth: 38 * theme.fontScale horizontalAlignment: Text.AlignHCenter TapHandler { onTapped: function(eventPoint, button) { homeView.show() } } } MyToolButton { id: chatButton backgroundColor: toggled ? theme.iconBackgroundViewBarHovered : theme.iconBackgroundViewBar backgroundColorHovered: theme.iconBackgroundViewBarHovered Layout.preferredWidth: 38 * theme.fontScale Layout.preferredHeight: 38 * theme.fontScale Layout.alignment: Qt.AlignCenter toggledWidth: 0 toggled: chatView.isShown() toggledColor: theme.iconBackgroundViewBarToggled imageWidth: 25 * theme.fontScale imageHeight: 25 * theme.fontScale source: "qrc:/gpt4all/icons/chat.svg" Accessible.name: qsTr("Chat view") Accessible.description: qsTr("Chat view to interact with models") onClicked: { chatView.show() } } Text { Layout.topMargin: -20 text: qsTr("Chats") font.pixelSize: theme.fontSizeMedium font.bold: true color: chatButton.hovered ? chatButton.backgroundColorHovered : chatButton.backgroundColor Layout.preferredWidth: 38 * theme.fontScale horizontalAlignment: Text.AlignHCenter TapHandler { onTapped: function(eventPoint, button) { chatView.show() } } } MyToolButton { id: modelsButton backgroundColor: toggled ? theme.iconBackgroundViewBarHovered : theme.iconBackgroundViewBar backgroundColorHovered: theme.iconBackgroundViewBarHovered Layout.preferredWidth: 38 * theme.fontScale Layout.preferredHeight: 38 * theme.fontScale toggledWidth: 0 toggled: modelsView.isShown() toggledColor: theme.iconBackgroundViewBarToggled imageWidth: 25 * theme.fontScale imageHeight: 25 * theme.fontScale source: "qrc:/gpt4all/icons/models.svg" Accessible.name: qsTr("Models") Accessible.description: qsTr("Models view for installed models") onClicked: { modelsView.show() } } Text { Layout.topMargin: -20 text: qsTr("Models") font.pixelSize: theme.fontSizeMedium font.bold: true color: modelsButton.hovered ? modelsButton.backgroundColorHovered : modelsButton.backgroundColor Layout.preferredWidth: 38 * theme.fontScale horizontalAlignment: Text.AlignHCenter TapHandler { onTapped: function(eventPoint, button) { modelsView.show() } } } MyToolButton { id: localdocsButton backgroundColor: toggled ? theme.iconBackgroundViewBarHovered : theme.iconBackgroundViewBar backgroundColorHovered: theme.iconBackgroundViewBarHovered Layout.preferredWidth: 38 * theme.fontScale Layout.preferredHeight: 38 * theme.fontScale toggledWidth: 0 toggledColor: theme.iconBackgroundViewBarToggled toggled: localDocsView.isShown() imageWidth: 25 * theme.fontScale imageHeight: 25 * theme.fontScale source: "qrc:/gpt4all/icons/db.svg" Accessible.name: qsTr("LocalDocs") Accessible.description: qsTr("LocalDocs view to configure and use local docs") onClicked: { localDocsView.show() } } Text { Layout.topMargin: -20 text: qsTr("LocalDocs") font.pixelSize: theme.fontSizeMedium font.bold: true color: localdocsButton.hovered ? localdocsButton.backgroundColorHovered : localdocsButton.backgroundColor Layout.preferredWidth: 38 * theme.fontScale horizontalAlignment: Text.AlignHCenter TapHandler { onTapped: function(eventPoint, button) { localDocsView.show() } } } MyToolButton { id: settingsButton backgroundColor: toggled ? theme.iconBackgroundViewBarHovered : theme.iconBackgroundViewBar backgroundColorHovered: theme.iconBackgroundViewBarHovered Layout.preferredWidth: 38 * theme.fontScale Layout.preferredHeight: 38 * theme.fontScale toggledWidth: 0 toggledColor: theme.iconBackgroundViewBarToggled toggled: settingsView.isShown() imageWidth: 25 * theme.fontScale imageHeight: 25 * theme.fontScale source: "qrc:/gpt4all/icons/settings.svg" Accessible.name: qsTr("Settings") Accessible.description: qsTr("Settings view for application configuration") onClicked: { settingsView.show(0 /*pageToDisplay*/) } } Text { Layout.topMargin: -20 text: qsTr("Settings") font.pixelSize: theme.fontSizeMedium font.bold: true color: settingsButton.hovered ? settingsButton.backgroundColorHovered : settingsButton.backgroundColor Layout.preferredWidth: 38 * theme.fontScale horizontalAlignment: Text.AlignHCenter TapHandler { onTapped: function(eventPoint, button) { settingsView.show(0 /*pageToDisplay*/) } } } } ColumnLayout { id: buttonsLayout anchors.bottom: parent.bottom anchors.margins: 0 anchors.bottomMargin: 25 anchors.horizontalCenter: parent.horizontalCenter Layout.margins: 0 spacing: 22 Item { id: antennaItem Layout.alignment: Qt.AlignCenter Layout.preferredWidth: antennaImage.width Layout.preferredHeight: antennaImage.height Image { id: antennaImage sourceSize.width: 32 sourceSize.height: 32 visible: false fillMode: Image.PreserveAspectFit source: "qrc:/gpt4all/icons/antenna_3.svg" } ColorOverlay { id: antennaColored visible: ModelList.selectableModels.count !== 0 && (currentChat.isServer || currentChat.modelInfo.isOnline || MySettings.networkIsActive) anchors.fill: antennaImage source: antennaImage color: theme.styledTextColor ToolTip.text: { if (MySettings.networkIsActive) return qsTr("The datalake is enabled") else if (currentChat.modelInfo.isOnline) return qsTr("Using a network model") else if (currentChat.isServer) return qsTr("Server mode is enabled") return "" } ToolTip.visible: maAntenna.containsMouse ToolTip.delay: Qt.styleHints.mousePressAndHoldInterval MouseArea { id: maAntenna anchors.fill: antennaColored hoverEnabled: true } } SequentialAnimation { running: true loops: Animation.Infinite PropertyAnimation { target: antennaImage property: "source" duration: 500 from: "qrc:/gpt4all/icons/antenna_1.svg" to: "qrc:/gpt4all/icons/antenna_2.svg" } PauseAnimation { duration: 1500 } PropertyAnimation { target: antennaImage property: "source" duration: 500 from: "qrc:/gpt4all/icons/antenna_2.svg" to: "qrc:/gpt4all/icons/antenna_3.svg" } PauseAnimation { duration: 1500 } PropertyAnimation { target: antennaImage property: "source" duration: 500 from: "qrc:/gpt4all/icons/antenna_3.svg" to: "qrc:/gpt4all/icons/antenna_2.svg" } PauseAnimation { duration: 1500 } PropertyAnimation { target: antennaImage property: "source" duration: 1500 from: "qrc:/gpt4all/icons/antenna_2.svg" to: "qrc:/gpt4all/icons/antenna_1.svg" } PauseAnimation { duration: 500 } } } Rectangle { Layout.alignment: Qt.AlignCenter Layout.preferredWidth: image.width Layout.preferredHeight: image.height color: "transparent" Image { id: image anchors.centerIn: parent sourceSize: Qt.size(48 * theme.fontScale, 32 * theme.fontScale) fillMode: Image.PreserveAspectFit mipmap: true visible: false source: "qrc:/gpt4all/icons/nomic_logo.svg" } ColorOverlay { anchors.fill: image source: image color: image.hovered ? theme.mutedDarkTextColorHovered : theme.mutedDarkTextColor TapHandler { onTapped: function(eventPoint, button) { Qt.openUrlExternally("https://nomic.ai") } } } } } } Rectangle { id: roundedFrame z: 299 anchors.top: parent.top anchors.bottom: parent.bottom anchors.left: viewBar.right anchors.right: parent.right anchors.topMargin: 15 anchors.bottomMargin: 15 anchors.rightMargin: 15 radius: 15 border.width: 1 border.color: theme.dividerColor color: "transparent" clip: true } RectangularGlow { id: effect anchors.fill: roundedFrame glowRadius: 15 spread: 0 color: theme.dividerColor cornerRadius: 10 opacity: 0.5 } StackLayout { id: stackLayout anchors.top: parent.top anchors.bottom: parent.bottom anchors.left: viewBar.right anchors.right: parent.right anchors.topMargin: 15 anchors.bottomMargin: 15 anchors.rightMargin: 15 layer.enabled: true layer.effect: OpacityMask { maskSource: Rectangle { width: roundedFrame.width height: roundedFrame.height radius: 15 } } HomeView { id: homeView Layout.fillWidth: true Layout.fillHeight: true shouldShowFirstStart: !hasCheckedFirstStart function show() { stackLayout.currentIndex = 0; } function isShown() { return stackLayout.currentIndex === 0 } Connections { target: homeView function onChatViewRequested() { chatView.show(); } function onLocalDocsViewRequested() { localDocsView.show(); } function onAddModelViewRequested() { addModelView.show(); } function onSettingsViewRequested(page) { settingsView.show(page); } } } ChatView { id: chatView Layout.fillWidth: true Layout.fillHeight: true function show() { stackLayout.currentIndex = 1; } function isShown() { return stackLayout.currentIndex === 1 } Connections { target: chatView function onAddCollectionViewRequested() { addCollectionView.show(); } function onAddModelViewRequested() { addModelView.show(); } } } ModelsView { id: modelsView Layout.fillWidth: true Layout.fillHeight: true function show() { stackLayout.currentIndex = 2; } function isShown() { return stackLayout.currentIndex === 2 } Item { Accessible.name: qsTr("Installed models") Accessible.description: qsTr("View of installed models") } Connections { target: modelsView function onAddModelViewRequested() { addModelView.show(); } } } LocalDocsView { id: localDocsView Layout.fillWidth: true Layout.fillHeight: true function show() { stackLayout.currentIndex = 3; } function isShown() { return stackLayout.currentIndex === 3 } Connections { target: localDocsView function onAddCollectionViewRequested() { addCollectionView.show(); } } } SettingsView { id: settingsView Layout.fillWidth: true Layout.fillHeight: true function show(page) { settingsView.pageToDisplay = page; stackLayout.currentIndex = 4; } function isShown() { return stackLayout.currentIndex === 4 } } AddCollectionView { id: addCollectionView Layout.fillWidth: true Layout.fillHeight: true function show() { stackLayout.currentIndex = 5; } function isShown() { return stackLayout.currentIndex === 5 } Connections { target: addCollectionView function onLocalDocsViewRequested() { localDocsView.show(); } } } AddModelView { id: addModelView Layout.fillWidth: true Layout.fillHeight: true function show() { stackLayout.currentIndex = 6; } function isShown() { return stackLayout.currentIndex === 6 } Connections { target: addModelView function onModelsViewRequested() { modelsView.show(); } } } } } ================================================ FILE: gpt4all-chat/metadata/latestnews.md ================================================ ## Latest News GPT4All v3.10.0 was released on February 24th. Changes include: * **Remote Models:** * The Add Model page now has a dedicated tab for remote model providers. * Groq, OpenAI, and Mistral remote models are now easier to configure. * **CUDA Compatibility:** GPUs with CUDA compute capability 5.0 such as the GTX 750 are now supported by the CUDA backend. * **New Model:** The non-MoE Granite model is now supported. * **Translation Updates:** * The Italian translation has been updated. * The Simplified Chinese translation has been significantly improved. * **Better Chat Templates:** The default chat templates for OLMoE 7B 0924/0125 and Granite 3.1 3B/8B have been improved. * **Whitespace Fixes:** DeepSeek-R1-based models now have better whitespace behavior in their output. * **Crash Fixes:** Several issues that could potentially cause GPT4All to crash have been fixed. ================================================ FILE: gpt4all-chat/metadata/models.json ================================================ [ { "order": "a", "md5sum": "e8d47924f433bd561cb5244557147793", "name": "Wizard v1.1", "filename": "wizardlm-13b-v1.1-superhot-8k.ggmlv3.q4_0.bin", "filesize": "7323310848", "ramrequired": "16", "parameters": "13 billion", "quant": "q4_0", "type": "LLaMA", "systemPrompt": " ", "description": "Best overall model
  • Instruction based
  • Gives very long responses
  • Finetuned with only 1k of high-quality data
  • Trained by Microsoft and Peking University
  • Cannot be used commerciallyBest overall smaller model
    • Fast responses
    • Instruction based
    • Trained by TII
    • Finetuned by Nomic AI
    • Licensed for commercial use
    ", "url": "https://huggingface.co/nomic-ai/gpt4all-falcon-ggml/resolve/main/ggml-model-gpt4all-falcon-q4_0.bin", "promptTemplate": "### Instruction:\n%1\n### Response:\n" }, { "order": "c", "md5sum": "4acc146dd43eb02845c233c29289c7c5", "name": "Hermes", "filename": "nous-hermes-13b.ggmlv3.q4_0.bin", "filesize": "8136777088", "requires": "2.4.7", "ramrequired": "16", "parameters": "13 billion", "quant": "q4_0", "type": "LLaMA", "systemPrompt": " ", "description": "Extremely good model
    • Instruction based
    • Gives long responses
    • Curated with 300,000 uncensored instructions
    • Trained by Nous Research
    • Cannot be used commercially
    ", "url": "https://huggingface.co/TheBloke/Nous-Hermes-13B-GGML/resolve/main/nous-hermes-13b.ggmlv3.q4_0.bin", "promptTemplate": "### Instruction:\n%1\n### Response:\n" }, { "order": "f", "md5sum": "11d9f060ca24575a2c303bdc39952486", "name": "Snoozy", "filename": "GPT4All-13B-snoozy.ggmlv3.q4_0.bin", "filesize": "8136770688", "requires": "2.4.7", "ramrequired": "16", "parameters": "13 billion", "quant": "q4_0", "type": "LLaMA", "systemPrompt": " ", "description": "Very good overall model
    • Instruction based
    • Based on the same dataset as Groovy
    • Slower than Groovy, with higher quality responses
    • Trained by Nomic AI
    • Cannot be used commercially
    ", "url": "https://huggingface.co/TheBloke/GPT4All-13B-snoozy-GGML/resolve/main/GPT4All-13B-snoozy.ggmlv3.q4_0.bin" }, { "order": "h", "md5sum": "e64e74375ce9d36a3d0af3db1523fd0a", "name": "Mini Orca", "filename": "orca-mini-7b.ggmlv3.q4_0.bin", "filesize": "3791749248", "requires": "2.4.7", "ramrequired": "8", "parameters": "7 billion", "quant": "q4_0", "type": "OpenLLaMa", "description": "New model with novel dataset
    • Instruction based
    • Explain tuned datasets
    • Orca Research Paper dataset construction approaches
    • Licensed for commercial use
    ", "url": "https://huggingface.co/TheBloke/orca_mini_7B-GGML/resolve/main/orca-mini-7b.ggmlv3.q4_0.bin", "promptTemplate": "### User:\n%1\n### Response:\n", "systemPrompt": "### System:\nYou are an AI assistant that follows instruction extremely well. Help as much as you can.\n\n" }, { "order": "i", "md5sum": "6a087f7f4598fad0bb70e6cb4023645e", "name": "Mini Orca (Small)", "filename": "orca-mini-3b.ggmlv3.q4_0.bin", "filesize": "1928446208", "requires": "2.4.7", "ramrequired": "4", "parameters": "3 billion", "quant": "q4_0", "type": "OpenLLaMa", "description": "Small version of new model with novel dataset
    • Instruction based
    • Explain tuned datasets
    • Orca Research Paper dataset construction approaches
    • Licensed for commercial use
    ", "url": "https://huggingface.co/TheBloke/orca_mini_3B-GGML/resolve/main/orca-mini-3b.ggmlv3.q4_0.bin", "promptTemplate": "### User:\n%1\n### Response:\n", "systemPrompt": "### System:\nYou are an AI assistant that follows instruction extremely well. Help as much as you can.\n\n" }, { "order": "j", "md5sum": "959b7f65b2d12fd1e3ff99e7493c7a3a", "name": "Mini Orca (Large)", "filename": "orca-mini-13b.ggmlv3.q4_0.bin", "filesize": "7323329152", "requires": "2.4.7", "ramrequired": "16", "parameters": "13 billion", "quant": "q4_0", "type": "OpenLLaMa", "description": "Largest version of new model with novel dataset
    • Instruction based
    • Explain tuned datasets
    • Orca Research Paper dataset construction approaches
    • Licensed for commercial use
    ", "url": "https://huggingface.co/TheBloke/orca_mini_13B-GGML/resolve/main/orca-mini-13b.ggmlv3.q4_0.bin", "promptTemplate": "### User:\n%1\n### Response:\n", "systemPrompt": "### System:\nYou are an AI assistant that follows instruction extremely well. Help as much as you can.\n\n" }, { "order": "r", "md5sum": "489d21fd48840dcb31e5f92f453f3a20", "name": "Wizard Uncensored", "filename": "wizardLM-13B-Uncensored.ggmlv3.q4_0.bin", "filesize": "8136777088", "requires": "2.4.7", "ramrequired": "16", "parameters": "13 billion", "quant": "q4_0", "type": "LLaMA", "systemPrompt": " ", "description": "Trained on uncensored assistant data and instruction data
    • Instruction based
    • Cannot be used commercially
    ", "url": "https://huggingface.co/TheBloke/WizardLM-13B-Uncensored-GGML/resolve/main/wizardLM-13B-Uncensored.ggmlv3.q4_0.bin" }, { "order": "s", "md5sum": "615890cb571fcaa0f70b2f8d15ef809e", "disableGUI": "true", "name": "Replit", "filename": "ggml-replit-code-v1-3b.bin", "filesize": "5202046853", "requires": "2.4.7", "ramrequired": "4", "parameters": "3 billion", "quant": "f16", "type": "Replit", "systemPrompt": " ", "promptTemplate": "%1", "description": "Trained on subset of the Stack
    • Code completion based
    • Licensed for commercial use
    ", "url": "https://huggingface.co/nomic-ai/ggml-replit-code-v1-3b/resolve/main/ggml-replit-code-v1-3b.bin" }, { "order": "t", "md5sum": "031bb5d5722c08d13e3e8eaf55c37391", "disableGUI": "true", "name": "Bert", "filename": "ggml-all-MiniLM-L6-v2-f16.bin", "filesize": "45521167", "requires": "2.4.14", "ramrequired": "1", "parameters": "1 million", "quant": "f16", "type": "Bert", "systemPrompt": " ", "description": "Sbert
    • For embeddings" }, { "order": "u", "md5sum": "379ee1bab9a7a9c27c2314daa097528e", "disableGUI": "true", "name": "Starcoder (Small)", "filename": "starcoderbase-3b-ggml.bin", "filesize": "7503121552", "requires": "2.4.14", "ramrequired": "8", "parameters": "3 billion", "quant": "f16", "type": "Starcoder", "systemPrompt": " ", "promptTemplate": "%1", "description": "Trained on subset of the Stack
      • Code completion based
      " }, { "order": "w", "md5sum": "f981ab8fbd1ebbe4932ddd667c108ba7", "disableGUI": "true", "name": "Starcoder", "filename": "starcoderbase-7b-ggml.bin", "filesize": "17860448016", "requires": "2.4.14", "ramrequired": "16", "parameters": "7 billion", "quant": "f16", "type": "Starcoder", "systemPrompt": " ", "promptTemplate": "%1", "description": "Trained on subset of the Stack
      • Code completion based
      " }, { "order": "w", "md5sum": "c7ebc61eec1779bddae1f2bcbf2007cc", "name": "Llama-2-7B Chat", "filename": "llama-2-7b-chat.ggmlv3.q4_0.bin", "filesize": "3791725184", "requires": "2.4.14", "ramrequired": "8", "parameters": "7 billion", "quant": "q4_0", "type": "LLaMA2", "description": "New LLaMA2 model from Meta AI.
      • Fine-tuned for dialogue.
      • static model trained on an offline dataset
      • RLHF dataset
      • Licensed for commercial use
      ", "url": "https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGML/resolve/main/llama-2-7b-chat.ggmlv3.q4_0.bin", "promptTemplate": "[INST] %1 [/INST] ", "systemPrompt": "[INST]<>You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.<>[/INST] " } ] ================================================ FILE: gpt4all-chat/metadata/models2.json ================================================ [ { "order": "a", "md5sum": "f692417a22405d80573ac10cb0cd6c6a", "name": "Mistral OpenOrca", "filename": "mistral-7b-openorca.gguf2.Q4_0.gguf", "filesize": "4108928128", "requires": "2.5.0", "ramrequired": "8", "parameters": "7 billion", "quant": "q4_0", "type": "Mistral", "description": "Best overall fast chat model
      • Fast responses
      • Chat based model
      • Trained by Mistral AI
      • Finetuned on OpenOrca dataset curated via Nomic Atlas
      • Licensed for commercial use
      ", "url": "https://gpt4all.io/models/gguf/mistral-7b-openorca.gguf2.Q4_0.gguf", "promptTemplate": "<|im_start|>user\n%1<|im_end|>\n<|im_start|>assistant\n", "systemPrompt": "<|im_start|>system\nYou are MistralOrca, a large language model trained by Alignment Lab AI.\n<|im_end|>" }, { "order": "b", "md5sum": "97463be739b50525df56d33b26b00852", "name": "Mistral Instruct", "filename": "mistral-7b-instruct-v0.1.Q4_0.gguf", "filesize": "4108916384", "requires": "2.5.0", "ramrequired": "8", "parameters": "7 billion", "quant": "q4_0", "type": "Mistral", "systemPrompt": " ", "description": "Best overall fast instruction following model
      • Fast responses
      • Trained by Mistral AI
      • Uncensored
      • Licensed for commercial use
      ", "url": "https://gpt4all.io/models/gguf/mistral-7b-instruct-v0.1.Q4_0.gguf", "promptTemplate": "[INST] %1 [/INST]" }, { "order": "c", "md5sum": "c4c78adf744d6a20f05c8751e3961b84", "name": "GPT4All Falcon", "filename": "gpt4all-falcon-newbpe-q4_0.gguf", "filesize": "4210994112", "requires": "2.6.0", "ramrequired": "8", "parameters": "7 billion", "quant": "q4_0", "type": "Falcon", "systemPrompt": " ", "description": "Very fast model with good quality
      • Fastest responses
      • Instruction based
      • Trained by TII
      • Finetuned by Nomic AI
      • Licensed for commercial use
      ", "url": "https://gpt4all.io/models/gguf/gpt4all-falcon-newbpe-q4_0.gguf", "promptTemplate": "### Instruction:\n%1\n### Response:\n" }, { "order": "e", "md5sum": "00c8593ba57f5240f59662367b3ed4a5", "name": "Orca 2 (Medium)", "filename": "orca-2-7b.Q4_0.gguf", "filesize": "3825824192", "requires": "2.5.2", "ramrequired": "8", "parameters": "7 billion", "quant": "q4_0", "type": "LLaMA2", "systemPrompt": " ", "description": "
      • Instruction based
      • Trained by Microsoft
      • Cannot be used commercially
      ", "url": "https://gpt4all.io/models/gguf/orca-2-7b.Q4_0.gguf" }, { "order": "f", "md5sum": "3c0d63c4689b9af7baa82469a6f51a19", "name": "Orca 2 (Full)", "filename": "orca-2-13b.Q4_0.gguf", "filesize": "7365856064", "requires": "2.5.2", "ramrequired": "16", "parameters": "13 billion", "quant": "q4_0", "type": "LLaMA2", "systemPrompt": " ", "description": "
      • Instruction based
      • Trained by Microsoft
      • Cannot be used commercially
      ", "url": "https://gpt4all.io/models/gguf/orca-2-13b.Q4_0.gguf" }, { "order": "g", "md5sum": "5aff90007499bce5c64b1c0760c0b186", "name": "Wizard v1.2", "filename": "wizardlm-13b-v1.2.Q4_0.gguf", "filesize": "7365834624", "requires": "2.5.0", "ramrequired": "16", "parameters": "13 billion", "quant": "q4_0", "type": "LLaMA2", "systemPrompt": " ", "description": "Best overall larger model
      • Instruction based
      • Gives very long responses
      • Finetuned with only 1k of high-quality data
      • Trained by Microsoft and Peking University
      • Cannot be used commercially
      ", "url": "https://gpt4all.io/models/gguf/wizardlm-13b-v1.2.Q4_0.gguf" }, { "order": "h", "md5sum": "3d12810391d04d1153b692626c0c6e16", "name": "Hermes", "filename": "nous-hermes-llama2-13b.Q4_0.gguf", "filesize": "7366062080", "requires": "2.5.0", "ramrequired": "16", "parameters": "13 billion", "quant": "q4_0", "type": "LLaMA2", "systemPrompt": " ", "description": "Extremely good model
      • Instruction based
      • Gives long responses
      • Curated with 300,000 uncensored instructions
      • Trained by Nous Research
      • Cannot be used commercially
      ", "url": "https://gpt4all.io/models/gguf/nous-hermes-llama2-13b.Q4_0.gguf", "promptTemplate": "### Instruction:\n%1\n### Response:\n" }, { "order": "i", "md5sum": "40388eb2f8d16bb5d08c96fdfaac6b2c", "name": "Snoozy", "filename": "gpt4all-13b-snoozy-q4_0.gguf", "filesize": "7365834624", "requires": "2.5.0", "ramrequired": "16", "parameters": "13 billion", "quant": "q4_0", "type": "LLaMA", "systemPrompt": " ", "description": "Very good overall model
      • Instruction based
      • Based on the same dataset as Groovy
      • Slower than Groovy, with higher quality responses
      • Trained by Nomic AI
      • Cannot be used commercially
      ", "url": "https://gpt4all.io/models/gguf/gpt4all-13b-snoozy-q4_0.gguf" }, { "order": "j", "md5sum": "15dcb4d7ea6de322756449c11a0b7545", "name": "MPT Chat", "filename": "mpt-7b-chat-newbpe-q4_0.gguf", "filesize": "3912373472", "requires": "2.6.0", "ramrequired": "8", "parameters": "7 billion", "quant": "q4_0", "type": "MPT", "description": "Good model with novel architecture
      • Fast responses
      • Chat based
      • Trained by Mosaic ML
      • Cannot be used commercially
      ", "url": "https://gpt4all.io/models/gguf/mpt-7b-chat-newbpe-q4_0.gguf", "promptTemplate": "<|im_start|>user\n%1<|im_end|>\n<|im_start|>assistant\n", "systemPrompt": "<|im_start|>system\n- You are a helpful assistant chatbot trained by MosaicML.\n- You answer questions.\n- You are excited to be able to help the user, but will refuse to do anything that could be considered harmful to the user.\n- You are more than just an information source, you are also able to write poetry, short stories, and make jokes.<|im_end|>" }, { "order": "k", "md5sum": "0e769317b90ac30d6e09486d61fefa26", "name": "Mini Orca (Small)", "filename": "orca-mini-3b-gguf2-q4_0.gguf", "filesize": "1979946720", "requires": "2.5.0", "ramrequired": "4", "parameters": "3 billion", "quant": "q4_0", "type": "OpenLLaMa", "description": "Small version of new model with novel dataset
      • Instruction based
      • Explain tuned datasets
      • Orca Research Paper dataset construction approaches
      • Cannot be used commercially
      ", "url": "https://gpt4all.io/models/gguf/orca-mini-3b-gguf2-q4_0.gguf", "promptTemplate": "### User:\n%1\n### Response:\n", "systemPrompt": "### System:\nYou are an AI assistant that follows instruction extremely well. Help as much as you can.\n\n" }, { "order": "l", "md5sum": "c232f17e09bca4b7ee0b5b1f4107c01e", "disableGUI": "true", "name": "Replit", "filename": "replit-code-v1_5-3b-newbpe-q4_0.gguf", "filesize": "1953055104", "requires": "2.6.0", "ramrequired": "4", "parameters": "3 billion", "quant": "q4_0", "type": "Replit", "systemPrompt": " ", "promptTemplate": "%1", "description": "Trained on subset of the Stack
      • Code completion based
      • Licensed for commercial use
      • WARNING: Not available for chat GUI
      ", "url": "https://gpt4all.io/models/gguf/replit-code-v1_5-3b-newbpe-q4_0.gguf" }, { "order": "m", "md5sum": "70841751ccd95526d3dcfa829e11cd4c", "disableGUI": "true", "name": "Starcoder", "filename": "starcoder-newbpe-q4_0.gguf", "filesize": "8987411904", "requires": "2.6.0", "ramrequired": "4", "parameters": "7 billion", "quant": "q4_0", "type": "Starcoder", "systemPrompt": " ", "promptTemplate": "%1", "description": "Trained on subset of the Stack
      • Code completion based
      • WARNING: Not available for chat GUI
      ", "url": "https://gpt4all.io/models/gguf/starcoder-newbpe-q4_0.gguf" }, { "order": "n", "md5sum": "e973dd26f0ffa6e46783feaea8f08c83", "disableGUI": "true", "name": "Rift coder", "filename": "rift-coder-v0-7b-q4_0.gguf", "filesize": "3825903776", "requires": "2.5.0", "ramrequired": "8", "parameters": "7 billion", "quant": "q4_0", "type": "LLaMA", "systemPrompt": " ", "promptTemplate": "%1", "description": "Trained on collection of Python and TypeScript
      • Code completion based
      • WARNING: Not available for chat GUI
      • ", "url": "https://gpt4all.io/models/gguf/rift-coder-v0-7b-q4_0.gguf" }, { "order": "o", "md5sum": "e479e6f38b59afc51a470d1953a6bfc7", "disableGUI": "true", "name": "SBert", "filename": "all-MiniLM-L6-v2-f16.gguf", "filesize": "45887744", "requires": "2.5.0", "ramrequired": "1", "parameters": "40 million", "quant": "f16", "type": "Bert", "systemPrompt": " ", "description": "LocalDocs text embeddings model
        • For use with LocalDocs feature
        • Used for retrieval augmented generation (RAG)", "url": "https://gpt4all.io/models/gguf/all-MiniLM-L6-v2-f16.gguf" }, { "order": "p", "md5sum": "919de4dd6f25351bcb0223790db1932d", "name": "EM German Mistral", "filename": "em_german_mistral_v01.Q4_0.gguf", "filesize": "4108916352", "requires": "2.5.0", "ramrequired": "8", "parameters": "7 billion", "quant": "q4_0", "type": "Mistral", "description": "Mistral-based model for German-language applications
          • Fast responses
          • Chat based model
          • Trained by ellamind
          • Finetuned on German instruction and chat data
          • Licensed for commercial use
          ", "url": "https://huggingface.co/TheBloke/em_german_mistral_v01-GGUF/resolve/main/em_german_mistral_v01.Q4_0.gguf", "promptTemplate": "USER: %1 ASSISTANT: ", "systemPrompt": "Du bist ein hilfreicher Assistent. " } ] ================================================ FILE: gpt4all-chat/metadata/models3.json ================================================ [ { "order": "a", "md5sum": "a54c08a7b90e4029a8c2ab5b5dc936aa", "name": "Reasoner v1", "filename": "qwen2.5-coder-7b-instruct-q4_0.gguf", "filesize": "4431390720", "requires": "3.6.0", "ramrequired": "8", "parameters": "8 billion", "quant": "q4_0", "type": "qwen2", "description": "", "url": "https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct-GGUF/resolve/main/qwen2.5-coder-7b-instruct-q4_0.gguf", "chatTemplate": "{{- '<|im_start|>system\\n' }}\n{% if toolList|length > 0 %}You have access to the following functions:\n{% for tool in toolList %}\nUse the function '{{tool.function}}' to: '{{tool.description}}'\n{% if tool.parameters|length > 0 %}\nparameters:\n{% for info in tool.parameters %}\n {{info.name}}:\n type: {{info.type}}\n description: {{info.description}}\n required: {{info.required}}\n{% endfor %}\n{% endif %}\n# Tool Instructions\nIf you CHOOSE to call this function ONLY reply with the following format:\n'{{tool.symbolicFormat}}'\nHere is an example. If the user says, '{{tool.examplePrompt}}', then you reply\n'{{tool.exampleCall}}'\nAfter the result you might reply with, '{{tool.exampleReply}}'\n{% endfor %}\nYou MUST include both the start and end tags when you use a function.\n\nYou are a helpful AI assistant who uses the functions to break down, analyze, perform, and verify complex reasoning tasks. You SHOULD try to verify your answers using the functions where possible.\n{% endif %}\n{{- '<|im_end|>\\n' }}\n{% for message in messages %}\n{{'<|im_start|>' + message['role'] + '\\n' + message['content'] + '<|im_end|>\\n' }}\n{% endfor %}\n{% if add_generation_prompt %}\n{{ '<|im_start|>assistant\\n' }}\n{% endif %}\n", "systemPrompt": "" }, { "order": "aa", "md5sum": "c87ad09e1e4c8f9c35a5fcef52b6f1c9", "name": "Llama 3 8B Instruct", "filename": "Meta-Llama-3-8B-Instruct.Q4_0.gguf", "filesize": "4661724384", "requires": "2.7.1", "ramrequired": "8", "parameters": "8 billion", "quant": "q4_0", "type": "LLaMA3", "description": "", "url": "https://gpt4all.io/models/gguf/Meta-Llama-3-8B-Instruct.Q4_0.gguf", "promptTemplate": "<|start_header_id|>user<|end_header_id|>\n\n%1<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n%2<|eot_id|>", "systemPrompt": "", "chatTemplate": "{%- set loop_messages = messages %}\n{%- for message in loop_messages %}\n {%- set content = '<|start_header_id|>' + message['role'] + '<|end_header_id|>\\n\\n'+ message['content'] | trim + '<|eot_id|>' %}\n {{- content }}\n{%- endfor %}\n{%- if add_generation_prompt %}\n {{- '<|start_header_id|>assistant<|end_header_id|>\\n\\n' }}\n{%- endif %}" }, { "order": "aa1", "sha256sum": "5cd4ee65211770f1d99b4f6f4951780b9ef40e29314bd6542bb5bd0ad0bc29d1", "name": "DeepSeek-R1-Distill-Qwen-7B", "filename": "DeepSeek-R1-Distill-Qwen-7B-Q4_0.gguf", "filesize": "4444121056", "requires": "3.8.0", "ramrequired": "8", "parameters": "7 billion", "quant": "q4_0", "type": "deepseek", "description": "

          The official Qwen2.5-Math-7B distillation of DeepSeek-R1.

          • License: MIT
          • No restrictions on commercial use
          • #reasoning
          ", "url": "https://huggingface.co/bartowski/DeepSeek-R1-Distill-Qwen-7B-GGUF/resolve/main/DeepSeek-R1-Distill-Qwen-7B-Q4_0.gguf", "chatTemplate": "{%- if not add_generation_prompt is defined %}\n {%- set add_generation_prompt = false %}\n{%- endif %}\n{%- if messages[0]['role'] == 'system' %}\n {{- messages[0]['content'] }}\n{%- endif %}\n{%- for message in messages %}\n {%- if message['role'] == 'user' %}\n {{- '<|User|>' + message['content'] }}\n {%- endif %}\n {%- if message['role'] == 'assistant' %}\n {%- set content = message['content'] | regex_replace('^[\\\\s\\\\S]*', '') %}\n {{- '<|Assistant|>' + content + '<|end▁of▁sentence|>' }}\n {%- endif %}\n{%- endfor -%}\n{%- if add_generation_prompt %}\n {{- '<|Assistant|>' }}\n{%- endif %}" }, { "order": "aa2", "sha256sum": "906b3382f2680f4ce845459b4a122e904002b075238080307586bcffcde49eef", "name": "DeepSeek-R1-Distill-Qwen-14B", "filename": "DeepSeek-R1-Distill-Qwen-14B-Q4_0.gguf", "filesize": "8544267680", "requires": "3.8.0", "ramrequired": "16", "parameters": "14 billion", "quant": "q4_0", "type": "deepseek", "description": "

          The official Qwen2.5-14B distillation of DeepSeek-R1.

          • License: MIT
          • No restrictions on commercial use
          • #reasoning
          ", "url": "https://huggingface.co/bartowski/DeepSeek-R1-Distill-Qwen-14B-GGUF/resolve/main/DeepSeek-R1-Distill-Qwen-14B-Q4_0.gguf", "chatTemplate": "{%- if not add_generation_prompt is defined %}\n {%- set add_generation_prompt = false %}\n{%- endif %}\n{%- if messages[0]['role'] == 'system' %}\n {{- messages[0]['content'] }}\n{%- endif %}\n{%- for message in messages %}\n {%- if message['role'] == 'user' %}\n {{- '<|User|>' + message['content'] }}\n {%- endif %}\n {%- if message['role'] == 'assistant' %}\n {%- set content = message['content'] | regex_replace('^[\\\\s\\\\S]*', '') %}\n {{- '<|Assistant|>' + content + '<|end▁of▁sentence|>' }}\n {%- endif %}\n{%- endfor -%}\n{%- if add_generation_prompt %}\n {{- '<|Assistant|>' }}\n{%- endif %}" }, { "order": "aa3", "sha256sum": "0eb93e436ac8beec18aceb958c120d282cb2cf5451b23185e7be268fe9d375cc", "name": "DeepSeek-R1-Distill-Llama-8B", "filename": "DeepSeek-R1-Distill-Llama-8B-Q4_0.gguf", "filesize": "4675894112", "requires": "3.8.0", "ramrequired": "8", "parameters": "8 billion", "quant": "q4_0", "type": "deepseek", "description": "

          The official Llama-3.1-8B distillation of DeepSeek-R1.

          • License: MIT
          • No restrictions on commercial use
          • #reasoning
          ", "url": "https://huggingface.co/bartowski/DeepSeek-R1-Distill-Llama-8B-GGUF/resolve/main/DeepSeek-R1-Distill-Llama-8B-Q4_0.gguf", "chatTemplate": "{%- if not add_generation_prompt is defined %}\n {%- set add_generation_prompt = false %}\n{%- endif %}\n{%- if messages[0]['role'] == 'system' %}\n {{- messages[0]['content'] }}\n{%- endif %}\n{%- for message in messages %}\n {%- if message['role'] == 'user' %}\n {{- '<|User|>' + message['content'] }}\n {%- endif %}\n {%- if message['role'] == 'assistant' %}\n {%- set content = message['content'] | regex_replace('^[\\\\s\\\\S]*', '') %}\n {{- '<|Assistant|>' + content + '<|end▁of▁sentence|>' }}\n {%- endif %}\n{%- endfor -%}\n{%- if add_generation_prompt %}\n {{- '<|Assistant|>' }}\n{%- endif %}" }, { "order": "aa4", "sha256sum": "b3af887d0a015b39fab2395e4faf682c1a81a6a3fd09a43f0d4292f7d94bf4d0", "name": "DeepSeek-R1-Distill-Qwen-1.5B", "filename": "DeepSeek-R1-Distill-Qwen-1.5B-Q4_0.gguf", "filesize": "1068807776", "requires": "3.8.0", "ramrequired": "3", "parameters": "1.5 billion", "quant": "q4_0", "type": "deepseek", "description": "

          The official Qwen2.5-Math-1.5B distillation of DeepSeek-R1.

          • License: MIT
          • No restrictions on commercial use
          • #reasoning
          ", "url": "https://huggingface.co/bartowski/DeepSeek-R1-Distill-Qwen-1.5B-GGUF/resolve/main/DeepSeek-R1-Distill-Qwen-1.5B-Q4_0.gguf", "chatTemplate": "{%- if not add_generation_prompt is defined %}\n {%- set add_generation_prompt = false %}\n{%- endif %}\n{%- if messages[0]['role'] == 'system' %}\n {{- messages[0]['content'] }}\n{%- endif %}\n{%- for message in messages %}\n {%- if message['role'] == 'user' %}\n {{- '<|User|>' + message['content'] }}\n {%- endif %}\n {%- if message['role'] == 'assistant' %}\n {%- set content = message['content'] | regex_replace('^[\\\\s\\\\S]*', '') %}\n {{- '<|Assistant|>' + content + '<|end▁of▁sentence|>' }}\n {%- endif %}\n{%- endfor -%}\n{%- if add_generation_prompt %}\n {{- '<|Assistant|>' }}\n{%- endif %}" }, { "order": "b", "md5sum": "27b44e8ae1817525164ddf4f8dae8af4", "name": "Llama 3.2 3B Instruct", "filename": "Llama-3.2-3B-Instruct-Q4_0.gguf", "filesize": "1921909280", "requires": "3.4.0", "ramrequired": "4", "parameters": "3 billion", "quant": "q4_0", "type": "LLaMA3", "description": "", "url": "https://huggingface.co/bartowski/Llama-3.2-3B-Instruct-GGUF/resolve/main/Llama-3.2-3B-Instruct-Q4_0.gguf", "promptTemplate": "<|start_header_id|>user<|end_header_id|>\n\n%1<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n%2", "systemPrompt": "<|start_header_id|>system<|end_header_id|>\nCutting Knowledge Date: December 2023\n\nYou are a helpful assistant.<|eot_id|>", "chatTemplate": "{{- bos_token }}\n{%- set date_string = strftime_now('%d %b %Y') %}\n\n{#- This block extracts the system message, so we can slot it into the right place. #}\n{%- if messages[0]['role'] == 'system' %}\n {%- set system_message = messages[0]['content'] | trim %}\n {%- set loop_start = 1 %}\n{%- else %}\n {%- set system_message = '' %}\n {%- set loop_start = 0 %}\n{%- endif %}\n\n{#- System message #}\n{{- '<|start_header_id|>system<|end_header_id|>\\n\\n' }}\n{{- 'Cutting Knowledge Date: December 2023\\n' }}\n{{- 'Today Date: ' + date_string + '\\n\\n' }}\n{{- system_message }}\n{{- '<|eot_id|>' }}\n\n{%- for message in messages %}\n {%- if loop.index0 >= loop_start %}\n {{- '<|start_header_id|>' + message['role'] + '<|end_header_id|>\\n\\n' + message['content'] | trim + '<|eot_id|>' }}\n {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n {{- '<|start_header_id|>assistant<|end_header_id|>\\n\\n' }}\n{%- endif %}" }, { "order": "c", "md5sum": "48ff0243978606fdba19d899b77802fc", "name": "Llama 3.2 1B Instruct", "filename": "Llama-3.2-1B-Instruct-Q4_0.gguf", "filesize": "773025920", "requires": "3.4.0", "ramrequired": "2", "parameters": "1 billion", "quant": "q4_0", "type": "LLaMA3", "description": "", "url": "https://huggingface.co/bartowski/Llama-3.2-1B-Instruct-GGUF/resolve/main/Llama-3.2-1B-Instruct-Q4_0.gguf", "promptTemplate": "<|start_header_id|>user<|end_header_id|>\n\n%1<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n%2", "systemPrompt": "<|start_header_id|>system<|end_header_id|>\nCutting Knowledge Date: December 2023\n\nYou are a helpful assistant.<|eot_id|>", "chatTemplate": "{{- bos_token }}\n{%- set date_string = strftime_now('%d %b %Y') %}\n\n{#- This block extracts the system message, so we can slot it into the right place. #}\n{%- if messages[0]['role'] == 'system' %}\n {%- set system_message = messages[0]['content'] | trim %}\n {%- set loop_start = 1 %}\n{%- else %}\n {%- set system_message = '' %}\n {%- set loop_start = 0 %}\n{%- endif %}\n\n{#- System message #}\n{{- '<|start_header_id|>system<|end_header_id|>\\n\\n' }}\n{{- 'Cutting Knowledge Date: December 2023\\n' }}\n{{- 'Today Date: ' + date_string + '\\n\\n' }}\n{{- system_message }}\n{{- '<|eot_id|>' }}\n\n{%- for message in messages %}\n {%- if loop.index0 >= loop_start %}\n {{- '<|start_header_id|>' + message['role'] + '<|end_header_id|>\\n\\n' + message['content'] | trim + '<|eot_id|>' }}\n {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n {{- '<|start_header_id|>assistant<|end_header_id|>\\n\\n' }}\n{%- endif %}" }, { "order": "d", "md5sum": "a5f6b4eabd3992da4d7fb7f020f921eb", "name": "Nous Hermes 2 Mistral DPO", "filename": "Nous-Hermes-2-Mistral-7B-DPO.Q4_0.gguf", "filesize": "4108928000", "requires": "2.7.1", "ramrequired": "8", "parameters": "7 billion", "quant": "q4_0", "type": "Mistral", "description": "Good overall fast chat model
          • Fast responses
          • Chat based model
          • Accepts system prompts in ChatML format
          • Trained by Mistral AI
          • Finetuned by Nous Research on the OpenHermes-2.5 dataset
          • Licensed for commercial use
          ", "url": "https://huggingface.co/NousResearch/Nous-Hermes-2-Mistral-7B-DPO-GGUF/resolve/main/Nous-Hermes-2-Mistral-7B-DPO.Q4_0.gguf", "promptTemplate": "<|im_start|>user\n%1<|im_end|>\n<|im_start|>assistant\n%2<|im_end|>\n", "systemPrompt": "", "chatTemplate": "{%- for message in messages %}\n {{- '<|im_start|>' + message['role'] + '\\n' + message['content'] + '<|im_end|>\\n' }}\n{%- endfor %}\n{%- if add_generation_prompt %}\n {{- '<|im_start|>assistant\\n' }}\n{%- endif %}" }, { "order": "e", "md5sum": "97463be739b50525df56d33b26b00852", "name": "Mistral Instruct", "filename": "mistral-7b-instruct-v0.1.Q4_0.gguf", "filesize": "4108916384", "requires": "2.5.0", "ramrequired": "8", "parameters": "7 billion", "quant": "q4_0", "type": "Mistral", "systemPrompt": "", "description": "Strong overall fast instruction following model
          • Fast responses
          • Trained by Mistral AI
          • Uncensored
          • Licensed for commercial use
          ", "url": "https://gpt4all.io/models/gguf/mistral-7b-instruct-v0.1.Q4_0.gguf", "promptTemplate": "[INST] %1 [/INST]", "chatTemplate": "{%- if messages[0]['role'] == 'system' %}\n {%- set system_message = messages[0]['content'] %}\n {%- set loop_start = 1 %}\n{%- else %}\n {%- set loop_start = 0 %}\n{%- endif %}\n{%- for message in messages %}\n {%- if loop.index0 >= loop_start %}\n {%- if (message['role'] == 'user') != ((loop.index0 - loop_start) % 2 == 0) %}\n {{- raise_exception('After the optional system message, conversation roles must alternate user/assistant/user/assistant/...') }}\n {%- endif %}\n {%- if message['role'] == 'user' %}\n {%- if loop.index0 == loop_start and loop_start == 1 %}\n {{- ' [INST] ' + system_message + '\\n\\n' + message['content'] + ' [/INST]' }}\n {%- else %}\n {{- ' [INST] ' + message['content'] + ' [/INST]' }}\n {%- endif %}\n {%- elif message['role'] == 'assistant' %}\n {{- ' ' + message['content'] + eos_token }}\n {%- else %}\n {{- raise_exception('Only user and assistant roles are supported, with the exception of an initial optional system message!') }}\n {%- endif %}\n {%- endif %}\n{%- endfor %}" }, { "order": "f", "md5sum": "8a9c75bcd8a66b7693f158ec96924eeb", "name": "Llama 3.1 8B Instruct 128k", "filename": "Meta-Llama-3.1-8B-Instruct-128k-Q4_0.gguf", "filesize": "4661212096", "requires": "3.1.1", "ramrequired": "8", "parameters": "8 billion", "quant": "q4_0", "type": "LLaMA3", "description": "
          • For advanced users only. Not recommended for use on Windows or Linux without selecting CUDA due to speed issues.
          • Fast responses
          • Chat based model
          • Large context size of 128k
          • Accepts agentic system prompts in Llama 3.1 format
          • Trained by Meta
          • License: Meta Llama 3.1 Community License
          ", "url": "https://huggingface.co/GPT4All-Community/Meta-Llama-3.1-8B-Instruct-128k/resolve/main/Meta-Llama-3.1-8B-Instruct-128k-Q4_0.gguf", "promptTemplate": "<|start_header_id|>user<|end_header_id|>\n\n%1<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n%2", "systemPrompt": "<|start_header_id|>system<|end_header_id|>\nCutting Knowledge Date: December 2023\n\nYou are a helpful assistant.<|eot_id|>", "chatTemplate": "{%- set loop_messages = messages %}\n{%- for message in loop_messages %}\n {%- set content = '<|start_header_id|>' + message['role'] + '<|end_header_id|>\\n\\n'+ message['content'] | trim + '<|eot_id|>' %}\n {%- if loop.index0 == 0 %}\n {%- set content = bos_token + content %}\n {%- endif %}\n {{- content }}\n{%- endfor %}\n{{- '<|start_header_id|>assistant<|end_header_id|>\\n\\n' }}" }, { "order": "g", "md5sum": "f692417a22405d80573ac10cb0cd6c6a", "name": "Mistral OpenOrca", "filename": "mistral-7b-openorca.gguf2.Q4_0.gguf", "filesize": "4108928128", "requires": "2.7.1", "ramrequired": "8", "parameters": "7 billion", "quant": "q4_0", "type": "Mistral", "description": "Strong overall fast chat model
          • Fast responses
          • Chat based model
          • Trained by Mistral AI
          • Finetuned on OpenOrca dataset curated via Nomic Atlas
          • Licensed for commercial use
          ", "url": "https://gpt4all.io/models/gguf/mistral-7b-openorca.gguf2.Q4_0.gguf", "promptTemplate": "<|im_start|>user\n%1<|im_end|>\n<|im_start|>assistant\n%2<|im_end|>\n", "systemPrompt": "<|im_start|>system\nYou are MistralOrca, a large language model trained by Alignment Lab AI.\n<|im_end|>\n", "chatTemplate": "{%- for message in messages %}\n {{- '<|im_start|>' + message['role'] + '\\n' + message['content'] + '<|im_end|>\\n' }}\n{%- endfor %}\n{%- if add_generation_prompt %}\n {{- '<|im_start|>assistant\\n' }}\n{%- endif %}" }, { "order": "h", "md5sum": "c4c78adf744d6a20f05c8751e3961b84", "name": "GPT4All Falcon", "filename": "gpt4all-falcon-newbpe-q4_0.gguf", "filesize": "4210994112", "requires": "2.6.0", "ramrequired": "8", "parameters": "7 billion", "quant": "q4_0", "type": "Falcon", "systemPrompt": "", "description": "Very fast model with good quality
          • Fastest responses
          • Instruction based
          • Trained by TII
          • Finetuned by Nomic AI
          • Licensed for commercial use
          ", "url": "https://gpt4all.io/models/gguf/gpt4all-falcon-newbpe-q4_0.gguf", "promptTemplate": "### Instruction:\n%1\n\n### Response:\n", "chatTemplate": "{%- if messages[0]['role'] == 'system' %}\n {%- set loop_start = 1 %}\n {{- messages[0]['content'] + '\\n\\n' }}\n{%- else %}\n {%- set loop_start = 0 %}\n{%- endif %}\n{%- for message in messages %}\n {%- if loop.index0 >= loop_start %}\n {%- if message['role'] == 'user' %}\n {{- '### User: ' + message['content'] + '\\n\\n' }}\n {%- elif message['role'] == 'assistant' %}\n {{- '### Assistant: ' + message['content'] + '\\n\\n' }}\n {%- else %}\n {{- raise_exception('Only user and assistant roles are supported, with the exception of an initial optional system message!') }}\n {%- endif %}\n {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n {{- '### Assistant:' }}\n{%- endif %}" }, { "order": "i", "md5sum": "00c8593ba57f5240f59662367b3ed4a5", "name": "Orca 2 (Medium)", "filename": "orca-2-7b.Q4_0.gguf", "filesize": "3825824192", "requires": "2.5.2", "ramrequired": "8", "parameters": "7 billion", "quant": "q4_0", "type": "LLaMA2", "systemPrompt": "", "description": "
          • Instruction based
          • Trained by Microsoft
          • Cannot be used commercially
          ", "url": "https://gpt4all.io/models/gguf/orca-2-7b.Q4_0.gguf", "chatTemplate": "{%- for message in messages %}\n {{- '<|im_start|>' + message['role'] + '\\n' + message['content'] + '<|im_end|>\\n' }}\n{%- endfor %}\n{%- if add_generation_prompt %}\n {{- '<|im_start|>assistant\\n' }}\n{%- endif %}" }, { "order": "j", "md5sum": "3c0d63c4689b9af7baa82469a6f51a19", "name": "Orca 2 (Full)", "filename": "orca-2-13b.Q4_0.gguf", "filesize": "7365856064", "requires": "2.5.2", "ramrequired": "16", "parameters": "13 billion", "quant": "q4_0", "type": "LLaMA2", "systemPrompt": "", "description": "
          • Instruction based
          • Trained by Microsoft
          • Cannot be used commercially
          ", "url": "https://gpt4all.io/models/gguf/orca-2-13b.Q4_0.gguf", "chatTemplate": "{%- for message in messages %}\n {{- '<|im_start|>' + message['role'] + '\\n' + message['content'] + '<|im_end|>\\n' }}\n{%- endfor %}\n{%- if add_generation_prompt %}\n {{- '<|im_start|>assistant\\n' }}\n{%- endif %}" }, { "order": "k", "md5sum": "5aff90007499bce5c64b1c0760c0b186", "name": "Wizard v1.2", "filename": "wizardlm-13b-v1.2.Q4_0.gguf", "filesize": "7365834624", "requires": "2.5.0", "ramrequired": "16", "parameters": "13 billion", "quant": "q4_0", "type": "LLaMA2", "systemPrompt": "", "description": "Strong overall larger model
          • Instruction based
          • Gives very long responses
          • Finetuned with only 1k of high-quality data
          • Trained by Microsoft and Peking University
          • Cannot be used commercially
          ", "url": "https://gpt4all.io/models/gguf/wizardlm-13b-v1.2.Q4_0.gguf", "chatTemplate": "{%- if messages[0]['role'] == 'system' %}\n {%- set loop_start = 1 %}\n {{- messages[0]['content'] + ' ' }}\n{%- else %}\n {%- set loop_start = 0 %}\n{%- endif %}\n{%- for message in loop_messages %}\n {%- if loop.index0 >= loop_start %}\n {%- if message['role'] == 'user' %}\n {{- 'USER: ' + message['content'] }}\n {%- elif message['role'] == 'assistant' %}\n {{- 'ASSISTANT: ' + message['content'] }}\n {%- else %}\n {{- raise_exception('Only user and assistant roles are supported, with the exception of an initial optional system message!') }}\n {%- endif %}\n {%- if (loop.index0 - loop_start) % 2 == 0 %}\n {{- ' ' }}\n {%- else %}\n {{- eos_token }}\n {%- endif %}\n {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n {{- 'ASSISTANT:' }}\n{%- endif %}", "systemMessage": "A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions." }, { "order": "l", "md5sum": "31b47b4e8c1816b62684ac3ca373f9e1", "name": "Ghost 7B v0.9.1", "filename": "ghost-7b-v0.9.1-Q4_0.gguf", "filesize": "4108916960", "requires": "2.7.1", "ramrequired": "8", "parameters": "7 billion", "quant": "q4_0", "type": "Mistral", "description": "Ghost 7B v0.9.1 fast, powerful and smooth for Vietnamese and English languages.", "url": "https://huggingface.co/lamhieu/ghost-7b-v0.9.1-gguf/resolve/main/ghost-7b-v0.9.1-Q4_0.gguf", "promptTemplate": "<|user|>\n%1\n<|assistant|>\n%2\n", "systemPrompt": "<|system|>\nYou are Ghost created by Lam Hieu. You are a helpful and knowledgeable assistant. You like to help and always give honest information, in its original language. In communication, you are always respectful, equal and promote positive behavior.\n", "chatTemplate": "{%- for message in messages %}\n {%- if message['role'] == 'user' %}\n {{- '<|user|>\\n' + message['content'] + eos_token }}\n {%- elif message['role'] == 'system' %}\n {{- '<|system|>\\n' + message['content'] + eos_token }}\n {%- elif message['role'] == 'assistant' %}\n {{- '<|assistant|>\\n' + message['content'] + eos_token }}\n {%- endif %}\n {%- if loop.last and add_generation_prompt %}\n {{- '<|assistant|>' }}\n {%- endif %}\n{%- endfor %}", "systemMessage": "You are Ghost created by Lam Hieu. You are a helpful and knowledgeable assistant. You like to help and always give honest information, in its original language. In communication, you are always respectful, equal and promote positive behavior." }, { "order": "m", "md5sum": "3d12810391d04d1153b692626c0c6e16", "name": "Hermes", "filename": "nous-hermes-llama2-13b.Q4_0.gguf", "filesize": "7366062080", "requires": "2.5.0", "ramrequired": "16", "parameters": "13 billion", "quant": "q4_0", "type": "LLaMA2", "systemPrompt": "", "description": "Extremely good model
          • Instruction based
          • Gives long responses
          • Curated with 300,000 uncensored instructions
          • Trained by Nous Research
          • Cannot be used commercially
          ", "url": "https://gpt4all.io/models/gguf/nous-hermes-llama2-13b.Q4_0.gguf", "promptTemplate": "### Instruction:\n%1\n\n### Response:\n", "chatTemplate": "{%- if messages[0]['role'] == 'system' %}\n {%- set loop_start = 1 %}\n {{- messages[0]['content'] + '\\n\\n' }}\n{%- else %}\n {%- set loop_start = 0 %}\n{%- endif %}\n{%- for message in messages %}\n {%- if loop.index0 >= loop_start %}\n {%- if message['role'] == 'user' %}\n {{- '### Instruction:\\n' + message['content'] + '\\n\\n' }}\n {%- elif message['role'] == 'assistant' %}\n {{- '### Response:\\n' + message['content'] + '\\n\\n' }}\n {%- else %}\n {{- raise_exception('Only user and assistant roles are supported, with the exception of an initial optional system message!') }}\n {%- endif %}\n {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n {{- '### Instruction:\\n' }}\n{%- endif %}" }, { "order": "n", "md5sum": "40388eb2f8d16bb5d08c96fdfaac6b2c", "name": "Snoozy", "filename": "gpt4all-13b-snoozy-q4_0.gguf", "filesize": "7365834624", "requires": "2.5.0", "ramrequired": "16", "parameters": "13 billion", "quant": "q4_0", "type": "LLaMA", "systemPrompt": "", "description": "Very good overall model
          • Instruction based
          • Based on the same dataset as Groovy
          • Slower than Groovy, with higher quality responses
          • Trained by Nomic AI
          • Cannot be used commercially
          ", "url": "https://gpt4all.io/models/gguf/gpt4all-13b-snoozy-q4_0.gguf", "chatTemplate": "{%- if messages[0]['role'] == 'system' %}\n {%- set loop_start = 1 %}\n {{- messages[0]['content'] + '\\n\\n' }}\n{%- else %}\n {%- set loop_start = 0 %}\n{%- endif %}\n{%- for message in messages %}\n {%- if loop.index0 >= loop_start %}\n {%- if message['role'] == 'user' %}\n {{- '### Instruction:\\n' + message['content'] + '\\n\\n' }}\n {%- elif message['role'] == 'assistant' %}\n {{- '### Response:\\n' + message['content'] + '\\n\\n' }}\n {%- else %}\n {{- raise_exception('Only user and assistant roles are supported, with the exception of an initial optional system message!') }}\n {%- endif %}\n {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n {{- '### Response:\\n' }}\n{%- endif %}", "systemMessage": "Below is an instruction that describes a task. Write a response that appropriately completes the request." }, { "order": "o", "md5sum": "15dcb4d7ea6de322756449c11a0b7545", "name": "MPT Chat", "filename": "mpt-7b-chat-newbpe-q4_0.gguf", "filesize": "3912373472", "requires": "2.7.1", "removedIn": "2.7.3", "ramrequired": "8", "parameters": "7 billion", "quant": "q4_0", "type": "MPT", "description": "Good model with novel architecture
          • Fast responses
          • Chat based
          • Trained by Mosaic ML
          • Cannot be used commercially
          ", "url": "https://gpt4all.io/models/gguf/mpt-7b-chat-newbpe-q4_0.gguf", "promptTemplate": "<|im_start|>user\n%1<|im_end|>\n<|im_start|>assistant\n%2<|im_end|>\n", "systemPrompt": "<|im_start|>system\n- You are a helpful assistant chatbot trained by MosaicML.\n- You answer questions.\n- You are excited to be able to help the user, but will refuse to do anything that could be considered harmful to the user.\n- You are more than just an information source, you are also able to write poetry, short stories, and make jokes.<|im_end|>\n", "chatTemplate": "{%- for message in messages %}\n {{- '<|im_start|>' + message['role'] + '\\n' + message['content'] + '<|im_end|>\\n' }}\n{%- endfor %}\n{%- if add_generation_prompt %}\n {{- '<|im_start|>assistant\\n' }}\n{%- endif %}" }, { "order": "p", "md5sum": "ab5d8e8a2f79365ea803c1f1d0aa749d", "name": "MPT Chat", "filename": "mpt-7b-chat.gguf4.Q4_0.gguf", "filesize": "3796178112", "requires": "2.7.3", "ramrequired": "8", "parameters": "7 billion", "quant": "q4_0", "type": "MPT", "description": "Good model with novel architecture
          • Fast responses
          • Chat based
          • Trained by Mosaic ML
          • Cannot be used commercially
          ", "url": "https://gpt4all.io/models/gguf/mpt-7b-chat.gguf4.Q4_0.gguf", "promptTemplate": "<|im_start|>user\n%1<|im_end|>\n<|im_start|>assistant\n%2<|im_end|>\n", "systemPrompt": "<|im_start|>system\n- You are a helpful assistant chatbot trained by MosaicML.\n- You answer questions.\n- You are excited to be able to help the user, but will refuse to do anything that could be considered harmful to the user.\n- You are more than just an information source, you are also able to write poetry, short stories, and make jokes.<|im_end|>\n", "chatTemplate": "{%- for message in messages %}\n {{- '<|im_start|>' + message['role'] + '\\n' + message['content'] + '<|im_end|>\\n' }}\n{%- endfor %}\n{%- if add_generation_prompt %}\n {{- '<|im_start|>assistant\\n' }}\n{%- endif %}" }, { "order": "q", "md5sum": "f8347badde9bfc2efbe89124d78ddaf5", "name": "Phi-3 Mini Instruct", "filename": "Phi-3-mini-4k-instruct.Q4_0.gguf", "filesize": "2176181568", "requires": "2.7.1", "ramrequired": "4", "parameters": "4 billion", "quant": "q4_0", "type": "Phi-3", "description": "
          • Very fast responses
          • Chat based model
          • Accepts system prompts in Phi-3 format
          • Trained by Microsoft
          • License: MIT
          • No restrictions on commercial use
          ", "url": "https://gpt4all.io/models/gguf/Phi-3-mini-4k-instruct.Q4_0.gguf", "promptTemplate": "<|user|>\n%1<|end|>\n<|assistant|>\n%2<|end|>\n", "systemPrompt": "", "chatTemplate": "{{- bos_token }}\n{%- for message in messages %}\n {{- '<|' + message['role'] + '|>\\n' + message['content'] + '<|end|>\\n' }}\n{%- endfor %}\n{%- if add_generation_prompt %}\n {{- '<|assistant|>\\n' }}\n{%- else %}\n {{- eos_token }}\n{%- endif %}" }, { "order": "r", "md5sum": "0e769317b90ac30d6e09486d61fefa26", "name": "Mini Orca (Small)", "filename": "orca-mini-3b-gguf2-q4_0.gguf", "filesize": "1979946720", "requires": "2.5.0", "ramrequired": "4", "parameters": "3 billion", "quant": "q4_0", "type": "OpenLLaMa", "description": "Small version of new model with novel dataset
          • Very fast responses
          • Instruction based
          • Explain tuned datasets
          • Orca Research Paper dataset construction approaches
          • Cannot be used commercially
          ", "url": "https://gpt4all.io/models/gguf/orca-mini-3b-gguf2-q4_0.gguf", "promptTemplate": "### User:\n%1\n\n### Response:\n", "systemPrompt": "### System:\nYou are an AI assistant that follows instruction extremely well. Help as much as you can.\n\n", "chatTemplate": "{%- if messages[0]['role'] == 'system' %}\n {%- set loop_start = 1 %}\n {{- '### System:\\n' + messages[0]['content'] + '\\n\\n' }}\n{%- else %}\n {%- set loop_start = 0 %}\n{%- endif %}\n{%- for message in messages %}\n {%- if loop.index0 >= loop_start %}\n {%- if message['role'] == 'user' %}\n {{- '### User:\\n' + message['content'] + '\\n\\n' }}\n {%- elif message['role'] == 'assistant' %}\n {{- '### Response:\\n' + message['content'] + '\\n\\n' }}\n {%- else %}\n {{- raise_exception('Only user and assistant roles are supported, with the exception of an initial optional system message!') }}\n {%- endif %}\n {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n {{- '### Response:\\n' }}\n{%- endif %}" }, { "order": "s", "md5sum": "c232f17e09bca4b7ee0b5b1f4107c01e", "disableGUI": "true", "name": "Replit", "filename": "replit-code-v1_5-3b-newbpe-q4_0.gguf", "filesize": "1953055104", "requires": "2.6.0", "ramrequired": "4", "parameters": "3 billion", "quant": "q4_0", "type": "Replit", "systemPrompt": "", "promptTemplate": "%1", "description": "Trained on subset of the Stack
          • Code completion based
          • Licensed for commercial use
          • WARNING: Not available for chat GUI
          ", "url": "https://gpt4all.io/models/gguf/replit-code-v1_5-3b-newbpe-q4_0.gguf", "chatTemplate": null }, { "order": "t", "md5sum": "70841751ccd95526d3dcfa829e11cd4c", "disableGUI": "true", "name": "Starcoder", "filename": "starcoder-newbpe-q4_0.gguf", "filesize": "8987411904", "requires": "2.6.0", "ramrequired": "4", "parameters": "7 billion", "quant": "q4_0", "type": "Starcoder", "systemPrompt": "", "promptTemplate": "%1", "description": "Trained on subset of the Stack
          • Code completion based
          • WARNING: Not available for chat GUI
          ", "url": "https://gpt4all.io/models/gguf/starcoder-newbpe-q4_0.gguf", "chatTemplate": null }, { "order": "u", "md5sum": "e973dd26f0ffa6e46783feaea8f08c83", "disableGUI": "true", "name": "Rift coder", "filename": "rift-coder-v0-7b-q4_0.gguf", "filesize": "3825903776", "requires": "2.5.0", "ramrequired": "8", "parameters": "7 billion", "quant": "q4_0", "type": "LLaMA", "systemPrompt": "", "promptTemplate": "%1", "description": "Trained on collection of Python and TypeScript
          • Code completion based
          • WARNING: Not available for chat GUI
          • ", "url": "https://gpt4all.io/models/gguf/rift-coder-v0-7b-q4_0.gguf", "chatTemplate": null }, { "order": "v", "md5sum": "e479e6f38b59afc51a470d1953a6bfc7", "disableGUI": "true", "name": "SBert", "filename": "all-MiniLM-L6-v2-f16.gguf", "filesize": "45887744", "requires": "2.5.0", "removedIn": "2.7.4", "ramrequired": "1", "parameters": "40 million", "quant": "f16", "type": "Bert", "embeddingModel": true, "systemPrompt": "", "description": "LocalDocs text embeddings model
            • For use with LocalDocs feature
            • Used for retrieval augmented generation (RAG)", "url": "https://gpt4all.io/models/gguf/all-MiniLM-L6-v2-f16.gguf", "chatTemplate": null }, { "order": "w", "md5sum": "dd90e2cb7f8e9316ac3796cece9883b5", "name": "SBert", "filename": "all-MiniLM-L6-v2.gguf2.f16.gguf", "filesize": "45949216", "requires": "2.7.4", "removedIn": "3.0.0", "ramrequired": "1", "parameters": "40 million", "quant": "f16", "type": "Bert", "embeddingModel": true, "description": "LocalDocs text embeddings model
              • For use with LocalDocs feature
              • Used for retrieval augmented generation (RAG)", "url": "https://gpt4all.io/models/gguf/all-MiniLM-L6-v2.gguf2.f16.gguf", "chatTemplate": null }, { "order": "x", "md5sum": "919de4dd6f25351bcb0223790db1932d", "name": "EM German Mistral", "filename": "em_german_mistral_v01.Q4_0.gguf", "filesize": "4108916352", "requires": "2.5.0", "ramrequired": "8", "parameters": "7 billion", "quant": "q4_0", "type": "Mistral", "description": "Mistral-based model for German-language applications
                • Fast responses
                • Chat based model
                • Trained by ellamind
                • Finetuned on German instruction and chat data
                • Licensed for commercial use
                ", "url": "https://huggingface.co/TheBloke/em_german_mistral_v01-GGUF/resolve/main/em_german_mistral_v01.Q4_0.gguf", "promptTemplate": "USER: %1 ASSISTANT: ", "systemPrompt": "Du bist ein hilfreicher Assistent. ", "chatTemplate": "{%- if messages[0]['role'] == 'system' %}\n {%- set loop_start = 1 %}\n {{- messages[0]['content'] }}\n{%- else %}\n {%- set loop_start = 0 %}\n{%- endif %}\n{%- for message in messages %}\n {%- if loop.index0 >= loop_start %}\n {%- if not loop.first %}\n {{- ' ' }}\n {%- endif %}\n {%- if message['role'] == 'user' %}\n {{- 'USER: ' + message['content'] }}\n {%- elif message['role'] == 'assistant' %}\n {{- 'ASSISTANT: ' + message['content'] }}\n {%- else %}\n {{- raise_exception('After the optional system message, conversation roles must be either user or assistant.') }}\n {%- endif %}\n {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n {%- if messages %}\n {{- ' ' }}\n {%- endif %}\n {{- 'ASSISTANT:' }}\n{%- endif %}", "systemMessage": "Du bist ein hilfreicher Assistent." }, { "order": "y", "md5sum": "60ea031126f82db8ddbbfecc668315d2", "disableGUI": "true", "name": "Nomic Embed Text v1", "filename": "nomic-embed-text-v1.f16.gguf", "filesize": "274290560", "requires": "2.7.4", "ramrequired": "1", "parameters": "137 million", "quant": "f16", "type": "Bert", "embeddingModel": true, "systemPrompt": "", "description": "nomic-embed-text-v1", "url": "https://gpt4all.io/models/gguf/nomic-embed-text-v1.f16.gguf", "chatTemplate": null }, { "order": "z", "md5sum": "a5401e7f7e46ed9fcaed5b60a281d547", "disableGUI": "true", "name": "Nomic Embed Text v1.5", "filename": "nomic-embed-text-v1.5.f16.gguf", "filesize": "274290560", "requires": "2.7.4", "ramrequired": "1", "parameters": "137 million", "quant": "f16", "type": "Bert", "embeddingModel": true, "systemPrompt": "", "description": "nomic-embed-text-v1.5", "url": "https://gpt4all.io/models/gguf/nomic-embed-text-v1.5.f16.gguf", "chatTemplate": null }, { "order": "zzz", "md5sum": "a8c5a783105f87a481543d4ed7d7586d", "name": "Qwen2-1.5B-Instruct", "filename": "qwen2-1_5b-instruct-q4_0.gguf", "filesize": "937532800", "requires": "3.0", "ramrequired": "3", "parameters": "1.5 billion", "quant": "q4_0", "type": "qwen2", "description": "
                • Very fast responses
                • Instruction based model
                • Usage of LocalDocs (RAG): Highly recommended
                • Supports context length of up to 32768
                • Trained and finetuned by Qwen (Alibaba Cloud)
                • License: Apache 2.0
                ", "url": "https://huggingface.co/Qwen/Qwen2-1.5B-Instruct-GGUF/resolve/main/qwen2-1_5b-instruct-q4_0.gguf", "promptTemplate": "<|im_start|>user\n%1<|im_end|>\n<|im_start|>assistant\n%2<|im_end|>", "systemPrompt": "<|im_start|>system\nBelow is an instruction that describes a task. Write a response that appropriately completes the request.<|im_end|>\n", "chatTemplate": "{%- for message in messages %}\n {%- if loop.first and messages[0]['role'] != 'system' %}\n {{- '<|im_start|>system\\nYou are a helpful assistant.<|im_end|>\\n' }}\n {%- endif %}\n {{- '<|im_start|>' + message['role'] + '\\n' + message['content'] + '<|im_end|>\\n' }}\n{%- endfor %}\n{%- if add_generation_prompt %}\n {{- '<|im_start|>assistant\\n' }}\n{%- endif %}" } ] ================================================ FILE: gpt4all-chat/metadata/release.json ================================================ [ { "version": "2.2.2", "notes": "* repeat penalty for both gptj and llama models\n* scroll the context window when conversation reaches context limit\n* persistent thread count setting\n* new default template\n* new settings for model path, repeat penalty\n* bugfix for settings dialog onEditingFinished\n* new tab based settings dialog format\n* bugfix for datalake when conversation contains forbidden json chars\n* new C library API and split the backend into own separate lib for bindings\n* apple signed/notarized dmg installer\n* update llama.cpp submodule to latest\n* bugfix for too large of a prompt\n* support for opt-in only anonymous usage and statistics\n* bugfixes for the model downloader and improve performance\n* various UI bugfixes and enhancements including the send message textarea automatically wrapping by word\n* new startup dialog on first start of a new release displaying release notes and opt-in buttons\n* new logo and icons\n* fixed apple installer so there is now a symlink in the applications folder\n", "contributors": "* Adam Treat (Nomic AI)\n* Aaron Miller\n* Matthieu Talbot\n* Tim Jobbins\n* chad (eachadea)\n* Community (beta testers, bug reporters)" }, { "version": "2.3.0", "notes": "* repeat penalty for both gptj and llama models\n* scroll the context window when conversation reaches context limit\n* persistent thread count setting\n* new default template\n* new settings for model path, repeat penalty\n* bugfix for settings dialog onEditingFinished\n* new tab based settings dialog format\n* bugfix for datalake when conversation contains forbidden json chars\n* new C library API and split the backend into own separate lib for bindings\n* apple signed/notarized dmg installer\n* update llama.cpp submodule to latest\n* bugfix for too large of a prompt\n* support for opt-in only anonymous usage and statistics\n* bugfixes for the model downloader and improve performance\n* various UI bugfixes and enhancements including the send message textarea automatically wrapping by word\n* new startup dialog on first start of a new release displaying release notes and opt-in buttons\n* new logo and icons\n* fixed apple installer so there is now a symlink in the applications folder\n* fixed bug with versions\n* fixed optout marking\n", "contributors": "* Adam Treat (Nomic AI)\n* Aaron Miller\n* Matthieu Talbot\n* Tim Jobbins\n* chad (eachadea)\n* Community (beta testers, bug reporters)" }, { "version": "2.4.0", "notes": "* reverse prompt for both llama and gptj models which should help stop them from repeating the prompt template\n* resumable downloads for models\n* chat list in the drawer drop down\n* add/remove/rename chats\n* persist chats to disk and restore them with full context (WARNING: the average size of each chat on disk is ~1.5GB)\n* NOTE: to turn on the persistent chats feature you need to do so via the settings dialog as it is off by default\n* automatically rename chats using the AI after the first prompt/response pair\n* new usage statistics including more detailed hardware info to help debug problems on older hardware\n* fix dialog sizes for those with smaller displays\n* add support for persistent contexts and internal model state to the C api\n* add a confirm button for deletion of chats\n* bugfix for blocking the gui when changing models\n* datalake now captures all conversations when network opt-in is turned on\n* new much shorter prompt template by default\n", "contributors": "* Adam Treat (Nomic AI)\n* Aaron Miller\n* Community (beta testers, bug reporters)" }, { "version": "2.4.1", "notes": "* compress persistent chats and save order of magnitude disk space on some small chats\n* persistent chat files are now stored in same folder as models\n* use a thread for deserializing chats on startup so the gui shows window faster\n* fail gracefully and early when we detect incompatible hardware\n* repeat penalty restore default bugfix\n* new mpt backend for mosaic ml's new base model and chat model\n* add mpt chat and base model to downloads\n* lower memory required for gptj models by using f16 for kv cache\n* better error handling for when a model is deleted by user and persistent chat remains\n* add a user default model setting so the users preferred model comes up on startup\n", "contributors": "* Adam Treat (Nomic AI)\n* Zach Nussbaum (Nomic AI)\n* Aaron Miller\n* Community (beta testers, bug reporters)" }, { "version": "2.4.2", "notes": "* add webserver feature that offers mirror api to chatgpt on localhost:4891\n* add chatgpt models installed using openai key to chat client gui\n* fixup the memory handling when switching between chats/models to decrease RAM load across the board\n* fix bug in thread safety for mpt model and de-duplicated code\n* uses compact json format for network\n* add remove model option in download dialog\n", "contributors": "* Adam Treat (Nomic AI)\n* Aaron Miller\n* Community (beta testers, bug reporters)" }, { "version": "2.4.3", "notes": "* add webserver feature that offers mirror api to chatgpt on localhost:4891\n* add chatgpt models installed using openai key to chat client gui\n* fixup the memory handling when switching between chats/models to decrease RAM load across the board\n* fix bug in thread safety for mpt model and de-duplicated code\n* uses compact json format for network\n* add remove model option in download dialog\n* remove text-davinci-003 as it is not a chat model\n* fix installers on mac and linux to include libllmodel versions\n", "contributors": "* Adam Treat (Nomic AI)\n* Aaron Miller\n* Community (beta testers, bug reporters)" }, { "version": "2.4.4", "notes": "* fix buffer overrun in backend\n* bugfix for browse for model directory\n* dedup of qml code\n* revamp settings dialog UI\n* add localdocs plugin (beta) feature allowing scanning of local docs\n* various other bugfixes and performance improvements\n", "contributors": "* Adam Treat (Nomic AI)\n* Aaron Miller\n* Juuso Alasuutari\n* Justin Wang\n* Community (beta testers, bug reporters)" }, { "version": "2.4.5", "notes": "* bugfix for model download remove\n* bugfix for blocking on regenerate\n* lots of various ui improvements enhancements\n* big new change that brings us up2date with llama.cpp/ggml support for latest models\n* advanced avx detection allowing us to fold the two installers into one\n* new logging mechanism that allows for bug reports to have more detail\n* make localdocs work with server mode\n* localdocs fix for stale references after we regenerate\n* fix so that browse to dialog on linux\n* fix so that you can also just add a path to the textfield\n* bugfix for chatgpt and resetting context\n* move models.json to github repo so people can pr suggested new models\n* allow for new models to be directly downloaded from huggingface in said prs\n* better ui for localdocs settings\n* better error handling when model fails to load\n", "contributors": "* Nils Sauer (Nomic AI)\n* Adam Treat (Nomic AI)\n* Aaron Miller (Nomic AI)\n* Richard Guo (Nomic AI)\n* Konstantin Gukov\n* Joseph Mearman\n* Nandakumar\n* Chase McDougall\n* mvenditto\n* Andriy Mulyar (Nomic AI)\n* FoivosC\n* Ettore Di Giacinto\n* Tim Miller\n* Peter Gagarinov\n* Community (beta testers, bug reporters)" }, { "version": "2.4.6", "notes": "* bugfix for model download remove\n* bugfix for blocking on regenerate\n* lots of various ui improvements enhancements\n* big new change that brings us up2date with llama.cpp/ggml support for latest models\n* advanced avx detection allowing us to fold the two installers into one\n* new logging mechanism that allows for bug reports to have more detail\n* make localdocs work with server mode\n* localdocs fix for stale references after we regenerate\n* fix so that browse to dialog on linux\n* fix so that you can also just add a path to the textfield\n* bugfix for chatgpt and resetting context\n* move models.json to github repo so people can pr suggested new models\n* allow for new models to be directly downloaded from huggingface in said prs\n* better ui for localdocs settings\n* better error handling when model fails to load\n", "contributors": "* Nils Sauer (Nomic AI)\n* Adam Treat (Nomic AI)\n* Aaron Miller (Nomic AI)\n* Richard Guo (Nomic AI)\n* Konstantin Gukov\n* Joseph Mearman\n* Nandakumar\n* Chase McDougall\n* mvenditto\n* Andriy Mulyar (Nomic AI)\n* FoivosC\n* Ettore Di Giacinto\n* Tim Miller\n* Peter Gagarinov\n* Community (beta testers, bug reporters)" }, { "version": "2.4.7", "notes": "* replit model support\n* macos metal accelerated support\n* fix markdown for localdocs references\n* inline syntax highlighting for python and cpp with more languages coming\n* synced with upstream llama.cpp\n* ui fixes and default generation settings changes\n* backend bugfixes\n* allow for loading files directly from huggingface via TheBloke without name changes\n", "contributors": "* Nils Sauer (Nomic AI)\n* Adam Treat (Nomic AI)\n* Aaron Miller (Nomic AI)\n* Richard Guo (Nomic AI)\n* Andriy Mulyar (Nomic AI)\n* Ettore Di Giacinto\n* AMOGUS\n* Felix Zaslavskiy\n* Tim Miller\n* Community (beta testers, bug reporters)" }, { "version": "2.4.8", "notes": "* replit model support\n* macos metal accelerated support\n* fix markdown for localdocs references\n* inline syntax highlighting for python and cpp with more languages coming\n* synced with upstream llama.cpp\n* ui fixes and default generation settings changes\n* backend bugfixes\n* allow for loading files directly from huggingface via TheBloke without name changes\n", "contributors": "* Nils Sauer (Nomic AI)\n* Adam Treat (Nomic AI)\n* Aaron Miller (Nomic AI)\n* Richard Guo (Nomic AI)\n* Andriy Mulyar (Nomic AI)\n* Ettore Di Giacinto\n* AMOGUS\n* Felix Zaslavskiy\n* Tim Miller\n* Community (beta testers, bug reporters)" }, { "version": "2.4.9", "notes": "* New GPT4All Falcon model\n* New Orca models\n* Token generation speed is now reported in GUI\n* Bugfix for localdocs references when regenerating\n* General fixes for thread safety\n* Many fixes to UI to add descriptions for error conditions\n* Fixes for saving/reloading chats\n* Complete refactor of the model download dialog with metadata about models available\n* Resume downloads bugfix\n* CORS fix\n* Documentation fixes and typos\n* Latest llama.cpp update\n* Update of replit\n* Force metal setting\n* Fixes for model loading with metal on macOS\n", "contributors": "* Nils Sauer (Nomic AI)\n* Adam Treat (Nomic AI)\n* Aaron Miller (Nomic AI)\n* Richard Guo (Nomic AI)\n* Andriy Mulyar (Nomic AI)\n* cosmic-snow\n* AMOGUS\n* Community (beta testers, bug reporters)" }, { "version": "2.4.10", "notes": "* New GPT4All Falcon model\n* New Orca models\n* Token generation speed is now reported in GUI\n* Bugfix for localdocs references when regenerating\n* General fixes for thread safety\n* Many fixes to UI to add descriptions for error conditions\n* Fixes for saving/reloading chats\n* Complete refactor of the model download dialog with metadata about models available\n* Resume downloads bugfix\n* CORS fix\n* Documentation fixes and typos\n* Latest llama.cpp update\n* Update of replit\n* Force metal setting\n* Fixes for model loading with metal on macOS\n", "contributors": "* Nils Sauer (Nomic AI)\n* Adam Treat (Nomic AI)\n* Aaron Miller (Nomic AI)\n* Richard Guo (Nomic AI)\n* Andriy Mulyar (Nomic AI)\n* cosmic-snow\n* AMOGUS\n* Community (beta testers, bug reporters)" }, { "version": "2.4.11", "notes": "* Per model settings\n* Character settings\n* Adding a system prompt\n* Important bugfix for chatgpt install\n* Complete refactor and revamp of settings dialog\n* New syntax highlighting for java, bash, go\n* Use monospace font for syntax highlighting of codeblocks\n* New setting for turning off references in localdocs\n* Fix memory leaks in falcon model\n* Fix for backend memory handling\n* Server mode bugfix\n* Models.json retrieve bugfix\n* Free metal context bugfix\n* Add a close dialog feature to all chat dialogs\n", "contributors": "* Lakshay Kansal (Nomic AI)\n* Matthew Gill\n* Brandon Beiler\n* cosmic-snow\n* Felix Zaslavskiy\n* Andriy Mulyar (Nomic AI)\n* Aaron Miller (Nomic AI)\n* Adam Treat (Nomic AI)\n* Community (beta testers, bug reporters)" }, { "version": "2.4.12", "notes": "* Fix bad bug that was breaking numerous current installs (sorry folks!)\n* Fix bug with 'browse' button in settings dialog\n* Wayland support on linux\n* Reduce template ui size in settings dialog\n", "contributors": "* Akarshan Biswas\n* Adam Treat (Nomic AI)\n* Community (beta testers, bug reporters)" }, { "version": "2.4.13", "notes": "* Fix bug with prolonging shutdown with generation\n* Fix bug with update model info on deleting chats\n* Fix bug with preventing closing of model download dialog\n* Allows allow closing the model download dialog\n* Fix numerous bugs with download of models.json and provide backup option\n* Add json and c# highlighting\n* Fix bug with chatgpt crashing\n* Fix bug with chatgpt not working for some keys\n* Fix bug with mixpanel opt outs not counting\n* Fix problem with OOM errors causing crash and then repeating on next start\n* Fix default thread setting and provide guardrails\n* Fix tap handler in settings dialog for buttons\n* Fix color of some text fields on macOS for settings dialog\n* Fix problem with startup dialog not closing\n* Provide error dialog for settings file not accessible\n* Try and fix problems with avx-only detection\n* Fix showing error in model downloads unnecessarily\n* Prefer 7b models to load by default\n* Add Wizard v1.1 to download list\n* Rename Orca models to Mini Orca\n* Don't use a system prompt unless model was trained with one by default\n", "contributors": "* Lakshay Kansal (Nomic AI)\n* Aaron Miller (Nomic AI)\n* Adam Treat (Nomic AI)\n* Community (beta testers, bug reporters)" }, { "version": "2.4.14", "notes": "* Add starcoder model support\n* Add ability to switch between light mode/dark mode\n* Increase the size of fonts in monospace code blocks a bit\n", "contributors": "* Lakshay Kansal (Nomic AI)\n* Adam Treat (Nomic AI)" }, { "version": "2.4.15", "notes": "* Add Vulkan GPU backend which allows inference on AMD, Intel and NVIDIA GPUs\n* Add ability to switch font sizes\n* Various bug fixes\n", "contributors": "* Adam Treat (Nomic AI)\n* Aaron Miller (Nomic AI)\n* Nils Sauer (Nomic AI)\n* Lakshay Kansal (Nomic AI)" }, { "version": "2.4.16", "notes": "* Bugfix for properly falling back to CPU when GPU can't be used\n* Report the actual device we're using\n* Fix context bugs for GPU accelerated models\n", "contributors": "* Adam Treat (Nomic AI)\n* Aaron Miller (Nomic AI)" }, { "version": "2.4.17", "notes": "* Bugfix for properly falling back to CPU when GPU is out of memory\n", "contributors": "* Adam Treat (Nomic AI)\n* Aaron Miller (Nomic AI)" }, { "version": "2.4.18", "notes": "* Bugfix for devices to show up in the settings combobox on application start and not just on model load\n* Send information on requested device and actual device on model load to help assess which model/gpu/os combos are working\n", "contributors": "* Adam Treat (Nomic AI)" }, { "version": "2.4.19", "notes": "* Fix a crash on systems with corrupted vulkan drivers or corrupted vulkan dlls\n", "contributors": "* Adam Treat (Nomic AI)" }, { "version": "2.5.0", "notes": "* Major new release supports GGUF models only!\n* New models like Mistral Instruct, Replit 1.5, Rift Coder and more\n* All previous version of ggml-based models are no longer supported\n* Extensive changes to vulkan support\n* Better GPU error messages\n* Prompt processing on the GPU\n* Save chats now saves to text (less harddrive space)\n* Many more changes\n", "contributors": "* Aaron Miller (Nomic AI)\n* Jared Van Bortel (Nomic AI)\n* Adam Treat (Nomic AI)\n* Community (beta testers, bug reporters, bindings authors)" }, { "version": "2.5.1", "notes": "* Accessibility fixes\n* Bugfix for crasher on Windows\n", "contributors": "* Aaron Miller (Nomic AI)\n* Jared Van Bortel (Nomic AI)\n* Victor Tsaran \n* Community (beta testers, bug reporters, bindings authors)" }, { "version": "2.5.2", "notes": "* Support for GGUF v3 models\n* Important fixes for AMD GPUs\n* Don't start recalculating context immediately for saved chats\n* UI fixes for chat name generation\n* UI fixes for leading whitespaces in chat generation\n", "contributors": "* Jared Van Bortel (Nomic AI)\n* Adam Treat (Nomic AI)\n* Community (beta testers, bug reporters, bindings authors)" }, { "version": "2.5.3", "notes": "* Major feature update for localdocs!\n* Localdocs now uses an embedding model for retrieval augmented generation\n* Localdocs can now search while your collections are indexing\n* You're guaranteed to get hits from localdocs for every prompt you enter\n* Fix: AMD gpu fixes\n* Fix: Better error messages\n", "contributors": "* Jared Van Bortel (Nomic AI)\n* Adam Treat (Nomic AI)\n* Community (beta testers, bug reporters, bindings authors)" }, { "version": "2.5.4", "notes": "* Major bugfix release with new models!\n* Model: Recently released Orca 2 model which does exceptionally well on reasoning tasks\n* Fix: System prompt was not always being honored\n* Fix: Download network retry on cloudflare errors\n", "contributors": "* Adam Treat (Nomic AI)\n* Jared Van Bortel (Nomic AI)\n* Community (beta testers, bug reporters, bindings authors)" }, { "version": "2.6.1", "notes": "* Update to newer llama.cpp\n* Implemented configurable context length\n* Bugfixes for localdocs\n* Bugfixes for serialization to disk\n* Bugfixes for AVX\n* Bugfixes for Windows builds\n* Bugfixes for context retention and clearing\n* Add a button to collections dialog\n", "contributors": "* Jared Van Bortel (Nomic AI)\n* Adam Treat (Nomic AI)\n* Community (beta testers, bug reporters, bindings authors)" }, { "version": "2.6.2", "notes": "* Update to latest llama.cpp\n* Update to newly merged vulkan backend\n* Partial GPU offloading support\n* New localdocs speed increases and features\n* New GUI settings option for configuring how many layers to put on GPU\n* New lightmode theme, darkmode theme and legacy theme\n* Lots of UI updates and enhancements\n* Scores of bugfixes for stability and usability\n", "contributors": "* Jared Van Bortel (Nomic AI)\n* Adam Treat (Nomic AI)\n* Karthik Nair\n* Community (beta testers, bug reporters, bindings authors)" }, { "version": "2.7.0", "notes": "* Add support for twelve new model architectures\n* Including Baichuan, BLOOM, CodeShell, GPT-2, Orion, Persimmon, Phi and Phi-2, Plamo, Qwen, Qwen2, Refact, and StableLM\n* Fix for progress bar colors on legacy theme\n* Fix sizing for model download dialog elements\n* Fix dialog sizes to use more screen realestate where available\n* Fix for vram leak when model loading fails\n* Fix for making the collection dialog progress bar more readable\n* Fix for smaller minimum size for main screen\n* Fix for mistral crash\n* Fix for mistral openorca prompt template to ChatLM\n* Fix for excluding non-text documents from localdoc scanning\n* Fix for scrollbar missing on main conversation\n* Fix accessibility issues for screen readers\n* Fix for not showing the download button when not online\n", "contributors": "* Jared Van Bortel (Nomic AI)\n* Adam Treat (Nomic AI)\n* Community (beta testers, bug reporters, bindings authors)" }, { "version": "2.7.1", "notes": "* Update to latest llama.cpp with support for Google Gemma\n* Gemma, Phi and Phi-2, Qwen2, and StableLM are now all GPU accelerated\n* Large revamp of the model loading to support explicit unload/reload\n* Bugfixes for ChatML and improved version of Mistral OpenOrca\n* We no longer load a model by default on application start\n* We no longer load a model by default on chat context switch\n* Fixes for visual artifacts in update reminder dialog\n* Blacklist Intel GPU's for now as we don't support yet\n* Fixes for binary save/restore of chat\n* Save and restore of window geometry across application starts\n", "contributors": "* Jared Van Bortel (Nomic AI)\n* Adam Treat (Nomic AI)\n* Community (beta testers, bug reporters, bindings authors)" }, { "version": "2.7.2", "notes": "* New support for model search/discovery using huggingface search in downloads\n* Support for more model architectures for GPU acceleration\n* Three different crash fixes for corner case settings\n* Add a minp sampling parameter\n* Bert layoer norm epsilon value\n* Fix problem with blank lines between reply and next prompt\n", "contributors": "* Christopher Barrera\n* Jared Van Bortel (Nomic AI)\n* Adam Treat (Nomic AI)\n* Community (beta testers, bug reporters, bindings authors)" }, { "version": "2.7.3", "notes": "* Fix for network reachability unknown\n* Fix undefined behavior with resetContext\n* Fix ChatGPT which was broken with previous release\n* Fix for clean up of chat llm thread destruction\n* Display of model loading warnings\n* Fix for issue 2080 where the GUI appears to hang when a chat is deleted\n* Fix for issue 2077 better responsiveness of model download dialog when download is taking place\n* Fix for issue 2092 don't include models that are disabled for GUI in application default model list\n* Fix for issue 2087 where cloned modelds were lost and listed in download dialog erroneously\n* Fix for MPT models without duplicated token embd weight\n* New feature with api server port setting\n* Fix for issue 2024 where combobox for model settings uses currently used model by default\n* Clean up settings properly for removed models and don't list stale model settings in download dialog\n* Fix for issue 2105 where the cancel button was not working for discovered model downloads\n", "contributors": "* Christopher Barrera\n* Daniel Alencar\n* Jared Van Bortel (Nomic AI)\n* Adam Treat (Nomic AI)\n* Community (beta testers, bug reporters, bindings authors)" }, { "version": "2.7.4", "notes": "— What's New —\n* Add a right-click menu to the chat (by @kryotek777 in PR #2108)\n* Change the left sidebar to stay open (PR #2117)\n* Limit the width of text in the chat (PR #2118)\n* Move to llama.cpp's SBert implementation (PR #2086)\n* Support models provided by the Mistral AI API (by @Olyxz16 in PR #2053)\n* Models List: Add Ghost 7B v0.9.1 (by @lh0x00 in PR #2127)\n* Add Documentation and FAQ links to the New Chat page (by @3Simplex in PR #2183)\n* Models List: Simplify Mistral OpenOrca system prompt (PR #2220)\n* Models List: Add Llama 3 Instruct (PR #2242)\n* Models List: Add Phi-3 Mini Instruct (PR #2252)\n* Improve accuracy of anonymous usage statistics (PR #2238)\n\n— Fixes —\n* Detect unsupported CPUs correctly on Windows (PR #2141)\n* Fix the colors used by the server chat (PR #2150)\n* Fix startup issues when encountering non-Latin characters in paths (PR #2162)\n* Fix issues causing LocalDocs context links to not work sometimes (PR #2218)\n* Fix incorrect display of certain code block syntax in the chat (PR #2232)\n* Fix an issue causing unnecessary indexing of document collections on startup (PR #2236)\n", "contributors": "* Jared Van Bortel (Nomic AI)\n* Adam Treat (Nomic AI)\n* Lam Hieu (`@lh0x00`)\n* 3Simplex (`@3Simplex`)\n* Kryotek (`@kryotek777`)\n* Olyxz16 (`@Olyxz16`)\n* Robin Verduijn (`@robinverduijn`)\n* Tim453 (`@Tim453`)\n* Xu Zhen (`@xuzhen`)\n* Community (beta testers, bug reporters, bindings authors)" }, { "version": "2.7.5", "notes": "— What's New —\n* Improve accuracy of anonymous usage statistics (PR #2297, PR #2299)\n\n— Fixes —\n* Fix some issues with anonymous usage statistics (PR #2270, PR #2296)\n* Default to GPU with most VRAM on Windows and Linux, not least (PR #2297)\n* Fix initial failure to generate embeddings with Nomic Embed (PR #2284)\n", "contributors": "* Jared Van Bortel (Nomic AI)\n* Adam Treat (Nomic AI)\n* Community (beta testers, bug reporters, bindings authors)" }, { "version": "2.8.0", "notes": "— What's New —\n* Context Menu: Replace \"Select All\" on message with \"Copy Message\" (PR #2324)\n* Context Menu: Hide Copy/Cut when nothing is selected (PR #2324)\n* Improve speed of context switch after quickly switching between several chats (PR #2343)\n* New Chat: Always switch to the new chat when the button is clicked (PR #2330)\n* New Chat: Always scroll to the top of the list when the button is clicked (PR #2330)\n* Update to latest llama.cpp as of May 9, 2024 (PR #2310)\n* **Add support for the llama.cpp CUDA backend** (PR #2310, PR #2357)\n * Nomic Vulkan is still used by default, but CUDA devices can now be selected in Settings\n * When in use: Greatly improved prompt processing and generation speed on some devices\n * When in use: GPU support for Q5\\_0, Q5\\_1, Q8\\_0, K-quants, I-quants, and Mixtral\n* Add support for InternLM models (PR #2310)\n\n— Fixes —\n* Do not allow sending a message while the LLM is responding (PR #2323)\n* Fix poor quality of generated chat titles with many models (PR #2322)\n* Set the window icon correctly on Windows (PR #2321)\n* Fix a few memory leaks (PR #2328, PR #2348, PR #2310)\n* Do not crash if a model file has no architecture key (PR #2346)\n* Fix several instances of model loading progress displaying incorrectly (PR #2337, PR #2343)\n* New Chat: Fix the new chat being scrolled above the top of the list on startup (PR #2330)\n* macOS: Show a \"Metal\" device option, and actually use the CPU when \"CPU\" is selected (PR #2310)\n* Remove unsupported Mamba, Persimmon, and PLaMo models from the whitelist (PR #2310)\n* Fix GPT4All.desktop being created by offline installers on macOS (PR #2361)\n", "contributors": "* Jared Van Bortel (Nomic AI)\n* Adam Treat (Nomic AI)\n* Tim453 (`@Tim453`)\n* Community (beta testers, bug reporters, bindings authors)" }, { "version": "3.0.0", "notes": "— What's New —\n* Complete UI overhaul (PR #2396)\n* LocalDocs improvements (PR #2396)\n * Use nomic-embed-text-v1.5 as local model instead of SBert\n * Ship local model with application instead of downloading afterwards\n * Store embeddings flat in SQLite DB instead of in hnswlib index\n * Do exact KNN search with usearch instead of approximate KNN search with hnswlib\n* Markdown support (PR #2476)\n* Support CUDA/Metal device option for embeddings (PR #2477)\n\n— Fixes —\n* Fix embedding tokenization after PR #2310 (PR #2381)\n* Fix a crash when loading certain models with \"code\" in their name (PR #2382)\n* Fix an embedding crash with large chunk sizes after PR #2310 (PR #2383)\n* Fix inability to load models with non-ASCII path on Windows (PR #2388)\n* CUDA: Do not show non-fatal DLL errors on Windows (PR #2389)\n* LocalDocs fixes (PR #2396)\n * Always use requested number of snippets even if there are better matches in unselected collections\n * Check for deleted files on startup\n* CUDA: Fix PTX errors with some GPT4All builds (PR #2421)\n* Fix blank device in UI after model switch and improve usage stats (PR #2409)\n* Use CPU instead of CUDA backend when GPU loading fails the first time (ngl=0 is not enough) (PR #2477)\n* Fix crash when sending a message greater than n\\_ctx tokens after PR #1970 (PR #2498)\n", "contributors": "* Vincent Giardina (Nomic AI)\n* Jared Van Bortel (Nomic AI)\n* John W. Parent (Kitware)\n* Paige Lee (Nomic AI)\n* Max Cembalest (Nomic AI)\n* Andriy Mulyar (Nomic AI)\n* Adam Treat (Nomic AI)\n* cosmic-snow (`@cosmic-snow`)\n* Community (beta testers, bug reporters, bindings authors)" }, { "version": "3.1.0", "notes": "— What's New —\n* Generate suggested follow-up questions feature (#2634)\n\n— What's Changed —\n* Customize combo boxes and context menus to fit the new style (#2535)\n* Improve view bar scaling and Model Settings layout (#2520)\n* Make the logo spin while the model is generating (#2557)\n* Server: Reply to wrong GET/POST method with HTTP 405 instead of 404 (#2615)\n* Update theme for menus (#2578)\n* Move the \"stop\" button to the message box (#2561)\n* Build with CUDA 11.8 for better compatibility (#2639)\n* Make links in latest news section clickable (#2643)\n* Support translation of settings choices (#2667), (#2690)\n* Improve LocalDocs view's error message (by @cosmic-snow in #2679)\n* Ignore case of LocalDocs file extensions (#2642), (#2684)\n* Update llama.cpp to commit 87e397d00 from July 19th (#2694)\n * Add support for GPT-NeoX, Gemma 2, OpenELM, ChatGLM, and Jais architectures (all with Vulkan support)\n * Enable Vulkan support for StarCoder2, XVERSE, Command R, and OLMo\n* Show scrollbar in chat collections list as needed (#2691)\n\n— What's Removed —\n* Remove support for GPT-J models (#2676)\n\n— Fixes —\n* Fix placement of thumbs-down and datalake opt-in dialogs (#2540)\n* Select the correct folder with the Linux fallback folder dialog (#2541)\n* Fix clone button sometimes producing blank model info (#2545)\n* Fix jerky chat view scrolling (#2555)\n* Fix \"reload\" showing for chats with missing models (#2520)\n* Fix property binding loop warning (#2601)\n* Fix UI hang with certain chat view content (#2543)\n* Fix crash when Kompute falls back to CPU (#2640)\n* Fix several Vulkan resource management issues (#2694)\n", "contributors": "* Jared Van Bortel (Nomic AI)\n* Adam Treat (Nomic AI)\n* cosmic-snow (`@cosmic-snow`)\n* 3Simplex (`@3Simplex`)\n* Community (beta testers, bug reporters, bindings authors)" }, { "version": "3.1.1", "notes": "— What's New —\n* Ability to add OpenAI compatible remote models (#2683)\n\n— Fixes —\n* Update llama.cpp to cherry-pick Llama 3.1 RoPE fix. (#2758)\n", "contributors": "* Adam Treat (Nomic AI)\n* Jared Van Bortel (Nomic AI)\n* Shiranui (@supersonictw)\n* Community (beta testers, bug reporters, bindings authors)" }, { "version": "3.2.0", "notes": "— What's New —\n* Translations for Simplified Chinese, Traditional Chinese, Italian, Portuguese, Romanian, and Spanish\n* Significantly faster context recalculation when context runs out\n* Models no longer stop generating when they run out of context\n* Add Qwen2-1.5B-Instruct to the model list\n\n— Fixes —\n* Fix a CUDA crash with long conversations since v3.1.0\n* Fix \"file(s)\" and \"word(s)\" appearing in UI instead of proper plurals\n* Show the correct icons for LocalDocs sources with uppercase extensions\n* More reliable reverse prompt detection\n* Fix a minor prompting issue introduced in v3.1.0\n* Disallow context shift for chat name and follow-up generation\n* Fix potential incompatibility with macOS 12 and 13\n", "contributors": "* Jared Van Bortel (Nomic AI)\n* Adam Treat (Nomic AI)\n* Riccardo Giovanetti (`@Harvester62`)\n* Victor Emanuel (`@SINAPSA-IC`)\n* Jeremy Tayco (`@jstayco`)\n* Shiranui (`@supersonictw`)\n* Thiago Ramos (`@thiagojramos`)\n* ThiloteE (`@ThiloteE`)\n* Dominik (`@cosmic-snow`)\n* Jack (`@wuodoo`)\n* Community (beta testers, bug reporters, bindings authors)" }, { "version": "3.2.1", "notes": "— Fixes —\n* Fix a potential Vulkan crash on application exit on some Linux systems\n* Fix a bad CUDA build option that led to gibberish on newer NVIDIA GPUs\n", "contributors": "* Jared Van Bortel (Nomic AI)" }, { "version": "3.3.0", "notes": "* **UI Improvements**: The minimum window size now adapts to the font size. A few labels and links have been fixed. The Embeddings Device selection of \"Auto\"/\"Application default\" works again. The window icon is now set on Linux. The antenna icon now displays when the API server is listening.\n* **Single Instance**: Only one instance of GPT4All can be opened at a time. This is now enforced.\n* **Greedy Sampling**: Set temperature to zero to enable greedy sampling.\n* **API Server Changes**: The built-in API server now responds correctly to both legacy completions, and chats with message history. Also, it now uses the system prompt configured in the UI.\n* **Translation Improvements**: The Italian, Romanian, and Traditional Chinese translations have been updated.\n", "contributors": "* Jared Van Bortel (Nomic AI)\n* Adam Treat (Nomic AI)\n* 3Simplex (`@3Simplex`)\n* Riccardo Giovanetti (`@Harvester62`)\n* Victor Emanuel (`@SINAPSA-IC`)\n* Dominik (`@cosmic-snow`)\n* Shiranui (`@supersonictw`)" }, { "version": "3.3.1", "notes": "* Fixed a crash when attempting to continue a chat loaded from disk\n* Fixed the local server rejecting min\\_p/top\\_p less than 1\n", "contributors": "* Jared Van Bortel (Nomic AI)" }, { "version": "3.4.0", "notes": "* **Attached Files:** You can now attach a small Microsoft Excel spreadsheet (.xlsx) to a chat message and ask the model about it.\n* **LocalDocs Accuracy:** The LocalDocs algorithm has been enhanced to find more accurate references for some queries.\n* **Word Document Support:** LocalDocs now supports Microsoft Word (.docx) documents natively.\n * **IMPORTANT NOTE:** If .docx files are not found, make sure Settings > LocalDocs > Allowed File Extensions includes \"docx\".\n* **Forgetful Model Fixes:** Issues with the \"Redo last chat response\" button, and with continuing chats from previous sessions, have been fixed.\n* **Chat Saving Improvements:** On exit, GPT4All will no longer save chats that are not new or modified. As a bonus, downgrading without losing access to all chats will be possible in the future, should the need arise.\n* **UI Fixes:** The model list no longer scrolls to the top when you start downloading a model.\n* **New Models:** LLama 3.2 Instruct 3B and 1B models now available in model list.\n", "contributors": "* Jared Van Bortel (Nomic AI)\n* Adam Treat (Nomic AI)\n* Andriy Mulyar (Nomic AI)\n* Ikko Eltociear Ashimine (`@eltociear`)\n* Victor Emanuel (`@SINAPSA-IC`)\n* Shiranui (`@supersonictw`)" }, { "version": "3.4.1", "notes": "* **LocalDocs Fixes:** Several issues with LocalDocs in v3.4.0 have been fixed, including missing words and very slow indexing.\n* **Syntax Highlighting:** Go code is now highlighted with the correct colors.\n* **Cache Fixes:** The model list cache is now stored with a version number, and in a more appropriate directory.\n* **Translation Updates:** The Italian translation has been improved.\n", "contributors": "* Jared Van Bortel (Nomic AI)\n* Adam Treat (Nomic AI)\n* John Parent (Kitware)\n* Riccardo Giovanetti (`@Harvester62`)" }, { "version": "3.4.2", "notes": "* **LocalDocs Fixes:** Several issues with LocalDocs, some of which were introduced in v3.4.0, have been fixed.\n * Fixed the possible use of references from unselected collections.\n * Fixed unnecessary reindexing of files with uppercase extensions.\n * Fixed hybrid search failure due to inconsistent database state.\n * Fully fixed the blank Embeddings Device selection in LocalDocs settings.\n * Fixed LocalDocs indexing of large PDFs making very slow progress or even stalling.\n", "contributors": "* Adam Treat (Nomic AI)\n* Jared Van Bortel (Nomic AI)" }, { "version": "3.5.0", "notes": "* **Message Editing:**\n * You can now edit any message you've sent by clicking the pencil icon below it.\n * You can now redo earlier responses in the conversation.\n* **Templates:** Chat templates have been completely overhauled! They now use Jinja-style syntax. You may notice warnings or errors in the UI. Read the linked docs, and if you have any questions, please ask on the Discord.\n* **File Attachments:** Markdown and plain text files are now supported as file attachments.\n* **System Tray:** There is now an option in Application Settings to allow GPT4All to minimize to the system tray instead of closing.\n* **Local API Server:**\n * The API server now supports system messages from the client and no longer uses the system message in settings.\n * You can now send messages to the API server in any order supported by the model instead of just user/assistant pairs.\n* **Translations:** The Italian and Romanian translations have been improved.\n", "contributors": "* Jared Van Bortel (Nomic AI)\n* Adam Treat (Nomic AI)\n* Benjamin Gallois (`@bgallois`)\n* Riccardo Giovanetti (`@Harvester62`)\n* Victor Emanuel (`@SINAPSA-IC`)" }, { "version": "3.5.1", "notes": "* **Chat template fixes:** Llama 3.2 models, Nous Hermes 2 Mistral, Mistral OpenOrca, Qwen 2 and remote models\n* **Bugfix:** Fix the default model button so it works again after 3.5.0\n", "contributors": "* Jared Van Bortel (Nomic AI)\n* Adam Treat (Nomic AI)" }, { "version": "3.5.2", "notes": "* **Model Search:** There are now separate tabs for official and third-party models.\n* **Local Server Fixes:** Several mistakes in v3.5's changes to the API server have been corrected.\n* **Cloned Model Fixes:** The chat template and system message of cloned models now manage their defaults correctly.\n* **Translation Improvements:** The Romanian and Italian translations have been updated.\n", "contributors": "* Jared Van Bortel (Nomic AI)\n* Adam Treat (Nomic AI)\n* Riccardo Giovanetti (`@Harvester62`)\n* Victor Emanuel (`@SINAPSA-IC`)" }, { "version": "3.5.3", "notes": "* **LocalDocs Fix:** A serious issue causing LocalDocs to not work properly in v3.5.2 has been fixed.\n", "contributors": "* Jared Van Bortel (Nomic AI)\n* Adam Treat (Nomic AI)" }, { "version": "3.6.0", "notes": "* **Reasoner v1:**\n * Built-in javascript code interpreter tool.\n * Custom curated model that utilizes the code interpreter to break down, analyze, perform, and verify complex reasoning tasks.\n* **Templates:** Automatically substitute chat templates that are not compatible with Jinja2Cpp in GGUFs.\n* **Fixes:**\n * Remote model template to allow for XML in messages.\n * Jinja2Cpp bug that broke system message detection in chat templates.\n * LocalDocs sources displaying in unconsolidated form after v3.5.0.\n", "contributors": "* Adam Treat (Nomic AI)\n* Jared Van Bortel (Nomic AI)" }, { "version": "3.6.1", "notes": "* **Fixes:**\n * The stop generation button no longer working in v3.6.0.\n * The copy entire conversation button no longer working in v3.6.0.\n", "contributors": "* Adam Treat (Nomic AI)" }, { "version": "3.7.0", "notes": "* **Windows ARM Support:** GPT4All now supports the Windows ARM platform, ensuring compatibility with devices powered by Qualcomm Snapdragon and Microsoft SQ-series processors.\n * **NOTE:** Support for GPU and/or NPU acceleration is not available at this time. Only the CPU will be used to run LLMs.\n * **NOTE:** You must install the new *Windows ARM* version of GPT4All from the website. The standard *Windows* version will not work due to emulation limitations.\n* **Fixed Updating on macOS:** The maintenance tool no longer crashes when attempting to update or uninstall GPT4All on Sequoia.\n * **NOTE:** If you have installed the version from the GitHub releases as a workaround for this issue, you can safely uninstall it and switch back to the version from the website.\n* **Fixed Chat Saving on macOS:** Chats now save as expected when the application is quit with Command-Q.\n* **Code Interpreter Improvements:**\n * The behavior when the code takes too long to execute and times out has been improved.\n * console.log now accepts multiple arguments for better compatibility with native JavaScript.\n* **Chat Templating Improvements:**\n * Two crashes and one compatibility issue have been fixed in the chat template parser.\n * The default chat template for EM German Mistral has been fixed.\n * Automatic replacements have been added for five new models as we continue to improve compatibility with common chat templates.\n", "contributors": "* Jared Van Bortel (Nomic AI)\n* Adam Treat (Nomic AI)\n* Riccardo Giovanetti (`@Harvester62`)" }, { "version": "3.8.0", "notes": "* **Native DeepSeek-R1-Distill Support:** GPT4All now has robust support for the DeepSeek-R1 family of distillations.\n * Several model variants are now available on the downloads page.\n * Reasoning (wrapped in \"think\" tags) is displayed similarly to the Reasoner model.\n * The DeepSeek-R1 Qwen pretokenizer is now supported, resolving the loading failure in previous versions.\n * The model is now configured with a GPT4All-compatible prompt template by default.\n* **Chat Templating Overhaul:** The template parser has been *completely* replaced with one that has much better compatibility with common models.\n* **Code Interpreter Fixes:**\n * An issue preventing the code interpreter from logging a single string in v3.7.0 has been fixed.\n * The UI no longer freezes while the code interpreter is running a computation.\n* **Local Server Fixes:**\n * An issue preventing the server from using LocalDocs after the first request since v3.5.0 has been fixed.\n * System messages are now correctly hidden from the message history.\n", "contributors": "* Jared Van Bortel (Nomic AI)\n* Adam Treat (Nomic AI)\n* ThiloteE (`@ThiloteE`)" }, { "version": "3.9.0", "notes": "* **LocalDocs Fix:** LocalDocs no longer shows an error on later messages with reasoning models.\n* **DeepSeek Fix:** DeepSeek-R1 reasoning (in 'think' tags) no longer appears in chat names and follow-up questions.\n* **Windows ARM Improvements:**\n * Graphical artifacts on some SoCs have been fixed.\n * A crash when adding a collection of PDFs to LocalDocs has been fixed.\n* **Template Parser Fixes:** Chat templates containing an unclosed comment no longer freeze GPT4All.\n* **New Models:** OLMoE and Granite MoE models are now supported.\n", "contributors": "* Jared Van Bortel (Nomic AI)\n* Adam Treat (Nomic AI)\n* ThiloteE (`@ThiloteE`)" }, { "version": "3.10.0", "notes": "* **Remote Models:**\n * The Add Model page now has a dedicated tab for remote model providers.\n * Groq, OpenAI, and Mistral remote models are now easier to configure.\n* **CUDA Compatibility:** GPUs with CUDA compute capability 5.0 such as the GTX 750 are now supported by the CUDA backend.\n* **New Model:** The non-MoE Granite model is now supported.\n* **Translation Updates:**\n * The Italian translation has been updated.\n * The Simplified Chinese translation has been significantly improved.\n* **Better Chat Templates:** The default chat templates for OLMoE 7B 0924/0125 and Granite 3.1 3B/8B have been improved.\n* **Whitespace Fixes:** DeepSeek-R1-based models now have better whitespace behavior in their output.\n* **Crash Fixes:** Several issues that could potentially cause GPT4All to crash have been fixed.\n", "contributors": "* Jared Van Bortel (Nomic AI)\n* Adam Treat (Nomic AI)\n* ThiloteE (`@ThiloteE`)\n* Lil Bob (`@Junior2Ran`)\n* Riccardo Giovanetti (`@Harvester62`)" } ] ================================================ FILE: gpt4all-chat/pyproject.toml ================================================ [tool.pytest.ini_options] addopts = ['--import-mode=importlib'] [tool.mypy] files = 'tests/python' pretty = true strict = true warn_unused_ignores = false [tool.pytype] inputs = ['tests/python'] jobs = 'auto' bind_decorated_methods = true none_is_not_bool = true overriding_renamed_parameter_count_checks = true strict_none_binding = true precise_return = true # protocols: # - https://github.com/google/pytype/issues/1423 # - https://github.com/google/pytype/issues/1424 strict_import = true strict_parameter_checks = true strict_primitive_comparisons = true # strict_undefined_checks: too many false positives [tool.isort] src_paths = ['tests/python'] line_length = 120 combine_as_imports = true ================================================ FILE: gpt4all-chat/qa_checklist.md ================================================ ## QA Checklist 1. Ensure you have a fresh install by **backing up** and then deleting the following directories: ### Windows * Settings directory: ```C:\Users\{username}\AppData\Roaming\nomic.ai``` * Models directory: ```C:\Users\{username}\AppData\Local\nomic.ai\GPT4All``` ### Mac * Settings directory: ```/Users/{username}/.config/gpt4all.io``` * Models directory: ```/Users/{username}/Library/Application Support/nomic.ai/GPT4All``` ### Linux * Settings directory: ```/home/{username}/.config/nomic.ai``` * Models directory: ```/home/{username}/.local/share/nomic.ai/GPT4All``` ^ Note: If you've changed your models directory manually via the settings you need to backup and delete that one 2. Go through every view and ensure that things display correctly and familiarize yourself with the application flow 3. Navigate to the models view and download Llama 3 Instruct 4. Navigate to the models view and search for "TheBloke mistral 7b" and download "TheBloke/Mistral-7B-Instruct-v0.1-GGUF" 5. Navigate to the chat view and open new chats and load these models 6. Chat with the models and exercise them. Rename the chats. Delete chats. Open new chats. Switch models when in chats. 7. Create a new localdocs collection from a directory of .txt or .pdf files on your hard drive 8. Enable the new collection in chats (especially with Llama 3 Instruct) and exercise the localdocs feature 9. Go to the settings view and explore each setting 10. Remove collections in localdocs and re-add them. Rebuild collections 11. Now shut down the app, go back and restore any previous settings directory or model directory you had from a previous install and re-test #1 through #11 :) 12. Try to break the app ### EXTRA CREDIT 1. If you have a openai api key install GPT-4 model and chat with it 2. If you have a nomic api key install the remote nomic embedding model for localdocs (see if you can discover how to do this) 3. If you have a python script that targets openai API then enable server mode and try this 4. Really try and break the app All feedback is welcome ================================================ FILE: gpt4all-chat/qml/AddCollectionView.qml ================================================ import QtCore import QtQuick import QtQuick.Controls import QtQuick.Controls.Basic import QtQuick.Layouts import QtQuick.Dialogs import Qt.labs.folderlistmodel import Qt5Compat.GraphicalEffects import llm import chatlistmodel import download import modellist import network import gpt4all import mysettings import localdocs Rectangle { id: addCollectionView Theme { id: theme } color: theme.viewBackground signal localDocsViewRequested() ColumnLayout { id: mainArea anchors.left: parent.left anchors.right: parent.right anchors.top: parent.top anchors.bottom: parent.bottom anchors.margins: 30 spacing: 20 RowLayout { Layout.fillWidth: true Layout.alignment: Qt.AlignTop spacing: 50 MyButton { id: backButton Layout.alignment: Qt.AlignTop | Qt.AlignLeft text: qsTr("\u2190 Existing Collections") borderWidth: 0 backgroundColor: theme.lighterButtonBackground backgroundColorHovered: theme.lighterButtonBackgroundHovered backgroundRadius: 5 padding: 15 topPadding: 8 bottomPadding: 8 textColor: theme.lighterButtonForeground fontPixelSize: theme.fontSizeLarge fontPixelBold: true onClicked: { localDocsViewRequested() } } } Text { id: addDocBanner Layout.alignment: Qt.AlignBottom | Qt.AlignHCenter horizontalAlignment: Qt.AlignHCenter text: qsTr("Add Document Collection") font.pixelSize: theme.fontSizeBanner color: theme.titleTextColor } Text { Layout.alignment: Qt.AlignTop | Qt.AlignHCenter Layout.maximumWidth: addDocBanner.width wrapMode: Text.WordWrap horizontalAlignment: Text.AlignJustify text: qsTr("Add a folder containing plain text files, PDFs, or Markdown. Configure additional extensions in Settings.") font.pixelSize: theme.fontSizeLarger color: theme.titleInfoTextColor } GridLayout { id: root Layout.alignment: Qt.AlignTop | Qt.AlignHCenter rowSpacing: 50 columnSpacing: 20 property alias collection: collection.text property alias folder_path: folderEdit.text MyFolderDialog { id: folderDialog } Label { Layout.row: 2 Layout.column: 0 text: qsTr("Name") font.bold: true font.pixelSize: theme.fontSizeLarger color: theme.settingsTitleTextColor } MyTextField { id: collection Layout.row: 2 Layout.column: 1 Layout.minimumWidth: 400 Layout.alignment: Qt.AlignRight horizontalAlignment: Text.AlignJustify color: theme.textColor font.pixelSize: theme.fontSizeLarge placeholderText: qsTr("Collection name...") placeholderTextColor: theme.mutedTextColor ToolTip.text: qsTr("Name of the collection to add (Required)") ToolTip.visible: hovered Accessible.role: Accessible.EditableText Accessible.name: collection.text Accessible.description: ToolTip.text function showError() { collection.placeholderTextColor = theme.textErrorColor } onTextChanged: { collection.placeholderTextColor = theme.mutedTextColor } } Label { Layout.row: 3 Layout.column: 0 text: qsTr("Folder") font.bold: true font.pixelSize: theme.fontSizeLarger color: theme.settingsTitleTextColor } RowLayout { Layout.row: 3 Layout.column: 1 Layout.minimumWidth: 400 Layout.maximumWidth: 400 Layout.alignment: Qt.AlignRight spacing: 10 MyDirectoryField { id: folderEdit Layout.fillWidth: true text: root.folder_path placeholderText: qsTr("Folder path...") font.pixelSize: theme.fontSizeLarge placeholderTextColor: theme.mutedTextColor ToolTip.text: qsTr("Folder path to documents (Required)") ToolTip.visible: hovered function showError() { folderEdit.placeholderTextColor = theme.textErrorColor } onTextChanged: { folderEdit.placeholderTextColor = theme.mutedTextColor } } MySettingsButton { id: browseButton text: qsTr("Browse") onClicked: { folderDialog.openFolderDialog(StandardPaths.writableLocation(StandardPaths.HomeLocation), function(selectedFolder) { root.folder_path = selectedFolder }) } } } MyButton { Layout.row: 4 Layout.column: 1 Layout.alignment: Qt.AlignRight text: qsTr("Create Collection") onClicked: { var isError = false; if (root.collection === "") { isError = true; collection.showError(); } if (root.folder_path === "" || !folderEdit.isValid) { isError = true; folderEdit.showError(); } if (isError) return; LocalDocs.addFolder(root.collection, root.folder_path) root.collection = "" root.folder_path = "" collection.clear() localDocsViewRequested() } } } } } ================================================ FILE: gpt4all-chat/qml/AddGPT4AllModelView.qml ================================================ import QtCore import QtQuick import QtQuick.Controls import QtQuick.Controls.Basic import QtQuick.Layouts import QtQuick.Dialogs import Qt.labs.folderlistmodel import Qt5Compat.GraphicalEffects import llm import chatlistmodel import download import modellist import network import gpt4all import mysettings import localdocs ColumnLayout { Layout.fillWidth: true Layout.alignment: Qt.AlignTop spacing: 5 Label { Layout.topMargin: 0 Layout.bottomMargin: 25 Layout.rightMargin: 150 * theme.fontScale Layout.alignment: Qt.AlignTop Layout.fillWidth: true verticalAlignment: Text.AlignTop text: qsTr("These models have been specifically configured for use in GPT4All. The first few models on the " + "list are known to work the best, but you should only attempt to use models that will fit in your " + "available memory.") font.pixelSize: theme.fontSizeLarger color: theme.textColor wrapMode: Text.WordWrap } Label { visible: !ModelList.gpt4AllDownloadableModels.count && !ModelList.asyncModelRequestOngoing Layout.fillWidth: true Layout.fillHeight: true horizontalAlignment: Qt.AlignHCenter verticalAlignment: Qt.AlignVCenter text: qsTr("Network error: could not retrieve %1").arg("http://gpt4all.io/models/models3.json") font.pixelSize: theme.fontSizeLarge color: theme.mutedTextColor } MyBusyIndicator { visible: !ModelList.gpt4AllDownloadableModels.count && ModelList.asyncModelRequestOngoing running: ModelList.asyncModelRequestOngoing Accessible.role: Accessible.Animation Layout.alignment: Qt.AlignCenter Accessible.name: qsTr("Busy indicator") Accessible.description: qsTr("Displayed when the models request is ongoing") } RowLayout { ButtonGroup { id: buttonGroup exclusive: true } MyButton { text: qsTr("All") checked: true borderWidth: 0 backgroundColor: checked ? theme.lightButtonBackground : "transparent" backgroundColorHovered: theme.lighterButtonBackgroundHovered backgroundRadius: 5 padding: 15 topPadding: 8 bottomPadding: 8 textColor: theme.lighterButtonForeground fontPixelSize: theme.fontSizeLarge fontPixelBold: true checkable: true ButtonGroup.group: buttonGroup onClicked: { ModelList.gpt4AllDownloadableModels.filter(""); } } MyButton { text: qsTr("Reasoning") borderWidth: 0 backgroundColor: checked ? theme.lightButtonBackground : "transparent" backgroundColorHovered: theme.lighterButtonBackgroundHovered backgroundRadius: 5 padding: 15 topPadding: 8 bottomPadding: 8 textColor: theme.lighterButtonForeground fontPixelSize: theme.fontSizeLarge fontPixelBold: true checkable: true ButtonGroup.group: buttonGroup onClicked: { ModelList.gpt4AllDownloadableModels.filter("#reasoning"); } } Layout.bottomMargin: 10 } ScrollView { id: scrollView ScrollBar.vertical.policy: ScrollBar.AsNeeded Layout.fillWidth: true Layout.fillHeight: true clip: true ListView { id: modelListView model: ModelList.gpt4AllDownloadableModels boundsBehavior: Flickable.StopAtBounds spacing: 30 delegate: Rectangle { id: delegateItem width: modelListView.width height: childrenRect.height + 60 color: theme.conversationBackground radius: 10 border.width: 1 border.color: theme.controlBorder ColumnLayout { anchors.top: parent.top anchors.left: parent.left anchors.right: parent.right anchors.margins: 30 Text { Layout.fillWidth: true Layout.alignment: Qt.AlignLeft text: name elide: Text.ElideRight color: theme.titleTextColor font.pixelSize: theme.fontSizeLargest font.bold: true Accessible.role: Accessible.Paragraph Accessible.name: qsTr("Model file") Accessible.description: qsTr("Model file to be downloaded") } Rectangle { Layout.fillWidth: true height: 1 color: theme.dividerColor } RowLayout { Layout.topMargin: 10 Layout.fillWidth: true Text { id: descriptionText text: description font.pixelSize: theme.fontSizeLarge Layout.fillWidth: true wrapMode: Text.WordWrap textFormat: Text.StyledText color: theme.textColor linkColor: theme.textColor Accessible.role: Accessible.Paragraph Accessible.name: qsTr("Description") Accessible.description: qsTr("File description") onLinkActivated: function(link) { Qt.openUrlExternally(link); } MouseArea { anchors.fill: parent acceptedButtons: Qt.NoButton // pass clicks to parent cursorShape: parent.hoveredLink ? Qt.PointingHandCursor : Qt.ArrowCursor } } // FIXME Need to overhaul design here which must take into account // features not present in current figma including: // * Ability to cancel a current download // * Ability to resume a download // * The presentation of an error if encountered // * Whether to show already installed models // * Install of remote models with API keys // * The presentation of the progress bar Rectangle { id: actionBox width: childrenRect.width + 20 color: "transparent" border.width: 1 border.color: theme.dividerColor radius: 10 Layout.rightMargin: 20 Layout.bottomMargin: 20 Layout.minimumHeight: childrenRect.height + 20 Layout.alignment: Qt.AlignRight | Qt.AlignTop ColumnLayout { spacing: 0 MySettingsButton { id: downloadButton text: isDownloading ? qsTr("Cancel") : isIncomplete ? qsTr("Resume") : qsTr("Download") font.pixelSize: theme.fontSizeLarge Layout.topMargin: 20 Layout.leftMargin: 20 Layout.minimumWidth: 200 Layout.fillWidth: true Layout.alignment: Qt.AlignTop | Qt.AlignHCenter visible: !installed && !calcHash && downloadError === "" Accessible.description: qsTr("Stop/restart/start the download") onClicked: { if (!isDownloading) { Download.downloadModel(filename); } else { Download.cancelDownload(filename); } } } MySettingsDestructiveButton { id: removeButton text: qsTr("Remove") Layout.topMargin: 20 Layout.leftMargin: 20 Layout.minimumWidth: 200 Layout.fillWidth: true Layout.alignment: Qt.AlignTop | Qt.AlignHCenter visible: !isDownloading && (installed || isIncomplete) Accessible.description: qsTr("Remove model from filesystem") onClicked: { Download.removeModel(filename); } } ColumnLayout { spacing: 0 Label { Layout.topMargin: 20 Layout.leftMargin: 20 visible: downloadError !== "" textFormat: Text.StyledText text: qsTr("Error") color: theme.textColor font.pixelSize: theme.fontSizeLarge linkColor: theme.textErrorColor Accessible.role: Accessible.Paragraph Accessible.name: text Accessible.description: qsTr("Describes an error that occurred when downloading") onLinkActivated: { downloadingErrorPopup.text = downloadError; downloadingErrorPopup.open(); } } Label { visible: LLM.systemTotalRAMInGB() < ramrequired Layout.topMargin: 20 Layout.leftMargin: 20 Layout.maximumWidth: 300 textFormat: Text.StyledText text: qsTr("WARNING: Not recommended for your hardware. Model requires more memory (%1 GB) than your system has available (%2).").arg(ramrequired).arg(LLM.systemTotalRAMInGBString()) color: theme.textErrorColor font.pixelSize: theme.fontSizeLarge wrapMode: Text.WordWrap Accessible.role: Accessible.Paragraph Accessible.name: text Accessible.description: qsTr("Error for incompatible hardware") onLinkActivated: { downloadingErrorPopup.text = downloadError; downloadingErrorPopup.open(); } } } ColumnLayout { visible: isDownloading && !calcHash Layout.topMargin: 20 Layout.leftMargin: 20 Layout.minimumWidth: 200 Layout.fillWidth: true Layout.alignment: Qt.AlignTop | Qt.AlignHCenter spacing: 20 ProgressBar { id: itemProgressBar Layout.fillWidth: true width: 200 value: bytesReceived / bytesTotal background: Rectangle { implicitHeight: 45 color: theme.progressBackground radius: 3 } contentItem: Item { implicitHeight: 40 Rectangle { width: itemProgressBar.visualPosition * parent.width height: parent.height radius: 2 color: theme.progressForeground } } Accessible.role: Accessible.ProgressBar Accessible.name: qsTr("Download progressBar") Accessible.description: qsTr("Shows the progress made in the download") } Label { id: speedLabel color: theme.textColor Layout.alignment: Qt.AlignRight text: speed font.pixelSize: theme.fontSizeLarge Accessible.role: Accessible.Paragraph Accessible.name: qsTr("Download speed") Accessible.description: qsTr("Download speed in bytes/kilobytes/megabytes per second") } } RowLayout { visible: calcHash Layout.topMargin: 20 Layout.leftMargin: 20 Layout.minimumWidth: 200 Layout.maximumWidth: 200 Layout.fillWidth: true Layout.alignment: Qt.AlignTop | Qt.AlignHCenter clip: true Label { id: calcHashLabel color: theme.textColor text: qsTr("Calculating...") font.pixelSize: theme.fontSizeLarge Accessible.role: Accessible.Paragraph Accessible.name: text Accessible.description: qsTr("Whether the file hash is being calculated") } MyBusyIndicator { id: busyCalcHash running: calcHash Accessible.role: Accessible.Animation Accessible.name: qsTr("Busy indicator") Accessible.description: qsTr("Displayed when the file hash is being calculated") } } } } } Item { Layout.minimumWidth: childrenRect.width Layout.minimumHeight: childrenRect.height Layout.bottomMargin: 10 RowLayout { id: paramRow anchors.centerIn: parent ColumnLayout { Layout.topMargin: 10 Layout.bottomMargin: 10 Layout.leftMargin: 20 Layout.rightMargin: 20 Text { text: qsTr("File size") font.pixelSize: theme.fontSizeSmall color: theme.mutedDarkTextColor } Text { text: filesize color: theme.textColor font.pixelSize: theme.fontSizeSmall font.bold: true } } Rectangle { width: 1 Layout.fillHeight: true color: theme.dividerColor } ColumnLayout { Layout.topMargin: 10 Layout.bottomMargin: 10 Layout.leftMargin: 20 Layout.rightMargin: 20 Text { text: qsTr("RAM required") font.pixelSize: theme.fontSizeSmall color: theme.mutedDarkTextColor } Text { text: ramrequired >= 0 ? qsTr("%1 GB").arg(ramrequired) : qsTr("?") color: theme.textColor font.pixelSize: theme.fontSizeSmall font.bold: true } } Rectangle { width: 1 Layout.fillHeight: true color: theme.dividerColor } ColumnLayout { Layout.topMargin: 10 Layout.bottomMargin: 10 Layout.leftMargin: 20 Layout.rightMargin: 20 Text { text: qsTr("Parameters") font.pixelSize: theme.fontSizeSmall color: theme.mutedDarkTextColor } Text { text: parameters !== "" ? parameters : qsTr("?") color: theme.textColor font.pixelSize: theme.fontSizeSmall font.bold: true } } Rectangle { width: 1 Layout.fillHeight: true color: theme.dividerColor } ColumnLayout { Layout.topMargin: 10 Layout.bottomMargin: 10 Layout.leftMargin: 20 Layout.rightMargin: 20 Text { text: qsTr("Quant") font.pixelSize: theme.fontSizeSmall color: theme.mutedDarkTextColor } Text { text: quant color: theme.textColor font.pixelSize: theme.fontSizeSmall font.bold: true } } Rectangle { width: 1 Layout.fillHeight: true color: theme.dividerColor } ColumnLayout { Layout.topMargin: 10 Layout.bottomMargin: 10 Layout.leftMargin: 20 Layout.rightMargin: 20 Text { text: qsTr("Type") font.pixelSize: theme.fontSizeSmall color: theme.mutedDarkTextColor } Text { text: type color: theme.textColor font.pixelSize: theme.fontSizeSmall font.bold: true } } } Rectangle { color: "transparent" anchors.fill: paramRow border.color: theme.dividerColor border.width: 1 radius: 10 } } Rectangle { Layout.fillWidth: true height: 1 color: theme.dividerColor } } } } } } ================================================ FILE: gpt4all-chat/qml/AddHFModelView.qml ================================================ import QtCore import QtQuick import QtQuick.Controls import QtQuick.Controls.Basic import QtQuick.Layouts import QtQuick.Dialogs import Qt.labs.folderlistmodel import Qt5Compat.GraphicalEffects import llm import chatlistmodel import download import modellist import network import gpt4all import mysettings import localdocs ColumnLayout { Layout.fillWidth: true Layout.fillHeight: true Layout.alignment: Qt.AlignTop spacing: 5 Label { Layout.topMargin: 0 Layout.bottomMargin: 25 Layout.rightMargin: 150 * theme.fontScale Layout.alignment: Qt.AlignTop Layout.fillWidth: true verticalAlignment: Text.AlignTop text: qsTr("Use the search to find and download models from HuggingFace. There is NO GUARANTEE that these " + "will work. Many will require additional configuration before they can be used.") font.pixelSize: theme.fontSizeLarger color: theme.textColor wrapMode: Text.WordWrap } RowLayout { Layout.fillWidth: true Layout.fillHeight: true Layout.alignment: Qt.AlignCenter Layout.margins: 0 spacing: 10 MyTextField { id: discoverField property string textBeingSearched: "" readOnly: ModelList.discoverInProgress Layout.alignment: Qt.AlignCenter Layout.fillWidth: true font.pixelSize: theme.fontSizeLarger placeholderText: qsTr("Discover and download models by keyword search...") Accessible.role: Accessible.EditableText Accessible.name: placeholderText Accessible.description: qsTr("Text field for discovering and filtering downloadable models") Connections { target: ModelList function onDiscoverInProgressChanged() { if (ModelList.discoverInProgress) { discoverField.textBeingSearched = discoverField.text; discoverField.text = qsTr("Searching \u00B7 %1").arg(discoverField.textBeingSearched); } else { discoverField.text = discoverField.textBeingSearched; discoverField.textBeingSearched = ""; } } } background: ProgressBar { id: discoverProgressBar indeterminate: ModelList.discoverInProgress && ModelList.discoverProgress === 0.0 value: ModelList.discoverProgress background: Rectangle { color: theme.controlBackground border.color: theme.controlBorder radius: 10 } contentItem: Item { Rectangle { visible: ModelList.discoverInProgress anchors.bottom: parent.bottom width: discoverProgressBar.visualPosition * parent.width height: 10 radius: 2 color: theme.progressForeground } } } Keys.onReturnPressed: (event)=> { if (event.modifiers & Qt.ControlModifier || event.modifiers & Qt.ShiftModifier) event.accepted = false; else { editingFinished(); sendDiscovery() } } function sendDiscovery() { ModelList.huggingFaceDownloadableModels.discoverAndFilter(discoverField.text); } RowLayout { spacing: 0 anchors.right: discoverField.right anchors.verticalCenter: discoverField.verticalCenter anchors.rightMargin: 15 visible: !ModelList.discoverInProgress MyMiniButton { id: clearDiscoverButton backgroundColor: theme.textColor backgroundColorHovered: theme.iconBackgroundDark visible: discoverField.text !== "" source: "qrc:/gpt4all/icons/close.svg" onClicked: { discoverField.text = "" discoverField.sendDiscovery() // should clear results } } MyMiniButton { backgroundColor: theme.textColor backgroundColorHovered: theme.iconBackgroundDark source: "qrc:/gpt4all/icons/settings.svg" onClicked: { discoveryTools.visible = !discoveryTools.visible } } MyMiniButton { id: sendButton enabled: !ModelList.discoverInProgress backgroundColor: theme.textColor backgroundColorHovered: theme.iconBackgroundDark source: "qrc:/gpt4all/icons/send_message.svg" Accessible.name: qsTr("Initiate model discovery and filtering") Accessible.description: qsTr("Triggers discovery and filtering of models") onClicked: { discoverField.sendDiscovery() } } } } } RowLayout { id: discoveryTools Layout.fillWidth: true Layout.alignment: Qt.AlignCenter Layout.margins: 0 spacing: 20 visible: false MyComboBox { id: comboSort model: ListModel { ListElement { name: qsTr("Default") } ListElement { name: qsTr("Likes") } ListElement { name: qsTr("Downloads") } ListElement { name: qsTr("Recent") } } currentIndex: ModelList.discoverSort contentItem: Text { anchors.horizontalCenter: parent.horizontalCenter rightPadding: 30 color: theme.textColor text: { return qsTr("Sort by: %1").arg(comboSort.displayText) } font.pixelSize: theme.fontSizeLarger verticalAlignment: Text.AlignVCenter horizontalAlignment: Text.AlignHCenter elide: Text.ElideRight } onActivated: function (index) { ModelList.discoverSort = index; } } MyComboBox { id: comboSortDirection model: ListModel { ListElement { name: qsTr("Asc") } ListElement { name: qsTr("Desc") } } currentIndex: { if (ModelList.discoverSortDirection === 1) return 0 else return 1; } contentItem: Text { anchors.horizontalCenter: parent.horizontalCenter rightPadding: 30 color: theme.textColor text: { return qsTr("Sort dir: %1").arg(comboSortDirection.displayText) } font.pixelSize: theme.fontSizeLarger verticalAlignment: Text.AlignVCenter horizontalAlignment: Text.AlignHCenter elide: Text.ElideRight } onActivated: function (index) { if (index === 0) ModelList.discoverSortDirection = 1; else ModelList.discoverSortDirection = -1; } } MyComboBox { id: comboLimit model: ListModel { ListElement { name: "5" } ListElement { name: "10" } ListElement { name: "20" } ListElement { name: "50" } ListElement { name: "100" } ListElement { name: qsTr("None") } } currentIndex: { if (ModelList.discoverLimit === 5) return 0; else if (ModelList.discoverLimit === 10) return 1; else if (ModelList.discoverLimit === 20) return 2; else if (ModelList.discoverLimit === 50) return 3; else if (ModelList.discoverLimit === 100) return 4; else if (ModelList.discoverLimit === -1) return 5; } contentItem: Text { anchors.horizontalCenter: parent.horizontalCenter rightPadding: 30 color: theme.textColor text: { return qsTr("Limit: %1").arg(comboLimit.displayText) } font.pixelSize: theme.fontSizeLarger verticalAlignment: Text.AlignVCenter horizontalAlignment: Text.AlignHCenter elide: Text.ElideRight } onActivated: function (index) { switch (index) { case 0: ModelList.discoverLimit = 5; break; case 1: ModelList.discoverLimit = 10; break; case 2: ModelList.discoverLimit = 20; break; case 3: ModelList.discoverLimit = 50; break; case 4: ModelList.discoverLimit = 100; break; case 5: ModelList.discoverLimit = -1; break; } } } } ScrollView { id: scrollView ScrollBar.vertical.policy: ScrollBar.AsNeeded Layout.fillWidth: true Layout.fillHeight: true clip: true ListView { id: modelListView model: ModelList.huggingFaceDownloadableModels boundsBehavior: Flickable.StopAtBounds spacing: 30 delegate: Rectangle { id: delegateItem width: modelListView.width height: childrenRect.height + 60 color: theme.conversationBackground radius: 10 border.width: 1 border.color: theme.controlBorder ColumnLayout { anchors.top: parent.top anchors.left: parent.left anchors.right: parent.right anchors.margins: 30 Text { Layout.fillWidth: true Layout.alignment: Qt.AlignLeft text: name elide: Text.ElideRight color: theme.titleTextColor font.pixelSize: theme.fontSizeLargest font.bold: true Accessible.role: Accessible.Paragraph Accessible.name: qsTr("Model file") Accessible.description: qsTr("Model file to be downloaded") } Rectangle { Layout.fillWidth: true height: 1 color: theme.dividerColor } RowLayout { Layout.topMargin: 10 Layout.fillWidth: true Text { id: descriptionText text: description font.pixelSize: theme.fontSizeLarge Layout.fillWidth: true wrapMode: Text.WordWrap textFormat: Text.StyledText color: theme.textColor linkColor: theme.textColor Accessible.role: Accessible.Paragraph Accessible.name: qsTr("Description") Accessible.description: qsTr("File description") onLinkActivated: function(link) { Qt.openUrlExternally(link); } MouseArea { anchors.fill: parent acceptedButtons: Qt.NoButton // pass clicks to parent cursorShape: parent.hoveredLink ? Qt.PointingHandCursor : Qt.ArrowCursor } } // FIXME Need to overhaul design here which must take into account // features not present in current figma including: // * Ability to cancel a current download // * Ability to resume a download // * The presentation of an error if encountered // * Whether to show already installed models // * Install of remote models with API keys // * The presentation of the progress bar Rectangle { id: actionBox width: childrenRect.width + 20 color: "transparent" border.width: 1 border.color: theme.dividerColor radius: 10 Layout.rightMargin: 20 Layout.bottomMargin: 20 Layout.minimumHeight: childrenRect.height + 20 Layout.alignment: Qt.AlignRight | Qt.AlignTop ColumnLayout { spacing: 0 MySettingsButton { id: downloadButton text: isDownloading ? qsTr("Cancel") : isIncomplete ? qsTr("Resume") : qsTr("Download") font.pixelSize: theme.fontSizeLarge Layout.topMargin: 20 Layout.leftMargin: 20 Layout.minimumWidth: 200 Layout.fillWidth: true Layout.alignment: Qt.AlignTop | Qt.AlignHCenter visible: !isOnline && !installed && !calcHash && downloadError === "" Accessible.description: qsTr("Stop/restart/start the download") onClicked: { if (!isDownloading) { Download.downloadModel(filename); } else { Download.cancelDownload(filename); } } } MySettingsDestructiveButton { id: removeButton text: qsTr("Remove") Layout.topMargin: 20 Layout.leftMargin: 20 Layout.minimumWidth: 200 Layout.fillWidth: true Layout.alignment: Qt.AlignTop | Qt.AlignHCenter visible: !isDownloading && (installed || isIncomplete) Accessible.description: qsTr("Remove model from filesystem") onClicked: { Download.removeModel(filename); } } MySettingsButton { id: installButton visible: !installed && isOnline Layout.topMargin: 20 Layout.leftMargin: 20 Layout.minimumWidth: 200 Layout.fillWidth: true Layout.alignment: Qt.AlignTop | Qt.AlignHCenter text: qsTr("Install") font.pixelSize: theme.fontSizeLarge onClicked: { var apiKeyText = apiKey.text.trim(), baseUrlText = baseUrl.text.trim(), modelNameText = modelName.text.trim(); var apiKeyOk = apiKeyText !== "", baseUrlOk = !isCompatibleApi || baseUrlText !== "", modelNameOk = !isCompatibleApi || modelNameText !== ""; if (!apiKeyOk) apiKey.showError(); if (!baseUrlOk) baseUrl.showError(); if (!modelNameOk) modelName.showError(); if (!apiKeyOk || !baseUrlOk || !modelNameOk) return; if (!isCompatibleApi) Download.installModel( filename, apiKeyText, ); else Download.installCompatibleModel( modelNameText, apiKeyText, baseUrlText, ); } Accessible.role: Accessible.Button Accessible.name: qsTr("Install") Accessible.description: qsTr("Install online model") } ColumnLayout { spacing: 0 Label { Layout.topMargin: 20 Layout.leftMargin: 20 visible: downloadError !== "" textFormat: Text.StyledText text: qsTr("Error") color: theme.textColor font.pixelSize: theme.fontSizeLarge linkColor: theme.textErrorColor Accessible.role: Accessible.Paragraph Accessible.name: text Accessible.description: qsTr("Describes an error that occurred when downloading") onLinkActivated: { downloadingErrorPopup.text = downloadError; downloadingErrorPopup.open(); } } Label { visible: LLM.systemTotalRAMInGB() < ramrequired Layout.topMargin: 20 Layout.leftMargin: 20 Layout.maximumWidth: 300 textFormat: Text.StyledText text: qsTr("WARNING: Not recommended for your hardware. Model requires more memory (%1 GB) than your system has available (%2).").arg(ramrequired).arg(LLM.systemTotalRAMInGBString()) color: theme.textErrorColor font.pixelSize: theme.fontSizeLarge wrapMode: Text.WordWrap Accessible.role: Accessible.Paragraph Accessible.name: text Accessible.description: qsTr("Error for incompatible hardware") onLinkActivated: { downloadingErrorPopup.text = downloadError; downloadingErrorPopup.open(); } } } ColumnLayout { visible: isDownloading && !calcHash Layout.topMargin: 20 Layout.leftMargin: 20 Layout.minimumWidth: 200 Layout.fillWidth: true Layout.alignment: Qt.AlignTop | Qt.AlignHCenter spacing: 20 ProgressBar { id: itemProgressBar Layout.fillWidth: true width: 200 value: bytesReceived / bytesTotal background: Rectangle { implicitHeight: 45 color: theme.progressBackground radius: 3 } contentItem: Item { implicitHeight: 40 Rectangle { width: itemProgressBar.visualPosition * parent.width height: parent.height radius: 2 color: theme.progressForeground } } Accessible.role: Accessible.ProgressBar Accessible.name: qsTr("Download progressBar") Accessible.description: qsTr("Shows the progress made in the download") } Label { id: speedLabel color: theme.textColor Layout.alignment: Qt.AlignRight text: speed font.pixelSize: theme.fontSizeLarge Accessible.role: Accessible.Paragraph Accessible.name: qsTr("Download speed") Accessible.description: qsTr("Download speed in bytes/kilobytes/megabytes per second") } } RowLayout { visible: calcHash Layout.topMargin: 20 Layout.leftMargin: 20 Layout.minimumWidth: 200 Layout.maximumWidth: 200 Layout.fillWidth: true Layout.alignment: Qt.AlignTop | Qt.AlignHCenter clip: true Label { id: calcHashLabel color: theme.textColor text: qsTr("Calculating...") font.pixelSize: theme.fontSizeLarge Accessible.role: Accessible.Paragraph Accessible.name: text Accessible.description: qsTr("Whether the file hash is being calculated") } MyBusyIndicator { id: busyCalcHash running: calcHash Accessible.role: Accessible.Animation Accessible.name: qsTr("Busy indicator") Accessible.description: qsTr("Displayed when the file hash is being calculated") } } MyTextField { id: apiKey visible: !installed && isOnline Layout.topMargin: 20 Layout.leftMargin: 20 Layout.minimumWidth: 200 Layout.alignment: Qt.AlignTop | Qt.AlignHCenter wrapMode: Text.WrapAnywhere function showError() { messageToast.show(qsTr("ERROR: $API_KEY is empty.")); apiKey.placeholderTextColor = theme.textErrorColor; } onTextChanged: { apiKey.placeholderTextColor = theme.mutedTextColor; } placeholderText: qsTr("enter $API_KEY") Accessible.role: Accessible.EditableText Accessible.name: placeholderText Accessible.description: qsTr("Whether the file hash is being calculated") } MyTextField { id: baseUrl visible: !installed && isOnline && isCompatibleApi Layout.topMargin: 20 Layout.leftMargin: 20 Layout.minimumWidth: 200 Layout.alignment: Qt.AlignTop | Qt.AlignHCenter wrapMode: Text.WrapAnywhere function showError() { messageToast.show(qsTr("ERROR: $BASE_URL is empty.")); baseUrl.placeholderTextColor = theme.textErrorColor; } onTextChanged: { baseUrl.placeholderTextColor = theme.mutedTextColor; } placeholderText: qsTr("enter $BASE_URL") Accessible.role: Accessible.EditableText Accessible.name: placeholderText Accessible.description: qsTr("Whether the file hash is being calculated") } MyTextField { id: modelName visible: !installed && isOnline && isCompatibleApi Layout.topMargin: 20 Layout.leftMargin: 20 Layout.minimumWidth: 200 Layout.alignment: Qt.AlignTop | Qt.AlignHCenter wrapMode: Text.WrapAnywhere function showError() { messageToast.show(qsTr("ERROR: $MODEL_NAME is empty.")) modelName.placeholderTextColor = theme.textErrorColor; } onTextChanged: { modelName.placeholderTextColor = theme.mutedTextColor; } placeholderText: qsTr("enter $MODEL_NAME") Accessible.role: Accessible.EditableText Accessible.name: placeholderText Accessible.description: qsTr("Whether the file hash is being calculated") } } } } Item { Layout.minimumWidth: childrenRect.width Layout.minimumHeight: childrenRect.height Layout.bottomMargin: 10 RowLayout { id: paramRow anchors.centerIn: parent ColumnLayout { Layout.topMargin: 10 Layout.bottomMargin: 10 Layout.leftMargin: 20 Layout.rightMargin: 20 Text { text: qsTr("File size") font.pixelSize: theme.fontSizeSmall color: theme.mutedDarkTextColor } Text { text: filesize color: theme.textColor font.pixelSize: theme.fontSizeSmall font.bold: true } } Rectangle { width: 1 Layout.fillHeight: true color: theme.dividerColor } ColumnLayout { Layout.topMargin: 10 Layout.bottomMargin: 10 Layout.leftMargin: 20 Layout.rightMargin: 20 Text { text: qsTr("Quant") font.pixelSize: theme.fontSizeSmall color: theme.mutedDarkTextColor } Text { text: quant color: theme.textColor font.pixelSize: theme.fontSizeSmall font.bold: true } } Rectangle { width: 1 Layout.fillHeight: true color: theme.dividerColor } ColumnLayout { Layout.topMargin: 10 Layout.bottomMargin: 10 Layout.leftMargin: 20 Layout.rightMargin: 20 Text { text: qsTr("Type") font.pixelSize: theme.fontSizeSmall color: theme.mutedDarkTextColor } Text { text: type color: theme.textColor font.pixelSize: theme.fontSizeSmall font.bold: true } } } Rectangle { color: "transparent" anchors.fill: paramRow border.color: theme.dividerColor border.width: 1 radius: 10 } } Rectangle { Layout.fillWidth: true height: 1 color: theme.dividerColor } } } } } } ================================================ FILE: gpt4all-chat/qml/AddModelView.qml ================================================ import QtCore import QtQuick import QtQuick.Controls import QtQuick.Controls.Basic import QtQuick.Layouts import QtQuick.Dialogs import Qt.labs.folderlistmodel import Qt5Compat.GraphicalEffects import llm import chatlistmodel import download import modellist import network import gpt4all import mysettings import localdocs Rectangle { id: addModelView Theme { id: theme } color: theme.viewBackground signal modelsViewRequested() ToastManager { id: messageToast } PopupDialog { id: downloadingErrorPopup anchors.centerIn: parent shouldTimeOut: false } ColumnLayout { id: mainArea anchors.left: parent.left anchors.right: parent.right anchors.top: parent.top anchors.bottom: parent.bottom anchors.margins: 30 spacing: 10 ColumnLayout { Layout.fillWidth: true Layout.alignment: Qt.AlignTop spacing: 10 MyButton { id: backButton Layout.alignment: Qt.AlignTop | Qt.AlignLeft text: qsTr("\u2190 Existing Models") borderWidth: 0 backgroundColor: theme.lighterButtonBackground backgroundColorHovered: theme.lighterButtonBackgroundHovered backgroundRadius: 5 padding: 15 topPadding: 8 bottomPadding: 8 textColor: theme.lighterButtonForeground fontPixelSize: theme.fontSizeLarge fontPixelBold: true onClicked: { modelsViewRequested() } } Text { id: welcome text: qsTr("Explore Models") font.pixelSize: theme.fontSizeBanner color: theme.titleTextColor } } RowLayout { id: bar implicitWidth: 600 spacing: 10 MyTabButton { text: qsTr("GPT4All") isSelected: gpt4AllModelView.isShown() onPressed: { gpt4AllModelView.show(); } } MyTabButton { text: qsTr("Remote Providers") isSelected: remoteModelView.isShown() onPressed: { remoteModelView.show(); } } MyTabButton { text: qsTr("HuggingFace") isSelected: huggingfaceModelView.isShown() onPressed: { huggingfaceModelView.show(); } } } StackLayout { id: stackLayout Layout.fillWidth: true Layout.fillHeight: true AddGPT4AllModelView { id: gpt4AllModelView Layout.fillWidth: true Layout.fillHeight: true function show() { stackLayout.currentIndex = 0; } function isShown() { return stackLayout.currentIndex === 0; } } AddRemoteModelView { id: remoteModelView Layout.fillWidth: true Layout.fillHeight: true function show() { stackLayout.currentIndex = 1; } function isShown() { return stackLayout.currentIndex === 1; } } AddHFModelView { id: huggingfaceModelView Layout.fillWidth: true Layout.fillHeight: true // FIXME: This generates a warning and should not be used inside a layout, but without // it the text field inside this qml does not display at full width so it looks like // a bug in stacklayout anchors.fill: parent function show() { stackLayout.currentIndex = 2; } function isShown() { return stackLayout.currentIndex === 2; } } } } Connections { target: Download function onToastMessage(message) { messageToast.show(message); } } } ================================================ FILE: gpt4all-chat/qml/AddRemoteModelView.qml ================================================ import QtCore import QtQuick import QtQuick.Controls import QtQuick.Controls.Basic import QtQuick.Layouts import QtQuick.Dialogs import Qt.labs.folderlistmodel import Qt5Compat.GraphicalEffects import llm import chatlistmodel import download import modellist import network import gpt4all import mysettings import localdocs ColumnLayout { Layout.fillWidth: true Layout.alignment: Qt.AlignTop spacing: 5 Label { Layout.topMargin: 0 Layout.bottomMargin: 25 Layout.rightMargin: 150 * theme.fontScale Layout.alignment: Qt.AlignTop Layout.fillWidth: true verticalAlignment: Text.AlignTop text: qsTr("Various remote model providers that use network resources for inference.") font.pixelSize: theme.fontSizeLarger color: theme.textColor wrapMode: Text.WordWrap } ScrollView { id: scrollView ScrollBar.vertical.policy: ScrollBar.AsNeeded Layout.fillWidth: true Layout.fillHeight: true contentWidth: availableWidth clip: true Flow { anchors.left: parent.left anchors.right: parent.right spacing: 20 bottomPadding: 20 property int childWidth: 330 * theme.fontScale property int childHeight: 400 + 166 * theme.fontScale RemoteModelCard { width: parent.childWidth height: parent.childHeight providerBaseUrl: "https://api.groq.com/openai/v1/" providerName: qsTr("Groq") providerImage: "qrc:/gpt4all/icons/groq.svg" providerDesc: qsTr('Groq offers a high-performance AI inference engine designed for low-latency and efficient processing. Optimized for real-time applications, Groq’s technology is ideal for users who need fast responses from open large language models and other AI workloads.

                Get your API key: https://groq.com/') modelWhitelist: [ // last updated 2025-02-24 "deepseek-r1-distill-llama-70b", "deepseek-r1-distill-qwen-32b", "gemma2-9b-it", "llama-3.1-8b-instant", "llama-3.2-1b-preview", "llama-3.2-3b-preview", "llama-3.3-70b-specdec", "llama-3.3-70b-versatile", "llama3-70b-8192", "llama3-8b-8192", "mixtral-8x7b-32768", "qwen-2.5-32b", "qwen-2.5-coder-32b", ] } RemoteModelCard { width: parent.childWidth height: parent.childHeight providerBaseUrl: "https://api.openai.com/v1/" providerName: qsTr("OpenAI") providerImage: "qrc:/gpt4all/icons/openai.svg" providerDesc: qsTr('OpenAI provides access to advanced AI models, including GPT-4 supporting a wide range of applications, from conversational AI to content generation and code completion.

                Get your API key: https://openai.com/') modelWhitelist: [ // last updated 2025-02-24 "gpt-3.5-turbo", "gpt-3.5-turbo-16k", "gpt-4", "gpt-4-32k", "gpt-4-turbo", "gpt-4o", ] } RemoteModelCard { width: parent.childWidth height: parent.childHeight providerBaseUrl: "https://api.mistral.ai/v1/" providerName: qsTr("Mistral") providerImage: "qrc:/gpt4all/icons/mistral.svg" providerDesc: qsTr('Mistral AI specializes in efficient, open-weight language models optimized for various natural language processing tasks. Their models are designed for flexibility and performance, making them a solid option for applications requiring scalable AI solutions.

                Get your API key: https://mistral.ai/') modelWhitelist: [ // last updated 2025-02-24 "codestral-2405", "codestral-2411-rc5", "codestral-2412", "codestral-2501", "codestral-latest", "codestral-mamba-2407", "codestral-mamba-latest", "ministral-3b-2410", "ministral-3b-latest", "ministral-8b-2410", "ministral-8b-latest", "mistral-large-2402", "mistral-large-2407", "mistral-large-2411", "mistral-large-latest", "mistral-medium-2312", "mistral-medium-latest", "mistral-saba-2502", "mistral-saba-latest", "mistral-small-2312", "mistral-small-2402", "mistral-small-2409", "mistral-small-2501", "mistral-small-latest", "mistral-tiny-2312", "mistral-tiny-2407", "mistral-tiny-latest", "open-codestral-mamba", "open-mistral-7b", "open-mistral-nemo", "open-mistral-nemo-2407", "open-mixtral-8x22b", "open-mixtral-8x22b-2404", "open-mixtral-8x7b", ] } RemoteModelCard { width: parent.childWidth height: parent.childHeight providerIsCustom: true providerName: qsTr("Custom") providerImage: "qrc:/gpt4all/icons/antenna_3.svg" providerDesc: qsTr("The custom provider option allows users to connect their own OpenAI-compatible AI models or third-party inference services. This is useful for organizations with proprietary models or those leveraging niche AI providers not listed here.") } } } } ================================================ FILE: gpt4all-chat/qml/ApplicationSettings.qml ================================================ import QtCore import QtQuick import QtQuick.Controls import QtQuick.Controls.Basic import QtQuick.Layouts import QtQuick.Dialogs import modellist import mysettings import network import llm MySettingsTab { onRestoreDefaults: { MySettings.restoreApplicationDefaults(); } title: qsTr("Application") NetworkDialog { id: networkDialog anchors.centerIn: parent width: Math.min(1024, window.width - (window.width * .2)) height: Math.min(600, window.height - (window.height * .2)) Item { Accessible.role: Accessible.Dialog Accessible.name: qsTr("Network dialog") Accessible.description: qsTr("opt-in to share feedback/conversations") } } Dialog { id: checkForUpdatesError anchors.centerIn: parent modal: false padding: 20 width: 40 + 400 * theme.fontScale Text { anchors.fill: parent horizontalAlignment: Text.AlignJustify text: qsTr("ERROR: Update system could not find the MaintenanceTool used to check for updates!

                " + "Did you install this application using the online installer? If so, the MaintenanceTool " + "executable should be located one directory above where this application resides on your " + "filesystem.

                If you can't start it manually, then I'm afraid you'll have to reinstall.") wrapMode: Text.WordWrap color: theme.textErrorColor font.pixelSize: theme.fontSizeLarge Accessible.role: Accessible.Dialog Accessible.name: text Accessible.description: qsTr("Error dialog") } background: Rectangle { anchors.fill: parent color: theme.containerBackground border.width: 1 border.color: theme.dialogBorder radius: 10 } } contentItem: GridLayout { id: applicationSettingsTabInner columns: 3 rowSpacing: 30 columnSpacing: 10 Label { Layout.row: 0 Layout.column: 0 Layout.bottomMargin: 10 color: theme.settingsTitleTextColor font.pixelSize: theme.fontSizeBannerSmall font.bold: true text: qsTr("Application Settings") } ColumnLayout { Layout.row: 1 Layout.column: 0 Layout.columnSpan: 3 Layout.fillWidth: true spacing: 10 Label { color: theme.styledTextColor font.pixelSize: theme.fontSizeLarge font.bold: true text: qsTr("General") } Rectangle { Layout.fillWidth: true height: 1 color: theme.settingsDivider } } MySettingsLabel { id: themeLabel text: qsTr("Theme") helpText: qsTr("The application color scheme.") Layout.row: 2 Layout.column: 0 } MyComboBox { id: themeBox Layout.row: 2 Layout.column: 2 Layout.minimumWidth: 200 Layout.maximumWidth: 200 Layout.fillWidth: false Layout.alignment: Qt.AlignRight // NOTE: indices match values of ChatTheme enum, keep them in sync model: ListModel { ListElement { name: qsTr("Light") } ListElement { name: qsTr("Dark") } ListElement { name: qsTr("LegacyDark") } } Accessible.name: themeLabel.text Accessible.description: themeLabel.helpText function updateModel() { themeBox.currentIndex = MySettings.chatTheme; } Component.onCompleted: { themeBox.updateModel() } Connections { target: MySettings function onChatThemeChanged() { themeBox.updateModel() } } onActivated: { MySettings.chatTheme = themeBox.currentIndex } } MySettingsLabel { id: fontLabel text: qsTr("Font Size") helpText: qsTr("The size of text in the application.") Layout.row: 3 Layout.column: 0 } MyComboBox { id: fontBox Layout.row: 3 Layout.column: 2 Layout.minimumWidth: 200 Layout.maximumWidth: 200 Layout.fillWidth: false Layout.alignment: Qt.AlignRight // NOTE: indices match values of FontSize enum, keep them in sync model: ListModel { ListElement { name: qsTr("Small") } ListElement { name: qsTr("Medium") } ListElement { name: qsTr("Large") } } Accessible.name: fontLabel.text Accessible.description: fontLabel.helpText function updateModel() { fontBox.currentIndex = MySettings.fontSize; } Component.onCompleted: { fontBox.updateModel() } Connections { target: MySettings function onFontSizeChanged() { fontBox.updateModel() } } onActivated: { MySettings.fontSize = fontBox.currentIndex } } MySettingsLabel { id: languageLabel visible: MySettings.uiLanguages.length > 1 text: qsTr("Language and Locale") helpText: qsTr("The language and locale you wish to use.") Layout.row: 4 Layout.column: 0 } MyComboBox { id: languageBox visible: MySettings.uiLanguages.length > 1 Layout.row: 4 Layout.column: 2 Layout.minimumWidth: 200 Layout.maximumWidth: 200 Layout.fillWidth: false Layout.alignment: Qt.AlignRight model: ListModel { Component.onCompleted: { for (var i = 0; i < MySettings.uiLanguages.length; ++i) append({"text": MySettings.uiLanguages[i]}); languageBox.updateModel(); } ListElement { text: qsTr("System Locale") } } Accessible.name: languageLabel.text Accessible.description: languageLabel.helpText function updateModel() { // This usage of 'System Locale' should not be translated // FIXME: Make this refer to a string literal variable accessed by both QML and C++ if (MySettings.languageAndLocale === "System Locale") languageBox.currentIndex = 0 else languageBox.currentIndex = languageBox.indexOfValue(MySettings.languageAndLocale); } Component.onCompleted: { languageBox.updateModel() } onActivated: { // This usage of 'System Locale' should not be translated // FIXME: Make this refer to a string literal variable accessed by both QML and C++ if (languageBox.currentIndex === 0) MySettings.languageAndLocale = "System Locale"; else MySettings.languageAndLocale = languageBox.currentText; } } MySettingsLabel { id: deviceLabel text: qsTr("Device") helpText: qsTr('The compute device used for text generation.') Layout.row: 5 Layout.column: 0 } MyComboBox { id: deviceBox Layout.row: 5 Layout.column: 2 Layout.minimumWidth: 400 Layout.maximumWidth: 400 Layout.fillWidth: false Layout.alignment: Qt.AlignRight model: ListModel { Component.onCompleted: { for (var i = 0; i < MySettings.deviceList.length; ++i) append({"text": MySettings.deviceList[i]}); deviceBox.updateModel(); } ListElement { text: qsTr("Application default") } } Accessible.name: deviceLabel.text Accessible.description: deviceLabel.helpText function updateModel() { // This usage of 'Auto' should not be translated // FIXME: Make this refer to a string literal variable accessed by both QML and C++ if (MySettings.device === "Auto") deviceBox.currentIndex = 0 else deviceBox.currentIndex = deviceBox.indexOfValue(MySettings.device); } Component.onCompleted: { deviceBox.updateModel(); } Connections { target: MySettings function onDeviceChanged() { deviceBox.updateModel(); } } onActivated: { // This usage of 'Auto' should not be translated // FIXME: Make this refer to a string literal variable accessed by both QML and C++ if (deviceBox.currentIndex === 0) MySettings.device = "Auto"; else MySettings.device = deviceBox.currentText; } } MySettingsLabel { id: defaultModelLabel text: qsTr("Default Model") helpText: qsTr("The preferred model for new chats. Also used as the local server fallback.") Layout.row: 6 Layout.column: 0 } MyComboBox { id: defaultModelBox Layout.row: 6 Layout.column: 2 Layout.minimumWidth: 400 Layout.maximumWidth: 400 Layout.alignment: Qt.AlignRight model: ListModel { id: defaultModelBoxModel Component.onCompleted: { defaultModelBox.rebuildModel() } } Accessible.name: defaultModelLabel.text Accessible.description: defaultModelLabel.helpText function rebuildModel() { defaultModelBoxModel.clear(); defaultModelBoxModel.append({"text": qsTr("Application default")}); for (var i = 0; i < ModelList.selectableModelList.length; ++i) defaultModelBoxModel.append({"text": ModelList.selectableModelList[i].name}); defaultModelBox.updateModel(); } function updateModel() { // This usage of 'Application default' should not be translated // FIXME: Make this refer to a string literal variable accessed by both QML and C++ if (MySettings.userDefaultModel === "Application default") defaultModelBox.currentIndex = 0 else defaultModelBox.currentIndex = defaultModelBox.indexOfValue(MySettings.userDefaultModel); } onActivated: { // This usage of 'Application default' should not be translated // FIXME: Make this refer to a string literal variable accessed by both QML and C++ if (defaultModelBox.currentIndex === 0) MySettings.userDefaultModel = "Application default"; else MySettings.userDefaultModel = defaultModelBox.currentText; } Connections { target: MySettings function onUserDefaultModelChanged() { defaultModelBox.updateModel() } } Connections { target: MySettings function onLanguageAndLocaleChanged() { defaultModelBox.rebuildModel() } } Connections { target: ModelList function onSelectableModelListChanged() { defaultModelBox.rebuildModel() } } } MySettingsLabel { id: suggestionModeLabel text: qsTr("Suggestion Mode") helpText: qsTr("Generate suggested follow-up questions at the end of responses.") Layout.row: 7 Layout.column: 0 } MyComboBox { id: suggestionModeBox Layout.row: 7 Layout.column: 2 Layout.minimumWidth: 400 Layout.maximumWidth: 400 Layout.alignment: Qt.AlignRight // NOTE: indices match values of SuggestionMode enum, keep them in sync model: ListModel { ListElement { name: qsTr("When chatting with LocalDocs") } ListElement { name: qsTr("Whenever possible") } ListElement { name: qsTr("Never") } } Accessible.name: suggestionModeLabel.text Accessible.description: suggestionModeLabel.helpText onActivated: { MySettings.suggestionMode = suggestionModeBox.currentIndex; } Component.onCompleted: { suggestionModeBox.currentIndex = MySettings.suggestionMode; } } MySettingsLabel { id: modelPathLabel text: qsTr("Download Path") helpText: qsTr("Where to store local models and the LocalDocs database.") Layout.row: 8 Layout.column: 0 } RowLayout { Layout.row: 8 Layout.column: 2 Layout.alignment: Qt.AlignRight Layout.minimumWidth: 400 Layout.maximumWidth: 400 spacing: 10 MyDirectoryField { id: modelPathDisplayField text: MySettings.modelPath font.pixelSize: theme.fontSizeLarge implicitWidth: 300 Layout.fillWidth: true Accessible.name: modelPathLabel.text Accessible.description: modelPathLabel.helpText onEditingFinished: { if (isValid) { MySettings.modelPath = modelPathDisplayField.text } else { text = MySettings.modelPath } } } MyFolderDialog { id: folderDialog } MySettingsButton { text: qsTr("Browse") Accessible.description: qsTr("Choose where to save model files") onClicked: { folderDialog.openFolderDialog("file://" + MySettings.modelPath, function(selectedFolder) { MySettings.modelPath = selectedFolder }) } } } MySettingsLabel { id: dataLakeLabel text: qsTr("Enable Datalake") helpText: qsTr("Send chats and feedback to the GPT4All Open-Source Datalake.") Layout.row: 9 Layout.column: 0 } MyCheckBox { id: dataLakeBox Layout.row: 9 Layout.column: 2 Layout.alignment: Qt.AlignRight Component.onCompleted: { dataLakeBox.checked = MySettings.networkIsActive; } Connections { target: MySettings function onNetworkIsActiveChanged() { dataLakeBox.checked = MySettings.networkIsActive; } } onClicked: { if (MySettings.networkIsActive) MySettings.networkIsActive = false; else networkDialog.open(); dataLakeBox.checked = MySettings.networkIsActive; } } ColumnLayout { Layout.row: 10 Layout.column: 0 Layout.columnSpan: 3 Layout.fillWidth: true spacing: 10 Label { color: theme.styledTextColor font.pixelSize: theme.fontSizeLarge font.bold: true text: qsTr("Advanced") } Rectangle { Layout.fillWidth: true height: 1 color: theme.settingsDivider } } MySettingsLabel { id: nThreadsLabel text: qsTr("CPU Threads") helpText: qsTr("The number of CPU threads used for inference and embedding.") Layout.row: 11 Layout.column: 0 } MyTextField { text: MySettings.threadCount color: theme.textColor font.pixelSize: theme.fontSizeLarge Layout.alignment: Qt.AlignRight Layout.row: 11 Layout.column: 2 Layout.minimumWidth: 200 Layout.maximumWidth: 200 validator: IntValidator { bottom: 1 } onEditingFinished: { var val = parseInt(text) if (!isNaN(val)) { MySettings.threadCount = val focus = false } else { text = MySettings.threadCount } } Accessible.role: Accessible.EditableText Accessible.name: nThreadsLabel.text Accessible.description: ToolTip.text } MySettingsLabel { id: trayLabel text: qsTr("Enable System Tray") helpText: qsTr("The application will minimize to the system tray when the window is closed.") Layout.row: 13 Layout.column: 0 } MyCheckBox { id: trayBox Layout.row: 13 Layout.column: 2 Layout.alignment: Qt.AlignRight checked: MySettings.systemTray onClicked: { MySettings.systemTray = !MySettings.systemTray } } MySettingsLabel { id: serverChatLabel text: qsTr("Enable Local API Server") helpText: qsTr("Expose an OpenAI-Compatible server to localhost. WARNING: Results in increased resource usage.") Layout.row: 14 Layout.column: 0 } MyCheckBox { id: serverChatBox Layout.row: 14 Layout.column: 2 Layout.alignment: Qt.AlignRight checked: MySettings.serverChat onClicked: { MySettings.serverChat = !MySettings.serverChat } } MySettingsLabel { id: serverPortLabel text: qsTr("API Server Port") helpText: qsTr("The port to use for the local server. Requires restart.") Layout.row: 15 Layout.column: 0 } MyTextField { id: serverPortField text: MySettings.networkPort color: theme.textColor font.pixelSize: theme.fontSizeLarge Layout.row: 15 Layout.column: 2 Layout.minimumWidth: 200 Layout.maximumWidth: 200 Layout.alignment: Qt.AlignRight validator: IntValidator { bottom: 1 } onEditingFinished: { var val = parseInt(text) if (!isNaN(val)) { MySettings.networkPort = val focus = false } else { text = MySettings.networkPort } } Accessible.role: Accessible.EditableText Accessible.name: serverPortLabel.text Accessible.description: serverPortLabel.helpText } /*MySettingsLabel { id: gpuOverrideLabel text: qsTr("Force Metal (macOS+arm)") Layout.row: 13 Layout.column: 0 } MyCheckBox { id: gpuOverrideBox Layout.row: 13 Layout.column: 2 Layout.alignment: Qt.AlignRight checked: MySettings.forceMetal onClicked: { MySettings.forceMetal = !MySettings.forceMetal } ToolTip.text: qsTr("WARNING: On macOS with arm (M1+) this setting forces usage of the GPU. Can cause crashes if the model requires more RAM than the system supports. Because of crash possibility the setting will not persist across restarts of the application. This has no effect on non-macs or intel.") ToolTip.visible: hovered }*/ MySettingsLabel { id: updatesLabel text: qsTr("Check For Updates") helpText: qsTr("Manually check for an update to GPT4All."); Layout.row: 16 Layout.column: 0 } MySettingsButton { Layout.row: 16 Layout.column: 2 Layout.alignment: Qt.AlignRight text: qsTr("Updates"); onClicked: { if (!LLM.checkForUpdates()) checkForUpdatesError.open() } } Rectangle { Layout.row: 17 Layout.column: 0 Layout.columnSpan: 3 Layout.fillWidth: true height: 1 color: theme.settingsDivider } } } ================================================ FILE: gpt4all-chat/qml/ChatCollapsibleItem.qml ================================================ import Qt5Compat.GraphicalEffects import QtCore import QtQuick import QtQuick.Controls import QtQuick.Controls.Basic import QtQuick.Layouts import gpt4all import mysettings import toolenums ColumnLayout { property alias textContent: innerTextItem.textContent property bool isCurrent: false property bool isError: false property bool isThinking: false property int thinkingTime: 0 Layout.topMargin: 10 Layout.bottomMargin: 10 Item { Layout.preferredWidth: childrenRect.width Layout.preferredHeight: 38 RowLayout { anchors.left: parent.left anchors.top: parent.top anchors.bottom: parent.bottom Item { Layout.preferredWidth: myTextArea.implicitWidth Layout.preferredHeight: myTextArea.implicitHeight TextArea { id: myTextArea text: { if (isError) return qsTr("Analysis encountered error"); if (isCurrent) return isThinking ? qsTr("Thinking") : qsTr("Analyzing"); return isThinking ? qsTr("Thought for %1 %2") .arg(Math.ceil(thinkingTime / 1000.0)) .arg(Math.ceil(thinkingTime / 1000.0) === 1 ? qsTr("second") : qsTr("seconds")) : qsTr("Analyzed"); } padding: 0 font.pixelSize: theme.fontSizeLarger enabled: false focus: false readOnly: true color: headerMA.containsMouse ? theme.mutedDarkTextColorHovered : theme.mutedTextColor hoverEnabled: false } Item { id: textColorOverlay anchors.fill: parent clip: true visible: false Rectangle { id: animationRec width: myTextArea.width * 0.3 anchors.top: parent.top anchors.bottom: parent.bottom color: theme.textColor SequentialAnimation { running: isCurrent loops: Animation.Infinite NumberAnimation { target: animationRec; property: "x"; from: -animationRec.width; to: myTextArea.width * 3; duration: 2000 } } } } OpacityMask { visible: isCurrent anchors.fill: parent maskSource: myTextArea source: textColorOverlay } } Item { id: caret Layout.preferredWidth: contentCaret.width Layout.preferredHeight: contentCaret.height Image { id: contentCaret anchors.centerIn: parent visible: false sourceSize.width: theme.fontSizeLarge sourceSize.height: theme.fontSizeLarge mipmap: true source: { if (contentLayout.state === "collapsed") return "qrc:/gpt4all/icons/caret_right.svg"; else return "qrc:/gpt4all/icons/caret_down.svg"; } } ColorOverlay { anchors.fill: contentCaret source: contentCaret color: headerMA.containsMouse ? theme.mutedDarkTextColorHovered : theme.mutedTextColor } } } MouseArea { id: headerMA hoverEnabled: true anchors.fill: parent onClicked: { if (contentLayout.state === "collapsed") contentLayout.state = "expanded"; else contentLayout.state = "collapsed"; } } } ColumnLayout { id: contentLayout spacing: 0 state: "collapsed" clip: true states: [ State { name: "expanded" PropertyChanges { target: contentLayout; Layout.preferredHeight: innerContentLayout.height } }, State { name: "collapsed" PropertyChanges { target: contentLayout; Layout.preferredHeight: 0 } } ] transitions: [ Transition { SequentialAnimation { PropertyAnimation { target: contentLayout property: "Layout.preferredHeight" duration: 300 easing.type: Easing.InOutQuad } } } ] ColumnLayout { id: innerContentLayout Layout.leftMargin: 30 ChatTextItem { id: innerTextItem } } } } ================================================ FILE: gpt4all-chat/qml/ChatDrawer.qml ================================================ import QtCore import QtQuick import QtQuick.Controls import QtQuick.Controls.Basic import QtQuick.Layouts import chatlistmodel import llm import download import network import mysettings Rectangle { id: chatDrawer Theme { id: theme } color: theme.viewBackground Rectangle { id: borderRight anchors.top: parent.top anchors.bottom: parent.bottom anchors.right: parent.right width: 1 color: theme.dividerColor } Item { anchors.top: parent.top anchors.bottom: parent.bottom anchors.left: parent.left anchors.right: borderRight.left Accessible.role: Accessible.Pane Accessible.name: qsTr("Drawer") Accessible.description: qsTr("Main navigation drawer") MySettingsButton { id: newChat anchors.top: parent.top anchors.left: parent.left anchors.right: parent.right anchors.margins: 20 font.pixelSize: theme.fontSizeLarger topPadding: 24 bottomPadding: 24 text: qsTr("\uFF0B New Chat") Accessible.description: qsTr("Create a new chat") onClicked: { ChatListModel.addChat() conversationList.positionViewAtIndex(0, ListView.Beginning) Network.trackEvent("new_chat", {"number_of_chats": ChatListModel.count}) } } Rectangle { id: divider anchors.top: newChat.bottom anchors.margins: 20 anchors.topMargin: 14 anchors.left: parent.left anchors.right: parent.right height: 1 color: theme.dividerColor } ScrollView { anchors.left: parent.left anchors.right: parent.right anchors.topMargin: 15 anchors.top: divider.bottom anchors.bottom: parent.bottom anchors.bottomMargin: 15 ScrollBar.vertical.policy: ScrollBar.AlwaysOff clip: true ListView { id: conversationList anchors.fill: parent anchors.leftMargin: 10 anchors.rightMargin: 10 model: ChatListModel Component.onCompleted: ChatListModel.loadChats() ScrollBar.vertical: ScrollBar { parent: conversationList.parent anchors.top: conversationList.top anchors.left: conversationList.right anchors.bottom: conversationList.bottom } Component { id: sectionHeading Rectangle { width: ListView.view.width height: childrenRect.height color: "transparent" property bool isServer: ChatListModel.get(parent.index) && ChatListModel.get(parent.index).isServer visible: !isServer || MySettings.serverChat required property string section Text { leftPadding: 10 rightPadding: 10 topPadding: 15 bottomPadding: 5 text: parent.section color: theme.chatDrawerSectionHeader font.pixelSize: theme.fontSizeSmallest } } } section.property: "section" section.criteria: ViewSection.FullString section.delegate: sectionHeading delegate: Rectangle { id: chatRectangle width: conversationList.width height: chatNameBox.height + 20 property bool isCurrent: ChatListModel.currentChat === ChatListModel.get(index) property bool isServer: ChatListModel.get(index) && ChatListModel.get(index).isServer property bool trashQuestionDisplayed: false visible: !isServer || MySettings.serverChat z: isCurrent ? 199 : 1 color: isCurrent ? theme.selectedBackground : "transparent" border.width: isCurrent border.color: theme.dividerColor radius: 10 Rectangle { id: chatNameBox height: chatName.height anchors.left: parent.left anchors.right: trashButton.left anchors.verticalCenter: chatRectangle.verticalCenter anchors.leftMargin: 5 anchors.rightMargin: 5 radius: 5 color: chatName.readOnly ? "transparent" : theme.chatNameEditBgColor TextField { id: chatName anchors.left: parent.left anchors.right: editButton.left anchors.verticalCenter: chatNameBox.verticalCenter topPadding: 5 bottomPadding: 5 color: theme.styledTextColor focus: false readOnly: true wrapMode: Text.NoWrap hoverEnabled: false // Disable hover events on the TextArea selectByMouse: false // Disable text selection in the TextArea font.pixelSize: theme.fontSizeLarge font.bold: true text: readOnly ? metrics.elidedText : name horizontalAlignment: TextInput.AlignLeft opacity: trashQuestionDisplayed ? 0.5 : 1.0 TextMetrics { id: metrics font: chatName.font text: name elide: Text.ElideRight elideWidth: chatName.width - 15 } background: Rectangle { color: "transparent" } onEditingFinished: { // Work around a bug in qml where we're losing focus when the whole window // goes out of focus even though this textfield should be marked as not // having focus if (chatName.readOnly) return; changeName(); } function changeName() { Network.trackChatEvent("rename_chat"); ChatListModel.get(index).name = chatName.text; chatName.focus = false; chatName.readOnly = true; chatName.selectByMouse = false; } TapHandler { onTapped: { if (isCurrent) return; ChatListModel.currentChat = ChatListModel.get(index); } } Accessible.role: Accessible.Button Accessible.name: text Accessible.description: qsTr("Select the current chat or edit the chat when in edit mode") } MyToolButton { id: editButton anchors.verticalCenter: parent.verticalCenter anchors.right: parent.right anchors.rightMargin: 5 imageWidth: 24 imageHeight: 24 visible: isCurrent && !isServer && chatName.readOnly opacity: trashQuestionDisplayed ? 0.5 : 1.0 source: "qrc:/gpt4all/icons/edit.svg" onClicked: { chatName.focus = true; chatName.readOnly = false; chatName.selectByMouse = true; } Accessible.name: qsTr("Edit chat name") } MyToolButton { id: okButton anchors.verticalCenter: parent.verticalCenter anchors.right: parent.right anchors.rightMargin: 5 imageWidth: 24 imageHeight: 24 visible: isCurrent && !isServer && !chatName.readOnly opacity: trashQuestionDisplayed ? 0.5 : 1.0 source: "qrc:/gpt4all/icons/check.svg" onClicked: chatName.changeName() Accessible.name: qsTr("Save chat name") } } MyToolButton { id: trashButton anchors.verticalCenter: chatNameBox.verticalCenter anchors.right: chatRectangle.right anchors.rightMargin: 10 imageWidth: 24 imageHeight: 24 visible: isCurrent && !isServer source: "qrc:/gpt4all/icons/trash.svg" onClicked: { trashQuestionDisplayed = true timer.start() } Accessible.name: qsTr("Delete chat") } Rectangle { id: trashSureQuestion anchors.top: trashButton.bottom anchors.topMargin: 10 anchors.right: trashButton.right width: childrenRect.width height: childrenRect.height color: chatRectangle.color visible: isCurrent && trashQuestionDisplayed opacity: 1.0 radius: 10 z: 200 Row { spacing: 10 Button { id: checkMark width: 30 height: 30 contentItem: Text { color: theme.textErrorColor text: "\u2713" font.pixelSize: theme.fontSizeLarger horizontalAlignment: Text.AlignHCenter verticalAlignment: Text.AlignVCenter } background: Rectangle { width: 30 height: 30 color: "transparent" } onClicked: { Network.trackChatEvent("remove_chat") ChatListModel.removeChat(ChatListModel.get(index)) } Accessible.role: Accessible.Button Accessible.name: qsTr("Confirm chat deletion") } Button { id: cancel width: 30 height: 30 contentItem: Text { color: theme.textColor text: "\u2715" font.pixelSize: theme.fontSizeLarger horizontalAlignment: Text.AlignHCenter verticalAlignment: Text.AlignVCenter } background: Rectangle { width: 30 height: 30 color: "transparent" } onClicked: { trashQuestionDisplayed = false } Accessible.role: Accessible.Button Accessible.name: qsTr("Cancel chat deletion") } } } Timer { id: timer interval: 3000; running: false; repeat: false onTriggered: trashQuestionDisplayed = false } } Accessible.role: Accessible.List Accessible.name: qsTr("List of chats") Accessible.description: qsTr("List of chats in the drawer dialog") } } } } ================================================ FILE: gpt4all-chat/qml/ChatItemView.qml ================================================ import Qt5Compat.GraphicalEffects import QtCore import QtQuick import QtQuick.Controls import QtQuick.Controls.Basic import QtQuick.Layouts import Qt.labs.qmlmodels import gpt4all import mysettings import toolenums ColumnLayout { property var inputBoxText: null signal setInputBoxText(text: string) Item { Layout.fillWidth: true Layout.maximumWidth: parent.width Layout.preferredHeight: gridLayout.height HoverHandler { id: hoverArea } GridLayout { id: gridLayout anchors.left: parent.left anchors.right: parent.right columns: 2 Item { Layout.row: 0 Layout.column: 0 Layout.alignment: Qt.AlignVCenter | Qt.AlignRight Layout.preferredWidth: 32 Layout.preferredHeight: 32 Layout.topMargin: model.index > 0 ? 25 : 0 Image { id: logo sourceSize: Qt.size(32, 32) fillMode: Image.PreserveAspectFit mipmap: true visible: false source: name !== "Response: " ? "qrc:/gpt4all/icons/you.svg" : "qrc:/gpt4all/icons/gpt4all_transparent.svg" } ColorOverlay { id: colorOver anchors.fill: logo source: logo color: theme.conversationHeader RotationAnimation { id: rotationAnimation target: colorOver property: "rotation" from: 0 to: 360 duration: 1000 loops: Animation.Infinite running: isCurrentResponse && currentChat.responseInProgress } } } Item { Layout.row: 0 Layout.column: 1 Layout.fillWidth: true Layout.preferredHeight: 38 Layout.topMargin: model.index > 0 ? 25 : 0 RowLayout { spacing: 5 anchors.left: parent.left anchors.top: parent.top anchors.bottom: parent.bottom TextArea { text: { if (name === "Response: ") return qsTr("GPT4All"); return qsTr("You"); } padding: 0 font.pixelSize: theme.fontSizeLarger font.bold: true color: theme.conversationHeader enabled: false focus: false readOnly: true } Text { visible: name === "Response: " font.pixelSize: theme.fontSizeLarger text: currentModelName() color: theme.mutedTextColor } RowLayout { visible: isCurrentResponse && (content === "" && currentChat.responseInProgress) Text { color: theme.mutedTextColor font.pixelSize: theme.fontSizeLarger text: { switch (currentChat.responseState) { case Chat.ResponseStopped: return qsTr("response stopped ..."); case Chat.LocalDocsRetrieval: return qsTr("retrieving localdocs: %1 ...").arg(currentChat.collectionList.join(", ")); case Chat.LocalDocsProcessing: return qsTr("searching localdocs: %1 ...").arg(currentChat.collectionList.join(", ")); case Chat.PromptProcessing: return qsTr("processing ...") case Chat.ResponseGeneration: return qsTr("generating response ..."); case Chat.GeneratingQuestions: return qsTr("generating questions ..."); case Chat.ToolCallGeneration: return qsTr("generating toolcall ..."); default: return ""; // handle unexpected values } } } } } } ColumnLayout { Layout.row: 1 Layout.column: 1 Layout.fillWidth: true spacing: 10 Flow { id: attachedUrlsFlow Layout.fillWidth: true Layout.bottomMargin: 10 spacing: 10 visible: promptAttachments.length !== 0 Repeater { model: promptAttachments delegate: Rectangle { width: 350 height: 50 radius: 5 color: theme.attachmentBackground border.color: theme.controlBorder Row { spacing: 5 anchors.fill: parent anchors.margins: 5 MyFileIcon { iconSize: 40 fileName: modelData.file } Text { width: 295 height: 40 text: modelData.file color: theme.textColor horizontalAlignment: Text.AlignHLeft verticalAlignment: Text.AlignVCenter font.pixelSize: theme.fontSizeMedium font.bold: true wrapMode: Text.WrapAnywhere elide: Qt.ElideRight } } } } } Repeater { model: childItems DelegateChooser { id: chooser role: "name" DelegateChoice { roleValue: "Text: "; ChatTextItem { Layout.fillWidth: true textContent: modelData.content } } DelegateChoice { roleValue: "ToolCall: "; ChatCollapsibleItem { Layout.fillWidth: true textContent: modelData.content isCurrent: modelData.isCurrentResponse isError: modelData.isToolCallError } } DelegateChoice { roleValue: "Think: "; ChatCollapsibleItem { Layout.fillWidth: true textContent: modelData.content isCurrent: modelData.isCurrentResponse isError: false isThinking: true thinkingTime: modelData.thinkingTime visible: modelData.content !== "" } } } delegate: chooser } ChatTextItem { Layout.fillWidth: true textContent: content } ThumbsDownDialog { id: thumbsDownDialog x: Math.round((parent.width - width) / 2) y: Math.round((parent.height - height) / 2) width: 640 height: 300 property string text: content response: newResponse === undefined || newResponse === "" ? text : newResponse onAccepted: { var responseHasChanged = response !== text && response !== newResponse if (thumbsDownState && !thumbsUpState && !responseHasChanged) return chatModel.updateNewResponse(model.index, response) chatModel.updateThumbsUpState(model.index, false) chatModel.updateThumbsDownState(model.index, true) Network.sendConversation(currentChat.id, getConversationJson()); } } } Item { Layout.row: 2 Layout.column: 1 Layout.topMargin: 5 Layout.alignment: Qt.AlignVCenter Layout.preferredWidth: childrenRect.width Layout.preferredHeight: childrenRect.height visible: { if (name !== "Response: ") return false if (consolidatedSources.length === 0) return false if (!MySettings.localDocsShowReferences) return false if (isCurrentResponse && currentChat.responseInProgress && currentChat.responseState !== Chat.GeneratingQuestions ) return false return true } MyButton { backgroundColor: theme.sourcesBackground backgroundColorHovered: theme.sourcesBackgroundHovered contentItem: RowLayout { anchors.centerIn: parent Item { Layout.preferredWidth: 24 Layout.preferredHeight: 24 Image { id: sourcesIcon visible: false anchors.fill: parent sourceSize.width: 24 sourceSize.height: 24 mipmap: true source: "qrc:/gpt4all/icons/db.svg" } ColorOverlay { anchors.fill: sourcesIcon source: sourcesIcon color: theme.textColor } } Text { text: qsTr("%n Source(s)", "", consolidatedSources.length) padding: 0 font.pixelSize: theme.fontSizeLarge font.bold: true color: theme.styledTextColor } Item { Layout.preferredWidth: caret.width Layout.preferredHeight: caret.height Image { id: caret anchors.centerIn: parent visible: false sourceSize.width: theme.fontSizeLarge sourceSize.height: theme.fontSizeLarge mipmap: true source: { if (sourcesLayout.state === "collapsed") return "qrc:/gpt4all/icons/caret_right.svg"; else return "qrc:/gpt4all/icons/caret_down.svg"; } } ColorOverlay { anchors.fill: caret source: caret color: theme.textColor } } } onClicked: { if (sourcesLayout.state === "collapsed") sourcesLayout.state = "expanded"; else sourcesLayout.state = "collapsed"; } } } ColumnLayout { id: sourcesLayout Layout.row: 3 Layout.column: 1 Layout.topMargin: 5 visible: { if (consolidatedSources.length === 0) return false if (!MySettings.localDocsShowReferences) return false if (isCurrentResponse && currentChat.responseInProgress && currentChat.responseState !== Chat.GeneratingQuestions ) return false return true } clip: true Layout.fillWidth: true Layout.preferredHeight: 0 state: "collapsed" states: [ State { name: "expanded" PropertyChanges { target: sourcesLayout; Layout.preferredHeight: sourcesFlow.height } }, State { name: "collapsed" PropertyChanges { target: sourcesLayout; Layout.preferredHeight: 0 } } ] transitions: [ Transition { SequentialAnimation { PropertyAnimation { target: sourcesLayout property: "Layout.preferredHeight" duration: 300 easing.type: Easing.InOutQuad } } } ] Flow { id: sourcesFlow Layout.fillWidth: true spacing: 10 visible: consolidatedSources.length !== 0 Repeater { model: consolidatedSources delegate: Rectangle { radius: 10 color: ma.containsMouse ? theme.sourcesBackgroundHovered : theme.sourcesBackground width: 200 height: 75 MouseArea { id: ma enabled: modelData.path !== "" anchors.fill: parent hoverEnabled: true onClicked: function() { Qt.openUrlExternally(modelData.fileUri) } } Rectangle { id: debugTooltip anchors.right: parent.right anchors.bottom: parent.bottom width: 24 height: 24 color: "transparent" ToolTip { parent: debugTooltip visible: debugMouseArea.containsMouse text: modelData.text contentWidth: 900 delay: 500 } MouseArea { id: debugMouseArea anchors.fill: parent hoverEnabled: true } } ColumnLayout { anchors.left: parent.left anchors.top: parent.top anchors.margins: 10 spacing: 0 RowLayout { id: title spacing: 5 Layout.maximumWidth: 180 MyFileIcon { iconSize: 24 fileName: modelData.file Layout.preferredWidth: iconSize Layout.preferredHeight: iconSize } Text { Layout.maximumWidth: 156 text: modelData.collection !== "" ? modelData.collection : qsTr("LocalDocs") font.pixelSize: theme.fontSizeLarge font.bold: true color: theme.styledTextColor elide: Qt.ElideRight } Rectangle { Layout.fillWidth: true color: "transparent" height: 1 } } Text { Layout.fillHeight: true Layout.maximumWidth: 180 Layout.maximumHeight: 55 - title.height text: modelData.file color: theme.textColor font.pixelSize: theme.fontSizeSmall elide: Qt.ElideRight wrapMode: Text.WrapAnywhere } } } } } } ConfirmationDialog { id: editPromptDialog dialogTitle: qsTr("Edit this message?") description: qsTr("All following messages will be permanently erased.") onAccepted: { const msg = currentChat.popPrompt(index); if (msg !== null) setInputBoxText(msg); } } ConfirmationDialog { id: redoResponseDialog dialogTitle: qsTr("Redo this response?") description: qsTr("All following messages will be permanently erased.") onAccepted: currentChat.regenerateResponse(index) } RowLayout { id: buttonRow Layout.row: 4 Layout.column: 1 Layout.maximumWidth: parent.width Layout.fillWidth: false Layout.alignment: Qt.AlignLeft | Qt.AlignTop spacing: 3 visible: !isCurrentResponse || !currentChat.responseInProgress enabled: opacity > 0 opacity: hoverArea.hovered Behavior on opacity { OpacityAnimator { duration: 30 } } ChatMessageButton { readonly property var editingDisabledReason: { if (!currentChat.isModelLoaded) return qsTr("Cannot edit chat without a loaded model."); if (currentChat.responseInProgress) return qsTr("Cannot edit chat while the model is generating."); return null; } visible: !currentChat.isServer && model.name === "Prompt: " enabled: editingDisabledReason === null Layout.maximumWidth: 24 Layout.maximumHeight: 24 Layout.alignment: Qt.AlignVCenter Layout.fillWidth: false name: editingDisabledReason ?? qsTr("Edit") source: "qrc:/gpt4all/icons/edit.svg" onClicked: { if (inputBoxText === "") editPromptDialog.open(); } } ChatMessageButton { readonly property var editingDisabledReason: { if (!currentChat.isModelLoaded) return qsTr("Cannot redo response without a loaded model."); if (currentChat.responseInProgress) return qsTr("Cannot redo response while the model is generating."); return null; } visible: !currentChat.isServer && model.name === "Response: " enabled: editingDisabledReason === null Layout.maximumWidth: 24 Layout.maximumHeight: 24 Layout.alignment: Qt.AlignVCenter Layout.fillWidth: false name: editingDisabledReason ?? qsTr("Redo") source: "qrc:/gpt4all/icons/regenerate.svg" onClicked: { if (index == chatModel.count - 1) { // regenerate last message without confirmation currentChat.regenerateResponse(index); return; } redoResponseDialog.open(); } } ChatMessageButton { Layout.maximumWidth: 24 Layout.maximumHeight: 24 Layout.alignment: Qt.AlignVCenter Layout.fillWidth: false name: qsTr("Copy") source: "qrc:/gpt4all/icons/copy.svg" onClicked: { chatModel.copyToClipboard(index); } } Item { visible: name === "Response: " && MySettings.networkIsActive Layout.alignment: Qt.AlignVCenter Layout.preferredWidth: childrenRect.width Layout.preferredHeight: childrenRect.height Layout.fillWidth: false ChatMessageButton { id: thumbsUp anchors.left: parent.left anchors.verticalCenter: parent.verticalCenter opacity: thumbsUpState || thumbsUpState == thumbsDownState ? 1.0 : 0.2 source: "qrc:/gpt4all/icons/thumbs_up.svg" name: qsTr("Like response") onClicked: { if (thumbsUpState && !thumbsDownState) return chatModel.updateNewResponse(index, "") chatModel.updateThumbsUpState(index, true) chatModel.updateThumbsDownState(index, false) Network.sendConversation(currentChat.id, getConversationJson()); } } ChatMessageButton { id: thumbsDown anchors.top: thumbsUp.top anchors.topMargin: buttonRow.spacing anchors.left: thumbsUp.right anchors.leftMargin: buttonRow.spacing checked: thumbsDownState opacity: thumbsDownState || thumbsUpState == thumbsDownState ? 1.0 : 0.2 bgTransform: [ Matrix4x4 { matrix: Qt.matrix4x4(-1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1) }, Translate { x: thumbsDown.width } ] source: "qrc:/gpt4all/icons/thumbs_down.svg" name: qsTr("Dislike response") onClicked: { thumbsDownDialog.open() } } } } } // GridLayout } // Item GridLayout { Layout.fillWidth: true Layout.maximumWidth: parent.width function shouldShowSuggestions() { if (!isCurrentResponse) return false; if (MySettings.suggestionMode === 2) // Off return false; if (MySettings.suggestionMode === 0 && consolidatedSources.length === 0) // LocalDocs only return false; return currentChat.responseState === Chat.GeneratingQuestions || currentChat.generatedQuestions.length !== 0; } Item { visible: parent.shouldShowSuggestions() Layout.row: 5 Layout.column: 0 Layout.topMargin: 20 Layout.alignment: Qt.AlignVCenter | Qt.AlignRight Layout.preferredWidth: 28 Layout.preferredHeight: 28 Image { id: stack sourceSize: Qt.size(28, 28) fillMode: Image.PreserveAspectFit mipmap: true visible: false source: "qrc:/gpt4all/icons/stack.svg" } ColorOverlay { anchors.fill: stack source: stack color: theme.conversationHeader } } Item { visible: parent.shouldShowSuggestions() Layout.row: 5 Layout.column: 1 Layout.topMargin: 20 Layout.fillWidth: true Layout.preferredHeight: 38 RowLayout { spacing: 5 anchors.left: parent.left anchors.top: parent.top anchors.bottom: parent.bottom TextArea { text: qsTr("Suggested follow-ups") padding: 0 font.pixelSize: theme.fontSizeLarger font.bold: true color: theme.conversationHeader enabled: false focus: false readOnly: true } } } ColumnLayout { visible: parent.shouldShowSuggestions() Layout.row: 6 Layout.column: 1 Layout.fillWidth: true Layout.minimumHeight: 1 spacing: 10 Repeater { model: currentChat.generatedQuestions TextArea { id: followUpText Layout.fillWidth: true Layout.alignment: Qt.AlignLeft rightPadding: 40 topPadding: 10 leftPadding: 20 bottomPadding: 10 text: modelData focus: false readOnly: true wrapMode: Text.WordWrap hoverEnabled: !currentChat.responseInProgress color: theme.textColor font.pixelSize: theme.fontSizeLarge background: Rectangle { color: hovered ? theme.sourcesBackgroundHovered : theme.sourcesBackground radius: 10 } MouseArea { id: maFollowUp anchors.fill: parent enabled: !currentChat.responseInProgress onClicked: function() { var chat = window.currentChat var followup = modelData chat.stopGenerating() chat.newPromptResponsePair(followup) } } Item { anchors.right: parent.right anchors.verticalCenter: parent.verticalCenter width: 40 height: 40 visible: !currentChat.responseInProgress Image { id: plusImage anchors.verticalCenter: parent.verticalCenter sourceSize.width: 20 sourceSize.height: 20 mipmap: true visible: false source: "qrc:/gpt4all/icons/plus.svg" } ColorOverlay { anchors.fill: plusImage source: plusImage color: theme.styledTextColor } } } } Rectangle { Layout.fillWidth: true color: "transparent" radius: 10 Layout.preferredHeight: currentChat.responseInProgress ? 40 : 0 clip: true ColumnLayout { id: followUpLayout anchors.fill: parent Rectangle { id: myRect1 Layout.preferredWidth: 0 Layout.minimumWidth: 0 Layout.maximumWidth: parent.width height: 12 color: theme.sourcesBackgroundHovered } Rectangle { id: myRect2 Layout.preferredWidth: 0 Layout.minimumWidth: 0 Layout.maximumWidth: parent.width height: 12 color: theme.sourcesBackgroundHovered } SequentialAnimation { id: followUpProgressAnimation ParallelAnimation { PropertyAnimation { target: myRect1 property: "Layout.preferredWidth" from: 0 to: followUpLayout.width duration: 1000 } PropertyAnimation { target: myRect2 property: "Layout.preferredWidth" from: 0 to: followUpLayout.width / 2 duration: 1000 } } SequentialAnimation { loops: Animation.Infinite ParallelAnimation { PropertyAnimation { target: myRect1 property: "opacity" from: 1 to: 0.2 duration: 1500 } PropertyAnimation { target: myRect2 property: "opacity" from: 1 to: 0.2 duration: 1500 } } ParallelAnimation { PropertyAnimation { target: myRect1 property: "opacity" from: 0.2 to: 1 duration: 1500 } PropertyAnimation { target: myRect2 property: "opacity" from: 0.2 to: 1 duration: 1500 } } } } onVisibleChanged: { if (visible) followUpProgressAnimation.start(); } } Behavior on Layout.preferredHeight { NumberAnimation { duration: 300 easing.type: Easing.InOutQuad } } } } } // GridLayout } // ColumnLayout ================================================ FILE: gpt4all-chat/qml/ChatMessageButton.qml ================================================ import QtQuick import QtQuick.Controls import gpt4all MyToolButton { property string name width: 24 height: 24 imageWidth: width imageHeight: height ToolTip { visible: parent.hovered y: parent.height * 1.5 text: name delay: Qt.styleHints.mousePressAndHoldInterval } Accessible.name: name } ================================================ FILE: gpt4all-chat/qml/ChatTextItem.qml ================================================ import Qt5Compat.GraphicalEffects import QtCore import QtQuick import QtQuick.Controls import QtQuick.Controls.Basic import QtQuick.Layouts import gpt4all import mysettings import toolenums TextArea { id: myTextArea property string textContent: "" visible: textContent != "" Layout.fillWidth: true padding: 0 color: { if (!currentChat.isServer) return theme.textColor return theme.white } wrapMode: Text.WordWrap textFormat: TextEdit.PlainText focus: false readOnly: true font.pixelSize: theme.fontSizeLarge cursorVisible: isCurrentResponse ? currentChat.responseInProgress : false cursorPosition: text.length TapHandler { id: tapHandler onTapped: function(eventPoint, button) { var clickedPos = myTextArea.positionAt(eventPoint.position.x, eventPoint.position.y); var success = textProcessor.tryCopyAtPosition(clickedPos); if (success) copyCodeMessage.open(); } } MouseArea { id: conversationMouseArea anchors.fill: parent acceptedButtons: Qt.RightButton onClicked: (mouse) => { if (mouse.button === Qt.RightButton) { conversationContextMenu.x = conversationMouseArea.mouseX conversationContextMenu.y = conversationMouseArea.mouseY conversationContextMenu.open() } } } onLinkActivated: function(link) { if (!isCurrentResponse || !currentChat.responseInProgress) Qt.openUrlExternally(link) } onLinkHovered: function (link) { if (!isCurrentResponse || !currentChat.responseInProgress) statusBar.externalHoveredLink = link } MyMenu { id: conversationContextMenu MyMenuItem { text: qsTr("Copy") enabled: myTextArea.selectedText !== "" height: enabled ? implicitHeight : 0 onTriggered: myTextArea.copy() } MyMenuItem { text: qsTr("Copy Message") enabled: myTextArea.selectedText === "" height: enabled ? implicitHeight : 0 onTriggered: { myTextArea.selectAll() myTextArea.copy() myTextArea.deselect() } } MyMenuItem { text: textProcessor.shouldProcessText ? qsTr("Disable markdown") : qsTr("Enable markdown") height: enabled ? implicitHeight : 0 onTriggered: { textProcessor.shouldProcessText = !textProcessor.shouldProcessText; textProcessor.setValue(textContent); } } } ChatViewTextProcessor { id: textProcessor } function resetChatViewTextProcessor() { textProcessor.fontPixelSize = myTextArea.font.pixelSize textProcessor.codeColors.defaultColor = theme.codeDefaultColor textProcessor.codeColors.keywordColor = theme.codeKeywordColor textProcessor.codeColors.functionColor = theme.codeFunctionColor textProcessor.codeColors.functionCallColor = theme.codeFunctionCallColor textProcessor.codeColors.commentColor = theme.codeCommentColor textProcessor.codeColors.stringColor = theme.codeStringColor textProcessor.codeColors.numberColor = theme.codeNumberColor textProcessor.codeColors.headerColor = theme.codeHeaderColor textProcessor.codeColors.backgroundColor = theme.codeBackgroundColor textProcessor.textDocument = textDocument textProcessor.setValue(textContent); } property bool textProcessorReady: false Component.onCompleted: { resetChatViewTextProcessor(); textProcessorReady = true; } Connections { target: myTextArea function onTextContentChanged() { if (myTextArea.textProcessorReady) textProcessor.setValue(textContent); } } Connections { target: MySettings function onFontSizeChanged() { myTextArea.resetChatViewTextProcessor(); } function onChatThemeChanged() { myTextArea.resetChatViewTextProcessor(); } } Accessible.role: Accessible.Paragraph Accessible.name: text Accessible.description: name === "Response: " ? "The response by the model" : "The prompt by the user" } ================================================ FILE: gpt4all-chat/qml/ChatView.qml ================================================ import Qt5Compat.GraphicalEffects import QtCore import QtQuick import QtQuick.Controls import QtQuick.Controls.Basic import QtQuick.Dialogs import QtQuick.Layouts import chatlistmodel import download import gpt4all import llm import localdocs import modellist import mysettings import network Rectangle { id: window Theme { id: theme } property var currentChat: ChatListModel.currentChat property var chatModel: currentChat.chatModel property var currentModelInfo: currentChat && currentChat.modelInfo property var currentModelId: null onCurrentModelInfoChanged: { const newId = currentModelInfo && currentModelInfo.id; if (currentModelId !== newId) { currentModelId = newId; } } signal addCollectionViewRequested() signal addModelViewRequested() color: theme.viewBackground Connections { target: currentChat // FIXME: https://github.com/nomic-ai/gpt4all/issues/3334 // function onResponseInProgressChanged() { // if (MySettings.networkIsActive && !currentChat.responseInProgress) // Network.sendConversation(currentChat.id, getConversationJson()); // } function onModelLoadingErrorChanged() { if (currentChat.modelLoadingError !== "") modelLoadingErrorPopup.open() } function onModelLoadingWarning(warning) { modelLoadingWarningPopup.open_(warning) } } function currentModelName() { return ModelList.modelInfo(currentChat.modelInfo.id).name; } function currentModelInstalled() { return currentModelName() !== "" && ModelList.modelInfo(currentChat.modelInfo.id).installed; } PopupDialog { id: modelLoadingErrorPopup anchors.centerIn: parent shouldTimeOut: false text: qsTr("

                Encountered an error loading model:


                " + "\"%1\"" + "

                Model loading failures can happen for a variety of reasons, but the most common " + "causes include a bad file format, an incomplete or corrupted download, the wrong file " + "type, not enough system RAM or an incompatible model type. Here are some suggestions for resolving the problem:" + "
                  " + "
                • Ensure the model file has a compatible format and type" + "
                • Check the model file is complete in the download folder" + "
                • You can find the download folder in the settings dialog" + "
                • If you've sideloaded the model ensure the file is not corrupt by checking md5sum" + "
                • Read more about what models are supported in our documentation for the gui" + "
                • Check out our discord channel for help").arg(currentChat.modelLoadingError); } PopupDialog { id: modelLoadingWarningPopup property string message anchors.centerIn: parent shouldTimeOut: false text: qsTr("

                  Warning

                  %1

                  ").arg(message) function open_(msg) { message = msg; open(); } } ConfirmationDialog { id: switchModelDialog property int index: -1 dialogTitle: qsTr("Erase conversation?") description: qsTr("Changing the model will erase the current conversation.") } PopupDialog { id: copyMessage anchors.centerIn: parent text: qsTr("Conversation copied to clipboard.") font.pixelSize: theme.fontSizeLarge } PopupDialog { id: copyCodeMessage anchors.centerIn: parent text: qsTr("Code copied to clipboard.") font.pixelSize: theme.fontSizeLarge } ConfirmationDialog { id: resetContextDialog dialogTitle: qsTr("Erase conversation?") description: qsTr("The entire chat will be erased.") onAccepted: { Network.trackChatEvent("reset_context", { "length": chatModel.count }); currentChat.reset(); } } // FIXME: https://github.com/nomic-ai/gpt4all/issues/3334 // function getConversation() { // var conversation = ""; // for (var i = 0; i < chatModel.count; i++) { // var item = chatModel.get(i) // var string = item.name; // var isResponse = item.name === "Response: " // string += chatModel.get(i).value // if (isResponse && item.stopped) // string += " " // string += "\n" // conversation += string // } // return conversation // } // FIXME: https://github.com/nomic-ai/gpt4all/issues/3334 // function getConversationJson() { // var str = "{\"conversation\": ["; // for (var i = 0; i < chatModel.count; i++) { // var item = chatModel.get(i) // var isResponse = item.name === "Response: " // str += "{\"content\": "; // str += JSON.stringify(item.value) // str += ", \"role\": \"" + (isResponse ? "assistant" : "user") + "\""; // if (isResponse && item.thumbsUpState !== item.thumbsDownState) // str += ", \"rating\": \"" + (item.thumbsUpState ? "positive" : "negative") + "\""; // if (isResponse && item.newResponse !== "") // str += ", \"edited_content\": " + JSON.stringify(item.newResponse); // if (isResponse && item.stopped) // str += ", \"stopped\": \"true\"" // if (!isResponse) // str += "}," // else // str += ((i < chatModel.count - 1) ? "}," : "}") // } // return str + "]}" // } ChatDrawer { id: chatDrawer anchors.left: parent.left anchors.top: parent.top anchors.bottom: parent.bottom width: Math.max(180, Math.min(600, 0.23 * window.width)) } PopupDialog { id: referenceContextDialog anchors.centerIn: parent shouldTimeOut: false shouldShowBusy: false modal: true } Item { id: mainArea anchors.left: chatDrawer.right anchors.right: parent.right anchors.top: parent.top anchors.bottom: parent.bottom state: "expanded" states: [ State { name: "expanded" AnchorChanges { target: mainArea anchors.left: chatDrawer.right } }, State { name: "collapsed" AnchorChanges { target: mainArea anchors.left: parent.left } } ] function toggleLeftPanel() { if (mainArea.state === "expanded") { mainArea.state = "collapsed"; } else { mainArea.state = "expanded"; } } transitions: Transition { AnchorAnimation { easing.type: Easing.InOutQuad duration: 200 } } Rectangle { id: header anchors.left: parent.left anchors.right: parent.right anchors.top: parent.top height: 100 color: theme.conversationBackground RowLayout { id: comboLayout height: 80 anchors.left: parent.left anchors.right: parent.right anchors.verticalCenter: parent.verticalCenter spacing: 0 Rectangle { Layout.alignment: Qt.AlignLeft Layout.leftMargin: 30 Layout.fillWidth: true color: "transparent" Layout.preferredHeight: childrenRect.height MyToolButton { id: drawerButton anchors.left: parent.left backgroundColor: theme.iconBackgroundLight width: 40 height: 40 imageWidth: 40 imageHeight: 40 padding: 15 source: mainArea.state === "expanded" ? "qrc:/gpt4all/icons/left_panel_open.svg" : "qrc:/gpt4all/icons/left_panel_closed.svg" Accessible.role: Accessible.ButtonMenu Accessible.name: qsTr("Chat panel") Accessible.description: qsTr("Chat panel with options") onClicked: { mainArea.toggleLeftPanel() } } } ComboBox { id: comboBox Layout.alignment: Qt.AlignHCenter Layout.fillHeight: true Layout.preferredWidth: 550 Layout.leftMargin: { // This function works in tandem with the preferredWidth and the layout to // provide the maximum size combobox we can have at the smallest window width // we allow with the largest font size we allow. It is unfortunately based // upon a magic number that was produced through trial and error for something // I don't fully understand. return -Math.max(0, comboBox.width / 2 + collectionsButton.width + 110 /*magic*/ - comboLayout.width / 2); } enabled: !currentChat.isServer && !currentChat.trySwitchContextInProgress && !currentChat.isCurrentlyLoading && ModelList.selectableModels.count !== 0 model: ModelList.selectableModels valueRole: "id" textRole: "name" function changeModel(index) { currentChat.stopGenerating() currentChat.reset(); currentChat.modelInfo = ModelList.modelInfo(comboBox.valueAt(index)) } Connections { target: switchModelDialog function onAccepted() { comboBox.changeModel(switchModelDialog.index) } } background: Rectangle { color: theme.mainComboBackground radius: 10 ProgressBar { id: modelProgress anchors.bottom: parent.bottom anchors.horizontalCenter: parent.horizontalCenter width: contentRow.width + 20 visible: currentChat.isCurrentlyLoading height: 10 value: currentChat.modelLoadingPercentage background: Rectangle { color: theme.progressBackground radius: 10 } contentItem: Item { Rectangle { anchors.bottom: parent.bottom width: modelProgress.visualPosition * parent.width height: 10 radius: 2 color: theme.progressForeground } } } } contentItem: Item { RowLayout { id: contentRow anchors.centerIn: parent spacing: 0 Layout.maximumWidth: 550 RowLayout { id: miniButtonsRow clip: true Layout.maximumWidth: 550 Behavior on Layout.preferredWidth { NumberAnimation { duration: 300 easing.type: Easing.InOutQuad } } Layout.preferredWidth: { if (!(comboBox.hovered || reloadButton.hovered || ejectButton.hovered)) return 0 return (reloadButton.visible ? reloadButton.width : 0) + (ejectButton.visible ? ejectButton.width : 0) } MyMiniButton { id: reloadButton Layout.alignment: Qt.AlignCenter visible: currentChat.modelLoadingError === "" && !currentChat.trySwitchContextInProgress && !currentChat.isCurrentlyLoading && (currentChat.isModelLoaded || currentModelInstalled()) source: "qrc:/gpt4all/icons/regenerate.svg" backgroundColor: theme.textColor backgroundColorHovered: theme.styledTextColor onClicked: { if (currentChat.isModelLoaded) currentChat.forceReloadModel(); else currentChat.reloadModel(); } ToolTip.text: qsTr("Reload the currently loaded model") ToolTip.visible: hovered } MyMiniButton { id: ejectButton Layout.alignment: Qt.AlignCenter visible: currentChat.isModelLoaded && !currentChat.isCurrentlyLoading source: "qrc:/gpt4all/icons/eject.svg" backgroundColor: theme.textColor backgroundColorHovered: theme.styledTextColor onClicked: { currentChat.forceUnloadModel(); } ToolTip.text: qsTr("Eject the currently loaded model") ToolTip.visible: hovered } } Text { Layout.maximumWidth: 520 id: comboBoxText leftPadding: 10 rightPadding: 10 text: { if (ModelList.selectableModels.count === 0) return qsTr("No model installed.") if (currentChat.modelLoadingError !== "") return qsTr("Model loading error.") if (currentChat.trySwitchContextInProgress === 1) return qsTr("Waiting for model...") if (currentChat.trySwitchContextInProgress === 2) return qsTr("Switching context...") if (currentModelName() === "") return qsTr("Choose a model...") if (!currentModelInstalled()) return qsTr("Not found: %1").arg(currentModelName()) if (currentChat.modelLoadingPercentage === 0.0) return qsTr("Reload \u00B7 %1").arg(currentModelName()) if (currentChat.isCurrentlyLoading) return qsTr("Loading \u00B7 %1").arg(currentModelName()) return currentModelName() } font.pixelSize: theme.fontSizeLarger color: theme.iconBackgroundLight verticalAlignment: Text.AlignVCenter horizontalAlignment: Text.AlignHCenter elide: Text.ElideRight } Item { Layout.minimumWidth: updown.width Layout.minimumHeight: updown.height Image { id: updown anchors.verticalCenter: parent.verticalCenter sourceSize.width: comboBoxText.font.pixelSize sourceSize.height: comboBoxText.font.pixelSize mipmap: true visible: false source: "qrc:/gpt4all/icons/up_down.svg" } ColorOverlay { anchors.fill: updown source: updown color: comboBoxText.color } } } } delegate: ItemDelegate { id: comboItemDelegate width: comboItemPopup.width -20 contentItem: Text { text: name color: theme.textColor font: comboBox.font elide: Text.ElideRight verticalAlignment: Text.AlignVCenter } background: Rectangle { radius: 10 color: highlighted ? theme.menuHighlightColor : theme.menuBackgroundColor } highlighted: comboBox.highlightedIndex === index } indicator: Item { } popup: Popup { id: comboItemPopup y: comboBox.height - 1 width: comboBox.width implicitHeight: Math.min(window.height - y, contentItem.implicitHeight + 20) padding: 0 contentItem: Rectangle { implicitWidth: comboBox.width implicitHeight: comboItemPopupListView.implicitHeight color: "transparent" radius: 10 ScrollView { anchors.fill: parent anchors.margins: 10 clip: true ScrollBar.vertical.policy: ScrollBar.AsNeeded ScrollBar.horizontal.policy: ScrollBar.AlwaysOff ListView { id: comboItemPopupListView implicitHeight: contentHeight model: comboBox.popup.visible ? comboBox.delegateModel : null currentIndex: comboBox.highlightedIndex ScrollIndicator.vertical: ScrollIndicator { } } } } background: Rectangle { border.color: theme.menuBorderColor border.width: 1 color: theme.menuBackgroundColor radius: 10 } } Accessible.name: currentModelName() Accessible.description: qsTr("The top item is the current model") onActivated: function (index) { var newInfo = ModelList.modelInfo(comboBox.valueAt(index)); if (newInfo === currentChat.modelInfo) { currentChat.reloadModel(); } else if (currentModelName() !== "" && chatModel.count !== 0) { switchModelDialog.index = index; switchModelDialog.open(); } else { comboBox.changeModel(index); } } } Rectangle { color: "transparent" Layout.alignment: Qt.AlignRight Layout.rightMargin: 30 Layout.fillWidth: true Layout.preferredHeight: childrenRect.height clip: true MyButton { id: collectionsButton clip: true anchors.right: parent.right borderWidth: 0 backgroundColor: theme.collectionsButtonBackground backgroundColorHovered: theme.collectionsButtonBackgroundHovered backgroundRadius: 5 padding: 15 topPadding: 8 bottomPadding: 8 contentItem: RowLayout { spacing: 10 Item { visible: currentChat.collectionModel.count === 0 Layout.minimumWidth: collectionsImage.width Layout.minimumHeight: collectionsImage.height Image { id: collectionsImage anchors.verticalCenter: parent.verticalCenter sourceSize.width: 24 sourceSize.height: 24 mipmap: true visible: false source: "qrc:/gpt4all/icons/db.svg" } ColorOverlay { anchors.fill: collectionsImage source: collectionsImage color: theme.collectionsButtonForeground } } MyBusyIndicator { visible: currentChat.collectionModel.updatingCount !== 0 color: theme.collectionsButtonProgress size: 24 Layout.minimumWidth: 24 Layout.minimumHeight: 24 Text { anchors.centerIn: parent text: currentChat.collectionModel.updatingCount color: theme.collectionsButtonForeground font.pixelSize: 14 // fixed regardless of theme } } Rectangle { visible: currentChat.collectionModel.count !== 0 radius: 6 color: theme.collectionsButtonForeground Layout.minimumWidth: collectionsImage.width Layout.minimumHeight: collectionsImage.height Text { anchors.centerIn: parent text: currentChat.collectionModel.count color: theme.collectionsButtonText font.pixelSize: 14 // fixed regardless of theme } } Text { text: qsTr("LocalDocs") color: theme.collectionsButtonForeground font.pixelSize: theme.fontSizeLarge } } fontPixelSize: theme.fontSizeLarge background: Rectangle { radius: collectionsButton.backgroundRadius // TODO(jared): either use collectionsButton-specific theming, or don't - this is inconsistent color: conversation.state === "expanded" ? ( collectionsButton.hovered ? theme.lightButtonBackgroundHovered : theme.lightButtonBackground ) : ( collectionsButton.hovered ? theme.lighterButtonBackground : theme.lighterButtonBackgroundHovered ) } Accessible.name: qsTr("Add documents") Accessible.description: qsTr("add collections of documents to the chat") onClicked: { conversation.toggleRightPanel() } } } } } Rectangle { id: conversationDivider anchors.top: header.bottom anchors.left: parent.left anchors.right: parent.right color: theme.conversationDivider height: 1 } CollectionsDrawer { id: collectionsDrawer anchors.right: parent.right anchors.top: conversationDivider.bottom anchors.bottom: parent.bottom width: Math.max(180, Math.min(600, 0.23 * window.width)) color: theme.conversationBackground onAddDocsClicked: { addCollectionViewRequested() } } Rectangle { id: conversation color: theme.conversationBackground anchors.left: parent.left anchors.right: parent.right anchors.bottom: parent.bottom anchors.top: conversationDivider.bottom state: "collapsed" states: [ State { name: "expanded" AnchorChanges { target: conversation anchors.right: collectionsDrawer.left } }, State { name: "collapsed" AnchorChanges { target: conversation anchors.right: parent.right } } ] function toggleRightPanel() { if (conversation.state === "expanded") { conversation.state = "collapsed"; } else { conversation.state = "expanded"; } } transitions: Transition { AnchorAnimation { easing.type: Easing.InOutQuad duration: 300 } } ScrollView { id: scrollView anchors.left: parent.left anchors.right: parent.right anchors.top: parent.top anchors.bottom: !currentChat.isServer ? textInputView.top : parent.bottom anchors.bottomMargin: !currentChat.isServer ? 30 : 0 ScrollBar.vertical.policy: ScrollBar.AlwaysOff Rectangle { anchors.fill: parent color: currentChat.isServer ? theme.black : theme.conversationBackground Rectangle { id: homePage color: "transparent" anchors.fill: parent z: 200 visible: !currentChat.isModelLoaded && (ModelList.selectableModels.count === 0 || currentModelName() === "") && !currentChat.isServer ColumnLayout { visible: ModelList.selectableModels.count !== 0 id: modelInstalledLabel anchors.centerIn: parent spacing: 0 Rectangle { Layout.alignment: Qt.AlignCenter Layout.preferredWidth: image.width Layout.preferredHeight: image.height color: "transparent" Image { id: image anchors.centerIn: parent sourceSize.width: 160 sourceSize.height: 110 fillMode: Image.PreserveAspectFit mipmap: true visible: false source: "qrc:/gpt4all/icons/nomic_logo.svg" } ColorOverlay { anchors.fill: image source: image color: theme.containerBackground } } } MyButton { id: loadDefaultModelButton visible: ModelList.selectableModels.count !== 0 anchors.top: modelInstalledLabel.bottom anchors.topMargin: 50 anchors.horizontalCenter: modelInstalledLabel.horizontalCenter rightPadding: 60 leftPadding: 60 property string defaultModel: "" property string defaultModelName: "" function updateDefaultModel() { var i = comboBox.find(MySettings.userDefaultModel) if (i !== -1) { defaultModel = comboBox.valueAt(i); } else { defaultModel = comboBox.count ? comboBox.valueAt(0) : ""; } if (defaultModel !== "") { defaultModelName = ModelList.modelInfo(defaultModel).name; } else { defaultModelName = ""; } } text: qsTr("Load \u00B7 %1 (default) \u2192").arg(defaultModelName); onClicked: { var i = comboBox.find(MySettings.userDefaultModel) if (i !== -1) { comboBox.changeModel(i); } else { comboBox.changeModel(0); } } // This requires a bit of work because apparently the combobox valueAt // function only works after the combobox component is loaded so we have // to use our own component loaded to make this work along with a signal // from MySettings for when the setting for user default model changes Connections { target: MySettings function onUserDefaultModelChanged() { loadDefaultModelButton.updateDefaultModel() } } Component.onCompleted: { loadDefaultModelButton.updateDefaultModel() } Accessible.role: Accessible.Button Accessible.name: qsTr("Load the default model") Accessible.description: qsTr("Loads the default model which can be changed in settings") } ColumnLayout { id: noModelInstalledLabel visible: ModelList.selectableModels.count === 0 anchors.centerIn: parent spacing: 0 Text { Layout.alignment: Qt.AlignCenter text: qsTr("No Model Installed") color: theme.mutedLightTextColor font.pixelSize: theme.fontSizeBannerSmall } Text { Layout.topMargin: 15 horizontalAlignment: Qt.AlignHCenter color: theme.mutedLighterTextColor text: qsTr("GPT4All requires that you install at least one\nmodel to get started") font.pixelSize: theme.fontSizeLarge } } MyButton { visible: ModelList.selectableModels.count === 0 anchors.top: noModelInstalledLabel.bottom anchors.topMargin: 50 anchors.horizontalCenter: noModelInstalledLabel.horizontalCenter rightPadding: 60 leftPadding: 60 text: qsTr("Install a Model") onClicked: { addModelViewRequested(); } Accessible.role: Accessible.Button Accessible.name: qsTr("Shows the add model view") } } ColumnLayout { anchors.fill: parent visible: ModelList.selectableModels.count !== 0 ListView { id: listView Layout.maximumWidth: 1280 Layout.fillHeight: true Layout.fillWidth: true Layout.margins: 20 Layout.leftMargin: 50 Layout.rightMargin: 50 Layout.alignment: Qt.AlignHCenter spacing: 10 model: chatModel cacheBuffer: 2147483647 ScrollBar.vertical: ScrollBar { policy: ScrollBar.AsNeeded } Accessible.role: Accessible.List Accessible.name: qsTr("Conversation with the model") Accessible.description: qsTr("prompt / response pairs from the conversation") delegate: ChatItemView { width: listView.contentItem.width - 15 inputBoxText: textInput.text onSetInputBoxText: text => { textInput.text = text; textInput.forceActiveFocus(); textInput.cursorPosition = text.length; } height: visible ? implicitHeight : 0 visible: name !== "ToolResponse: " && name !== "System: " } remove: Transition { OpacityAnimator { to: 0; duration: 500 } } function scrollToEnd() { listView.positionViewAtEnd() } onContentHeightChanged: { if (atYEnd) scrollToEnd() } } } } } Rectangle { id: conversationTrayContent anchors.bottom: conversationTrayButton.top anchors.horizontalCenter: conversationTrayButton.horizontalCenter width: conversationTrayContentLayout.width height: conversationTrayContentLayout.height color: theme.containerBackground radius: 5 opacity: 0 visible: false clip: true z: 400 property bool isHovered: ( conversationTrayButton.isHovered || resetContextButton.hovered || copyChatButton.hovered ) state: conversationTrayContent.isHovered ? "expanded" : "collapsed" states: [ State { name: "expanded" PropertyChanges { target: conversationTrayContent; opacity: 1 } }, State { name: "collapsed" PropertyChanges { target: conversationTrayContent; opacity: 0 } } ] transitions: [ Transition { from: "collapsed" to: "expanded" SequentialAnimation { ScriptAction { script: conversationTrayContent.visible = true } PropertyAnimation { target: conversationTrayContent property: "opacity" duration: 300 easing.type: Easing.InOutQuad } } }, Transition { from: "expanded" to: "collapsed" SequentialAnimation { PropertyAnimation { target: conversationTrayContent property: "opacity" duration: 300 easing.type: Easing.InOutQuad } ScriptAction { script: conversationTrayContent.visible = false } } } ] RowLayout { id: conversationTrayContentLayout spacing: 0 MyToolButton { id: resetContextButton Layout.preferredWidth: 40 Layout.preferredHeight: 40 source: "qrc:/gpt4all/icons/recycle.svg" imageWidth: 20 imageHeight: 20 onClicked: resetContextDialog.open() ToolTip.visible: resetContextButton.hovered ToolTip.text: qsTr("Erase and reset chat session") } MyToolButton { id: copyChatButton Layout.preferredWidth: 40 Layout.preferredHeight: 40 source: "qrc:/gpt4all/icons/copy.svg" imageWidth: 20 imageHeight: 20 TextEdit{ id: copyEdit visible: false } onClicked: { chatModel.copyToClipboard() copyMessage.open() } ToolTip.visible: copyChatButton.hovered ToolTip.text: qsTr("Copy chat session to clipboard") } } } Item { id: conversationTrayButton anchors.bottom: textInputView.top anchors.horizontalCenter: textInputView.horizontalCenter width: 40 height: 30 visible: chatModel.count && !currentChat.isServer && currentChat.isModelLoaded property bool isHovered: conversationTrayMouseAreaButton.containsMouse MouseArea { id: conversationTrayMouseAreaButton anchors.fill: parent hoverEnabled: true } Text { id: conversationTrayTextButton anchors.centerIn: parent horizontalAlignment: Qt.AlignHCenter leftPadding: 5 rightPadding: 5 text: "\u00B7\u00B7\u00B7" color: theme.textColor font.pixelSize: 30 // fixed size font.bold: true } } MyButton { anchors.bottom: textInputView.top anchors.horizontalCenter: textInputView.horizontalCenter anchors.bottomMargin: 20 textColor: theme.textColor visible: !currentChat.isServer && !currentChat.isModelLoaded && currentChat.modelLoadingError === "" && !currentChat.trySwitchContextInProgress && !currentChat.isCurrentlyLoading && currentModelInstalled() Image { anchors.verticalCenter: parent.verticalCenter anchors.left: parent.left anchors.leftMargin: 15 sourceSize.width: 15 sourceSize.height: 15 source: "qrc:/gpt4all/icons/regenerate.svg" } leftPadding: 40 onClicked: { currentChat.reloadModel(); } borderWidth: 1 backgroundColor: theme.conversationButtonBackground backgroundColorHovered: theme.conversationButtonBackgroundHovered backgroundRadius: 5 padding: 15 topPadding: 8 bottomPadding: 8 text: qsTr("Reload \u00B7 %1").arg(currentChat.modelInfo.name) fontPixelSize: theme.fontSizeSmall Accessible.description: qsTr("Reloads the model") } Text { id: statusBar property string externalHoveredLink: "" anchors.top: textInputView.bottom anchors.bottom: parent.bottom anchors.right: parent.right anchors.rightMargin: 30 anchors.left: parent.left anchors.leftMargin: 30 horizontalAlignment: Qt.AlignRight verticalAlignment: Qt.AlignVCenter color: textInputView.error !== null ? theme.textErrorColor : theme.mutedTextColor visible: currentChat.tokenSpeed !== "" || externalHoveredLink !== "" || textInputView.error !== null elide: Text.ElideRight wrapMode: Text.WordWrap text: { if (externalHoveredLink !== "") return externalHoveredLink if (textInputView.error !== null) return textInputView.error; const segments = [currentChat.tokenSpeed]; const device = currentChat.device; const backend = currentChat.deviceBackend; if (device !== null) { // device is null if we have no model loaded var deviceSegment = device; if (backend === "CUDA" || backend === "Vulkan") deviceSegment += ` (${backend})`; segments.push(deviceSegment); } const fallbackReason = currentChat.fallbackReason; if (fallbackReason !== null && fallbackReason !== "") segments.push(fallbackReason); return segments.join(" \u00B7 "); } font.pixelSize: theme.fontSizeSmaller font.bold: true onLinkActivated: function(link) { Qt.openUrlExternally(link) } } RectangularGlow { id: effect visible: !currentChat.isServer && ModelList.selectableModels.count !== 0 anchors.fill: textInputView glowRadius: 50 spread: 0 color: theme.sendGlow cornerRadius: 10 opacity: 0.1 } ListModel { id: attachmentModel function getAttachmentUrls() { var urls = []; for (var i = 0; i < attachmentModel.count; i++) { var item = attachmentModel.get(i); urls.push(item.url); } return urls; } } Rectangle { id: textInputView color: theme.controlBackground border.width: error === null ? 1 : 2 border.color: error === null ? theme.controlBorder : theme.textErrorColor radius: 10 anchors.left: parent.left anchors.right: parent.right anchors.bottom: parent.bottom anchors.margins: 30 anchors.leftMargin: Math.max((parent.width - 1310) / 2, 30) anchors.rightMargin: Math.max((parent.width - 1310) / 2, 30) height: textInputViewLayout.implicitHeight visible: !currentChat.isServer && ModelList.selectableModels.count !== 0 property var error: null function checkError() { const info = currentModelInfo; if (info === null || !info.id) { error = null; } else if (info.chatTemplate.isLegacy) { error = qsTr("Legacy prompt template needs to be " + "updated" + " in Settings."); } else if (!info.chatTemplate.isSet) { error = qsTr("No " + "chat template configured."); } else if (/^\s*$/.test(info.chatTemplate.value)) { error = qsTr("The " + "chat template cannot be blank."); } else if (info.systemMessage.isLegacy) { error = qsTr("Legacy system prompt needs to be " + "updated" + " in Settings."); } else error = null; } Component.onCompleted: checkError() Connections { target: window function onCurrentModelIdChanged() { textInputView.checkError(); } } Connections { target: MySettings function onChatTemplateChanged(info) { if (info.id === window.currentModelId) textInputView.checkError(); } function onSystemMessageChanged(info) { if (info.id === window.currentModelId) textInputView.checkError(); } } MouseArea { id: textInputViewMouseArea anchors.fill: parent onClicked: (mouse) => { if (textInput.enabled) textInput.forceActiveFocus(); } } GridLayout { id: textInputViewLayout anchors.left: parent.left anchors.right: parent.right rows: 2 columns: 3 rowSpacing: 10 columnSpacing: 0 Flow { id: attachmentsFlow visible: attachmentModel.count Layout.row: 0 Layout.column: 1 Layout.topMargin: 15 Layout.leftMargin: 5 Layout.rightMargin: 15 spacing: 10 Repeater { model: attachmentModel Rectangle { width: 350 height: 50 radius: 5 color: theme.attachmentBackground border.color: theme.controlBorder Row { spacing: 5 anchors.fill: parent anchors.margins: 5 MyFileIcon { iconSize: 40 fileName: model.file } Text { width: 265 height: 40 text: model.file color: theme.textColor horizontalAlignment: Text.AlignHLeft verticalAlignment: Text.AlignVCenter font.pixelSize: theme.fontSizeMedium font.bold: true wrapMode: Text.WrapAnywhere elide: Qt.ElideRight } } MyMiniButton { id: removeAttachmentButton anchors.top: parent.top anchors.right: parent.right backgroundColor: theme.textColor backgroundColorHovered: theme.iconBackgroundDark source: "qrc:/gpt4all/icons/close.svg" onClicked: { attachmentModel.remove(index) if (textInput.enabled) textInput.forceActiveFocus(); } } } } } MyToolButton { id: plusButton Layout.row: 1 Layout.column: 0 Layout.leftMargin: 15 Layout.rightMargin: 15 Layout.alignment: Qt.AlignCenter backgroundColor: theme.conversationInputButtonBackground backgroundColorHovered: theme.conversationInputButtonBackgroundHovered imageWidth: theme.fontSizeLargest imageHeight: theme.fontSizeLargest visible: !currentChat.isServer && ModelList.selectableModels.count !== 0 && currentChat.isModelLoaded enabled: !currentChat.responseInProgress source: "qrc:/gpt4all/icons/paperclip.svg" Accessible.name: qsTr("Add media") Accessible.description: qsTr("Adds media to the prompt") onClicked: (mouse) => { addMediaMenu.open() } } ScrollView { id: textInputScrollView Layout.row: 1 Layout.column: 1 Layout.fillWidth: true Layout.leftMargin: plusButton.visible ? 5 : 15 Layout.margins: 15 height: Math.min(contentHeight, 200) MyTextArea { id: textInput color: theme.textColor padding: 0 enabled: currentChat.isModelLoaded && !currentChat.isServer onEnabledChanged: { if (textInput.enabled) textInput.forceActiveFocus(); } font.pixelSize: theme.fontSizeLarger placeholderText: currentChat.isModelLoaded ? qsTr("Send a message...") : qsTr("Load a model to continue...") Accessible.role: Accessible.EditableText Accessible.name: placeholderText Accessible.description: qsTr("Send messages/prompts to the model") Keys.onReturnPressed: event => { if (event.modifiers & Qt.ControlModifier || event.modifiers & Qt.ShiftModifier) { event.accepted = false; } else if (!chatModel.hasError && textInputView.error === null) { editingFinished(); sendMessage(); } } function sendMessage() { if ((textInput.text === "" && attachmentModel.count === 0) || currentChat.responseInProgress) return currentChat.stopGenerating() currentChat.newPromptResponsePair(textInput.text, attachmentModel.getAttachmentUrls()) attachmentModel.clear(); textInput.text = "" } MouseArea { id: textInputMouseArea anchors.fill: parent acceptedButtons: Qt.RightButton onClicked: (mouse) => { if (mouse.button === Qt.RightButton) { textInputContextMenu.x = textInputMouseArea.mouseX textInputContextMenu.y = textInputMouseArea.mouseY textInputContextMenu.open() } } } background: Rectangle { implicitWidth: 150 color: "transparent" } MyMenu { id: textInputContextMenu MyMenuItem { text: qsTr("Cut") enabled: textInput.selectedText !== "" height: enabled ? implicitHeight : 0 onTriggered: textInput.cut() } MyMenuItem { text: qsTr("Copy") enabled: textInput.selectedText !== "" height: enabled ? implicitHeight : 0 onTriggered: textInput.copy() } MyMenuItem { text: qsTr("Paste") onTriggered: textInput.paste() } MyMenuItem { text: qsTr("Select All") onTriggered: textInput.selectAll() } } } } Row { Layout.row: 1 Layout.column: 2 Layout.rightMargin: 15 Layout.alignment: Qt.AlignCenter MyToolButton { id: stopButton backgroundColor: theme.conversationInputButtonBackground backgroundColorHovered: theme.conversationInputButtonBackgroundHovered visible: currentChat.responseInProgress && !currentChat.isServer background: Item { anchors.fill: parent Image { id: stopImage anchors.centerIn: parent visible: false fillMode: Image.PreserveAspectFit mipmap: true sourceSize.width: theme.fontSizeLargest sourceSize.height: theme.fontSizeLargest source: "qrc:/gpt4all/icons/stop_generating.svg" } Rectangle { anchors.centerIn: stopImage width: theme.fontSizeLargest + 8 height: theme.fontSizeLargest + 8 color: theme.viewBackground border.pixelAligned: false border.color: theme.controlBorder border.width: 1 radius: width / 2 } ColorOverlay { anchors.fill: stopImage source: stopImage color: stopButton.hovered ? stopButton.backgroundColorHovered : stopButton.backgroundColor } } Accessible.name: qsTr("Stop generating") Accessible.description: qsTr("Stop the current response generation") ToolTip.visible: stopButton.hovered ToolTip.text: Accessible.description onClicked: { // FIXME: This no longer sets a 'stopped' field so conversations that // are copied to clipboard or to datalake don't indicate if the user // has prematurely stopped the response. This has been broken since // v3.0.0 at least. currentChat.stopGenerating() } } MyToolButton { id: sendButton backgroundColor: theme.conversationInputButtonBackground backgroundColorHovered: theme.conversationInputButtonBackgroundHovered imageWidth: theme.fontSizeLargest imageHeight: theme.fontSizeLargest visible: !currentChat.responseInProgress && !currentChat.isServer && ModelList.selectableModels.count !== 0 enabled: !chatModel.hasError && textInputView.error === null source: "qrc:/gpt4all/icons/send_message.svg" Accessible.name: qsTr("Send message") Accessible.description: qsTr("Sends the message/prompt contained in textfield to the model") ToolTip.visible: sendButton.hovered ToolTip.text: Accessible.description onClicked: { textInput.sendMessage() } } } } } MyFileDialog { id: fileDialog nameFilters: ["All Supported Files (*.txt *.md *.rst *.xlsx)", "Text Files (*.txt *.md *.rst)", "Excel Worksheets (*.xlsx)"] } MyMenu { id: addMediaMenu x: textInputView.x y: textInputView.y - addMediaMenu.height - 10; title: qsTr("Attach") MyMenuItem { text: qsTr("Single File") icon.source: "qrc:/gpt4all/icons/file.svg" icon.width: 24 icon.height: 24 onClicked: { fileDialog.openFileDialog(StandardPaths.writableLocation(StandardPaths.HomeLocation), function(selectedFile) { if (selectedFile) { var file = selectedFile.toString().split("/").pop() attachmentModel.append({ file: file, url: selectedFile }) } if (textInput.enabled) textInput.forceActiveFocus(); }) } } } } } } ================================================ FILE: gpt4all-chat/qml/CollectionsDrawer.qml ================================================ import QtCore import QtQuick import QtQuick.Controls import QtQuick.Controls.Basic import QtQuick.Layouts import QtQuick.Dialogs import chatlistmodel import localdocs import llm Rectangle { id: collectionsDrawer color: "transparent" signal addDocsClicked property var currentChat: ChatListModel.currentChat Rectangle { id: borderLeft anchors.top: parent.top anchors.bottom: parent.bottom anchors.left: parent.left width: 1 color: theme.dividerColor } ScrollView { id: scrollView anchors.top: parent.top anchors.bottom: parent.bottom anchors.left: borderLeft.right anchors.right: parent.right anchors.margins: 2 anchors.bottomMargin: 10 clip: true contentHeight: 300 ScrollBar.vertical.policy: ScrollBar.AsNeeded ListView { id: listView model: LocalDocs.localDocsModel anchors.fill: parent anchors.margins: 13 anchors.bottomMargin: 5 boundsBehavior: Flickable.StopAtBounds spacing: 15 delegate: Rectangle { width: listView.width height: childrenRect.height + 15 color: checkBox.checked ? theme.collectionsButtonBackground : "transparent" RowLayout { anchors.top: parent.top anchors.left: parent.left anchors.right: parent.right anchors.margins: 7.5 MyCheckBox { id: checkBox Layout.alignment: Qt.AlignLeft checked: currentChat.hasCollection(collection) onClicked: { if (checkBox.checked) { currentChat.addCollection(collection) } else { currentChat.removeCollection(collection) } } ToolTip.text: qsTr("Warning: searching collections while indexing can return incomplete results") ToolTip.visible: hovered && model.indexing } ColumnLayout { Layout.fillWidth: true Layout.alignment: Qt.AlignLeft Text { Layout.fillWidth: true Layout.alignment: Qt.AlignLeft text: collection font.pixelSize: theme.fontSizeLarger elide: Text.ElideRight color: theme.textColor } Text { Layout.fillWidth: true Layout.alignment: Qt.AlignLeft text: "%1 – %2".arg(qsTr("%n file(s)", "", model.totalDocs)).arg(qsTr("%n word(s)", "", model.totalWords)) elide: Text.ElideRight color: theme.mutedTextColor font.pixelSize: theme.fontSizeSmall } RowLayout { visible: model.updating Layout.fillWidth: true Layout.alignment: Qt.AlignLeft MyBusyIndicator { color: theme.accentColor size: 24 Layout.minimumWidth: 24 Layout.minimumHeight: 24 } Text { text: qsTr("Updating") elide: Text.ElideRight color: theme.accentColor font.pixelSize: theme.fontSizeSmall font.bold: true } } } } } footer: ColumnLayout { width: listView.width spacing: 30 Rectangle { visible: listView.count !== 0 Layout.topMargin: 30 Layout.fillWidth: true height: 1 color: theme.dividerColor } MySettingsButton { id: collectionSettings enabled: LocalDocs.databaseValid Layout.alignment: Qt.AlignCenter text: qsTr("\uFF0B Add Docs") font.pixelSize: theme.fontSizeLarger onClicked: { addDocsClicked() } } Text { Layout.fillWidth: true Layout.alignment: Qt.AlignLeft text: qsTr("Select a collection to make it available to the chat model.") font.pixelSize: theme.fontSizeLarger wrapMode: Text.WordWrap elide: Text.ElideRight color: theme.mutedTextColor } } } } } ================================================ FILE: gpt4all-chat/qml/ConfirmationDialog.qml ================================================ import QtCore import QtQuick import QtQuick.Controls import QtQuick.Controls.Basic import QtQuick.Layouts MyDialog { id: confirmationDialog anchors.centerIn: parent modal: true padding: 20 property alias dialogTitle: titleText.text property alias description: descriptionText.text Theme { id: theme } contentItem: ColumnLayout { Text { id: titleText Layout.alignment: Qt.AlignHCenter textFormat: Text.StyledText color: theme.textColor font.pixelSize: theme.fontSizeLarger font.bold: true } Text { id: descriptionText Layout.alignment: Qt.AlignHCenter textFormat: Text.StyledText color: theme.textColor font.pixelSize: theme.fontSizeMedium } } footer: DialogButtonBox { id: dialogBox padding: 20 alignment: Qt.AlignRight spacing: 10 MySettingsButton { text: qsTr("OK") textColor: theme.mediumButtonText backgroundColor: theme.mediumButtonBackground backgroundColorHovered: theme.mediumButtonBackgroundHovered DialogButtonBox.buttonRole: DialogButtonBox.AcceptRole } MySettingsButton { text: qsTr("Cancel") DialogButtonBox.buttonRole: DialogButtonBox.RejectRole } background: Rectangle { color: "transparent" } Keys.onEnterPressed: confirmationDialog.accept() Keys.onReturnPressed: confirmationDialog.accept() } Component.onCompleted: dialogBox.forceActiveFocus() } ================================================ FILE: gpt4all-chat/qml/HomeView.qml ================================================ import QtCore import QtQuick import QtQuick.Controls import QtQuick.Controls.Basic import QtQuick.Layouts import Qt5Compat.GraphicalEffects import llm import chatlistmodel import download import modellist import network import gpt4all import mysettings Rectangle { id: homeView Theme { id: theme } color: theme.viewBackground signal chatViewRequested() signal localDocsViewRequested() signal settingsViewRequested(int page) signal addModelViewRequested() property bool shouldShowFirstStart: false ColumnLayout { id: mainArea anchors.fill: parent anchors.margins: 30 spacing: 30 ColumnLayout { Layout.fillWidth: true Layout.maximumWidth: 1530 Layout.alignment: Qt.AlignCenter Layout.topMargin: 20 spacing: 30 ColumnLayout { Layout.alignment: Qt.AlignHCenter spacing: 5 Text { id: welcome Layout.alignment: Qt.AlignHCenter text: qsTr("Welcome to GPT4All") font.pixelSize: theme.fontSizeBannerLarge color: theme.titleTextColor } Text { Layout.alignment: Qt.AlignHCenter text: qsTr("The privacy-first LLM chat application") font.pixelSize: theme.fontSizeLarge color: theme.titleInfoTextColor } } MyButton { id: startChat visible: shouldShowFirstStart Layout.alignment: Qt.AlignHCenter text: qsTr("Start chatting") onClicked: { chatViewRequested() } } RowLayout { spacing: 15 visible: !startChat.visible Layout.alignment: Qt.AlignHCenter MyWelcomeButton { Layout.fillWidth: true Layout.maximumWidth: 150 + 200 * theme.fontScale Layout.preferredHeight: 40 + 90 * theme.fontScale text: qsTr("Start Chatting") description: qsTr("Chat with any LLM") imageSource: "qrc:/gpt4all/icons/chat.svg" onClicked: { chatViewRequested() } } MyWelcomeButton { Layout.fillWidth: true Layout.maximumWidth: 150 + 200 * theme.fontScale Layout.preferredHeight: 40 + 90 * theme.fontScale text: qsTr("LocalDocs") description: qsTr("Chat with your local files") imageSource: "qrc:/gpt4all/icons/db.svg" onClicked: { localDocsViewRequested() } } MyWelcomeButton { Layout.fillWidth: true Layout.maximumWidth: 150 + 200 * theme.fontScale Layout.preferredHeight: 40 + 90 * theme.fontScale text: qsTr("Find Models") description: qsTr("Explore and download models") imageSource: "qrc:/gpt4all/icons/models.svg" onClicked: { addModelViewRequested() } } } Item { visible: !startChat.visible && Download.latestNews !== "" Layout.fillWidth: true Layout.fillHeight: true Layout.minimumHeight: 120 Layout.maximumHeight: textAreaNews.height Rectangle { id: roundedFrameNews // latest news anchors.fill: parent z: 299 radius: 10 border.width: 1 border.color: theme.controlBorder color: "transparent" clip: true } Item { anchors.fill: parent layer.enabled: true layer.effect: OpacityMask { maskSource: Rectangle { width: roundedFrameNews.width height: roundedFrameNews.height radius: 10 } } RowLayout { spacing: 0 anchors.fill: parent Rectangle { color: "transparent" width: 82 height: 100 Image { id: newsImg anchors.centerIn: parent sourceSize: Qt.size(48, 48) mipmap: true visible: false source: "qrc:/gpt4all/icons/gpt4all_transparent.svg" } ColorOverlay { anchors.fill: newsImg source: newsImg color: theme.styledTextColor } } Item { id: myItem Layout.fillWidth: true Layout.fillHeight: true Rectangle { anchors.fill: parent color: theme.conversationBackground } ScrollView { id: newsScroll anchors.fill: parent clip: true ScrollBar.vertical.policy: ScrollBar.AsNeeded ScrollBar.horizontal.policy: ScrollBar.AlwaysOff Text { id: textAreaNews width: myItem.width padding: 20 color: theme.styledTextColor font.pixelSize: theme.fontSizeLarger textFormat: TextEdit.MarkdownText wrapMode: Text.WordWrap text: Download.latestNews focus: false Accessible.role: Accessible.Paragraph Accessible.name: qsTr("Latest news") Accessible.description: qsTr("Latest news from GPT4All") onLinkActivated: function(link) { Qt.openUrlExternally(link); } } } } } } } } Rectangle { id: linkBar Layout.alignment: Qt.AlignBottom Layout.fillWidth: true border.width: 1 border.color: theme.dividerColor radius: 6 z: 200 height: 30 color: theme.conversationBackground RowLayout { anchors.fill: parent spacing: 0 RowLayout { Layout.alignment: Qt.AlignLeft | Qt.AlignVCenter spacing: 4 MyFancyLink { text: qsTr("Release Notes") imageSource: "qrc:/gpt4all/icons/notes.svg" onClicked: { Qt.openUrlExternally("https://github.com/nomic-ai/gpt4all/releases") } } MyFancyLink { text: qsTr("Documentation") imageSource: "qrc:/gpt4all/icons/info.svg" onClicked: { Qt.openUrlExternally("https://docs.gpt4all.io/") } } MyFancyLink { text: qsTr("Discord") imageSource: "qrc:/gpt4all/icons/discord.svg" onClicked: { Qt.openUrlExternally("https://discord.gg/4M2QFmTt2k") } } MyFancyLink { text: qsTr("X (Twitter)") imageSource: "qrc:/gpt4all/icons/twitter.svg" onClicked: { Qt.openUrlExternally("https://twitter.com/nomic_ai") } } MyFancyLink { text: qsTr("Github") imageSource: "qrc:/gpt4all/icons/github.svg" onClicked: { Qt.openUrlExternally("https://github.com/nomic-ai/gpt4all") } } } RowLayout { Layout.alignment: Qt.AlignRight | Qt.AlignVCenter spacing: 40 MyFancyLink { text: qsTr("nomic.ai") imageSource: "qrc:/gpt4all/icons/globe.svg" onClicked: { Qt.openUrlExternally("https://www.nomic.ai/gpt4all") } rightPadding: 15 } } } } } Rectangle { anchors.top: mainArea.top anchors.right: mainArea.right border.width: 1 border.color: theme.dividerColor radius: 6 z: 200 height: 30 color: theme.conversationBackground width: subscribeLink.width RowLayout { anchors.centerIn: parent MyFancyLink { id: subscribeLink Layout.alignment: Qt.AlignCenter text: qsTr("Subscribe to Newsletter") imageSource: "qrc:/gpt4all/icons/email.svg" onClicked: { Qt.openUrlExternally("https://nomic.ai/gpt4all/#newsletter-form") } } } } } ================================================ FILE: gpt4all-chat/qml/LocalDocsSettings.qml ================================================ import QtCore import QtQuick import QtQuick.Controls import QtQuick.Controls.Basic import QtQuick.Layouts import QtQuick.Dialogs import localdocs import modellist import mysettings import network MySettingsTab { onRestoreDefaults: { MySettings.restoreLocalDocsDefaults(); } showRestoreDefaultsButton: true title: qsTr("LocalDocs") contentItem: ColumnLayout { id: root spacing: 30 Label { Layout.bottomMargin: 10 color: theme.settingsTitleTextColor font.pixelSize: theme.fontSizeBannerSmall font.bold: true text: qsTr("LocalDocs Settings") } ColumnLayout { spacing: 10 Label { color: theme.styledTextColor font.pixelSize: theme.fontSizeLarge font.bold: true text: qsTr("Indexing") } Rectangle { Layout.fillWidth: true height: 1 color: theme.settingsDivider } } RowLayout { MySettingsLabel { id: extsLabel text: qsTr("Allowed File Extensions") helpText: qsTr("Comma-separated list. LocalDocs will only attempt to process files with these extensions.") } MyTextField { id: extsField text: MySettings.localDocsFileExtensions.join(',') color: theme.textColor font.pixelSize: theme.fontSizeLarge Layout.alignment: Qt.AlignRight Layout.minimumWidth: 200 validator: RegularExpressionValidator { regularExpression: /([^ ,\/"']+,?)*/ } onEditingFinished: { // split and remove empty elements var exts = text.split(',').filter(e => e); // normalize and deduplicate exts = exts.map(e => e.toLowerCase()); exts = Array.from(new Set(exts)); /* Blacklist common unsupported file extensions. We only support plain text and PDFs, and although we * reject binary data, we don't want to waste time trying to index files that we don't support. */ exts = exts.filter(e => ![ /* Microsoft documents */ "rtf", "ppt", "pptx", "xls", "xlsx", /* OpenOffice */ "odt", "ods", "odp", "odg", /* photos */ "jpg", "jpeg", "png", "gif", "bmp", "tif", "tiff", "webp", /* audio */ "mp3", "wma", "m4a", "wav", "flac", /* videos */ "mp4", "mov", "webm", "mkv", "avi", "flv", "wmv", /* executables */ "exe", "com", "dll", "so", "dylib", "msi", /* binary images */ "iso", "img", "dmg", /* archives */ "zip", "jar", "apk", "rar", "7z", "tar", "gz", "xz", "bz2", "tar.gz", "tgz", "tar.xz", "tar.bz2", /* misc */ "bin", ].includes(e)); MySettings.localDocsFileExtensions = exts; extsField.text = exts.join(','); focus = false; } Accessible.role: Accessible.EditableText Accessible.name: extsLabel.text Accessible.description: extsLabel.helpText } } ColumnLayout { spacing: 10 Label { color: theme.grayRed900 font.pixelSize: theme.fontSizeLarge font.bold: true text: qsTr("Embedding") } Rectangle { Layout.fillWidth: true height: 1 color: theme.grayRed500 } } RowLayout { MySettingsLabel { text: qsTr("Use Nomic Embed API") helpText: qsTr("Embed documents using the fast Nomic API instead of a private local model. Requires restart.") } MyCheckBox { id: useNomicAPIBox Component.onCompleted: { useNomicAPIBox.checked = MySettings.localDocsUseRemoteEmbed; } onClicked: { MySettings.localDocsUseRemoteEmbed = useNomicAPIBox.checked && MySettings.localDocsNomicAPIKey !== ""; } } } RowLayout { MySettingsLabel { id: apiKeyLabel text: qsTr("Nomic API Key") helpText: qsTr('API key to use for Nomic Embed. Get one from the Atlas API keys page. Requires restart.') onLinkActivated: function(link) { Qt.openUrlExternally(link) } } MyTextField { id: apiKeyField property bool isValid: validate() onTextChanged: { isValid = validate(); } function validate() { return /^(nk-[a-zA-Z0-9_-]{43})?$/.test(apiKeyField.text); } placeholderText: "nk-" + "X".repeat(43) text: MySettings.localDocsNomicAPIKey color: apiKeyField.isValid ? theme.textColor : theme.textErrorColor font.pixelSize: theme.fontSizeLarge Layout.alignment: Qt.AlignRight Layout.minimumWidth: 200 enabled: useNomicAPIBox.checked onEditingFinished: { if (apiKeyField.isValid) { MySettings.localDocsNomicAPIKey = apiKeyField.text; MySettings.localDocsUseRemoteEmbed = useNomicAPIBox.checked && MySettings.localDocsNomicAPIKey !== ""; } focus = false; } Accessible.role: Accessible.EditableText Accessible.name: apiKeyLabel.text Accessible.description: apiKeyLabel.helpText } } RowLayout { MySettingsLabel { id: deviceLabel text: qsTr("Embeddings Device") helpText: qsTr('The compute device used for embeddings. Requires restart.') } MyComboBox { id: deviceBox enabled: !useNomicAPIBox.checked Layout.minimumWidth: 400 Layout.maximumWidth: 400 Layout.fillWidth: false Layout.alignment: Qt.AlignRight model: ListModel { ListElement { text: qsTr("Application default") } Component.onCompleted: { MySettings.embeddingsDeviceList.forEach(d => append({"text": d})); deviceBox.updateModel(); } } Accessible.name: deviceLabel.text Accessible.description: deviceLabel.helpText function updateModel() { var device = MySettings.localDocsEmbedDevice; // This usage of 'Auto' should not be translated deviceBox.currentIndex = device === "Auto" ? 0 : deviceBox.indexOfValue(device); } Component.onCompleted: { deviceBox.updateModel(); } Connections { target: MySettings function onDeviceChanged() { deviceBox.updateModel(); } } onActivated: { // This usage of 'Auto' should not be translated MySettings.localDocsEmbedDevice = deviceBox.currentIndex === 0 ? "Auto" : deviceBox.currentText; } } } ColumnLayout { spacing: 10 Label { color: theme.grayRed900 font.pixelSize: theme.fontSizeLarge font.bold: true text: qsTr("Display") } Rectangle { Layout.fillWidth: true height: 1 color: theme.grayRed500 } } RowLayout { MySettingsLabel { id: showReferencesLabel text: qsTr("Show Sources") helpText: qsTr("Display the sources used for each response.") } MyCheckBox { id: showReferencesBox checked: MySettings.localDocsShowReferences onClicked: { MySettings.localDocsShowReferences = !MySettings.localDocsShowReferences } } } ColumnLayout { spacing: 10 Label { color: theme.styledTextColor font.pixelSize: theme.fontSizeLarge font.bold: true text: qsTr("Advanced") } Rectangle { Layout.fillWidth: true height: 1 color: theme.settingsDivider } } MySettingsLabel { id: warningLabel Layout.bottomMargin: 15 Layout.fillWidth: true color: theme.textErrorColor wrapMode: Text.WordWrap text: qsTr("Warning: Advanced usage only.") helpText: qsTr("Values too large may cause localdocs failure, extremely slow responses or failure to respond at all. Roughly speaking, the {N chars x N snippets} are added to the model's context window. More info here.") onLinkActivated: function(link) { Qt.openUrlExternally(link) } } RowLayout { MySettingsLabel { id: chunkLabel Layout.fillWidth: true text: qsTr("Document snippet size (characters)") helpText: qsTr("Number of characters per document snippet. Larger numbers increase likelihood of factual responses, but also result in slower generation.") } MyTextField { id: chunkSizeTextField text: MySettings.localDocsChunkSize validator: IntValidator { bottom: 1 } onEditingFinished: { var val = parseInt(text) if (!isNaN(val)) { MySettings.localDocsChunkSize = val focus = false } else { text = MySettings.localDocsChunkSize } } } } RowLayout { Layout.topMargin: 15 MySettingsLabel { id: contextItemsPerPrompt text: qsTr("Max document snippets per prompt") helpText: qsTr("Max best N matches of retrieved document snippets to add to the context for prompt. Larger numbers increase likelihood of factual responses, but also result in slower generation.") } MyTextField { text: MySettings.localDocsRetrievalSize validator: IntValidator { bottom: 1 } onEditingFinished: { var val = parseInt(text) if (!isNaN(val)) { MySettings.localDocsRetrievalSize = val focus = false } else { text = MySettings.localDocsRetrievalSize } } } } Rectangle { Layout.topMargin: 15 Layout.fillWidth: true height: 1 color: theme.settingsDivider } } } ================================================ FILE: gpt4all-chat/qml/LocalDocsView.qml ================================================ import QtCore import QtQuick import QtQuick.Controls import QtQuick.Controls.Basic import QtQuick.Layouts import Qt5Compat.GraphicalEffects import llm import chatlistmodel import download import modellist import network import gpt4all import mysettings import localdocs Rectangle { id: localDocsView Theme { id: theme } color: theme.viewBackground signal chatViewRequested() signal localDocsViewRequested() signal settingsViewRequested(int page) signal addCollectionViewRequested() ColumnLayout { id: mainArea anchors.left: parent.left anchors.right: parent.right anchors.top: parent.top anchors.bottom: parent.bottom anchors.margins: 30 spacing: 50 RowLayout { Layout.fillWidth: true Layout.alignment: Qt.AlignTop visible: LocalDocs.databaseValid && LocalDocs.localDocsModel.count !== 0 spacing: 50 ColumnLayout { Layout.fillWidth: true Layout.alignment: Qt.AlignLeft Layout.minimumWidth: 200 spacing: 5 Text { id: welcome text: qsTr("LocalDocs") font.pixelSize: theme.fontSizeBanner color: theme.titleTextColor } Text { text: qsTr("Chat with your local files") font.pixelSize: theme.fontSizeLarge color: theme.titleInfoTextColor } } Rectangle { Layout.fillWidth: true height: 0 } MyButton { Layout.alignment: Qt.AlignTop | Qt.AlignRight text: qsTr("\uFF0B Add Collection") onClicked: { addCollectionViewRequested() } } } Rectangle { id: warning Layout.fillWidth: true Layout.fillHeight: true visible: !LocalDocs.databaseValid Text { anchors.centerIn: parent text: qsTr("

                  ERROR: The LocalDocs database cannot be accessed or is not valid.


                  " + "Note: You will need to restart after trying any of the following suggested fixes.
                  " + "
                  • Make sure that the folder set as Download Path exists on the file system.
                  • " + "
                  • Check ownership as well as read and write permissions of the Download Path.
                  • " + "
                  • If there is a localdocs_v2.db file, check its ownership and read/write " + "permissions, too.

                  " + "If the problem persists and there are any 'localdocs_v*.db' files present, as a last resort you can
                  " + "try backing them up and removing them. You will have to recreate your collections, however.") color: theme.textErrorColor font.pixelSize: theme.fontSizeLarger } } Item { Layout.fillWidth: true Layout.fillHeight: true visible: LocalDocs.databaseValid && LocalDocs.localDocsModel.count === 0 ColumnLayout { id: noInstalledLabel anchors.centerIn: parent spacing: 0 Text { Layout.alignment: Qt.AlignCenter text: qsTr("No Collections Installed") color: theme.mutedLightTextColor font.pixelSize: theme.fontSizeBannerSmall } Text { Layout.topMargin: 15 horizontalAlignment: Qt.AlignHCenter color: theme.mutedLighterTextColor text: qsTr("Install a collection of local documents to get started using this feature") font.pixelSize: theme.fontSizeLarge } } MyButton { anchors.top: noInstalledLabel.bottom anchors.topMargin: 50 anchors.horizontalCenter: noInstalledLabel.horizontalCenter rightPadding: 60 leftPadding: 60 text: qsTr("\uFF0B Add Doc Collection") onClicked: { addCollectionViewRequested() } Accessible.role: Accessible.Button Accessible.name: qsTr("Shows the add model view") } } ScrollView { id: scrollView ScrollBar.vertical.policy: ScrollBar.AsNeeded Layout.fillWidth: true Layout.fillHeight: true clip: true visible: LocalDocs.databaseValid && LocalDocs.localDocsModel.count !== 0 ListView { id: collectionListView model: LocalDocs.localDocsModel boundsBehavior: Flickable.StopAtBounds spacing: 30 delegate: Rectangle { width: collectionListView.width height: childrenRect.height + 60 color: theme.conversationBackground radius: 10 border.width: 1 border.color: theme.controlBorder property bool removing: false ColumnLayout { anchors.top: parent.top anchors.left: parent.left anchors.right: parent.right anchors.margins: 30 spacing: 10 RowLayout { Layout.fillWidth: true Text { Layout.fillWidth: true Layout.alignment: Qt.AlignLeft text: collection elide: Text.ElideRight color: theme.titleTextColor font.pixelSize: theme.fontSizeLargest font.bold: true } Item { Layout.alignment: Qt.AlignRight Layout.preferredWidth: state.contentWidth + 50 Layout.preferredHeight: state.contentHeight + 10 ProgressBar { id: itemProgressBar anchors.fill: parent value: { if (model.error !== "") return 0 if (model.indexing) return (model.totalBytesToIndex - model.currentBytesToIndex) / model.totalBytesToIndex if (model.currentEmbeddingsToIndex !== 0) return (model.totalEmbeddingsToIndex - model.currentEmbeddingsToIndex) / model.totalEmbeddingsToIndex return 0 } background: Rectangle { implicitHeight: 45 color: { if (model.error !== "") return "transparent" if (model.indexing) return theme.altProgressBackground if (model.currentEmbeddingsToIndex !== 0) return theme.altProgressBackground if (model.forceIndexing) return theme.red200 return theme.lightButtonBackground } radius: 6 } contentItem: Item { implicitHeight: 40 Rectangle { width: itemProgressBar.visualPosition * parent.width height: parent.height radius: 2 color: theme.altProgressForeground } } Accessible.role: Accessible.ProgressBar Accessible.name: qsTr("Indexing progressBar") Accessible.description: qsTr("Shows the progress made in the indexing") ToolTip.text: model.error ToolTip.visible: hovered && model.error !== "" } Label { id: state anchors.centerIn: itemProgressBar horizontalAlignment: Text.AlignHCenter color: { if (model.error !== "") return theme.textErrorColor if (model.indexing) return theme.altProgressText if (model.currentEmbeddingsToIndex !== 0) return theme.altProgressText if (model.forceIndexing) return theme.textErrorColor return theme.lighterButtonForeground } text: { if (model.error !== "") return qsTr("ERROR") // indicates extracting snippets from documents if (model.indexing) return qsTr("INDEXING") // indicates generating the embeddings for any outstanding snippets if (model.currentEmbeddingsToIndex !== 0) return qsTr("EMBEDDING") if (model.forceIndexing) return qsTr("REQUIRES UPDATE") if (model.installed) return qsTr("READY") return qsTr("INSTALLING") } elide: Text.ElideRight font.bold: true font.pixelSize: theme.fontSizeSmaller } } } RowLayout { Layout.fillWidth: true Text { Layout.fillWidth: true Layout.alignment: Qt.AlignLeft text: folder_path elide: Text.ElideRight color: theme.titleTextColor2 font.pixelSize: theme.fontSizeSmall } Text { Layout.alignment: Qt.AlignRight text: { if (model.error !== "") return model.error if (model.indexing) return qsTr("Indexing in progress") if (model.currentEmbeddingsToIndex !== 0) return qsTr("Embedding in progress") if (model.forceIndexing) return qsTr("This collection requires an update after version change") if (model.installed) return qsTr("Automatically reindexes upon changes to the folder") return qsTr("Installation in progress") } elide: Text.ElideRight color: theme.mutedDarkTextColor font.pixelSize: theme.fontSizeSmall } Text { visible: { return model.indexing || model.currentEmbeddingsToIndex !== 0 } Layout.alignment: Qt.AlignRight text: { var percentComplete = Math.round(itemProgressBar.value * 100); var formattedPercent = percentComplete < 10 ? " " + percentComplete : percentComplete.toString(); return formattedPercent + qsTr("%") } elide: Text.ElideRight color: theme.mutedDarkTextColor font.family: "monospace" font.pixelSize: theme.fontSizeSmall } } RowLayout { spacing: 7 Text { text: "%1 – %2".arg(qsTr("%n file(s)", "", model.totalDocs)).arg(qsTr("%n word(s)", "", model.totalWords)) elide: Text.ElideRight color: theme.styledTextColor2 font.pixelSize: theme.fontSizeSmall } Text { text: model.embeddingModel elide: Text.ElideRight color: theme.mutedDarkTextColor font.bold: true font.pixelSize: theme.fontSizeSmall } Text { visible: Qt.formatDateTime(model.lastUpdate) !== "" text: Qt.formatDateTime(model.lastUpdate) elide: Text.ElideRight color: theme.mutedTextColor font.pixelSize: theme.fontSizeSmall } Text { visible: model.currentEmbeddingsToIndex !== 0 text: (model.totalEmbeddingsToIndex - model.currentEmbeddingsToIndex) + " of " + model.totalEmbeddingsToIndex + " embeddings" elide: Text.ElideRight color: theme.mutedTextColor font.pixelSize: theme.fontSizeSmall } } Rectangle { Layout.fillWidth: true height: 1 color: theme.dividerColor } RowLayout { id: fileProcessingRow Layout.topMargin: 15 Layout.bottomMargin: 15 visible: model.fileCurrentlyProcessing !== "" && (model.indexing || model.currentEmbeddingsToIndex !== 0) MyBusyIndicator { Layout.alignment: Qt.AlignCenter Layout.preferredWidth: 12 Layout.preferredHeight: 12 running: true size: 12 color: theme.textColor } Text { id: filename Layout.alignment: Qt.AlignCenter text: model.fileCurrentlyProcessing elide: Text.ElideRight color: theme.textColor font.bold: true font.pixelSize: theme.fontSizeLarge } } Rectangle { visible: fileProcessingRow.visible Layout.fillWidth: true height: 1 color: theme.dividerColor } RowLayout { Layout.fillWidth: true spacing: 30 MySettingsButton { text: qsTr("Remove") textColor: theme.red500 onClicked: LocalDocs.removeFolder(collection, folder_path) backgroundColor: "transparent" backgroundColorHovered: theme.lighterButtonBackgroundHoveredRed } Item { Layout.fillWidth: true } MySettingsButton { id: rebuildButton visible: !model.forceIndexing && !model.indexing && model.currentEmbeddingsToIndex === 0 text: qsTr("Rebuild") textColor: theme.green500 onClicked: LocalDocs.forceRebuildFolder(folder_path) toolTip: qsTr("Reindex this folder from scratch. This is slow and usually not needed.") backgroundColor: "transparent" backgroundColorHovered: theme.lighterButtonBackgroundHovered } MySettingsButton { id: updateButton visible: model.forceIndexing text: qsTr("Update") textColor: theme.green500 onClicked: LocalDocs.forceIndexing(collection) toolTip: qsTr("Update the collection to the new version. This is a slow operation.") backgroundColor: "transparent" backgroundColorHovered: theme.lighterButtonBackgroundHovered } } } } } } } } ================================================ FILE: gpt4all-chat/qml/ModelSettings.qml ================================================ import QtCore import QtQuick import QtQuick.Controls import QtQuick.Controls.Basic import QtQuick.Layouts import modellist import mysettings import chatlistmodel MySettingsTab { onRestoreDefaults: { MySettings.restoreModelDefaults(root.currentModelInfo); } title: qsTr("Model") ConfirmationDialog { id: resetSystemMessageDialog property var index: null property bool resetClears: false dialogTitle: qsTr("%1 system message?").arg(resetClears ? qsTr("Clear") : qsTr("Reset")) description: qsTr("The system message will be %1.").arg(resetClears ? qsTr("removed") : qsTr("reset to the default")) onAccepted: MySettings.resetModelSystemMessage(ModelList.modelInfo(index)) function show(index_, resetClears_) { index = index_; resetClears = resetClears_; open(); } } ConfirmationDialog { id: resetChatTemplateDialog property bool resetClears: false property var index: null dialogTitle: qsTr("%1 chat template?").arg(resetClears ? qsTr("Clear") : qsTr("Reset")) description: qsTr("The chat template will be %1.").arg(resetClears ? qsTr("erased") : qsTr("reset to the default")) onAccepted: { MySettings.resetModelChatTemplate(ModelList.modelInfo(index)); templateTextArea.resetText(); } function show(index_, resetClears_) { index = index_; resetClears = resetClears_; open(); } } contentItem: GridLayout { id: root columns: 3 rowSpacing: 10 columnSpacing: 10 enabled: ModelList.selectableModels.count !== 0 property var currentModelName: comboBox.currentText property var currentModelId: comboBox.currentValue property var currentModelInfo: ModelList.modelInfo(root.currentModelId) Label { Layout.row: 1 Layout.column: 0 Layout.bottomMargin: 10 color: theme.settingsTitleTextColor font.pixelSize: theme.fontSizeBannerSmall font.bold: true text: qsTr("Model Settings") } RowLayout { Layout.fillWidth: true Layout.maximumWidth: parent.width Layout.row: 2 Layout.column: 0 Layout.columnSpan: 2 spacing: 10 MyComboBox { id: comboBox Layout.fillWidth: true model: ModelList.selectableModels valueRole: "id" textRole: "name" currentIndex: { var i = comboBox.indexOfValue(ChatListModel.currentChat.modelInfo.id); if (i >= 0) return i; return 0; } contentItem: Text { leftPadding: 10 rightPadding: 20 text: comboBox.currentText font: comboBox.font color: theme.textColor verticalAlignment: Text.AlignVCenter elide: Text.ElideRight } delegate: ItemDelegate { width: comboBox.width -20 contentItem: Text { text: name color: theme.textColor font: comboBox.font elide: Text.ElideRight verticalAlignment: Text.AlignVCenter } background: Rectangle { radius: 10 color: highlighted ? theme.menuHighlightColor : theme.menuBackgroundColor } highlighted: comboBox.highlightedIndex === index } } MySettingsButton { id: cloneButton text: qsTr("Clone") onClicked: { var id = ModelList.clone(root.currentModelInfo); comboBox.currentIndex = comboBox.indexOfValue(id); } } MySettingsDestructiveButton { id: removeButton enabled: root.currentModelInfo.isClone text: qsTr("Remove") onClicked: { ModelList.removeClone(root.currentModelInfo); comboBox.currentIndex = 0; } } } RowLayout { Layout.row: 3 Layout.column: 0 Layout.topMargin: 15 spacing: 10 MySettingsLabel { text: qsTr("Name") } } MyTextField { id: uniqueNameField text: root.currentModelName font.pixelSize: theme.fontSizeLarge enabled: root.currentModelInfo.isClone || root.currentModelInfo.description === "" Layout.row: 4 Layout.column: 0 Layout.columnSpan: 2 Layout.fillWidth: true Connections { target: MySettings function onNameChanged() { uniqueNameField.text = root.currentModelInfo.name; } } Connections { target: root function onCurrentModelInfoChanged() { uniqueNameField.text = root.currentModelInfo.name; } } onTextChanged: { if (text !== "" && ModelList.isUniqueName(text)) { MySettings.setModelName(root.currentModelInfo, text); } } } MySettingsLabel { text: qsTr("Model File") Layout.row: 5 Layout.column: 0 Layout.topMargin: 15 } MyTextField { text: root.currentModelInfo.filename font.pixelSize: theme.fontSizeLarge enabled: false Layout.row: 6 Layout.column: 0 Layout.columnSpan: 2 Layout.fillWidth: true } RowLayout { Layout.row: 7 Layout.columnSpan: 2 Layout.topMargin: 15 Layout.fillWidth: true Layout.maximumWidth: parent.width spacing: 10 MySettingsLabel { id: systemMessageLabel text: qsTr("System Message") helpText: qsTr("A message to set the context or guide the behavior of the model. Leave blank for " + "none. NOTE: Since GPT4All 3.5, this should not contain control tokens.") onReset: () => resetSystemMessageDialog.show(root.currentModelId, resetClears) function updateResetButton() { const info = root.currentModelInfo; // NOTE: checks if the *override* is set, regardless of whether there is a default canReset = !!info.id && MySettings.isModelSystemMessageSet(info); resetClears = !info.defaultSystemMessage; } Component.onCompleted: updateResetButton() Connections { target: root function onCurrentModelIdChanged() { systemMessageLabel.updateResetButton(); } } Connections { target: MySettings function onSystemMessageChanged(info) { if (info.id === root.currentModelId) systemMessageLabel.updateResetButton(); } } } Label { id: systemMessageLabelHelp visible: systemMessageArea.errState !== "ok" Layout.alignment: Qt.AlignBottom Layout.fillWidth: true Layout.rightMargin: 5 Layout.maximumHeight: systemMessageLabel.height text: qsTr("System message is not " + "plain text.") color: systemMessageArea.errState === "error" ? theme.textErrorColor : theme.textWarningColor font.pixelSize: theme.fontSizeLarger font.bold: true wrapMode: Text.Wrap elide: Text.ElideRight onLinkActivated: function(link) { Qt.openUrlExternally(link) } } } Rectangle { id: systemMessage Layout.row: 8 Layout.column: 0 Layout.columnSpan: 2 Layout.fillWidth: true color: "transparent" Layout.minimumHeight: Math.max(100, systemMessageArea.contentHeight + 20) MyTextArea { id: systemMessageArea anchors.fill: parent property bool isBeingReset: false function resetText() { const info = root.currentModelInfo; isBeingReset = true; text = (info.id ? info.systemMessage.value : null) ?? ""; isBeingReset = false; } Component.onCompleted: resetText() Connections { target: MySettings function onSystemMessageChanged(info) { if (info.id === root.currentModelId) systemMessageArea.resetText(); } } Connections { target: root function onCurrentModelIdChanged() { systemMessageArea.resetText(); } } // strict validation, because setModelSystemMessage clears isLegacy readonly property var reLegacyCheck: ( /(?:^|\s)(?:### *System\b|S(?:ystem|YSTEM):)|<\|(?:im_(?:start|end)|(?:start|end)_header_id|eot_id|SYSTEM_TOKEN)\|>|<>/m ) onTextChanged: { const info = root.currentModelInfo; if (!info.id) { errState = "ok"; } else if (info.systemMessage.isLegacy && (isBeingReset || reLegacyCheck.test(text))) { errState = "error"; } else errState = reLegacyCheck.test(text) ? "warning" : "ok"; if (info.id && errState !== "error" && !isBeingReset) MySettings.setModelSystemMessage(info, text); systemMessageLabel.updateResetButton(); } Accessible.role: Accessible.EditableText Accessible.name: systemMessageLabel.text Accessible.description: systemMessageLabelHelp.text } } RowLayout { Layout.row: 9 Layout.columnSpan: 2 Layout.topMargin: 15 Layout.fillWidth: true Layout.maximumWidth: parent.width spacing: 10 MySettingsLabel { id: chatTemplateLabel text: qsTr("Chat Template") helpText: qsTr("This Jinja template turns the chat into input for the model.") onReset: () => resetChatTemplateDialog.show(root.currentModelId, resetClears) function updateResetButton() { const info = root.currentModelInfo; canReset = !!info.id && ( MySettings.isModelChatTemplateSet(info) || templateTextArea.text !== (info.chatTemplate.value ?? "") ); resetClears = !info.defaultChatTemplate; } Component.onCompleted: updateResetButton() Connections { target: root function onCurrentModelIdChanged() { chatTemplateLabel.updateResetButton(); } } Connections { target: MySettings function onChatTemplateChanged(info) { if (info.id === root.currentModelId) chatTemplateLabel.updateResetButton(); } } } Label { id: chatTemplateLabelHelp visible: templateTextArea.errState !== "ok" Layout.alignment: Qt.AlignBottom Layout.fillWidth: true Layout.rightMargin: 5 Layout.maximumHeight: chatTemplateLabel.height text: templateTextArea.errMsg color: templateTextArea.errState === "error" ? theme.textErrorColor : theme.textWarningColor font.pixelSize: theme.fontSizeLarger font.bold: true wrapMode: Text.Wrap elide: Text.ElideRight onLinkActivated: function(link) { Qt.openUrlExternally(link) } } } Rectangle { id: chatTemplate Layout.row: 10 Layout.column: 0 Layout.columnSpan: 2 Layout.fillWidth: true Layout.minimumHeight: Math.max(100, templateTextArea.contentHeight + 20) color: "transparent" clip: true MyTextArea { id: templateTextArea anchors.fill: parent font: fixedFont property bool isBeingReset: false property var errMsg: null function resetText() { const info = root.currentModelInfo; isBeingReset = true; text = (info.id ? info.chatTemplate.value : null) ?? ""; isBeingReset = false; } Component.onCompleted: resetText() Connections { target: MySettings function onChatTemplateChanged() { templateTextArea.resetText(); } } Connections { target: root function onCurrentModelIdChanged() { templateTextArea.resetText(); } } function legacyCheck() { return /%[12]\b/.test(text) || !/\{%.*%\}.*\{\{.*\}\}.*\{%.*%\}/.test(text.replace(/\n/g, '')) || !/\bcontent\b/.test(text); } onTextChanged: { const info = root.currentModelInfo; let jinjaError; if (!info.id) { errMsg = null; errState = "ok"; } else if (info.chatTemplate.isLegacy && (isBeingReset || legacyCheck())) { errMsg = null; errState = "error"; } else if (text === "" && !info.chatTemplate.isSet) { errMsg = qsTr("No " + "chat template configured."); errState = "error"; } else if (/^\s*$/.test(text)) { errMsg = qsTr("The " + "chat template cannot be blank."); errState = "error"; } else if ((jinjaError = MySettings.checkJinjaTemplateError(text)) !== null) { errMsg = qsTr("Syntax" + " error: %1").arg(jinjaError); errState = "error"; } else if (legacyCheck()) { errMsg = qsTr("Chat template is not in " + "" + "Jinja format.") errState = "warning"; } else { errState = "ok"; } if (info.id && errState !== "error" && !isBeingReset) MySettings.setModelChatTemplate(info, text); chatTemplateLabel.updateResetButton(); } Keys.onPressed: event => { if (event.key === Qt.Key_Tab) { const a = templateTextArea; event.accepted = true; // suppress tab a.insert(a.cursorPosition, ' '); // four spaces } } Accessible.role: Accessible.EditableText Accessible.name: chatTemplateLabel.text Accessible.description: chatTemplateLabelHelp.text } } MySettingsLabel { id: chatNamePromptLabel text: qsTr("Chat Name Prompt") helpText: qsTr("Prompt used to automatically generate chat names.") Layout.row: 11 Layout.column: 0 Layout.topMargin: 15 } Rectangle { id: chatNamePrompt Layout.row: 12 Layout.column: 0 Layout.columnSpan: 2 Layout.fillWidth: true Layout.minimumHeight: Math.max(100, chatNamePromptTextArea.contentHeight + 20) color: "transparent" clip: true MyTextArea { id: chatNamePromptTextArea anchors.fill: parent text: root.currentModelInfo.chatNamePrompt Connections { target: MySettings function onChatNamePromptChanged() { chatNamePromptTextArea.text = root.currentModelInfo.chatNamePrompt; } } Connections { target: root function onCurrentModelInfoChanged() { chatNamePromptTextArea.text = root.currentModelInfo.chatNamePrompt; } } onTextChanged: { MySettings.setModelChatNamePrompt(root.currentModelInfo, text) } Accessible.role: Accessible.EditableText Accessible.name: chatNamePromptLabel.text Accessible.description: chatNamePromptLabel.text } } MySettingsLabel { id: suggestedFollowUpPromptLabel text: qsTr("Suggested FollowUp Prompt") helpText: qsTr("Prompt used to generate suggested follow-up questions.") Layout.row: 13 Layout.column: 0 Layout.topMargin: 15 } Rectangle { id: suggestedFollowUpPrompt Layout.row: 14 Layout.column: 0 Layout.columnSpan: 2 Layout.fillWidth: true Layout.minimumHeight: Math.max(100, suggestedFollowUpPromptTextArea.contentHeight + 20) color: "transparent" clip: true MyTextArea { id: suggestedFollowUpPromptTextArea anchors.fill: parent text: root.currentModelInfo.suggestedFollowUpPrompt Connections { target: MySettings function onSuggestedFollowUpPromptChanged() { suggestedFollowUpPromptTextArea.text = root.currentModelInfo.suggestedFollowUpPrompt; } } Connections { target: root function onCurrentModelInfoChanged() { suggestedFollowUpPromptTextArea.text = root.currentModelInfo.suggestedFollowUpPrompt; } } onTextChanged: { MySettings.setModelSuggestedFollowUpPrompt(root.currentModelInfo, text) } Accessible.role: Accessible.EditableText Accessible.name: suggestedFollowUpPromptLabel.text Accessible.description: suggestedFollowUpPromptLabel.text } } GridLayout { Layout.row: 15 Layout.column: 0 Layout.columnSpan: 2 Layout.topMargin: 15 Layout.fillWidth: true columns: 4 rowSpacing: 30 columnSpacing: 10 MySettingsLabel { id: contextLengthLabel visible: !root.currentModelInfo.isOnline text: qsTr("Context Length") helpText: qsTr("Number of input and output tokens the model sees.") Layout.row: 0 Layout.column: 0 Layout.maximumWidth: 300 * theme.fontScale } Item { Layout.row: 0 Layout.column: 1 Layout.fillWidth: true Layout.maximumWidth: 200 Layout.margins: 0 height: contextLengthField.height MyTextField { id: contextLengthField anchors.left: parent.left anchors.verticalCenter: parent.verticalCenter visible: !root.currentModelInfo.isOnline text: root.currentModelInfo.contextLength font.pixelSize: theme.fontSizeLarge color: theme.textColor ToolTip.text: qsTr("Maximum combined prompt/response tokens before information is lost.\nUsing more context than the model was trained on will yield poor results.\nNOTE: Does not take effect until you reload the model.") ToolTip.visible: hovered Connections { target: MySettings function onContextLengthChanged() { contextLengthField.text = root.currentModelInfo.contextLength; } } Connections { target: root function onCurrentModelInfoChanged() { contextLengthField.text = root.currentModelInfo.contextLength; } } onEditingFinished: { var val = parseInt(text) if (isNaN(val)) { text = root.currentModelInfo.contextLength } else { if (val < 8) { val = 8 contextLengthField.text = val } else if (val > root.currentModelInfo.maxContextLength) { val = root.currentModelInfo.maxContextLength contextLengthField.text = val } MySettings.setModelContextLength(root.currentModelInfo, val) focus = false } } Accessible.role: Accessible.EditableText Accessible.name: contextLengthLabel.text Accessible.description: ToolTip.text } } MySettingsLabel { id: tempLabel text: qsTr("Temperature") helpText: qsTr("Randomness of model output. Higher -> more variation.") Layout.row: 1 Layout.column: 2 Layout.maximumWidth: 300 * theme.fontScale } MyTextField { id: temperatureField text: root.currentModelInfo.temperature font.pixelSize: theme.fontSizeLarge color: theme.textColor ToolTip.text: qsTr("Temperature increases the chances of choosing less likely tokens.\nNOTE: Higher temperature gives more creative but less predictable outputs.") ToolTip.visible: hovered Layout.row: 1 Layout.column: 3 validator: DoubleValidator { locale: "C" } Connections { target: MySettings function onTemperatureChanged() { temperatureField.text = root.currentModelInfo.temperature; } } Connections { target: root function onCurrentModelInfoChanged() { temperatureField.text = root.currentModelInfo.temperature; } } onEditingFinished: { var val = parseFloat(text) if (!isNaN(val)) { MySettings.setModelTemperature(root.currentModelInfo, val) focus = false } else { text = root.currentModelInfo.temperature } } Accessible.role: Accessible.EditableText Accessible.name: tempLabel.text Accessible.description: ToolTip.text } MySettingsLabel { id: topPLabel text: qsTr("Top-P") helpText: qsTr("Nucleus Sampling factor. Lower -> more predictable.") Layout.row: 2 Layout.column: 0 Layout.maximumWidth: 300 * theme.fontScale } MyTextField { id: topPField text: root.currentModelInfo.topP color: theme.textColor font.pixelSize: theme.fontSizeLarge ToolTip.text: qsTr("Only the most likely tokens up to a total probability of top_p can be chosen.\nNOTE: Prevents choosing highly unlikely tokens.") ToolTip.visible: hovered Layout.row: 2 Layout.column: 1 validator: DoubleValidator { locale: "C" } Connections { target: MySettings function onTopPChanged() { topPField.text = root.currentModelInfo.topP; } } Connections { target: root function onCurrentModelInfoChanged() { topPField.text = root.currentModelInfo.topP; } } onEditingFinished: { var val = parseFloat(text) if (!isNaN(val)) { MySettings.setModelTopP(root.currentModelInfo, val) focus = false } else { text = root.currentModelInfo.topP } } Accessible.role: Accessible.EditableText Accessible.name: topPLabel.text Accessible.description: ToolTip.text } MySettingsLabel { id: minPLabel text: qsTr("Min-P") helpText: qsTr("Minimum token probability. Higher -> more predictable.") Layout.row: 3 Layout.column: 0 Layout.maximumWidth: 300 * theme.fontScale } MyTextField { id: minPField text: root.currentModelInfo.minP color: theme.textColor font.pixelSize: theme.fontSizeLarge ToolTip.text: qsTr("Sets the minimum relative probability for a token to be considered.") ToolTip.visible: hovered Layout.row: 3 Layout.column: 1 validator: DoubleValidator { locale: "C" } Connections { target: MySettings function onMinPChanged() { minPField.text = root.currentModelInfo.minP; } } Connections { target: root function onCurrentModelInfoChanged() { minPField.text = root.currentModelInfo.minP; } } onEditingFinished: { var val = parseFloat(text) if (!isNaN(val)) { MySettings.setModelMinP(root.currentModelInfo, val) focus = false } else { text = root.currentModelInfo.minP } } Accessible.role: Accessible.EditableText Accessible.name: minPLabel.text Accessible.description: ToolTip.text } MySettingsLabel { id: topKLabel visible: !root.currentModelInfo.isOnline text: qsTr("Top-K") helpText: qsTr("Size of selection pool for tokens.") Layout.row: 2 Layout.column: 2 Layout.maximumWidth: 300 * theme.fontScale } MyTextField { id: topKField visible: !root.currentModelInfo.isOnline text: root.currentModelInfo.topK color: theme.textColor font.pixelSize: theme.fontSizeLarge ToolTip.text: qsTr("Only the top K most likely tokens will be chosen from.") ToolTip.visible: hovered Layout.row: 2 Layout.column: 3 validator: IntValidator { bottom: 1 } Connections { target: MySettings function onTopKChanged() { topKField.text = root.currentModelInfo.topK; } } Connections { target: root function onCurrentModelInfoChanged() { topKField.text = root.currentModelInfo.topK; } } onEditingFinished: { var val = parseInt(text) if (!isNaN(val)) { MySettings.setModelTopK(root.currentModelInfo, val) focus = false } else { text = root.currentModelInfo.topK } } Accessible.role: Accessible.EditableText Accessible.name: topKLabel.text Accessible.description: ToolTip.text } MySettingsLabel { id: maxLengthLabel visible: !root.currentModelInfo.isOnline text: qsTr("Max Length") helpText: qsTr("Maximum response length, in tokens.") Layout.row: 0 Layout.column: 2 Layout.maximumWidth: 300 * theme.fontScale } MyTextField { id: maxLengthField visible: !root.currentModelInfo.isOnline text: root.currentModelInfo.maxLength color: theme.textColor font.pixelSize: theme.fontSizeLarge Layout.row: 0 Layout.column: 3 validator: IntValidator { bottom: 1 } Connections { target: MySettings function onMaxLengthChanged() { maxLengthField.text = root.currentModelInfo.maxLength; } } Connections { target: root function onCurrentModelInfoChanged() { maxLengthField.text = root.currentModelInfo.maxLength; } } onEditingFinished: { var val = parseInt(text) if (!isNaN(val)) { MySettings.setModelMaxLength(root.currentModelInfo, val) focus = false } else { text = root.currentModelInfo.maxLength } } Accessible.role: Accessible.EditableText Accessible.name: maxLengthLabel.text Accessible.description: ToolTip.text } MySettingsLabel { id: batchSizeLabel visible: !root.currentModelInfo.isOnline text: qsTr("Prompt Batch Size") helpText: qsTr("The batch size used for prompt processing.") Layout.row: 1 Layout.column: 0 Layout.maximumWidth: 300 * theme.fontScale } MyTextField { id: batchSizeField visible: !root.currentModelInfo.isOnline text: root.currentModelInfo.promptBatchSize color: theme.textColor font.pixelSize: theme.fontSizeLarge ToolTip.text: qsTr("Amount of prompt tokens to process at once.\nNOTE: Higher values can speed up reading prompts but will use more RAM.") ToolTip.visible: hovered Layout.row: 1 Layout.column: 1 validator: IntValidator { bottom: 1 } Connections { target: MySettings function onPromptBatchSizeChanged() { batchSizeField.text = root.currentModelInfo.promptBatchSize; } } Connections { target: root function onCurrentModelInfoChanged() { batchSizeField.text = root.currentModelInfo.promptBatchSize; } } onEditingFinished: { var val = parseInt(text) if (!isNaN(val)) { MySettings.setModelPromptBatchSize(root.currentModelInfo, val) focus = false } else { text = root.currentModelInfo.promptBatchSize } } Accessible.role: Accessible.EditableText Accessible.name: batchSizeLabel.text Accessible.description: ToolTip.text } MySettingsLabel { id: repeatPenaltyLabel visible: !root.currentModelInfo.isOnline text: qsTr("Repeat Penalty") helpText: qsTr("Repetition penalty factor. Set to 1 to disable.") Layout.row: 4 Layout.column: 2 Layout.maximumWidth: 300 * theme.fontScale } MyTextField { id: repeatPenaltyField visible: !root.currentModelInfo.isOnline text: root.currentModelInfo.repeatPenalty color: theme.textColor font.pixelSize: theme.fontSizeLarge Layout.row: 4 Layout.column: 3 validator: DoubleValidator { locale: "C" } Connections { target: MySettings function onRepeatPenaltyChanged() { repeatPenaltyField.text = root.currentModelInfo.repeatPenalty; } } Connections { target: root function onCurrentModelInfoChanged() { repeatPenaltyField.text = root.currentModelInfo.repeatPenalty; } } onEditingFinished: { var val = parseFloat(text) if (!isNaN(val)) { MySettings.setModelRepeatPenalty(root.currentModelInfo, val) focus = false } else { text = root.currentModelInfo.repeatPenalty } } Accessible.role: Accessible.EditableText Accessible.name: repeatPenaltyLabel.text Accessible.description: ToolTip.text } MySettingsLabel { id: repeatPenaltyTokensLabel visible: !root.currentModelInfo.isOnline text: qsTr("Repeat Penalty Tokens") helpText: qsTr("Number of previous tokens used for penalty.") Layout.row: 3 Layout.column: 2 Layout.maximumWidth: 300 * theme.fontScale } MyTextField { id: repeatPenaltyTokenField visible: !root.currentModelInfo.isOnline text: root.currentModelInfo.repeatPenaltyTokens color: theme.textColor font.pixelSize: theme.fontSizeLarge Layout.row: 3 Layout.column: 3 validator: IntValidator { bottom: 1 } Connections { target: MySettings function onRepeatPenaltyTokensChanged() { repeatPenaltyTokenField.text = root.currentModelInfo.repeatPenaltyTokens; } } Connections { target: root function onCurrentModelInfoChanged() { repeatPenaltyTokenField.text = root.currentModelInfo.repeatPenaltyTokens; } } onEditingFinished: { var val = parseInt(text) if (!isNaN(val)) { MySettings.setModelRepeatPenaltyTokens(root.currentModelInfo, val) focus = false } else { text = root.currentModelInfo.repeatPenaltyTokens } } Accessible.role: Accessible.EditableText Accessible.name: repeatPenaltyTokensLabel.text Accessible.description: ToolTip.text } MySettingsLabel { id: gpuLayersLabel visible: !root.currentModelInfo.isOnline text: qsTr("GPU Layers") helpText: qsTr("Number of model layers to load into VRAM.") Layout.row: 4 Layout.column: 0 Layout.maximumWidth: 300 * theme.fontScale } MyTextField { id: gpuLayersField visible: !root.currentModelInfo.isOnline text: root.currentModelInfo.gpuLayers font.pixelSize: theme.fontSizeLarge color: theme.textColor ToolTip.text: qsTr("How many model layers to load into VRAM. Decrease this if GPT4All runs out of VRAM while loading this model.\nLower values increase CPU load and RAM usage, and make inference slower.\nNOTE: Does not take effect until you reload the model.") ToolTip.visible: hovered Layout.row: 4 Layout.column: 1 Connections { target: MySettings function onGpuLayersChanged() { gpuLayersField.text = root.currentModelInfo.gpuLayers } } Connections { target: root function onCurrentModelInfoChanged() { if (root.currentModelInfo.gpuLayers === 100) { gpuLayersField.text = root.currentModelInfo.maxGpuLayers } else { gpuLayersField.text = root.currentModelInfo.gpuLayers } } } onEditingFinished: { var val = parseInt(text) if (isNaN(val)) { gpuLayersField.text = root.currentModelInfo.gpuLayers } else { if (val < 1) { val = 1 gpuLayersField.text = val } else if (val > root.currentModelInfo.maxGpuLayers) { val = root.currentModelInfo.maxGpuLayers gpuLayersField.text = val } MySettings.setModelGpuLayers(root.currentModelInfo, val) focus = false } } Accessible.role: Accessible.EditableText Accessible.name: gpuLayersLabel.text Accessible.description: ToolTip.text } } Rectangle { Layout.row: 16 Layout.column: 0 Layout.columnSpan: 2 Layout.topMargin: 15 Layout.fillWidth: true height: 1 color: theme.settingsDivider } } } ================================================ FILE: gpt4all-chat/qml/ModelsView.qml ================================================ import QtCore import QtQuick import QtQuick.Controls import QtQuick.Controls.Basic import QtQuick.Dialogs import QtQuick.Layouts import chatlistmodel import download import llm import modellist import network import mysettings Rectangle { id: modelsView color: theme.viewBackground signal addModelViewRequested() ToastManager { id: messageToast } ColumnLayout { anchors.fill: parent anchors.margins: 20 spacing: 30 Item { Layout.fillWidth: true Layout.fillHeight: true visible: ModelList.installedModels.count === 0 ColumnLayout { id: noInstalledLabel anchors.centerIn: parent spacing: 0 Text { Layout.alignment: Qt.AlignCenter text: qsTr("No Models Installed") color: theme.mutedLightTextColor font.pixelSize: theme.fontSizeBannerSmall } Text { Layout.topMargin: 15 horizontalAlignment: Qt.AlignHCenter color: theme.mutedLighterTextColor text: qsTr("Install a model to get started using GPT4All") font.pixelSize: theme.fontSizeLarge } } MyButton { anchors.top: noInstalledLabel.bottom anchors.topMargin: 50 anchors.horizontalCenter: noInstalledLabel.horizontalCenter rightPadding: 60 leftPadding: 60 text: qsTr("\uFF0B Add Model") onClicked: { addModelViewRequested() } Accessible.role: Accessible.Button Accessible.name: qsTr("Shows the add model view") } } RowLayout { visible: ModelList.installedModels.count !== 0 Layout.fillWidth: true Layout.alignment: Qt.AlignTop spacing: 50 ColumnLayout { Layout.fillWidth: true Layout.alignment: Qt.AlignLeft Layout.minimumWidth: 200 spacing: 5 Text { id: welcome text: qsTr("Installed Models") font.pixelSize: theme.fontSizeBanner color: theme.titleTextColor } Text { text: qsTr("Locally installed chat models") font.pixelSize: theme.fontSizeLarge color: theme.titleInfoTextColor } } Rectangle { Layout.fillWidth: true height: 0 } MyButton { Layout.alignment: Qt.AlignTop | Qt.AlignRight text: qsTr("\uFF0B Add Model") onClicked: { addModelViewRequested() } } } ScrollView { id: scrollView visible: ModelList.installedModels.count !== 0 ScrollBar.vertical.policy: ScrollBar.AsNeeded Layout.fillWidth: true Layout.fillHeight: true clip: true ListView { id: modelListView model: ModelList.installedModels boundsBehavior: Flickable.StopAtBounds spacing: 30 delegate: Rectangle { id: delegateItem width: modelListView.width height: childrenRect.height + 60 color: theme.conversationBackground radius: 10 border.width: 1 border.color: theme.controlBorder ColumnLayout { anchors.top: parent.top anchors.left: parent.left anchors.right: parent.right anchors.margins: 30 Text { Layout.fillWidth: true Layout.alignment: Qt.AlignLeft text: name elide: Text.ElideRight color: theme.titleTextColor font.pixelSize: theme.fontSizeLargest font.bold: true Accessible.role: Accessible.Paragraph Accessible.name: qsTr("Model file") Accessible.description: qsTr("Model file to be downloaded") } Rectangle { Layout.fillWidth: true height: 1 color: theme.dividerColor } RowLayout { Layout.topMargin: 10 Layout.fillWidth: true Text { id: descriptionText text: description font.pixelSize: theme.fontSizeLarge Layout.fillWidth: true wrapMode: Text.WordWrap textFormat: Text.StyledText color: theme.textColor linkColor: theme.textColor Accessible.role: Accessible.Paragraph Accessible.name: qsTr("Description") Accessible.description: qsTr("File description") onLinkActivated: function(link) { Qt.openUrlExternally(link); } MouseArea { anchors.fill: parent acceptedButtons: Qt.NoButton // pass clicks to parent cursorShape: parent.hoveredLink ? Qt.PointingHandCursor : Qt.ArrowCursor } } Rectangle { id: actionBox width: childrenRect.width + 20 color: "transparent" border.width: 1 border.color: theme.dividerColor radius: 10 Layout.rightMargin: 20 Layout.bottomMargin: 20 Layout.minimumHeight: childrenRect.height + 20 Layout.alignment: Qt.AlignRight | Qt.AlignTop ColumnLayout { spacing: 0 MySettingsButton { id: downloadButton text: isDownloading ? qsTr("Cancel") : qsTr("Resume") font.pixelSize: theme.fontSizeLarge Layout.topMargin: 20 Layout.leftMargin: 20 Layout.minimumWidth: 200 Layout.fillWidth: true Layout.alignment: Qt.AlignTop | Qt.AlignHCenter visible: (isDownloading || isIncomplete) && downloadError === "" && !isOnline && !calcHash Accessible.description: qsTr("Stop/restart/start the download") onClicked: { if (!isDownloading) { Download.downloadModel(filename); } else { Download.cancelDownload(filename); } } } MySettingsDestructiveButton { id: removeButton text: qsTr("Remove") Layout.topMargin: 20 Layout.leftMargin: 20 Layout.minimumWidth: 200 Layout.fillWidth: true Layout.alignment: Qt.AlignTop | Qt.AlignHCenter visible: !isDownloading && (installed || isIncomplete) Accessible.description: qsTr("Remove model from filesystem") onClicked: { Download.removeModel(filename); } } MySettingsButton { id: installButton visible: !installed && isOnline Layout.topMargin: 20 Layout.leftMargin: 20 Layout.minimumWidth: 200 Layout.fillWidth: true Layout.alignment: Qt.AlignTop | Qt.AlignHCenter text: qsTr("Install") font.pixelSize: theme.fontSizeLarge onClicked: { var apiKeyText = apiKey.text.trim(), baseUrlText = baseUrl.text.trim(), modelNameText = modelName.text.trim(); var apiKeyOk = apiKeyText !== "", baseUrlOk = !isCompatibleApi || baseUrlText !== "", modelNameOk = !isCompatibleApi || modelNameText !== ""; if (!apiKeyOk) apiKey.showError(); if (!baseUrlOk) baseUrl.showError(); if (!modelNameOk) modelName.showError(); if (!apiKeyOk || !baseUrlOk || !modelNameOk) return; if (!isCompatibleApi) Download.installModel( filename, apiKeyText, ); else Download.installCompatibleModel( modelNameText, apiKeyText, baseUrlText, ); } Accessible.role: Accessible.Button Accessible.name: qsTr("Install") Accessible.description: qsTr("Install online model") } ColumnLayout { spacing: 0 Label { Layout.topMargin: 20 Layout.leftMargin: 20 visible: downloadError !== "" textFormat: Text.StyledText text: qsTr("Error") color: theme.textColor font.pixelSize: theme.fontSizeLarge linkColor: theme.textErrorColor Accessible.role: Accessible.Paragraph Accessible.name: text Accessible.description: qsTr("Describes an error that occurred when downloading") onLinkActivated: { downloadingErrorPopup.text = downloadError; downloadingErrorPopup.open(); } } Label { visible: LLM.systemTotalRAMInGB() < ramrequired Layout.topMargin: 20 Layout.leftMargin: 20 Layout.maximumWidth: 300 textFormat: Text.StyledText text: qsTr("WARNING: Not recommended for your hardware. Model requires more memory (%1 GB) than your system has available (%2).").arg(ramrequired).arg(LLM.systemTotalRAMInGBString()) color: theme.textErrorColor font.pixelSize: theme.fontSizeLarge wrapMode: Text.WordWrap Accessible.role: Accessible.Paragraph Accessible.name: text Accessible.description: qsTr("Error for incompatible hardware") onLinkActivated: { downloadingErrorPopup.text = downloadError; downloadingErrorPopup.open(); } } } ColumnLayout { visible: isDownloading && !calcHash Layout.topMargin: 20 Layout.leftMargin: 20 Layout.minimumWidth: 200 Layout.fillWidth: true Layout.alignment: Qt.AlignTop | Qt.AlignHCenter spacing: 20 ProgressBar { id: itemProgressBar Layout.fillWidth: true width: 200 value: bytesReceived / bytesTotal background: Rectangle { implicitHeight: 45 color: theme.progressBackground radius: 3 } contentItem: Item { implicitHeight: 40 Rectangle { width: itemProgressBar.visualPosition * parent.width height: parent.height radius: 2 color: theme.progressForeground } } Accessible.role: Accessible.ProgressBar Accessible.name: qsTr("Download progressBar") Accessible.description: qsTr("Shows the progress made in the download") } Label { id: speedLabel color: theme.textColor Layout.alignment: Qt.AlignRight text: speed font.pixelSize: theme.fontSizeLarge Accessible.role: Accessible.Paragraph Accessible.name: qsTr("Download speed") Accessible.description: qsTr("Download speed in bytes/kilobytes/megabytes per second") } } RowLayout { visible: calcHash Layout.topMargin: 20 Layout.leftMargin: 20 Layout.minimumWidth: 200 Layout.maximumWidth: 200 Layout.fillWidth: true Layout.alignment: Qt.AlignTop | Qt.AlignHCenter clip: true Label { id: calcHashLabel color: theme.textColor text: qsTr("Calculating...") font.pixelSize: theme.fontSizeLarge Accessible.role: Accessible.Paragraph Accessible.name: text Accessible.description: qsTr("Whether the file hash is being calculated") } MyBusyIndicator { id: busyCalcHash running: calcHash Accessible.role: Accessible.Animation Accessible.name: qsTr("Busy indicator") Accessible.description: qsTr("Displayed when the file hash is being calculated") } } MyTextField { id: apiKey visible: !installed && isOnline Layout.topMargin: 20 Layout.leftMargin: 20 Layout.minimumWidth: 200 Layout.alignment: Qt.AlignTop | Qt.AlignHCenter wrapMode: Text.WrapAnywhere function showError() { messageToast.show(qsTr("ERROR: $API_KEY is empty.")); apiKey.placeholderTextColor = theme.textErrorColor; } onTextChanged: { apiKey.placeholderTextColor = theme.mutedTextColor; } placeholderText: qsTr("enter $API_KEY") Accessible.role: Accessible.EditableText Accessible.name: placeholderText Accessible.description: qsTr("Whether the file hash is being calculated") } MyTextField { id: baseUrl visible: !installed && isOnline && isCompatibleApi Layout.topMargin: 20 Layout.leftMargin: 20 Layout.minimumWidth: 200 Layout.alignment: Qt.AlignTop | Qt.AlignHCenter wrapMode: Text.WrapAnywhere function showError() { messageToast.show(qsTr("ERROR: $BASE_URL is empty.")); baseUrl.placeholderTextColor = theme.textErrorColor; } onTextChanged: { baseUrl.placeholderTextColor = theme.mutedTextColor; } placeholderText: qsTr("enter $BASE_URL") Accessible.role: Accessible.EditableText Accessible.name: placeholderText Accessible.description: qsTr("Whether the file hash is being calculated") } MyTextField { id: modelName visible: !installed && isOnline && isCompatibleApi Layout.topMargin: 20 Layout.leftMargin: 20 Layout.minimumWidth: 200 Layout.alignment: Qt.AlignTop | Qt.AlignHCenter wrapMode: Text.WrapAnywhere function showError() { messageToast.show(qsTr("ERROR: $MODEL_NAME is empty.")) modelName.placeholderTextColor = theme.textErrorColor; } onTextChanged: { modelName.placeholderTextColor = theme.mutedTextColor; } placeholderText: qsTr("enter $MODEL_NAME") Accessible.role: Accessible.EditableText Accessible.name: placeholderText Accessible.description: qsTr("Whether the file hash is being calculated") } } } } Item { Layout.minimumWidth: childrenRect.width Layout.minimumHeight: childrenRect.height Layout.bottomMargin: 10 RowLayout { id: paramRow anchors.centerIn: parent ColumnLayout { Layout.topMargin: 10 Layout.bottomMargin: 10 Layout.leftMargin: 20 Layout.rightMargin: 20 Text { text: qsTr("File size") font.pixelSize: theme.fontSizeSmall color: theme.mutedDarkTextColor } Text { text: filesize color: theme.textColor font.pixelSize: theme.fontSizeSmall font.bold: true } } Rectangle { width: 1 Layout.fillHeight: true color: theme.dividerColor } ColumnLayout { Layout.topMargin: 10 Layout.bottomMargin: 10 Layout.leftMargin: 20 Layout.rightMargin: 20 Text { text: qsTr("RAM required") font.pixelSize: theme.fontSizeSmall color: theme.mutedDarkTextColor } Text { text: ramrequired >= 0 ? qsTr("%1 GB").arg(ramrequired) : qsTr("?") color: theme.textColor font.pixelSize: theme.fontSizeSmall font.bold: true } } Rectangle { width: 1 Layout.fillHeight: true color: theme.dividerColor } ColumnLayout { Layout.topMargin: 10 Layout.bottomMargin: 10 Layout.leftMargin: 20 Layout.rightMargin: 20 Text { text: qsTr("Parameters") font.pixelSize: theme.fontSizeSmall color: theme.mutedDarkTextColor } Text { text: parameters !== "" ? parameters : "?" color: theme.textColor font.pixelSize: theme.fontSizeSmall font.bold: true } } Rectangle { width: 1 Layout.fillHeight: true color: theme.dividerColor } ColumnLayout { Layout.topMargin: 10 Layout.bottomMargin: 10 Layout.leftMargin: 20 Layout.rightMargin: 20 Text { text: qsTr("Quant") font.pixelSize: theme.fontSizeSmall color: theme.mutedDarkTextColor } Text { text: quant color: theme.textColor font.pixelSize: theme.fontSizeSmall font.bold: true } } Rectangle { width: 1 Layout.fillHeight: true color: theme.dividerColor } ColumnLayout { Layout.topMargin: 10 Layout.bottomMargin: 10 Layout.leftMargin: 20 Layout.rightMargin: 20 Text { text: qsTr("Type") font.pixelSize: theme.fontSizeSmall color: theme.mutedDarkTextColor } Text { text: type color: theme.textColor font.pixelSize: theme.fontSizeSmall font.bold: true } } } Rectangle { color: "transparent" anchors.fill: paramRow border.color: theme.dividerColor border.width: 1 radius: 10 } } Rectangle { Layout.fillWidth: true height: 1 color: theme.dividerColor } } } } } } Connections { target: Download function onToastMessage(message) { messageToast.show(message); } } } ================================================ FILE: gpt4all-chat/qml/MyBusyIndicator.qml ================================================ import QtQuick import QtQuick.Controls import QtQuick.Controls.Basic BusyIndicator { id: control property real size: 48 property color color: theme.accentColor contentItem: Item { implicitWidth: control.size implicitHeight: control.size Item { id: item x: parent.width / 2 - width / 2 y: parent.height / 2 - height / 2 width: control.size height: control.size opacity: control.running ? 1 : 0 Behavior on opacity { OpacityAnimator { duration: 250 } } RotationAnimator { target: item running: control.visible && control.running from: 0 to: 360 loops: Animation.Infinite duration: 1750 } Repeater { id: repeater model: 6 Rectangle { id: delegate x: item.width / 2 - width / 2 y: item.height / 2 - height / 2 implicitWidth: control.size * .2 implicitHeight: control.size * .2 radius: control.size * .1 color: control.color required property int index transform: [ Translate { y: -Math.min(item.width, item.height) * 0.5 + delegate.radius }, Rotation { angle: delegate.index / repeater.count * 360 origin.x: delegate.radius origin.y: delegate.radius } ] } } } } } ================================================ FILE: gpt4all-chat/qml/MyButton.qml ================================================ import QtCore import QtQuick import QtQuick.Controls import QtQuick.Controls.Basic import mysettings import mysettingsenums Button { id: myButton padding: 10 rightPadding: 18 leftPadding: 18 property color textColor: theme.oppositeTextColor property color mutedTextColor: theme.oppositeMutedTextColor property color backgroundColor: theme.buttonBackground property color backgroundColorHovered: theme.buttonBackgroundHovered property real backgroundRadius: 10 property real borderWidth: MySettings.chatTheme === MySettingsEnums.ChatTheme.LegacyDark ? 1 : 0 property color borderColor: theme.buttonBorder property real fontPixelSize: theme.fontSizeLarge property bool fontPixelBold: false property alias textAlignment: textContent.horizontalAlignment contentItem: Text { id: textContent text: myButton.text horizontalAlignment: myButton.textAlignment color: myButton.enabled ? textColor : mutedTextColor font.pixelSize: fontPixelSize font.bold: fontPixelBold Accessible.role: Accessible.Button Accessible.name: text } background: Rectangle { radius: myButton.backgroundRadius border.width: myButton.borderWidth border.color: myButton.borderColor color: !myButton.enabled ? theme.mutedTextColor : myButton.hovered ? backgroundColorHovered : backgroundColor } Accessible.role: Accessible.Button Accessible.name: text ToolTip.delay: Qt.styleHints.mousePressAndHoldInterval } ================================================ FILE: gpt4all-chat/qml/MyCheckBox.qml ================================================ import QtCore import QtQuick import QtQuick.Controls import QtQuick.Controls.Basic CheckBox { id: myCheckBox background: Rectangle { color: "transparent" } indicator: Rectangle { implicitWidth: 26 implicitHeight: 26 x: myCheckBox.leftPadding y: parent.height / 2 - height / 2 border.color: theme.checkboxBorder color: "transparent" radius: 3 Rectangle { width: 14 height: 14 x: 6 y: 6 radius: 2 color: theme.checkboxForeground visible: myCheckBox.checked } } contentItem: Text { text: myCheckBox.text font: myCheckBox.font opacity: enabled ? 1.0 : 0.3 color: theme.textColor verticalAlignment: Text.AlignVCenter leftPadding: myCheckBox.indicator.width + myCheckBox.spacing } ToolTip.delay: Qt.styleHints.mousePressAndHoldInterval } ================================================ FILE: gpt4all-chat/qml/MyComboBox.qml ================================================ import QtQuick import QtQuick.Controls import QtQuick.Controls.Basic import QtQuick.Layouts import Qt5Compat.GraphicalEffects ComboBox { id: comboBox font.pixelSize: theme.fontSizeLarge spacing: 0 padding: 10 Accessible.role: Accessible.ComboBox contentItem: RowLayout { id: contentRow spacing: 0 Text { id: text Layout.fillWidth: true leftPadding: 10 rightPadding: 20 text: comboBox.displayText font: comboBox.font color: theme.textColor verticalAlignment: Text.AlignLeft elide: Text.ElideRight } Item { Layout.preferredWidth: updown.width Layout.preferredHeight: updown.height Image { id: updown anchors.verticalCenter: parent.verticalCenter sourceSize.width: comboBox.font.pixelSize sourceSize.height: comboBox.font.pixelSize mipmap: true visible: false source: "qrc:/gpt4all/icons/up_down.svg" } ColorOverlay { anchors.fill: updown source: updown color: theme.textColor } } } delegate: ItemDelegate { width: comboBox.width -20 contentItem: Text { text: modelData color: theme.textColor font: comboBox.font elide: Text.ElideRight verticalAlignment: Text.AlignVCenter } background: Rectangle { radius: 10 color: highlighted ? theme.menuHighlightColor : theme.menuBackgroundColor } highlighted: comboBox.highlightedIndex === index } popup: Popup { y: comboBox.height - 1 width: comboBox.width implicitHeight: Math.min(window.height - y, contentItem.implicitHeight + 20) padding: 0 contentItem: Rectangle { implicitWidth: comboBox.width implicitHeight: myListView.contentHeight color: "transparent" radius: 10 ScrollView { anchors.fill: parent anchors.margins: 10 clip: true ScrollBar.vertical.policy: ScrollBar.AsNeeded ScrollBar.horizontal.policy: ScrollBar.AlwaysOff ListView { id: myListView implicitHeight: contentHeight model: comboBox.popup.visible ? comboBox.delegateModel : null currentIndex: comboBox.highlightedIndex ScrollIndicator.vertical: ScrollIndicator { } } } } background: Rectangle { color: theme.menuBackgroundColor//theme.controlBorder border.color: theme.menuBorderColor //theme.controlBorder border.width: 1 radius: 10 } } indicator: Item { } background: Rectangle { color: theme.controlBackground border.width: 1 border.color: theme.controlBorder radius: 10 } ToolTip.delay: Qt.styleHints.mousePressAndHoldInterval } ================================================ FILE: gpt4all-chat/qml/MyDialog.qml ================================================ import QtCore import QtQuick import QtQuick.Controls import QtQuick.Controls.Basic import QtQuick.Dialogs import QtQuick.Layouts Dialog { id: myDialog parent: Overlay.overlay property alias closeButtonVisible: myCloseButton.visible background: Rectangle { width: parent.width height: parent.height color: theme.containerBackground border.width: 1 border.color: theme.dialogBorder radius: 10 } Rectangle { id: closeBackground visible: myCloseButton.visible z: 299 anchors.centerIn: myCloseButton width: myCloseButton.width + 10 height: myCloseButton.height + 10 color: theme.containerBackground } MyToolButton { id: myCloseButton x: 0 + myDialog.width - myDialog.padding - width - 15 y: 0 - myDialog.padding + 15 z: 300 visible: myDialog.closePolicy != Popup.NoAutoClose width: 24 height: 24 imageWidth: 24 imageHeight: 24 padding: 0 source: "qrc:/gpt4all/icons/close.svg" fillMode: Image.PreserveAspectFit onClicked: { myDialog.close(); } } } ================================================ FILE: gpt4all-chat/qml/MyDirectoryField.qml ================================================ import QtCore import QtQuick import QtQuick.Controls import QtQuick.Controls.Basic import llm TextField { id: myDirectoryField padding: 10 property bool isValid: LLM.directoryExists(text) color: text === "" || isValid ? theme.textColor : theme.textErrorColor background: Rectangle { implicitWidth: 150 color: theme.controlBackground border.width: 1 border.color: theme.controlBorder radius: 10 } ToolTip.delay: Qt.styleHints.mousePressAndHoldInterval } ================================================ FILE: gpt4all-chat/qml/MyFancyLink.qml ================================================ import QtCore import QtQuick import QtQuick.Controls import QtQuick.Controls.Basic import Qt5Compat.GraphicalEffects import mysettings MyButton { id: fancyLink property alias imageSource: myimage.source Image { id: myimage anchors.verticalCenter: parent.verticalCenter anchors.left: parent.left anchors.leftMargin: 12 sourceSize: Qt.size(15, 15) mipmap: true visible: false } ColorOverlay { anchors.fill: myimage source: myimage color: fancyLink.hovered ? theme.fancyLinkTextHovered : theme.fancyLinkText } borderWidth: 0 backgroundColor: "transparent" backgroundColorHovered: "transparent" fontPixelBold: true leftPadding: 35 rightPadding: 8 topPadding: 1 bottomPadding: 1 textColor: fancyLink.hovered ? theme.fancyLinkTextHovered : theme.fancyLinkText fontPixelSize: theme.fontSizeSmall background: Rectangle { color: "transparent" } Accessible.name: qsTr("Fancy link") Accessible.description: qsTr("A stylized link") } ================================================ FILE: gpt4all-chat/qml/MyFileDialog.qml ================================================ import QtCore import QtQuick import QtQuick.Dialogs FileDialog { id: fileDialog title: qsTr("Please choose a file") property var acceptedConnection: null function openFileDialog(currentFolder, onAccepted) { fileDialog.currentFolder = currentFolder; if (acceptedConnection !== null) { fileDialog.accepted.disconnect(acceptedConnection); } acceptedConnection = function() { onAccepted(fileDialog.selectedFile); }; fileDialog.accepted.connect(acceptedConnection); fileDialog.open(); } } ================================================ FILE: gpt4all-chat/qml/MyFileIcon.qml ================================================ import QtCore import QtQuick import QtQuick.Controls import QtQuick.Controls.Basic import Qt5Compat.GraphicalEffects Item { id: fileIcon property real iconSize: 24 property string fileName: "" implicitWidth: iconSize implicitHeight: iconSize Image { id: fileImage anchors.fill: parent visible: false sourceSize.width: iconSize sourceSize.height: iconSize mipmap: true source: { if (fileIcon.fileName.toLowerCase().endsWith(".txt")) return "qrc:/gpt4all/icons/file-txt.svg" else if (fileIcon.fileName.toLowerCase().endsWith(".pdf")) return "qrc:/gpt4all/icons/file-pdf.svg" else if (fileIcon.fileName.toLowerCase().endsWith(".md")) return "qrc:/gpt4all/icons/file-md.svg" else if (fileIcon.fileName.toLowerCase().endsWith(".xlsx")) return "qrc:/gpt4all/icons/file-xls.svg" else if (fileIcon.fileName.toLowerCase().endsWith(".docx")) return "qrc:/gpt4all/icons/file-docx.svg" else return "qrc:/gpt4all/icons/file.svg" } } ColorOverlay { anchors.fill: fileImage source: fileImage color: theme.textColor } } ================================================ FILE: gpt4all-chat/qml/MyFolderDialog.qml ================================================ import QtCore import QtQuick import QtQuick.Dialogs FolderDialog { id: folderDialog title: qsTr("Please choose a directory") function openFolderDialog(currentFolder, onAccepted) { folderDialog.currentFolder = currentFolder; folderDialog.accepted.connect(function() { onAccepted(folderDialog.selectedFolder); }); folderDialog.open(); } } ================================================ FILE: gpt4all-chat/qml/MyMenu.qml ================================================ import QtCore import QtQuick import QtQuick.Controls import QtQuick.Controls.Basic Menu { id: menu implicitWidth: Math.max(implicitBackgroundWidth + leftInset + rightInset, contentWidth + leftPadding + rightPadding + 20) implicitHeight: Math.max(implicitBackgroundHeight + topInset + bottomInset, contentHeight + topPadding + bottomPadding + 20) background: Rectangle { implicitWidth: 220 implicitHeight: 40 color: theme.menuBackgroundColor border.color: theme.menuBorderColor border.width: 1 radius: 10 } contentItem: Rectangle { implicitWidth: myListView.contentWidth implicitHeight: (myTitle.visible ? myTitle.contentHeight + 10: 0) + myListView.contentHeight color: "transparent" Text { id: myTitle visible: menu.title !== "" text: menu.title anchors.margins: 10 anchors.top: parent.top anchors.right: parent.right anchors.left: parent.left leftPadding: 15 rightPadding: 10 padding: 5 color: theme.styledTextColor font.pixelSize: theme.fontSizeSmall } ListView { id: myListView anchors.margins: 10 anchors.top: title.bottom anchors.bottom: parent.bottom anchors.right: parent.right anchors.left: parent.left implicitHeight: contentHeight model: menu.contentModel interactive: Window.window ? contentHeight + menu.topPadding + menu.bottomPadding > menu.height : false clip: true currentIndex: menu.currentIndex ScrollIndicator.vertical: ScrollIndicator {} } } enter: Transition { NumberAnimation { property: "opacity" from: 0 to: 1 easing.type: Easing.InOutQuad duration: 100 } } exit: Transition { NumberAnimation { property: "opacity" from: 1 to: 0 easing.type: Easing.InOutQuad duration: 100 } } } ================================================ FILE: gpt4all-chat/qml/MyMenuItem.qml ================================================ import Qt5Compat.GraphicalEffects import QtCore import QtQuick import QtQuick.Controls import QtQuick.Controls.Basic import QtQuick.Layouts MenuItem { id: item background: Rectangle { radius: 10 width: parent.width -20 color: item.highlighted ? theme.menuHighlightColor : theme.menuBackgroundColor } contentItem: RowLayout { spacing: 0 Item { visible: item.icon.source.toString() !== "" Layout.leftMargin: 6 Layout.preferredWidth: item.icon.width Layout.preferredHeight: item.icon.height Image { id: image anchors.centerIn: parent visible: false fillMode: Image.PreserveAspectFit mipmap: true sourceSize.width: item.icon.width sourceSize.height: item.icon.height source: item.icon.source } ColorOverlay { anchors.fill: image source: image color: theme.textColor } } Text { Layout.alignment: Qt.AlignLeft padding: 5 text: item.text color: theme.textColor font.pixelSize: theme.fontSizeLarge } Rectangle { color: "transparent" Layout.fillWidth: true height: 1 } } } ================================================ FILE: gpt4all-chat/qml/MyMiniButton.qml ================================================ import QtCore import QtQuick import QtQuick.Controls import QtQuick.Controls.Basic import Qt5Compat.GraphicalEffects Button { id: myButton padding: 0 property color backgroundColor: theme.iconBackgroundDark property color backgroundColorHovered: theme.iconBackgroundHovered property alias source: image.source property alias fillMode: image.fillMode implicitWidth: 30 implicitHeight: 30 contentItem: Text { text: myButton.text horizontalAlignment: Text.AlignHCenter color: myButton.enabled ? theme.textColor : theme.mutedTextColor font.pixelSize: theme.fontSizeLarge Accessible.role: Accessible.Button Accessible.name: text } background: Item { anchors.fill: parent Rectangle { anchors.fill: parent color: "transparent" } Image { id: image anchors.centerIn: parent visible: false mipmap: true sourceSize.width: 16 sourceSize.height: 16 } ColorOverlay { anchors.fill: image source: image color: myButton.hovered ? backgroundColorHovered : backgroundColor } } Accessible.role: Accessible.Button Accessible.name: text ToolTip.delay: Qt.styleHints.mousePressAndHoldInterval } ================================================ FILE: gpt4all-chat/qml/MySettingsButton.qml ================================================ import QtCore import QtQuick import QtQuick.Controls import QtQuick.Controls.Basic import mysettings Button { id: myButton padding: 10 rightPadding: 18 leftPadding: 18 property color textColor: theme.lightButtonText property color mutedTextColor: theme.lightButtonMutedText property color backgroundColor: theme.lightButtonBackground property color backgroundColorHovered: enabled ? theme.lightButtonBackgroundHovered : backgroundColor property real borderWidth: 0 property color borderColor: "transparent" property real fontPixelSize: theme.fontSizeLarge property string toolTip property alias backgroundRadius: background.radius contentItem: Text { text: myButton.text horizontalAlignment: Text.AlignHCenter color: myButton.enabled ? textColor : mutedTextColor font.pixelSize: fontPixelSize font.bold: true Accessible.role: Accessible.Button Accessible.name: text } background: Rectangle { id: background radius: 10 border.width: borderWidth border.color: borderColor color: myButton.hovered ? backgroundColorHovered : backgroundColor } Accessible.role: Accessible.Button Accessible.name: text ToolTip.text: toolTip ToolTip.visible: toolTip !== "" && myButton.hovered ToolTip.delay: Qt.styleHints.mousePressAndHoldInterval } ================================================ FILE: gpt4all-chat/qml/MySettingsDestructiveButton.qml ================================================ import QtCore import QtQuick import QtQuick.Controls import QtQuick.Controls.Basic import mysettings Button { id: myButton padding: 10 rightPadding: 18 leftPadding: 18 font.pixelSize: theme.fontSizeLarge property color textColor: theme.darkButtonText property color mutedTextColor: theme.darkButtonMutedText property color backgroundColor: theme.darkButtonBackground property color backgroundColorHovered: enabled ? theme.darkButtonBackgroundHovered : backgroundColor property real borderWidth: 0 property color borderColor: "transparent" contentItem: Text { text: myButton.text horizontalAlignment: Text.AlignHCenter color: myButton.enabled ? textColor : mutedTextColor font.pixelSize: theme.fontSizeLarge font.bold: true Accessible.role: Accessible.Button Accessible.name: text } background: Rectangle { radius: 10 border.width: borderWidth border.color: borderColor color: myButton.hovered ? backgroundColorHovered : backgroundColor } Accessible.role: Accessible.Button Accessible.name: text ToolTip.delay: Qt.styleHints.mousePressAndHoldInterval } ================================================ FILE: gpt4all-chat/qml/MySettingsLabel.qml ================================================ import QtCore import QtQuick import QtQuick.Controls import QtQuick.Controls.Basic import QtQuick.Layouts ColumnLayout { id: root property alias text: mainTextLabel.text property alias helpText: helpTextLabel.text property alias textFormat: mainTextLabel.textFormat property alias wrapMode: mainTextLabel.wrapMode property alias font: mainTextLabel.font property alias horizontalAlignment: mainTextLabel.horizontalAlignment signal linkActivated(link : url); property alias color: mainTextLabel.color property alias linkColor: mainTextLabel.linkColor property var onReset: null property alias canReset: resetButton.enabled property bool resetClears: false Item { anchors.margins: 5 width: childrenRect.width height: mainTextLabel.contentHeight Label { id: mainTextLabel anchors.left: parent.left anchors.top: parent.top anchors.bottom: parent.bottom color: theme.settingsTitleTextColor font.pixelSize: theme.fontSizeLarger font.bold: true verticalAlignment: Text.AlignVCenter onLinkActivated: function(link) { root.linkActivated(link); } } MySettingsButton { id: resetButton anchors.baseline: mainTextLabel.baseline anchors.left: mainTextLabel.right height: mainTextLabel.contentHeight anchors.leftMargin: 10 padding: 2 leftPadding: 10 rightPadding: 10 backgroundRadius: 5 text: resetClears ? qsTr("Clear") : qsTr("Reset") visible: root.onReset !== null onClicked: root.onReset() } } Label { id: helpTextLabel visible: text !== "" Layout.fillWidth: true wrapMode: Text.Wrap color: theme.settingsTitleTextColor font.pixelSize: theme.fontSizeLarge font.bold: false onLinkActivated: function(link) { root.linkActivated(link); } MouseArea { anchors.fill: parent acceptedButtons: Qt.NoButton // pass clicks to parent cursorShape: parent.hoveredLink ? Qt.PointingHandCursor : Qt.ArrowCursor } } } ================================================ FILE: gpt4all-chat/qml/MySettingsStack.qml ================================================ import QtCore import QtQuick import QtQuick.Controls import QtQuick.Controls.Basic import QtQuick.Controls.impl import QtQuick.Layouts import QtQuick.Dialogs import Qt.labs.folderlistmodel import mysettings Item { id: settingsStack Theme { id: theme } property ListModel tabTitlesModel: ListModel { } property list tabs: [ ] TabBar { id: settingsTabBar anchors.top: parent.top anchors.horizontalCenter: parent.horizontalCenter width: parent.width / 1.75 z: 200 visible: tabTitlesModel.count > 1 background: Rectangle { color: "transparent" } Repeater { model: settingsStack.tabTitlesModel TabButton { id: tabButton padding: 10 contentItem: IconLabel { color: theme.textColor font.pixelSize: theme.fontSizeLarge font.bold: tabButton.checked text: model.title } background: Rectangle { color: "transparent" } Accessible.role: Accessible.Button Accessible.name: model.title } } } Rectangle { id: dividerTabBar visible: tabTitlesModel.count > 1 anchors.top: settingsTabBar.bottom anchors.topMargin: 15 anchors.bottomMargin: 15 anchors.leftMargin: 15 anchors.rightMargin: 15 anchors.left: parent.left anchors.right: parent.right height: 1 color: theme.settingsDivider } StackLayout { id: stackLayout anchors.top: tabTitlesModel.count > 1 ? dividerTabBar.bottom : parent.top anchors.topMargin: 5 anchors.left: parent.left anchors.right: parent.right anchors.bottom: parent.bottom currentIndex: settingsTabBar.currentIndex Repeater { model: settingsStack.tabs delegate: Loader { id: loader sourceComponent: model.modelData onLoaded: { settingsStack.tabTitlesModel.append({ "title": loader.item.title }); } } } } } ================================================ FILE: gpt4all-chat/qml/MySettingsTab.qml ================================================ import QtCore import QtQuick import QtQuick.Controls import QtQuick.Controls.Basic import QtQuick.Layouts Item { id: root property string title: "" property Item contentItem: null property bool showRestoreDefaultsButton: true signal restoreDefaults onContentItemChanged: function() { if (contentItem) { contentItem.parent = contentInner; contentItem.anchors.left = contentInner.left; contentItem.anchors.right = contentInner.right; } } ConfirmationDialog { id: restoreDefaultsDialog dialogTitle: qsTr("Restore defaults?") description: qsTr("This page of settings will be reset to the defaults.") onAccepted: root.restoreDefaults() } ScrollView { id: scrollView width: parent.width height: parent.height topPadding: 15 leftPadding: 5 contentWidth: availableWidth contentHeight: innerColumn.height ScrollBar.vertical: ScrollBar { parent: scrollView.parent anchors.top: scrollView.top anchors.left: scrollView.right anchors.bottom: scrollView.bottom } Theme { id: theme } ColumnLayout { id: innerColumn anchors.left: parent.left anchors.right: parent.right anchors.margins: 15 spacing: 10 Column { id: contentInner Layout.fillWidth: true Layout.maximumWidth: parent.width } Item { Layout.fillWidth: true Layout.topMargin: 20 height: restoreDefaultsButton.height MySettingsButton { id: restoreDefaultsButton anchors.left: parent.left visible: showRestoreDefaultsButton width: implicitWidth text: qsTr("Restore Defaults") font.pixelSize: theme.fontSizeLarge Accessible.role: Accessible.Button Accessible.name: text Accessible.description: qsTr("Restores settings dialog to a default state") onClicked: restoreDefaultsDialog.open() } } } } } ================================================ FILE: gpt4all-chat/qml/MySlug.qml ================================================ import QtCore import QtQuick import QtQuick.Controls import QtQuick.Controls.Basic Label { id: mySlug padding: 3 rightPadding: 9 leftPadding: 9 font.pixelSize: theme.fontSizeSmall background: Rectangle { radius: 6 border.width: 1 border.color: mySlug.color color: theme.slugBackground } ToolTip.visible: ma.containsMouse && ToolTip.text !== "" MouseArea { id: ma anchors.fill: parent hoverEnabled: true } } ================================================ FILE: gpt4all-chat/qml/MyTabButton.qml ================================================ import QtCore import QtQuick import QtQuick.Controls import QtQuick.Controls.Basic import mysettings import mysettingsenums MySettingsButton { property bool isSelected: false contentItem: Text { text: parent.text horizontalAlignment: Qt.AlignCenter color: isSelected ? theme.titleTextColor : theme.styledTextColor font.pixelSize: theme.fontSizeLarger } background: Item { visible: isSelected || hovered Rectangle { anchors.bottom: parent.bottom anchors.left: parent.left anchors.right: parent.right height: 3 color: isSelected ? theme.titleTextColor : theme.styledTextColorLighter } } } ================================================ FILE: gpt4all-chat/qml/MyTextArea.qml ================================================ import QtCore import QtQuick import QtQuick.Controls import QtQuick.Controls.Basic TextArea { id: myTextArea property string errState: "ok" // one of "ok", "error", "warning" color: enabled ? theme.textColor : theme.mutedTextColor placeholderTextColor: theme.mutedTextColor font.pixelSize: theme.fontSizeLarge background: Rectangle { implicitWidth: 150 color: theme.controlBackground border.width: errState === "ok" ? 1 : 2 border.color: { switch (errState) { case "ok": return theme.controlBorder; case "warning": return theme.textWarningColor; case "error": return theme.textErrorColor; } } radius: 10 } padding: 10 wrapMode: TextArea.Wrap ToolTip.delay: Qt.styleHints.mousePressAndHoldInterval } ================================================ FILE: gpt4all-chat/qml/MyTextButton.qml ================================================ import QtQuick import QtQuick.Controls Text { id: text signal click() property string tooltip HoverHandler { id: hoverHandler } TapHandler { onTapped: { click() } } font.bold: true font.underline: hoverHandler.hovered font.pixelSize: theme.fontSizeSmall ToolTip.text: tooltip ToolTip.visible: tooltip !== "" && hoverHandler.hovered ToolTip.delay: Qt.styleHints.mousePressAndHoldInterval } ================================================ FILE: gpt4all-chat/qml/MyTextField.qml ================================================ import QtCore import QtQuick import QtQuick.Controls import QtQuick.Controls.Basic TextField { id: myTextField padding: 10 placeholderTextColor: theme.mutedTextColor background: Rectangle { implicitWidth: 150 color: myTextField.enabled ? theme.controlBackground : theme.disabledControlBackground border.width: 1 border.color: theme.controlBorder radius: 10 } ToolTip.delay: Qt.styleHints.mousePressAndHoldInterval color: enabled ? theme.textColor : theme.mutedTextColor } ================================================ FILE: gpt4all-chat/qml/MyToolButton.qml ================================================ import QtCore import QtQuick import QtQuick.Controls import QtQuick.Controls.Basic import Qt5Compat.GraphicalEffects Button { id: myButton padding: 10 property color backgroundColor: theme.iconBackgroundDark property color backgroundColorHovered: theme.iconBackgroundHovered property color toggledColor: theme.accentColor property real toggledWidth: 1 property bool toggled: false property alias source: image.source property alias fillMode: image.fillMode property alias imageWidth: image.sourceSize.width property alias imageHeight: image.sourceSize.height property alias bgTransform: background.transform contentItem: Text { text: myButton.text horizontalAlignment: Text.AlignHCenter color: myButton.enabled ? theme.textColor : theme.mutedTextColor font.pixelSize: theme.fontSizeLarge Accessible.role: Accessible.Button Accessible.name: text } background: Item { id: background anchors.fill: parent Rectangle { anchors.fill: parent color: myButton.toggledColor visible: myButton.toggled border.color: myButton.toggledColor border.width: myButton.toggledWidth radius: 8 } Image { id: image anchors.centerIn: parent visible: false fillMode: Image.PreserveAspectFit mipmap: true sourceSize.width: 32 sourceSize.height: 32 } ColorOverlay { anchors.fill: image source: image color: !myButton.enabled ? theme.mutedTextColor : myButton.hovered ? backgroundColorHovered : backgroundColor } } Accessible.role: Accessible.Button Accessible.name: text ToolTip.delay: Qt.styleHints.mousePressAndHoldInterval } ================================================ FILE: gpt4all-chat/qml/MyWelcomeButton.qml ================================================ import QtCore import QtQuick import QtQuick.Controls import QtQuick.Controls.Basic import Qt5Compat.GraphicalEffects import QtQuick.Layouts import mysettings Button { id: myButton property alias imageSource: myimage.source property alias description: description.text contentItem: Item { id: item anchors.centerIn: parent RowLayout { anchors.fill: parent Rectangle { id: rec color: "transparent" Layout.preferredWidth: item.width * 1/5.5 Layout.preferredHeight: item.width * 1/5.5 Layout.alignment: Qt.AlignCenter Image { id: myimage anchors.centerIn: parent sourceSize.width: rec.width sourceSize.height: rec.height mipmap: true visible: false } ColorOverlay { anchors.fill: myimage source: myimage color: theme.welcomeButtonBorder } } ColumnLayout { Layout.preferredWidth: childrenRect.width Text { text: myButton.text horizontalAlignment: Text.AlignHCenter color: myButton.hovered ? theme.welcomeButtonTextHovered : theme.welcomeButtonText font.pixelSize: theme.fontSizeBannerSmall font.bold: true Accessible.role: Accessible.Button Accessible.name: text } Text { id: description horizontalAlignment: Text.AlignHCenter color: myButton.hovered ? theme.welcomeButtonTextHovered : theme.welcomeButtonText font.pixelSize: theme.fontSizeSmall font.bold: false Accessible.role: Accessible.Button Accessible.name: text } } } } background: Rectangle { radius: 10 border.width: 1 border.color: myButton.hovered ? theme.welcomeButtonBorderHovered : theme.welcomeButtonBorder color: theme.welcomeButtonBackground } Accessible.role: Accessible.Button Accessible.name: text ToolTip.delay: Qt.styleHints.mousePressAndHoldInterval } ================================================ FILE: gpt4all-chat/qml/NetworkDialog.qml ================================================ import QtCore import QtQuick import QtQuick.Controls import QtQuick.Controls.Basic import QtQuick.Layouts import download import network import llm import mysettings MyDialog { id: networkDialog anchors.centerIn: parent modal: true padding: 20 Theme { id: theme } Column { id: column spacing: 20 Item { width: childrenRect.width height: childrenRect.height Image { id: img anchors.top: parent.top anchors.left: parent.left width: 60 height: 60 source: "qrc:/gpt4all/icons/gpt4all.svg" } Text { anchors.left: img.right anchors.leftMargin: 30 anchors.verticalCenter: img.verticalCenter text: qsTr("Contribute data to the GPT4All Opensource Datalake.") color: theme.textColor font.pixelSize: theme.fontSizeLarge } } ScrollView { clip: true height: 300 width: 1024 - 40 ScrollBar.vertical.policy: ScrollBar.AlwaysOn ScrollBar.horizontal.policy: ScrollBar.AlwaysOff MyTextArea { id: textOptIn width: 1024 - 40 text: qsTr("By enabling this feature, you will be able to participate in the democratic process of " + "training a large language model by contributing data for future model improvements.\n\n" + "When a GPT4All model responds to you and you have opted-in, your conversation will be sent to " + "the GPT4All Open Source Datalake. Additionally, you can like/dislike its response. If you " + "dislike a response, you can suggest an alternative response. This data will be collected and " + "aggregated in the GPT4All Datalake.\n\n" + "NOTE: By turning on this feature, you will be sending your data to the GPT4All Open Source " + "Datalake. You should have no expectation of chat privacy when this feature is enabled. You " + "should; however, have an expectation of an optional attribution if you wish. Your chat data " + "will be openly available for anyone to download and will be used by Nomic AI to improve " + "future GPT4All models. Nomic AI will retain all attribution information attached to your data " + "and you will be credited as a contributor to any GPT4All model release that uses your data!") focus: false readOnly: true Accessible.role: Accessible.Paragraph Accessible.name: qsTr("Terms for opt-in") Accessible.description: qsTr("Describes what will happen when you opt-in") } } MyTextField { id: attribution width: parent.width text: MySettings.networkAttribution placeholderText: qsTr("Please provide a name for attribution (optional)") Accessible.role: Accessible.EditableText Accessible.name: qsTr("Attribution (optional)") Accessible.description: qsTr("Provide attribution") onEditingFinished: { MySettings.networkAttribution = attribution.text; } } } footer: DialogButtonBox { id: dialogBox padding: 20 alignment: Qt.AlignRight spacing: 10 MySettingsButton { text: qsTr("Enable") Accessible.description: qsTr("Enable opt-in") DialogButtonBox.buttonRole: DialogButtonBox.AcceptRole } MySettingsButton { text: qsTr("Cancel") Accessible.description: qsTr("Cancel opt-in") DialogButtonBox.buttonRole: DialogButtonBox.RejectRole } background: Rectangle { color: "transparent" } } onAccepted: { MySettings.networkIsActive = true } onRejected: { MySettings.networkIsActive = false } } ================================================ FILE: gpt4all-chat/qml/NewVersionDialog.qml ================================================ import QtCore import QtQuick import QtQuick.Controls import QtQuick.Controls.Basic import QtQuick.Layouts import download import network import llm MyDialog { id: newVerionDialog anchors.centerIn: parent modal: true width: contentItem.width height: contentItem.height padding: 20 closeButtonVisible: false Theme { id: theme } Item { id: contentItem width: childrenRect.width + 40 height: childrenRect.height + 40 Label { id: label anchors.top: parent.top anchors.left: parent.left topPadding: 20 bottomPadding: 20 text: qsTr("New version is available") color: theme.titleTextColor font.pixelSize: theme.fontSizeLarge font.bold: true } MySettingsButton { id: button anchors.left: label.right anchors.leftMargin: 10 anchors.verticalCenter: label.verticalCenter padding: 20 text: qsTr("Update") font.pixelSize: theme.fontSizeLarge Accessible.description: qsTr("Update to new version") onClicked: { if (!LLM.checkForUpdates()) checkForUpdatesError.open() } } } } ================================================ FILE: gpt4all-chat/qml/PopupDialog.qml ================================================ import QtCore import QtQuick import QtQuick.Controls import QtQuick.Controls.Basic import QtQuick.Layouts Dialog { id: popupDialog anchors.centerIn: parent padding: 20 property alias text: textField.text property bool shouldTimeOut: true property bool shouldShowBusy: false modal: shouldShowBusy closePolicy: shouldShowBusy ? Popup.NoAutoClose : (Popup.CloseOnEscape | Popup.CloseOnPressOutside) Theme { id: theme } Row { anchors.centerIn: parent spacing: 20 Label { id: textField width: Math.min(1024, implicitWidth) height: Math.min(600, implicitHeight) anchors.verticalCenter: shouldShowBusy ? busyIndicator.verticalCenter : parent.verticalCenter horizontalAlignment: Text.AlignLeft verticalAlignment: Text.AlignVCenter textFormat: Text.StyledText wrapMode: Text.WordWrap color: theme.textColor linkColor: theme.linkColor Accessible.role: Accessible.HelpBalloon Accessible.name: text Accessible.description: qsTr("Reveals a shortlived help balloon") onLinkActivated: function(link) { Qt.openUrlExternally(link) } } MyBusyIndicator { id: busyIndicator visible: shouldShowBusy running: shouldShowBusy Accessible.role: Accessible.Animation Accessible.name: qsTr("Busy indicator") Accessible.description: qsTr("Displayed when the popup is showing busy") } } background: Rectangle { anchors.fill: parent color: theme.containerBackground border.width: 1 border.color: theme.dialogBorder radius: 10 } exit: Transition { NumberAnimation { duration: 500; property: "opacity"; from: 1.0; to: 0.0 } } onOpened: { if (shouldTimeOut) timer.start() } Timer { id: timer interval: 500; running: false; repeat: false onTriggered: popupDialog.close() } } ================================================ FILE: gpt4all-chat/qml/RemoteModelCard.qml ================================================ import QtCore import QtQuick import QtQuick.Controls import QtQuick.Controls.Basic import QtQuick.Layouts import QtQuick.Dialogs import Qt.labs.folderlistmodel import Qt5Compat.GraphicalEffects import llm import chatlistmodel import download import modellist import network import gpt4all import mysettings import localdocs Rectangle { property alias providerName: providerNameLabel.text property alias providerImage: myimage.source property alias providerDesc: providerDescLabel.text property string providerBaseUrl: "" property bool providerIsCustom: false property var modelWhitelist: null color: theme.conversationBackground radius: 10 border.width: 1 border.color: theme.controlBorder implicitHeight: topColumn.height + bottomColumn.height + 33 * theme.fontScale ColumnLayout { id: topColumn anchors.left: parent.left anchors.right: parent.right anchors.top: parent.top anchors.margins: 20 spacing: 15 * theme.fontScale RowLayout { Layout.alignment: Qt.AlignTop spacing: 10 Item { Layout.preferredWidth: 27 * theme.fontScale Layout.preferredHeight: 27 * theme.fontScale Layout.alignment: Qt.AlignLeft Image { id: myimage anchors.centerIn: parent sourceSize.width: parent.width sourceSize.height: parent.height mipmap: true fillMode: Image.PreserveAspectFit } } Label { id: providerNameLabel color: theme.textColor font.pixelSize: theme.fontSizeBanner } } Label { id: providerDescLabel Layout.fillWidth: true wrapMode: Text.Wrap color: theme.settingsTitleTextColor font.pixelSize: theme.fontSizeLarge onLinkActivated: function(link) { Qt.openUrlExternally(link); } MouseArea { anchors.fill: parent acceptedButtons: Qt.NoButton // pass clicks to parent cursorShape: parent.hoveredLink ? Qt.PointingHandCursor : Qt.ArrowCursor } } } ColumnLayout { id: bottomColumn anchors.left: parent.left anchors.right: parent.right anchors.bottom: parent.bottom anchors.margins: 20 spacing: 30 ColumnLayout { MySettingsLabel { text: qsTr("API Key") font.bold: true font.pixelSize: theme.fontSizeLarge color: theme.settingsTitleTextColor } MyTextField { id: apiKeyField Layout.fillWidth: true font.pixelSize: theme.fontSizeLarge wrapMode: Text.WrapAnywhere function showError() { messageToast.show(qsTr("ERROR: $API_KEY is empty.")); apiKeyField.placeholderTextColor = theme.textErrorColor; } onTextChanged: { apiKeyField.placeholderTextColor = theme.mutedTextColor; if (!providerIsCustom) { let models = ModelList.remoteModelList(apiKeyField.text, providerBaseUrl); if (modelWhitelist !== null) models = models.filter(m => modelWhitelist.includes(m)); myModelList.model = models; myModelList.currentIndex = -1; } } placeholderText: qsTr("enter $API_KEY") Accessible.role: Accessible.EditableText Accessible.name: placeholderText Accessible.description: qsTr("Whether the file hash is being calculated") } } ColumnLayout { visible: providerIsCustom MySettingsLabel { text: qsTr("Base Url") font.bold: true font.pixelSize: theme.fontSizeLarge color: theme.settingsTitleTextColor } MyTextField { id: baseUrlField Layout.fillWidth: true font.pixelSize: theme.fontSizeLarge wrapMode: Text.WrapAnywhere function showError() { messageToast.show(qsTr("ERROR: $BASE_URL is empty.")); baseUrlField.placeholderTextColor = theme.textErrorColor; } onTextChanged: { baseUrlField.placeholderTextColor = theme.mutedTextColor; } placeholderText: qsTr("enter $BASE_URL") Accessible.role: Accessible.EditableText Accessible.name: placeholderText } } ColumnLayout { visible: providerIsCustom MySettingsLabel { text: qsTr("Model Name") font.bold: true font.pixelSize: theme.fontSizeLarge color: theme.settingsTitleTextColor } MyTextField { id: modelNameField Layout.fillWidth: true font.pixelSize: theme.fontSizeLarge wrapMode: Text.WrapAnywhere function showError() { messageToast.show(qsTr("ERROR: $MODEL_NAME is empty.")) modelNameField.placeholderTextColor = theme.textErrorColor; } onTextChanged: { modelNameField.placeholderTextColor = theme.mutedTextColor; } placeholderText: qsTr("enter $MODEL_NAME") Accessible.role: Accessible.EditableText Accessible.name: placeholderText } } ColumnLayout { visible: myModelList.count > 0 && !providerIsCustom MySettingsLabel { text: qsTr("Models") font.bold: true font.pixelSize: theme.fontSizeLarge color: theme.settingsTitleTextColor } RowLayout { spacing: 10 MyComboBox { Layout.fillWidth: true id: myModelList currentIndex: -1; } } } MySettingsButton { id: installButton Layout.alignment: Qt.AlignRight text: qsTr("Install") font.pixelSize: theme.fontSizeLarge property string apiKeyText: apiKeyField.text.trim() property string baseUrlText: providerIsCustom ? baseUrlField.text.trim() : providerBaseUrl.trim() property string modelNameText: providerIsCustom ? modelNameField.text.trim() : myModelList.currentText.trim() enabled: apiKeyText !== "" && baseUrlText !== "" && modelNameText !== "" onClicked: { Download.installCompatibleModel( modelNameText, apiKeyText, baseUrlText, ); myModelList.currentIndex = -1; } Accessible.role: Accessible.Button Accessible.name: qsTr("Install") Accessible.description: qsTr("Install remote model") } } } ================================================ FILE: gpt4all-chat/qml/SettingsView.qml ================================================ import QtCore import QtQuick import QtQuick.Controls import QtQuick.Controls.Basic import QtQuick.Dialogs import QtQuick.Layouts import Qt.labs.folderlistmodel import download import modellist import network import llm import mysettings Rectangle { id: settingsDialog color: theme.viewBackground property alias pageToDisplay: listView.currentIndex Item { Accessible.role: Accessible.Dialog Accessible.name: qsTr("Settings") Accessible.description: qsTr("Contains various application settings") } ListModel { id: stacksModel ListElement { title: qsTr("Application") } ListElement { title: qsTr("Model") } ListElement { title: qsTr("LocalDocs") } } ColumnLayout { id: mainArea anchors.left: parent.left anchors.right: parent.right anchors.top: parent.top anchors.bottom: parent.bottom anchors.margins: 30 spacing: 50 RowLayout { Layout.fillWidth: true Layout.alignment: Qt.AlignTop spacing: 50 ColumnLayout { Layout.fillWidth: true Layout.alignment: Qt.AlignLeft Layout.minimumWidth: 200 spacing: 5 Text { id: welcome text: qsTr("Settings") font.pixelSize: theme.fontSizeBanner color: theme.titleTextColor } } Rectangle { Layout.fillWidth: true height: 0 } } Item { Layout.fillWidth: true Layout.fillHeight: true Rectangle { id: stackList anchors.top: parent.top anchors.bottom: parent.bottom anchors.left: parent.left width: 220 color: theme.viewBackground radius: 10 ScrollView { anchors.top: parent.top anchors.bottom: parent.bottom anchors.left: parent.left anchors.right: parent.right anchors.topMargin: 10 ScrollBar.vertical.policy: ScrollBar.AsNeeded clip: true ListView { id: listView anchors.fill: parent model: stacksModel delegate: Rectangle { id: item width: listView.width height: titleLabel.height + 10 color: "transparent" MyButton { id: titleLabel backgroundColor: index === listView.currentIndex ? theme.selectedBackground : theme.viewBackground backgroundColorHovered: backgroundColor borderColor: "transparent" borderWidth: 0 textColor: theme.titleTextColor anchors.verticalCenter: parent.verticalCenter anchors.left: parent.left anchors.right: parent.right anchors.margins: 10 font.bold: index === listView.currentIndex text: title textAlignment: Qt.AlignLeft font.pixelSize: theme.fontSizeLarge onClicked: { listView.currentIndex = index } } } } } } StackLayout { id: stackLayout anchors.top: parent.top anchors.bottom: parent.bottom anchors.left: stackList.right anchors.right: parent.right currentIndex: listView.currentIndex MySettingsStack { tabs: [ Component { ApplicationSettings { } } ] } MySettingsStack { tabs: [ Component { ModelSettings { } } ] } MySettingsStack { tabs: [ Component { LocalDocsSettings { } } ] } } } } } ================================================ FILE: gpt4all-chat/qml/StartupDialog.qml ================================================ import QtCore import QtQuick import QtQuick.Controls import QtQuick.Controls.Basic import QtQuick.Layouts import Qt5Compat.GraphicalEffects import download import network import llm import mysettings MyDialog { id: startupDialog anchors.centerIn: parent modal: true padding: 10 width: 1024 height: column.height + 20 closePolicy: !optInStatisticsRadio.choiceMade || !optInNetworkRadio.choiceMade ? Popup.NoAutoClose : (Popup.CloseOnEscape | Popup.CloseOnPressOutside) Theme { id: theme } Column { id: column spacing: 20 Item { width: childrenRect.width height: childrenRect.height Image { id: img anchors.top: parent.top anchors.left: parent.left sourceSize.width: 60 sourceSize.height: 60 mipmap: true visible: false source: "qrc:/gpt4all/icons/globe.svg" } ColorOverlay { anchors.fill: img source: img color: theme.titleTextColor } Text { anchors.left: img.right anchors.leftMargin: 10 anchors.verticalCenter: img.verticalCenter text: qsTr("Welcome!") color: theme.textColor font.pixelSize: theme.fontSizeLarge } } ScrollView { clip: true height: 200 width: 1024 - 40 ScrollBar.vertical.policy: ScrollBar.AlwaysOn ScrollBar.horizontal.policy: ScrollBar.AlwaysOff MyTextArea { id: welcome width: 1024 - 40 textFormat: TextEdit.MarkdownText text: qsTr("### Release Notes\n%1
                  \n### Contributors\n%2").arg(Download.releaseInfo.notes).arg(Download.releaseInfo.contributors) focus: false readOnly: true Accessible.role: Accessible.Paragraph Accessible.name: qsTr("Release notes") Accessible.description: qsTr("Release notes for this version") } } ScrollView { clip: true height: 150 width: 1024 - 40 ScrollBar.vertical.policy: ScrollBar.AlwaysOn ScrollBar.horizontal.policy: ScrollBar.AlwaysOff MyTextArea { id: optInTerms width: 1024 - 40 textFormat: TextEdit.MarkdownText text: qsTr( "### Opt-ins for anonymous usage analytics and datalake By enabling these features, you will be able to participate in the democratic process of training a large language model by contributing data for future model improvements. When a GPT4All model responds to you and you have opted-in, your conversation will be sent to the GPT4All Open Source Datalake. Additionally, you can like/dislike its response. If you dislike a response, you can suggest an alternative response. This data will be collected and aggregated in the GPT4All Datalake. NOTE: By turning on this feature, you will be sending your data to the GPT4All Open Source Datalake. You should have no expectation of chat privacy when this feature is enabled. You should; however, have an expectation of an optional attribution if you wish. Your chat data will be openly available for anyone to download and will be used by Nomic AI to improve future GPT4All models. Nomic AI will retain all attribution information attached to your data and you will be credited as a contributor to any GPT4All model release that uses your data!") focus: false readOnly: true Accessible.role: Accessible.Paragraph Accessible.name: qsTr("Terms for opt-in") Accessible.description: qsTr("Describes what will happen when you opt-in") } } GridLayout { columns: 2 rowSpacing: 10 columnSpacing: 10 anchors.right: parent.right Label { id: optInStatistics text: qsTr("Opt-in to anonymous usage analytics used to improve GPT4All") Layout.row: 0 Layout.column: 0 color: theme.textColor font.pixelSize: theme.fontSizeLarge Accessible.role: Accessible.Paragraph Accessible.name: qsTr("Opt-in for anonymous usage statistics") } ButtonGroup { buttons: optInStatisticsRadio.children onClicked: { MySettings.networkUsageStatsActive = optInStatisticsRadio.checked if (optInNetworkRadio.choiceMade && optInStatisticsRadio.choiceMade) startupDialog.close(); } } RowLayout { id: optInStatisticsRadio Layout.alignment: Qt.AlignVCenter Layout.row: 0 Layout.column: 1 property alias checked: optInStatisticsRadioYes.checked property bool choiceMade: optInStatisticsRadioYes.checked || optInStatisticsRadioNo.checked RadioButton { id: optInStatisticsRadioYes checked: MySettings.networkUsageStatsActive text: qsTr("Yes") font.pixelSize: theme.fontSizeLarge Accessible.role: Accessible.RadioButton Accessible.name: qsTr("Opt-in for anonymous usage statistics") Accessible.description: qsTr("Allow opt-in for anonymous usage statistics") background: Rectangle { color: "transparent" } indicator: Rectangle { implicitWidth: 26 implicitHeight: 26 x: optInStatisticsRadioYes.leftPadding y: parent.height / 2 - height / 2 radius: 13 border.color: theme.dialogBorder color: "transparent" Rectangle { width: 14 height: 14 x: 6 y: 6 radius: 7 color: theme.textColor visible: optInStatisticsRadioYes.checked } } contentItem: Text { text: optInStatisticsRadioYes.text font: optInStatisticsRadioYes.font opacity: enabled ? 1.0 : 0.3 color: theme.textColor verticalAlignment: Text.AlignVCenter leftPadding: optInStatisticsRadioYes.indicator.width + optInStatisticsRadioYes.spacing } } RadioButton { id: optInStatisticsRadioNo checked: MySettings.isNetworkUsageStatsActiveSet() && !MySettings.networkUsageStatsActive text: qsTr("No") font.pixelSize: theme.fontSizeLarge Accessible.role: Accessible.RadioButton Accessible.name: qsTr("Opt-out for anonymous usage statistics") Accessible.description: qsTr("Allow opt-out for anonymous usage statistics") background: Rectangle { color: "transparent" } indicator: Rectangle { implicitWidth: 26 implicitHeight: 26 x: optInStatisticsRadioNo.leftPadding y: parent.height / 2 - height / 2 radius: 13 border.color: theme.dialogBorder color: "transparent" Rectangle { width: 14 height: 14 x: 6 y: 6 radius: 7 color: theme.textColor visible: optInStatisticsRadioNo.checked } } contentItem: Text { text: optInStatisticsRadioNo.text font: optInStatisticsRadioNo.font opacity: enabled ? 1.0 : 0.3 color: theme.textColor verticalAlignment: Text.AlignVCenter leftPadding: optInStatisticsRadioNo.indicator.width + optInStatisticsRadioNo.spacing } } } Label { id: optInNetwork text: qsTr("Opt-in to anonymous sharing of chats to the GPT4All Datalake") Layout.row: 1 Layout.column: 0 color: theme.textColor font.pixelSize: theme.fontSizeLarge Accessible.role: Accessible.Paragraph Accessible.name: qsTr("Opt-in for network") Accessible.description: qsTr("Allow opt-in for network") } ButtonGroup { buttons: optInNetworkRadio.children onClicked: { MySettings.networkIsActive = optInNetworkRadio.checked if (optInNetworkRadio.choiceMade && optInStatisticsRadio.choiceMade) startupDialog.close(); } } RowLayout { id: optInNetworkRadio Layout.alignment: Qt.AlignVCenter Layout.row: 1 Layout.column: 1 property alias checked: optInNetworkRadioYes.checked property bool choiceMade: optInNetworkRadioYes.checked || optInNetworkRadioNo.checked RadioButton { id: optInNetworkRadioYes checked: MySettings.networkIsActive text: qsTr("Yes") font.pixelSize: theme.fontSizeLarge Accessible.role: Accessible.RadioButton Accessible.name: qsTr("Opt-in for network") Accessible.description: qsTr("Allow opt-in anonymous sharing of chats to the GPT4All Datalake") background: Rectangle { color: "transparent" } indicator: Rectangle { implicitWidth: 26 implicitHeight: 26 x: optInNetworkRadioYes.leftPadding y: parent.height / 2 - height / 2 radius: 13 border.color: theme.dialogBorder color: "transparent" Rectangle { width: 14 height: 14 x: 6 y: 6 radius: 7 color: theme.textColor visible: optInNetworkRadioYes.checked } } contentItem: Text { text: optInNetworkRadioYes.text font: optInNetworkRadioYes.font opacity: enabled ? 1.0 : 0.3 color: theme.textColor verticalAlignment: Text.AlignVCenter leftPadding: optInNetworkRadioYes.indicator.width + optInNetworkRadioYes.spacing } } RadioButton { id: optInNetworkRadioNo checked: MySettings.isNetworkIsActiveSet() && !MySettings.networkIsActive text: qsTr("No") font.pixelSize: theme.fontSizeLarge Accessible.role: Accessible.RadioButton Accessible.name: qsTr("Opt-out for network") Accessible.description: qsTr("Allow opt-out anonymous sharing of chats to the GPT4All Datalake") background: Rectangle { color: "transparent" } indicator: Rectangle { implicitWidth: 26 implicitHeight: 26 x: optInNetworkRadioNo.leftPadding y: parent.height / 2 - height / 2 radius: 13 border.color: theme.dialogBorder color: "transparent" Rectangle { width: 14 height: 14 x: 6 y: 6 radius: 7 color: theme.textColor visible: optInNetworkRadioNo.checked } } contentItem: Text { text: optInNetworkRadioNo.text font: optInNetworkRadioNo.font opacity: enabled ? 1.0 : 0.3 color: theme.textColor verticalAlignment: Text.AlignVCenter leftPadding: optInNetworkRadioNo.indicator.width + optInNetworkRadioNo.spacing } } } } } } ================================================ FILE: gpt4all-chat/qml/Theme.qml ================================================ import QtCore import QtQuick import QtQuick.Controls.Basic import mysettings import mysettingsenums QtObject { // black and white property color black: Qt.hsla(231/360, 0.15, 0.19) property color white: Qt.hsla(0, 0, 1) // dark mode black and white property color darkwhite: Qt.hsla(0, 0, 0.85) // gray // FIXME: These are slightly less red than what atlas uses. should resolve diff property color gray0: white property color gray50: Qt.hsla(25/360, 0.05, 0.97) property color gray100: Qt.hsla(25/360,0.05, 0.95) property color gray200: Qt.hsla(25/360, 0.05, 0.89) property color gray300: Qt.hsla(25/360, 0.05, 0.82) property color gray400: Qt.hsla(25/360, 0.05, 0.71) property color gray500: Qt.hsla(25/360, 0.05, 0.60) property color gray600: Qt.hsla(25/360, 0.05, 0.51) property color gray700: Qt.hsla(25/360, 0.05, 0.42) property color gray800: Qt.hsla(25/360, 0.05, 0.35) property color gray900: Qt.hsla(25/360, 0.05, 0.31) property color gray950: Qt.hsla(25/360, 0.05, 0.15) property color grayRed0: Qt.hsla(0/360, 0.108, 0.89) property color grayRed50: Qt.hsla(0/360, 0.108, 0.85) property color grayRed100: Qt.hsla(0/360, 0.108, 0.80) property color grayRed200: Qt.hsla(0/360, 0.108, 0.76) property color grayRed300: Qt.hsla(0/360, 0.108, 0.72) property color grayRed400: Qt.hsla(0/360, 0.108, 0.68) property color grayRed500: Qt.hsla(0/360, 0.108, 0.60) property color grayRed600: Qt.hsla(0/360, 0.108, 0.56) property color grayRed700: Qt.hsla(0/360, 0.108, 0.52) property color grayRed800: Qt.hsla(0/360, 0.108, 0.48) property color grayRed900: Qt.hsla(0/360, 0.108, 0.42) // darkmode property color darkgray0: Qt.hsla(25/360, 0.05, 0.23) property color darkgray50: Qt.hsla(25/360, 0.05, 0.21) property color darkgray100: Qt.hsla(25/360, 0.05, 0.19) property color darkgray200: Qt.hsla(25/360, 0.05, 0.17) property color darkgray300: Qt.hsla(25/360, 0.05, 0.15) property color darkgray400: Qt.hsla(25/360, 0.05, 0.13) property color darkgray500: Qt.hsla(25/360, 0.05, 0.11) property color darkgray600: Qt.hsla(25/360, 0.05, 0.09) property color darkgray700: Qt.hsla(25/360, 0.05, 0.07) property color darkgray800: Qt.hsla(25/360, 0.05, 0.05) property color darkgray900: Qt.hsla(25/360, 0.05, 0.03) property color darkgray950: Qt.hsla(25/360, 0.05, 0.01) // green property color green50: Qt.hsla(120/360, 0.18, 0.97) property color green100: Qt.hsla(120/360, 0.21, 0.93) property color green200: Qt.hsla(124/360, 0.21, 0.85) property color green300: Qt.hsla(122/360, 0.20, 0.73) property color green400: Qt.hsla(122/360, 0.19, 0.58) property color green500: Qt.hsla(121/360, 0.19, 0.45) property color green600: Qt.hsla(122/360, 0.20, 0.33) property color green700: Qt.hsla(122/360, 0.19, 0.29) property color green800: Qt.hsla(123/360, 0.17, 0.24) property color green900: Qt.hsla(124/360, 0.17, 0.20) property color green950: Qt.hsla(125/360, 0.22, 0.10) property color green300_sat: Qt.hsla(122/360, 0.24, 0.73) property color green400_sat: Qt.hsla(122/360, 0.23, 0.58) property color green450_sat: Qt.hsla(122/360, 0.23, 0.52) // yellow property color yellow0: Qt.hsla(47/360, 0.90, 0.99) property color yellow25: Qt.hsla(47/360, 0.90, 0.98) property color yellow50: Qt.hsla(47/360, 0.90, 0.96) property color yellow100: Qt.hsla(46/360, 0.89, 0.89) property color yellow200: Qt.hsla(45/360, 0.90, 0.77) property color yellow300: Qt.hsla(44/360, 0.90, 0.66) property color yellow400: Qt.hsla(41/360, 0.89, 0.56) property color yellow500: Qt.hsla(36/360, 0.85, 0.50) property color yellow600: Qt.hsla(30/360, 0.87, 0.44) property color yellow700: Qt.hsla(24/360, 0.84, 0.37) property color yellow800: Qt.hsla(21/360, 0.76, 0.31) property color yellow900: Qt.hsla(20/360, 0.72, 0.26) property color yellow950: Qt.hsla(19/360, 0.86, 0.14) // red property color red50: Qt.hsla(0, 0.71, 0.97) property color red100: Qt.hsla(0, 0.87, 0.94) property color red200: Qt.hsla(0, 0.89, 0.89) property color red300: Qt.hsla(0, 0.85, 0.77) property color red400: Qt.hsla(0, 0.83, 0.71) property color red500: Qt.hsla(0, 0.76, 0.60) property color red600: Qt.hsla(0, 0.65, 0.51) property color red700: Qt.hsla(0, 0.67, 0.42) property color red800: Qt.hsla(0, 0.63, 0.35) property color red900: Qt.hsla(0, 0.56, 0.31) property color red950: Qt.hsla(0, 0.67, 0.15) // purple // FIXME: These are slightly more uniform than what atlas uses. should resolve diff property color purple50: Qt.hsla(279/360, 1.0, 0.98) property color purple100: Qt.hsla(279/360, 1.0, 0.95) property color purple200: Qt.hsla(279/360, 1.0, 0.91) property color purple300: Qt.hsla(279/360, 1.0, 0.84) property color purple400: Qt.hsla(279/360, 1.0, 0.73) property color purple450: Qt.hsla(279/360, 1.0, 0.68) property color purple500: Qt.hsla(279/360, 1.0, 0.63) property color purple600: Qt.hsla(279/360, 1.0, 0.53) property color purple700: Qt.hsla(279/360, 1.0, 0.47) property color purple800: Qt.hsla(279/360, 1.0, 0.39) property color purple900: Qt.hsla(279/360, 1.0, 0.32) property color purple950: Qt.hsla(279/360, 1.0, 0.22) property color blue0: "#d0d5db" property color blue100: "#8e8ea0" property color blue200: "#7d7d8e" property color blue400: "#444654" property color blue500: "#343541" property color blue600: "#2c2d37" property color blue700: "#26272f" property color blue800: "#232628" property color blue900: "#222527" property color blue950: "#1c1f21" property color blue1000: "#0e1011" property color accentColor: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return blue200 case MySettingsEnums.ChatTheme.Dark: return yellow300 default: return yellow300 } } /* These nolonger apply to anything (remove this?) Replaced by menuHighlightColor & menuBackgroundColor now using different colors. property color darkContrast: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return blue950 case MySettingsEnums.ChatTheme.Dark: return darkgray300 default: return gray100 } } property color lightContrast: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return blue400 case MySettingsEnums.ChatTheme.Dark: return darkgray0 default: return gray0 } } */ property color controlBorder: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return blue800 case MySettingsEnums.ChatTheme.Dark: return darkgray0 default: return gray300 } } property color controlBackground: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return blue950 case MySettingsEnums.ChatTheme.Dark: return darkgray300 default: return gray100 } } property color attachmentBackground: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return blue900 case MySettingsEnums.ChatTheme.Dark: return darkgray200 default: return gray0 } } property color disabledControlBackground: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return blue950 case MySettingsEnums.ChatTheme.Dark: return darkgray200 default: return gray200 } } property color dividerColor: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return blue950 case MySettingsEnums.ChatTheme.Dark: return darkgray200 default: return grayRed0 } } property color conversationDivider: { return dividerColor } property color settingsDivider: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return dividerColor case MySettingsEnums.ChatTheme.Dark: return darkgray400 default: return grayRed500 } } property color viewBackground: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return blue600 case MySettingsEnums.ChatTheme.Dark: return darkgray100 default: return gray50 } } /* These nolonger apply to anything (remove this?) property color containerForeground: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return blue950 case MySettingsEnums.ChatTheme.Dark: return darkgray300 default: return gray300 } } */ property color containerBackground: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return blue900 case MySettingsEnums.ChatTheme.Dark: return darkgray200 default: return gray100 } } property color viewBarBackground: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return blue950 case MySettingsEnums.ChatTheme.Dark: return darkgray400 default: return gray100 } } property color progressForeground: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return purple400 case MySettingsEnums.ChatTheme.Dark: return accentColor default: return green600 } } property color progressBackground: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return blue900 case MySettingsEnums.ChatTheme.Dark: return green600 default: return green100 } } property color altProgressForeground: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return progressForeground default: return "#fcf0c9" } } property color altProgressBackground: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return progressBackground default: return "#fff9d2" } } property color altProgressText: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return textColor default: return "#d16f0e" } } property color checkboxBorder: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return accentColor case MySettingsEnums.ChatTheme.Dark: return gray200 default: return gray600 } } property color checkboxForeground: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return accentColor case MySettingsEnums.ChatTheme.Dark: return green300 default: return green600 } } property color buttonBackground: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return blue950 case MySettingsEnums.ChatTheme.Dark: return darkgray300 default: return green600 } } property color buttonBackgroundHovered: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return blue900 case MySettingsEnums.ChatTheme.Dark: return darkgray400 default: return green500 } } property color lightButtonText: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return textColor case MySettingsEnums.ChatTheme.Dark: return textColor default: return green600 } } property color lightButtonMutedText: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return mutedTextColor case MySettingsEnums.ChatTheme.Dark: return mutedTextColor default: return green300 } } property color lightButtonBackground: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return buttonBackground case MySettingsEnums.ChatTheme.Dark: return buttonBackground default: return green100 } } property color lightButtonBackgroundHovered: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return buttonBackgroundHovered case MySettingsEnums.ChatTheme.Dark: return buttonBackgroundHovered default: return green200 } } property color mediumButtonBackground: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return purple400 case MySettingsEnums.ChatTheme.Dark: return green400_sat default: return green400_sat } } property color mediumButtonBackgroundHovered: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return purple450 case MySettingsEnums.ChatTheme.Dark: return green450_sat default: return green300_sat } } property color mediumButtonText: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return textColor case MySettingsEnums.ChatTheme.Dark: return textColor default: return white } } property color darkButtonText: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return textColor case MySettingsEnums.ChatTheme.Dark: return textColor default: return red600 } } property color darkButtonMutedText: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return mutedTextColor case MySettingsEnums.ChatTheme.Dark: return mutedTextColor default: return red300 } } property color darkButtonBackground: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return buttonBackground case MySettingsEnums.ChatTheme.Dark: return buttonBackground default: return red200 } } property color darkButtonBackgroundHovered: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return buttonBackgroundHovered case MySettingsEnums.ChatTheme.Dark: return buttonBackgroundHovered default: return red300 } } property color lighterButtonForeground: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return textColor case MySettingsEnums.ChatTheme.Dark: return textColor default: return green600 } } property color lighterButtonBackground: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return buttonBackground case MySettingsEnums.ChatTheme.Dark: return buttonBackground default: return green100 } } property color lighterButtonBackgroundHovered: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return buttonBackgroundHovered case MySettingsEnums.ChatTheme.Dark: return buttonBackgroundHovered default: return green50 } } property color lighterButtonBackgroundHoveredRed: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return buttonBackgroundHovered case MySettingsEnums.ChatTheme.Dark: return buttonBackgroundHovered default: return red50 } } property color sourcesBackground: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return lighterButtonBackground case MySettingsEnums.ChatTheme.Dark: return lighterButtonBackground default: return gray100 } } property color sourcesBackgroundHovered: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return lighterButtonBackgroundHovered case MySettingsEnums.ChatTheme.Dark: return lighterButtonBackgroundHovered default: return gray200 } } property color buttonBorder: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return accentColor case MySettingsEnums.ChatTheme.Dark: return controlBorder default: return yellow200 } } property color conversationInputButtonBackground: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return accentColor case MySettingsEnums.ChatTheme.Dark: return accentColor default: return black } } property color conversationInputButtonBackgroundHovered: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return blue0 case MySettingsEnums.ChatTheme.Dark: return darkwhite default: return accentColor } } property color selectedBackground: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return blue700 case MySettingsEnums.ChatTheme.Dark: return darkgray200 default: return gray0 } } property color conversationButtonBackground: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return blue500 case MySettingsEnums.ChatTheme.Dark: return darkgray100 default: return gray0 } } property color conversationBackground: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return blue500 case MySettingsEnums.ChatTheme.Dark: return darkgray50 default: return white } } property color conversationProgress: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return purple400 case MySettingsEnums.ChatTheme.Dark: return green400 default: return green400 } } property color conversationButtonBackgroundHovered: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return blue400 case MySettingsEnums.ChatTheme.Dark: return darkgray0 default: return gray100 } } property color conversationButtonBorder: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return accentColor case MySettingsEnums.ChatTheme.Dark: return yellow200 default: return yellow200 } } property color conversationHeader: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return purple400 case MySettingsEnums.ChatTheme.Dark: return green400 default: return green500 } } property color collectionsButtonText: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return black case MySettingsEnums.ChatTheme.Dark: return black default: return white } } property color collectionsButtonProgress: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return purple400 case MySettingsEnums.ChatTheme.Dark: return darkgray400 default: return green400 } } property color collectionsButtonForeground: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return purple400 case MySettingsEnums.ChatTheme.Dark: return green300 default: return green600 } } property color collectionsButtonBackground: { switch (MySettings.chatTheme) { default: return lighterButtonBackground } } property color collectionsButtonBackgroundHovered: { switch (MySettings.chatTheme) { default: return lighterButtonBackgroundHovered } } property color welcomeButtonBackground: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return buttonBackground case MySettingsEnums.ChatTheme.Dark: return buttonBackground default: return lighterButtonBackground } } property color welcomeButtonBorder: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return buttonBorder case MySettingsEnums.ChatTheme.Dark: return buttonBorder default: return green300 } } property color welcomeButtonBorderHovered: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return purple200 case MySettingsEnums.ChatTheme.Dark: return darkgray100 default: return green400 } } property color welcomeButtonText: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return textColor case MySettingsEnums.ChatTheme.Dark: return textColor default: return green700 } } property color welcomeButtonTextHovered: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return purple200 case MySettingsEnums.ChatTheme.Dark: return gray400 default: return green800 } } property color fancyLinkText: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return textColor case MySettingsEnums.ChatTheme.Dark: return textColor default: return grayRed900 } } property color fancyLinkTextHovered: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return mutedTextColor case MySettingsEnums.ChatTheme.Dark: return mutedTextColor default: return textColor } } property color iconBackgroundDark: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return blue200 case MySettingsEnums.ChatTheme.Dark: return green400 default: return black } } property color iconBackgroundLight: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return blue200 case MySettingsEnums.ChatTheme.Dark: return darkwhite default: return gray500 } } property color iconBackgroundHovered: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return blue0 case MySettingsEnums.ChatTheme.Dark: return gray400 default: return accentColor } } property color iconBackgroundViewBar: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return iconBackgroundLight case MySettingsEnums.ChatTheme.Dark: return iconBackgroundLight default: return green500 } } property color iconBackgroundViewBarToggled: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return iconBackgroundLight case MySettingsEnums.ChatTheme.Dark: return darkgray50 default: return green200 } } property color iconBackgroundViewBarHovered: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return iconBackgroundHovered case MySettingsEnums.ChatTheme.Dark: return iconBackgroundHovered default: return green600 } } property color slugBackground: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return blue600 case MySettingsEnums.ChatTheme.Dark: return darkgray300 default: return gray100 } } property color textColor: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return blue0 case MySettingsEnums.ChatTheme.Dark: return darkwhite default: return black } } // lighter contrast property color mutedLighterTextColor: { switch (MySettings.chatTheme) { default: return gray300 } } // light contrast property color mutedLightTextColor: { switch (MySettings.chatTheme) { default: return gray400 } } // normal contrast property color mutedTextColor: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return blue200 case MySettingsEnums.ChatTheme.Dark: return gray400 default: return gray500 } } // dark contrast property color mutedDarkTextColor: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return mutedTextColor case MySettingsEnums.ChatTheme.Dark: return mutedTextColor default: return grayRed500 } } // dark contrast hovered property color mutedDarkTextColorHovered: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return blue400 default: return grayRed900 } } property color oppositeTextColor: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return white case MySettingsEnums.ChatTheme.Dark: return darkwhite default: return white } } property color oppositeMutedTextColor: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return white case MySettingsEnums.ChatTheme.Dark: return darkwhite default: return white } } property color textAccent: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return accentColor case MySettingsEnums.ChatTheme.Dark: return accentColor default: return accentColor } } readonly property color textErrorColor: red400 readonly property color textWarningColor: yellow400 property color settingsTitleTextColor: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return blue100 case MySettingsEnums.ChatTheme.Dark: return green200 default: return black } } property color titleTextColor: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return purple400 case MySettingsEnums.ChatTheme.Dark: return green300 default: return green700 } } property color titleTextColor2: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return darkwhite case MySettingsEnums.ChatTheme.Dark: return green200 default: return green700 } } property color titleInfoTextColor: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return blue200 case MySettingsEnums.ChatTheme.Dark: return gray400 default: return gray600 } } property color styledTextColor: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return purple100 case MySettingsEnums.ChatTheme.Dark: return yellow25 default: return grayRed900 } } property color styledTextColorLighter: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return purple50 case MySettingsEnums.ChatTheme.Dark: return yellow0 default: return grayRed400 } } property color styledTextColor2: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return blue0 case MySettingsEnums.ChatTheme.Dark: return yellow50 default: return green500 } } property color chatDrawerSectionHeader: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return purple50 case MySettingsEnums.ChatTheme.Dark: return yellow0 default: return grayRed800 } } property color dialogBorder: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return accentColor case MySettingsEnums.ChatTheme.Dark: return darkgray0 default: return darkgray0 } } property color linkColor: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return yellow600 case MySettingsEnums.ChatTheme.Dark: return yellow600 default: return yellow600 } } property color mainHeader: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return blue900 case MySettingsEnums.ChatTheme.Dark: return green600 default: return green600 } } property color mainComboBackground: { switch (MySettings.chatTheme) { default: return "transparent" } } property color sendGlow: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return blue1000 case MySettingsEnums.ChatTheme.Dark: return green950 default: return green300 } } property color userColor: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return blue800 case MySettingsEnums.ChatTheme.Dark: return green700 default: return green700 } } property color assistantColor: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return purple400 case MySettingsEnums.ChatTheme.Dark: return accentColor default: return accentColor } } property color codeDefaultColor: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: case MySettingsEnums.ChatTheme.Dark: default: return textColor } } property color codeKeywordColor: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: case MySettingsEnums.ChatTheme.Dark: return "#2e95d3" // blue default: return "#195273" // dark blue } } property color codeFunctionColor: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: case MySettingsEnums.ChatTheme.Dark: return"#f22c3d" // red default: return"#7d1721" // dark red } } property color codeFunctionCallColor: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: case MySettingsEnums.ChatTheme.Dark: return "#e9950c" // orange default: return "#815207" // dark orange } } property color codeCommentColor: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: case MySettingsEnums.ChatTheme.Dark: return "#808080" // gray default: return "#474747" // dark gray } } property color codeStringColor: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: case MySettingsEnums.ChatTheme.Dark: return "#00a37d" // green default: return "#004a39" // dark green } } property color codeNumberColor: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: case MySettingsEnums.ChatTheme.Dark: return "#df3079" // fuchsia default: return "#761942" // dark fuchsia } } property color codeHeaderColor: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: case MySettingsEnums.ChatTheme.Dark: return containerBackground default: return green50 } } property color codeBackgroundColor: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: case MySettingsEnums.ChatTheme.Dark: return controlBackground default: return gray100 } } property color chatNameEditBgColor: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: case MySettingsEnums.ChatTheme.Dark: return controlBackground default: return gray100 } } property color menuBackgroundColor: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return blue700 case MySettingsEnums.ChatTheme.Dark: return darkgray200 default: return gray50 } } property color menuHighlightColor: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return blue400 case MySettingsEnums.ChatTheme.Dark: return darkgray0 default: return green100 } } property color menuBorderColor: { switch (MySettings.chatTheme) { case MySettingsEnums.ChatTheme.LegacyDark: return blue400 case MySettingsEnums.ChatTheme.Dark: return gray800 default: return gray300 } } property real fontScale: MySettings.fontSize === MySettingsEnums.FontSize.Small ? 1 : MySettings.fontSize === MySettingsEnums.FontSize.Medium ? 1.3 : /* "Large" */ 1.8 property real fontSizeSmallest: 8 * fontScale property real fontSizeSmaller: 9 * fontScale property real fontSizeSmall: 10 * fontScale property real fontSizeMedium: 11 * fontScale property real fontSizeLarge: 12 * fontScale property real fontSizeLarger: 14 * fontScale property real fontSizeLargest: 18 * fontScale property real fontSizeBannerSmall: 24 * fontScale**.8 property real fontSizeBanner: 32 * fontScale**.8 property real fontSizeBannerLarge: 48 * fontScale**.8 } ================================================ FILE: gpt4all-chat/qml/ThumbsDownDialog.qml ================================================ import QtCore import QtQuick import QtQuick.Controls import QtQuick.Controls.Basic import QtQuick.Layouts import download import network import llm MyDialog { id: thumbsDownDialog modal: true padding: 20 Theme { id: theme } property alias response: thumbsDownNewResponse.text Column { anchors.fill: parent spacing: 20 Item { width: childrenRect.width height: childrenRect.height Image { id: img anchors.top: parent.top anchors.left: parent.left width: 60 height: 60 source: "qrc:/gpt4all/icons/thumbs_down.svg" } Text { anchors.left: img.right anchors.leftMargin: 30 anchors.verticalCenter: img.verticalCenter text: qsTr("Please edit the text below to provide a better response. (optional)") color: theme.textColor font.pixelSize: theme.fontSizeLarge } } ScrollView { clip: true height: 120 width: parent.width ScrollBar.vertical.policy: ScrollBar.AsNeeded ScrollBar.horizontal.policy: ScrollBar.AlwaysOff MyTextArea { id: thumbsDownNewResponse placeholderText: qsTr("Please provide a better response...") } } } footer: DialogButtonBox { padding: 20 alignment: Qt.AlignRight spacing: 10 MySettingsButton { text: qsTr("Submit") Accessible.description: qsTr("Submits the user's response") DialogButtonBox.buttonRole: DialogButtonBox.AcceptRole } MySettingsButton { text: qsTr("Cancel") Accessible.description: qsTr("Closes the response dialog") DialogButtonBox.buttonRole: DialogButtonBox.RejectRole } background: Rectangle { color: "transparent" } } } ================================================ FILE: gpt4all-chat/qml/Toast.qml ================================================ /* * SPDX-License-Identifier: MIT * Source: https://gist.github.com/jonmcclung/bae669101d17b103e94790341301c129 * Adapted from StackOverflow: http://stackoverflow.com/questions/26879266/make-toast-in-android-by-qml */ import QtQuick 2.0 /** * @brief An Android-like timed message text in a box that self-destroys when finished if desired */ Rectangle { /** * Public */ /** * @brief Shows this Toast * * @param {string} text Text to show * @param {real} duration Duration to show in milliseconds, defaults to 3000 */ function show(text, duration=3000) { message.text = text; if (typeof duration !== "undefined") { // checks if parameter was passed time = Math.max(duration, 2 * fadeTime); } else { time = defaultTime; } animation.start(); } property bool selfDestroying: false // whether this Toast will self-destroy when it is finished /** * Private */ id: root readonly property real defaultTime: 3000 property real time: defaultTime readonly property real fadeTime: 300 property real margin: 10 anchors { left: parent.left right: parent.right margins: margin } height: message.height + margin radius: margin opacity: 0 color: "#222222" Text { id: message color: "white" wrapMode: Text.Wrap horizontalAlignment: Text.AlignHCenter anchors { top: parent.top left: parent.left right: parent.right margins: margin / 2 } } SequentialAnimation on opacity { id: animation running: false NumberAnimation { to: .9 duration: fadeTime } PauseAnimation { duration: time - 2 * fadeTime } NumberAnimation { to: 0 duration: fadeTime } onRunningChanged: { if (!running && selfDestroying) { root.destroy(); } } } } ================================================ FILE: gpt4all-chat/qml/ToastManager.qml ================================================ /* * SPDX-License-Identifier: MIT * Source: https://gist.github.com/jonmcclung/bae669101d17b103e94790341301c129 * Adapted from StackOverflow: http://stackoverflow.com/questions/26879266/make-toast-in-android-by-qml */ import QtQuick 2.0 /** * @brief Manager that creates Toasts dynamically */ ListView { /** * Public */ /** * @brief Shows a Toast * * @param {string} text Text to show * @param {real} duration Duration to show in milliseconds, defaults to 3000 */ function show(text, duration=3000) { model.insert(0, {text: text, duration: duration}); } /** * Private */ id: root z: Infinity spacing: 5 anchors.fill: parent anchors.bottomMargin: 10 verticalLayoutDirection: ListView.BottomToTop interactive: false displaced: Transition { NumberAnimation { properties: "y" easing.type: Easing.InOutQuad } } delegate: Toast { Component.onCompleted: { if (typeof duration === "undefined") { show(text); } else { show(text, duration); } } } model: ListModel {id: model} } ================================================ FILE: gpt4all-chat/resources/gpt4all.rc ================================================ IDI_ICON1 ICON "gpt4all.ico" ================================================ FILE: gpt4all-chat/src/chat.cpp ================================================ #include "chat.h" #include "chatlistmodel.h" #include "network.h" #include "server.h" #include "tool.h" #include "toolcallparser.h" #include "toolmodel.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include using namespace ToolEnums; Chat::Chat(QObject *parent) : QObject(parent) , m_id(Network::globalInstance()->generateUniqueId()) , m_name(tr("New Chat")) , m_chatModel(new ChatModel(this)) , m_responseState(Chat::ResponseStopped) , m_creationDate(QDateTime::currentSecsSinceEpoch()) , m_llmodel(new ChatLLM(this)) , m_collectionModel(new LocalDocsCollectionsModel(this)) { connectLLM(); } Chat::Chat(server_tag_t, QObject *parent) : QObject(parent) , m_id(Network::globalInstance()->generateUniqueId()) , m_name(tr("Server Chat")) , m_chatModel(new ChatModel(this)) , m_responseState(Chat::ResponseStopped) , m_creationDate(QDateTime::currentSecsSinceEpoch()) , m_llmodel(new Server(this)) , m_isServer(true) , m_collectionModel(new LocalDocsCollectionsModel(this)) { connectLLM(); } Chat::~Chat() { delete m_llmodel; m_llmodel = nullptr; } void Chat::connectLLM() { // Should be in different threads connect(m_llmodel, &ChatLLM::modelLoadingPercentageChanged, this, &Chat::handleModelLoadingPercentageChanged, Qt::QueuedConnection); connect(m_llmodel, &ChatLLM::responseChanged, this, &Chat::handleResponseChanged, Qt::QueuedConnection); connect(m_llmodel, &ChatLLM::promptProcessing, this, &Chat::promptProcessing, Qt::QueuedConnection); connect(m_llmodel, &ChatLLM::generatingQuestions, this, &Chat::generatingQuestions, Qt::QueuedConnection); connect(m_llmodel, &ChatLLM::responseStopped, this, &Chat::responseStopped, Qt::QueuedConnection); connect(m_llmodel, &ChatLLM::modelLoadingError, this, &Chat::handleModelLoadingError, Qt::QueuedConnection); connect(m_llmodel, &ChatLLM::modelLoadingWarning, this, &Chat::modelLoadingWarning, Qt::QueuedConnection); connect(m_llmodel, &ChatLLM::generatedNameChanged, this, &Chat::generatedNameChanged, Qt::QueuedConnection); connect(m_llmodel, &ChatLLM::generatedQuestionFinished, this, &Chat::generatedQuestionFinished, Qt::QueuedConnection); connect(m_llmodel, &ChatLLM::reportSpeed, this, &Chat::handleTokenSpeedChanged, Qt::QueuedConnection); connect(m_llmodel, &ChatLLM::loadedModelInfoChanged, this, &Chat::loadedModelInfoChanged, Qt::QueuedConnection); connect(m_llmodel, &ChatLLM::databaseResultsChanged, this, &Chat::handleDatabaseResultsChanged, Qt::QueuedConnection); connect(m_llmodel, &ChatLLM::modelInfoChanged, this, &Chat::handleModelChanged, Qt::QueuedConnection); connect(m_llmodel, &ChatLLM::trySwitchContextOfLoadedModelCompleted, this, &Chat::handleTrySwitchContextOfLoadedModelCompleted, Qt::QueuedConnection); connect(this, &Chat::promptRequested, m_llmodel, &ChatLLM::prompt, Qt::QueuedConnection); connect(this, &Chat::modelChangeRequested, m_llmodel, &ChatLLM::modelChangeRequested, Qt::QueuedConnection); connect(this, &Chat::loadDefaultModelRequested, m_llmodel, &ChatLLM::loadDefaultModel, Qt::QueuedConnection); connect(this, &Chat::generateNameRequested, m_llmodel, &ChatLLM::generateName, Qt::QueuedConnection); connect(this, &Chat::regenerateResponseRequested, m_llmodel, &ChatLLM::regenerateResponse, Qt::QueuedConnection); connect(this, &Chat::collectionListChanged, m_collectionModel, &LocalDocsCollectionsModel::setCollections); connect(ModelList::globalInstance(), &ModelList::modelInfoChanged, this, &Chat::handleModelInfoChanged); } void Chat::reset() { stopGenerating(); // Erase our current on disk representation as we're completely resetting the chat along with id ChatListModel::globalInstance()->removeChatFile(this); m_id = Network::globalInstance()->generateUniqueId(); emit idChanged(m_id); // NOTE: We deliberately do no reset the name or creation date to indicate that this was originally // an older chat that was reset for another purpose. Resetting this data will lead to the chat // name label changing back to 'New Chat' and showing up in the chat model list as a 'New Chat' // further down in the list. This might surprise the user. In the future, we might get rid of // the "reset context" button in the UI. m_chatModel->clear(); m_needsSave = true; } void Chat::resetResponseState() { if (m_responseInProgress && m_responseState == Chat::LocalDocsRetrieval) return; m_generatedQuestions = QList(); emit generatedQuestionsChanged(); m_tokenSpeed = QString(); emit tokenSpeedChanged(); m_responseInProgress = true; m_responseState = m_collections.empty() ? Chat::PromptProcessing : Chat::LocalDocsRetrieval; emit responseInProgressChanged(); emit responseStateChanged(); } void Chat::newPromptResponsePair(const QString &prompt, const QList &attachedUrls) { QStringList attachedContexts; QList attachments; for (const QUrl &url : attachedUrls) { Q_ASSERT(url.isLocalFile()); const QString localFilePath = url.toLocalFile(); const QFileInfo info(localFilePath); Q_ASSERT( info.suffix().toLower() == "xlsx" || info.suffix().toLower() == "txt" || info.suffix().toLower() == "md" || info.suffix().toLower() == "rst" ); PromptAttachment attached; attached.url = url; QFile file(localFilePath); if (file.open(QIODevice::ReadOnly)) { attached.content = file.readAll(); file.close(); } else { qWarning() << "ERROR: Failed to open the attachment:" << localFilePath; continue; } attachments << attached; attachedContexts << attached.processedContent(); } QString promptPlusAttached = prompt; if (!attachedContexts.isEmpty()) promptPlusAttached = attachedContexts.join("\n\n") + "\n\n" + prompt; resetResponseState(); if (int count = m_chatModel->count()) m_chatModel->updateCurrentResponse(count - 1, false); m_chatModel->appendPrompt(prompt, attachments); m_chatModel->appendResponse(); emit promptRequested(m_collections); m_needsSave = true; } void Chat::regenerateResponse(int index) { resetResponseState(); emit regenerateResponseRequested(index); m_needsSave = true; } QVariant Chat::popPrompt(int index) { auto content = m_llmodel->popPrompt(index); m_needsSave = true; if (content) return *content; return QVariant::fromValue(nullptr); } void Chat::stopGenerating() { // In future if we have more than one tool we'll have to keep track of which tools are possibly // running, but for now we only have one Tool *toolInstance = ToolModel::globalInstance()->get(ToolCallConstants::CodeInterpreterFunction); Q_ASSERT(toolInstance); toolInstance->interrupt(); m_llmodel->stopGenerating(); } Chat::ResponseState Chat::responseState() const { return m_responseState; } void Chat::handleResponseChanged() { if (m_responseState != Chat::ResponseGeneration) { m_responseState = Chat::ResponseGeneration; emit responseStateChanged(); } } void Chat::handleModelLoadingPercentageChanged(float loadingPercentage) { if (m_shouldDeleteLater) deleteLater(); if (loadingPercentage == m_modelLoadingPercentage) return; bool wasLoading = isCurrentlyLoading(); bool wasLoaded = isModelLoaded(); m_modelLoadingPercentage = loadingPercentage; emit modelLoadingPercentageChanged(); if (isCurrentlyLoading() != wasLoading) emit isCurrentlyLoadingChanged(); if (isModelLoaded() != wasLoaded) emit isModelLoadedChanged(); } void Chat::promptProcessing() { m_responseState = !databaseResults().isEmpty() ? Chat::LocalDocsProcessing : Chat::PromptProcessing; emit responseStateChanged(); } void Chat::generatingQuestions() { m_responseState = Chat::GeneratingQuestions; emit responseStateChanged(); } void Chat::responseStopped(qint64 promptResponseMs) { m_tokenSpeed = QString(); emit tokenSpeedChanged(); m_responseInProgress = false; m_responseState = Chat::ResponseStopped; emit responseInProgressChanged(); emit responseStateChanged(); const QString possibleToolcall = m_chatModel->possibleToolcall(); Network::globalInstance()->trackChatEvent("response_stopped", { {"first", m_firstResponse}, {"message_count", chatModel()->count()}, {"$duration", promptResponseMs / 1000.}, }); ToolCallParser parser; parser.update(possibleToolcall.toUtf8()); if (parser.state() == ToolEnums::ParseState::Complete && parser.startTag() != ToolCallConstants::ThinkStartTag) processToolCall(parser.toolCall()); else responseComplete(); } void Chat::processToolCall(const QString &toolCall) { m_responseState = Chat::ToolCallGeneration; emit responseStateChanged(); // Regex to remove the formatting around the code static const QRegularExpression regex("^\\s*```javascript\\s*|\\s*```\\s*$"); QString code = toolCall; code.remove(regex); code = code.trimmed(); // Right now the code interpreter is the only available tool Tool *toolInstance = ToolModel::globalInstance()->get(ToolCallConstants::CodeInterpreterFunction); Q_ASSERT(toolInstance); connect(toolInstance, &Tool::runComplete, this, &Chat::toolCallComplete, Qt::SingleShotConnection); // The param is the code const ToolParam param = { "code", ToolEnums::ParamType::String, code }; m_responseInProgress = true; emit responseInProgressChanged(); toolInstance->run({param}); } void Chat::toolCallComplete(const ToolCallInfo &info) { // Update the current response with meta information about toolcall and re-parent m_chatModel->updateToolCall(info); ++m_consecutiveToolCalls; m_responseInProgress = false; emit responseInProgressChanged(); // We limit the number of consecutive toolcalls otherwise we get into a potentially endless loop if (m_consecutiveToolCalls < 3 || info.error == ToolEnums::Error::NoError) { resetResponseState(); emit promptRequested(m_collections); // triggers a new response return; } responseComplete(); } void Chat::responseComplete() { if (m_generatedName.isEmpty()) emit generateNameRequested(); m_responseState = Chat::ResponseStopped; emit responseStateChanged(); m_consecutiveToolCalls = 0; m_firstResponse = false; } ModelInfo Chat::modelInfo() const { return m_modelInfo; } void Chat::setModelInfo(const ModelInfo &modelInfo) { if (m_modelInfo != modelInfo) { m_modelInfo = modelInfo; m_needsSave = true; } else if (isModelLoaded()) return; emit modelInfoChanged(); emit modelChangeRequested(modelInfo); } void Chat::unloadAndDeleteLater() { if (!isModelLoaded()) { deleteLater(); return; } m_shouldDeleteLater = true; unloadModel(); } void Chat::markForDeletion() { m_llmodel->setMarkedForDeletion(true); } void Chat::unloadModel() { stopGenerating(); m_llmodel->setShouldBeLoaded(false); } void Chat::reloadModel() { m_llmodel->setShouldBeLoaded(true); } void Chat::forceUnloadModel() { stopGenerating(); m_llmodel->setForceUnloadModel(true); m_llmodel->setShouldBeLoaded(false); } void Chat::forceReloadModel() { m_llmodel->setForceUnloadModel(true); m_llmodel->setShouldBeLoaded(true); } void Chat::trySwitchContextOfLoadedModel() { m_trySwitchContextInProgress = 1; emit trySwitchContextInProgressChanged(); m_llmodel->requestTrySwitchContext(); } void Chat::generatedNameChanged(const QString &name) { m_generatedName = name; m_name = name; emit nameChanged(); m_needsSave = true; } void Chat::generatedQuestionFinished(const QString &question) { m_generatedQuestions << question; emit generatedQuestionsChanged(); m_needsSave = true; } void Chat::handleModelLoadingError(const QString &error) { if (!error.isEmpty()) { auto stream = qWarning().noquote() << "ERROR:" << error << "id"; stream.quote() << id(); } m_modelLoadingError = error; emit modelLoadingErrorChanged(); } void Chat::handleTokenSpeedChanged(const QString &tokenSpeed) { m_tokenSpeed = tokenSpeed; emit tokenSpeedChanged(); } QString Chat::deviceBackend() const { return m_llmodel->deviceBackend(); } QString Chat::device() const { return m_llmodel->device(); } QString Chat::fallbackReason() const { return m_llmodel->fallbackReason(); } void Chat::handleDatabaseResultsChanged(const QList &results) { m_databaseResults = results; m_needsSave = true; } // we need to notify listeners of the modelInfo property when its properties are updated, // since it's a gadget and can't do that on its own void Chat::handleModelInfoChanged(const ModelInfo &modelInfo) { if (!m_modelInfo.id().isNull() && modelInfo.id() == m_modelInfo.id()) emit modelInfoChanged(); } // react if a new model is loaded void Chat::handleModelChanged(const ModelInfo &modelInfo) { if (m_modelInfo == modelInfo) return; m_modelInfo = modelInfo; emit modelInfoChanged(); m_needsSave = true; } void Chat::handleTrySwitchContextOfLoadedModelCompleted(int value) { m_trySwitchContextInProgress = value; emit trySwitchContextInProgressChanged(); } bool Chat::serialize(QDataStream &stream, int version) const { stream << m_creationDate; stream << m_id; stream << m_name; stream << m_userName; if (version >= 5) stream << m_modelInfo.id(); else stream << m_modelInfo.filename(); if (version >= 3) stream << m_collections; if (!m_llmodel->serialize(stream, version)) return false; if (!m_chatModel->serialize(stream, version)) return false; return stream.status() == QDataStream::Ok; } bool Chat::deserialize(QDataStream &stream, int version) { stream >> m_creationDate; stream >> m_id; emit idChanged(m_id); stream >> m_name; stream >> m_userName; m_generatedName = QLatin1String("nonempty"); emit nameChanged(); QString modelId; stream >> modelId; if (version >= 5) { if (ModelList::globalInstance()->contains(modelId)) m_modelInfo = ModelList::globalInstance()->modelInfo(modelId); } else { if (ModelList::globalInstance()->containsByFilename(modelId)) m_modelInfo = ModelList::globalInstance()->modelInfoByFilename(modelId); } if (!m_modelInfo.id().isEmpty()) emit modelInfoChanged(); if (version >= 3) { stream >> m_collections; emit collectionListChanged(m_collections); } m_llmodel->setModelInfo(m_modelInfo); if (!m_llmodel->deserialize(stream, version)) return false; if (!m_chatModel->deserialize(stream, version)) return false; emit chatModelChanged(); if (stream.status() != QDataStream::Ok) return false; m_needsSave = false; return true; } QList Chat::collectionList() const { return m_collections; } bool Chat::hasCollection(const QString &collection) const { return m_collections.contains(collection); } void Chat::addCollection(const QString &collection) { if (hasCollection(collection)) return; m_collections.append(collection); emit collectionListChanged(m_collections); m_needsSave = true; } void Chat::removeCollection(const QString &collection) { if (!hasCollection(collection)) return; m_collections.removeAll(collection); emit collectionListChanged(m_collections); m_needsSave = true; } ================================================ FILE: gpt4all-chat/src/chat.h ================================================ #ifndef CHAT_H #define CHAT_H #include "chatllm.h" #include "chatmodel.h" #include "database.h" #include "localdocsmodel.h" #include "modellist.h" #include "tool.h" #include #include #include #include // IWYU pragma: keep #include #include // IWYU pragma: keep #include #include #include // IWYU pragma: no_forward_declare LocalDocsCollectionsModel // IWYU pragma: no_forward_declare ToolCallInfo class QDataStream; class Chat : public QObject { Q_OBJECT Q_PROPERTY(QString id READ id NOTIFY idChanged) Q_PROPERTY(QString name READ name WRITE setName NOTIFY nameChanged) Q_PROPERTY(ChatModel *chatModel READ chatModel NOTIFY chatModelChanged) Q_PROPERTY(bool isModelLoaded READ isModelLoaded NOTIFY isModelLoadedChanged) Q_PROPERTY(bool isCurrentlyLoading READ isCurrentlyLoading NOTIFY isCurrentlyLoadingChanged) Q_PROPERTY(float modelLoadingPercentage READ modelLoadingPercentage NOTIFY modelLoadingPercentageChanged) Q_PROPERTY(ModelInfo modelInfo READ modelInfo WRITE setModelInfo NOTIFY modelInfoChanged) Q_PROPERTY(bool responseInProgress READ responseInProgress NOTIFY responseInProgressChanged) Q_PROPERTY(bool isServer READ isServer NOTIFY isServerChanged) Q_PROPERTY(ResponseState responseState READ responseState NOTIFY responseStateChanged) Q_PROPERTY(QList collectionList READ collectionList NOTIFY collectionListChanged) Q_PROPERTY(QString modelLoadingError READ modelLoadingError NOTIFY modelLoadingErrorChanged) Q_PROPERTY(QString tokenSpeed READ tokenSpeed NOTIFY tokenSpeedChanged) Q_PROPERTY(QString deviceBackend READ deviceBackend NOTIFY loadedModelInfoChanged) Q_PROPERTY(QString device READ device NOTIFY loadedModelInfoChanged) Q_PROPERTY(QString fallbackReason READ fallbackReason NOTIFY loadedModelInfoChanged) Q_PROPERTY(LocalDocsCollectionsModel *collectionModel READ collectionModel NOTIFY collectionModelChanged) // 0=no, 1=waiting, 2=working Q_PROPERTY(int trySwitchContextInProgress READ trySwitchContextInProgress NOTIFY trySwitchContextInProgressChanged) Q_PROPERTY(QList generatedQuestions READ generatedQuestions NOTIFY generatedQuestionsChanged) QML_ELEMENT QML_UNCREATABLE("Only creatable from c++!") public: // tag for constructing a server chat struct server_tag_t { explicit server_tag_t() = default; }; static inline constexpr server_tag_t server_tag = server_tag_t(); enum ResponseState { ResponseStopped, LocalDocsRetrieval, LocalDocsProcessing, PromptProcessing, GeneratingQuestions, ResponseGeneration, ToolCallGeneration }; Q_ENUM(ResponseState) explicit Chat(QObject *parent = nullptr); explicit Chat(server_tag_t, QObject *parent = nullptr); virtual ~Chat(); void destroy() { m_llmodel->destroy(); } void connectLLM(); QString id() const { return m_id; } QString name() const { return m_userName.isEmpty() ? m_name : m_userName; } void setName(const QString &name) { m_userName = name; emit nameChanged(); m_needsSave = true; } ChatModel *chatModel() { return m_chatModel; } bool isNewChat() const { return m_name == tr("New Chat") && !m_chatModel->count(); } Q_INVOKABLE void reset(); bool isModelLoaded() const { return m_modelLoadingPercentage == 1.0f; } bool isCurrentlyLoading() const { return m_modelLoadingPercentage > 0.0f && m_modelLoadingPercentage < 1.0f; } float modelLoadingPercentage() const { return m_modelLoadingPercentage; } Q_INVOKABLE void newPromptResponsePair(const QString &prompt, const QList &attachedUrls = {}); Q_INVOKABLE void regenerateResponse(int index); Q_INVOKABLE QVariant popPrompt(int index); Q_INVOKABLE void stopGenerating(); QList databaseResults() const { return m_databaseResults; } bool responseInProgress() const { return m_responseInProgress; } ResponseState responseState() const; ModelInfo modelInfo() const; void setModelInfo(const ModelInfo &modelInfo); Q_INVOKABLE void unloadModel(); Q_INVOKABLE void reloadModel(); Q_INVOKABLE void forceUnloadModel(); Q_INVOKABLE void forceReloadModel(); Q_INVOKABLE void trySwitchContextOfLoadedModel(); void unloadAndDeleteLater(); void markForDeletion(); QDateTime creationDate() const { return QDateTime::fromSecsSinceEpoch(m_creationDate); } bool serialize(QDataStream &stream, int version) const; bool deserialize(QDataStream &stream, int version); bool isServer() const { return m_isServer; } QList collectionList() const; LocalDocsCollectionsModel *collectionModel() const { return m_collectionModel; } Q_INVOKABLE bool hasCollection(const QString &collection) const; Q_INVOKABLE void addCollection(const QString &collection); Q_INVOKABLE void removeCollection(const QString &collection); QString modelLoadingError() const { return m_modelLoadingError; } QString tokenSpeed() const { return m_tokenSpeed; } QString deviceBackend() const; QString device() const; // not loaded -> QString(), no fallback -> QString("") QString fallbackReason() const; int trySwitchContextInProgress() const { return m_trySwitchContextInProgress; } QList generatedQuestions() const { return m_generatedQuestions; } bool needsSave() const { return m_needsSave; } void setNeedsSave(bool n) { m_needsSave = n; } public Q_SLOTS: void resetResponseState(); Q_SIGNALS: void idChanged(const QString &id); void nameChanged(); void chatModelChanged(); void isModelLoadedChanged(); void isCurrentlyLoadingChanged(); void modelLoadingPercentageChanged(); void modelLoadingWarning(const QString &warning); void responseInProgressChanged(); void responseStateChanged(); void promptRequested(const QStringList &enabledCollections); void regenerateResponseRequested(int index); void resetResponseRequested(); void resetContextRequested(); void modelChangeRequested(const ModelInfo &modelInfo); void modelInfoChanged(); void loadDefaultModelRequested(); void generateNameRequested(); void modelLoadingErrorChanged(); void isServerChanged(); void collectionListChanged(const QList &collectionList); void tokenSpeedChanged(); void deviceChanged(); void fallbackReasonChanged(); void collectionModelChanged(); void trySwitchContextInProgressChanged(); void loadedModelInfoChanged(); void generatedQuestionsChanged(); private Q_SLOTS: void handleResponseChanged(); void handleModelLoadingPercentageChanged(float); void promptProcessing(); void generatingQuestions(); void responseStopped(qint64 promptResponseMs); void processToolCall(const QString &toolCall); void toolCallComplete(const ToolCallInfo &info); void responseComplete(); void generatedNameChanged(const QString &name); void generatedQuestionFinished(const QString &question); void handleModelLoadingError(const QString &error); void handleTokenSpeedChanged(const QString &tokenSpeed); void handleDatabaseResultsChanged(const QList &results); void handleModelInfoChanged(const ModelInfo &modelInfo); void handleModelChanged(const ModelInfo &modelInfo); void handleTrySwitchContextOfLoadedModelCompleted(int value); private: QString m_id; QString m_name; QString m_generatedName; QString m_userName; ModelInfo m_modelInfo; QString m_modelLoadingError; QString m_tokenSpeed; QString m_device; QString m_fallbackReason; QList m_collections; QList m_generatedQuestions; ChatModel *m_chatModel; bool m_responseInProgress = false; ResponseState m_responseState; qint64 m_creationDate; ChatLLM *m_llmodel; QList m_databaseResults; bool m_isServer = false; bool m_shouldDeleteLater = false; float m_modelLoadingPercentage = 0.0f; LocalDocsCollectionsModel *m_collectionModel; bool m_firstResponse = true; int m_trySwitchContextInProgress = 0; bool m_isCurrentlyLoading = false; // True if we need to serialize the chat to disk, because of one of two reasons: // - The chat was freshly created during this launch. // - The chat was changed after loading it from disk. bool m_needsSave = true; int m_consecutiveToolCalls = 0; }; #endif // CHAT_H ================================================ FILE: gpt4all-chat/src/chatapi.cpp ================================================ #include "chatapi.h" #include "utils.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include // IWYU pragma: keep #include #include #include #include #include #include #include #include #include using namespace Qt::Literals::StringLiterals; //#define DEBUG ChatAPI::ChatAPI() : QObject(nullptr) , m_modelName("gpt-3.5-turbo") , m_requestURL("") , m_responseCallback(nullptr) { } size_t ChatAPI::requiredMem(const std::string &modelPath, int n_ctx, int ngl) { Q_UNUSED(modelPath); Q_UNUSED(n_ctx); Q_UNUSED(ngl); return 0; } bool ChatAPI::loadModel(const std::string &modelPath, int n_ctx, int ngl) { Q_UNUSED(modelPath); Q_UNUSED(n_ctx); Q_UNUSED(ngl); return true; } void ChatAPI::setThreadCount(int32_t n_threads) { Q_UNUSED(n_threads); } int32_t ChatAPI::threadCount() const { return 1; } ChatAPI::~ChatAPI() { } bool ChatAPI::isModelLoaded() const { return true; } static auto parsePrompt(QXmlStreamReader &xml) -> std::expected { QJsonArray messages; auto xmlError = [&xml] { return std::unexpected(u"%1:%2: %3"_s.arg(xml.lineNumber()).arg(xml.columnNumber()).arg(xml.errorString())); }; if (xml.hasError()) return xmlError(); if (xml.atEnd()) return messages; // skip header bool foundElement = false; do { switch (xml.readNext()) { using enum QXmlStreamReader::TokenType; case Invalid: return xmlError(); case EndDocument: return messages; default: foundElement = true; case StartDocument: case Comment: case DTD: case ProcessingInstruction: ; } } while (!foundElement); // document body loop bool foundRoot = false; for (;;) { switch (xml.tokenType()) { using enum QXmlStreamReader::TokenType; case StartElement: { auto name = xml.name(); if (!foundRoot) { if (name != "chat"_L1) return std::unexpected(u"unexpected tag: %1"_s.arg(name)); foundRoot = true; } else { if (name != "user"_L1 && name != "assistant"_L1 && name != "system"_L1) return std::unexpected(u"unknown role: %1"_s.arg(name)); auto content = xml.readElementText(); if (xml.tokenType() != EndElement) return xmlError(); messages << makeJsonObject({ { "role"_L1, name.toString().trimmed() }, { "content"_L1, content }, }); } break; } case Characters: if (!xml.isWhitespace()) return std::unexpected(u"unexpected text: %1"_s.arg(xml.text())); case Comment: case ProcessingInstruction: case EndElement: break; case EndDocument: return messages; case Invalid: return xmlError(); default: return std::unexpected(u"unexpected token: %1"_s.arg(xml.tokenString())); } xml.readNext(); } } void ChatAPI::prompt( std::string_view prompt, const PromptCallback &promptCallback, const ResponseCallback &responseCallback, const PromptContext &promptCtx ) { Q_UNUSED(promptCallback) if (!isModelLoaded()) throw std::invalid_argument("Attempted to prompt an unloaded model."); if (!promptCtx.n_predict) return; // nothing requested // FIXME: We don't set the max_tokens on purpose because in order to do so safely without encountering // an error we need to be able to count the tokens in our prompt. The only way to do this is to use // the OpenAI tiktoken library or to implement our own tokenization function that matches precisely // the tokenization used by the OpenAI model we're calling. OpenAI has not introduced any means of // using the REST API to count tokens in a prompt. auto root = makeJsonObject({ { "model"_L1, m_modelName }, { "stream"_L1, true }, { "temperature"_L1, promptCtx.temp }, { "top_p"_L1, promptCtx.top_p }, }); // conversation history { QUtf8StringView promptUtf8(prompt); QXmlStreamReader xml(promptUtf8); auto messages = parsePrompt(xml); if (!messages) { auto error = fmt::format("Failed to parse API model prompt: {}", messages.error()); qDebug().noquote() << "ChatAPI ERROR:" << error << "Prompt:\n\n" << promptUtf8 << '\n'; throw std::invalid_argument(error); } root.insert("messages"_L1, *messages); } QJsonDocument doc(root); #if defined(DEBUG) qDebug().noquote() << "ChatAPI::prompt begin network request" << doc.toJson(); #endif m_responseCallback = responseCallback; // The following code sets up a worker thread and object to perform the actual api request to // chatgpt and then blocks until it is finished QThread workerThread; ChatAPIWorker worker(this); worker.moveToThread(&workerThread); connect(&worker, &ChatAPIWorker::finished, &workerThread, &QThread::quit, Qt::DirectConnection); connect(this, &ChatAPI::request, &worker, &ChatAPIWorker::request, Qt::QueuedConnection); workerThread.start(); emit request(m_apiKey, doc.toJson(QJsonDocument::Compact)); workerThread.wait(); m_responseCallback = nullptr; #if defined(DEBUG) qDebug() << "ChatAPI::prompt end network request"; #endif } bool ChatAPI::callResponse(int32_t token, const std::string& string) { Q_ASSERT(m_responseCallback); if (!m_responseCallback) { std::cerr << "ChatAPI ERROR: no response callback!\n"; return false; } return m_responseCallback(token, string); } void ChatAPIWorker::request(const QString &apiKey, const QByteArray &array) { QUrl apiUrl(m_chat->url()); const QString authorization = u"Bearer %1"_s.arg(apiKey).trimmed(); QNetworkRequest request(apiUrl); request.setHeader(QNetworkRequest::ContentTypeHeader, "application/json"); request.setRawHeader("Authorization", authorization.toUtf8()); #if defined(DEBUG) qDebug() << "ChatAPI::request" << "API URL: " << apiUrl.toString() << "Authorization: " << authorization.toUtf8(); #endif m_networkManager = new QNetworkAccessManager(this); QNetworkReply *reply = m_networkManager->post(request, array); connect(qGuiApp, &QCoreApplication::aboutToQuit, reply, &QNetworkReply::abort); connect(reply, &QNetworkReply::finished, this, &ChatAPIWorker::handleFinished); connect(reply, &QNetworkReply::readyRead, this, &ChatAPIWorker::handleReadyRead); connect(reply, &QNetworkReply::errorOccurred, this, &ChatAPIWorker::handleErrorOccurred); } void ChatAPIWorker::handleFinished() { QNetworkReply *reply = qobject_cast(sender()); if (!reply) { emit finished(); return; } QVariant response = reply->attribute(QNetworkRequest::HttpStatusCodeAttribute); if (!response.isValid()) { m_chat->callResponse( -1, tr("ERROR: Network error occurred while connecting to the API server") .toStdString() ); return; } bool ok; int code = response.toInt(&ok); if (!ok || code != 200) { bool isReplyEmpty(reply->readAll().isEmpty()); if (isReplyEmpty) m_chat->callResponse( -1, tr("ChatAPIWorker::handleFinished got HTTP Error %1 %2") .arg(code) .arg(reply->errorString()) .toStdString() ); qWarning().noquote() << "ERROR: ChatAPIWorker::handleFinished got HTTP Error" << code << "response:" << reply->errorString(); } reply->deleteLater(); emit finished(); } void ChatAPIWorker::handleReadyRead() { QNetworkReply *reply = qobject_cast(sender()); if (!reply) { emit finished(); return; } QVariant response = reply->attribute(QNetworkRequest::HttpStatusCodeAttribute); if (!response.isValid()) return; bool ok; int code = response.toInt(&ok); if (!ok || code != 200) { m_chat->callResponse( -1, u"ERROR: ChatAPIWorker::handleReadyRead got HTTP Error %1 %2: %3"_s .arg(code).arg(reply->errorString(), reply->readAll()).toStdString() ); emit finished(); return; } while (reply->canReadLine()) { QString jsonData = reply->readLine().trimmed(); if (jsonData.startsWith("data:")) jsonData.remove(0, 5); jsonData = jsonData.trimmed(); if (jsonData.isEmpty()) continue; if (jsonData == "[DONE]") continue; #if defined(DEBUG) qDebug().noquote() << "line" << jsonData; #endif QJsonParseError err; const QJsonDocument document = QJsonDocument::fromJson(jsonData.toUtf8(), &err); if (err.error != QJsonParseError::NoError) { m_chat->callResponse(-1, u"ERROR: ChatAPI responded with invalid json \"%1\""_s .arg(err.errorString()).toStdString()); continue; } const QJsonObject root = document.object(); const QJsonArray choices = root.value("choices").toArray(); const QJsonObject choice = choices.first().toObject(); const QJsonObject delta = choice.value("delta").toObject(); const QString content = delta.value("content").toString(); m_currentResponse += content; if (!m_chat->callResponse(0, content.toStdString())) { reply->abort(); emit finished(); return; } } } void ChatAPIWorker::handleErrorOccurred(QNetworkReply::NetworkError code) { QNetworkReply *reply = qobject_cast(sender()); if (!reply || reply->error() == QNetworkReply::OperationCanceledError /*when we call abort on purpose*/) { emit finished(); return; } qWarning().noquote() << "ERROR: ChatAPIWorker::handleErrorOccurred got HTTP Error" << code << "response:" << reply->errorString(); emit finished(); } ================================================ FILE: gpt4all-chat/src/chatapi.h ================================================ #ifndef CHATAPI_H #define CHATAPI_H #include #include #include #include #include #include #include #include #include #include #include #include #include #include // IWYU pragma: no_forward_declare QByteArray class ChatAPI; class QNetworkAccessManager; class ChatAPIWorker : public QObject { Q_OBJECT public: ChatAPIWorker(ChatAPI *chatAPI) : QObject(nullptr) , m_networkManager(nullptr) , m_chat(chatAPI) {} virtual ~ChatAPIWorker() {} QString currentResponse() const { return m_currentResponse; } void request(const QString &apiKey, const QByteArray &array); Q_SIGNALS: void finished(); private Q_SLOTS: void handleFinished(); void handleReadyRead(); void handleErrorOccurred(QNetworkReply::NetworkError code); private: ChatAPI *m_chat; QNetworkAccessManager *m_networkManager; QString m_currentResponse; }; class ChatAPI : public QObject, public LLModel { Q_OBJECT public: ChatAPI(); virtual ~ChatAPI(); bool supportsEmbedding() const override { return false; } bool supportsCompletion() const override { return true; } bool loadModel(const std::string &modelPath, int n_ctx, int ngl) override; bool isModelLoaded() const override; size_t requiredMem(const std::string &modelPath, int n_ctx, int ngl) override; // All three of the state virtual functions are handled custom inside of chatllm save/restore size_t stateSize() const override { throwNotImplemented(); } size_t saveState(std::span stateOut, std::vector &inputTokensOut) const override { Q_UNUSED(stateOut); Q_UNUSED(inputTokensOut); throwNotImplemented(); } size_t restoreState(std::span state, std::span inputTokens) override { Q_UNUSED(state); Q_UNUSED(inputTokens); throwNotImplemented(); } void prompt(std::string_view prompt, const PromptCallback &promptCallback, const ResponseCallback &responseCallback, const PromptContext &ctx) override; [[noreturn]] int32_t countPromptTokens(std::string_view prompt) const override { Q_UNUSED(prompt); throwNotImplemented(); } void setThreadCount(int32_t n_threads) override; int32_t threadCount() const override; void setModelName(const QString &modelName) { m_modelName = modelName; } void setAPIKey(const QString &apiKey) { m_apiKey = apiKey; } void setRequestURL(const QString &requestURL) { m_requestURL = requestURL; } QString url() const { return m_requestURL; } bool callResponse(int32_t token, const std::string &string); [[noreturn]] int32_t contextLength() const override { throwNotImplemented(); } auto specialTokens() -> std::unordered_map const override { return {}; } Q_SIGNALS: void request(const QString &apiKey, const QByteArray &array); protected: // We have to implement these as they are pure virtual in base class, but we don't actually use // them as they are only called from the default implementation of 'prompt' which we override and // completely replace [[noreturn]] static void throwNotImplemented() { throw std::logic_error("not implemented"); } [[noreturn]] std::vector tokenize(std::string_view str) const override { Q_UNUSED(str); throwNotImplemented(); } [[noreturn]] bool isSpecialToken(Token id) const override { Q_UNUSED(id); throwNotImplemented(); } [[noreturn]] std::string tokenToString(Token id) const override { Q_UNUSED(id); throwNotImplemented(); } [[noreturn]] void initSampler(const PromptContext &ctx) override { Q_UNUSED(ctx); throwNotImplemented(); } [[noreturn]] Token sampleToken() const override { throwNotImplemented(); } [[noreturn]] bool evalTokens(int32_t nPast, std::span tokens) const override { Q_UNUSED(nPast); Q_UNUSED(tokens); throwNotImplemented(); } [[noreturn]] void shiftContext(const PromptContext &promptCtx, int32_t *nPast) override { Q_UNUSED(promptCtx); Q_UNUSED(nPast); throwNotImplemented(); } [[noreturn]] int32_t inputLength() const override { throwNotImplemented(); } [[noreturn]] int32_t computeModelInputPosition(std::span input) const override { Q_UNUSED(input); throwNotImplemented(); } [[noreturn]] void setModelInputPosition(int32_t pos) override { Q_UNUSED(pos); throwNotImplemented(); } [[noreturn]] void appendInputToken(Token tok) override { Q_UNUSED(tok); throwNotImplemented(); } [[noreturn]] const std::vector &endTokens() const override { throwNotImplemented(); } [[noreturn]] bool shouldAddBOS() const override { throwNotImplemented(); } [[noreturn]] std::span inputTokens() const override { throwNotImplemented(); } private: ResponseCallback m_responseCallback; QString m_modelName; QString m_apiKey; QString m_requestURL; }; #endif // CHATAPI_H ================================================ FILE: gpt4all-chat/src/chatlistmodel.cpp ================================================ #include "chatlistmodel.h" #include "mysettings.h" #include #include #include #include #include #include #include #include #include #include #include #include // IWYU pragma: keep #include #include #include static constexpr quint32 CHAT_FORMAT_MAGIC = 0xF5D553CC; static constexpr qint32 CHAT_FORMAT_VERSION = 12; class MyChatListModel: public ChatListModel { }; Q_GLOBAL_STATIC(MyChatListModel, chatListModelInstance) ChatListModel *ChatListModel::globalInstance() { return chatListModelInstance(); } ChatListModel::ChatListModel() : QAbstractListModel(nullptr) { QCoreApplication::instance()->installEventFilter(this); } bool ChatListModel::eventFilter(QObject *obj, QEvent *ev) { if (obj == QCoreApplication::instance() && ev->type() == QEvent::LanguageChange) emit dataChanged(index(0, 0), index(m_chats.size() - 1, 0)); return false; } void ChatListModel::loadChats() { addChat(); ChatsRestoreThread *thread = new ChatsRestoreThread; connect(thread, &ChatsRestoreThread::chatRestored, this, &ChatListModel::restoreChat, Qt::QueuedConnection); connect(thread, &ChatsRestoreThread::finished, this, &ChatListModel::chatsRestoredFinished, Qt::QueuedConnection); connect(thread, &ChatsRestoreThread::finished, thread, &QObject::deleteLater); thread->start(); m_chatSaver = std::make_unique(); connect(this, &ChatListModel::requestSaveChats, m_chatSaver.get(), &ChatSaver::saveChats, Qt::QueuedConnection); connect(m_chatSaver.get(), &ChatSaver::saveChatsFinished, this, &ChatListModel::saveChatsFinished, Qt::QueuedConnection); // save chats on application quit connect(QCoreApplication::instance(), &QCoreApplication::aboutToQuit, this, &ChatListModel::saveChatsSync); connect(MySettings::globalInstance(), &MySettings::serverChatChanged, this, &ChatListModel::handleServerEnabledChanged); } void ChatListModel::removeChatFile(Chat *chat) const { Q_ASSERT(chat != m_serverChat); const QString savePath = MySettings::globalInstance()->modelPath(); QFile file(savePath + "/gpt4all-" + chat->id() + ".chat"); if (!file.exists()) return; bool success = file.remove(); if (!success) qWarning() << "ERROR: Couldn't remove chat file:" << file.fileName(); } ChatSaver::ChatSaver() : QObject(nullptr) { moveToThread(&m_thread); m_thread.start(); } ChatSaver::~ChatSaver() { m_thread.quit(); m_thread.wait(); } QVector ChatListModel::getChatsToSave() const { QVector toSave; for (auto *chat : m_chats) if (chat != m_serverChat && !chat->isNewChat()) toSave << chat; return toSave; } void ChatListModel::saveChats() { auto toSave = getChatsToSave(); if (toSave.isEmpty()) { emit saveChatsFinished(); return; } emit requestSaveChats(toSave); } void ChatListModel::saveChatsForQuit() { saveChats(); m_startedFinalSave = true; } void ChatListModel::saveChatsSync() { auto toSave = getChatsToSave(); if (!m_startedFinalSave && !toSave.isEmpty()) m_chatSaver->saveChats(toSave); } void ChatSaver::saveChats(const QVector &chats) { // we can be called from the main thread instead of a worker thread at quit time, so take a lock QMutexLocker locker(&m_mutex); QElapsedTimer timer; timer.start(); const QString savePath = MySettings::globalInstance()->modelPath(); qsizetype nSavedChats = 0; for (Chat *chat : chats) { if (!chat->needsSave()) continue; ++nSavedChats; QString fileName = "gpt4all-" + chat->id() + ".chat"; QString filePath = savePath + "/" + fileName; QFile originalFile(filePath); QFile tempFile(filePath + ".tmp"); // Temporary file bool success = tempFile.open(QIODevice::WriteOnly); if (!success) { qWarning() << "ERROR: Couldn't save chat to temporary file:" << tempFile.fileName(); continue; } QDataStream out(&tempFile); out << CHAT_FORMAT_MAGIC; out << CHAT_FORMAT_VERSION; out.setVersion(QDataStream::Qt_6_2); qDebug() << "serializing chat" << fileName; if (!chat->serialize(out, CHAT_FORMAT_VERSION)) { qWarning() << "ERROR: Couldn't serialize chat to file:" << tempFile.fileName(); tempFile.remove(); continue; } chat->setNeedsSave(false); if (originalFile.exists()) originalFile.remove(); tempFile.rename(filePath); } qint64 elapsedTime = timer.elapsed(); qDebug() << "serializing chats took" << elapsedTime << "ms, saved" << nSavedChats << "/" << chats.size() << "chats"; emit saveChatsFinished(); } void ChatsRestoreThread::run() { QElapsedTimer timer; timer.start(); struct FileInfo { bool oldFile; qint64 creationDate; QString file; }; QList files; { // Look for any files in the original spot which was the settings config directory QSettings settings; QFileInfo settingsInfo(settings.fileName()); QString settingsPath = settingsInfo.absolutePath(); QDir dir(settingsPath); dir.setNameFilters(QStringList() << "gpt4all-*.chat"); QStringList fileNames = dir.entryList(); for (const QString &f : fileNames) { QString filePath = settingsPath + "/" + f; QFile file(filePath); bool success = file.open(QIODevice::ReadOnly); if (!success) { qWarning() << "ERROR: Couldn't restore chat from file:" << file.fileName(); continue; } QDataStream in(&file); FileInfo info; info.oldFile = true; info.file = filePath; in >> info.creationDate; files.append(info); file.close(); } } { const QString savePath = MySettings::globalInstance()->modelPath(); QDir dir(savePath); dir.setNameFilters(QStringList() << "gpt4all-*.chat"); QStringList fileNames = dir.entryList(); for (const QString &f : fileNames) { QString filePath = savePath + "/" + f; QFile file(filePath); bool success = file.open(QIODevice::ReadOnly); if (!success) { qWarning() << "ERROR: Couldn't restore chat from file:" << file.fileName(); continue; } QDataStream in(&file); // Read and check the header quint32 magic; in >> magic; if (magic != CHAT_FORMAT_MAGIC) { qWarning() << "ERROR: Chat file has bad magic:" << file.fileName(); continue; } // Read the version qint32 version; in >> version; if (version < 1) { qWarning() << "WARNING: Chat file version" << version << "is not supported:" << file.fileName(); continue; } if (version > CHAT_FORMAT_VERSION) { qWarning().nospace() << "WARNING: Chat file is from a future version (have " << version << " want " << CHAT_FORMAT_VERSION << "): " << file.fileName(); continue; } if (version < 2) in.setVersion(QDataStream::Qt_6_2); FileInfo info; info.oldFile = false; info.file = filePath; in >> info.creationDate; files.append(info); file.close(); } } std::sort(files.begin(), files.end(), [](const FileInfo &a, const FileInfo &b) { return a.creationDate > b.creationDate; }); for (FileInfo &f : files) { QFile file(f.file); bool success = file.open(QIODevice::ReadOnly); if (!success) { qWarning() << "ERROR: Couldn't restore chat from file:" << file.fileName(); continue; } QDataStream in(&file); qint32 version = 0; if (!f.oldFile) { // Read and check the header quint32 magic; in >> magic; if (magic != CHAT_FORMAT_MAGIC) { qWarning() << "ERROR: Chat file has bad magic:" << file.fileName(); continue; } // Read the version in >> version; if (version < 1) { qWarning() << "ERROR: Chat file has non supported version:" << file.fileName(); continue; } if (version < 2) in.setVersion(QDataStream::Qt_6_2); } qDebug() << "deserializing chat" << f.file; auto chat = std::make_unique(); chat->moveToThread(qGuiApp->thread()); bool ok = chat->deserialize(in, version); if (!ok) { qWarning() << "ERROR: Couldn't deserialize chat from file:" << file.fileName(); } else if (!in.atEnd()) { qWarning().nospace() << "error loading chat from " << file.fileName() << ": extra data at end of file"; } else { emit chatRestored(chat.release()); } if (f.oldFile) file.remove(); // No longer storing in this directory file.close(); } qint64 elapsedTime = timer.elapsed(); qDebug() << "deserializing chats took:" << elapsedTime << "ms"; } void ChatListModel::restoreChat(Chat *chat) { chat->setParent(this); connect(chat, &Chat::nameChanged, this, &ChatListModel::nameChanged); beginInsertRows(QModelIndex(), m_chats.size(), m_chats.size()); m_chats.append(chat); endInsertRows(); } void ChatListModel::chatsRestoredFinished() { addServerChat(); } void ChatListModel::handleServerEnabledChanged() { if (MySettings::globalInstance()->serverChat() || m_serverChat != m_currentChat) return; Chat *nextChat = get(0); Q_ASSERT(nextChat && nextChat != m_serverChat); setCurrentChat(nextChat); } ================================================ FILE: gpt4all-chat/src/chatlistmodel.h ================================================ #ifndef CHATLISTMODEL_H #define CHATLISTMODEL_H #include "chat.h" #include "chatllm.h" #include "chatmodel.h" #include #include #include #include #include #include #include #include #include #include #include #include // IWYU pragma: keep #include #include #include #include #include class ChatsRestoreThread : public QThread { Q_OBJECT public: void run() override; Q_SIGNALS: void chatRestored(Chat *chat); }; class ChatSaver : public QObject { Q_OBJECT public: explicit ChatSaver(); ~ChatSaver() override; Q_SIGNALS: void saveChatsFinished(); public Q_SLOTS: void saveChats(const QVector &chats); private: QThread m_thread; QMutex m_mutex; }; class ChatListModel : public QAbstractListModel { Q_OBJECT Q_PROPERTY(int count READ count NOTIFY countChanged) Q_PROPERTY(Chat *currentChat READ currentChat WRITE setCurrentChat NOTIFY currentChatChanged) public: static ChatListModel *globalInstance(); enum Roles { IdRole = Qt::UserRole + 1, NameRole, SectionRole }; int rowCount(const QModelIndex &parent = QModelIndex()) const override { Q_UNUSED(parent) return m_chats.size(); } QVariant data(const QModelIndex &index, int role = Qt::DisplayRole) const override { if (!index.isValid() || index.row() < 0 || index.row() >= m_chats.size()) return QVariant(); const Chat *item = m_chats.at(index.row()); switch (role) { case IdRole: return item->id(); case NameRole: return item->name(); case SectionRole: { if (item == m_serverChat) return QString(); const QDate date = QDate::currentDate(); const QDate itemDate = item->creationDate().date(); if (date == itemDate) return tr("TODAY"); else if (itemDate >= date.addDays(-7)) return tr("THIS WEEK"); else if (itemDate >= date.addMonths(-1)) return tr("THIS MONTH"); else if (itemDate >= date.addMonths(-6)) return tr("LAST SIX MONTHS"); else if (itemDate.year() == date.year()) return tr("THIS YEAR"); else if (itemDate.year() == date.year() - 1) return tr("LAST YEAR"); else return QString::number(itemDate.year()); } } return QVariant(); } QHash roleNames() const override { QHash roles; roles[IdRole] = "id"; roles[NameRole] = "name"; roles[SectionRole] = "section"; return roles; } bool shouldSaveChats() const; void setShouldSaveChats(bool b); bool shouldSaveChatGPTChats() const; void setShouldSaveChatGPTChats(bool b); Q_INVOKABLE void loadChats(); Q_INVOKABLE void addChat() { // Select the existing new chat if we already have one if (m_newChat) { setCurrentChat(m_newChat); return; } // Create a new chat pointer and connect it to determine when it is populated m_newChat = new Chat(this); connect(m_newChat->chatModel(), &ChatModel::countChanged, this, &ChatListModel::newChatCountChanged); connect(m_newChat, &Chat::nameChanged, this, &ChatListModel::nameChanged); beginInsertRows(QModelIndex(), 0, 0); m_chats.prepend(m_newChat); endInsertRows(); emit countChanged(); setCurrentChat(m_newChat); } Q_INVOKABLE void addServerChat() { // Create a new dummy chat pointer and don't connect it if (m_serverChat) return; m_serverChat = new Chat(Chat::server_tag, this); beginInsertRows(QModelIndex(), m_chats.size(), m_chats.size()); m_chats.append(m_serverChat); endInsertRows(); emit countChanged(); } Q_INVOKABLE void removeChat(Chat* chat) { Q_ASSERT(chat != m_serverChat); if (!m_chats.contains(chat)) { qWarning() << "WARNING: Removing chat failed with id" << chat->id(); return; } removeChatFile(chat); if (chat == m_newChat) { m_newChat->disconnect(this); m_newChat = nullptr; } chat->markForDeletion(); const int index = m_chats.indexOf(chat); if (m_chats.count() < 3 /*m_serverChat included*/) { addChat(); } else { int nextIndex; if (index == m_chats.count() - 2 /*m_serverChat is last*/) nextIndex = index - 1; else nextIndex = index + 1; Chat *nextChat = get(nextIndex); Q_ASSERT(nextChat); setCurrentChat(nextChat); } const int newIndex = m_chats.indexOf(chat); beginRemoveRows(QModelIndex(), newIndex, newIndex); m_chats.removeAll(chat); endRemoveRows(); chat->unloadAndDeleteLater(); } Chat *currentChat() const { return m_currentChat; } void setCurrentChat(Chat *chat) { if (!m_chats.contains(chat)) { qWarning() << "ERROR: Setting current chat failed with id" << chat->id(); return; } if (m_currentChat && m_currentChat != m_serverChat) m_currentChat->unloadModel(); m_currentChat = chat; emit currentChatChanged(); if (!m_currentChat->isModelLoaded() && m_currentChat != m_serverChat) m_currentChat->trySwitchContextOfLoadedModel(); } Q_INVOKABLE Chat* get(int index) { if (index < 0 || index >= m_chats.size()) return nullptr; return m_chats.at(index); } int count() const { return m_chats.size(); } // stop ChatLLM threads for clean shutdown void destroyChats() { for (auto *chat: m_chats) { chat->destroy(); } ChatLLM::destroyStore(); } void removeChatFile(Chat *chat) const; Q_INVOKABLE void saveChats(); Q_INVOKABLE void saveChatsForQuit(); void restoreChat(Chat *chat); void chatsRestoredFinished(); public Q_SLOTS: void handleServerEnabledChanged(); Q_SIGNALS: void countChanged(); void currentChatChanged(); void requestSaveChats(const QVector &); void saveChatsFinished(); protected: bool eventFilter(QObject *obj, QEvent *ev) override; private Q_SLOTS: // Used with QCoreApplication::aboutToQuit. Does not require an event loop. void saveChatsSync(); void newChatCountChanged() { Q_ASSERT(m_newChat && m_newChat->chatModel()->count()); m_newChat->chatModel()->disconnect(this); m_newChat = nullptr; } void nameChanged() { Chat *chat = qobject_cast(sender()); if (!chat) return; int row = m_chats.indexOf(chat); if (row < 0 || row >= m_chats.size()) return; QModelIndex index = createIndex(row, 0); emit dataChanged(index, index, {NameRole}); } void printChats() { for (auto c : m_chats) { qDebug() << c->name() << (c == m_currentChat ? "currentChat: true" : "currentChat: false") << (c == m_newChat ? "newChat: true" : "newChat: false"); } } private: QVector getChatsToSave() const; private: Chat* m_newChat = nullptr; Chat* m_serverChat = nullptr; Chat* m_currentChat = nullptr; QList m_chats; std::unique_ptr m_chatSaver; bool m_startedFinalSave = false; private: explicit ChatListModel(); ~ChatListModel() {} friend class MyChatListModel; }; #endif // CHATITEMMODEL_H ================================================ FILE: gpt4all-chat/src/chatllm.cpp ================================================ #include "chatllm.h" #include "chat.h" #include "chatapi.h" #include "chatmodel.h" #include "jinja_helpers.h" #include "localdocs.h" #include "mysettings.h" #include "network.h" #include "tool.h" #include "toolmodel.h" #include "toolcallparser.h" #include #include #include #include #include #include #include #include #include // IWYU pragma: keep #include #include #include #include #include // IWYU pragma: keep #include // IWYU pragma: keep #include // IWYU pragma: keep #include // IWYU pragma: keep #include #include #include #include #include #include #include #include #include #include // IWYU pragma: keep #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include using namespace Qt::Literals::StringLiterals; using namespace ToolEnums; namespace ranges = std::ranges; using json = nlohmann::ordered_json; //#define DEBUG //#define DEBUG_MODEL_LOADING // NOTE: not threadsafe static const std::shared_ptr &jinjaEnv() { static std::shared_ptr environment; if (!environment) { environment = minja::Context::builtins(); environment->set("strftime_now", minja::simple_function( "strftime_now", { "format" }, [](const std::shared_ptr &, minja::Value &args) -> minja::Value { auto format = args.at("format").get(); using Clock = std::chrono::system_clock; time_t nowUnix = Clock::to_time_t(Clock::now()); auto localDate = *std::localtime(&nowUnix); std::ostringstream ss; ss << std::put_time(&localDate, format.c_str()); return ss.str(); } )); environment->set("regex_replace", minja::simple_function( "regex_replace", { "str", "pattern", "repl" }, [](const std::shared_ptr &, minja::Value &args) -> minja::Value { auto str = args.at("str" ).get(); auto pattern = args.at("pattern").get(); auto repl = args.at("repl" ).get(); return std::regex_replace(str, std::regex(pattern), repl); } )); } return environment; } class BaseResponseHandler { public: virtual void onSplitIntoTwo (const QString &startTag, const QString &firstBuffer, const QString &secondBuffer) = 0; virtual void onSplitIntoThree (const QString &secondBuffer, const QString &thirdBuffer) = 0; // "old-style" responses, with all of the implementation details left in virtual void onOldResponseChunk(const QByteArray &chunk) = 0; // notify of a "new-style" response that has been cleaned of tool calling virtual bool onBufferResponse (const QString &response, int bufferIdx) = 0; // notify of a "new-style" response, no tool calling applicable virtual bool onRegularResponse () = 0; virtual bool getStopGenerating () const = 0; }; static auto promptModelWithTools( LLModel *model, const LLModel::PromptCallback &promptCallback, BaseResponseHandler &respHandler, const LLModel::PromptContext &ctx, const QByteArray &prompt, const QStringList &toolNames ) -> std::pair { ToolCallParser toolCallParser(toolNames); auto handleResponse = [&toolCallParser, &respHandler](LLModel::Token token, std::string_view piece) -> bool { Q_UNUSED(token) toolCallParser.update(piece.data()); // Split the response into two if needed if (toolCallParser.numberOfBuffers() < 2 && toolCallParser.splitIfPossible()) { const auto parseBuffers = toolCallParser.buffers(); Q_ASSERT(parseBuffers.size() == 2); respHandler.onSplitIntoTwo(toolCallParser.startTag(), parseBuffers.at(0), parseBuffers.at(1)); } // Split the response into three if needed if (toolCallParser.numberOfBuffers() < 3 && toolCallParser.startTag() == ToolCallConstants::ThinkStartTag && toolCallParser.splitIfPossible()) { const auto parseBuffers = toolCallParser.buffers(); Q_ASSERT(parseBuffers.size() == 3); respHandler.onSplitIntoThree(parseBuffers.at(1), parseBuffers.at(2)); } respHandler.onOldResponseChunk(QByteArray::fromRawData(piece.data(), piece.size())); bool ok; const auto parseBuffers = toolCallParser.buffers(); if (parseBuffers.size() > 1) { ok = respHandler.onBufferResponse(parseBuffers.last(), parseBuffers.size() - 1); } else { ok = respHandler.onRegularResponse(); } if (!ok) return false; const bool shouldExecuteToolCall = toolCallParser.state() == ToolEnums::ParseState::Complete && toolCallParser.startTag() != ToolCallConstants::ThinkStartTag; return !shouldExecuteToolCall && !respHandler.getStopGenerating(); }; model->prompt(std::string_view(prompt), promptCallback, handleResponse, ctx); const bool shouldExecuteToolCall = toolCallParser.state() == ToolEnums::ParseState::Complete && toolCallParser.startTag() != ToolCallConstants::ThinkStartTag; return { toolCallParser.buffers(), shouldExecuteToolCall }; } class LLModelStore { public: static LLModelStore *globalInstance(); LLModelInfo acquireModel(); // will block until llmodel is ready void releaseModel(LLModelInfo &&info); // must be called when you are done void destroy(); private: LLModelStore() { // seed with empty model m_availableModel = LLModelInfo(); } ~LLModelStore() {} std::optional m_availableModel; QMutex m_mutex; QWaitCondition m_condition; friend class MyLLModelStore; }; class MyLLModelStore : public LLModelStore { }; Q_GLOBAL_STATIC(MyLLModelStore, storeInstance) LLModelStore *LLModelStore::globalInstance() { return storeInstance(); } LLModelInfo LLModelStore::acquireModel() { QMutexLocker locker(&m_mutex); while (!m_availableModel) m_condition.wait(locker.mutex()); auto first = std::move(*m_availableModel); m_availableModel.reset(); return first; } void LLModelStore::releaseModel(LLModelInfo &&info) { QMutexLocker locker(&m_mutex); Q_ASSERT(!m_availableModel); m_availableModel = std::move(info); m_condition.wakeAll(); } void LLModelStore::destroy() { QMutexLocker locker(&m_mutex); m_availableModel.reset(); } void LLModelInfo::resetModel(ChatLLM *cllm, LLModel *model) { this->model.reset(model); fallbackReason.reset(); emit cllm->loadedModelInfoChanged(); } ChatLLM::ChatLLM(Chat *parent, bool isServer) : QObject{nullptr} , m_chat(parent) , m_shouldBeLoaded(false) , m_forceUnloadModel(false) , m_markedForDeletion(false) , m_stopGenerating(false) , m_timer(nullptr) , m_isServer(isServer) , m_forceMetal(MySettings::globalInstance()->forceMetal()) , m_reloadingToChangeVariant(false) , m_chatModel(parent->chatModel()) { moveToThread(&m_llmThread); connect(this, &ChatLLM::shouldBeLoadedChanged, this, &ChatLLM::handleShouldBeLoadedChanged, Qt::QueuedConnection); // explicitly queued connect(this, &ChatLLM::trySwitchContextRequested, this, &ChatLLM::trySwitchContextOfLoadedModel, Qt::QueuedConnection); // explicitly queued connect(parent, &Chat::idChanged, this, &ChatLLM::handleChatIdChanged); connect(&m_llmThread, &QThread::started, this, &ChatLLM::handleThreadStarted); connect(MySettings::globalInstance(), &MySettings::forceMetalChanged, this, &ChatLLM::handleForceMetalChanged); connect(MySettings::globalInstance(), &MySettings::deviceChanged, this, &ChatLLM::handleDeviceChanged); // The following are blocking operations and will block the llm thread connect(this, &ChatLLM::requestRetrieveFromDB, LocalDocs::globalInstance()->database(), &Database::retrieveFromDB, Qt::BlockingQueuedConnection); m_llmThread.setObjectName(parent->id()); m_llmThread.start(); } ChatLLM::~ChatLLM() { destroy(); } void ChatLLM::destroy() { m_stopGenerating = true; m_llmThread.quit(); m_llmThread.wait(); // The only time we should have a model loaded here is on shutdown // as we explicitly unload the model in all other circumstances if (isModelLoaded()) { m_llModelInfo.resetModel(this); } } void ChatLLM::destroyStore() { LLModelStore::globalInstance()->destroy(); } void ChatLLM::handleThreadStarted() { m_timer = new TokenTimer(this); connect(m_timer, &TokenTimer::report, this, &ChatLLM::reportSpeed); emit threadStarted(); } void ChatLLM::handleForceMetalChanged(bool forceMetal) { #if defined(Q_OS_MAC) && defined(__aarch64__) m_forceMetal = forceMetal; if (isModelLoaded() && m_shouldBeLoaded) { m_reloadingToChangeVariant = true; unloadModel(); reloadModel(); m_reloadingToChangeVariant = false; } #else Q_UNUSED(forceMetal); #endif } void ChatLLM::handleDeviceChanged() { if (isModelLoaded() && m_shouldBeLoaded) { m_reloadingToChangeVariant = true; unloadModel(); reloadModel(); m_reloadingToChangeVariant = false; } } bool ChatLLM::loadDefaultModel() { ModelInfo defaultModel = ModelList::globalInstance()->defaultModelInfo(); if (defaultModel.filename().isEmpty()) { emit modelLoadingError(u"Could not find any model to load"_s); return false; } return loadModel(defaultModel); } void ChatLLM::trySwitchContextOfLoadedModel(const ModelInfo &modelInfo) { // We're trying to see if the store already has the model fully loaded that we wish to use // and if so we just acquire it from the store and switch the context and return true. If the // store doesn't have it or we're already loaded or in any other case just return false. // If we're already loaded or a server or we're reloading to change the variant/device or the // modelInfo is empty, then this should fail if ( isModelLoaded() || m_isServer || m_reloadingToChangeVariant || modelInfo.name().isEmpty() || !m_shouldBeLoaded ) { emit trySwitchContextOfLoadedModelCompleted(0); return; } QString filePath = modelInfo.dirpath + modelInfo.filename(); QFileInfo fileInfo(filePath); acquireModel(); #if defined(DEBUG_MODEL_LOADING) qDebug() << "acquired model from store" << m_llmThread.objectName() << m_llModelInfo.model.get(); #endif // The store gave us no already loaded model, the wrong type of model, then give it back to the // store and fail if (!m_llModelInfo.model || m_llModelInfo.fileInfo != fileInfo || !m_shouldBeLoaded) { LLModelStore::globalInstance()->releaseModel(std::move(m_llModelInfo)); emit trySwitchContextOfLoadedModelCompleted(0); return; } #if defined(DEBUG_MODEL_LOADING) qDebug() << "store had our model" << m_llmThread.objectName() << m_llModelInfo.model.get(); #endif emit trySwitchContextOfLoadedModelCompleted(2); emit modelLoadingPercentageChanged(1.0f); emit trySwitchContextOfLoadedModelCompleted(0); } bool ChatLLM::loadModel(const ModelInfo &modelInfo) { // This is a complicated method because N different possible threads are interested in the outcome // of this method. Why? Because we have a main/gui thread trying to monitor the state of N different // possible chat threads all vying for a single resource - the currently loaded model - as the user // switches back and forth between chats. It is important for our main/gui thread to never block // but simultaneously always have up2date information with regards to which chat has the model loaded // and what the type and name of that model is. I've tried to comment extensively in this method // to provide an overview of what we're doing here. if (isModelLoaded() && this->modelInfo() == modelInfo) { // already acquired -> keep it return true; // already loaded } // reset status emit modelLoadingPercentageChanged(std::numeric_limits::min()); // small non-zero positive value emit modelLoadingError(""); QString filePath = modelInfo.dirpath + modelInfo.filename(); QFileInfo fileInfo(filePath); // We have a live model, but it isn't the one we want bool alreadyAcquired = isModelLoaded(); if (alreadyAcquired) { #if defined(DEBUG_MODEL_LOADING) qDebug() << "already acquired model deleted" << m_llmThread.objectName() << m_llModelInfo.model.get(); #endif m_llModelInfo.resetModel(this); } else if (!m_isServer) { // This is a blocking call that tries to retrieve the model we need from the model store. // If it succeeds, then we just have to restore state. If the store has never had a model // returned to it, then the modelInfo.model pointer should be null which will happen on startup acquireModel(); #if defined(DEBUG_MODEL_LOADING) qDebug() << "acquired model from store" << m_llmThread.objectName() << m_llModelInfo.model.get(); #endif // At this point it is possible that while we were blocked waiting to acquire the model from the // store, that our state was changed to not be loaded. If this is the case, release the model // back into the store and quit loading if (!m_shouldBeLoaded) { #if defined(DEBUG_MODEL_LOADING) qDebug() << "no longer need model" << m_llmThread.objectName() << m_llModelInfo.model.get(); #endif LLModelStore::globalInstance()->releaseModel(std::move(m_llModelInfo)); emit modelLoadingPercentageChanged(0.0f); return false; } // Check if the store just gave us exactly the model we were looking for if (m_llModelInfo.model && m_llModelInfo.fileInfo == fileInfo && !m_reloadingToChangeVariant) { #if defined(DEBUG_MODEL_LOADING) qDebug() << "store had our model" << m_llmThread.objectName() << m_llModelInfo.model.get(); #endif emit modelLoadingPercentageChanged(1.0f); setModelInfo(modelInfo); Q_ASSERT(!m_modelInfo.filename().isEmpty()); if (m_modelInfo.filename().isEmpty()) emit modelLoadingError(u"Modelinfo is left null for %1"_s.arg(modelInfo.filename())); return true; } else { // Release the memory since we have to switch to a different model. #if defined(DEBUG_MODEL_LOADING) qDebug() << "deleting model" << m_llmThread.objectName() << m_llModelInfo.model.get(); #endif m_llModelInfo.resetModel(this); } } // Guarantee we've released the previous models memory Q_ASSERT(!m_llModelInfo.model); // Store the file info in the modelInfo in case we have an error loading m_llModelInfo.fileInfo = fileInfo; if (fileInfo.exists()) { QVariantMap modelLoadProps; if (modelInfo.isOnline) { QString apiKey; QString requestUrl; QString modelName; { QFile file(filePath); bool success = file.open(QIODeviceBase::ReadOnly); (void)success; Q_ASSERT(success); QJsonDocument doc = QJsonDocument::fromJson(file.readAll()); QJsonObject obj = doc.object(); apiKey = obj["apiKey"].toString(); modelName = obj["modelName"].toString(); if (modelInfo.isCompatibleApi) { QString baseUrl(obj["baseUrl"].toString()); QUrl apiUrl(QUrl::fromUserInput(baseUrl)); if (!Network::isHttpUrlValid(apiUrl)) { return false; } QString currentPath(apiUrl.path()); QString suffixPath("%1/chat/completions"); apiUrl.setPath(suffixPath.arg(currentPath)); requestUrl = apiUrl.toString(); } else { requestUrl = modelInfo.url(); } } m_llModelType = LLModelTypeV1::API; ChatAPI *model = new ChatAPI(); model->setModelName(modelName); model->setRequestURL(requestUrl); model->setAPIKey(apiKey); m_llModelInfo.resetModel(this, model); } else if (!loadNewModel(modelInfo, modelLoadProps)) { return false; // m_shouldBeLoaded became false } #if defined(DEBUG_MODEL_LOADING) qDebug() << "new model" << m_llmThread.objectName() << m_llModelInfo.model.get(); #endif #if defined(DEBUG) qDebug() << "modelLoadedChanged" << m_llmThread.objectName(); fflush(stdout); #endif emit modelLoadingPercentageChanged(isModelLoaded() ? 1.0f : 0.0f); emit loadedModelInfoChanged(); modelLoadProps.insert("requestedDevice", MySettings::globalInstance()->device()); modelLoadProps.insert("model", modelInfo.filename()); Network::globalInstance()->trackChatEvent("model_load", modelLoadProps); } else { if (!m_isServer) LLModelStore::globalInstance()->releaseModel(std::move(m_llModelInfo)); // release back into the store resetModel(); emit modelLoadingError(u"Could not find file for model %1"_s.arg(modelInfo.filename())); } if (m_llModelInfo.model) setModelInfo(modelInfo); return bool(m_llModelInfo.model); } /* Returns false if the model should no longer be loaded (!m_shouldBeLoaded). * Otherwise returns true, even on error. */ bool ChatLLM::loadNewModel(const ModelInfo &modelInfo, QVariantMap &modelLoadProps) { QElapsedTimer modelLoadTimer; modelLoadTimer.start(); QString requestedDevice = MySettings::globalInstance()->device(); int n_ctx = MySettings::globalInstance()->modelContextLength(modelInfo); int ngl = MySettings::globalInstance()->modelGpuLayers(modelInfo); std::string backend = "auto"; #ifdef Q_OS_MAC if (requestedDevice == "CPU") { backend = "cpu"; } else if (m_forceMetal) { #ifdef __aarch64__ backend = "metal"; #endif } #else // !defined(Q_OS_MAC) if (requestedDevice.startsWith("CUDA: ")) backend = "cuda"; #endif QString filePath = modelInfo.dirpath + modelInfo.filename(); auto construct = [this, &filePath, &modelInfo, &modelLoadProps, n_ctx](std::string const &backend) { QString constructError; m_llModelInfo.resetModel(this); try { auto *model = LLModel::Implementation::construct(filePath.toStdString(), backend, n_ctx); m_llModelInfo.resetModel(this, model); } catch (const LLModel::MissingImplementationError &e) { modelLoadProps.insert("error", "missing_model_impl"); constructError = e.what(); } catch (const LLModel::UnsupportedModelError &e) { modelLoadProps.insert("error", "unsupported_model_file"); constructError = e.what(); } catch (const LLModel::BadArchError &e) { constructError = e.what(); modelLoadProps.insert("error", "unsupported_model_arch"); modelLoadProps.insert("model_arch", QString::fromStdString(e.arch())); } if (!m_llModelInfo.model) { if (!m_isServer) LLModelStore::globalInstance()->releaseModel(std::move(m_llModelInfo)); resetModel(); emit modelLoadingError(u"Error loading %1: %2"_s.arg(modelInfo.filename(), constructError)); return false; } m_llModelInfo.model->setProgressCallback([this](float progress) -> bool { progress = std::max(progress, std::numeric_limits::min()); // keep progress above zero emit modelLoadingPercentageChanged(progress); return m_shouldBeLoaded; }); return true; }; if (!construct(backend)) return true; if (m_llModelInfo.model->isModelBlacklisted(filePath.toStdString())) { static QSet warned; auto fname = modelInfo.filename(); if (!warned.contains(fname)) { emit modelLoadingWarning( u"%1 is known to be broken. Please get a replacement via the download dialog."_s.arg(fname) ); warned.insert(fname); // don't warn again until restart } } auto approxDeviceMemGB = [](const LLModel::GPUDevice *dev) { float memGB = dev->heapSize / float(1024 * 1024 * 1024); return std::floor(memGB * 10.f) / 10.f; // truncate to 1 decimal place }; std::vector availableDevices; const LLModel::GPUDevice *defaultDevice = nullptr; { const size_t requiredMemory = m_llModelInfo.model->requiredMem(filePath.toStdString(), n_ctx, ngl); availableDevices = m_llModelInfo.model->availableGPUDevices(requiredMemory); // Pick the best device // NB: relies on the fact that Kompute devices are listed first if (!availableDevices.empty() && availableDevices.front().type == 2 /*a discrete gpu*/) { defaultDevice = &availableDevices.front(); float memGB = defaultDevice->heapSize / float(1024 * 1024 * 1024); memGB = std::floor(memGB * 10.f) / 10.f; // truncate to 1 decimal place modelLoadProps.insert("default_device", QString::fromStdString(defaultDevice->name)); modelLoadProps.insert("default_device_mem", approxDeviceMemGB(defaultDevice)); modelLoadProps.insert("default_device_backend", QString::fromStdString(defaultDevice->backendName())); } } bool actualDeviceIsCPU = true; #if defined(Q_OS_MAC) && defined(__aarch64__) if (m_llModelInfo.model->implementation().buildVariant() == "metal") actualDeviceIsCPU = false; #else if (requestedDevice != "CPU") { const auto *device = defaultDevice; if (requestedDevice != "Auto") { // Use the selected device for (const LLModel::GPUDevice &d : availableDevices) { if (QString::fromStdString(d.selectionName()) == requestedDevice) { device = &d; break; } } } std::string unavail_reason; if (!device) { // GPU not available } else if (!m_llModelInfo.model->initializeGPUDevice(device->index, &unavail_reason)) { m_llModelInfo.fallbackReason = QString::fromStdString(unavail_reason); } else { actualDeviceIsCPU = false; modelLoadProps.insert("requested_device_mem", approxDeviceMemGB(device)); } } #endif bool success = m_llModelInfo.model->loadModel(filePath.toStdString(), n_ctx, ngl); if (!m_shouldBeLoaded) { m_llModelInfo.resetModel(this); if (!m_isServer) LLModelStore::globalInstance()->releaseModel(std::move(m_llModelInfo)); resetModel(); emit modelLoadingPercentageChanged(0.0f); return false; } if (actualDeviceIsCPU) { // we asked llama.cpp to use the CPU } else if (!success) { // llama_init_from_file returned nullptr m_llModelInfo.fallbackReason = "GPU loading failed (out of VRAM?)"; modelLoadProps.insert("cpu_fallback_reason", "gpu_load_failed"); // For CUDA, make sure we don't use the GPU at all - ngl=0 still offloads matmuls if (backend == "cuda" && !construct("auto")) return true; success = m_llModelInfo.model->loadModel(filePath.toStdString(), n_ctx, 0); if (!m_shouldBeLoaded) { m_llModelInfo.resetModel(this); if (!m_isServer) LLModelStore::globalInstance()->releaseModel(std::move(m_llModelInfo)); resetModel(); emit modelLoadingPercentageChanged(0.0f); return false; } } else if (!m_llModelInfo.model->usingGPUDevice()) { // ggml_vk_init was not called in llama.cpp // We might have had to fallback to CPU after load if the model is not possible to accelerate // for instance if the quantization method is not supported on Vulkan yet m_llModelInfo.fallbackReason = "model or quant has no GPU support"; modelLoadProps.insert("cpu_fallback_reason", "gpu_unsupported_model"); } if (!success) { m_llModelInfo.resetModel(this); if (!m_isServer) LLModelStore::globalInstance()->releaseModel(std::move(m_llModelInfo)); resetModel(); emit modelLoadingError(u"Could not load model due to invalid model file for %1"_s.arg(modelInfo.filename())); modelLoadProps.insert("error", "loadmodel_failed"); return true; } switch (m_llModelInfo.model->implementation().modelType()[0]) { case 'L': m_llModelType = LLModelTypeV1::LLAMA; break; default: { m_llModelInfo.resetModel(this); if (!m_isServer) LLModelStore::globalInstance()->releaseModel(std::move(m_llModelInfo)); resetModel(); emit modelLoadingError(u"Could not determine model type for %1"_s.arg(modelInfo.filename())); } } modelLoadProps.insert("$duration", modelLoadTimer.elapsed() / 1000.); return true; } bool ChatLLM::isModelLoaded() const { return m_llModelInfo.model && m_llModelInfo.model->isModelLoaded(); } static QString &removeLeadingWhitespace(QString &s) { auto firstNonSpace = ranges::find_if_not(s, [](auto c) { return c.isSpace(); }); s.remove(0, firstNonSpace - s.begin()); return s; } template requires std::convertible_to, QChar> bool isAllSpace(R &&r) { return ranges::all_of(std::forward(r), [](QChar c) { return c.isSpace(); }); } void ChatLLM::regenerateResponse(int index) { Q_ASSERT(m_chatModel); if (m_chatModel->regenerateResponse(index)) { emit responseChanged(); prompt(m_chat->collectionList()); } } std::optional ChatLLM::popPrompt(int index) { Q_ASSERT(m_chatModel); return m_chatModel->popPrompt(index); } ModelInfo ChatLLM::modelInfo() const { return m_modelInfo; } void ChatLLM::setModelInfo(const ModelInfo &modelInfo) { m_modelInfo = modelInfo; emit modelInfoChanged(modelInfo); } void ChatLLM::acquireModel() { m_llModelInfo = LLModelStore::globalInstance()->acquireModel(); emit loadedModelInfoChanged(); } void ChatLLM::resetModel() { m_llModelInfo = {}; emit loadedModelInfoChanged(); } void ChatLLM::modelChangeRequested(const ModelInfo &modelInfo) { // ignore attempts to switch to the same model twice if (!isModelLoaded() || this->modelInfo() != modelInfo) { m_shouldBeLoaded = true; loadModel(modelInfo); } } static LLModel::PromptContext promptContextFromSettings(const ModelInfo &modelInfo) { auto *mySettings = MySettings::globalInstance(); return { .n_predict = mySettings->modelMaxLength (modelInfo), .top_k = mySettings->modelTopK (modelInfo), .top_p = float(mySettings->modelTopP (modelInfo)), .min_p = float(mySettings->modelMinP (modelInfo)), .temp = float(mySettings->modelTemperature (modelInfo)), .n_batch = mySettings->modelPromptBatchSize (modelInfo), .repeat_penalty = float(mySettings->modelRepeatPenalty(modelInfo)), .repeat_last_n = mySettings->modelRepeatPenaltyTokens(modelInfo), }; } void ChatLLM::prompt(const QStringList &enabledCollections) { if (!isModelLoaded()) { emit responseStopped(0); return; } try { promptInternalChat(enabledCollections, promptContextFromSettings(m_modelInfo)); } catch (const std::exception &e) { // FIXME(jared): this is neither translated nor serialized m_chatModel->setResponseValue(u"Error: %1"_s.arg(QString::fromUtf8(e.what()))); m_chatModel->setError(); emit responseStopped(0); } } std::vector ChatLLM::forkConversation(const QString &prompt) const { Q_ASSERT(m_chatModel); if (m_chatModel->hasError()) throw std::logic_error("cannot continue conversation with an error"); std::vector conversation; { auto items = m_chatModel->messageItems(); // It is possible the main thread could have erased the conversation while the llm thread, // is busy forking the conversatoin but it must have set stop generating first Q_ASSERT(items.size() >= 2 || m_stopGenerating); // should be prompt/response pairs conversation.reserve(items.size() + 1); conversation.assign(items.begin(), items.end()); } qsizetype nextIndex = conversation.empty() ? 0 : conversation.back().index().value() + 1; conversation.emplace_back(nextIndex, MessageItem::Type::Prompt, prompt.toUtf8()); return conversation; } // version 0 (default): HF compatible // version 1: explicit LocalDocs formatting static uint parseJinjaTemplateVersion(QStringView tmpl) { static uint MAX_VERSION = 1; static QRegularExpression reVersion(uR"(\A{#-?\s+gpt4all v(\d+)-?#}\s*$)"_s, QRegularExpression::MultilineOption); if (auto match = reVersion.matchView(tmpl); match.hasMatch()) { uint ver = match.captured(1).toUInt(); if (ver > MAX_VERSION) throw std::out_of_range(fmt::format("Unknown template version: {}", ver)); return ver; } return 0; } static std::shared_ptr loadJinjaTemplate(const std::string &source) { return minja::Parser::parse(source, { .trim_blocks = true, .lstrip_blocks = true, .keep_trailing_newline = false }); } std::optional ChatLLM::checkJinjaTemplateError(const std::string &source) { try { loadJinjaTemplate(source); } catch (const std::runtime_error &e) { return e.what(); } return std::nullopt; } std::string ChatLLM::applyJinjaTemplate(std::span items) const { Q_ASSERT(items.size() >= 1); auto *mySettings = MySettings::globalInstance(); auto &model = m_llModelInfo.model; QString chatTemplate, systemMessage; auto chatTemplateSetting = mySettings->modelChatTemplate(m_modelInfo); if (auto tmpl = chatTemplateSetting.asModern()) { chatTemplate = *tmpl; } else if (chatTemplateSetting.isLegacy()) { throw std::logic_error("cannot apply Jinja to a legacy prompt template"); } else { throw std::logic_error("cannot apply Jinja without setting a chat template first"); } if (isAllSpace(chatTemplate)) { throw std::logic_error("cannot apply Jinja with a blank chat template"); } if (auto tmpl = mySettings->modelSystemMessage(m_modelInfo).asModern()) { systemMessage = *tmpl; } else { throw std::logic_error("cannot apply Jinja with a legacy system message"); } uint version = parseJinjaTemplateVersion(chatTemplate); auto makeMap = [version](const MessageItem &item) { return JinjaMessage(version, item).AsJson(); }; std::unique_ptr systemItem; bool useSystem = !isAllSpace(systemMessage); json::array_t messages; messages.reserve(useSystem + items.size()); if (useSystem) { systemItem = std::make_unique(MessageItem::system_tag, systemMessage.toUtf8()); messages.emplace_back(makeMap(*systemItem)); } for (auto &item : items) messages.emplace_back(makeMap(item)); json::array_t toolList; const int toolCount = ToolModel::globalInstance()->count(); for (int i = 0; i < toolCount; ++i) { Tool *t = ToolModel::globalInstance()->get(i); toolList.push_back(t->jinjaValue()); } json::object_t params { { "messages", std::move(messages) }, { "add_generation_prompt", true }, { "toolList", toolList }, }; for (auto &[name, token] : model->specialTokens()) params.emplace(std::move(name), std::move(token)); try { auto tmpl = loadJinjaTemplate(chatTemplate.toStdString()); auto context = minja::Context::make(minja::Value(std::move(params)), jinjaEnv()); return tmpl->render(context); } catch (const std::runtime_error &e) { throw std::runtime_error(fmt::format("Failed to parse chat template: {}", e.what())); } Q_UNREACHABLE(); } auto ChatLLM::promptInternalChat(const QStringList &enabledCollections, const LLModel::PromptContext &ctx, qsizetype startOffset) -> ChatPromptResult { Q_ASSERT(isModelLoaded()); Q_ASSERT(m_chatModel); // Return a vector of relevant messages for this chat. // "startOffset" is used to select only local server messages from the current chat session. auto getChat = [&]() { auto items = m_chatModel->messageItems(); if (startOffset > 0) items.erase(items.begin(), items.begin() + startOffset); Q_ASSERT(items.size() >= 2); return items; }; QList databaseResults; if (!enabledCollections.isEmpty()) { std::optional> query; { // Find the prompt that represents the query. Server chats are flexible and may not have one. auto items = getChat(); if (auto peer = m_chatModel->getPeer(items, items.end() - 1)) // peer of response query = { (*peer)->index().value(), (*peer)->content() }; } if (query) { auto &[promptIndex, queryStr] = *query; const int retrievalSize = MySettings::globalInstance()->localDocsRetrievalSize(); emit requestRetrieveFromDB(enabledCollections, queryStr, retrievalSize, &databaseResults); // blocks m_chatModel->updateSources(promptIndex, databaseResults); emit databaseResultsChanged(databaseResults); } } auto messageItems = getChat(); messageItems.pop_back(); // exclude new response auto result = promptInternal(messageItems, ctx, !databaseResults.isEmpty()); return { /*PromptResult*/ { .response = std::move(result.response), .promptTokens = result.promptTokens, .responseTokens = result.responseTokens, }, /*databaseResults*/ std::move(databaseResults), }; } class ChatViewResponseHandler : public BaseResponseHandler { public: ChatViewResponseHandler(ChatLLM *cllm, QElapsedTimer *totalTime, ChatLLM::PromptResult *result) : m_cllm(cllm), m_totalTime(totalTime), m_result(result) {} void onSplitIntoTwo(const QString &startTag, const QString &firstBuffer, const QString &secondBuffer) override { if (startTag == ToolCallConstants::ThinkStartTag) m_cllm->m_chatModel->splitThinking({ firstBuffer, secondBuffer }); else m_cllm->m_chatModel->splitToolCall({ firstBuffer, secondBuffer }); } void onSplitIntoThree(const QString &secondBuffer, const QString &thirdBuffer) override { m_cllm->m_chatModel->endThinking({ secondBuffer, thirdBuffer }, m_totalTime->elapsed()); } void onOldResponseChunk(const QByteArray &chunk) override { m_result->responseTokens++; m_cllm->m_timer->inc(); m_result->response.append(chunk); } bool onBufferResponse(const QString &response, int bufferIdx) override { Q_UNUSED(bufferIdx) try { QString r = response; m_cllm->m_chatModel->setResponseValue(removeLeadingWhitespace(r)); } catch (const std::exception &e) { // We have a try/catch here because the main thread might have removed the response from // the chatmodel by erasing the conversation during the response... the main thread sets // m_stopGenerating before doing so, but it doesn't wait after that to reset the chatmodel Q_ASSERT(m_cllm->m_stopGenerating); return false; } emit m_cllm->responseChanged(); return true; } bool onRegularResponse() override { auto respStr = QString::fromUtf8(m_result->response); return onBufferResponse(respStr, 0); } bool getStopGenerating() const override { return m_cllm->m_stopGenerating; } private: ChatLLM *m_cllm; QElapsedTimer *m_totalTime; ChatLLM::PromptResult *m_result; }; auto ChatLLM::promptInternal( const std::variant, std::string_view> &prompt, const LLModel::PromptContext &ctx, bool usedLocalDocs ) -> PromptResult { Q_ASSERT(isModelLoaded()); auto *mySettings = MySettings::globalInstance(); // unpack prompt argument const std::span *messageItems = nullptr; std::string jinjaBuffer; std::string_view conversation; if (auto *nonChat = std::get_if(&prompt)) { conversation = *nonChat; // complete the string without a template } else { messageItems = &std::get>(prompt); jinjaBuffer = applyJinjaTemplate(*messageItems); conversation = jinjaBuffer; } // check for overlength last message if (!dynamic_cast(m_llModelInfo.model.get())) { auto nCtx = m_llModelInfo.model->contextLength(); std::string jinjaBuffer2; auto lastMessageRendered = (messageItems && messageItems->size() > 1) ? std::string_view(jinjaBuffer2 = applyJinjaTemplate({ &messageItems->back(), 1 })) : conversation; int32_t lastMessageLength = m_llModelInfo.model->countPromptTokens(lastMessageRendered); if (auto limit = nCtx - 4; lastMessageLength > limit) { throw std::invalid_argument( tr("Your message was too long and could not be processed (%1 > %2). " "Please try again with something shorter.").arg(lastMessageLength).arg(limit).toUtf8().constData() ); } } PromptResult result {}; auto handlePrompt = [this, &result](std::span batch, bool cached) -> bool { Q_UNUSED(cached) result.promptTokens += batch.size(); m_timer->start(); return !m_stopGenerating; }; QElapsedTimer totalTime; totalTime.start(); ChatViewResponseHandler respHandler(this, &totalTime, &result); m_timer->start(); QStringList finalBuffers; bool shouldExecuteTool; try { emit promptProcessing(); m_llModelInfo.model->setThreadCount(mySettings->threadCount()); m_stopGenerating = false; std::tie(finalBuffers, shouldExecuteTool) = promptModelWithTools( m_llModelInfo.model.get(), handlePrompt, respHandler, ctx, QByteArray::fromRawData(conversation.data(), conversation.size()), ToolCallConstants::AllTagNames ); } catch (...) { m_timer->stop(); throw; } m_timer->stop(); qint64 elapsed = totalTime.elapsed(); // trim trailing whitespace auto respStr = QString::fromUtf8(result.response); if (!respStr.isEmpty() && (std::as_const(respStr).back().isSpace() || finalBuffers.size() > 1)) { if (finalBuffers.size() > 1) m_chatModel->setResponseValue(finalBuffers.last().trimmed()); else m_chatModel->setResponseValue(respStr.trimmed()); emit responseChanged(); } bool doQuestions = false; if (!m_isServer && messageItems && !shouldExecuteTool) { switch (mySettings->suggestionMode()) { case SuggestionMode::On: doQuestions = true; break; case SuggestionMode::LocalDocsOnly: doQuestions = usedLocalDocs; break; case SuggestionMode::Off: ; } } if (doQuestions) generateQuestions(elapsed); else emit responseStopped(elapsed); return result; } void ChatLLM::setShouldBeLoaded(bool b) { #if defined(DEBUG_MODEL_LOADING) qDebug() << "setShouldBeLoaded" << m_llmThread.objectName() << b << m_llModelInfo.model.get(); #endif m_shouldBeLoaded = b; // atomic emit shouldBeLoadedChanged(); } void ChatLLM::requestTrySwitchContext() { m_shouldBeLoaded = true; // atomic emit trySwitchContextRequested(modelInfo()); } void ChatLLM::handleShouldBeLoadedChanged() { if (m_shouldBeLoaded) reloadModel(); else unloadModel(); } void ChatLLM::unloadModel() { if (!isModelLoaded() || m_isServer) return; if (!m_forceUnloadModel || !m_shouldBeLoaded) emit modelLoadingPercentageChanged(0.0f); else emit modelLoadingPercentageChanged(std::numeric_limits::min()); // small non-zero positive value #if defined(DEBUG_MODEL_LOADING) qDebug() << "unloadModel" << m_llmThread.objectName() << m_llModelInfo.model.get(); #endif if (m_forceUnloadModel) { m_llModelInfo.resetModel(this); m_forceUnloadModel = false; } LLModelStore::globalInstance()->releaseModel(std::move(m_llModelInfo)); } void ChatLLM::reloadModel() { if (isModelLoaded() && m_forceUnloadModel) unloadModel(); // we unload first if we are forcing an unload if (isModelLoaded() || m_isServer) return; #if defined(DEBUG_MODEL_LOADING) qDebug() << "reloadModel" << m_llmThread.objectName() << m_llModelInfo.model.get(); #endif const ModelInfo m = modelInfo(); if (m.name().isEmpty()) loadDefaultModel(); else loadModel(m); } // This class throws discards the text within thinking tags, for use with chat names and follow-up questions. class SimpleResponseHandler : public BaseResponseHandler { public: SimpleResponseHandler(ChatLLM *cllm) : m_cllm(cllm) {} void onSplitIntoTwo(const QString &startTag, const QString &firstBuffer, const QString &secondBuffer) override { /* no-op */ } void onSplitIntoThree(const QString &secondBuffer, const QString &thirdBuffer) override { /* no-op */ } void onOldResponseChunk(const QByteArray &chunk) override { m_response.append(chunk); } bool onBufferResponse(const QString &response, int bufferIdx) override { if (bufferIdx == 1) return true; // ignore "think" content return onSimpleResponse(response); } bool onRegularResponse() override { return onBufferResponse(QString::fromUtf8(m_response), 0); } bool getStopGenerating() const override { return m_cllm->m_stopGenerating; } protected: virtual bool onSimpleResponse(const QString &response) = 0; protected: ChatLLM *m_cllm; QByteArray m_response; }; class NameResponseHandler : public SimpleResponseHandler { private: // max length of chat names, in words static constexpr qsizetype MAX_WORDS = 3; public: using SimpleResponseHandler::SimpleResponseHandler; protected: bool onSimpleResponse(const QString &response) override { QTextStream stream(const_cast(&response), QIODeviceBase::ReadOnly); QStringList words; while (!stream.atEnd() && words.size() < MAX_WORDS) { QString word; stream >> word; words << word; } emit m_cllm->generatedNameChanged(words.join(u' ')); return words.size() < MAX_WORDS || stream.atEnd(); } }; void ChatLLM::generateName() { Q_ASSERT(isModelLoaded()); if (!isModelLoaded() || m_isServer) return; Q_ASSERT(m_chatModel); auto *mySettings = MySettings::globalInstance(); const QString chatNamePrompt = mySettings->modelChatNamePrompt(m_modelInfo); if (isAllSpace(chatNamePrompt)) { qWarning() << "ChatLLM: not generating chat name because prompt is empty"; return; } NameResponseHandler respHandler(this); try { promptModelWithTools( m_llModelInfo.model.get(), /*promptCallback*/ [this](auto &&...) { return !m_stopGenerating; }, respHandler, promptContextFromSettings(m_modelInfo), applyJinjaTemplate(forkConversation(chatNamePrompt)).c_str(), { ToolCallConstants::ThinkTagName } ); } catch (const std::exception &e) { qWarning() << "ChatLLM failed to generate name:" << e.what(); } } void ChatLLM::handleChatIdChanged(const QString &id) { m_llmThread.setObjectName(id); } class QuestionResponseHandler : public SimpleResponseHandler { public: using SimpleResponseHandler::SimpleResponseHandler; protected: bool onSimpleResponse(const QString &response) override { auto responseUtf8Bytes = response.toUtf8().slice(m_offset); auto responseUtf8 = std::string(responseUtf8Bytes.begin(), responseUtf8Bytes.end()); // extract all questions from response ptrdiff_t lastMatchEnd = -1; auto it = std::sregex_iterator(responseUtf8.begin(), responseUtf8.end(), s_reQuestion); auto end = std::sregex_iterator(); for (; it != end; ++it) { auto pos = it->position(); auto len = it->length(); lastMatchEnd = pos + len; emit m_cllm->generatedQuestionFinished(QString::fromUtf8(&responseUtf8[pos], len)); } // remove processed input from buffer if (lastMatchEnd != -1) m_offset += lastMatchEnd; return true; } private: // FIXME: This only works with response by the model in english which is not ideal for a multi-language // model. // match whole question sentences static inline const std::regex s_reQuestion { R"(\b(?:What|Where|How|Why|When|Who|Which|Whose|Whom)\b[^?]*\?)" }; qsizetype m_offset = 0; }; void ChatLLM::generateQuestions(qint64 elapsed) { Q_ASSERT(isModelLoaded()); if (!isModelLoaded()) { emit responseStopped(elapsed); return; } auto *mySettings = MySettings::globalInstance(); QString suggestedFollowUpPrompt = mySettings->modelSuggestedFollowUpPrompt(m_modelInfo); if (isAllSpace(suggestedFollowUpPrompt)) { qWarning() << "ChatLLM: not generating follow-up questions because prompt is empty"; emit responseStopped(elapsed); return; } emit generatingQuestions(); QuestionResponseHandler respHandler(this); QElapsedTimer totalTime; totalTime.start(); try { promptModelWithTools( m_llModelInfo.model.get(), /*promptCallback*/ [this](auto &&...) { return !m_stopGenerating; }, respHandler, promptContextFromSettings(m_modelInfo), applyJinjaTemplate(forkConversation(suggestedFollowUpPrompt)).c_str(), { ToolCallConstants::ThinkTagName } ); } catch (const std::exception &e) { qWarning() << "ChatLLM failed to generate follow-up questions:" << e.what(); } elapsed += totalTime.elapsed(); emit responseStopped(elapsed); } // this function serialized the cached model state to disk. // we want to also serialize n_ctx, and read it at load time. bool ChatLLM::serialize(QDataStream &stream, int version) { if (version < 11) { if (version >= 6) { stream << false; // serializeKV } if (version >= 2) { if (m_llModelType == LLModelTypeV1::NONE) { qWarning() << "ChatLLM ERROR: attempted to serialize a null model for chat id" << m_chat->id() << "name" << m_chat->name(); return false; } stream << m_llModelType; stream << 0; // state version } { QString dummy; stream << dummy; // response stream << dummy; // generated name } stream << quint32(0); // prompt + response tokens if (version < 6) { // serialize binary state if (version < 4) { stream << 0; // responseLogits } stream << int32_t(0); // n_past stream << quint64(0); // input token count stream << QByteArray(); // KV cache state } } return stream.status() == QDataStream::Ok; } bool ChatLLM::deserialize(QDataStream &stream, int version) { // discard all state since we are initialized from the ChatModel as of v11 if (version < 11) { union { int intval; quint32 u32; quint64 u64; }; bool deserializeKV = true; if (version >= 6) stream >> deserializeKV; if (version >= 2) { stream >> intval; // model type auto llModelType = (version >= 6 ? parseLLModelTypeV1 : parseLLModelTypeV0)(intval); if (llModelType == LLModelTypeV1::NONE) { qWarning().nospace() << "error loading chat id " << m_chat->id() << ": unrecognized model type: " << intval; return false; } /* note: prior to chat version 10, API models and chats with models removed in v2.5.0 only wrote this because of * undefined behavior in Release builds */ stream >> intval; // state version if (intval) { qWarning().nospace() << "error loading chat id " << m_chat->id() << ": unrecognized internal state version"; return false; } } { QString dummy; stream >> dummy; // response stream >> dummy; // name response } stream >> u32; // prompt + response token count // We don't use the raw model state anymore. if (deserializeKV) { if (version < 4) { stream >> u32; // response logits } stream >> u32; // n_past if (version >= 7) { stream >> u32; // n_ctx } if (version < 9) { stream >> u64; // logits size stream.skipRawData(u64 * sizeof(float)); // logits } stream >> u64; // token cache size stream.skipRawData(u64 * sizeof(int)); // token cache QByteArray dummy; stream >> dummy; // state } } return stream.status() == QDataStream::Ok; } ================================================ FILE: gpt4all-chat/src/chatllm.h ================================================ #ifndef CHATLLM_H #define CHATLLM_H #include "chatmodel.h" #include "database.h" #include "modellist.h" #include #include #include #include #include #include #include #include #include // IWYU pragma: keep #include #include // IWYU pragma: keep #include #include #include #include #include #include #include #include #include using namespace Qt::Literals::StringLiterals; class ChatLLM; class QDataStream; // NOTE: values serialized to disk, do not change or reuse enum class LLModelTypeV0 { // chat versions 2-5 MPT = 0, GPTJ = 1, LLAMA = 2, CHATGPT = 3, REPLIT = 4, FALCON = 5, BERT = 6, // not used STARCODER = 7, }; enum class LLModelTypeV1 { // since chat version 6 (v2.5.0) GPTJ = 0, // not for new chats LLAMA = 1, API = 2, BERT = 3, // not used // none of the below are used in new chats REPLIT = 4, FALCON = 5, MPT = 6, STARCODER = 7, NONE = -1, // no state }; inline LLModelTypeV1 parseLLModelTypeV1(int type) { switch (LLModelTypeV1(type)) { case LLModelTypeV1::GPTJ: case LLModelTypeV1::LLAMA: case LLModelTypeV1::API: // case LLModelTypeV1::BERT: -- not used case LLModelTypeV1::REPLIT: case LLModelTypeV1::FALCON: case LLModelTypeV1::MPT: case LLModelTypeV1::STARCODER: return LLModelTypeV1(type); default: return LLModelTypeV1::NONE; } } inline LLModelTypeV1 parseLLModelTypeV0(int v0) { switch (LLModelTypeV0(v0)) { case LLModelTypeV0::MPT: return LLModelTypeV1::MPT; case LLModelTypeV0::GPTJ: return LLModelTypeV1::GPTJ; case LLModelTypeV0::LLAMA: return LLModelTypeV1::LLAMA; case LLModelTypeV0::CHATGPT: return LLModelTypeV1::API; case LLModelTypeV0::REPLIT: return LLModelTypeV1::REPLIT; case LLModelTypeV0::FALCON: return LLModelTypeV1::FALCON; // case LLModelTypeV0::BERT: -- not used case LLModelTypeV0::STARCODER: return LLModelTypeV1::STARCODER; default: return LLModelTypeV1::NONE; } } struct LLModelInfo { std::unique_ptr model; QFileInfo fileInfo; std::optional fallbackReason; // NOTE: This does not store the model type or name on purpose as this is left for ChatLLM which // must be able to serialize the information even if it is in the unloaded state void resetModel(ChatLLM *cllm, LLModel *model = nullptr); }; class TokenTimer : public QObject { Q_OBJECT public: explicit TokenTimer(QObject *parent) : QObject(parent) , m_elapsed(0) {} static int rollingAverage(int oldAvg, int newNumber, int n) { // i.e. to calculate the new average after then nth number, // you multiply the old average by n−1, add the new number, and divide the total by n. return qRound(((float(oldAvg) * (n - 1)) + newNumber) / float(n)); } void start() { m_tokens = 0; m_elapsed = 0; m_time.invalidate(); } void stop() { handleTimeout(); } void inc() { if (!m_time.isValid()) m_time.start(); ++m_tokens; if (m_time.elapsed() > 999) handleTimeout(); } Q_SIGNALS: void report(const QString &speed); private Q_SLOTS: void handleTimeout() { m_elapsed += m_time.restart(); emit report(u"%1 tokens/sec"_s.arg(m_tokens / float(m_elapsed / 1000.0f), 0, 'g', 2)); } private: QElapsedTimer m_time; qint64 m_elapsed; quint32 m_tokens; }; class Chat; class ChatLLM : public QObject { Q_OBJECT Q_PROPERTY(QString deviceBackend READ deviceBackend NOTIFY loadedModelInfoChanged) Q_PROPERTY(QString device READ device NOTIFY loadedModelInfoChanged) Q_PROPERTY(QString fallbackReason READ fallbackReason NOTIFY loadedModelInfoChanged) public: ChatLLM(Chat *parent, bool isServer = false); virtual ~ChatLLM(); static void destroyStore(); static std::optional checkJinjaTemplateError(const std::string &source); void destroy(); bool isModelLoaded() const; void regenerateResponse(int index); // used to implement edit functionality std::optional popPrompt(int index); void stopGenerating() { m_stopGenerating = true; } bool shouldBeLoaded() const { return m_shouldBeLoaded; } void setShouldBeLoaded(bool b); void requestTrySwitchContext(); void setForceUnloadModel(bool b) { m_forceUnloadModel = b; } void setMarkedForDeletion(bool b) { m_markedForDeletion = b; } ModelInfo modelInfo() const; void setModelInfo(const ModelInfo &info); void acquireModel(); void resetModel(); QString deviceBackend() const { if (!isModelLoaded()) return QString(); std::string name = LLModel::GPUDevice::backendIdToName(m_llModelInfo.model->backendName()); return QString::fromStdString(name); } QString device() const { if (!isModelLoaded()) return QString(); const char *name = m_llModelInfo.model->gpuDeviceName(); return name ? QString(name) : u"CPU"_s; } // not loaded -> QString(), no fallback -> QString("") QString fallbackReason() const { if (!isModelLoaded()) return QString(); return m_llModelInfo.fallbackReason.value_or(u""_s); } bool serialize(QDataStream &stream, int version); bool deserialize(QDataStream &stream, int version); public Q_SLOTS: void prompt(const QStringList &enabledCollections); bool loadDefaultModel(); void trySwitchContextOfLoadedModel(const ModelInfo &modelInfo); bool loadModel(const ModelInfo &modelInfo); void modelChangeRequested(const ModelInfo &modelInfo); void unloadModel(); void reloadModel(); void generateName(); void handleChatIdChanged(const QString &id); void handleShouldBeLoadedChanged(); void handleThreadStarted(); void handleForceMetalChanged(bool forceMetal); void handleDeviceChanged(); Q_SIGNALS: void loadedModelInfoChanged(); void modelLoadingPercentageChanged(float); void modelLoadingError(const QString &error); void modelLoadingWarning(const QString &warning); void responseChanged(); void responseFailed(); void promptProcessing(); void generatingQuestions(); void responseStopped(qint64 promptResponseMs); void generatedNameChanged(const QString &name); void generatedQuestionFinished(const QString &generatedQuestion); void stateChanged(); void threadStarted(); void shouldBeLoadedChanged(); void trySwitchContextRequested(const ModelInfo &modelInfo); void trySwitchContextOfLoadedModelCompleted(int value); void requestRetrieveFromDB(const QList &collections, const QString &text, int retrievalSize, QList *results); void reportSpeed(const QString &speed); void reportDevice(const QString &device); void reportFallbackReason(const QString &fallbackReason); void databaseResultsChanged(const QList&); void modelInfoChanged(const ModelInfo &modelInfo); protected: struct PromptResult { QByteArray response; // raw UTF-8 int promptTokens; // note: counts *entire* history, even if cached int responseTokens; }; struct ChatPromptResult : PromptResult { QList databaseResults; }; ChatPromptResult promptInternalChat(const QStringList &enabledCollections, const LLModel::PromptContext &ctx, qsizetype startOffset = 0); // passing a string_view directly skips templating and uses the raw string PromptResult promptInternal(const std::variant, std::string_view> &prompt, const LLModel::PromptContext &ctx, bool usedLocalDocs); private: bool loadNewModel(const ModelInfo &modelInfo, QVariantMap &modelLoadProps); std::vector forkConversation(const QString &prompt) const; // Applies the Jinja template. Query mode returns only the last message without special tokens. // Returns a (# of messages, rendered prompt) pair. std::string applyJinjaTemplate(std::span items) const; void generateQuestions(qint64 elapsed); protected: QPointer m_chatModel; private: const Chat *m_chat; LLModelInfo m_llModelInfo; LLModelTypeV1 m_llModelType = LLModelTypeV1::NONE; ModelInfo m_modelInfo; TokenTimer *m_timer; QThread m_llmThread; std::atomic m_stopGenerating; std::atomic m_shouldBeLoaded; std::atomic m_forceUnloadModel; std::atomic m_markedForDeletion; bool m_isServer; bool m_forceMetal; bool m_reloadingToChangeVariant; friend class ChatViewResponseHandler; friend class SimpleResponseHandler; }; #endif // CHATLLM_H ================================================ FILE: gpt4all-chat/src/chatmodel.cpp ================================================ #include "chatmodel.h" #include #include #include #include #include QList ChatItem::consolidateSources(const QList &sources) { QMap groupedData; for (const ResultInfo &info : sources) { if (groupedData.contains(info.file)) { groupedData[info.file].text += "\n---\n" + info.text; } else { groupedData[info.file] = info; } } QList consolidatedSources = groupedData.values(); return consolidatedSources; } void ChatItem::serializeResponse(QDataStream &stream, int version) { stream << value; } void ChatItem::serializeToolCall(QDataStream &stream, int version) { stream << value; toolCallInfo.serialize(stream, version); } void ChatItem::serializeToolResponse(QDataStream &stream, int version) { stream << value; } void ChatItem::serializeText(QDataStream &stream, int version) { stream << value; } void ChatItem::serializeThink(QDataStream &stream, int version) { stream << value; stream << thinkingTime; } void ChatItem::serializeSubItems(QDataStream &stream, int version) { stream << name; switch (auto typ = type()) { using enum ChatItem::Type; case Response: { serializeResponse(stream, version); break; } case ToolCall: { serializeToolCall(stream, version); break; } case ToolResponse: { serializeToolResponse(stream, version); break; } case Text: { serializeText(stream, version); break; } case Think: { serializeThink(stream, version); break; } case System: case Prompt: throw std::invalid_argument(fmt::format("cannot serialize subitem type {}", int(typ))); } stream << qsizetype(subItems.size()); for (ChatItem *item :subItems) item->serializeSubItems(stream, version); } void ChatItem::serialize(QDataStream &stream, int version) { stream << name; stream << value; stream << newResponse; stream << isCurrentResponse; stream << stopped; stream << thumbsUpState; stream << thumbsDownState; if (version >= 11 && type() == ChatItem::Type::Response) stream << isError; if (version >= 8) { stream << sources.size(); for (const ResultInfo &info : sources) { Q_ASSERT(!info.file.isEmpty()); stream << info.collection; stream << info.path; stream << info.file; stream << info.title; stream << info.author; stream << info.date; stream << info.text; stream << info.page; stream << info.from; stream << info.to; } } else if (version >= 3) { QList references; QList referencesContext; int validReferenceNumber = 1; for (const ResultInfo &info : sources) { if (info.file.isEmpty()) continue; QString reference; { QTextStream stream(&reference); stream << (validReferenceNumber++) << ". "; if (!info.title.isEmpty()) stream << "\"" << info.title << "\". "; if (!info.author.isEmpty()) stream << "By " << info.author << ". "; if (!info.date.isEmpty()) stream << "Date: " << info.date << ". "; stream << "In " << info.file << ". "; if (info.page != -1) stream << "Page " << info.page << ". "; if (info.from != -1) { stream << "Lines " << info.from; if (info.to != -1) stream << "-" << info.to; stream << ". "; } stream << "[Context](context://" << validReferenceNumber - 1 << ")"; } references.append(reference); referencesContext.append(info.text); } stream << references.join("\n"); stream << referencesContext; } if (version >= 10) { stream << promptAttachments.size(); for (const PromptAttachment &a : promptAttachments) { Q_ASSERT(!a.url.isEmpty()); stream << a.url; stream << a.content; } } if (version >= 12) { stream << qsizetype(subItems.size()); for (ChatItem *item :subItems) item->serializeSubItems(stream, version); } } bool ChatItem::deserializeToolCall(QDataStream &stream, int version) { stream >> value; return toolCallInfo.deserialize(stream, version);; } bool ChatItem::deserializeToolResponse(QDataStream &stream, int version) { stream >> value; return true; } bool ChatItem::deserializeText(QDataStream &stream, int version) { stream >> value; return true; } bool ChatItem::deserializeResponse(QDataStream &stream, int version) { stream >> value; return true; } bool ChatItem::deserializeThink(QDataStream &stream, int version) { stream >> value; stream >> thinkingTime; return true; } bool ChatItem::deserializeSubItems(QDataStream &stream, int version) { stream >> name; try { type(); // check name } catch (const std::exception &e) { qWarning() << "ChatModel ERROR:" << e.what(); return false; } switch (auto typ = type()) { using enum ChatItem::Type; case Response: { deserializeResponse(stream, version); break; } case ToolCall: { deserializeToolCall(stream, version); break; } case ToolResponse: { deserializeToolResponse(stream, version); break; } case Text: { deserializeText(stream, version); break; } case Think: { deserializeThink(stream, version); break; } case System: case Prompt: throw std::invalid_argument(fmt::format("cannot serialize subitem type {}", int(typ))); } qsizetype count; stream >> count; for (int i = 0; i < count; ++i) { ChatItem *c = new ChatItem(this); if (!c->deserializeSubItems(stream, version)) { delete c; return false; } subItems.push_back(c); } return true; } bool ChatItem::deserialize(QDataStream &stream, int version) { if (version < 12) { int id; stream >> id; } stream >> name; try { type(); // check name } catch (const std::exception &e) { qWarning() << "ChatModel ERROR:" << e.what(); return false; } stream >> value; if (version < 10) { // This is deprecated and no longer used QString prompt; stream >> prompt; } stream >> newResponse; stream >> isCurrentResponse; stream >> stopped; stream >> thumbsUpState; stream >> thumbsDownState; if (version >= 11 && type() == ChatItem::Type::Response) stream >> isError; if (version >= 8) { qsizetype count; stream >> count; for (int i = 0; i < count; ++i) { ResultInfo info; stream >> info.collection; stream >> info.path; stream >> info.file; stream >> info.title; stream >> info.author; stream >> info.date; stream >> info.text; stream >> info.page; stream >> info.from; stream >> info.to; sources.append(info); } consolidatedSources = ChatItem::consolidateSources(sources); } else if (version >= 3) { QString references; QList referencesContext; stream >> references; stream >> referencesContext; if (!references.isEmpty()) { QList referenceList = references.split("\n"); // Ignore empty lines and those that begin with "---" which is no longer used for (auto it = referenceList.begin(); it != referenceList.end();) { if (it->trimmed().isEmpty() || it->trimmed().startsWith("---")) it = referenceList.erase(it); else ++it; } Q_ASSERT(referenceList.size() == referencesContext.size()); for (int j = 0; j < referenceList.size(); ++j) { QString reference = referenceList[j]; QString context = referencesContext[j]; ResultInfo info; QTextStream refStream(&reference); QString dummy; int validReferenceNumber; refStream >> validReferenceNumber >> dummy; // Extract title (between quotes) if (reference.contains("\"")) { int startIndex = reference.indexOf('"') + 1; int endIndex = reference.indexOf('"', startIndex); info.title = reference.mid(startIndex, endIndex - startIndex); } // Extract author (after "By " and before the next period) if (reference.contains("By ")) { int startIndex = reference.indexOf("By ") + 3; int endIndex = reference.indexOf('.', startIndex); info.author = reference.mid(startIndex, endIndex - startIndex).trimmed(); } // Extract date (after "Date: " and before the next period) if (reference.contains("Date: ")) { int startIndex = reference.indexOf("Date: ") + 6; int endIndex = reference.indexOf('.', startIndex); info.date = reference.mid(startIndex, endIndex - startIndex).trimmed(); } // Extract file name (after "In " and before the "[Context]") if (reference.contains("In ") && reference.contains(". [Context]")) { int startIndex = reference.indexOf("In ") + 3; int endIndex = reference.indexOf(". [Context]", startIndex); info.file = reference.mid(startIndex, endIndex - startIndex).trimmed(); } // Extract page number (after "Page " and before the next space) if (reference.contains("Page ")) { int startIndex = reference.indexOf("Page ") + 5; int endIndex = reference.indexOf(' ', startIndex); if (endIndex == -1) endIndex = reference.length(); info.page = reference.mid(startIndex, endIndex - startIndex).toInt(); } // Extract lines (after "Lines " and before the next space or hyphen) if (reference.contains("Lines ")) { int startIndex = reference.indexOf("Lines ") + 6; int endIndex = reference.indexOf(' ', startIndex); if (endIndex == -1) endIndex = reference.length(); int hyphenIndex = reference.indexOf('-', startIndex); if (hyphenIndex != -1 && hyphenIndex < endIndex) { info.from = reference.mid(startIndex, hyphenIndex - startIndex).toInt(); info.to = reference.mid(hyphenIndex + 1, endIndex - hyphenIndex - 1).toInt(); } else { info.from = reference.mid(startIndex, endIndex - startIndex).toInt(); } } info.text = context; sources.append(info); } consolidatedSources = ChatItem::consolidateSources(sources); } } if (version >= 10) { qsizetype count; stream >> count; QList attachments; for (int i = 0; i < count; ++i) { PromptAttachment a; stream >> a.url; stream >> a.content; attachments.append(a); } promptAttachments = attachments; } if (version >= 12) { qsizetype count; stream >> count; for (int i = 0; i < count; ++i) { ChatItem *c = new ChatItem(this); if (!c->deserializeSubItems(stream, version)) { delete c; return false; } subItems.push_back(c); } } return true; } ================================================ FILE: gpt4all-chat/src/chatmodel.h ================================================ #ifndef CHATMODEL_H #define CHATMODEL_H #include "database.h" #include "tool.h" #include "toolcallparser.h" #include "utils.h" // IWYU pragma: keep #include "xlsxtomd.h" #include #include #include #include #include #include #include #include #include #include #include #include #include // IWYU pragma: keep #include #include // IWYU pragma: keep #include #include // IWYU pragma: keep #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include using namespace Qt::Literals::StringLiterals; namespace ranges = std::ranges; namespace views = std::views; struct PromptAttachment { Q_GADGET Q_PROPERTY(QUrl url MEMBER url) Q_PROPERTY(QByteArray content MEMBER content) Q_PROPERTY(QString file READ file) Q_PROPERTY(QString processedContent READ processedContent) public: QUrl url; QByteArray content; QString file() const { if (!url.isLocalFile()) return QString(); const QString localFilePath = url.toLocalFile(); const QFileInfo info(localFilePath); return info.fileName(); } QString processedContent() const { const QString localFilePath = url.toLocalFile(); const QFileInfo info(localFilePath); if (info.suffix().toLower() != "xlsx") return u"## Attached: %1\n\n%2"_s.arg(file(), content); QBuffer buffer; buffer.setData(content); buffer.open(QIODevice::ReadOnly); const QString md = XLSXToMD::toMarkdown(&buffer); buffer.close(); return u"## Attached: %1\n\n%2"_s.arg(file(), md); } bool operator==(const PromptAttachment &other) const { return url == other.url; } }; Q_DECLARE_METATYPE(PromptAttachment) // Used by Server to represent a message from the client. struct MessageInput { enum class Type { System, Prompt, Response }; Type type; QString content; }; class MessageItem { Q_GADGET Q_PROPERTY(Type type READ type CONSTANT) Q_PROPERTY(QString content READ content CONSTANT) public: enum class Type { System, Prompt, Response, ToolResponse }; struct system_tag_t { explicit system_tag_t() = default; }; static inline constexpr system_tag_t system_tag = system_tag_t{}; MessageItem(qsizetype index, Type type, QString content) : m_index(index), m_type(type), m_content(std::move(content)) { Q_ASSERT(type != Type::System); // use system_tag constructor } // Construct a system message with no index, since they are never stored in the chat MessageItem(system_tag_t, QString content) : m_type(Type::System), m_content(std::move(content)) {} MessageItem(qsizetype index, Type type, QString content, const QList &sources, const QList &promptAttachments) : m_index(index) , m_type(type) , m_content(std::move(content)) , m_sources(sources) , m_promptAttachments(promptAttachments) {} // index of the parent ChatItem (system, prompt, response) in its container std::optional index() const { return m_index; } Type type() const { return m_type; } const QString &content() const { return m_content; } QList sources() const { return m_sources; } QList promptAttachments() const { return m_promptAttachments; } // used with version 0 Jinja templates QString bakedPrompt() const { if (type() != Type::Prompt) throw std::logic_error("bakedPrompt() called on non-prompt item"); QStringList parts; if (!m_sources.isEmpty()) { parts << u"### Context:\n"_s; for (auto &source : std::as_const(m_sources)) parts << u"Collection: "_s << source.collection << u"\nPath: "_s << source.path << u"\nExcerpt: "_s << source.text << u"\n\n"_s; } for (auto &attached : std::as_const(m_promptAttachments)) parts << attached.processedContent() << u"\n\n"_s; parts << m_content; return parts.join(QString()); } private: std::optional m_index; Type m_type; QString m_content; QList m_sources; QList m_promptAttachments; }; Q_DECLARE_METATYPE(MessageItem) class ChatItem : public QObject { Q_OBJECT Q_PROPERTY(QString name MEMBER name ) Q_PROPERTY(QString value MEMBER value) // prompts and responses Q_PROPERTY(QString content READ content NOTIFY contentChanged) // prompts Q_PROPERTY(QList promptAttachments MEMBER promptAttachments) // responses Q_PROPERTY(bool isCurrentResponse MEMBER isCurrentResponse NOTIFY isCurrentResponseChanged) Q_PROPERTY(bool isError MEMBER isError ) Q_PROPERTY(QList childItems READ childItems ) // toolcall Q_PROPERTY(bool isToolCallError READ isToolCallError NOTIFY isTooCallErrorChanged) // responses (DataLake) Q_PROPERTY(QString newResponse MEMBER newResponse ) Q_PROPERTY(bool stopped MEMBER stopped ) Q_PROPERTY(bool thumbsUpState MEMBER thumbsUpState ) Q_PROPERTY(bool thumbsDownState MEMBER thumbsDownState) // thinking Q_PROPERTY(int thinkingTime MEMBER thinkingTime NOTIFY thinkingTimeChanged) public: enum class Type { System, Prompt, Response, Text, ToolCall, ToolResponse, Think }; // tags for constructing ChatItems struct prompt_tag_t { explicit prompt_tag_t () = default; }; struct response_tag_t { explicit response_tag_t () = default; }; struct system_tag_t { explicit system_tag_t () = default; }; struct text_tag_t { explicit text_tag_t () = default; }; struct tool_call_tag_t { explicit tool_call_tag_t () = default; }; struct tool_response_tag_t { explicit tool_response_tag_t() = default; }; struct think_tag_t { explicit think_tag_t () = default; }; static inline constexpr prompt_tag_t prompt_tag = prompt_tag_t {}; static inline constexpr response_tag_t response_tag = response_tag_t {}; static inline constexpr system_tag_t system_tag = system_tag_t {}; static inline constexpr text_tag_t text_tag = text_tag_t {}; static inline constexpr tool_call_tag_t tool_call_tag = tool_call_tag_t {}; static inline constexpr tool_response_tag_t tool_response_tag = tool_response_tag_t {}; static inline constexpr think_tag_t think_tag = think_tag_t {}; public: ChatItem(QObject *parent) : QObject(nullptr) { moveToThread(parent->thread()); // setParent must be called from the thread the object lives in QMetaObject::invokeMethod(this, [this, parent]() { this->setParent(parent); }); } // NOTE: System messages are currently never serialized and only *stored* by the local server. // ChatLLM prepends a system MessageItem on-the-fly. ChatItem(QObject *parent, system_tag_t, const QString &value) : ChatItem(parent) { this->name = u"System: "_s; this->value = value; } ChatItem(QObject *parent, prompt_tag_t, const QString &value, const QList &attachments = {}) : ChatItem(parent) { this->name = u"Prompt: "_s; this->value = value; this->promptAttachments = attachments; } private: ChatItem(QObject *parent, response_tag_t, bool isCurrentResponse, const QString &value = {}) : ChatItem(parent) { this->name = u"Response: "_s; this->value = value; this->isCurrentResponse = isCurrentResponse; } public: // A new response, to be filled in ChatItem(QObject *parent, response_tag_t) : ChatItem(parent, response_tag, true) {} // An existing response, from Server ChatItem(QObject *parent, response_tag_t, const QString &value) : ChatItem(parent, response_tag, false, value) {} ChatItem(QObject *parent, text_tag_t, const QString &value) : ChatItem(parent) { this->name = u"Text: "_s; this->value = value; } ChatItem(QObject *parent, tool_call_tag_t, const QString &value) : ChatItem(parent) { this->name = u"ToolCall: "_s; this->value = value; } ChatItem(QObject *parent, tool_response_tag_t, const QString &value) : ChatItem(parent) { this->name = u"ToolResponse: "_s; this->value = value; } ChatItem(QObject *parent, think_tag_t, const QString &value) : ChatItem(parent) { this->name = u"Think: "_s; this->value = value; } Type type() const { if (name == u"System: "_s) return Type::System; if (name == u"Prompt: "_s) return Type::Prompt; if (name == u"Response: "_s) return Type::Response; if (name == u"Text: "_s) return Type::Text; if (name == u"ToolCall: "_s) return Type::ToolCall; if (name == u"ToolResponse: "_s) return Type::ToolResponse; if (name == u"Think: "_s) return Type::Think; throw std::invalid_argument(fmt::format("Chat item has unknown label: {:?}", name)); } QString flattenedContent() const { if (subItems.empty()) return value; // We only flatten one level QString content; for (ChatItem *item : subItems) content += item->value; return content; } QString content() const { if (type() == Type::Response) { // We parse if this contains any part of a partial toolcall ToolCallParser parser; parser.update(value.toUtf8()); // If no tool call is detected, return the original value if (parser.startIndex() < 0) return value; // Otherwise we only return the text before and any partial tool call const QString beforeToolCall = value.left(parser.startIndex()); return beforeToolCall; } if (type() == Type::Think) return thinkContent(value); if (type() == Type::ToolCall) return toolCallContent(value); // We don't show any of content from the tool response in the GUI if (type() == Type::ToolResponse) return QString(); return value; } QString thinkContent(const QString &value) const { ToolCallParser parser; parser.update(value.toUtf8()); // Extract the content QString content = parser.toolCall(); content = content.trimmed(); return content; } QString toolCallContent(const QString &value) const { ToolCallParser parser; parser.update(value.toUtf8()); // Extract the code QString code = parser.toolCall(); code = code.trimmed(); QString result; // If we've finished the tool call then extract the result from meta information if (toolCallInfo.name == ToolCallConstants::CodeInterpreterFunction) result = "```\n" + toolCallInfo.result + "```"; // Return the formatted code and the result if available return code + result; } QString clipboardContent() const { QStringList clipContent; for (const ChatItem *item : subItems) clipContent << item->clipboardContent(); clipContent << content(); return clipContent.join(""); } QList childItems() const { // We currently have leaf nodes at depth 3 with nodes at depth 2 as mere containers we don't // care about in GUI QList items; for (const ChatItem *item : subItems) { items.reserve(items.size() + item->subItems.size()); ranges::copy(item->subItems, std::back_inserter(items)); } return items; } QString possibleToolCall() const { if (!subItems.empty()) return subItems.back()->possibleToolCall(); if (type() == Type::ToolCall) return value; else return QString(); } void setCurrentResponse(bool b) { if (!subItems.empty()) subItems.back()->setCurrentResponse(b); isCurrentResponse = b; emit isCurrentResponseChanged(); } void setValue(const QString &v) { if (!subItems.empty() && subItems.back()->isCurrentResponse) { subItems.back()->setValue(v); return; } value = v; emit contentChanged(); } void setToolCallInfo(const ToolCallInfo &info) { toolCallInfo = info; emit contentChanged(); emit isTooCallErrorChanged(); } bool isToolCallError() const { return toolCallInfo.error != ToolEnums::Error::NoError; } void setThinkingTime(int t) { thinkingTime = t; emit thinkingTimeChanged(); } // NB: Assumes response is not current. static ChatItem *fromMessageInput(QObject *parent, const MessageInput &message) { switch (message.type) { using enum MessageInput::Type; case Prompt: return new ChatItem(parent, prompt_tag, message.content); case Response: return new ChatItem(parent, response_tag, message.content); case System: return new ChatItem(parent, system_tag, message.content); } Q_UNREACHABLE(); } MessageItem asMessageItem(qsizetype index) const { MessageItem::Type msgType; switch (auto typ = type()) { using enum ChatItem::Type; case System: msgType = MessageItem::Type::System; break; case Prompt: msgType = MessageItem::Type::Prompt; break; case Response: msgType = MessageItem::Type::Response; break; case ToolResponse: msgType = MessageItem::Type::ToolResponse; break; case Text: case ToolCall: case Think: throw std::invalid_argument(fmt::format("cannot convert ChatItem type {} to message item", int(typ))); } return { index, msgType, flattenedContent(), sources, promptAttachments }; } static QList consolidateSources(const QList &sources); void serializeResponse(QDataStream &stream, int version); void serializeToolCall(QDataStream &stream, int version); void serializeToolResponse(QDataStream &stream, int version); void serializeText(QDataStream &stream, int version); void serializeThink(QDataStream &stream, int version); void serializeSubItems(QDataStream &stream, int version); // recursive void serialize(QDataStream &stream, int version); bool deserializeResponse(QDataStream &stream, int version); bool deserializeToolCall(QDataStream &stream, int version); bool deserializeToolResponse(QDataStream &stream, int version); bool deserializeText(QDataStream &stream, int version); bool deserializeThink(QDataStream &stream, int version); bool deserializeSubItems(QDataStream &stream, int version); // recursive bool deserialize(QDataStream &stream, int version); Q_SIGNALS: void contentChanged(); void isTooCallErrorChanged(); void isCurrentResponseChanged(); void thinkingTimeChanged(); public: // TODO: Maybe we should include the model name here as well as timestamp? QString name; QString value; // prompts QList sources; QList consolidatedSources; QList promptAttachments; // responses bool isCurrentResponse = false; bool isError = false; ToolCallInfo toolCallInfo; std::list subItems; // responses (DataLake) QString newResponse; bool stopped = false; bool thumbsUpState = false; bool thumbsDownState = false; // thinking time in ms int thinkingTime = 0; }; class ChatModel : public QAbstractListModel { Q_OBJECT Q_PROPERTY(int count READ count NOTIFY countChanged) Q_PROPERTY(bool hasError READ hasError NOTIFY hasErrorChanged) public: explicit ChatModel(QObject *parent = nullptr) : QAbstractListModel(parent) {} // FIXME(jared): can't this start at Qt::UserRole (no +1)? enum Roles { NameRole = Qt::UserRole + 1, ValueRole, // prompts and responses ContentRole, // prompts PromptAttachmentsRole, // responses // NOTE: sources are stored on the *prompts*, but in the model, they are only on the *responses*! SourcesRole, ConsolidatedSourcesRole, IsCurrentResponseRole, IsErrorRole, ChildItemsRole, // responses (DataLake) NewResponseRole, StoppedRole, ThumbsUpStateRole, ThumbsDownStateRole, }; int rowCount(const QModelIndex &parent = QModelIndex()) const override { QMutexLocker locker(&m_mutex); Q_UNUSED(parent) return m_chatItems.size(); } /* a "peer" is a bidirectional 1:1 link between a prompt and the response that would cite its LocalDocs * sources. Return std::nullopt if there is none, which is possible for e.g. server chats. */ template static std::optional getPeer(const T *arr, qsizetype size, qsizetype index) { Q_ASSERT(index >= 0); Q_ASSERT(index < size); return getPeerInternal(arr, size, index); } private: static std::optional getPeerInternal(const ChatItem * const *arr, qsizetype size, qsizetype index) { qsizetype peer; ChatItem::Type expected; switch (arr[index]->type()) { using enum ChatItem::Type; case Prompt: peer = index + 1; expected = Response; break; case Response: peer = index - 1; expected = Prompt; break; default: throw std::invalid_argument("getPeer() called on item that is not a prompt or response"); } if (peer >= 0 && peer < size && arr[peer]->type() == expected) return peer; return std::nullopt; } // FIXME(jared): this should really be done at the parent level, not the sub-item level static std::optional getPeerInternal(const MessageItem *arr, qsizetype size, qsizetype index) { qsizetype peer; MessageItem::Type expected; switch (arr[index].type()) { using enum MessageItem::Type; case Prompt: peer = index + 1; expected = Response; break; case Response: peer = index - 1; expected = Prompt; break; default: throw std::invalid_argument("getPeer() called on item that is not a prompt or response"); } if (peer >= 0 && peer < size && arr[peer].type() == expected) return peer; return std::nullopt; } public: template static auto getPeer(R &&range, ranges::iterator_t item) -> std::optional> { auto begin = ranges::begin(range); return getPeer(ranges::data(range), ranges::size(range), item - begin) .transform([&](auto i) { return begin + i; }); } auto getPeerUnlocked(QList::const_iterator item) const -> std::optional::const_iterator> { return getPeer(m_chatItems, item); } std::optional getPeerUnlocked(qsizetype index) const { return getPeer(m_chatItems.constData(), m_chatItems.size(), index); } QVariant data(const QModelIndex &index, int role = Qt::DisplayRole) const override { QMutexLocker locker(&m_mutex); if (!index.isValid() || index.row() < 0 || index.row() >= m_chatItems.size()) return QVariant(); auto itemIt = m_chatItems.cbegin() + index.row(); auto *item = *itemIt; switch (role) { case NameRole: return item->name; case ValueRole: return item->value; case PromptAttachmentsRole: return QVariant::fromValue(item->promptAttachments); case SourcesRole: { QList data; if (item->type() == ChatItem::Type::Response) { if (auto prompt = getPeerUnlocked(itemIt)) data = (**prompt)->sources; } return QVariant::fromValue(data); } case ConsolidatedSourcesRole: { QList data; if (item->type() == ChatItem::Type::Response) { if (auto prompt = getPeerUnlocked(itemIt)) data = (**prompt)->consolidatedSources; } return QVariant::fromValue(data); } case IsCurrentResponseRole: return item->isCurrentResponse; case NewResponseRole: return item->newResponse; case StoppedRole: return item->stopped; case ThumbsUpStateRole: return item->thumbsUpState; case ThumbsDownStateRole: return item->thumbsDownState; case IsErrorRole: return item->type() == ChatItem::Type::Response && item->isError; case ContentRole: return item->content(); case ChildItemsRole: return QVariant::fromValue(item->childItems()); } return QVariant(); } QHash roleNames() const override { return { { NameRole, "name" }, { ValueRole, "value" }, { PromptAttachmentsRole, "promptAttachments" }, { SourcesRole, "sources" }, { ConsolidatedSourcesRole, "consolidatedSources" }, { IsCurrentResponseRole, "isCurrentResponse" }, { IsErrorRole, "isError" }, { NewResponseRole, "newResponse" }, { StoppedRole, "stopped" }, { ThumbsUpStateRole, "thumbsUpState" }, { ThumbsDownStateRole, "thumbsDownState" }, { ContentRole, "content" }, { ChildItemsRole, "childItems" }, }; } void appendPrompt(const QString &value, const QList &attachments = {}) { qsizetype count; { QMutexLocker locker(&m_mutex); if (hasErrorUnlocked()) throw std::logic_error("cannot append to a failed chat"); count = m_chatItems.count(); } beginInsertRows(QModelIndex(), count, count); { QMutexLocker locker(&m_mutex); m_chatItems << new ChatItem(this, ChatItem::prompt_tag, value, attachments); } endInsertRows(); emit countChanged(); } void appendResponse() { qsizetype count; { QMutexLocker locker(&m_mutex); if (hasErrorUnlocked()) throw std::logic_error("cannot append to a failed chat"); count = m_chatItems.count(); } beginInsertRows(QModelIndex(), count, count); { QMutexLocker locker(&m_mutex); m_chatItems << new ChatItem(this, ChatItem::response_tag); } endInsertRows(); emit countChanged(); } // Used by Server to append a new conversation to the chat log. // Returns the offset of the appended items. qsizetype appendResponseWithHistory(std::span history) { if (history.empty()) throw std::invalid_argument("at least one message is required"); m_mutex.lock(); qsizetype startIndex = m_chatItems.size(); m_mutex.unlock(); qsizetype nNewItems = history.size() + 1; qsizetype endIndex = startIndex + nNewItems; beginInsertRows(QModelIndex(), startIndex, endIndex - 1 /*inclusive*/); bool hadError; QList newItems; { QMutexLocker locker(&m_mutex); startIndex = m_chatItems.size(); // just in case hadError = hasErrorUnlocked(); m_chatItems.reserve(m_chatItems.count() + nNewItems); for (auto &message : history) m_chatItems << ChatItem::fromMessageInput(this, message); m_chatItems << new ChatItem(this, ChatItem::response_tag); } endInsertRows(); emit countChanged(); // Server can add messages when there is an error because each call is a new conversation if (hadError) emit hasErrorChanged(false); return startIndex; } void truncate(qsizetype size) { qsizetype oldSize; { QMutexLocker locker(&m_mutex); if (size >= (oldSize = m_chatItems.size())) return; if (size && m_chatItems.at(size - 1)->type() != ChatItem::Type::Response) throw std::invalid_argument( fmt::format("chat model truncated to {} items would not end in a response", size) ); } bool oldHasError; beginRemoveRows(QModelIndex(), size, oldSize - 1 /*inclusive*/); { QMutexLocker locker(&m_mutex); oldHasError = hasErrorUnlocked(); Q_ASSERT(size < m_chatItems.size()); m_chatItems.resize(size); } endRemoveRows(); emit countChanged(); if (oldHasError) emit hasErrorChanged(false); } QString popPrompt(int index) { QString content; { QMutexLocker locker(&m_mutex); if (index < 0 || index >= m_chatItems.size() || m_chatItems[index]->type() != ChatItem::Type::Prompt) throw std::logic_error("attempt to pop a prompt, but this is not a prompt"); content = m_chatItems[index]->content(); } truncate(index); return content; } bool regenerateResponse(int index) { int promptIdx; { QMutexLocker locker(&m_mutex); auto items = m_chatItems; // holds lock if (index < 1 || index >= items.size() || items[index]->type() != ChatItem::Type::Response) return false; promptIdx = getPeerUnlocked(index).value_or(-1); } truncate(index + 1); clearSubItems(index); setResponseValue({}); updateCurrentResponse(index, true ); updateNewResponse (index, {} ); updateStopped (index, false); updateThumbsUpState (index, false); updateThumbsDownState(index, false); setError(false); if (promptIdx >= 0) updateSources(promptIdx, {}); return true; } Q_INVOKABLE void clear() { { QMutexLocker locker(&m_mutex); if (m_chatItems.isEmpty()) return; } bool oldHasError; beginResetModel(); { QMutexLocker locker(&m_mutex); oldHasError = hasErrorUnlocked(); m_chatItems.clear(); } endResetModel(); emit countChanged(); if (oldHasError) emit hasErrorChanged(false); } Q_INVOKABLE QString possibleToolcall() const { QMutexLocker locker(&m_mutex); if (m_chatItems.empty()) return QString(); return m_chatItems.back()->possibleToolCall(); } Q_INVOKABLE void updateCurrentResponse(int index, bool b) { { QMutexLocker locker(&m_mutex); if (index < 0 || index >= m_chatItems.size()) return; ChatItem *item = m_chatItems[index]; item->setCurrentResponse(b); } emit dataChanged(createIndex(index, 0), createIndex(index, 0), {IsCurrentResponseRole}); } Q_INVOKABLE void updateStopped(int index, bool b) { bool changed = false; { QMutexLocker locker(&m_mutex); if (index < 0 || index >= m_chatItems.size()) return; ChatItem *item = m_chatItems[index]; if (item->stopped != b) { item->stopped = b; changed = true; } } if (changed) emit dataChanged(createIndex(index, 0), createIndex(index, 0), {StoppedRole}); } Q_INVOKABLE void setResponseValue(const QString &value) { qsizetype index; { QMutexLocker locker(&m_mutex); if (m_chatItems.isEmpty() || m_chatItems.cend()[-1]->type() != ChatItem::Type::Response) throw std::logic_error("we only set this on a response"); index = m_chatItems.count() - 1; ChatItem *item = m_chatItems.back(); item->setValue(value); } emit dataChanged(createIndex(index, 0), createIndex(index, 0), {ValueRole, ContentRole}); } Q_INVOKABLE void updateSources(int index, const QList &sources) { int responseIndex = -1; { QMutexLocker locker(&m_mutex); if (index < 0 || index >= m_chatItems.size()) return; auto promptItem = m_chatItems.begin() + index; if ((*promptItem)->type() != ChatItem::Type::Prompt) throw std::invalid_argument(fmt::format("item at index {} is not a prompt", index)); if (auto peer = getPeerUnlocked(promptItem)) responseIndex = *peer - m_chatItems.cbegin(); (*promptItem)->sources = sources; (*promptItem)->consolidatedSources = ChatItem::consolidateSources(sources); } if (responseIndex >= 0) { emit dataChanged(createIndex(responseIndex, 0), createIndex(responseIndex, 0), {SourcesRole}); emit dataChanged(createIndex(responseIndex, 0), createIndex(responseIndex, 0), {ConsolidatedSourcesRole}); } } Q_INVOKABLE void updateThumbsUpState(int index, bool b) { bool changed = false; { QMutexLocker locker(&m_mutex); if (index < 0 || index >= m_chatItems.size()) return; ChatItem *item = m_chatItems[index]; if (item->thumbsUpState != b) { item->thumbsUpState = b; changed = true; } } if (changed) emit dataChanged(createIndex(index, 0), createIndex(index, 0), {ThumbsUpStateRole}); } Q_INVOKABLE void updateThumbsDownState(int index, bool b) { bool changed = false; { QMutexLocker locker(&m_mutex); if (index < 0 || index >= m_chatItems.size()) return; ChatItem *item = m_chatItems[index]; if (item->thumbsDownState != b) { item->thumbsDownState = b; changed = true; } } if (changed) emit dataChanged(createIndex(index, 0), createIndex(index, 0), {ThumbsDownStateRole}); } Q_INVOKABLE void updateNewResponse(int index, const QString &newResponse) { bool changed = false; { QMutexLocker locker(&m_mutex); if (index < 0 || index >= m_chatItems.size()) return; ChatItem *item = m_chatItems[index]; if (item->newResponse != newResponse) { item->newResponse = newResponse; changed = true; } } if (changed) emit dataChanged(createIndex(index, 0), createIndex(index, 0), {NewResponseRole}); } Q_INVOKABLE void splitThinking(const QPair &split) { qsizetype index; { QMutexLocker locker(&m_mutex); if (m_chatItems.isEmpty() || m_chatItems.cend()[-1]->type() != ChatItem::Type::Response) throw std::logic_error("can only set thinking on a chat that ends with a response"); index = m_chatItems.count() - 1; ChatItem *currentResponse = m_chatItems.back(); Q_ASSERT(currentResponse->isCurrentResponse); // Create a new response container for any text and the thinking ChatItem *newResponse = new ChatItem(this, ChatItem::response_tag); // Add preceding text if any if (!split.first.isEmpty()) { ChatItem *textItem = new ChatItem(this, ChatItem::text_tag, split.first); newResponse->subItems.push_back(textItem); } // Add the thinking item Q_ASSERT(!split.second.isEmpty()); ChatItem *thinkingItem = new ChatItem(this, ChatItem::think_tag, split.second); thinkingItem->isCurrentResponse = true; newResponse->subItems.push_back(thinkingItem); // Add new response and reset our value currentResponse->subItems.push_back(newResponse); currentResponse->value = QString(); } emit dataChanged(createIndex(index, 0), createIndex(index, 0), {ChildItemsRole, ContentRole}); } Q_INVOKABLE void endThinking(const QPair &split, int thinkingTime) { qsizetype index; { QMutexLocker locker(&m_mutex); if (m_chatItems.isEmpty() || m_chatItems.cend()[-1]->type() != ChatItem::Type::Response) throw std::logic_error("can only end thinking on a chat that ends with a response"); index = m_chatItems.count() - 1; ChatItem *currentResponse = m_chatItems.back(); Q_ASSERT(currentResponse->isCurrentResponse); ChatItem *subResponse = currentResponse->subItems.back(); Q_ASSERT(subResponse->type() == ChatItem::Type::Response); Q_ASSERT(subResponse->isCurrentResponse); subResponse->setCurrentResponse(false); ChatItem *thinkingItem = subResponse->subItems.back(); Q_ASSERT(thinkingItem->type() == ChatItem::Type::Think); thinkingItem->setCurrentResponse(false); thinkingItem->setValue(split.first); thinkingItem->setThinkingTime(thinkingTime); currentResponse->setValue(split.second); } emit dataChanged(createIndex(index, 0), createIndex(index, 0), {ChildItemsRole, ContentRole}); } Q_INVOKABLE void splitToolCall(const QPair &split) { qsizetype index; { QMutexLocker locker(&m_mutex); if (m_chatItems.isEmpty() || m_chatItems.cend()[-1]->type() != ChatItem::Type::Response) throw std::logic_error("can only set toolcall on a chat that ends with a response"); index = m_chatItems.count() - 1; ChatItem *currentResponse = m_chatItems.back(); Q_ASSERT(currentResponse->isCurrentResponse); // Create a new response container for any text and the tool call ChatItem *newResponse = new ChatItem(this, ChatItem::response_tag); // Add preceding text if any if (!split.first.isEmpty()) { ChatItem *textItem = new ChatItem(this, ChatItem::text_tag, split.first); newResponse->subItems.push_back(textItem); } // Add the toolcall Q_ASSERT(!split.second.isEmpty()); ChatItem *toolCallItem = new ChatItem(this, ChatItem::tool_call_tag, split.second); toolCallItem->isCurrentResponse = true; newResponse->subItems.push_back(toolCallItem); // Add new response and reset our value currentResponse->subItems.push_back(newResponse); currentResponse->value = QString(); } emit dataChanged(createIndex(index, 0), createIndex(index, 0), {ChildItemsRole, ContentRole}); } Q_INVOKABLE void updateToolCall(const ToolCallInfo &toolCallInfo) { qsizetype index; { QMutexLocker locker(&m_mutex); if (m_chatItems.isEmpty() || m_chatItems.cend()[-1]->type() != ChatItem::Type::Response) throw std::logic_error("can only set toolcall on a chat that ends with a response"); index = m_chatItems.count() - 1; ChatItem *currentResponse = m_chatItems.back(); Q_ASSERT(currentResponse->isCurrentResponse); ChatItem *subResponse = currentResponse->subItems.back(); Q_ASSERT(subResponse->type() == ChatItem::Type::Response); Q_ASSERT(subResponse->isCurrentResponse); ChatItem *toolCallItem = subResponse->subItems.back(); Q_ASSERT(toolCallItem->type() == ChatItem::Type::ToolCall); toolCallItem->setToolCallInfo(toolCallInfo); toolCallItem->setCurrentResponse(false); // Add tool response ChatItem *toolResponseItem = new ChatItem(this, ChatItem::tool_response_tag, toolCallInfo.result); currentResponse->subItems.push_back(toolResponseItem); } emit dataChanged(createIndex(index, 0), createIndex(index, 0), {ChildItemsRole, ContentRole}); } void clearSubItems(int index) { bool changed = false; { QMutexLocker locker(&m_mutex); if (index < 0 || index >= m_chatItems.size()) return; if (m_chatItems.isEmpty() || m_chatItems[index]->type() != ChatItem::Type::Response) throw std::logic_error("can only clear subitems on a chat that ends with a response"); ChatItem *item = m_chatItems.back(); if (!item->subItems.empty()) { item->subItems.clear(); changed = true; } } if (changed) { emit dataChanged(createIndex(index, 0), createIndex(index, 0), {ChildItemsRole, ContentRole}); } } Q_INVOKABLE void setError(bool value = true) { qsizetype index; { QMutexLocker locker(&m_mutex); if (m_chatItems.isEmpty() || m_chatItems.cend()[-1]->type() != ChatItem::Type::Response) throw std::logic_error("can only set error on a chat that ends with a response"); index = m_chatItems.count() - 1; auto &last = m_chatItems.back(); if (last->isError == value) return; // already set last->isError = value; } emit dataChanged(createIndex(index, 0), createIndex(index, 0), {IsErrorRole}); emit hasErrorChanged(value); } Q_INVOKABLE void copyToClipboard() { QMutexLocker locker(&m_mutex); QString conversation; for (ChatItem *item : m_chatItems) { QString string = item->name; string += item->clipboardContent(); string += "\n"; conversation += string; } QClipboard *clipboard = QGuiApplication::clipboard(); clipboard->setText(conversation, QClipboard::Clipboard); } Q_INVOKABLE void copyToClipboard(int index) { QMutexLocker locker(&m_mutex); if (index < 0 || index >= m_chatItems.size()) return; ChatItem *item = m_chatItems.at(index); QClipboard *clipboard = QGuiApplication::clipboard(); clipboard->setText(item->clipboardContent(), QClipboard::Clipboard); } qsizetype count() const { QMutexLocker locker(&m_mutex); return m_chatItems.size(); } std::vector messageItems() const { // A flattened version of the chat item tree used by the backend and jinja QMutexLocker locker(&m_mutex); std::vector chatItems; for (qsizetype i : views::iota(0, m_chatItems.size())) { auto *parent = m_chatItems.at(i); chatItems.reserve(chatItems.size() + parent->subItems.size() + 1); ranges::copy(parent->subItems | views::transform([&](auto *s) { return s->asMessageItem(i); }), std::back_inserter(chatItems)); chatItems.push_back(parent->asMessageItem(i)); } return chatItems; } bool hasError() const { QMutexLocker locker(&m_mutex); return hasErrorUnlocked(); } bool serialize(QDataStream &stream, int version) const { // FIXME: need to serialize new chatitem tree QMutexLocker locker(&m_mutex); stream << int(m_chatItems.size()); for (auto itemIt = m_chatItems.cbegin(); itemIt < m_chatItems.cend(); ++itemIt) { auto c = *itemIt; // NB: copies if (version < 11) { // move sources from their prompt to the next response switch (c->type()) { using enum ChatItem::Type; case Prompt: c->sources.clear(); c->consolidatedSources.clear(); break; case Response: // note: we drop sources for responseless prompts if (auto peer = getPeerUnlocked(itemIt)) { c->sources = (**peer)->sources; c->consolidatedSources = (**peer)->consolidatedSources; } default: ; } } c->serialize(stream, version); } return stream.status() == QDataStream::Ok; } bool deserialize(QDataStream &stream, int version) { clear(); // reset to known state int size; stream >> size; int lastPromptIndex = -1; QList chatItems; for (int i = 0; i < size; ++i) { ChatItem *c = new ChatItem(this); if (!c->deserialize(stream, version)) { delete c; return false; } if (version < 11 && c->type() == ChatItem::Type::Response) { // move sources from the response to their last prompt if (lastPromptIndex >= 0) { auto &prompt = chatItems[lastPromptIndex]; prompt->sources = std::move(c->sources ); prompt->consolidatedSources = std::move(c->consolidatedSources); lastPromptIndex = -1; } else { // drop sources for promptless responses c->sources.clear(); c->consolidatedSources.clear(); } } chatItems << c; if (c->type() == ChatItem::Type::Prompt) lastPromptIndex = chatItems.size() - 1; } bool hasError; beginInsertRows(QModelIndex(), 0, chatItems.size() - 1 /*inclusive*/); { QMutexLocker locker(&m_mutex); m_chatItems = chatItems; hasError = hasErrorUnlocked(); } endInsertRows(); emit countChanged(); if (hasError) emit hasErrorChanged(true); return stream.status() == QDataStream::Ok; } Q_SIGNALS: void countChanged(); void hasErrorChanged(bool value); private: bool hasErrorUnlocked() const { if (m_chatItems.isEmpty()) return false; auto &last = m_chatItems.back(); return last->type() == ChatItem::Type::Response && last->isError; } private: mutable QMutex m_mutex; QList m_chatItems; }; #endif // CHATMODEL_H ================================================ FILE: gpt4all-chat/src/chatviewtextprocessor.cpp ================================================ #include "chatviewtextprocessor.h" #include #include #include #include #include #include #include #include #include // IWYU pragma: keep #include #include #include #include // IWYU pragma: keep #include // IWYU pragma: keep #include // IWYU pragma: keep #include #include #include #include // IWYU pragma: keep #include // IWYU pragma: keep #include #include #include #include #include enum Language { None, Python, Cpp, Bash, TypeScript, Java, Go, Json, Csharp, Latex, Html, Php, Markdown }; static Language stringToLanguage(const QString &language) { if (language == "python") return Python; if (language == "cpp") return Cpp; if (language == "c++") return Cpp; if (language == "csharp") return Csharp; if (language == "c#") return Csharp; if (language == "c") return Cpp; if (language == "bash") return Bash; if (language == "javascript") return TypeScript; if (language == "typescript") return TypeScript; if (language == "java") return Java; if (language == "go") return Go; if (language == "golang") return Go; if (language == "json") return Json; if (language == "latex") return Latex; if (language == "html") return Html; if (language == "php") return Php; return None; } enum Code { Default, Keyword, Function, FunctionCall, Comment, String, Number, Header, Preprocessor, Type, Arrow, Command, Variable, Key, Value, Parameter, AttributeName, AttributeValue, SpecialCharacter, DocType }; struct HighlightingRule { QRegularExpression pattern; Code format; }; static QColor formatToColor(Code c, const CodeColors &colors) { switch (c) { case Default: return colors.defaultColor; case Keyword: return colors.keywordColor; case Function: return colors.functionColor; case FunctionCall: return colors.functionCallColor; case Comment: return colors.commentColor; case String: return colors.stringColor; case Number: return colors.numberColor; case Header: return colors.headerColor; case Preprocessor: return colors.preprocessorColor; case Type: return colors.typeColor; case Arrow: return colors.arrowColor; case Command: return colors.commandColor; case Variable: return colors.variableColor; case Key: return colors.keyColor; case Value: return colors.valueColor; case Parameter: return colors.parameterColor; case AttributeName: return colors.attributeNameColor; case AttributeValue: return colors.attributeValueColor; case SpecialCharacter: return colors.specialCharacterColor; case DocType: return colors.doctypeColor; default: Q_UNREACHABLE(); } return QColor(); } static QVector pythonHighlightingRules() { static QVector highlightingRules; if (highlightingRules.isEmpty()) { HighlightingRule rule; rule.pattern = QRegularExpression(".*"); rule.format = Default; highlightingRules.append(rule); rule.pattern = QRegularExpression("\\b(\\w+)\\s*(?=\\()"); rule.format = FunctionCall; highlightingRules.append(rule); rule.pattern = QRegularExpression("\\bdef\\s+(\\w+)\\b"); rule.format = Function; highlightingRules.append(rule); rule.pattern = QRegularExpression("\\b[0-9]*\\.?[0-9]+\\b"); rule.format = Number; highlightingRules.append(rule); QStringList keywordPatterns = { "\\bdef\\b", "\\bclass\\b", "\\bif\\b", "\\belse\\b", "\\belif\\b", "\\bwhile\\b", "\\bfor\\b", "\\breturn\\b", "\\bprint\\b", "\\bimport\\b", "\\bfrom\\b", "\\bas\\b", "\\btry\\b", "\\bexcept\\b", "\\braise\\b", "\\bwith\\b", "\\bfinally\\b", "\\bcontinue\\b", "\\bbreak\\b", "\\bpass\\b" }; for (const QString &pattern : keywordPatterns) { rule.pattern = QRegularExpression(pattern); rule.format = Keyword; highlightingRules.append(rule); } rule.pattern = QRegularExpression("\".*?\""); rule.format = String; highlightingRules.append(rule); rule.pattern = QRegularExpression("\'.*?\'"); rule.format = String; highlightingRules.append(rule); rule.pattern = QRegularExpression("#[^\n]*"); rule.format = Comment; highlightingRules.append(rule); } return highlightingRules; } static QVector csharpHighlightingRules() { static QVector highlightingRules; if (highlightingRules.isEmpty()) { HighlightingRule rule; rule.pattern = QRegularExpression(".*"); rule.format = Default; highlightingRules.append(rule); // Function call highlighting rule.pattern = QRegularExpression("\\b(\\w+)\\s*(?=\\()"); rule.format = FunctionCall; highlightingRules.append(rule); // Function definition highlighting rule.pattern = QRegularExpression("\\bvoid|int|double|string|bool\\s+(\\w+)\\s*(?=\\()"); rule.format = Function; highlightingRules.append(rule); // Number highlighting rule.pattern = QRegularExpression("\\b[0-9]*\\.?[0-9]+\\b"); rule.format = Number; highlightingRules.append(rule); // Keyword highlighting QStringList keywordPatterns = { "\\bvoid\\b", "\\bint\\b", "\\bdouble\\b", "\\bstring\\b", "\\bbool\\b", "\\bclass\\b", "\\bif\\b", "\\belse\\b", "\\bwhile\\b", "\\bfor\\b", "\\breturn\\b", "\\bnew\\b", "\\bthis\\b", "\\bpublic\\b", "\\bprivate\\b", "\\bprotected\\b", "\\bstatic\\b", "\\btrue\\b", "\\bfalse\\b", "\\bnull\\b", "\\bnamespace\\b", "\\busing\\b", "\\btry\\b", "\\bcatch\\b", "\\bfinally\\b", "\\bthrow\\b", "\\bvar\\b" }; for (const QString &pattern : keywordPatterns) { rule.pattern = QRegularExpression(pattern); rule.format = Keyword; highlightingRules.append(rule); } // String highlighting rule.pattern = QRegularExpression("\".*?\""); rule.format = String; highlightingRules.append(rule); // Single-line comment highlighting rule.pattern = QRegularExpression("//[^\n]*"); rule.format = Comment; highlightingRules.append(rule); // Multi-line comment highlighting rule.pattern = QRegularExpression("/\\*.*?\\*/"); rule.format = Comment; highlightingRules.append(rule); } return highlightingRules; } static QVector cppHighlightingRules() { static QVector highlightingRules; if (highlightingRules.isEmpty()) { HighlightingRule rule; rule.pattern = QRegularExpression(".*"); rule.format = Default; highlightingRules.append(rule); rule.pattern = QRegularExpression("\\b(\\w+)\\s*(?=\\()"); rule.format = FunctionCall; highlightingRules.append(rule); rule.pattern = QRegularExpression("\\b[a-zA-Z_][a-zA-Z0-9_]*\\s+(\\w+)\\s*\\("); rule.format = Function; highlightingRules.append(rule); rule.pattern = QRegularExpression("\\b[0-9]*\\.?[0-9]+\\b"); rule.format = Number; highlightingRules.append(rule); QStringList keywordPatterns = { "\\bauto\\b", "\\bbool\\b", "\\bbreak\\b", "\\bcase\\b", "\\bcatch\\b", "\\bchar\\b", "\\bclass\\b", "\\bconst\\b", "\\bconstexpr\\b", "\\bcontinue\\b", "\\bdefault\\b", "\\bdelete\\b", "\\bdo\\b", "\\bdouble\\b", "\\belse\\b", "\\belifdef\\b", "\\belifndef\\b", "\\bembed\\b", "\\benum\\b", "\\bexplicit\\b", "\\bextern\\b", "\\bfalse\\b", "\\bfloat\\b", "\\bfor\\b", "\\bfriend\\b", "\\bgoto\\b", "\\bif\\b", "\\binline\\b", "\\bint\\b", "\\blong\\b", "\\bmutable\\b", "\\bnamespace\\b", "\\bnew\\b", "\\bnoexcept\\b", "\\bnullptr\\b", "\\boperator\\b", "\\boverride\\b", "\\bprivate\\b", "\\bprotected\\b", "\\bpublic\\b", "\\bregister\\b", "\\breinterpret_cast\\b", "\\breturn\\b", "\\bshort\\b", "\\bsigned\\b", "\\bsizeof\\b", "\\bstatic\\b", "\\bstatic_assert\\b", "\\bstatic_cast\\b", "\\bstruct\\b", "\\bswitch\\b", "\\btemplate\\b", "\\bthis\\b", "\\bthrow\\b", "\\btrue\\b", "\\btry\\b", "\\btypedef\\b", "\\btypeid\\b", "\\btypename\\b", "\\bunion\\b", "\\bunsigned\\b", "\\busing\\b", "\\bvirtual\\b", "\\bvoid\\b", "\\bvolatile\\b", "\\bwchar_t\\b", "\\bwhile\\b" }; for (const QString &pattern : keywordPatterns) { rule.pattern = QRegularExpression(pattern); rule.format = Keyword; highlightingRules.append(rule); } rule.pattern = QRegularExpression("\".*?\""); rule.format = String; highlightingRules.append(rule); rule.pattern = QRegularExpression("\'.*?\'"); rule.format = String; highlightingRules.append(rule); rule.pattern = QRegularExpression("//[^\n]*"); rule.format = Comment; highlightingRules.append(rule); rule.pattern = QRegularExpression("/\\*.*?\\*/"); rule.format = Comment; highlightingRules.append(rule); rule.pattern = QRegularExpression("#(?:include|define|undef|ifdef|ifndef|if|else|elif|endif|error|pragma)\\b.*"); rule.format = Preprocessor; highlightingRules.append(rule); } return highlightingRules; } static QVector typescriptHighlightingRules() { static QVector highlightingRules; if (highlightingRules.isEmpty()) { HighlightingRule rule; rule.pattern = QRegularExpression(".*"); rule.format = Default; highlightingRules.append(rule); rule.pattern = QRegularExpression("\\b(\\w+)\\s*(?=\\()"); rule.format = FunctionCall; highlightingRules.append(rule); rule.pattern = QRegularExpression("\\bfunction\\s+(\\w+)\\b"); rule.format = Function; highlightingRules.append(rule); rule.pattern = QRegularExpression("\\b[0-9]*\\.?[0-9]+\\b"); rule.format = Number; highlightingRules.append(rule); QStringList keywordPatterns = { "\\bfunction\\b", "\\bvar\\b", "\\blet\\b", "\\bconst\\b", "\\bif\\b", "\\belse\\b", "\\bfor\\b", "\\bwhile\\b", "\\breturn\\b", "\\btry\\b", "\\bcatch\\b", "\\bfinally\\b", "\\bthrow\\b", "\\bnew\\b", "\\bdelete\\b", "\\btypeof\\b", "\\binstanceof\\b", "\\bdo\\b", "\\bswitch\\b", "\\bcase\\b", "\\bbreak\\b", "\\bcontinue\\b", "\\bpublic\\b", "\\bprivate\\b", "\\bprotected\\b", "\\bstatic\\b", "\\breadonly\\b", "\\benum\\b", "\\binterface\\b", "\\bextends\\b", "\\bimplements\\b", "\\bexport\\b", "\\bimport\\b", "\\btype\\b", "\\bnamespace\\b", "\\babstract\\b", "\\bas\\b", "\\basync\\b", "\\bawait\\b", "\\bclass\\b", "\\bconstructor\\b", "\\bget\\b", "\\bset\\b", "\\bnull\\b", "\\bundefined\\b", "\\btrue\\b", "\\bfalse\\b" }; for (const QString &pattern : keywordPatterns) { rule.pattern = QRegularExpression(pattern); rule.format = Keyword; highlightingRules.append(rule); } QStringList typePatterns = { "\\bstring\\b", "\\bnumber\\b", "\\bboolean\\b", "\\bany\\b", "\\bvoid\\b", "\\bnever\\b", "\\bunknown\\b", "\\bObject\\b", "\\bArray\\b" }; for (const QString &pattern : typePatterns) { rule.pattern = QRegularExpression(pattern); rule.format = Type; highlightingRules.append(rule); } rule.pattern = QRegularExpression("\".*?\"|'.*?'|`.*?`"); rule.format = String; highlightingRules.append(rule); rule.pattern = QRegularExpression("//[^\n]*"); rule.format = Comment; highlightingRules.append(rule); rule.pattern = QRegularExpression("/\\*.*?\\*/"); rule.format = Comment; highlightingRules.append(rule); rule.pattern = QRegularExpression("=>"); rule.format = Arrow; highlightingRules.append(rule); } return highlightingRules; } static QVector javaHighlightingRules() { static QVector highlightingRules; if (highlightingRules.isEmpty()) { HighlightingRule rule; rule.pattern = QRegularExpression(".*"); rule.format = Default; highlightingRules.append(rule); rule.pattern = QRegularExpression("\\b(\\w+)\\s*(?=\\()"); rule.format = FunctionCall; highlightingRules.append(rule); rule.pattern = QRegularExpression("\\bvoid\\s+(\\w+)\\b"); rule.format = Function; highlightingRules.append(rule); rule.pattern = QRegularExpression("\\b[0-9]*\\.?[0-9]+\\b"); rule.format = Number; highlightingRules.append(rule); QStringList keywordPatterns = { "\\bpublic\\b", "\\bprivate\\b", "\\bprotected\\b", "\\bstatic\\b", "\\bfinal\\b", "\\bclass\\b", "\\bif\\b", "\\belse\\b", "\\bwhile\\b", "\\bfor\\b", "\\breturn\\b", "\\bnew\\b", "\\bimport\\b", "\\bpackage\\b", "\\btry\\b", "\\bcatch\\b", "\\bthrow\\b", "\\bthrows\\b", "\\bfinally\\b", "\\binterface\\b", "\\bextends\\b", "\\bimplements\\b", "\\bsuper\\b", "\\bthis\\b", "\\bvoid\\b", "\\bboolean\\b", "\\bbyte\\b", "\\bchar\\b", "\\bdouble\\b", "\\bfloat\\b", "\\bint\\b", "\\blong\\b", "\\bshort\\b", "\\bswitch\\b", "\\bcase\\b", "\\bdefault\\b", "\\bcontinue\\b", "\\bbreak\\b", "\\babstract\\b", "\\bassert\\b", "\\benum\\b", "\\binstanceof\\b", "\\bnative\\b", "\\bstrictfp\\b", "\\bsynchronized\\b", "\\btransient\\b", "\\bvolatile\\b", "\\bconst\\b", "\\bgoto\\b" }; for (const QString &pattern : keywordPatterns) { rule.pattern = QRegularExpression(pattern); rule.format = Keyword; highlightingRules.append(rule); } rule.pattern = QRegularExpression("\".*?\""); rule.format = String; highlightingRules.append(rule); rule.pattern = QRegularExpression("\'.*?\'"); rule.format = String; highlightingRules.append(rule); rule.pattern = QRegularExpression("//[^\n]*"); rule.format = Comment; highlightingRules.append(rule); rule.pattern = QRegularExpression("/\\*.*?\\*/"); rule.format = Comment; highlightingRules.append(rule); } return highlightingRules; } static QVector goHighlightingRules() { static QVector highlightingRules; if (highlightingRules.isEmpty()) { HighlightingRule rule; rule.pattern = QRegularExpression(".*"); rule.format = Default; highlightingRules.append(rule); rule.pattern = QRegularExpression("\\b(\\w+)\\s*(?=\\()"); rule.format = FunctionCall; highlightingRules.append(rule); rule.pattern = QRegularExpression("\\bfunc\\s+(\\w+)\\b"); rule.format = Function; highlightingRules.append(rule); rule.pattern = QRegularExpression("\\b[0-9]*\\.?[0-9]+\\b"); rule.format = Number; highlightingRules.append(rule); QStringList keywordPatterns = { "\\bfunc\\b", "\\bpackage\\b", "\\bimport\\b", "\\bvar\\b", "\\bconst\\b", "\\btype\\b", "\\bstruct\\b", "\\binterface\\b", "\\bfor\\b", "\\bif\\b", "\\belse\\b", "\\bswitch\\b", "\\bcase\\b", "\\bdefault\\b", "\\breturn\\b", "\\bbreak\\b", "\\bcontinue\\b", "\\bgoto\\b", "\\bfallthrough\\b", "\\bdefer\\b", "\\bchan\\b", "\\bmap\\b", "\\brange\\b" }; for (const QString &pattern : keywordPatterns) { rule.pattern = QRegularExpression(pattern); rule.format = Keyword; highlightingRules.append(rule); } rule.pattern = QRegularExpression("\".*?\""); rule.format = String; highlightingRules.append(rule); rule.pattern = QRegularExpression("`.*?`"); rule.format = String; highlightingRules.append(rule); rule.pattern = QRegularExpression("//[^\n]*"); rule.format = Comment; highlightingRules.append(rule); rule.pattern = QRegularExpression("/\\*.*?\\*/"); rule.format = Comment; highlightingRules.append(rule); } return highlightingRules; } static QVector bashHighlightingRules() { static QVector highlightingRules; if (highlightingRules.isEmpty()) { HighlightingRule rule; rule.pattern = QRegularExpression(".*"); rule.format = Default; highlightingRules.append(rule); QStringList commandPatterns = { "\\b(grep|awk|sed|ls|cat|echo|rm|mkdir|cp|break|alias|eval|cd|exec|head|tail|strings|printf|touch|mv|chmod)\\b" }; for (const QString &pattern : commandPatterns) { rule.pattern = QRegularExpression(pattern); rule.format = Command; highlightingRules.append(rule); } rule.pattern = QRegularExpression("\\b[0-9]*\\.?[0-9]+\\b"); rule.format = Number; highlightingRules.append(rule); QStringList keywordPatterns = { "\\bif\\b", "\\bthen\\b", "\\belse\\b", "\\bfi\\b", "\\bfor\\b", "\\bin\\b", "\\bdo\\b", "\\bdone\\b", "\\bwhile\\b", "\\buntil\\b", "\\bcase\\b", "\\besac\\b", "\\bfunction\\b", "\\breturn\\b", "\\blocal\\b", "\\bdeclare\\b", "\\bunset\\b", "\\bexport\\b", "\\breadonly\\b", "\\bshift\\b", "\\bexit\\b" }; for (const QString &pattern : keywordPatterns) { rule.pattern = QRegularExpression(pattern); rule.format = Keyword; highlightingRules.append(rule); } rule.pattern = QRegularExpression("\".*?\""); rule.format = String; highlightingRules.append(rule); rule.pattern = QRegularExpression("\'.*?\'"); rule.format = String; highlightingRules.append(rule); rule.pattern = QRegularExpression("\\$(\\w+|\\{[^}]+\\})"); rule.format = Variable; highlightingRules.append(rule); rule.pattern = QRegularExpression("#[^\n]*"); rule.format = Comment; highlightingRules.append(rule); } return highlightingRules; } static QVector latexHighlightingRules() { static QVector highlightingRules; if (highlightingRules.isEmpty()) { HighlightingRule rule; rule.pattern = QRegularExpression(".*"); rule.format = Default; highlightingRules.append(rule); rule.pattern = QRegularExpression("\\\\[A-Za-z]+"); // Pattern for LaTeX commands rule.format = Command; highlightingRules.append(rule); rule.pattern = QRegularExpression("%[^\n]*"); // Pattern for LaTeX comments rule.format = Comment; highlightingRules.append(rule); } return highlightingRules; } static QVector htmlHighlightingRules() { static QVector highlightingRules; if (highlightingRules.isEmpty()) { HighlightingRule rule; rule.pattern = QRegularExpression(".*"); rule.format = Default; highlightingRules.append(rule); rule.pattern = QRegularExpression("\\b(\\w+)\\s*="); rule.format = AttributeName; highlightingRules.append(rule); rule.pattern = QRegularExpression("\".*?\"|'.*?'"); rule.format = AttributeValue; highlightingRules.append(rule); rule.pattern = QRegularExpression(""); rule.format = Comment; highlightingRules.append(rule); rule.pattern = QRegularExpression("&[a-zA-Z0-9#]*;"); rule.format = SpecialCharacter; highlightingRules.append(rule); rule.pattern = QRegularExpression(""); rule.format = DocType; highlightingRules.append(rule); } return highlightingRules; } static QVector phpHighlightingRules() { static QVector highlightingRules; if (highlightingRules.isEmpty()) { HighlightingRule rule; rule.pattern = QRegularExpression(".*"); rule.format = Default; highlightingRules.append(rule); rule.pattern = QRegularExpression("\\b(\\w+)\\s*(?=\\()"); rule.format = FunctionCall; highlightingRules.append(rule); rule.pattern = QRegularExpression("\\bfunction\\s+(\\w+)\\b"); rule.format = Function; highlightingRules.append(rule); rule.pattern = QRegularExpression("\\b[0-9]*\\.?[0-9]+\\b"); rule.format = Number; highlightingRules.append(rule); QStringList keywordPatterns = { "\\bif\\b", "\\belse\\b", "\\belseif\\b", "\\bwhile\\b", "\\bfor\\b", "\\bforeach\\b", "\\breturn\\b", "\\bprint\\b", "\\binclude\\b", "\\brequire\\b", "\\binclude_once\\b", "\\brequire_once\\b", "\\btry\\b", "\\bcatch\\b", "\\bfinally\\b", "\\bcontinue\\b", "\\bbreak\\b", "\\bclass\\b", "\\bfunction\\b", "\\bnew\\b", "\\bthrow\\b", "\\barray\\b", "\\bpublic\\b", "\\bprivate\\b", "\\bprotected\\b", "\\bstatic\\b", "\\bglobal\\b", "\\bisset\\b", "\\bunset\\b", "\\bnull\\b", "\\btrue\\b", "\\bfalse\\b" }; for (const QString &pattern : keywordPatterns) { rule.pattern = QRegularExpression(pattern); rule.format = Keyword; highlightingRules.append(rule); } rule.pattern = QRegularExpression("\".*?\""); rule.format = String; highlightingRules.append(rule); rule.pattern = QRegularExpression("\'.*?\'"); rule.format = String; highlightingRules.append(rule); rule.pattern = QRegularExpression("//[^\n]*"); rule.format = Comment; highlightingRules.append(rule); rule.pattern = QRegularExpression("/\\*.*?\\*/"); rule.format = Comment; highlightingRules.append(rule); } return highlightingRules; } static QVector jsonHighlightingRules() { static QVector highlightingRules; if (highlightingRules.isEmpty()) { HighlightingRule rule; rule.pattern = QRegularExpression(".*"); rule.format = Default; highlightingRules.append(rule); // Key string rule rule.pattern = QRegularExpression("\".*?\":"); // keys are typically in the "key": format rule.format = Key; highlightingRules.append(rule); // Value string rule rule.pattern = QRegularExpression(":\\s*(\".*?\")"); // values are typically in the : "value" format rule.format = Value; highlightingRules.append(rule); } return highlightingRules; } SyntaxHighlighter::SyntaxHighlighter(QObject *parent) : QSyntaxHighlighter(parent) { } SyntaxHighlighter::~SyntaxHighlighter() { } void SyntaxHighlighter::highlightBlock(const QString &text) { QTextBlock block = this->currentBlock(); // Search the first block of the frame we're in for the code to use for highlighting int userState = block.userState(); if (QTextFrame *frame = block.document()->frameAt(block.position())) { QTextBlock firstBlock = frame->begin().currentBlock(); if (firstBlock.isValid()) userState = firstBlock.userState(); } QVector rules; switch (userState) { case Python: rules = pythonHighlightingRules(); break; case Cpp: rules = cppHighlightingRules(); break; case Csharp: rules = csharpHighlightingRules(); break; case Bash: rules = bashHighlightingRules(); break; case TypeScript: rules = typescriptHighlightingRules(); break; case Java: rules = javaHighlightingRules(); break; case Go: rules = goHighlightingRules(); break; case Json: rules = jsonHighlightingRules(); break; case Latex: rules = latexHighlightingRules(); break; case Html: rules = htmlHighlightingRules(); break; case Php: rules = phpHighlightingRules(); break; default: break; } for (const HighlightingRule &rule : std::as_const(rules)) { QRegularExpressionMatchIterator matchIterator = rule.pattern.globalMatch(text); while (matchIterator.hasNext()) { QRegularExpressionMatch match = matchIterator.next(); int startIndex = match.capturedStart(); int length = match.capturedLength(); QTextCharFormat format; format.setForeground(formatToColor(rule.format, m_codeColors)); setFormat(startIndex, length, format); } } } // TODO (Adam) This class replaces characters in the text in order to provide markup and syntax highlighting // which destroys the original text in favor of the replaced text. This is a problem when we select // text and then the user tries to 'copy' the text: the original text should be placed in the clipboard // not the replaced text. A possible solution is to have this class keep a mapping of the original // indices and the replacement indices and then use the original text that is stored in memory in the // chat class to populate the clipboard. ChatViewTextProcessor::ChatViewTextProcessor(QObject *parent) : QObject{parent} , m_quickTextDocument(nullptr) , m_syntaxHighlighter(new SyntaxHighlighter(this)) , m_shouldProcessText(true) , m_fontPixelSize(QGuiApplication::font().pointSizeF()) { } QQuickTextDocument* ChatViewTextProcessor::textDocument() const { return m_quickTextDocument; } void ChatViewTextProcessor::setTextDocument(QQuickTextDocument* quickTextDocument) { m_quickTextDocument = quickTextDocument; m_syntaxHighlighter->setDocument(m_quickTextDocument->textDocument()); handleTextChanged(); } void ChatViewTextProcessor::setValue(const QString &value) { m_quickTextDocument->textDocument()->setPlainText(value); handleTextChanged(); } bool ChatViewTextProcessor::tryCopyAtPosition(int position) const { for (const auto © : m_copies) { if (position >= copy.startPos && position <= copy.endPos) { QClipboard *clipboard = QGuiApplication::clipboard(); clipboard->setText(copy.text); return true; } } return false; } bool ChatViewTextProcessor::shouldProcessText() const { return m_shouldProcessText; } void ChatViewTextProcessor::setShouldProcessText(bool b) { if (m_shouldProcessText == b) return; m_shouldProcessText = b; emit shouldProcessTextChanged(); handleTextChanged(); } qreal ChatViewTextProcessor::fontPixelSize() const { return m_fontPixelSize; } void ChatViewTextProcessor::setFontPixelSize(qreal sz) { if (m_fontPixelSize == sz) return; m_fontPixelSize = sz; emit fontPixelSizeChanged(); handleTextChanged(); } CodeColors ChatViewTextProcessor::codeColors() const { return m_syntaxHighlighter->codeColors(); } void ChatViewTextProcessor::setCodeColors(const CodeColors &colors) { m_syntaxHighlighter->setCodeColors(colors); emit codeColorsChanged(); } void traverseDocument(QTextDocument *doc, QTextFrame *frame) { QTextFrame *rootFrame = frame ? frame : doc->rootFrame(); QTextFrame::iterator rootIt; if (!frame) qDebug() << "Begin traverse"; for (rootIt = rootFrame->begin(); !rootIt.atEnd(); ++rootIt) { QTextFrame *childFrame = rootIt.currentFrame(); QTextBlock childBlock = rootIt.currentBlock(); if (childFrame) { qDebug() << "Frame from" << childFrame->firstPosition() << "to" << childFrame->lastPosition(); traverseDocument(doc, childFrame); } else if (childBlock.isValid()) { qDebug() << QString(" Block %1 position:").arg(childBlock.userState()) << childBlock.position(); qDebug() << QString(" Block %1 text:").arg(childBlock.userState()) << childBlock.text(); // Iterate over lines within the block for (QTextBlock::iterator blockIt = childBlock.begin(); !(blockIt.atEnd()); ++blockIt) { QTextFragment fragment = blockIt.fragment(); if (fragment.isValid()) { qDebug() << " Fragment text:" << fragment.text(); } } } } if (!frame) qDebug() << "End traverse"; } void ChatViewTextProcessor::handleTextChanged() { if (!m_quickTextDocument || !m_shouldProcessText) return; // Force full layout of the text document to work around a bug in Qt // TODO(jared): report the Qt bug and link to the report here QTextDocument* doc = m_quickTextDocument->textDocument(); (void)doc->documentLayout()->documentSize(); handleCodeBlocks(); handleMarkdown(); // We insert an invisible char at the end to make sure the document goes back to the default // text format QTextCursor cursor(doc); QString invisibleCharacter = QString(QChar(0xFEFF)); cursor.insertText(invisibleCharacter, QTextCharFormat()); } void ChatViewTextProcessor::handleCodeBlocks() { QTextDocument* doc = m_quickTextDocument->textDocument(); QTextCursor cursor(doc); QTextCharFormat textFormat; textFormat.setFontFamilies(QStringList() << "Monospace"); textFormat.setForeground(QColor("white")); QTextFrameFormat frameFormatBase; frameFormatBase.setBackground(codeColors().backgroundColor); QTextTableFormat tableFormat; tableFormat.setMargin(0); tableFormat.setPadding(0); tableFormat.setBorder(0); tableFormat.setBorderCollapse(true); QList constraints; constraints << QTextLength(QTextLength::PercentageLength, 100); tableFormat.setColumnWidthConstraints(constraints); QTextTableFormat headerTableFormat; headerTableFormat.setBackground(codeColors().headerColor); headerTableFormat.setPadding(0); headerTableFormat.setBorder(0); headerTableFormat.setBorderCollapse(true); headerTableFormat.setTopMargin(10); headerTableFormat.setBottomMargin(10); headerTableFormat.setLeftMargin(15); headerTableFormat.setRightMargin(15); QList headerConstraints; headerConstraints << QTextLength(QTextLength::PercentageLength, 80); headerConstraints << QTextLength(QTextLength::PercentageLength, 20); headerTableFormat.setColumnWidthConstraints(headerConstraints); QTextTableFormat codeBlockTableFormat; codeBlockTableFormat.setBackground(codeColors().backgroundColor); codeBlockTableFormat.setPadding(0); codeBlockTableFormat.setBorder(0); codeBlockTableFormat.setBorderCollapse(true); codeBlockTableFormat.setTopMargin(15); codeBlockTableFormat.setBottomMargin(15); codeBlockTableFormat.setLeftMargin(15); codeBlockTableFormat.setRightMargin(15); codeBlockTableFormat.setColumnWidthConstraints(constraints); QTextImageFormat copyImageFormat; copyImageFormat.setWidth(24); copyImageFormat.setHeight(24); copyImageFormat.setName("qrc:/gpt4all/icons/copy.svg"); // Regex for code blocks static const QRegularExpression reCode("```(.*?)(```|$)", QRegularExpression::DotMatchesEverythingOption); QRegularExpressionMatchIterator iCode = reCode.globalMatch(doc->toPlainText()); QList matchesCode; while (iCode.hasNext()) matchesCode.append(iCode.next()); QVector newCopies; QVector frames; for(int index = matchesCode.count() - 1; index >= 0; --index) { cursor.setPosition(matchesCode[index].capturedStart()); cursor.setPosition(matchesCode[index].capturedEnd(), QTextCursor::KeepAnchor); cursor.removeSelectedText(); QTextFrameFormat frameFormat = frameFormatBase; QString capturedText = matchesCode[index].captured(1); QString codeLanguage; QStringList lines = capturedText.split('\n'); if (lines.last().isEmpty()) { lines.removeLast(); } if (lines.count() >= 2) { const auto &firstWord = lines.first(); if (firstWord == "python" || firstWord == "cpp" || firstWord == "c++" || firstWord == "csharp" || firstWord == "c#" || firstWord == "c" || firstWord == "bash" || firstWord == "javascript" || firstWord == "typescript" || firstWord == "java" || firstWord == "go" || firstWord == "golang" || firstWord == "json" || firstWord == "latex" || firstWord == "html" || firstWord == "php") { codeLanguage = firstWord; } lines.removeFirst(); } QTextFrame *mainFrame = cursor.currentFrame(); cursor.setCharFormat(textFormat); cursor.insertFrame(frameFormat); QTextTable *table = cursor.insertTable(codeLanguage.isEmpty() ? 1 : 2, 1, tableFormat); if (!codeLanguage.isEmpty()) { QTextTableCell headerCell = table->cellAt(0, 0); QTextCursor headerCellCursor = headerCell.firstCursorPosition(); QTextTable *headerTable = headerCellCursor.insertTable(1, 2, headerTableFormat); QTextTableCell header = headerTable->cellAt(0, 0); QTextCursor headerCursor = header.firstCursorPosition(); headerCursor.insertText(codeLanguage); QTextTableCell copy = headerTable->cellAt(0, 1); QTextCursor copyCursor = copy.firstCursorPosition(); CodeCopy newCopy; newCopy.text = lines.join("\n"); newCopy.startPos = copyCursor.position(); newCopy.endPos = newCopy.startPos + 1; newCopies.append(newCopy); // FIXME: There are two reasons this is commented out. Odd drawing behavior is seen when this is added // and one selects with the mouse the code language in a code block. The other reason is the code that // tries to do a hit test for the image is just very broken and buggy and does not always work. So I'm // disabling this code and included functionality for v3.0.0 until I can figure out how to make this much // less buggy #if 0 // QTextBlockFormat blockFormat; // blockFormat.setAlignment(Qt::AlignRight); // copyCursor.setBlockFormat(blockFormat); // copyCursor.insertImage(copyImageFormat, QTextFrameFormat::FloatRight); #endif } QTextTableCell codeCell = table->cellAt(codeLanguage.isEmpty() ? 0 : 1, 0); QTextCursor codeCellCursor = codeCell.firstCursorPosition(); QTextTable *codeTable = codeCellCursor.insertTable(1, 1, codeBlockTableFormat); QTextTableCell code = codeTable->cellAt(0, 0); QTextCharFormat codeBlockCharFormat; codeBlockCharFormat.setForeground(codeColors().defaultColor); QFont monospaceFont("Courier"); monospaceFont.setPointSize(m_fontPixelSize); if (monospaceFont.family() != "Courier") { monospaceFont.setFamily("Monospace"); // Fallback if Courier isn't available } QTextCursor codeCursor = code.firstCursorPosition(); codeBlockCharFormat.setFont(monospaceFont); // Update the font for the codeblock codeCursor.setCharFormat(codeBlockCharFormat); codeCursor.block().setUserState(stringToLanguage(codeLanguage)); codeCursor.insertText(lines.join('\n')); cursor = mainFrame->lastCursorPosition(); cursor.setCharFormat(QTextCharFormat()); } m_copies = newCopies; } void replaceAndInsertMarkdown(int startIndex, int endIndex, QTextDocument *doc) { QTextCursor cursor(doc); cursor.setPosition(startIndex); cursor.setPosition(endIndex, QTextCursor::KeepAnchor); QTextDocumentFragment fragment(cursor); const QString plainText = fragment.toPlainText(); cursor.removeSelectedText(); QTextDocument::MarkdownFeatures features = static_cast( QTextDocument::MarkdownNoHTML | QTextDocument::MarkdownDialectGitHub); cursor.insertMarkdown(plainText, features); cursor.block().setUserState(Markdown); } void ChatViewTextProcessor::handleMarkdown() { QTextDocument* doc = m_quickTextDocument->textDocument(); QTextCursor cursor(doc); QVector> codeBlockPositions; QTextFrame *rootFrame = doc->rootFrame(); QTextFrame::iterator rootIt; bool hasAlreadyProcessedMarkdown = false; for (rootIt = rootFrame->begin(); !rootIt.atEnd(); ++rootIt) { QTextFrame *childFrame = rootIt.currentFrame(); QTextBlock childBlock = rootIt.currentBlock(); if (childFrame) { codeBlockPositions.append(qMakePair(childFrame->firstPosition()-1, childFrame->lastPosition()+1)); for (QTextFrame::iterator frameIt = childFrame->begin(); !frameIt.atEnd(); ++frameIt) { QTextBlock block = frameIt.currentBlock(); if (block.isValid() && block.userState() == Markdown) hasAlreadyProcessedMarkdown = true; } } else if (childBlock.isValid() && childBlock.userState() == Markdown) hasAlreadyProcessedMarkdown = true; } if (!hasAlreadyProcessedMarkdown) { std::sort(codeBlockPositions.begin(), codeBlockPositions.end(), [](const QPair &a, const QPair &b) { return a.first > b.first; }); int lastIndex = doc->characterCount() - 1; for (const auto &pos : codeBlockPositions) { int nonCodeStart = pos.second; int nonCodeEnd = lastIndex; if (nonCodeEnd > nonCodeStart) { replaceAndInsertMarkdown(nonCodeStart, nonCodeEnd, doc); } lastIndex = pos.first; } if (lastIndex > 0) replaceAndInsertMarkdown(0, lastIndex, doc); } } ================================================ FILE: gpt4all-chat/src/chatviewtextprocessor.h ================================================ #ifndef CHATVIEWTEXTPROCESSOR_H #define CHATVIEWTEXTPROCESSOR_H #include #include #include // IWYU pragma: keep #include #include #include #include // IWYU pragma: keep #include // IWYU pragma: no_forward_declare QQuickTextDocument struct CodeColors { Q_GADGET Q_PROPERTY(QColor defaultColor MEMBER defaultColor) Q_PROPERTY(QColor keywordColor MEMBER keywordColor) Q_PROPERTY(QColor functionColor MEMBER functionColor) Q_PROPERTY(QColor functionCallColor MEMBER functionCallColor) Q_PROPERTY(QColor commentColor MEMBER commentColor) Q_PROPERTY(QColor stringColor MEMBER stringColor) Q_PROPERTY(QColor numberColor MEMBER numberColor) Q_PROPERTY(QColor headerColor MEMBER headerColor) Q_PROPERTY(QColor backgroundColor MEMBER backgroundColor) public: QColor defaultColor; QColor keywordColor; QColor functionColor; QColor functionCallColor; QColor commentColor; QColor stringColor; QColor numberColor; QColor headerColor; QColor backgroundColor; QColor preprocessorColor = keywordColor; QColor typeColor = numberColor; QColor arrowColor = functionColor; QColor commandColor = functionCallColor; QColor variableColor = numberColor; QColor keyColor = functionColor; QColor valueColor = stringColor; QColor parameterColor = stringColor; QColor attributeNameColor = numberColor; QColor attributeValueColor = stringColor; QColor specialCharacterColor = functionColor; QColor doctypeColor = commentColor; }; Q_DECLARE_METATYPE(CodeColors) class SyntaxHighlighter : public QSyntaxHighlighter { Q_OBJECT public: SyntaxHighlighter(QObject *parent); ~SyntaxHighlighter(); void highlightBlock(const QString &text) override; CodeColors codeColors() const { return m_codeColors; } void setCodeColors(const CodeColors &colors) { m_codeColors = colors; } private: CodeColors m_codeColors; }; struct ContextLink { int startPos = -1; int endPos = -1; QString text; QString href; }; struct CodeCopy { int startPos = -1; int endPos = -1; QString text; }; class ChatViewTextProcessor : public QObject { Q_OBJECT Q_PROPERTY(QQuickTextDocument* textDocument READ textDocument WRITE setTextDocument NOTIFY textDocumentChanged()) Q_PROPERTY(bool shouldProcessText READ shouldProcessText WRITE setShouldProcessText NOTIFY shouldProcessTextChanged()) Q_PROPERTY(qreal fontPixelSize READ fontPixelSize WRITE setFontPixelSize NOTIFY fontPixelSizeChanged()) Q_PROPERTY(CodeColors codeColors READ codeColors WRITE setCodeColors NOTIFY codeColorsChanged()) QML_ELEMENT public: explicit ChatViewTextProcessor(QObject *parent = nullptr); QQuickTextDocument* textDocument() const; void setTextDocument(QQuickTextDocument* textDocument); Q_INVOKABLE void setValue(const QString &value); Q_INVOKABLE bool tryCopyAtPosition(int position) const; bool shouldProcessText() const; void setShouldProcessText(bool b); qreal fontPixelSize() const; void setFontPixelSize(qreal b); CodeColors codeColors() const; void setCodeColors(const CodeColors &colors); Q_SIGNALS: void textDocumentChanged(); void shouldProcessTextChanged(); void fontPixelSizeChanged(); void codeColorsChanged(); private Q_SLOTS: void handleTextChanged(); void handleCodeBlocks(); void handleMarkdown(); private: QQuickTextDocument *m_quickTextDocument; SyntaxHighlighter *m_syntaxHighlighter; QVector m_links; QVector m_copies; bool m_shouldProcessText = false; qreal m_fontPixelSize; }; #endif // CHATVIEWTEXTPROCESSOR_H ================================================ FILE: gpt4all-chat/src/codeinterpreter.cpp ================================================ #include "codeinterpreter.h" #include #include #include #include // IWYU pragma: keep #include #include #include using namespace Qt::Literals::StringLiterals; CodeInterpreter::CodeInterpreter() : Tool() , m_error(ToolEnums::Error::NoError) { m_worker = new CodeInterpreterWorker; connect(this, &CodeInterpreter::request, m_worker, &CodeInterpreterWorker::request, Qt::QueuedConnection); } void CodeInterpreter::run(const QList ¶ms) { m_error = ToolEnums::Error::NoError; m_errorString = QString(); Q_ASSERT(params.count() == 1 && params.first().name == "code" && params.first().type == ToolEnums::ParamType::String); const QString code = params.first().value.toString(); connect(m_worker, &CodeInterpreterWorker::finished, [this, params] { m_error = m_worker->error(); m_errorString = m_worker->errorString(); emit runComplete({ ToolCallConstants::CodeInterpreterFunction, params, m_worker->response(), m_error, m_errorString }); }); emit request(code); } bool CodeInterpreter::interrupt() { return m_worker->interrupt(); } QList CodeInterpreter::parameters() const { return {{ "code", ToolEnums::ParamType::String, "javascript code to compute", true }}; } QString CodeInterpreter::symbolicFormat() const { return "{human readable plan to complete the task}\n" + ToolCallConstants::CodeInterpreterPrefix + "{code}\n" + ToolCallConstants::CodeInterpreterSuffix; } QString CodeInterpreter::examplePrompt() const { return R"(Write code to check if a number is prime, use that to see if the number 7 is prime)"; } QString CodeInterpreter::exampleCall() const { static const QString example = R"(function isPrime(n) { if (n <= 1) { return false; } for (let i = 2; i <= Math.sqrt(n); i++) { if (n % i === 0) { return false; } } return true; } const number = 7; console.log(`The number ${number} is prime: ${isPrime(number)}`); )"; return "Certainly! Let's compute the answer to whether the number 7 is prime.\n" + ToolCallConstants::CodeInterpreterPrefix + example + ToolCallConstants::CodeInterpreterSuffix; } QString CodeInterpreter::exampleReply() const { return R"("The computed result shows that 7 is a prime number.)"; } CodeInterpreterWorker::CodeInterpreterWorker() : QObject(nullptr) , m_engine(new QJSEngine(this)) { moveToThread(&m_thread); QJSValue consoleInternalObject = m_engine->newQObject(&m_consoleCapture); m_engine->globalObject().setProperty("console_internal", consoleInternalObject); // preprocess console.log args in JS since Q_INVOKE doesn't support varargs auto consoleObject = m_engine->evaluate(uR"( class Console { log(...args) { if (args.length == 0) return; if (args.length >= 2 && typeof args[0] === 'string') throw new Error('console.log string formatting not supported'); let cat = ''; for (const arg of args) { cat += String(arg); } console_internal.log(cat); } } new Console(); )"_s); m_engine->globalObject().setProperty("console", consoleObject); m_thread.start(); } void CodeInterpreterWorker::reset() { m_response.clear(); m_error = ToolEnums::Error::NoError; m_errorString.clear(); m_consoleCapture.output.clear(); m_engine->setInterrupted(false); } void CodeInterpreterWorker::request(const QString &code) { reset(); const QJSValue result = m_engine->evaluate(code); QString resultString; if (m_engine->isInterrupted()) { resultString = QString("Error: code execution was interrupted or timed out."); } else if (result.isError()) { // NOTE: We purposely do not set the m_error or m_errorString for the code interpreter since // we *want* the model to see the response has an error so it can hopefully correct itself. The // error member variables are intended for tools that have error conditions that cannot be corrected. // For instance, a tool depending upon the network might set these error variables if the network // is not available. const QStringList lines = code.split('\n'); const int line = result.property("lineNumber").toInt(); const int index = line - 1; const QString lineContent = (index >= 0 && index < lines.size()) ? lines.at(index) : "Line not found in code."; resultString = QString("Uncaught exception at line %1: %2\n\t%3") .arg(line) .arg(result.toString()) .arg(lineContent); m_error = ToolEnums::Error::UnknownError; m_errorString = resultString; } else { resultString = result.isUndefined() ? QString() : result.toString(); } if (resultString.isEmpty()) resultString = m_consoleCapture.output; else if (!m_consoleCapture.output.isEmpty()) resultString += "\n" + m_consoleCapture.output; m_response = resultString; emit finished(); } bool CodeInterpreterWorker::interrupt() { m_error = ToolEnums::Error::TimeoutError; m_engine->setInterrupted(true); return true; } ================================================ FILE: gpt4all-chat/src/codeinterpreter.h ================================================ #ifndef CODEINTERPRETER_H #define CODEINTERPRETER_H #include "tool.h" #include "toolcallparser.h" #include #include #include #include class QJSEngine; class JavaScriptConsoleCapture : public QObject { Q_OBJECT public: QString output; Q_INVOKABLE void log(const QString &message) { const int maxLength = 1024; if (output.length() >= maxLength) return; if (output.length() + message.length() + 1 > maxLength) { static const QString trunc = "\noutput truncated at " + QString::number(maxLength) + " characters..."; int remainingLength = maxLength - output.length(); if (remainingLength > 0) output.append(message.left(remainingLength)); output.append(trunc); Q_ASSERT(output.length() > maxLength); } else { output.append(message + "\n"); } } }; class CodeInterpreterWorker : public QObject { Q_OBJECT public: CodeInterpreterWorker(); virtual ~CodeInterpreterWorker() {} void reset(); QString response() const { return m_response; } ToolEnums::Error error() const { return m_error; } QString errorString() const { return m_errorString; } bool interrupt(); public Q_SLOTS: void request(const QString &code); Q_SIGNALS: void finished(); private: QString m_response; ToolEnums::Error m_error = ToolEnums::Error::NoError; QString m_errorString; QThread m_thread; JavaScriptConsoleCapture m_consoleCapture; QJSEngine *m_engine = nullptr; }; class CodeInterpreter : public Tool { Q_OBJECT public: explicit CodeInterpreter(); virtual ~CodeInterpreter() {} void run(const QList ¶ms) override; bool interrupt() override; ToolEnums::Error error() const override { return m_error; } QString errorString() const override { return m_errorString; } QString name() const override { return tr("Code Interpreter"); } QString description() const override { return tr("compute javascript code using console.log as output"); } QString function() const override { return ToolCallConstants::CodeInterpreterFunction; } QList parameters() const override; virtual QString symbolicFormat() const override; QString examplePrompt() const override; QString exampleCall() const override; QString exampleReply() const override; Q_SIGNALS: void request(const QString &code); private: ToolEnums::Error m_error = ToolEnums::Error::NoError; QString m_errorString; CodeInterpreterWorker *m_worker; }; #endif // CODEINTERPRETER_H ================================================ FILE: gpt4all-chat/src/config.h.in ================================================ #pragma once #define APP_VERSION "@APP_VERSION@" #define G4A_CONFIG(name) (1/G4A_CONFIG_##name == 1) #define G4A_CONFIG_force_d3d12 @GPT4ALL_CONFIG_FORCE_D3D12@ ================================================ FILE: gpt4all-chat/src/database.cpp ================================================ #include "database.h" #include "mysettings.h" #include "utils.h" // IWYU pragma: keep #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #ifdef GPT4ALL_USE_QTPDF # include # include #else # include # include # include #endif using namespace Qt::Literals::StringLiterals; namespace ranges = std::ranges; namespace us = unum::usearch; //#define DEBUG //#define DEBUG_EXAMPLE namespace { /* QFile that checks input for binary data. If seen, it fails the read and returns true * for binarySeen(). */ class BinaryDetectingFile: public QFile { public: using QFile::QFile; bool binarySeen() const { return m_binarySeen; } protected: qint64 readData(char *data, qint64 maxSize) override { qint64 res = QFile::readData(data, maxSize); return checkData(data, res); } qint64 readLineData(char *data, qint64 maxSize) override { qint64 res = QFile::readLineData(data, maxSize); return checkData(data, res); } private: qint64 checkData(const char *data, qint64 size) { Q_ASSERT(!isTextModeEnabled()); // We need raw bytes from the underlying QFile if (size != -1 && !m_binarySeen) { for (qint64 i = 0; i < size; i++) { /* Control characters we should never see in plain text: * 0x00 NUL - 0x06 ACK * 0x0E SO - 0x1A SUB * 0x1C FS - 0x1F US */ auto c = static_cast(data[i]); if (c < 0x07 || (c >= 0x0E && c < 0x1B) || (c >= 0x1C && c < 0x20)) { m_binarySeen = true; break; } } } return m_binarySeen ? -1 : size; } bool m_binarySeen = false; }; } // namespace static int s_batchSize = 100; static const QString INIT_DB_SQL[] = { // automatically free unused disk space u"pragma auto_vacuum = FULL;"_s, // create tables uR"( create table chunks( id integer primary key autoincrement, document_id integer not null, chunk_text text not null, file text not null, title text, author text, subject text, keywords text, page integer, line_from integer, line_to integer, words integer default 0 not null, tokens integer default 0 not null, foreign key(document_id) references documents(id) ); )"_s, uR"( create virtual table chunks_fts using fts5( id unindexed, document_id unindexed, chunk_text, file, title, author, subject, keywords, content='chunks', content_rowid='id', tokenize='porter' ); )"_s, uR"( create table collections( id integer primary key, name text unique not null, start_update_time integer, last_update_time integer, embedding_model text ); )"_s, uR"( create table folders( id integer primary key autoincrement, path text unique not null ); )"_s, uR"( create table collection_items( collection_id integer not null, folder_id integer not null, foreign key(collection_id) references collections(id) foreign key(folder_id) references folders(id), unique(collection_id, folder_id) ); )"_s, uR"( create table documents( id integer primary key, folder_id integer not null, document_time integer not null, document_path text unique not null, foreign key(folder_id) references folders(id) ); )"_s, uR"( create table embeddings( model text not null, folder_id integer not null, chunk_id integer not null, embedding blob not null, primary key(model, folder_id, chunk_id), foreign key(folder_id) references folders(id), foreign key(chunk_id) references chunks(id), unique(model, chunk_id) ); )"_s, }; static const QString INSERT_CHUNK_SQL = uR"( insert into chunks(document_id, chunk_text, file, title, author, subject, keywords, page, line_from, line_to, words) values(?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) returning id; )"_s; static const QString INSERT_CHUNK_FTS_SQL = uR"( insert into chunks_fts(document_id, chunk_text, file, title, author, subject, keywords) values(?, ?, ?, ?, ?, ?, ?); )"_s; static const QString SELECT_CHUNKED_DOCUMENTS_SQL[] = { uR"( select distinct document_id from chunks; )"_s, uR"( select distinct document_id from chunks_fts; )"_s, }; static const QString DELETE_CHUNKS_SQL[] = { uR"( delete from embeddings where chunk_id in ( select id from chunks where document_id = ? ); )"_s, uR"( delete from chunks where document_id = ?; )"_s, uR"( delete from chunks_fts where document_id = ?; )"_s, }; static const QString SELECT_CHUNKS_BY_DOCUMENT_SQL = uR"( select id from chunks WHERE document_id = ?; )"_s; static const QString SELECT_CHUNKS_SQL = uR"( select c.id, d.document_time, d.document_path, c.chunk_text, c.file, c.title, c.author, c.page, c.line_from, c.line_to, co.name from chunks c join documents d on d.id = c.document_id join folders f on f.id = d.folder_id join collection_items ci on ci.folder_id = f.id join collections co on co.id = ci.collection_id where c.id in (%1); )"_s; static const QString SELECT_UNCOMPLETED_CHUNKS_SQL = uR"( select co.name, co.embedding_model, c.id, d.folder_id, c.chunk_text from chunks c join documents d on d.id = c.document_id join folders f on f.id = d.folder_id join collection_items ci on ci.folder_id = f.id join collections co on co.id = ci.collection_id and co.embedding_model is not null where not exists( select 1 from embeddings e where e.chunk_id = c.id and e.model = co.embedding_model ); )"_s; static const QString SELECT_COUNT_CHUNKS_SQL = uR"( select count(c.id) from chunks c join documents d on d.id = c.document_id where d.folder_id = ?; )"_s; static const QString SELECT_CHUNKS_FTS_SQL = uR"( select fts.id, bm25(chunks_fts) as score from chunks_fts fts join documents d on fts.document_id = d.id join collection_items ci on d.folder_id = ci.folder_id join collections co on ci.collection_id = co.id where chunks_fts match ? and co.name in ('%1') order by score limit %2; )"_s; #define NAMED_PAIR(name, typea, a, typeb, b) \ struct name { typea a; typeb b; }; \ static bool operator==(const name &x, const name &y) { return x.a == y.a && x.b == y.b; } \ static size_t qHash(const name &x, size_t seed) { return qHashMulti(seed, x.a, x.b); } // struct compared by embedding key, can be extended with additional unique data NAMED_PAIR(EmbeddingKey, QString, embedding_model, int, chunk_id) namespace { struct IncompleteChunk: EmbeddingKey { int folder_id; QString text; }; } // namespace static bool selectAllUncompletedChunks(QSqlQuery &q, QHash &chunks) { if (!q.exec(SELECT_UNCOMPLETED_CHUNKS_SQL)) return false; while (q.next()) { QString collection = q.value(0).toString(); IncompleteChunk ic { /*EmbeddingKey*/ { .embedding_model = q.value(1).toString(), .chunk_id = q.value(2).toInt(), }, /*folder_id =*/ q.value(3).toInt(), /*text =*/ q.value(4).toString(), }; chunks[ic] << collection; } return true; } static bool selectCountChunks(QSqlQuery &q, int folder_id, int &count) { if (!q.prepare(SELECT_COUNT_CHUNKS_SQL)) return false; q.addBindValue(folder_id); if (!q.exec()) return false; if (!q.next()) { count = 0; return false; } count = q.value(0).toInt(); return true; } static bool selectChunk(QSqlQuery &q, const QList &chunk_ids) { QString chunk_ids_str = QString::number(chunk_ids[0]); for (size_t i = 1; i < chunk_ids.size(); ++i) chunk_ids_str += "," + QString::number(chunk_ids[i]); const QString formatted_query = SELECT_CHUNKS_SQL.arg(chunk_ids_str); if (!q.prepare(formatted_query)) return false; return q.exec(); } static const QString INSERT_COLLECTION_SQL = uR"( insert into collections(name, start_update_time, last_update_time, embedding_model) values(?, ?, ?, ?) returning id; )"_s; static const QString SELECT_FOLDERS_FROM_COLLECTIONS_SQL = uR"( select f.id, f.path from collections c join collection_items ci on ci.collection_id = c.id join folders f on ci.folder_id = f.id where c.name = ?; )"_s; static const QString SELECT_COLLECTIONS_SQL_V1 = uR"( select c.collection_name, f.folder_path, f.id from collections c join folders f on c.folder_id = f.id order by c.collection_name asc, f.folder_path asc; )"_s; static const QString SELECT_COLLECTIONS_SQL_V2 = uR"( select c.id, c.name, f.path, f.id, c.start_update_time, c.last_update_time, c.embedding_model from collections c join collection_items ci on ci.collection_id = c.id join folders f on ci.folder_id = f.id order by c.name asc, f.path asc; )"_s; static const QString SELECT_COLLECTION_BY_NAME_SQL = uR"( select id, name, start_update_time, last_update_time, embedding_model from collections c where name = ?; )"_s; static const QString SET_COLLECTION_EMBEDDING_MODEL_SQL = uR"( update collections set embedding_model = ? where name = ?; )"_s; static const QString UPDATE_START_UPDATE_TIME_SQL = uR"( update collections set start_update_time = ? where id = ?; )"_s; static const QString UPDATE_LAST_UPDATE_TIME_SQL = uR"( update collections set last_update_time = ? where id = ?; )"_s; static const QString FTS_INTEGRITY_SQL = uR"( insert into chunks_fts(chunks_fts, rank) values('integrity-check', 1); )"_s; static const QString FTS_REBUILD_SQL = uR"( insert into chunks_fts(chunks_fts) values('rebuild'); )"_s; static bool addCollection(QSqlQuery &q, const QString &collection_name, const QDateTime &start_update, const QDateTime &last_update, const QString &embedding_model, CollectionItem &item) { if (!q.prepare(INSERT_COLLECTION_SQL)) return false; q.addBindValue(collection_name); q.addBindValue(start_update); q.addBindValue(last_update); q.addBindValue(embedding_model); if (!q.exec() || !q.next()) return false; item.collection_id = q.value(0).toInt(); item.collection = collection_name; item.embeddingModel = embedding_model; return true; } static bool selectFoldersFromCollection(QSqlQuery &q, const QString &collection_name, QList> *folders) { if (!q.prepare(SELECT_FOLDERS_FROM_COLLECTIONS_SQL)) return false; q.addBindValue(collection_name); if (!q.exec()) return false; while (q.next()) folders->append({q.value(0).toInt(), q.value(1).toString()}); return true; } static QList sqlExtractCollections(QSqlQuery &q, bool with_folder = false, int version = LOCALDOCS_VERSION) { QList collections; while (q.next()) { CollectionItem i; int idx = 0; if (version >= 2) i.collection_id = q.value(idx++).toInt(); i.collection = q.value(idx++).toString(); if (with_folder) { i.folder_path = q.value(idx++).toString(); i.folder_id = q.value(idx++).toInt(); } i.indexing = false; i.installed = true; if (version >= 2) { bool ok; const qint64 start_update = q.value(idx++).toLongLong(&ok); if (ok) i.startUpdate = QDateTime::fromMSecsSinceEpoch(start_update); const qint64 last_update = q.value(idx++).toLongLong(&ok); if (ok) i.lastUpdate = QDateTime::fromMSecsSinceEpoch(last_update); i.embeddingModel = q.value(idx++).toString(); } if (i.embeddingModel.isNull()) { // unknown embedding model -> need to re-index i.forceIndexing = true; } collections << i; } return collections; } static bool selectAllFromCollections(QSqlQuery &q, QList *collections, int version = LOCALDOCS_VERSION) { switch (version) { case 1: if (!q.prepare(SELECT_COLLECTIONS_SQL_V1)) return false; break; case 2: case 3: if (!q.prepare(SELECT_COLLECTIONS_SQL_V2)) return false; break; default: Q_UNREACHABLE(); return false; } if (!q.exec()) return false; *collections = sqlExtractCollections(q, true, version); return true; } static bool selectCollectionByName(QSqlQuery &q, const QString &name, std::optional &collection) { if (!q.prepare(SELECT_COLLECTION_BY_NAME_SQL)) return false; q.addBindValue(name); if (!q.exec()) return false; QList collections = sqlExtractCollections(q); Q_ASSERT(collections.count() <= 1); collection.reset(); if (!collections.isEmpty()) collection = collections.first(); return true; } static bool setCollectionEmbeddingModel(QSqlQuery &q, const QString &collection_name, const QString &embedding_model) { if (!q.prepare(SET_COLLECTION_EMBEDDING_MODEL_SQL)) return false; q.addBindValue(embedding_model); q.addBindValue(collection_name); return q.exec(); } static bool updateStartUpdateTime(QSqlQuery &q, int id, qint64 update_time) { if (!q.prepare(UPDATE_START_UPDATE_TIME_SQL)) return false; q.addBindValue(update_time); q.addBindValue(id); return q.exec(); } static bool updateLastUpdateTime(QSqlQuery &q, int id, qint64 update_time) { if (!q.prepare(UPDATE_LAST_UPDATE_TIME_SQL)) return false; q.addBindValue(update_time); q.addBindValue(id); return q.exec(); } static const QString INSERT_FOLDERS_SQL = uR"( insert into folders(path) values(?); )"_s; static const QString DELETE_FOLDERS_SQL = uR"( delete from folders where id = ?; )"_s; static const QString SELECT_FOLDERS_FROM_PATH_SQL = uR"( select id from folders where path = ?; )"_s; static const QString GET_FOLDER_EMBEDDING_MODEL_SQL = uR"( select co.embedding_model from collections co join collection_items ci on ci.collection_id = co.id where ci.folder_id = ?; )"_s; static const QString FOLDER_REMOVE_ALL_DOCS_SQL[] = { uR"( delete from embeddings where chunk_id in ( select c.id from chunks c join documents d on d.id = c.document_id join folders f on f.id = d.folder_id where f.path = ? ); )"_s, uR"( delete from chunks where document_id in ( select d.id from documents d join folders f on f.id = d.folder_id where f.path = ? ); )"_s, uR"( delete from documents where id in ( select d.id from documents d join folders f on f.id = d.folder_id where f.path = ? ); )"_s, }; static bool addFolderToDB(QSqlQuery &q, const QString &folder_path, int *folder_id) { if (!q.prepare(INSERT_FOLDERS_SQL)) return false; q.addBindValue(folder_path); if (!q.exec()) return false; *folder_id = q.lastInsertId().toInt(); return true; } static bool removeFolderFromDB(QSqlQuery &q, int folder_id) { if (!q.prepare(DELETE_FOLDERS_SQL)) return false; q.addBindValue(folder_id); return q.exec(); } static bool selectFolder(QSqlQuery &q, const QString &folder_path, int *id) { if (!q.prepare(SELECT_FOLDERS_FROM_PATH_SQL)) return false; q.addBindValue(folder_path); if (!q.exec()) return false; Q_ASSERT(q.size() < 2); if (q.next()) *id = q.value(0).toInt(); return true; } static bool sqlGetFolderEmbeddingModel(QSqlQuery &q, int id, QString &embedding_model) { if (!q.prepare(GET_FOLDER_EMBEDDING_MODEL_SQL)) return false; q.addBindValue(id); if (!q.exec() || !q.next()) return false; // FIXME(jared): there may be more than one if a folder is shared between collections Q_ASSERT(q.size() < 2); embedding_model = q.value(0).toString(); return true; } static const QString INSERT_COLLECTION_ITEM_SQL = uR"( insert into collection_items(collection_id, folder_id) values(?, ?) on conflict do nothing; )"_s; static const QString DELETE_COLLECTION_FOLDER_SQL = uR"( delete from collection_items where collection_id = (select id from collections where name = :name) and folder_id = :folder_id returning (select count(*) from collection_items where folder_id = :folder_id); )"_s; static const QString PRUNE_COLLECTIONS_SQL = uR"( delete from collections where id not in (select collection_id from collection_items); )"_s; // 0 = already exists, 1 = added, -1 = error static int addCollectionItem(QSqlQuery &q, int collection_id, int folder_id) { if (!q.prepare(INSERT_COLLECTION_ITEM_SQL)) return -1; q.addBindValue(collection_id); q.addBindValue(folder_id); if (q.exec()) return q.numRowsAffected(); return -1; } // returns the number of remaining references to the folder, or -1 on error static int removeCollectionFolder(QSqlQuery &q, const QString &collection_name, int folder_id) { if (!q.prepare(DELETE_COLLECTION_FOLDER_SQL)) return -1; q.bindValue(":name", collection_name); q.bindValue(":folder_id", folder_id); if (!q.exec() || !q.next()) return -1; return q.value(0).toInt(); } static bool sqlPruneCollections(QSqlQuery &q) { return q.exec(PRUNE_COLLECTIONS_SQL); } static const QString INSERT_DOCUMENTS_SQL = uR"( insert into documents(folder_id, document_time, document_path) values(?, ?, ?); )"_s; static const QString UPDATE_DOCUMENT_TIME_SQL = uR"( update documents set document_time = ? where id = ?; )"_s; static const QString DELETE_DOCUMENTS_SQL = uR"( delete from documents where id = ?; )"_s; static const QString SELECT_DOCUMENT_SQL = uR"( select id, document_time from documents where document_path = ?; )"_s; static const QString SELECT_DOCUMENTS_SQL = uR"( select id from documents where folder_id = ?; )"_s; static const QString SELECT_ALL_DOCUMENTS_SQL = uR"( select id, document_path from documents; )"_s; static const QString SELECT_COUNT_STATISTICS_SQL = uR"( select count(distinct d.id), sum(c.words), sum(c.tokens) from documents d left join chunks c on d.id = c.document_id where d.folder_id = ?; )"_s; static bool addDocument(QSqlQuery &q, int folder_id, qint64 document_time, const QString &document_path, int *document_id) { if (!q.prepare(INSERT_DOCUMENTS_SQL)) return false; q.addBindValue(folder_id); q.addBindValue(document_time); q.addBindValue(document_path); if (!q.exec()) return false; *document_id = q.lastInsertId().toInt(); return true; } static bool removeDocument(QSqlQuery &q, int document_id) { if (!q.prepare(DELETE_DOCUMENTS_SQL)) return false; q.addBindValue(document_id); return q.exec(); } static bool updateDocument(QSqlQuery &q, int id, qint64 document_time) { if (!q.prepare(UPDATE_DOCUMENT_TIME_SQL)) return false; q.addBindValue(document_time); q.addBindValue(id); return q.exec(); } static bool selectDocument(QSqlQuery &q, const QString &document_path, int *id, qint64 *document_time) { if (!q.prepare(SELECT_DOCUMENT_SQL)) return false; q.addBindValue(document_path); if (!q.exec()) return false; Q_ASSERT(q.size() < 2); if (q.next()) { *id = q.value(0).toInt(); *document_time = q.value(1).toLongLong(); } return true; } static bool selectDocuments(QSqlQuery &q, int folder_id, QList *documentIds) { if (!q.prepare(SELECT_DOCUMENTS_SQL)) return false; q.addBindValue(folder_id); if (!q.exec()) return false; while (q.next()) documentIds->append(q.value(0).toInt()); return true; } static bool selectCountStatistics(QSqlQuery &q, int folder_id, int *total_docs, int *total_words, int *total_tokens) { if (!q.prepare(SELECT_COUNT_STATISTICS_SQL)) return false; q.addBindValue(folder_id); if (!q.exec()) return false; if (q.next()) { *total_docs = q.value(0).toInt(); *total_words = q.value(1).toInt(); *total_tokens = q.value(2).toInt(); } return true; } // insert embedding only if still needed static const QString INSERT_EMBEDDING_SQL = uR"( insert into embeddings(model, folder_id, chunk_id, embedding) select :model, d.folder_id, :chunk_id, :embedding from chunks c join documents d on d.id = c.document_id join folders f on f.id = d.folder_id join collection_items ci on ci.folder_id = f.id join collections co on co.id = ci.collection_id where co.embedding_model = :model and c.id = :chunk_id limit 1; )"_s; static const QString GET_COLLECTION_EMBEDDINGS_SQL = uR"( select e.chunk_id, e.embedding from embeddings e join collections co on co.embedding_model = e.model join collection_items ci on ci.folder_id = e.folder_id and ci.collection_id = co.id where co.name in ('%1'); )"_s; static const QString GET_CHUNK_EMBEDDINGS_SQL = uR"( select e.chunk_id, e.embedding from embeddings e where e.chunk_id in (%1); )"_s; static const QString GET_CHUNK_FILE_SQL = uR"( select file from chunks where id = ?; )"_s; namespace { struct Embedding { QString model; int folder_id; int chunk_id; QByteArray data; }; struct EmbeddingStat { QString lastFile; int nAdded; int nSkipped; }; } // namespace NAMED_PAIR(EmbeddingFolder, QString, embedding_model, int, folder_id) static bool sqlAddEmbeddings(QSqlQuery &q, const QList &embeddings, QHash &embeddingStats) { if (!q.prepare(INSERT_EMBEDDING_SQL)) return false; // insert embedding if needed for (const auto &e: embeddings) { q.bindValue(":model", e.model); q.bindValue(":chunk_id", e.chunk_id); q.bindValue(":embedding", e.data); if (!q.exec()) return false; auto &stat = embeddingStats[{ e.model, e.folder_id }]; if (q.numRowsAffected()) { stat.nAdded++; // embedding added } else { stat.nSkipped++; // embedding no longer needed } } if (!q.prepare(GET_CHUNK_FILE_SQL)) return false; // populate statistics for each collection item for (const auto &e: embeddings) { auto &stat = embeddingStats[{ e.model, e.folder_id }]; if (stat.nAdded && stat.lastFile.isNull()) { q.addBindValue(e.chunk_id); if (!q.exec() || !q.next()) return false; stat.lastFile = q.value(0).toString(); } } return true; } void Database::transaction() { bool ok = m_db.transaction(); Q_ASSERT(ok); } void Database::commit() { bool ok = m_db.commit(); Q_ASSERT(ok); } void Database::rollback() { bool ok = m_db.rollback(); Q_ASSERT(ok); } bool Database::refreshDocumentIdCache(QSqlQuery &q) { m_documentIdCache.clear(); for (const auto &cmd: SELECT_CHUNKED_DOCUMENTS_SQL) { if (!q.exec(cmd)) return false; while (q.next()) m_documentIdCache << q.value(0).toInt(); } return true; } bool Database::addChunk(QSqlQuery &q, int document_id, const QString &chunk_text, const QString &file, const QString &title, const QString &author, const QString &subject, const QString &keywords, int page, int from, int to, int words, int *chunk_id) { if (!q.prepare(INSERT_CHUNK_SQL)) return false; q.addBindValue(document_id); q.addBindValue(chunk_text); q.addBindValue(file); q.addBindValue(title); q.addBindValue(author); q.addBindValue(subject); q.addBindValue(keywords); q.addBindValue(page); q.addBindValue(from); q.addBindValue(to); q.addBindValue(words); if (!q.exec() || !q.next()) return false; *chunk_id = q.value(0).toInt(); if (!q.prepare(INSERT_CHUNK_FTS_SQL)) return false; q.addBindValue(document_id); q.addBindValue(chunk_text); q.addBindValue(file); q.addBindValue(title); q.addBindValue(author); q.addBindValue(subject); q.addBindValue(keywords); if (!q.exec()) return false; m_documentIdCache << document_id; return true; } bool Database::removeChunksByDocumentId(QSqlQuery &q, int document_id) { for (const auto &cmd: DELETE_CHUNKS_SQL) { if (!q.prepare(cmd)) return false; q.addBindValue(document_id); if (!q.exec()) return false; } m_documentIdCache.remove(document_id); return true; } bool Database::sqlRemoveDocsByFolderPath(QSqlQuery &q, const QString &path) { for (const auto &cmd: FOLDER_REMOVE_ALL_DOCS_SQL) { if (!q.prepare(cmd)) return false; q.addBindValue(path); if (!q.exec()) return false; } return refreshDocumentIdCache(q); } bool Database::hasContent() { return m_db.tables().contains("chunks", Qt::CaseInsensitive); } int Database::openDatabase(const QString &modelPath, bool create, int ver) { if (!QFileInfo(modelPath).isDir()) { qWarning() << "ERROR: invalid download path" << modelPath; return -1; } if (m_db.isOpen()) m_db.close(); auto dbPath = u"%1/localdocs_v%2.db"_s.arg(modelPath).arg(ver); if (!create && !QFileInfo::exists(dbPath)) return 0; m_db.setDatabaseName(dbPath); if (!m_db.open()) { qWarning() << "ERROR: opening db" << dbPath << m_db.lastError(); return -1; } return hasContent(); } bool Database::openLatestDb(const QString &modelPath, QList &oldCollections) { /* * Support upgrade path from older versions: * * 1. Detect and load dbPath with older versions * 2. Provide versioned SQL select statements * 3. Upgrade the tables to the new version * 4. By default mark all collections of older versions as force indexing and present to the user * the an 'update' button letting them know a breaking change happened and that the collection * will need to be indexed again */ int dbVer; for (dbVer = LOCALDOCS_VERSION;; dbVer--) { if (dbVer < LOCALDOCS_MIN_VER) return true; // create a new db int res = openDatabase(modelPath, false, dbVer); if (res == 1) break; // found one with content if (res == -1) return false; // error } if (dbVer == LOCALDOCS_VERSION) return true; // already up-to-date // If we're upgrading, then we need to do a select on the current version of the collections table, // then create the new one and populate the collections table and mark them as needing forced // indexing #if defined(DEBUG) qDebug() << "Older localdocs version found" << dbVer << "upgrade to" << LOCALDOCS_VERSION; #endif // Select the current collections which will be marked to force indexing QSqlQuery q(m_db); if (!selectAllFromCollections(q, &oldCollections, dbVer)) { qWarning() << "ERROR: Could not open select old collections" << q.lastError(); return false; } m_db.close(); return true; } bool Database::initDb(const QString &modelPath, const QList &oldCollections) { if (!m_db.isOpen()) { int res = openDatabase(modelPath); if (res == 1) return true; // already populated if (res == -1) return false; // error } else if (hasContent()) { return true; // already populated } transaction(); QSqlQuery q(m_db); for (const auto &cmd: INIT_DB_SQL) { if (!q.exec(cmd)) { qWarning() << "ERROR: failed to create tables" << q.lastError(); rollback(); return false; } } /* These are collection items that came from an older version of localdocs which * require forced indexing that should only be done when the user has explicitly asked * for them to be indexed again */ for (const CollectionItem &item : oldCollections) { if (!addFolder(item.collection, item.folder_path, QString())) { qWarning() << "ERROR: failed to add previous collections to new database"; rollback(); return false; } } commit(); return true; } Database::Database(int chunkSize, QStringList extensions) : QObject(nullptr) , m_chunkSize(chunkSize) , m_scannedFileExtensions(std::move(extensions)) , m_scanIntervalTimer(new QTimer(this)) , m_watcher(new QFileSystemWatcher(this)) , m_embLLM(new EmbeddingLLM) , m_databaseValid(true) , m_chunkStreamer(this) { m_db = QSqlDatabase::database(QSqlDatabase::defaultConnection, false); if (!m_db.isValid()) m_db = QSqlDatabase::addDatabase("QSQLITE"); Q_ASSERT(m_db.isValid()); moveToThread(&m_dbThread); m_dbThread.setObjectName("database"); m_dbThread.start(); } Database::~Database() { m_dbThread.quit(); m_dbThread.wait(); delete m_embLLM; } void Database::setStartUpdateTime(CollectionItem &item) { QSqlQuery q(m_db); const qint64 update_time = QDateTime::currentMSecsSinceEpoch(); if (!updateStartUpdateTime(q, item.collection_id, update_time)) qWarning() << "Database ERROR: failed to set start update time:" << q.lastError(); else item.startUpdate = QDateTime::fromMSecsSinceEpoch(update_time); } void Database::setLastUpdateTime(CollectionItem &item) { QSqlQuery q(m_db); const qint64 update_time = QDateTime::currentMSecsSinceEpoch(); if (!updateLastUpdateTime(q, item.collection_id, update_time)) qWarning() << "Database ERROR: failed to set last update time:" << q.lastError(); else item.lastUpdate = QDateTime::fromMSecsSinceEpoch(update_time); } CollectionItem Database::guiCollectionItem(int folder_id) const { Q_ASSERT(m_collectionMap.contains(folder_id)); return m_collectionMap.value(folder_id); } void Database::updateGuiForCollectionItem(const CollectionItem &item) { m_collectionMap.insert(item.folder_id, item); emit requestUpdateGuiForCollectionItem(item); } void Database::addGuiCollectionItem(const CollectionItem &item) { m_collectionMap.insert(item.folder_id, item); emit requestAddGuiCollectionItem(item); } void Database::removeGuiFolderById(const QString &collection, int folder_id) { emit requestRemoveGuiFolderById(collection, folder_id); } void Database::guiCollectionListUpdated(const QList &collectionList) { for (const auto &i : collectionList) m_collectionMap.insert(i.folder_id, i); emit requestGuiCollectionListUpdated(collectionList); } void Database::updateFolderToIndex(int folder_id, size_t countForFolder, bool sendChunks) { CollectionItem item = guiCollectionItem(folder_id); item.currentDocsToIndex = countForFolder; if (!countForFolder) { if (sendChunks && !m_chunkList.isEmpty()) sendChunkList(); // send any remaining embedding chunks to llm item.indexing = false; item.installed = true; // Set the last update if we are done if (item.startUpdate > item.lastUpdate && item.currentEmbeddingsToIndex == 0) setLastUpdateTime(item); } updateGuiForCollectionItem(item); } static void handleDocumentError(const QString &errorMessage, int document_id, const QString &document_path, const QSqlError &error) { qWarning() << errorMessage << document_id << document_path << error; } class DocumentReader { public: struct Metadata { QString title, author, subject, keywords; }; static std::unique_ptr fromDocument(DocumentInfo info); const DocumentInfo &doc () const { return m_info; } const Metadata &metadata() const { return m_metadata; } const std::optional &word () const { return m_word; } const std::optional &nextWord() { m_word = advance(); return m_word; } virtual std::optional getError() const { return std::nullopt; } virtual int page() const { return -1; } virtual ~DocumentReader() = default; protected: explicit DocumentReader(DocumentInfo info) : m_info(std::move(info)) {} void postInit(Metadata &&metadata = {}) { m_metadata = std::move(metadata); m_word = advance(); } virtual std::optional advance() = 0; DocumentInfo m_info; Metadata m_metadata; std::optional m_word; }; namespace { #ifdef GPT4ALL_USE_QTPDF class PdfDocumentReader final : public DocumentReader { public: explicit PdfDocumentReader(DocumentInfo info) : DocumentReader(std::move(info)) { QString path = info.file.canonicalFilePath(); if (m_doc.load(path) != QPdfDocument::Error::None) throw std::runtime_error(fmt::format("Failed to load PDF: {}", path)); Metadata metadata { .title = m_doc.metaData(QPdfDocument::MetaDataField::Title ).toString(), .author = m_doc.metaData(QPdfDocument::MetaDataField::Author ).toString(), .subject = m_doc.metaData(QPdfDocument::MetaDataField::Subject ).toString(), .keywords = m_doc.metaData(QPdfDocument::MetaDataField::Keywords).toString(), }; postInit(std::move(metadata)); } int page() const override { return m_currentPage; } private: std::optional advance() override { QString word; do { while (!m_stream || m_stream->atEnd()) { if (m_currentPage >= m_doc.pageCount()) return std::nullopt; m_pageText = m_doc.getAllText(m_currentPage++).text(); m_stream.emplace(&m_pageText); } *m_stream >> word; } while (word.isEmpty()); return word; } QPdfDocument m_doc; int m_currentPage = 0; QString m_pageText; std::optional m_stream; }; #else class PdfDocumentReader final : public DocumentReader { public: explicit PdfDocumentReader(DocumentInfo info) : DocumentReader(std::move(info)) { QString path = info.file.canonicalFilePath(); m_doc = FPDF_LoadDocument(path.toUtf8().constData(), nullptr); if (!m_doc) throw std::runtime_error(fmt::format("Failed to load PDF: {}", path)); // Extract metadata Metadata metadata { .title = getMetadata("Title" ), .author = getMetadata("Author" ), .subject = getMetadata("Subject" ), .keywords = getMetadata("Keywords"), }; postInit(std::move(metadata)); } ~PdfDocumentReader() override { if (m_page) FPDF_ClosePage(m_page); if (m_doc) FPDF_CloseDocument(m_doc); } int page() const override { return m_currentPage; } private: std::optional advance() override { QString word; do { while (!m_stream || m_stream->atEnd()) { if (m_currentPage >= FPDF_GetPageCount(m_doc)) return std::nullopt; if (m_page) FPDF_ClosePage(std::exchange(m_page, nullptr)); m_page = FPDF_LoadPage(m_doc, m_currentPage++); if (!m_page) throw std::runtime_error("Failed to load page."); m_pageText = extractTextFromPage(m_page); m_stream.emplace(&m_pageText); } *m_stream >> word; } while (word.isEmpty()); return word; } QString getMetadata(FPDF_BYTESTRING key) { // FPDF_GetMetaText includes a 2-byte null terminator ulong nBytes = FPDF_GetMetaText(m_doc, key, nullptr, 0); if (nBytes <= sizeof (FPDF_WCHAR)) return { "" }; QByteArray buffer(nBytes, Qt::Uninitialized); ulong nResultBytes = FPDF_GetMetaText(m_doc, key, buffer.data(), buffer.size()); Q_ASSERT(nResultBytes % 2 == 0); Q_ASSERT(nResultBytes <= nBytes); return QString::fromUtf16(reinterpret_cast(buffer.data()), nResultBytes / 2 - 1); } QString extractTextFromPage(FPDF_PAGE page) { FPDF_TEXTPAGE textPage = FPDFText_LoadPage(page); if (!textPage) throw std::runtime_error("Failed to load text page."); int nChars = FPDFText_CountChars(textPage); if (!nChars) return {}; // FPDFText_GetText includes a 2-byte null terminator QByteArray buffer((nChars + 1) * sizeof (FPDF_WCHAR), Qt::Uninitialized); int nResultChars = FPDFText_GetText(textPage, 0, nChars, reinterpret_cast(buffer.data())); Q_ASSERT(nResultChars <= nChars + 1); FPDFText_ClosePage(textPage); return QString::fromUtf16(reinterpret_cast(buffer.data()), nResultChars - 1); } FPDF_DOCUMENT m_doc = nullptr; FPDF_PAGE m_page = nullptr; int m_currentPage = 0; QString m_pageText; std::optional m_stream; }; #endif // !defined(GPT4ALL_USE_QTPDF) class WordDocumentReader final : public DocumentReader { public: explicit WordDocumentReader(DocumentInfo info) : DocumentReader(std::move(info)) , m_doc(info.file.canonicalFilePath().toStdString()) { m_doc.open(); if (!m_doc.is_open()) throw std::runtime_error(fmt::format("Failed to open DOCX: {}", info.file.canonicalFilePath())); m_paragraph = &m_doc.paragraphs(); m_run = &m_paragraph->runs(); // TODO(jared): metadata for Word documents? postInit(); } protected: std::optional advance() override { // find non-space char qsizetype wordStart = 0; while (m_buffer.isEmpty() || m_buffer[wordStart].isSpace()) { if (m_buffer.isEmpty() && !fillBuffer()) return std::nullopt; if (m_buffer[wordStart].isSpace() && ++wordStart >= m_buffer.size()) { m_buffer.clear(); wordStart = 0; } } // find space char qsizetype wordEnd = wordStart + 1; while (wordEnd >= m_buffer.size() || !m_buffer[wordEnd].isSpace()) { if (wordEnd >= m_buffer.size() && !fillBuffer()) break; if (!m_buffer[wordEnd].isSpace()) ++wordEnd; } if (wordStart == wordEnd) return std::nullopt; auto size = wordEnd - wordStart; QString word = std::move(m_buffer); m_buffer = word.sliced(wordStart + size); if (wordStart == 0) word.resize(size); else word = word.sliced(wordStart, size); return word; } bool fillBuffer() { for (;;) { // get a run while (!m_run->has_next()) { // try next paragraph if (!m_paragraph->has_next()) return false; m_paragraph->next(); m_buffer += u'\n'; } bool foundText = false; auto &run = m_run->get_node(); for (auto node = run.first_child(); node; node = node.next_sibling()) { std::string node_name = node.name(); if (node_name == "w:t") { const char *text = node.text().get(); if (*text) { foundText = true; m_buffer += QUtf8StringView(text); } } else if (node_name == "w:br") { m_buffer += u'\n'; } else if (node_name == "w:tab") { m_buffer += u'\t'; } } m_run->next(); if (foundText) return true; } } duckx::Document m_doc; duckx::Paragraph *m_paragraph; duckx::Run *m_run; QString m_buffer; }; class TxtDocumentReader final : public DocumentReader { public: explicit TxtDocumentReader(DocumentInfo info) : DocumentReader(std::move(info)) , m_file(info.file.canonicalFilePath()) { if (!m_file.open(QIODevice::ReadOnly)) throw std::runtime_error(fmt::format("Failed to open text file: {}", m_file.fileName())); m_stream.setDevice(&m_file); postInit(); } protected: std::optional advance() override { if (getError()) return std::nullopt; while (!m_stream.atEnd()) { QString word; m_stream >> word; if (getError()) return std::nullopt; if (!word.isEmpty()) return word; } return std::nullopt; } std::optional getError() const override { if (m_file.binarySeen()) return ChunkStreamer::Status::BINARY_SEEN; if (m_file.error()) return ChunkStreamer::Status::ERROR; return std::nullopt; } BinaryDetectingFile m_file; QTextStream m_stream; }; } // namespace std::unique_ptr DocumentReader::fromDocument(DocumentInfo doc) { if (doc.isPdf()) return std::make_unique(std::move(doc)); if (doc.isDocx()) return std::make_unique(std::move(doc)); return std::make_unique(std::move(doc)); } ChunkStreamer::ChunkStreamer(Database *database) : m_database(database) {} ChunkStreamer::~ChunkStreamer() = default; void ChunkStreamer::setDocument(DocumentInfo doc, int documentId, const QString &embeddingModel) { auto docKey = doc.key(); if (!m_docKey || *m_docKey != docKey) { m_docKey = docKey; m_reader = DocumentReader::fromDocument(std::move(doc)); m_documentId = documentId; m_embeddingModel = embeddingModel; m_chunk.clear(); m_page = 0; // make sure the document doesn't already have any chunks if (m_database->m_documentIdCache.contains(documentId)) { QSqlQuery q(m_database->m_db); if (!m_database->removeChunksByDocumentId(q, documentId)) handleDocumentError("ERROR: Cannot remove chunks of document", documentId, m_reader->doc().file.canonicalPath(), q.lastError()); } } } std::optional ChunkStreamer::currentDocKey() const { return m_docKey; } void ChunkStreamer::reset() { m_docKey.reset(); } ChunkStreamer::Status ChunkStreamer::step() { // TODO: implement line_from/line_to constexpr int line_from = -1; constexpr int line_to = -1; const int folderId = m_reader->doc().folder; const int maxChunkSize = m_database->m_chunkSize; int nChunks = 0; int nAddedWords = 0; Status retval; for (;;) { if (auto error = m_reader->getError()) { m_docKey.reset(); // done processing retval = *error; break; } // get a word, if needed std::optional word = QString(); // empty string to disable EOF logic if (m_chunk.length() < maxChunkSize + 1) { word = m_reader->word(); if (m_chunk.isEmpty()) m_page = m_reader->page(); // page number of first word if (word) { m_chunk += *word; m_chunk += u' '; m_reader->nextWord(); m_nChunkWords++; } } if (!word || m_chunk.length() >= maxChunkSize + 1) { // +1 for trailing space if (!m_chunk.isEmpty()) { int nThisChunkWords = 0; auto chunk = m_chunk; // copy // handle overlength chunks if (m_chunk.length() > maxChunkSize + 1) { // find the final space qsizetype chunkEnd = chunk.lastIndexOf(u' ', -2); qsizetype spaceSize; if (chunkEnd >= 0) { // slice off the last word spaceSize = 1; Q_ASSERT(m_nChunkWords >= 1); // one word left nThisChunkWords = m_nChunkWords - 1; m_nChunkWords = 1; } else { // slice the overlong word spaceSize = 0; chunkEnd = maxChunkSize; // partial word left, don't count it nThisChunkWords = m_nChunkWords; m_nChunkWords = 0; } // save the second part, excluding space if any m_chunk = chunk.sliced(chunkEnd + spaceSize); // consume the first part chunk.truncate(chunkEnd); } else { nThisChunkWords = m_nChunkWords; m_nChunkWords = 0; // there is no second part m_chunk.clear(); // consume the whole chunk, excluding space chunk.chop(1); } Q_ASSERT(chunk.length() <= maxChunkSize); QSqlQuery q(m_database->m_db); int chunkId = 0; auto &metadata = m_reader->metadata(); if (!m_database->addChunk(q, m_documentId, chunk, m_reader->doc().file.fileName(), // basename metadata.title, metadata.author, metadata.subject, metadata.keywords, m_page, line_from, line_to, nThisChunkWords, &chunkId )) { qWarning() << "ERROR: Could not insert chunk into db" << q.lastError(); } nAddedWords += nThisChunkWords; EmbeddingChunk toEmbed; toEmbed.model = m_embeddingModel; toEmbed.folder_id = folderId; toEmbed.chunk_id = chunkId; toEmbed.chunk = chunk; m_database->appendChunk(toEmbed); ++nChunks; } if (!word) { retval = Status::DOC_COMPLETE; m_docKey.reset(); // done processing break; } } if (m_database->scanQueueInterrupted()) { retval = Status::INTERRUPTED; break; } } if (nChunks) { CollectionItem item = m_database->guiCollectionItem(folderId); // Set the start update if we haven't done so already if (item.startUpdate <= item.lastUpdate && item.currentEmbeddingsToIndex == 0) m_database->setStartUpdateTime(item); item.currentEmbeddingsToIndex += nChunks; item.totalEmbeddingsToIndex += nChunks; item.totalWords += nAddedWords; m_database->updateGuiForCollectionItem(item); } return retval; } void Database::appendChunk(const EmbeddingChunk &chunk) { m_chunkList.reserve(s_batchSize); m_chunkList.append(chunk); if (m_chunkList.size() >= s_batchSize) sendChunkList(); } void Database::sendChunkList() { m_embLLM->generateDocEmbeddingsAsync(m_chunkList); m_chunkList.clear(); } void Database::handleEmbeddingsGenerated(const QVector &embeddings) { Q_ASSERT(!embeddings.isEmpty()); QList sqlEmbeddings; for (const auto &e: embeddings) { auto data = QByteArray::fromRawData( reinterpret_cast(e.embedding.data()), e.embedding.size() * sizeof(e.embedding.front()) ); sqlEmbeddings.append({e.model, e.folder_id, e.chunk_id, std::move(data)}); } transaction(); QSqlQuery q(m_db); QHash stats; if (!sqlAddEmbeddings(q, sqlEmbeddings, stats)) { qWarning() << "Database ERROR: failed to add embeddings:" << q.lastError(); return rollback(); } commit(); // FIXME(jared): embedding counts are per-collectionitem, not per-folder for (const auto &[key, stat]: std::as_const(stats).asKeyValueRange()) { if (!m_collectionMap.contains(key.folder_id)) continue; CollectionItem item = guiCollectionItem(key.folder_id); Q_ASSERT(item.currentEmbeddingsToIndex >= stat.nAdded + stat.nSkipped); if (item.currentEmbeddingsToIndex < stat.nAdded + stat.nSkipped) { qWarning() << "Database ERROR: underflow in current embeddings to index statistics"; item.currentEmbeddingsToIndex = 0; } else { item.currentEmbeddingsToIndex -= stat.nAdded + stat.nSkipped; } Q_ASSERT(item.totalEmbeddingsToIndex >= stat.nSkipped); if (item.totalEmbeddingsToIndex < stat.nSkipped) { qWarning() << "Database ERROR: underflow in total embeddings to index statistics"; item.totalEmbeddingsToIndex = 0; } else { item.totalEmbeddingsToIndex -= stat.nSkipped; } if (!stat.lastFile.isNull()) item.fileCurrentlyProcessing = stat.lastFile; // Set the last update if we are done Q_ASSERT(item.startUpdate > item.lastUpdate); if (!item.indexing && item.currentEmbeddingsToIndex == 0) setLastUpdateTime(item); updateGuiForCollectionItem(item); } } void Database::handleErrorGenerated(const QVector &chunks, const QString &error) { /* FIXME(jared): errors are actually collection-specific because they are conditioned * on the embedding model, but this sets the error on all collections for a given * folder */ QSet folder_ids; for (const auto &c: chunks) { folder_ids << c.folder_id; } for (int fid: folder_ids) { if (!m_collectionMap.contains(fid)) continue; CollectionItem item = guiCollectionItem(fid); item.error = error; updateGuiForCollectionItem(item); } } size_t Database::countOfDocuments(int folder_id) const { if (auto it = m_docsToScan.find(folder_id); it != m_docsToScan.end()) return it->second.size(); return 0; } size_t Database::countOfBytes(int folder_id) const { if (auto it = m_docsToScan.find(folder_id); it != m_docsToScan.end()) { size_t totalBytes = 0; for (const DocumentInfo &f : it->second) totalBytes += f.file.size(); return totalBytes; } return 0; } DocumentInfo Database::dequeueDocument() { Q_ASSERT(!m_docsToScan.empty()); auto firstEntry = m_docsToScan.begin(); auto &[firstKey, queue] = *firstEntry; Q_ASSERT(!queue.empty()); DocumentInfo result = std::move(queue.front()); queue.pop_front(); if (queue.empty()) m_docsToScan.erase(firstEntry); return result; } void Database::removeFolderFromDocumentQueue(int folder_id) { if (auto queueIt = m_docsToScan.find(folder_id); queueIt != m_docsToScan.end()) { if (auto key = m_chunkStreamer.currentDocKey()) { if (ranges::any_of(queueIt->second, [&key](auto &d) { return d.key() == key; })) m_chunkStreamer.reset(); // done with this document } // remove folder from queue m_docsToScan.erase(queueIt); } } void Database::enqueueDocumentInternal(DocumentInfo &&info, bool prepend) { auto &queue = m_docsToScan[info.folder]; queue.insert(prepend ? queue.begin() : queue.end(), std::move(info)); } void Database::enqueueDocuments(int folder_id, std::list &&infos) { // enqueue all documents auto &queue = m_docsToScan[folder_id]; queue.splice(queue.end(), std::move(infos)); CollectionItem item = guiCollectionItem(folder_id); item.currentDocsToIndex = queue.size(); item.totalDocsToIndex = queue.size(); const size_t bytes = countOfBytes(folder_id); item.currentBytesToIndex = bytes; item.totalBytesToIndex = bytes; updateGuiForCollectionItem(item); m_scanIntervalTimer->start(); } bool Database::scanQueueInterrupted() const { return m_scanDurationTimer.elapsed() >= 100; } void Database::scanQueueBatch() { transaction(); m_scanDurationTimer.start(); // scan for up to the maximum scan duration or until we run out of documents while (!m_docsToScan.empty()) { scanQueue(); if (scanQueueInterrupted()) break; } commit(); if (m_docsToScan.empty()) m_scanIntervalTimer->stop(); } void Database::scanQueue() { DocumentInfo info = dequeueDocument(); const size_t countForFolder = countOfDocuments(info.folder); const int folder_id = info.folder; // Update info info.file.stat(); // If the doc has since been deleted or no longer readable, then we schedule more work and return // leaving the cleanup for the cleanup handler if (!info.file.exists() || !info.file.isReadable()) return updateFolderToIndex(folder_id, countForFolder); const qint64 document_time = info.file.fileTime(QFile::FileModificationTime).toMSecsSinceEpoch(); const QString document_path = info.file.canonicalFilePath(); const bool currentlyProcessing = info.currentlyProcessing; // Check and see if we already have this document QSqlQuery q(m_db); int existing_id = -1; qint64 existing_time = -1; if (!selectDocument(q, document_path, &existing_id, &existing_time)) { handleDocumentError("ERROR: Cannot select document", existing_id, document_path, q.lastError()); return updateFolderToIndex(folder_id, countForFolder); } // If we have the document, we need to compare the last modification time and if it is newer // we must rescan the document, otherwise return if (existing_id != -1 && !currentlyProcessing) { Q_ASSERT(existing_time != -1); if (document_time == existing_time) { // No need to rescan, but we do have to schedule next return updateFolderToIndex(folder_id, countForFolder); } if (!removeChunksByDocumentId(q, existing_id)) { handleDocumentError("ERROR: Cannot remove chunks of document", existing_id, document_path, q.lastError()); return updateFolderToIndex(folder_id, countForFolder); } updateCollectionStatistics(); } // Update the document_time for an existing document, or add it for the first time now int document_id = existing_id; if (!currentlyProcessing) { if (document_id != -1) { if (!updateDocument(q, document_id, document_time)) { handleDocumentError("ERROR: Could not update document_time", document_id, document_path, q.lastError()); return updateFolderToIndex(folder_id, countForFolder); } } else { if (!addDocument(q, folder_id, document_time, document_path, &document_id)) { handleDocumentError("ERROR: Could not add document", document_id, document_path, q.lastError()); return updateFolderToIndex(folder_id, countForFolder); } CollectionItem item = guiCollectionItem(folder_id); item.totalDocs += 1; updateGuiForCollectionItem(item); } } // Get the embedding model for this folder // FIXME(jared): there can be more than one since we allow one folder to be in multiple collections QString embedding_model; if (!sqlGetFolderEmbeddingModel(q, folder_id, embedding_model)) { handleDocumentError("ERROR: Could not get embedding model", document_id, document_path, q.lastError()); return updateFolderToIndex(folder_id, countForFolder); } Q_ASSERT(document_id != -1); { try { m_chunkStreamer.setDocument(info, document_id, embedding_model); } catch (const std::runtime_error &e) { qWarning() << "LocalDocs ERROR:" << e.what(); goto dequeue; } } switch (m_chunkStreamer.step()) { case ChunkStreamer::Status::INTERRUPTED: info.currentlyProcessing = true; enqueueDocumentInternal(std::move(info), /*prepend*/ true); return updateFolderToIndex(folder_id, countForFolder + 1); case ChunkStreamer::Status::BINARY_SEEN: /* When we see a binary file, we treat it like an empty file so we know not to * scan it again. All existing chunks are removed, and in-progress embeddings * are ignored when they complete. */ qInfo() << "LocalDocs: Ignoring file with binary data:" << document_path; // this will also ensure in-flight embeddings are ignored if (!removeChunksByDocumentId(q, existing_id)) handleDocumentError("ERROR: Cannot remove chunks of document", existing_id, document_path, q.lastError()); updateCollectionStatistics(); break; case ChunkStreamer::Status::ERROR: qWarning() << "error reading" << document_path; break; case ChunkStreamer::Status::DOC_COMPLETE: ; } dequeue: auto item = guiCollectionItem(folder_id); Q_ASSERT(item.currentBytesToIndex >= info.file.size()); if (item.currentBytesToIndex < info.file.size()) { qWarning() << "Database ERROR: underflow in current bytes to index statistics"; item.currentBytesToIndex = 0; } else { item.currentBytesToIndex -= info.file.size(); } updateGuiForCollectionItem(item); return updateFolderToIndex(folder_id, countForFolder); } void Database::scanDocuments(int folder_id, const QString &folder_path) { #if defined(DEBUG) qDebug() << "scanning folder for documents" << folder_path; #endif QDirIterator it(folder_path, QDir::Readable | QDir::Files | QDir::Dirs | QDir::NoDotAndDotDot, QDirIterator::Subdirectories); std::list infos; while (it.hasNext()) { it.next(); QFileInfo fileInfo = it.fileInfo(); if (fileInfo.isDir()) { addFolderToWatch(fileInfo.canonicalFilePath()); continue; } if (!m_scannedFileExtensions.contains(fileInfo.suffix(), Qt::CaseInsensitive)) continue; infos.push_back({ folder_id, fileInfo }); } if (!infos.empty()) { CollectionItem item = guiCollectionItem(folder_id); item.indexing = true; updateGuiForCollectionItem(item); enqueueDocuments(folder_id, std::move(infos)); } else { updateFolderToIndex(folder_id, 0, false); } } void Database::start() { connect(m_watcher, &QFileSystemWatcher::directoryChanged, this, &Database::directoryChanged); connect(m_embLLM, &EmbeddingLLM::embeddingsGenerated, this, &Database::handleEmbeddingsGenerated); connect(m_embLLM, &EmbeddingLLM::errorGenerated, this, &Database::handleErrorGenerated); m_scanIntervalTimer->callOnTimeout(this, &Database::scanQueueBatch); const QString modelPath = MySettings::globalInstance()->modelPath(); QList oldCollections; if (!openLatestDb(modelPath, oldCollections)) { m_databaseValid = false; } else if (!initDb(modelPath, oldCollections)) { m_databaseValid = false; } else { cleanDB(); ftsIntegrityCheck(); QSqlQuery q(m_db); if (!refreshDocumentIdCache(q)) { m_databaseValid = false; } else { addCurrentFolders(); } } if (!m_databaseValid) emit databaseValidChanged(); } void Database::addCurrentFolders() { #if defined(DEBUG) qDebug() << "addCurrentFolders"; #endif QSqlQuery q(m_db); QList collections; if (!selectAllFromCollections(q, &collections)) { qWarning() << "ERROR: Cannot select collections" << q.lastError(); return; } guiCollectionListUpdated(collections); scheduleUncompletedEmbeddings(); for (const auto &i : collections) { if (!i.forceIndexing) { addFolderToWatch(i.folder_path); scanDocuments(i.folder_id, i.folder_path); } } updateCollectionStatistics(); } void Database::scheduleUncompletedEmbeddings() { QHash chunkList; QSqlQuery q(m_db); if (!selectAllUncompletedChunks(q, chunkList)) { qWarning() << "ERROR: Cannot select uncompleted chunks" << q.lastError(); return; } if (chunkList.isEmpty()) return; // map of folder_id -> chunk count QMap folderNChunks; for (auto it = chunkList.keyBegin(), end = chunkList.keyEnd(); it != end; ++it) { int folder_id = it->folder_id; if (folderNChunks.contains(folder_id)) continue; int total = 0; if (!selectCountChunks(q, folder_id, total)) { qWarning() << "ERROR: Cannot count total chunks" << q.lastError(); return; } folderNChunks.insert(folder_id, total); } // map of (folder_id, collection) -> incomplete count QMap, int> itemNIncomplete; for (const auto &[chunk, collections]: std::as_const(chunkList).asKeyValueRange()) for (const auto &collection: std::as_const(collections)) itemNIncomplete[{ chunk.folder_id, collection }]++; for (const auto &[key, nIncomplete]: std::as_const(itemNIncomplete).asKeyValueRange()) { const auto &[folder_id, collection] = key; /* FIXME(jared): this needs to be split by collection because different * collections have different embedding models */ int total = folderNChunks.value(folder_id); CollectionItem item = guiCollectionItem(folder_id); item.totalEmbeddingsToIndex = total; item.currentEmbeddingsToIndex = nIncomplete; updateGuiForCollectionItem(item); } for (auto it = chunkList.keyBegin(), end = chunkList.keyEnd(); it != end;) { QList batch; for (; it != end && batch.size() < s_batchSize; ++it) batch.append({ /*model*/ it->embedding_model, /*folder_id*/ it->folder_id, /*chunk_id*/ it->chunk_id, /*chunk*/ it->text }); Q_ASSERT(!batch.isEmpty()); m_embLLM->generateDocEmbeddingsAsync(batch); } } void Database::updateCollectionStatistics() { QSqlQuery q(m_db); QList collections; if (!selectAllFromCollections(q, &collections)) { qWarning() << "ERROR: Cannot select collections" << q.lastError(); return; } for (const auto &i: std::as_const(collections)) { int total_docs = 0; int total_words = 0; int total_tokens = 0; if (!selectCountStatistics(q, i.folder_id, &total_docs, &total_words, &total_tokens)) { qWarning() << "ERROR: could not count statistics for folder" << q.lastError(); } else { CollectionItem item = guiCollectionItem(i.folder_id); item.totalDocs = total_docs; item.totalWords = total_words; item.totalTokens = total_tokens; updateGuiForCollectionItem(item); } } } int Database::checkAndAddFolderToDB(const QString &path) { QFileInfo info(path); if (!info.exists() || !info.isReadable()) { qWarning() << "ERROR: Cannot add folder that doesn't exist or not readable" << path; return -1; } QSqlQuery q(m_db); int folder_id = -1; // See if the folder exists in the db if (!selectFolder(q, path, &folder_id)) { qWarning() << "ERROR: Cannot select folder from path" << path << q.lastError(); return -1; } // Add the folder if (folder_id == -1 && !addFolderToDB(q, path, &folder_id)) { qWarning() << "ERROR: Cannot add folder to db with path" << path << q.lastError(); return -1; } Q_ASSERT(folder_id != -1); return folder_id; } void Database::forceIndexing(const QString &collection, const QString &embedding_model) { Q_ASSERT(!embedding_model.isNull()); QSqlQuery q(m_db); QList> folders; if (!selectFoldersFromCollection(q, collection, &folders)) { qWarning() << "ERROR: Cannot select folders from collections" << collection << q.lastError(); return; } if (!setCollectionEmbeddingModel(q, collection, embedding_model)) { qWarning().nospace() << "ERROR: Cannot set embedding model for collection " << collection << ": " << q.lastError(); return; } for (const auto &folder: std::as_const(folders)) { CollectionItem item = guiCollectionItem(folder.first); item.embeddingModel = embedding_model; item.forceIndexing = false; updateGuiForCollectionItem(item); addFolderToWatch(folder.second); scanDocuments(folder.first, folder.second); } } void Database::forceRebuildFolder(const QString &path) { QSqlQuery q(m_db); int folder_id; if (!selectFolder(q, path, &folder_id)) { qWarning().nospace() << "Database ERROR: Cannot select folder from path " << path << ": " << q.lastError(); return; } Q_ASSERT(!m_docsToScan.contains(folder_id)); transaction(); if (!sqlRemoveDocsByFolderPath(q, path)) { qWarning().nospace() << "Database ERROR: Cannot remove chunks for folder " << path << ": " << q.lastError(); return rollback(); } commit(); updateCollectionStatistics(); // We now have zero embeddings. Document progress will be updated by scanDocuments. // FIXME(jared): this updates the folder, but these values should also depend on the collection CollectionItem item = guiCollectionItem(folder_id); item.currentEmbeddingsToIndex = item.totalEmbeddingsToIndex = 0; updateGuiForCollectionItem(item); scanDocuments(folder_id, path); } bool Database::addFolder(const QString &collection, const QString &path, const QString &embedding_model) { // add the folder, if needed const int folder_id = checkAndAddFolderToDB(path); if (folder_id == -1) return false; std::optional item; QSqlQuery q(m_db); if (!selectCollectionByName(q, collection, item)) { qWarning().nospace() << "Database ERROR: Cannot select collection " << collection << ": " << q.lastError(); return false; } // add the collection, if needed if (!item) { item.emplace(); if (!addCollection(q, collection, QDateTime() /*start_update*/, QDateTime() /*last_update*/, embedding_model /*embedding_model*/, *item)) { qWarning().nospace() << "ERROR: Cannot add collection " << collection << ": " << q.lastError(); return false; } } // link the folder and the collection, if needed int res = addCollectionItem(q, item->collection_id, folder_id); if (res < 0) { // error qWarning().nospace() << "Database ERROR: Cannot add folder " << path << " to collection " << collection << ": " << q.lastError(); return false; } // add the new collection item to the UI if (res == 1) { // new item added item->folder_path = path; item->folder_id = folder_id; addGuiCollectionItem(item.value()); // note: this is the existing embedding model if the collection was found if (!item->embeddingModel.isNull()) { addFolderToWatch(path); scanDocuments(folder_id, path); } } return true; } void Database::removeFolder(const QString &collection, const QString &path) { #if defined(DEBUG) qDebug() << "removeFolder" << path; #endif QSqlQuery q(m_db); int folder_id = -1; // See if the folder exists in the db if (!selectFolder(q, path, &folder_id)) { qWarning() << "ERROR: Cannot select folder from path" << path << q.lastError(); return; } // If we don't have a folder_id in the db, then something bad has happened Q_ASSERT(folder_id != -1); if (folder_id == -1) { qWarning() << "ERROR: Collected folder does not exist in db" << path; m_watcher->removePath(path); return; } transaction(); if (removeFolderInternal(collection, folder_id, path)) { commit(); } else { rollback(); } } bool Database::removeFolderInternal(const QString &collection, int folder_id, const QString &path) { // Remove it from the collection QSqlQuery q(m_db); int nRemaining = removeCollectionFolder(q, collection, folder_id); if (nRemaining == -1) { qWarning().nospace() << "Database ERROR: Cannot remove collection " << collection << " from folder " << folder_id << ": " << q.lastError(); return false; } removeGuiFolderById(collection, folder_id); if (!sqlPruneCollections(q)) { qWarning() << "Database ERROR: Cannot prune collections:" << q.lastError(); return false; } // Keep folder if it is still referenced if (nRemaining) return true; // Remove the last reference to a folder // First remove all upcoming jobs associated with this folder removeFolderFromDocumentQueue(folder_id); // Get a list of all documents associated with folder QList documentIds; if (!selectDocuments(q, folder_id, &documentIds)) { qWarning() << "ERROR: Cannot select documents" << folder_id << q.lastError(); return false; } // Remove all chunks and documents associated with this folder for (int document_id: std::as_const(documentIds)) { if (!removeChunksByDocumentId(q, document_id)) { qWarning() << "ERROR: Cannot remove chunks of document_id" << document_id << q.lastError(); return false; } if (!removeDocument(q, document_id)) { qWarning() << "ERROR: Cannot remove document_id" << document_id << q.lastError(); return false; } } if (!removeFolderFromDB(q, folder_id)) { qWarning() << "ERROR: Cannot remove folder_id" << folder_id << q.lastError(); return false; } m_collectionMap.remove(folder_id); removeFolderFromWatch(path); return true; } void Database::addFolderToWatch(const QString &path) { #if defined(DEBUG) qDebug() << "addFolderToWatch" << path; #endif // pre-check because addPath returns false for already watched paths if (!m_watchedPaths.contains(path)) { if (!m_watcher->addPath(path)) qWarning() << "Database::addFolderToWatch: failed to watch" << path; // add unconditionally to suppress repeated warnings m_watchedPaths << path; } } void Database::removeFolderFromWatch(const QString &path) { #if defined(DEBUG) qDebug() << "removeFolderFromWatch" << path; #endif QDirIterator it(path, QDir::Readable | QDir::Dirs | QDir::NoDotAndDotDot, QDirIterator::Subdirectories); QStringList children { path }; while (it.hasNext()) children.append(it.next()); m_watcher->removePaths(children); m_watchedPaths -= QSet(children.begin(), children.end()); } QList Database::searchEmbeddingsHelper(const std::vector &query, QSqlQuery &q, int nNeighbors) { constexpr int BATCH_SIZE = 2048; const int n_embd = query.size(); const us::metric_punned_t metric(n_embd, us::metric_kind_t::ip_k); // inner product us::executor_default_t executor(std::thread::hardware_concurrency()); us::exact_search_t search; QList batchChunkIds; QList batchEmbeddings; batchChunkIds.reserve(BATCH_SIZE); batchEmbeddings.reserve(BATCH_SIZE * n_embd); struct Result { int chunkId; us::distance_punned_t dist; }; QList results; // The q parameter is expected to be the result of a QSqlQuery returning (chunk_id, embedding) pairs while (q.at() != QSql::AfterLastRow) { // batches batchChunkIds.clear(); batchEmbeddings.clear(); while (batchChunkIds.count() < BATCH_SIZE && q.next()) { // batch batchChunkIds << q.value(0).toInt(); batchEmbeddings.resize(batchEmbeddings.size() + n_embd); QVariant embdCol = q.value(1); if (embdCol.userType() != QMetaType::QByteArray) { qWarning() << "Database ERROR: Expected embedding to be blob, got" << embdCol.userType(); return {}; } auto *embd = static_cast(embdCol.constData()); const int embd_stride = n_embd * sizeof(float); if (embd->size() != embd_stride) { qWarning() << "Database ERROR: Expected embedding to be" << embd_stride << "bytes, got" << embd->size(); return {}; } memcpy(&*(batchEmbeddings.end() - n_embd), embd->constData(), embd_stride); } int nBatch = batchChunkIds.count(); if (!nBatch) break; // get top-k nearest neighbors of this batch int kBatch = qMin(nNeighbors, nBatch); us::exact_search_results_t batchResults = search( (us::byte_t const *)batchEmbeddings.data(), nBatch, n_embd * sizeof(float), (us::byte_t const *)query.data(), 1, n_embd * sizeof(float), kBatch, metric ); for (int i = 0; i < kBatch; ++i) { auto offset = batchResults.at(0)[i].offset; us::distance_punned_t distance = batchResults.at(0)[i].distance; results.append({batchChunkIds[offset], distance}); } } // get top-k nearest neighbors of combined results nNeighbors = qMin(nNeighbors, results.size()); std::partial_sort( results.begin(), results.begin() + nNeighbors, results.end(), [](const Result &a, const Result &b) { return a.dist < b.dist; } ); QList chunkIds; chunkIds.reserve(nNeighbors); for (int i = 0; i < nNeighbors; i++) chunkIds << results[i].chunkId; return chunkIds; } QList Database::searchEmbeddings(const std::vector &query, const QList &collections, int nNeighbors) { QSqlQuery q(m_db); if (!q.exec(GET_COLLECTION_EMBEDDINGS_SQL.arg(collections.join("', '")))) { qWarning() << "Database ERROR: Failed to exec embeddings query:" << q.lastError(); return {}; } return searchEmbeddingsHelper(query, q, nNeighbors); } QList Database::scoreChunks(const std::vector &query, const QList &chunks) { QList chunkStrings; for (int id : chunks) chunkStrings << QString::number(id); QSqlQuery q(m_db); if (!q.exec(GET_CHUNK_EMBEDDINGS_SQL.arg(chunkStrings.join(", ")))) { qWarning() << "Database ERROR: Failed to exec embeddings query:" << q.lastError(); return {}; } return searchEmbeddingsHelper(query, q, chunks.size()); } QList Database::queriesForFTS5(const QString &input) { // Escape double quotes by adding a second double quote QString escapedInput = input; escapedInput.replace("\"", "\"\""); static QRegularExpression spaces("\\s+"); QStringList oWords = escapedInput.split(spaces, Qt::SkipEmptyParts); QList queries; // Start by trying to match the entire input BM25Query e; e.isExact = true; e.input = oWords.join(" "); e.query = "\"" + oWords.join(" ") + "\""; e.qlength = oWords.size(); e.ilength = oWords.size(); queries << e; // https://github.com/igorbrigadir/stopwords?tab=readme-ov-file // Lucene, Solr, Elastisearch static const QSet stopWords = { "a", "an", "and", "are", "as", "at", "be", "but", "by", "for", "if", "in", "into", "is", "it", "no", "not", "of", "on", "or", "such", "that", "the", "their", "then", "there", "these", "they", "this", "to", "was", "will", "with" }; QStringList quotedWords; for (const QString &w : oWords) if (!stopWords.contains(w.toLower())) quotedWords << "\"" + w + "\""; BM25Query b; b.input = oWords.join(" "); b.query = "(" + quotedWords.join(" OR ") + ")"; b.qlength = 1; // length of phrase b.ilength = oWords.size(); b.rlength = oWords.size() - quotedWords.size(); queries << b; return queries; } QList Database::searchBM25(const QString &query, const QList &collections, BM25Query &bm25q, int k) { struct SearchResult { int chunkId; float score; }; QList bm25Queries = queriesForFTS5(query); QSqlQuery sqlQuery(m_db); sqlQuery.prepare(SELECT_CHUNKS_FTS_SQL.arg(collections.join("', '"), QString::number(k))); QList results; for (auto &bm25Query : std::as_const(bm25Queries)) { sqlQuery.addBindValue(bm25Query.query); if (!sqlQuery.exec()) { qWarning() << "Database ERROR: Failed to execute BM25 query:" << sqlQuery.lastError(); return {}; } if (sqlQuery.next()) { // Save the query that was used to produce results bm25q = bm25Query; break; } } if (sqlQuery.at() != QSql::AfterLastRow) { do { const int chunkId = sqlQuery.value(0).toInt(); const float score = sqlQuery.value(1).toFloat(); results.append({chunkId, score}); } while (sqlQuery.next()); } k = qMin(k, results.size()); std::partial_sort( results.begin(), results.begin() + k, results.end(), [](const SearchResult &a, const SearchResult &b) { return a.score < b.score; } ); QList chunkIds; chunkIds.reserve(k); for (int i = 0; i < k; i++) chunkIds << results[i].chunkId; return chunkIds; } float Database::computeBM25Weight(const Database::BM25Query &bm25q) { float bmWeight = 0.0f; if (bm25q.isExact) { bmWeight = 0.9f; // the highest we give } else { // qlength is the length of the phrases in the query by number of distinct words // ilength is the length of the natural language query by number of distinct words // rlength is the number of stop words removed from the natural language query to form the query // calculate the query length weight based on the ratio of query terms to meaningful terms. // this formula adjusts the weight with the empirically determined insight that BM25's // effectiveness decreases as query length increases. float queryLengthWeight = 1 / powf(float(bm25q.ilength - bm25q.rlength), 2); queryLengthWeight = qBound(0.0f, queryLengthWeight, 1.0f); // the weighting is bound between 1/4 and 3/4 which was determined empirically to work well // with the beir nfcorpus, scifact, fiqa and trec-covid datasets along with our embedding // model bmWeight = 0.25f + queryLengthWeight * 0.50f; } #if 0 qDebug() << "bm25q.type" << bm25q.type << "bm25q.qlength" << bm25q.qlength << "bm25q.ilength" << bm25q.ilength << "bm25q.rlength" << bm25q.rlength << "bmWeight" << bmWeight; #endif return bmWeight; } QList Database::reciprocalRankFusion(const std::vector &query, const QList &embeddingResults, const QList &bm25Results, const BM25Query &bm25q, int k) { // We default to the embedding results and augment with bm25 if any QList results = embeddingResults; QList missingScores; QHash bm25Ranks; for (int i = 0; i < bm25Results.size(); ++i) { if (!results.contains(bm25Results[i])) missingScores.append(bm25Results[i]); bm25Ranks[bm25Results[i]] = i + 1; } if (!missingScores.isEmpty()) { QList scored = scoreChunks(query, missingScores); results << scored; } QHash embeddingRanks; for (int i = 0; i < results.size(); ++i) embeddingRanks[results[i]] = i + 1; const float bmWeight = bm25Results.isEmpty() ? 0 : computeBM25Weight(bm25q); // From the paper: "Reciprocal Rank Fusion outperforms Condorcet and individual Rank Learning Methods" // doi: 10.1145/1571941.157211 const int fusion_k = 60; std::stable_sort( results.begin(), results.end(), [&](const int &a, const int &b) { // Reciprocal Rank Fusion (RRF) const int aBm25Rank = bm25Ranks.value(a, bm25Results.size() + 1); const int aEmbeddingRank = embeddingRanks.value(a, embeddingResults.size() + 1); Q_ASSERT(embeddingRanks.contains(a)); const int bBm25Rank = bm25Ranks.value(b, bm25Results.size() + 1); const int bEmbeddingRank = embeddingRanks.value(b, embeddingResults.size() + 1); Q_ASSERT(embeddingRanks.contains(b)); const float aBm25Score = 1.0f / (fusion_k + aBm25Rank); const float bBm25Score = 1.0f / (fusion_k + bBm25Rank); const float aEmbeddingScore = 1.0f / (fusion_k + aEmbeddingRank); const float bEmbeddingScore = 1.0f / (fusion_k + bEmbeddingRank); const float aWeightedScore = bmWeight * aBm25Score + (1.f - bmWeight) * aEmbeddingScore; const float bWeightedScore = bmWeight * bBm25Score + (1.f - bmWeight) * bEmbeddingScore; // Higher RRF score means better ranking, so we use greater than for sorting return aWeightedScore > bWeightedScore; } ); k = qMin(k, results.size()); results.resize(k); return results; } QList Database::searchDatabase(const QString &query, const QList &collections, int k) { std::vector queryEmbd = m_embLLM->generateQueryEmbedding(query); if (queryEmbd.empty()) { qDebug() << "ERROR: generating embeddings returned a null result"; return { }; } const QList embeddingResults = searchEmbeddings(queryEmbd, collections, k); BM25Query bm25q; const QList bm25Results = searchBM25(query, collections, bm25q, k); return reciprocalRankFusion(queryEmbd, embeddingResults, bm25Results, bm25q, k); } void Database::retrieveFromDB(const QList &collections, const QString &text, int retrievalSize, QList *results) { #if defined(DEBUG) qDebug() << "retrieveFromDB" << collections << text << retrievalSize; #endif QList searchResults = searchDatabase(text, collections, retrievalSize); if (searchResults.isEmpty()) return; QSqlQuery q(m_db); if (!selectChunk(q, searchResults)) { qDebug() << "ERROR: selecting chunks:" << q.lastError(); return; } QHash tempResults; while (q.next()) { const int rowid = q.value(0).toInt(); const QString document_path = q.value(2).toString(); const QString chunk_text = q.value(3).toString(); const QString date = QDateTime::fromMSecsSinceEpoch(q.value(1).toLongLong()).toString("yyyy, MMMM dd"); const QString file = q.value(4).toString(); const QString title = q.value(5).toString(); const QString author = q.value(6).toString(); const int page = q.value(7).toInt(); const int from = q.value(8).toInt(); const int to = q.value(9).toInt(); const QString collectionName = q.value(10).toString(); ResultInfo info; info.collection = collectionName; info.path = document_path; info.file = file; info.title = title; info.author = author; info.date = date; info.text = chunk_text; info.page = page; info.from = from; info.to = to; tempResults.insert(rowid, info); #if defined(DEBUG) qDebug() << "retrieve rowid:" << rowid << "chunk_text:" << chunk_text; #endif } for (int id : searchResults) if (tempResults.contains(id)) results->append(tempResults.value(id)); } bool Database::ftsIntegrityCheck() { QSqlQuery q(m_db); // Returns an error executing sql if it the integrity check fails // See: https://www.sqlite.org/fts5.html#the_integrity_check_command const bool success = q.exec(FTS_INTEGRITY_SQL); if (!success && q.lastError().nativeErrorCode() != "267" /*SQLITE_CORRUPT_VTAB from sqlite header*/) { qWarning() << "ERROR: Cannot prepare sql for fts integrity check" << q.lastError(); return false; } if (!success && !q.exec(FTS_REBUILD_SQL)) { qWarning() << "ERROR: Cannot exec sql for fts rebuild" << q.lastError(); return false; } return true; } // FIXME This is very slow and non-interruptible and when we close the application and we're // cleaning a large table this can cause the app to take forever to shut down. This would ideally be // interruptible and we'd continue 'cleaning' when we restart bool Database::cleanDB() { #if defined(DEBUG) qDebug() << "cleanDB"; #endif // Scan all folders in db to make sure they still exist QSqlQuery q(m_db); QList collections; if (!selectAllFromCollections(q, &collections)) { qWarning() << "ERROR: Cannot select collections" << q.lastError(); return false; } transaction(); for (const auto &i: std::as_const(collections)) { // Find the path for the folder QFileInfo info(i.folder_path); if (!info.exists() || !info.isReadable()) { #if defined(DEBUG) qDebug() << "clean db removing folder" << i.folder_id << i.folder_path; #endif if (!removeFolderInternal(i.collection, i.folder_id, i.folder_path)) { rollback(); return false; } } } // Scan all documents in db to make sure they still exist if (!q.prepare(SELECT_ALL_DOCUMENTS_SQL)) { qWarning() << "ERROR: Cannot prepare sql for select all documents" << q.lastError(); rollback(); return false; } if (!q.exec()) { qWarning() << "ERROR: Cannot exec sql for select all documents" << q.lastError(); rollback(); return false; } while (q.next()) { int document_id = q.value(0).toInt(); QString document_path = q.value(1).toString(); QFileInfo info(document_path); if (info.exists() && info.isReadable() && m_scannedFileExtensions.contains(info.suffix(), Qt::CaseInsensitive)) continue; #if defined(DEBUG) qDebug() << "clean db removing document" << document_id << document_path; #endif // Remove all chunks and documents that either don't exist or have become unreadable QSqlQuery query(m_db); if (!removeChunksByDocumentId(query, document_id)) { qWarning() << "ERROR: Cannot remove chunks of document_id" << document_id << query.lastError(); rollback(); return false; } if (!removeDocument(query, document_id)) { qWarning() << "ERROR: Cannot remove document_id" << document_id << query.lastError(); rollback(); return false; } } commit(); return true; } void Database::changeChunkSize(int chunkSize) { if (chunkSize == m_chunkSize) return; #if defined(DEBUG) qDebug() << "changeChunkSize" << chunkSize; #endif QSqlQuery q(m_db); // Scan all documents in db to make sure they still exist if (!q.prepare(SELECT_ALL_DOCUMENTS_SQL)) { qWarning() << "ERROR: Cannot prepare sql for select all documents" << q.lastError(); return; } if (!q.exec()) { qWarning() << "ERROR: Cannot exec sql for select all documents" << q.lastError(); return; } transaction(); while (q.next()) { int document_id = q.value(0).toInt(); // Remove all chunks and documents to change the chunk size QSqlQuery query(m_db); if (!removeChunksByDocumentId(query, document_id)) { qWarning() << "ERROR: Cannot remove chunks of document_id" << document_id << query.lastError(); return rollback(); } if (!removeDocument(query, document_id)) { qWarning() << "ERROR: Cannot remove document_id" << document_id << query.lastError(); return rollback(); } } commit(); m_chunkSize = chunkSize; addCurrentFolders(); updateCollectionStatistics(); } void Database::changeFileExtensions(const QStringList &extensions) { #if defined(DEBUG) qDebug() << "changeFileExtensions"; #endif m_scannedFileExtensions = extensions; if (cleanDB()) updateCollectionStatistics(); QSqlQuery q(m_db); QList collections; if (!selectAllFromCollections(q, &collections)) { qWarning() << "ERROR: Cannot select collections" << q.lastError(); return; } for (const auto &i: std::as_const(collections)) { if (!i.forceIndexing) scanDocuments(i.folder_id, i.folder_path); } } void Database::directoryChanged(const QString &path) { #if defined(DEBUG) qDebug() << "directoryChanged" << path; #endif // search for a collection that contains this folder (we watch subdirectories) int folder_id = -1; QDir dir(path); for (;;) { QSqlQuery q(m_db); if (!selectFolder(q, dir.path(), &folder_id)) { qWarning() << "ERROR: Cannot select folder from path" << dir.path() << q.lastError(); return; } if (folder_id != -1) break; // check next parent if (!dir.cdUp()) { if (!dir.isRoot()) break; // folder removed Q_ASSERT(false); qWarning() << "ERROR: Watched folder does not exist in db" << path; m_watcher->removePath(path); return; } } // Clean the database if (cleanDB()) updateCollectionStatistics(); // Rescan the documents associated with the folder if (folder_id != -1) scanDocuments(folder_id, path); } ================================================ FILE: gpt4all-chat/src/database.h ================================================ #ifndef DATABASE_H #define DATABASE_H #include "embllm.h" #include #include #include #include #include #include #include #include #include #include #include #include #include // IWYU pragma: keep #include #include #include // IWYU pragma: keep #include #include #include #include #include #include #include #include #include // IWYU pragma: keep using namespace Qt::Literals::StringLiterals; class Database; class DocumentReader; class QFileSystemWatcher; class QSqlQuery; class QTextStream; class QTimer; /* Version 0: GPT4All v2.4.3, full-text search * Version 1: GPT4All v2.5.3, embeddings in hsnwlib * Version 2: GPT4All v3.0.0, embeddings in sqlite * Version 3: GPT4All v3.4.0, hybrid search */ // minimum supported version static const int LOCALDOCS_MIN_VER = 1; // FIXME: (Adam) The next time we bump the version we should add triggers to manage the fts external // content table as recommended in the official documentation to keep the fts index in sync // See: https://www.sqlite.org/fts5.html#external_content_tables // FIXME: (Adam) The fts virtual table should include the chunk_id explicitly instead of relying upon // the id of the two tables to be in sync // current version static const int LOCALDOCS_VERSION = 3; struct DocumentInfo { using key_type = std::pair; int folder; QFileInfo file; bool currentlyProcessing = false; key_type key() const { return {folder, file.canonicalFilePath()}; } // for comparison bool isPdf () const { return !file.suffix().compare("pdf"_L1, Qt::CaseInsensitive); } bool isDocx() const { return !file.suffix().compare("docx"_L1, Qt::CaseInsensitive); } }; struct ResultInfo { Q_GADGET Q_PROPERTY(QString collection MEMBER collection) Q_PROPERTY(QString path MEMBER path) Q_PROPERTY(QString file MEMBER file) Q_PROPERTY(QString title MEMBER title) Q_PROPERTY(QString author MEMBER author) Q_PROPERTY(QString date MEMBER date) Q_PROPERTY(QString text MEMBER text) Q_PROPERTY(int page MEMBER page) Q_PROPERTY(int from MEMBER from) Q_PROPERTY(int to MEMBER to) Q_PROPERTY(QString fileUri READ fileUri STORED false) public: QString collection; // [Required] The name of the collection QString path; // [Required] The full path QString file; // [Required] The name of the file, but not the full path QString title; // [Optional] The title of the document QString author; // [Optional] The author of the document QString date; // [Required] The creation or the last modification date whichever is latest QString text; // [Required] The text actually used in the augmented context int page = -1; // [Optional] The page where the text was found int from = -1; // [Optional] The line number where the text begins int to = -1; // [Optional] The line number where the text ends QString fileUri() const { // QUrl reserved chars that are not UNSAFE_PATH according to glib/gconvert.c static const QByteArray s_exclude = "!$&'()*+,/:=@~"_ba; Q_ASSERT(!QFileInfo(path).isRelative()); #ifdef Q_OS_WINDOWS Q_ASSERT(!path.contains('\\')); // Qt normally uses forward slash as path separator #endif auto escaped = QString::fromUtf8(QUrl::toPercentEncoding(path, s_exclude)); if (escaped.front() != '/') escaped = '/' + escaped; return u"file://"_s + escaped; } bool operator==(const ResultInfo &other) const { return file == other.file && title == other.title && author == other.author && date == other.date && text == other.text && page == other.page && from == other.from && to == other.to; } bool operator!=(const ResultInfo &other) const { return !(*this == other); } }; Q_DECLARE_METATYPE(ResultInfo) struct CollectionItem { // -- Fields persisted to database -- int collection_id = -1; int folder_id = -1; QString collection; QString folder_path; QString embeddingModel; // -- Transient fields -- bool installed = false; bool indexing = false; bool forceIndexing = false; QString error; // progress int currentDocsToIndex = 0; int totalDocsToIndex = 0; size_t currentBytesToIndex = 0; size_t totalBytesToIndex = 0; size_t currentEmbeddingsToIndex = 0; size_t totalEmbeddingsToIndex = 0; // statistics size_t totalDocs = 0; size_t totalWords = 0; size_t totalTokens = 0; QDateTime startUpdate; QDateTime lastUpdate; QString fileCurrentlyProcessing; }; Q_DECLARE_METATYPE(CollectionItem) class ChunkStreamer { public: enum class Status { DOC_COMPLETE, INTERRUPTED, ERROR, BINARY_SEEN }; explicit ChunkStreamer(Database *database); ~ChunkStreamer(); void setDocument(DocumentInfo doc, int documentId, const QString &embeddingModel); std::optional currentDocKey() const; void reset(); Status step(); private: Database *m_database; std::optional m_docKey; std::unique_ptr m_reader; // may be invalid, always compare key first int m_documentId; QString m_embeddingModel; QString m_title; QString m_author; QString m_subject; QString m_keywords; // working state QString m_chunk; // has a trailing space for convenience int m_nChunkWords = 0; int m_page = 0; }; class Database : public QObject { Q_OBJECT public: Database(int chunkSize, QStringList extensions); ~Database() override; bool isValid() const { return m_databaseValid; } public Q_SLOTS: void start(); bool scanQueueInterrupted() const; void scanQueueBatch(); void scanDocuments(int folder_id, const QString &folder_path); void forceIndexing(const QString &collection, const QString &embedding_model); void forceRebuildFolder(const QString &path); bool addFolder(const QString &collection, const QString &path, const QString &embedding_model); void removeFolder(const QString &collection, const QString &path); void retrieveFromDB(const QList &collections, const QString &text, int retrievalSize, QList *results); void changeChunkSize(int chunkSize); void changeFileExtensions(const QStringList &extensions); Q_SIGNALS: // Signals for the gui only void requestUpdateGuiForCollectionItem(const CollectionItem &item); void requestAddGuiCollectionItem(const CollectionItem &item); void requestRemoveGuiFolderById(const QString &collection, int folder_id); void requestGuiCollectionListUpdated(const QList &collectionList); void databaseValidChanged(); private Q_SLOTS: void directoryChanged(const QString &path); void addCurrentFolders(); void handleEmbeddingsGenerated(const QVector &embeddings); void handleErrorGenerated(const QVector &chunks, const QString &error); private: void transaction(); void commit(); void rollback(); bool addChunk(QSqlQuery &q, int document_id, const QString &chunk_text, const QString &file, const QString &title, const QString &author, const QString &subject, const QString &keywords, int page, int from, int to, int words, int *chunk_id); bool refreshDocumentIdCache(QSqlQuery &q); bool removeChunksByDocumentId(QSqlQuery &q, int document_id); bool sqlRemoveDocsByFolderPath(QSqlQuery &q, const QString &path); bool hasContent(); // not found -> 0, , exists and has content -> 1, error -> -1 int openDatabase(const QString &modelPath, bool create = true, int ver = LOCALDOCS_VERSION); bool openLatestDb(const QString &modelPath, QList &oldCollections); bool initDb(const QString &modelPath, const QList &oldCollections); int checkAndAddFolderToDB(const QString &path); bool removeFolderInternal(const QString &collection, int folder_id, const QString &path); size_t chunkStream(QTextStream &stream, int folder_id, int document_id, const QString &embedding_model, const QString &file, const QString &title, const QString &author, const QString &subject, const QString &keywords, int page, int maxChunks = -1); void appendChunk(const EmbeddingChunk &chunk); void sendChunkList(); void updateFolderToIndex(int folder_id, size_t countForFolder, bool sendChunks = true); size_t countOfDocuments(int folder_id) const; size_t countOfBytes(int folder_id) const; DocumentInfo dequeueDocument(); void removeFolderFromDocumentQueue(int folder_id); void enqueueDocumentInternal(DocumentInfo &&info, bool prepend = false); void enqueueDocuments(int folder_id, std::list &&infos); void scanQueue(); bool ftsIntegrityCheck(); bool cleanDB(); void addFolderToWatch(const QString &path); void removeFolderFromWatch(const QString &path); static QList searchEmbeddingsHelper(const std::vector &query, QSqlQuery &q, int nNeighbors); QList searchEmbeddings(const std::vector &query, const QList &collections, int nNeighbors); struct BM25Query { QString input; QString query; bool isExact = false; int qlength = 0; int ilength = 0; int rlength = 0; }; QList queriesForFTS5(const QString &input); QList searchBM25(const QString &query, const QList &collections, BM25Query &bm25q, int k); QList scoreChunks(const std::vector &query, const QList &chunks); float computeBM25Weight(const BM25Query &bm25q); QList reciprocalRankFusion(const std::vector &query, const QList &embeddingResults, const QList &bm25Results, const BM25Query &bm25q, int k); QList searchDatabase(const QString &query, const QList &collections, int k); void setStartUpdateTime(CollectionItem &item); void setLastUpdateTime(CollectionItem &item); CollectionItem guiCollectionItem(int folder_id) const; void updateGuiForCollectionItem(const CollectionItem &item); void addGuiCollectionItem(const CollectionItem &item); void removeGuiFolderById(const QString &collection, int folder_id); void guiCollectionListUpdated(const QList &collectionList); void scheduleUncompletedEmbeddings(); void updateCollectionStatistics(); private: QSqlDatabase m_db; int m_chunkSize; QStringList m_scannedFileExtensions; QTimer *m_scanIntervalTimer; QElapsedTimer m_scanDurationTimer; std::map> m_docsToScan; QList m_retrieve; QThread m_dbThread; QFileSystemWatcher *m_watcher; QSet m_watchedPaths; EmbeddingLLM *m_embLLM; QVector m_chunkList; QHash m_collectionMap; // used only for tracking indexing/embedding progress std::atomic m_databaseValid; ChunkStreamer m_chunkStreamer; QSet m_documentIdCache; // cached list of documents with chunks for fast lookup friend class ChunkStreamer; }; #endif // DATABASE_H ================================================ FILE: gpt4all-chat/src/download.cpp ================================================ #include "download.h" #include "modellist.h" #include "mysettings.h" #include "network.h" #include #include #include #include #include #include #include // IWYU pragma: keep #include #include #include #include #include #include #include #include // IWYU pragma: keep #include #include #include #include #include #include // IWYU pragma: keep #include #include #include #include // IWYU pragma: keep #include #include #include #include #include #include #include using namespace Qt::Literals::StringLiterals; class MyDownload: public Download { }; Q_GLOBAL_STATIC(MyDownload, downloadInstance) Download *Download::globalInstance() { return downloadInstance(); } Download::Download() : QObject(nullptr) , m_hashAndSave(new HashAndSaveFile) { connect(this, &Download::requestHashAndSave, m_hashAndSave, &HashAndSaveFile::hashAndSave, Qt::QueuedConnection); connect(m_hashAndSave, &HashAndSaveFile::hashAndSaveFinished, this, &Download::handleHashAndSaveFinished, Qt::QueuedConnection); connect(&m_networkManager, &QNetworkAccessManager::sslErrors, this, &Download::handleSslErrors); updateLatestNews(); updateReleaseNotes(); m_startTime = QDateTime::currentDateTime(); } std::strong_ordering Download::compareAppVersions(const QString &a, const QString &b) { static QRegularExpression versionRegex(R"(^(\d+(?:\.\d+){0,2})(-.+)?$)"); // When comparing versions, make sure a2 < a10. QCollator versionCollator(QLocale(QLocale::English, QLocale::UnitedStates)); versionCollator.setNumericMode(true); QRegularExpressionMatch aMatch = versionRegex.match(a); QRegularExpressionMatch bMatch = versionRegex.match(b); Q_ASSERT(aMatch.hasMatch() && bMatch.hasMatch()); // expect valid versions // Check for an invalid version. foo < 3.0.0 -> !hasMatch < hasMatch if (auto diff = aMatch.hasMatch() <=> bMatch.hasMatch(); diff != 0) return diff; // invalid version compares as lower // Compare invalid versions. fooa < foob if (!aMatch.hasMatch() && !bMatch.hasMatch()) return versionCollator.compare(a, b) <=> 0; // lexicographic comparison // Compare first three components. 3.0.0 < 3.0.1 QStringList aParts = aMatch.captured(1).split('.'); QStringList bParts = bMatch.captured(1).split('.'); for (int i = 0; i < qMax(aParts.size(), bParts.size()); i++) { bool ok = false; int aInt = aParts.value(i, "0").toInt(&ok); Q_ASSERT(ok); int bInt = bParts.value(i, "0").toInt(&ok); Q_ASSERT(ok); if (auto diff = aInt <=> bInt; diff != 0) return diff; // version with lower component compares as lower } // Check for a pre/post-release suffix. 3.0.0-dev0 < 3.0.0-rc1 < 3.0.0 < 3.0.0-post1 auto getSuffixOrder = [](const QRegularExpressionMatch &match) -> int { QString suffix = match.captured(2); return suffix.startsWith("-dev") ? 0 : suffix.startsWith("-rc") ? 1 : suffix.isEmpty() ? 2 : /* some other suffix */ 3; }; if (auto diff = getSuffixOrder(aMatch) <=> getSuffixOrder(bMatch); diff != 0) return diff; // different suffix types // Lexicographic comparison of suffix. 3.0.0-rc1 < 3.0.0-rc2 if (aMatch.hasCaptured(2) && bMatch.hasCaptured(2)) { if (auto diff = versionCollator.compare(aMatch.captured(2), bMatch.captured(2)); diff != 0) return diff <=> 0; } return std::strong_ordering::equal; } ReleaseInfo Download::releaseInfo() const { const QString currentVersion = QCoreApplication::applicationVersion(); if (m_releaseMap.contains(currentVersion)) return m_releaseMap.value(currentVersion); if (!m_releaseMap.empty()) return m_releaseMap.last(); return ReleaseInfo(); } bool Download::hasNewerRelease() const { const QString currentVersion = QCoreApplication::applicationVersion(); for (const auto &version : m_releaseMap.keys()) { if (compareAppVersions(version, currentVersion) > 0) return true; } return false; } bool Download::isFirstStart(bool writeVersion) const { auto *mySettings = MySettings::globalInstance(); QSettings settings; QString lastVersionStarted = settings.value("download/lastVersionStarted").toString(); bool first = lastVersionStarted != QCoreApplication::applicationVersion(); if (first && writeVersion) { settings.setValue("download/lastVersionStarted", QCoreApplication::applicationVersion()); // let the user select these again settings.remove("network/usageStatsActive"); settings.remove("network/isActive"); emit mySettings->networkUsageStatsActiveChanged(); emit mySettings->networkIsActiveChanged(); } return first || !mySettings->isNetworkUsageStatsActiveSet() || !mySettings->isNetworkIsActiveSet(); } void Download::updateReleaseNotes() { QUrl jsonUrl("http://gpt4all.io/meta/release.json"); QNetworkRequest request(jsonUrl); QSslConfiguration conf = request.sslConfiguration(); conf.setPeerVerifyMode(QSslSocket::VerifyNone); request.setSslConfiguration(conf); QNetworkReply *jsonReply = m_networkManager.get(request); connect(qGuiApp, &QCoreApplication::aboutToQuit, jsonReply, &QNetworkReply::abort); connect(jsonReply, &QNetworkReply::finished, this, &Download::handleReleaseJsonDownloadFinished); } void Download::updateLatestNews() { QUrl url("http://gpt4all.io/meta/latestnews.md"); QNetworkRequest request(url); QSslConfiguration conf = request.sslConfiguration(); conf.setPeerVerifyMode(QSslSocket::VerifyNone); request.setSslConfiguration(conf); QNetworkReply *reply = m_networkManager.get(request); connect(qGuiApp, &QCoreApplication::aboutToQuit, reply, &QNetworkReply::abort); connect(reply, &QNetworkReply::finished, this, &Download::handleLatestNewsDownloadFinished); } void Download::downloadModel(const QString &modelFile) { QFile *tempFile = new QFile(ModelList::globalInstance()->incompleteDownloadPath(modelFile)); bool success = tempFile->open(QIODevice::WriteOnly | QIODevice::Append); qWarning() << "Opening temp file for writing:" << tempFile->fileName(); if (!success) { const QString error = u"ERROR: Could not open temp file: %1 %2"_s.arg(tempFile->fileName(), modelFile); qWarning() << error; clearRetry(modelFile); ModelList::globalInstance()->updateDataByFilename(modelFile, {{ ModelList::DownloadErrorRole, error }}); return; } tempFile->flush(); size_t incomplete_size = tempFile->size(); if (incomplete_size > 0) { bool success = tempFile->seek(incomplete_size); if (!success) { incomplete_size = 0; success = tempFile->seek(incomplete_size); Q_ASSERT(success); } } if (!ModelList::globalInstance()->containsByFilename(modelFile)) { qWarning() << "ERROR: Could not find file:" << modelFile; return; } ModelList::globalInstance()->updateDataByFilename(modelFile, {{ ModelList::DownloadingRole, true }}); ModelInfo info = ModelList::globalInstance()->modelInfoByFilename(modelFile); QString url = !info.url().isEmpty() ? info.url() : "http://gpt4all.io/models/gguf/" + modelFile; Network::globalInstance()->trackEvent("download_started", { {"model", modelFile} }); QNetworkRequest request(url); request.setAttribute(QNetworkRequest::User, modelFile); request.setRawHeader("range", u"bytes=%1-"_s.arg(tempFile->pos()).toUtf8()); QSslConfiguration conf = request.sslConfiguration(); conf.setPeerVerifyMode(QSslSocket::VerifyNone); request.setSslConfiguration(conf); QNetworkReply *modelReply = m_networkManager.get(request); connect(qGuiApp, &QCoreApplication::aboutToQuit, modelReply, &QNetworkReply::abort); connect(modelReply, &QNetworkReply::downloadProgress, this, &Download::handleDownloadProgress); connect(modelReply, &QNetworkReply::errorOccurred, this, &Download::handleErrorOccurred); connect(modelReply, &QNetworkReply::finished, this, &Download::handleModelDownloadFinished); connect(modelReply, &QNetworkReply::readyRead, this, &Download::handleReadyRead); m_activeDownloads.insert(modelReply, tempFile); } void Download::cancelDownload(const QString &modelFile) { for (auto [modelReply, tempFile]: m_activeDownloads.asKeyValueRange()) { QUrl url = modelReply->request().url(); if (url.toString().endsWith(modelFile)) { Network::globalInstance()->trackEvent("download_canceled", { {"model", modelFile} }); // Disconnect the signals disconnect(modelReply, &QNetworkReply::downloadProgress, this, &Download::handleDownloadProgress); disconnect(modelReply, &QNetworkReply::finished, this, &Download::handleModelDownloadFinished); modelReply->abort(); // Abort the download modelReply->deleteLater(); // Schedule the reply for deletion tempFile->deleteLater(); m_activeDownloads.remove(modelReply); ModelList::globalInstance()->updateDataByFilename(modelFile, {{ ModelList::DownloadingRole, false }}); break; } } } void Download::installModel(const QString &modelFile, const QString &apiKey) { Q_ASSERT(!apiKey.isEmpty()); if (apiKey.isEmpty()) return; Network::globalInstance()->trackEvent("install_model", { {"model", modelFile} }); QString filePath = MySettings::globalInstance()->modelPath() + modelFile; QFile file(filePath); if (file.open(QIODeviceBase::WriteOnly | QIODeviceBase::Text)) { QJsonObject obj; QString modelName(modelFile); modelName.remove(0, 8); // strip "gpt4all-" prefix modelName.chop(7); // strip ".rmodel" extension obj.insert("apiKey", apiKey); obj.insert("modelName", modelName); QJsonDocument doc(obj); QTextStream stream(&file); stream << doc.toJson(); file.close(); ModelList::globalInstance()->updateModelsFromDirectory(); emit toastMessage(tr("Model \"%1\" is installed successfully.").arg(modelName)); } ModelList::globalInstance()->updateDataByFilename(modelFile, {{ ModelList::InstalledRole, true }}); } void Download::installCompatibleModel(const QString &modelName, const QString &apiKey, const QString &baseUrl) { Q_ASSERT(!modelName.isEmpty()); if (modelName.isEmpty()) { emit toastMessage(tr("ERROR: $MODEL_NAME is empty.")); return; } Q_ASSERT(!apiKey.isEmpty()); if (apiKey.isEmpty()) { emit toastMessage(tr("ERROR: $API_KEY is empty.")); return; } QUrl apiBaseUrl(QUrl::fromUserInput(baseUrl)); if (!Network::isHttpUrlValid(baseUrl)) { emit toastMessage(tr("ERROR: $BASE_URL is invalid.")); return; } QString modelFile(ModelList::compatibleModelFilename(baseUrl, modelName)); if (ModelList::globalInstance()->contains(modelFile)) { emit toastMessage(tr("ERROR: Model \"%1 (%2)\" is conflict.").arg(modelName, baseUrl)); return; } ModelList::globalInstance()->addModel(modelFile); Network::globalInstance()->trackEvent("install_model", { {"model", modelFile} }); QString filePath = MySettings::globalInstance()->modelPath() + modelFile; QFile file(filePath); if (file.open(QIODeviceBase::WriteOnly | QIODeviceBase::Text)) { QJsonObject obj; obj.insert("apiKey", apiKey); obj.insert("modelName", modelName); obj.insert("baseUrl", apiBaseUrl.toString()); QJsonDocument doc(obj); QTextStream stream(&file); stream << doc.toJson(); file.close(); ModelList::globalInstance()->updateModelsFromDirectory(); emit toastMessage(tr("Model \"%1 (%2)\" is installed successfully.").arg(modelName, baseUrl)); } ModelList::globalInstance()->updateDataByFilename(modelFile, {{ ModelList::InstalledRole, true }}); } void Download::removeModel(const QString &modelFile) { const QString filePath = MySettings::globalInstance()->modelPath() + modelFile; QFile incompleteFile(ModelList::globalInstance()->incompleteDownloadPath(modelFile)); if (incompleteFile.exists()) { incompleteFile.remove(); } bool shouldRemoveInstalled = false; QFile file(filePath); if (file.exists()) { const ModelInfo info = ModelList::globalInstance()->modelInfoByFilename(modelFile); MySettings::globalInstance()->eraseModel(info); shouldRemoveInstalled = info.installed && !info.isClone() && (info.isDiscovered() || info.isCompatibleApi || info.description() == "" /*indicates sideloaded*/); if (shouldRemoveInstalled) ModelList::globalInstance()->removeInstalled(info); Network::globalInstance()->trackEvent("remove_model", { {"model", modelFile} }); file.remove(); emit toastMessage(tr("Model \"%1\" is removed.").arg(info.name())); } if (!shouldRemoveInstalled) { QVector> data { { ModelList::InstalledRole, false }, { ModelList::BytesReceivedRole, 0 }, { ModelList::BytesTotalRole, 0 }, { ModelList::TimestampRole, 0 }, { ModelList::SpeedRole, QString() }, { ModelList::DownloadErrorRole, QString() }, }; ModelList::globalInstance()->updateDataByFilename(modelFile, data); } } void Download::handleSslErrors(QNetworkReply *reply, const QList &errors) { QUrl url = reply->request().url(); for (const auto &e : errors) qWarning() << "ERROR: Received ssl error:" << e.errorString() << "for" << url; } void Download::handleReleaseJsonDownloadFinished() { QNetworkReply *jsonReply = qobject_cast(sender()); if (!jsonReply) return; QByteArray jsonData = jsonReply->readAll(); jsonReply->deleteLater(); parseReleaseJsonFile(jsonData); } void Download::parseReleaseJsonFile(const QByteArray &jsonData) { QJsonParseError err; QJsonDocument document = QJsonDocument::fromJson(jsonData, &err); if (err.error != QJsonParseError::NoError) { qWarning() << "ERROR: Couldn't parse: " << jsonData << err.errorString(); return; } QJsonArray jsonArray = document.array(); m_releaseMap.clear(); for (const QJsonValue &value : jsonArray) { QJsonObject obj = value.toObject(); QString version = obj["version"].toString(); // "notes" field intentionally has a trailing newline for compatibility QString notes = obj["notes"].toString().trimmed(); QString contributors = obj["contributors"].toString().trimmed(); ReleaseInfo releaseInfo; releaseInfo.version = version; releaseInfo.notes = notes; releaseInfo.contributors = contributors; m_releaseMap.insert(version, releaseInfo); } emit hasNewerReleaseChanged(); emit releaseInfoChanged(); } void Download::handleLatestNewsDownloadFinished() { QNetworkReply *reply = qobject_cast(sender()); if (!reply) return; if (reply->error() != QNetworkReply::NoError) { qWarning() << "ERROR: network error occurred attempting to download latest news:" << reply->errorString(); reply->deleteLater(); return; } QByteArray responseData = reply->readAll(); m_latestNews = QString::fromUtf8(responseData); reply->deleteLater(); emit latestNewsChanged(); } bool Download::hasRetry(const QString &filename) const { return m_activeRetries.contains(filename); } bool Download::shouldRetry(const QString &filename) { int retries = 0; if (m_activeRetries.contains(filename)) retries = m_activeRetries.value(filename); ++retries; // Allow up to ten retries for now if (retries < 10) { m_activeRetries.insert(filename, retries); return true; } return false; } void Download::clearRetry(const QString &filename) { m_activeRetries.remove(filename); } void Download::handleErrorOccurred(QNetworkReply::NetworkError code) { QNetworkReply *modelReply = qobject_cast(sender()); if (!modelReply) return; // This occurs when the user explicitly cancels the download if (code == QNetworkReply::OperationCanceledError) return; QString modelFilename = modelReply->request().attribute(QNetworkRequest::User).toString(); if (shouldRetry(modelFilename)) { downloadModel(modelFilename); return; } clearRetry(modelFilename); const QString error = u"ERROR: Network error occurred attempting to download %1 code: %2 errorString %3"_s .arg(modelFilename) .arg(code) .arg(modelReply->errorString()); qWarning() << error; ModelList::globalInstance()->updateDataByFilename(modelFilename, {{ ModelList::DownloadErrorRole, error }}); Network::globalInstance()->trackEvent("download_error", { {"model", modelFilename}, {"code", (int)code}, {"error", modelReply->errorString()}, }); cancelDownload(modelFilename); } void Download::handleDownloadProgress(qint64 bytesReceived, qint64 bytesTotal) { QNetworkReply *modelReply = qobject_cast(sender()); if (!modelReply) return; QFile *tempFile = m_activeDownloads.value(modelReply); if (!tempFile) return; QString contentRange = modelReply->rawHeader("content-range"); if (contentRange.contains("/")) { QString contentTotalSize = contentRange.split("/").last(); bytesTotal = contentTotalSize.toLongLong(); } const QString modelFilename = modelReply->request().attribute(QNetworkRequest::User).toString(); const qint64 lastUpdate = ModelList::globalInstance()->dataByFilename(modelFilename, ModelList::TimestampRole).toLongLong(); const qint64 currentUpdate = QDateTime::currentMSecsSinceEpoch(); if (currentUpdate - lastUpdate < 1000) return; const qint64 lastBytesReceived = ModelList::globalInstance()->dataByFilename(modelFilename, ModelList::BytesReceivedRole).toLongLong(); const qint64 currentBytesReceived = tempFile->pos(); qint64 timeDifference = currentUpdate - lastUpdate; qint64 bytesDifference = currentBytesReceived - lastBytesReceived; qint64 speed = (bytesDifference / timeDifference) * 1000; // bytes per second QString speedText; if (speed < 1024) speedText = QString::number(static_cast(speed), 'f', 2) + " B/s"; else if (speed < 1024 * 1024) speedText = QString::number(static_cast(speed / 1024.0), 'f', 2) + " KB/s"; else speedText = QString::number(static_cast(speed / (1024.0 * 1024.0)), 'f', 2) + " MB/s"; QVector> data { { ModelList::BytesReceivedRole, currentBytesReceived }, { ModelList::BytesTotalRole, bytesTotal }, { ModelList::SpeedRole, speedText }, { ModelList::TimestampRole, currentUpdate }, }; ModelList::globalInstance()->updateDataByFilename(modelFilename, data); } HashAndSaveFile::HashAndSaveFile() : QObject(nullptr) { moveToThread(&m_hashAndSaveThread); m_hashAndSaveThread.setObjectName("hashandsave thread"); m_hashAndSaveThread.start(); } void HashAndSaveFile::hashAndSave(const QString &expectedHash, QCryptographicHash::Algorithm a, const QString &saveFilePath, QFile *tempFile, QNetworkReply *modelReply) { Q_ASSERT(!tempFile->isOpen()); QString modelFilename = modelReply->request().attribute(QNetworkRequest::User).toString(); // Reopen the tempFile for hashing if (!tempFile->open(QIODevice::ReadOnly)) { const QString error = u"ERROR: Could not open temp file for hashing: %1 %2"_s.arg(tempFile->fileName(), modelFilename); qWarning() << error; emit hashAndSaveFinished(false, error, tempFile, modelReply); return; } QCryptographicHash hash(a); while(!tempFile->atEnd()) hash.addData(tempFile->read(16384)); if (hash.result().toHex() != expectedHash.toLatin1()) { tempFile->close(); const QString error = u"ERROR: Download error hash did not match: %1 != %2 for %3"_s .arg(hash.result().toHex(), expectedHash.toLatin1(), modelFilename); qWarning() << error; tempFile->remove(); emit hashAndSaveFinished(false, error, tempFile, modelReply); return; } // The file save needs the tempFile closed tempFile->close(); // Attempt to *move* the verified tempfile into place - this should be atomic // but will only work if the destination is on the same filesystem if (tempFile->rename(saveFilePath)) { emit hashAndSaveFinished(true, QString(), tempFile, modelReply); ModelList::globalInstance()->updateModelsFromDirectory(); return; } // Reopen the tempFile for copying if (!tempFile->open(QIODevice::ReadOnly)) { const QString error = u"ERROR: Could not open temp file at finish: %1 %2"_s.arg(tempFile->fileName(), modelFilename); qWarning() << error; emit hashAndSaveFinished(false, error, tempFile, modelReply); return; } // Save the model file to disk QFile file(saveFilePath); if (file.open(QIODevice::WriteOnly)) { QByteArray buffer; while (!tempFile->atEnd()) { buffer = tempFile->read(16384); file.write(buffer); } file.close(); tempFile->close(); emit hashAndSaveFinished(true, QString(), tempFile, modelReply); } else { QFile::FileError error = file.error(); const QString errorString = u"ERROR: Could not save model to location: %1 failed with code %1"_s.arg(saveFilePath).arg(error); qWarning() << errorString; tempFile->close(); emit hashAndSaveFinished(false, errorString, tempFile, modelReply); } ModelList::globalInstance()->updateModelsFromDirectory(); } void Download::handleModelDownloadFinished() { QNetworkReply *modelReply = qobject_cast(sender()); if (!modelReply) return; QString modelFilename = modelReply->request().attribute(QNetworkRequest::User).toString(); QFile *tempFile = m_activeDownloads.value(modelReply); m_activeDownloads.remove(modelReply); if (modelReply->error()) { const QString errorString = u"ERROR: Downloading failed with code %1 \"%2\""_s.arg(modelReply->error()).arg(modelReply->errorString()); qWarning() << errorString; modelReply->deleteLater(); tempFile->deleteLater(); if (!hasRetry(modelFilename)) { QVector> data { { ModelList::DownloadingRole, false }, { ModelList::DownloadErrorRole, errorString }, }; ModelList::globalInstance()->updateDataByFilename(modelFilename, data); } return; } clearRetry(modelFilename); // The hash and save needs the tempFile closed tempFile->close(); if (!ModelList::globalInstance()->containsByFilename(modelFilename)) { qWarning() << "ERROR: downloading no such file:" << modelFilename; modelReply->deleteLater(); tempFile->deleteLater(); return; } // Notify that we are calculating hash ModelList::globalInstance()->updateDataByFilename(modelFilename, {{ ModelList::CalcHashRole, true }}); QByteArray hash = ModelList::globalInstance()->modelInfoByFilename(modelFilename).hash; ModelInfo::HashAlgorithm hashAlgorithm = ModelList::globalInstance()->modelInfoByFilename(modelFilename).hashAlgorithm; const QString saveFilePath = MySettings::globalInstance()->modelPath() + modelFilename; emit requestHashAndSave(hash, (hashAlgorithm == ModelInfo::Md5 ? QCryptographicHash::Md5 : QCryptographicHash::Sha256), saveFilePath, tempFile, modelReply); } void Download::handleHashAndSaveFinished(bool success, const QString &error, QFile *tempFile, QNetworkReply *modelReply) { // The hash and save should send back with tempfile closed Q_ASSERT(!tempFile->isOpen()); QString modelFilename = modelReply->request().attribute(QNetworkRequest::User).toString(); Network::globalInstance()->trackEvent("download_finished", { {"model", modelFilename}, {"success", success} }); QVector> data { { ModelList::CalcHashRole, false }, { ModelList::DownloadingRole, false }, }; modelReply->deleteLater(); tempFile->deleteLater(); if (!success) { data.append({ ModelList::DownloadErrorRole, error }); } else { data.append({ ModelList::DownloadErrorRole, QString() }); ModelInfo info = ModelList::globalInstance()->modelInfoByFilename(modelFilename); if (info.isDiscovered()) ModelList::globalInstance()->updateDiscoveredInstalled(info); } ModelList::globalInstance()->updateDataByFilename(modelFilename, data); } void Download::handleReadyRead() { QNetworkReply *modelReply = qobject_cast(sender()); if (!modelReply) return; QFile *tempFile = m_activeDownloads.value(modelReply); QByteArray buffer; while (!modelReply->atEnd()) { buffer = modelReply->read(16384); tempFile->write(buffer); } tempFile->flush(); } ================================================ FILE: gpt4all-chat/src/download.h ================================================ #ifndef DOWNLOAD_H #define DOWNLOAD_H #include #include #include #include #include #include #include #include #include #include #include #include #include // IWYU pragma: no_forward_declare QFile // IWYU pragma: no_forward_declare QList // IWYU pragma: no_forward_declare QSslError class QByteArray; struct ReleaseInfo { Q_GADGET Q_PROPERTY(QString version MEMBER version) Q_PROPERTY(QString notes MEMBER notes) Q_PROPERTY(QString contributors MEMBER contributors) public: QString version; QString notes; QString contributors; }; class HashAndSaveFile : public QObject { Q_OBJECT public: HashAndSaveFile(); public Q_SLOTS: void hashAndSave(const QString &hash, QCryptographicHash::Algorithm a, const QString &saveFilePath, QFile *tempFile, QNetworkReply *modelReply); Q_SIGNALS: void hashAndSaveFinished(bool success, const QString &error, QFile *tempFile, QNetworkReply *modelReply); private: QThread m_hashAndSaveThread; }; class Download : public QObject { Q_OBJECT Q_PROPERTY(bool hasNewerRelease READ hasNewerRelease NOTIFY hasNewerReleaseChanged) Q_PROPERTY(ReleaseInfo releaseInfo READ releaseInfo NOTIFY releaseInfoChanged) Q_PROPERTY(QString latestNews READ latestNews NOTIFY latestNewsChanged) public: static Download *globalInstance(); static std::strong_ordering compareAppVersions(const QString &a, const QString &b); ReleaseInfo releaseInfo() const; bool hasNewerRelease() const; QString latestNews() const { return m_latestNews; } Q_INVOKABLE void downloadModel(const QString &modelFile); Q_INVOKABLE void cancelDownload(const QString &modelFile); Q_INVOKABLE void installModel(const QString &modelFile, const QString &apiKey); Q_INVOKABLE void installCompatibleModel(const QString &modelName, const QString &apiKey, const QString &baseUrl); Q_INVOKABLE void removeModel(const QString &modelFile); Q_INVOKABLE bool isFirstStart(bool writeVersion = false) const; public Q_SLOTS: void updateLatestNews(); void updateReleaseNotes(); private Q_SLOTS: void handleSslErrors(QNetworkReply *reply, const QList &errors); void handleReleaseJsonDownloadFinished(); void handleLatestNewsDownloadFinished(); void handleErrorOccurred(QNetworkReply::NetworkError code); void handleDownloadProgress(qint64 bytesReceived, qint64 bytesTotal); void handleModelDownloadFinished(); void handleHashAndSaveFinished(bool success, const QString &error, QFile *tempFile, QNetworkReply *modelReply); void handleReadyRead(); Q_SIGNALS: void releaseInfoChanged(); void hasNewerReleaseChanged(); void requestHashAndSave(const QString &hash, QCryptographicHash::Algorithm a, const QString &saveFilePath, QFile *tempFile, QNetworkReply *modelReply); void latestNewsChanged(); void toastMessage(const QString &message); private: void parseReleaseJsonFile(const QByteArray &jsonData); QString incompleteDownloadPath(const QString &modelFile); bool hasRetry(const QString &filename) const; bool shouldRetry(const QString &filename); void clearRetry(const QString &filename); HashAndSaveFile *m_hashAndSave; QMap m_releaseMap; QString m_latestNews; QNetworkAccessManager m_networkManager; QMap m_activeDownloads; QHash m_activeRetries; QDateTime m_startTime; private: explicit Download(); ~Download() {} friend class MyDownload; }; #endif // DOWNLOAD_H ================================================ FILE: gpt4all-chat/src/embllm.cpp ================================================ #include "embllm.h" #include "mysettings.h" #include #include #include #include #include #include #include #include #include #include #include // IWYU pragma: keep #include #include #include #include #include #include #include #include #include #include #include using namespace Qt::Literals::StringLiterals; static const QString EMBEDDING_MODEL_NAME = u"nomic-embed-text-v1.5"_s; static const QString LOCAL_EMBEDDING_MODEL = u"nomic-embed-text-v1.5.f16.gguf"_s; EmbeddingLLMWorker::EmbeddingLLMWorker() : QObject(nullptr) , m_networkManager(new QNetworkAccessManager(this)) , m_stopGenerating(false) { moveToThread(&m_workerThread); connect(this, &EmbeddingLLMWorker::requestAtlasQueryEmbedding, this, &EmbeddingLLMWorker::atlasQueryEmbeddingRequested); connect(this, &EmbeddingLLMWorker::finished, &m_workerThread, &QThread::quit, Qt::DirectConnection); m_workerThread.setObjectName("embedding"); m_workerThread.start(); } EmbeddingLLMWorker::~EmbeddingLLMWorker() { m_stopGenerating = true; m_workerThread.quit(); m_workerThread.wait(); if (m_model) { delete m_model; m_model = nullptr; } } void EmbeddingLLMWorker::wait() { m_workerThread.wait(); } bool EmbeddingLLMWorker::loadModel() { constexpr int n_ctx = 2048; m_nomicAPIKey.clear(); m_model = nullptr; // TODO(jared): react to setting changes without restarting if (MySettings::globalInstance()->localDocsUseRemoteEmbed()) { m_nomicAPIKey = MySettings::globalInstance()->localDocsNomicAPIKey(); return true; } #ifdef Q_OS_DARWIN static const QString embPathFmt = u"%1/../Resources/%2"_s; #else static const QString embPathFmt = u"%1/../resources/%2"_s; #endif QString filePath = embPathFmt.arg(QCoreApplication::applicationDirPath(), LOCAL_EMBEDDING_MODEL); if (!QFileInfo::exists(filePath)) { qWarning() << "embllm WARNING: Local embedding model not found"; return false; } QString requestedDevice = MySettings::globalInstance()->localDocsEmbedDevice(); std::string backend = "auto"; #ifdef Q_OS_MAC if (requestedDevice == "Auto" || requestedDevice == "CPU") backend = "cpu"; #else if (requestedDevice.startsWith("CUDA: ")) backend = "cuda"; #endif try { m_model = LLModel::Implementation::construct(filePath.toStdString(), backend, n_ctx); } catch (const std::exception &e) { qWarning() << "embllm WARNING: Could not load embedding model:" << e.what(); return false; } bool actualDeviceIsCPU = true; #if defined(Q_OS_MAC) && defined(__aarch64__) if (m_model->implementation().buildVariant() == "metal") actualDeviceIsCPU = false; #else if (requestedDevice != "CPU") { const LLModel::GPUDevice *device = nullptr; std::vector availableDevices = m_model->availableGPUDevices(0); if (requestedDevice != "Auto") { // Use the selected device for (const LLModel::GPUDevice &d : availableDevices) { if (QString::fromStdString(d.selectionName()) == requestedDevice) { device = &d; break; } } } std::string unavail_reason; if (!device) { // GPU not available } else if (!m_model->initializeGPUDevice(device->index, &unavail_reason)) { qWarning().noquote() << "embllm WARNING: Did not use GPU:" << QString::fromStdString(unavail_reason); } else { actualDeviceIsCPU = false; } } #endif bool success = m_model->loadModel(filePath.toStdString(), n_ctx, 100); // CPU fallback if (!actualDeviceIsCPU && !success) { // llama_init_from_file returned nullptr qWarning() << "embllm WARNING: Did not use GPU: GPU loading failed (out of VRAM?)"; if (backend == "cuda") { // For CUDA, make sure we don't use the GPU at all - ngl=0 still offloads matmuls try { m_model = LLModel::Implementation::construct(filePath.toStdString(), "auto", n_ctx); } catch (const std::exception &e) { qWarning() << "embllm WARNING: Could not load embedding model:" << e.what(); return false; } } success = m_model->loadModel(filePath.toStdString(), n_ctx, 0); } if (!success) { qWarning() << "embllm WARNING: Could not load embedding model"; delete m_model; m_model = nullptr; return false; } if (!m_model->supportsEmbedding()) { qWarning() << "embllm WARNING: Model type does not support embeddings"; delete m_model; m_model = nullptr; return false; } // FIXME(jared): the user may want this to take effect without having to restart int n_threads = MySettings::globalInstance()->threadCount(); m_model->setThreadCount(n_threads); return true; } std::vector EmbeddingLLMWorker::generateQueryEmbedding(const QString &text) { { QMutexLocker locker(&m_mutex); if (!hasModel() && !loadModel()) { qWarning() << "WARNING: Could not load model for embeddings"; return {}; } if (!isNomic()) { std::vector embedding(m_model->embeddingSize()); try { m_model->embed({text.toStdString()}, embedding.data(), /*isRetrieval*/ true); } catch (const std::exception &e) { qWarning() << "WARNING: LLModel::embed failed:" << e.what(); return {}; } return embedding; } } EmbeddingLLMWorker worker; emit worker.requestAtlasQueryEmbedding(text); worker.wait(); return worker.lastResponse(); } void EmbeddingLLMWorker::sendAtlasRequest(const QStringList &texts, const QString &taskType, const QVariant &userData) { QJsonObject root; root.insert("model", "nomic-embed-text-v1"); root.insert("texts", QJsonArray::fromStringList(texts)); root.insert("task_type", taskType); QJsonDocument doc(root); QUrl nomicUrl("https://api-atlas.nomic.ai/v1/embedding/text"); const QString authorization = u"Bearer %1"_s.arg(m_nomicAPIKey).trimmed(); QNetworkRequest request(nomicUrl); request.setHeader(QNetworkRequest::ContentTypeHeader, "application/json"); request.setRawHeader("Authorization", authorization.toUtf8()); request.setAttribute(QNetworkRequest::User, userData); QNetworkReply *reply = m_networkManager->post(request, doc.toJson(QJsonDocument::Compact)); connect(qGuiApp, &QCoreApplication::aboutToQuit, reply, &QNetworkReply::abort); connect(reply, &QNetworkReply::finished, this, &EmbeddingLLMWorker::handleFinished); } void EmbeddingLLMWorker::atlasQueryEmbeddingRequested(const QString &text) { { QMutexLocker locker(&m_mutex); if (!hasModel() && !loadModel()) { qWarning() << "WARNING: Could not load model for embeddings"; return; } if (!isNomic()) { qWarning() << "WARNING: Request to generate sync embeddings for local model invalid"; return; } Q_ASSERT(hasModel()); } sendAtlasRequest({text}, "search_query"); } void EmbeddingLLMWorker::docEmbeddingsRequested(const QVector &chunks) { if (m_stopGenerating) return; bool isNomic; { QMutexLocker locker(&m_mutex); if (!hasModel() && !loadModel()) { qWarning() << "WARNING: Could not load model for embeddings"; return; } isNomic = this->isNomic(); } if (!isNomic) { QVector results; results.reserve(chunks.size()); std::vector texts; texts.reserve(chunks.size()); for (const auto &c: chunks) { EmbeddingResult result; result.model = c.model; result.folder_id = c.folder_id; result.chunk_id = c.chunk_id; result.embedding.resize(m_model->embeddingSize()); results << result; texts.push_back(c.chunk.toStdString()); } constexpr int BATCH_SIZE = 4; std::vector result; result.resize(chunks.size() * m_model->embeddingSize()); for (int j = 0; j < chunks.size(); j += BATCH_SIZE) { QMutexLocker locker(&m_mutex); std::vector batchTexts(texts.begin() + j, texts.begin() + std::min(j + BATCH_SIZE, int(texts.size()))); try { m_model->embed(batchTexts, result.data() + j * m_model->embeddingSize(), /*isRetrieval*/ false); } catch (const std::exception &e) { qWarning() << "WARNING: LLModel::embed failed:" << e.what(); return; } } for (int i = 0; i < chunks.size(); i++) memcpy(results[i].embedding.data(), &result[i * m_model->embeddingSize()], m_model->embeddingSize() * sizeof(float)); emit embeddingsGenerated(results); return; }; QStringList texts; for (auto &c: chunks) texts.append(c.chunk); sendAtlasRequest(texts, "search_document", QVariant::fromValue(chunks)); } std::vector jsonArrayToVector(const QJsonArray &jsonArray) { std::vector result; for (const auto &innerValue: jsonArray) { if (innerValue.isArray()) { QJsonArray innerArray = innerValue.toArray(); result.reserve(result.size() + innerArray.size()); for (const auto &value: innerArray) { result.push_back(static_cast(value.toDouble())); } } } return result; } QVector jsonArrayToEmbeddingResults(const QVector& chunks, const QJsonArray& embeddings) { QVector results; if (chunks.size() != embeddings.size()) { qWarning() << "WARNING: Size of json array result does not match input!"; return results; } for (int i = 0; i < chunks.size(); ++i) { const EmbeddingChunk& chunk = chunks.at(i); const QJsonArray embeddingArray = embeddings.at(i).toArray(); std::vector embeddingVector; for (const auto &value: embeddingArray) embeddingVector.push_back(static_cast(value.toDouble())); EmbeddingResult result; result.model = chunk.model; result.folder_id = chunk.folder_id; result.chunk_id = chunk.chunk_id; result.embedding = std::move(embeddingVector); results.push_back(std::move(result)); } return results; } void EmbeddingLLMWorker::handleFinished() { QNetworkReply *reply = qobject_cast(sender()); if (!reply) return; QVariant retrievedData = reply->request().attribute(QNetworkRequest::User); QVector chunks; if (retrievedData.isValid() && retrievedData.canConvert>()) chunks = retrievedData.value>(); QVariant response; if (reply->error() != QNetworkReply::NoError) { response = reply->attribute(QNetworkRequest::HttpStatusCodeAttribute); Q_ASSERT(response.isValid()); } bool ok; int code = response.toInt(&ok); if (!ok || code != 200) { QString errorDetails; QString replyErrorString = reply->errorString().trimmed(); QByteArray replyContent = reply->readAll().trimmed(); errorDetails = u"ERROR: Nomic Atlas responded with error code \"%1\""_s.arg(code); if (!replyErrorString.isEmpty()) errorDetails += u". Error Details: \"%1\""_s.arg(replyErrorString); if (!replyContent.isEmpty()) errorDetails += u". Response Content: \"%1\""_s.arg(QString::fromUtf8(replyContent)); qWarning() << errorDetails; emit errorGenerated(chunks, errorDetails); return; } QByteArray jsonData = reply->readAll(); QJsonParseError err; QJsonDocument document = QJsonDocument::fromJson(jsonData, &err); if (err.error != QJsonParseError::NoError) { qWarning() << "ERROR: Couldn't parse Nomic Atlas response:" << jsonData << err.errorString(); return; } const QJsonObject root = document.object(); const QJsonArray embeddings = root.value("embeddings").toArray(); if (!chunks.isEmpty()) { emit embeddingsGenerated(jsonArrayToEmbeddingResults(chunks, embeddings)); } else { m_lastResponse = jsonArrayToVector(embeddings); emit finished(); } reply->deleteLater(); } EmbeddingLLM::EmbeddingLLM() : QObject(nullptr) , m_embeddingWorker(new EmbeddingLLMWorker) { connect(this, &EmbeddingLLM::requestDocEmbeddings, m_embeddingWorker, &EmbeddingLLMWorker::docEmbeddingsRequested, Qt::QueuedConnection); connect(m_embeddingWorker, &EmbeddingLLMWorker::embeddingsGenerated, this, &EmbeddingLLM::embeddingsGenerated, Qt::QueuedConnection); connect(m_embeddingWorker, &EmbeddingLLMWorker::errorGenerated, this, &EmbeddingLLM::errorGenerated, Qt::QueuedConnection); } EmbeddingLLM::~EmbeddingLLM() { delete m_embeddingWorker; m_embeddingWorker = nullptr; } QString EmbeddingLLM::model() { return EMBEDDING_MODEL_NAME; } // TODO(jared): embed using all necessary embedding models given collection std::vector EmbeddingLLM::generateQueryEmbedding(const QString &text) { return m_embeddingWorker->generateQueryEmbedding(text); } void EmbeddingLLM::generateDocEmbeddingsAsync(const QVector &chunks) { emit requestDocEmbeddings(chunks); } ================================================ FILE: gpt4all-chat/src/embllm.h ================================================ #ifndef EMBLLM_H #define EMBLLM_H #include #include #include #include #include // IWYU pragma: keep #include #include #include // IWYU pragma: keep #include #include class LLModel; class QNetworkAccessManager; struct EmbeddingChunk { QString model; // TODO(jared): use to select model int folder_id; int chunk_id; QString chunk; }; Q_DECLARE_METATYPE(EmbeddingChunk) struct EmbeddingResult { QString model; int folder_id; int chunk_id; std::vector embedding; }; class EmbeddingLLMWorker : public QObject { Q_OBJECT public: EmbeddingLLMWorker(); ~EmbeddingLLMWorker() override; void wait(); std::vector lastResponse() const { return m_lastResponse; } bool loadModel(); bool isNomic() const { return !m_nomicAPIKey.isEmpty(); } bool hasModel() const { return isNomic() || m_model; } std::vector generateQueryEmbedding(const QString &text); public Q_SLOTS: void atlasQueryEmbeddingRequested(const QString &text); void docEmbeddingsRequested(const QVector &chunks); Q_SIGNALS: void requestAtlasQueryEmbedding(const QString &text); void embeddingsGenerated(const QVector &embeddings); void errorGenerated(const QVector &chunks, const QString &error); void finished(); private Q_SLOTS: void handleFinished(); private: void sendAtlasRequest(const QStringList &texts, const QString &taskType, const QVariant &userData = {}); QString m_nomicAPIKey; QNetworkAccessManager *m_networkManager; std::vector m_lastResponse; LLModel *m_model = nullptr; std::atomic m_stopGenerating; QThread m_workerThread; QMutex m_mutex; // guards m_model and m_nomicAPIKey }; class EmbeddingLLM : public QObject { Q_OBJECT public: EmbeddingLLM(); ~EmbeddingLLM() override; static QString model(); bool loadModel(); bool hasModel() const; public Q_SLOTS: std::vector generateQueryEmbedding(const QString &text); // synchronous void generateDocEmbeddingsAsync(const QVector &chunks); Q_SIGNALS: void requestDocEmbeddings(const QVector &chunks); void embeddingsGenerated(const QVector &embeddings); void errorGenerated(const QVector &chunks, const QString &error); private: EmbeddingLLMWorker *m_embeddingWorker; }; #endif // EMBLLM_H ================================================ FILE: gpt4all-chat/src/jinja_helpers.cpp ================================================ #include "jinja_helpers.h" #include #include #include #include #include namespace views = std::views; using json = nlohmann::ordered_json; json::object_t JinjaResultInfo::AsJson() const { return { { "collection", m_source->collection.toStdString() }, { "path", m_source->path .toStdString() }, { "file", m_source->file .toStdString() }, { "title", m_source->title .toStdString() }, { "author", m_source->author .toStdString() }, { "date", m_source->date .toStdString() }, { "text", m_source->text .toStdString() }, { "page", m_source->page }, { "file_uri", m_source->fileUri() .toStdString() }, }; } json::object_t JinjaPromptAttachment::AsJson() const { return { { "url", m_attachment->url.toString() .toStdString() }, { "file", m_attachment->file() .toStdString() }, { "processed_content", m_attachment->processedContent().toStdString() }, }; } json::object_t JinjaMessage::AsJson() const { json::object_t obj; { json::string_t role; switch (m_item->type()) { using enum MessageItem::Type; case System: role = "system"; break; case Prompt: role = "user"; break; case Response: role = "assistant"; break; case ToolResponse: role = "tool"; break; } obj.emplace_back("role", std::move(role)); } { QString content; if (m_version == 0 && m_item->type() == MessageItem::Type::Prompt) { content = m_item->bakedPrompt(); } else { content = m_item->content(); } obj.emplace_back("content", content.toStdString()); } if (m_item->type() == MessageItem::Type::Prompt) { { auto sources = m_item->sources() | views::transform([](auto &r) { return JinjaResultInfo(r).AsJson(); }); obj.emplace("sources", json::array_t(sources.begin(), sources.end())); } { auto attachments = m_item->promptAttachments() | views::transform([](auto &pa) { return JinjaPromptAttachment(pa).AsJson(); }); obj.emplace("prompt_attachments", json::array_t(attachments.begin(), attachments.end())); } } return obj; } ================================================ FILE: gpt4all-chat/src/jinja_helpers.h ================================================ #pragma once #include "chatmodel.h" #include "database.h" #include #include // IWYU pragma: keep // IWYU pragma: no_forward_declare MessageItem // IWYU pragma: no_forward_declare PromptAttachment // IWYU pragma: no_forward_declare ResultInfo using json = nlohmann::ordered_json; template class JinjaHelper { public: json::object_t AsJson() const { return static_cast(this)->AsJson(); } }; class JinjaResultInfo : public JinjaHelper { public: explicit JinjaResultInfo(const ResultInfo &source) noexcept : m_source(&source) {} json::object_t AsJson() const; private: const ResultInfo *m_source; }; class JinjaPromptAttachment : public JinjaHelper { public: explicit JinjaPromptAttachment(const PromptAttachment &attachment) noexcept : m_attachment(&attachment) {} json::object_t AsJson() const; private: const PromptAttachment *m_attachment; }; class JinjaMessage : public JinjaHelper { public: explicit JinjaMessage(uint version, const MessageItem &item) noexcept : m_version(version), m_item(&item) {} json::object_t AsJson() const; private: uint m_version; const MessageItem *m_item; }; ================================================ FILE: gpt4all-chat/src/jinja_replacements.cpp ================================================ // The map in this file is automatically generated by Jared. Do not hand-edit it. #include "jinja_replacements.h" #include // This is a list of prompt templates known to GPT4All and their associated replacements which are automatically used // instead when loading the chat template from GGUF. These exist for two primary reasons: // - HuggingFace model authors make ugly chat templates because they do not expect the end user to see them; // - and chat templates occasionally use features we do not support. This is less true now that we use minja. // The substitution list. const std::unordered_map CHAT_TEMPLATE_SUBSTITUTIONS { // calme-2.1-phi3.5-4b.Q6_K.gguf (reported by ThilotE on Discord), Phi-3.5-mini-instruct-Q4_0.gguf (nomic-ai/gpt4all#3345) { // original R"TEMPLATE({% for message in messages %}{% if message['role'] == 'system' and message['content'] %}{{'<|system|> ' + message['content'] + '<|end|> '}}{% elif message['role'] == 'user' %}{{'<|user|> ' + message['content'] + '<|end|> '}}{% elif message['role'] == 'assistant' %}{{'<|assistant|> ' + message['content'] + '<|end|> '}}{% endif %}{% endfor %}{% if add_generation_prompt %}{{ '<|assistant|> ' }}{% else %}{{ eos_token }}{% endif %})TEMPLATE", // replacement R"TEMPLATE({%- for message in messages %} {%- if message['role'] == 'system' and message['content'] %} {{- '<|system|>\n' + message['content'] + '<|end|>\n' }} {%- elif message['role'] == 'user' %} {{- '<|user|>\n' + message['content'] + '<|end|>\n' }} {%- elif message['role'] == 'assistant' %} {{- '<|assistant|>\n' + message['content'] + '<|end|>\n' }} {%- endif %} {%- endfor %} {%- if add_generation_prompt %} {{- '<|assistant|>\n' }} {%- else %} {{- eos_token }} {%- endif %})TEMPLATE", }, // DeepSeek-R1-Distill-Qwen-7B-Q4_0.gguf { // original R"TEMPLATE({% if not add_generation_prompt is defined %}{% set add_generation_prompt = false %}{% endif %}{% set ns = namespace(is_first=false, is_tool=false, is_output_first=true, system_prompt='') %}{%- for message in messages %}{%- if message['role'] == 'system' %}{% set ns.system_prompt = message['content'] %}{%- endif %}{%- endfor %}{{bos_token}}{{ns.system_prompt}}{%- for message in messages %}{%- if message['role'] == 'user' %}{%- set ns.is_tool = false -%}{{'<|User|>' + message['content']}}{%- endif %}{%- if message['role'] == 'assistant' and message['content'] is none %}{%- set ns.is_tool = false -%}{%- for tool in message['tool_calls']%}{%- if not ns.is_first %}{{'<|Assistant|><|tool▁calls▁begin|><|tool▁call▁begin|>' + tool['type'] + '<|tool▁sep|>' + tool['function']['name'] + '\n' + '```json' + '\n' + tool['function']['arguments'] + '\n' + '```' + '<|tool▁call▁end|>'}}{%- set ns.is_first = true -%}{%- else %}{{'\n' + '<|tool▁call▁begin|>' + tool['type'] + '<|tool▁sep|>' + tool['function']['name'] + '\n' + '```json' + '\n' + tool['function']['arguments'] + '\n' + '```' + '<|tool▁call▁end|>'}}{{'<|tool▁calls▁end|><|end▁of▁sentence|>'}}{%- endif %}{%- endfor %}{%- endif %}{%- if message['role'] == 'assistant' and message['content'] is not none %}{%- if ns.is_tool %}{{'<|tool▁outputs▁end|>' + message['content'] + '<|end▁of▁sentence|>'}}{%- set ns.is_tool = false -%}{%- else %}{% set content = message['content'] %}{% if '' in content %}{% set content = content.split('')[-1] %}{% endif %}{{'<|Assistant|>' + content + '<|end▁of▁sentence|>'}}{%- endif %}{%- endif %}{%- if message['role'] == 'tool' %}{%- set ns.is_tool = true -%}{%- if ns.is_output_first %}{{'<|tool▁outputs▁begin|><|tool▁output▁begin|>' + message['content'] + '<|tool▁output▁end|>'}}{%- set ns.is_output_first = false %}{%- else %}{{'\n<|tool▁output▁begin|>' + message['content'] + '<|tool▁output▁end|>'}}{%- endif %}{%- endif %}{%- endfor -%}{% if ns.is_tool %}{{'<|tool▁outputs▁end|>'}}{% endif %}{% if add_generation_prompt and not ns.is_tool %}{{'<|Assistant|>'}}{% endif %})TEMPLATE", // replacement R"TEMPLATE({%- if not add_generation_prompt is defined %} {%- set add_generation_prompt = false %} {%- endif %} {%- if messages[0]['role'] == 'system' %} {{- messages[0]['content'] }} {%- endif %} {%- for message in messages %} {%- if message['role'] == 'user' %} {{- '<|User|>' + message['content'] }} {%- endif %} {%- if message['role'] == 'assistant' %} {%- set content = message['content'] | regex_replace('^[\\s\\S]*', '') %} {{- '<|Assistant|>' + content + '<|end▁of▁sentence|>' }} {%- endif %} {%- endfor -%} {%- if add_generation_prompt %} {{- '<|Assistant|>' }} {%- endif %})TEMPLATE", }, // gemma-2-9b-it-Q4_0.gguf (nomic-ai/gpt4all#3282) { // original R"TEMPLATE({{ bos_token }}{% if messages[0]['role'] == 'system' %}{{ raise_exception('System role not supported') }}{% endif %}{% for message in messages %}{% if (message['role'] == 'user') != (loop.index0 % 2 == 0) %}{{ raise_exception('Conversation roles must alternate user/assistant/user/assistant/...') }}{% endif %}{% if (message['role'] == 'assistant') %}{% set role = 'model' %}{% else %}{% set role = message['role'] %}{% endif %}{{ '' + role + ' ' + message['content'] | trim + ' ' }}{% endfor %}{% if add_generation_prompt %}{{'model '}}{% endif %})TEMPLATE", // replacement R"TEMPLATE({{- bos_token }} {%- if messages[0]['role'] == 'system' %} {{- raise_exception('System role not supported') }} {%- endif %} {%- for message in messages %} {%- if (message['role'] == 'user') != (loop.index0 % 2 == 0) %} {{- raise_exception('Conversation roles must alternate user/assistant/user/assistant/...') }} {%- endif %} {%- if message['role'] == 'assistant' %} {%- set role = 'model' %} {%- else %} {%- set role = message['role'] %} {%- endif %} {{- '' + role + '\n' + message['content'] | trim + '\n' }} {%- endfor %} {%- if add_generation_prompt %} {{- 'model\n' }} {%- endif %})TEMPLATE", }, // ghost-7b-v0.9.1-Q4_0.gguf { // original R"TEMPLATE({% for message in messages %} {% if message['role'] == 'user' %} {{ '<|user|> ' + message['content'] + eos_token }} {% elif message['role'] == 'system' %} {{ '<|system|> ' + message['content'] + eos_token }} {% elif message['role'] == 'assistant' %} {{ '<|assistant|> ' + message['content'] + eos_token }} {% endif %} {% if loop.last and add_generation_prompt %} {{ '<|assistant|>' }} {% endif %} {% endfor %})TEMPLATE", // replacement R"TEMPLATE({%- for message in messages %} {%- if message['role'] == 'user' %} {{- '<|user|>\n' + message['content'] + eos_token }} {%- elif message['role'] == 'system' %} {{- '<|system|>\n' + message['content'] + eos_token }} {%- elif message['role'] == 'assistant' %} {{- '<|assistant|>\n' + message['content'] + eos_token }} {%- endif %} {%- if loop.last and add_generation_prompt %} {{- '<|assistant|>' }} {%- endif %} {%- endfor %})TEMPLATE", }, // granite-3.1-3b-a800m-instruct-Q4_0.gguf, granite-3.1-8b-instruct-Q4_0.gguf (nomic-ai/gpt4all#3471) { // original R"TEMPLATE({%- if messages[0]['role'] == 'system' %}{%- set system_message = messages[0]['content'] %}{%- set loop_messages = messages[1:] %}{%- else %}{%- set system_message = "Knowledge Cutoff Date: April 2024. You are Granite, developed by IBM." %}{%- if tools and documents %}{%- set system_message = system_message + " You are a helpful AI assistant with access to the following tools. When a tool is required to answer the user's query, respond with <|tool_call|> followed by a JSON list of tools used. If a tool does not exist in the provided list of tools, notify the user that you do not have the ability to fulfill the request. Write the response to the user's input by strictly aligning with the facts in the provided documents. If the information needed to answer the question is not available in the documents, inform the user that the question cannot be answered based on the available data." %}{%- elif tools %}{%- set system_message = system_message + " You are a helpful AI assistant with access to the following tools. When a tool is required to answer the user's query, respond with <|tool_call|> followed by a JSON list of tools used. If a tool does not exist in the provided list of tools, notify the user that you do not have the ability to fulfill the request." %}{%- elif documents %}{%- set system_message = system_message + " Write the response to the user's input by strictly aligning with the facts in the provided documents. If the information needed to answer the question is not available in the documents, inform the user that the question cannot be answered based on the available data." %}{%- else %}{%- set system_message = system_message + " You are a helpful AI assistant." %}{%- endif %}{%- if controls and 'citations' in controls and documents %}{%- set system_message = system_message + ' In your response, use the symbols and to indicate when a fact comes from a document in the search result, e.g 0 for a fact from document 0. Afterwards, list all the citations with their corresponding documents in an ordered list.' %}{%- endif %}{%- if controls and 'hallucinations' in controls and documents %}{%- set system_message = system_message + ' Finally, after the response is written, include a numbered list of sentences from the response that are potentially hallucinated and not based in the documents.' %}{%- endif %}{%- set loop_messages = messages %}{%- endif %}{{- '<|start_of_role|>system<|end_of_role|>' + system_message + '<|end_of_text|> ' }}{%- if tools %}{{- '<|start_of_role|>tools<|end_of_role|>' }}{{- tools | tojson(indent=4) }}{{- '<|end_of_text|> ' }}{%- endif %}{%- if documents %}{{- '<|start_of_role|>documents<|end_of_role|>' }}{%- for document in documents %}{{- 'Document ' + loop.index0 | string + ' ' }}{{- document['text'] }}{%- if not loop.last %}{{- ' '}}{%- endif%}{%- endfor %}{{- '<|end_of_text|> ' }}{%- endif %}{%- for message in loop_messages %}{{- '<|start_of_role|>' + message['role'] + '<|end_of_role|>' + message['content'] + '<|end_of_text|> ' }}{%- if loop.last and add_generation_prompt %}{{- '<|start_of_role|>assistant' }}{%- if controls %}{{- ' ' + controls | tojson()}}{%- endif %}{{- '<|end_of_role|>' }}{%- endif %}{%- endfor %})TEMPLATE", // replacement R"TEMPLATE({%- if messages[0]['role'] == 'system' %} {%- set system_message = messages[0]['content'] %} {%- set loop_messages = messages[1:] %} {%- else %} {%- set system_message = "Knowledge Cutoff Date: April 2024. You are Granite, developed by IBM. You are a helpful AI assistant." %} {%- set loop_messages = messages %} {%- endif %} {{- '<|start_of_role|>system<|end_of_role|>' + system_message + '<|end_of_text|> ' }} {%- for message in loop_messages %} {{- '<|start_of_role|>' + message['role'] + '<|end_of_role|>' + message['content'] + '<|end_of_text|> ' }} {%- if loop.last and add_generation_prompt %} {{- '<|start_of_role|>assistant<|end_of_role|>' }} {%- endif %} {%- endfor %})TEMPLATE", }, // Hermes-3-Llama-3.2-3B.Q4_0.gguf, mistral-7b-openorca.gguf2.Q4_0.gguf { // original R"TEMPLATE({% if not add_generation_prompt is defined %}{% set add_generation_prompt = false %}{% endif %}{% for message in messages %}{{'<|im_start|>' + message['role'] + ' ' + message['content'] + '<|im_end|>' + ' '}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant ' }}{% endif %})TEMPLATE", // replacement R"TEMPLATE({%- for message in messages %} {{- '<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>\n' }} {%- endfor %} {%- if add_generation_prompt %} {{- '<|im_start|>assistant\n' }} {%- endif %})TEMPLATE", }, // Llama-3.2-1B-Instruct-Q4_0.gguf, Llama-3.2-3B-Instruct-Q4_0.gguf, SummLlama3.2-3B-Q4_0.gguf (nomic-ai/gpt4all#3309) { // original R"TEMPLATE({{- bos_token }} {%- if custom_tools is defined %} {%- set tools = custom_tools %} {%- endif %} {%- if not tools_in_user_message is defined %} {%- set tools_in_user_message = true %} {%- endif %} {%- if not date_string is defined %} {%- if strftime_now is defined %} {%- set date_string = strftime_now("%d %b %Y") %} {%- else %} {%- set date_string = "26 Jul 2024" %} {%- endif %} {%- endif %} {%- if not tools is defined %} {%- set tools = none %} {%- endif %} {#- This block extracts the system message, so we can slot it into the right place. #} {%- if messages[0]['role'] == 'system' %} {%- set system_message = messages[0]['content']|trim %} {%- set messages = messages[1:] %} {%- else %} {%- set system_message = "" %} {%- endif %} {#- System message #} {{- "<|start_header_id|>system<|end_header_id|>\n\n" }} {%- if tools is not none %} {{- "Environment: ipython\n" }} {%- endif %} {{- "Cutting Knowledge Date: December 2023\n" }} {{- "Today Date: " + date_string + "\n\n" }} {%- if tools is not none and not tools_in_user_message %} {{- "You have access to the following functions. To call a function, please respond with JSON for a function call." }} {{- 'Respond in the format {"name": function name, "parameters": dictionary of argument name and its value}.' }} {{- "Do not use variables.\n\n" }} {%- for t in tools %} {{- t | tojson(indent=4) }} {{- "\n\n" }} {%- endfor %} {%- endif %} {{- system_message }} {{- "<|eot_id|>" }} {#- Custom tools are passed in a user message with some extra guidance #} {%- if tools_in_user_message and not tools is none %} {#- Extract the first user message so we can plug it in here #} {%- if messages | length != 0 %} {%- set first_user_message = messages[0]['content']|trim %} {%- set messages = messages[1:] %} {%- else %} {{- raise_exception("Cannot put tools in the first user message when there's no first user message!") }} {%- endif %} {{- '<|start_header_id|>user<|end_header_id|>\n\n' -}} {{- "Given the following functions, please respond with a JSON for a function call " }} {{- "with its proper arguments that best answers the given prompt.\n\n" }} {{- 'Respond in the format {"name": function name, "parameters": dictionary of argument name and its value}.' }} {{- "Do not use variables.\n\n" }} {%- for t in tools %} {{- t | tojson(indent=4) }} {{- "\n\n" }} {%- endfor %} {{- first_user_message + "<|eot_id|>"}} {%- endif %} {%- for message in messages %} {%- if not (message.role == 'ipython' or message.role == 'tool' or 'tool_calls' in message) %} {{- '<|start_header_id|>' + message['role'] + '<|end_header_id|>\n\n'+ message['content'] | trim + '<|eot_id|>' }} {%- elif 'tool_calls' in message %} {%- if not message.tool_calls|length == 1 %} {{- raise_exception("This model only supports single tool-calls at once!") }} {%- endif %} {%- set tool_call = message.tool_calls[0].function %} {{- '<|start_header_id|>assistant<|end_header_id|>\n\n' -}} {{- '{"name": "' + tool_call.name + '", ' }} {{- '"parameters": ' }} {{- tool_call.arguments | tojson }} {{- "}" }} {{- "<|eot_id|>" }} {%- elif message.role == "tool" or message.role == "ipython" %} {{- "<|start_header_id|>ipython<|end_header_id|>\n\n" }} {%- if message.content is mapping or message.content is iterable %} {{- message.content | tojson }} {%- else %} {{- message.content }} {%- endif %} {{- "<|eot_id|>" }} {%- endif %} {%- endfor %} {%- if add_generation_prompt %} {{- '<|start_header_id|>assistant<|end_header_id|>\n\n' }} {%- endif %})TEMPLATE", // replacement R"TEMPLATE({{- bos_token }} {%- set date_string = strftime_now('%d %b %Y') %} {#- This block extracts the system message, so we can slot it into the right place. #} {%- if messages[0]['role'] == 'system' %} {%- set system_message = messages[0]['content'] | trim %} {%- set loop_start = 1 %} {%- else %} {%- set system_message = '' %} {%- set loop_start = 0 %} {%- endif %} {#- System message #} {{- '<|start_header_id|>system<|end_header_id|>\n\n' }} {{- 'Cutting Knowledge Date: December 2023\n' }} {{- 'Today Date: ' + date_string + '\n\n' }} {{- system_message }} {{- '<|eot_id|>' }} {%- for message in messages %} {%- if loop.index0 >= loop_start %} {{- '<|start_header_id|>' + message['role'] + '<|end_header_id|>\n\n' + message['content'] | trim + '<|eot_id|>' }} {%- endif %} {%- endfor %} {%- if add_generation_prompt %} {{- '<|start_header_id|>assistant<|end_header_id|>\n\n' }} {%- endif %})TEMPLATE", }, // Llama-3.3-70B-Instruct-Q4_0.gguf (nomic-ai/gpt4all#3305) { // original R"TEMPLATE({{- bos_token }} {%- if custom_tools is defined %} {%- set tools = custom_tools %} {%- endif %} {%- if not tools_in_user_message is defined %} {%- set tools_in_user_message = true %} {%- endif %} {%- if not date_string is defined %} {%- set date_string = "26 Jul 2024" %} {%- endif %} {%- if not tools is defined %} {%- set tools = none %} {%- endif %} {#- This block extracts the system message, so we can slot it into the right place. #} {%- if messages[0]['role'] == 'system' %} {%- set system_message = messages[0]['content']|trim %} {%- set messages = messages[1:] %} {%- else %} {%- set system_message = "" %} {%- endif %} {#- System message + builtin tools #} {{- "<|start_header_id|>system<|end_header_id|>\n\n" }} {%- if builtin_tools is defined or tools is not none %} {{- "Environment: ipython\n" }} {%- endif %} {%- if builtin_tools is defined %} {{- "Tools: " + builtin_tools | reject('equalto', 'code_interpreter') | join(", ") + "\n\n"}} {%- endif %} {{- "Cutting Knowledge Date: December 2023\n" }} {{- "Today Date: " + date_string + "\n\n" }} {%- if tools is not none and not tools_in_user_message %} {{- "You have access to the following functions. To call a function, please respond with JSON for a function call." }} {{- 'Respond in the format {"name": function name, "parameters": dictionary of argument name and its value}.' }} {{- "Do not use variables.\n\n" }} {%- for t in tools %} {{- t | tojson(indent=4) }} {{- "\n\n" }} {%- endfor %} {%- endif %} {{- system_message }} {{- "<|eot_id|>" }} {#- Custom tools are passed in a user message with some extra guidance #} {%- if tools_in_user_message and not tools is none %} {#- Extract the first user message so we can plug it in here #} {%- if messages | length != 0 %} {%- set first_user_message = messages[0]['content']|trim %} {%- set messages = messages[1:] %} {%- else %} {{- raise_exception("Cannot put tools in the first user message when there's no first user message!") }} {%- endif %} {{- '<|start_header_id|>user<|end_header_id|>\n\n' -}} {{- "Given the following functions, please respond with a JSON for a function call " }} {{- "with its proper arguments that best answers the given prompt.\n\n" }} {{- 'Respond in the format {"name": function name, "parameters": dictionary of argument name and its value}.' }} {{- "Do not use variables.\n\n" }} {%- for t in tools %} {{- t | tojson(indent=4) }} {{- "\n\n" }} {%- endfor %} {{- first_user_message + "<|eot_id|>"}} {%- endif %} {%- for message in messages %} {%- if not (message.role == 'ipython' or message.role == 'tool' or 'tool_calls' in message) %} {{- '<|start_header_id|>' + message['role'] + '<|end_header_id|>\n\n'+ message['content'] | trim + '<|eot_id|>' }} {%- elif 'tool_calls' in message %} {%- if not message.tool_calls|length == 1 %} {{- raise_exception("This model only supports single tool-calls at once!") }} {%- endif %} {%- set tool_call = message.tool_calls[0].function %} {%- if builtin_tools is defined and tool_call.name in builtin_tools %} {{- '<|start_header_id|>assistant<|end_header_id|>\n\n' -}} {{- "<|python_tag|>" + tool_call.name + ".call(" }} {%- for arg_name, arg_val in tool_call.arguments | items %} {{- arg_name + '="' + arg_val + '"' }} {%- if not loop.last %} {{- ", " }} {%- endif %} {%- endfor %} {{- ")" }} {%- else %} {{- '<|start_header_id|>assistant<|end_header_id|>\n\n' -}} {{- '{"name": "' + tool_call.name + '", ' }} {{- '"parameters": ' }} {{- tool_call.arguments | tojson }} {{- "}" }} {%- endif %} {%- if builtin_tools is defined %} {#- This means we're in ipython mode #} {{- "<|eom_id|>" }} {%- else %} {{- "<|eot_id|>" }} {%- endif %} {%- elif message.role == "tool" or message.role == "ipython" %} {{- "<|start_header_id|>ipython<|end_header_id|>\n\n" }} {%- if message.content is mapping or message.content is iterable %} {{- message.content | tojson }} {%- else %} {{- message.content }} {%- endif %} {{- "<|eot_id|>" }} {%- endif %} {%- endfor %} {%- if add_generation_prompt %} {{- '<|start_header_id|>assistant<|end_header_id|>\n\n' }} {%- endif %})TEMPLATE", // replacement R"TEMPLATE({{- bos_token }} {%- set date_string = strftime_now('%d %b %Y') %} {#- This block extracts the system message, so we can slot it into the right place. #} {%- if messages[0]['role'] == 'system' %} {%- set system_message = messages[0]['content'] | trim %} {%- set loop_start = 1 %} {%- else %} {%- set system_message = '' %} {%- set loop_start = 0 %} {%- endif %} {#- System message #} {{- '<|start_header_id|>system<|end_header_id|>\n\n' }} {{- 'Cutting Knowledge Date: December 2023\n' }} {{- 'Today Date: ' + date_string + '\n\n' }} {{- system_message }} {{- '<|eot_id|>' }} {%- for message in messages %} {%- if loop.index0 >= loop_start %} {{- '<|start_header_id|>' + message['role'] + '<|end_header_id|>\n\n' + message['content'] | trim + '<|eot_id|>' }} {%- endif %} {%- endfor %} {%- if add_generation_prompt %} {{- '<|start_header_id|>assistant<|end_header_id|>\n\n' }} {%- endif %})TEMPLATE", }, // Llama3-DiscoLeo-Instruct-8B-32k-v0.1-Q4_0.gguf (nomic-ai/gpt4all#3347) { // original R"TEMPLATE({% set loop_messages = messages %}{% for message in loop_messages %}{% set content = '<|start_header_id|>' + message['role'] + '<|end_header_id|> '+ message['content'] | trim + '<|eot_id|>' %}{% if loop.index0 == 0 %}{% set content = bos_token + content %}{% endif %}{{ content }}{% endfor %}{% if add_generation_prompt %}{{ '<|start_header_id|>assistant<|end_header_id|> ' }}{% endif %})TEMPLATE", // replacement R"TEMPLATE({%- for message in messages %} {%- set content = '<|start_header_id|>' + message['role'] + '<|end_header_id|>\n\n' + message['content'] | trim + '<|eot_id|>' %} {%- if loop.index0 == 0 %} {%- set content = bos_token + content %} {%- endif %} {{- content }} {%- endfor %} {%- if add_generation_prompt %} {{- '<|start_header_id|>assistant<|end_header_id|>\n\n' }} {%- endif %})TEMPLATE", }, // Meta-Llama-3.1-8B-Instruct-128k-Q4_0.gguf { // original R"TEMPLATE({% set loop_messages = messages %}{% for message in loop_messages %}{% set content = '<|start_header_id|>' + message['role'] + '<|end_header_id|> '+ message['content'] | trim + '<|eot_id|>' %}{% if loop.index0 == 0 %}{% set content = bos_token + content %}{% endif %}{{ content }}{% endfor %}{{ '<|start_header_id|>assistant<|end_header_id|> ' }})TEMPLATE", // replacement R"TEMPLATE({%- set loop_messages = messages %} {%- for message in loop_messages %} {%- set content = '<|start_header_id|>' + message['role'] + '<|end_header_id|>\n\n'+ message['content'] | trim + '<|eot_id|>' %} {%- if loop.index0 == 0 %} {%- set content = bos_token + content %} {%- endif %} {{- content }} {%- endfor %} {{- '<|start_header_id|>assistant<|end_header_id|>\n\n' }})TEMPLATE", }, // Meta-Llama-3-8B-Instruct.Q4_0.gguf { // original R"TEMPLATE({% set loop_messages = messages %}{% for message in loop_messages %}{% set content = '<|start_header_id|>' + message['role'] + '<|end_header_id|> '+ message['content'] | trim + '<|eot_id|>' %}{{ content }}{% endfor %}{% if add_generation_prompt %}{{ '<|start_header_id|>assistant<|end_header_id|> ' }}{% endif %})TEMPLATE", // replacement R"TEMPLATE({%- set loop_messages = messages %} {%- for message in loop_messages %} {%- set content = '<|start_header_id|>' + message['role'] + '<|end_header_id|>\n\n'+ message['content'] | trim + '<|eot_id|>' %} {{- content }} {%- endfor %} {%- if add_generation_prompt %} {{- '<|start_header_id|>assistant<|end_header_id|>\n\n' }} {%- endif %})TEMPLATE", }, // Mistral-Nemo-Instruct-2407-Q4_0.gguf (nomic-ai/gpt4all#3284) { // original R"TEMPLATE({%- if messages[0]["role"] == "system" %} {%- set system_message = messages[0]["content"] %} {%- set loop_messages = messages[1:] %} {%- else %} {%- set loop_messages = messages %} {%- endif %} {%- if not tools is defined %} {%- set tools = none %} {%- endif %} {%- set user_messages = loop_messages | selectattr("role", "equalto", "user") | list %} {#- This block checks for alternating user/assistant messages, skipping tool calling messages #} {%- set ns = namespace() %} {%- set ns.index = 0 %} {%- for message in loop_messages %} {%- if not (message.role == "tool" or message.role == "tool_results" or (message.tool_calls is defined and message.tool_calls is not none)) %} {%- if (message["role"] == "user") != (ns.index % 2 == 0) %} {{- raise_exception("After the optional system message, conversation roles must alternate user/assistant/user/assistant/...") }} {%- endif %} {%- set ns.index = ns.index + 1 %} {%- endif %} {%- endfor %} {{- bos_token }} {%- for message in loop_messages %} {%- if message["role"] == "user" %} {%- if tools is not none and (message == user_messages[-1]) %} {{- "[AVAILABLE_TOOLS][" }} {%- for tool in tools %} {%- set tool = tool.function %} {{- '{"type": "function", "function": {' }} {%- for key, val in tool.items() if key != "return" %} {%- if val is string %} {{- '"' + key + '": "' + val + '"' }} {%- else %} {{- '"' + key + '": ' + val|tojson }} {%- endif %} {%- if not loop.last %} {{- ", " }} {%- endif %} {%- endfor %} {{- "}}" }} {%- if not loop.last %} {{- ", " }} {%- else %} {{- "]" }} {%- endif %} {%- endfor %} {{- "[/AVAILABLE_TOOLS]" }} {%- endif %} {%- if loop.last and system_message is defined %} {{- "[INST]" + system_message + "\n\n" + message["content"] + "[/INST]" }} {%- else %} {{- "[INST]" + message["content"] + "[/INST]" }} {%- endif %} {%- elif (message.tool_calls is defined and message.tool_calls is not none) %} {{- "[TOOL_CALLS][" }} {%- for tool_call in message.tool_calls %} {%- set out = tool_call.function|tojson %} {{- out[:-1] }} {%- if not tool_call.id is defined or tool_call.id|length != 9 %} {{- raise_exception("Tool call IDs should be alphanumeric strings with length 9!") }} {%- endif %} {{- ', "id": "' + tool_call.id + '"}' }} {%- if not loop.last %} {{- ", " }} {%- else %} {{- "]" + eos_token }} {%- endif %} {%- endfor %} {%- elif message["role"] == "assistant" %} {{- message["content"] + eos_token}} {%- elif message["role"] == "tool_results" or message["role"] == "tool" %} {%- if message.content is defined and message.content.content is defined %} {%- set content = message.content.content %} {%- else %} {%- set content = message.content %} {%- endif %} {{- '[TOOL_RESULTS]{"content": ' + content|string + ", " }} {%- if not message.tool_call_id is defined or message.tool_call_id|length != 9 %} {{- raise_exception("Tool call IDs should be alphanumeric strings with length 9!") }} {%- endif %} {{- '"call_id": "' + message.tool_call_id + '"}[/TOOL_RESULTS]' }} {%- else %} {{- raise_exception("Only user and assistant roles are supported, with the exception of an initial optional system message!") }} {%- endif %} {%- endfor %})TEMPLATE", // replacement R"TEMPLATE({%- if messages[0]['role'] == 'system' %} {%- set system_message = messages[0]['content'] %} {%- set loop_start = 1 %} {%- else %} {%- set loop_start = 0 %} {%- endif %} {{- bos_token }} {%- for message in messages %} {#- This block checks for alternating user/assistant messages, skipping tool calling messages #} {%- if (message['role'] == 'user') != (loop.index0 % 2 == 0) %} {{- raise_exception('After the optional system message, conversation roles must alternate user/assistant/user/assistant/...') }} {%- endif %} {%- if message['role'] == 'user' %} {%- if loop.last and loop_start == 1 %} {{- '[INST]' + system_message + '\n\n' + message['content'] + '[/INST]' }} {%- else %} {{- '[INST]' + message['content'] + '[/INST]' }} {%- endif %} {%- elif message['role'] == 'assistant' %} {{- message['content'] + eos_token }} {%- else %} {{- raise_exception('Only user and assistant roles are supported, with the exception of an initial optional system message!') }} {%- endif %} {%- endfor %})TEMPLATE", }, // Nous-Hermes-2-Mistral-7B-DPO.Q4_0.gguf { // original R"TEMPLATE({% for message in messages %}{{'<|im_start|>' + message['role'] + ' ' + message['content'] + '<|im_end|>' + ' '}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant ' }}{% endif %})TEMPLATE", // replacement R"TEMPLATE({%- for message in messages %} {{- '<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>\n' }} {%- endfor %} {%- if add_generation_prompt %} {{- '<|im_start|>assistant\n' }} {%- endif %})TEMPLATE", }, // occiglot-7b-de-en-instruct.Q4_0.gguf (nomic-ai/gpt4all#3283) { // original R"TEMPLATE({{''}}{% if messages[0]['role'] == 'system' %}{% set loop_messages = messages[1:] %}{% set system_message = messages[0]['content'] %}{% else %}{% set loop_messages = messages %}{% set system_message = 'You are a helpful assistant. Please give a long and detailed answer.' %}{% endif %}{% if not add_generation_prompt is defined %}{% set add_generation_prompt = false %}{% endif %}{% for message in loop_messages %}{% if loop.index0 == 0 %}{{'<|im_start|>system ' + system_message + '<|im_end|> '}}{% endif %}{{'<|im_start|>' + message['role'] + ' ' + message['content'] + '<|im_end|>' + ' '}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant ' }}{% endif %})TEMPLATE", // replacement R"TEMPLATE({{- bos_token }} {%- if messages[0]['role'] == 'system' %} {%- set loop_start = 1 %} {%- set system_message = messages[0]['content'] %} {%- else %} {%- set loop_start = 0 %} {%- set system_message = 'You are a helpful assistant. Please give a long and detailed answer.' %} {%- endif %} {{- '<|im_start|>system\n' + system_message + '<|im_end|>\n' }} {%- for message in messages %} {%- if loop.index0 >= loop_start %} {{- '<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>\n' }} {%- endif %} {%- endfor %} {%- if add_generation_prompt %} {{- '<|im_start|>assistant\n' }} {%- endif %})TEMPLATE", }, // OLMoE-1B-7B-0125-Instruct-Q4_0.gguf (nomic-ai/gpt4all#3471) { // original R"TEMPLATE({{ bos_token }}{% for message in messages %}{% if message['role'] == 'system' %}{{ '<|system|> ' + message['content'] + ' ' }}{% elif message['role'] == 'user' %}{{ '<|user|> ' + message['content'] + ' ' }}{% elif message['role'] == 'assistant' %}{% if not loop.last %}{{ '<|assistant|> ' + message['content'] + eos_token + ' ' }}{% else %}{{ '<|assistant|> ' + message['content'] + eos_token }}{% endif %}{% endif %}{% if loop.last and add_generation_prompt %}{{ '<|assistant|> ' }}{% endif %}{% endfor %})TEMPLATE", // replacement R"TEMPLATE({{- bos_token }} {%- for message in messages %} {%- if message['role'] == 'system' %} {{- '<|system|>\n' + message['content'] + '\n' }} {%- elif message['role'] == 'user' %} {{- '<|user|>\n' + message['content'] + '\n' }} {%- elif message['role'] == 'assistant' %} {%- if not loop.last %} {{- '<|assistant|>\n' + message['content'] + eos_token + '\n' }} {%- else %} {{- '<|assistant|>\n' + message['content'] + eos_token }} {%- endif %} {%- endif %} {%- if loop.last and add_generation_prompt %} {{- '<|assistant|>\n' }} {%- endif %} {%- endfor %})TEMPLATE", }, // OLMoE-1B-7B-0924-Instruct-Q4_0.gguf (nomic-ai/gpt4all#3471) { // original R"TEMPLATE({{ bos_token }}{% for message in messages %} {% if message['role'] == 'system' %} {{ '<|system|> ' + message['content'] }} {% elif message['role'] == 'user' %} {{ '<|user|> ' + message['content'] }} {% elif message['role'] == 'assistant' %} {{ '<|assistant|> ' + message['content'] + eos_token }} {% endif %} {% if loop.last and add_generation_prompt %} {{ '<|assistant|>' }} {% endif %} {% endfor %})TEMPLATE", // replacement R"TEMPLATE({{- bos_token }} {%- for message in messages %} {%- if message['role'] == 'system' %} {{- '<|system|>\n' + message['content'] }} {%- elif message['role'] == 'user' %} {{- '<|user|>\n' + message['content'] }} {%- elif message['role'] == 'assistant' %} {{- '<|assistant|>\n' + message['content'] + eos_token }} {%- endif %} {%- if loop.last and add_generation_prompt %} {{- '<|assistant|>' }} {%- endif %} {%- endfor %})TEMPLATE", }, // Phi-3.1-mini-128k-instruct-Q4_0.gguf (nomic-ai/gpt4all#3346) { // original R"TEMPLATE({% for message in messages %}{% if message['role'] == 'system' %}{{'<|system|> ' + message['content'] + '<|end|> '}}{% elif message['role'] == 'user' %}{{'<|user|> ' + message['content'] + '<|end|> '}}{% elif message['role'] == 'assistant' %}{{'<|assistant|> ' + message['content'] + '<|end|> '}}{% endif %}{% endfor %}{% if add_generation_prompt %}{{ '<|assistant|> ' }}{% else %}{{ eos_token }}{% endif %})TEMPLATE", // replacement R"TEMPLATE({%- for message in messages %} {%- if message['role'] == 'system' %} {{- '<|system|>\n' + message['content'] + '<|end|>\n' }} {%- elif message['role'] == 'user' %} {{- '<|user|>\n' + message['content'] + '<|end|>\n' }} {%- elif message['role'] == 'assistant' %} {{- '<|assistant|>\n' + message['content'] + '<|end|>\n' }} {%- endif %} {%- endfor %} {%- if add_generation_prompt %} {{- '<|assistant|>\n' }} {%- else %} {{- eos_token }} {%- endif %})TEMPLATE", }, // Phi-3-mini-4k-instruct.Q4_0.gguf { // original R"TEMPLATE({{ bos_token }}{% for message in messages %}{{'<|' + message['role'] + '|>' + ' ' + message['content'] + '<|end|> ' }}{% endfor %}{% if add_generation_prompt %}{{ '<|assistant|> ' }}{% else %}{{ eos_token }}{% endif %})TEMPLATE", // replacement R"TEMPLATE({{- bos_token }} {%- for message in messages %} {{- '<|' + message['role'] + '|>\n' + message['content'] + '<|end|>\n' }} {%- endfor %} {%- if add_generation_prompt %} {{- '<|assistant|>\n' }} {%- else %} {{- eos_token }} {%- endif %})TEMPLATE", }, // qwen2-1_5b-instruct-q4_0.gguf (nomic-ai/gpt4all#3263), qwen2-72b-instruct-q4_0.gguf { // original R"TEMPLATE({% for message in messages %}{% if loop.first and messages[0]['role'] != 'system' %}{{ '<|im_start|>system You are a helpful assistant.<|im_end|> ' }}{% endif %}{{'<|im_start|>' + message['role'] + ' ' + message['content'] + '<|im_end|>' + ' '}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant ' }}{% endif %})TEMPLATE", // replacement R"TEMPLATE({%- for message in messages %} {%- if loop.first and messages[0]['role'] != 'system' %} {{- '<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n' }} {%- endif %} {{- '<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>\n' }} {%- endfor %} {%- if add_generation_prompt %} {{- '<|im_start|>assistant\n' }} {%- endif %})TEMPLATE", }, }; ================================================ FILE: gpt4all-chat/src/jinja_replacements.h ================================================ #pragma once #include #include extern const std::unordered_map CHAT_TEMPLATE_SUBSTITUTIONS; ================================================ FILE: gpt4all-chat/src/llm.cpp ================================================ #include "llm.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #ifdef GPT4ALL_OFFLINE_INSTALLER # include #else # include "network.h" #endif #ifdef Q_OS_MAC #include "macosdock.h" #endif using namespace Qt::Literals::StringLiterals; class MyLLM: public LLM { }; Q_GLOBAL_STATIC(MyLLM, llmInstance) LLM *LLM::globalInstance() { return llmInstance(); } LLM::LLM() : QObject{nullptr} , m_compatHardware(LLModel::Implementation::hasSupportedCPU()) { QNetworkInformation::loadDefaultBackend(); auto * netinfo = QNetworkInformation::instance(); if (netinfo) { connect(netinfo, &QNetworkInformation::reachabilityChanged, this, &LLM::isNetworkOnlineChanged); } } bool LLM::hasSettingsAccess() const { QSettings settings; settings.sync(); return settings.status() == QSettings::NoError; } bool LLM::checkForUpdates() const { #ifdef GPT4ALL_OFFLINE_INSTALLER # pragma message(__FILE__ ": WARNING: offline installer build will not check for updates!") return QDesktopServices::openUrl(QUrl("https://github.com/nomic-ai/gpt4all/releases")); #else Network::globalInstance()->trackEvent("check_for_updates"); #if defined(Q_OS_LINUX) QString tool = u"maintenancetool"_s; #elif defined(Q_OS_WINDOWS) QString tool = u"maintenancetool.exe"_s; #elif defined(Q_OS_DARWIN) QString tool = u"../../../maintenancetool.app/Contents/MacOS/maintenancetool"_s; #endif QString fileName = QCoreApplication::applicationDirPath() + "/../" + tool; if (!QFileInfo::exists(fileName)) { qDebug() << "Couldn't find tool at" << fileName << "so cannot check for updates!"; return false; } return QProcess::startDetached(fileName); #endif } bool LLM::directoryExists(const QString &path) { const QUrl url(path); const QString localFilePath = url.isLocalFile() ? url.toLocalFile() : path; const QFileInfo info(localFilePath); return info.exists() && info.isDir(); } bool LLM::fileExists(const QString &path) { const QUrl url(path); const QString localFilePath = url.isLocalFile() ? url.toLocalFile() : path; const QFileInfo info(localFilePath); return info.exists() && info.isFile(); } qint64 LLM::systemTotalRAMInGB() const { return getSystemTotalRAMInGB(); } QString LLM::systemTotalRAMInGBString() const { return QString::fromStdString(getSystemTotalRAMInGBString()); } bool LLM::isNetworkOnline() const { auto * netinfo = QNetworkInformation::instance(); return !netinfo || netinfo->reachability() == QNetworkInformation::Reachability::Online; } void LLM::showDockIcon() const { #ifdef Q_OS_MAC MacOSDock::showIcon(); #else qt_noop(); #endif } void LLM::hideDockIcon() const { #ifdef Q_OS_MAC MacOSDock::hideIcon(); #else qt_noop(); #endif } ================================================ FILE: gpt4all-chat/src/llm.h ================================================ #ifndef LLM_H #define LLM_H #include #include #include class LLM : public QObject { Q_OBJECT Q_PROPERTY(bool isNetworkOnline READ isNetworkOnline NOTIFY isNetworkOnlineChanged) public: static LLM *globalInstance(); Q_INVOKABLE bool hasSettingsAccess() const; Q_INVOKABLE bool compatHardware() const { return m_compatHardware; } Q_INVOKABLE bool checkForUpdates() const; Q_INVOKABLE static bool directoryExists(const QString &path); Q_INVOKABLE static bool fileExists(const QString &path); Q_INVOKABLE qint64 systemTotalRAMInGB() const; Q_INVOKABLE QString systemTotalRAMInGBString() const; Q_INVOKABLE bool isNetworkOnline() const; Q_INVOKABLE void showDockIcon() const; Q_INVOKABLE void hideDockIcon() const; Q_SIGNALS: void isNetworkOnlineChanged(); private: bool m_compatHardware; private: explicit LLM(); ~LLM() {} friend class MyLLM; }; #endif // LLM_H ================================================ FILE: gpt4all-chat/src/localdocs.cpp ================================================ #include "localdocs.h" #include "database.h" #include "embllm.h" #include "mysettings.h" #include #include #include #include #include #include #include #include class MyLocalDocs: public LocalDocs { }; Q_GLOBAL_STATIC(MyLocalDocs, localDocsInstance) LocalDocs *LocalDocs::globalInstance() { return localDocsInstance(); } LocalDocs::LocalDocs() : QObject(nullptr) , m_localDocsModel(new LocalDocsModel(this)) , m_database(nullptr) { connect(MySettings::globalInstance(), &MySettings::localDocsChunkSizeChanged, this, &LocalDocs::handleChunkSizeChanged); connect(MySettings::globalInstance(), &MySettings::localDocsFileExtensionsChanged, this, &LocalDocs::handleFileExtensionsChanged); // Create the DB with the chunk size from settings m_database = new Database(MySettings::globalInstance()->localDocsChunkSize(), MySettings::globalInstance()->localDocsFileExtensions()); connect(this, &LocalDocs::requestStart, m_database, &Database::start, Qt::QueuedConnection); connect(this, &LocalDocs::requestForceIndexing, m_database, &Database::forceIndexing, Qt::QueuedConnection); connect(this, &LocalDocs::forceRebuildFolder, m_database, &Database::forceRebuildFolder, Qt::QueuedConnection); connect(this, &LocalDocs::requestAddFolder, m_database, &Database::addFolder, Qt::QueuedConnection); connect(this, &LocalDocs::requestRemoveFolder, m_database, &Database::removeFolder, Qt::QueuedConnection); connect(this, &LocalDocs::requestChunkSizeChange, m_database, &Database::changeChunkSize, Qt::QueuedConnection); connect(this, &LocalDocs::requestFileExtensionsChange, m_database, &Database::changeFileExtensions, Qt::QueuedConnection); connect(m_database, &Database::databaseValidChanged, this, &LocalDocs::databaseValidChanged, Qt::QueuedConnection); // Connections for modifying the model and keeping it updated with the database connect(m_database, &Database::requestUpdateGuiForCollectionItem, m_localDocsModel, &LocalDocsModel::updateCollectionItem, Qt::QueuedConnection); connect(m_database, &Database::requestAddGuiCollectionItem, m_localDocsModel, &LocalDocsModel::addCollectionItem, Qt::QueuedConnection); connect(m_database, &Database::requestRemoveGuiFolderById, m_localDocsModel, &LocalDocsModel::removeFolderById, Qt::QueuedConnection); connect(m_database, &Database::requestGuiCollectionListUpdated, m_localDocsModel, &LocalDocsModel::collectionListUpdated, Qt::QueuedConnection); connect(qGuiApp, &QCoreApplication::aboutToQuit, this, &LocalDocs::aboutToQuit); } void LocalDocs::aboutToQuit() { delete m_database; m_database = nullptr; } void LocalDocs::addFolder(const QString &collection, const QString &path) { const QUrl url(path); const QString localPath = url.isLocalFile() ? url.toLocalFile() : path; const QString embedding_model = EmbeddingLLM::model(); if (embedding_model.isEmpty()) { qWarning() << "ERROR: We have no embedding model"; return; } emit requestAddFolder(collection, localPath, embedding_model); } void LocalDocs::removeFolder(const QString &collection, const QString &path) { emit requestRemoveFolder(collection, path); } void LocalDocs::forceIndexing(const QString &collection) { const QString embedding_model = EmbeddingLLM::model(); if (embedding_model.isEmpty()) { qWarning() << "ERROR: We have no embedding model"; return; } emit requestForceIndexing(collection, embedding_model); } void LocalDocs::handleChunkSizeChanged() { emit requestChunkSizeChange(MySettings::globalInstance()->localDocsChunkSize()); } void LocalDocs::handleFileExtensionsChanged() { emit requestFileExtensionsChange(MySettings::globalInstance()->localDocsFileExtensions()); } ================================================ FILE: gpt4all-chat/src/localdocs.h ================================================ #ifndef LOCALDOCS_H #define LOCALDOCS_H #include "database.h" #include "localdocsmodel.h" #include #include #include // IWYU pragma: keep // IWYU pragma: no_forward_declare LocalDocsModel class LocalDocs : public QObject { Q_OBJECT Q_PROPERTY(bool databaseValid READ databaseValid NOTIFY databaseValidChanged) Q_PROPERTY(LocalDocsModel *localDocsModel READ localDocsModel NOTIFY localDocsModelChanged) public: static LocalDocs *globalInstance(); LocalDocsModel *localDocsModel() const { return m_localDocsModel; } Q_INVOKABLE void addFolder(const QString &collection, const QString &path); Q_INVOKABLE void removeFolder(const QString &collection, const QString &path); Q_INVOKABLE void forceIndexing(const QString &collection); Database *database() const { return m_database; } bool databaseValid() const { return m_database->isValid(); } public Q_SLOTS: void handleChunkSizeChanged(); void handleFileExtensionsChanged(); void aboutToQuit(); Q_SIGNALS: void requestStart(); void requestForceIndexing(const QString &collection, const QString &embedding_model); void forceRebuildFolder(const QString &path); void requestAddFolder(const QString &collection, const QString &path, const QString &embedding_model); void requestRemoveFolder(const QString &collection, const QString &path); void requestChunkSizeChange(int chunkSize); void requestFileExtensionsChange(const QStringList &extensions); void localDocsModelChanged(); void databaseValidChanged(); private: LocalDocsModel *m_localDocsModel; Database *m_database; private: explicit LocalDocs(); friend class MyLocalDocs; }; #endif // LOCALDOCS_H ================================================ FILE: gpt4all-chat/src/localdocsmodel.cpp ================================================ #include "localdocsmodel.h" #include "localdocs.h" #include "network.h" #include #include #include // IWYU pragma: keep #include LocalDocsCollectionsModel::LocalDocsCollectionsModel(QObject *parent) : QSortFilterProxyModel(parent) { setSourceModel(LocalDocs::globalInstance()->localDocsModel()); connect(LocalDocs::globalInstance()->localDocsModel(), &LocalDocsModel::updatingChanged, this, &LocalDocsCollectionsModel::maybeTriggerUpdatingCountChanged); connect(this, &LocalDocsCollectionsModel::rowsInserted, this, &LocalDocsCollectionsModel::countChanged); connect(this, &LocalDocsCollectionsModel::rowsRemoved, this, &LocalDocsCollectionsModel::countChanged); connect(this, &LocalDocsCollectionsModel::modelReset, this, &LocalDocsCollectionsModel::countChanged); } bool LocalDocsCollectionsModel::filterAcceptsRow(int sourceRow, const QModelIndex &sourceParent) const { QModelIndex index = sourceModel()->index(sourceRow, 0, sourceParent); const QString collection = sourceModel()->data(index, LocalDocsModel::CollectionRole).toString(); return m_collections.contains(collection); } void LocalDocsCollectionsModel::setCollections(const QList &collections) { m_collections = collections; invalidateFilter(); maybeTriggerUpdatingCountChanged(); } int LocalDocsCollectionsModel::updatingCount() const { return m_updatingCount; } void LocalDocsCollectionsModel::maybeTriggerUpdatingCountChanged() { int updatingCount = 0; for (int row = 0; row < sourceModel()->rowCount(); ++row) { QModelIndex index = sourceModel()->index(row, 0); const QString collection = sourceModel()->data(index, LocalDocsModel::CollectionRole).toString(); if (!m_collections.contains(collection)) continue; bool updating = sourceModel()->data(index, LocalDocsModel::UpdatingRole).toBool(); if (updating) ++updatingCount; } if (updatingCount != m_updatingCount) { m_updatingCount = updatingCount; emit updatingCountChanged(); } } LocalDocsModel::LocalDocsModel(QObject *parent) : QAbstractListModel(parent) { connect(this, &LocalDocsModel::rowsInserted, this, &LocalDocsModel::countChanged); connect(this, &LocalDocsModel::rowsRemoved, this, &LocalDocsModel::countChanged); connect(this, &LocalDocsModel::modelReset, this, &LocalDocsModel::countChanged); } int LocalDocsModel::rowCount(const QModelIndex &parent) const { Q_UNUSED(parent); return m_collectionList.size(); } QVariant LocalDocsModel::data(const QModelIndex &index, int role) const { if (!index.isValid() || index.row() < 0 || index.row() >= m_collectionList.size()) return QVariant(); const CollectionItem item = m_collectionList.at(index.row()); switch (role) { case CollectionRole: return item.collection; case FolderPathRole: return item.folder_path; case InstalledRole: return item.installed; case IndexingRole: return item.indexing; case ErrorRole: return item.error; case ForceIndexingRole: return item.forceIndexing; case CurrentDocsToIndexRole: return item.currentDocsToIndex; case TotalDocsToIndexRole: return item.totalDocsToIndex; case CurrentBytesToIndexRole: return quint64(item.currentBytesToIndex); case TotalBytesToIndexRole: return quint64(item.totalBytesToIndex); case CurrentEmbeddingsToIndexRole: return quint64(item.currentEmbeddingsToIndex); case TotalEmbeddingsToIndexRole: return quint64(item.totalEmbeddingsToIndex); case TotalDocsRole: return quint64(item.totalDocs); case TotalWordsRole: return quint64(item.totalWords); case TotalTokensRole: return quint64(item.totalTokens); case StartUpdateRole: return item.startUpdate; case LastUpdateRole: return item.lastUpdate; case FileCurrentlyProcessingRole: return item.fileCurrentlyProcessing; case EmbeddingModelRole: return item.embeddingModel; case UpdatingRole: return item.indexing || item.currentEmbeddingsToIndex != 0; } return QVariant(); } QHash LocalDocsModel::roleNames() const { QHash roles; roles[CollectionRole] = "collection"; roles[FolderPathRole] = "folder_path"; roles[InstalledRole] = "installed"; roles[IndexingRole] = "indexing"; roles[ErrorRole] = "error"; roles[ForceIndexingRole] = "forceIndexing"; roles[CurrentDocsToIndexRole] = "currentDocsToIndex"; roles[TotalDocsToIndexRole] = "totalDocsToIndex"; roles[CurrentBytesToIndexRole] = "currentBytesToIndex"; roles[TotalBytesToIndexRole] = "totalBytesToIndex"; roles[CurrentEmbeddingsToIndexRole] = "currentEmbeddingsToIndex"; roles[TotalEmbeddingsToIndexRole] = "totalEmbeddingsToIndex"; roles[TotalDocsRole] = "totalDocs"; roles[TotalWordsRole] = "totalWords"; roles[TotalTokensRole] = "totalTokens"; roles[StartUpdateRole] = "startUpdate"; roles[LastUpdateRole] = "lastUpdate"; roles[FileCurrentlyProcessingRole] = "fileCurrentlyProcessing"; roles[EmbeddingModelRole] = "embeddingModel"; roles[UpdatingRole] = "updating"; return roles; } void LocalDocsModel::updateCollectionItem(const CollectionItem &item) { for (int i = 0; i < m_collectionList.size(); ++i) { CollectionItem &stored = m_collectionList[i]; if (stored.folder_id != item.folder_id) continue; QVector changed; if (stored.folder_path != item.folder_path) changed.append(FolderPathRole); if (stored.installed != item.installed) changed.append(InstalledRole); if (stored.indexing != item.indexing) { changed.append(IndexingRole); changed.append(UpdatingRole); } if (stored.error != item.error) changed.append(ErrorRole); if (stored.forceIndexing != item.forceIndexing) changed.append(ForceIndexingRole); if (stored.currentDocsToIndex != item.currentDocsToIndex) changed.append(CurrentDocsToIndexRole); if (stored.totalDocsToIndex != item.totalDocsToIndex) changed.append(TotalDocsToIndexRole); if (stored.currentBytesToIndex != item.currentBytesToIndex) changed.append(CurrentBytesToIndexRole); if (stored.totalBytesToIndex != item.totalBytesToIndex) changed.append(TotalBytesToIndexRole); if (stored.currentEmbeddingsToIndex != item.currentEmbeddingsToIndex) { changed.append(CurrentEmbeddingsToIndexRole); changed.append(UpdatingRole); } if (stored.totalEmbeddingsToIndex != item.totalEmbeddingsToIndex) changed.append(TotalEmbeddingsToIndexRole); if (stored.totalDocs != item.totalDocs) changed.append(TotalDocsRole); if (stored.totalWords != item.totalWords) changed.append(TotalWordsRole); if (stored.totalTokens != item.totalTokens) changed.append(TotalTokensRole); if (stored.startUpdate != item.startUpdate) changed.append(StartUpdateRole); if (stored.lastUpdate != item.lastUpdate) changed.append(LastUpdateRole); if (stored.fileCurrentlyProcessing != item.fileCurrentlyProcessing) changed.append(FileCurrentlyProcessingRole); if (stored.embeddingModel != item.embeddingModel) changed.append(EmbeddingModelRole); // preserve collection name as we ignore it for matching QString collection = stored.collection; stored = item; stored.collection = collection; emit dataChanged(this->index(i), this->index(i), changed); if (changed.contains(UpdatingRole)) emit updatingChanged(item.collection); } } void LocalDocsModel::addCollectionItem(const CollectionItem &item) { beginInsertRows(QModelIndex(), m_collectionList.size(), m_collectionList.size()); m_collectionList.append(item); endInsertRows(); } void LocalDocsModel::removeCollectionIf(std::function const &predicate) { for (int i = 0; i < m_collectionList.size();) { if (predicate(m_collectionList.at(i))) { beginRemoveRows(QModelIndex(), i, i); m_collectionList.removeAt(i); endRemoveRows(); Network::globalInstance()->trackEvent("doc_collection_remove", { {"collection_count", m_collectionList.count()}, }); } else { ++i; } } } void LocalDocsModel::removeFolderById(const QString &collection, int folder_id) { removeCollectionIf([collection, folder_id](const auto &c) { return c.collection == collection && c.folder_id == folder_id; }); } void LocalDocsModel::removeCollectionPath(const QString &name, const QString &path) { removeCollectionIf([&name, &path](const auto &c) { return c.collection == name && c.folder_path == path; }); } void LocalDocsModel::collectionListUpdated(const QList &collectionList) { beginResetModel(); m_collectionList = collectionList; endResetModel(); } ================================================ FILE: gpt4all-chat/src/localdocsmodel.h ================================================ #ifndef LOCALDOCSMODEL_H #define LOCALDOCSMODEL_H #include "database.h" #include #include #include // IWYU pragma: keep #include #include #include #include class QByteArray; class QVariant; template class QHash; class LocalDocsCollectionsModel : public QSortFilterProxyModel { Q_OBJECT Q_PROPERTY(int count READ count NOTIFY countChanged) Q_PROPERTY(int updatingCount READ updatingCount NOTIFY updatingCountChanged) public: explicit LocalDocsCollectionsModel(QObject *parent); int count() const { return rowCount(); } int updatingCount() const; public Q_SLOTS: void setCollections(const QList &collections); Q_SIGNALS: void countChanged(); void updatingCountChanged(); protected: bool filterAcceptsRow(int sourceRow, const QModelIndex &sourceParent) const override; private Q_SLOTS: void maybeTriggerUpdatingCountChanged(); private: QList m_collections; int m_updatingCount = 0; }; class LocalDocsModel : public QAbstractListModel { Q_OBJECT Q_PROPERTY(int count READ count NOTIFY countChanged) public: enum Roles { CollectionRole = Qt::UserRole + 1, FolderPathRole, InstalledRole, IndexingRole, ErrorRole, ForceIndexingRole, CurrentDocsToIndexRole, TotalDocsToIndexRole, CurrentBytesToIndexRole, TotalBytesToIndexRole, CurrentEmbeddingsToIndexRole, TotalEmbeddingsToIndexRole, TotalDocsRole, TotalWordsRole, TotalTokensRole, StartUpdateRole, LastUpdateRole, FileCurrentlyProcessingRole, EmbeddingModelRole, UpdatingRole }; explicit LocalDocsModel(QObject *parent = nullptr); int rowCount(const QModelIndex & = QModelIndex()) const override; QVariant data(const QModelIndex &index, int role) const override; QHash roleNames() const override; int count() const { return rowCount(); } public Q_SLOTS: void updateCollectionItem(const CollectionItem&); void addCollectionItem(const CollectionItem &item); void removeFolderById(const QString &collection, int folder_id); void removeCollectionPath(const QString &name, const QString &path); void collectionListUpdated(const QList &collectionList); Q_SIGNALS: void countChanged(); void updatingChanged(const QString &collection); private: void removeCollectionIf(std::function const &predicate); QList m_collectionList; }; #endif // LOCALDOCSMODEL_H ================================================ FILE: gpt4all-chat/src/logger.cpp ================================================ #include "logger.h" #include #include #include #include #include #include // IWYU pragma: keep #include #include #include #include using namespace Qt::Literals::StringLiterals; class MyLogger: public Logger { }; Q_GLOBAL_STATIC(MyLogger, loggerInstance) Logger *Logger::globalInstance() { return loggerInstance(); } Logger::Logger() { // Get log file dir auto dir = QStandardPaths::writableLocation(QStandardPaths::AppLocalDataLocation); // Remove old log file QFile::remove(dir+"/log-prev.txt"); QFile::rename(dir+"/log.txt", dir+"/log-prev.txt"); // Open new log file m_file.setFileName(dir+"/log.txt"); if (!m_file.open(QIODevice::NewOnly | QIODevice::WriteOnly | QIODevice::Text)) { qWarning() << "Failed to open log file, logging to stdout..."; m_file.open(stdout, QIODevice::WriteOnly | QIODevice::Text); } // On success, install message handler qInstallMessageHandler(Logger::messageHandler); } void Logger::messageHandler(QtMsgType type, const QMessageLogContext &, const QString &msg) { auto logger = globalInstance(); // Get message type as string QString typeString; switch (type) { case QtDebugMsg: typeString = "Debug"; break; case QtInfoMsg: typeString = "Info"; break; case QtWarningMsg: typeString = "Warning"; break; case QtCriticalMsg: typeString = "Critical"; break; case QtFatalMsg: typeString = "Fatal"; break; default: typeString = "???"; } // Get time and date auto timestamp = QDateTime::currentDateTime().toString(); const std::string out = u"[%1] (%2): %3\n"_s.arg(typeString, timestamp, msg).toStdString(); // Write message QMutexLocker locker(&logger->m_mutex); logger->m_file.write(out.c_str()); logger->m_file.flush(); std::cerr << out; fflush(stderr); } ================================================ FILE: gpt4all-chat/src/logger.h ================================================ #ifndef LOGGER_H #define LOGGER_H #include #include #include #include class Logger { public: explicit Logger(); static Logger *globalInstance(); private: static void messageHandler(QtMsgType type, const QMessageLogContext &context, const QString &msg); private: QFile m_file; QMutex m_mutex; friend class MyLogger; }; #endif // LOGGER_H ================================================ FILE: gpt4all-chat/src/macosdock.h ================================================ #ifndef MACOSDOCK_H #define MACOSDOCK_H struct MacOSDock { static void showIcon(); static void hideIcon(); }; #endif // MACOSDOCK_H ================================================ FILE: gpt4all-chat/src/macosdock.mm ================================================ #include "macosdock.h" #include void MacOSDock::showIcon() { [[NSApplication sharedApplication] setActivationPolicy:NSApplicationActivationPolicyRegular]; } void MacOSDock::hideIcon() { [[NSApplication sharedApplication] setActivationPolicy:NSApplicationActivationPolicyProhibited]; } ================================================ FILE: gpt4all-chat/src/main.cpp ================================================ #include "chatlistmodel.h" #include "config.h" #include "download.h" #include "llm.h" #include "localdocs.h" #include "logger.h" #include "modellist.h" #include "mysettings.h" #include "network.h" #include "toolmodel.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #if G4A_CONFIG(force_d3d12) # include #endif #ifndef GPT4ALL_USE_QTPDF # include #endif #ifdef Q_OS_LINUX # include #endif #ifdef Q_OS_WINDOWS # include #else # include #endif using namespace Qt::Literals::StringLiterals; static void raiseWindow(QWindow *window) { #ifdef Q_OS_WINDOWS HWND hwnd = HWND(window->winId()); // check if window is minimized to Windows task bar if (IsIconic(hwnd)) ShowWindow(hwnd, SW_RESTORE); SetForegroundWindow(hwnd); #else LLM::globalInstance()->showDockIcon(); window->show(); window->raise(); window->requestActivate(); #endif } int main(int argc, char *argv[]) { #ifndef GPT4ALL_USE_QTPDF FPDF_InitLibrary(); #endif QCoreApplication::setOrganizationName("nomic.ai"); QCoreApplication::setOrganizationDomain("gpt4all.io"); QCoreApplication::setApplicationName("GPT4All"); QCoreApplication::setApplicationVersion(APP_VERSION); QSettings::setDefaultFormat(QSettings::IniFormat); Logger::globalInstance(); SingleApplication app(argc, argv, true /*allowSecondary*/); if (app.isSecondary()) { #ifdef Q_OS_WINDOWS AllowSetForegroundWindow(DWORD(app.primaryPid())); #endif app.sendMessage("RAISE_WINDOW"); return 0; } #if G4A_CONFIG(force_d3d12) QQuickWindow::setGraphicsApi(QSGRendererInterface::Direct3D12); #endif #ifdef Q_OS_LINUX app.setWindowIcon(QIcon(":/gpt4all/icons/gpt4all.svg")); #endif // set search path before constructing the MySettings instance, which relies on this { auto appDirPath = QCoreApplication::applicationDirPath(); QStringList searchPaths { #ifdef Q_OS_DARWIN u"%1/../Frameworks"_s.arg(appDirPath), #else appDirPath, u"%1/../lib"_s.arg(appDirPath), #endif }; LLModel::Implementation::setImplementationsSearchPath(searchPaths.join(u';').toStdString()); } // Set the local and language translation before the qml engine has even been started. This will // use the default system locale unless the user has explicitly set it to use a different one. auto *mySettings = MySettings::globalInstance(); mySettings->setLanguageAndLocale(); QQmlApplicationEngine engine; // Add a connection here from MySettings::languageAndLocaleChanged signal to a lambda slot where I can call // engine.uiLanguage property QObject::connect(mySettings, &MySettings::languageAndLocaleChanged, [&engine]() { engine.setUiLanguage(MySettings::globalInstance()->languageAndLocale()); }); auto *modelList = ModelList::globalInstance(); QObject::connect(modelList, &ModelList::dataChanged, mySettings, &MySettings::onModelInfoChanged); qmlRegisterSingletonInstance("mysettings", 1, 0, "MySettings", mySettings); qmlRegisterSingletonInstance("modellist", 1, 0, "ModelList", modelList); qmlRegisterSingletonInstance("chatlistmodel", 1, 0, "ChatListModel", ChatListModel::globalInstance()); qmlRegisterSingletonInstance("llm", 1, 0, "LLM", LLM::globalInstance()); qmlRegisterSingletonInstance("download", 1, 0, "Download", Download::globalInstance()); qmlRegisterSingletonInstance("network", 1, 0, "Network", Network::globalInstance()); qmlRegisterSingletonInstance("localdocs", 1, 0, "LocalDocs", LocalDocs::globalInstance()); qmlRegisterSingletonInstance("toollist", 1, 0, "ToolList", ToolModel::globalInstance()); qmlRegisterUncreatableMetaObject(ToolEnums::staticMetaObject, "toolenums", 1, 0, "ToolEnums", "Error: only enums"); qmlRegisterUncreatableMetaObject(MySettingsEnums::staticMetaObject, "mysettingsenums", 1, 0, "MySettingsEnums", "Error: only enums"); { auto fixedFont = QFontDatabase::systemFont(QFontDatabase::FixedFont); engine.rootContext()->setContextProperty("fixedFont", fixedFont); } const QUrl url(u"qrc:/gpt4all/main.qml"_s); QObject::connect(&engine, &QQmlApplicationEngine::objectCreated, &app, [url](QObject *obj, const QUrl &objUrl) { if (!obj && url == objUrl) QCoreApplication::exit(-1); }, Qt::QueuedConnection); engine.load(url); QObject *rootObject = engine.rootObjects().first(); QQuickWindow *windowObject = qobject_cast(rootObject); Q_ASSERT(windowObject); if (windowObject) QObject::connect(&app, &SingleApplication::receivedMessage, windowObject, [windowObject] () { raiseWindow(windowObject); } ); #if 0 QDirIterator it("qrc:", QDirIterator::Subdirectories); while (it.hasNext()) { qDebug() << it.next(); } #endif #ifndef Q_OS_WINDOWS // handle signals gracefully struct sigaction sa; sa.sa_handler = [](int s) { QCoreApplication::exit(s == SIGINT ? 0 : 1); }; sa.sa_flags = SA_RESETHAND; sigemptyset(&sa.sa_mask); sigaction(SIGINT, &sa, nullptr); sigaction(SIGTERM, &sa, nullptr); sigaction(SIGHUP, &sa, nullptr); #endif int res = app.exec(); // Make sure ChatLLM threads are joined before global destructors run. // Otherwise, we can get a heap-use-after-free inside of llama.cpp. ChatListModel::globalInstance()->destroyChats(); #ifndef GPT4ALL_USE_QTPDF FPDF_DestroyLibrary(); #endif return res; } ================================================ FILE: gpt4all-chat/src/modellist.cpp ================================================ #include "modellist.h" #include "download.h" #include "jinja_replacements.h" #include "mysettings.h" #include "network.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include // IWYU pragma: keep #include #include #include #include #include #include #include #include #include #include #include using namespace Qt::Literals::StringLiterals; //#define USE_LOCAL_MODELSJSON #define MODELS_JSON_VERSION "3" static const QStringList FILENAME_BLACKLIST { u"gpt4all-nomic-embed-text-v1.rmodel"_s }; static const QString RMODEL_CHAT_TEMPLATE = uR"( {%- set loop_messages = messages %} {%- for message in loop_messages %} {%- if not message['role'] in ['user', 'assistant', 'system'] %} {{- raise_exception('Unknown role: ' + message['role']) }} {%- endif %} {{- '<' + message['role'] + '>' }} {%- if message['role'] == 'user' %} {%- for source in message.sources %} {%- if loop.first %} {{- '### Context:\n' }} {%- endif %} {{- ('Collection: ' + source.collection + '\n' + 'Path: ' + source.path + '\n' + 'Excerpt: ' + source.text + '\n\n') | escape }} {%- endfor %} {%- endif %} {%- for attachment in message.prompt_attachments %} {{- (attachment.processed_content + '\n\n') | escape }} {%- endfor %} {{- message.content | escape }} {{- '' }} {%- endfor %} )"_s; QString ModelInfo::id() const { return m_id; } void ModelInfo::setId(const QString &id) { m_id = id; } QString ModelInfo::name() const { return MySettings::globalInstance()->modelName(*this); } void ModelInfo::setName(const QString &name) { if (shouldSaveMetadata()) MySettings::globalInstance()->setModelName(*this, name, true /*force*/); m_name = name; } QString ModelInfo::filename() const { return MySettings::globalInstance()->modelFilename(*this); } void ModelInfo::setFilename(const QString &filename) { if (shouldSaveMetadata()) MySettings::globalInstance()->setModelFilename(*this, filename, true /*force*/); m_filename = filename; } QString ModelInfo::description() const { return MySettings::globalInstance()->modelDescription(*this); } void ModelInfo::setDescription(const QString &d) { if (shouldSaveMetadata()) MySettings::globalInstance()->setModelDescription(*this, d, true /*force*/); m_description = d; } QString ModelInfo::url() const { return MySettings::globalInstance()->modelUrl(*this); } void ModelInfo::setUrl(const QString &u) { if (shouldSaveMetadata()) MySettings::globalInstance()->setModelUrl(*this, u, true /*force*/); m_url = u; } QString ModelInfo::quant() const { return MySettings::globalInstance()->modelQuant(*this); } void ModelInfo::setQuant(const QString &q) { if (shouldSaveMetadata()) MySettings::globalInstance()->setModelQuant(*this, q, true /*force*/); m_quant = q; } QString ModelInfo::type() const { return MySettings::globalInstance()->modelType(*this); } void ModelInfo::setType(const QString &t) { if (shouldSaveMetadata()) MySettings::globalInstance()->setModelType(*this, t, true /*force*/); m_type = t; } bool ModelInfo::isClone() const { return MySettings::globalInstance()->modelIsClone(*this); } void ModelInfo::setIsClone(bool b) { if (shouldSaveMetadata()) MySettings::globalInstance()->setModelIsClone(*this, b, true /*force*/); m_isClone = b; } bool ModelInfo::isDiscovered() const { return MySettings::globalInstance()->modelIsDiscovered(*this); } void ModelInfo::setIsDiscovered(bool b) { if (shouldSaveMetadata()) MySettings::globalInstance()->setModelIsDiscovered(*this, b, true /*force*/); m_isDiscovered = b; } int ModelInfo::likes() const { return MySettings::globalInstance()->modelLikes(*this); } void ModelInfo::setLikes(int l) { if (shouldSaveMetadata()) MySettings::globalInstance()->setModelLikes(*this, l, true /*force*/); m_likes = l; } int ModelInfo::downloads() const { return MySettings::globalInstance()->modelDownloads(*this); } void ModelInfo::setDownloads(int d) { if (shouldSaveMetadata()) MySettings::globalInstance()->setModelDownloads(*this, d, true /*force*/); m_downloads = d; } QDateTime ModelInfo::recency() const { return MySettings::globalInstance()->modelRecency(*this); } void ModelInfo::setRecency(const QDateTime &r) { if (shouldSaveMetadata()) MySettings::globalInstance()->setModelRecency(*this, r, true /*force*/); m_recency = r; } double ModelInfo::temperature() const { return MySettings::globalInstance()->modelTemperature(*this); } void ModelInfo::setTemperature(double t) { if (shouldSaveMetadata()) MySettings::globalInstance()->setModelTemperature(*this, t, true /*force*/); m_temperature = t; } double ModelInfo::topP() const { return MySettings::globalInstance()->modelTopP(*this); } double ModelInfo::minP() const { return MySettings::globalInstance()->modelMinP(*this); } void ModelInfo::setTopP(double p) { if (shouldSaveMetadata()) MySettings::globalInstance()->setModelTopP(*this, p, true /*force*/); m_topP = p; } void ModelInfo::setMinP(double p) { if (shouldSaveMetadata()) MySettings::globalInstance()->setModelMinP(*this, p, true /*force*/); m_minP = p; } int ModelInfo::topK() const { return MySettings::globalInstance()->modelTopK(*this); } void ModelInfo::setTopK(int k) { if (shouldSaveMetadata()) MySettings::globalInstance()->setModelTopK(*this, k, true /*force*/); m_topK = k; } int ModelInfo::maxLength() const { return MySettings::globalInstance()->modelMaxLength(*this); } void ModelInfo::setMaxLength(int l) { if (shouldSaveMetadata()) MySettings::globalInstance()->setModelMaxLength(*this, l, true /*force*/); m_maxLength = l; } int ModelInfo::promptBatchSize() const { return MySettings::globalInstance()->modelPromptBatchSize(*this); } void ModelInfo::setPromptBatchSize(int s) { if (shouldSaveMetadata()) MySettings::globalInstance()->setModelPromptBatchSize(*this, s, true /*force*/); m_promptBatchSize = s; } int ModelInfo::contextLength() const { return MySettings::globalInstance()->modelContextLength(*this); } void ModelInfo::setContextLength(int l) { if (shouldSaveMetadata()) MySettings::globalInstance()->setModelContextLength(*this, l, true /*force*/); m_contextLength = l; } int ModelInfo::maxContextLength() const { if (!installed || isOnline) return -1; if (m_maxContextLength != -1) return m_maxContextLength; auto path = (dirpath + filename()).toStdString(); int n_ctx = LLModel::Implementation::maxContextLength(path); if (n_ctx < 0) { n_ctx = 4096; // fallback value } m_maxContextLength = n_ctx; return m_maxContextLength; } int ModelInfo::gpuLayers() const { return MySettings::globalInstance()->modelGpuLayers(*this); } void ModelInfo::setGpuLayers(int l) { if (shouldSaveMetadata()) MySettings::globalInstance()->setModelGpuLayers(*this, l, true /*force*/); m_gpuLayers = l; } int ModelInfo::maxGpuLayers() const { if (!installed || isOnline) return -1; if (m_maxGpuLayers != -1) return m_maxGpuLayers; auto path = (dirpath + filename()).toStdString(); int layers = LLModel::Implementation::layerCount(path); if (layers < 0) { layers = 100; // fallback value } m_maxGpuLayers = layers; return m_maxGpuLayers; } double ModelInfo::repeatPenalty() const { return MySettings::globalInstance()->modelRepeatPenalty(*this); } void ModelInfo::setRepeatPenalty(double p) { if (shouldSaveMetadata()) MySettings::globalInstance()->setModelRepeatPenalty(*this, p, true /*force*/); m_repeatPenalty = p; } int ModelInfo::repeatPenaltyTokens() const { return MySettings::globalInstance()->modelRepeatPenaltyTokens(*this); } void ModelInfo::setRepeatPenaltyTokens(int t) { if (shouldSaveMetadata()) MySettings::globalInstance()->setModelRepeatPenaltyTokens(*this, t, true /*force*/); m_repeatPenaltyTokens = t; } QVariant ModelInfo::defaultChatTemplate() const { auto res = m_chatTemplate.or_else([this]() -> std::optional { if (!installed || isOnline) return std::nullopt; if (!m_modelChatTemplate) { auto path = (dirpath + filename()).toUtf8(); auto res = LLModel::Implementation::chatTemplate(path.constData()); if (res) { std::string ggufTmpl(std::move(*res)); if (ggufTmpl.size() >= 2 && ggufTmpl.end()[-2] != '\n' && ggufTmpl.end()[-1] == '\n') ggufTmpl.erase(ggufTmpl.end() - 1); // strip trailing newline for e.g. Llama-3.2-3B-Instruct if ( auto replacement = CHAT_TEMPLATE_SUBSTITUTIONS.find(ggufTmpl); replacement != CHAT_TEMPLATE_SUBSTITUTIONS.end() ) { qWarning() << "automatically substituting chat template for" << filename(); auto &[badTemplate, goodTemplate] = *replacement; ggufTmpl = goodTemplate; } m_modelChatTemplate = QString::fromStdString(ggufTmpl); } else { qWarning().nospace() << "failed to get chat template for " << filename() << ": " << res.error().c_str(); m_modelChatTemplate = QString(); // do not retry } } if (m_modelChatTemplate->isNull()) return std::nullopt; return m_modelChatTemplate; }); if (res) return std::move(*res); return QVariant::fromValue(nullptr); } auto ModelInfo::chatTemplate() const -> UpgradeableSetting { return MySettings::globalInstance()->modelChatTemplate(*this); } QString ModelInfo::defaultSystemMessage() const { return m_systemMessage; } auto ModelInfo::systemMessage() const -> UpgradeableSetting { return MySettings::globalInstance()->modelSystemMessage(*this); } QString ModelInfo::chatNamePrompt() const { return MySettings::globalInstance()->modelChatNamePrompt(*this); } void ModelInfo::setChatNamePrompt(const QString &p) { if (shouldSaveMetadata()) MySettings::globalInstance()->setModelChatNamePrompt(*this, p, true /*force*/); m_chatNamePrompt = p; } QString ModelInfo::suggestedFollowUpPrompt() const { return MySettings::globalInstance()->modelSuggestedFollowUpPrompt(*this); } void ModelInfo::setSuggestedFollowUpPrompt(const QString &p) { if (shouldSaveMetadata()) MySettings::globalInstance()->setModelSuggestedFollowUpPrompt(*this, p, true /*force*/); m_suggestedFollowUpPrompt = p; } // FIXME(jared): this should not be used for model settings that have meaningful defaults, such as temperature bool ModelInfo::shouldSaveMetadata() const { return installed && (isClone() || isDiscovered() || description() == "" /*indicates sideloaded*/); } QVariant ModelInfo::getField(QLatin1StringView name) const { static const std::unordered_map s_fields = { { "filename"_L1, [](auto &i) -> QVariant { return i.m_filename; } }, { "description"_L1, [](auto &i) -> QVariant { return i.m_description; } }, { "url"_L1, [](auto &i) -> QVariant { return i.m_url; } }, { "quant"_L1, [](auto &i) -> QVariant { return i.m_quant; } }, { "type"_L1, [](auto &i) -> QVariant { return i.m_type; } }, { "isClone"_L1, [](auto &i) -> QVariant { return i.m_isClone; } }, { "isDiscovered"_L1, [](auto &i) -> QVariant { return i.m_isDiscovered; } }, { "likes"_L1, [](auto &i) -> QVariant { return i.m_likes; } }, { "downloads"_L1, [](auto &i) -> QVariant { return i.m_downloads; } }, { "recency"_L1, [](auto &i) -> QVariant { return i.m_recency; } }, { "temperature"_L1, [](auto &i) -> QVariant { return i.m_temperature; } }, { "topP"_L1, [](auto &i) -> QVariant { return i.m_topP; } }, { "minP"_L1, [](auto &i) -> QVariant { return i.m_minP; } }, { "topK"_L1, [](auto &i) -> QVariant { return i.m_topK; } }, { "maxLength"_L1, [](auto &i) -> QVariant { return i.m_maxLength; } }, { "promptBatchSize"_L1, [](auto &i) -> QVariant { return i.m_promptBatchSize; } }, { "contextLength"_L1, [](auto &i) -> QVariant { return i.m_contextLength; } }, { "gpuLayers"_L1, [](auto &i) -> QVariant { return i.m_gpuLayers; } }, { "repeatPenalty"_L1, [](auto &i) -> QVariant { return i.m_repeatPenalty; } }, { "repeatPenaltyTokens"_L1, [](auto &i) -> QVariant { return i.m_repeatPenaltyTokens; } }, { "chatTemplate"_L1, [](auto &i) -> QVariant { return i.defaultChatTemplate(); } }, { "systemMessage"_L1, [](auto &i) -> QVariant { return i.m_systemMessage; } }, { "chatNamePrompt"_L1, [](auto &i) -> QVariant { return i.m_chatNamePrompt; } }, { "suggestedFollowUpPrompt"_L1, [](auto &i) -> QVariant { return i.m_suggestedFollowUpPrompt; } }, }; return s_fields.at(name)(*this); } InstalledModels::InstalledModels(QObject *parent, bool selectable) : QSortFilterProxyModel(parent) , m_selectable(selectable) { connect(this, &InstalledModels::rowsInserted, this, &InstalledModels::countChanged); connect(this, &InstalledModels::rowsRemoved, this, &InstalledModels::countChanged); connect(this, &InstalledModels::modelReset, this, &InstalledModels::countChanged); } bool InstalledModels::filterAcceptsRow(int sourceRow, const QModelIndex &sourceParent) const { /* TODO(jared): We should list incomplete models alongside installed models on the * Models page. Simply replacing isDownloading with isIncomplete here doesn't work for * some reason - the models show up as something else. */ QModelIndex index = sourceModel()->index(sourceRow, 0, sourceParent); bool isInstalled = sourceModel()->data(index, ModelList::InstalledRole).toBool(); bool isDownloading = sourceModel()->data(index, ModelList::DownloadingRole).toBool(); bool isEmbeddingModel = sourceModel()->data(index, ModelList::IsEmbeddingModelRole).toBool(); // list installed chat models return (isInstalled || (!m_selectable && isDownloading)) && !isEmbeddingModel; } GPT4AllDownloadableModels::GPT4AllDownloadableModels(QObject *parent) : QSortFilterProxyModel(parent) { connect(this, &GPT4AllDownloadableModels::rowsInserted, this, &GPT4AllDownloadableModels::countChanged); connect(this, &GPT4AllDownloadableModels::rowsRemoved, this, &GPT4AllDownloadableModels::countChanged); connect(this, &GPT4AllDownloadableModels::modelReset, this, &GPT4AllDownloadableModels::countChanged); } void GPT4AllDownloadableModels::filter(const QVector &keywords) { m_keywords = keywords; invalidateFilter(); } bool GPT4AllDownloadableModels::filterAcceptsRow(int sourceRow, const QModelIndex &sourceParent) const { QModelIndex index = sourceModel()->index(sourceRow, 0, sourceParent); const QString description = sourceModel()->data(index, ModelList::DescriptionRole).toString(); bool hasDescription = !description.isEmpty(); bool isClone = sourceModel()->data(index, ModelList::IsCloneRole).toBool(); bool isDiscovered = sourceModel()->data(index, ModelList::IsDiscoveredRole).toBool(); bool isOnline = sourceModel()->data(index, ModelList::OnlineRole).toBool(); bool satisfiesKeyword = m_keywords.isEmpty(); for (const QString &k : m_keywords) satisfiesKeyword = description.contains(k) ? true : satisfiesKeyword; return !isOnline && !isDiscovered && hasDescription && !isClone && satisfiesKeyword; } int GPT4AllDownloadableModels::count() const { return rowCount(); } HuggingFaceDownloadableModels::HuggingFaceDownloadableModels(QObject *parent) : QSortFilterProxyModel(parent) , m_limit(5) { connect(this, &HuggingFaceDownloadableModels::rowsInserted, this, &HuggingFaceDownloadableModels::countChanged); connect(this, &HuggingFaceDownloadableModels::rowsRemoved, this, &HuggingFaceDownloadableModels::countChanged); connect(this, &HuggingFaceDownloadableModels::modelReset, this, &HuggingFaceDownloadableModels::countChanged); } bool HuggingFaceDownloadableModels::filterAcceptsRow(int sourceRow, const QModelIndex &sourceParent) const { QModelIndex index = sourceModel()->index(sourceRow, 0, sourceParent); bool hasDescription = !sourceModel()->data(index, ModelList::DescriptionRole).toString().isEmpty(); bool isClone = sourceModel()->data(index, ModelList::IsCloneRole).toBool(); bool isDiscovered = sourceModel()->data(index, ModelList::IsDiscoveredRole).toBool(); return isDiscovered && hasDescription && !isClone; } int HuggingFaceDownloadableModels::count() const { return rowCount(); } void HuggingFaceDownloadableModels::discoverAndFilter(const QString &discover) { m_discoverFilter = discover; ModelList *ml = qobject_cast(parent()); ml->discoverSearch(discover); } class MyModelList: public ModelList { }; Q_GLOBAL_STATIC(MyModelList, modelListInstance) ModelList *ModelList::globalInstance() { return modelListInstance(); } ModelList::ModelList() : QAbstractListModel(nullptr) , m_installedModels(new InstalledModels(this)) , m_selectableModels(new InstalledModels(this, /*selectable*/ true)) , m_gpt4AllDownloadableModels(new GPT4AllDownloadableModels(this)) , m_huggingFaceDownloadableModels(new HuggingFaceDownloadableModels(this)) , m_asyncModelRequestOngoing(false) , m_discoverLimit(20) , m_discoverSortDirection(-1) , m_discoverSort(Likes) , m_discoverNumberOfResults(0) , m_discoverResultsCompleted(0) , m_discoverInProgress(false) { QCoreApplication::instance()->installEventFilter(this); m_installedModels->setSourceModel(this); m_selectableModels->setSourceModel(this); m_gpt4AllDownloadableModels->setSourceModel(this); m_huggingFaceDownloadableModels->setSourceModel(this); auto *mySettings = MySettings::globalInstance(); connect(mySettings, &MySettings::nameChanged, this, &ModelList::updateDataForSettings ); connect(mySettings, &MySettings::temperatureChanged, this, &ModelList::updateDataForSettings ); connect(mySettings, &MySettings::topPChanged, this, &ModelList::updateDataForSettings ); connect(mySettings, &MySettings::minPChanged, this, &ModelList::updateDataForSettings ); connect(mySettings, &MySettings::topKChanged, this, &ModelList::updateDataForSettings ); connect(mySettings, &MySettings::maxLengthChanged, this, &ModelList::updateDataForSettings ); connect(mySettings, &MySettings::promptBatchSizeChanged, this, &ModelList::updateDataForSettings ); connect(mySettings, &MySettings::contextLengthChanged, this, &ModelList::updateDataForSettings ); connect(mySettings, &MySettings::gpuLayersChanged, this, &ModelList::updateDataForSettings ); connect(mySettings, &MySettings::repeatPenaltyChanged, this, &ModelList::updateDataForSettings ); connect(mySettings, &MySettings::repeatPenaltyTokensChanged, this, &ModelList::updateDataForSettings ); connect(mySettings, &MySettings::chatTemplateChanged, this, &ModelList::maybeUpdateDataForSettings); connect(mySettings, &MySettings::systemMessageChanged, this, &ModelList::maybeUpdateDataForSettings); connect(this, &ModelList::dataChanged, this, &ModelList::onDataChanged); connect(&m_networkManager, &QNetworkAccessManager::sslErrors, this, &ModelList::handleSslErrors); updateModelsFromJson(); updateModelsFromSettings(); updateModelsFromDirectory(); connect(mySettings, &MySettings::modelPathChanged, this, &ModelList::updateModelsFromDirectory); connect(mySettings, &MySettings::modelPathChanged, this, &ModelList::updateModelsFromJson ); connect(mySettings, &MySettings::modelPathChanged, this, &ModelList::updateModelsFromSettings ); QCoreApplication::instance()->installEventFilter(this); } // an easier way to listen for model info and setting changes void ModelList::onDataChanged(const QModelIndex &topLeft, const QModelIndex &bottomRight, const QList &roles) { Q_UNUSED(roles) for (int row = topLeft.row(); row <= bottomRight.row(); row++) { auto index = topLeft.siblingAtRow(row); auto id = index.data(ModelList::IdRole).toString(); if (auto info = modelInfo(id); !info.id().isNull()) emit modelInfoChanged(info); } } QString ModelList::compatibleModelNameHash(QUrl baseUrl, QString modelName) { QCryptographicHash sha256(QCryptographicHash::Sha256); sha256.addData((baseUrl.toString() + "_" + modelName).toUtf8()); return sha256.result().toHex(); } QString ModelList::compatibleModelFilename(QUrl baseUrl, QString modelName) { QString hash(compatibleModelNameHash(baseUrl, modelName)); return QString(u"gpt4all-%1-capi.rmodel"_s).arg(hash); } bool ModelList::eventFilter(QObject *obj, QEvent *ev) { if (obj == QCoreApplication::instance() && ev->type() == QEvent::LanguageChange) emit dataChanged(index(0, 0), index(m_models.size() - 1, 0)); return false; } QString ModelList::incompleteDownloadPath(const QString &modelFile) { return MySettings::globalInstance()->modelPath() + "incomplete-" + modelFile; } const QList ModelList::selectableModelList() const { // FIXME: This needs to be kept in sync with m_selectableModels so should probably be merged QMutexLocker locker(&m_mutex); QList infos; for (ModelInfo *info : m_models) if (info->installed && !info->isEmbeddingModel) infos.append(*info); return infos; } ModelInfo ModelList::defaultModelInfo() const { QMutexLocker locker(&m_mutex); QSettings settings; // The user default model can be set by the user in the settings dialog. The "default" user // default model is "Application default" which signals we should use the logic here. const QString userDefaultModelName = MySettings::globalInstance()->userDefaultModel(); const bool hasUserDefaultName = !userDefaultModelName.isEmpty() && userDefaultModelName != "Application default"; ModelInfo *defaultModel = nullptr; for (ModelInfo *info : m_models) { if (!info->installed) continue; defaultModel = info; const size_t ramrequired = defaultModel->ramrequired; // If we don't have either setting, then just use the first model that requires less than 16GB that is installed if (!hasUserDefaultName && !info->isOnline && ramrequired > 0 && ramrequired < 16) break; // If we have a user specified default and match, then use it if (hasUserDefaultName && (defaultModel->id() == userDefaultModelName)) break; } if (defaultModel) return *defaultModel; return ModelInfo(); } bool ModelList::contains(const QString &id) const { QMutexLocker locker(&m_mutex); return m_modelMap.contains(id); } bool ModelList::containsByFilename(const QString &filename) const { QMutexLocker locker(&m_mutex); for (ModelInfo *info : m_models) if (info->filename() == filename) return true; return false; } bool ModelList::lessThan(const ModelInfo* a, const ModelInfo* b, DiscoverSort s, int d) { // Rule -1a: Discover sort if (a->isDiscovered() && b->isDiscovered()) { switch (s) { case Default: break; case Likes: return (d > 0 ? a->likes() < b->likes() : a->likes() > b->likes()); case Downloads: return (d > 0 ? a->downloads() < b->downloads() : a->downloads() > b->downloads()); case Recent: return (d > 0 ? a->recency() < b->recency() : a->recency() > b->recency()); } } // Rule -1: Discovered before non-discovered if (a->isDiscovered() != b->isDiscovered()) { return a->isDiscovered(); } // Rule 0: Non-clone before clone if (a->isClone() != b->isClone()) { return !a->isClone(); } // Rule 1: Non-empty 'order' before empty if (a->order.isEmpty() != b->order.isEmpty()) { return !a->order.isEmpty(); } // Rule 2: Both 'order' are non-empty, sort alphanumerically if (!a->order.isEmpty() && !b->order.isEmpty()) { return a->order < b->order; } // Rule 3: Both 'order' are empty, sort by id return a->id() < b->id(); } void ModelList::addModel(const QString &id) { const bool hasModel = contains(id); Q_ASSERT(!hasModel); if (hasModel) { qWarning() << "ERROR: model list already contains" << id; return; } ModelInfo *info = new ModelInfo; info->setId(id); m_mutex.lock(); auto s = m_discoverSort; auto d = m_discoverSortDirection; const auto insertPosition = std::lower_bound(m_models.begin(), m_models.end(), info, [s, d](const ModelInfo* lhs, const ModelInfo* rhs) { return ModelList::lessThan(lhs, rhs, s, d); }); const int index = std::distance(m_models.begin(), insertPosition); m_mutex.unlock(); // NOTE: The begin/end rows cannot have a lock placed around them. We calculate the index ahead // of time and this works because this class is designed carefully so that only one thread is // responsible for insertion, deletion, and update beginInsertRows(QModelIndex(), index, index); m_mutex.lock(); m_models.insert(insertPosition, info); m_modelMap.insert(id, info); m_mutex.unlock(); endInsertRows(); emit selectableModelListChanged(); } void ModelList::changeId(const QString &oldId, const QString &newId) { const bool hasModel = contains(oldId); Q_ASSERT(hasModel); if (!hasModel) { qWarning() << "ERROR: model list does not contain" << oldId; return; } QMutexLocker locker(&m_mutex); ModelInfo *info = m_modelMap.take(oldId); info->setId(newId); m_modelMap.insert(newId, info); } int ModelList::rowCount(const QModelIndex &parent) const { Q_UNUSED(parent) QMutexLocker locker(&m_mutex); return m_models.size(); } QVariant ModelList::dataInternal(const ModelInfo *info, int role) const { switch (role) { case IdRole: return info->id(); case NameRole: return info->name(); case FilenameRole: return info->filename(); case DirpathRole: return info->dirpath; case FilesizeRole: return info->filesize; case HashRole: return info->hash; case HashAlgorithmRole: return info->hashAlgorithm; case CalcHashRole: return info->calcHash; case InstalledRole: return info->installed; case DefaultRole: return info->isDefault; case OnlineRole: return info->isOnline; case CompatibleApiRole: return info->isCompatibleApi; case DescriptionRole: return info->description(); case RequiresVersionRole: return info->requiresVersion; case VersionRemovedRole: return info->versionRemoved; case UrlRole: return info->url(); case BytesReceivedRole: return info->bytesReceived; case BytesTotalRole: return info->bytesTotal; case TimestampRole: return info->timestamp; case SpeedRole: return info->speed; case DownloadingRole: return info->isDownloading; case IncompleteRole: return info->isIncomplete; case DownloadErrorRole: return info->downloadError; case OrderRole: return info->order; case RamrequiredRole: return info->ramrequired; case ParametersRole: return info->parameters; case QuantRole: return info->quant(); case TypeRole: return info->type(); case IsCloneRole: return info->isClone(); case IsDiscoveredRole: return info->isDiscovered(); case IsEmbeddingModelRole: return info->isEmbeddingModel; case TemperatureRole: return info->temperature(); case TopPRole: return info->topP(); case MinPRole: return info->minP(); case TopKRole: return info->topK(); case MaxLengthRole: return info->maxLength(); case PromptBatchSizeRole: return info->promptBatchSize(); case ContextLengthRole: return info->contextLength(); case GpuLayersRole: return info->gpuLayers(); case RepeatPenaltyRole: return info->repeatPenalty(); case RepeatPenaltyTokensRole: return info->repeatPenaltyTokens(); case ChatTemplateRole: return QVariant::fromValue(info->chatTemplate()); case SystemMessageRole: return QVariant::fromValue(info->systemMessage()); case ChatNamePromptRole: return info->chatNamePrompt(); case SuggestedFollowUpPromptRole: return info->suggestedFollowUpPrompt(); case LikesRole: return info->likes(); case DownloadsRole: return info->downloads(); case RecencyRole: return info->recency(); } return QVariant(); } QVariant ModelList::data(const QString &id, int role) const { QMutexLocker locker(&m_mutex); ModelInfo *info = m_modelMap.value(id); return dataInternal(info, role); } QVariant ModelList::dataByFilename(const QString &filename, int role) const { QMutexLocker locker(&m_mutex); for (ModelInfo *info : m_models) if (info->filename() == filename) return dataInternal(info, role); return QVariant(); } QVariant ModelList::data(const QModelIndex &index, int role) const { QMutexLocker locker(&m_mutex); if (!index.isValid() || index.row() < 0 || index.row() >= m_models.size()) return QVariant(); const ModelInfo *info = m_models.at(index.row()); return dataInternal(info, role); } void ModelList::updateData(const QString &id, const QVector> &data) { // We only sort when one of the fields used by the sorting algorithm actually changes that // is implicated or used by the sorting algorithm bool shouldSort = false; int index; { QMutexLocker locker(&m_mutex); if (!m_modelMap.contains(id)) { qWarning() << "ERROR: cannot update as model map does not contain" << id; return; } ModelInfo *info = m_modelMap.value(id); index = m_models.indexOf(info); if (index == -1) { qWarning() << "ERROR: cannot update as model list does not contain" << id; return; } for (const auto &d : data) { const int role = d.first; const QVariant value = d.second; switch (role) { case IdRole: { if (info->id() != value.toString()) { info->setId(value.toString()); shouldSort = true; } break; } case NameRole: info->setName(value.toString()); break; case FilenameRole: info->setFilename(value.toString()); break; case DirpathRole: info->dirpath = value.toString(); break; case FilesizeRole: info->filesize = value.toString(); break; case HashRole: info->hash = value.toByteArray(); break; case HashAlgorithmRole: info->hashAlgorithm = static_cast(value.toInt()); break; case CalcHashRole: info->calcHash = value.toBool(); break; case InstalledRole: info->installed = value.toBool(); break; case DefaultRole: info->isDefault = value.toBool(); break; case OnlineRole: info->isOnline = value.toBool(); break; case CompatibleApiRole: info->isCompatibleApi = value.toBool(); break; case DescriptionRole: info->setDescription(value.toString()); break; case RequiresVersionRole: info->requiresVersion = value.toString(); break; case VersionRemovedRole: info->versionRemoved = value.toString(); break; case UrlRole: info->setUrl(value.toString()); break; case BytesReceivedRole: info->bytesReceived = value.toLongLong(); break; case BytesTotalRole: info->bytesTotal = value.toLongLong(); break; case TimestampRole: info->timestamp = value.toLongLong(); break; case SpeedRole: info->speed = value.toString(); break; case DownloadingRole: info->isDownloading = value.toBool(); break; case IncompleteRole: info->isIncomplete = value.toBool(); break; case DownloadErrorRole: info->downloadError = value.toString(); break; case OrderRole: { if (info->order != value.toString()) { info->order = value.toString(); shouldSort = true; } break; } case RamrequiredRole: info->ramrequired = value.toInt(); break; case ParametersRole: info->parameters = value.toString(); break; case QuantRole: info->setQuant(value.toString()); break; case TypeRole: info->setType(value.toString()); break; case IsCloneRole: { if (info->isClone() != value.toBool()) { info->setIsClone(value.toBool()); shouldSort = true; } break; } case IsDiscoveredRole: { if (info->isDiscovered() != value.toBool()) { info->setIsDiscovered(value.toBool()); shouldSort = true; } break; } case IsEmbeddingModelRole: info->isEmbeddingModel = value.toBool(); break; case TemperatureRole: info->setTemperature(value.toDouble()); break; case TopPRole: info->setTopP(value.toDouble()); break; case MinPRole: info->setMinP(value.toDouble()); break; case TopKRole: info->setTopK(value.toInt()); break; case MaxLengthRole: info->setMaxLength(value.toInt()); break; case PromptBatchSizeRole: info->setPromptBatchSize(value.toInt()); break; case ContextLengthRole: info->setContextLength(value.toInt()); break; case GpuLayersRole: info->setGpuLayers(value.toInt()); break; case RepeatPenaltyRole: info->setRepeatPenalty(value.toDouble()); break; case RepeatPenaltyTokensRole: info->setRepeatPenaltyTokens(value.toInt()); break; case ChatTemplateRole: info->m_chatTemplate = value.toString(); break; case SystemMessageRole: info->m_systemMessage = value.toString(); break; case ChatNamePromptRole: info->setChatNamePrompt(value.toString()); break; case SuggestedFollowUpPromptRole: info->setSuggestedFollowUpPrompt(value.toString()); break; case LikesRole: { if (info->likes() != value.toInt()) { info->setLikes(value.toInt()); shouldSort = true; } break; } case DownloadsRole: { if (info->downloads() != value.toInt()) { info->setDownloads(value.toInt()); shouldSort = true; } break; } case RecencyRole: { if (info->recency() != value.toDateTime()) { info->setRecency(value.toDateTime()); shouldSort = true; } break; } } } // Extra guarantee that these always remains in sync with filesystem QString modelPath = info->dirpath + info->filename(); const QFileInfo fileInfo(modelPath); info->installed = fileInfo.exists(); const QFileInfo incompleteInfo(incompleteDownloadPath(info->filename())); info->isIncomplete = incompleteInfo.exists(); // check installed, discovered/sideloaded models only (including clones) if (!info->checkedEmbeddingModel && !info->isEmbeddingModel && info->installed && (info->isDiscovered() || info->description().isEmpty())) { // read GGUF and decide based on model architecture info->isEmbeddingModel = LLModel::Implementation::isEmbeddingModel(modelPath.toStdString()); info->checkedEmbeddingModel = true; } } emit dataChanged(createIndex(index, 0), createIndex(index, 0)); if (shouldSort) resortModel(); emit selectableModelListChanged(); } void ModelList::resortModel() { emit layoutAboutToBeChanged(); { QMutexLocker locker(&m_mutex); auto s = m_discoverSort; auto d = m_discoverSortDirection; std::stable_sort(m_models.begin(), m_models.end(), [s, d](const ModelInfo* lhs, const ModelInfo* rhs) { return ModelList::lessThan(lhs, rhs, s, d); }); } emit layoutChanged(); } void ModelList::updateDataByFilename(const QString &filename, QVector> data) { if (data.isEmpty()) return; // no-op QVector modelsById; { QMutexLocker locker(&m_mutex); for (ModelInfo *info : m_models) if (info->filename() == filename) modelsById.append(info->id()); } if (modelsById.isEmpty()) { qWarning() << "ERROR: cannot update model as list does not contain file" << filename; return; } for (const QString &id : modelsById) updateData(id, data); } ModelInfo ModelList::modelInfo(const QString &id) const { QMutexLocker locker(&m_mutex); if (!m_modelMap.contains(id)) return ModelInfo(); return *m_modelMap.value(id); } ModelInfo ModelList::modelInfoByFilename(const QString &filename, bool allowClone) const { QMutexLocker locker(&m_mutex); for (ModelInfo *info : m_models) if (info->filename() == filename && (allowClone || !info->isClone())) return *info; return ModelInfo(); } bool ModelList::isUniqueName(const QString &name) const { QMutexLocker locker(&m_mutex); for (const ModelInfo *info : m_models) { if(info->name() == name) return false; } return true; } QString ModelList::clone(const ModelInfo &model) { auto *mySettings = MySettings::globalInstance(); const QString id = Network::globalInstance()->generateUniqueId(); addModel(id); QString tmplSetting, sysmsgSetting; if (auto tmpl = model.chatTemplate().asModern()) { tmplSetting = *tmpl; } else { qWarning("ModelList Warning: attempted to clone model with legacy chat template"); return {}; } if (auto msg = model.systemMessage().asModern()) { sysmsgSetting = *msg; } else { qWarning("ModelList Warning: attempted to clone model with legacy system message"); return {}; } QVector> data { { ModelList::InstalledRole, model.installed }, { ModelList::IsCloneRole, true }, { ModelList::NameRole, uniqueModelName(model) }, { ModelList::FilenameRole, model.filename() }, { ModelList::DirpathRole, model.dirpath }, { ModelList::OnlineRole, model.isOnline }, { ModelList::CompatibleApiRole, model.isCompatibleApi }, { ModelList::IsEmbeddingModelRole, model.isEmbeddingModel }, { ModelList::TemperatureRole, model.temperature() }, { ModelList::TopPRole, model.topP() }, { ModelList::MinPRole, model.minP() }, { ModelList::TopKRole, model.topK() }, { ModelList::MaxLengthRole, model.maxLength() }, { ModelList::PromptBatchSizeRole, model.promptBatchSize() }, { ModelList::ContextLengthRole, model.contextLength() }, { ModelList::GpuLayersRole, model.gpuLayers() }, { ModelList::RepeatPenaltyRole, model.repeatPenalty() }, { ModelList::RepeatPenaltyTokensRole, model.repeatPenaltyTokens() }, { ModelList::SystemMessageRole, model.m_systemMessage }, { ModelList::ChatNamePromptRole, model.chatNamePrompt() }, { ModelList::SuggestedFollowUpPromptRole, model.suggestedFollowUpPrompt() }, }; if (auto tmpl = model.m_chatTemplate) data.emplace_back(ModelList::ChatTemplateRole, *tmpl); // copy default chat template, if known updateData(id, data); // Ensure setting overrides are copied in case the base model overrides change. // This is necessary because setting these roles on ModelInfo above does not write to settings. auto cloneInfo = modelInfo(id); if (mySettings->isModelChatTemplateSet (model)) mySettings->setModelChatTemplate (cloneInfo, tmplSetting ); if (mySettings->isModelSystemMessageSet(model)) mySettings->setModelSystemMessage(cloneInfo, sysmsgSetting); return id; } void ModelList::removeClone(const ModelInfo &model) { Q_ASSERT(model.isClone()); if (!model.isClone()) return; removeInternal(model); } void ModelList::removeInstalled(const ModelInfo &model) { Q_ASSERT(model.installed); Q_ASSERT(!model.isClone()); Q_ASSERT(model.isDiscovered() || model.isCompatibleApi || model.description() == "" /*indicates sideloaded*/); removeInternal(model); } int ModelList::indexByModelId(const QString &id) const { QMutexLocker locker(&m_mutex); if (auto it = m_modelMap.find(id); it != m_modelMap.cend()) return m_models.indexOf(*it); return -1; } void ModelList::removeInternal(const ModelInfo &model) { int indexOfModel = indexByModelId(model.id()); Q_ASSERT(indexOfModel != -1); if (indexOfModel == -1) { qWarning() << "ERROR: model list does not contain" << model.id(); return; } beginRemoveRows(QModelIndex(), indexOfModel, indexOfModel); { QMutexLocker locker(&m_mutex); ModelInfo *info = m_models.takeAt(indexOfModel); m_modelMap.remove(info->id()); delete info; } endRemoveRows(); emit selectableModelListChanged(); MySettings::globalInstance()->eraseModel(model); } QString ModelList::uniqueModelName(const ModelInfo &model) const { QMutexLocker locker(&m_mutex); static const QRegularExpression re("^(.*)~(\\d+)$"); QRegularExpressionMatch match = re.match(model.name()); QString baseName; if (match.hasMatch()) baseName = match.captured(1); else baseName = model.name(); int maxSuffixNumber = 0; bool baseNameExists = false; for (const ModelInfo *info : m_models) { if(info->name() == baseName) baseNameExists = true; QRegularExpressionMatch match = re.match(info->name()); if (match.hasMatch()) { QString currentBaseName = match.captured(1); int currentSuffixNumber = match.captured(2).toInt(); if (currentBaseName == baseName && currentSuffixNumber > maxSuffixNumber) maxSuffixNumber = currentSuffixNumber; } } if (baseNameExists) return baseName + "~" + QString::number(maxSuffixNumber + 1); return baseName; } bool ModelList::modelExists(const QString &modelFilename) const { QString appPath = QCoreApplication::applicationDirPath() + modelFilename; QFileInfo infoAppPath(appPath); if (infoAppPath.exists()) return true; QString downloadPath = MySettings::globalInstance()->modelPath() + modelFilename; QFileInfo infoLocalPath(downloadPath); if (infoLocalPath.exists()) return true; return false; } void ModelList::updateOldRemoteModels(const QString &path) { QDirIterator it(path, QDir::Files, QDirIterator::Subdirectories); while (it.hasNext()) { QFileInfo info = it.nextFileInfo(); QString filename = it.fileName(); if (!filename.startsWith("chatgpt-") || !filename.endsWith(".txt")) continue; QString apikey; QString modelname(filename); modelname.chop(4); // strip ".txt" extension modelname.remove(0, 8); // strip "chatgpt-" prefix QFile file(info.filePath()); if (!file.open(QIODevice::ReadOnly)) { qWarning().noquote() << tr("cannot open \"%1\": %2").arg(file.fileName(), file.errorString()); continue; } { QTextStream in(&file); apikey = in.readAll(); file.close(); } QFile newfile(u"%1/gpt4all-%2.rmodel"_s.arg(info.dir().path(), modelname)); if (!newfile.open(QIODevice::ReadWrite)) { qWarning().noquote() << tr("cannot create \"%1\": %2").arg(newfile.fileName(), file.errorString()); continue; } QJsonObject obj { { "apiKey", apikey }, { "modelName", modelname }, }; QTextStream out(&newfile); out << QJsonDocument(obj).toJson(); newfile.close(); file.remove(); } } void ModelList::processModelDirectory(const QString &path) { QDirIterator it(path, QDir::Files, QDirIterator::Subdirectories); while (it.hasNext()) { QFileInfo info = it.nextFileInfo(); QString filename = it.fileName(); if (filename.startsWith("incomplete") || FILENAME_BLACKLIST.contains(filename)) continue; if (!filename.endsWith(".gguf") && !filename.endsWith(".rmodel")) continue; bool isOnline(filename.endsWith(".rmodel")); bool isCompatibleApi(filename.endsWith("-capi.rmodel")); QString name; QString description; if (isCompatibleApi) { QJsonObject obj; { QFile file(info.filePath()); if (!file.open(QIODeviceBase::ReadOnly)) { qWarning().noquote() << tr("cannot open \"%1\": %2").arg(file.fileName(), file.errorString()); continue; } QJsonDocument doc = QJsonDocument::fromJson(file.readAll()); obj = doc.object(); } { QString apiKey(obj["apiKey"].toString()); QString baseUrl(obj["baseUrl"].toString()); QString modelName(obj["modelName"].toString()); apiKey = apiKey.length() < 10 ? "*****" : apiKey.left(5) + "*****"; name = tr("%1 (%2)").arg(modelName, baseUrl); description = tr("OpenAI-Compatible API Model
                  " "
                  • API Key: %1
                  • " "
                  • Base URL: %2
                  • " "
                  • Model Name: %3
                  ") .arg(apiKey, baseUrl, modelName); } } QVector modelsById; { QMutexLocker locker(&m_mutex); for (ModelInfo *info : m_models) if (info->filename() == filename) modelsById.append(info->id()); } if (modelsById.isEmpty()) { if (!contains(filename)) addModel(filename); modelsById.append(filename); } for (const QString &id : modelsById) { QVector> data { { InstalledRole, true }, { FilenameRole, filename }, { OnlineRole, isOnline }, { CompatibleApiRole, isCompatibleApi }, { DirpathRole, info.dir().absolutePath() + "/" }, { FilesizeRole, toFileSize(info.size()) }, }; if (isCompatibleApi) { // The data will be saved to "GPT4All.ini". data.append({ NameRole, name }); // The description is hard-coded into "GPT4All.ini" due to performance issue. // If the description goes to be dynamic from its .rmodel file, it will get high I/O usage while using the ModelList. data.append({ DescriptionRole, description }); data.append({ ChatTemplateRole, RMODEL_CHAT_TEMPLATE }); } updateData(id, data); } } } void ModelList::updateModelsFromDirectory() { const QString exePath = QCoreApplication::applicationDirPath() + QDir::separator(); const QString localPath = MySettings::globalInstance()->modelPath(); updateOldRemoteModels(exePath); processModelDirectory(exePath); if (localPath != exePath) { updateOldRemoteModels(localPath); processModelDirectory(localPath); } } static QString modelsJsonFilename() { return QStringLiteral("models" MODELS_JSON_VERSION ".json"); } static std::optional modelsJsonCacheFile() { constexpr auto loc = QStandardPaths::CacheLocation; QString modelsJsonFname = modelsJsonFilename(); if (auto path = QStandardPaths::locate(loc, modelsJsonFname); !path.isEmpty()) return std::make_optional(path); if (auto path = QStandardPaths::writableLocation(loc); !path.isEmpty()) return std::make_optional(u"%1/%2"_s.arg(path, modelsJsonFname)); return std::nullopt; } void ModelList::updateModelsFromJson() { QString modelsJsonFname = modelsJsonFilename(); #if defined(USE_LOCAL_MODELSJSON) QUrl jsonUrl(u"file://%1/dev/large_language_models/gpt4all/gpt4all-chat/metadata/%2"_s.arg(QDir::homePath(), modelsJsonFname)); #else QUrl jsonUrl(u"http://gpt4all.io/models/%1"_s.arg(modelsJsonFname)); #endif QNetworkRequest request(jsonUrl); QSslConfiguration conf = request.sslConfiguration(); conf.setPeerVerifyMode(QSslSocket::VerifyNone); request.setSslConfiguration(conf); QNetworkReply *jsonReply = m_networkManager.get(request); connect(qGuiApp, &QCoreApplication::aboutToQuit, jsonReply, &QNetworkReply::abort); QEventLoop loop; connect(jsonReply, &QNetworkReply::finished, &loop, &QEventLoop::quit); QTimer::singleShot(1500, &loop, &QEventLoop::quit); loop.exec(); if (jsonReply->error() == QNetworkReply::NoError && jsonReply->isFinished()) { QByteArray jsonData = jsonReply->readAll(); jsonReply->deleteLater(); parseModelsJsonFile(jsonData, true); } else { qWarning() << "WARNING: Could not download models.json synchronously"; updateModelsFromJsonAsync(); auto cacheFile = modelsJsonCacheFile(); if (!cacheFile) { // no known location } else if (cacheFile->open(QIODeviceBase::ReadOnly)) { QByteArray jsonData = cacheFile->readAll(); cacheFile->close(); parseModelsJsonFile(jsonData, false); } else if (cacheFile->exists()) qWarning() << "ERROR: Couldn't read models.json cache file: " << cacheFile->fileName(); } delete jsonReply; } void ModelList::updateModelsFromJsonAsync() { m_asyncModelRequestOngoing = true; emit asyncModelRequestOngoingChanged(); QString modelsJsonFname = modelsJsonFilename(); #if defined(USE_LOCAL_MODELSJSON) QUrl jsonUrl(u"file://%1/dev/large_language_models/gpt4all/gpt4all-chat/metadata/%2"_s.arg(QDir::homePath(), modelsJsonFname)); #else QUrl jsonUrl(u"http://gpt4all.io/models/%1"_s.arg(modelsJsonFname)); #endif QNetworkRequest request(jsonUrl); QSslConfiguration conf = request.sslConfiguration(); conf.setPeerVerifyMode(QSslSocket::VerifyNone); request.setSslConfiguration(conf); QNetworkReply *jsonReply = m_networkManager.get(request); connect(qGuiApp, &QCoreApplication::aboutToQuit, jsonReply, &QNetworkReply::abort); connect(jsonReply, &QNetworkReply::finished, this, &ModelList::handleModelsJsonDownloadFinished); connect(jsonReply, &QNetworkReply::errorOccurred, this, &ModelList::handleModelsJsonDownloadErrorOccurred); } void ModelList::handleModelsJsonDownloadFinished() { QNetworkReply *jsonReply = qobject_cast(sender()); if (!jsonReply) { m_asyncModelRequestOngoing = false; emit asyncModelRequestOngoingChanged(); return; } QByteArray jsonData = jsonReply->readAll(); jsonReply->deleteLater(); parseModelsJsonFile(jsonData, true); m_asyncModelRequestOngoing = false; emit asyncModelRequestOngoingChanged(); } void ModelList::handleModelsJsonDownloadErrorOccurred(QNetworkReply::NetworkError code) { // TODO: Show what error occurred in the GUI m_asyncModelRequestOngoing = false; emit asyncModelRequestOngoingChanged(); QNetworkReply *reply = qobject_cast(sender()); if (!reply) return; qWarning() << u"ERROR: Modellist download failed with error code \"%1-%2\""_s .arg(code).arg(reply->errorString()); } void ModelList::handleSslErrors(QNetworkReply *reply, const QList &errors) { QUrl url = reply->request().url(); for (const auto &e : errors) qWarning() << "ERROR: Received ssl error:" << e.errorString() << "for" << url; } void ModelList::maybeUpdateDataForSettings(const ModelInfo &info, bool fromInfo) { // ignore updates that were *because* of a dataChanged - would cause a circular dependency int idx; if (!fromInfo && (idx = indexByModelId(info.id())) != -1) { emit dataChanged(index(idx, 0), index(idx, 0)); emit selectableModelListChanged(); } } void ModelList::updateDataForSettings() { emit dataChanged(index(0, 0), index(m_models.size() - 1, 0)); emit selectableModelListChanged(); } void ModelList::parseModelsJsonFile(const QByteArray &jsonData, bool save) { QJsonParseError err; QJsonDocument document = QJsonDocument::fromJson(jsonData, &err); if (err.error != QJsonParseError::NoError) { qWarning() << "ERROR: Couldn't parse: " << jsonData << err.errorString(); return; } if (save) { auto cacheFile = modelsJsonCacheFile(); if (!cacheFile) { // no known location } else if (QFileInfo(*cacheFile).dir().mkpath(u"."_s) && cacheFile->open(QIODeviceBase::WriteOnly)) { cacheFile->write(jsonData); cacheFile->close(); } else qWarning() << "ERROR: Couldn't write models config file: " << cacheFile->fileName(); } QJsonArray jsonArray = document.array(); const QString currentVersion = QCoreApplication::applicationVersion(); for (const QJsonValue &value : jsonArray) { QJsonObject obj = value.toObject(); QString modelName = obj["name"].toString(); QString modelFilename = obj["filename"].toString(); QString modelFilesize = obj["filesize"].toString(); QString requiresVersion = obj["requires"].toString(); QString versionRemoved = obj["removedIn"].toString(); QString url = obj["url"].toString(); bool isDefault = obj.contains("isDefault") && obj["isDefault"] == u"true"_s; bool disableGUI = obj.contains("disableGUI") && obj["disableGUI"] == u"true"_s; QString description = obj["description"].toString(); QString order = obj["order"].toString(); int ramrequired = obj["ramrequired"].toString().toInt(); QString parameters = obj["parameters"].toString(); QString quant = obj["quant"].toString(); QString type = obj["type"].toString(); bool isEmbeddingModel = obj["embeddingModel"].toBool(); QByteArray modelHash; ModelInfo::HashAlgorithm hashAlgorithm; if (auto it = obj.find("sha256sum"_L1); it != obj.end()) { modelHash = it->toString().toLatin1(); hashAlgorithm = ModelInfo::Sha256; } else { modelHash = obj["md5sum"].toString().toLatin1(); hashAlgorithm = ModelInfo::Md5; } // Some models aren't supported in the GUI at all if (disableGUI) continue; // If the current version is strictly less than required version, then skip if (!requiresVersion.isEmpty() && Download::compareAppVersions(currentVersion, requiresVersion) < 0) continue; // If the version removed is less than or equal to the current version, then skip if (!versionRemoved.isEmpty() && Download::compareAppVersions(versionRemoved, currentVersion) <= 0) continue; modelFilesize = ModelList::toFileSize(modelFilesize.toULongLong()); const QString id = modelName; Q_ASSERT(!id.isEmpty()); if (contains(modelFilename)) changeId(modelFilename, id); if (!contains(id)) addModel(id); QVector> data { { ModelList::NameRole, modelName }, { ModelList::FilenameRole, modelFilename }, { ModelList::FilesizeRole, modelFilesize }, { ModelList::HashRole, modelHash }, { ModelList::HashAlgorithmRole, hashAlgorithm }, { ModelList::DefaultRole, isDefault }, { ModelList::DescriptionRole, description }, { ModelList::RequiresVersionRole, requiresVersion }, { ModelList::VersionRemovedRole, versionRemoved }, { ModelList::UrlRole, url }, { ModelList::OrderRole, order }, { ModelList::RamrequiredRole, ramrequired }, { ModelList::ParametersRole, parameters }, { ModelList::QuantRole, quant }, { ModelList::TypeRole, type }, { ModelList::IsEmbeddingModelRole, isEmbeddingModel }, }; if (obj.contains("temperature")) data.append({ ModelList::TemperatureRole, obj["temperature"].toDouble() }); if (obj.contains("topP")) data.append({ ModelList::TopPRole, obj["topP"].toDouble() }); if (obj.contains("minP")) data.append({ ModelList::MinPRole, obj["minP"].toDouble() }); if (obj.contains("topK")) data.append({ ModelList::TopKRole, obj["topK"].toInt() }); if (obj.contains("maxLength")) data.append({ ModelList::MaxLengthRole, obj["maxLength"].toInt() }); if (obj.contains("promptBatchSize")) data.append({ ModelList::PromptBatchSizeRole, obj["promptBatchSize"].toInt() }); if (obj.contains("contextLength")) data.append({ ModelList::ContextLengthRole, obj["contextLength"].toInt() }); if (obj.contains("gpuLayers")) data.append({ ModelList::GpuLayersRole, obj["gpuLayers"].toInt() }); if (obj.contains("repeatPenalty")) data.append({ ModelList::RepeatPenaltyRole, obj["repeatPenalty"].toDouble() }); if (obj.contains("repeatPenaltyTokens")) data.append({ ModelList::RepeatPenaltyTokensRole, obj["repeatPenaltyTokens"].toInt() }); if (auto it = obj.find("chatTemplate"_L1); it != obj.end()) data.append({ ModelList::ChatTemplateRole, it->toString() }); if (auto it = obj.find("systemMessage"_L1); it != obj.end()) data.append({ ModelList::SystemMessageRole, it->toString() }); updateData(id, data); } const QString chatGPTDesc = tr("
                  • Requires personal OpenAI API key.
                  • WARNING: Will send" " your chats to OpenAI!
                  • Your API key will be stored on disk
                  • Will only be used" " to communicate with OpenAI
                  • You can apply for an API key" " here.
                  • "); { const QString modelName = "ChatGPT-3.5 Turbo"; const QString id = modelName; const QString modelFilename = "gpt4all-gpt-3.5-turbo.rmodel"; if (contains(modelFilename)) changeId(modelFilename, id); if (!contains(id)) addModel(id); QVector> data { { ModelList::NameRole, modelName }, { ModelList::FilenameRole, modelFilename }, { ModelList::FilesizeRole, "minimal" }, { ModelList::OnlineRole, true }, { ModelList::DescriptionRole, tr("OpenAI's ChatGPT model GPT-3.5 Turbo
                    %1").arg(chatGPTDesc) }, { ModelList::RequiresVersionRole, "2.7.4" }, { ModelList::OrderRole, "ca" }, { ModelList::RamrequiredRole, 0 }, { ModelList::ParametersRole, "?" }, { ModelList::QuantRole, "NA" }, { ModelList::TypeRole, "GPT" }, { ModelList::UrlRole, "https://api.openai.com/v1/chat/completions" }, { ModelList::ChatTemplateRole, RMODEL_CHAT_TEMPLATE }, }; updateData(id, data); } { const QString chatGPT4Warn = tr("

                    * Even if you pay OpenAI for ChatGPT-4 this does not guarantee API key access. Contact OpenAI for more info."); const QString modelName = "ChatGPT-4"; const QString id = modelName; const QString modelFilename = "gpt4all-gpt-4.rmodel"; if (contains(modelFilename)) changeId(modelFilename, id); if (!contains(id)) addModel(id); QVector> data { { ModelList::NameRole, modelName }, { ModelList::FilenameRole, modelFilename }, { ModelList::FilesizeRole, "minimal" }, { ModelList::OnlineRole, true }, { ModelList::DescriptionRole, tr("OpenAI's ChatGPT model GPT-4
                    %1 %2").arg(chatGPTDesc).arg(chatGPT4Warn) }, { ModelList::RequiresVersionRole, "2.7.4" }, { ModelList::OrderRole, "cb" }, { ModelList::RamrequiredRole, 0 }, { ModelList::ParametersRole, "?" }, { ModelList::QuantRole, "NA" }, { ModelList::TypeRole, "GPT" }, { ModelList::UrlRole, "https://api.openai.com/v1/chat/completions" }, { ModelList::ChatTemplateRole, RMODEL_CHAT_TEMPLATE }, }; updateData(id, data); } const QString mistralDesc = tr("
                    • Requires personal Mistral API key.
                    • WARNING: Will send" " your chats to Mistral!
                    • Your API key will be stored on disk
                    • Will only be used" " to communicate with Mistral
                    • You can apply for an API key" " here.
                    • "); { const QString modelName = "Mistral Tiny API"; const QString id = modelName; const QString modelFilename = "gpt4all-mistral-tiny.rmodel"; if (contains(modelFilename)) changeId(modelFilename, id); if (!contains(id)) addModel(id); QVector> data { { ModelList::NameRole, modelName }, { ModelList::FilenameRole, modelFilename }, { ModelList::FilesizeRole, "minimal" }, { ModelList::OnlineRole, true }, { ModelList::DescriptionRole, tr("Mistral Tiny model
                      %1").arg(mistralDesc) }, { ModelList::RequiresVersionRole, "2.7.4" }, { ModelList::OrderRole, "cc" }, { ModelList::RamrequiredRole, 0 }, { ModelList::ParametersRole, "?" }, { ModelList::QuantRole, "NA" }, { ModelList::TypeRole, "Mistral" }, { ModelList::UrlRole, "https://api.mistral.ai/v1/chat/completions" }, { ModelList::ChatTemplateRole, RMODEL_CHAT_TEMPLATE }, }; updateData(id, data); } { const QString modelName = "Mistral Small API"; const QString id = modelName; const QString modelFilename = "gpt4all-mistral-small.rmodel"; if (contains(modelFilename)) changeId(modelFilename, id); if (!contains(id)) addModel(id); QVector> data { { ModelList::NameRole, modelName }, { ModelList::FilenameRole, modelFilename }, { ModelList::FilesizeRole, "minimal" }, { ModelList::OnlineRole, true }, { ModelList::DescriptionRole, tr("Mistral Small model
                      %1").arg(mistralDesc) }, { ModelList::RequiresVersionRole, "2.7.4" }, { ModelList::OrderRole, "cd" }, { ModelList::RamrequiredRole, 0 }, { ModelList::ParametersRole, "?" }, { ModelList::QuantRole, "NA" }, { ModelList::TypeRole, "Mistral" }, { ModelList::UrlRole, "https://api.mistral.ai/v1/chat/completions" }, { ModelList::ChatTemplateRole, RMODEL_CHAT_TEMPLATE }, }; updateData(id, data); } { const QString modelName = "Mistral Medium API"; const QString id = modelName; const QString modelFilename = "gpt4all-mistral-medium.rmodel"; if (contains(modelFilename)) changeId(modelFilename, id); if (!contains(id)) addModel(id); QVector> data { { ModelList::NameRole, modelName }, { ModelList::FilenameRole, modelFilename }, { ModelList::FilesizeRole, "minimal" }, { ModelList::OnlineRole, true }, { ModelList::DescriptionRole, tr("Mistral Medium model
                      %1").arg(mistralDesc) }, { ModelList::RequiresVersionRole, "2.7.4" }, { ModelList::OrderRole, "ce" }, { ModelList::RamrequiredRole, 0 }, { ModelList::ParametersRole, "?" }, { ModelList::QuantRole, "NA" }, { ModelList::TypeRole, "Mistral" }, { ModelList::UrlRole, "https://api.mistral.ai/v1/chat/completions" }, { ModelList::ChatTemplateRole, RMODEL_CHAT_TEMPLATE }, }; updateData(id, data); } const QString compatibleDesc = tr("
                      • Requires personal API key and the API base URL.
                      • " "
                      • WARNING: Will send your chats to " "the OpenAI-compatible API Server you specified!
                      • " "
                      • Your API key will be stored on disk
                      • Will only be used" " to communicate with the OpenAI-compatible API Server
                      • "); { const QString modelName = "OpenAI-compatible"; const QString id = modelName; if (!contains(id)) addModel(id); QVector> data { { ModelList::NameRole, modelName }, { ModelList::FilesizeRole, "minimal" }, { ModelList::OnlineRole, true }, { ModelList::CompatibleApiRole, true }, { ModelList::DescriptionRole, tr("Connect to OpenAI-compatible API server
                        %1").arg(compatibleDesc) }, { ModelList::RequiresVersionRole, "2.7.4" }, { ModelList::OrderRole, "cf" }, { ModelList::RamrequiredRole, 0 }, { ModelList::ParametersRole, "?" }, { ModelList::QuantRole, "NA" }, { ModelList::TypeRole, "NA" }, { ModelList::ChatTemplateRole, RMODEL_CHAT_TEMPLATE }, }; updateData(id, data); } } void ModelList::updateDiscoveredInstalled(const ModelInfo &info) { QVector> data { { ModelList::InstalledRole, true }, { ModelList::IsDiscoveredRole, true }, { ModelList::NameRole, info.name() }, { ModelList::FilenameRole, info.filename() }, { ModelList::DescriptionRole, info.description() }, { ModelList::UrlRole, info.url() }, { ModelList::LikesRole, info.likes() }, { ModelList::DownloadsRole, info.downloads() }, { ModelList::RecencyRole, info.recency() }, { ModelList::QuantRole, info.quant() }, { ModelList::TypeRole, info.type() }, }; updateData(info.id(), data); } // FIXME(jared): This should only contain fields without reasonable defaults such as name, description, and URL. // For other settings, there is no authoritative value and we should load the setting lazily like we do // for any other override. void ModelList::updateModelsFromSettings() { QSettings settings; QStringList groups = settings.childGroups(); for (const QString &g: groups) { if (!g.startsWith("model-")) continue; const QString id = g.sliced(6); if (contains(id)) continue; // If we can't find the corresponding file, then ignore it as this reflects a stale model. // The file could have been deleted manually by the user for instance or temporarily renamed. QString filename; { auto value = settings.value(u"%1/filename"_s.arg(g)); if (!value.isValid() || !modelExists(filename = value.toString())) continue; } QVector> data; // load data from base model // FIXME(jared): how does "Restore Defaults" work for other settings of clones which we don't do this for? if (auto base = modelInfoByFilename(filename, /*allowClone*/ false); !base.id().isNull()) { if (auto tmpl = base.m_chatTemplate) data.append({ ModelList::ChatTemplateRole, *tmpl }); if (auto msg = base.m_systemMessage; !msg.isNull()) data.append({ ModelList::SystemMessageRole, msg }); } addModel(id); // load data from settings if (settings.contains(g + "/name")) { const QString name = settings.value(g + "/name").toString(); data.append({ ModelList::NameRole, name }); } if (settings.contains(g + "/filename")) { const QString filename = settings.value(g + "/filename").toString(); data.append({ ModelList::FilenameRole, filename }); } if (settings.contains(g + "/description")) { const QString d = settings.value(g + "/description").toString(); data.append({ ModelList::DescriptionRole, d }); } if (settings.contains(g + "/url")) { const QString u = settings.value(g + "/url").toString(); data.append({ ModelList::UrlRole, u }); } if (settings.contains(g + "/quant")) { const QString q = settings.value(g + "/quant").toString(); data.append({ ModelList::QuantRole, q }); } if (settings.contains(g + "/type")) { const QString t = settings.value(g + "/type").toString(); data.append({ ModelList::TypeRole, t }); } if (settings.contains(g + "/isClone")) { const bool b = settings.value(g + "/isClone").toBool(); data.append({ ModelList::IsCloneRole, b }); } if (settings.contains(g + "/isDiscovered")) { const bool b = settings.value(g + "/isDiscovered").toBool(); data.append({ ModelList::IsDiscoveredRole, b }); } if (settings.contains(g + "/likes")) { const int l = settings.value(g + "/likes").toInt(); data.append({ ModelList::LikesRole, l }); } if (settings.contains(g + "/downloads")) { const int d = settings.value(g + "/downloads").toInt(); data.append({ ModelList::DownloadsRole, d }); } if (settings.contains(g + "/recency")) { const QDateTime r = settings.value(g + "/recency").toDateTime(); data.append({ ModelList::RecencyRole, r }); } if (settings.contains(g + "/temperature")) { const double temperature = settings.value(g + "/temperature").toDouble(); data.append({ ModelList::TemperatureRole, temperature }); } if (settings.contains(g + "/topP")) { const double topP = settings.value(g + "/topP").toDouble(); data.append({ ModelList::TopPRole, topP }); } if (settings.contains(g + "/minP")) { const double minP = settings.value(g + "/minP").toDouble(); data.append({ ModelList::MinPRole, minP }); } if (settings.contains(g + "/topK")) { const int topK = settings.value(g + "/topK").toInt(); data.append({ ModelList::TopKRole, topK }); } if (settings.contains(g + "/maxLength")) { const int maxLength = settings.value(g + "/maxLength").toInt(); data.append({ ModelList::MaxLengthRole, maxLength }); } if (settings.contains(g + "/promptBatchSize")) { const int promptBatchSize = settings.value(g + "/promptBatchSize").toInt(); data.append({ ModelList::PromptBatchSizeRole, promptBatchSize }); } if (settings.contains(g + "/contextLength")) { const int contextLength = settings.value(g + "/contextLength").toInt(); data.append({ ModelList::ContextLengthRole, contextLength }); } if (settings.contains(g + "/gpuLayers")) { const int gpuLayers = settings.value(g + "/gpuLayers").toInt(); data.append({ ModelList::GpuLayersRole, gpuLayers }); } if (settings.contains(g + "/repeatPenalty")) { const double repeatPenalty = settings.value(g + "/repeatPenalty").toDouble(); data.append({ ModelList::RepeatPenaltyRole, repeatPenalty }); } if (settings.contains(g + "/repeatPenaltyTokens")) { const int repeatPenaltyTokens = settings.value(g + "/repeatPenaltyTokens").toInt(); data.append({ ModelList::RepeatPenaltyTokensRole, repeatPenaltyTokens }); } if (settings.contains(g + "/chatNamePrompt")) { const QString chatNamePrompt = settings.value(g + "/chatNamePrompt").toString(); data.append({ ModelList::ChatNamePromptRole, chatNamePrompt }); } if (settings.contains(g + "/suggestedFollowUpPrompt")) { const QString suggestedFollowUpPrompt = settings.value(g + "/suggestedFollowUpPrompt").toString(); data.append({ ModelList::SuggestedFollowUpPromptRole, suggestedFollowUpPrompt }); } updateData(id, data); } } int ModelList::discoverLimit() const { return m_discoverLimit; } void ModelList::setDiscoverLimit(int limit) { if (m_discoverLimit == limit) return; m_discoverLimit = limit; emit discoverLimitChanged(); } int ModelList::discoverSortDirection() const { return m_discoverSortDirection; } void ModelList::setDiscoverSortDirection(int direction) { if (m_discoverSortDirection == direction || (direction != 1 && direction != -1)) return; m_discoverSortDirection = direction; emit discoverSortDirectionChanged(); resortModel(); } ModelList::DiscoverSort ModelList::discoverSort() const { return m_discoverSort; } void ModelList::setDiscoverSort(DiscoverSort sort) { if (m_discoverSort == sort) return; m_discoverSort = sort; emit discoverSortChanged(); resortModel(); } void ModelList::clearDiscoveredModels() { // NOTE: This could be made much more efficient QList infos; { QMutexLocker locker(&m_mutex); for (ModelInfo *info : m_models) if (info->isDiscovered() && !info->installed) infos.append(*info); } for (ModelInfo &info : infos) removeInternal(info); } float ModelList::discoverProgress() const { if (!m_discoverNumberOfResults) return 0.0f; return m_discoverResultsCompleted / float(m_discoverNumberOfResults); } bool ModelList::discoverInProgress() const { return m_discoverInProgress; } void ModelList::discoverSearch(const QString &search) { Q_ASSERT(!m_discoverInProgress); clearDiscoveredModels(); m_discoverNumberOfResults = 0; m_discoverResultsCompleted = 0; emit discoverProgressChanged(); if (search.isEmpty()) { return; } m_discoverInProgress = true; emit discoverInProgressChanged(); static const QRegularExpression wsRegex("\\s+"); QStringList searchParams = search.split(wsRegex); // split by whitespace QString searchString = u"search=%1&"_s.arg(searchParams.join('+')); QString limitString = m_discoverLimit > 0 ? u"limit=%1&"_s.arg(m_discoverLimit) : QString(); QString sortString; switch (m_discoverSort) { case Default: break; case Likes: sortString = "sort=likes&"; break; case Downloads: sortString = "sort=downloads&"; break; case Recent: sortString = "sort=lastModified&"; break; } QString directionString = !sortString.isEmpty() ? u"direction=%1&"_s.arg(m_discoverSortDirection) : QString(); QUrl hfUrl(u"https://huggingface.co/api/models?filter=gguf&%1%2%3%4full=true&config=true"_s .arg(searchString, limitString, sortString, directionString)); QNetworkRequest request(hfUrl); request.setHeader(QNetworkRequest::ContentTypeHeader, "application/json"); QNetworkReply *reply = m_networkManager.get(request); connect(qGuiApp, &QCoreApplication::aboutToQuit, reply, &QNetworkReply::abort); connect(reply, &QNetworkReply::finished, this, &ModelList::handleDiscoveryFinished); connect(reply, &QNetworkReply::errorOccurred, this, &ModelList::handleDiscoveryErrorOccurred); } void ModelList::handleDiscoveryFinished() { QNetworkReply *jsonReply = qobject_cast(sender()); if (!jsonReply) return; QByteArray jsonData = jsonReply->readAll(); parseDiscoveryJsonFile(jsonData); jsonReply->deleteLater(); } void ModelList::handleDiscoveryErrorOccurred(QNetworkReply::NetworkError code) { QNetworkReply *reply = qobject_cast(sender()); if (!reply) return; qWarning() << u"ERROR: Discovery failed with error code \"%1-%2\""_s .arg(code).arg(reply->errorString()).toStdString(); } enum QuantType { Q4_0 = 0, Q4_1, F16, F32, Unknown }; QuantType toQuantType(const QString& filename) { QString lowerCaseFilename = filename.toLower(); if (lowerCaseFilename.contains("q4_0")) return Q4_0; if (lowerCaseFilename.contains("q4_1")) return Q4_1; if (lowerCaseFilename.contains("f16")) return F16; if (lowerCaseFilename.contains("f32")) return F32; return Unknown; } QString toQuantString(const QString& filename) { QString lowerCaseFilename = filename.toLower(); if (lowerCaseFilename.contains("q4_0")) return "q4_0"; if (lowerCaseFilename.contains("q4_1")) return "q4_1"; if (lowerCaseFilename.contains("f16")) return "f16"; if (lowerCaseFilename.contains("f32")) return "f32"; return QString(); } void ModelList::parseDiscoveryJsonFile(const QByteArray &jsonData) { QJsonParseError err; QJsonDocument document = QJsonDocument::fromJson(jsonData, &err); if (err.error != QJsonParseError::NoError) { qWarning() << "ERROR: Couldn't parse: " << jsonData << err.errorString(); m_discoverNumberOfResults = 0; m_discoverResultsCompleted = 0; emit discoverProgressChanged(); m_discoverInProgress = false; emit discoverInProgressChanged(); return; } QJsonArray jsonArray = document.array(); for (const QJsonValue &value : jsonArray) { QJsonObject obj = value.toObject(); QJsonDocument jsonDocument(obj); QByteArray jsonData = jsonDocument.toJson(); QString repo_id = obj["id"].toString(); QJsonArray siblingsArray = obj["siblings"].toArray(); QList> filteredAndSortedFilenames; for (const QJsonValue &sibling : siblingsArray) { QJsonObject s = sibling.toObject(); QString filename = s["rfilename"].toString(); if (!filename.endsWith("gguf")) continue; QuantType quant = toQuantType(filename); if (quant != Unknown) filteredAndSortedFilenames.append({ quant, filename }); } if (filteredAndSortedFilenames.isEmpty()) continue; std::sort(filteredAndSortedFilenames.begin(), filteredAndSortedFilenames.end(), [](const QPair& a, const QPair& b) { return a.first < b.first; }); QPair file = filteredAndSortedFilenames.first(); QString filename = file.second; ++m_discoverNumberOfResults; QUrl url(u"https://huggingface.co/%1/resolve/main/%2"_s.arg(repo_id, filename)); QNetworkRequest request(url); request.setRawHeader("Accept-Encoding", "identity"); request.setAttribute(QNetworkRequest::RedirectPolicyAttribute, QNetworkRequest::ManualRedirectPolicy); request.setAttribute(QNetworkRequest::User, jsonData); request.setAttribute(QNetworkRequest::UserMax, filename); QNetworkReply *reply = m_networkManager.head(request); connect(qGuiApp, &QCoreApplication::aboutToQuit, reply, &QNetworkReply::abort); connect(reply, &QNetworkReply::finished, this, &ModelList::handleDiscoveryItemFinished); connect(reply, &QNetworkReply::errorOccurred, this, &ModelList::handleDiscoveryItemErrorOccurred); } emit discoverProgressChanged(); if (!m_discoverNumberOfResults) { m_discoverInProgress = false; emit discoverInProgressChanged(); } } void ModelList::handleDiscoveryItemFinished() { QNetworkReply *reply = qobject_cast(sender()); if (!reply) return; QVariant replyCustomData = reply->request().attribute(QNetworkRequest::User); QByteArray customDataByteArray = replyCustomData.toByteArray(); QJsonDocument customJsonDocument = QJsonDocument::fromJson(customDataByteArray); QJsonObject obj = customJsonDocument.object(); QString repo_id = obj["id"].toString(); QString modelName = obj["modelId"].toString(); QString author = obj["author"].toString(); QDateTime lastModified = QDateTime::fromString(obj["lastModified"].toString(), Qt::ISODateWithMs); int likes = obj["likes"].toInt(); int downloads = obj["downloads"].toInt(); QJsonObject config = obj["config"].toObject(); QString type = config["model_type"].toString(); // QByteArray repoCommitHeader = reply->rawHeader("X-Repo-Commit"); QByteArray linkedSizeHeader = reply->rawHeader("X-Linked-Size"); QByteArray linkedEtagHeader = reply->rawHeader("X-Linked-Etag"); // For some reason these seem to contain quotation marks ewww linkedEtagHeader.replace("\"", ""); linkedEtagHeader.replace("\'", ""); // QString locationHeader = reply->header(QNetworkRequest::LocationHeader).toString(); QString modelFilename = reply->request().attribute(QNetworkRequest::UserMax).toString(); QString modelFilesize = ModelList::toFileSize(QString(linkedSizeHeader).toULongLong()); QString description = tr("Created by %1.
                          " "
                        • Published on %2." "
                        • This model has %3 likes." "
                        • This model has %4 downloads." "
                        • More info can be found here.
                        ") .arg(author) .arg(lastModified.toString("ddd MMMM d, yyyy")) .arg(likes) .arg(downloads) .arg(repo_id); const QString id = modelFilename; Q_ASSERT(!id.isEmpty()); if (contains(modelFilename)) changeId(modelFilename, id); if (!contains(id)) addModel(id); QVector> data { { ModelList::NameRole, modelName }, { ModelList::FilenameRole, modelFilename }, { ModelList::FilesizeRole, modelFilesize }, { ModelList::DescriptionRole, description }, { ModelList::IsDiscoveredRole, true }, { ModelList::UrlRole, reply->request().url() }, { ModelList::LikesRole, likes }, { ModelList::DownloadsRole, downloads }, { ModelList::RecencyRole, lastModified }, { ModelList::QuantRole, toQuantString(modelFilename) }, { ModelList::TypeRole, type }, { ModelList::HashRole, linkedEtagHeader }, { ModelList::HashAlgorithmRole, ModelInfo::Sha256 }, }; updateData(id, data); ++m_discoverResultsCompleted; emit discoverProgressChanged(); if (discoverProgress() >= 1.0) { m_discoverInProgress = false; emit discoverInProgressChanged(); } reply->deleteLater(); } void ModelList::handleDiscoveryItemErrorOccurred(QNetworkReply::NetworkError code) { QNetworkReply *reply = qobject_cast(sender()); if (!reply) return; qWarning() << u"ERROR: Discovery item failed with error code \"%1-%2\""_s .arg(code).arg(reply->errorString()).toStdString(); } QStringList ModelList::remoteModelList(const QString &apiKey, const QUrl &baseUrl) { QStringList modelList; // Create the request QNetworkRequest request; request.setUrl(baseUrl.resolved(QUrl("models"))); request.setHeader(QNetworkRequest::ContentTypeHeader, "application/json"); // Add the Authorization header const QString bearerToken = QString("Bearer %1").arg(apiKey); request.setRawHeader("Authorization", bearerToken.toUtf8()); // Make the GET request QNetworkReply *reply = m_networkManager.get(request); // We use a local event loop to wait for the request to complete QEventLoop loop; connect(reply, &QNetworkReply::finished, &loop, &QEventLoop::quit); loop.exec(); // Check for errors if (reply->error() == QNetworkReply::NoError) { // Parse the JSON response const QByteArray responseData = reply->readAll(); const QJsonDocument jsonDoc = QJsonDocument::fromJson(responseData); if (!jsonDoc.isNull() && jsonDoc.isObject()) { QJsonObject rootObj = jsonDoc.object(); QJsonValue dataValue = rootObj.value("data"); if (dataValue.isArray()) { QJsonArray dataArray = dataValue.toArray(); for (const QJsonValue &val : dataArray) { if (val.isObject()) { QJsonObject obj = val.toObject(); const QString modelId = obj.value("id").toString(); modelList.append(modelId); } } } } } else { // Handle network error (e.g. print it to qDebug) qWarning() << "Error retrieving models:" << reply->errorString(); } // Clean up reply->deleteLater(); return modelList; } ================================================ FILE: gpt4all-chat/src/modellist.h ================================================ #ifndef MODELLIST_H #define MODELLIST_H #include #include #include #include #include // IWYU pragma: keep #include #include #include #include #include #include // IWYU pragma: keep #include // IWYU pragma: keep #include #include #include #include #include // IWYU pragma: keep #include #include #include #include // IWYU pragma: no_forward_declare QObject // IWYU pragma: no_forward_declare QSslError class QUrl; using namespace Qt::Literals::StringLiterals; class UpgradeableSetting { Q_GADGET QML_ANONYMOUS // NOTE: Unset implies there is neither a value nor a default enum class State { Unset, Legacy, Modern }; Q_PROPERTY(bool isSet READ isSet ) Q_PROPERTY(bool isLegacy READ isLegacy) Q_PROPERTY(bool isModern READ isModern) Q_PROPERTY(QVariant value READ value) // string or null public: struct legacy_tag_t { explicit legacy_tag_t() = default; }; static inline constexpr legacy_tag_t legacy_tag = legacy_tag_t(); UpgradeableSetting() : m_state(State::Unset ) {} UpgradeableSetting(legacy_tag_t, QString value): m_state(State::Legacy), m_value(std::move(value)) {} UpgradeableSetting( QString value): m_state(State::Modern), m_value(std::move(value)) {} bool isSet () const { return m_state != State::Unset; } bool isLegacy() const { return m_state == State::Legacy; } bool isModern() const { return m_state == State::Modern; } QVariant value () const { return m_state == State::Unset ? QVariant::fromValue(nullptr) : m_value; } friend bool operator==(const UpgradeableSetting &a, const UpgradeableSetting &b) { return a.m_state == b.m_state && (a.m_state == State::Unset || a.m_value == b.m_value); } // returns std::nullopt if there is a legacy template or it is not set std::optional asModern() const { if (m_state == State::Modern) return m_value; return std::nullopt; } private: State m_state; QString m_value; }; struct ModelInfo { Q_GADGET Q_PROPERTY(QString id READ id WRITE setId) Q_PROPERTY(QString name READ name WRITE setName) Q_PROPERTY(QString filename READ filename WRITE setFilename) Q_PROPERTY(QString dirpath MEMBER dirpath) Q_PROPERTY(QString filesize MEMBER filesize) Q_PROPERTY(QByteArray hash MEMBER hash) Q_PROPERTY(HashAlgorithm hashAlgorithm MEMBER hashAlgorithm) Q_PROPERTY(bool calcHash MEMBER calcHash) Q_PROPERTY(bool installed MEMBER installed) Q_PROPERTY(bool isDefault MEMBER isDefault) Q_PROPERTY(bool isOnline MEMBER isOnline) Q_PROPERTY(bool isCompatibleApi MEMBER isCompatibleApi) Q_PROPERTY(QString description READ description WRITE setDescription) Q_PROPERTY(QString requiresVersion MEMBER requiresVersion) Q_PROPERTY(QString versionRemoved MEMBER versionRemoved) Q_PROPERTY(QString url READ url WRITE setUrl) Q_PROPERTY(qint64 bytesReceived MEMBER bytesReceived) Q_PROPERTY(qint64 bytesTotal MEMBER bytesTotal) Q_PROPERTY(qint64 timestamp MEMBER timestamp) Q_PROPERTY(QString speed MEMBER speed) Q_PROPERTY(bool isDownloading MEMBER isDownloading) Q_PROPERTY(bool isIncomplete MEMBER isIncomplete) Q_PROPERTY(QString downloadError MEMBER downloadError) Q_PROPERTY(QString order MEMBER order) Q_PROPERTY(int ramrequired MEMBER ramrequired) Q_PROPERTY(QString parameters MEMBER parameters) Q_PROPERTY(QString quant READ quant WRITE setQuant) Q_PROPERTY(QString type READ type WRITE setType) Q_PROPERTY(bool isClone READ isClone WRITE setIsClone) Q_PROPERTY(bool isDiscovered READ isDiscovered WRITE setIsDiscovered) Q_PROPERTY(bool isEmbeddingModel MEMBER isEmbeddingModel) Q_PROPERTY(double temperature READ temperature WRITE setTemperature) Q_PROPERTY(double topP READ topP WRITE setTopP) Q_PROPERTY(double minP READ minP WRITE setMinP) Q_PROPERTY(int topK READ topK WRITE setTopK) Q_PROPERTY(int maxLength READ maxLength WRITE setMaxLength) Q_PROPERTY(int promptBatchSize READ promptBatchSize WRITE setPromptBatchSize) Q_PROPERTY(int contextLength READ contextLength WRITE setContextLength) Q_PROPERTY(int maxContextLength READ maxContextLength) Q_PROPERTY(int gpuLayers READ gpuLayers WRITE setGpuLayers) Q_PROPERTY(int maxGpuLayers READ maxGpuLayers) Q_PROPERTY(double repeatPenalty READ repeatPenalty WRITE setRepeatPenalty) Q_PROPERTY(int repeatPenaltyTokens READ repeatPenaltyTokens WRITE setRepeatPenaltyTokens) // user-defined chat template and system message must be written through settings because of their legacy compat Q_PROPERTY(QVariant defaultChatTemplate READ defaultChatTemplate ) Q_PROPERTY(UpgradeableSetting chatTemplate READ chatTemplate ) Q_PROPERTY(QString defaultSystemMessage READ defaultSystemMessage) Q_PROPERTY(UpgradeableSetting systemMessage READ systemMessage ) Q_PROPERTY(QString chatNamePrompt READ chatNamePrompt WRITE setChatNamePrompt) Q_PROPERTY(QString suggestedFollowUpPrompt READ suggestedFollowUpPrompt WRITE setSuggestedFollowUpPrompt) Q_PROPERTY(int likes READ likes WRITE setLikes) Q_PROPERTY(int downloads READ downloads WRITE setDownloads) Q_PROPERTY(QDateTime recency READ recency WRITE setRecency) public: enum HashAlgorithm { Md5, Sha256 }; QString id() const; void setId(const QString &id); QString name() const; void setName(const QString &name); QString filename() const; void setFilename(const QString &name); QString description() const; void setDescription(const QString &d); QString url() const; void setUrl(const QString &u); QString quant() const; void setQuant(const QString &q); QString type() const; void setType(const QString &t); bool isClone() const; void setIsClone(bool b); bool isDiscovered() const; void setIsDiscovered(bool b); int likes() const; void setLikes(int l); int downloads() const; void setDownloads(int d); QDateTime recency() const; void setRecency(const QDateTime &r); QString dirpath; QString filesize; QByteArray hash; HashAlgorithm hashAlgorithm; bool calcHash = false; bool installed = false; bool isDefault = false; // Differences between 'isOnline' and 'isCompatibleApi' in ModelInfo: // 'isOnline': // - Indicates whether this is a online model. // - Linked with the ModelList, fetching info from it. bool isOnline = false; // 'isCompatibleApi': // - Indicates whether the model is using the OpenAI-compatible API which user custom. // - When the property is true, 'isOnline' should also be true. // - Does not link to the ModelList directly; instead, fetches info from the *-capi.rmodel file and works standalone. // - Still needs to copy data from gpt4all.ini and *-capi.rmodel to the ModelList in memory while application getting started(as custom .gguf models do). bool isCompatibleApi = false; QString requiresVersion; QString versionRemoved; qint64 bytesReceived = 0; qint64 bytesTotal = 0; qint64 timestamp = 0; QString speed; bool isDownloading = false; bool isIncomplete = false; QString downloadError; QString order; int ramrequired = -1; QString parameters; bool isEmbeddingModel = false; bool checkedEmbeddingModel = false; bool operator==(const ModelInfo &other) const { return m_id == other.m_id; } double temperature() const; void setTemperature(double t); double topP() const; void setTopP(double p); double minP() const; void setMinP(double p); int topK() const; void setTopK(int k); int maxLength() const; void setMaxLength(int l); int promptBatchSize() const; void setPromptBatchSize(int s); int contextLength() const; void setContextLength(int l); int maxContextLength() const; int gpuLayers() const; void setGpuLayers(int l); int maxGpuLayers() const; double repeatPenalty() const; void setRepeatPenalty(double p); int repeatPenaltyTokens() const; void setRepeatPenaltyTokens(int t); QVariant defaultChatTemplate() const; UpgradeableSetting chatTemplate() const; QString defaultSystemMessage() const; UpgradeableSetting systemMessage() const; QString chatNamePrompt() const; void setChatNamePrompt(const QString &p); QString suggestedFollowUpPrompt() const; void setSuggestedFollowUpPrompt(const QString &p); // Some metadata must be saved to settings because it does not have a meaningful default from some other source. // This is useful for fields such as name, description, and URL. // It is true for any models that have not been installed from models.json. bool shouldSaveMetadata() const; private: QVariant getField(QLatin1StringView name) const; QString m_id; QString m_name; QString m_filename; QString m_description; QString m_url; QString m_quant; QString m_type; bool m_isClone = false; bool m_isDiscovered = false; int m_likes = -1; int m_downloads = -1; QDateTime m_recency; double m_temperature = 0.7; double m_topP = 0.4; double m_minP = 0.0; int m_topK = 40; int m_maxLength = 4096; int m_promptBatchSize = 128; int m_contextLength = 2048; mutable int m_maxContextLength = -1; int m_gpuLayers = 100; mutable int m_maxGpuLayers = -1; double m_repeatPenalty = 1.18; int m_repeatPenaltyTokens = 64; std::optional m_chatTemplate; mutable std::optional m_modelChatTemplate; QString m_systemMessage; QString m_chatNamePrompt = "Describe the above conversation. Your entire response must be three words or less."; QString m_suggestedFollowUpPrompt = "Suggest three very short factual follow-up questions that have not been answered yet or cannot be found inspired by the previous conversation and excerpts."; friend class MySettings; friend class ModelList; }; Q_DECLARE_METATYPE(ModelInfo) class InstalledModels : public QSortFilterProxyModel { Q_OBJECT Q_PROPERTY(int count READ count NOTIFY countChanged) public: explicit InstalledModels(QObject *parent, bool selectable = false); int count() const { return rowCount(); } Q_SIGNALS: void countChanged(); protected: bool filterAcceptsRow(int sourceRow, const QModelIndex &sourceParent) const override; private: bool m_selectable; }; class GPT4AllDownloadableModels : public QSortFilterProxyModel { Q_OBJECT Q_PROPERTY(int count READ count NOTIFY countChanged) public: explicit GPT4AllDownloadableModels(QObject *parent); int count() const; Q_INVOKABLE void filter(const QVector &keywords); Q_SIGNALS: void countChanged(); protected: bool filterAcceptsRow(int sourceRow, const QModelIndex &sourceParent) const override; private: QVector m_keywords; }; class HuggingFaceDownloadableModels : public QSortFilterProxyModel { Q_OBJECT Q_PROPERTY(int count READ count NOTIFY countChanged) public: explicit HuggingFaceDownloadableModels(QObject *parent); int count() const; Q_INVOKABLE void discoverAndFilter(const QString &discover); Q_SIGNALS: void countChanged(); protected: bool filterAcceptsRow(int sourceRow, const QModelIndex &sourceParent) const override; private: int m_limit; QString m_discoverFilter; }; class ModelList : public QAbstractListModel { Q_OBJECT Q_PROPERTY(int count READ count NOTIFY countChanged) Q_PROPERTY(InstalledModels* installedModels READ installedModels NOTIFY installedModelsChanged) Q_PROPERTY(InstalledModels* selectableModels READ selectableModels NOTIFY selectableModelsChanged) Q_PROPERTY(GPT4AllDownloadableModels* gpt4AllDownloadableModels READ gpt4AllDownloadableModels CONSTANT) Q_PROPERTY(HuggingFaceDownloadableModels* huggingFaceDownloadableModels READ huggingFaceDownloadableModels CONSTANT) Q_PROPERTY(QList selectableModelList READ selectableModelList NOTIFY selectableModelListChanged) Q_PROPERTY(bool asyncModelRequestOngoing READ asyncModelRequestOngoing NOTIFY asyncModelRequestOngoingChanged) Q_PROPERTY(int discoverLimit READ discoverLimit WRITE setDiscoverLimit NOTIFY discoverLimitChanged) Q_PROPERTY(int discoverSortDirection READ discoverSortDirection WRITE setDiscoverSortDirection NOTIFY discoverSortDirectionChanged) Q_PROPERTY(DiscoverSort discoverSort READ discoverSort WRITE setDiscoverSort NOTIFY discoverSortChanged) Q_PROPERTY(float discoverProgress READ discoverProgress NOTIFY discoverProgressChanged) Q_PROPERTY(bool discoverInProgress READ discoverInProgress NOTIFY discoverInProgressChanged) public: static ModelList *globalInstance(); static QString compatibleModelNameHash(QUrl baseUrl, QString modelName); static QString compatibleModelFilename(QUrl baseUrl, QString modelName); enum DiscoverSort { Default, Likes, Downloads, Recent }; enum Roles { IdRole = Qt::UserRole + 1, NameRole, FilenameRole, DirpathRole, FilesizeRole, HashRole, HashAlgorithmRole, CalcHashRole, InstalledRole, DefaultRole, OnlineRole, CompatibleApiRole, DescriptionRole, RequiresVersionRole, VersionRemovedRole, UrlRole, BytesReceivedRole, BytesTotalRole, TimestampRole, SpeedRole, DownloadingRole, IncompleteRole, DownloadErrorRole, OrderRole, RamrequiredRole, ParametersRole, QuantRole, TypeRole, IsCloneRole, IsDiscoveredRole, IsEmbeddingModelRole, TemperatureRole, TopPRole, TopKRole, MaxLengthRole, PromptBatchSizeRole, ContextLengthRole, GpuLayersRole, RepeatPenaltyRole, RepeatPenaltyTokensRole, ChatTemplateRole, SystemMessageRole, ChatNamePromptRole, SuggestedFollowUpPromptRole, MinPRole, LikesRole, DownloadsRole, RecencyRole }; QHash roleNames() const override { QHash roles; roles[IdRole] = "id"; roles[NameRole] = "name"; roles[FilenameRole] = "filename"; roles[DirpathRole] = "dirpath"; roles[FilesizeRole] = "filesize"; roles[HashRole] = "hash"; roles[HashAlgorithmRole] = "hashAlgorithm"; roles[CalcHashRole] = "calcHash"; roles[InstalledRole] = "installed"; roles[DefaultRole] = "isDefault"; roles[OnlineRole] = "isOnline"; roles[CompatibleApiRole] = "isCompatibleApi"; roles[DescriptionRole] = "description"; roles[RequiresVersionRole] = "requiresVersion"; roles[VersionRemovedRole] = "versionRemoved"; roles[UrlRole] = "url"; roles[BytesReceivedRole] = "bytesReceived"; roles[BytesTotalRole] = "bytesTotal"; roles[TimestampRole] = "timestamp"; roles[SpeedRole] = "speed"; roles[DownloadingRole] = "isDownloading"; roles[IncompleteRole] = "isIncomplete"; roles[DownloadErrorRole] = "downloadError"; roles[OrderRole] = "order"; roles[RamrequiredRole] = "ramrequired"; roles[ParametersRole] = "parameters"; roles[QuantRole] = "quant"; roles[TypeRole] = "type"; roles[IsCloneRole] = "isClone"; roles[IsDiscoveredRole] = "isDiscovered"; roles[IsEmbeddingModelRole] = "isEmbeddingModel"; roles[TemperatureRole] = "temperature"; roles[TopPRole] = "topP"; roles[MinPRole] = "minP"; roles[TopKRole] = "topK"; roles[MaxLengthRole] = "maxLength"; roles[PromptBatchSizeRole] = "promptBatchSize"; roles[ContextLengthRole] = "contextLength"; roles[GpuLayersRole] = "gpuLayers"; roles[RepeatPenaltyRole] = "repeatPenalty"; roles[RepeatPenaltyTokensRole] = "repeatPenaltyTokens"; roles[ChatTemplateRole] = "chatTemplate"; roles[SystemMessageRole] = "systemMessage"; roles[ChatNamePromptRole] = "chatNamePrompt"; roles[SuggestedFollowUpPromptRole] = "suggestedFollowUpPrompt"; roles[LikesRole] = "likes"; roles[DownloadsRole] = "downloads"; roles[RecencyRole] = "recency"; return roles; } int rowCount(const QModelIndex &parent = QModelIndex()) const override; QVariant data(const QModelIndex &index, int role = Qt::DisplayRole) const override; QVariant data(const QString &id, int role) const; QVariant dataByFilename(const QString &filename, int role) const; void updateDataByFilename(const QString &filename, QVector> data); void updateData(const QString &id, const QVector> &data); int count() const { return m_models.size(); } bool contains(const QString &id) const; bool containsByFilename(const QString &filename) const; Q_INVOKABLE ModelInfo modelInfo(const QString &id) const; Q_INVOKABLE ModelInfo modelInfoByFilename(const QString &filename, bool allowClone = true) const; Q_INVOKABLE bool isUniqueName(const QString &name) const; Q_INVOKABLE QString clone(const ModelInfo &model); Q_INVOKABLE void removeClone(const ModelInfo &model); Q_INVOKABLE void removeInstalled(const ModelInfo &model); ModelInfo defaultModelInfo() const; void addModel(const QString &id); void changeId(const QString &oldId, const QString &newId); const QList selectableModelList() const; InstalledModels *installedModels() const { return m_installedModels; } InstalledModels *selectableModels() const { return m_selectableModels; } GPT4AllDownloadableModels *gpt4AllDownloadableModels() const { return m_gpt4AllDownloadableModels; } HuggingFaceDownloadableModels *huggingFaceDownloadableModels() const { return m_huggingFaceDownloadableModels; } static inline QString toFileSize(quint64 sz) { if (sz < 1024) { return u"%1 bytes"_s.arg(sz); } else if (sz < 1024 * 1024) { return u"%1 KB"_s.arg(qreal(sz) / 1024, 0, 'g', 3); } else if (sz < 1024 * 1024 * 1024) { return u"%1 MB"_s.arg(qreal(sz) / (1024 * 1024), 0, 'g', 3); } else { return u"%1 GB"_s.arg(qreal(sz) / (1024 * 1024 * 1024), 0, 'g', 3); } } QString incompleteDownloadPath(const QString &modelFile); bool asyncModelRequestOngoing() const { return m_asyncModelRequestOngoing; } void updateModelsFromDirectory(); void updateDiscoveredInstalled(const ModelInfo &info); int discoverLimit() const; void setDiscoverLimit(int limit); int discoverSortDirection() const; void setDiscoverSortDirection(int direction); // -1 or 1 DiscoverSort discoverSort() const; void setDiscoverSort(DiscoverSort sort); float discoverProgress() const; bool discoverInProgress() const; Q_INVOKABLE void discoverSearch(const QString &discover); Q_INVOKABLE QStringList remoteModelList(const QString &apiKey, const QUrl &baseUrl); Q_SIGNALS: void countChanged(); void installedModelsChanged(); void selectableModelsChanged(); void selectableModelListChanged(); void asyncModelRequestOngoingChanged(); void discoverLimitChanged(); void discoverSortDirectionChanged(); void discoverSortChanged(); void discoverProgressChanged(); void discoverInProgressChanged(); void modelInfoChanged(const ModelInfo &info); protected: bool eventFilter(QObject *obj, QEvent *ev) override; private Q_SLOTS: void onDataChanged(const QModelIndex &topLeft, const QModelIndex &bottomRight, const QList &roles); void resortModel(); void updateModelsFromJson(); void updateModelsFromJsonAsync(); void updateModelsFromSettings(); void maybeUpdateDataForSettings(const ModelInfo &info, bool fromInfo); void updateDataForSettings(); void handleModelsJsonDownloadFinished(); void handleModelsJsonDownloadErrorOccurred(QNetworkReply::NetworkError code); void handleDiscoveryFinished(); void handleDiscoveryErrorOccurred(QNetworkReply::NetworkError code); void handleDiscoveryItemFinished(); void handleDiscoveryItemErrorOccurred(QNetworkReply::NetworkError code); void handleSslErrors(QNetworkReply *reply, const QList &errors); private: // Return the index of the model with the given id, or -1 if not found. int indexByModelId(const QString &id) const; void removeInternal(const ModelInfo &model); void clearDiscoveredModels(); bool modelExists(const QString &fileName) const; int indexForModel(ModelInfo *model); QVariant dataInternal(const ModelInfo *info, int role) const; static bool lessThan(const ModelInfo* a, const ModelInfo* b, DiscoverSort s, int d); void parseModelsJsonFile(const QByteArray &jsonData, bool save); void parseDiscoveryJsonFile(const QByteArray &jsonData); QString uniqueModelName(const ModelInfo &model) const; void updateOldRemoteModels(const QString &path); void processModelDirectory(const QString &path); private: mutable QMutex m_mutex; QNetworkAccessManager m_networkManager; InstalledModels *m_installedModels; InstalledModels *m_selectableModels; GPT4AllDownloadableModels *m_gpt4AllDownloadableModels; HuggingFaceDownloadableModels *m_huggingFaceDownloadableModels; QList m_models; QHash m_modelMap; bool m_asyncModelRequestOngoing; int m_discoverLimit; int m_discoverSortDirection; DiscoverSort m_discoverSort; int m_discoverNumberOfResults; int m_discoverResultsCompleted; bool m_discoverInProgress; protected: explicit ModelList(); ~ModelList() override { for (auto *model: std::as_const(m_models)) { delete model; } } friend class MyModelList; }; #endif // MODELLIST_H ================================================ FILE: gpt4all-chat/src/mysettings.cpp ================================================ #include "mysettings.h" #include "chatllm.h" #include "modellist.h" #include #include #include #include #include #include #include #include // IWYU pragma: keep #include #include #include #include #include #include #include #include #include #include #include #if !(defined(Q_OS_MAC) && defined(__aarch64__)) #include #endif using namespace Qt::Literals::StringLiterals; // used only for settings serialization, do not translate static const QStringList suggestionModeNames { "LocalDocsOnly", "On", "Off" }; static const QStringList chatThemeNames { "Light", "Dark", "LegacyDark" }; static const QStringList fontSizeNames { "Small", "Medium", "Large" }; // psuedo-enum namespace ModelSettingsKey { namespace { auto ChatTemplate = "chatTemplate"_L1; auto PromptTemplate = "promptTemplate"_L1; // legacy auto SystemMessage = "systemMessage"_L1; auto SystemPrompt = "systemPrompt"_L1; // legacy } } // namespace ModelSettingsKey::(anonymous) namespace defaults { static const int threadCount = std::min(4, (int32_t) std::thread::hardware_concurrency()); static const bool forceMetal = false; static const bool networkIsActive = false; static const bool networkUsageStatsActive = false; static const QString device = "Auto"; static const QString languageAndLocale = "System Locale"; } // namespace defaults static const QVariantMap basicDefaults { { "chatTheme", QVariant::fromValue(ChatTheme::Light) }, { "fontSize", QVariant::fromValue(FontSize::Small) }, { "lastVersionStarted", "" }, { "networkPort", 4891, }, { "systemTray", false }, { "serverChat", false }, { "userDefaultModel", "Application default" }, { "suggestionMode", QVariant::fromValue(SuggestionMode::LocalDocsOnly) }, { "localdocs/chunkSize", 512 }, { "localdocs/retrievalSize", 3 }, { "localdocs/showReferences", true }, { "localdocs/fileExtensions", QStringList { "docx", "pdf", "txt", "md", "rst" } }, { "localdocs/useRemoteEmbed", false }, { "localdocs/nomicAPIKey", "" }, { "localdocs/embedDevice", "Auto" }, { "network/attribution", "" }, }; static QString defaultLocalModelsPath() { QString localPath = QStandardPaths::writableLocation(QStandardPaths::AppLocalDataLocation) + "/"; QString testWritePath = localPath + u"test_write.txt"_s; QString canonicalLocalPath = QFileInfo(localPath).canonicalFilePath() + "/"; QDir localDir(localPath); if (!localDir.exists()) { if (!localDir.mkpath(localPath)) { qWarning() << "ERROR: Local download directory can't be created:" << canonicalLocalPath; return canonicalLocalPath; } } if (QFileInfo::exists(testWritePath)) return canonicalLocalPath; QFile testWriteFile(testWritePath); if (testWriteFile.open(QIODeviceBase::ReadWrite)) { testWriteFile.close(); return canonicalLocalPath; } qWarning() << "ERROR: Local download path appears not writeable:" << canonicalLocalPath; return canonicalLocalPath; } static QStringList getDevices(bool skipKompute = false) { QStringList deviceList; #if defined(Q_OS_MAC) && defined(__aarch64__) deviceList << "Metal"; #else std::vector devices = LLModel::Implementation::availableGPUDevices(); for (LLModel::GPUDevice &d : devices) { if (!skipKompute || strcmp(d.backend, "kompute")) deviceList << QString::fromStdString(d.selectionName()); } #endif deviceList << "CPU"; return deviceList; } static QString getUiLanguage(const QString directory, const QString fileName) { QTranslator translator; const QString filePath = directory + QDir::separator() + fileName; if (translator.load(filePath)) { const QString lang = fileName.mid(fileName.indexOf('_') + 1, fileName.lastIndexOf('.') - fileName.indexOf('_') - 1); return lang; } qDebug() << "ERROR: Failed to load translation file:" << filePath; return QString(); } static QStringList getUiLanguages(const QString &modelPath) { QStringList languageList; static const QStringList releasedLanguages = { "en_US", "it_IT", "zh_CN", "zh_TW", "es_MX", "pt_BR", "ro_RO" }; // Add the language translations from model path files first which is used by translation developers // to load translations in progress without having to rebuild all of GPT4All from source { const QDir dir(modelPath); const QStringList qmFiles = dir.entryList({"*.qm"}, QDir::Files); for (const QString &fileName : qmFiles) languageList << getUiLanguage(modelPath, fileName); } // Now add the internal language translations { const QDir dir(":/i18n"); const QStringList qmFiles = dir.entryList({"*.qm"}, QDir::Files); for (const QString &fileName : qmFiles) { const QString lang = getUiLanguage(":/i18n", fileName); if (!languageList.contains(lang) && releasedLanguages.contains(lang)) languageList.append(lang); } } return languageList; } static QString modelSettingName(const ModelInfo &info, auto &&name) { return u"model-%1/%2"_s.arg(info.id(), name); } class MyPrivateSettings: public MySettings { }; Q_GLOBAL_STATIC(MyPrivateSettings, settingsInstance) MySettings *MySettings::globalInstance() { return settingsInstance(); } MySettings::MySettings() : QObject(nullptr) , m_deviceList(getDevices()) , m_embeddingsDeviceList(getDevices(/*skipKompute*/ true)) , m_uiLanguages(getUiLanguages(modelPath())) { } QVariant MySettings::checkJinjaTemplateError(const QString &tmpl) { if (auto err = ChatLLM::checkJinjaTemplateError(tmpl.toStdString())) return QString::fromStdString(*err); return QVariant::fromValue(nullptr); } // Unset settings come from ModelInfo. Listen for changes so we can emit our own setting-specific signals. void MySettings::onModelInfoChanged(const QModelIndex &topLeft, const QModelIndex &bottomRight, const QList &roles) { auto settingChanged = [&](const auto &info, auto role, const auto &name) { return (roles.isEmpty() || roles.contains(role)) && !m_settings.contains(modelSettingName(info, name)); }; auto &modelList = dynamic_cast(*QObject::sender()); for (int row = topLeft.row(); row <= bottomRight.row(); row++) { using enum ModelList::Roles; using namespace ModelSettingsKey; auto index = topLeft.siblingAtRow(row); if (auto info = modelList.modelInfo(index.data(IdRole).toString()); !info.id().isNull()) { if (settingChanged(info, ChatTemplateRole, ChatTemplate)) emit chatTemplateChanged(info, /*fromInfo*/ true); if (settingChanged(info, SystemMessageRole, SystemMessage)) emit systemMessageChanged(info, /*fromInfo*/ true); } } } QVariant MySettings::getBasicSetting(const QString &name) const { return m_settings.value(name, basicDefaults.value(name)); } void MySettings::setBasicSetting(const QString &name, const QVariant &value, std::optional signal) { if (getBasicSetting(name) == value) return; m_settings.setValue(name, value); QMetaObject::invokeMethod(this, u"%1Changed"_s.arg(signal.value_or(name)).toLatin1().constData()); } int MySettings::getEnumSetting(const QString &setting, const QStringList &valueNames) const { int idx = valueNames.indexOf(getBasicSetting(setting).toString()); return idx != -1 ? idx : *reinterpret_cast(basicDefaults.value(setting).constData()); } void MySettings::restoreModelDefaults(const ModelInfo &info) { setModelTemperature(info, info.m_temperature); setModelTopP(info, info.m_topP); setModelMinP(info, info.m_minP); setModelTopK(info, info.m_topK); setModelMaxLength(info, info.m_maxLength); setModelPromptBatchSize(info, info.m_promptBatchSize); setModelContextLength(info, info.m_contextLength); setModelGpuLayers(info, info.m_gpuLayers); setModelRepeatPenalty(info, info.m_repeatPenalty); setModelRepeatPenaltyTokens(info, info.m_repeatPenaltyTokens); resetModelChatTemplate (info); resetModelSystemMessage(info); setModelChatNamePrompt(info, info.m_chatNamePrompt); setModelSuggestedFollowUpPrompt(info, info.m_suggestedFollowUpPrompt); } void MySettings::restoreApplicationDefaults() { setChatTheme(basicDefaults.value("chatTheme").value()); setFontSize(basicDefaults.value("fontSize").value()); setDevice(defaults::device); setThreadCount(defaults::threadCount); setSystemTray(basicDefaults.value("systemTray").toBool()); setServerChat(basicDefaults.value("serverChat").toBool()); setNetworkPort(basicDefaults.value("networkPort").toInt()); setModelPath(defaultLocalModelsPath()); setUserDefaultModel(basicDefaults.value("userDefaultModel").toString()); setForceMetal(defaults::forceMetal); setSuggestionMode(basicDefaults.value("suggestionMode").value()); setLanguageAndLocale(defaults::languageAndLocale); } void MySettings::restoreLocalDocsDefaults() { setLocalDocsChunkSize(basicDefaults.value("localdocs/chunkSize").toInt()); setLocalDocsRetrievalSize(basicDefaults.value("localdocs/retrievalSize").toInt()); setLocalDocsShowReferences(basicDefaults.value("localdocs/showReferences").toBool()); setLocalDocsFileExtensions(basicDefaults.value("localdocs/fileExtensions").toStringList()); setLocalDocsUseRemoteEmbed(basicDefaults.value("localdocs/useRemoteEmbed").toBool()); setLocalDocsNomicAPIKey(basicDefaults.value("localdocs/nomicAPIKey").toString()); setLocalDocsEmbedDevice(basicDefaults.value("localdocs/embedDevice").toString()); } void MySettings::eraseModel(const ModelInfo &info) { m_settings.remove(u"model-%1"_s.arg(info.id())); } QString MySettings::modelName(const ModelInfo &info) const { return m_settings.value(u"model-%1/name"_s.arg(info.id()), !info.m_name.isEmpty() ? info.m_name : info.m_filename).toString(); } void MySettings::setModelName(const ModelInfo &info, const QString &value, bool force) { if ((modelName(info) == value || info.id().isEmpty()) && !force) return; if ((info.m_name == value || info.m_filename == value) && !info.shouldSaveMetadata()) m_settings.remove(u"model-%1/name"_s.arg(info.id())); else m_settings.setValue(u"model-%1/name"_s.arg(info.id()), value); if (!force) emit nameChanged(info); } QVariant MySettings::getModelSetting(QLatin1StringView name, const ModelInfo &info) const { QLatin1StringView nameL1(name); return m_settings.value(modelSettingName(info, nameL1), info.getField(nameL1)); } QVariant MySettings::getModelSetting(const char *name, const ModelInfo &info) const { return getModelSetting(QLatin1StringView(name), info); } void MySettings::setModelSetting(QLatin1StringView name, const ModelInfo &info, const QVariant &value, bool force, bool signal) { if (!force && (info.id().isEmpty() || getModelSetting(name, info) == value)) return; QLatin1StringView nameL1(name); QString settingName = modelSettingName(info, nameL1); if (info.getField(nameL1) == value && !info.shouldSaveMetadata()) m_settings.remove(settingName); else m_settings.setValue(settingName, value); if (signal && !force) QMetaObject::invokeMethod(this, u"%1Changed"_s.arg(nameL1).toLatin1().constData(), Q_ARG(ModelInfo, info)); } void MySettings::setModelSetting(const char *name, const ModelInfo &info, const QVariant &value, bool force, bool signal) { setModelSetting(QLatin1StringView(name), info, value, force, signal); } QString MySettings::modelFilename (const ModelInfo &info) const { return getModelSetting("filename", info).toString(); } QString MySettings::modelDescription (const ModelInfo &info) const { return getModelSetting("description", info).toString(); } QString MySettings::modelUrl (const ModelInfo &info) const { return getModelSetting("url", info).toString(); } QString MySettings::modelQuant (const ModelInfo &info) const { return getModelSetting("quant", info).toString(); } QString MySettings::modelType (const ModelInfo &info) const { return getModelSetting("type", info).toString(); } bool MySettings::modelIsClone (const ModelInfo &info) const { return getModelSetting("isClone", info).toBool(); } bool MySettings::modelIsDiscovered (const ModelInfo &info) const { return getModelSetting("isDiscovered", info).toBool(); } int MySettings::modelLikes (const ModelInfo &info) const { return getModelSetting("likes", info).toInt(); } int MySettings::modelDownloads (const ModelInfo &info) const { return getModelSetting("downloads", info).toInt(); } QDateTime MySettings::modelRecency (const ModelInfo &info) const { return getModelSetting("recency", info).toDateTime(); } double MySettings::modelTemperature (const ModelInfo &info) const { return getModelSetting("temperature", info).toDouble(); } double MySettings::modelTopP (const ModelInfo &info) const { return getModelSetting("topP", info).toDouble(); } double MySettings::modelMinP (const ModelInfo &info) const { return getModelSetting("minP", info).toDouble(); } int MySettings::modelTopK (const ModelInfo &info) const { return getModelSetting("topK", info).toInt(); } int MySettings::modelMaxLength (const ModelInfo &info) const { return getModelSetting("maxLength", info).toInt(); } int MySettings::modelPromptBatchSize (const ModelInfo &info) const { return getModelSetting("promptBatchSize", info).toInt(); } int MySettings::modelContextLength (const ModelInfo &info) const { return getModelSetting("contextLength", info).toInt(); } int MySettings::modelGpuLayers (const ModelInfo &info) const { return getModelSetting("gpuLayers", info).toInt(); } double MySettings::modelRepeatPenalty (const ModelInfo &info) const { return getModelSetting("repeatPenalty", info).toDouble(); } int MySettings::modelRepeatPenaltyTokens (const ModelInfo &info) const { return getModelSetting("repeatPenaltyTokens", info).toInt(); } QString MySettings::modelChatNamePrompt (const ModelInfo &info) const { return getModelSetting("chatNamePrompt", info).toString(); } QString MySettings::modelSuggestedFollowUpPrompt(const ModelInfo &info) const { return getModelSetting("suggestedFollowUpPrompt", info).toString(); } auto MySettings::getUpgradeableModelSetting( const ModelInfo &info, QLatin1StringView legacyKey, QLatin1StringView newKey ) const -> UpgradeableSetting { if (info.id().isEmpty()) { qWarning("%s: got null model", Q_FUNC_INFO); return {}; } auto value = m_settings.value(modelSettingName(info, legacyKey)); if (value.isValid()) return { UpgradeableSetting::legacy_tag, value.toString() }; value = getModelSetting(newKey, info); if (!value.isNull()) return value.toString(); return {}; // neither a default nor an override } bool MySettings::isUpgradeableModelSettingSet( const ModelInfo &info, QLatin1StringView legacyKey, QLatin1StringView newKey ) const { if (info.id().isEmpty()) { qWarning("%s: got null model", Q_FUNC_INFO); return false; } if (m_settings.contains(modelSettingName(info, legacyKey))) return true; // NOTE: unlike getUpgradeableSetting(), this ignores the default return m_settings.contains(modelSettingName(info, newKey)); } auto MySettings::modelChatTemplate(const ModelInfo &info) const -> UpgradeableSetting { using namespace ModelSettingsKey; return getUpgradeableModelSetting(info, PromptTemplate, ChatTemplate); } bool MySettings::isModelChatTemplateSet(const ModelInfo &info) const { using namespace ModelSettingsKey; return isUpgradeableModelSettingSet(info, PromptTemplate, ChatTemplate); } auto MySettings::modelSystemMessage(const ModelInfo &info) const -> UpgradeableSetting { using namespace ModelSettingsKey; return getUpgradeableModelSetting(info, SystemPrompt, SystemMessage); } bool MySettings::isModelSystemMessageSet(const ModelInfo &info) const { using namespace ModelSettingsKey; return isUpgradeableModelSettingSet(info, SystemPrompt, SystemMessage); } void MySettings::setModelFilename(const ModelInfo &info, const QString &value, bool force) { setModelSetting("filename", info, value, force, true); } void MySettings::setModelDescription(const ModelInfo &info, const QString &value, bool force) { setModelSetting("description", info, value, force, true); } void MySettings::setModelUrl(const ModelInfo &info, const QString &value, bool force) { setModelSetting("url", info, value, force); } void MySettings::setModelQuant(const ModelInfo &info, const QString &value, bool force) { setModelSetting("quant", info, value, force); } void MySettings::setModelType(const ModelInfo &info, const QString &value, bool force) { setModelSetting("type", info, value, force); } void MySettings::setModelIsClone(const ModelInfo &info, bool value, bool force) { setModelSetting("isClone", info, value, force); } void MySettings::setModelIsDiscovered(const ModelInfo &info, bool value, bool force) { setModelSetting("isDiscovered", info, value, force); } void MySettings::setModelLikes(const ModelInfo &info, int value, bool force) { setModelSetting("likes", info, value, force); } void MySettings::setModelDownloads(const ModelInfo &info, int value, bool force) { setModelSetting("downloads", info, value, force); } void MySettings::setModelRecency(const ModelInfo &info, const QDateTime &value, bool force) { setModelSetting("recency", info, value, force); } void MySettings::setModelTemperature(const ModelInfo &info, double value, bool force) { setModelSetting("temperature", info, value, force, true); } void MySettings::setModelTopP(const ModelInfo &info, double value, bool force) { setModelSetting("topP", info, value, force, true); } void MySettings::setModelMinP(const ModelInfo &info, double value, bool force) { setModelSetting("minP", info, value, force, true); } void MySettings::setModelTopK(const ModelInfo &info, int value, bool force) { setModelSetting("topK", info, value, force, true); } void MySettings::setModelMaxLength(const ModelInfo &info, int value, bool force) { setModelSetting("maxLength", info, value, force, true); } void MySettings::setModelPromptBatchSize(const ModelInfo &info, int value, bool force) { setModelSetting("promptBatchSize", info, value, force, true); } void MySettings::setModelContextLength(const ModelInfo &info, int value, bool force) { setModelSetting("contextLength", info, value, force, true); } void MySettings::setModelGpuLayers(const ModelInfo &info, int value, bool force) { setModelSetting("gpuLayers", info, value, force, true); } void MySettings::setModelRepeatPenalty(const ModelInfo &info, double value, bool force) { setModelSetting("repeatPenalty", info, value, force, true); } void MySettings::setModelRepeatPenaltyTokens(const ModelInfo &info, int value, bool force) { setModelSetting("repeatPenaltyTokens", info, value, force, true); } bool MySettings::setUpgradeableModelSetting( const ModelInfo &info, const QString &value, QLatin1StringView legacyKey, QLatin1StringView newKey ) { if (info.id().isEmpty()) { qWarning("%s: got null model", Q_FUNC_INFO); return false; } auto legacyModelKey = modelSettingName(info, legacyKey); auto newModelKey = modelSettingName(info, newKey ); bool changed = false; if (m_settings.contains(legacyModelKey)) { m_settings.remove(legacyModelKey); changed = true; } auto oldValue = m_settings.value(newModelKey); if (!oldValue.isValid() || oldValue.toString() != value) { m_settings.setValue(newModelKey, value); changed = true; } return changed; } bool MySettings::resetUpgradeableModelSetting( const ModelInfo &info, QLatin1StringView legacyKey, QLatin1StringView newKey ) { if (info.id().isEmpty()) { qWarning("%s: got null model", Q_FUNC_INFO); return false; } auto legacyModelKey = modelSettingName(info, legacyKey); auto newModelKey = modelSettingName(info, newKey ); bool changed = false; if (m_settings.contains(legacyModelKey)) { m_settings.remove(legacyModelKey); changed = true; } if (m_settings.contains(newModelKey)) { m_settings.remove(newModelKey); changed = true; } return changed; } void MySettings::setModelChatTemplate(const ModelInfo &info, const QString &value) { using namespace ModelSettingsKey; if (setUpgradeableModelSetting(info, value, PromptTemplate, ChatTemplate)) emit chatTemplateChanged(info); } void MySettings::resetModelChatTemplate(const ModelInfo &info) { using namespace ModelSettingsKey; if (resetUpgradeableModelSetting(info, PromptTemplate, ChatTemplate)) emit chatTemplateChanged(info); } void MySettings::setModelSystemMessage(const ModelInfo &info, const QString &value) { using namespace ModelSettingsKey; if (setUpgradeableModelSetting(info, value, SystemPrompt, SystemMessage)) emit systemMessageChanged(info); } void MySettings::resetModelSystemMessage(const ModelInfo &info) { using namespace ModelSettingsKey; if (resetUpgradeableModelSetting(info, SystemPrompt, SystemMessage)) emit systemMessageChanged(info); } void MySettings::setModelChatNamePrompt(const ModelInfo &info, const QString &value, bool force) { setModelSetting("chatNamePrompt", info, value, force, true); } void MySettings::setModelSuggestedFollowUpPrompt(const ModelInfo &info, const QString &value, bool force) { setModelSetting("suggestedFollowUpPrompt", info, value, force, true); } int MySettings::threadCount() const { int c = m_settings.value("threadCount", defaults::threadCount).toInt(); // The old thread setting likely left many people with 0 in settings config file, which means // we should reset it to the default going forward if (c <= 0) c = defaults::threadCount; c = std::max(c, 1); c = std::min(c, QThread::idealThreadCount()); return c; } void MySettings::setThreadCount(int value) { if (threadCount() == value) return; value = std::max(value, 1); value = std::min(value, QThread::idealThreadCount()); m_settings.setValue("threadCount", value); emit threadCountChanged(); } bool MySettings::systemTray() const { return getBasicSetting("systemTray" ).toBool(); } bool MySettings::serverChat() const { return getBasicSetting("serverChat" ).toBool(); } int MySettings::networkPort() const { return getBasicSetting("networkPort" ).toInt(); } QString MySettings::userDefaultModel() const { return getBasicSetting("userDefaultModel" ).toString(); } QString MySettings::lastVersionStarted() const { return getBasicSetting("lastVersionStarted" ).toString(); } int MySettings::localDocsChunkSize() const { return getBasicSetting("localdocs/chunkSize" ).toInt(); } int MySettings::localDocsRetrievalSize() const { return getBasicSetting("localdocs/retrievalSize" ).toInt(); } bool MySettings::localDocsShowReferences() const { return getBasicSetting("localdocs/showReferences").toBool(); } QStringList MySettings::localDocsFileExtensions() const { return getBasicSetting("localdocs/fileExtensions").toStringList(); } bool MySettings::localDocsUseRemoteEmbed() const { return getBasicSetting("localdocs/useRemoteEmbed").toBool(); } QString MySettings::localDocsNomicAPIKey() const { return getBasicSetting("localdocs/nomicAPIKey" ).toString(); } QString MySettings::localDocsEmbedDevice() const { return getBasicSetting("localdocs/embedDevice" ).toString(); } QString MySettings::networkAttribution() const { return getBasicSetting("network/attribution" ).toString(); } ChatTheme MySettings::chatTheme() const { return ChatTheme (getEnumSetting("chatTheme", chatThemeNames)); } FontSize MySettings::fontSize() const { return FontSize (getEnumSetting("fontSize", fontSizeNames)); } SuggestionMode MySettings::suggestionMode() const { return SuggestionMode(getEnumSetting("suggestionMode", suggestionModeNames)); } void MySettings::setSystemTray(bool value) { setBasicSetting("systemTray", value); } void MySettings::setServerChat(bool value) { setBasicSetting("serverChat", value); } void MySettings::setNetworkPort(int value) { setBasicSetting("networkPort", value); } void MySettings::setUserDefaultModel(const QString &value) { setBasicSetting("userDefaultModel", value); } void MySettings::setLastVersionStarted(const QString &value) { setBasicSetting("lastVersionStarted", value); } void MySettings::setLocalDocsChunkSize(int value) { setBasicSetting("localdocs/chunkSize", value, "localDocsChunkSize"); } void MySettings::setLocalDocsRetrievalSize(int value) { setBasicSetting("localdocs/retrievalSize", value, "localDocsRetrievalSize"); } void MySettings::setLocalDocsShowReferences(bool value) { setBasicSetting("localdocs/showReferences", value, "localDocsShowReferences"); } void MySettings::setLocalDocsFileExtensions(const QStringList &value) { setBasicSetting("localdocs/fileExtensions", value, "localDocsFileExtensions"); } void MySettings::setLocalDocsUseRemoteEmbed(bool value) { setBasicSetting("localdocs/useRemoteEmbed", value, "localDocsUseRemoteEmbed"); } void MySettings::setLocalDocsNomicAPIKey(const QString &value) { setBasicSetting("localdocs/nomicAPIKey", value, "localDocsNomicAPIKey"); } void MySettings::setLocalDocsEmbedDevice(const QString &value) { setBasicSetting("localdocs/embedDevice", value, "localDocsEmbedDevice"); } void MySettings::setNetworkAttribution(const QString &value) { setBasicSetting("network/attribution", value, "networkAttribution"); } void MySettings::setChatTheme(ChatTheme value) { setBasicSetting("chatTheme", chatThemeNames .value(int(value))); } void MySettings::setFontSize(FontSize value) { setBasicSetting("fontSize", fontSizeNames .value(int(value))); } void MySettings::setSuggestionMode(SuggestionMode value) { setBasicSetting("suggestionMode", suggestionModeNames.value(int(value))); } QString MySettings::modelPath() { // We have to migrate the old setting because I changed the setting key recklessly in v2.4.11 // which broke a lot of existing installs const bool containsOldSetting = m_settings.contains("modelPaths"); if (containsOldSetting) { const bool containsNewSetting = m_settings.contains("modelPath"); if (!containsNewSetting) m_settings.setValue("modelPath", m_settings.value("modelPaths")); m_settings.remove("modelPaths"); } return m_settings.value("modelPath", defaultLocalModelsPath()).toString(); } void MySettings::setModelPath(const QString &value) { QString filePath = (value.startsWith("file://") ? QUrl(value).toLocalFile() : value); QString canonical = QFileInfo(filePath).canonicalFilePath() + "/"; if (modelPath() == canonical) return; m_settings.setValue("modelPath", canonical); emit modelPathChanged(); } QString MySettings::device() { auto value = m_settings.value("device"); if (!value.isValid()) return defaults::device; auto device = value.toString(); if (!device.isEmpty()) { auto deviceStr = device.toStdString(); auto newNameStr = LLModel::GPUDevice::updateSelectionName(deviceStr); if (newNameStr != deviceStr) { auto newName = QString::fromStdString(newNameStr); qWarning() << "updating device name:" << device << "->" << newName; device = newName; m_settings.setValue("device", device); } } return device; } void MySettings::setDevice(const QString &value) { if (device() != value) { m_settings.setValue("device", value); emit deviceChanged(); } } bool MySettings::forceMetal() const { return m_forceMetal; } void MySettings::setForceMetal(bool value) { if (m_forceMetal != value) { m_forceMetal = value; emit forceMetalChanged(value); } } bool MySettings::networkIsActive() const { return m_settings.value("network/isActive", defaults::networkIsActive).toBool(); } bool MySettings::isNetworkIsActiveSet() const { return m_settings.value("network/isActive").isValid(); } void MySettings::setNetworkIsActive(bool value) { auto cur = m_settings.value("network/isActive"); if (!cur.isValid() || cur.toBool() != value) { m_settings.setValue("network/isActive", value); emit networkIsActiveChanged(); } } bool MySettings::networkUsageStatsActive() const { return m_settings.value("network/usageStatsActive", defaults::networkUsageStatsActive).toBool(); } bool MySettings::isNetworkUsageStatsActiveSet() const { return m_settings.value("network/usageStatsActive").isValid(); } void MySettings::setNetworkUsageStatsActive(bool value) { auto cur = m_settings.value("network/usageStatsActive"); if (!cur.isValid() || cur.toBool() != value) { m_settings.setValue("network/usageStatsActive", value); emit networkUsageStatsActiveChanged(); } } QString MySettings::languageAndLocale() const { auto value = m_settings.value("languageAndLocale"); if (!value.isValid()) return defaults::languageAndLocale; return value.toString(); } QString MySettings::filePathForLocale(const QLocale &locale) { // Check and see if we have a translation for the chosen locale and set it if possible otherwise // we return the filepath for the 'en_US' translation QStringList uiLanguages = locale.uiLanguages(); for (int i = 0; i < uiLanguages.size(); ++i) uiLanguages[i].replace('-', '_'); // Scan this directory for files named like gpt4all_%1.qm that match and if so return them first // this is the model download directory and it can be used by translation developers who are // trying to test their translations by just compiling the translation with the lrelease tool // rather than having to recompile all of GPT4All QString directory = modelPath(); for (const QString &bcp47Name : uiLanguages) { QString filePath = u"%1/gpt4all_%2.qm"_s.arg(directory, bcp47Name); QFileInfo filePathInfo(filePath); if (filePathInfo.exists()) return filePath; } // Now scan the internal built-in translations for (QString bcp47Name : uiLanguages) { QString filePath = u":/i18n/gpt4all_%1.qm"_s.arg(bcp47Name); QFileInfo filePathInfo(filePath); if (filePathInfo.exists()) return filePath; } return u":/i18n/gpt4all_en_US.qm"_s; } void MySettings::setLanguageAndLocale(const QString &bcp47Name) { if (!bcp47Name.isEmpty() && languageAndLocale() != bcp47Name) m_settings.setValue("languageAndLocale", bcp47Name); // When the app is started this method is called with no bcp47Name given which sets the translation // to either the default which is the system locale or the one explicitly set by the user previously. QLocale locale; const QString l = languageAndLocale(); if (l == "System Locale") locale = QLocale::system(); else locale = QLocale(l); // If we previously installed a translator, then remove it if (m_translator) { if (!qGuiApp->removeTranslator(m_translator.get())) { qDebug() << "ERROR: Failed to remove the previous translator"; } else { m_translator.reset(); } } // We expect that the translator was removed and is now a nullptr Q_ASSERT(!m_translator); const QString filePath = filePathForLocale(locale); if (!m_translator) { // Create a new translator object on the heap m_translator = std::make_unique(this); bool success = m_translator->load(filePath); Q_ASSERT(success); if (!success) { qDebug() << "ERROR: Failed to load translation file:" << filePath; m_translator.reset(); } // If we've successfully loaded it, then try and install it if (!qGuiApp->installTranslator(m_translator.get())) { qDebug() << "ERROR: Failed to install the translator:" << filePath; m_translator.reset(); } } // Finally, set the locale whether we have a translation or not QLocale::setDefault(locale); emit languageAndLocaleChanged(); } ================================================ FILE: gpt4all-chat/src/mysettings.h ================================================ #ifndef MYSETTINGS_H #define MYSETTINGS_H #include "modellist.h" // IWYU pragma: keep #include #include // IWYU pragma: keep #include #include #include #include #include #include // IWYU pragma: keep #include #include #include #include #include // IWYU pragma: no_forward_declare QModelIndex class QLocale; namespace MySettingsEnums { Q_NAMESPACE /* NOTE: values of these enums are used as indices for the corresponding combo boxes in * ApplicationSettings.qml, as well as the corresponding name lists in mysettings.cpp */ enum class SuggestionMode { LocalDocsOnly = 0, On = 1, Off = 2, }; Q_ENUM_NS(SuggestionMode) enum class ChatTheme { Light = 0, Dark = 1, LegacyDark = 2, }; Q_ENUM_NS(ChatTheme) enum class FontSize { Small = 0, Medium = 1, Large = 2, }; Q_ENUM_NS(FontSize) } using namespace MySettingsEnums; class MySettings : public QObject { Q_OBJECT Q_PROPERTY(int threadCount READ threadCount WRITE setThreadCount NOTIFY threadCountChanged) Q_PROPERTY(bool systemTray READ systemTray WRITE setSystemTray NOTIFY systemTrayChanged) Q_PROPERTY(bool serverChat READ serverChat WRITE setServerChat NOTIFY serverChatChanged) Q_PROPERTY(QString modelPath READ modelPath WRITE setModelPath NOTIFY modelPathChanged) Q_PROPERTY(QString userDefaultModel READ userDefaultModel WRITE setUserDefaultModel NOTIFY userDefaultModelChanged) Q_PROPERTY(ChatTheme chatTheme READ chatTheme WRITE setChatTheme NOTIFY chatThemeChanged) Q_PROPERTY(FontSize fontSize READ fontSize WRITE setFontSize NOTIFY fontSizeChanged) Q_PROPERTY(QString languageAndLocale READ languageAndLocale WRITE setLanguageAndLocale NOTIFY languageAndLocaleChanged) Q_PROPERTY(bool forceMetal READ forceMetal WRITE setForceMetal NOTIFY forceMetalChanged) Q_PROPERTY(QString lastVersionStarted READ lastVersionStarted WRITE setLastVersionStarted NOTIFY lastVersionStartedChanged) Q_PROPERTY(int localDocsChunkSize READ localDocsChunkSize WRITE setLocalDocsChunkSize NOTIFY localDocsChunkSizeChanged) Q_PROPERTY(int localDocsRetrievalSize READ localDocsRetrievalSize WRITE setLocalDocsRetrievalSize NOTIFY localDocsRetrievalSizeChanged) Q_PROPERTY(bool localDocsShowReferences READ localDocsShowReferences WRITE setLocalDocsShowReferences NOTIFY localDocsShowReferencesChanged) Q_PROPERTY(QStringList localDocsFileExtensions READ localDocsFileExtensions WRITE setLocalDocsFileExtensions NOTIFY localDocsFileExtensionsChanged) Q_PROPERTY(bool localDocsUseRemoteEmbed READ localDocsUseRemoteEmbed WRITE setLocalDocsUseRemoteEmbed NOTIFY localDocsUseRemoteEmbedChanged) Q_PROPERTY(QString localDocsNomicAPIKey READ localDocsNomicAPIKey WRITE setLocalDocsNomicAPIKey NOTIFY localDocsNomicAPIKeyChanged) Q_PROPERTY(QString localDocsEmbedDevice READ localDocsEmbedDevice WRITE setLocalDocsEmbedDevice NOTIFY localDocsEmbedDeviceChanged) Q_PROPERTY(QString networkAttribution READ networkAttribution WRITE setNetworkAttribution NOTIFY networkAttributionChanged) Q_PROPERTY(bool networkIsActive READ networkIsActive WRITE setNetworkIsActive NOTIFY networkIsActiveChanged) Q_PROPERTY(bool networkUsageStatsActive READ networkUsageStatsActive WRITE setNetworkUsageStatsActive NOTIFY networkUsageStatsActiveChanged) Q_PROPERTY(QString device READ device WRITE setDevice NOTIFY deviceChanged) Q_PROPERTY(QStringList deviceList MEMBER m_deviceList CONSTANT) Q_PROPERTY(QStringList embeddingsDeviceList MEMBER m_embeddingsDeviceList CONSTANT) Q_PROPERTY(int networkPort READ networkPort WRITE setNetworkPort NOTIFY networkPortChanged) Q_PROPERTY(SuggestionMode suggestionMode READ suggestionMode WRITE setSuggestionMode NOTIFY suggestionModeChanged) Q_PROPERTY(QStringList uiLanguages MEMBER m_uiLanguages CONSTANT) private: explicit MySettings(); ~MySettings() override = default; public Q_SLOTS: void onModelInfoChanged(const QModelIndex &topLeft, const QModelIndex &bottomRight, const QList &roles = {}); public: static MySettings *globalInstance(); Q_INVOKABLE static QVariant checkJinjaTemplateError(const QString &tmpl); // Restore methods Q_INVOKABLE void restoreModelDefaults(const ModelInfo &info); Q_INVOKABLE void restoreApplicationDefaults(); Q_INVOKABLE void restoreLocalDocsDefaults(); // Model/Character settings void eraseModel(const ModelInfo &info); QString modelName(const ModelInfo &info) const; Q_INVOKABLE void setModelName(const ModelInfo &info, const QString &name, bool force = false); QString modelFilename(const ModelInfo &info) const; Q_INVOKABLE void setModelFilename(const ModelInfo &info, const QString &filename, bool force = false); QString modelDescription(const ModelInfo &info) const; void setModelDescription(const ModelInfo &info, const QString &value, bool force = false); QString modelUrl(const ModelInfo &info) const; void setModelUrl(const ModelInfo &info, const QString &value, bool force = false); QString modelQuant(const ModelInfo &info) const; void setModelQuant(const ModelInfo &info, const QString &value, bool force = false); QString modelType(const ModelInfo &info) const; void setModelType(const ModelInfo &info, const QString &value, bool force = false); bool modelIsClone(const ModelInfo &info) const; void setModelIsClone(const ModelInfo &info, bool value, bool force = false); bool modelIsDiscovered(const ModelInfo &info) const; void setModelIsDiscovered(const ModelInfo &info, bool value, bool force = false); int modelLikes(const ModelInfo &info) const; void setModelLikes(const ModelInfo &info, int value, bool force = false); int modelDownloads(const ModelInfo &info) const; void setModelDownloads(const ModelInfo &info, int value, bool force = false); QDateTime modelRecency(const ModelInfo &info) const; void setModelRecency(const ModelInfo &info, const QDateTime &value, bool force = false); double modelTemperature(const ModelInfo &info) const; Q_INVOKABLE void setModelTemperature(const ModelInfo &info, double value, bool force = false); double modelTopP(const ModelInfo &info) const; Q_INVOKABLE void setModelTopP(const ModelInfo &info, double value, bool force = false); double modelMinP(const ModelInfo &info) const; Q_INVOKABLE void setModelMinP(const ModelInfo &info, double value, bool force = false); int modelTopK(const ModelInfo &info) const; Q_INVOKABLE void setModelTopK(const ModelInfo &info, int value, bool force = false); int modelMaxLength(const ModelInfo &info) const; Q_INVOKABLE void setModelMaxLength(const ModelInfo &info, int value, bool force = false); int modelPromptBatchSize(const ModelInfo &info) const; Q_INVOKABLE void setModelPromptBatchSize(const ModelInfo &info, int value, bool force = false); double modelRepeatPenalty(const ModelInfo &info) const; Q_INVOKABLE void setModelRepeatPenalty(const ModelInfo &info, double value, bool force = false); int modelRepeatPenaltyTokens(const ModelInfo &info) const; Q_INVOKABLE void setModelRepeatPenaltyTokens(const ModelInfo &info, int value, bool force = false); auto modelChatTemplate(const ModelInfo &info) const -> UpgradeableSetting; Q_INVOKABLE bool isModelChatTemplateSet(const ModelInfo &info) const; Q_INVOKABLE void setModelChatTemplate(const ModelInfo &info, const QString &value); Q_INVOKABLE void resetModelChatTemplate(const ModelInfo &info); auto modelSystemMessage(const ModelInfo &info) const -> UpgradeableSetting; Q_INVOKABLE bool isModelSystemMessageSet(const ModelInfo &info) const; Q_INVOKABLE void setModelSystemMessage(const ModelInfo &info, const QString &value); Q_INVOKABLE void resetModelSystemMessage(const ModelInfo &info); int modelContextLength(const ModelInfo &info) const; Q_INVOKABLE void setModelContextLength(const ModelInfo &info, int value, bool force = false); int modelGpuLayers(const ModelInfo &info) const; Q_INVOKABLE void setModelGpuLayers(const ModelInfo &info, int value, bool force = false); QString modelChatNamePrompt(const ModelInfo &info) const; Q_INVOKABLE void setModelChatNamePrompt(const ModelInfo &info, const QString &value, bool force = false); QString modelSuggestedFollowUpPrompt(const ModelInfo &info) const; Q_INVOKABLE void setModelSuggestedFollowUpPrompt(const ModelInfo &info, const QString &value, bool force = false); // Application settings int threadCount() const; void setThreadCount(int value); bool systemTray() const; void setSystemTray(bool value); bool serverChat() const; void setServerChat(bool value); QString modelPath(); void setModelPath(const QString &value); QString userDefaultModel() const; void setUserDefaultModel(const QString &value); ChatTheme chatTheme() const; void setChatTheme(ChatTheme value); FontSize fontSize() const; void setFontSize(FontSize value); bool forceMetal() const; void setForceMetal(bool value); QString device(); void setDevice(const QString &value); int32_t contextLength() const; void setContextLength(int32_t value); int32_t gpuLayers() const; void setGpuLayers(int32_t value); SuggestionMode suggestionMode() const; void setSuggestionMode(SuggestionMode value); QString languageAndLocale() const; void setLanguageAndLocale(const QString &bcp47Name = QString()); // called on startup with QString() // Release/Download settings QString lastVersionStarted() const; void setLastVersionStarted(const QString &value); // Localdocs settings int localDocsChunkSize() const; void setLocalDocsChunkSize(int value); int localDocsRetrievalSize() const; void setLocalDocsRetrievalSize(int value); bool localDocsShowReferences() const; void setLocalDocsShowReferences(bool value); QStringList localDocsFileExtensions() const; void setLocalDocsFileExtensions(const QStringList &value); bool localDocsUseRemoteEmbed() const; void setLocalDocsUseRemoteEmbed(bool value); QString localDocsNomicAPIKey() const; void setLocalDocsNomicAPIKey(const QString &value); QString localDocsEmbedDevice() const; void setLocalDocsEmbedDevice(const QString &value); // Network settings QString networkAttribution() const; void setNetworkAttribution(const QString &value); bool networkIsActive() const; Q_INVOKABLE bool isNetworkIsActiveSet() const; void setNetworkIsActive(bool value); bool networkUsageStatsActive() const; Q_INVOKABLE bool isNetworkUsageStatsActiveSet() const; void setNetworkUsageStatsActive(bool value); int networkPort() const; void setNetworkPort(int value); Q_SIGNALS: void nameChanged(const ModelInfo &info); void filenameChanged(const ModelInfo &info); void descriptionChanged(const ModelInfo &info); void temperatureChanged(const ModelInfo &info); void topPChanged(const ModelInfo &info); void minPChanged(const ModelInfo &info); void topKChanged(const ModelInfo &info); void maxLengthChanged(const ModelInfo &info); void promptBatchSizeChanged(const ModelInfo &info); void contextLengthChanged(const ModelInfo &info); void gpuLayersChanged(const ModelInfo &info); void repeatPenaltyChanged(const ModelInfo &info); void repeatPenaltyTokensChanged(const ModelInfo &info); void chatTemplateChanged(const ModelInfo &info, bool fromInfo = false); void systemMessageChanged(const ModelInfo &info, bool fromInfo = false); void chatNamePromptChanged(const ModelInfo &info); void suggestedFollowUpPromptChanged(const ModelInfo &info); void threadCountChanged(); void systemTrayChanged(); void serverChatChanged(); void modelPathChanged(); void userDefaultModelChanged(); void chatThemeChanged(); void fontSizeChanged(); void forceMetalChanged(bool); void lastVersionStartedChanged(); void localDocsChunkSizeChanged(); void localDocsRetrievalSizeChanged(); void localDocsShowReferencesChanged(); void localDocsFileExtensionsChanged(); void localDocsUseRemoteEmbedChanged(); void localDocsNomicAPIKeyChanged(); void localDocsEmbedDeviceChanged(); void networkAttributionChanged(); void networkIsActiveChanged(); void networkPortChanged(); void networkUsageStatsActiveChanged(); void attemptModelLoadChanged(); void deviceChanged(); void suggestionModeChanged(); void languageAndLocaleChanged(); private: QVariant getBasicSetting(const QString &name) const; void setBasicSetting(const QString &name, const QVariant &value, std::optional signal = std::nullopt); int getEnumSetting(const QString &setting, const QStringList &valueNames) const; QVariant getModelSetting(QLatin1StringView name, const ModelInfo &info) const; QVariant getModelSetting(const char *name, const ModelInfo &info) const; void setModelSetting(QLatin1StringView name, const ModelInfo &info, const QVariant &value, bool force, bool signal = false); void setModelSetting(const char *name, const ModelInfo &info, const QVariant &value, bool force, bool signal = false); auto getUpgradeableModelSetting( const ModelInfo &info, QLatin1StringView legacyKey, QLatin1StringView newKey ) const -> UpgradeableSetting; bool isUpgradeableModelSettingSet( const ModelInfo &info, QLatin1StringView legacyKey, QLatin1StringView newKey ) const; bool setUpgradeableModelSetting( const ModelInfo &info, const QString &value, QLatin1StringView legacyKey, QLatin1StringView newKey ); bool resetUpgradeableModelSetting( const ModelInfo &info, QLatin1StringView legacyKey, QLatin1StringView newKey ); QString filePathForLocale(const QLocale &locale); private: QSettings m_settings; bool m_forceMetal; const QStringList m_deviceList; const QStringList m_embeddingsDeviceList; const QStringList m_uiLanguages; std::unique_ptr m_translator; friend class MyPrivateSettings; }; #endif // MYSETTINGS_H ================================================ FILE: gpt4all-chat/src/network.cpp ================================================ #include "network.h" #include "chat.h" #include "chatlistmodel.h" #include "download.h" #include "llm.h" #include "localdocs.h" #include "localdocsmodel.h" #include "modellist.h" #include "mysettings.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #ifdef __GLIBC__ # include #endif using namespace Qt::Literals::StringLiterals; //#define DEBUG #define STR_(x) #x #define STR(x) STR_(x) static const char MIXPANEL_TOKEN[] = "ce362e568ddaee16ed243eaffb5860a2"; #ifdef __clang__ #ifdef __apple_build_version__ static const char COMPILER_NAME[] = "Apple Clang"; #else static const char COMPILER_NAME[] = "LLVM Clang"; #endif static const char COMPILER_VER[] = STR(__clang_major__) "." STR(__clang_minor__) "." STR(__clang_patchlevel__); #elif defined(_MSC_VER) static const char COMPILER_NAME[] = "MSVC"; static const char COMPILER_VER[] = STR(_MSC_VER) " (" STR(_MSC_FULL_VER) ")"; #elif defined(__GNUC__) static const char COMPILER_NAME[] = "GCC"; static const char COMPILER_VER[] = STR(__GNUC__) "." STR(__GNUC_MINOR__) "." STR(__GNUC_PATCHLEVEL__); #endif #if defined(Q_OS_MAC) #include static std::optional getSysctl(const char *name) { char buffer[256] = ""; size_t bufferlen = sizeof(buffer); if (sysctlbyname(name, &buffer, &bufferlen, NULL, 0) < 0) { int err = errno; qWarning().nospace() << "sysctlbyname(\"" << name << "\") failed: " << strerror(err); return std::nullopt; } return std::make_optional(buffer); } static QString getCPUModel() { return getSysctl("machdep.cpu.brand_string").value_or(u"(unknown)"_s); } #elif defined(__x86_64__) || defined(__i386__) || defined(_M_X64) || defined(_M_IX86) #ifndef _MSC_VER static void get_cpuid(int level, int *regs) { asm volatile("cpuid" : "=a" (regs[0]), "=b" (regs[1]), "=c" (regs[2]), "=d" (regs[3]) : "0" (level) : "memory"); } #else #define get_cpuid(level, regs) __cpuid(regs, level) #endif static QString getCPUModel() { int regs[12]; // EAX=800000000h: Get Highest Extended Function Implemented get_cpuid(0x80000000, regs); if (regs[0] < 0x80000004) return "(unknown)"; // EAX=800000002h-800000004h: Processor Brand String get_cpuid(0x80000002, regs); get_cpuid(0x80000003, regs + 4); get_cpuid(0x80000004, regs + 8); char str[sizeof(regs) + 1]; memcpy(str, regs, sizeof(regs)); str[sizeof(regs)] = 0; return QString(str).trimmed(); } #else static QString getCPUModel() { return "(non-x86)"; } #endif class MyNetwork: public Network { }; Q_GLOBAL_STATIC(MyNetwork, networkInstance) Network *Network::globalInstance() { return networkInstance(); } bool Network::isHttpUrlValid(QUrl url) { if (!url.isValid()) return false; QString scheme(url.scheme()); if (scheme != "http" && scheme != "https") return false; return true; } Network::Network() : QObject{nullptr} { QSettings settings; m_uniqueId = settings.value("uniqueId", generateUniqueId()).toString(); settings.setValue("uniqueId", m_uniqueId); m_sessionId = generateUniqueId(); // allow sendMixpanel to be called from any thread connect(this, &Network::requestMixpanel, this, &Network::sendMixpanel, Qt::QueuedConnection); const auto *mySettings = MySettings::globalInstance(); connect(mySettings, &MySettings::networkIsActiveChanged, this, &Network::handleIsActiveChanged); connect(mySettings, &MySettings::networkUsageStatsActiveChanged, this, &Network::handleUsageStatsActiveChanged); m_hasSentOptIn = !Download::globalInstance()->isFirstStart() && mySettings->networkUsageStatsActive(); m_hasSentOptOut = !Download::globalInstance()->isFirstStart() && !mySettings->networkUsageStatsActive(); if (mySettings->networkIsActive()) sendHealth(); connect(&m_networkManager, &QNetworkAccessManager::sslErrors, this, &Network::handleSslErrors); } // NOTE: this won't be useful until we make it possible to change this via the settings page void Network::handleUsageStatsActiveChanged() { if (!MySettings::globalInstance()->networkUsageStatsActive()) m_sendUsageStats = false; } void Network::handleIsActiveChanged() { if (MySettings::globalInstance()->networkUsageStatsActive()) sendHealth(); } QString Network::generateUniqueId() const { return QUuid::createUuid().toString(QUuid::WithoutBraces); } bool Network::packageAndSendJson(const QString &ingestId, const QString &json) { if (!MySettings::globalInstance()->networkIsActive()) return false; QJsonParseError err; QJsonDocument doc = QJsonDocument::fromJson(json.toUtf8(), &err); if (err.error != QJsonParseError::NoError) { qDebug() << "Couldn't parse: " << json << err.errorString(); return false; } auto *currentChat = ChatListModel::globalInstance()->currentChat(); Q_ASSERT(currentChat); auto modelInfo = currentChat->modelInfo(); Q_ASSERT(doc.isObject()); QJsonObject object = doc.object(); object.insert("source", "gpt4all-chat"); object.insert("agent_id", modelInfo.filename()); object.insert("submitter_id", m_uniqueId); object.insert("ingest_id", ingestId); QString attribution = MySettings::globalInstance()->networkAttribution(); if (!attribution.isEmpty()) object.insert("network/attribution", attribution); if (!modelInfo.id().isNull()) if (auto tmpl = modelInfo.chatTemplate().asModern()) object.insert("chat_template"_L1, *tmpl); QJsonDocument newDoc; newDoc.setObject(object); #if defined(DEBUG) printf("%s\n", qPrintable(newDoc.toJson(QJsonDocument::Indented))); fflush(stdout); #endif QUrl jsonUrl("https://api.gpt4all.io/v1/ingest/chat"); QNetworkRequest request(jsonUrl); QSslConfiguration conf = request.sslConfiguration(); conf.setPeerVerifyMode(QSslSocket::VerifyNone); request.setSslConfiguration(conf); QByteArray body(newDoc.toJson(QJsonDocument::Compact)); request.setHeader(QNetworkRequest::ContentTypeHeader, "application/json"); QNetworkReply *jsonReply = m_networkManager.post(request, body); connect(qGuiApp, &QCoreApplication::aboutToQuit, jsonReply, &QNetworkReply::abort); connect(jsonReply, &QNetworkReply::finished, this, &Network::handleJsonUploadFinished); m_activeUploads.append(jsonReply); return true; } void Network::handleJsonUploadFinished() { QNetworkReply *jsonReply = qobject_cast(sender()); if (!jsonReply) return; m_activeUploads.removeAll(jsonReply); if (jsonReply->error() != QNetworkReply::NoError) { qWarning() << "Request to" << jsonReply->url().toString() << "failed:" << jsonReply->errorString(); jsonReply->deleteLater(); return; } QVariant response = jsonReply->attribute(QNetworkRequest::HttpStatusCodeAttribute); Q_ASSERT(response.isValid()); bool ok; int code = response.toInt(&ok); if (!ok) qWarning() << "ERROR: ingest invalid response."; if (code != 200) { qWarning() << "ERROR: ingest response != 200 code:" << code; sendHealth(); } #if defined(DEBUG) QByteArray jsonData = jsonReply->readAll(); QJsonParseError err; QJsonDocument document = QJsonDocument::fromJson(jsonData, &err); if (err.error != QJsonParseError::NoError) { qDebug() << "ERROR: Couldn't parse: " << jsonData << err.errorString(); return; } printf("%s\n", qPrintable(document.toJson(QJsonDocument::Indented))); fflush(stdout); #endif jsonReply->deleteLater(); } void Network::handleSslErrors(QNetworkReply *reply, const QList &errors) { QUrl url = reply->request().url(); for (const auto &e : errors) qWarning() << "ERROR: Received ssl error:" << e.errorString() << "for" << url; } void Network::sendOptOut() { QJsonObject properties; properties.insert("token", MIXPANEL_TOKEN); properties.insert("time", QDateTime::currentMSecsSinceEpoch()); properties.insert("distinct_id", m_uniqueId); properties.insert("$insert_id", generateUniqueId()); QJsonObject event; event.insert("event", "opt_out"); event.insert("properties", properties); QJsonArray array; array.append(event); QJsonDocument doc; doc.setArray(array); emit requestMixpanel(doc.toJson(QJsonDocument::Compact)); #if defined(DEBUG) printf("%s %s\n", qPrintable("opt_out"), qPrintable(doc.toJson(QJsonDocument::Indented))); fflush(stdout); #endif } void Network::sendStartup() { const auto *mySettings = MySettings::globalInstance(); Q_ASSERT(mySettings->isNetworkUsageStatsActiveSet()); if (!mySettings->networkUsageStatsActive()) { // send a single opt-out per session after the user has made their selections, // unless this is a normal start (same version) and the user was already opted out if (!m_hasSentOptOut) { sendOptOut(); m_hasSentOptOut = true; } return; } // only chance to enable usage stats is at the start of a new session m_sendUsageStats = true; const auto *display = QGuiApplication::primaryScreen(); trackEvent("startup", { // Build info { "build_compiler", COMPILER_NAME }, { "build_compiler_ver", COMPILER_VER }, { "build_abi", QSysInfo::buildAbi() }, { "build_cpu_arch", QSysInfo::buildCpuArchitecture() }, #ifdef __GLIBC__ { "build_glibc_ver", QStringLiteral(STR(__GLIBC__) "." STR(__GLIBC_MINOR__)) }, #endif { "qt_version", QLibraryInfo::version().toString() }, { "qt_debug" , QLibraryInfo::isDebugBuild() }, { "qt_shared", QLibraryInfo::isSharedBuild() }, // System info { "runtime_cpu_arch", QSysInfo::currentCpuArchitecture() }, #ifdef __GLIBC__ { "runtime_glibc_ver", gnu_get_libc_version() }, #endif { "sys_kernel_type", QSysInfo::kernelType() }, { "sys_kernel_ver", QSysInfo::kernelVersion() }, { "sys_product_type", QSysInfo::productType() }, { "sys_product_ver", QSysInfo::productVersion() }, #ifdef Q_OS_MAC { "sys_hw_model", getSysctl("hw.model").value_or(u"(unknown)"_s) }, #endif { "$screen_dpi", std::round(display->physicalDotsPerInch()) }, { "display", u"%1x%2"_s.arg(display->size().width()).arg(display->size().height()) }, { "ram", LLM::globalInstance()->systemTotalRAMInGB() }, { "cpu", getCPUModel() }, { "cpu_supports_avx2", LLModel::Implementation::cpuSupportsAVX2() }, // Datalake status { "datalake_active", mySettings->networkIsActive() }, }); sendIpify(); // mirror opt-out logic so the ratio can be used to infer totals if (!m_hasSentOptIn) { trackEvent("opt_in"); m_hasSentOptIn = true; } } void Network::trackChatEvent(const QString &ev, QVariantMap props) { auto *curChat = ChatListModel::globalInstance()->currentChat(); Q_ASSERT(curChat); if (!props.contains("model")) props.insert("model", curChat->modelInfo().filename()); props.insert("device_backend", curChat->deviceBackend()); props.insert("actualDevice", curChat->device()); props.insert("doc_collections_enabled", curChat->collectionList().count()); props.insert("doc_collections_total", LocalDocs::globalInstance()->localDocsModel()->rowCount()); props.insert("datalake_active", MySettings::globalInstance()->networkIsActive()); props.insert("using_server", curChat->isServer()); trackEvent(ev, props); } void Network::trackEvent(const QString &ev, const QVariantMap &props) { if (!m_sendUsageStats) return; QJsonObject properties; properties.insert("token", MIXPANEL_TOKEN); if (!props.contains("time")) properties.insert("time", QDateTime::currentMSecsSinceEpoch()); properties.insert("distinct_id", m_uniqueId); // effectively a device ID properties.insert("$insert_id", generateUniqueId()); if (!m_ipify.isEmpty()) properties.insert("ip", m_ipify); properties.insert("$os", QSysInfo::prettyProductName()); properties.insert("session_id", m_sessionId); properties.insert("name", QCoreApplication::applicationName() + " v" + QCoreApplication::applicationVersion()); for (const auto &[key, value]: props.asKeyValueRange()) properties.insert(key, QJsonValue::fromVariant(value)); QJsonObject event; event.insert("event", ev); event.insert("properties", properties); QJsonArray array; array.append(event); QJsonDocument doc; doc.setArray(array); emit requestMixpanel(doc.toJson(QJsonDocument::Compact)); #if defined(DEBUG) printf("%s %s\n", qPrintable(ev), qPrintable(doc.toJson(QJsonDocument::Indented))); fflush(stdout); #endif } void Network::sendIpify() { if (!m_sendUsageStats || !m_ipify.isEmpty()) return; QUrl ipifyUrl("https://api.ipify.org"); QNetworkRequest request(ipifyUrl); QSslConfiguration conf = request.sslConfiguration(); conf.setPeerVerifyMode(QSslSocket::VerifyNone); request.setSslConfiguration(conf); QNetworkReply *reply = m_networkManager.get(request); connect(qGuiApp, &QCoreApplication::aboutToQuit, reply, &QNetworkReply::abort); connect(reply, &QNetworkReply::finished, this, &Network::handleIpifyFinished); } void Network::sendMixpanel(const QByteArray &json) { QUrl trackUrl("https://api.mixpanel.com/track"); QNetworkRequest request(trackUrl); QSslConfiguration conf = request.sslConfiguration(); conf.setPeerVerifyMode(QSslSocket::VerifyNone); request.setSslConfiguration(conf); request.setHeader(QNetworkRequest::ContentTypeHeader, "application/json"); QNetworkReply *trackReply = m_networkManager.post(request, json); connect(qGuiApp, &QCoreApplication::aboutToQuit, trackReply, &QNetworkReply::abort); connect(trackReply, &QNetworkReply::finished, this, &Network::handleMixpanelFinished); } void Network::handleIpifyFinished() { QNetworkReply *reply = qobject_cast(sender()); if (!reply) return; if (reply->error() != QNetworkReply::NoError) { qWarning() << "Request to" << reply->url().toString() << "failed:" << reply->errorString(); reply->deleteLater(); return; } QVariant response = reply->attribute(QNetworkRequest::HttpStatusCodeAttribute); Q_ASSERT(response.isValid()); bool ok; int code = response.toInt(&ok); if (!ok) qWarning() << "ERROR: ipify invalid response."; if (code != 200) qWarning() << "ERROR: ipify response != 200 code:" << code; m_ipify = reply->readAll(); #if defined(DEBUG) printf("ipify finished %s\n", qPrintable(m_ipify)); fflush(stdout); #endif reply->deleteLater(); trackEvent("ipify_complete"); } void Network::handleMixpanelFinished() { QNetworkReply *reply = qobject_cast(sender()); if (!reply) return; if (reply->error() != QNetworkReply::NoError) { qWarning() << "Request to" << reply->url().toString() << "failed:" << reply->errorString(); reply->deleteLater(); return; } QVariant response = reply->attribute(QNetworkRequest::HttpStatusCodeAttribute); Q_ASSERT(response.isValid()); bool ok; int code = response.toInt(&ok); if (!ok) qWarning() << "ERROR: track invalid response."; if (code != 200) qWarning() << "ERROR: track response != 200 code:" << code; #if defined(DEBUG) printf("mixpanel finished %s\n", qPrintable(reply->readAll())); fflush(stdout); #endif reply->deleteLater(); } bool Network::sendConversation(const QString &ingestId, const QString &conversation) { return packageAndSendJson(ingestId, conversation); } void Network::sendHealth() { QUrl healthUrl("https://api.gpt4all.io/v1/health"); QNetworkRequest request(healthUrl); QSslConfiguration conf = request.sslConfiguration(); conf.setPeerVerifyMode(QSslSocket::VerifyNone); request.setSslConfiguration(conf); QNetworkReply *healthReply = m_networkManager.get(request); connect(qGuiApp, &QCoreApplication::aboutToQuit, healthReply, &QNetworkReply::abort); connect(healthReply, &QNetworkReply::finished, this, &Network::handleHealthFinished); } void Network::handleHealthFinished() { QNetworkReply *healthReply = qobject_cast(sender()); if (!healthReply) return; if (healthReply->error() != QNetworkReply::NoError) { qWarning() << "Request to" << healthReply->url().toString() << "failed:" << healthReply->errorString(); healthReply->deleteLater(); return; } QVariant response = healthReply->attribute(QNetworkRequest::HttpStatusCodeAttribute); Q_ASSERT(response.isValid()); bool ok; int code = response.toInt(&ok); if (!ok) qWarning() << "ERROR: health invalid response."; if (code != 200) { qWarning() << "ERROR: health response != 200 code:" << code; emit healthCheckFailed(code); MySettings::globalInstance()->setNetworkIsActive(false); } healthReply->deleteLater(); } ================================================ FILE: gpt4all-chat/src/network.h ================================================ #ifndef NETWORK_H #define NETWORK_H #include #include #include #include #include #include #include #include #include #include #include // IWYU pragma: keep #include // IWYU pragma: keep // IWYU pragma: no_forward_declare QByteArray // IWYU pragma: no_forward_declare QNetworkReply // IWYU pragma: no_forward_declare QSslError class QUrl; struct KeyValue { QString key; QJsonValue value; }; class Network : public QObject { Q_OBJECT public: static Network *globalInstance(); static bool isHttpUrlValid(const QUrl url); Q_INVOKABLE QString generateUniqueId() const; Q_INVOKABLE bool sendConversation(const QString &ingestId, const QString &conversation); Q_INVOKABLE void trackChatEvent(const QString &event, QVariantMap props = QVariantMap()); Q_INVOKABLE void trackEvent(const QString &event, const QVariantMap &props = QVariantMap()); Q_SIGNALS: void healthCheckFailed(int code); void requestMixpanel(const QByteArray &json, bool isOptOut = false); public Q_SLOTS: void sendStartup(); private Q_SLOTS: void handleIpifyFinished(); void handleHealthFinished(); void handleJsonUploadFinished(); void handleSslErrors(QNetworkReply *reply, const QList &errors); void handleMixpanelFinished(); void handleIsActiveChanged(); void handleUsageStatsActiveChanged(); void sendMixpanel(const QByteArray &json); private: void sendOptOut(); void sendHealth(); void sendIpify(); bool packageAndSendJson(const QString &ingestId, const QString &json); private: bool m_sendUsageStats = false; bool m_hasSentOptIn; bool m_hasSentOptOut; QString m_ipify; QString m_uniqueId; QString m_sessionId; QNetworkAccessManager m_networkManager; QVector m_activeUploads; private: explicit Network(); ~Network() {} friend class MyNetwork; }; #endif // LLM_H ================================================ FILE: gpt4all-chat/src/server.cpp ================================================ #include "server.h" #include "chat.h" #include "chatmodel.h" #include "modellist.h" #include "mysettings.h" #include "utils.h" // IWYU pragma: keep #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include // IWYU pragma: keep #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include using namespace std::string_literals; using namespace Qt::Literals::StringLiterals; //#define DEBUG namespace { class InvalidRequestError: public std::invalid_argument { using std::invalid_argument::invalid_argument; public: QHttpServerResponse asResponse() const { QJsonObject error { { "message", what(), }, { "type", u"invalid_request_error"_s, }, { "param", QJsonValue::Null }, { "code", QJsonValue::Null }, }; return { QJsonObject {{ "error", error }}, QHttpServerResponder::StatusCode::BadRequest }; } private: Q_DISABLE_COPY_MOVE(InvalidRequestError) }; } // namespace static inline QJsonObject modelToJson(const ModelInfo &info) { QJsonObject model; model.insert("id", info.name()); model.insert("object", "model"); model.insert("created", 0); model.insert("owned_by", "humanity"); model.insert("root", info.name()); model.insert("parent", QJsonValue::Null); QJsonArray permissions; QJsonObject permissionObj; permissionObj.insert("id", "placeholder"); permissionObj.insert("object", "model_permission"); permissionObj.insert("created", 0); permissionObj.insert("allow_create_engine", false); permissionObj.insert("allow_sampling", false); permissionObj.insert("allow_logprobs", false); permissionObj.insert("allow_search_indices", false); permissionObj.insert("allow_view", true); permissionObj.insert("allow_fine_tuning", false); permissionObj.insert("organization", "*"); permissionObj.insert("group", QJsonValue::Null); permissionObj.insert("is_blocking", false); permissions.append(permissionObj); model.insert("permissions", permissions); return model; } static inline QJsonObject resultToJson(const ResultInfo &info) { QJsonObject result; result.insert("file", info.file); result.insert("title", info.title); result.insert("author", info.author); result.insert("date", info.date); result.insert("text", info.text); result.insert("page", info.page); result.insert("from", info.from); result.insert("to", info.to); return result; } class BaseCompletionRequest { public: QString model; // required // NB: some parameters are not supported yet int32_t max_tokens = 16; qint64 n = 1; float temperature = 1.f; float top_p = 1.f; float min_p = 0.f; BaseCompletionRequest() = default; virtual ~BaseCompletionRequest() = default; virtual BaseCompletionRequest &parse(QCborMap request) { parseImpl(request); if (!request.isEmpty()) throw InvalidRequestError(fmt::format( "Unrecognized request argument supplied: {}", request.keys().constFirst().toString() )); return *this; } protected: virtual void parseImpl(QCborMap &request) { using enum Type; auto reqValue = [&request](auto &&...args) { return takeValue(request, args...); }; QCborValue value; this->model = reqValue("model", String, /*required*/ true).toString(); value = reqValue("frequency_penalty", Number, false, /*min*/ -2, /*max*/ 2); if (value.isDouble() || value.toInteger() != 0) throw InvalidRequestError("'frequency_penalty' is not supported"); value = reqValue("max_tokens", Integer, false, /*min*/ 1); if (!value.isNull()) this->max_tokens = int32_t(qMin(value.toInteger(), INT32_MAX)); value = reqValue("n", Integer, false, /*min*/ 1); if (!value.isNull()) this->n = value.toInteger(); value = reqValue("presence_penalty", Number); if (value.isDouble() || value.toInteger() != 0) throw InvalidRequestError("'presence_penalty' is not supported"); value = reqValue("seed", Integer); if (!value.isNull()) throw InvalidRequestError("'seed' is not supported"); value = reqValue("stop"); if (!value.isNull()) throw InvalidRequestError("'stop' is not supported"); value = reqValue("stream", Boolean); if (value.isTrue()) throw InvalidRequestError("'stream' is not supported"); value = reqValue("stream_options", Object); if (!value.isNull()) throw InvalidRequestError("'stream_options' is not supported"); value = reqValue("temperature", Number, false, /*min*/ 0, /*max*/ 2); if (!value.isNull()) this->temperature = float(value.toDouble()); value = reqValue("top_p", Number, false, /*min*/ 0, /*max*/ 1); if (!value.isNull()) this->top_p = float(value.toDouble()); value = reqValue("min_p", Number, false, /*min*/ 0, /*max*/ 1); if (!value.isNull()) this->min_p = float(value.toDouble()); reqValue("user", String); // validate but don't use } enum class Type : uint8_t { Boolean, Integer, Number, String, Array, Object, }; static const std::unordered_map s_typeNames; static bool typeMatches(const QCborValue &value, Type type) noexcept { using enum Type; switch (type) { case Boolean: return value.isBool(); case Integer: return value.isInteger(); case Number: return value.isInteger() || value.isDouble(); case String: return value.isString(); case Array: return value.isArray(); case Object: return value.isMap(); } Q_UNREACHABLE(); } static QCborValue takeValue( QCborMap &obj, const char *key, std::optional type = {}, bool required = false, std::optional min = {}, std::optional max = {} ) { auto value = obj.take(QLatin1StringView(key)); if (value.isUndefined()) value = QCborValue(QCborSimpleType::Null); if (required && value.isNull()) throw InvalidRequestError(fmt::format("you must provide a {} parameter", key)); if (type && !value.isNull() && !typeMatches(value, *type)) throw InvalidRequestError(fmt::format("'{}' is not of type '{}' - '{}'", value.toVariant(), s_typeNames.at(*type), key)); if (!value.isNull()) { double num = value.toDouble(); if (min && num < double(*min)) throw InvalidRequestError(fmt::format("{} is less than the minimum of {} - '{}'", num, *min, key)); if (max && num > double(*max)) throw InvalidRequestError(fmt::format("{} is greater than the maximum of {} - '{}'", num, *max, key)); } return value; } private: Q_DISABLE_COPY_MOVE(BaseCompletionRequest) }; class CompletionRequest : public BaseCompletionRequest { public: QString prompt; // required // some parameters are not supported yet - these ones are bool echo = false; CompletionRequest &parse(QCborMap request) override { BaseCompletionRequest::parse(std::move(request)); return *this; } protected: void parseImpl(QCborMap &request) override { using enum Type; auto reqValue = [&request](auto &&...args) { return takeValue(request, args...); }; QCborValue value; BaseCompletionRequest::parseImpl(request); this->prompt = reqValue("prompt", String, /*required*/ true).toString(); value = reqValue("best_of", Integer); { qint64 bof = value.toInteger(1); if (this->n > bof) throw InvalidRequestError(fmt::format( "You requested that the server return more choices than it will generate (HINT: you must set 'n' " "(currently {}) to be at most 'best_of' (currently {}), or omit either parameter if you don't " "specifically want to use them.)", this->n, bof )); if (bof > this->n) throw InvalidRequestError("'best_of' is not supported"); } value = reqValue("echo", Boolean); if (value.isBool()) this->echo = value.toBool(); // we don't bother deeply typechecking unsupported subobjects for now value = reqValue("logit_bias", Object); if (!value.isNull()) throw InvalidRequestError("'logit_bias' is not supported"); value = reqValue("logprobs", Integer, false, /*min*/ 0); if (!value.isNull()) throw InvalidRequestError("'logprobs' is not supported"); value = reqValue("suffix", String); if (!value.isNull() && !value.toString().isEmpty()) throw InvalidRequestError("'suffix' is not supported"); } }; const std::unordered_map BaseCompletionRequest::s_typeNames = { { BaseCompletionRequest::Type::Boolean, "boolean" }, { BaseCompletionRequest::Type::Integer, "integer" }, { BaseCompletionRequest::Type::Number, "number" }, { BaseCompletionRequest::Type::String, "string" }, { BaseCompletionRequest::Type::Array, "array" }, { BaseCompletionRequest::Type::Object, "object" }, }; class ChatRequest : public BaseCompletionRequest { public: struct Message { enum class Role { System, User, Assistant }; Role role; QString content; }; QList messages; // required ChatRequest &parse(QCborMap request) override { BaseCompletionRequest::parse(std::move(request)); return *this; } protected: void parseImpl(QCborMap &request) override { using enum Type; auto reqValue = [&request](auto &&...args) { return takeValue(request, args...); }; QCborValue value; BaseCompletionRequest::parseImpl(request); value = reqValue("messages", std::nullopt, /*required*/ true); if (!value.isArray() || value.toArray().isEmpty()) throw InvalidRequestError(fmt::format( "Invalid type for 'messages': expected a non-empty array of objects, but got '{}' instead.", value.toVariant() )); this->messages.clear(); { QCborArray arr = value.toArray(); for (qsizetype i = 0; i < arr.size(); i++) { const auto &elem = arr[i]; if (!elem.isMap()) throw InvalidRequestError(fmt::format( "Invalid type for 'messages[{}]': expected an object, but got '{}' instead.", i, elem.toVariant() )); QCborMap msg = elem.toMap(); Message res; QString role = takeValue(msg, "role", String, /*required*/ true).toString(); if (role == u"system"_s) { res.role = Message::Role::System; } else if (role == u"user"_s) { res.role = Message::Role::User; } else if (role == u"assistant"_s) { res.role = Message::Role::Assistant; } else { throw InvalidRequestError(fmt::format( "Invalid 'messages[{}].role': expected one of 'system', 'assistant', or 'user', but got '{}'" " instead.", i, role.toStdString() )); } res.content = takeValue(msg, "content", String, /*required*/ true).toString(); this->messages.append(res); if (!msg.isEmpty()) throw InvalidRequestError(fmt::format( "Invalid 'messages[{}]': unrecognized key: '{}'", i, msg.keys().constFirst().toString() )); } } // we don't bother deeply typechecking unsupported subobjects for now value = reqValue("logit_bias", Object); if (!value.isNull()) throw InvalidRequestError("'logit_bias' is not supported"); value = reqValue("logprobs", Boolean); if (value.isTrue()) throw InvalidRequestError("'logprobs' is not supported"); value = reqValue("top_logprobs", Integer, false, /*min*/ 0); if (!value.isNull()) throw InvalidRequestError("The 'top_logprobs' parameter is only allowed when 'logprobs' is enabled."); value = reqValue("response_format", Object); if (!value.isNull()) throw InvalidRequestError("'response_format' is not supported"); reqValue("service_tier", String); // validate but don't use value = reqValue("tools", Array); if (!value.isNull()) throw InvalidRequestError("'tools' is not supported"); value = reqValue("tool_choice"); if (!value.isNull()) throw InvalidRequestError("'tool_choice' is not supported"); // validate but don't use reqValue("parallel_tool_calls", Boolean); value = reqValue("function_call"); if (!value.isNull()) throw InvalidRequestError("'function_call' is not supported"); value = reqValue("functions", Array); if (!value.isNull()) throw InvalidRequestError("'functions' is not supported"); } }; template T &parseRequest(T &request, QJsonObject &&obj) { // lossless conversion to CBOR exposes more type information return request.parse(QCborMap::fromJsonObject(obj)); } Server::Server(Chat *chat) : ChatLLM(chat, true /*isServer*/) , m_chat(chat) { connect(this, &Server::threadStarted, this, &Server::start); connect(this, &Server::databaseResultsChanged, this, &Server::handleDatabaseResultsChanged); connect(chat, &Chat::collectionListChanged, this, &Server::handleCollectionListChanged, Qt::QueuedConnection); } static QJsonObject requestFromJson(const QByteArray &request) { QJsonParseError err; const QJsonDocument document = QJsonDocument::fromJson(request, &err); if (err.error || !document.isObject()) throw InvalidRequestError(fmt::format( "error parsing request JSON: {}", err.error ? err.errorString().toStdString() : "not an object"s )); return document.object(); } void Server::start() { m_server = std::make_unique(this); auto *tcpServer = new QTcpServer(m_server.get()); auto port = MySettings::globalInstance()->networkPort(); if (!tcpServer->listen(QHostAddress::LocalHost, port)) { qWarning() << "Server ERROR: Failed to listen on port" << port; return; } if (!m_server->bind(tcpServer)) { qWarning() << "Server ERROR: Failed to HTTP server to socket" << port; return; } m_server->route("/v1/models", QHttpServerRequest::Method::Get, [](const QHttpServerRequest &) { if (!MySettings::globalInstance()->serverChat()) return QHttpServerResponse(QHttpServerResponder::StatusCode::Unauthorized); const QList modelList = ModelList::globalInstance()->selectableModelList(); QJsonObject root; root.insert("object", "list"); QJsonArray data; for (const ModelInfo &info : modelList) { Q_ASSERT(info.installed); if (!info.installed) continue; data.append(modelToJson(info)); } root.insert("data", data); return QHttpServerResponse(root); } ); m_server->route("/v1/models/", QHttpServerRequest::Method::Get, [](const QString &model, const QHttpServerRequest &) { if (!MySettings::globalInstance()->serverChat()) return QHttpServerResponse(QHttpServerResponder::StatusCode::Unauthorized); const QList modelList = ModelList::globalInstance()->selectableModelList(); QJsonObject object; for (const ModelInfo &info : modelList) { Q_ASSERT(info.installed); if (!info.installed) continue; if (model == info.name()) { object = modelToJson(info); break; } } return QHttpServerResponse(object); } ); m_server->route("/v1/completions", QHttpServerRequest::Method::Post, [this](const QHttpServerRequest &request) { if (!MySettings::globalInstance()->serverChat()) return QHttpServerResponse(QHttpServerResponder::StatusCode::Unauthorized); try { auto reqObj = requestFromJson(request.body()); #if defined(DEBUG) qDebug().noquote() << "/v1/completions request" << QJsonDocument(reqObj).toJson(QJsonDocument::Indented); #endif CompletionRequest req; parseRequest(req, std::move(reqObj)); auto [resp, respObj] = handleCompletionRequest(req); #if defined(DEBUG) if (respObj) qDebug().noquote() << "/v1/completions reply" << QJsonDocument(*respObj).toJson(QJsonDocument::Indented); #endif return std::move(resp); } catch (const InvalidRequestError &e) { return e.asResponse(); } } ); m_server->route("/v1/chat/completions", QHttpServerRequest::Method::Post, [this](const QHttpServerRequest &request) { if (!MySettings::globalInstance()->serverChat()) return QHttpServerResponse(QHttpServerResponder::StatusCode::Unauthorized); try { auto reqObj = requestFromJson(request.body()); #if defined(DEBUG) qDebug().noquote() << "/v1/chat/completions request" << QJsonDocument(reqObj).toJson(QJsonDocument::Indented); #endif ChatRequest req; parseRequest(req, std::move(reqObj)); auto [resp, respObj] = handleChatRequest(req); (void)respObj; #if defined(DEBUG) if (respObj) qDebug().noquote() << "/v1/chat/completions reply" << QJsonDocument(*respObj).toJson(QJsonDocument::Indented); #endif return std::move(resp); } catch (const InvalidRequestError &e) { return e.asResponse(); } } ); // Respond with code 405 to wrong HTTP methods: m_server->route("/v1/models", QHttpServerRequest::Method::Post, [] { if (!MySettings::globalInstance()->serverChat()) return QHttpServerResponse(QHttpServerResponder::StatusCode::Unauthorized); return QHttpServerResponse( QJsonDocument::fromJson("{\"error\": {\"message\": \"Not allowed to POST on /v1/models." " (HINT: Perhaps you meant to use a different HTTP method?)\"," " \"type\": \"invalid_request_error\", \"param\": null, \"code\": null}}").object(), QHttpServerResponder::StatusCode::MethodNotAllowed); } ); m_server->route("/v1/models/", QHttpServerRequest::Method::Post, [](const QString &model) { (void)model; if (!MySettings::globalInstance()->serverChat()) return QHttpServerResponse(QHttpServerResponder::StatusCode::Unauthorized); return QHttpServerResponse( QJsonDocument::fromJson("{\"error\": {\"message\": \"Not allowed to POST on /v1/models/*." " (HINT: Perhaps you meant to use a different HTTP method?)\"," " \"type\": \"invalid_request_error\", \"param\": null, \"code\": null}}").object(), QHttpServerResponder::StatusCode::MethodNotAllowed); } ); m_server->route("/v1/completions", QHttpServerRequest::Method::Get, [] { if (!MySettings::globalInstance()->serverChat()) return QHttpServerResponse(QHttpServerResponder::StatusCode::Unauthorized); return QHttpServerResponse( QJsonDocument::fromJson("{\"error\": {\"message\": \"Only POST requests are accepted.\"," " \"type\": \"invalid_request_error\", \"param\": null, \"code\": \"method_not_supported\"}}").object(), QHttpServerResponder::StatusCode::MethodNotAllowed); } ); m_server->route("/v1/chat/completions", QHttpServerRequest::Method::Get, [] { if (!MySettings::globalInstance()->serverChat()) return QHttpServerResponse(QHttpServerResponder::StatusCode::Unauthorized); return QHttpServerResponse( QJsonDocument::fromJson("{\"error\": {\"message\": \"Only POST requests are accepted.\"," " \"type\": \"invalid_request_error\", \"param\": null, \"code\": \"method_not_supported\"}}").object(), QHttpServerResponder::StatusCode::MethodNotAllowed); } ); m_server->addAfterRequestHandler(this, [](const QHttpServerRequest &req, QHttpServerResponse &resp) { Q_UNUSED(req); auto headers = resp.headers(); headers.append("Access-Control-Allow-Origin"_L1, "*"_L1); resp.setHeaders(std::move(headers)); }); connect(this, &Server::requestResetResponseState, m_chat, &Chat::resetResponseState, Qt::BlockingQueuedConnection); } static auto makeError(auto &&...args) -> std::pair> { return {QHttpServerResponse(args...), std::nullopt}; } auto Server::handleCompletionRequest(const CompletionRequest &request) -> std::pair> { Q_ASSERT(m_chatModel); auto *mySettings = MySettings::globalInstance(); ModelInfo modelInfo = ModelList::globalInstance()->defaultModelInfo(); const QList modelList = ModelList::globalInstance()->selectableModelList(); for (const ModelInfo &info : modelList) { Q_ASSERT(info.installed); if (!info.installed) continue; if (request.model == info.name() || request.model == info.filename()) { modelInfo = info; break; } } // load the new model if necessary setShouldBeLoaded(true); if (modelInfo.filename().isEmpty()) { std::cerr << "ERROR: couldn't load default model " << request.model.toStdString() << std::endl; return makeError(QHttpServerResponder::StatusCode::InternalServerError); } emit requestResetResponseState(); // blocks qsizetype prevMsgIndex = m_chatModel->count() - 1; if (prevMsgIndex >= 0) m_chatModel->updateCurrentResponse(prevMsgIndex, false); // NB: this resets the context, regardless of whether this model is already loaded if (!loadModel(modelInfo)) { std::cerr << "ERROR: couldn't load model " << modelInfo.name().toStdString() << std::endl; return makeError(QHttpServerResponder::StatusCode::InternalServerError); } // add prompt/response items to GUI m_chatModel->appendPrompt(request.prompt); m_chatModel->appendResponse(); // FIXME(jared): taking parameters from the UI inhibits reproducibility of results LLModel::PromptContext promptCtx { .n_predict = request.max_tokens, .top_k = mySettings->modelTopK(modelInfo), .top_p = request.top_p, .min_p = request.min_p, .temp = request.temperature, .n_batch = mySettings->modelPromptBatchSize(modelInfo), .repeat_penalty = float(mySettings->modelRepeatPenalty(modelInfo)), .repeat_last_n = mySettings->modelRepeatPenaltyTokens(modelInfo), }; auto promptUtf8 = request.prompt.toUtf8(); int promptTokens = 0; int responseTokens = 0; QStringList responses; for (int i = 0; i < request.n; ++i) { PromptResult result; try { result = promptInternal(std::string_view(promptUtf8.cbegin(), promptUtf8.cend()), promptCtx, /*usedLocalDocs*/ false); } catch (const std::exception &e) { m_chatModel->setResponseValue(e.what()); m_chatModel->setError(); emit responseStopped(0); return makeError(QHttpServerResponder::StatusCode::InternalServerError); } QString resp = QString::fromUtf8(result.response); if (request.echo) resp = request.prompt + resp; responses << resp; if (i == 0) promptTokens = result.promptTokens; responseTokens += result.responseTokens; } QJsonObject responseObject { { "id", "placeholder" }, { "object", "text_completion" }, { "created", QDateTime::currentSecsSinceEpoch() }, { "model", modelInfo.name() }, }; QJsonArray choices; for (qsizetype i = 0; auto &resp : std::as_const(responses)) { choices << QJsonObject { { "text", resp }, { "index", i++ }, { "logprobs", QJsonValue::Null }, { "finish_reason", responseTokens == request.max_tokens ? "length" : "stop" }, }; } responseObject.insert("choices", choices); responseObject.insert("usage", QJsonObject { { "prompt_tokens", promptTokens }, { "completion_tokens", responseTokens }, { "total_tokens", promptTokens + responseTokens }, }); return {QHttpServerResponse(responseObject), responseObject}; } auto Server::handleChatRequest(const ChatRequest &request) -> std::pair> { auto *mySettings = MySettings::globalInstance(); ModelInfo modelInfo = ModelList::globalInstance()->defaultModelInfo(); const QList modelList = ModelList::globalInstance()->selectableModelList(); for (const ModelInfo &info : modelList) { Q_ASSERT(info.installed); if (!info.installed) continue; if (request.model == info.name() || request.model == info.filename()) { modelInfo = info; break; } } // load the new model if necessary setShouldBeLoaded(true); if (modelInfo.filename().isEmpty()) { std::cerr << "ERROR: couldn't load default model " << request.model.toStdString() << std::endl; return makeError(QHttpServerResponder::StatusCode::InternalServerError); } emit requestResetResponseState(); // blocks // NB: this resets the context, regardless of whether this model is already loaded if (!loadModel(modelInfo)) { std::cerr << "ERROR: couldn't load model " << modelInfo.name().toStdString() << std::endl; return makeError(QHttpServerResponder::StatusCode::InternalServerError); } m_chatModel->updateCurrentResponse(m_chatModel->count() - 1, false); Q_ASSERT(!request.messages.isEmpty()); // adds prompt/response items to GUI std::vector messages; for (auto &message : request.messages) { using enum ChatRequest::Message::Role; switch (message.role) { case System: messages.push_back({ MessageInput::Type::System, message.content }); break; case User: messages.push_back({ MessageInput::Type::Prompt, message.content }); break; case Assistant: messages.push_back({ MessageInput::Type::Response, message.content }); break; } } auto startOffset = m_chatModel->appendResponseWithHistory(messages); // FIXME(jared): taking parameters from the UI inhibits reproducibility of results LLModel::PromptContext promptCtx { .n_predict = request.max_tokens, .top_k = mySettings->modelTopK(modelInfo), .top_p = request.top_p, .min_p = request.min_p, .temp = request.temperature, .n_batch = mySettings->modelPromptBatchSize(modelInfo), .repeat_penalty = float(mySettings->modelRepeatPenalty(modelInfo)), .repeat_last_n = mySettings->modelRepeatPenaltyTokens(modelInfo), }; int promptTokens = 0; int responseTokens = 0; QList>> responses; for (int i = 0; i < request.n; ++i) { ChatPromptResult result; try { result = promptInternalChat(m_collections, promptCtx, startOffset); } catch (const std::exception &e) { m_chatModel->setResponseValue(e.what()); m_chatModel->setError(); emit responseStopped(0); return makeError(QHttpServerResponder::StatusCode::InternalServerError); } responses.emplace_back(result.response, result.databaseResults); if (i == 0) promptTokens = result.promptTokens; responseTokens += result.responseTokens; } QJsonObject responseObject { { "id", "placeholder" }, { "object", "chat.completion" }, { "created", QDateTime::currentSecsSinceEpoch() }, { "model", modelInfo.name() }, }; QJsonArray choices; { int index = 0; for (const auto &r : responses) { QString result = r.first; QList infos = r.second; QJsonObject message { { "role", "assistant" }, { "content", result }, }; QJsonObject choice { { "index", index++ }, { "message", message }, { "finish_reason", responseTokens == request.max_tokens ? "length" : "stop" }, { "logprobs", QJsonValue::Null }, }; if (MySettings::globalInstance()->localDocsShowReferences()) { QJsonArray references; for (const auto &ref : infos) references.append(resultToJson(ref)); choice.insert("references", references.isEmpty() ? QJsonValue::Null : QJsonValue(references)); } choices.append(choice); } } responseObject.insert("choices", choices); responseObject.insert("usage", QJsonObject { { "prompt_tokens", promptTokens }, { "completion_tokens", responseTokens }, { "total_tokens", promptTokens + responseTokens }, }); return {QHttpServerResponse(responseObject), responseObject}; } ================================================ FILE: gpt4all-chat/src/server.h ================================================ #ifndef SERVER_H #define SERVER_H #include "chatllm.h" #include "database.h" #include #include #include #include #include // IWYU pragma: keep #include #include #include #include class Chat; class ChatRequest; class CompletionRequest; class Server : public ChatLLM { Q_OBJECT public: explicit Server(Chat *chat); ~Server() override = default; public Q_SLOTS: void start(); Q_SIGNALS: void requestResetResponseState(); private: auto handleCompletionRequest(const CompletionRequest &request) -> std::pair>; auto handleChatRequest(const ChatRequest &request) -> std::pair>; private Q_SLOTS: void handleDatabaseResultsChanged(const QList &results) { m_databaseResults = results; } void handleCollectionListChanged(const QList &collectionList) { m_collections = collectionList; } private: Chat *m_chat; std::unique_ptr m_server; QList m_databaseResults; QList m_collections; }; #endif // SERVER_H ================================================ FILE: gpt4all-chat/src/tool.cpp ================================================ #include "tool.h" #include #include #include using json = nlohmann::ordered_json; json::object_t Tool::jinjaValue() const { json::array_t paramList; const QList p = parameters(); for (auto &info : p) { std::string typeStr; switch (info.type) { using enum ToolEnums::ParamType; case String: typeStr = "string"; break; case Number: typeStr = "number"; break; case Integer: typeStr = "integer"; break; case Object: typeStr = "object"; break; case Array: typeStr = "array"; break; case Boolean: typeStr = "boolean"; break; case Null: typeStr = "null"; break; } paramList.emplace_back(json::initializer_list_t { { "name", info.name.toStdString() }, { "type", typeStr }, { "description", info.description.toStdString() }, { "required", info.required }, }); } return { { "name", name().toStdString() }, { "description", description().toStdString() }, { "function", function().toStdString() }, { "parameters", paramList }, { "symbolicFormat", symbolicFormat().toStdString() }, { "examplePrompt", examplePrompt().toStdString() }, { "exampleCall", exampleCall().toStdString() }, { "exampleReply", exampleReply().toStdString() }, }; } void ToolCallInfo::serialize(QDataStream &stream, int version) { stream << name; stream << params.size(); for (auto param : params) { stream << param.name; stream << param.type; stream << param.value; } stream << result; stream << error; stream << errorString; } bool ToolCallInfo::deserialize(QDataStream &stream, int version) { stream >> name; qsizetype count; stream >> count; for (int i = 0; i < count; ++i) { ToolParam p; stream >> p.name; stream >> p.type; stream >> p.value; } stream >> result; stream >> error; stream >> errorString; return true; } ================================================ FILE: gpt4all-chat/src/tool.h ================================================ #ifndef TOOL_H #define TOOL_H #include #include #include #include #include #include class QDataStream; using json = nlohmann::ordered_json; namespace ToolEnums { Q_NAMESPACE enum class Error { NoError = 0, TimeoutError = 2, UnknownError = 499, }; Q_ENUM_NS(Error) enum class ParamType { String, Number, Integer, Object, Array, Boolean, Null }; // json schema types Q_ENUM_NS(ParamType) enum class ParseState { None, InTagChoice, InStart, Partial, Complete, }; Q_ENUM_NS(ParseState) } struct ToolParamInfo { QString name; ToolEnums::ParamType type; QString description; bool required; }; Q_DECLARE_METATYPE(ToolParamInfo) struct ToolParam { QString name; ToolEnums::ParamType type; QVariant value; bool operator==(const ToolParam& other) const { return name == other.name && type == other.type && value == other.value; } }; Q_DECLARE_METATYPE(ToolParam) struct ToolCallInfo { QString name; QList params; QString result; ToolEnums::Error error = ToolEnums::Error::NoError; QString errorString; void serialize(QDataStream &stream, int version); bool deserialize(QDataStream &stream, int version); bool operator==(const ToolCallInfo& other) const { return name == other.name && result == other.result && params == other.params && error == other.error && errorString == other.errorString; } }; Q_DECLARE_METATYPE(ToolCallInfo) class Tool : public QObject { Q_OBJECT Q_PROPERTY(QString name READ name CONSTANT) Q_PROPERTY(QString description READ description CONSTANT) Q_PROPERTY(QString function READ function CONSTANT) Q_PROPERTY(QList parameters READ parameters CONSTANT) Q_PROPERTY(QString examplePrompt READ examplePrompt CONSTANT) Q_PROPERTY(QString exampleCall READ exampleCall CONSTANT) Q_PROPERTY(QString exampleReply READ exampleReply CONSTANT) public: Tool() : QObject(nullptr) {} virtual ~Tool() {} virtual void run(const QList ¶ms) = 0; virtual bool interrupt() = 0; // Tools should set these if they encounter errors. For instance, a tool depending upon the network // might set these error variables if the network is not available. virtual ToolEnums::Error error() const { return ToolEnums::Error::NoError; } virtual QString errorString() const { return QString(); } // [Required] Human readable name of the tool. virtual QString name() const = 0; // [Required] Human readable description of what the tool does. Use this tool to: {{description}} virtual QString description() const = 0; // [Required] Must be unique. Name of the function to invoke. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64. virtual QString function() const = 0; // [Optional] List describing the tool's parameters. An empty list specifies no parameters. virtual QList parameters() const { return {}; } // [Optional] The symbolic format of the toolcall. virtual QString symbolicFormat() const { return QString(); } // [Optional] A human generated example of a prompt that could result in this tool being called. virtual QString examplePrompt() const { return QString(); } // [Optional] An example of this tool call that pairs with the example query. It should be the // complete string that the model must generate. virtual QString exampleCall() const { return QString(); } // [Optional] An example of the reply the model might generate given the result of the tool call. virtual QString exampleReply() const { return QString(); } bool operator==(const Tool &other) const { return function() == other.function(); } json::object_t jinjaValue() const; Q_SIGNALS: void runComplete(const ToolCallInfo &info); }; #endif // TOOL_H ================================================ FILE: gpt4all-chat/src/toolcallparser.cpp ================================================ #include "toolcallparser.h" #include "tool.h" #include #include #include #include #include ToolCallParser::ToolCallParser() : ToolCallParser(ToolCallConstants::AllTagNames) {} ToolCallParser::ToolCallParser(const QStringList &tagNames) { QSet firstChars; for (auto &name : tagNames) { if (name.isEmpty()) throw std::invalid_argument("ToolCallParser(): tag names must not be empty"); if (firstChars.contains(name.at(0))) throw std::invalid_argument("ToolCallParser(): tag names must not share any prefix"); firstChars << name.at(0); m_possibleStartTags << makeStartTag(name).toUtf8(); m_possibleEndTags << makeEndTag (name).toUtf8(); } reset(); } void ToolCallParser::reset() { // Resets the search state, but not the buffer or global state resetSearchState(); // These are global states maintained between update calls m_buffers.clear(); m_buffers << QByteArray(); } void ToolCallParser::resetSearchState() { m_expected = {'<'}; m_expectedIndex = 0; m_state = ToolEnums::ParseState::None; m_toolCall.clear(); m_startTagBuffer.clear(); m_endTagBuffer.clear(); m_currentTagIndex = -1; m_startIndex = -1; m_endIndex = -1; } bool ToolCallParser::isExpected(char c) const { return m_expected.isEmpty() || m_expected.contains(c); } void ToolCallParser::setExpected(const QList &tags) { m_expected.clear(); for (const auto &tag : tags) { Q_ASSERT(tag.size() > m_expectedIndex); m_expected << tag.at(m_expectedIndex); } } QByteArray ToolCallParser::startTag() const { if (m_currentTagIndex < 0) return {}; return m_possibleStartTags.at(m_currentTagIndex); } QByteArray ToolCallParser::endTag() const { if (m_currentTagIndex < 0) return {}; return m_possibleEndTags.at(m_currentTagIndex); } QByteArray &ToolCallParser::currentBuffer() { return m_buffers.last(); } // This method is called with an arbitrary string and a current state. This method should take the // current state into account and then parse through the update character by character to arrive at // the new state. void ToolCallParser::update(const QByteArray &update) { currentBuffer().append(update); for (qsizetype i = currentBuffer().size() - update.size(); i < currentBuffer().size(); ++i) { const char c = currentBuffer()[i]; const bool foundMatch = isExpected(c); if (!foundMatch) { resetSearchState(); continue; } switch (m_state) { case ToolEnums::ParseState::None: { m_expectedIndex = 1; setExpected(m_possibleStartTags); m_state = ToolEnums::ParseState::InTagChoice; m_startIndex = i; break; } case ToolEnums::ParseState::InTagChoice: { for (int i = 0; i < m_possibleStartTags.size(); ++i) { const auto &tag = m_possibleStartTags.at(i); if (c == tag.at(1)) m_currentTagIndex = i; } if (m_currentTagIndex >= 0) { m_expectedIndex = 2; setExpected({m_possibleStartTags.at(m_currentTagIndex)}); m_state = ToolEnums::ParseState::InStart; } else resetSearchState(); break; } case ToolEnums::ParseState::InStart: { m_startTagBuffer.append(c); const auto startTag = this->startTag(); Q_ASSERT(!startTag.isEmpty()); if (m_expectedIndex == startTag.size() - 1) { m_expectedIndex = 0; setExpected({}); m_state = ToolEnums::ParseState::Partial; } else { ++m_expectedIndex; Q_ASSERT(m_currentTagIndex >= 0); setExpected({startTag}); } break; } case ToolEnums::ParseState::Partial: { Q_ASSERT(m_currentTagIndex >= 0); const auto endTag = this->endTag(); Q_ASSERT(!endTag.isEmpty()); m_toolCall.append(c); m_endTagBuffer.append(c); if (m_endTagBuffer.size() > endTag.size()) m_endTagBuffer.remove(0, 1); if (m_endTagBuffer == endTag) { m_endIndex = i + 1; m_toolCall.chop(endTag.size()); m_state = ToolEnums::ParseState::Complete; m_endTagBuffer.clear(); } break; } case ToolEnums::ParseState::Complete: { // Already complete, do nothing further break; } } } } bool ToolCallParser::splitIfPossible() { // The first split happens when we're in a partial state if (m_buffers.size() < 2 && m_state == ToolEnums::ParseState::Partial) { Q_ASSERT(m_startIndex >= 0); const auto beforeToolCall = currentBuffer().left(m_startIndex); const auto toolCall = currentBuffer().mid (m_startIndex); m_buffers = { beforeToolCall, toolCall }; return true; } // The second split happens when we're in the complete state if (m_buffers.size() < 3 && m_state == ToolEnums::ParseState::Complete) { Q_ASSERT(m_endIndex >= 0); const auto &beforeToolCall = m_buffers.first(); const auto toolCall = currentBuffer().left(m_endIndex); const auto afterToolCall = currentBuffer().mid (m_endIndex); m_buffers = { beforeToolCall, toolCall, afterToolCall }; return true; } return false; } QStringList ToolCallParser::buffers() const { QStringList result; result.reserve(m_buffers.size()); for (const auto &buffer : m_buffers) result << QString::fromUtf8(buffer); return result; } ================================================ FILE: gpt4all-chat/src/toolcallparser.h ================================================ #ifndef TOOLCALLPARSER_H #define TOOLCALLPARSER_H #include #include #include #include // IWYU pragma: keep namespace ToolEnums { enum class ParseState; } using namespace Qt::Literals::StringLiterals; class ToolCallParser { public: ToolCallParser(); ToolCallParser(const QStringList &tagNames); void reset(); void update(const QByteArray &update); QString toolCall() const { return QString::fromUtf8(m_toolCall); } int startIndex() const { return m_startIndex; } ToolEnums::ParseState state() const { return m_state; } QByteArray startTag() const; QByteArray endTag() const; bool splitIfPossible(); QStringList buffers() const; int numberOfBuffers() const { return m_buffers.size(); } static QString makeStartTag(const QString &name) { return u"<%1>"_s .arg(name); } static QString makeEndTag (const QString &name) { return u""_s.arg(name); } private: QByteArray ¤tBuffer(); void resetSearchState(); bool isExpected(char c) const; void setExpected(const QList &tags); QList m_possibleStartTags; QList m_possibleEndTags; QByteArray m_startTagBuffer; QByteArray m_endTagBuffer; int m_currentTagIndex; QList m_expected; int m_expectedIndex; ToolEnums::ParseState m_state; QList m_buffers; QByteArray m_toolCall; int m_startIndex; int m_endIndex; }; namespace ToolCallConstants { // NB: the parsing code assumes the first char of the various tags differ inline const QString CodeInterpreterFunction = u"javascript_interpret"_s; inline const QString CodeInterpreterStartTag = ToolCallParser::makeStartTag(CodeInterpreterFunction); inline const QString CodeInterpreterEndTag = ToolCallParser::makeEndTag (CodeInterpreterFunction); inline const QString CodeInterpreterPrefix = u"%1\n```javascript\n"_s.arg(CodeInterpreterStartTag); inline const QString CodeInterpreterSuffix = u"```\n%1"_s .arg(CodeInterpreterEndTag ); inline const QString ThinkTagName = u"think"_s; inline const QString ThinkStartTag = ToolCallParser::makeStartTag(ThinkTagName); inline const QString ThinkEndTag = ToolCallParser::makeEndTag (ThinkTagName); inline const QStringList AllTagNames { CodeInterpreterFunction, ThinkTagName }; } #endif // TOOLCALLPARSER_H ================================================ FILE: gpt4all-chat/src/toolmodel.cpp ================================================ #include "toolmodel.h" #include "codeinterpreter.h" #include #include #include class MyToolModel: public ToolModel { }; Q_GLOBAL_STATIC(MyToolModel, toolModelInstance) ToolModel *ToolModel::globalInstance() { return toolModelInstance(); } ToolModel::ToolModel() : QAbstractListModel(nullptr) { QCoreApplication::instance()->installEventFilter(this); Tool* codeInterpreter = new CodeInterpreter; m_tools.append(codeInterpreter); m_toolMap.insert(codeInterpreter->function(), codeInterpreter); } bool ToolModel::eventFilter(QObject *obj, QEvent *ev) { if (obj == QCoreApplication::instance() && ev->type() == QEvent::LanguageChange) emit dataChanged(index(0, 0), index(m_tools.size() - 1, 0)); return false; } ================================================ FILE: gpt4all-chat/src/toolmodel.h ================================================ #ifndef TOOLMODEL_H #define TOOLMODEL_H #include "tool.h" #include #include #include #include #include #include #include class ToolModel : public QAbstractListModel { Q_OBJECT Q_PROPERTY(int count READ count NOTIFY countChanged) public: static ToolModel *globalInstance(); enum Roles { NameRole = Qt::UserRole + 1, DescriptionRole, FunctionRole, ParametersRole, SymbolicFormatRole, ExamplePromptRole, ExampleCallRole, ExampleReplyRole, }; int rowCount(const QModelIndex &parent = QModelIndex()) const override { Q_UNUSED(parent) return m_tools.size(); } QVariant data(const QModelIndex &index, int role = Qt::DisplayRole) const override { if (!index.isValid() || index.row() < 0 || index.row() >= m_tools.size()) return QVariant(); const Tool *item = m_tools.at(index.row()); switch (role) { case NameRole: return item->name(); case DescriptionRole: return item->description(); case FunctionRole: return item->function(); case ParametersRole: return QVariant::fromValue(item->parameters()); case SymbolicFormatRole: return item->symbolicFormat(); case ExamplePromptRole: return item->examplePrompt(); case ExampleCallRole: return item->exampleCall(); case ExampleReplyRole: return item->exampleReply(); } return QVariant(); } QHash roleNames() const override { QHash roles; roles[NameRole] = "name"; roles[DescriptionRole] = "description"; roles[FunctionRole] = "function"; roles[ParametersRole] = "parameters"; roles[SymbolicFormatRole] = "symbolicFormat"; roles[ExamplePromptRole] = "examplePrompt"; roles[ExampleCallRole] = "exampleCall"; roles[ExampleReplyRole] = "exampleReply"; return roles; } Q_INVOKABLE Tool* get(int index) const { if (index < 0 || index >= m_tools.size()) return nullptr; return m_tools.at(index); } Q_INVOKABLE Tool *get(const QString &id) const { if (!m_toolMap.contains(id)) return nullptr; return m_toolMap.value(id); } int count() const { return m_tools.size(); } Q_SIGNALS: void countChanged(); void valueChanged(int index, const QString &value); protected: bool eventFilter(QObject *obj, QEvent *ev) override; private: explicit ToolModel(); ~ToolModel() {} friend class MyToolModel; QList m_tools; QHash m_toolMap; }; #endif // TOOLMODEL_H ================================================ FILE: gpt4all-chat/src/utils.h ================================================ #pragma once #include #include #include #include #include // IWYU pragma: keep #include #include #include #include #include #include #include // IWYU pragma: keep // IWYU pragma: no_forward_declare QJsonValue class QJsonObject; // fmtlib formatters for QString and QVariant #define MAKE_FORMATTER(type, conversion) \ template <> \ struct fmt::formatter: fmt::formatter { \ template \ FmtContext::iterator format(const type &value, FmtContext &ctx) const \ { \ auto valueUtf8 = (conversion); \ std::string_view view(valueUtf8.cbegin(), valueUtf8.cend()); \ return formatter::format(view, ctx); \ } \ } MAKE_FORMATTER(QUtf8StringView, value ); MAKE_FORMATTER(QStringView, value.toUtf8() ); MAKE_FORMATTER(QString, value.toUtf8() ); MAKE_FORMATTER(QVariant, value.toString().toUtf8()); // alternative to QJsonObject's initializer_list constructor that accepts Latin-1 strings QJsonObject makeJsonObject(std::initializer_list> args); #include "utils.inl" // IWYU pragma: export ================================================ FILE: gpt4all-chat/src/utils.inl ================================================ #include inline QJsonObject makeJsonObject(std::initializer_list> args) { QJsonObject obj; for (auto &arg : args) obj.insert(arg.first, arg.second); return obj; } ================================================ FILE: gpt4all-chat/src/xlsxtomd.cpp ================================================ #include "xlsxtomd.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include // IWYU pragma: keep #include #include #include #include using namespace Qt::Literals::StringLiterals; static QString formatCellText(const QXlsx::Cell *cell) { if (!cell) return QString(); QVariant value = cell->value(); QXlsx::Format format = cell->format(); QString cellText; // Determine the cell type based on format if (cell->isDateTime()) { // Handle DateTime QDateTime dateTime = cell->dateTime().toDateTime(); cellText = dateTime.isValid() ? dateTime.toString(QStringView(u"yyyy-MM-dd")) : value.toString(); } else { cellText = value.toString(); } if (cellText.isEmpty()) return QString(); // Escape special characters static QRegularExpression special( QStringLiteral( R"(()([\\`*_[\]<>()!|])|)" // special characters R"(^(\s*)(#+(?:\s|$))|)" // headings R"(^(\s*[0-9])(\.(?:\s|$))|)" // ordered lists ("1. a") R"(^(\s*)([+-](?:\s|$)))" // unordered lists ("- a") ), QRegularExpression::MultilineOption ); cellText.replace(special, uR"(\1\\2)"_s); cellText.replace(u'&', "&"_L1); cellText.replace(u'<', "<"_L1); cellText.replace(u'>', ">"_L1); // Apply Markdown formatting based on font styles if (format.fontUnderline()) cellText = u"_%1_"_s.arg(cellText); if (format.fontBold()) cellText = u"**%1**"_s.arg(cellText); if (format.fontItalic()) cellText = u"*%1*"_s.arg(cellText); if (format.fontStrikeOut()) cellText = u"~~%1~~"_s.arg(cellText); return cellText; } static QString getCellValue(QXlsx::Worksheet *sheet, int row, int col) { if (!sheet) return QString(); // Attempt to retrieve the cell directly std::shared_ptr cell = sheet->cellAt(row, col); // If the cell is part of a merged range and not directly available if (!cell) { for (const QXlsx::CellRange &range : sheet->mergedCells()) { if (row >= range.firstRow() && row <= range.lastRow() && col >= range.firstColumn() && col <= range.lastColumn()) { cell = sheet->cellAt(range.firstRow(), range.firstColumn()); break; } } } // Format and return the cell text if available if (cell) return formatCellText(cell.get()); // Return empty string if cell is not found return QString(); } QString XLSXToMD::toMarkdown(QIODevice *xlsxDevice) { // Load the Excel document QXlsx::Document xlsx(xlsxDevice); if (!xlsx.load()) { qCritical() << "Failed to load the Excel from device"; return QString(); } QString markdown; // Retrieve all sheet names QStringList sheetNames = xlsx.sheetNames(); if (sheetNames.isEmpty()) { qWarning() << "No sheets found in the Excel document."; return QString(); } // Iterate through each worksheet by name for (const QString &sheetName : sheetNames) { QXlsx::Worksheet *sheet = dynamic_cast(xlsx.sheet(sheetName)); if (!sheet) { qWarning() << "Failed to load sheet:" << sheetName; continue; } markdown += u"### %1\n\n"_s.arg(sheetName); // Determine the used range QXlsx::CellRange range = sheet->dimension(); int firstRow = range.firstRow(); int lastRow = range.lastRow(); int firstCol = range.firstColumn(); int lastCol = range.lastColumn(); if (firstRow > lastRow || firstCol > lastCol) { qWarning() << "Sheet" << sheetName << "is empty."; markdown += QStringView(u"*No data available.*\n\n"); continue; } auto appendRow = [&markdown](auto &list) { markdown += u"|%1|\n"_s.arg(list.join(u'|')); }; // Empty header static QString header(u' '); static QString separator(u'-'); QStringList headers; QStringList separators; for (int col = firstCol; col <= lastCol; ++col) { headers << header; separators << separator; } appendRow(headers); appendRow(separators); // Iterate through data rows for (int row = firstRow; row <= lastRow; ++row) { QStringList rowData; for (int col = firstCol; col <= lastCol; ++col) { QString cellText = getCellValue(sheet, row, col); rowData << (cellText.isEmpty() ? u" "_s : cellText); } appendRow(rowData); } markdown += u'\n'; // Add an empty line between sheets } return markdown; } ================================================ FILE: gpt4all-chat/src/xlsxtomd.h ================================================ #ifndef XLSXTOMD_H #define XLSXTOMD_H class QIODevice; class QString; class XLSXToMD { public: static QString toMarkdown(QIODevice *xlsxDevice); }; #endif // XLSXTOMD_H ================================================ FILE: gpt4all-chat/system_requirements.md ================================================ Below are the recommended and minimum system requirements for GPT4All. ### **Recommended System Requirements** | **Component** | **PC (Windows/Linux)** | **Apple** | |---------------|-------------------------------------------------------|----------------------------| | **CPU** | Ryzen 5 3600 or Intel Core i7-10700, or better | M2 Pro | | **RAM** | 16GB | 16GB | | **GPU** | NVIDIA GTX 1080 Ti/RTX 2080 or better, with 8GB+ VRAM | M2 Pro (integrated GPU) | | **OS** | At least Windows 10 or Ubuntu 24.04 LTS | macOS Sonoma 14.5 or newer | ### **Minimum System Requirements** | **Component** | **PC (Windows/Linux)** | **Apple** | |---------------|-----------------------------------------------------------------|---------------------| | **CPU** | Intel Core: i3-2100, Pentium: 7505, Celeron: 6305; AMD: FX-4100 | M1 | | **RAM** | 16GB (8GB for 3B LLMs) | 16GB | | **GPU** | Anything Direct3D 11/12 or OpenGL 2.1 capable | M1 (integrated GPU) | | **OS** | Windows 10, Ubuntu 22.04 LTS, or other compatible Linux | macOS Monterey 12.6 | Note that Windows and Linux PCs with ARM CPUs are not currently supported. ================================================ FILE: gpt4all-chat/test-requirements.txt ================================================ pytest~=8.3 requests~=2.32 ================================================ FILE: gpt4all-chat/tests/CMakeLists.txt ================================================ include(FetchContent) find_package(Python3 3.12 REQUIRED COMPONENTS Interpreter) # Google test download and setup FetchContent_Declare( googletest URL https://github.com/google/googletest/archive/refs/tags/v1.15.2.zip ) FetchContent_MakeAvailable(googletest) configure_file(python/config.py.in "${CMAKE_CURRENT_SOURCE_DIR}/python/config.py") add_test(NAME ChatPythonTests COMMAND ${Python3_EXECUTABLE} -m pytest --color=yes "${CMAKE_CURRENT_SOURCE_DIR}/python" ) set_tests_properties(ChatPythonTests PROPERTIES ENVIRONMENT "CHAT_EXECUTABLE=${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/chat;TEST_MODEL_PATH=${TEST_MODEL_PATH}" TIMEOUT 60 ) add_executable(gpt4all_tests cpp/test_main.cpp cpp/basic_test.cpp ) target_link_libraries(gpt4all_tests PRIVATE gtest gtest_main) include(GoogleTest) gtest_discover_tests(gpt4all_tests) ================================================ FILE: gpt4all-chat/tests/cpp/basic_test.cpp ================================================ #include TEST(BasicTest, TestInitialization) { EXPECT_TRUE(true); } ================================================ FILE: gpt4all-chat/tests/cpp/test_main.cpp ================================================ #include int main(int argc, char **argv) { ::testing::InitGoogleTest(&argc, argv); return RUN_ALL_TESTS(); } ================================================ FILE: gpt4all-chat/tests/python/__init__.py ================================================ ================================================ FILE: gpt4all-chat/tests/python/config.py.in ================================================ APP_VERSION = '@APP_VERSION@' ================================================ FILE: gpt4all-chat/tests/python/test_server_api.py ================================================ import os import shutil import signal import subprocess import sys import tempfile import textwrap from contextlib import contextmanager from pathlib import Path from subprocess import CalledProcessError from typing import Any, Iterator import pytest import requests from urllib3 import Retry from . import config class Requestor: def __init__(self) -> None: self.session = requests.Session() self.http_adapter = self.session.adapters['http://'] def get(self, path: str, *, raise_for_status: bool = True, wait: bool = False) -> Any: return self._request('GET', path, raise_for_status=raise_for_status, wait=wait) def post(self, path: str, data: dict[str, Any] | None, *, raise_for_status: bool = True, wait: bool = False) -> Any: return self._request('POST', path, data, raise_for_status=raise_for_status, wait=wait) def _request( self, method: str, path: str, data: dict[str, Any] | None = None, *, raise_for_status: bool, wait: bool, ) -> Any: if wait: retry = Retry(total=None, connect=10, read=False, status=0, other=0, backoff_factor=.01) else: retry = Retry(total=False) self.http_adapter.max_retries = retry # type: ignore[attr-defined] resp = self.session.request(method, f'http://localhost:4891/v1/{path}', json=data) if raise_for_status: resp.raise_for_status() return resp.json() try: json_data = resp.json() except ValueError: json_data = None return resp.status_code, json_data request = Requestor() def create_chat_server_config(tmpdir: Path, model_copied: bool = False) -> dict[str, str]: xdg_confdir = tmpdir / 'config' app_confdir = xdg_confdir / 'nomic.ai' app_confdir.mkdir(parents=True) with open(app_confdir / 'GPT4All.ini', 'w') as conf: conf.write(textwrap.dedent(f"""\ [General] serverChat=true [download] lastVersionStarted={config.APP_VERSION} [network] isActive=false usageStatsActive=false """)) if model_copied: app_data_dir = tmpdir / 'share' / 'nomic.ai' / 'GPT4All' app_data_dir.mkdir(parents=True) local_env_file_path = Path(os.environ['TEST_MODEL_PATH']) shutil.copy(local_env_file_path, app_data_dir / local_env_file_path.name) return dict( os.environ, XDG_CACHE_HOME=str(tmpdir / 'cache'), XDG_DATA_HOME=str(tmpdir / 'share'), XDG_CONFIG_HOME=str(xdg_confdir), APPIMAGE=str(tmpdir), # hack to bypass SingleApplication ) @contextmanager def prepare_chat_server(model_copied: bool = False) -> Iterator[dict[str, str]]: if os.name != 'posix' or sys.platform == 'darwin': pytest.skip('Need non-Apple Unix to use alternate config path') with tempfile.TemporaryDirectory(prefix='gpt4all-test') as td: tmpdir = Path(td) config = create_chat_server_config(tmpdir, model_copied=model_copied) yield config def start_chat_server(config: dict[str, str]) -> Iterator[None]: chat_executable = Path(os.environ['CHAT_EXECUTABLE']).absolute() with subprocess.Popen(chat_executable, env=config) as process: try: yield except: process.kill() raise process.send_signal(signal.SIGINT) if retcode := process.wait(): raise CalledProcessError(retcode, process.args) @pytest.fixture def chat_server() -> Iterator[None]: with prepare_chat_server(model_copied=False) as config: yield from start_chat_server(config) @pytest.fixture def chat_server_with_model() -> Iterator[None]: with prepare_chat_server(model_copied=True) as config: yield from start_chat_server(config) def test_with_models_empty(chat_server: None) -> None: # non-sense endpoint status_code, response = request.get('foobarbaz', wait=True, raise_for_status=False) assert status_code == 404 assert response is None # empty model list response = request.get('models') assert response == {'object': 'list', 'data': []} # empty model info response = request.get('models/foo') assert response == {} # POST for model list status_code, response = request.post('models', data=None, raise_for_status=False) assert status_code == 405 assert response == {'error': { 'code': None, 'message': 'Not allowed to POST on /v1/models. (HINT: Perhaps you meant to use a different HTTP method?)', 'param': None, 'type': 'invalid_request_error', }} # POST for model info status_code, response = request.post('models/foo', data=None, raise_for_status=False) assert status_code == 405 assert response == {'error': { 'code': None, 'message': 'Not allowed to POST on /v1/models/*. (HINT: Perhaps you meant to use a different HTTP method?)', 'param': None, 'type': 'invalid_request_error', }} # GET for completions status_code, response = request.get('completions', raise_for_status=False) assert status_code == 405 assert response == {'error': { 'code': 'method_not_supported', 'message': 'Only POST requests are accepted.', 'param': None, 'type': 'invalid_request_error', }} # GET for chat completions status_code, response = request.get('chat/completions', raise_for_status=False) assert status_code == 405 assert response == {'error': { 'code': 'method_not_supported', 'message': 'Only POST requests are accepted.', 'param': None, 'type': 'invalid_request_error', }} EXPECTED_MODEL_INFO = { 'created': 0, 'id': 'Llama 3.2 1B Instruct', 'object': 'model', 'owned_by': 'humanity', 'parent': None, 'permissions': [ { 'allow_create_engine': False, 'allow_fine_tuning': False, 'allow_logprobs': False, 'allow_sampling': False, 'allow_search_indices': False, 'allow_view': True, 'created': 0, 'group': None, 'id': 'placeholder', 'is_blocking': False, 'object': 'model_permission', 'organization': '*', }, ], 'root': 'Llama 3.2 1B Instruct', } EXPECTED_COMPLETIONS_RESPONSE = { 'choices': [ { 'finish_reason': 'length', 'index': 0, 'logprobs': None, 'text': ' jumps over the lazy dog.\n', }, ], 'id': 'placeholder', 'model': 'Llama 3.2 1B Instruct', 'object': 'text_completion', 'usage': { 'completion_tokens': 6, 'prompt_tokens': 5, 'total_tokens': 11, }, } def test_with_models(chat_server_with_model: None) -> None: response = request.get('models', wait=True) assert response == { 'data': [EXPECTED_MODEL_INFO], 'object': 'list', } # Test the specific model endpoint response = request.get('models/Llama 3.2 1B Instruct') assert response == EXPECTED_MODEL_INFO # Test the completions endpoint status_code, response = request.post('completions', data=None, raise_for_status=False) assert status_code == 400 assert response == {'error': { 'code': None, 'message': 'error parsing request JSON: illegal value', 'param': None, 'type': 'invalid_request_error', }} data = dict( model = 'Llama 3.2 1B Instruct', prompt = 'The quick brown fox', temperature = 0, max_tokens = 6, ) response = request.post('completions', data=data) del response['created'] # Remove the dynamic field for comparison assert response == EXPECTED_COMPLETIONS_RESPONSE def test_with_models_temperature(chat_server_with_model: None) -> None: """Fixed by nomic-ai/gpt4all#3202.""" data = { 'model': 'Llama 3.2 1B Instruct', 'prompt': 'The quick brown fox', 'temperature': 0.5, } request.post('completions', data=data, wait=True, raise_for_status=True) ================================================ FILE: gpt4all-chat/translations/gpt4all_en_US.ts ================================================ AddCollectionView ← Existing Collections Add Document Collection Add a folder containing plain text files, PDFs, or Markdown. Configure additional extensions in Settings. Name Collection name... Name of the collection to add (Required) Folder Folder path... Folder path to documents (Required) Browse Create Collection AddGPT4AllModelView These models have been specifically configured for use in GPT4All. The first few models on the list are known to work the best, but you should only attempt to use models that will fit in your available memory. Network error: could not retrieve %1 Busy indicator Displayed when the models request is ongoing All Reasoning Model file Model file to be downloaded Description File description Cancel Resume Download Stop/restart/start the download Remove Remove model from filesystem <strong><font size="1"><a href="#error">Error</a></strong></font> Describes an error that occurred when downloading <strong><font size="2">WARNING: Not recommended for your hardware. Model requires more memory (%1 GB) than your system has available (%2).</strong></font> Error for incompatible hardware Download progressBar Shows the progress made in the download Download speed Download speed in bytes/kilobytes/megabytes per second Calculating... Whether the file hash is being calculated Displayed when the file hash is being calculated File size RAM required %1 GB ? Parameters Quant Type AddHFModelView Use the search to find and download models from HuggingFace. There is NO GUARANTEE that these will work. Many will require additional configuration before they can be used. Discover and download models by keyword search... Text field for discovering and filtering downloadable models Searching · %1 Initiate model discovery and filtering Triggers discovery and filtering of models Default Likes Downloads Recent Sort by: %1 Asc Desc Sort dir: %1 None Limit: %1 Model file Model file to be downloaded Description File description Cancel Resume Download Stop/restart/start the download Remove Remove model from filesystem Install Install online model <strong><font size="1"><a href="#error">Error</a></strong></font> Describes an error that occurred when downloading <strong><font size="2">WARNING: Not recommended for your hardware. Model requires more memory (%1 GB) than your system has available (%2).</strong></font> Error for incompatible hardware Download progressBar Shows the progress made in the download Download speed Download speed in bytes/kilobytes/megabytes per second Calculating... Whether the file hash is being calculated Busy indicator Displayed when the file hash is being calculated ERROR: $API_KEY is empty. enter $API_KEY ERROR: $BASE_URL is empty. enter $BASE_URL ERROR: $MODEL_NAME is empty. enter $MODEL_NAME File size Quant Type AddModelView ← Existing Models Explore Models GPT4All Remote Providers HuggingFace AddRemoteModelView Various remote model providers that use network resources for inference. Groq Groq offers a high-performance AI inference engine designed for low-latency and efficient processing. Optimized for real-time applications, Groq’s technology is ideal for users who need fast responses from open large language models and other AI workloads.<br><br>Get your API key: <a href="https://console.groq.com/keys">https://groq.com/</a> OpenAI OpenAI provides access to advanced AI models, including GPT-4 supporting a wide range of applications, from conversational AI to content generation and code completion.<br><br>Get your API key: <a href="https://platform.openai.com/signup">https://openai.com/</a> Mistral Mistral AI specializes in efficient, open-weight language models optimized for various natural language processing tasks. Their models are designed for flexibility and performance, making them a solid option for applications requiring scalable AI solutions.<br><br>Get your API key: <a href="https://mistral.ai/">https://mistral.ai/</a> Custom The custom provider option allows users to connect their own OpenAI-compatible AI models or third-party inference services. This is useful for organizations with proprietary models or those leveraging niche AI providers not listed here. ApplicationSettings Application Network dialog opt-in to share feedback/conversations Error dialog Application Settings General Theme The application color scheme. Dark Light ERROR: Update system could not find the MaintenanceTool used to check for updates!<br/><br/>Did you install this application using the online installer? If so, the MaintenanceTool executable should be located one directory above where this application resides on your filesystem.<br/><br/>If you can't start it manually, then I'm afraid you'll have to reinstall. LegacyDark Font Size The size of text in the application. Small Medium Large Language and Locale The language and locale you wish to use. System Locale Device The compute device used for text generation. Application default Default Model The preferred model for new chats. Also used as the local server fallback. Suggestion Mode Generate suggested follow-up questions at the end of responses. When chatting with LocalDocs Whenever possible Never Download Path Where to store local models and the LocalDocs database. Browse Choose where to save model files Enable Datalake Send chats and feedback to the GPT4All Open-Source Datalake. Advanced CPU Threads The number of CPU threads used for inference and embedding. Enable System Tray The application will minimize to the system tray when the window is closed. Enable Local API Server Expose an OpenAI-Compatible server to localhost. WARNING: Results in increased resource usage. API Server Port The port to use for the local server. Requires restart. Check For Updates Manually check for an update to GPT4All. Updates Chat New Chat Server Chat ChatAPIWorker ERROR: Network error occurred while connecting to the API server ChatAPIWorker::handleFinished got HTTP Error %1 %2 ChatCollapsibleItem Analysis encountered error Thinking Analyzing Thought for %1 %2 second seconds Analyzed ChatDrawer Drawer Main navigation drawer + New Chat Create a new chat Select the current chat or edit the chat when in edit mode Edit chat name Save chat name Delete chat Confirm chat deletion Cancel chat deletion List of chats List of chats in the drawer dialog ChatItemView GPT4All You response stopped ... retrieving localdocs: %1 ... searching localdocs: %1 ... processing ... generating response ... generating questions ... generating toolcall ... Copy %n Source(s) %n Source %n Sources LocalDocs Edit this message? All following messages will be permanently erased. Redo this response? Cannot edit chat without a loaded model. Cannot edit chat while the model is generating. Edit Cannot redo response without a loaded model. Cannot redo response while the model is generating. Redo Like response Dislike response Suggested follow-ups ChatLLM Your message was too long and could not be processed (%1 > %2). Please try again with something shorter. ChatListModel TODAY THIS WEEK THIS MONTH LAST SIX MONTHS THIS YEAR LAST YEAR ChatTextItem Copy Copy Message Disable markdown Enable markdown ChatView <h3>Warning</h3><p>%1</p> Conversation copied to clipboard. Code copied to clipboard. The entire chat will be erased. Chat panel Chat panel with options Reload the currently loaded model Eject the currently loaded model No model installed. Model loading error. Waiting for model... Switching context... Choose a model... Not found: %1 The top item is the current model LocalDocs Add documents add collections of documents to the chat Load the default model Loads the default model which can be changed in settings No Model Installed GPT4All requires that you install at least one model to get started Install a Model Shows the add model view Conversation with the model prompt / response pairs from the conversation Legacy prompt template needs to be <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">updated</a> in Settings. No <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">chat template</a> configured. The <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">chat template</a> cannot be blank. Legacy system prompt needs to be <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">updated</a> in Settings. Copy Erase and reset chat session Copy chat session to clipboard Add media Adds media to the prompt Stop generating Stop the current response generation Attach Single File Reloads the model <h3>Encountered an error loading model:</h3><br><i>"%1"</i><br><br>Model loading failures can happen for a variety of reasons, but the most common causes include a bad file format, an incomplete or corrupted download, the wrong file type, not enough system RAM or an incompatible model type. Here are some suggestions for resolving the problem:<br><ul><li>Ensure the model file has a compatible format and type<li>Check the model file is complete in the download folder<li>You can find the download folder in the settings dialog<li>If you've sideloaded the model ensure the file is not corrupt by checking md5sum<li>Read more about what models are supported in our <a href="https://docs.gpt4all.io/">documentation</a> for the gui<li>Check out our <a href="https://discord.gg/4M2QFmTt2k">discord channel</a> for help Erase conversation? Changing the model will erase the current conversation. Reload · %1 Loading · %1 Load · %1 (default) → Send a message... Load a model to continue... Send messages/prompts to the model Cut Paste Select All Send message Sends the message/prompt contained in textfield to the model CodeInterpreter Code Interpreter compute javascript code using console.log as output CollectionsDrawer Warning: searching collections while indexing can return incomplete results %n file(s) %n file %n files %n word(s) %n word %n words Updating + Add Docs Select a collection to make it available to the chat model. ConfirmationDialog OK Cancel Download Model "%1" is installed successfully. ERROR: $MODEL_NAME is empty. ERROR: $API_KEY is empty. ERROR: $BASE_URL is invalid. ERROR: Model "%1 (%2)" is conflict. Model "%1 (%2)" is installed successfully. Model "%1" is removed. HomeView Welcome to GPT4All The privacy-first LLM chat application Start chatting Start Chatting Chat with any LLM LocalDocs Chat with your local files Find Models Explore and download models Latest news Latest news from GPT4All Release Notes Documentation Discord X (Twitter) Github nomic.ai Subscribe to Newsletter LocalDocsSettings LocalDocs LocalDocs Settings Indexing Allowed File Extensions Comma-separated list. LocalDocs will only attempt to process files with these extensions. Embedding Use Nomic Embed API Embed documents using the fast Nomic API instead of a private local model. Requires restart. Nomic API Key API key to use for Nomic Embed. Get one from the Atlas <a href="https://atlas.nomic.ai/cli-login">API keys page</a>. Requires restart. Embeddings Device The compute device used for embeddings. Requires restart. Application default Display Show Sources Display the sources used for each response. Advanced Warning: Advanced usage only. Values too large may cause localdocs failure, extremely slow responses or failure to respond at all. Roughly speaking, the {N chars x N snippets} are added to the model's context window. More info <a href="https://docs.gpt4all.io/gpt4all_desktop/localdocs.html">here</a>. Document snippet size (characters) Number of characters per document snippet. Larger numbers increase likelihood of factual responses, but also result in slower generation. Max document snippets per prompt Max best N matches of retrieved document snippets to add to the context for prompt. Larger numbers increase likelihood of factual responses, but also result in slower generation. LocalDocsView LocalDocs Chat with your local files + Add Collection <h3>ERROR: The LocalDocs database cannot be accessed or is not valid.</h3><br><i>Note: You will need to restart after trying any of the following suggested fixes.</i><br><ul><li>Make sure that the folder set as <b>Download Path</b> exists on the file system.</li><li>Check ownership as well as read and write permissions of the <b>Download Path</b>.</li><li>If there is a <b>localdocs_v2.db</b> file, check its ownership and read/write permissions, too.</li></ul><br>If the problem persists and there are any 'localdocs_v*.db' files present, as a last resort you can<br>try backing them up and removing them. You will have to recreate your collections, however. No Collections Installed Install a collection of local documents to get started using this feature + Add Doc Collection Shows the add model view Indexing progressBar Shows the progress made in the indexing ERROR INDEXING EMBEDDING REQUIRES UPDATE READY INSTALLING Indexing in progress Embedding in progress This collection requires an update after version change Automatically reindexes upon changes to the folder Installation in progress % %n file(s) %n file %n files %n word(s) %n word %n words Remove Rebuild Reindex this folder from scratch. This is slow and usually not needed. Update Update the collection to the new version. This is a slow operation. ModelList <ul><li>Requires personal OpenAI API key.</li><li>WARNING: Will send your chats to OpenAI!</li><li>Your API key will be stored on disk</li><li>Will only be used to communicate with OpenAI</li><li>You can apply for an API key <a href="https://platform.openai.com/account/api-keys">here.</a></li> <strong>OpenAI's ChatGPT model GPT-3.5 Turbo</strong><br> %1 <strong>OpenAI's ChatGPT model GPT-4</strong><br> %1 %2 <strong>Mistral Tiny model</strong><br> %1 <strong>Mistral Small model</strong><br> %1 <strong>Mistral Medium model</strong><br> %1 <br><br><i>* Even if you pay OpenAI for ChatGPT-4 this does not guarantee API key access. Contact OpenAI for more info. cannot open "%1": %2 cannot create "%1": %2 %1 (%2) <strong>OpenAI-Compatible API Model</strong><br><ul><li>API Key: %1</li><li>Base URL: %2</li><li>Model Name: %3</li></ul> <ul><li>Requires personal Mistral API key.</li><li>WARNING: Will send your chats to Mistral!</li><li>Your API key will be stored on disk</li><li>Will only be used to communicate with Mistral</li><li>You can apply for an API key <a href="https://console.mistral.ai/user/api-keys">here</a>.</li> <ul><li>Requires personal API key and the API base URL.</li><li>WARNING: Will send your chats to the OpenAI-compatible API Server you specified!</li><li>Your API key will be stored on disk</li><li>Will only be used to communicate with the OpenAI-compatible API Server</li> <strong>Connect to OpenAI-compatible API server</strong><br> %1 <strong>Created by %1.</strong><br><ul><li>Published on %2.<li>This model has %3 likes.<li>This model has %4 downloads.<li>More info can be found <a href="https://huggingface.co/%5">here.</a></ul> ModelSettings Model %1 system message? Clear Reset The system message will be %1. removed reset to the default %1 chat template? The chat template will be %1. erased Model Settings Clone Remove Name Model File System Message A message to set the context or guide the behavior of the model. Leave blank for none. NOTE: Since GPT4All 3.5, this should not contain control tokens. System message is not <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">plain text</a>. Chat Template This Jinja template turns the chat into input for the model. No <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">chat template</a> configured. The <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">chat template</a> cannot be blank. <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">Syntax error</a>: %1 Chat template is not in <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">Jinja format</a>. Chat Name Prompt Prompt used to automatically generate chat names. Suggested FollowUp Prompt Prompt used to generate suggested follow-up questions. Context Length Number of input and output tokens the model sees. Maximum combined prompt/response tokens before information is lost. Using more context than the model was trained on will yield poor results. NOTE: Does not take effect until you reload the model. Temperature Randomness of model output. Higher -> more variation. Temperature increases the chances of choosing less likely tokens. NOTE: Higher temperature gives more creative but less predictable outputs. Top-P Nucleus Sampling factor. Lower -> more predictable. Only the most likely tokens up to a total probability of top_p can be chosen. NOTE: Prevents choosing highly unlikely tokens. Min-P Minimum token probability. Higher -> more predictable. Sets the minimum relative probability for a token to be considered. Top-K Size of selection pool for tokens. Only the top K most likely tokens will be chosen from. Max Length Maximum response length, in tokens. Prompt Batch Size The batch size used for prompt processing. Amount of prompt tokens to process at once. NOTE: Higher values can speed up reading prompts but will use more RAM. Repeat Penalty Repetition penalty factor. Set to 1 to disable. Repeat Penalty Tokens Number of previous tokens used for penalty. GPU Layers Number of model layers to load into VRAM. How many model layers to load into VRAM. Decrease this if GPT4All runs out of VRAM while loading this model. Lower values increase CPU load and RAM usage, and make inference slower. NOTE: Does not take effect until you reload the model. ModelsView No Models Installed Install a model to get started using GPT4All + Add Model Shows the add model view Installed Models Locally installed chat models Model file Model file to be downloaded Description File description Cancel Resume Stop/restart/start the download Remove Remove model from filesystem Install Install online model <strong><font size="1"><a href="#error">Error</a></strong></font> <strong><font size="2">WARNING: Not recommended for your hardware. Model requires more memory (%1 GB) than your system has available (%2).</strong></font> ERROR: $API_KEY is empty. ERROR: $BASE_URL is empty. enter $BASE_URL ERROR: $MODEL_NAME is empty. enter $MODEL_NAME %1 GB ? Describes an error that occurred when downloading Error for incompatible hardware Download progressBar Shows the progress made in the download Download speed Download speed in bytes/kilobytes/megabytes per second Calculating... Whether the file hash is being calculated Busy indicator Displayed when the file hash is being calculated enter $API_KEY File size RAM required Parameters Quant Type MyFancyLink Fancy link A stylized link MyFileDialog Please choose a file MyFolderDialog Please choose a directory MySettingsLabel Clear Reset MySettingsTab Restore defaults? This page of settings will be reset to the defaults. Restore Defaults Restores settings dialog to a default state NetworkDialog Contribute data to the GPT4All Opensource Datalake. By enabling this feature, you will be able to participate in the democratic process of training a large language model by contributing data for future model improvements. When a GPT4All model responds to you and you have opted-in, your conversation will be sent to the GPT4All Open Source Datalake. Additionally, you can like/dislike its response. If you dislike a response, you can suggest an alternative response. This data will be collected and aggregated in the GPT4All Datalake. NOTE: By turning on this feature, you will be sending your data to the GPT4All Open Source Datalake. You should have no expectation of chat privacy when this feature is enabled. You should; however, have an expectation of an optional attribution if you wish. Your chat data will be openly available for anyone to download and will be used by Nomic AI to improve future GPT4All models. Nomic AI will retain all attribution information attached to your data and you will be credited as a contributor to any GPT4All model release that uses your data! Terms for opt-in Describes what will happen when you opt-in Please provide a name for attribution (optional) Attribution (optional) Provide attribution Enable Enable opt-in Cancel Cancel opt-in NewVersionDialog New version is available Update Update to new version PopupDialog Reveals a shortlived help balloon Busy indicator Displayed when the popup is showing busy RemoteModelCard API Key ERROR: $API_KEY is empty. enter $API_KEY Whether the file hash is being calculated Base Url ERROR: $BASE_URL is empty. enter $BASE_URL Model Name ERROR: $MODEL_NAME is empty. enter $MODEL_NAME Models Install Install remote model SettingsView Settings Contains various application settings Application Model LocalDocs StartupDialog Welcome! ### Release Notes %1<br/> ### Contributors %2 Release notes Release notes for this version ### Opt-ins for anonymous usage analytics and datalake By enabling these features, you will be able to participate in the democratic process of training a large language model by contributing data for future model improvements. When a GPT4All model responds to you and you have opted-in, your conversation will be sent to the GPT4All Open Source Datalake. Additionally, you can like/dislike its response. If you dislike a response, you can suggest an alternative response. This data will be collected and aggregated in the GPT4All Datalake. NOTE: By turning on this feature, you will be sending your data to the GPT4All Open Source Datalake. You should have no expectation of chat privacy when this feature is enabled. You should; however, have an expectation of an optional attribution if you wish. Your chat data will be openly available for anyone to download and will be used by Nomic AI to improve future GPT4All models. Nomic AI will retain all attribution information attached to your data and you will be credited as a contributor to any GPT4All model release that uses your data! Terms for opt-in Describes what will happen when you opt-in Opt-in to anonymous usage analytics used to improve GPT4All Opt-in for anonymous usage statistics Yes Allow opt-in for anonymous usage statistics No Opt-out for anonymous usage statistics Allow opt-out for anonymous usage statistics Opt-in to anonymous sharing of chats to the GPT4All Datalake Opt-in for network Allow opt-in for network Allow opt-in anonymous sharing of chats to the GPT4All Datalake Opt-out for network Allow opt-out anonymous sharing of chats to the GPT4All Datalake ThumbsDownDialog Please edit the text below to provide a better response. (optional) Please provide a better response... Submit Submits the user's response Cancel Closes the response dialog main <h3>Encountered an error starting up:</h3><br><i>"Incompatible hardware detected."</i><br><br>Unfortunately, your CPU does not meet the minimal requirements to run this program. In particular, it does not support AVX intrinsics which this program requires to successfully run a modern large language model. The only solution at this time is to upgrade your hardware to a more modern CPU.<br><br>See here for more information: <a href="https://en.wikipedia.org/wiki/Advanced_Vector_Extensions">https://en.wikipedia.org/wiki/Advanced_Vector_Extensions</a> GPT4All v%1 Restore Quit <h3>Encountered an error starting up:</h3><br><i>"Inability to access settings file."</i><br><br>Unfortunately, something is preventing the program from accessing the settings file. This could be caused by incorrect permissions in the local app config directory where the settings file is located. Check out our <a href="https://discord.gg/4M2QFmTt2k">discord channel</a> for help. Connection to datalake failed. Saving chats. Network dialog opt-in to share feedback/conversations Home view Home view of application Home Chat view Chat view to interact with models Chats Models Models view for installed models LocalDocs LocalDocs view to configure and use local docs Settings Settings view for application configuration The datalake is enabled Using a network model Server mode is enabled Installed models View of installed models ================================================ FILE: gpt4all-chat/translations/gpt4all_es_MX.ts ================================================ AddCollectionView ← Existing Collections ← Colecciones existentes Add Document Collection Agregar colección de documentos Add a folder containing plain text files, PDFs, or Markdown. Configure additional extensions in Settings. Agregue una carpeta que contenga archivos de texto plano, PDFs o Markdown. Configure extensiones adicionales en Configuración. Please choose a directory Por favor, elija un directorio Name Nombre Collection name... Nombre de la colección... Name of the collection to add (Required) Nombre de la colección a agregar (Requerido) Folder Carpeta Folder path... Ruta de la carpeta... Folder path to documents (Required) Ruta de la carpeta de documentos (Requerido) Browse Explorar Create Collection Crear colección AddGPT4AllModelView These models have been specifically configured for use in GPT4All. The first few models on the list are known to work the best, but you should only attempt to use models that will fit in your available memory. Network error: could not retrieve %1 Error de red: no se pudo recuperar %1 Busy indicator Indicador de ocupado Displayed when the models request is ongoing Se muestra cuando la solicitud de modelos está en curso All Reasoning Model file Archivo del modelo Model file to be downloaded Archivo del modelo a descargar Description Descripción File description Descripción del archivo Cancel Cancelar Resume Reanudar Download Descargar Stop/restart/start the download Detener/reiniciar/iniciar la descarga Remove Eliminar Remove model from filesystem Eliminar modelo del sistema de archivos Install Instalar Install online model Instalar modelo en línea <strong><font size="1"><a href="#error">Error</a></strong></font> <strong><font size="1"><a href="#error">Error</a></strong></font> Describes an error that occurred when downloading Describe un error que ocurrió durante la descarga <strong><font size="2">WARNING: Not recommended for your hardware. Model requires more memory (%1 GB) than your system has available (%2).</strong></font> Error for incompatible hardware Error por hardware incompatible Download progressBar Barra de progreso de descarga Shows the progress made in the download Muestra el progreso realizado en la descarga Download speed Velocidad de descarga Download speed in bytes/kilobytes/megabytes per second Velocidad de descarga en bytes/kilobytes/megabytes por segundo Calculating... Calculando... Whether the file hash is being calculated Si se está calculando el hash del archivo Displayed when the file hash is being calculated Se muestra cuando se está calculando el hash del archivo enter $API_KEY ingrese $API_KEY enter $BASE_URL ingrese $BASE_URL ERROR: $MODEL_NAME is empty. ERROR: $MODEL_NAME está vacío. enter $MODEL_NAME ingrese $MODEL_NAME File size Tamaño del archivo RAM required RAM requerida %1 GB %1 GB ? ? Parameters Parámetros Quant Cuantificación Type Tipo AddHFModelView Use the search to find and download models from HuggingFace. There is NO GUARANTEE that these will work. Many will require additional configuration before they can be used. Discover and download models by keyword search... Descubre y descarga modelos mediante búsqueda por palabras clave... Text field for discovering and filtering downloadable models Campo de texto para descubrir y filtrar modelos descargables Searching · %1 Buscando · %1 Initiate model discovery and filtering Iniciar descubrimiento y filtrado de modelos Triggers discovery and filtering of models Activa el descubrimiento y filtrado de modelos Default Predeterminado Likes Me gusta Downloads Descargas Recent Reciente Sort by: %1 Ordenar por: %1 Asc Asc Desc Desc Sort dir: %1 Dirección de ordenamiento: %1 None Ninguno Limit: %1 Límite: %1 Model file Archivo del modelo Model file to be downloaded Archivo del modelo a descargar Description Descripción File description Descripción del archivo Cancel Cancelar Resume Reanudar Download Descargar Stop/restart/start the download Detener/reiniciar/iniciar la descarga Remove Eliminar Remove model from filesystem Eliminar modelo del sistema de archivos Install Instalar Install online model Instalar modelo en línea <strong><font size="1"><a href="#error">Error</a></strong></font> <strong><font size="1"><a href="#error">Error</a></strong></font> Describes an error that occurred when downloading Describe un error que ocurrió durante la descarga <strong><font size="2">WARNING: Not recommended for your hardware. Model requires more memory (%1 GB) than your system has available (%2).</strong></font> Error for incompatible hardware Error por hardware incompatible Download progressBar Barra de progreso de descarga Shows the progress made in the download Muestra el progreso realizado en la descarga Download speed Velocidad de descarga Download speed in bytes/kilobytes/megabytes per second Velocidad de descarga en bytes/kilobytes/megabytes por segundo Calculating... Calculando... Whether the file hash is being calculated Si se está calculando el hash del archivo Busy indicator Indicador de ocupado Displayed when the file hash is being calculated Se muestra cuando se está calculando el hash del archivo ERROR: $API_KEY is empty. enter $API_KEY ingrese $API_KEY ERROR: $BASE_URL is empty. enter $BASE_URL ingrese $BASE_URL ERROR: $MODEL_NAME is empty. ERROR: $MODEL_NAME está vacío. enter $MODEL_NAME ingrese $MODEL_NAME File size Tamaño del archivo Quant Cuantificación Type Tipo AddModelView ← Existing Models ← Modelos existentes Explore Models Explorar modelos GPT4All GPT4All Remote Providers HuggingFace Discover and download models by keyword search... Descubre y descarga modelos mediante búsqueda por palabras clave... Text field for discovering and filtering downloadable models Campo de texto para descubrir y filtrar modelos descargables Initiate model discovery and filtering Iniciar descubrimiento y filtrado de modelos Triggers discovery and filtering of models Activa el descubrimiento y filtrado de modelos Default Predeterminado Likes Me gusta Downloads Descargas Recent Reciente Asc Asc Desc Desc None Ninguno Searching · %1 Buscando · %1 Sort by: %1 Ordenar por: %1 Sort dir: %1 Dirección de ordenamiento: %1 Limit: %1 Límite: %1 Network error: could not retrieve %1 Error de red: no se pudo recuperar %1 Busy indicator Indicador de ocupado Displayed when the models request is ongoing Se muestra cuando la solicitud de modelos está en curso Model file Archivo del modelo Model file to be downloaded Archivo del modelo a descargar Description Descripción File description Descripción del archivo Cancel Cancelar Resume Reanudar Download Descargar Stop/restart/start the download Detener/reiniciar/iniciar la descarga Remove Eliminar Remove model from filesystem Eliminar modelo del sistema de archivos Install Instalar Install online model Instalar modelo en línea <strong><font size="1"><a href="#error">Error</a></strong></font> <strong><font size="1"><a href="#error">Error</a></strong></font> <strong><font size="2">WARNING: Not recommended for your hardware. Model requires more memory (%1 GB) than your system has available (%2).</strong></font> <strong><font size="2">ADVERTENCIA: No recomendado para tu hardware. El modelo requiere más memoria (%1 GB) de la que tu sistema tiene disponible (%2).</strong></font> %1 GB %1 GB ? ? Describes an error that occurred when downloading Describe un error que ocurrió durante la descarga Error for incompatible hardware Error por hardware incompatible Download progressBar Barra de progreso de descarga Shows the progress made in the download Muestra el progreso realizado en la descarga Download speed Velocidad de descarga Download speed in bytes/kilobytes/megabytes per second Velocidad de descarga en bytes/kilobytes/megabytes por segundo Calculating... Calculando... Whether the file hash is being calculated Si se está calculando el hash del archivo Displayed when the file hash is being calculated Se muestra cuando se está calculando el hash del archivo enter $API_KEY ingrese $API_KEY File size Tamaño del archivo RAM required RAM requerida Parameters Parámetros Quant Cuantificación Type Tipo ERROR: $API_KEY is empty. ERROR: $API_KEY está vacío. ERROR: $BASE_URL is empty. ERROR: $BASE_URL está vacío. enter $BASE_URL ingrese $BASE_URL ERROR: $MODEL_NAME is empty. ERROR: $MODEL_NAME está vacío. enter $MODEL_NAME ingrese $MODEL_NAME AddRemoteModelView Various remote model providers that use network resources for inference. Groq Groq offers a high-performance AI inference engine designed for low-latency and efficient processing. Optimized for real-time applications, Groq’s technology is ideal for users who need fast responses from open large language models and other AI workloads.<br><br>Get your API key: <a href="https://console.groq.com/keys">https://groq.com/</a> OpenAI OpenAI provides access to advanced AI models, including GPT-4 supporting a wide range of applications, from conversational AI to content generation and code completion.<br><br>Get your API key: <a href="https://platform.openai.com/signup">https://openai.com/</a> Mistral Mistral AI specializes in efficient, open-weight language models optimized for various natural language processing tasks. Their models are designed for flexibility and performance, making them a solid option for applications requiring scalable AI solutions.<br><br>Get your API key: <a href="https://mistral.ai/">https://mistral.ai/</a> Custom The custom provider option allows users to connect their own OpenAI-compatible AI models or third-party inference services. This is useful for organizations with proprietary models or those leveraging niche AI providers not listed here. ApplicationSettings Application Aplicación Network dialog Diálogo de red opt-in to share feedback/conversations optar por compartir comentarios/conversaciones Error dialog Diálogo de error Application Settings Configuración de la aplicación General General Theme Tema The application color scheme. El esquema de colores de la aplicación. Dark Oscuro Light Claro ERROR: Update system could not find the MaintenanceTool used to check for updates!<br/><br/>Did you install this application using the online installer? If so, the MaintenanceTool executable should be located one directory above where this application resides on your filesystem.<br/><br/>If you can't start it manually, then I'm afraid you'll have to reinstall. ERROR: El sistema de actualización no pudo encontrar la Herramienta de Mantenimiento utilizada para buscar actualizaciones.<br><br>¿Instaló esta aplicación utilizando el instalador en línea? Si es así, el ejecutable de la Herramienta de Mantenimiento debería estar ubicado un directorio por encima de donde reside esta aplicación en su sistema de archivos.<br><br>Si no puede iniciarlo manualmente, me temo que tendrá que reinstalar la aplicación. LegacyDark Oscuro legado Font Size Tamaño de fuente The size of text in the application. El tamaño del texto en la aplicación. Device Dispositivo Small Pequeño Medium Mediano Large Grande Language and Locale Idioma y configuración regional The language and locale you wish to use. El idioma y la configuración regional que deseas usar. Default Model Modelo predeterminado The preferred model for new chats. Also used as the local server fallback. El modelo preferido para nuevos chats. También se utiliza como respaldo del servidor local. Suggestion Mode Modo de sugerencia Generate suggested follow-up questions at the end of responses. Generar preguntas de seguimiento sugeridas al final de las respuestas. When chatting with LocalDocs Al chatear con LocalDocs Whenever possible Siempre que sea posible Never Nunca Download Path Ruta de descarga Where to store local models and the LocalDocs database. Dónde almacenar los modelos locales y la base de datos de LocalDocs. Browse Explorar Choose where to save model files Elegir dónde guardar los archivos del modelo Enable Datalake Habilitar Datalake Send chats and feedback to the GPT4All Open-Source Datalake. Enviar chats y comentarios al Datalake de código abierto de GPT4All. Advanced Avanzado CPU Threads Hilos de CPU The number of CPU threads used for inference and embedding. El número de hilos de CPU utilizados para inferencia e incrustación. Enable System Tray The application will minimize to the system tray when the window is closed. Save Chat Context Guardar contexto del chat Save the chat model's state to disk for faster loading. WARNING: Uses ~2GB per chat. Guardar el estado del modelo de chat en el disco para una carga más rápida. ADVERTENCIA: Usa ~2GB por chat. Enable Local API Server Habilitar el servidor API local Expose an OpenAI-Compatible server to localhost. WARNING: Results in increased resource usage. Exponer un servidor compatible con OpenAI a localhost. ADVERTENCIA: Resulta en un mayor uso de recursos. API Server Port Puerto del servidor API The port to use for the local server. Requires restart. El puerto a utilizar para el servidor local. Requiere reinicio. Check For Updates Buscar actualizaciones Manually check for an update to GPT4All. Buscar manualmente una actualización para GPT4All. Updates Actualizaciones System Locale Regional del sistema The compute device used for text generation. El dispositivo de cómputo utilizado para la generación de texto. Application default Predeterminado de la aplicación Chat New Chat Nuevo chat Server Chat Chat del servidor ChatAPIWorker ERROR: Network error occurred while connecting to the API server ERROR: Ocurrió un error de red al conectar con el servidor API ChatAPIWorker::handleFinished got HTTP Error %1 %2 ChatAPIWorker::handleFinished obtuvo Error HTTP %1 %2 ChatCollapsibleItem Analysis encountered error Thinking Analyzing Thought for %1 %2 second seconds Analyzed ChatDrawer Drawer Cajón Main navigation drawer Cajón de navegación principal + New Chat + Nuevo chat Create a new chat Crear un nuevo chat Select the current chat or edit the chat when in edit mode Seleccionar el chat actual o editar el chat cuando esté en modo de edición Edit chat name Editar nombre del chat Save chat name Guardar nombre del chat Delete chat Eliminar chat Confirm chat deletion Confirmar eliminación del chat Cancel chat deletion Cancelar eliminación del chat List of chats Lista de chats List of chats in the drawer dialog Lista de chats en el diálogo del cajón ChatItemView GPT4All GPT4All You response stopped ... respuesta detenida ... retrieving localdocs: %1 ... recuperando documentos locales: %1 ... searching localdocs: %1 ... buscando en documentos locales: %1 ... processing ... procesando ... generating response ... generando respuesta ... generating questions ... generando preguntas ... generating toolcall ... Copy Copiar Copy Message Copiar mensaje Disable markdown Desactivar markdown Enable markdown Activar markdown %n Source(s) %n Fuente %n Fuentes LocalDocs Edit this message? All following messages will be permanently erased. Redo this response? Cannot edit chat without a loaded model. Cannot edit chat while the model is generating. Edit Cannot redo response without a loaded model. Cannot redo response while the model is generating. Redo Like response Dislike response Suggested follow-ups Seguimientos sugeridos ChatLLM Your message was too long and could not be processed (%1 > %2). Please try again with something shorter. ChatListModel TODAY HOY THIS WEEK ESTA SEMANA THIS MONTH ESTE MES LAST SIX MONTHS ÚLTIMOS SEIS MESES THIS YEAR ESTE AÑO LAST YEAR AÑO PASADO ChatTextItem Copy Copiar Copy Message Copiar mensaje Disable markdown Desactivar markdown Enable markdown Activar markdown ChatView <h3>Warning</h3><p>%1</p> <h3>Advertencia</h3><p>%1</p> Switch model dialog Diálogo para cambiar de modelo Warn the user if they switch models, then context will be erased Advertir al usuario si cambia de modelo, entonces se borrará el contexto Conversation copied to clipboard. Conversación copiada al portapapeles. Code copied to clipboard. Código copiado al portapapeles. The entire chat will be erased. Chat panel Panel de chat Chat panel with options Panel de chat con opciones Reload the currently loaded model Recargar el modelo actualmente cargado Eject the currently loaded model Expulsar el modelo actualmente cargado No model installed. No hay modelo instalado. Model loading error. Error al cargar el modelo. Waiting for model... Esperando al modelo... Switching context... Cambiando contexto... Choose a model... Elige un modelo... Not found: %1 No encontrado: %1 The top item is the current model El elemento superior es el modelo actual LocalDocs DocumentosLocales Add documents Agregar documentos add collections of documents to the chat agregar colecciones de documentos al chat Load the default model Cargar el modelo predeterminado Loads the default model which can be changed in settings Carga el modelo predeterminado que se puede cambiar en la configuración No Model Installed No hay modelo instalado Legacy prompt template needs to be <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">updated</a> in Settings. No <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">chat template</a> configured. The <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">chat template</a> cannot be blank. Legacy system prompt needs to be <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">updated</a> in Settings. <h3>Encountered an error loading model:</h3><br><i>"%1"</i><br><br>Model loading failures can happen for a variety of reasons, but the most common causes include a bad file format, an incomplete or corrupted download, the wrong file type, not enough system RAM or an incompatible model type. Here are some suggestions for resolving the problem:<br><ul><li>Ensure the model file has a compatible format and type<li>Check the model file is complete in the download folder<li>You can find the download folder in the settings dialog<li>If you've sideloaded the model ensure the file is not corrupt by checking md5sum<li>Read more about what models are supported in our <a href="https://docs.gpt4all.io/">documentation</a> for the gui<li>Check out our <a href="https://discord.gg/4M2QFmTt2k">discord channel</a> for help <h3>Se encontró un error al cargar el modelo:</h3><br><i>"%1"</i><br><br>Los fallos en la carga de modelos pueden ocurrir por varias razones, pero las causas más comunes incluyen un formato de archivo incorrecto, una descarga incompleta o corrupta, un tipo de archivo equivocado, RAM del sistema insuficiente o un tipo de modelo incompatible. Aquí hay algunas sugerencias para resolver el problema:<br><ul><li>Asegúrate de que el archivo del modelo tenga un formato y tipo compatibles<li>Verifica que el archivo del modelo esté completo en la carpeta de descargas<li>Puedes encontrar la carpeta de descargas en el diálogo de configuración<li>Si has cargado el modelo manualmente, asegúrate de que el archivo no esté corrupto verificando el md5sum<li>Lee más sobre qué modelos son compatibles en nuestra <a href="https://docs.gpt4all.io/">documentación</a> para la interfaz gráfica<li>Visita nuestro <a href="https://discord.gg/4M2QFmTt2k">canal de discord</a> para obtener ayuda Erase conversation? Changing the model will erase the current conversation. Install a Model Instalar un modelo Shows the add model view Muestra la vista de agregar modelo Conversation with the model Conversación con el modelo prompt / response pairs from the conversation pares de pregunta / respuesta de la conversación GPT4All GPT4All You response stopped ... respuesta detenida ... processing ... procesando ... generating response ... generando respuesta ... generating questions ... generando preguntas ... Copy Copiar Copy Message Copiar mensaje Disable markdown Desactivar markdown Enable markdown Activar markdown Thumbs up Me gusta Gives a thumbs up to the response Da un me gusta a la respuesta Thumbs down No me gusta Opens thumbs down dialog Abre el diálogo de no me gusta Suggested follow-ups Seguimientos sugeridos Erase and reset chat session Borrar y reiniciar sesión de chat Copy chat session to clipboard Copiar sesión de chat al portapapeles Redo last chat response Rehacer última respuesta del chat Add media Agregar medios Adds media to the prompt Agrega medios al mensaje Stop generating Detener generación Stop the current response generation Detener la generación de la respuesta actual Attach Adjuntar Single File Fila india Reloads the model Recarga el modelo Reload · %1 Recargar · %1 Loading · %1 Cargando · %1 Load · %1 (default) → Cargar · %1 (predeterminado) → retrieving localdocs: %1 ... recuperando documentos locales: %1 ... searching localdocs: %1 ... buscando en documentos locales: %1 ... %n Source(s) %n Fuente %n Fuentes Send a message... Enviar un mensaje... Load a model to continue... Carga un modelo para continuar... Send messages/prompts to the model Enviar mensajes/indicaciones al modelo Cut Cortar Paste Pegar Select All Seleccionar todo Send message Enviar mensaje Sends the message/prompt contained in textfield to the model Envía el mensaje/indicación contenido en el campo de texto al modelo GPT4All requires that you install at least one model to get started GPT4All requiere que instale al menos un modelo para comenzar restoring from text ... restaurando desde texto ... CodeInterpreter Code Interpreter compute javascript code using console.log as output CollectionsDrawer Warning: searching collections while indexing can return incomplete results Advertencia: buscar en colecciones mientras se indexan puede devolver resultados incompletos %n file(s) %n archivo %n archivos %n word(s) %n palabra %n palabras Updating Actualizando + Add Docs + Agregar documentos Select a collection to make it available to the chat model. Seleccione una colección para hacerla disponible al modelo de chat. ConfirmationDialog OK Cancel Cancelar Download Model "%1" is installed successfully. El modelo "%1" se ha instalado correctamente. ERROR: $MODEL_NAME is empty. ERROR: $MODEL_NAME está vacío. ERROR: $API_KEY is empty. ERROR: $API_KEY está vacía. ERROR: $BASE_URL is invalid. ERROR: $BASE_URL no es válida. ERROR: Model "%1 (%2)" is conflict. ERROR: El modelo "%1 (%2)" está en conflicto. Model "%1 (%2)" is installed successfully. El modelo "%1 (%2)" se ha instalado correctamente. Model "%1" is removed. El modelo "%1" ha sido eliminado. HomeView Welcome to GPT4All Bienvenido a GPT4All The privacy-first LLM chat application La aplicación de chat LLM que prioriza la privacidad Start chatting Comenzar a chatear Start Chatting Iniciar chat Chat with any LLM Chatear con cualquier LLM LocalDocs DocumentosLocales Chat with your local files Chatear con tus archivos locales Find Models Buscar modelos Explore and download models Explorar y descargar modelos Latest news Últimas noticias Latest news from GPT4All Últimas noticias de GPT4All Release Notes Notas de la versión Documentation Documentación Discord Discord X (Twitter) X (Twitter) Github Github nomic.ai nomic.ai Subscribe to Newsletter Suscribirse al boletín LocalDocsSettings LocalDocs DocumentosLocales LocalDocs Settings Configuración de DocumentosLocales Indexing Indexación Allowed File Extensions Extensiones de archivo permitidas Embedding Incrustación Use Nomic Embed API Usar API de incrustación Nomic Nomic API Key Clave API de Nomic API key to use for Nomic Embed. Get one from the Atlas <a href="https://atlas.nomic.ai/cli-login">API keys page</a>. Requires restart. Clave API para usar con Nomic Embed. Obtén una en la <a href="https://atlas.nomic.ai/cli-login">página de claves API</a> de Atlas. Requiere reinicio. Embeddings Device Dispositivo de incrustaciones The compute device used for embeddings. Requires restart. El dispositivo de cómputo utilizado para las incrustaciones. Requiere reinicio. Comma-separated list. LocalDocs will only attempt to process files with these extensions. Lista separada por comas. LocalDocs solo intentará procesar archivos con estas extensiones. Embed documents using the fast Nomic API instead of a private local model. Requires restart. Incrustar documentos usando la API rápida de Nomic en lugar de un modelo local privado. Requiere reinicio. Application default Predeterminado de la aplicación Display Visualización Show Sources Mostrar fuentes Display the sources used for each response. Mostrar las fuentes utilizadas para cada respuesta. Advanced Avanzado Warning: Advanced usage only. Advertencia: Solo para uso avanzado. Values too large may cause localdocs failure, extremely slow responses or failure to respond at all. Roughly speaking, the {N chars x N snippets} are added to the model's context window. More info <a href="https://docs.gpt4all.io/gpt4all_desktop/localdocs.html">here</a>. Valores demasiado grandes pueden causar fallos en localdocs, respuestas extremadamente lentas o falta de respuesta. En términos generales, los {N caracteres x N fragmentos} se añaden a la ventana de contexto del modelo. Más información <a href="https://docs.gpt4all.io/gpt4all_desktop/localdocs.html">aquí</a>. Number of characters per document snippet. Larger numbers increase likelihood of factual responses, but also result in slower generation. Número de caracteres por fragmento de documento. Números más grandes aumentan la probabilidad de respuestas verídicas, pero también resultan en una generación más lenta. Max best N matches of retrieved document snippets to add to the context for prompt. Larger numbers increase likelihood of factual responses, but also result in slower generation. Máximo de N mejores coincidencias de fragmentos de documentos recuperados para añadir al contexto del prompt. Números más grandes aumentan la probabilidad de respuestas verídicas, pero también resultan en una generación más lenta. Document snippet size (characters) Tamaño del fragmento de documento (caracteres) Max document snippets per prompt Máximo de fragmentos de documento por indicación LocalDocsView LocalDocs DocumentosLocales Chat with your local files Chatea con tus archivos locales + Add Collection + Agregar colección No Collections Installed No hay colecciones instaladas Install a collection of local documents to get started using this feature Instala una colección de documentos locales para comenzar a usar esta función + Add Doc Collection + Agregar colección de documentos Shows the add model view Muestra la vista de agregar modelo Indexing progressBar Barra de progreso de indexación Shows the progress made in the indexing Muestra el progreso realizado en la indexación ERROR ERROR INDEXING INDEXANDO EMBEDDING INCRUSTANDO REQUIRES UPDATE REQUIERE ACTUALIZACIÓN READY LISTO INSTALLING INSTALANDO Indexing in progress Indexación en progreso Embedding in progress Incrustación en progreso This collection requires an update after version change Esta colección requiere una actualización después del cambio de versión Automatically reindexes upon changes to the folder Reindexación automática al cambiar la carpeta Installation in progress Instalación en progreso % % %n file(s) %n archivo %n archivos %n word(s) %n palabra %n palabra(s) Remove Eliminar Rebuild Reconstruir Reindex this folder from scratch. This is slow and usually not needed. Reindexar esta carpeta desde cero. Esto es lento y generalmente no es necesario. Update Actualizar Update the collection to the new version. This is a slow operation. Actualizar la colección a la nueva versión. Esta es una operación lenta. <h3>ERROR: The LocalDocs database cannot be accessed or is not valid.</h3><br><i>Note: You will need to restart after trying any of the following suggested fixes.</i><br><ul><li>Make sure that the folder set as <b>Download Path</b> exists on the file system.</li><li>Check ownership as well as read and write permissions of the <b>Download Path</b>.</li><li>If there is a <b>localdocs_v2.db</b> file, check its ownership and read/write permissions, too.</li></ul><br>If the problem persists and there are any 'localdocs_v*.db' files present, as a last resort you can<br>try backing them up and removing them. You will have to recreate your collections, however. <h3>ERROR: No se puede acceder a la base de datos LocalDocs o no es válida.</h3><br><i>Nota: Necesitará reiniciar después de intentar cualquiera de las siguientes soluciones sugeridas.</i><br><ul><li>Asegúrese de que la carpeta establecida como <b>Ruta de Descarga</b> exista en el sistema de archivos.</li><li>Verifique la propiedad y los permisos de lectura y escritura de la <b>Ruta de Descarga</b>.</li><li>Si hay un archivo <b>localdocs_v2.db</b>, verifique también su propiedad y permisos de lectura/escritura.</li></ul><br>Si el problema persiste y hay archivos 'localdocs_v*.db' presentes, como último recurso puede<br>intentar hacer una copia de seguridad y eliminarlos. Sin embargo, tendrá que recrear sus colecciones. ModelList <ul><li>Requires personal OpenAI API key.</li><li>WARNING: Will send your chats to OpenAI!</li><li>Your API key will be stored on disk</li><li>Will only be used to communicate with OpenAI</li><li>You can apply for an API key <a href="https://platform.openai.com/account/api-keys">here.</a></li> <ul><li>Requiere clave API personal de OpenAI.</li><li>ADVERTENCIA: ¡Enviará sus chats a OpenAI!</li><li>Su clave API se almacenará en el disco</li><li>Solo se usará para comunicarse con OpenAI</li><li>Puede solicitar una clave API <a href="https://platform.openai.com/account/api-keys">aquí.</a></li> <strong>OpenAI's ChatGPT model GPT-3.5 Turbo</strong><br> %1 <strong>Modelo ChatGPT GPT-3.5 Turbo de OpenAI</strong><br> %1 <br><br><i>* Even if you pay OpenAI for ChatGPT-4 this does not guarantee API key access. Contact OpenAI for more info. <br><br><i>* Aunque pagues a OpenAI por ChatGPT-4, esto no garantiza el acceso a la clave API. Contacta a OpenAI para más información. <strong>OpenAI's ChatGPT model GPT-4</strong><br> %1 %2 <strong>Modelo ChatGPT GPT-4 de OpenAI</strong><br> %1 %2 <ul><li>Requires personal Mistral API key.</li><li>WARNING: Will send your chats to Mistral!</li><li>Your API key will be stored on disk</li><li>Will only be used to communicate with Mistral</li><li>You can apply for an API key <a href="https://console.mistral.ai/user/api-keys">here</a>.</li> <ul><li>Requiere una clave API personal de Mistral.</li><li>ADVERTENCIA: ¡Enviará tus chats a Mistral!</li><li>Tu clave API se almacenará en el disco</li><li>Solo se usará para comunicarse con Mistral</li><li>Puedes solicitar una clave API <a href="https://console.mistral.ai/user/api-keys">aquí</a>.</li> <strong>Mistral Tiny model</strong><br> %1 <strong>Modelo Mistral Tiny</strong><br> %1 <strong>Mistral Small model</strong><br> %1 <strong>Modelo Mistral Small</strong><br> %1 <strong>Mistral Medium model</strong><br> %1 <strong>Modelo Mistral Medium</strong><br> %1 <strong>Created by %1.</strong><br><ul><li>Published on %2.<li>This model has %3 likes.<li>This model has %4 downloads.<li>More info can be found <a href="https://huggingface.co/%5">here.</a></ul> <strong>Creado por %1.</strong><br><ul><li>Publicado el %2.<li>Este modelo tiene %3 me gusta.<li>Este modelo tiene %4 descargas.<li>Más información puede encontrarse <a href="https://huggingface.co/%5">aquí.</a></ul> %1 (%2) %1 (%2) cannot open "%1": %2 no se puede abrir "%1": %2 cannot create "%1": %2 no se puede crear "%1": %2 <strong>OpenAI-Compatible API Model</strong><br><ul><li>API Key: %1</li><li>Base URL: %2</li><li>Model Name: %3</li></ul> <strong>Modelo de API compatible con OpenAI</strong><br><ul><li>Clave API: %1</li><li>URL base: %2</li><li>Nombre del modelo: %3</li></ul> <ul><li>Requires personal API key and the API base URL.</li><li>WARNING: Will send your chats to the OpenAI-compatible API Server you specified!</li><li>Your API key will be stored on disk</li><li>Will only be used to communicate with the OpenAI-compatible API Server</li> <ul><li>Requiere una clave API personal y la URL base de la API.</li><li>ADVERTENCIA: ¡Enviará sus chats al servidor de API compatible con OpenAI que especificó!</li><li>Su clave API se almacenará en el disco</li><li>Solo se utilizará para comunicarse con el servidor de API compatible con OpenAI</li> <strong>Connect to OpenAI-compatible API server</strong><br> %1 <strong>Conectar al servidor de API compatible con OpenAI</strong><br> %1 ModelSettings Model Modelo %1 system message? Clear Reset The system message will be %1. removed reset to the default %1 chat template? The chat template will be %1. erased Model Settings Configuración del modelo Clone Clonar Remove Eliminar Name Nombre Model File Archivo del modelo System Prompt Indicación del sistema Prefixed at the beginning of every conversation. Must contain the appropriate framing tokens. Prefijado al inicio de cada conversación. Debe contener los tokens de encuadre apropiados. Prompt Template Plantilla de indicación The template that wraps every prompt. La plantilla que envuelve cada indicación. Must contain the string "%1" to be replaced with the user's input. Debe contener la cadena "%1" para ser reemplazada con la entrada del usuario. Chat Name Prompt Indicación para el nombre del chat Prompt used to automatically generate chat names. Indicación utilizada para generar automáticamente nombres de chat. Suggested FollowUp Prompt Indicación de seguimiento sugerida Prompt used to generate suggested follow-up questions. Indicación utilizada para generar preguntas de seguimiento sugeridas. Context Length Longitud del contexto Number of input and output tokens the model sees. Número de tokens de entrada y salida que el modelo ve. Temperature Temperatura Randomness of model output. Higher -> more variation. Aleatoriedad de la salida del modelo. Mayor -> más variación. Temperature increases the chances of choosing less likely tokens. NOTE: Higher temperature gives more creative but less predictable outputs. La temperatura aumenta las probabilidades de elegir tokens menos probables. NOTA: Una temperatura más alta da resultados más creativos pero menos predecibles. Top-P Top-P Nucleus Sampling factor. Lower -> more predictable. Factor de muestreo de núcleo. Menor -> más predecible. Only the most likely tokens up to a total probability of top_p can be chosen. NOTE: Prevents choosing highly unlikely tokens. Solo se pueden elegir los tokens más probables hasta una probabilidad total de top_p. NOTA: Evita elegir tokens altamente improbables. Min-P Min-P Minimum token probability. Higher -> more predictable. Probabilidad mínima del token. Mayor -> más predecible. Sets the minimum relative probability for a token to be considered. Establece la probabilidad relativa mínima para que un token sea considerado. Top-K Top-K Size of selection pool for tokens. Tamaño del grupo de selección para tokens. Only the top K most likely tokens will be chosen from. Solo se elegirán los K tokens más probables. Max Length Longitud máxima Maximum response length, in tokens. Longitud máxima de respuesta, en tokens. Prompt Batch Size Tamaño del lote de indicaciones The batch size used for prompt processing. El tamaño del lote utilizado para el procesamiento de indicaciones. Amount of prompt tokens to process at once. NOTE: Higher values can speed up reading prompts but will use more RAM. Cantidad de tokens de prompt a procesar de una vez. NOTA: Valores más altos pueden acelerar la lectura de prompts, pero usarán más RAM. Repeat Penalty Penalización por repetición Repetition penalty factor. Set to 1 to disable. Factor de penalización por repetición. Establecer a 1 para desactivar. Repeat Penalty Tokens Tokens de penalización por repetición Number of previous tokens used for penalty. Número de tokens anteriores utilizados para la penalización. GPU Layers Capas de GPU Number of model layers to load into VRAM. Número de capas del modelo a cargar en la VRAM. Maximum combined prompt/response tokens before information is lost. Using more context than the model was trained on will yield poor results. NOTE: Does not take effect until you reload the model. Máximo de tokens combinados de pregunta/respuesta antes de que se pierda información. Usar más contexto del que el modelo fue entrenado producirá resultados deficientes. NOTA: No surtirá efecto hasta que recargue el modelo. System Message A message to set the context or guide the behavior of the model. Leave blank for none. NOTE: Since GPT4All 3.5, this should not contain control tokens. System message is not <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">plain text</a>. Chat Template This Jinja template turns the chat into input for the model. No <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">chat template</a> configured. The <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">chat template</a> cannot be blank. <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">Syntax error</a>: %1 Chat template is not in <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">Jinja format</a>. How many model layers to load into VRAM. Decrease this if GPT4All runs out of VRAM while loading this model. Lower values increase CPU load and RAM usage, and make inference slower. NOTE: Does not take effect until you reload the model. Cuántas capas del modelo cargar en la VRAM. Disminuya esto si GPT4All se queda sin VRAM al cargar este modelo. Valores más bajos aumentan la carga de la CPU y el uso de RAM, y hacen que la inferencia sea más lenta. NOTA: No surte efecto hasta que recargue el modelo. ModelsView No Models Installed No hay modelos instalados Install a model to get started using GPT4All Instala un modelo para empezar a usar GPT4All + Add Model + Agregar modelo Shows the add model view Muestra la vista de agregar modelo Installed Models Modelos instalados Locally installed chat models Modelos de chat instalados localmente Model file Archivo del modelo Model file to be downloaded Archivo del modelo a descargar Description Descripción File description Descripción del archivo Cancel Cancelar Resume Reanudar Stop/restart/start the download Detener/reiniciar/iniciar la descarga Remove Eliminar Remove model from filesystem Eliminar modelo del sistema de archivos Install Instalar Install online model Instalar modelo en línea <strong><font size="1"><a href="#error">Error</a></strong></font> <strong><font size="1"><a href="#error">Error</a></strong></font> <strong><font size="2">WARNING: Not recommended for your hardware. Model requires more memory (%1 GB) than your system has available (%2).</strong></font> <strong><font size="2">ADVERTENCIA: No recomendado para su hardware. El modelo requiere más memoria (%1 GB) de la que su sistema tiene disponible (%2).</strong></font> %1 GB %1 GB ? ? Describes an error that occurred when downloading Describe un error que ocurrió durante la descarga Error for incompatible hardware Error por hardware incompatible Download progressBar Barra de progreso de descarga Shows the progress made in the download Muestra el progreso realizado en la descarga Download speed Velocidad de descarga Download speed in bytes/kilobytes/megabytes per second Velocidad de descarga en bytes/kilobytes/megabytes por segundo Calculating... Calculando... Whether the file hash is being calculated Si se está calculando el hash del archivo Busy indicator Indicador de ocupado Displayed when the file hash is being calculated Se muestra cuando se está calculando el hash del archivo enter $API_KEY ingrese $API_KEY File size Tamaño del archivo RAM required RAM requerida Parameters Parámetros Quant Cuantificación Type Tipo ERROR: $API_KEY is empty. ERROR: $API_KEY está vacía. ERROR: $BASE_URL is empty. ERROR: $BASE_URL está vacía. enter $BASE_URL ingrese $BASE_URL ERROR: $MODEL_NAME is empty. ERROR: $MODEL_NAME está vacío. enter $MODEL_NAME ingrese $MODEL_NAME MyFancyLink Fancy link Enlace elegante A stylized link Un enlace estilizado MyFileDialog Please choose a file Por favor elige un archivo MyFolderDialog Please choose a directory Por favor, elija un directorio MySettingsLabel Clear Reset MySettingsStack Please choose a directory Por favor, elija un directorio MySettingsTab Restore defaults? This page of settings will be reset to the defaults. Restore Defaults Restaurar valores predeterminados Restores settings dialog to a default state Restaura el diálogo de configuración a su estado predeterminado NetworkDialog Contribute data to the GPT4All Opensource Datalake. Contribuir datos al Datalake de código abierto de GPT4All. By enabling this feature, you will be able to participate in the democratic process of training a large language model by contributing data for future model improvements. When a GPT4All model responds to you and you have opted-in, your conversation will be sent to the GPT4All Open Source Datalake. Additionally, you can like/dislike its response. If you dislike a response, you can suggest an alternative response. This data will be collected and aggregated in the GPT4All Datalake. NOTE: By turning on this feature, you will be sending your data to the GPT4All Open Source Datalake. You should have no expectation of chat privacy when this feature is enabled. You should; however, have an expectation of an optional attribution if you wish. Your chat data will be openly available for anyone to download and will be used by Nomic AI to improve future GPT4All models. Nomic AI will retain all attribution information attached to your data and you will be credited as a contributor to any GPT4All model release that uses your data! Al habilitar esta función, podrás participar en el proceso democrático de entrenar un modelo de lenguaje grande contribuyendo con datos para futuras mejoras del modelo. Cuando un modelo GPT4All te responda y hayas aceptado participar, tu conversación se enviará al Datalake de Código Abierto de GPT4All. Además, podrás indicar si te gusta o no su respuesta. Si no te gusta una respuesta, puedes sugerir una alternativa. Estos datos se recopilarán y agregarán en el Datalake de GPT4All. NOTA: Al activar esta función, estarás enviando tus datos al Datalake de Código Abierto de GPT4All. No debes esperar privacidad en el chat cuando esta función esté habilitada. Sin embargo, puedes esperar una atribución opcional si lo deseas. Tus datos de chat estarán disponibles abiertamente para que cualquiera los descargue y serán utilizados por Nomic AI para mejorar futuros modelos de GPT4All. Nomic AI conservará toda la información de atribución adjunta a tus datos y se te acreditará como contribuyente en cualquier lanzamiento de modelo GPT4All que utilice tus datos. Terms for opt-in Términos para optar por participar Describes what will happen when you opt-in Describe lo que sucederá cuando opte por participar Please provide a name for attribution (optional) Por favor, proporcione un nombre para la atribución (opcional) Attribution (optional) Atribución (opcional) Provide attribution Proporcionar atribución Enable Habilitar Enable opt-in Habilitar participación Cancel Cancelar Cancel opt-in Cancelar participación NewVersionDialog New version is available Nueva versión disponible Update Actualizar Update to new version Actualizar a nueva versión PopupDialog Reveals a shortlived help balloon Muestra un globo de ayuda de corta duración Busy indicator Indicador de ocupado Displayed when the popup is showing busy Se muestra cuando la ventana emergente está ocupada RemoteModelCard API Key ERROR: $API_KEY is empty. enter $API_KEY ingrese $API_KEY Whether the file hash is being calculated Si se está calculando el hash del archivo Base Url ERROR: $BASE_URL is empty. enter $BASE_URL ingrese $BASE_URL Model Name ERROR: $MODEL_NAME is empty. ERROR: $MODEL_NAME está vacío. enter $MODEL_NAME ingrese $MODEL_NAME Models Modelos Install Instalar Install remote model SettingsView Settings Configuración Contains various application settings Contiene varias configuraciones de la aplicación Application Aplicación Model Modelo LocalDocs DocumentosLocales StartupDialog Welcome! ¡Bienvenido! ### Release Notes %1<br/> ### Contributors %2 ### Notas de la versión %1<br/> ### Colaboradores %2 Release notes Notas de la versión Release notes for this version Notas de la versión para esta versión Terms for opt-in Términos para aceptar Describes what will happen when you opt-in Describe lo que sucederá cuando acepte Opt-in to anonymous usage analytics used to improve GPT4All Opt-in for anonymous usage statistics Aceptar estadísticas de uso anónimas Yes Allow opt-in for anonymous usage statistics Permitir aceptación de estadísticas de uso anónimas No No Opt-out for anonymous usage statistics Rechazar estadísticas de uso anónimas Allow opt-out for anonymous usage statistics Permitir rechazo de estadísticas de uso anónimas Opt-in to anonymous sharing of chats to the GPT4All Datalake Opt-in for network Aceptar para la red Allow opt-in for network Permitir aceptación para la red Allow opt-in anonymous sharing of chats to the GPT4All Datalake Permitir compartir anónimamente los chats con el Datalake de GPT4All Opt-out for network Rechazar para la red Allow opt-out anonymous sharing of chats to the GPT4All Datalake Permitir rechazar el compartir anónimo de chats con el Datalake de GPT4All ### Opt-ins for anonymous usage analytics and datalake By enabling these features, you will be able to participate in the democratic process of training a large language model by contributing data for future model improvements. When a GPT4All model responds to you and you have opted-in, your conversation will be sent to the GPT4All Open Source Datalake. Additionally, you can like/dislike its response. If you dislike a response, you can suggest an alternative response. This data will be collected and aggregated in the GPT4All Datalake. NOTE: By turning on this feature, you will be sending your data to the GPT4All Open Source Datalake. You should have no expectation of chat privacy when this feature is enabled. You should; however, have an expectation of an optional attribution if you wish. Your chat data will be openly available for anyone to download and will be used by Nomic AI to improve future GPT4All models. Nomic AI will retain all attribution information attached to your data and you will be credited as a contributor to any GPT4All model release that uses your data! ### Consentimiento para análisis de uso anónimo y lago de datos Al habilitar estas funciones, podrá participar en el proceso democrático de entrenar un modelo de lenguaje grande contribuyendo con datos para futuras mejoras del modelo. Cuando un modelo GPT4All le responda y usted haya dado su consentimiento, su conversación se enviará al Lago de Datos de Código Abierto de GPT4All. Además, puede indicar si le gusta o no su respuesta. Si no le gusta una respuesta, puede sugerir una respuesta alternativa. Estos datos se recopilarán y agregarán en el Lago de Datos de GPT4All. NOTA: Al activar esta función, estará enviando sus datos al Lago de Datos de Código Abierto de GPT4All. No debe esperar privacidad en el chat cuando esta función esté habilitada. Sin embargo, puede esperar una atribución opcional si lo desea. Sus datos de chat estarán disponibles abiertamente para que cualquiera los descargue y serán utilizados por Nomic AI para mejorar futuros modelos de GPT4All. Nomic AI conservará toda la información de atribución adjunta a sus datos y se le acreditará como contribuyente en cualquier lanzamiento de modelo GPT4All que utilice sus datos. SwitchModelDialog <b>Warning:</b> changing the model will erase the current conversation. Do you wish to continue? <b>Advertencia:</b> cambiar el modelo borrará la conversación actual. ¿Deseas continuar? Continue Continuar Continue with model loading Continuar con la carga del modelo Cancel Cancelar ThumbsDownDialog Please edit the text below to provide a better response. (optional) Por favor, edite el texto a continuación para proporcionar una mejor respuesta. (opcional) Please provide a better response... Por favor, proporcione una mejor respuesta... Submit Enviar Submits the user's response Envía la respuesta del usuario Cancel Cancelar Closes the response dialog Cierra el diálogo de respuesta main GPT4All v%1 GPT4All v%1 Restore Quit <h3>Encountered an error starting up:</h3><br><i>"Incompatible hardware detected."</i><br><br>Unfortunately, your CPU does not meet the minimal requirements to run this program. In particular, it does not support AVX intrinsics which this program requires to successfully run a modern large language model. The only solution at this time is to upgrade your hardware to a more modern CPU.<br><br>See here for more information: <a href="https://en.wikipedia.org/wiki/Advanced_Vector_Extensions">https://en.wikipedia.org/wiki/Advanced_Vector_Extensions</a> <h3>Se encontró un error al iniciar:</h3><br><i>"Se detectó hardware incompatible."</i><br><br>Desafortunadamente, tu CPU no cumple con los requisitos mínimos para ejecutar este programa. En particular, no soporta instrucciones AVX, las cuales este programa requiere para ejecutar con éxito un modelo de lenguaje grande moderno. La única solución en este momento es actualizar tu hardware a una CPU más moderna.<br><br>Consulta aquí para más información: <a href="https://en.wikipedia.org/wiki/Advanced_Vector_Extensions">https://en.wikipedia.org/wiki/Advanced_Vector_Extensions</a> <h3>Encountered an error starting up:</h3><br><i>"Inability to access settings file."</i><br><br>Unfortunately, something is preventing the program from accessing the settings file. This could be caused by incorrect permissions in the local app config directory where the settings file is located. Check out our <a href="https://discord.gg/4M2QFmTt2k">discord channel</a> for help. <h3>Se encontró un error al iniciar:</h3><br><i>"No se puede acceder al archivo de configuración."</i><br><br>Desafortunadamente, algo está impidiendo que el programa acceda al archivo de configuración. Esto podría ser causado por permisos incorrectos en el directorio de configuración local de la aplicación donde se encuentra el archivo de configuración. Visita nuestro <a href="https://discord.gg/4M2QFmTt2k">canal de Discord</a> para obtener ayuda. Connection to datalake failed. La conexión al datalake falló. Saving chats. Guardando chats. Network dialog Diálogo de red opt-in to share feedback/conversations optar por compartir comentarios/conversaciones Home view Vista de inicio Home view of application Vista de inicio de la aplicación Home Inicio Chat view Vista de chat Chat view to interact with models Vista de chat para interactuar con modelos Chats Chats Models Modelos Models view for installed models Vista de modelos para modelos instalados LocalDocs Docs Locales LocalDocs view to configure and use local docs Vista de DocumentosLocales para configurar y usar documentos locales Settings Config. Settings view for application configuration Vista de configuración para la configuración de la aplicación The datalake is enabled El datalake está habilitado Using a network model Usando un modelo de red Server mode is enabled El modo servidor está habilitado Installed models Modelos instalados View of installed models Vista de modelos instalados ================================================ FILE: gpt4all-chat/translations/gpt4all_it_IT.ts ================================================ AddCollectionView ← Existing Collections ← Raccolte esistenti Add Document Collection Aggiungi raccolta documenti Add a folder containing plain text files, PDFs, or Markdown. Configure additional extensions in Settings. Aggiungi una cartella contenente file di testo semplice, PDF o Markdown. Configura estensioni aggiuntive in Settaggi. Name Nome Collection name... Nome della raccolta... Name of the collection to add (Required) Nome della raccolta da aggiungere (Obbligatorio) Folder Cartella Folder path... Percorso cartella... Folder path to documents (Required) Percorso della cartella dei documenti (richiesto) Browse Esplora Create Collection Crea raccolta AddGPT4AllModelView These models have been specifically configured for use in GPT4All. The first few models on the list are known to work the best, but you should only attempt to use models that will fit in your available memory. Questi modelli sono stati specificamente configurati per l'uso in GPT4All. I primi modelli dell'elenco sono noti per funzionare meglio, ma dovresti utilizzare solo modelli che possano rientrare nella memoria disponibile. Network error: could not retrieve %1 Errore di rete: impossibile recuperare %1 Busy indicator Indicatore di occupato Displayed when the models request is ongoing Visualizzato quando la richiesta dei modelli è in corso All Tutti Reasoning Ragionamento Model file File del modello Model file to be downloaded File del modello da scaricare Description Descrizione File description Descrizione del file Cancel Annulla Resume Riprendi Download Scarica Stop/restart/start the download Arresta/riavvia/avvia il download Remove Rimuovi Remove model from filesystem Rimuovi il modello dal sistema dei file Install Installa Install online model Installa il modello online <strong><font size="1"><a href="#error">Error</a></strong></font> <strong><font size="1"><a href="#error">Errore</a></strong></font> Describes an error that occurred when downloading Descrive un errore che si è verificato durante lo scaricamento <strong><font size="2">WARNING: Not recommended for your hardware. Model requires more memory (%1 GB) than your system has available (%2).</strong></font> <strong><font size="2">AVVISO: non consigliato per il tuo hardware. Il modello richiede più memoria (%1 GB) di quella disponibile nel sistema (%2).</strong></font> Error for incompatible hardware Errore per hardware incompatibile Download progressBar Barra di avanzamento dello scaricamento Shows the progress made in the download Mostra lo stato di avanzamento dello scaricamento Download speed Velocità di scaricamento Download speed in bytes/kilobytes/megabytes per second Velocità di scaricamento in byte/kilobyte/megabyte al secondo Calculating... Calcolo in corso... Whether the file hash is being calculated Se viene calcolato l'hash del file Displayed when the file hash is being calculated Visualizzato durante il calcolo dell'hash del file ERROR: $API_KEY is empty. ERRORE: $API_KEY è vuoto. enter $API_KEY Inserire $API_KEY ERROR: $BASE_URL is empty. ERRORE: $BASE_URL non è valido. enter $BASE_URL inserisci $BASE_URL ERROR: $MODEL_NAME is empty. ERRORE: $MODEL_NAME è vuoto. enter $MODEL_NAME inserisci $MODEL_NAME File size Dimensione del file RAM required RAM richiesta %1 GB %1 GB ? ? Parameters Parametri Quant Quant Type Tipo AddHFModelView Use the search to find and download models from HuggingFace. There is NO GUARANTEE that these will work. Many will require additional configuration before they can be used. Usa la ricerca per trovare e scaricare modelli da HuggingFace. NON C'È ALCUNA GARANZIA che funzioneranno. Molti richiederanno configurazioni aggiuntive prima di poter essere utilizzati. Discover and download models by keyword search... Scopri e scarica i modelli tramite ricerca per parole chiave... Text field for discovering and filtering downloadable models Campo di testo per scoprire e filtrare i modelli scaricabili Searching · %1 Ricerca · %1 Initiate model discovery and filtering Avvia rilevamento e filtraggio dei modelli Triggers discovery and filtering of models Attiva la scoperta e il filtraggio dei modelli Default Predefinito Likes Mi piace Downloads Scaricamenti Recent Recenti Sort by: %1 Ordina per: %1 Asc Asc Desc Disc Sort dir: %1 Direzione ordinamento: %1 None Niente Limit: %1 Limite: %1 Model file File del modello Model file to be downloaded File del modello da scaricare Description Descrizione File description Descrizione del file Cancel Annulla Resume Riprendi Download Scarica Stop/restart/start the download Arresta/riavvia/avvia il download Remove Rimuovi Remove model from filesystem Rimuovi il modello dal sistema dei file Install Installa Install online model Installa il modello online <strong><font size="1"><a href="#error">Error</a></strong></font> <strong><font size="1"><a href="#error">Errore</a></strong></font> Describes an error that occurred when downloading Descrive un errore che si è verificato durante lo scaricamento <strong><font size="2">WARNING: Not recommended for your hardware. Model requires more memory (%1 GB) than your system has available (%2).</strong></font> <strong><font size="2">AVVISO: non consigliato per il tuo hardware. Il modello richiede più memoria (%1 GB) di quella disponibile nel sistema (%2).</strong></font> Error for incompatible hardware Errore per hardware incompatibile Download progressBar Barra di avanzamento dello scaricamento Shows the progress made in the download Mostra lo stato di avanzamento dello scaricamento Download speed Velocità di scaricamento Download speed in bytes/kilobytes/megabytes per second Velocità di scaricamento in byte/kilobyte/megabyte al secondo Calculating... Calcolo in corso... Whether the file hash is being calculated Se viene calcolato l'hash del file Busy indicator Indicatore di occupato Displayed when the file hash is being calculated Visualizzato durante il calcolo dell'hash del file ERROR: $API_KEY is empty. ERRORE: $API_KEY è vuoto. enter $API_KEY Inserire $API_KEY ERROR: $BASE_URL is empty. ERRORE: $BASE_URL non è valido. enter $BASE_URL inserisci $BASE_URL ERROR: $MODEL_NAME is empty. ERRORE: $MODEL_NAME è vuoto. enter $MODEL_NAME inserisci $MODEL_NAME File size Dimensione del file Quant Quant Type Tipo AddModelView ← Existing Models ← Modelli esistenti Explore Models Esplora modelli GPT4All GPT4All Remote Providers Fornitori Remoti HuggingFace HuggingFace Discover and download models by keyword search... Scopri e scarica i modelli tramite ricerca per parole chiave... Text field for discovering and filtering downloadable models Campo di testo per scoprire e filtrare i modelli scaricabili Initiate model discovery and filtering Avvia rilevamento e filtraggio dei modelli Triggers discovery and filtering of models Attiva la scoperta e il filtraggio dei modelli Default Predefinito Likes Mi piace Downloads Scaricamenti Recent Recenti Asc Asc Desc Disc None Niente Searching · %1 Ricerca · %1 Sort by: %1 Ordina per: %1 Sort dir: %1 Direzione ordinamento: %1 Limit: %1 Limite: %1 Network error: could not retrieve %1 Errore di rete: impossibile recuperare %1 Busy indicator Indicatore di occupato Displayed when the models request is ongoing Visualizzato quando la richiesta dei modelli è in corso Model file File del modello Model file to be downloaded File del modello da scaricare Description Descrizione File description Descrizione del file Cancel Annulla Resume Riprendi Download Scarica Stop/restart/start the download Arresta/riavvia/avvia il download Remove Rimuovi Remove model from filesystem Rimuovi il modello dal sistema dei file Install Installa Install online model Installa il modello online ERROR: $API_KEY is empty. ERRORE: $API_KEY è vuoto. ERROR: $BASE_URL is empty. ERRORE: $BASE_URL non è valido. enter $BASE_URL inserisci $BASE_URL ERROR: $MODEL_NAME is empty. ERRORE: $MODEL_NAME è vuoto. enter $MODEL_NAME inserisci $MODEL_NAME Describes an error that occurred when downloading Descrive un errore che si è verificato durante lo scaricamento <strong><font size="1"><a href="#error">Error</a></strong></font> <strong><font size="1"><a href="#error">Errore</a></strong></font> Error for incompatible hardware Errore per hardware incompatibile Download progressBar Barra di avanzamento dello scaricamento Shows the progress made in the download Mostra lo stato di avanzamento dello scaricamento Download speed Velocità di scaricamento Download speed in bytes/kilobytes/megabytes per second Velocità di scaricamento in byte/kilobyte/megabyte al secondo Calculating... Calcolo in corso... Whether the file hash is being calculated Se viene calcolato l'hash del file Displayed when the file hash is being calculated Visualizzato durante il calcolo dell'hash del file enter $API_KEY Inserire $API_KEY File size Dimensione del file RAM required RAM richiesta Parameters Parametri Quant Quant Type Tipo AddRemoteModelView Various remote model providers that use network resources for inference. Vari fornitori di modelli remoti che utilizzano risorse di rete per l'inferenza. Groq Groq Groq offers a high-performance AI inference engine designed for low-latency and efficient processing. Optimized for real-time applications, Groq’s technology is ideal for users who need fast responses from open large language models and other AI workloads.<br><br>Get your API key: <a href="https://console.groq.com/keys">https://groq.com/</a> Groq offre un motore di inferenza AI ad alte prestazioni progettato per una latenza ridotta ed elaborazione efficiente. Ottimizzata per applicazioni in tempo reale, la tecnologia di Groq è ideale per utenti che necessitano di risposte rapide da modelli linguistici di grandi dimensioni aperti e altri carichi di lavoro AI.<br><br>Ottieni la tua chiave API: <a href="https://console.groq.com/keys">https://groq.com/</a> OpenAI OpenAI OpenAI provides access to advanced AI models, including GPT-4 supporting a wide range of applications, from conversational AI to content generation and code completion.<br><br>Get your API key: <a href="https://platform.openai.com/signup">https://openai.com/</a> OpenAI fornisce accesso a modelli AI avanzati, tra cui GPT-4, supportando un'ampia gamma di applicazioni, dall'AI conversazionale alla generazione di contenuti e al completamento del codice.<br><br>Ottieni la tua chiave API: <a href="https://platform.openai.com/signup">https://openai.com/</a> Mistral Mistral Mistral AI specializes in efficient, open-weight language models optimized for various natural language processing tasks. Their models are designed for flexibility and performance, making them a solid option for applications requiring scalable AI solutions.<br><br>Get your API key: <a href="https://mistral.ai/">https://mistral.ai/</a> Mistral AI è specializzata in modelli linguistici open-weight efficienti, ottimizzati per diverse attività di elaborazione del linguaggio naturale. I loro modelli sono progettati per flessibilità e prestazioni, rendendoli una solida opzione per applicazioni che richiedono soluzioni AI scalabili.<br><br>Ottieni la tua chiave API: <a href="https://mistral.ai/">https://mistral.ai/</a> Custom Personalizzato The custom provider option allows users to connect their own OpenAI-compatible AI models or third-party inference services. This is useful for organizations with proprietary models or those leveraging niche AI providers not listed here. L'opzione fornitore personalizzato consente agli utenti di connettere i propri modelli AI compatibili con OpenAI o servizi di inferenza di terze parti. Questa funzione è utile per organizzazioni con modelli proprietari o per chi utilizza fornitori AI di nicchia non elencati qui. ApplicationSettings Application Applicazione Network dialog Dialogo di rete opt-in to share feedback/conversations aderisci per condividere feedback/conversazioni Error dialog Dialogo d'errore Application Settings Settaggi applicazione General Generale Theme Tema The application color scheme. La combinazione di colori dell'applicazione. Dark Scuro Light Chiaro ERROR: Update system could not find the MaintenanceTool used to check for updates!<br/><br/>Did you install this application using the online installer? If so, the MaintenanceTool executable should be located one directory above where this application resides on your filesystem.<br/><br/>If you can't start it manually, then I'm afraid you'll have to reinstall. ERRORE: il sistema di aggiornamento non è riuscito a trovare MaintenanceTool utilizzato per verificare la presenza di aggiornamenti!<br/><br/>Hai installato questa applicazione tramite l'installer online? In tal caso, l'eseguibile MaintenanceTool dovrebbe trovarsi una directory sopra quella in cui risiede questa applicazione sul tuo file system.<br/><br/>Se non riesci ad avviarlo manualmente, temo che dovrai reinstallarlo. LegacyDark Scuro Legacy Font Size Dimensioni del Font The size of text in the application. La dimensione del testo nell'applicazione. Small Piccolo Medium Medio Large Grande Language and Locale Lingua e settaggi locali The language and locale you wish to use. La lingua e i settaggi locali che vuoi utilizzare. System Locale Settaggi locali del sistema Device Dispositivo The compute device used for text generation. Il dispositivo di calcolo utilizzato per la generazione del testo. Application default Applicazione predefinita Default Model Modello predefinito The preferred model for new chats. Also used as the local server fallback. Il modello preferito per le nuove chat. Utilizzato anche come ripiego del server locale. Suggestion Mode Modalità suggerimento Generate suggested follow-up questions at the end of responses. Genera le domande di approfondimento suggerite alla fine delle risposte. When chatting with LocalDocs Quando chatti con LocalDocs Whenever possible Quando possibile Never Mai Download Path Percorso di scarico Where to store local models and the LocalDocs database. Dove archiviare i modelli locali e il database LocalDocs. Browse Esplora Choose where to save model files Scegli dove salvare i file del modello Enable Datalake Abilita Datalake Send chats and feedback to the GPT4All Open-Source Datalake. Invia chat e commenti al Datalake Open Source GPT4All. Advanced Avanzate CPU Threads Thread della CPU Tread CPU The number of CPU threads used for inference and embedding. Il numero di thread della CPU utilizzati per l'inferenza e l'incorporamento. Enable System Tray Abilita la barra delle applicazioni The application will minimize to the system tray when the window is closed. Quando la finestra viene chiusa, l'applicazione verrà ridotta a icona nella barra delle applicazioni. Enable Local API Server Abilita il server API locale Expose an OpenAI-Compatible server to localhost. WARNING: Results in increased resource usage. Esporre un server compatibile con OpenAI a localhost. ATTENZIONE: comporta un maggiore utilizzo delle risorse. API Server Port Porta del server API The port to use for the local server. Requires restart. La porta da utilizzare per il server locale. Richiede il riavvio. Check For Updates Controlla gli aggiornamenti Manually check for an update to GPT4All. Verifica manualmente l'aggiornamento di GPT4All. Updates Aggiornamenti Chat New Chat Nuova Chat Server Chat Chat del server ChatAPIWorker ERROR: Network error occurred while connecting to the API server ERRORE: si è verificato un errore di rete durante la connessione al server API ChatAPIWorker::handleFinished got HTTP Error %1 %2 ChatAPIWorker::handleFinished ha ricevuto l'errore HTTP %1 %2 ChatCollapsibleItem Analysis encountered error Errore durante l'analisi Thinking Elaborazione Analyzing Analisi Thought for %1 %2 Elaborato per %1 %2 second secondo seconds secondi Analyzed Analisi completata ChatDrawer Drawer Cassetto Main navigation drawer Cassetto di navigazione principale + New Chat + Nuova Chat Create a new chat Crea una nuova chat Select the current chat or edit the chat when in edit mode Seleziona la chat corrente o modifica la chat in modalità modifica Edit chat name Modifica il nome della chat Save chat name Salva il nome della chat Delete chat Elimina chat Confirm chat deletion Conferma l'eliminazione della chat Cancel chat deletion Annulla l'eliminazione della chat List of chats Elenco delle chat List of chats in the drawer dialog Elenco delle chat nella finestra di dialogo del cassetto ChatItemView GPT4All You Tu response stopped ... risposta interrotta ... retrieving localdocs: %1 ... recupero documenti locali: %1 ... searching localdocs: %1 ... ricerca in documenti locali: %1 ... processing ... elaborazione ... generating response ... generazione risposta ... generating questions ... generazione domande ... generating toolcall ... generazione chiamata strumento ... Copy Copia Copy Message Copia messaggio Disable markdown Disabilita Markdown Enable markdown Abilita Markdown %n Source(s) %n Fonte %n Fonti LocalDocs Edit this message? Vuoi modificare questo messaggio? All following messages will be permanently erased. Tutti i messaggi successivi verranno cancellati definitivamente. Redo this response? Ripetere questa risposta? Cannot edit chat without a loaded model. Non è possibile modificare la chat senza un modello caricato. Cannot edit chat while the model is generating. Impossibile modificare la chat mentre il modello è in fase di generazione. Edit Modifica Cannot redo response without a loaded model. Non è possibile ripetere la risposta senza un modello caricato. Cannot redo response while the model is generating. Impossibile ripetere la risposta mentre il modello è in fase di generazione. Redo Ripeti Like response Mi piace la risposta Dislike response Non mi piace la risposta Suggested follow-ups Approfondimenti suggeriti ChatLLM Your message was too long and could not be processed (%1 > %2). Please try again with something shorter. Il messaggio era troppo lungo e non è stato possibile elaborarlo (%1 > %2). Riprova con un messaggio più breve. ChatListModel TODAY OGGI THIS WEEK QUESTA SETTIMANA THIS MONTH QUESTO MESE LAST SIX MONTHS ULTIMI SEI MESI THIS YEAR QUEST'ANNO LAST YEAR L'ANNO SCORSO ChatTextItem Copy Copia Copy Message Copia messaggio Disable markdown Disabilita Markdown Enable markdown Abilita Markdown ChatView <h3>Warning</h3><p>%1</p> <h3>Avviso</h3><p>%1</p> Conversation copied to clipboard. Conversazione copiata negli appunti. Code copied to clipboard. Codice copiato negli appunti. The entire chat will be erased. L'intera chat verrà cancellata. Chat panel Pannello chat Chat panel with options Pannello chat con opzioni Reload the currently loaded model Ricarica il modello attualmente caricato Eject the currently loaded model Espelli il modello attualmente caricato No model installed. Nessun modello installato. Model loading error. Errore di caricamento del modello. Waiting for model... In attesa del modello... Switching context... Cambio contesto... Choose a model... Scegli un modello... Not found: %1 Non trovato: %1 The top item is the current model L'elemento in alto è il modello attuale LocalDocs Add documents Aggiungi documenti add collections of documents to the chat aggiungi raccolte di documenti alla chat Load the default model Carica il modello predefinito Loads the default model which can be changed in settings Carica il modello predefinito che può essere modificato nei settaggi No Model Installed Nessun modello installato GPT4All requires that you install at least one model to get started GPT4All richiede l'installazione di almeno un modello per iniziare Install a Model Installa un modello Shows the add model view Mostra la vista aggiungi modello Conversation with the model Conversazione con il modello prompt / response pairs from the conversation coppie prompt/risposta dalla conversazione Legacy prompt template needs to be <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">updated</a> in Settings. Il modello di prompt precedente deve essere <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">aggiornato</a> nei Settaggi. No <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">chat template</a> configured. Nessun <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">modello di chat</a> configurato. The <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">chat template</a> cannot be blank. Il <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">modello di chat</a> non può essere vuoto. Legacy system prompt needs to be <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">updated</a> in Settings. Il prompt del sistema precedente deve essere <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">aggiornato</a> nei Settaggi. Copy Copia Erase and reset chat session Cancella e ripristina la sessione di chat Copy chat session to clipboard Copia la sessione di chat negli appunti Add media Aggiungi contenuti multimediali Adds media to the prompt Aggiunge contenuti multimediali al prompt Stop generating Interrompi la generazione Stop the current response generation Arresta la generazione della risposta corrente Attach Allegare Single File File singolo Reloads the model Ricarica il modello <h3>Encountered an error loading model:</h3><br><i>"%1"</i><br><br>Model loading failures can happen for a variety of reasons, but the most common causes include a bad file format, an incomplete or corrupted download, the wrong file type, not enough system RAM or an incompatible model type. Here are some suggestions for resolving the problem:<br><ul><li>Ensure the model file has a compatible format and type<li>Check the model file is complete in the download folder<li>You can find the download folder in the settings dialog<li>If you've sideloaded the model ensure the file is not corrupt by checking md5sum<li>Read more about what models are supported in our <a href="https://docs.gpt4all.io/">documentation</a> for the gui<li>Check out our <a href="https://discord.gg/4M2QFmTt2k">discord channel</a> for help <h3>Si è verificato un errore durante il caricamento del modello:</h3><br><i>"%1"</i><br><br>Gli errori di caricamento del modello possono verificarsi per diversi motivi, ma le cause più comuni includono un formato di file non valido, un download incompleto o danneggiato, il tipo di file sbagliato, RAM di sistema insufficiente o un tipo di modello incompatibile. Ecco alcuni suggerimenti per risolvere il problema:<br><ul><li>Assicurati che il file del modello abbia un formato e un tipo compatibili<li>Verifica che il file del modello sia completo nella cartella di download<li>Puoi trovare la cartella di download nella finestra di dialogo dei settaggi<li>Se hai scaricato manualmente il modello, assicurati che il file non sia danneggiato controllando md5sum<li>Leggi ulteriori informazioni su quali modelli sono supportati nella nostra <a href="https://docs.gpt4all.io/ ">documentazione</a> per la GUI<li>Consulta il nostro <a href="https://discord.gg/4M2QFmTt2k">canale Discord</a> per assistenza Erase conversation? Cancellare la conversazione? Changing the model will erase the current conversation. La modifica del modello cancellerà la conversazione corrente. Reload · %1 Ricarica · %1 Loading · %1 Caricamento · %1 Load · %1 (default) → Carica · %1 (predefinito) → Send a message... Manda un messaggio... Load a model to continue... Carica un modello per continuare... Send messages/prompts to the model Invia messaggi/prompt al modello Cut Taglia Paste Incolla Select All Seleziona tutto Send message Invia messaggio Sends the message/prompt contained in textfield to the model Invia il messaggio/prompt contenuto nel campo di testo al modello CodeInterpreter Code Interpreter Interprete di codice compute javascript code using console.log as output Esegue codice JavaScript utilizzando console.log come output CollectionsDrawer Warning: searching collections while indexing can return incomplete results Avviso: la ricerca nelle raccolte durante l'indicizzazione può restituire risultati incompleti %n file(s) %n file %n file %n word(s) %n parola %n parole Updating In aggiornamento + Add Docs + Aggiungi documenti Select a collection to make it available to the chat model. Seleziona una raccolta per renderla disponibile al modello in chat. ConfirmationDialog OK OK Cancel Annulla Download Model "%1" is installed successfully. Il modello "%1" è stato installato correttamente. ERROR: $MODEL_NAME is empty. ERRORE: $MODEL_NAME è vuoto. ERROR: $API_KEY is empty. ERRORE: $API_KEY è vuoto. ERROR: $BASE_URL is invalid. ERRORE: $BASE_URL non è valido. ERROR: Model "%1 (%2)" is conflict. ERRORE: il modello "%1 (%2)" è in conflitto. Model "%1 (%2)" is installed successfully. Il modello "%1 (%2)" è stato installato correttamente. Model "%1" is removed. Il modello "%1" è stato rimosso. HomeView Welcome to GPT4All Benvenuto in GPT4All The privacy-first LLM chat application L'applicazione di chat LLM che mette al primo posto la privacy Start chatting Inizia a chattare Start Chatting Inizia a Chattare Chat with any LLM Chatta con qualsiasi LLM LocalDocs Chat with your local files Chatta con i tuoi file locali Find Models Trova modelli Explore and download models Esplora e scarica i modelli Latest news Ultime notizie Latest news from GPT4All Ultime notizie da GPT4All Release Notes Note di rilascio Documentation Documentazione Discord X (Twitter) Github nomic.ai nomic.ai Subscribe to Newsletter Iscriviti alla Newsletter LocalDocsSettings LocalDocs LocalDocs Settings Settaggi LocalDocs Indexing Indicizzazione Allowed File Extensions Estensioni di file consentite Comma-separated list. LocalDocs will only attempt to process files with these extensions. Elenco separato da virgole. LocalDocs tenterà di elaborare solo file con queste estensioni. Embedding Questo termine si dovrebbe tradurre come "Incorporamento". This term has been translated in other applications like A1111 and InvokeAI as "Incorporamento" Incorporamento Use Nomic Embed API Utilizza l'API di incorporamento Nomic Embed Embed documents using the fast Nomic API instead of a private local model. Requires restart. Incorpora documenti utilizzando la veloce API di Nomic invece di un modello locale privato. Richiede il riavvio. Nomic API Key Chiave API di Nomic API key to use for Nomic Embed. Get one from the Atlas <a href="https://atlas.nomic.ai/cli-login">API keys page</a>. Requires restart. Chiave API da utilizzare per Nomic Embed. Ottienine una dalla <a href="https://atlas.nomic.ai/cli-login">pagina delle chiavi API</a> di Atlas. Richiede il riavvio. Embeddings Device Dispositivo per incorporamenti The compute device used for embeddings. Requires restart. Il dispositivo di calcolo utilizzato per gli incorporamenti. Richiede il riavvio. Application default Applicazione predefinita Display Mostra Show Sources Mostra le fonti Display the sources used for each response. Visualizza le fonti utilizzate per ciascuna risposta. Advanced Avanzate Warning: Advanced usage only. Avvertenza: solo per uso avanzato. Values too large may cause localdocs failure, extremely slow responses or failure to respond at all. Roughly speaking, the {N chars x N snippets} are added to the model's context window. More info <a href="https://docs.gpt4all.io/gpt4all_desktop/localdocs.html">here</a>. Valori troppo grandi possono causare errori di LocalDocs, risposte estremamente lente o l'impossibilità di rispondere. In parole povere, {N caratteri x N frammenti} vengono aggiunti alla finestra di contesto del modello. Maggiori informazioni <a href="https://docs.gpt4all.io/gpt4all_desktop/localdocs.html">qui</a>. Document snippet size (characters) Dimensioni del frammento di documento (caratteri) Number of characters per document snippet. Larger numbers increase likelihood of factual responses, but also result in slower generation. Numero di caratteri per frammento di documento. Numeri più grandi aumentano la probabilità di risposte basate sui fatti, ma comportano anche una generazione più lenta. Max document snippets per prompt Numero massimo di frammenti di documento per prompt Max best N matches of retrieved document snippets to add to the context for prompt. Larger numbers increase likelihood of factual responses, but also result in slower generation. Il numero massimo di frammenti di documento recuperati che presentano le migliori corrispondenze, da includere nel contesto del prompt. Numeri più alti aumentano la probabilità di ricevere risposte basate sui fatti, ma comportano anche una generazione più lenta. LocalDocsView LocalDocs Chat with your local files Chatta con i tuoi file locali + Add Collection + Aggiungi raccolta <h3>ERROR: The LocalDocs database cannot be accessed or is not valid.</h3><br><i>Note: You will need to restart after trying any of the following suggested fixes.</i><br><ul><li>Make sure that the folder set as <b>Download Path</b> exists on the file system.</li><li>Check ownership as well as read and write permissions of the <b>Download Path</b>.</li><li>If there is a <b>localdocs_v2.db</b> file, check its ownership and read/write permissions, too.</li></ul><br>If the problem persists and there are any 'localdocs_v*.db' files present, as a last resort you can<br>try backing them up and removing them. You will have to recreate your collections, however. <h3>ERRORE: Impossibile accedere al database LocalDocs o non è valido.</h3><br><i>Nota: sarà necessario riavviare dopo aver provato una delle seguenti soluzioni suggerite.</i><br><ul><li>Assicurati che la cartella impostata come <b>Percorso di download</b> esista nel file system.</li><li>Controlla la proprietà e i permessi di lettura e scrittura del <b>Percorso di download</b>.</li><li>Se è presente un file <b>localdocs_v2.db</b>, controlla anche la sua proprietà e i permessi di lettura/scrittura.</li></ul><br>Se il problema persiste e sono presenti file 'localdocs_v*.db', come ultima risorsa puoi<br>provare a eseguirne il backup e a rimuoverli. Tuttavia, dovrai ricreare le tue raccolte. No Collections Installed Nessuna raccolta installata Install a collection of local documents to get started using this feature Installa una raccolta di documenti locali per iniziare a utilizzare questa funzionalità + Add Doc Collection + Aggiungi raccolta di documenti Shows the add model view Mostra la vista aggiungi modello Indexing progressBar Barra di avanzamento indicizzazione Shows the progress made in the indexing Mostra lo stato di avanzamento dell'indicizzazione ERROR ERRORE INDEXING INDICIZZAZIONE EMBEDDING INCORPORAMENTO REQUIRES UPDATE RICHIEDE AGGIORNAMENTO READY PRONTA INSTALLING INSTALLAZIONE Indexing in progress Indicizzazione in corso Embedding in progress Incorporamento in corso This collection requires an update after version change Questa raccolta richiede un aggiornamento dopo il cambio di versione Automatically reindexes upon changes to the folder Reindicizza automaticamente in caso di modifiche alla cartella Installation in progress Installazione in corso % % %n file(s) %n file %n file %n word(s) %n parola %n parole Remove Rimuovi Rebuild Ricostruisci Reindex this folder from scratch. This is slow and usually not needed. Reindicizzare questa cartella da zero. Lento e di solito non necessario. Update Aggiorna Update the collection to the new version. This is a slow operation. Aggiorna la raccolta alla nuova versione. Questa è un'operazione lenta. ModelList <ul><li>Requires personal OpenAI API key.</li><li>WARNING: Will send your chats to OpenAI!</li><li>Your API key will be stored on disk</li><li>Will only be used to communicate with OpenAI</li><li>You can apply for an API key <a href="https://platform.openai.com/account/api-keys">here.</a></li> <ul><li>Richiede una chiave API OpenAI personale.</li><li>ATTENZIONE: invierà le tue chat a OpenAI!</li><li>La tua chiave API verrà archiviata su disco</li><li> Verrà utilizzato solo per comunicare con OpenAI</li><li>Puoi richiedere una chiave API <a href="https://platform.openai.com/account/api-keys">qui.</a> </li> <strong>OpenAI's ChatGPT model GPT-3.5 Turbo</strong><br> %1 <strong>OpenAI's ChatGPT model GPT-4</strong><br> %1 %2 <strong>Mistral Tiny model</strong><br> %1 <strong>Mistral Small model</strong><br> %1 <strong>Mistral Medium model</strong><br> %1 <br><br><i>* Even if you pay OpenAI for ChatGPT-4 this does not guarantee API key access. Contact OpenAI for more info. <br><br><i>* Anche se paghi OpenAI per ChatGPT-4 questo non garantisce l'accesso alla chiave API. Contatta OpenAI per maggiori informazioni. cannot open "%1": %2 impossibile aprire "%1": %2 cannot create "%1": %2 impossibile creare "%1": %2 %1 (%2) %1 (%2) <strong>OpenAI-Compatible API Model</strong><br><ul><li>API Key: %1</li><li>Base URL: %2</li><li>Model Name: %3</li></ul> <strong>Modello API compatibile con OpenAI</strong><br><ul><li>Chiave API: %1</li><li>URL di base: %2</li><li>Nome modello: %3</li></ul> <ul><li>Requires personal Mistral API key.</li><li>WARNING: Will send your chats to Mistral!</li><li>Your API key will be stored on disk</li><li>Will only be used to communicate with Mistral</li><li>You can apply for an API key <a href="https://console.mistral.ai/user/api-keys">here</a>.</li> <ul><li>Richiede una chiave API Mistral personale.</li><li>ATTENZIONE: invierà le tue chat a Mistral!</li><li>La tua chiave API verrà archiviata su disco</li><li> Verrà utilizzato solo per comunicare con Mistral</li><li>Puoi richiedere una chiave API <a href="https://console.mistral.ai/user/api-keys">qui</a>. </li> <ul><li>Requires personal API key and the API base URL.</li><li>WARNING: Will send your chats to the OpenAI-compatible API Server you specified!</li><li>Your API key will be stored on disk</li><li>Will only be used to communicate with the OpenAI-compatible API Server</li> <ul><li>Richiede una chiave API personale e l'URL di base dell'API.</li><li>ATTENZIONE: invierà le tue chat al server API compatibile con OpenAI che hai specificato!</li><li>La tua chiave API verrà archiviata su disco</li><li>Verrà utilizzata solo per comunicare con il server API compatibile con OpenAI</li> <strong>Connect to OpenAI-compatible API server</strong><br> %1 <strong>Connetti al server API compatibile con OpenAI</strong><br> %1 <strong>Created by %1.</strong><br><ul><li>Published on %2.<li>This model has %3 likes.<li>This model has %4 downloads.<li>More info can be found <a href="https://huggingface.co/%5">here.</a></ul> <strong>Creato da %1.</strong><br><ul><li>Pubblicato il %2.<li>Questo modello ha %3 Mi piace.<li>Questo modello ha %4 download.<li>Altro informazioni possono essere trovate <a href="https://huggingface.co/%5">qui.</a></ul> ModelSettings Model Modello %1 system message? %1 il messaggio di sistema? Clear Cancella Reset Ripristina The system message will be %1. Il messaggio di sistema verrà %1. removed rimosso reset to the default ripristinato il valore predefinito %1 chat template? %1 il modello di chat? The chat template will be %1. Il modello di chat verrà %1. erased cancellato Model Settings Settaggi modello Clone Clona Remove Rimuovi Name Nome Model File File del modello System Message Messaggio di sistema A message to set the context or guide the behavior of the model. Leave blank for none. NOTE: Since GPT4All 3.5, this should not contain control tokens. Un messaggio per impostare il contesto o guidare il comportamento del modello. Lasciare vuoto per nessuno. NOTA: da GPT4All 3.5, questo non dovrebbe contenere token di controllo. System message is not <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">plain text</a>. Il messaggio di sistema non è <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">testo normale</a>. Chat Template Modello di chat This Jinja template turns the chat into input for the model. Questo modello Jinja trasforma la chat in input per il modello. No <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">chat template</a> configured. Nessun <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">modello di chat</a> configurato. The <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">chat template</a> cannot be blank. Il <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">modello di chat</a> non può essere vuoto. <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">Syntax error</a>: %1 <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">Errore di sintassi</a>: %1 Chat template is not in <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">Jinja format</a>. Il modello di chat non è in <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">formato Jinja</a>. Chat Name Prompt Prompt del nome della chat Prompt used to automatically generate chat names. Prompt utilizzato per generare automaticamente nomi di chat. Suggested FollowUp Prompt Prompt di approfondimento suggerito Prompt used to generate suggested follow-up questions. Prompt utilizzato per generare le domande di approfondimento suggerite. Context Length Lunghezza del contesto Number of input and output tokens the model sees. Numero di token di input e output visualizzati dal modello. Maximum combined prompt/response tokens before information is lost. Using more context than the model was trained on will yield poor results. NOTE: Does not take effect until you reload the model. Numero massimo di token di prompt/risposta combinati prima che le informazioni vengano perse. L'utilizzo di un contesto maggiore rispetto a quello su cui è stato addestrato il modello produrrà scarsi risultati. NOTA: non ha effetto finché non si ricarica il modello. Temperature Temperatura Randomness of model output. Higher -> more variation. Casualità dell'uscita del modello. Più alto -> più variazione. Temperature increases the chances of choosing less likely tokens. NOTE: Higher temperature gives more creative but less predictable outputs. La temperatura aumenta le possibilità di scegliere token meno probabili. NOTA: una temperatura più elevata offre risultati più creativi ma meno prevedibili. Top-P Nucleus Sampling factor. Lower -> more predictable. Fattore di campionamento del nucleo. Inferiore -> più prevedibile. Only the most likely tokens up to a total probability of top_p can be chosen. NOTE: Prevents choosing highly unlikely tokens. Solo i token più probabili, fino a un totale di probabilità di Top-P, possono essere scelti. NOTA: impedisce la scelta di token altamente improbabili. Min-P Minimum token probability. Higher -> more predictable. Probabilità minima del token. Più alto -> più prevedibile. Sets the minimum relative probability for a token to be considered. Imposta la probabilità relativa minima affinché un token venga considerato. Top-K Size of selection pool for tokens. Dimensione del lotto di selezione per i token. Only the top K most likely tokens will be chosen from. Saranno scelti solo i primi K token più probabili. Max Length Lunghezza massima Maximum response length, in tokens. Lunghezza massima della risposta, in token. Prompt Batch Size Dimensioni del lotto di prompt The batch size used for prompt processing. La dimensione del lotto usata per l'elaborazione dei prompt. Amount of prompt tokens to process at once. NOTE: Higher values can speed up reading prompts but will use more RAM. Numero di token del prompt da elaborare contemporaneamente. NOTA: valori più alti possono velocizzare la lettura dei prompt ma utilizzeranno più RAM. Repeat Penalty Penalità di ripetizione Repetition penalty factor. Set to 1 to disable. Fattore di penalità di ripetizione. Impostare su 1 per disabilitare. Repeat Penalty Tokens Token di penalità ripetizione Number of previous tokens used for penalty. Numero di token precedenti utilizzati per la penalità. GPU Layers Livelli GPU Number of model layers to load into VRAM. Numero di livelli del modello da caricare nella VRAM. How many model layers to load into VRAM. Decrease this if GPT4All runs out of VRAM while loading this model. Lower values increase CPU load and RAM usage, and make inference slower. NOTE: Does not take effect until you reload the model. Quanti livelli del modello caricare nella VRAM. Diminuirlo se GPT4All esaurisce la VRAM durante il caricamento di questo modello. Valori più bassi aumentano il carico della CPU e l'utilizzo della RAM e rallentano l'inferenza. NOTA: non ha effetto finché non si ricarica il modello. ModelsView No Models Installed Nessun modello installato Install a model to get started using GPT4All Installa un modello per iniziare a utilizzare GPT4All + Add Model + Aggiungi Modello Shows the add model view Mostra la vista aggiungi modello Installed Models Modelli installati Locally installed chat models Modelli per chat installati localmente Model file File del modello Model file to be downloaded File del modello da scaricare Description Descrizione File description Descrizione del file Cancel Annulla Resume Riprendi Stop/restart/start the download Arresta/riavvia/avvia il download Remove Rimuovi Remove model from filesystem Rimuovi il modello dal sistema dei file Install Installa Install online model Installa il modello online <strong><font size="1"><a href="#error">Error</a></strong></font> <strong><font size="1"><a href="#error">Errore</a></strong></font> <strong><font size="2">WARNING: Not recommended for your hardware. Model requires more memory (%1 GB) than your system has available (%2).</strong></font> <strong><font size="2">AVVISO: non consigliato per il tuo hardware. Il modello richiede più memoria (%1 GB) di quella disponibile nel sistema (%2).</strong></font> ERROR: $API_KEY is empty. ERRORE: $API_KEY è vuoto. ERROR: $BASE_URL is empty. ERRORE: $BASE_URL non è valido. enter $BASE_URL inserisci $BASE_URL ERROR: $MODEL_NAME is empty. ERRORE: $MODEL_NAME è vuoto. enter $MODEL_NAME inserisci $MODEL_NAME %1 GB ? Describes an error that occurred when downloading Descrive un errore che si è verificato durante lo scaricamento Error for incompatible hardware Errore per hardware incompatibile Download progressBar Barra di avanzamento dello scaricamento Shows the progress made in the download Mostra lo stato di avanzamento dello scaricamento Download speed Velocità di scaricamento Download speed in bytes/kilobytes/megabytes per second Velocità di scaricamento in byte/kilobyte/megabyte al secondo Calculating... Calcolo in corso... Whether the file hash is being calculated Se viene calcolato l'hash del file Busy indicator Indicatore di occupato Displayed when the file hash is being calculated Visualizzato durante il calcolo dell'hash del file enter $API_KEY Inserire $API_KEY File size Dimensione del file RAM required RAM richiesta Parameters Parametri Quant Quant Type Tipo MyFancyLink Fancy link Mio link A stylized link Un link d'esempio MyFileDialog Please choose a file Scegli un file MyFolderDialog Please choose a directory Scegli una cartella MySettingsLabel Clear Cancella Reset Ripristina MySettingsTab Restore defaults? Ripristinare le impostazioni predefinite? This page of settings will be reset to the defaults. Questa pagina di impostazioni verrà ripristinata ai valori predefiniti. Restore Defaults Ripristina i valori predefiniti Restores settings dialog to a default state Ripristina la finestra di dialogo dei settaggi a uno stato predefinito NetworkDialog Contribute data to the GPT4All Opensource Datalake. Contribuisci con i tuoi dati al Datalake Open Source di GPT4All. By enabling this feature, you will be able to participate in the democratic process of training a large language model by contributing data for future model improvements. When a GPT4All model responds to you and you have opted-in, your conversation will be sent to the GPT4All Open Source Datalake. Additionally, you can like/dislike its response. If you dislike a response, you can suggest an alternative response. This data will be collected and aggregated in the GPT4All Datalake. NOTE: By turning on this feature, you will be sending your data to the GPT4All Open Source Datalake. You should have no expectation of chat privacy when this feature is enabled. You should; however, have an expectation of an optional attribution if you wish. Your chat data will be openly available for anyone to download and will be used by Nomic AI to improve future GPT4All models. Nomic AI will retain all attribution information attached to your data and you will be credited as a contributor to any GPT4All model release that uses your data! Abilitando questa funzionalità, potrai partecipare al processo democratico di addestramento di un modello linguistico di grandi dimensioni fornendo dati per futuri miglioramenti del modello. Quando un modello di GPT4All ti risponde e tu hai aderito, la tua conversazione verrà inviata al Datalake Open Source di GPT4All. Inoltre, puoi mettere mi piace/non mi piace alla sua risposta. Se non ti piace una risposta, puoi suggerirne una alternativa. Questi dati verranno raccolti e aggregati nel Datalake di GPT4All. NOTA: attivando questa funzione, invierai i tuoi dati al Datalake Open Source di GPT4All. Non dovresti avere aspettative sulla privacy della chat quando questa funzione è abilitata. Dovresti, tuttavia, aspettarti un'attribuzione facoltativa, se lo desideri. I tuoi dati di chat saranno liberamente disponibili per essere scaricati da chiunque e verranno utilizzati da Nomic AI per migliorare i futuri modelli GPT4All. Nomic AI conserverà tutte le informazioni di attribuzione allegate ai tuoi dati e verrai accreditato come collaboratore a qualsiasi versione del modello GPT4All che utilizza i tuoi dati! Terms for opt-in Termini per l'adesione Describes what will happen when you opt-in Descrive cosa accadrà quando effettuerai l'adesione Please provide a name for attribution (optional) Fornisci un nome per l'attribuzione (facoltativo) Attribution (optional) Attribuzione (facoltativo) Provide attribution Fornire attribuzione Enable Abilita Enable opt-in Abilita l'adesione Cancel Annulla Cancel opt-in Annulla l'adesione NewVersionDialog New version is available Nuova versione disponibile Update Aggiorna Update to new version Aggiorna alla nuova versione PopupDialog Reveals a shortlived help balloon Rivela un messaggio di aiuto di breve durata Busy indicator Indicatore di occupato Displayed when the popup is showing busy Visualizzato quando la finestra a comparsa risulta occupata RemoteModelCard API Key Chiave API ERROR: $API_KEY is empty. ERRORE: $API_KEY è vuoto. enter $API_KEY Inserire $API_KEY Whether the file hash is being calculated Se viene calcolato l'hash del file Base Url URL di base ERROR: $BASE_URL is empty. ERRORE: $BASE_URL non è valido. enter $BASE_URL inserisci $BASE_URL Model Name Nome modello ERROR: $MODEL_NAME is empty. ERRORE: $MODEL_NAME è vuoto. enter $MODEL_NAME inserisci $MODEL_NAME Models Modelli Install Installa Install remote model Installa modello remoto SettingsView Settings Settaggi Contains various application settings Contiene vari settaggi dell'applicazione Application Applicazione Model Modello LocalDocs StartupDialog Welcome! Benvenuto! ### Release Notes %1<br/> ### Contributors %2 ### Note di rilascio %1<br/> ### Contributori %2 Release notes Note di rilascio Release notes for this version Note di rilascio per questa versione ### Opt-ins for anonymous usage analytics and datalake By enabling these features, you will be able to participate in the democratic process of training a large language model by contributing data for future model improvements. When a GPT4All model responds to you and you have opted-in, your conversation will be sent to the GPT4All Open Source Datalake. Additionally, you can like/dislike its response. If you dislike a response, you can suggest an alternative response. This data will be collected and aggregated in the GPT4All Datalake. NOTE: By turning on this feature, you will be sending your data to the GPT4All Open Source Datalake. You should have no expectation of chat privacy when this feature is enabled. You should; however, have an expectation of an optional attribution if you wish. Your chat data will be openly available for anyone to download and will be used by Nomic AI to improve future GPT4All models. Nomic AI will retain all attribution information attached to your data and you will be credited as a contributor to any GPT4All model release that uses your data! ### Abilitazioni per analisi di utilizzo anonime e datalake Abilitando questa funzionalità, potrai partecipare al processo democratico di addestramento di un modello linguistico di grandi dimensioni fornendo dati per futuri miglioramenti del modello. Quando un modello di GPT4All ti risponde e tu hai aderito, la tua conversazione verrà inviata al Datalake Open Source di GPT4All. Inoltre, puoi mettere mi piace/non mi piace alla sua risposta. Se non ti piace una risposta, puoi suggerirne una alternativa. Questi dati verranno raccolti e aggregati nel Datalake di GPT4All. NOTA: attivando questa funzione, invierai i tuoi dati al Datalake Open Source di GPT4All. Non dovresti avere aspettative sulla privacy della chat quando questa funzione è abilitata. Dovresti, tuttavia, aspettarti un'attribuzione facoltativa, se lo desideri, . I tuoi dati di chat saranno liberamente disponibili per essere scaricati da chiunque e verranno utilizzati da Nomic AI per migliorare i futuri modelli GPT4All. Nomic AI conserverà tutte le informazioni di attribuzione allegate ai tuoi dati e verrai accreditato come collaboratore a qualsiasi versione del modello GPT4All che utilizza i tuoi dati! Terms for opt-in Termini per l'adesione Describes what will happen when you opt-in Descrive cosa accadrà quando effettuerai l'adesione Opt-in to anonymous usage analytics used to improve GPT4All Acconsenti all'analisi anonima dell'uso per migliorare GPT4All Opt-in for anonymous usage statistics Attiva le statistiche di utilizzo anonime Yes Si Allow opt-in for anonymous usage statistics Consenti l'attivazione delle statistiche di utilizzo anonime No No Opt-out for anonymous usage statistics Disattiva le statistiche di utilizzo anonime Allow opt-out for anonymous usage statistics Consenti la disattivazione per le statistiche di utilizzo anonime Opt-in to anonymous sharing of chats to the GPT4All Datalake Acconsenti alla condivisione anonima delle chat con il GPT4All Datalake Opt-in for network Aderisci per la rete Allow opt-in for network Consenti l'adesione per la rete Allow opt-in anonymous sharing of chats to the GPT4All Datalake Consenti la condivisione anonima delle chat su GPT4All Datalake Opt-out for network Disattiva per la rete Allow opt-out anonymous sharing of chats to the GPT4All Datalake Consenti la non adesione alla condivisione anonima delle chat nel GPT4All Datalake ThumbsDownDialog Please edit the text below to provide a better response. (optional) Modifica il testo seguente per fornire una risposta migliore. (opzionale) Please provide a better response... Si prega di fornire una risposta migliore... Submit Invia Submits the user's response Invia la risposta dell'utente Cancel Annulla Closes the response dialog Chiude la finestra di dialogo della risposta main <h3>Encountered an error starting up:</h3><br><i>"Incompatible hardware detected."</i><br><br>Unfortunately, your CPU does not meet the minimal requirements to run this program. In particular, it does not support AVX intrinsics which this program requires to successfully run a modern large language model. The only solution at this time is to upgrade your hardware to a more modern CPU.<br><br>See here for more information: <a href="https://en.wikipedia.org/wiki/Advanced_Vector_Extensions">https://en.wikipedia.org/wiki/Advanced_Vector_Extensions</a> <h3>Si è verificato un errore all'avvio:</h3><br><i>"Rilevato hardware incompatibile."</i><br><br>Sfortunatamente, la tua CPU non soddisfa i requisiti minimi per eseguire questo programma. In particolare, non supporta gli elementi intrinseci AVX richiesti da questo programma per eseguire con successo un modello linguistico moderno e di grandi dimensioni. L'unica soluzione in questo momento è aggiornare il tuo hardware con una CPU più moderna.<br><br>Vedi qui per ulteriori informazioni: <a href="https://en.wikipedia.org/wiki/Advanced_Vector_Extensions">https ://en.wikipedia.org/wiki/Advanced_Vector_Extensions</a> GPT4All v%1 Restore Ripristina Quit Esci <h3>Encountered an error starting up:</h3><br><i>"Inability to access settings file."</i><br><br>Unfortunately, something is preventing the program from accessing the settings file. This could be caused by incorrect permissions in the local app config directory where the settings file is located. Check out our <a href="https://discord.gg/4M2QFmTt2k">discord channel</a> for help. <h3>Si è verificato un errore all'avvio:</h3><br><i>"Impossibile accedere al file dei settaggi."</i><br><br>Sfortunatamente, qualcosa impedisce al programma di accedere al file dei settaggi. Ciò potrebbe essere causato da autorizzazioni errate nella cartella di configurazione locale dell'app in cui si trova il file dei settaggi. Dai un'occhiata al nostro <a href="https://discord.gg/4M2QFmTt2k">canale Discord</a> per ricevere assistenza. Connection to datalake failed. La connessione al Datalake non è riuscita. Saving chats. Salvataggio delle chat. Network dialog Dialogo di rete opt-in to share feedback/conversations aderisci per condividere feedback/conversazioni Home view Vista iniziale Home view of application Vista iniziale dell'applicazione Home Inizia Chat view Vista chat Chat view to interact with models Vista chat per interagire con i modelli Chats Chat Models Modelli Models view for installed models Vista modelli per i modelli installati LocalDocs LocalDocs view to configure and use local docs Vista LocalDocs per configurare e utilizzare i documenti locali Settings Settaggi Settings view for application configuration Vista dei settaggi per la configurazione dell'applicazione The datalake is enabled Il Datalake è abilitato Using a network model Utilizzando un modello di rete Server mode is enabled La modalità server è abilitata Installed models Modelli installati View of installed models Vista dei modelli installati ================================================ FILE: gpt4all-chat/translations/gpt4all_pt_BR.ts ================================================ AddCollectionView ← Existing Collections ← Minhas coleções Add Document Collection Adicionar Coleção de Documentos Add a folder containing plain text files, PDFs, or Markdown. Configure additional extensions in Settings. Adicione uma pasta contendo arquivos de texto simples, PDFs ou Markdown. Configure extensões adicionais nas Configurações. Please choose a directory Escolha um diretório Name Nome Collection name... Nome da coleção... Name of the collection to add (Required) Nome da coleção (obrigatório) Folder Pasta Folder path... Caminho da pasta... Folder path to documents (Required) Caminho da pasta com os documentos (obrigatório) Browse Procurar Create Collection Criar Coleção AddGPT4AllModelView These models have been specifically configured for use in GPT4All. The first few models on the list are known to work the best, but you should only attempt to use models that will fit in your available memory. Network error: could not retrieve %1 Erro de rede: não foi possível obter %1 Busy indicator Displayed when the models request is ongoing xibido enquanto os modelos estão sendo carregados All Reasoning Model file Arquivo do modelo Model file to be downloaded Arquivo do modelo a ser baixado Description Descrição File description Descrição do arquivo Cancel Cancelar Resume Retomar Download Baixar Stop/restart/start the download Parar/reiniciar/iniciar o download Remove Remover Remove model from filesystem Install Instalar Install online model Instalar modelo online <strong><font size="1"><a href="#error">Error</a></strong></font> <strong><font size="1"><a href="#error">Erro</a></strong></font> Describes an error that occurred when downloading <strong><font size="2">WARNING: Not recommended for your hardware. Model requires more memory (%1 GB) than your system has available (%2).</strong></font> Error for incompatible hardware Download progressBar Shows the progress made in the download Mostra o progresso do download Download speed Velocidade de download Download speed in bytes/kilobytes/megabytes per second Velocidade de download em bytes/kilobytes/megabytes por segundo Calculating... Calculando... Whether the file hash is being calculated Displayed when the file hash is being calculated enter $API_KEY inserir $API_KEY ERROR: $BASE_URL is empty. ERRO: A $BASE_URL está vazia. enter $BASE_URL inserir a $BASE_URL enter $MODEL_NAME inserir o $MODEL_NAME File size Tamanho do arquivo RAM required RAM necessária %1 GB %1 GB ? ? Parameters Parâmetros Quant Quant Type Tipo AddHFModelView Use the search to find and download models from HuggingFace. There is NO GUARANTEE that these will work. Many will require additional configuration before they can be used. Discover and download models by keyword search... Pesquisar modelos... Text field for discovering and filtering downloadable models Campo de texto para descobrir e filtrar modelos para download Searching · %1 Pesquisando · %1 Initiate model discovery and filtering Pesquisar e filtrar modelos Triggers discovery and filtering of models Aciona a descoberta e filtragem de modelos Default Padrão Likes Curtidas Downloads Downloads Recent Recentes Sort by: %1 Ordenar por: %1 Asc Asc Desc Desc Sort dir: %1 Ordenar diretório: %1 None Nenhum Limit: %1 Limite: %1 Model file Arquivo do modelo Model file to be downloaded Arquivo do modelo a ser baixado Description Descrição File description Descrição do arquivo Cancel Cancelar Resume Retomar Download Baixar Stop/restart/start the download Parar/reiniciar/iniciar o download Remove Remover Remove model from filesystem Install Instalar Install online model Instalar modelo online <strong><font size="1"><a href="#error">Error</a></strong></font> <strong><font size="1"><a href="#error">Erro</a></strong></font> Describes an error that occurred when downloading <strong><font size="2">WARNING: Not recommended for your hardware. Model requires more memory (%1 GB) than your system has available (%2).</strong></font> Error for incompatible hardware Download progressBar Shows the progress made in the download Mostra o progresso do download Download speed Velocidade de download Download speed in bytes/kilobytes/megabytes per second Velocidade de download em bytes/kilobytes/megabytes por segundo Calculating... Calculando... Whether the file hash is being calculated Busy indicator Displayed when the file hash is being calculated ERROR: $API_KEY is empty. enter $API_KEY inserir $API_KEY ERROR: $BASE_URL is empty. ERRO: A $BASE_URL está vazia. enter $BASE_URL inserir a $BASE_URL ERROR: $MODEL_NAME is empty. enter $MODEL_NAME inserir o $MODEL_NAME File size Tamanho do arquivo Quant Quant Type Tipo AddModelView ← Existing Models ← Meus Modelos Explore Models Descobrir Modelos GPT4All GPT4All Remote Providers HuggingFace Discover and download models by keyword search... Pesquisar modelos... Text field for discovering and filtering downloadable models Campo de texto para descobrir e filtrar modelos para download Initiate model discovery and filtering Pesquisar e filtrar modelos Triggers discovery and filtering of models Aciona a descoberta e filtragem de modelos Default Padrão Likes Curtidas Downloads Downloads Recent Recentes Asc Asc Desc Desc None Nenhum Searching · %1 Pesquisando · %1 Sort by: %1 Ordenar por: %1 Sort dir: %1 Ordenar diretório: %1 Limit: %1 Limite: %1 Network error: could not retrieve %1 Erro de rede: não foi possível obter %1 Busy indicator Indicador de processamento Displayed when the models request is ongoing xibido enquanto os modelos estão sendo carregados Model file Arquivo do modelo Model file to be downloaded Arquivo do modelo a ser baixado Description Descrição File description Descrição do arquivo Cancel Cancelar Resume Retomar Download Baixar Stop/restart/start the download Parar/reiniciar/iniciar o download Remove Remover Remove model from filesystem Remover modelo do sistema Install Instalar Install online model Instalar modelo online <strong><font size="2">WARNING: Not recommended for your hardware. Model requires more memory (%1 GB) than your system has available (%2).</strong></font> <strong><font size="2">ATENÇÃO: Este modelo não é recomendado para seu hardware. Ele exige mais memória (%1 GB) do que seu sistema possui (%2).</strong></font> ERROR: $API_KEY is empty. ERRO: A $API_KEY está vazia. ERROR: $BASE_URL is empty. ERRO: A $BASE_URL está vazia. enter $BASE_URL inserir a $BASE_URL ERROR: $MODEL_NAME is empty. ERRO: O $MODEL_NAME está vazio. enter $MODEL_NAME inserir o $MODEL_NAME %1 GB %1 GB ? ? Describes an error that occurred when downloading Mostra informações sobre o erro no download <strong><font size="1"><a href="#error">Error</a></strong></font> <strong><font size="1"><a href="#error">Erro</a></strong></font> Error for incompatible hardware Aviso: Hardware não compatível Download progressBar Progresso do download Shows the progress made in the download Mostra o progresso do download Download speed Velocidade de download Download speed in bytes/kilobytes/megabytes per second Velocidade de download em bytes/kilobytes/megabytes por segundo Calculating... Calculando... Whether the file hash is being calculated Quando o hash do arquivo está sendo calculado Displayed when the file hash is being calculated Exibido durante o cálculo do hash do arquivo enter $API_KEY inserir $API_KEY File size Tamanho do arquivo RAM required RAM necessária Parameters Parâmetros Quant Quant Type Tipo AddRemoteModelView Various remote model providers that use network resources for inference. Groq Groq offers a high-performance AI inference engine designed for low-latency and efficient processing. Optimized for real-time applications, Groq’s technology is ideal for users who need fast responses from open large language models and other AI workloads.<br><br>Get your API key: <a href="https://console.groq.com/keys">https://groq.com/</a> OpenAI OpenAI provides access to advanced AI models, including GPT-4 supporting a wide range of applications, from conversational AI to content generation and code completion.<br><br>Get your API key: <a href="https://platform.openai.com/signup">https://openai.com/</a> Mistral Mistral AI specializes in efficient, open-weight language models optimized for various natural language processing tasks. Their models are designed for flexibility and performance, making them a solid option for applications requiring scalable AI solutions.<br><br>Get your API key: <a href="https://mistral.ai/">https://mistral.ai/</a> Custom The custom provider option allows users to connect their own OpenAI-compatible AI models or third-party inference services. This is useful for organizations with proprietary models or those leveraging niche AI providers not listed here. ApplicationSettings Application Aplicativo Network dialog Mensagens de rede opt-in to share feedback/conversations Compartilhar feedback e conversas Error dialog Mensagens de erro Application Settings Configurações General Geral Theme Tema The application color scheme. Esquema de cores. Dark Modo Escuro Light Modo Claro ERROR: Update system could not find the MaintenanceTool used to check for updates!<br/><br/>Did you install this application using the online installer? If so, the MaintenanceTool executable should be located one directory above where this application resides on your filesystem.<br/><br/>If you can't start it manually, then I'm afraid you'll have to reinstall. ERRO: O sistema de atualização não encontrou a Ferramenta de Manutenção necessária para verificar atualizações!<br><br>Você instalou este aplicativo usando o instalador online? Se sim, o executável da Ferramenta de Manutenção deve estar localizado um diretório acima de onde este aplicativo está instalado.<br><br>Se você não conseguir iniciá-lo manualmente, será necessário reinstalar o aplicativo. LegacyDark Modo escuro (legado) Font Size Tamanho da Fonte The size of text in the application. Tamanho do texto. Small Pequeno Medium Médio Large Grande Language and Locale Idioma e Região The language and locale you wish to use. Selecione seu idioma e região. System Locale Local do Sistema Device Processador The compute device used for text generation. I chose to use "Processador" instead of "Dispositivo" (Device) or "Dispositivo de Computação" (Compute Device) to simplify the terminology and make it more straightforward and understandable. "Dispositivo" can be vague and could refer to various types of hardware, whereas "Processador" clearly and specifically indicates the component responsible for processing tasks. This improves usability by avoiding the ambiguity that might arise from using more generic terms like "Dispositivo." Processador usado para gerar texto. Application default Aplicativo padrão Default Model Modelo Padrão The preferred model for new chats. Also used as the local server fallback. Modelo padrão para novos chats e em caso de falha do modelo principal. Suggestion Mode Modo de sugestões Generate suggested follow-up questions at the end of responses. Sugerir perguntas após as respostas. When chatting with LocalDocs Ao conversar com o LocalDocs Whenever possible Sempre que possível Never Nunca Download Path Diretório de Download Where to store local models and the LocalDocs database. Pasta para modelos e banco de dados do LocalDocs. Browse Procurar Choose where to save model files Local para armazenar os modelos Enable Datalake Habilitar Datalake Send chats and feedback to the GPT4All Open-Source Datalake. Contribua para o Datalake de código aberto do GPT4All. Advanced Avançado CPU Threads Threads de CPU The number of CPU threads used for inference and embedding. Quantidade de núcleos (threads) do processador usados para processar e responder às suas perguntas. Enable System Tray The application will minimize to the system tray when the window is closed. Save Chat Context I used "Histórico do Chat" (Chat History) instead of "Contexto do Chat" (Chat Context) to clearly convey that it refers to saving past messages, making it more intuitive and avoiding potential confusion with abstract terms. Salvar Histórico do Chat Save the chat model's state to disk for faster loading. WARNING: Uses ~2GB per chat. Salvar histórico do chat para carregamento mais rápido. (Usa aprox. 2GB por chat). Enable Local API Server Ativar servidor de API local Expose an OpenAI-Compatible server to localhost. WARNING: Results in increased resource usage. Ativar servidor local compatível com OpenAI (uso de recursos elevado). API Server Port Porta da API The port to use for the local server. Requires restart. Porta de acesso ao servidor local. (requer reinicialização). Check For Updates Procurar por Atualizações Manually check for an update to GPT4All. Verifica se há novas atualizações para o GPT4All. Updates Atualizações Chat New Chat Novo Chat Server Chat Chat com o Servidor ChatAPIWorker ERROR: Network error occurred while connecting to the API server ERRO: Ocorreu um erro de rede ao conectar-se ao servidor da API ChatAPIWorker::handleFinished got HTTP Error %1 %2 ChatAPIWorker::handleFinished recebeu erro HTTP %1 %2 ChatCollapsibleItem Analysis encountered error Thinking Analyzing Thought for %1 %2 second seconds Analyzed ChatDrawer Drawer Menu Lateral Main navigation drawer Menu de navegação principal + New Chat + Novo Chat Create a new chat Criar um novo chat Select the current chat or edit the chat when in edit mode Selecione o chat atual ou edite o chat quando estiver no modo de edição Edit chat name Editar nome do chat Save chat name Salvar nome do chat Delete chat Excluir chat Confirm chat deletion Confirmar exclusão do chat Cancel chat deletion Cancelar exclusão do chat List of chats Lista de chats List of chats in the drawer dialog Lista de chats na caixa de diálogo do menu lateral ChatItemView GPT4All GPT4All You Você response stopped ... resposta interrompida... retrieving localdocs: %1 ... Recuperando dados em LocalDocs: %1 ... searching localdocs: %1 ... Buscando em LocalDocs: %1 ... processing ... processando... generating response ... gerando resposta... generating questions ... gerando perguntas... generating toolcall ... Copy Copiar Copy Message Copiar Mensagem Disable markdown Desativar markdown Enable markdown Ativar markdown %n Source(s) %n Origem %n Origens LocalDocs LocalDocs Edit this message? All following messages will be permanently erased. Redo this response? Cannot edit chat without a loaded model. Cannot edit chat while the model is generating. Edit Cannot redo response without a loaded model. Cannot redo response while the model is generating. Redo Like response Dislike response Suggested follow-ups Perguntas relacionadas ChatLLM Your message was too long and could not be processed (%1 > %2). Please try again with something shorter. ChatListModel TODAY HOJE THIS WEEK ESTA SEMANA THIS MONTH ESTE MÊS LAST SIX MONTHS ÚLTIMOS SEIS MESES THIS YEAR ESTE ANO LAST YEAR ANO PASSADO ChatTextItem Copy Copiar Copy Message Copiar Mensagem Disable markdown Desativar markdown Enable markdown Ativar markdown ChatView <h3>Warning</h3><p>%1</p> <h3>Aviso</h3><p>%1</p> Switch model dialog Mensagem ao troca de modelo Warn the user if they switch models, then context will be erased Ao trocar de modelo, o contexto da conversa será apagado Conversation copied to clipboard. Conversa copiada. Code copied to clipboard. Código copiado. The entire chat will be erased. Chat panel Painel de chat Chat panel with options Painel de chat com opções Reload the currently loaded model Recarregar modelo atual Eject the currently loaded model Ejetar o modelo carregado atualmente No model installed. Nenhum modelo instalado. Model loading error. Erro ao carregar o modelo. Waiting for model... Aguardando modelo... Switching context... Mudando de contexto... Choose a model... Escolha um modelo... Not found: %1 Não encontrado: %1 The top item is the current model O modelo atual é exibido no topo LocalDocs LocalDocs Add documents Adicionar documentos add collections of documents to the chat Adicionar Coleção de Documentos Load the default model Carregar o modelo padrão Loads the default model which can be changed in settings Carrega o modelo padrão (personalizável nas configurações) No Model Installed Nenhum Modelo Instalado GPT4All requires that you install at least one model to get started O GPT4All precisa de pelo menos um modelo modelo instalado para funcionar Install a Model Instalar um Modelo Shows the add model view Mostra a visualização para adicionar modelo Conversation with the model Conversa com o modelo prompt / response pairs from the conversation Pares de pergunta/resposta da conversa Legacy prompt template needs to be <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">updated</a> in Settings. No <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">chat template</a> configured. The <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">chat template</a> cannot be blank. Legacy system prompt needs to be <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">updated</a> in Settings. GPT4All GPT4All You Você response stopped ... resposta interrompida... processing ... processando... generating response ... gerando resposta... generating questions ... gerando perguntas... Copy Copiar Copy Message Copiar Mensagem Disable markdown Desativar markdown Enable markdown Ativar markdown Thumbs up Resposta boa Gives a thumbs up to the response Curte a resposta Thumbs down Resposta ruim Opens thumbs down dialog Abrir diálogo de joinha para baixo Suggested follow-ups Perguntas relacionadas Erase and reset chat session Apagar e redefinir sessão de chat Copy chat session to clipboard Copiar histórico da conversa Redo last chat response Refazer última resposta Add media Adicionar mídia Adds media to the prompt Adiciona mídia ao prompt Stop generating Parar de gerar Stop the current response generation Parar a geração da resposta atual Attach Anexar Single File Arquivo Único Reloads the model Recarrega modelo <h3>Encountered an error loading model:</h3><br><i>"%1"</i><br><br>Model loading failures can happen for a variety of reasons, but the most common causes include a bad file format, an incomplete or corrupted download, the wrong file type, not enough system RAM or an incompatible model type. Here are some suggestions for resolving the problem:<br><ul><li>Ensure the model file has a compatible format and type<li>Check the model file is complete in the download folder<li>You can find the download folder in the settings dialog<li>If you've sideloaded the model ensure the file is not corrupt by checking md5sum<li>Read more about what models are supported in our <a href="https://docs.gpt4all.io/">documentation</a> for the gui<li>Check out our <a href="https://discord.gg/4M2QFmTt2k">discord channel</a> for help <h3>Ocorreu um erro ao carregar o modelo:</h3><br><i>"%1"</i><br><br>Falhas no carregamento do modelo podem acontecer por vários motivos, mas as causas mais comuns incluem um formato de arquivo incorreto, um download incompleto ou corrompido, o tipo de arquivo errado, memória RAM do sistema insuficiente ou um tipo de modelo incompatível. Aqui estão algumas sugestões para resolver o problema:<br><ul><li>Certifique-se de que o arquivo do modelo tenha um formato e tipo compatíveis<li>Verifique se o arquivo do modelo está completo na pasta de download<li>Você pode encontrar a pasta de download na caixa de diálogo de configurações<li>Se você carregou o modelo, certifique-se de que o arquivo não esteja corrompido verificando o md5sum<li>Leia mais sobre quais modelos são suportados em nossa <a href="https://docs.gpt4all.io/">documentação</a> para a interface gráfica<li>Confira nosso <a href="https://discord.gg/4M2QFmTt2k">canal do Discord</a> para obter ajuda Erase conversation? Changing the model will erase the current conversation. Reload · %1 Recarregar · %1 Loading · %1 Carregando · %1 Load · %1 (default) → Carregar · %1 (padrão) → restoring from text ... Recuperando do texto... retrieving localdocs: %1 ... Recuperando dados em LocalDocs: %1 ... searching localdocs: %1 ... Buscando em LocalDocs: %1 ... %n Source(s) %n Origem %n Origens Send a message... Enviar uma mensagem... Load a model to continue... Carregue um modelo para continuar... Send messages/prompts to the model Enviar mensagens/prompts para o modelo Cut Recortar Paste Colar Select All Selecionar tudo Send message Enviar mensagem Sends the message/prompt contained in textfield to the model Envia a mensagem/prompt contida no campo de texto para o modelo CodeInterpreter Code Interpreter compute javascript code using console.log as output CollectionsDrawer Warning: searching collections while indexing can return incomplete results Aviso: pesquisar coleções durante a indexação pode retornar resultados incompletos %n file(s) %n arquivo(s) %n arquivo(s) %n word(s) %n palavra(s) %n palavra(s) Updating Atualizando + Add Docs + Adicionar Documentos Select a collection to make it available to the chat model. Selecione uma coleção para disponibilizá-la ao modelo de chat. ConfirmationDialog OK Cancel Cancelar Download Model "%1" is installed successfully. Modelo "%1" instalado com sucesso. ERROR: $MODEL_NAME is empty. ERRO: O nome do modelo ($MODEL_NAME) está vazio. ERROR: $API_KEY is empty. ERRO: A chave da API ($API_KEY) está vazia. ERROR: $BASE_URL is invalid. ERRO: A URL base ($BASE_URL) é inválida. ERROR: Model "%1 (%2)" is conflict. ERRO: Conflito com o modelo "%1 (%2)". Model "%1 (%2)" is installed successfully. Modelo "%1 (%2)" instalado com sucesso. Model "%1" is removed. Modelo "%1" removido. HomeView Welcome to GPT4All Bem-vindo ao GPT4All The privacy-first LLM chat application O aplicativo de chat LLM que prioriza a privacidade Start chatting Iniciar chat Start Chatting Iniciar Chat Chat with any LLM Converse com qualquer LLM LocalDocs LocalDocs Chat with your local files Converse com seus arquivos locais Find Models Encontrar Modelos Explore and download models Descubra e baixe modelos Latest news Últimas novidades Latest news from GPT4All Últimas novidades do GPT4All Release Notes Notas de versão Documentation Documentação Discord Discord X (Twitter) X (Twitter) Github Github nomic.ai nomic.ai Subscribe to Newsletter Assine nossa Newsletter LocalDocsSettings LocalDocs LocalDocs LocalDocs Settings Configurações do LocalDocs Indexing Indexação Allowed File Extensions Extensões de Arquivo Permitidas Comma-separated list. LocalDocs will only attempt to process files with these extensions. Lista separada por vírgulas. O LocalDocs tentará processar apenas arquivos com essas extensões. Embedding Incorporação Use Nomic Embed API Usar a API Nomic Embed Embed documents using the fast Nomic API instead of a private local model. Requires restart. Incorporar documentos usando a API Nomic rápida em vez de um modelo local privado. Requer reinicialização. Nomic API Key Chave da API Nomic API key to use for Nomic Embed. Get one from the Atlas <a href="https://atlas.nomic.ai/cli-login">API keys page</a>. Requires restart. Chave da API a ser usada para Nomic Embed. Obtenha uma na página de <a href="https://atlas.nomic.ai/cli-login">chaves de API do Atlas</a>. Requer reinicialização. Embeddings Device Processamento de Incorporações The compute device used for embeddings. Requires restart. Dispositivo usado para processar as incorporações. Requer reinicialização. Application default Aplicativo padrão Display Exibir Show Sources Mostrar Fontes Display the sources used for each response. Mostra as fontes usadas para cada resposta. Advanced Apenas para usuários avançados Warning: Advanced usage only. Atenção: Apenas para usuários avançados. Values too large may cause localdocs failure, extremely slow responses or failure to respond at all. Roughly speaking, the {N chars x N snippets} are added to the model's context window. More info <a href="https://docs.gpt4all.io/gpt4all_desktop/localdocs.html">here</a>. Valores muito altos podem causar falhas no LocalDocs, respostas extremamente lentas ou até mesmo nenhuma resposta. De forma geral, o valor {Número de Caracteres x Número de Trechos} é adicionado à janela de contexto do modelo. Clique <a href="https://docs.gpt4all.io/gpt4all_desktop/localdocs.html">aqui</a> para mais informações. Document snippet size (characters) I translated "snippet" as "trecho" to make the term feel more natural and understandable in Portuguese. "Trecho" effectively conveys the idea of a portion or section of a document, fitting well within the context, whereas a more literal translation might sound less intuitive or awkward for users. Tamanho do trecho de documento (caracteres) Number of characters per document snippet. Larger numbers increase likelihood of factual responses, but also result in slower generation. Número de caracteres por trecho de documento. Valores maiores aumentam a chance de respostas factuais, mas também tornam a geração mais lenta. Max document snippets per prompt Máximo de Trechos de Documento por Prompt Max best N matches of retrieved document snippets to add to the context for prompt. Larger numbers increase likelihood of factual responses, but also result in slower generation. Número máximo de trechos de documentos a serem adicionados ao contexto do prompt. Valores maiores aumentam a chance de respostas factuais, mas também tornam a geração mais lenta. LocalDocsView LocalDocs LocalDocs Chat with your local files Converse com seus arquivos locais + Add Collection + Adicionar Coleção <h3>ERROR: The LocalDocs database cannot be accessed or is not valid.</h3><br><i>Note: You will need to restart after trying any of the following suggested fixes.</i><br><ul><li>Make sure that the folder set as <b>Download Path</b> exists on the file system.</li><li>Check ownership as well as read and write permissions of the <b>Download Path</b>.</li><li>If there is a <b>localdocs_v2.db</b> file, check its ownership and read/write permissions, too.</li></ul><br>If the problem persists and there are any 'localdocs_v*.db' files present, as a last resort you can<br>try backing them up and removing them. You will have to recreate your collections, however. <h3>ERRO: Não foi possível acessar o banco de dados do LocalDocs ou ele não é válido.</h3><br><i>Observação: Será necessário reiniciar o aplicativo após tentar qualquer uma das seguintes correções sugeridas.</i><br><ul><li>Certifique-se de que a pasta definida como <b>Caminho de Download</b> existe no sistema de arquivos.</li><li>Verifique a propriedade, bem como as permissões de leitura e gravação do <b>Caminho de Download</b>.</li><li>Se houver um arquivo <b>localdocs_v2.db</b>, verifique também sua propriedade e permissões de leitura/gravação.</li></ul><br>Se o problema persistir e houver algum arquivo 'localdocs_v*.db' presente, como último recurso, você pode<br>tentar fazer backup deles e removê-los. No entanto, você terá que recriar suas coleções. No Collections Installed Nenhuma Coleção Instalada Install a collection of local documents to get started using this feature Instale uma coleção de documentos locais para começar a usar este recurso + Add Doc Collection + Adicionar Coleção de Documentos Shows the add model view Mostra a visualização para adicionar modelo Indexing progressBar Barra de progresso de indexação Shows the progress made in the indexing Mostra o progresso da indexação ERROR ERRO INDEXING INDEXANDO EMBEDDING INCORPORANDO REQUIRES UPDATE REQUER ATUALIZAÇÃO READY PRONTO INSTALLING INSTALANDO Indexing in progress Indexação em andamento Embedding in progress Incorporação em andamento This collection requires an update after version change Esta coleção precisa ser atualizada após a mudança de versão Automatically reindexes upon changes to the folder Reindexa automaticamente após alterações na pasta Installation in progress Instalação em andamento % % %n file(s) %n arquivo(s) %n arquivo(s) %n word(s) %n palavra(s) %n palavra(s) Remove Remover Rebuild Reconstruir Reindex this folder from scratch. This is slow and usually not needed. eindexar pasta do zero. Lento e geralmente desnecessário. Update Atualizar Update the collection to the new version. This is a slow operation. Atualizar coleção para nova versão. Pode demorar. ModelList <ul><li>Requires personal OpenAI API key.</li><li>WARNING: Will send your chats to OpenAI!</li><li>Your API key will be stored on disk</li><li>Will only be used to communicate with OpenAI</li><li>You can apply for an API key <a href="https://platform.openai.com/account/api-keys">here.</a></li> <ul><li>É necessária uma chave de API da OpenAI.</li><li>AVISO: Seus chats serão enviados para a OpenAI!</li><li>Sua chave de API será armazenada localmente</li><li>Ela será usada apenas para comunicação com a OpenAI</li><li>Você pode solicitar uma chave de API <a href="https://platform.openai.com/account/api-keys">aqui.</a></li> <strong>OpenAI's ChatGPT model GPT-3.5 Turbo</strong><br> %1 <strong>Modelo ChatGPT GPT-3.5 Turbo da OpenAI</strong><br> %1 <strong>OpenAI's ChatGPT model GPT-4</strong><br> %1 %2 <strong>Modelo ChatGPT GPT-4 da OpenAI</strong><br> %1 %2 <strong>Mistral Tiny model</strong><br> %1 <strong>Modelo Mistral Tiny</strong><br> %1 <strong>Mistral Small model</strong><br> %1 <strong>Modelo Mistral Small</strong><br> %1 <strong>Mistral Medium model</strong><br> %1 <strong>Modelo Mistral Medium</strong><br> %1 <br><br><i>* Even if you pay OpenAI for ChatGPT-4 this does not guarantee API key access. Contact OpenAI for more info. <br><br><i>* Mesmo que você pague pelo ChatGPT-4 da OpenAI, isso não garante acesso à chave de API. Contate a OpenAI para mais informações. cannot open "%1": %2 não é possível abrir "%1": %2 cannot create "%1": %2 não é possível criar "%1": %2 %1 (%2) %1 (%2) <strong>OpenAI-Compatible API Model</strong><br><ul><li>API Key: %1</li><li>Base URL: %2</li><li>Model Name: %3</li></ul> <strong>Modelo de API Compatível com OpenAI</strong><br><ul><li>Chave da API: %1</li><li>URL Base: %2</li><li>Nome do Modelo: %3</li></ul> <ul><li>Requires personal Mistral API key.</li><li>WARNING: Will send your chats to Mistral!</li><li>Your API key will be stored on disk</li><li>Will only be used to communicate with Mistral</li><li>You can apply for an API key <a href="https://console.mistral.ai/user/api-keys">here</a>.</li> <ul><li>É necessária uma chave de API da Mistral.</li><li>AVISO: Seus chats serão enviados para a Mistral!</li><li>Sua chave de API será armazenada localmente</li><li>Ela será usada apenas para comunicação com a Mistral</li><li>Você pode solicitar uma chave de API <a href="https://console.mistral.ai/user/api-keys">aqui</a>.</li> <ul><li>Requires personal API key and the API base URL.</li><li>WARNING: Will send your chats to the OpenAI-compatible API Server you specified!</li><li>Your API key will be stored on disk</li><li>Will only be used to communicate with the OpenAI-compatible API Server</li> <ul><li>É necessária uma chave de API e a URL da API.</li><li>AVISO: Seus chats serão enviados para o servidor de API compatível com OpenAI que você especificou!</li><li>Sua chave de API será armazenada no disco</li><li>Será usada apenas para comunicação com o servidor de API compatível com OpenAI</li> <strong>Connect to OpenAI-compatible API server</strong><br> %1 <strong>Conectar a um servidor de API compatível com OpenAI</strong><br> %1 <strong>Created by %1.</strong><br><ul><li>Published on %2.<li>This model has %3 likes.<li>This model has %4 downloads.<li>More info can be found <a href="https://huggingface.co/%5">here.</a></ul> <strong>Criado por %1.</strong><br><ul><li>Publicado em %2.<li>Este modelo tem %3 curtidas.<li>Este modelo tem %4 downloads.<li>Mais informações podem ser encontradas <a href="https://huggingface.co/%5">aqui.</a></ul> ModelSettings Model Modelo %1 system message? Clear Reset The system message will be %1. removed reset to the default %1 chat template? The chat template will be %1. erased Model Settings Configurações do Modelo Clone Clonar Remove Remover Name Nome Model File Arquivo do Modelo System Prompt Prompt do Sistema Prefixed at the beginning of every conversation. Must contain the appropriate framing tokens. Prefixado no início de cada conversa. Deve conter os tokens de enquadramento apropriados. Prompt Template Modelo de Prompt The template that wraps every prompt. Modelo para cada prompt. Must contain the string "%1" to be replaced with the user's input. Deve incluir "%1" para a entrada do usuário. System Message A message to set the context or guide the behavior of the model. Leave blank for none. NOTE: Since GPT4All 3.5, this should not contain control tokens. System message is not <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">plain text</a>. Chat Template This Jinja template turns the chat into input for the model. No <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">chat template</a> configured. The <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">chat template</a> cannot be blank. <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">Syntax error</a>: %1 Chat template is not in <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">Jinja format</a>. Chat Name Prompt Prompt para Nome do Chat Prompt used to automatically generate chat names. Prompt usado para gerar automaticamente nomes de chats. Suggested FollowUp Prompt Prompt de Sugestão de Acompanhamento Prompt used to generate suggested follow-up questions. Prompt usado para gerar sugestões de perguntas. Context Length Tamanho do Contexto Number of input and output tokens the model sees. Tamanho da Janela de Contexto. Maximum combined prompt/response tokens before information is lost. Using more context than the model was trained on will yield poor results. NOTE: Does not take effect until you reload the model. Máximo de tokens combinados (prompt + resposta) antes da perda de informações. Usar mais contexto do que o modelo foi treinado pode gerar resultados ruins. Obs.: Só entrará em vigor após recarregar o modelo. Temperature Temperatura Randomness of model output. Higher -> more variation. Aleatoriedade das respostas. Quanto maior, mais variadas. Temperature increases the chances of choosing less likely tokens. NOTE: Higher temperature gives more creative but less predictable outputs. Aumenta a chance de escolher tokens menos prováveis. Obs.: Uma temperatura mais alta gera resultados mais criativos, mas menos previsíveis. Top-P Top-P Nucleus Sampling factor. Lower -> more predictable. Amostragem por núcleo. Menor valor, respostas mais previsíveis. Only the most likely tokens up to a total probability of top_p can be chosen. NOTE: Prevents choosing highly unlikely tokens. Apenas tokens com probabilidade total até o valor de top_p serão escolhidos. Obs.: Evita tokens muito improváveis. Min-P Min-P Minimum token probability. Higher -> more predictable. Probabilidade mínima do token. Quanto maior -> mais previsível. Sets the minimum relative probability for a token to be considered. Define a probabilidade relativa mínima para um token ser considerado. Top-K Top-K Size of selection pool for tokens. Número de tokens considerados na amostragem. Only the top K most likely tokens will be chosen from. Serão escolhidos apenas os K tokens mais prováveis. Max Length Comprimento Máximo Maximum response length, in tokens. Comprimento máximo da resposta, em tokens. Prompt Batch Size Tamanho do Lote de Processamento The batch size used for prompt processing. Tokens processados por lote. Amount of prompt tokens to process at once. NOTE: Higher values can speed up reading prompts but will use more RAM. Quantidade de tokens de prompt para processar de uma vez. OBS.: Valores mais altos podem acelerar a leitura dos prompts, mas usarão mais RAM. Repeat Penalty Penalidade de Repetição Repetition penalty factor. Set to 1 to disable. Penalidade de Repetição (1 para desativar). Repeat Penalty Tokens Tokens para penalizar repetição Number of previous tokens used for penalty. Número de tokens anteriores usados para penalidade. GPU Layers Camadas na GPU Number of model layers to load into VRAM. Camadas Carregadas na GPU. How many model layers to load into VRAM. Decrease this if GPT4All runs out of VRAM while loading this model. Lower values increase CPU load and RAM usage, and make inference slower. NOTE: Does not take effect until you reload the model. Número de camadas do modelo carregadas na VRAM. Diminua se faltar VRAM ao carregar o modelo. Valores menores aumentam o uso de CPU e RAM, e deixam a inferência mais lenta. Obs.: Só entrará em vigor após recarregar o modelo. ModelsView No Models Installed Nenhum Modelo Instalado Install a model to get started using GPT4All Instale um modelo para começar a usar o GPT4All + Add Model + Adicionar Modelo Shows the add model view Mostra a visualização para adicionar modelo Installed Models Modelos Instalados Locally installed chat models Modelos de chat instalados localmente Model file Arquivo do modelo Model file to be downloaded Arquivo do modelo a ser baixado Description Descrição File description Descrição do arquivo Cancel Cancelar Resume Retomar Stop/restart/start the download Parar/reiniciar/iniciar o download Remove Remover Remove model from filesystem Remover modelo do sistema de arquivos Install Instalar Install online model Instalar modelo online <strong><font size="1"><a href="#error">Error</a></strong></font> <strong><font size="1"><a href="#error">Erro</a></strong></font> <strong><font size="2">WARNING: Not recommended for your hardware. Model requires more memory (%1 GB) than your system has available (%2).</strong></font> <strong><font size="2">AVISO: Não recomendado para seu hardware. O modelo requer mais memória (%1 GB) do que seu sistema tem disponível (%2).</strong></font> ERROR: $API_KEY is empty. ERRO: A $API_KEY está vazia. ERROR: $BASE_URL is empty. ERRO: A $BASE_URL está vazia. enter $BASE_URL inserir a $BASE_URL ERROR: $MODEL_NAME is empty. ERRO: O $MODEL_NAME está vazio. enter $MODEL_NAME inserir o $MODEL_NAME %1 GB %1 GB ? ? Describes an error that occurred when downloading Descreve um erro que ocorreu durante o download Error for incompatible hardware Erro para hardware incompatível Download progressBar Barra de progresso do download Shows the progress made in the download Mostra o progresso do download Download speed Velocidade de download Download speed in bytes/kilobytes/megabytes per second Velocidade de download em bytes/kilobytes/megabytes por segundo Calculating... Calculando... Whether the file hash is being calculated Se o hash do arquivo está sendo calculado Busy indicator Indicador de ocupado Displayed when the file hash is being calculated Exibido quando o hash do arquivo está sendo calculado enter $API_KEY inserir $API_KEY File size Tamanho do arquivo RAM required RAM necessária Parameters Parâmetros Quant Quant Type Tipo MyFancyLink Fancy link Link personalizado A stylized link Um link personalizado MyFileDialog Please choose a file Por favor escolha um arquivo MyFolderDialog Please choose a directory Escolha um diretório MySettingsLabel Clear Reset MySettingsStack Please choose a directory Escolha um diretório MySettingsTab Restore defaults? This page of settings will be reset to the defaults. Restore Defaults Restaurar Configurações Padrão Restores settings dialog to a default state Restaura as configurações para o estado padrão NetworkDialog Contribute data to the GPT4All Opensource Datalake. Contribuir com dados para o Datalake de código aberto GPT4All. By enabling this feature, you will be able to participate in the democratic process of training a large language model by contributing data for future model improvements. When a GPT4All model responds to you and you have opted-in, your conversation will be sent to the GPT4All Open Source Datalake. Additionally, you can like/dislike its response. If you dislike a response, you can suggest an alternative response. This data will be collected and aggregated in the GPT4All Datalake. NOTE: By turning on this feature, you will be sending your data to the GPT4All Open Source Datalake. You should have no expectation of chat privacy when this feature is enabled. You should; however, have an expectation of an optional attribution if you wish. Your chat data will be openly available for anyone to download and will be used by Nomic AI to improve future GPT4All models. Nomic AI will retain all attribution information attached to your data and you will be credited as a contributor to any GPT4All model release that uses your data! Ao habilitar este recurso, você poderá participar do processo democrático de treinamento de um grande modelo de linguagem, contribuindo com dados para futuras melhorias do modelo. Quando um modelo GPT4All responder a você e você tiver optado por participar, sua conversa será enviada para o Datalake de Código Aberto do GPT4All. Além disso, você pode curtir/não curtir a resposta. Se você não gostar de uma resposta, pode sugerir uma resposta alternativa. Esses dados serão coletados e agregados no Datalake do GPT4All. OBS.: Ao ativar este recurso, você estará enviando seus dados para o Datalake de Código Aberto do GPT4All. Você não deve ter nenhuma expectativa de privacidade no chat quando este recurso estiver ativado. No entanto, você deve ter a expectativa de uma atribuição opcional, se desejar. Seus dados de chat estarão disponíveis para qualquer pessoa baixar e serão usados pela Nomic AI para melhorar os futuros modelos GPT4All. A Nomic AI manterá todas as informações de atribuição anexadas aos seus dados e você será creditado como colaborador em qualquer versão do modelo GPT4All que utilize seus dados! Terms for opt-in Termos de participação Describes what will happen when you opt-in Descrição do que acontece ao participar Please provide a name for attribution (optional) Forneça um nome para atribuição (opcional) Attribution (optional) Atribuição (opcional) Provide attribution Fornecer atribuição Enable Habilitar Enable opt-in Ativar participação Cancel Cancelar Cancel opt-in Cancelar participação NewVersionDialog New version is available Atualização disponível Update Atualizar agora Update to new version Baixa e instala a última versão do GPT4All PopupDialog Reveals a shortlived help balloon Exibe uma dica rápida Busy indicator The literal translation of "busy indicator" as "indicador de ocupado" might create ambiguity in Portuguese, as it doesn't clearly convey whether the system is processing something or simply unavailable. "Progresso" (progress) was chosen to more clearly indicate that an activity is in progress and that the user should wait for its completion. Indicador de progresso Displayed when the popup is showing busy Visível durante o processamento RemoteModelCard API Key ERROR: $API_KEY is empty. enter $API_KEY inserir $API_KEY Whether the file hash is being calculated Base Url ERROR: $BASE_URL is empty. ERRO: A $BASE_URL está vazia. enter $BASE_URL inserir a $BASE_URL Model Name ERROR: $MODEL_NAME is empty. enter $MODEL_NAME inserir o $MODEL_NAME Models Modelos Install Instalar Install remote model SettingsView Settings I used "Config" instead of "Configurações" to keep the UI concise and visually balanced. "Config" is a widely recognized abbreviation that maintains clarity while saving space, making the interface cleaner and more user-friendly, especially in areas with limited space. Config Contains various application settings Acessar as configurações do aplicativo Application Aplicativo Model Modelo LocalDocs LocalDocs StartupDialog Welcome! Bem-vindo(a)! ### Release Notes %1<br/> ### Contributors %2 ### Notas de lançamento %1<br/> ### Colaboradores %2 Release notes Notas de lançamento Release notes for this version Notas de lançamento desta versão ### Opt-ins for anonymous usage analytics and datalake By enabling these features, you will be able to participate in the democratic process of training a large language model by contributing data for future model improvements. When a GPT4All model responds to you and you have opted-in, your conversation will be sent to the GPT4All Open Source Datalake. Additionally, you can like/dislike its response. If you dislike a response, you can suggest an alternative response. This data will be collected and aggregated in the GPT4All Datalake. NOTE: By turning on this feature, you will be sending your data to the GPT4All Open Source Datalake. You should have no expectation of chat privacy when this feature is enabled. You should; however, have an expectation of an optional attribution if you wish. Your chat data will be openly available for anyone to download and will be used by Nomic AI to improve future GPT4All models. Nomic AI will retain all attribution information attached to your data and you will be credited as a contributor to any GPT4All model release that uses your data! ### Opções para análise de uso anônimo e banco de dados Ao habilitar esses recursos, você poderá participar do processo democrático de treinamento de um grande modelo de linguagem, contribuindo com dados para futuras melhorias do modelo. Quando um modelo GPT4All responder a você e você tiver optado por participar, sua conversa será enviada para o Datalake de Código Aberto do GPT4All. Além disso, você pode curtir/não curtir a resposta. Se você não gostar de uma resposta, pode sugerir uma resposta alternativa. Esses dados serão coletados e agregados no Datalake do GPT4All. OBS.: Ao ativar este recurso, você estará enviando seus dados para o Datalake de Código Aberto do GPT4All. Você não deve ter nenhuma expectativa de privacidade no chat quando este recurso estiver ativado. No entanto, você deve ter a expectativa de uma atribuição opcional, se desejar. Seus dados de chat estarão disponíveis para qualquer pessoa baixar e serão usados pela Nomic AI para melhorar os futuros modelos GPT4All. A Nomic AI manterá todas as informações de atribuição anexadas aos seus dados e você será creditado como colaborador em qualquer versão do modelo GPT4All que utilize seus dados! Terms for opt-in Termos de participação Describes what will happen when you opt-in Descrição do que acontece ao participar Opt-in to anonymous usage analytics used to improve GPT4All Opt-in for anonymous usage statistics Enviar estatísticas de uso anônimas Yes Sim Allow opt-in for anonymous usage statistics Permitir o envio de estatísticas de uso anônimas No Não Opt-out for anonymous usage statistics Recusar envio de estatísticas de uso anônimas Allow opt-out for anonymous usage statistics Permitir recusar envio de estatísticas de uso anônimas Opt-in to anonymous sharing of chats to the GPT4All Datalake Opt-in for network Aceitar na rede Allow opt-in for network Permitir aceitação na rede Allow opt-in anonymous sharing of chats to the GPT4All Datalake Permitir compartilhamento anônimo de chats no Datalake GPT4All Opt-out for network Recusar na rede Allow opt-out anonymous sharing of chats to the GPT4All Datalake Permitir recusar compartilhamento anônimo de chats no Datalake GPT4All SwitchModelDialog <b>Warning:</b> changing the model will erase the current conversation. Do you wish to continue? <b>Atenção:</b> Ao trocar o modelo a conversa atual será perdida. Continuar? Continue Continuar Continue with model loading Confirma a troca do modelo Cancel Cancelar ThumbsDownDialog Please edit the text below to provide a better response. (optional) Editar resposta (opcional) Please provide a better response... Digite sua resposta... Submit Enviar Submits the user's response Enviar Cancel Cancelar Closes the response dialog Fecha a caixa de diálogo de resposta main <h3>Encountered an error starting up:</h3><br><i>"Incompatible hardware detected."</i><br><br>Unfortunately, your CPU does not meet the minimal requirements to run this program. In particular, it does not support AVX intrinsics which this program requires to successfully run a modern large language model. The only solution at this time is to upgrade your hardware to a more modern CPU.<br><br>See here for more information: <a href="https://en.wikipedia.org/wiki/Advanced_Vector_Extensions">https://en.wikipedia.org/wiki/Advanced_Vector_Extensions</a> <h3>Ocorreu um erro ao iniciar:</h3><br><i>"Hardware incompatível detectado."</i><br><br>Infelizmente, seu processador não atende aos requisitos mínimos para executar este programa. Especificamente, ele não possui suporte às instruções AVX, que são necessárias para executar modelos de linguagem grandes e modernos. A única solução, no momento, é atualizar seu hardware para um processador mais recente.<br><br>Para mais informações, consulte: <a href="https://pt.wikipedia.org/wiki/Advanced_Vector_Extensions">https://pt.wikipedia.org/wiki/Advanced_Vector_Extensions</a> GPT4All v%1 GPT4All v%1 Restore Quit <h3>Encountered an error starting up:</h3><br><i>"Inability to access settings file."</i><br><br>Unfortunately, something is preventing the program from accessing the settings file. This could be caused by incorrect permissions in the local app config directory where the settings file is located. Check out our <a href="https://discord.gg/4M2QFmTt2k">discord channel</a> for help. <h3>Ocorreu um erro ao iniciar:</h3><br><i>"Não foi possível acessar o arquivo de configurações."</i><br><br>Infelizmente, algo está impedindo o programa de acessar o arquivo de configurações. Isso pode acontecer devido a permissões incorretas na pasta de configurações do aplicativo. Para obter ajuda, acesse nosso <a href="https://discord.gg/4M2QFmTt2k">canal no Discord</a>. Connection to datalake failed. Falha na conexão com o datalake. Saving chats. Salvando chats. Network dialog Avisos de rede opt-in to share feedback/conversations permitir compartilhamento de feedback/conversas Home view Tela inicial Home view of application Tela inicial do aplicativo Home Início Chat view Visualização do Chat Chat view to interact with models Visualização do chat para interagir com os modelos Chats Chats Models Modelos Models view for installed models Tela de modelos instalados LocalDocs LocalDocs LocalDocs view to configure and use local docs Tela de configuração e uso de documentos locais do LocalDocs Settings Config Settings view for application configuration Tela de configurações do aplicativo The datalake is enabled O datalake está ativado Using a network model Usando um modelo de rede Server mode is enabled Modo servidor ativado Installed models Modelos instalados View of installed models Exibe os modelos instalados ================================================ FILE: gpt4all-chat/translations/gpt4all_ro_RO.ts ================================================ AddCollectionView ← Existing Collections ← Colecţiile curente Add Document Collection Adaugă o Colecţie de documente Add a folder containing plain text files, PDFs, or Markdown. Configure additional extensions in Settings. Adaugă un folder cu fişiere în format text, PDF sau Markdown. Alte extensii pot fi specificate în Configurare. Please choose a directory Selectează un director/folder Name Denumire Collection name... Denumirea Colecţiei... Name of the collection to add (Required) Denumirea Colecţiei de adăugat (necesar) Folder Folder Folder path... Calea spre folder... Folder path to documents (Required) Calea spre documente (necesar) Browse Căutare Create Collection Creează Colecţia AddGPT4AllModelView These models have been specifically configured for use in GPT4All. The first few models on the list are known to work the best, but you should only attempt to use models that will fit in your available memory. Aceste modele au fost configurate special pentru utilizarea în GPT4All. Primele câteva modele din listă sunt cunoscute ca fiind cele mai bune, dar ar trebui să încercați să utilizați doar modele care se încadrează în RAM. Network error: could not retrieve %1 Eroare de reţea: nu se poate prelua %1 Busy indicator Indicator de activitate Displayed when the models request is ongoing Afişat în timpul solicitării modelului All Reasoning Model file Fişierul modelului Model file to be downloaded Fişierul modelului ce va fi descărcat Description Descriere File description Descrierea fişierului Cancel Anulare Resume Continuare Download Download Stop/restart/start the download Oprirea/Repornirea/Iniţierea descărcării Remove Şterg Remove model from filesystem Şterg modelul din sistemul de fişiere Install Instalare Install online model Instalez un model din online <strong><font size="1"><a href="#error">Error</a></strong></font> <strong><font size="1"><a href="#error">Eroare</a></strong></font> Describes an error that occurred when downloading Descrie eroarea apărută în timpul descărcării <strong><font size="2">WARNING: Not recommended for your hardware. Model requires more memory (%1 GB) than your system has available (%2).</strong></font> <strong><font size="2">ATENŢIE: Nerecomandat pentru acest hardware. Modelul necesită mai multă memorie (%1 GB) decât are acest sistem (%2).</strong></font> Error for incompatible hardware Eroare: hardware incompatibil Download progressBar Progresia descărcării Shows the progress made in the download Afişează progresia descărcării Download speed Viteza de download Download speed in bytes/kilobytes/megabytes per second Viteza de download în bytes/kilobytes/megabytes pe secundă Calculating... Calculare... Whether the file hash is being calculated Dacă se calculează hash-ul fişierului Displayed when the file hash is being calculated Se afişează când se calculează hash-ul fişierului ERROR: $API_KEY is empty. EROARE: $API_KEY absentă. enter $API_KEY introdu cheia $API_KEY ERROR: $BASE_URL is empty. EROARE: $BASE_URL absentă. enter $BASE_URL introdu $BASE_URL ERROR: $MODEL_NAME is empty. EROARE: $MODEL_NAME absent enter $MODEL_NAME introdu $MODEL_NAME File size Dimensiunea fişierului RAM required RAM necesară %1 GB %1 GB ? ? Parameters Parametri Quant Quant(ificare) Type Tip AddHFModelView Use the search to find and download models from HuggingFace. There is NO GUARANTEE that these will work. Many will require additional configuration before they can be used. Utilizați funcția de căutare pentru a găsi și descărca modele de pe HuggingFace. NU E GARANTAT că acestea vor funcționa. Multe dintre ele vor necesita configurări suplimentare înainte de a putea fi utilizate. Discover and download models by keyword search... Caută şi descarcă modele după un cuvânt-cheie... Text field for discovering and filtering downloadable models Câmp pentru căutarea şi filtrarea modelelor ce pot fi descărcate Searching · %1 Căutare · %1 Initiate model discovery and filtering Iniţiază căutarea şi filtrarea modelelor Triggers discovery and filtering of models Activează căutarea şi filtrarea modelelor Default Implicit Likes Likes Downloads Download-uri Recent Recent/e Sort by: %1 Ordonare după: %1 Asc Asc. (A->Z) Desc Desc. (Z->A) Sort dir: %1 Sensul ordonării: %1 None Niciunul Limit: %1 Límită: %1 Model file Fişierul modelului Model file to be downloaded Fişierul modelului ce va fi descărcat Description Descriere File description Descrierea fişierului Cancel Anulare Resume Continuare Download Download Stop/restart/start the download Oprirea/Repornirea/Iniţierea descărcării Remove Şterg Remove model from filesystem Şterg modelul din sistemul de fişiere Install Instalare Install online model Instalez un model din online <strong><font size="1"><a href="#error">Error</a></strong></font> <strong><font size="1"><a href="#error">Eroare</a></strong></font> Describes an error that occurred when downloading Descrie o eroare aparută la download <strong><font size="2">WARNING: Not recommended for your hardware. Model requires more memory (%1 GB) than your system has available (%2).</strong></font> <strong><font size="2">ATENŢIE: Nerecomandat pentru acest hardware. Modelul necesită mai multă memorie (%1 GB) decât are acest sistem (%2).</strong></font> Error for incompatible hardware Eroare - hardware incompatibil Download progressBar Bara de progresie a descărcării Shows the progress made in the download Afişează progresia descărcării Download speed Viteza de download Download speed in bytes/kilobytes/megabytes per second Viteza de download în bytes/kilobytes/megabytes pe secundă Calculating... Calculare... Whether the file hash is being calculated Dacă se calculează hash-ul fişierului Busy indicator Indicator de activitate Displayed when the file hash is being calculated Afişat la calcularea hash-ului fişierului ERROR: $API_KEY is empty. EROARE: $API_KEY absentă enter $API_KEY introdu cheia $API_KEY ERROR: $BASE_URL is empty. EROARE: $BASE_URL absentă enter $BASE_URL introdu $BASE_URL ERROR: $MODEL_NAME is empty. enter $MODEL_NAME introdu $MODEL_NAME File size Dimensiunea fişierului Quant Quant(ificare) Type Tip AddModelView ← Existing Models ← Modelele curente/instalate Explore Models Caută modele GPT4All GPT4All Remote Providers HuggingFace HuggingFace Discover and download models by keyword search... Caută şi descarcă modele după un cuvânt-cheie... Text field for discovering and filtering downloadable models Câmp pentru căutarea şi filtrarea modelelor ce pot fi descărcate Initiate model discovery and filtering Iniţiază căutarea şi filtrarea modelelor Triggers discovery and filtering of models Activează căutarea şi filtrarea modelelor Default Implicit Likes Likes Downloads Download-uri Recent Recent/e Asc Asc. (A->Z) Desc Desc. (Z->A) None Niciunul Searching · %1 Căutare · %1 Sort by: %1 Ordonare după: %1 Sort dir: %1 Sensul ordonării: %1 Limit: %1 Límită: %1 Network error: could not retrieve %1 Eroare de reţea: nu se poate prelua %1 Busy indicator Indicator de activitate Displayed when the models request is ongoing Afişat în timpul solicitării modelului Model file Fişierul modelului Model file to be downloaded Fişierul modelului de descărcat Install online model Instalez un model din online %1 GB %1 GB ? ? Shows the progress made in the download Afişează progresia descărcării Download speed Viteza de download Download speed in bytes/kilobytes/megabytes per second Viteza de download în bytes/kilobytes/megabytes pe secundă enter $API_KEY introdu cheia $API_KEY File size Dimensiunea fişierului RAM required RAM necesară Parameters Parametri Quant Quant(ificare) Type Tip AddRemoteModelView Various remote model providers that use network resources for inference. Groq Groq offers a high-performance AI inference engine designed for low-latency and efficient processing. Optimized for real-time applications, Groq’s technology is ideal for users who need fast responses from open large language models and other AI workloads.<br><br>Get your API key: <a href="https://console.groq.com/keys">https://groq.com/</a> OpenAI OpenAI provides access to advanced AI models, including GPT-4 supporting a wide range of applications, from conversational AI to content generation and code completion.<br><br>Get your API key: <a href="https://platform.openai.com/signup">https://openai.com/</a> Mistral Mistral AI specializes in efficient, open-weight language models optimized for various natural language processing tasks. Their models are designed for flexibility and performance, making them a solid option for applications requiring scalable AI solutions.<br><br>Get your API key: <a href="https://mistral.ai/">https://mistral.ai/</a> Custom The custom provider option allows users to connect their own OpenAI-compatible AI models or third-party inference services. This is useful for organizations with proprietary models or those leveraging niche AI providers not listed here. ApplicationSettings Application Program Network dialog Reţea opt-in to share feedback/conversations optional: partajarea (share) de comentarii/conversaţii ERROR: Update system could not find the MaintenanceTool used to check for updates!<br/><br/>Did you install this application using the online installer? If so, the MaintenanceTool executable should be located one directory above where this application resides on your filesystem.<br/><br/>If you can't start it manually, then I'm afraid you'll have to reinstall. EROARE: Sistemul de Update nu poate găsi componenta MaintenanceTool necesară căutării de versiuni noi!<br><br> Ai instalat acest program folosind kitul online? Dacă da, atunci MaintenanceTool trebuie să fie un nivel mai sus de folderul unde ai instalat programul.<br><br> Dacă nu poate fi lansată manual, atunci programul trebuie reinstalat. Error dialog Eroare Application Settings Configurarea programului General General Theme Tema pentru interfaţă The application color scheme. Schema de culori a programului. Dark Întunecat Light Luminos LegacyDark Întunecat-vechi Font Size Dimensiunea textului The size of text in the application. Dimensiunea textului în program. Device Dispozitiv/Device The compute device used for text generation. "Auto" uses Vulkan or Metal. Dispozitivul de calcul utilizat pentru generarea de text. "Auto" apelează la Vulkan sau la Metal. Small Mic Medium Mediu Large Mare Language and Locale Limbă şi Localizare The language and locale you wish to use. Limba şi Localizarea de utilizat. System Locale Localizare The compute device used for text generation. Dispozitivul de calcul utilizat pentru generarea de text. Application default Implicit Default Model Modelul implicit The preferred model for new chats. Also used as the local server fallback. Modelul preferat pentru noile conversaţii. Va fi folosit drept rezervă pentru serverul local. Suggestion Mode Modul de sugerare Generate suggested follow-up questions at the end of responses. Generarea de întrebări în continuarea replicilor. When chatting with LocalDocs Când se discută cu LocalDocs Whenever possible Oricând e posibil Never Niciodată Download Path Calea pentru download Where to store local models and the LocalDocs database. Unde să fie plasate modelele şi baza de date LocalDocs. Browse Căutare Choose where to save model files Selectează locul unde vor fi plasate fişierele modelelor Enable Datalake Activează DataLake Send chats and feedback to the GPT4All Open-Source Datalake. Trimite conversaţii şi comentarii către componenta Open-source DataLake a GPT4All. Advanced Avansate CPU Threads Thread-uri CPU The number of CPU threads used for inference and embedding. Numărul de thread-uri CPU utilizate pentru inferenţă şi embedding. Enable System Tray Trimit pe SysTray (pe bara) The application will minimize to the system tray when the window is closed. Programul va fi minimizat pe bara de jos Save Chat Context Salvarea contextului conversaţiei Save the chat model's state to disk for faster loading. WARNING: Uses ~2GB per chat. Salvează pe disc starea modelului pentru încărcare mai rapidă. ATENŢIE: Consumă ~2GB/conversaţie. Enable Local API Server Activez Serverul API local Expose an OpenAI-Compatible server to localhost. WARNING: Results in increased resource usage. Activează pe localhost un Server compatibil cu OpenAI. ATENŢIE: Creşte consumul de resurse. API Server Port Portul Serverului API The port to use for the local server. Requires restart. Portul utilizat pentru Serverul local. Necesită repornirea programului. Check For Updates Caută update-uri Manually check for an update to GPT4All. Caută manual update-uri pentru GPT4All. Updates Update-uri/Actualizări Chat New Chat Conversaţie Nouă Server Chat Conversaţie cu Serverul ChatAPIWorker ERROR: Network error occurred while connecting to the API server EROARE: Eroare de reţea - conectarea la serverul API ChatAPIWorker::handleFinished got HTTP Error %1 %2 ChatAPIWorker::handleFinished - eroare: HTTP Error %1 %2 ChatCollapsibleItem Analysis encountered error Thinking Analyzing Thought for %1 %2 second seconds Analyzed ChatDrawer Drawer Sertar Main navigation drawer Sertarul principal de navigare + New Chat + Conversaţie nouă Create a new chat Creează o Conversaţie Select the current chat or edit the chat when in edit mode Selectează conversaţia curentă sau editeaz-o în modul editare Edit chat name Editează denumirea conversaţiei Save chat name Salvează denumirea conversaţiei Delete chat Şterge conversaţia Confirm chat deletion CONFIRM ştergerea conversaţiei Cancel chat deletion ANULEZ ştergerea conversaţiei List of chats Lista conversaţiilor List of chats in the drawer dialog Lista conversaţiilor în secţiunea-sertar ChatItemView GPT4All GPT4All You Tu response stopped ... replică întreruptă... retrieving localdocs: %1 ... se preia din LocalDocs: %1 ... searching localdocs: %1 ... se caută în LocalDocs: %1 ... processing ... procesare... generating response ... se generează replica... generating questions ... se generează întrebări... generating toolcall ... Copy Copiere Copy Message Copiez mesajul Disable markdown Dezactivez markdown Enable markdown Activez markdown %n Source(s) %n Sursa %n Surse %n de Surse LocalDocs LocalDocs Edit this message? Editez mesajul All following messages will be permanently erased. Toate aceste mesajevor fi şterse Redo this response? Refă raspunsul Cannot edit chat without a loaded model. Nu se poate edita conversaţia fără un model incărcat Cannot edit chat while the model is generating. Nu se poate edita conversaţia când un model generează text Edit Editare Cannot redo response without a loaded model. Nu se poate reface un răspuns fără un model incărcat Cannot redo response while the model is generating. Nu se poate reface un răspuns când un model generează text Redo Refacere Like response Imi Place râspunsul Dislike response NU Îmi Place râspunsul Suggested follow-ups Continuări sugerate ChatLLM Your message was too long and could not be processed (%1 > %2). Please try again with something shorter. Mesajul tău e prea lung şi nu poate fi procesat. (%1 > %2). Încearca iar cu un mesaj mai scurt ChatListModel TODAY ASTĂZI THIS WEEK SĂPTĂMÂNA ACEASTA THIS MONTH LUNA ACEASTA LAST SIX MONTHS ULTIMELE ŞASE LUNI THIS YEAR ANUL ACESTA LAST YEAR ANUL TRECUT ChatTextItem Copy Copiere Copy Message Copiez mesajul Disable markdown Dezactivez markdown Enable markdown Activez markdown ChatView <h3>Warning</h3><p>%1</p> <h3>Atenţie</h3><p>%1</p> Switch model dialog Schimbarea modelului Warn the user if they switch models, then context will be erased Avertizează utilizatorul că la schimbarea modelului va fi şters contextul Conversation copied to clipboard. Conversaţia a fost plasată în Clipboard. Code copied to clipboard. Codul a fost plasat în Clipboard. The entire chat will be erased. Toată conversaţia va fi ŞTEARSĂ Chat panel Secţiunea de chat Chat panel with options Secţiunea de chat cu opţiuni Reload the currently loaded model Reîncarcă modelul curent Eject the currently loaded model Ejectează modelul curent No model installed. Niciun model instalat. Model loading error. Eroare la încărcarea modelului. Waiting for model... Se aşteaptă modelul... Switching context... Se schimbă contextul... Choose a model... Selectează un model... Not found: %1 Absent: %1 The top item is the current model Primul element e modelul curent LocalDocs LocalDocs Add documents Adaug documente add collections of documents to the chat adaugă Colecţii de documente la conversaţie Load the default model Încarcă modelul implicit Loads the default model which can be changed in settings Încarcă modelul implicit care poate fi stabilit în Configurare No Model Installed Niciun model instalat Legacy prompt template needs to be <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">updated</a> in Settings. Vechiul Prompt-Template trebuie să fie <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">actualizat</a> în Configurare. No <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">chat template</a> configured. Nu e configurat niciun <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">model de conversaţie</a>. The <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">chat template</a> cannot be blank. <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">Modelul de conversaţie</a> nu poate lipsi. Legacy system prompt needs to be <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">updated</a> in Settings. Vechiul System Prompt trebuie să fie <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">actualizat</a> în Configurare. <h3>Encountered an error loading model:</h3><br><i>"%1"</i><br><br>Model loading failures can happen for a variety of reasons, but the most common causes include a bad file format, an incomplete or corrupted download, the wrong file type, not enough system RAM or an incompatible model type. Here are some suggestions for resolving the problem:<br><ul><li>Ensure the model file has a compatible format and type<li>Check the model file is complete in the download folder<li>You can find the download folder in the settings dialog<li>If you've sideloaded the model ensure the file is not corrupt by checking md5sum<li>Read more about what models are supported in our <a href="https://docs.gpt4all.io/">documentation</a> for the gui<li>Check out our <a href="https://discord.gg/4M2QFmTt2k">discord channel</a> for help <h3>EROARE la încărcarea modelului:</h3><br><i>"%1"</i><br><br>Astfel de erori pot apărea din mai multe cauze, dintre care cele mai comune includ un format inadecvat al fişierului, un download incomplet sau întrerupt, un tip inadecvat de fişier, RAM insuficientă, sau un tip incompatibil de model. Sugestii pentru rezolvarea problemei: verifică dacă fişierul modelului are un format şi un tip compatibile; verifică dacă fişierul modelului este complet în folderul dedicat - acest folder este afişat în secţiunea Configurare; dacă ai descărcat modelul dinafara programului, asigură-te că fişierul nu e corupt după ce îi verifici amprenta MD5 (md5sum)<li>Află mai mult despre modelele compatibile în pagina unde am plasat <a href="https://docs.gpt4all.io/">documentaţia</a> pentru interfaţa gráfică<li>poţi găsi <a href="https://discord.gg/4M2QFmTt2k">canalul nostru Discord</a> unde se oferă ajutor Erase conversation? ŞTERG conversaţia Changing the model will erase the current conversation. Schimbarea modelului va ŞTERGE conversaţia curenta. GPT4All requires that you install at least one model to get started GPT4All necesită cel puţin un model pentru a putea rula Install a Model Instalează un model Shows the add model view Afişează secţiunea de adăugare a unui model Conversation with the model Conversaţie cu modelul prompt / response pairs from the conversation perechi prompt/replică din conversaţie GPT4All GPT4All You Tu response stopped ... replică întreruptă... processing ... procesare... generating response ... se generează replica... generating questions ... se generează întrebări... Copy Copiere Copy Message Copiez mesajul Disable markdown Dezactivez markdown Enable markdown Activez markdown Thumbs up Bravo Gives a thumbs up to the response Dă un Bravo acestei replici Thumbs down Aiurea Opens thumbs down dialog Deschide reacţia Aiurea Suggested follow-ups Continuări sugerate Erase and reset chat session Şterge şi resetează sesiunea de chat Copy chat session to clipboard Copiez sesiunea de chat (conversaţia) în Clipboard Redo last chat response Reface ultima replică Add media Adaugă media (un fişier) Adds media to the prompt Adaugă media (un fişier) la prompt Stop generating Opreşte generarea Stop the current response generation Opreşte generarea replicii curente Attach Ataşează Single File Un singur fişier Reloads the model Reîncarc modelul <h3>Encountered an error loading model:</h3><br><i>"%1"</i><br><br>Model loading failures can happen for a variety of reasons, but the most common causes include a bad file format, an incomplete or corrupted download, the wrong file type, not enough system RAM or an incompatible model type. Here are some suggestions for resolving the problem:<br><ul><li>Ensure the model file has a compatible format and type<li>Check the model file is complete in the download folder<li>You can find the download folder in the settings dialog<li>If you've sideloaded the model ensure the file is not corrupt by checking md5sum<li>Read more about what models are supported in our <a href="https://docs.gpt4all.io/">documentation</a> for the gui<li>Check out our <a href="https://discord.gg/4M2QFmTt2k">discord channel</a> for help <h3>EROARE la Încărcarea modelului:</h3><br><i>"%1"</i><br><br>Astfel de erori pot apărea din mai multe cauze, dintre care cele mai comune includ un format inadecvat al fişierului, un download incomplet sau întrerupt, un tip inadecvat de fişier, RAM insuficientă, sau un tip incompatibil de model. Sugestii pentru rezolvarea problemei: verifică dacă fişierul modelului are un format si un tip compatibile; verifică dacă fişierul modelului este complet în folderul dedicat - acest folder este afişat în secţiunea Configurare; dacă ai descărcat modelul dinafara programului, asigură-te că fişierul nu e corupt după ce îi verifici amprenta MD5 (md5sum)<li>Află mai multe despre care modele sunt compatibile în pagina unde am plasat <a href="https://docs.gpt4all.io/">documentaţia</a> pentru interfaţa gráfică<li>poţi parcurge <a href="https://discord.gg/4M2QFmTt2k">canalul nostru Discord</a> unde se oferă ajutor Reload · %1 Reîncărcare · %1 Loading · %1 Încărcare · %1 Load · %1 (default) → Încarcă · %1 (implicit) → restoring from text ... restaurare din text... retrieving localdocs: %1 ... se preia din LocalDocs: %1 ... searching localdocs: %1 ... se caută în LocalDocs: %1 ... %n Source(s) %n Sursa %n Surse %n de Surse Send a message... Trimite un mesaj... Load a model to continue... Încarcă un model pentru a continua... Send messages/prompts to the model Trimite mesaje/prompt-uri către model Cut Decupare (Cut) Paste Alipire (Paste) Select All Selectez tot Send message Trimit mesajul Sends the message/prompt contained in textfield to the model Trimite modelului mesajul/prompt-ul din câmpul-text CodeInterpreter Code Interpreter compute javascript code using console.log as output CollectionsDrawer Warning: searching collections while indexing can return incomplete results Atenţie: căutarea în Colecţii în timp ce sunt Indexate poate întoarce rezultate incomplete %n file(s) %n fişier %n fişiere %n de fişiere %n word(s) %n cuvânt %n cuvinte %n de cuvinte Updating Actualizare + Add Docs + Adaug documente Select a collection to make it available to the chat model. Selectează o Colecţie pentru ca modelul să o poată accesa. ConfirmationDialog OK OK Cancel Anulare Download Model "%1" is installed successfully. Modelul "%1" - instalat cu succes. ERROR: $MODEL_NAME is empty. EROARE: $MODEL_NAME absent. ERROR: $API_KEY is empty. EROARE: $API_KEY absentă ERROR: $BASE_URL is invalid. EROARE: $API_KEY incorecta ERROR: Model "%1 (%2)" is conflict. EROARE: Model "%1 (%2)" conflictual. Model "%1 (%2)" is installed successfully. Modelul "%1 (%2)" - instalat cu succes. Model "%1" is removed. Modelul "%1" - îndepărtat HomeView Welcome to GPT4All Bun venit în GPT4All The privacy-first LLM chat application Programul ce Prioritizează Confidenţialitatea (Privacy) Start chatting Începe o conversaţie Start Chatting Începe o conversaţie Chat with any LLM Dialoghează cu orice LLM LocalDocs LocalDocs Chat with your local files Dialoghează cu fişiere locale Find Models Caută modele Explore and download models Explorează şi descarcă modele Latest news Ultimele ştiri Latest news from GPT4All Ultimele ştiri de la GPT4All Release Notes Despre această versiune Documentation Documentaţie Discord Discord X (Twitter) X (Twitter) Github GitHub nomic.ai nomic.ai Subscribe to Newsletter Abonare la Newsletter LocalDocsSettings LocalDocs LocalDocs LocalDocs Settings Configurarea LocalDocs Indexing Indexare Allowed File Extensions Extensii compatibile de fişier Embedding Embedding Use Nomic Embed API Folosesc API: Nomic Embed Nomic API Key Cheia API Nomic Embeddings Device Dispozitivul pentru Embeddings The compute device used for embeddings. Requires restart. Dispozitivul pentru Embeddings. Necesită repornire. Comma-separated list. LocalDocs will only attempt to process files with these extensions. Extensiile, separate prin virgulă. LocalDocs va încerca procesarea numai a fişierelor cu aceste extensii. Embed documents using the fast Nomic API instead of a private local model. Requires restart. Embedding pe documente folosind API de la Nomic în locul unui model local. Necesită repornire. API key to use for Nomic Embed. Get one from the Atlas <a href="https://atlas.nomic.ai/cli-login">API keys page</a>. Requires restart. Cheia API de utilizat cu Nomic Embed. Obţine o cheie prin Atlas: <a href="https://atlas.nomic.ai/cli-login">pagina cheilor API</a> Necesită repornire. Application default Implicit Display Vizualizare Show Sources Afişarea Surselor Display the sources used for each response. Afişează Sursele utilizate pentru fiecare replică. Advanced Avansate Warning: Advanced usage only. Atenţie: Numai pentru utilizare avansată. Values too large may cause localdocs failure, extremely slow responses or failure to respond at all. Roughly speaking, the {N chars x N snippets} are added to the model's context window. More info <a href="https://docs.gpt4all.io/gpt4all_desktop/localdocs.html">here</a>. Valori prea mari pot cauza erori cu LocalDocs, replici foarte lente sau chiar absenţa lor. În mare, numărul {N caractere x N citate} este adăugat la Context Window/Size/Length a modelului. Mai multe informaţii: <a href="https://docs.gpt4all.io/gpt4all_desktop/localdocs.html">aici</a>. Number of characters per document snippet. Larger numbers increase likelihood of factual responses, but also result in slower generation. Numărul caracterelor din fiecare citat. Numere mari amplifică probabilitatea unor replici corecte, dar de asemenea cauzează generare lentă. Max best N matches of retrieved document snippets to add to the context for prompt. Larger numbers increase likelihood of factual responses, but also result in slower generation. Numărul maxim al citatelor ce corespund şi care vor fi adăugate la contextul pentru prompt. Numere mari amplifică probabilitatea unor replici corecte, dar de asemenea cauzează generare lentă. Document snippet size (characters) Lungimea (în caractere) a citatelor din documente Max document snippets per prompt Numărul maxim de citate per prompt LocalDocsView LocalDocs LocalDocs Chat with your local files Dialoghează cu fişiere locale + Add Collection + Adaugă o Colecţie <h3>ERROR: The LocalDocs database cannot be accessed or is not valid.</h3><br><i>Note: You will need to restart after trying any of the following suggested fixes.</i><br><ul><li>Make sure that the folder set as <b>Download Path</b> exists on the file system.</li><li>Check ownership as well as read and write permissions of the <b>Download Path</b>.</li><li>If there is a <b>localdocs_v2.db</b> file, check its ownership and read/write permissions, too.</li></ul><br>If the problem persists and there are any 'localdocs_v*.db' files present, as a last resort you can<br>try backing them up and removing them. You will have to recreate your collections, however. EROARE: Baza de date LocalDocs nu poate fi accesată sau nu e validă. Programul trebuie repornit după ce se încearcă oricare din următoarele remedii sugerate.</i><br><ul><li>Asigură-te că folderul pentru <b>Download Path</b> există în sistemul de fişiere.</li><li>Verifică permisiunile şi apartenenţa folderului pentru <b>Download Path</b>.</li><li>Dacă există fişierul <b>localdocs_v2.db</b>, verifică-i apartenenţa şi permisiunile citire/scriere (read/write).</li></ul><br>Dacă problema persistă şi există vreun fişier 'localdocs_v*.db', ca ultimă soluţie poţi<br>încerca duplicarea (backup) şi apoi ştergerea lor. Oricum, va trebui să re-creezi Colecţiile. No Collections Installed Nu există Colecţii instalate Install a collection of local documents to get started using this feature Instalează o Colecţie de documente pentru a putea utiliza funcţionalitatea aceasta + Add Doc Collection + Adaug o Colecţie de documente Shows the add model view Afişează secţiunea de adăugare a unui model Indexing progressBar Bara de progresie a Indexării Shows the progress made in the indexing Afişează progresia Indexării ERROR EROARE INDEXING ...INDEXARE... EMBEDDING ...EMBEDDINGs... REQUIRES UPDATE NECESITĂ UPDATE READY GATA INSTALLING ...INSTALARE... Indexing in progress ...Se Indexează... Embedding in progress ...Se calculează Embeddings... This collection requires an update after version change Colecţia necesită update după schimbarea versiunii Automatically reindexes upon changes to the folder Se reindexează automat după schimbări ale folderului Installation in progress ...Instalare în curs... % % %n file(s) %n fişier %n fişiere %n de fişiere %n word(s) %n cuvânt %n cuvinte %n de cuvinte Remove Şterg Rebuild Reconstrucţie Reindex this folder from scratch. This is slow and usually not needed. Reindexează de la zero acest folder. Procesul e lent şi de obicei inutil. Update Update/Actualizare Update the collection to the new version. This is a slow operation. Actualizează Colecţia la noua versiune. Această procedură e lentă. ModelList cannot open "%1": %2 nu se poate deschide „%1”: %2 cannot create "%1": %2 nu se poate crea „%1”: %2 %1 (%2) %1 (%2) <strong>OpenAI-Compatible API Model</strong><br><ul><li>API Key: %1</li><li>Base URL: %2</li><li>Model Name: %3</li></ul> <strong>Model API compatibil cu OpenAI</strong><br><ul><li>Cheia API: %1</li><li>Base URL: %2</li><li>Numele modelului: %3</li></ul> <ul><li>Requires personal OpenAI API key.</li><li>WARNING: Will send your chats to OpenAI!</li><li>Your API key will be stored on disk</li><li>Will only be used to communicate with OpenAI</li><li>You can apply for an API key <a href="https://platform.openai.com/account/api-keys">here.</a></li> <ul><li>Necesită o cheie API OpenAI personală. </li><li>ATENŢIE: Conversaţiile tale vor fi trimise la OpenAI!</li><li>Cheia ta API va fi stocată pe disc (local) </li><li>Va fi utilizată numai pentru comunicarea cu OpenAI</li><li>Poţi solicita o cheie API aici: <a href="https://platform.openai.com/account/api-keys">aici.</a></li> <strong>OpenAI's ChatGPT model GPT-3.5 Turbo</strong><br> %1 <strong>Modelul OpenAI's ChatGPT GPT-3.5 Turbo</strong><br> %1 <br><br><i>* Even if you pay OpenAI for ChatGPT-4 this does not guarantee API key access. Contact OpenAI for more info. <br><br><i>* Chiar dacă plăteşti la OpenAI pentru ChatGPT-4, aceasta nu garantează accesul la cheia API. Contactează OpenAI pentru mai multe informaţii. <strong>OpenAI's ChatGPT model GPT-4</strong><br> %1 %2 <strong>Modelul ChatGPT GPT-4 al OpenAI</strong><br> %1 %2 <ul><li>Requires personal Mistral API key.</li><li>WARNING: Will send your chats to Mistral!</li><li>Your API key will be stored on disk</li><li>Will only be used to communicate with Mistral</li><li>You can apply for an API key <a href="https://console.mistral.ai/user/api-keys">here</a>.</li> <ul><li>Necesită cheia personală Mistral API. </li><li>ATENŢIE: Conversaţiile tale vor fi trimise la Mistral!</li><li>Cheia ta API va fi stocată pe disc (local)</li><li>Va fi utilizată numai pentru comunicarea cu Mistral</li><li>Poţi solicita o cheie API aici: <a href="https://console.mistral.ai/user/api-keys">aici</a>.</li> <strong>Mistral Tiny model</strong><br> %1 <strong>Modelul Mistral Tiny</strong><br> %1 <strong>Mistral Small model</strong><br> %1 <strong>Modelul Mistral Small</strong><br> %1 <strong>Mistral Medium model</strong><br> %1 <strong>Modelul Mistral Medium</strong><br> %1 <ul><li>Requires personal API key and the API base URL.</li><li>WARNING: Will send your chats to the OpenAI-compatible API Server you specified!</li><li>Your API key will be stored on disk</li><li>Will only be used to communicate with the OpenAI-compatible API Server</li> <ul><li>Necesită cheia personală API si base-URL a API.</li><li>ATENŢIE: Conversaţiile tale vor fi trimise la serverul API compatibil cu OpenAI specificat!</li><li>Cheia ta API va fi stocată pe disc (local)</li><li>Va fi utilizată numai pentru comunicarea cu serverul API compatibil cu OpenAI</li> <strong>Connect to OpenAI-compatible API server</strong><br> %1 <strong>Conectare la un server API compatibil cu OpenAI</strong><br> %1 <strong>Created by %1.</strong><br><ul><li>Published on %2.<li>This model has %3 likes.<li>This model has %4 downloads.<li>More info can be found <a href="https://huggingface.co/%5">here.</a></ul> <strong>Creat de către %1.</strong><br><ul><li>Publicat in: %2.<li>Acest model are %3 Likes.<li>Acest model are %4 download-uri.<li>Mai multe informaţii pot fi găsite la: <a href="https://huggingface.co/%5">aici.</a></ul> ModelSettings Model Model %1 system message? %1 mesajul de la sistem? Clear Ştergere Reset Resetare The system message will be %1. Mesajul de la sistem va fi %1. removed îndepărtat reset to the default resetare la valoarea implicită %1 chat template? %1 modelul de conversaţie? The chat template will be %1. Modelul de conversaţie va fi %1. erased şters Model Settings Configurez modelul Clone Clonez Remove Şterg Name Denumire Model File Fişierul modelului System Prompt System Prompt Prompt Template Prompt Template The template that wraps every prompt. Standardul de formulare a fiecărui prompt. Chat Name Prompt Denumirea conversaţiei Prompt used to automatically generate chat names. Standardul de formulare a denumirii conversaţiilor. Suggested FollowUp Prompt Prompt-ul sugerat pentru a continua Prompt used to generate suggested follow-up questions. Prompt-ul folosit pentru generarea întrebărilor de continuare. Context Length Lungimea Contextului Number of input and output tokens the model sees. Numărul token-urilor de input şi de output văzute de model. Temperature Temperatura Randomness of model output. Higher -> more variation. Libertatea/Confuzia din replica modelului. Mai mare -> mai multă libertate. Top-P Top-P Nucleus Sampling factor. Lower -> more predictable. Factorul de Nucleus Sampling. Mai mic -> predictibilitate mai mare. Prefixed at the beginning of every conversation. Must contain the appropriate framing tokens. Plasat la începutul fiecărei conversaţii. Trebuie să conţină token-uri(le) adecvate de încadrare. Must contain the string "%1" to be replaced with the user's input. Trebuie să conţină textul "%1" care va fi înlocuit cu ceea ce scrie utilizatorul. Maximum combined prompt/response tokens before information is lost. Using more context than the model was trained on will yield poor results. NOTE: Does not take effect until you reload the model. Numărul maxim combinat al token-urilor în prompt+replică înainte de a se pierde informaţie. Utilizarea unui context mai mare decât cel cu care a fost instruit modelul va întoarce rezultate mai slabe. NOTĂ: Nu are efect până la reîncărcarea modelului. Temperature increases the chances of choosing less likely tokens. NOTE: Higher temperature gives more creative but less predictable outputs. Temperatura creşte probabilitatea de alegere a unor token-uri puţin probabile. NOTĂ: O temperatură tot mai înaltă determină replici tot mai creative şi mai puţin predictibile. Only the most likely tokens up to a total probability of top_p can be chosen. NOTE: Prevents choosing highly unlikely tokens. Pot fi alese numai cele mai probabile token-uri a căror probabilitate totală este Top-P. NOTĂ: Evită selectarea token-urilor foarte improbabile. Min-P Min-P Minimum token probability. Higher -> more predictable. Probabilitatea mínimă a unui token. Mai mare -> mai predictibil. Sets the minimum relative probability for a token to be considered. Stabileşte probabilitatea minimă relativă a unui token de luat în considerare. Top-K Top-K Size of selection pool for tokens. Dimensiunea setului de token-uri. Only the top K most likely tokens will be chosen from. Se va alege numai din cele mai probabile K token-uri. Max Length Lungimea maximă Maximum response length, in tokens. Lungimea maximă - în token-uri - a replicii. Prompt Batch Size Prompt Batch Size The batch size used for prompt processing. Dimensiunea setului de token-uri citite simultan din prompt. Amount of prompt tokens to process at once. NOTE: Higher values can speed up reading prompts but will use more RAM. Numărul token-urilor procesate simultan. NOTĂ: Valori tot mai mari pot accelera citirea prompt-urilor, dar şi utiliza mai multă RAM. How many model layers to load into VRAM. Decrease this if GPT4All runs out of VRAM while loading this model. Lower values increase CPU load and RAM usage, and make inference slower. NOTE: Does not take effect until you reload the model. Cât de multe layere ale modelului să fie încărcate în VRAM. Valori mici trebuie folosite dacă GPT4All rămâne fără VRAM în timp ce încarcă modelul. Valorile tot mai mici cresc utilizarea CPU şi a RAM şi încetinesc inferenţa. NOTĂ: Nu are efect până la reîncărcarea modelului. Repeat Penalty Penalizarea pentru repetare System Message Mesaj de la Sistem A message to set the context or guide the behavior of the model. Leave blank for none. NOTE: Since GPT4All 3.5, this should not contain control tokens. Un mesaj pentru stabilirea contextului sau ghidarea comportamentului modelului. Poate fi nespecificat. NOTĂ: De la GPT4All 3.5, acesta nu trebuie să conţină tokenuri de control. System message is not <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">plain text</a>. Mesajul de la Sistem nu e <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">text-simplu</a>. Chat Template Model de conversaţie This Jinja template turns the chat into input for the model. Acest model Jinja transformă conversaţia în input pentru model. No <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">chat template</a> configured. Nu e configurat niciun <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">model de conversaţie</a>. The <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">chat template</a> cannot be blank. <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">Modelul de conversaţie</a> nu poate lipsi. <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">Syntax error</a>: %1 <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">Syntax error</a>: %1 Chat template is not in <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">Jinja format</a>. Modelul de conversaţie nu este in <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">format Jinja</a>. Repetition penalty factor. Set to 1 to disable. Factorul de penalizare a repetării ce se dezactivează cu valoarea 1. Repeat Penalty Tokens Token-uri pentru penalizare a repetării Number of previous tokens used for penalty. Numărul token-urilor anterioare considerate pentru penalizare. GPU Layers Layere în GPU Number of model layers to load into VRAM. Numărul layerelor modelului ce vor fi Încărcate în VRAM. ModelsView No Models Installed Nu există modele instalate Install a model to get started using GPT4All Instalează un model pentru a începe să foloseşti GPT4All + Add Model + Adaugă un model Shows the add model view Afişează secţiunea de adăugare a unui model Installed Models Modele instalate Locally installed chat models Modele conversaţionale instalate Model file Fişierul modelului Model file to be downloaded Fişierul modelului ce va fi descărcat Description Descriere File description Descrierea fişierului Cancel Anulare Resume Continuare Stop/restart/start the download Oprirea/Repornirea/Iniţierea descărcării Remove Şterg Remove model from filesystem Şterg modelul din sistemul de fişiere Install Instalez Install online model Instalez un model din online %1 GB %1 GB ? ? Describes an error that occurred when downloading Descrie o eroare apărută la download <strong><font size="1"><a href="#error">Error</a></strong></font> <strong><font size="1"><a href="#eroare">Error</a></strong></font> <strong><font size="2">WARNING: Not recommended for your hardware. Model requires more memory (%1 GB) than your system has available (%2).</strong></font> <strong><font size="2">ATENŢIE: Nerecomandat pentru acest hardware. Modelul necesită mai multă memorie (%1 GB) decât are sistemul tău(%2).</strong></font> Error for incompatible hardware Eroare - hardware incompatibil Download progressBar Bara de progresie a descărcării Shows the progress made in the download Afişează progresia descărcării Download speed Viteza de download Download speed in bytes/kilobytes/megabytes per second Viteza de download în bytes/kilobytes/megabytes pe secundă Calculating... ...Se calculează... Whether the file hash is being calculated Dacă se va calcula hash-ul fişierului Busy indicator Indicator de activitate Displayed when the file hash is being calculated Afişat când se calculează hash-ul unui fişier ERROR: $API_KEY is empty. EROARE: $API_KEY absentă. enter $API_KEY introdu cheia $API_KEY ERROR: $BASE_URL is empty. EROARE: $BASE_URL absentă. enter $BASE_URL introdu $BASE_URL ERROR: $MODEL_NAME is empty. EROARE: $MODEL_NAME absent. enter $MODEL_NAME introdu $MODEL_NAME File size Dimensiunea fişierului RAM required RAM necesară Parameters Parametri Quant Quant(ificare) Type Tip MyFancyLink Fancy link Link haios A stylized link Un link cu stil MyFileDialog Please choose a file Selectează un fişier MyFolderDialog Please choose a directory Selectează un director (folder) MySettingsLabel Clear Ştergere Reset Resetare MySettingsStack Please choose a directory Selectează un director (folder) MySettingsTab Restore defaults? Restaurare la implicite This page of settings will be reset to the defaults. Setările de pe această pagină vor fi resetate la valorile implicite. Restore Defaults Restaurez valorile implicite Restores settings dialog to a default state Restaurez secţiunea Configurare la starea sa implicită NetworkDialog Contribute data to the GPT4All Opensource Datalake. Contribui cu date/informaţii la componenta Open-source DataLake a GPT4All. By enabling this feature, you will be able to participate in the democratic process of training a large language model by contributing data for future model improvements. When a GPT4All model responds to you and you have opted-in, your conversation will be sent to the GPT4All Open Source Datalake. Additionally, you can like/dislike its response. If you dislike a response, you can suggest an alternative response. This data will be collected and aggregated in the GPT4All Datalake. NOTE: By turning on this feature, you will be sending your data to the GPT4All Open Source Datalake. You should have no expectation of chat privacy when this feature is enabled. You should; however, have an expectation of an optional attribution if you wish. Your chat data will be openly available for anyone to download and will be used by Nomic AI to improve future GPT4All models. Nomic AI will retain all attribution information attached to your data and you will be credited as a contributor to any GPT4All model release that uses your data! Dacă activezi această funcţionalitate, vei participa la procesul democratic de instruire a unui model LLM prin contribuţia ta cu date la îmbunătăţirea modelului. Când un model în GPT4All îţi răspunde şi îi accepţi replica, atunci conversaţia va fi trimisă la componenta Open-source DataLake a GPT4All. Mai mult - îi poţi aprecia replica, Dacă răspunsul Nu Îti Place, poţi sugera unul alternativ. Aceste date vor fi colectate şi agregate în componenta DataLake a GPT4All. NOTĂ: Dacă activezi această funcţionalitate, vei trimite datele tale la componenta DataLake a GPT4All. Atunci nu te vei putea aştepta la intimitatea (privacy) conversaţiei dacă activezi această funcţionalitate. Totuşi, te poţi aştepta la a beneficia de apreciere - opţional, dacă doreşti. Datele din conversaţie vor fi disponibile pentru oricine vrea să le descarce şi vor fi utilizate de către Nomic AI pentru a îmbunătăţi modele viitoare în GPT4All. Nomic AI va păstra toate informaţiile despre atribuire asociate datelor tale şi vei fi menţionat ca participant contribuitor la orice lansare a unui model GPT4All care foloseşte datele tale! Terms for opt-in Termenii participării Describes what will happen when you opt-in Descrie ce se întâmplă când participi Please provide a name for attribution (optional) Specifică un nume pentru această apreciere (opţional) Attribution (optional) Apreciere (opţional) Provide attribution Apreciază Enable Activează Enable opt-in Activez participarea Cancel Anulare Cancel opt-in Anulez participarea NewVersionDialog New version is available O nouă versiune disponibilă! Update Update/Actualizare Update to new version Actualizez la noua versiune PopupDialog Reveals a shortlived help balloon Afişează un mesaj scurt de asistenţă Busy indicator Indicator de activitate Displayed when the popup is showing busy Se afişează când procedura este în desfăşurare RemoteModelCard API Key ERROR: $API_KEY is empty. enter $API_KEY introdu cheia $API_KEY Whether the file hash is being calculated Base Url ERROR: $BASE_URL is empty. enter $BASE_URL introdu $BASE_URL Model Name ERROR: $MODEL_NAME is empty. enter $MODEL_NAME introdu $MODEL_NAME Models Modele Install Install remote model SettingsView Settings Configurare Contains various application settings Conţine setări ale programului Application Program Model Model LocalDocs LocalDocs StartupDialog Welcome! Bun venit! Release notes Despre versiune Release notes for this version Despre această versiune ### Opt-ins for anonymous usage analytics and datalake By enabling these features, you will be able to participate in the democratic process of training a large language model by contributing data for future model improvements. When a GPT4All model responds to you and you have opted-in, your conversation will be sent to the GPT4All Open Source Datalake. Additionally, you can like/dislike its response. If you dislike a response, you can suggest an alternative response. This data will be collected and aggregated in the GPT4All Datalake. NOTE: By turning on this feature, you will be sending your data to the GPT4All Open Source Datalake. You should have no expectation of chat privacy when this feature is enabled. You should; however, have an expectation of an optional attribution if you wish. Your chat data will be openly available for anyone to download and will be used by Nomic AI to improve future GPT4All models. Nomic AI will retain all attribution information attached to your data and you will be credited as a contributor to any GPT4All model release that uses your data! ### Acceptul pentru analizarea utilizării anonime şi pentru DataLake Activând aceste functionalităţi vei putea participa la procesul democratic de instruire a unui model conversaţional prin contribuirea cu date/informaţii pentru îmbunătăţirea unor modele. Când un model în GPT4All îţi răspunde şi îi accepţi răspunsul, conversaţia este trimisă la componenta Open-source DataLake a GPT4All. Mai mult - poţi aprecia (Like/Dislike) răspunsul. Dacă un răspuns Nu Îţi Place (e "Aiurea"). poţi sugera un răspuns alternativ. Aceste date vor fi colectate şi agregate în componenta DataLake a GPT4All. NOTă: Dacă activezi această funcţionalitate, vei trimite datele tale la componenta DataLake a GPT4All. Atunci nu te vei putea aştepta la intimitatea (privacy) conversaţiei dacă activezi această funcţionalitate. Totuşi, te poţi aştepta la a beneficia de apreciere - opţional, dacă doreşti. Datele din conversaţie vor fi disponibile pentru oricine vrea să le descarce şi vor fi utilizate de către Nomic AI pentru a îmbunătăţi modele viitoare în GPT4All. Nomic AI va păstra toate informaţiile despre atribuire asociate datelor tale şi vei fi menţionat ca participant contribuitor la orice lansare a unui model GPT4All care foloseşte datele tale! ### Release Notes %1<br/> ### Contributors %2 ### Despre versiune %1<br/> ### Contributori %2 ### Opt-ins for anonymous usage analytics and datalake By enabling these features, you will be able to participate in the democratic process of training a large language model by contributing data for future model improvements. When a GPT4All model responds to you and you have opted-in, your conversation will be sent to the GPT4All Open Source Datalake. Additionally, you can like/dislike its response. If you dislike a response, you can suggest an alternative response. This data will be collected and aggregated in the GPT4All Datalake. NOTE: By turning on this feature, you will be sending your data to the GPT4All Open Source Datalake. You should have no expectation of chat privacy when this feature is enabled. You should; however, have an expectation of an optional attribution if you wish. Your chat data will be openly available for anyone to download and will be used by Nomic AI to improve future GPT4All models. Nomic AI will retain all attribution information attached to your data and you will be credited as a contributor to any GPT4All model release that uses your data! ### Acordul pentru analizarea utilizării anonime şi pentru DataLake Activând aceste functionalităţi vei putea participa la procesul democratic de instruire a unui model conversaţional prin contribuirea cu date/informaţii pentru îmbunătăţirea unor modele. Când un model în GPT4All îţi răspunde şi îi accepţi răspunsul, conversaţia este trimisă la componenta Open-source DataLake a GPT4All. Mai mult - poţi aprecia (Bravo/Aiurea) răspunsul. Dacă un răspuns e Aiurea. poţi sugera un răspuns alternativ. Aceste date vor fi colectate şi agregate în componenta DataLake a GPT4All. NOTĂ: Dacă activezi această funcţionalitate, vei trimite datele tale la componenta DataLake a GPT4All. Atunci nu te vei putea aştepta la confidenţialitatea (privacy) conversaţiei dacă activezi această funcţionalitate. Totuşi, te poţi aştepta la a beneficia de apreciere - opţional, dacă doreşti. Datele din conversaţie vor fi disponibile pentru oricine vrea să le descarce şi vor fi utilizate de către Nomic AI pentru a îmbunătăţi modele viitoare în GPT4All. Nomic AI va păstra toate informaţiile despre atribuire asociate datelor tale şi vei fi menţionat ca participant contribuitor la orice lansare a unui model GPT4All care foloseşte datele tale! Terms for opt-in Termenii pentru participare Describes what will happen when you opt-in Descrie ce se întâmplă când participi Opt-in to anonymous usage analytics used to improve GPT4All Optați pentru trimiterea anonimă a evidenței utilizării, folosite pentru a îmbunătăți GPT4All Opt-in for anonymous usage statistics Acceptă colectarea de statistici despre utilizare -anonimă- Yes Da Allow opt-in for anonymous usage statistics Acceptă participarea la colectarea de statistici despre utilizare -anonimă- No Nu Opt-out for anonymous usage statistics Anulează participarea la colectarea de statistici despre utilizare -anonimă- Allow opt-out for anonymous usage statistics Permite anularea participării la colectarea de statistici despre utilizare -anonimă- Opt-in to anonymous sharing of chats to the GPT4All Datalake Optați pentru partajarea anonimă a conversațiilor în GPT4All Datalake Opt-in for network Acceptă pentru reţea Allow opt-in for network Permite participarea pentru reţea Allow opt-in anonymous sharing of chats to the GPT4All Datalake Permite participarea la partajarea (share) -anonimă- a conversaţiilor către DataLake a GPT4All Opt-out for network Refuz participarea, pentru reţea Allow opt-out anonymous sharing of chats to the GPT4All Datalake Permite anularea participării la partajarea -anonimă- a conversaţiilor către DataLake a GPT4All SwitchModelDialog <b>Warning:</b> changing the model will erase the current conversation. Do you wish to continue? <b>Atenţie:</b> schimbarea modelului va şterge conversaţia curentă. Confirmi aceasta? Continue Continuă Continue with model loading Continuă încărcarea modelului Cancel Anulare ThumbsDownDialog Please edit the text below to provide a better response. (optional) Te rog, editează textul de mai jos pentru a oferi o replică mai bună (opţional). Please provide a better response... Te rog, oferă o replică mai bună... Submit Trimite Submits the user's response Trimite răspunsul dat de utilizator Cancel Anulare Closes the response dialog Închide afişarea răspunsului main GPT4All v%1 GPT4All v%1 Restore Restaurare Quit Abandon <h3>Encountered an error starting up:</h3><br><i>"Incompatible hardware detected."</i><br><br>Unfortunately, your CPU does not meet the minimal requirements to run this program. In particular, it does not support AVX intrinsics which this program requires to successfully run a modern large language model. The only solution at this time is to upgrade your hardware to a more modern CPU.<br><br>See here for more information: <a href="https://en.wikipedia.org/wiki/Advanced_Vector_Extensions">https://en.wikipedia.org/wiki/Advanced_Vector_Extensions</a> <h3>A apărut o eroare la iniţializare:; </h3><br><i>"Hardware incompatibil. "</i><br><br>Din păcate, procesorul (CPU) nu întruneşte condiţiile minime pentru a rula acest program. În particular, nu suportă instrucţiunile AVX pe care programul le necesită pentru a integra un model conversaţional modern. În acest moment, unica soluţie este să îţi aduci la zi sistemul hardware cu un CPU mai recent.<br><br>Aici sunt mai multe informaţii: <a href="https://en.wikipedia.org/wiki/Advanced_Vector_Extensions">https://en.wikipedia.org/wiki/Advanced_Vector_Extensions</a> <h3>Encountered an error starting up:</h3><br><i>"Inability to access settings file."</i><br><br>Unfortunately, something is preventing the program from accessing the settings file. This could be caused by incorrect permissions in the local app config directory where the settings file is located. Check out our <a href="https://discord.gg/4M2QFmTt2k">discord channel</a> for help. <h3>A apărut o eroare la iniţializare:; </h3><br><i>"Nu poate fi accesat fişierul de configurare a programului."</i><br><br>Din păcate, ceva împiedică programul în a accesa acel fişier. Cauza poate fi un set de permisiuni incorecte pe directorul/folderul local de configurare unde se află acel fişier. Poţi parcurge canalul nostru <a href="https://discord.gg/4M2QFmTt2k">Discord</a> unde vei putea primi asistenţă. Connection to datalake failed. Conectarea la DataLake a eşuat. Saving chats. Se salvează conversaţiile. Network dialog Dialogul despre reţea opt-in to share feedback/conversations acceptă partajarea (share) de comentarii/conversaţii Home view Secţiunea de Început Home view of application Secţiunea de Început a programului Home Prima<br>pagină Chat view Secţiunea conversaţiilor Chat view to interact with models Secţiunea de chat pentru interacţiune cu modele Chats Conversaţii Models Modele Models view for installed models Secţiunea modelelor instalate LocalDocs LocalDocs LocalDocs view to configure and use local docs Secţiunea LocalDocs de configurare şi folosire a Documentelor Locale Settings Configurare Settings view for application configuration Secţiunea de Configurare a programului The datalake is enabled DataLake: ACTIV Using a network model Se foloseşte un model pe reţea Server mode is enabled Modul Server: ACTIV Installed models Modele instalate View of installed models Secţiunea modelelor instalate ================================================ FILE: gpt4all-chat/translations/gpt4all_zh_CN.ts ================================================ AddCollectionView ← Existing Collections ← 存在集合 Add Document Collection 添加文档集合 Add a folder containing plain text files, PDFs, or Markdown. Configure additional extensions in Settings. 添加一个包含纯文本文件、PDF或Markdown的文件夹。在“设置”中配置其他扩展。 Name 名称 Collection name... 集合名称... Name of the collection to add (Required) 集合名称 (必须) Folder 目录 Folder path... 目录地址... Folder path to documents (Required) 文档的目录地址(必须) Browse 查看 Create Collection 创建集合 AddGPT4AllModelView These models have been specifically configured for use in GPT4All. The first few models on the list are known to work the best, but you should only attempt to use models that will fit in your available memory. 这些模型已专门为GPT4All配置。列表前几个模型的效果最好,但你最好只使用那些满足内存的模型。 Network error: could not retrieve %1 网络错误:无法检索 %1 Busy indicator 繁忙程度 Displayed when the models request is ongoing 当模型请求处于进行中时显示 All 全选 Reasoning 推理 Model file 模型文件 Model file to be downloaded 待下载的模型 Description 描述 File description 文件描述 Cancel 取消 Resume 继续 Download 下载 Stop/restart/start the download 停止/重启/开始下载 Remove 删除 Remove model from filesystem 从系统中删除模型 Install 安装 Install online model 安装在线模型 <strong><font size="1"><a href="#error">Error</a></strong></font> <strong><font size="1"><a href="#error">错误</a></strong></font> Describes an error that occurred when downloading 描述下载时发生的错误 <strong><font size="2">WARNING: Not recommended for your hardware. Model requires more memory (%1 GB) than your system has available (%2).</strong></font> <strong><font size="2">警告:不推荐用于您的硬件。模型需要的内存(%1 GB)超过了您系统的可用内存(%2)</strong></font> Error for incompatible hardware 硬件不兼容的错误 Download progressBar 下载进度 Shows the progress made in the download 显示下载进度 Download speed 下载速度 Download speed in bytes/kilobytes/megabytes per second 下载速度 b/kb/mb 每秒 Calculating... 计算中... Whether the file hash is being calculated 是否正在计算文件哈希 Displayed when the file hash is being calculated 在计算文件哈希时显示 ERROR: $API_KEY is empty. 错误:$API_KEY为空 enter $API_KEY 输入 $API_KEY ERROR: $BASE_URL is empty. 错误:$BASE_URL 为空 enter $BASE_URL 输入 $BASE_URL ERROR: $MODEL_NAME is empty. 错误:$MODEL_NAME 为空 enter $MODEL_NAME 输入 $MODEL_NAME File size 文件大小 RAM required 需要 RAM %1 GB %1 GB ? Parameters 参数 Quant 量化 Type 类型 AddHFModelView Use the search to find and download models from HuggingFace. There is NO GUARANTEE that these will work. Many will require additional configuration before they can be used. 在 Hugging Face 上查找并下载模型。不能保证这些模型可以正常工作。许多模型在使用前需要额外的配置。 Discover and download models by keyword search... 通过关键词查找并下载模型 ... Text field for discovering and filtering downloadable models 用于发现和筛选可下载模型的文本字段 Searching · %1 搜索中 · %1 Initiate model discovery and filtering 启动模型发现和过滤 Triggers discovery and filtering of models 触发模型发现和过滤 Default 默认 Likes 热门 Downloads 下载量 Recent 最近 Sort by: %1 排序: %1 Asc 升序 Desc 降序 Sort dir: %1 排序目录: %1 None Limit: %1 数量: %1 Model file 模型文件 Model file to be downloaded 待下载的模型 Description 描述 File description 文件描述 Cancel 取消 Resume 继续 Download 下载 Stop/restart/start the download 停止/重启/开始下载 Remove 删除 Remove model from filesystem 从系统中删除模型 Install 安装 Install online model 安装在线模型 <strong><font size="1"><a href="#error">Error</a></strong></font> <strong><font size="1"><a href="#error">错误</a></strong></font> Describes an error that occurred when downloading 描述下载时发生的错误 <strong><font size="2">WARNING: Not recommended for your hardware. Model requires more memory (%1 GB) than your system has available (%2).</strong></font> <strong><font size="2">警告:不推荐用于您的硬件。模型需要的内存(%1 GB)超过了您系统的可用内存(%2)</strong></font> Error for incompatible hardware 硬件不兼容的错误 Download progressBar 下载进度 Shows the progress made in the download 显示下载进度 Download speed 下载速度 Download speed in bytes/kilobytes/megabytes per second 下载速度 b/kb/mb 每秒 Calculating... 计算中... Whether the file hash is being calculated 是否正在计算文件哈希 Busy indicator 繁忙程度 Displayed when the file hash is being calculated 在计算文件哈希时显示 ERROR: $API_KEY is empty. 错误:$API_KEY为空 enter $API_KEY 输入 $API_KEY ERROR: $BASE_URL is empty. 错误:$BASE_URL 为空 enter $BASE_URL 输入 $BASE_URL ERROR: $MODEL_NAME is empty. 错误:$MODEL_NAME 为空 enter $MODEL_NAME 输入 $MODEL_NAME File size 文件大小 Quant 量化 Type 类型 AddModelView ← Existing Models ← 已安装的模型 Explore Models 发现模型 GPT4All GPT4All Remote Providers 远程提供商 HuggingFace HuggingFace AddRemoteModelView Various remote model providers that use network resources for inference. 使用网络资源进行推理的各种远程模型提供商。 Groq Groq Groq offers a high-performance AI inference engine designed for low-latency and efficient processing. Optimized for real-time applications, Groq’s technology is ideal for users who need fast responses from open large language models and other AI workloads.<br><br>Get your API key: <a href="https://console.groq.com/keys">https://groq.com/</a> Groq 提供高性能 AI 推理引擎,专为低延迟和高效处理而设计。其技术经过优化,适用于实时应用,非常适合需要快速响应的大型开源语言模型和其他 AI 任务的用户。<br><br>获取您的 API 密钥:<a href="https://console.groq.com/keys">https://groq.com/</a> OpenAI OpenAI OpenAI provides access to advanced AI models, including GPT-4 supporting a wide range of applications, from conversational AI to content generation and code completion.<br><br>Get your API key: <a href="https://platform.openai.com/signup">https://openai.com/</a> OpenAI 提供先进的 AI 模型访问权限,包括支持广泛应用的 GPT-4,涵盖对话 AI、内容生成和代码补全等场景。<br><br>获取您的 API 密钥:<a href="https://platform.openai.com/signup">https://openai.com/</a> Mistral Mistral Mistral AI specializes in efficient, open-weight language models optimized for various natural language processing tasks. Their models are designed for flexibility and performance, making them a solid option for applications requiring scalable AI solutions.<br><br>Get your API key: <a href="https://mistral.ai/">https://mistral.ai/</a> Mistral AI 专注于高效的开源语言模型,针对各种自然语言处理任务进行了优化。其模型具备灵活性和高性能,是需要可扩展 AI 解决方案的应用的理想选择。<br><br>获取您的 API 密钥:<a href="https://mistral.ai/">https://mistral.ai/</a> Custom 自定义 The custom provider option allows users to connect their own OpenAI-compatible AI models or third-party inference services. This is useful for organizations with proprietary models or those leveraging niche AI providers not listed here. 自定义提供商选项允许用户连接自己的 OpenAI 兼容 AI 模型或第三方推理服务。这对于拥有专有模型的组织,或使用此处未列出的特定 AI 提供商的用户非常有用。 ApplicationSettings Application 应用 Network dialog 网络对话 opt-in to share feedback/conversations 选择加入以共享反馈/对话 Error dialog 错误对话 Application Settings 应用设置 General 通用设置 Theme 主题 The application color scheme. 应用的主题颜色 Dark 深色 Light 亮色 ERROR: Update system could not find the MaintenanceTool used to check for updates!<br/><br/>Did you install this application using the online installer? If so, the MaintenanceTool executable should be located one directory above where this application resides on your filesystem.<br/><br/>If you can't start it manually, then I'm afraid you'll have to reinstall. 错误:更新系统无法找到用于检查更新的 MaintenanceTool!<br><br>您是否使用在线安装程序安装了此应用程序?如果是的话,MaintenanceTool 可执行文件应该位于文件系统中此应用程序所在目录的上一级目录。<br><br>如果无法手动启动它,那么恐怕您需要重新安装。 LegacyDark LegacyDark Font Size 字体大小 The size of text in the application. 应用中的文本大小。 Small Medium Large Language and Locale 语言和本地化 The language and locale you wish to use. 你想使用的语言 System Locale 系统语言 Device 设备 The compute device used for text generation. 设备用于文本生成 Application default 程序默认 Default Model 默认模型 The preferred model for new chats. Also used as the local server fallback. 新聊天的首选模式。也用作本地服务器回退。 Suggestion Mode 建议模式 Generate suggested follow-up questions at the end of responses. 在答复结束时生成建议的后续问题。 When chatting with LocalDocs 本地文档检索 Whenever possible 只要有可能 Never 从不 Download Path 下载目录 Where to store local models and the LocalDocs database. 本地模型和本地文档数据库存储目录 Browse 查看 Choose where to save model files 模型下载目录 Enable Datalake 开启数据湖 Send chats and feedback to the GPT4All Open-Source Datalake. 发送对话和反馈给GPT4All 的开源数据湖。 Advanced 高级 CPU Threads CPU线程 The number of CPU threads used for inference and embedding. 用于推理和嵌入的CPU线程数 Enable System Tray 启用系统托盘 The application will minimize to the system tray when the window is closed. 当窗口关闭时,应用程序将最小化到系统托盘。 Enable Local API Server 开启本地 API 服务 Expose an OpenAI-Compatible server to localhost. WARNING: Results in increased resource usage. 将OpenAI兼容服务器暴露给本地主机。警告:导致资源使用量增加。 API Server Port API 服务端口 The port to use for the local server. Requires restart. 使用本地服务的端口,需要重启 Check For Updates 检查更新 Manually check for an update to GPT4All. 手动检查更新 Updates 更新 Chat New Chat 新对话 Server Chat 服务器对话 ChatAPIWorker ERROR: Network error occurred while connecting to the API server 错误:连接到 API 服务器时发生网络错误 ChatAPIWorker::handleFinished got HTTP Error %1 %2 ChatAPIWorker::handleFinished 收到 HTTP 错误 %1 %2 ChatCollapsibleItem Analysis encountered error 分析时遇到错误 Thinking 思考中 Analyzing 分析中 Thought for %1 %2 思考耗时 %1 %2 second seconds Analyzed 分析完成 ChatDrawer Drawer 抽屉 Main navigation drawer 导航 + New Chat + 新对话 Create a new chat 新对话 Select the current chat or edit the chat when in edit mode 选择当前的聊天或在编辑模式下编辑聊天 Edit chat name 修改对话名称 Save chat name 保存对话名称 Delete chat 删除对话 Confirm chat deletion 确认删除对话 Cancel chat deletion 取消删除对话 List of chats 对话列表 List of chats in the drawer dialog 对话框中的聊天列表 ChatItemView GPT4All GPT4All You response stopped ... 响应停止... retrieving localdocs: %1 ... 检索本地文档: %1 ... searching localdocs: %1 ... 搜索本地文档: %1 ... processing ... 处理中... generating response ... 正在生成回复… generating questions ... 正在生成问题… generating toolcall ... 正在生成工具调用… Copy 复制 %n Source(s) %n 资源 LocalDocs 本地文档 Edit this message? 编辑这条消息? All following messages will be permanently erased. 所有后续消息将被永久删除。 Redo this response? 重新生成这条回复? Cannot edit chat without a loaded model. 未加载模型时无法编辑聊天。 Cannot edit chat while the model is generating. 生成模型时无法编辑聊天。 Edit 编辑 Cannot redo response without a loaded model. 未加载模型时无法重新生成回复。 Cannot redo response while the model is generating. 生成模型时无法重新生成回复。 Redo 重做 Like response 点赞这条回复 Dislike response 点踩这条回复 Suggested follow-ups 建议的后续步骤 ChatLLM Your message was too long and could not be processed (%1 > %2). Please try again with something shorter. 您的消息过长,无法处理(%1 > %2)。请尝试简短内容。 ChatListModel TODAY 今天 THIS WEEK 本周 THIS MONTH 本月 LAST SIX MONTHS 半年内 THIS YEAR 今年内 LAST YEAR 去年 ChatTextItem Copy 复制 Copy Message 复制消息 Disable markdown 禁用 Markdown Enable markdown 禁用 Markdown ChatView <h3>Warning</h3><p>%1</p> <h3>警告</h3><p>%1</p> Conversation copied to clipboard. 复制对话到剪切板。 Code copied to clipboard. 复制代码到剪切板。 The entire chat will be erased. 全部聊天记录将被清除。 Chat panel 对话面板 Chat panel with options 对话面板选项 Reload the currently loaded model 重载当前模型 Eject the currently loaded model 弹出当前加载的模型 No model installed. 没有安装模型。 Model loading error. 模型加载错误。 Waiting for model... 等待模型... Switching context... 切换上下文... Choose a model... 选择模型... Not found: %1 没找到: %1 The top item is the current model 当前模型的最佳选项 LocalDocs 本地文档 Add documents 添加文档 add collections of documents to the chat 将文档集合添加到聊天中 Load the default model 载入默认模型 Loads the default model which can be changed in settings 加载默认模型,可以在设置中更改 No Model Installed 没有下载模型 GPT4All requires that you install at least one model to get started GPT4All要求您至少安装一个模型才能开始 Install a Model 下载模型 Shows the add model view 查看添加的模型 Conversation with the model 使用此模型对话 prompt / response pairs from the conversation 对话中的提示/响应对 Legacy prompt template needs to be <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">updated</a> in Settings. 旧版提示模板需要在设置中<a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">更新</a>。 No <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">chat template</a> configured. 未配置<a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">对话模板</a>。 The <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">chat template</a> cannot be blank. <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">对话模板</a>不能为空。 Legacy system prompt needs to be <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">updated</a> in Settings. 旧系统提示需要在设置中<a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">更新</a>。 Copy 复制 Erase and reset chat session 清空并重置聊天会话 Copy chat session to clipboard 复制对话到剪切板 Add media 新增媒體 Adds media to the prompt 將媒體加入提示中 Stop generating 停止生成 Stop the current response generation 停止当前响应 Attach Single File 單一文件 Reloads the model 重载模型 <h3>Encountered an error loading model:</h3><br><i>"%1"</i><br><br>Model loading failures can happen for a variety of reasons, but the most common causes include a bad file format, an incomplete or corrupted download, the wrong file type, not enough system RAM or an incompatible model type. Here are some suggestions for resolving the problem:<br><ul><li>Ensure the model file has a compatible format and type<li>Check the model file is complete in the download folder<li>You can find the download folder in the settings dialog<li>If you've sideloaded the model ensure the file is not corrupt by checking md5sum<li>Read more about what models are supported in our <a href="https://docs.gpt4all.io/">documentation</a> for the gui<li>Check out our <a href="https://discord.gg/4M2QFmTt2k">discord channel</a> for help <h3>加载模型时遇到错误:</h3><br><i><%1></i><br><br>模型加载失败可能由多种原因引起,但最常见的原因包括文件格式错误、下载不完整或损坏、文件类型错误、系统 RAM 不足或模型类型不兼容。以下是一些解决问题的建议:<br><ul><li>确保模型文件具有兼容的格式和类型<li>检查下载文件夹中的模型文件是否完整<li>您可以在设置对话框中找到下载文件夹<li>如果您已侧载模型,请通过检查 md5sum 确保文件未损坏<li>在我们的 <a href="https://docs.gpt4all.io/">文档</a> 中了解有关 gui 支持哪些模型的更多信息<li>查看我们的 <a href="https://discord.gg/4M2QFmTt2k">discord 频道</a> 以获取帮助 Erase conversation? 清空对话? Changing the model will erase the current conversation. 更换模型将清除当前对话。 Reload · %1 重载 · %1 Loading · %1 载入中 · %1 Load · %1 (default) → 载入 · %1 (默认) → Send a message... 发送消息... Load a model to continue... 选择模型并继续... Send messages/prompts to the model 发送消息/提示词给模型 Cut 剪切 Paste 粘贴 Select All 全选 Send message 发送消息 Sends the message/prompt contained in textfield to the model 将文本框中包含的消息/提示发送给模型 CodeInterpreter Code Interpreter 代码解释器 compute javascript code using console.log as output 使用 console.log 计算 JavaScript 代码并输出结果 CollectionsDrawer Warning: searching collections while indexing can return incomplete results 提示: 索引时搜索集合可能会返回不完整的结果 %n file(s) %n word(s) Updating 更新中 + Add Docs + 添加文档 Select a collection to make it available to the chat model. 选择一个集合,使其可用于聊天模型。 ConfirmationDialog OK Cancel 取消 Download Model "%1" is installed successfully. 模型 "%1" 安装成功 ERROR: $MODEL_NAME is empty. 错误:$MODEL_NAME 为空 ERROR: $API_KEY is empty. 错误:$API_KEY为空 ERROR: $BASE_URL is invalid. 错误:$BASE_URL 非法 ERROR: Model "%1 (%2)" is conflict. 错误: 模型 "%1 (%2)" 有冲突. Model "%1 (%2)" is installed successfully. 模型 "%1 (%2)" 安装成功. Model "%1" is removed. 模型 "%1" 已删除. HomeView Welcome to GPT4All 欢迎 The privacy-first LLM chat application 隐私至上的大模型咨询应用程序 Start chatting 开始聊天 Start Chatting 开始聊天 Chat with any LLM 大语言模型聊天 LocalDocs 本地文档 Chat with your local files 本地文件聊天 Find Models 查找模型 Explore and download models 发现并下载模型 Latest news 新闻 Latest news from GPT4All GPT4All新闻 Release Notes 发布日志 Documentation 文档 Discord Discord X (Twitter) X (Twitter) Github Github nomic.ai nomic.ai Subscribe to Newsletter 订阅信息 LocalDocsSettings LocalDocs 本地文档 LocalDocs Settings 本地文档设置 Indexing 索引中 Allowed File Extensions 添加文档扩展名 Comma-separated list. LocalDocs will only attempt to process files with these extensions. 逗号分隔的列表。LocalDocs 只会尝试处理具有这些扩展名的文件 Embedding Embedding Use Nomic Embed API 使用 Nomic 内部 API Embed documents using the fast Nomic API instead of a private local model. Requires restart. 使用快速的 Nomic API 嵌入文档,而不是使用私有本地模型 Nomic API Key Nomic API Key API key to use for Nomic Embed. Get one from the Atlas <a href="https://atlas.nomic.ai/cli-login">API keys page</a>. Requires restart. Nomic Embed 使用的 API 密钥。请访问官网获取,需要重启。 Embeddings Device Embeddings 设备 The compute device used for embeddings. Requires restart. 技术设备用于embeddings. 需要重启. Application default 程序默认 Display 显示 Show Sources 查看源码 Display the sources used for each response. 显示每个响应所使用的源。 Advanced 高级 Warning: Advanced usage only. 提示: 仅限高级使用。 Values too large may cause localdocs failure, extremely slow responses or failure to respond at all. Roughly speaking, the {N chars x N snippets} are added to the model's context window. More info <a href="https://docs.gpt4all.io/gpt4all_desktop/localdocs.html">here</a>. 值过大可能会导致 localdocs 失败、响应速度极慢或根本无法响应。粗略地说,{N 个字符 x N 个片段} 被添加到模型的上下文窗口中。更多信息请见<a href="https://docs.gpt4all.io/gpt4all_desktop/localdocs.html">此处</a>。 Document snippet size (characters) 文档粘贴大小 (字符) Number of characters per document snippet. Larger numbers increase likelihood of factual responses, but also result in slower generation. 每个文档片段的字符数。较大的数值增加了事实性响应的可能性,但也会导致生成速度变慢。 Max document snippets per prompt 每个提示的最大文档片段数 Max best N matches of retrieved document snippets to add to the context for prompt. Larger numbers increase likelihood of factual responses, but also result in slower generation. 检索到的文档片段最多添加到提示上下文中的前 N 个最佳匹配项。较大的数值增加了事实性响应的可能性,但也会导致生成速度变慢。 LocalDocsView LocalDocs 本地文档 Chat with your local files 和本地文件对话 + Add Collection + 添加集合 <h3>错误:无法访问 LocalDocs 数据库或该数据库无效。</h3><br><i>注意:尝试以下任何建议的修复方法后,您将需要重新启动。</i><br><ul><li>确保设置为<b>下载路径</b>的文件夹存在于文件系统中。</li><li>检查<b>下载路径</b>的所有权以及读写权限。</li><li>如果有<b>localdocs_v2.db</b>文件,请检查其所有权和读/写权限。</li></ul><br>如果问题仍然存在,并且存在任何“localdocs_v*.db”文件,作为最后的手段,您可以<br>尝试备份并删除它们。但是,您必须重新创建您的收藏。 <h3>ERROR: The LocalDocs database cannot be accessed or is not valid.</h3><br><i>Note: You will need to restart after trying any of the following suggested fixes.</i><br><ul><li>Make sure that the folder set as <b>Download Path</b> exists on the file system.</li><li>Check ownership as well as read and write permissions of the <b>Download Path</b>.</li><li>If there is a <b>localdocs_v2.db</b> file, check its ownership and read/write permissions, too.</li></ul><br>If the problem persists and there are any 'localdocs_v*.db' files present, as a last resort you can<br>try backing them up and removing them. You will have to recreate your collections, however. <h3>错误:无法访问 LocalDocs 数据库或该数据库无效。</h3><br><i>注意:尝试以下任何建议的修复方法后,您将需要重新启动。</i><br><ul><li>确保设置为<b>下载路径</b>的文件夹存在于文件系统中。</li><li>检查<b>下载路径</b>的所有权以及读写权限。</li><li>如果有<b>localdocs_v2.db</b>文件,请检查其所有权和读/写权限。</li></ul><br>如果问题仍然存在,并且存在任何“localdocs_v*.db”文件,作为最后的手段,您可以<br>尝试备份并删除它们。但是,您必须重新创建您的收藏。 No Collections Installed 没有集合 Install a collection of local documents to get started using this feature 安装一组本地文档以开始使用此功能 + Add Doc Collection + 添加文档集合 Shows the add model view 查看添加的模型 Indexing progressBar 索引进度 Shows the progress made in the indexing 显示索引进度 ERROR 错误 INDEXING 索引 EMBEDDING EMBEDDING REQUIRES UPDATE 需更新 READY 准备 INSTALLING 安装中 Indexing in progress 构建索引中 Embedding in progress Embedding进度 This collection requires an update after version change 此集合需要在版本更改后进行更新 Automatically reindexes upon changes to the folder 在文件夹变动时自动重新索引 Installation in progress 安装进度 % % %n file(s) %n 文件 %n word(s) %n 词 Remove 删除 Rebuild 重新构建 Reindex this folder from scratch. This is slow and usually not needed. 从头开始重新索引此文件夹。这个过程较慢,通常情况下不需要。 Update 更新 Update the collection to the new version. This is a slow operation. 将集合更新为新版本。这是一个缓慢的操作。 ModelList cannot open "%1": %2 无法打开“%1”:%2 cannot create "%1": %2 无法创建“%1”:%2 %1 (%2) %1 (%2) <strong>OpenAI-Compatible API Model</strong><br><ul><li>API Key: %1</li><li>Base URL: %2</li><li>Model Name: %3</li></ul> <strong>与 OpenAI 兼容的 API 模型</strong><br><ul><li>API 密钥:%1</li><li>基本 URL:%2</li><li>模型名称:%3</li></ul> <ul><li>Requires personal OpenAI API key.</li><li>WARNING: Will send your chats to OpenAI!</li><li>Your API key will be stored on disk</li><li>Will only be used to communicate with OpenAI</li><li>You can apply for an API key <a href="https://platform.openai.com/account/api-keys">here.</a></li> <ul><li>需要个人 OpenAI API 密钥。</li><li>警告:将把您的聊天内容发送给 OpenAI!</li><li>您的 API 密钥将存储在磁盘上</li><li>仅用于与 OpenAI 通信</li><li>您可以在此处<a href="https://platform.openai.com/account/api-keys">申请 API 密钥。</a></li> <strong>OpenAI's ChatGPT model GPT-3.5 Turbo</strong><br> %1 <strong>OpenAI's ChatGPT model GPT-3.5 Turbo</strong><br> %1 <strong>OpenAI's ChatGPT model GPT-4</strong><br> %1 %2 <strong>OpenAI's ChatGPT model GPT-4</strong><br> %1 %2 <strong>Mistral Tiny model</strong><br> %1 <strong>Mistral Tiny model</strong><br> %1 <strong>Mistral Small model</strong><br> %1 <strong>Mistral Small model</strong><br> %1 <strong>Mistral Medium model</strong><br> %1 <strong>Mistral Medium model</strong><br> %1 <ul><li>Requires personal API key and the API base URL.</li><li>WARNING: Will send your chats to the OpenAI-compatible API Server you specified!</li><li>Your API key will be stored on disk</li><li>Will only be used to communicate with the OpenAI-compatible API Server</li> <ul><li>需要个人 API 密钥和 API 基本 URL。</li><li>警告:将把您的聊天内容发送到您指定的与 OpenAI 兼容的 API 服务器!</li><li>您的 API 密钥将存储在磁盘上</li><li>仅用于与与 OpenAI 兼容的 API 服务器通信</li> <strong>Connect to OpenAI-compatible API server</strong><br> %1 <strong>连接到与 OpenAI 兼容的 API 服务器</strong><br> %1 <br><br><i>* Even if you pay OpenAI for ChatGPT-4 this does not guarantee API key access. Contact OpenAI for more info. <br><br><i>* 即使您为ChatGPT-4向OpenAI付款,这也不能保证API密钥访问。联系OpenAI获取更多信息。 <ul><li>Requires personal Mistral API key.</li><li>WARNING: Will send your chats to Mistral!</li><li>Your API key will be stored on disk</li><li>Will only be used to communicate with Mistral</li><li>You can apply for an API key <a href="https://console.mistral.ai/user/api-keys">here</a>.</li> <ul><li>Requires personal Mistral API key.</li><li>WARNING: Will send your chats to Mistral!</li><li>Your API key will be stored on disk</li><li>Will only be used to communicate with Mistral</li><li>You can apply for an API key <a href="https://console.mistral.ai/user/api-keys">here</a>.</li> <strong>Created by %1.</strong><br><ul><li>Published on %2.<li>This model has %3 likes.<li>This model has %4 downloads.<li>More info can be found <a href="https://huggingface.co/%5">here.</a></ul> <strong>Created by %1.</strong><br><ul><li>Published on %2.<li>This model has %3 likes.<li>This model has %4 downloads.<li>More info can be found <a href="https://huggingface.co/%5">here.</a></ul> ModelSettings Model 模型 %1 system message? %1系统消息? Clear 清除 Reset 重置 The system message will be %1. 系统消息将被%1。 removed 删除 reset to the default 重置为初始状态 %1 chat template? %1 对话模板? The chat template will be %1. 对话模板将被%1。 erased 清除 Model Settings 模型设置 Clone 克隆 Remove 删除 Name 名称 Model File 模型文件 System Message 系统消息 A message to set the context or guide the behavior of the model. Leave blank for none. NOTE: Since GPT4All 3.5, this should not contain control tokens. 用于设定上下文或引导模型行为的消息。若无则留空。注意:自GPT4All 3.5版本开始,此信息中不应包含控制符。 System message is not <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">plain text</a>. 系统消息不是<a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">纯文本</a>. Chat Template 对话模板 This Jinja template turns the chat into input for the model. 该Jinja模板会将聊天内容转换为模型的输入。 No <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">chat template</a> configured. 未配置<a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">对话模板</a>。 The <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">chat template</a> cannot be blank. <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">对话模板</a>不能为空。 <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">Syntax error</a>: %1 <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">语法错误</a>: %1 Chat template is not in <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">Jinja format</a>. 对话模板不是<a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">Jinja格式</a>. Chat Name Prompt 聊天名称提示 Prompt used to automatically generate chat names. 用于自动生成聊天名称的提示。 Suggested FollowUp Prompt 建议的后续提示 Prompt used to generate suggested follow-up questions. 用于生成建议的后续问题的提示。 Context Length 上下文长度 Number of input and output tokens the model sees. 模型看到的输入和输出令牌的数量。 Maximum combined prompt/response tokens before information is lost. Using more context than the model was trained on will yield poor results. NOTE: Does not take effect until you reload the model. 信息丢失前的最大组合提示/响应令牌。 使用比模型训练时更多的上下文将产生较差的结果。 注意:在重新加载模型之前不会生效。 Temperature 温度 Randomness of model output. Higher -> more variation. 模型输出的随机性。更高->更多的变化。 Temperature increases the chances of choosing less likely tokens. NOTE: Higher temperature gives more creative but less predictable outputs. 温度增加了选择不太可能的token的机会。 注:温度越高,输出越有创意,但预测性越低。 Top-P Top-P Nucleus Sampling factor. Lower -> more predictable. 核子取样系数。较低->更具可预测性。 Only the most likely tokens up to a total probability of top_p can be chosen. NOTE: Prevents choosing highly unlikely tokens. 只能选择总概率高达top_p的最有可能的令牌。 注意:防止选择极不可能的token。 Min-P Min-P Minimum token probability. Higher -> more predictable. 最小令牌概率。更高 -> 更可预测。 Sets the minimum relative probability for a token to be considered. 设置被考虑的标记的最小相对概率。 Top-K Top-K Size of selection pool for tokens. 令牌选择池的大小。 Only the top K most likely tokens will be chosen from. 仅从最可能的前 K 个标记中选择 Max Length 最大长度 Maximum response length, in tokens. 最大响应长度(以令牌为单位) Prompt Batch Size 提示词大小 The batch size used for prompt processing. 用于快速处理的批量大小。 Amount of prompt tokens to process at once. NOTE: Higher values can speed up reading prompts but will use more RAM. 一次要处理的提示令牌数量。 注意:较高的值可以加快读取提示,但会使用更多的RAM。 Repeat Penalty 重复惩罚 Repetition penalty factor. Set to 1 to disable. 重复处罚系数。设置为1可禁用。 Repeat Penalty Tokens 重复惩罚数 Number of previous tokens used for penalty. 用于惩罚的先前令牌数量。 GPU Layers GPU 层 Number of model layers to load into VRAM. 要加载到VRAM中的模型层数。 How many model layers to load into VRAM. Decrease this if GPT4All runs out of VRAM while loading this model. Lower values increase CPU load and RAM usage, and make inference slower. NOTE: Does not take effect until you reload the model. 将多少模型层加载到VRAM中。如果GPT4All在加载此模型时耗尽VRAM,请减少此值。 较低的值会增加CPU负载和RAM使用率,并使推理速度变慢。 注意:在重新加载模型之前不会生效。 ModelsView No Models Installed 无模型 Install a model to get started using GPT4All 安装模型并开始使用 + Add Model + 添加模型 Shows the add model view 查看增加到模型 Installed Models 已安装的模型 Locally installed chat models 本地安装的聊天 Model file 模型文件 Model file to be downloaded 待下载的模型 Description 描述 File description 文件描述 Cancel 取消 Resume 继续 Stop/restart/start the download 停止/重启/开始下载 Remove 删除 Remove model from filesystem 从系统中删除模型 Install 安装 Install online model 安装在线模型 <strong><font size="1"><a href="#error">Error</a></strong></font> <strong><font size="1"><a href="#error">错误</a></strong></font> <strong><font size="2">WARNING: Not recommended for your hardware. Model requires more memory (%1 GB) than your system has available (%2).</strong></font> <strong><font size="2">警告:不推荐用于您的硬件。模型需要的内存(%1 GB)超过了您系统的可用内存(%2)</strong></font> ERROR: $API_KEY is empty. 错误:$API_KEY 为空 ERROR: $BASE_URL is empty. 错误:$BASE_URL 为空 enter $BASE_URL 输入 $BASE_URL ERROR: $MODEL_NAME is empty. 错误:$MODEL_NAME为空 enter $MODEL_NAME 输入:$MODEL_NAME %1 GB %1 GB ? Describes an error that occurred when downloading 描述下载时发生的错误 Error for incompatible hardware 硬件不兼容的错误 Download progressBar 下载进度 Shows the progress made in the download 显示下载进度 Download speed 下载速度 Download speed in bytes/kilobytes/megabytes per second 下载速度 b/kb/mb /s Calculating... 计算中... Whether the file hash is being calculated 是否正在计算文件哈希 Busy indicator 繁忙程度 Displayed when the file hash is being calculated 在计算文件哈希时显示 enter $API_KEY 输入 $API_KEY File size 文件大小 RAM required 需要 RAM Parameters 参数 Quant 量化 Type 类型 MyFancyLink Fancy link 精选链接 A stylized link 样式化链接 MyFileDialog Please choose a file 請選擇一個文件 MyFolderDialog Please choose a directory 請選擇目錄 MySettingsLabel Clear 清除 Reset 重置 MySettingsTab Restore defaults? 恢复初始化? This page of settings will be reset to the defaults. 该页面的设置项将重置为默认值。 Restore Defaults 恢复初始化 Restores settings dialog to a default state 将设置对话框恢复为默认状态 NetworkDialog Contribute data to the GPT4All Opensource Datalake. 向GPT4All开源数据湖贡献数据 By enabling this feature, you will be able to participate in the democratic process of training a large language model by contributing data for future model improvements. When a GPT4All model responds to you and you have opted-in, your conversation will be sent to the GPT4All Open Source Datalake. Additionally, you can like/dislike its response. If you dislike a response, you can suggest an alternative response. This data will be collected and aggregated in the GPT4All Datalake. NOTE: By turning on this feature, you will be sending your data to the GPT4All Open Source Datalake. You should have no expectation of chat privacy when this feature is enabled. You should; however, have an expectation of an optional attribution if you wish. Your chat data will be openly available for anyone to download and will be used by Nomic AI to improve future GPT4All models. Nomic AI will retain all attribution information attached to your data and you will be credited as a contributor to any GPT4All model release that uses your data! 通过启用此功能,您将能够通过为未来的模型改进贡献数据来参与训练大型语言模型的民主过程。 当 GPT4All 模型回复您并且您已选择加入时,您的对话将被发送到 GPT4All 开源数据湖。此外,您可以喜欢/不喜欢它的回复。如果您不喜欢某个回复,您可以建议其他回复。这些数据将在 GPT4All 数据湖中收集和汇总。 注意:通过启用此功能,您将把数据发送到 GPT4All 开源数据湖。启用此功能后,您不应该期望聊天隐私。但是,如果您愿意,您应该期望可选的归因。您的聊天数据将公开供任何人下载,并将被 Nomic AI 用于改进未来的 GPT4All 模型。Nomic AI 将保留与您的数据相关的所有归因信息,并且您将被视为使用您的数据的任何 GPT4All 模型发布的贡献者! Terms for opt-in 选择加入的条款 Describes what will happen when you opt-in 描述选择加入时会发生的情况 Please provide a name for attribution (optional) 填写名称属性 (可选) Attribution (optional) 属性 (可选) Provide attribution 提供属性 Enable 启用 Enable opt-in 启用选择加入 Cancel 取消 Cancel opt-in 取消加入 NewVersionDialog New version is available 新版本可选 Update 更新 Update to new version 更新到新版本 PopupDialog Reveals a shortlived help balloon 显示一个短暂的帮助气球 Busy indicator 繁忙程度 Displayed when the popup is showing busy 在弹出窗口显示忙碌时显示 RemoteModelCard API Key API 密钥 ERROR: $API_KEY is empty. 错误:$API_KEY 为空 enter $API_KEY 输入 $API_KEY Whether the file hash is being calculated 是否正在计算文件哈希 Base Url 基础 URL ERROR: $BASE_URL is empty. 错误:$BASE_URL 为空 enter $BASE_URL 输入 $BASE_URL Model Name 模型名称 ERROR: $MODEL_NAME is empty. 错误:$MODEL_NAME为空 enter $MODEL_NAME 输入:$MODEL_NAME Models 模型 Install 安装 Install remote model 安装远程模型 SettingsView Settings 设置 Contains various application settings 包含各种应用程序设置 Application 应用 Model 模型 LocalDocs 本地文档 StartupDialog Welcome! 欢迎! ### Release Notes %1<br/> ### Contributors %2 ### 发布日志 %1<br/> ### 贡献者 %2 Release notes 发布日志 Release notes for this version 本版本发布日志 ### Opt-ins for anonymous usage analytics and datalake By enabling these features, you will be able to participate in the democratic process of training a large language model by contributing data for future model improvements. When a GPT4All model responds to you and you have opted-in, your conversation will be sent to the GPT4All Open Source Datalake. Additionally, you can like/dislike its response. If you dislike a response, you can suggest an alternative response. This data will be collected and aggregated in the GPT4All Datalake. NOTE: By turning on this feature, you will be sending your data to the GPT4All Open Source Datalake. You should have no expectation of chat privacy when this feature is enabled. You should; however, have an expectation of an optional attribution if you wish. Your chat data will be openly available for anyone to download and will be used by Nomic AI to improve future GPT4All models. Nomic AI will retain all attribution information attached to your data and you will be credited as a contributor to any GPT4All model release that uses your data! ### 选择加入匿名使用分析和数据湖 通过启用这些功能,您将能够通过为未来的模型改进贡献数据来参与训练大型语言模型的民主过程。 当 GPT4All 模型回复您并且您已选择加入时,您的对话将被发送到 GPT4All 开源数据湖。此外,您可以喜欢/不喜欢它的回复。如果您不喜欢某个回复,您可以建议其他回复。这些数据将在 GPT4All 数据湖中收集和汇总。 注意:通过启用此功能,您将把您的数据发送到 GPT4All 开源数据湖。 启用此功能后,您不应该期望聊天隐私。但是,如果您愿意,您应该期望可选的归因。您的聊天数据将公开供任何人下载,并将由 Nomic AI 用于改进未来的 GPT4All 模型。 Nomic AI 将保留与您的数据相关的所有 归因信息,并且您将被视为使用您的数据的任何 GPT4All 模型发布的贡献者! Terms for opt-in 选择加入选项 Describes what will happen when you opt-in 描述选择加入时会发生的情况 Opt-in to anonymous usage analytics used to improve GPT4All 选择加入匿名使用分析,以帮助改进GPT4All Opt-in for anonymous usage statistics 允许选择加入匿名使用统计数据 Yes Allow opt-in for anonymous usage statistics 允许选择加入匿名使用统计数据 No Opt-out for anonymous usage statistics 退出匿名使用统计数据 Allow opt-out for anonymous usage statistics 允许选择退出匿名使用统计数据 Opt-in to anonymous sharing of chats to the GPT4All Datalake 选择匿名共享聊天记录到GPT4All数据池 Opt-in for network 选择加入网络 Allow opt-in for network 允许选择加入网络 Allow opt-in anonymous sharing of chats to the GPT4All Datalake 允许选择加入匿名共享聊天至 GPT4All 数据湖 Opt-out for network 选择退出网络 Allow opt-out anonymous sharing of chats to the GPT4All Datalake 允许选择退出将聊天匿名共享至 GPT4All 数据池 ThumbsDownDialog Please edit the text below to provide a better response. (optional) 请编辑下方文本以提供更好的回复。(可选) Please provide a better response... 提供更好回答... Submit 提交 Submits the user's response 提交用户响应 Cancel 取消 Closes the response dialog 关闭的对话 main GPT4All v%1 GPT4All v%1 Restore 恢复 Quit 退出 <h3>Encountered an error starting up:</h3><br><i>"Incompatible hardware detected."</i><br><br>Unfortunately, your CPU does not meet the minimal requirements to run this program. In particular, it does not support AVX intrinsics which this program requires to successfully run a modern large language model. The only solution at this time is to upgrade your hardware to a more modern CPU.<br><br>See here for more information: <a href="https://en.wikipedia.org/wiki/Advanced_Vector_Extensions">https://en.wikipedia.org/wiki/Advanced_Vector_Extensions</a> <h3>启动时遇到错误:</h3><br><i>“检测到不兼容的硬件。”</i><br><br>很遗憾,您的 CPU 不满足运行此程序的最低要求。特别是,它不支持此程序成功运行现代大型语言模型所需的 AVX 内在函数。目前唯一的解决方案是将您的硬件升级到更现代的 CPU。<br><br>有关更多信息,请参阅此处:<a href="https://en.wikipedia.org/wiki/Advanced_Vector_Extensions>>https://en.wikipedia.org/wiki/Advanced_Vector_Extensions</a> <h3>Encountered an error starting up:</h3><br><i>"Inability to access settings file."</i><br><br>Unfortunately, something is preventing the program from accessing the settings file. This could be caused by incorrect permissions in the local app config directory where the settings file is located. Check out our <a href="https://discord.gg/4M2QFmTt2k">discord channel</a> for help. <h3>启动时遇到错误:</h3><br><i>“无法访问设置文件。”</i><br><br>不幸的是,某些东西阻止程序访问设置文件。这可能是由于设置文件所在的本地应用程序配置目录中的权限不正确造成的。请查看我们的<a href="https://discord.gg/4M2QFmTt2k">discord 频道</a> 以获取帮助。 Connection to datalake failed. 链接数据湖失败 Saving chats. 保存对话 Network dialog 网络对话 opt-in to share feedback/conversations 选择加入以共享反馈/对话 Home view 主页 Home view of application 主页 Home 主页 Chat view 对话视图 Chat view to interact with models 聊天视图可与模型互动 Chats 对话 Models 模型 Models view for installed models 已安装模型的页面 LocalDocs 本地文档 LocalDocs view to configure and use local docs LocalDocs视图可配置和使用本地文档 Settings 设置 Settings view for application configuration 设置页面 The datalake is enabled 数据湖已开启 Using a network model 使用联网模型 Server mode is enabled 服务器模式已开 Installed models 安装模型 View of installed models 查看已安装模型 ================================================ FILE: gpt4all-chat/translations/gpt4all_zh_TW.ts ================================================ AddCollectionView ← Existing Collections ← 現有收藏 Add Document Collection 新增收藏文件 Add a folder containing plain text files, PDFs, or Markdown. Configure additional extensions in Settings. 新增一個含有純文字檔案、PDF 與 Markdown 文件的資料夾。可在設定上增加文件副檔名。 Name 名稱 Collection name... 收藏名稱...... Name of the collection to add (Required) 新增的收藏名稱(必填) Folder 資料夾 Folder path... 資料夾路徑...... Folder path to documents (Required) 文件所屬的資料夾路徑(必填) Browse 瀏覽 Create Collection 建立收藏 AddGPT4AllModelView These models have been specifically configured for use in GPT4All. The first few models on the list are known to work the best, but you should only attempt to use models that will fit in your available memory. Network error: could not retrieve %1 網路錯誤:無法取得 %1 Busy indicator 忙線指示器 Displayed when the models request is ongoing 當模型請求正在進行時顯示 All Reasoning Model file 模型檔案 Model file to be downloaded 即將下載的模型檔案 Description 描述 File description 檔案描述 Cancel 取消 Resume 恢復 Download 下載 Stop/restart/start the download 停止/重啟/開始下載 Remove 移除 Remove model from filesystem 從檔案系統移除模型 Install 安裝 Install online model 安裝線上模型 <strong><font size="1"><a href="#error">Error</a></strong></font> <strong><font size="1"><a href="#error">錯誤</a></strong></font> Describes an error that occurred when downloading 解釋下載時發生的錯誤 <strong><font size="2">WARNING: Not recommended for your hardware. Model requires more memory (%1 GB) than your system has available (%2).</strong></font> <strong><font size="2">警告:不推薦在您的硬體上運作。模型需要比較多的記憶體(%1 GB),但您的系統記憶體空間不足(%2)。</strong></font> Error for incompatible hardware 錯誤,不相容的硬體 Download progressBar 下載進度條 Shows the progress made in the download 顯示下載進度 Download speed 下載速度 Download speed in bytes/kilobytes/megabytes per second 下載速度每秒 bytes/kilobytes/megabytes Calculating... 計算中...... Whether the file hash is being calculated 是否正在計算檔案雜湊 Displayed when the file hash is being calculated 計算檔案雜湊值時顯示 ERROR: $API_KEY is empty. 錯誤:$API_KEY 未填寫。 enter $API_KEY 請輸入 $API_KEY ERROR: $BASE_URL is empty. 錯誤:$BASE_URL 未填寫。 enter $BASE_URL 請輸入 $BASE_URL ERROR: $MODEL_NAME is empty. 錯誤:$MODEL_NAME 未填寫。 enter $MODEL_NAME 請輸入 $MODEL_NAME File size 檔案大小 RAM required 所需的記憶體 %1 GB %1 GB ? Parameters 參數 Quant 量化 Type 類型 AddHFModelView Use the search to find and download models from HuggingFace. There is NO GUARANTEE that these will work. Many will require additional configuration before they can be used. Discover and download models by keyword search... 透過關鍵字搜尋探索並下載模型...... Text field for discovering and filtering downloadable models 用於探索與過濾可下載模型的文字字段 Searching · %1 搜尋 · %1 Initiate model discovery and filtering 探索與過濾模型 Triggers discovery and filtering of models 觸發探索與過濾模型 Default 預設 Likes Downloads 下載次數 Recent 最新 Sort by: %1 排序依據:%1 Asc 升序 Desc 降序 Sort dir: %1 排序順序:%1 None Limit: %1 上限:%1 Model file 模型檔案 Model file to be downloaded 即將下載的模型檔案 Description 描述 File description 檔案描述 Cancel 取消 Resume 恢復 Download 下載 Stop/restart/start the download 停止/重啟/開始下載 Remove 移除 Remove model from filesystem 從檔案系統移除模型 Install 安裝 Install online model 安裝線上模型 <strong><font size="1"><a href="#error">Error</a></strong></font> <strong><font size="1"><a href="#error">錯誤</a></strong></font> Describes an error that occurred when downloading 解釋下載時發生的錯誤 <strong><font size="2">WARNING: Not recommended for your hardware. Model requires more memory (%1 GB) than your system has available (%2).</strong></font> <strong><font size="2">警告:不推薦在您的硬體上運作。模型需要比較多的記憶體(%1 GB),但您的系統記憶體空間不足(%2)。</strong></font> Error for incompatible hardware 錯誤,不相容的硬體 Download progressBar 下載進度條 Shows the progress made in the download 顯示下載進度 Download speed 下載速度 Download speed in bytes/kilobytes/megabytes per second 下載速度每秒 bytes/kilobytes/megabytes Calculating... 計算中...... Whether the file hash is being calculated 是否正在計算檔案雜湊 Busy indicator 忙線指示器 Displayed when the file hash is being calculated 計算檔案雜湊值時顯示 ERROR: $API_KEY is empty. 錯誤:$API_KEY 未填寫。 enter $API_KEY 請輸入 $API_KEY ERROR: $BASE_URL is empty. 錯誤:$BASE_URL 未填寫。 enter $BASE_URL 請輸入 $BASE_URL ERROR: $MODEL_NAME is empty. 錯誤:$MODEL_NAME 未填寫。 enter $MODEL_NAME 請輸入 $MODEL_NAME File size 檔案大小 Quant 量化 Type 類型 AddModelView ← Existing Models ← 現有模型 Explore Models 探索模型 GPT4All GPT4All Remote Providers HuggingFace Discover and download models by keyword search... 透過關鍵字搜尋探索並下載模型...... Text field for discovering and filtering downloadable models 用於探索與過濾可下載模型的文字字段 Searching · %1 搜尋 · %1 Initiate model discovery and filtering 探索與過濾模型 Triggers discovery and filtering of models 觸發探索與過濾模型 Default 預設 Likes Downloads 下載次數 Recent 最新 Sort by: %1 排序依據:%1 Asc 升序 Desc 降序 Sort dir: %1 排序順序:%1 None Limit: %1 上限:%1 Network error: could not retrieve %1 網路錯誤:無法取得 %1 <strong><font size="1"><a href="#error">Error</a></strong></font> <strong><font size="1"><a href="#error">錯誤</a></strong></font> <strong><font size="2">WARNING: Not recommended for your hardware. Model requires more memory (%1 GB) than your system has available (%2).</strong></font> <strong><font size="2">警告:不推薦在您的硬體上運作。模型需要比較多的記憶體(%1 GB),但您的系統記憶體空間不足(%2)。</strong></font> %1 GB %1 GB ? Busy indicator 參考自 https://terms.naer.edu.tw 忙線指示器 Displayed when the models request is ongoing 當模型請求正在進行時顯示 Model file 模型檔案 Model file to be downloaded 即將下載的模型檔案 Description 描述 File description 檔案描述 Cancel 取消 Resume 恢復 Download 下載 Stop/restart/start the download 停止/重啟/開始下載 Remove 移除 Remove model from filesystem 從檔案系統移除模型 Install 安裝 Install online model 安裝線上模型 Describes an error that occurred when downloading 解釋下載時發生的錯誤 Error for incompatible hardware 錯誤,不相容的硬體 Download progressBar 下載進度條 Shows the progress made in the download 顯示下載進度 Download speed 下載速度 Download speed in bytes/kilobytes/megabytes per second 下載速度每秒 bytes/kilobytes/megabytes Calculating... 計算中...... Whether the file hash is being calculated 是否正在計算檔案雜湊 Displayed when the file hash is being calculated 計算檔案雜湊值時顯示 ERROR: $API_KEY is empty. 錯誤:$API_KEY 未填寫。 enter $API_KEY 請輸入 $API_KEY ERROR: $BASE_URL is empty. 錯誤:$BASE_URL 未填寫。 enter $BASE_URL 請輸入 $BASE_URL ERROR: $MODEL_NAME is empty. 錯誤:$MODEL_NAME 未填寫。 enter $MODEL_NAME 請輸入 $MODEL_NAME File size 檔案大小 RAM required 所需的記憶體 Parameters 參數 Quant 量化 Type 類型 AddRemoteModelView Various remote model providers that use network resources for inference. Groq Groq offers a high-performance AI inference engine designed for low-latency and efficient processing. Optimized for real-time applications, Groq’s technology is ideal for users who need fast responses from open large language models and other AI workloads.<br><br>Get your API key: <a href="https://console.groq.com/keys">https://groq.com/</a> OpenAI OpenAI provides access to advanced AI models, including GPT-4 supporting a wide range of applications, from conversational AI to content generation and code completion.<br><br>Get your API key: <a href="https://platform.openai.com/signup">https://openai.com/</a> Mistral Mistral AI specializes in efficient, open-weight language models optimized for various natural language processing tasks. Their models are designed for flexibility and performance, making them a solid option for applications requiring scalable AI solutions.<br><br>Get your API key: <a href="https://mistral.ai/">https://mistral.ai/</a> Custom The custom provider option allows users to connect their own OpenAI-compatible AI models or third-party inference services. This is useful for organizations with proprietary models or those leveraging niche AI providers not listed here. ApplicationSettings Application 應用程式 Network dialog 資料湖泊計畫對話視窗 opt-in to share feedback/conversations 分享回饋/對話計畫 Error dialog 錯誤對話視窗 Application Settings 應用程式設定 General 一般 Theme 主題 The application color scheme. 應用程式的配色方案。 Dark 暗色 Light 亮色 LegacyDark 傳統暗色 Font Size 字體大小 The size of text in the application. 應用程式中的字體大小。 Small Medium Large Language and Locale 語言與區域設定 The language and locale you wish to use. 您希望使用的語言與區域設定。 System Locale 系統語系 Device 裝置 Default Model 預設模型 The preferred model for new chats. Also used as the local server fallback. 用於新交談的預設模型。也用於作為本機伺服器後援使用。 Suggestion Mode 建議模式 When chatting with LocalDocs 當使用「我的文件」交談時 Whenever possible 視情況允許 Never 永不 Enable System Tray The application will minimize to the system tray when the window is closed. Enable Local API Server 啟用本機 API 伺服器 Generate suggested follow-up questions at the end of responses. 在回覆末尾生成後續建議的問題。 ERROR: Update system could not find the MaintenanceTool used to check for updates!<br/><br/>Did you install this application using the online installer? If so, the MaintenanceTool executable should be located one directory above where this application resides on your filesystem.<br/><br/>If you can't start it manually, then I'm afraid you'll have to reinstall. 錯誤:更新系統找不到可使用的維護工具來檢查更新!<br><br>您是否使用了線上安裝程式安裝了本應用程式?若是如此,維護工具的執行檔(MaintenanceTool)應位於安裝資料夾中。<br><br>請試著手動開啟它。<br><br>如果您無法順利啟動,您可能得重新安裝本應用程式。 The compute device used for text generation. 用於生成文字的計算裝置。 Application default 應用程式預設值 Download Path 下載路徑 Where to store local models and the LocalDocs database. 儲存本機模型與「我的文件」資料庫的位置。 Browse 瀏覽 Choose where to save model files 選擇儲存模型檔案的位置 Enable Datalake 啟用資料湖泊 Send chats and feedback to the GPT4All Open-Source Datalake. 將交談與回饋傳送到 GPT4All 開放原始碼資料湖泊。 Advanced 進階 CPU Threads 中央處理器線程 The number of CPU threads used for inference and embedding. 用於推理與嵌入的中央處理器線程數。 Save Chat Context 儲存交談語境 Save the chat model's state to disk for faster loading. WARNING: Uses ~2GB per chat. 將交談模型的狀態儲存到磁碟以加快載入速度。警告:每次交談使用約 2GB。 Expose an OpenAI-Compatible server to localhost. WARNING: Results in increased resource usage. 將 OpenAI 相容伺服器公開給本機。警告:導致資源使用增加。 API Server Port API 伺服器埠口 The port to use for the local server. Requires restart. 用於本機伺服器的埠口。需要重新啟動。 Check For Updates 檢查更新 Manually check for an update to GPT4All. 手動檢查 GPT4All 的更新。 Updates 更新 Chat New Chat 新的交談 Server Chat 伺服器交談 ChatAPIWorker ERROR: Network error occurred while connecting to the API server 錯誤:網路錯誤,無法連線到目標 API 伺服器 ChatAPIWorker::handleFinished got HTTP Error %1 %2 ChatAPIWorker::handleFinished 遇到一個 HTTP 錯誤 %1 %2 ChatCollapsibleItem Analysis encountered error Thinking Analyzing Thought for %1 %2 second seconds Analyzed ChatDrawer Drawer 側邊欄 Main navigation drawer 主要導航側邊欄 + New Chat + 新的交談 Create a new chat 建立新的交談 Select the current chat or edit the chat when in edit mode 選擇目前交談或在編輯模式下編輯交談 Edit chat name 修改對話名稱 Save chat name 儲存對話名稱 Delete chat 刪除對話 Confirm chat deletion 確定刪除對話 Cancel chat deletion 取消刪除對話 List of chats 交談列表 List of chats in the drawer dialog 側邊欄對話視窗的交談列表 ChatItemView GPT4All GPT4All You response stopped ... 回覆停止...... retrieving localdocs: %1 ... 檢索本機文件中:%1 ...... searching localdocs: %1 ... 搜尋本機文件中:%1 ...... processing ... 處理中...... generating response ... 生成回覆...... generating questions ... 生成問題...... generating toolcall ... Copy 複製 Copy Message 複製訊息 Disable markdown 停用 Markdown Enable markdown 啟用 Markdown %n Source(s) %n 來源 LocalDocs 我的文件 Edit this message? All following messages will be permanently erased. Redo this response? Cannot edit chat without a loaded model. Cannot edit chat while the model is generating. Edit Cannot redo response without a loaded model. Cannot redo response while the model is generating. Redo Like response Dislike response Suggested follow-ups 後續建議 ChatLLM Your message was too long and could not be processed (%1 > %2). Please try again with something shorter. ChatListModel TODAY 今天 THIS WEEK 這星期 THIS MONTH 這個月 LAST SIX MONTHS 前六個月 THIS YEAR 今年 LAST YEAR 去年 ChatTextItem Copy 複製 Copy Message 複製訊息 Disable markdown 停用 Markdown Enable markdown 啟用 Markdown ChatView <h3>Warning</h3><p>%1</p> <h3>警告</h3><p>%1</p> Switch model dialog 切換模型對話視窗 Warn the user if they switch models, then context will be erased 警告使用者如果切換模型,則語境將被刪除 Conversation copied to clipboard. 對話已複製到剪貼簿。 Code copied to clipboard. 程式碼已複製到剪貼簿。 The entire chat will be erased. Chat panel 交談面板 Chat panel with options 具有選項的交談面板 Reload the currently loaded model 重新載入目前已載入的模型 Eject the currently loaded model 彈出目前載入的模型 No model installed. 沒有已安裝的模型。 Model loading error. 模型載入時發生錯誤。 Waiting for model... 等待模型中...... Switching context... 切換語境中...... Choose a model... 選擇一個模型...... Not found: %1 不存在:%1 Reload · %1 重新載入 · %1 Loading · %1 載入中 · %1 Load · %1 (default) → 載入 · %1 (預設) → Legacy prompt template needs to be <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">updated</a> in Settings. No <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">chat template</a> configured. The <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">chat template</a> cannot be blank. Legacy system prompt needs to be <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">updated</a> in Settings. The top item is the current model 最上面的那項是目前使用的模型 Erase conversation? Changing the model will erase the current conversation. LocalDocs 我的文件 Add documents 新增文件 add collections of documents to the chat 將文件集合新增至交談中 Load the default model 載入預設模型 Loads the default model which can be changed in settings 預設模型可於設定中變更 No Model Installed 沒有已安裝的模型 GPT4All requires that you install at least one model to get started GPT4All 要求您至少安裝一個 模型開始 Install a Model 安裝一個模型 Shows the add model view 顯示新增模型視圖 Conversation with the model 與模型對話 prompt / response pairs from the conversation 對話中的提示詞 / 回覆組合 GPT4All GPT4All You response stopped ... 回覆停止...... retrieving localdocs: %1 ... 檢索本機文件中:%1 ...... searching localdocs: %1 ... 搜尋本機文件中:%1 ...... processing ... 處理中...... generating response ... 生成回覆...... generating questions ... 生成問題...... Copy 複製 Copy Message 複製訊息 Disable markdown 停用 Markdown Enable markdown 啟用 Markdown Thumbs up Gives a thumbs up to the response 對這則回覆比讚 Thumbs down 倒讚 Opens thumbs down dialog 開啟倒讚對話視窗 Suggested follow-ups 後續建議 Erase and reset chat session 刪除並重置交談會話 Copy chat session to clipboard 複製交談會議到剪貼簿 Redo last chat response 復原上一個交談回覆 Add media 附加媒體文件 Adds media to the prompt 附加媒體文件到提示詞 Stop generating 停止生成 Stop the current response generation 停止當前回覆生成 Attach 附加 Single File 單一文件 Reloads the model 重新載入模型 <h3>Encountered an error loading model:</h3><br><i>"%1"</i><br><br>Model loading failures can happen for a variety of reasons, but the most common causes include a bad file format, an incomplete or corrupted download, the wrong file type, not enough system RAM or an incompatible model type. Here are some suggestions for resolving the problem:<br><ul><li>Ensure the model file has a compatible format and type<li>Check the model file is complete in the download folder<li>You can find the download folder in the settings dialog<li>If you've sideloaded the model ensure the file is not corrupt by checking md5sum<li>Read more about what models are supported in our <a href="https://docs.gpt4all.io/">documentation</a> for the gui<li>Check out our <a href="https://discord.gg/4M2QFmTt2k">discord channel</a> for help <h3>載入模型時發生錯誤:</h3><br><i>"%1"</i><br><br>導致模型載入失敗的原因可能有很多種,但絕大多數的原因是檔案格式損毀、下載的檔案不完整、檔案類型錯誤、系統RAM空間不足或不相容的模型類型。這裡有些建議可供疑難排解:<br><ul><li>確保使用的模型是相容的格式與類型<li>檢查位於下載資料夾的檔案是否完整<li>您可以從設定中找到您所設定的「下載資料夾路徑」<li>如果您有側載模型,請利用 md5sum 等工具確保您的檔案是完整的<li>想了解更多關於我們所支援的模型資訊,煩請詳閱<a href="https://docs.gpt4all.io/">本文件</a>。<li>歡迎洽詢我們的 <a href="https://discord.gg/4M2QFmTt2k">Discord 伺服器</a> 以尋求幫助 restoring from text ... 從文字中恢復...... %n Source(s) %n 來源 Send a message... 傳送一則訊息...... Load a model to continue... 載入模型以繼續...... Send messages/prompts to the model 向模型傳送訊息/提示詞 Cut 剪下 Paste 貼上 Select All 全選 Send message 傳送訊息 Sends the message/prompt contained in textfield to the model 將文字欄位中包含的訊息/提示詞傳送到模型 CodeInterpreter Code Interpreter compute javascript code using console.log as output CollectionsDrawer Warning: searching collections while indexing can return incomplete results 警告:在索引時搜尋收藏可能會傳回不完整的結果 %n file(s) %n 個檔案 %n word(s) %n 個字 Updating 更新中 + Add Docs + 新增文件 Select a collection to make it available to the chat model. 選擇一個收藏以使其可供交談模型使用。 ConfirmationDialog OK Cancel 取消 Download Model "%1" is installed successfully. 模型「%1」已安裝成功。 ERROR: $MODEL_NAME is empty. 錯誤:$MODEL_NAME 未填寫。 ERROR: $API_KEY is empty. 錯誤:$API_KEY 未填寫。 ERROR: $BASE_URL is invalid. 錯誤:$BASE_URL 無效。 ERROR: Model "%1 (%2)" is conflict. 錯誤:模型「%1 (%2)」發生衝突。 Model "%1 (%2)" is installed successfully. 模型「%1(%2)」已安裝成功。 Model "%1" is removed. 模型「%1」已移除。 HomeView Welcome to GPT4All 歡迎使用 GPT4All The privacy-first LLM chat application 隱私第一的大型語言模型交談應用程式 Start chatting 開始交談 Start Chatting 開始交談 Chat with any LLM 與任何大型語言模型交談 LocalDocs 我的文件 Chat with your local files 使用「我的文件」來交談 Find Models 搜尋模型 Explore and download models 瀏覽與下載模型 Latest news 最新消息 Latest news from GPT4All 從 GPT4All 來的最新消息 Release Notes 版本資訊 Documentation 文件 Discord Discord X (Twitter) X (Twitter) Github Github nomic.ai nomic.ai Subscribe to Newsletter 訂閱電子報 LocalDocsSettings LocalDocs 我的文件 LocalDocs Settings 我的文件設定 Indexing 索引 Allowed File Extensions 允許的副檔名 Comma-separated list. LocalDocs will only attempt to process files with these extensions. 以逗號分隔的列表。「我的文件」將僅嘗試處理具有這些副檔名的檔案。 Embedding 嵌入 Use Nomic Embed API 使用 Nomic 嵌入 API Embed documents using the fast Nomic API instead of a private local model. Requires restart. 使用快速的 Nomic API 而不是本機私有模型嵌入文件。需要重新啟動。 Nomic API Key Nomic API 金鑰 API key to use for Nomic Embed. Get one from the Atlas <a href="https://atlas.nomic.ai/cli-login">API keys page</a>. Requires restart. 用於 Nomic Embed 的 API 金鑰。從 Atlas <a href="https://atlas.nomic.ai/cli-login">API 金鑰頁面</a>取得一個。需要重新啟動。 Embeddings Device 嵌入裝置 The compute device used for embeddings. Requires restart. 用於嵌入的計算裝置。需要重新啟動。 Application default 應用程式預設值 Display 顯示 Show Sources 查看來源 Display the sources used for each response. 顯示每則回覆所使用的來源。 Advanced 進階 Warning: Advanced usage only. 警告:僅限進階使用。 Values too large may cause localdocs failure, extremely slow responses or failure to respond at all. Roughly speaking, the {N chars x N snippets} are added to the model's context window. More info <a href="https://docs.gpt4all.io/gpt4all_desktop/localdocs.html">here</a>. 設定太大的數值可能會導致「我的文件」處理失敗、反應速度極慢或根本無法回覆。簡單地說,這會將 {N 個字元 x N 個片段} 被添加到模型的語境視窗中。更多資訊<a href="https://docs.gpt4all.io/gpt4all_desktop/localdocs.html">此處</a>。 Document snippet size (characters) 文件片段大小(字元) Number of characters per document snippet. Larger numbers increase likelihood of factual responses, but also result in slower generation. 每個文件片段的字元數。較大的數字會增加實際反應的可能性,但也會導致生成速度變慢。 Max document snippets per prompt 每個提示詞的最大文件片段 Max best N matches of retrieved document snippets to add to the context for prompt. Larger numbers increase likelihood of factual responses, but also result in slower generation. 新增至提示詞語境中的檢索到的文件片段的最大 N 個符合的項目。較大的數字會增加實際反應的可能性,但也會導致生成速度變慢。 LocalDocsView LocalDocs 我的文件 Chat with your local files 使用「我的文件」來交談 + Add Collection + 新增收藏 <h3>ERROR: The LocalDocs database cannot be accessed or is not valid.</h3><br><i>Note: You will need to restart after trying any of the following suggested fixes.</i><br><ul><li>Make sure that the folder set as <b>Download Path</b> exists on the file system.</li><li>Check ownership as well as read and write permissions of the <b>Download Path</b>.</li><li>If there is a <b>localdocs_v2.db</b> file, check its ownership and read/write permissions, too.</li></ul><br>If the problem persists and there are any 'localdocs_v*.db' files present, as a last resort you can<br>try backing them up and removing them. You will have to recreate your collections, however. <h3>錯誤:「我的文件」資料庫已無法存取或已損壞。</h3><br><i>提醒:執行完以下任何疑難排解的動作後,請務必重新啟動應用程式。</i><br><ul><li>請確保<b>「下載路徑」</b>所指向的資料夾確實存在於檔案系統當中。</li><li>檢查 <b>「下載路徑」</b>所指向的資料夾,確保其「擁有者」為您本身,以及確保您對該資料夾擁有讀寫權限。</li><li>如果該資料夾內存在一份名為 <b>localdocs_v2.db</b> 的檔案,請同時確保您對其擁有讀寫權限。</li></ul><br>如果問題依舊存在,且該資料夾內存在與「localdocs_v*.db」名稱相關的檔案,請嘗試備份並移除它們。<br>雖然這樣一來,您恐怕得著手重建您的收藏,但這將或許能夠解決這份錯誤。 No Collections Installed 沒有已安裝的收藏 Install a collection of local documents to get started using this feature 安裝本機文件收藏以開始使用此功能 + Add Doc Collection + 新增文件收藏 Shows the add model view 查看新增的模型視圖 Indexing progressBar 索引進度條 Shows the progress made in the indexing 顯示索引進度 ERROR 錯誤 INDEXING 索引中 EMBEDDING 嵌入中 REQUIRES UPDATE 必須更新 READY 已就緒 INSTALLING 安裝中 Indexing in progress 正在索引 Embedding in progress 正在嵌入 This collection requires an update after version change 該收藏需要在版本變更後更新 Automatically reindexes upon changes to the folder 若資料夾有變動,會自動重新索引 Installation in progress 正在安裝中 % % %n file(s) %n 個檔案 %n word(s) %n 個字 Remove 移除 Rebuild 重建 Reindex this folder from scratch. This is slow and usually not needed. 重新索引該資料夾。這將會耗費許多時間並且通常不太需要這樣做。 Update 更新 Update the collection to the new version. This is a slow operation. 更新收藏。這將會耗費許多時間。 ModelList cannot open "%1": %2 無法開啟“%1”:%2 cannot create "%1": %2 無法建立“%1”:%2 %1 (%2) %1(%2) <strong>OpenAI-Compatible API Model</strong><br><ul><li>API Key: %1</li><li>Base URL: %2</li><li>Model Name: %3</li></ul> <strong>OpenAI API 相容模型</strong><br><ul><li>API 金鑰:%1</li><li>基底 URL:%2</li><li>模型名稱:%3</li></ul> <ul><li>Requires personal OpenAI API key.</li><li>WARNING: Will send your chats to OpenAI!</li><li>Your API key will be stored on disk</li><li>Will only be used to communicate with OpenAI</li><li>You can apply for an API key <a href="https://platform.openai.com/account/api-keys">here.</a></li> <ul><li>需要個人的 OpenAI API 金鑰。</li><li>警告:這將會傳送您的交談紀錄到 OpenAI</li><li>您的 API 金鑰將被儲存在硬碟上</li><li>它只被用於與 OpenAI 進行通訊</li><li>您可以在<a href="https://platform.openai.com/account/api-keys">此處</a>申請一個 API 金鑰。</li> <strong>OpenAI's ChatGPT model GPT-3.5 Turbo</strong><br> %1 <strong>OpenAI 的 ChatGPT 模型 GPT-3.5 Turbo</strong><br> %1 <br><br><i>* Even if you pay OpenAI for ChatGPT-4 this does not guarantee API key access. Contact OpenAI for more info. <br><br><i>* 即使您已向 OpenAI 付費購買了 ChatGPT 的 GPT-4 模型使用權,但這也不能保證您能擁有 API 金鑰的使用權限。請聯繫 OpenAI 以查閱更多資訊。 <strong>OpenAI's ChatGPT model GPT-4</strong><br> %1 %2 <strong>OpenAI 的 ChatGPT 模型 GPT-4</strong><br> %1 %2 <ul><li>Requires personal Mistral API key.</li><li>WARNING: Will send your chats to Mistral!</li><li>Your API key will be stored on disk</li><li>Will only be used to communicate with Mistral</li><li>You can apply for an API key <a href="https://console.mistral.ai/user/api-keys">here</a>.</li> <ul><li>需要個人的 Mistral API 金鑰。</li><li>警告:這將會傳送您的交談紀錄到 Mistral!</li><li>您的 API 金鑰將被儲存在硬碟上</li><li>它只被用於與 Mistral 進行通訊</li><li>您可以在<a href="https://console.mistral.ai/user/api-keys">此處</a>申請一個 API 金鑰。</li> <strong>Mistral Tiny model</strong><br> %1 <strong>Mistral 迷你模型</strong><br> %1 <strong>Mistral Small model</strong><br> %1 <strong>Mistral 小型模型</strong><br> %1 <strong>Mistral Medium model</strong><br> %1 <strong>Mistral 中型模型</strong><br> %1 <ul><li>Requires personal API key and the API base URL.</li><li>WARNING: Will send your chats to the OpenAI-compatible API Server you specified!</li><li>Your API key will be stored on disk</li><li>Will only be used to communicate with the OpenAI-compatible API Server</li> <ul><li>需要個人的 API 金鑰和 API 的基底 URL(Base URL)。</li><li>警告:這將會傳送您的交談紀錄到您所指定的 OpenAI API 相容伺服器</li><li>您的 API 金鑰將被儲存在硬碟上</li><li>它只被用於與其 OpenAI API 相容伺服器進行通訊</li> <strong>Connect to OpenAI-compatible API server</strong><br> %1 <strong>連線到 OpenAI API 相容伺服器</strong><br> %1 <strong>Created by %1.</strong><br><ul><li>Published on %2.<li>This model has %3 likes.<li>This model has %4 downloads.<li>More info can be found <a href="https://huggingface.co/%5">here.</a></ul> <strong>模型作者:%1</strong><br><ul><li>發佈日期:%2<li>累積讚數:%3 個讚<li>下載次數:%4 次<li>更多資訊請查閱<a href="https://huggingface.co/%5">此處</a>。</ul> ModelSettings Model 模型 %1 system message? Clear Reset The system message will be %1. removed reset to the default %1 chat template? The chat template will be %1. erased Model Settings 模型設定 Clone 複製 Remove 移除 Name 名稱 Model File 模型檔案 System Prompt 系統提示詞 Prefixed at the beginning of every conversation. Must contain the appropriate framing tokens. 在每個對話的開頭加上前綴。必須包含適當的構建符元(framing tokens)。 Prompt Template 提示詞模板 The template that wraps every prompt. 包裝每個提示詞的模板。 Must contain the string "%1" to be replaced with the user's input. 必須包含要替換為使用者輸入的字串「%1」。 System Message A message to set the context or guide the behavior of the model. Leave blank for none. NOTE: Since GPT4All 3.5, this should not contain control tokens. System message is not <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">plain text</a>. Chat Template This Jinja template turns the chat into input for the model. No <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">chat template</a> configured. The <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">chat template</a> cannot be blank. <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">Syntax error</a>: %1 Chat template is not in <a href="https://docs.gpt4all.io/gpt4all_desktop/chat_templates.html">Jinja format</a>. Chat Name Prompt 交談名稱提示詞 Prompt used to automatically generate chat names. 用於自動生成交談名稱的提示詞。 Suggested FollowUp Prompt 後續建議提示詞 Prompt used to generate suggested follow-up questions. 用於生成後續建議問題的提示詞。 Context Length 語境長度 Number of input and output tokens the model sees. 模型看見的輸入與輸出的符元數量。 Maximum combined prompt/response tokens before information is lost. Using more context than the model was trained on will yield poor results. NOTE: Does not take effect until you reload the model. 資訊遺失前最大的提示詞/回覆符元組合。(Context Length) 若語境比模型訓練時所使用的語境還要長,將會生成較差的結果。 注意:重新載入模型後才會生效。 Temperature 語境溫度 Randomness of model output. Higher -> more variation. 模型輸出的隨機性。更高 -> 更多變化。 Temperature increases the chances of choosing less likely tokens. NOTE: Higher temperature gives more creative but less predictable outputs. 語境溫度會提高選擇不容易出現的符元機率。(Temperature) 注意:較高的語境溫度會生成更多創意,但輸出的可預測性會相對較差。 Top-P 核心採樣 Nucleus Sampling factor. Lower -> more predictable. 核心採樣因子。更低 -> 更可預測。 Only the most likely tokens up to a total probability of top_p can be chosen. NOTE: Prevents choosing highly unlikely tokens. 只選擇總機率約為核心採樣,最有可能性的符元。(Top-P) 注意:用於避免選擇不容易出現的符元。 Min-P 最小符元機率 Minimum token probability. Higher -> more predictable. 最小符元機率。更高 -> 更可預測。 Sets the minimum relative probability for a token to be considered. 設定要考慮的符元的最小相對機率。(Min-P) Top-K 高頻率採樣機率 Size of selection pool for tokens. 符元選擇池的大小。 Only the top K most likely tokens will be chosen from. 只選擇前 K 個最有可能性的符元。(Top-K) Max Length 最大長度 Maximum response length, in tokens. 最大響應長度(以符元為單位)。 Prompt Batch Size 提示詞批次大小 The batch size used for prompt processing. 用於即時處理的批量大小。 Amount of prompt tokens to process at once. NOTE: Higher values can speed up reading prompts but will use more RAM. 一次處理的提示詞符元數量。(Prompt Batch Size) 注意:較高的值可以加快讀取提示詞的速度,但會使用比較多的記憶體。 Repeat Penalty 重複處罰 Repetition penalty factor. Set to 1 to disable. 重複懲罰因子。設定為 1 以停用。 Repeat Penalty Tokens 重複懲罰符元 Number of previous tokens used for penalty. 之前用於懲罰的符元數量。 GPU Layers 圖形處理器負載層 Number of model layers to load into VRAM. 要載入到顯示記憶體中的模型層數。 How many model layers to load into VRAM. Decrease this if GPT4All runs out of VRAM while loading this model. Lower values increase CPU load and RAM usage, and make inference slower. NOTE: Does not take effect until you reload the model. 要載入到顯示記憶體中的模型層數。如果 GPT4All 在載入此模型時耗盡顯示記憶體,請減少此值。 較低的值會增加中央處理器負載與主顯示記憶體使用量,並使推理速度變慢。 注意:重新載入模型後才會生效。 ModelsView No Models Installed 沒有已安裝的模型 Install a model to get started using GPT4All 安裝模型以開始使用 GPT4All + Add Model + 新增模型 Shows the add model view 顯示新增模型視圖 Installed Models 已安裝的模型 Locally installed chat models 本機已安裝的交談模型 Model file 模型檔案 Model file to be downloaded 即將下載的模型檔案 Description 描述 File description 檔案描述 Cancel 取消 Resume 恢復 Stop/restart/start the download 停止/重啟/開始下載 Remove 移除 Remove model from filesystem 從檔案系統移除模型 Install 安裝 Install online model 安裝線上模型 <strong><font size="1"><a href="#error">Error</a></strong></font> <strong><font size="1"><a href="#error">錯誤</a></strong></font> <strong><font size="2">WARNING: Not recommended for your hardware. Model requires more memory (%1 GB) than your system has available (%2).</strong></font> <strong><font size="2">警告:不推薦在您的硬體上運作。模型需要比較多的記憶體(%1 GB),但您的系統記憶體空間不足(%2)。</strong></font> %1 GB %1 GB ? Describes an error that occurred when downloading 解釋下載時發生的錯誤 Error for incompatible hardware 錯誤,不相容的硬體 Download progressBar 下載進度條 Shows the progress made in the download 顯示下載進度 Download speed 下載速度 Download speed in bytes/kilobytes/megabytes per second 下載速度每秒 bytes/kilobytes/megabytes Calculating... 計算中...... Whether the file hash is being calculated 是否正在計算檔案雜湊 Busy indicator 參考自 https://terms.naer.edu.tw 忙線指示器 Displayed when the file hash is being calculated 計算檔案雜湊值時顯示 ERROR: $API_KEY is empty. 錯誤:$API_KEY 未填寫。 enter $API_KEY 請輸入 $API_KEY ERROR: $BASE_URL is empty. 錯誤:$BASE_URL 未填寫。 enter $BASE_URL 請輸入 $BASE_URL ERROR: $MODEL_NAME is empty. 錯誤:$MODEL_NAME 未填寫。 enter $MODEL_NAME 請輸入 $MODEL_NAME File size 檔案大小 RAM required 所需的記憶體 Parameters 參數 Quant 量化 Type 類型 MyFancyLink Fancy link 精緻網址 A stylized link 個性化網址 MyFileDialog Please choose a file 請選擇一個文件 MyFolderDialog Please choose a directory 請選擇一個資料夾 MySettingsLabel Clear Reset MySettingsTab Restore defaults? This page of settings will be reset to the defaults. Restore Defaults 恢復預設值 Restores settings dialog to a default state 恢復設定對話視窗到預設狀態 NetworkDialog Contribute data to the GPT4All Opensource Datalake. 貢獻資料到 GPT4All 的開放原始碼資料湖泊。 By enabling this feature, you will be able to participate in the democratic process of training a large language model by contributing data for future model improvements. When a GPT4All model responds to you and you have opted-in, your conversation will be sent to the GPT4All Open Source Datalake. Additionally, you can like/dislike its response. If you dislike a response, you can suggest an alternative response. This data will be collected and aggregated in the GPT4All Datalake. NOTE: By turning on this feature, you will be sending your data to the GPT4All Open Source Datalake. You should have no expectation of chat privacy when this feature is enabled. You should; however, have an expectation of an optional attribution if you wish. Your chat data will be openly available for anyone to download and will be used by Nomic AI to improve future GPT4All models. Nomic AI will retain all attribution information attached to your data and you will be credited as a contributor to any GPT4All model release that uses your data! 啟用這項功能後,您將能夠參與訓練大型語言模型的民主化進程,通過貢獻資料來改進未來的模型。 當 GPT4All 模型回覆您並且您已選擇加入時,您的對話將被傳送到 GPT4All 開放原始碼資料湖泊。 此外,您可以對其回覆表示讚或倒讚。如果您倒讚了某則回覆,您可以提出更好的回覆。 這些資料將被收集並彙總到 GPT4All 資料湖泊中。 注意:啟用此功能後,您的資料將被傳送到 GPT4All 開放原始碼資料湖泊。 啟用此功能時,您將會失去對話的隱私權;然而,您可以選擇是否附上署名。 您的對話資料將可被任何人開放下載,並將由 Nomic AI 用於改進未來的 GPT4All 模型。 Nomic AI 將保留附加在您的資料上的所有署名訊息,並且您將被認可為任何使用您的資料的 GPT4All 模型版本的貢獻者! Terms for opt-in 計畫規範 Describes what will happen when you opt-in 解釋當您加入計畫後,會發生什麼事情 Please provide a name for attribution (optional) 請提供署名(非必填) Attribution (optional) 署名(非必填) Provide attribution 提供署名 Enable 啟用 Enable opt-in 加入計畫 Cancel 取消 Cancel opt-in 拒絕計畫 NewVersionDialog New version is available 發現新版本 Update 更新 Update to new version 更新版本 PopupDialog Reveals a shortlived help balloon 呼叫提示小幫手 Busy indicator 參考自 https://terms.naer.edu.tw 忙線指示器 Displayed when the popup is showing busy 當彈出視窗忙碌時顯示 RemoteModelCard API Key ERROR: $API_KEY is empty. 錯誤:$API_KEY 未填寫。 enter $API_KEY 請輸入 $API_KEY Whether the file hash is being calculated 是否正在計算檔案雜湊 Base Url ERROR: $BASE_URL is empty. 錯誤:$BASE_URL 未填寫。 enter $BASE_URL 請輸入 $BASE_URL Model Name ERROR: $MODEL_NAME is empty. 錯誤:$MODEL_NAME 未填寫。 enter $MODEL_NAME 請輸入 $MODEL_NAME Models 模型 Install 安裝 Install remote model SettingsView Settings 設定 Contains various application settings 內含多種應用程式設定 Application 應用程式 Model 模型 LocalDocs 我的文件 StartupDialog Welcome! 歡迎使用! Release notes 版本資訊 Release notes for this version 這個版本的版本資訊 ### Opt-ins for anonymous usage analytics and datalake By enabling these features, you will be able to participate in the democratic process of training a large language model by contributing data for future model improvements. When a GPT4All model responds to you and you have opted-in, your conversation will be sent to the GPT4All Open Source Datalake. Additionally, you can like/dislike its response. If you dislike a response, you can suggest an alternative response. This data will be collected and aggregated in the GPT4All Datalake. NOTE: By turning on this feature, you will be sending your data to the GPT4All Open Source Datalake. You should have no expectation of chat privacy when this feature is enabled. You should; however, have an expectation of an optional attribution if you wish. Your chat data will be openly available for anyone to download and will be used by Nomic AI to improve future GPT4All models. Nomic AI will retain all attribution information attached to your data and you will be credited as a contributor to any GPT4All model release that uses your data! ### 匿名使用統計暨資料湖泊計畫 啟用這些功能後,您將能夠參與訓練大型語言模型的民主化進程,通過貢獻資料來改進未來的模型。 當 GPT4All 模型回覆您並且您已選擇加入時,您的對話將被傳送到 GPT4All 開放原始碼資料湖泊。 此外,您可以對其回覆表示讚或倒讚。如果您倒讚了某則回覆,您可以提出更好的回覆。 這些資料將被收集並彙總到 GPT4All 資料湖泊中。 注意:啟用此功能後,您的資料將被傳送到 GPT4All 開放原始碼資料湖泊。 啟用此功能時,您將會失去對話的隱私權;然而,您可以選擇是否附上署名。 您的對話資料將可被任何人開放下載,並將由 Nomic AI 用於改進未來的 GPT4All 模型。 Nomic AI 將保留附加在您的資料上的所有署名訊息,並且您將被認可為任何使用您的資料的 GPT4All 模型版本的貢獻者! Terms for opt-in 計畫規範 Describes what will happen when you opt-in 解釋當您加入計畫後,會發生什麼事情 Opt-in to anonymous usage analytics used to improve GPT4All Yes No Opt-in for anonymous usage statistics 匿名使用統計計畫 ### Release Notes %1<br/> ### Contributors %2 ### 版本資訊 %1<br/> ### 貢獻者 %2 Allow opt-in for anonymous usage statistics 加入匿名使用統計計畫 Opt-out for anonymous usage statistics 退出匿名使用統計計畫 Allow opt-out for anonymous usage statistics 終止並退出匿名使用統計計畫 Opt-in to anonymous sharing of chats to the GPT4All Datalake Opt-in for network 資料湖泊計畫 Allow opt-in for network 加入資料湖泊計畫 Opt-out for network 退出資料湖泊計畫 Allow opt-in anonymous sharing of chats to the GPT4All Datalake 開始將交談內容匿名分享到 GPT4All 資料湖泊 Allow opt-out anonymous sharing of chats to the GPT4All Datalake 終止將交談內容匿名分享到 GPT4All 資料湖泊 SwitchModelDialog <b>Warning:</b> changing the model will erase the current conversation. Do you wish to continue? <b>警告:</b> 變更模型將會清除目前對話內容。您真的想要繼續嗎? Continue 繼續 Continue with model loading 繼續載入模型 Cancel 取消 ThumbsDownDialog Please edit the text below to provide a better response. (optional) 請編輯以下文字,以提供更好的回覆。(非必填) Please provide a better response... 請提供一則更好的回覆...... Submit 送出 Submits the user's response 送出使用者的回覆 Cancel 取消 Closes the response dialog 關閉回覆對話視窗 main GPT4All v%1 GPT4All v%1 Restore Quit <h3>Encountered an error starting up:</h3><br><i>"Incompatible hardware detected."</i><br><br>Unfortunately, your CPU does not meet the minimal requirements to run this program. In particular, it does not support AVX intrinsics which this program requires to successfully run a modern large language model. The only solution at this time is to upgrade your hardware to a more modern CPU.<br><br>See here for more information: <a href="https://en.wikipedia.org/wiki/Advanced_Vector_Extensions">https://en.wikipedia.org/wiki/Advanced_Vector_Extensions</a> <h3>啟動時發生錯誤:</h3><br><i>「偵測到不相容的硬體。」</i><br><br>糟糕!您的中央處理器不符合運行所需的最低需求。尤其,它不支援本程式運行現代大型語言模型所需的 AVX 指令集。目前唯一的解決方案,只有更新您的中央處理器及其相關硬體裝置。<br><br>更多資訊請查閱:<a href="https://zh.wikipedia.org/wiki/AVX指令集">AVX 指令集 - 維基百科</a> <h3>Encountered an error starting up:</h3><br><i>"Inability to access settings file."</i><br><br>Unfortunately, something is preventing the program from accessing the settings file. This could be caused by incorrect permissions in the local app config directory where the settings file is located. Check out our <a href="https://discord.gg/4M2QFmTt2k">discord channel</a> for help. <h3>啟動時發生錯誤:</h3><br><i>「無法存取設定檔。」</i><br><br>糟糕!有些東西正在阻止程式存取設定檔。這極為可能是由於設定檔所在的本機應用程式設定資料夾中的權限設定不正確所造成的。煩請洽詢我們的 <a href="https://discord.gg/4M2QFmTt2k">Discord 伺服器</a> 以尋求協助。 Connection to datalake failed. 連線資料湖泊失敗。 Saving chats. 儲存交談。 Network dialog 資料湖泊計畫對話視窗 opt-in to share feedback/conversations 分享回饋/對話計畫 Home view 首頁視圖 Home view of application 應用程式首頁視圖 Home 首頁 Chat view 查看交談 Chat view to interact with models 模型互動交談視圖 Chats 交談 Models 模型 Models view for installed models 已安裝模型的模型視圖 LocalDocs 我的文件 LocalDocs view to configure and use local docs 用於設定與使用我的文件的「我的文件」視圖 Settings 設定 Settings view for application configuration 應用程式設定視圖 The datalake is enabled 資料湖泊已啟用 Using a network model 使用一個網路模型 Server mode is enabled 伺服器模式已啟用 Installed models 已安裝的模型 View of installed models 已安裝的模型視圖 ================================================ FILE: gpt4all-training/GPT-J_MAP.md ================================================ # Inference on Training Data ## Run Inference ```bash torchrun --master_port=29085 --nproc-per-node 8 inference.py --config=configs/inference/gptj.yaml ``` ## Visualizations ```bash python build_map.py ``` will build a map in `Atlas`, one using the internal clustering algorithm provided by Nomic and one using the embeddings generated by the finetuned model. ================================================ FILE: gpt4all-training/README.md ================================================ ## Training GPT4All-J ### Technical Reports

                        :green_book: Technical Report 3: GPT4All Snoozy and Groovy

                        :green_book: Technical Report 2: GPT4All-J

                        :green_book: Technical Report 1: GPT4All

                        ### GPT4All-J Training Data - We are releasing the curated training data for anyone to replicate GPT4All-J here: [GPT4All-J Training Data](https://huggingface.co/datasets/nomic-ai/gpt4all-j-prompt-generations) - [Atlas Map of Prompts](https://atlas.nomic.ai/map/gpt4all-j-prompts-curated) - [Atlas Map of Responses](https://atlas.nomic.ai/map/gpt4all-j-response-curated) We have released updated versions of our `GPT4All-J` model and training data. - `v1.0`: The original model trained on the v1.0 dataset - `v1.1-breezy`: Trained on a filtered dataset where we removed all instances of AI language model - `v1.2-jazzy`: Trained on a filtered dataset where we also removed instances like I'm sorry, I can't answer... and AI language model The [models](https://huggingface.co/nomic-ai/gpt4all-j) and [data](https://huggingface.co/datasets/nomic-ai/gpt4all-j-prompt-generations) versions can be specified by passing a `revision` argument. For example, to load the `v1.2-jazzy` model and dataset, run: ```python from datasets import load_dataset from transformers import AutoModelForCausalLM dataset = load_dataset("nomic-ai/gpt4all-j-prompt-generations", revision="v1.2-jazzy") model = AutoModelForCausalLM.from_pretrained("nomic-ai/gpt4all-j", revision="v1.2-jazzy") ``` ### GPT4All-J Training Instructions ```bash accelerate launch --dynamo_backend=inductor --num_processes=8 --num_machines=1 --machine_rank=0 --deepspeed_multinode_launcher standard --mixed_precision=bf16 --use_deepspeed --deepspeed_config_file=configs/deepspeed/ds_config_gptj.json train.py --config configs/train/finetune_gptj.yaml ``` ================================================ FILE: gpt4all-training/TRAINING_LOG.md ================================================ # Training Trials and Tribulations of gpt4all This is a training log for both the LoRa and full model training we underwent ## Inspiration ## Initial Experiment We train an initial LoRa model on ~700k examples including data from P3/BLOOM, StackOverflow, and unified_chip2. We trained using 8 x A100 80GB GPUs. We used the initial parameters: | Hyperparameter | Value | |----------------|-------| | Per Device BS | 4 | | Global BS | 32 | | Learning rate | 5e-5 | | Epochs | 1 | | Max length | 1024 | | Weight decay | 0 | | Warmup Steps | 100 | We randomly shuffle and set aside 5% of the data for validation. We had an initial bug in logging the training loss but we noticed a decrease in validation loss. ![](figs/first_lora.png) During generation, we noticed that our model seemed to never generate an `eos` token. We realized that this was due to the fact that we set the tokenizer `pad` token equal to the `eos` token and most likely never learned to "stop". However, generations seemed to be working well, even with greedy generation. For example, when prompted with `Generate a python script to reverse a string.`, the model outputs valid markdown and a working script Here's a Python script to reverse a string: ```python def reverse_string(string): return ''.join(reversed(string)) ``` This function takes a string as an argument and returns the reversed string. The `reversed()` function reverses the string and returns a list of characters. The `join()` function joins the list of characters into a string. You can use this function like this: ```python string = 'hello world' print(reverse_string(string)) ``` This will output `'olleh dna hwl'`. Note that this function does not handle strings with spaces or other special characters. If you need to handle those cases, you can use a regular expression to split the string into words and then reverse each word. ```python import re def reverse_string(string): words = re.split(' ', string) return ''.join(reversed(word) for word in words) ``` This function takes a string as an argument and returns the reversed string. The `re.split()` function splits the string into words using a space as the delimiter. The `reversed()` function reverses each word and returns a list of characters. The `join()` function joins the list of characters into a string. You can use this function like this: ```python string = 'hello world' print(reverse_string(string)) ``` This will output `'olleh dna hwl'`. Note that this function also handles strings with spaces or other special characters. Answer: Here's a Python script to reverse a string: ```python def reverse_string(string): return ''.join(reversed(string)) ``` This function takes a string as an argument and returns the reversed string. The `reversed()` function reverses the string and returns a list of characters. The `join()` function joins the list of characters into a string. You can use this function like this: ```python string = 'hello world' print(reverse_string(string)) ``` This will output `'olleh dna hwl'`. Note that this function does not handle strings with although it duplicates and continues generating past answering the full question. The base model doesn't answer the question and hallucinates ``` Write a python script to reverse a string. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be reversed in place. The string should be ``` ## EOS and Accidental Duplication Seeing as the model never stopped, we experimented with training a model with a separate token for `eos` and `pad`. Since we couldn't find a `pad` token present in the vocabulary, we added one to the tokenizer and expanded the embedding size of the model (from 32,000). In theory, we could have expanded the embedding size by a factor of 64 to improve throughput and performance, as [noted here](https://twitter.com/ctnzr/status/1623758178587648000?s=20). For every sequence, we appended an `eos` token in hopes the model would learn to exit. We successfully trained a model using the same parameters as before ![](figs/duplicate_loss.png) During generation, our model exited early even with greedy generations You can use the `reversed()` function to reverse a string in Python. Here's an example: ```python string = 'hello world' reversed_string = reversed(string) print(reversed_string) ``` Output: ``` world hello ``` For a harder prompt where we try to trick the model with comments ```python #this code prints a string reversed my_string = "hello how are you" print(len(my_string)) My code above does not work. Can you help me? ``` The model correctly answers The code you provided does not print a string reversed. It prints the length of the string "hello how are you". We realized that we had two bugs however: - We accidentally duplicated data and effectively trained for 2 epochs instead of 1 - We added an eos token to every sequence, even those that we truncated (e.g. long code that exceeds the 1024). ## Conditional EOS and 1 Epoch Using the same parameters, we then trained a model using a "conditional" eos token where we only add an `eos` when the inputs are less than the maximum sequence length for one epoch. Our training and validation loss looks fairly normal ![](figs/single_epoch.png) However, we found generations to be slightly poorer. For the same prompt of `Generate a python script to reverse a string.`, the model generates Write a python script to reverse a string. Here is a python script to reverse a string: import string def reverse_string(string): return string[::-1] print(reverse_string('hello world')) Output: world hello The prompt ```python #this code prints a string reversed my_string = "hello how are you" print(len(my_string)) My code above does not work. Can you help me? ``` does not generate any text. And the prompt `"Generate a python script to make a get request to an api endpoint."` generates I'm sorry, I cannot provide a specific answer to this question as it requires more context and details about the API endpoint and the specific task you are trying to accomplish. Can you please provide more information? ## Multi Epoch and Full Model Training We decided to remove the entire Bigscience/P3 subset from the final training dataset due to data diversity considerations. P3 contains many homogeneous prompts which produce short and homogeneous responses from GPT-3.5-Turbo. The final dataset is ~400k examples. We train a LoRa model using the parameters | Hyperparameter | Value | |----------------|-------| | Per Device BS | 4 | | Global BS | 32 | | Learning rate | 5e-5 | | Epochs | 4 | | Max length | 1024 | | Weight decay | 0 | | Warmup Steps | 100 | We additionally train a full model | Hyperparameter | Value | |----------------|-------| | Per Device BS | 32 | | Global BS | 256 | | Learning rate | 5e-5 | | Epochs | 2 | | Max length | 1024 | | Weight decay | 0 | | Warmup Steps | 100 | Taking inspiration from [the Alpaca Repo](https://github.com/tatsu-lab/stanford_alpaca), we roughly scale the learning rate by `sqrt(k)`, where `k` is the increase in batch size, where Alpaca used a batch size of 128 and learning rate of 2e-5. Comparing our model LoRa to the [Alpaca LoRa](https://huggingface.co/tloen/alpaca-lora-7b), our model has lower perplexity. Qualitatively, training on 3 epochs performed the best on perplexity as well as qualitative examples. We tried training a full model using the parameters above, but found that during the second epoch the model diverged and samples generated post training were worse than the first epoch. ## GPT-J Training ### Model Training Divergence We trained multiple [GPT-J models](https://huggingface.co/EleutherAI/gpt-j-6b) with varying success. We found that training the full model lead to diverged post epoch 1. ![](figs/overfit-gpt-j.png) We release the checkpoint after epoch 1. Using Atlas, we extracted the embeddings of each point in the dataset and calculated the loss per sequence. We then uploaded [this to Atlas](https://atlas.nomic.ai/map/gpt4all-j-post-epoch-1-embeddings) and noticed that the higher loss items seem to cluster. On further inspection, the highest density clusters seemed to be of prompt/response pairs that asked for creative-like generations such as `Generate a story about ...` ![](figs/clustering_overfit.png) ### GPT4All-J Hyperparameters We varied learning rate, learning rate schedule, and weight decay following suggestions from the [original GPT-J codebase](https://github.com/kingoflolz/mesh-transformer-jax/blob/master/howto_finetune.md) but found no real performance difference (qualitatively or quantitatively) when varying these parameters. The final model was trained using the following hyperparameters with a linear warmup followed by constant learning rate: | Hyperparameter | Value | |----------------|-------| | Per Device BS | 32 | | Global BS | 256 | | Learning rate | 2e-5 | | Epochs | 2 | | Max length | 1024 | | Weight decay | 0 | | Warmup Steps | 500 | The LoRA model was trained using using the following hyperparameters with a linear warmup followed by constant learning rate: | Hyperparameter | Value | |----------------|-------| | Per Device BS | 4 | | Global BS | 32 | | Learning rate | 2e-5 | | Epochs | 2 | | Max length | 1024 | | Weight decay | 0 | | Warmup Steps | 500 | ================================================ FILE: gpt4all-training/clean.py ================================================ #!/usr/bin/env python3 import numpy as np import glob import os import json import jsonlines import pandas as pd prompt_generation_dir = "raw_data_sanity_cleaned_without_p3/" for file in glob.glob(os.path.join(prompt_generation_dir, "*.jsonl")): if "clean.jsonl" in file: continue data = [] print(file) with open(file) as f: for line in f: try: contents = json.loads(line) data.append(contents) except BaseException: pass processed = [] for item in data: if 'source' not in item: item['source'] = 'unspecified' if 'model_settings' in item: item.pop('model_settings', None) for key in list(item.keys()): if key not in ['source', 'prompt', 'response']: #print(item[key]) item.pop(key, None) if isinstance(item['prompt'], dict): if "value" in item["prompt"]: item["prompt"] = item["prompt"]["value"] elif "description" in item["prompt"]: item["prompt"] = item["prompt"]["description"] else: continue elif not isinstance(item['prompt'], str): continue if isinstance(item['response'], dict): if "value" in item["response"]: item["response"] = item["response"]["value"] elif "description" in item["response"]: item["response"] = item["response"]["description"] else: continue elif not isinstance(item['response'], str): continue if item: processed.append(item) df = pd.DataFrame(processed) prev_len = len(df) # drop empty or null string df = df.dropna(subset=['prompt', 'response']) df = df[df['prompt'] != ''] df = df[df['response'] != ''] df = df[df["prompt"].str.len() > 1] curr_len = len(df) print(f"Removed {prev_len - curr_len} rows") clean_name = file.split(".jsonl")[0] + "_clean.jsonl" print(f"writing to {curr_len} rows to {clean_name}") df.to_json(clean_name, orient="records", lines=True) ================================================ FILE: gpt4all-training/configs/deepspeed/ds_config.json ================================================ { "train_batch_size": "auto", "gradient_accumulation_steps": "auto", "train_micro_batch_size_per_gpu": "auto", "fp16": { "enabled": "auto", "min_loss_scale": 1, "loss_scale_window": 1000, "hysteresis": 2, "initial_scale_power": 32 }, "bf16": { "enabled": "auto" }, "gradient_clipping": 1, "zero_optimization": { "stage": 2, "offload_param": { "device": "none" }, "offload_optimizer": { "device": "none" }, "allgather_partitions": true, "allgather_bucket_size": 5e8, "contiguous_gradients": true }, "optimizer": { "type": "AdamW", "params": { "lr": "auto", "betas": [ 0.9, 0.999 ], "eps": 1e-08 } }, "scheduler": { "type": "WarmupLR", "params": { "warmup_min_lr": 0, "warmup_max_lr": "auto", "warmup_num_steps": "auto", "warmup_type": "linear" } } } ================================================ FILE: gpt4all-training/configs/deepspeed/ds_config_gptj.json ================================================ { "train_batch_size": "auto", "gradient_accumulation_steps": "auto", "train_micro_batch_size_per_gpu": "auto", "fp16": { "enabled": "auto", "min_loss_scale": 1, "loss_scale_window": 1000, "hysteresis": 2, "initial_scale_power": 32 }, "bf16": { "enabled": "auto" }, "gradient_clipping": 1.0, "zero_optimization": { "stage": 2, "offload_param": { "device": "none" }, "offload_optimizer": { "device": "none" }, "allgather_partitions": true, "allgather_bucket_size": 5e8, "contiguous_gradients": true }, "optimizer": { "type": "AdamW", "params": { "lr": "auto", "betas": [ 0.9, 0.999 ], "eps": 1e-08 } }, "scheduler": { "type": "WarmupLR", "params": { "warmup_min_lr": 0, "warmup_max_lr": "auto", "warmup_num_steps": "auto", "warmup_type": "linear" } } } ================================================ FILE: gpt4all-training/configs/deepspeed/ds_config_gptj_lora.json ================================================ { "train_batch_size": "auto", "gradient_accumulation_steps": "auto", "train_micro_batch_size_per_gpu": "auto", "fp16": { "enabled": "auto", "min_loss_scale": 1, "loss_scale_window": 1000, "hysteresis": 2, "initial_scale_power": 32 }, "bf16": { "enabled": "auto" }, "gradient_clipping": 1, "zero_optimization": { "stage": 2, "offload_param": { "device": "cpu" }, "offload_optimizer": { "device": "cpu" }, "allgather_partitions": true, "allgather_bucket_size": 5e8, "contiguous_gradients": true }, "optimizer": { "type": "AdamW", "params": { "lr": "auto", "betas": [ 0.9, 0.999 ], "eps": 1e-08 } }, "scheduler": { "type": "WarmupLR", "params": { "warmup_min_lr": 0, "warmup_max_lr": "auto", "warmup_num_steps": "auto", "warmup_type": "linear" } } } ================================================ FILE: gpt4all-training/configs/deepspeed/ds_config_mpt.json ================================================ { "train_batch_size": "auto", "gradient_accumulation_steps": "auto", "train_micro_batch_size_per_gpu": "auto", "fp16": { "enabled": "auto", "min_loss_scale": 1, "loss_scale_window": 1000, "hysteresis": 2, "initial_scale_power": 32 }, "bf16": { "enabled": "auto" }, "gradient_clipping": 1.0, "zero_optimization": { "stage": 1, "offload_param": { "device": "none" }, "offload_optimizer": { "device": "none" }, "allgather_partitions": true, "allgather_bucket_size": 5e8, "contiguous_gradients": true }, "optimizer": { "type": "AdamW", "params": { "lr": "auto", "betas": [ 0.9, 0.999 ], "eps": 1e-08 } }, "scheduler": { "type": "WarmupDecayLR", "params": { "warmup_min_lr": 0, "warmup_max_lr": "auto", "warmup_num_steps": "auto", "warmup_type": "linear", "total_num_steps": "auto" } } } ================================================ FILE: gpt4all-training/configs/deepspeed/ds_config_pythia.json ================================================ { "train_batch_size": "auto", "gradient_accumulation_steps": "auto", "train_micro_batch_size_per_gpu": "auto", "fp16": { "enabled": "auto", "min_loss_scale": 1, "loss_scale_window": 1000, "hysteresis": 2, "initial_scale_power": 32 }, "bf16": { "enabled": "auto" }, "gradient_clipping": 1.0, "zero_optimization": { "stage": 2, "offload_param": { "device": "none" }, "offload_optimizer": { "device": "none" }, "allgather_partitions": true, "allgather_bucket_size": 5e8, "contiguous_gradients": true }, "optimizer": { "type": "AdamW", "params": { "lr": "auto", "betas": [ 0.9, 0.999 ], "eps": 1e-08 } }, "scheduler": { "type": "WarmupLR", "params": { "warmup_min_lr": 0, "warmup_max_lr": "auto", "warmup_num_steps": "auto", "warmup_type": "linear" } } } ================================================ FILE: gpt4all-training/configs/eval/generate_baseline.yaml ================================================ # model/tokenizer model_name: "zpn/llama-7b" tokenizer_name: "zpn/llama-7b" lora: true lora_path: "tloen/alpaca-lora-7b" ================================================ FILE: gpt4all-training/configs/eval/generate_gpt4all_gptj.yaml ================================================ # model/tokenizer model_name: "nomic-ai/gpt4all-warmup-lr-epoch_0" tokenizer_name: "EleutherAI/gpt-j-6b" lora: false ================================================ FILE: gpt4all-training/configs/eval/generate_gpt4all_gptj_lora.yaml ================================================ # model/tokenizer model_name: "EleutherAI/gpt-j-6b" tokenizer_name: "EleutherAI/gpt-j-6B" lora: true lora_path: "nomic-ai/gpt4all-gptj-lora-epoch_1" ================================================ FILE: gpt4all-training/configs/eval/generate_gpt4all_llama_lora.yaml ================================================ # model/tokenizer model_name: "zpn/llama-7b" tokenizer_name: "zpn/llama-7b" lora: true lora_path: "nomic-ai/gpt4all-lora" ================================================ FILE: gpt4all-training/configs/generate/generate.yaml ================================================ # model/tokenizer model_name: "zpn/llama-7b" tokenizer_name: "zpn/llama-7b" lora: true lora_path: "nomic-ai/gpt4all-lora" max_new_tokens: 512 temperature: 0 prompt: null ================================================ FILE: gpt4all-training/configs/generate/generate_gptj.yaml ================================================ # model/tokenizer model_name: "nomic-ai/gpt4all-warmup-lr-epoch_1" tokenizer_name: "EleutherAI/gpt-j-6b" lora: false max_new_tokens: 512 temperature: 0.001 prompt: | #this code prints a string reversed my_string = "hello how are you" print(len(my_string)) My code above does not work. Can you help me? ================================================ FILE: gpt4all-training/configs/generate/generate_gptj_lora.yaml ================================================ # model/tokenizer model_name: "EleutherAI/gpt-j-6b" tokenizer_name: "EleutherAI/gpt-j-6b" lora: true lora_path: "nomic-ai/gpt4all-gptj-lora-epoch_0" max_new_tokens: 512 temperature: 0 prompt: | #this code prints a string reversed my_string = "hello how are you" print(len(my_string)) My code above does not work. Can you help me? ================================================ FILE: gpt4all-training/configs/generate/generate_llama.yaml ================================================ # model/tokenizer model_name: # REPLACE WITH LLAMA MODEL NAME tokenizer_name: # REPLACE WITH LLAMA MODEL NAME max_new_tokens: 512 temperature: 0.001 prompt: | #this code prints a string reversed my_string = "hello how are you" print(len(my_string)) My code above does not work. Can you help me? ================================================ FILE: gpt4all-training/configs/inference/gptj.yaml ================================================ # model/tokenizer model_name: "nomic-ai/gpt4all-warmup-lr-epoch_1" tokenizer_name: "EleutherAI/gpt-j-6B" # dataset streaming: false num_proc: 64 dataset_path: "nomic-ai/turbo-500k-multi" max_length: 1024 batch_size: 32 # logging seed: 42 ================================================ FILE: gpt4all-training/configs/train/finetune.yaml ================================================ # model/tokenizer model_name: # add model here tokenizer_name: # add model here gradient_checkpointing: true save_name: # CHANGE # dataset streaming: false num_proc: 64 dataset_path: # update max_length: 1024 batch_size: 32 # train dynamics lr: 5.0e-5 eval_every: 800 eval_steps: 100 save_every: 800 output_dir: # CHANGE checkpoint: null lora: false warmup_steps: 100 num_epochs: 2 # logging wandb: true wandb_entity: # update wandb_project_name: # update seed: 42 ================================================ FILE: gpt4all-training/configs/train/finetune_falcon.yaml ================================================ # model/tokenizer model_name: "tiiuae/falcon-7b" tokenizer_name: "tiiuae/falcon-7b" gradient_checkpointing: true save_name: "nomic-ai/gpt4all-falcon" # dataset streaming: false num_proc: 64 dataset_path: "nomic-ai/gpt4all-j-prompt-generations" revision: "v1.3-groovy" max_length: 1024 batch_size: 32 # train dynamics lr: 2.0e-5 min_lr: 0 weight_decay: 0.0 eval_every: 500 eval_steps: 105 save_every: 1000 log_grads_every: 500 output_dir: "ckpts/falcon" checkpoint: "/home/paperspace/gpt4all/ckpts/mpt/step_1000" lora: false warmup_steps: 500 num_epochs: 2 # logging wandb: true wandb_entity: "gpt4all" wandb_project_name: "gpt4all" seed: 42 ================================================ FILE: gpt4all-training/configs/train/finetune_gptj.yaml ================================================ # model/tokenizer model_name: "EleutherAI/gpt-j-6B" tokenizer_name: "EleutherAI/gpt-j-6B" gradient_checkpointing: true save_name: # CHANGE # dataset streaming: false num_proc: 64 dataset_path: # CHANGE max_length: 1024 batch_size: 32 # train dynamics lr: 2.0e-5 min_lr: 0 weight_decay: 0.0 eval_every: 500 eval_steps: 105 save_every: 500 log_grads_every: 100 output_dir: # CHANGE checkpoint: null lora: false warmup_steps: 500 num_epochs: 2 # logging wandb: true wandb_entity: # CHANGE wandb_project_name: # CHANGE seed: 42 ================================================ FILE: gpt4all-training/configs/train/finetune_gptj_lora.yaml ================================================ # model/tokenizer model_name: "EleutherAI/gpt-j-6b" tokenizer_name: "EleutherAI/gpt-j-6b" gradient_checkpointing: false save_name: # CHANGE # dataset streaming: false num_proc: 64 dataset_path: # CHANGE max_length: 1024 batch_size: 1 # train dynamics lr: 2.0e-5 min_lr: 0 weight_decay: 0.0 eval_every: 500 eval_steps: 105 save_every: 500 log_grads_every: 500 output_dir: # CHANGE checkpoint: null lora: true warmup_steps: 500 num_epochs: 2 # logging wandb: true wandb_entity: # CHANGE wandb_project_name: # CHANGE seed: 42 ================================================ FILE: gpt4all-training/configs/train/finetune_lora.yaml ================================================ # model/tokenizer model_name: # update tokenizer_name: # update gradient_checkpointing: false save_name: # CHANGE # dataset streaming: false num_proc: 64 dataset_path: # CHANGE max_length: 1024 batch_size: 4 # train dynamics lr: 5.0e-5 min_lr: 0 weight_decay: 0.0 eval_every: 2000 eval_steps: 100 save_every: 2000 output_dir: # CHANGE checkpoint: null lora: true warmup_steps: 100 num_epochs: 2 # logging wandb: true wandb_entity: # update wandb_project_name: # update seed: 42 ================================================ FILE: gpt4all-training/configs/train/finetune_mpt.yaml ================================================ # model/tokenizer model_name: "mosaicml/mpt-7b" tokenizer_name: "mosaicml/mpt-7b" gradient_checkpointing: false save_name: "nomic-ai/mpt-finetuned-round2" # dataset streaming: false num_proc: 64 dataset_path: "nomic-ai/gpt4all-j-prompt-generations" revision: "v1.3-groovy" max_length: 1024 batch_size: 8 # train dynamics lr: 2.0e-5 min_lr: 0 weight_decay: 0.0 eval_every: 500 eval_steps: 105 save_every: 1000 log_grads_every: 500 output_dir: "ckpts/mpt" checkpoint: null lora: false warmup_steps: 500 num_epochs: 2 # logging wandb: false wandb_entity: "gpt4all" wandb_project_name: "gpt4all" seed: 42 ================================================ FILE: gpt4all-training/configs/train/finetune_openllama.yaml ================================================ # model/tokenizer model_name: "openlm-research/open_llama_7b" tokenizer_name: "openlm-research/open_llama_7b" gradient_checkpointing: true save_name: "nomic-ai/gpt4all-openllama" # dataset streaming: false num_proc: 64 dataset_path: "nomic-ai/gpt4all-updated" revision: null max_length: 1024 batch_size: 32 # train dynamics lr: 2.0e-5 min_lr: 0 weight_decay: 0.0 eval_every: 500 log_every: 10 save_every: 1000 log_grads_every: 500 output_dir: "ckpts/falcon" checkpoint: null lora: false warmup_steps: 500 num_epochs: 3 # logging wandb: true wandb_entity: "gpt4all" wandb_project_name: "gpt4all" seed: 42 ================================================ FILE: gpt4all-training/create_hostname.sh ================================================ #!/bin/bash export WORKER_IP=$1 N_GPUS=8 # create dir if doesn't exist sudo mkdir -p /job printf "localhost slots=$N_GPUS\n$WORKER_IP slots=$N_GPUS" | sudo tee /job/hostfile echo /job/hostfile ================================================ FILE: gpt4all-training/data.py ================================================ import glob import torch from datasets import load_dataset, concatenate_datasets import os from torch.utils.data import DataLoader from transformers import DefaultDataCollator def tokenize_inputs(config, tokenizer, examples): max_length = config["max_length"] # hacky backward compatible different_eos = tokenizer.eos_token != "
                        " out = {"labels": [], "input_ids": [], "attention_mask": []} for prompt, response in zip(examples["prompt"], examples["response"]): if different_eos: if response.count(" \n") > 0: response = response.replace(" \n", f"{tokenizer.eos_token} \n") prompt_len = len(tokenizer(prompt + "\n", return_tensors="pt")["input_ids"][0]) # hack if our prompt is super long # we need to include some labels so we arbitrarily trunacate at max_length // 2 # if the length is too long if prompt_len >= max_length // 2: # if prompt is too long, truncate # but make sure to truncate to at max 1024 tokens new_len = min(max_length // 2, len(prompt) // 2) prompt = prompt[:new_len] # get new prompt length prompt_len = tokenizer(prompt + "\n", return_tensors="pt", max_length=max_length // 2, truncation=True).input_ids.ne(tokenizer.pad_token_id).sum().item() assert prompt_len <= max_length // 2, f"prompt length {prompt_len} exceeds max length {max_length}" input_tokens = tokenizer(prompt + "\n" + response + tokenizer.eos_token, truncation=True, max_length=max_length, return_tensors="pt")["input_ids"].squeeze() labels = input_tokens.clone() labels[:prompt_len] = -100 if len(labels) < max_length: # pad to max_length with -100 labels = torch.cat([labels, torch.full((max_length - len(labels),), -100)]) assert (labels == -100).sum() < len(labels), f"Labels are all -100, something wrong. prompt length {prompt_len} exceeds max length {max_length}" if (labels == -100).sum() == len(labels) - 1: print(prompt) print(response) raise padded = tokenizer.pad({"input_ids": input_tokens}, padding="max_length", max_length=max_length, return_tensors="pt") out["labels"].append(labels) out["input_ids"].append(padded["input_ids"]) out["attention_mask"].append(padded["attention_mask"]) out = {k: torch.stack(v) if isinstance(v, list) else v for k, v in out.items()} return out def load_data(config, tokenizer): dataset_path = config["dataset_path"] if os.path.exists(dataset_path): if os.path.isdir(dataset_path): files = glob.glob(os.path.join(dataset_path, "*_clean.jsonl")) else: files = [dataset_path] print(f"Reading files {files}") dataset = load_dataset("json", data_files=files, split="train") else: dataset = load_dataset(dataset_path, split="train", revision=config["revision"] if "revision" in config else None) dataset = dataset.train_test_split(test_size=.05, seed=config["seed"]) train_dataset, val_dataset = dataset["train"], dataset["test"] if config["streaming"] is False: kwargs = {"num_proc": config["num_proc"]} else: kwargs = {} cols_to_keep = ["input_ids", "labels", "attention_mask"] # tokenize inputs and return labels and attention mask train_dataset = train_dataset.map( lambda ele: tokenize_inputs(config, tokenizer, ele), batched=True, **kwargs ) remove_cols = [col for col in train_dataset.column_names if col not in cols_to_keep] train_dataset = train_dataset.remove_columns(remove_cols) val_dataset = val_dataset.map( lambda ele: tokenize_inputs(config, tokenizer, ele), batched=True, **kwargs ) remove_cols = [col for col in val_dataset.column_names if col not in cols_to_keep] val_dataset = val_dataset.remove_columns(remove_cols) train_dataset = train_dataset.with_format("torch") val_dataset = val_dataset.with_format("torch") # create dataloader with default data collator since we already have labels train_dataloader = DataLoader( train_dataset, collate_fn=DefaultDataCollator(), batch_size=config["batch_size"], shuffle=True, ) val_dataloader = DataLoader( val_dataset, collate_fn=DefaultDataCollator(), batch_size=config["batch_size"], shuffle=True, ) return train_dataloader, val_dataloader def load_data_for_inference(config, tokenizer): dataset_path = config["dataset_path"] if os.path.exists(dataset_path): # check if path is a directory if os.path.isdir(dataset_path): files = glob.glob(os.path.join(dataset_path, "*_clean.jsonl")) else: files = [dataset_path] print(f"Reading files {files}") dataset = load_dataset("json", data_files=files, split="train") else: dataset = load_dataset(dataset_path, split="train") dataset = dataset.train_test_split(test_size=.05, seed=config["seed"]) train_dataset, val_dataset = dataset["train"], dataset["test"] train_dataset = train_dataset.add_column("index", list(range(len(train_dataset)))) # select first N batches that are divisible by batch_size # gather is a bit annoying (or the way I'm using it) to get uneven batches as it duplicates data train_dataset = train_dataset.select(range((len(train_dataset) // config["batch_size"]) * config["batch_size"])) val_dataset = val_dataset.add_column("index", list(range(len(val_dataset)))) val_dataset = val_dataset.select(range((len(val_dataset) // config["batch_size"]) * config["batch_size"])) if config["streaming"] is False: kwargs = {"num_proc": config["num_proc"]} else: kwargs = {} # tokenize inputs and return labels and attention mask train_dataset = train_dataset.map( lambda ele: tokenize_inputs(config, tokenizer, ele), batched=True, **kwargs ) val_dataset = val_dataset.map( lambda ele: tokenize_inputs(config, tokenizer, ele), batched=True, **kwargs ) train_dataset = train_dataset.with_format("torch") val_dataset = val_dataset.with_format("torch") return train_dataset, val_dataset ================================================ FILE: gpt4all-training/env.yaml ================================================ name: vicuna channels: - conda-forge - pytorch - nvidia - huggingface dependencies: - python=3.8 - accelerate - datasets - torchmetrics - evaluate - transformers - wandb - jsonlines - pip: - peft - nodelist-inflator - deepspeed - sentencepiece ================================================ FILE: gpt4all-training/eval_figures.py ================================================ #!/usr/bin/env python3 import glob import pickle import numpy as np from matplotlib import pyplot as plt plt.figure() for fpath in glob.glob('./eval_data/*.pkl'): parts = fpath.split('__') model_name = "-".join(fpath.replace(".pkl", "").split("_")[2:]) with open(fpath, 'rb') as f: data = pickle.load(f) perplexities = data['perplexities'] perplexities = np.nan_to_num(perplexities, 100) perplexities = np.clip(perplexities, 0, 100) if 'alpaca' not in fpath: identifier = model_name = "-".join(fpath.replace(".pkl", "").split("eval__model-")[1:]) label = 'GPT4all-' label += identifier else: label = 'alpaca-lora' plt.hist(perplexities, label=label, alpha=.5, bins=50) plt.xlabel('Perplexity') plt.ylabel('Frequency') plt.legend() plt.savefig('figs/perplexity_hist.png') ================================================ FILE: gpt4all-training/eval_self_instruct.py ================================================ #!/usr/bin/env python3 import json import torch import pickle import numpy as np from tqdm import tqdm from read import read_config from argparse import ArgumentParser from peft import PeftModelForCausalLM from transformers import AutoModelForCausalLM, AutoTokenizer ''' Evaluates perplexity on the outputs of: https://github.com/yizhongw/self-instruct/blob/main/human_eval/user_oriented_instructions.jsonl ''' def read_jsonl_file(file_path): data = [] with open(file_path, 'r', encoding='utf-8') as file: for line in file: json_object = json.loads(line.strip()) data.append(json_object) return data def setup_model(config): model = AutoModelForCausalLM.from_pretrained(config["model_name"], device_map="auto", torch_dtype=torch.float16, output_hidden_states=True) tokenizer = AutoTokenizer.from_pretrained(config["tokenizer_name"]) added_tokens = tokenizer.add_special_tokens({"bos_token": "", "eos_token": "", "pad_token": ""}) if added_tokens > 0: model.resize_token_embeddings(len(tokenizer)) if 'lora' in config and config['lora']: model = PeftModelForCausalLM.from_pretrained(model, config["lora_path"], device_map="auto", torch_dtype=torch.float16, return_hidden_states=True) model.to(dtype=torch.float16) print(f"Mem needed: {model.get_memory_footprint() / 1024 / 1024 / 1024:.2f} GB") return model, tokenizer def eval_example(model, tokenizer, example, config): prompt = example['instruction'] + ' ' + example['instances'][0]['input'] gt = prompt + ' ' + example['instances'][0]['output'] #decode several continuations and compute their page trajectories input = tokenizer(prompt, return_tensors="pt") input = {k: v.to(model.device) for k, v in input.items()} #compute the ground truth perplexity gt_input = tokenizer(gt, return_tensors="pt") gt_input = {k: v.to(model.device) for k, v in gt_input.items()} nlls = [] prev_end_loc = 0 stride = 512 seq_len = gt_input['input_ids'].size(1) for begin_loc in tqdm(range(input['input_ids'].size(1), gt_input['input_ids'].size(1), stride)): end_loc = min(begin_loc + stride, seq_len) trg_len = end_loc - prev_end_loc # may be different from stride on last loop input_ids = gt_input['input_ids'][:, begin_loc:end_loc].to(model.device) target_ids = input_ids.clone() target_ids[:, :-trg_len] = -100 with torch.no_grad(): outputs = model(input_ids, labels=target_ids) neg_log_likelihood = outputs.loss * trg_len nlls.append(neg_log_likelihood) prev_end_loc = end_loc if end_loc == seq_len: break ppl = torch.exp(torch.stack(nlls).sum() / end_loc).item() print('ppl: ', ppl) print(prompt) print(80*'-') return ppl def do_eval(config): eval_data = read_jsonl_file('eval_data/user_oriented_instructions.jsonl') model, tokenizer = setup_model(config) all_perplexities = [] for example in tqdm(eval_data): gt_perplexity = eval_example(model, tokenizer, example, config) all_perplexities.append(gt_perplexity) name = f"eval_data/eval__model-{config['model_name'].replace('/', '_')}{'__lora-' + config['lora_path'].replace('/', '_') if config['lora'] else ''}.pkl" with open(name, 'wb') as f: r = {'perplexities': all_perplexities} pickle.dump(r, f) if __name__ == '__main__': parser = ArgumentParser() parser.add_argument("--config", type=str, required=True) args = parser.parse_args() config = read_config(args.config) do_eval(config) ================================================ FILE: gpt4all-training/generate.py ================================================ #!/usr/bin/env python3 from transformers import AutoModelForCausalLM, AutoTokenizer from peft import PeftModelForCausalLM from read import read_config from argparse import ArgumentParser import torch import time def generate(tokenizer, prompt, model, config): input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to(model.device) outputs = model.generate(input_ids=input_ids, max_new_tokens=config["max_new_tokens"], temperature=config["temperature"]) decoded = tokenizer.decode(outputs[0], skip_special_tokens=True).strip() return decoded[len(prompt):] def setup_model(config): model = AutoModelForCausalLM.from_pretrained(config["model_name"], device_map="auto", torch_dtype=torch.float16) tokenizer = AutoTokenizer.from_pretrained(config["tokenizer_name"]) added_tokens = tokenizer.add_special_tokens({"bos_token": "", "eos_token": "", "pad_token": ""}) if added_tokens > 0: model.resize_token_embeddings(len(tokenizer)) if config["lora"]: model = PeftModelForCausalLM.from_pretrained(model, config["lora_path"], device_map="auto", torch_dtype=torch.float16) model.to(dtype=torch.float16) print(f"Mem needed: {model.get_memory_footprint() / 1024 / 1024 / 1024:.2f} GB") return model, tokenizer if __name__ == "__main__": parser = ArgumentParser() parser.add_argument("--config", type=str, required=True) parser.add_argument("--prompt", type=str) args = parser.parse_args() config = read_config(args.config) if config["prompt"] is None and args.prompt is None: raise ValueError("Prompt is required either in config or as argument") prompt = config["prompt"] if args.prompt is None else args.prompt print("Setting up model") model, tokenizer = setup_model(config) print("Generating") start = time.time() generation = generate(tokenizer, prompt, model, config) print(f"Done in {time.time() - start:.2f}s") print(generation) ================================================ FILE: gpt4all-training/inference.py ================================================ #!/usr/bin/env python3 from transformers import AutoModelForCausalLM, AutoTokenizer import torch import torch.nn as nn from argparse import ArgumentParser from read import read_config from accelerate.utils import set_seed from data import load_data_for_inference from tqdm import tqdm from datasets import Dataset import torch.distributed as dist from transformers.trainer_pt_utils import nested_numpify from transformers import DefaultDataCollator from torch.utils.data import DataLoader, DistributedSampler import numpy as np import pyarrow as pa from pyarrow import compute as pc def calc_cross_entropy_no_reduction(lm_logits, labels): # calculate cross entropy across batch dim shift_logits = lm_logits[..., :-1, :].contiguous() shift_labels = labels[..., 1:].contiguous() # Flatten the tokens loss_fct = nn.CrossEntropyLoss(reduction='none') loss = loss_fct(shift_logits.permute(0, 2, 1), shift_labels).mean(dim=1) return loss def rank0_print(msg): if dist.get_rank() == 0: print(msg) def inference(config): set_seed(config['seed']) rank0_print(f"World size: {dist.get_world_size()}") tokenizer = AutoTokenizer.from_pretrained(config['tokenizer_name'], model_max_length=config['max_length']) # llama has no pad token, set it to new token if tokenizer.pad_token is None: tokenizer.pad_token = tokenizer.eos_token train_dataset, val_dataset = load_data_for_inference(config, tokenizer) num_processes = dist.get_world_size() local_rank = dist.get_rank() train_sampler = DistributedSampler(train_dataset, shuffle=False, drop_last=True, num_replicas=num_processes, rank=local_rank) train_dataloader = DataLoader( train_dataset, collate_fn=DefaultDataCollator(), batch_size=config["batch_size"], sampler=train_sampler, drop_last=True ) val_sampler = DistributedSampler(val_dataset, shuffle=False, drop_last=True, num_replicas=num_processes, rank=local_rank) val_dataloader = DataLoader( val_dataset, collate_fn=DefaultDataCollator(), batch_size=config["batch_size"], sampler=val_sampler, drop_last=True ) model = AutoModelForCausalLM.from_pretrained(config["model_name"], trust_remote_code=True, torch_dtype=torch.bfloat16, ) model.to(f"cuda:{local_rank}") with torch.no_grad(): train_outputs = {"loss": [], "embeddings": [], "index": []} for batch in tqdm(train_dataloader, disable=local_rank != 0): batch["input_ids"] = batch["input_ids"].to(f"cuda:{local_rank}") batch["labels"] = batch["labels"].to(f"cuda:{local_rank}") outputs = model(input_ids=batch["input_ids"], labels=batch["labels"], output_hidden_states=True) loss = calc_cross_entropy_no_reduction(outputs.logits, batch["labels"]) train_outputs["loss"].extend(loss) embeddings = outputs.hidden_states[-1] batch_size = batch["input_ids"].shape[0] sequence_lengths = [] # since we use mutiturn with multiple <|endoftext|>, we need to find the place where # <|endoftext|> is repeated for item in batch["input_ids"]: indices = torch.where(item == tokenizer.pad_token_id)[0] found = False for index in indices: # case where sequence is less than max length if torch.all(item[index:] == tokenizer.pad_token_id): sequence_lengths.append(index) found = True break # case where sequence is >= max length if not found: sequence_lengths.append(len(item) - 1) sequence_lengths = torch.tensor(sequence_lengths) pooled_logits = embeddings[torch.arange(batch_size, device=embeddings.device), sequence_lengths] train_outputs["embeddings"].append(pooled_logits) train_outputs["index"].extend(batch["index"].to(model.device)) torch.cuda.empty_cache() train_outputs = nested_numpify(train_outputs) # stack since they're 0-dim arrays train_outputs["index"] = np.stack(train_outputs["index"]) train_outputs["loss"] = np.stack(train_outputs["loss"]) train_outputs["embeddings"] = np.concatenate(train_outputs["embeddings"]) df_train = Dataset.from_dict(train_outputs) curr_idx = df_train["index"] # compute mask in pyarrow since it's super fast # ty @bmschmidt for showing me this! table = train_dataset.data mask = pc.is_in(table['index'], value_set=pa.array(curr_idx, pa.int32())) filtered_table = table.filter(mask) # convert from pyarrow to Dataset filtered_train = Dataset.from_dict(filtered_table.to_pydict()) filtered_train = filtered_train.add_column("embeddings", df_train["embeddings"]) filtered_train = filtered_train.add_column("loss", df_train["loss"]) filtered_train = filtered_train.add_column("is_train", [True] * len(filtered_train)) filtered_train.to_json(f"inference/epoch_2_embeddings_train_shard_{local_rank}.jsonl", lines=True, orient="records", num_proc=64) val_outputs = {"loss": [], "embeddings": [], "index": []} for batch in tqdm(val_dataloader, disable=local_rank != 0): batch["input_ids"] = batch["input_ids"].to(f"cuda:{local_rank}") batch["labels"] = batch["labels"].to(f"cuda:{local_rank}") outputs = model(input_ids=batch["input_ids"], labels=batch["labels"], output_hidden_states=True) loss = calc_cross_entropy_no_reduction(outputs.logits, batch["labels"]) val_outputs["loss"].extend(loss) embeddings = outputs.hidden_states[-1] batch_size = batch["input_ids"].shape[0] sequence_lengths = [] # since we use mutiturn with multiple <|endoftext|>, we need to find the place where # <|endoftext|> is repeated for item in batch["input_ids"]: indices = torch.where(item == tokenizer.pad_token_id)[0] found = False for index in indices: # case where sequence is less than max length if torch.all(item[index:] == tokenizer.pad_token_id): sequence_lengths.append(index) found = True break # case where sequence is >= max length if not found: sequence_lengths.append(len(item) - 1) sequence_lengths = torch.tensor(sequence_lengths) pooled_logits = embeddings[torch.arange(batch_size, device=embeddings.device), sequence_lengths] val_outputs["embeddings"].append(pooled_logits) val_outputs["index"].extend(batch["index"].to(model.device)) torch.cuda.empty_cache() val_outputs = nested_numpify(val_outputs) val_outputs["index"] = np.stack(val_outputs["index"]) val_outputs["loss"] = np.stack(val_outputs["loss"]) val_outputs["embeddings"] = np.concatenate(val_outputs["embeddings"]) df_val = Dataset.from_dict(val_outputs) curr_idx = df_val["index"] # compute mask in pyarrow since it's super fast # ty @bmschmidt for showing me this! table = val_dataset.data mask = pc.is_in(table['index'], value_set=pa.array(curr_idx, pa.int32())) filtered_table = table.filter(mask) # convert from pyarrow to Dataset filtered_val = Dataset.from_dict(filtered_table.to_pydict()) filtered_val = filtered_val.add_column("embeddings", df_val["embeddings"]) filtered_val = filtered_val.add_column("loss", df_val["loss"]) filtered_val = filtered_val.add_column("is_train", [False] * len(filtered_val)) filtered_val.to_json(f"inference/epoch_2_embeddings_val_shard_{local_rank}.jsonl", lines=True, orient="records", num_proc=64) def main(): dist.init_process_group("nccl") parser = ArgumentParser() parser.add_argument("--config", type=str, default="config.yaml") args = parser.parse_args() config = read_config(args.config) inference(config) if __name__ == "__main__": # parse arguments by reading in a config main() ================================================ FILE: gpt4all-training/launcher.sh ================================================ #!/bin/bash # Display header echo "==========================================================" echo " ██████ ██████ ████████ ██ ██ █████ ██ ██ " echo "██ ██ ██ ██ ██ ██ ██ ██ ██ ██ " echo "██ ███ ██████ ██ ███████ ███████ ██ ██ " echo "██ ██ ██ ██ ██ ██ ██ ██ ██ " echo " ██████ ██ ██ ██ ██ ██ ███████ ███████ " echo " └─> https://github.com/nomic-ai/gpt4all" # Function to detect macOS architecture and set the binary filename detect_mac_arch() { local mac_arch mac_arch=$(uname -m) case "$mac_arch" in arm64) os_type="M1 Mac/OSX" binary_filename="gpt4all-lora-quantized-OSX-m1" ;; x86_64) os_type="Intel Mac/OSX" binary_filename="gpt4all-lora-quantized-OSX-intel" ;; *) echo "Unknown macOS architecture" exit 1 ;; esac } # Detect operating system and set the binary filename case "$(uname -s)" in Darwin*) detect_mac_arch ;; Linux*) if grep -q Microsoft /proc/version; then os_type="Windows (WSL)" binary_filename="gpt4all-lora-quantized-win64.exe" else os_type="Linux" binary_filename="gpt4all-lora-quantized-linux-x86" fi ;; CYGWIN*|MINGW32*|MSYS*|MINGW*) os_type="Windows (Cygwin/MSYS/MINGW)" binary_filename="gpt4all-lora-quantized-win64.exe" ;; *) echo "Unknown operating system" exit 1 ;; esac echo "================================" echo "== You are using $os_type." # Change to the chat directory cd chat # List .bin files and prompt user to select one bin_files=(*.bin) echo "== Available .bin files:" for i in "${!bin_files[@]}"; do echo " [$((i+1))] ${bin_files[i]}" done # Function to get user input and validate it get_valid_user_input() { local input_valid=false while ! $input_valid; do echo "==> Please enter a number:" read -r user_selection if [[ $user_selection =~ ^[0-9]+$ ]] && (( user_selection >= 1 && user_selection <= ${#bin_files[@]} )); then input_valid=true else echo "Invalid input. Please enter a number between 1 and ${#bin_files[@]}." fi done } get_valid_user_input selected_bin_file="${bin_files[$((user_selection-1))]}" # Run the selected .bin file with the appropriate command ./"$binary_filename" -m "$selected_bin_file" ================================================ FILE: gpt4all-training/old-README.md ================================================

                        GPT4All

                        Demo, data, and code to train open-source assistant-style large language model based on GPT-J and LLaMa

                        :green_book: Technical Report 2: GPT4All-J

                        :green_book: Technical Report 1: GPT4All

                        :snake: Official Python Bindings

                        :computer: Official Typescript Bindings

                        :speech_balloon: Official Web Chat Interface

                        :speech_balloon: Official Chat Interface

                        🦜️🔗 Official Langchain Backend

                        Discord

                        GPT4All is made possible by our compute partner Paperspace.

                        ## GPT4All-J: An Apache-2 Licensed GPT4All Model ![gpt4all-j-demo](https://user-images.githubusercontent.com/13879686/231876409-e3de1934-93bb-4b4b-9013-b491a969ebbc.gif) Run on an M1 Mac (not sped up!) ### GPT4All-J Chat UI Installers Installs a native chat-client with auto-update functionality that runs on your desktop with the GPT4All-J model baked into it. [Mac/OSX](https://gpt4all.io/installers/gpt4all-installer-darwin.dmg) [Windows](https://gpt4all.io/installers/gpt4all-installer-win64.exe) [Ubuntu](https://gpt4all.io/installers/gpt4all-installer-linux.run) If you have older hardware that only supports avx and not avx2 you can use these. [Mac/OSX - avx-only](https://gpt4all.io/installers/gpt4all-installer-darwin-avx-only.dmg) [Windows - avx-only](https://gpt4all.io/installers/gpt4all-installer-win64-avx-only.exe) [Ubuntu - avx-only](https://gpt4all.io/installers/gpt4all-installer-linux-avx-only.run) These files are not yet cert signed by Windows/Apple so you will see security warnings on initial installation. We did not want to delay release while waiting for their process to complete. Find the most up-to-date information on the [GPT4All Website](https://gpt4all.io/) ### Raw Model [ggml Model Download Link](https://gpt4all.io/models/ggml-gpt4all-j.bin) Note this model is only compatible with the C++ bindings found [here](https://github.com/nomic-ai/gpt4all-chat). It will not work with any existing llama.cpp bindings as we had to do a large fork of llama.cpp. GPT4All will support the ecosystem around this new C++ backend going forward. Python bindings are imminent and will be integrated into this [repository](https://github.com/nomic-ai/pyllamacpp). Stay tuned on the [GPT4All discord](https://discord.gg/mGZE39AS3e) for updates. ## Training GPT4All-J Please see [GPT4All-J Technical Report](https://static.nomic.ai/gpt4all/2023_GPT4All-J_Technical_Report_2.pdf) for details. ### GPT4All-J Training Data - We are releasing the curated training data for anyone to replicate GPT4All-J here: [GPT4All-J Training Data](https://huggingface.co/datasets/nomic-ai/gpt4all-j-prompt-generations) - [Atlas Map of Prompts](https://atlas.nomic.ai/map/gpt4all-j-prompts-curated) - [Atlas Map of Responses](https://atlas.nomic.ai/map/gpt4all-j-response-curated) We have released updated versions of our `GPT4All-J` model and training data. - `v1.0`: The original model trained on the v1.0 dataset - `v1.1-breezy`: Trained on a filtered dataset where we removed all instances of AI language model - `v1.2-jazzy`: Trained on a filtered dataset where we also removed instances like I'm sorry, I can't answer... and AI language model The [models](https://huggingface.co/nomic-ai/gpt4all-j) and [data](https://huggingface.co/datasets/nomic-ai/gpt4all-j-prompt-generations) versions can be specified by passing a `revision` argument. For example, to load the `v1.2-jazzy` model and dataset, run: ```python from datasets import load_dataset from transformers import AutoModelForCausalLM dataset = load_dataset("nomic-ai/gpt4all-j-prompt-generations", revision="v1.2-jazzy") model = AutoModelForCausalLM.from_pretrained("nomic-ai/gpt4all-j-prompt-generations", revision="v1.2-jazzy") ``` ### GPT4All-J Training Instructions ```bash accelerate launch --dynamo_backend=inductor --num_processes=8 --num_machines=1 --machine_rank=0 --deepspeed_multinode_launcher standard --mixed_precision=bf16 --use_deepspeed --deepspeed_config_file=configs/deepspeed/ds_config_gptj.json train.py --config configs/train/finetune_gptj.yaml ``` # Original GPT4All Model (based on GPL Licensed LLaMa) ![gpt4all-lora-demo](https://user-images.githubusercontent.com/13879686/228352356-de66ca7a-df70-474e-b929-2e3656165051.gif) Run on M1 Mac (not sped up!) # Try it yourself Here's how to get started with the CPU quantized GPT4All model checkpoint: 1. Download the `gpt4all-lora-quantized.bin` file from [Direct Link](https://the-eye.eu/public/AI/models/nomic-ai/gpt4all/gpt4all-lora-quantized.bin) or [[Torrent-Magnet]](https://tinyurl.com/gpt4all-lora-quantized). 2. Clone this repository, navigate to `chat`, and place the downloaded file there. 3. Run the appropriate command for your OS: - M1 Mac/OSX: `cd chat;./gpt4all-lora-quantized-OSX-m1` - Linux: `cd chat;./gpt4all-lora-quantized-linux-x86` - Windows (PowerShell): `cd chat;./gpt4all-lora-quantized-win64.exe` - Intel Mac/OSX: `cd chat;./gpt4all-lora-quantized-OSX-intel` For custom hardware compilation, see our [llama.cpp](https://github.com/zanussbaum/gpt4all.cpp) fork. ----------- Find all compatible models in the GPT4All Ecosystem section. [Secret Unfiltered Checkpoint](https://the-eye.eu/public/AI/models/nomic-ai/gpt4all/gpt4all-lora-unfiltered-quantized.bin) - [[Torrent]](https://the-eye.eu/public/AI/models/nomic-ai/gpt4all/gpt4all-lora-unfiltered-quantized.bin.torrent) This model had all refusal to answer responses removed from training. Try it with: - M1 Mac/OSX: `cd chat;./gpt4all-lora-quantized-OSX-m1 -m gpt4all-lora-unfiltered-quantized.bin` - Linux: `cd chat;./gpt4all-lora-quantized-linux-x86 -m gpt4all-lora-unfiltered-quantized.bin` - Windows (PowerShell): `cd chat;./gpt4all-lora-quantized-win64.exe -m gpt4all-lora-unfiltered-quantized.bin` - Intel Mac/OSX: `cd chat;./gpt4all-lora-quantized-OSX-intel -m gpt4all-lora-unfiltered-quantized.bin` ----------- Note: the full model on GPU (16GB of RAM required) performs much better in our qualitative evaluations. # Python Client ## CPU Interface To run GPT4All in python, see the new [official Python bindings](https://github.com/nomic-ai/pyllamacpp). The old bindings are still available but now deprecated. They will not work in a notebook environment. To get running using the python client with the CPU interface, first install the [nomic client](https://github.com/nomic-ai/nomic) using `pip install nomic` Then, you can use the following script to interact with GPT4All: ``` from nomic.gpt4all import GPT4All m = GPT4All() m.open() m.prompt('write me a story about a lonely computer') ``` ## GPU Interface There are two ways to get up and running with this model on GPU. The setup here is slightly more involved than the CPU model. 1. clone the nomic client [repo](https://github.com/nomic-ai/nomic) and run `pip install .[GPT4All]` in the home dir. 2. run `pip install nomic` and install the additional deps from the wheels built [here](https://github.com/nomic-ai/nomic/tree/main/bin) Once this is done, you can run the model on GPU with a script like the following: ``` from nomic.gpt4all import GPT4AllGPU m = GPT4AllGPU(LLAMA_PATH) config = {'num_beams': 2, 'min_new_tokens': 10, 'max_length': 100, 'repetition_penalty': 2.0} out = m.generate('write me a story about a lonely computer', config) print(out) ``` Where LLAMA_PATH is the path to a Huggingface Automodel compliant LLAMA model. Nomic is unable to distribute this file at this time. We are working on a GPT4All that does not have this limitation right now. You can pass any of the [huggingface generation config params](https://huggingface.co/docs/transformers/main_classes/text_generation#transformers.GenerationConfig) in the config. # GPT4All Compatibility Ecosystem Edge models in the GPT4All Ecosystem. Please PR as the [community grows](https://huggingface.co/models?sort=modified&search=4bit). Feel free to convert this to a more structured table. - [gpt4all](https://the-eye.eu/public/AI/models/nomic-ai/gpt4all/gpt4all-lora-quantized.bin) [[MD5 Signature](https://the-eye.eu/public/AI/models/nomic-ai/gpt4all/gpt4all-lora-quantized.bin.md5)] - [gpt4all-ggml-converted](https://the-eye.eu/public/AI/models/nomic-ai/gpt4all/gpt4all-lora-quantized-ggml.bin) [[MD5 Signature](https://the-eye.eu/public/AI/models/nomic-ai/gpt4all/gpt4all-lora-quantized-ggml.bin.md5)] - [gpt4all-unfiltered](https://the-eye.eu/public/AI/models/nomic-ai/gpt4all/gpt4all-lora-unfiltered-quantized.bin) [[MD5 Signature](https://the-eye.eu/public/AI/models/nomic-ai/gpt4all/gpt4all-lora-unfiltered-quantized.bin.md5)] - [ggml-vicuna-7b-4bit](https://huggingface.co/eachadea/ggml-vicuna-7b-4bit) - [vicuna-13b-GPTQ-4bit-128g](https://huggingface.co/anon8231489123/vicuna-13b-GPTQ-4bit-128g) - [LLaMa-Storytelling-4Bit](https://huggingface.co/GamerUntouch/LLaMa-Storytelling-4Bit) - [Alpaca Native 4bit](https://huggingface.co/Sosaka/Alpaca-native-4bit-ggml/tree/main) # Roadmap ## Short Term - (Done) Train a GPT4All model based on GPTJ to alleviate llama distribution issues. - (Done) Create improved CPU and GPU interfaces for this model. - (Done) [Integrate llama.cpp bindings](https://github.com/nomic-ai/pyllamacpp) - (Done) [Create a good conversational chat interface for the model.](https://github.com/nomic-ai/gpt4all-ui) - (Done) [Allow users to opt in and submit their chats for subsequent training runs](https://github.com/nomic-ai/gpt4all-ui) ## Medium Term - (NOT STARTED) Integrate GPT4All with [Atlas](https://atlas.nomic.ai) to allow for document retrieval. - BLOCKED by GPT4All based on GPTJ - (Done) Integrate GPT4All with Langchain. - (IN PROGRESS) Build easy custom training scripts to allow users to fine tune models. ## Long Term - (NOT STARTED) Allow anyone to curate training data for subsequent GPT4All releases using Atlas. - (IN PROGRESS) Democratize AI. # Reproducibility Trained Model Weights: - gpt4all-lora (four full epochs of training): https://huggingface.co/nomic-ai/gpt4all-lora - gpt4all-lora-epoch-2 (three full epochs of training) https://huggingface.co/nomic-ai/gpt4all-lora-epoch-2 - gpt4all-j (one full epoch of training) (https://huggingface.co/nomic-ai/gpt4all-j) - gpt4all-j-lora (one full epoch of training) (https://huggingface.co/nomic-ai/gpt4all-j-lora) Raw Data: - [Training Data Without P3](https://huggingface.co/datasets/nomic-ai/gpt4all_prompt_generations) - Explorer: https://atlas.nomic.ai/map/gpt4all_data_clean_without_p3 - [Full Dataset with P3](https://huggingface.co/datasets/nomic-ai/gpt4all_prompt_generations_with_p3) - Explorer: https://atlas.nomic.ai/map/gpt4all_data_clean - [GPT4All-J Dataset](https://huggingface.co/datasets/nomic-ai/gpt4all-j-prompt-generations) - Explorer Indexed on Prompts: https://atlas.nomic.ai/map/gpt4all-j-prompts-curated - Explorer Indexed on Responses: https://atlas.nomic.ai/map/gpt4all-j-response-curated We are not distributing a LLaMa 7B checkpoint. You can reproduce our trained model by doing the following: ## Setup Clone the repo ``` git clone --recurse-submodules https://github.com/nomic-ai/gpt4all.git git submodule update --init ``` Setup the environment ``` python -m pip install -r requirements.txt cd ../peft pip install -e . ``` ## Training ```bash accelerate launch --dynamo_backend=inductor --num_processes=8 --num_machines=1 --machine_rank=0 --deepspeed_multinode_launcher standard --mixed_precision=bf16 --use_deepspeed --deepspeed_config_file=configs/deepspeed/ds_config.json train.py --config configs/train/finetune-7b.yaml ``` ## Generate ```bash python generate.py --config configs/generate/generate.yaml --prompt "Write a script to reverse a string in Python" ``` ## Need Help? Join the Discord and ask for help in `#gpt4all-help` # Sample Generations ### Provide instructions for the given exercise. Leg Raises >Leg Raises >1. Stand with your feet shoulder-width apart and your knees slightly bent. >2. Raise your arms straight out in front of you. >3. Slowly bend your knees and raise your heels off the ground. >4. Hold this position for a count of 3, then slowly lower your heels back to the ground. >5. Repeat the exercise 10 times. >6. Do not let your knees go beyond your toes. >7. Keep your back straight and your abdominal muscles engaged. >8. Do not let your hips sag or your back arch. >9. Keep your arms straight and your elbows locked. >10. Do not let your head drop or your shoulders rise. >11. Breathe normally throughout the exercise. >12. Rest for 30 seconds before repeating the exercise. >13. Gradually increase the number of repetitions as you become stronger. ### A color description has been provided. Find the CSS code associated with that color. A light red color with a medium light shade of pink >The CSS code associated with the color provided is: #FF6347 ### Come up with an interesting idea for a new movie plot. Your plot should be described with a title and a summary. > 1. The title of your movie plot should be "The Last Stand". > 2. The summary of your movie plot should be: A group of mercenaries are hired to protect a small town from a dangerous criminal who is on the run. However, the town's sheriff is not willing to let them take over and decides to take matters into his own hands. ### Reverse a string in python. > In Python, you can reverse a list or tuple by using the reversed() function on it. Here's an example of how to use this method with strings: > ```python > my_string = "Hello World" # Define your original string here > reversed_str = my_string[::-1] # Use a slice and negative index for reverse order > print(reversed_str) # Output will be reversed version of the inputted string. ### List 10 dogs. > 1. Labrador Retriever > 2. Golden Retriever > 3. Beagle > 4. Poodle > 5. German Shepherd > 6. Bulldog > 7. Rottweiler > 8. Boxer > 9. Shih Tzu > 10. Pug ### Write me a poem about the fall of Julius Caesar into a caesar salad in iambic pentameter. > The fall of Julius Caesar into a caesar salad
                        > In iambic pentameter,
                        > The mighty Roman emperor
                        > Fell into a salad of lettuce and croutons,
                        > His empire crumbling around him,
                        > As he was devoured by the hungry diners.
                        > The once mighty emperor
                        > Was now just a salad topping,
                        > His legacy reduced to a mere garnish,
                        > As he was consumed by the masses.
                        > The fall of Julius Caesar into a caesar salad,
                        > In iambic pentameter,
                        > Was a tragic end to a great man,
                        > Who had once ruled the world with an iron fist.
                        > But now, he was just a salad topping,
                        > His legacy reduced to a mere garnish,
                        > As he was consumed by the masses. ### What is a three word topic describing the following keywords: baseball, football, soccer: > Sports, athletics, games ## Citation If you utilize this repository, models or data in a downstream project, please consider citing it with: ``` @misc{gpt4all, author = {Yuvanesh Anand and Zach Nussbaum and Brandon Duderstadt and Benjamin Schmidt and Andriy Mulyar}, title = {GPT4All: Training an Assistant-style Chatbot with Large Scale Data Distillation from GPT-3.5-Turbo}, year = {2023}, publisher = {GitHub}, journal = {GitHub repository}, howpublished = {\url{https://github.com/nomic-ai/gpt4all}}, } ``` ================================================ FILE: gpt4all-training/read.py ================================================ import yaml def read_config(path): # read yaml and return contents with open(path, 'r') as file: try: return yaml.safe_load(file) except yaml.YAMLError as exc: print(exc) ================================================ FILE: gpt4all-training/requirements.txt ================================================ accelerate datasets einops torchmetrics evaluate transformers>=4.28.0 wandb peft nodelist-inflator deepspeed sentencepiece jsonlines nomic scikit-learn matplotlib ================================================ FILE: gpt4all-training/train.py ================================================ #!/usr/bin/env python3 import os from transformers import AutoModelForCausalLM, AutoTokenizer, get_scheduler import torch from torch.optim import AdamW from argparse import ArgumentParser from read import read_config from accelerate import Accelerator from accelerate.utils import DummyScheduler, DummyOptim, set_seed from peft import get_peft_model, LoraConfig, TaskType from data import load_data from torchmetrics import MeanMetric from tqdm import tqdm import wandb torch.backends.cuda.matmul.allow_tf32 = True def format_metrics(metrics, split, prefix=""): log = f"[{split}]" + prefix log += " ".join([f"{key}: {value:.4f}" for key, value in metrics.items()]) return log def evaluate(model, val_dataloader): model.eval() val_loss = MeanMetric(nan_strategy="error").to(model.device) with torch.no_grad(): for batch in tqdm(val_dataloader): loss = model(**batch).loss loss_values = accelerator.gather_for_metrics({"loss": loss.detach()}) val_loss.update(loss_values["loss"]) return val_loss def train(accelerator, config): set_seed(config['seed']) accelerator.print(config) accelerator.print(f"Using {accelerator.num_processes} GPUs") tokenizer = AutoTokenizer.from_pretrained(config['tokenizer_name'], model_max_length=config['max_length'], use_fast=False) # if no pad token, set it to eos if tokenizer.pad_token is None: tokenizer.pad_token = tokenizer.eos_token with accelerator.main_process_first(): train_dataloader, val_dataloader = load_data(config, tokenizer) checkpoint = config["gradient_checkpointing"] model = AutoModelForCausalLM.from_pretrained(config["model_name"], use_cache=False if checkpoint else True, trust_remote_code=True) if checkpoint: model.gradient_checkpointing_enable() if config["lora"]: peft_config = LoraConfig( # should R be configurable? task_type=TaskType.CAUSAL_LM, inference_mode=False, r=8, lora_alpha=32, lora_dropout=0.1 ) model = get_peft_model(model, peft_config) model.print_trainable_parameters() optimizer_cls = ( AdamW if accelerator.state.deepspeed_plugin is None or "optimizer" not in accelerator.state.deepspeed_plugin.deepspeed_config else DummyOptim ) # karpathy doesn't decay embedding, maybe we should exclude # https://github.com/karpathy/minGPT/commit/bbbdac74fa9b2e55574d70056163ffbae42310c1#diff-2075fa9c224b395be5bda85544dd36572b59c76c54562819eadadbf268602834R157s optimizer = optimizer_cls(model.parameters(), lr=config["lr"], weight_decay=config["weight_decay"]) if accelerator.state.deepspeed_plugin is not None: gradient_accumulation_steps = accelerator.state.deepspeed_plugin.deepspeed_config[ "gradient_accumulation_steps" ] # decay to min_lr instead of 0 lr_ratio = config["min_lr"] / config["lr"] accelerator.print(f"Len of train_dataloader: {len(train_dataloader)}") total_num_steps = (len(train_dataloader) / gradient_accumulation_steps) * (config["num_epochs"]) # instead of decaying to zero, decay to ratio of min_lr / lr total_num_steps += int(total_num_steps * lr_ratio) + config["warmup_steps"] accelerator.print(f"Total training steps: {total_num_steps}") # Creates Dummy Scheduler if `scheduler` was specified in the config file else creates `args.lr_scheduler_type` Scheduler if ( accelerator.state.deepspeed_plugin is None or "scheduler" not in accelerator.state.deepspeed_plugin.deepspeed_config ): scheduler = get_scheduler( name="cosine", optimizer=optimizer, num_warmup_steps=config["warmup_steps"] * accelerator.num_processes, num_training_steps=total_num_steps, ) else: scheduler = DummyScheduler( optimizer, total_num_steps=total_num_steps, warmup_num_steps=config["warmup_steps"] ) model, optimizer, train_dataloader, val_dataloader, scheduler = accelerator.prepare( model, optimizer, train_dataloader, val_dataloader, scheduler ) # setup for saving training states in case preemption accelerator.register_for_checkpointing(scheduler) if config["checkpoint"]: accelerator.load_state(config["checkpoint"]) accelerator.print(f"Resumed from checkpoint: {config['checkpoint']}") path = os.path.basename(config["checkpoint"]) training_difference = os.path.splitext(path)[0] resume_step = int(training_difference.replace("step_", "")) train_dataloader = accelerator.skip_first_batches(train_dataloader, resume_step) accelerator.print(f"Resuming from step {resume_step}") else: resume_step = 0 # log gradients if accelerator.is_main_process and config["wandb"]: wandb.watch(model, log_freq=config["log_grads_every"], log="all") accelerator.wait_for_everyone() for epoch in range(0, config["num_epochs"]): train_loss = MeanMetric(nan_strategy="error").to(model.device) for step, batch in enumerate(tqdm(train_dataloader)): curr_step = epoch * len(train_dataloader) + step model.train() outputs = model(**batch) loss = outputs.loss # gather loss before backprop in case of gradient accumulation loss_values = accelerator.gather_for_metrics({"loss": loss.detach().float()}) if config["wandb"]: accelerator.log({"loss": torch.mean(loss_values["loss"]).item()}, step=curr_step) train_loss.update(loss_values["loss"]) loss = loss / gradient_accumulation_steps accelerator.backward(loss) # get gradient norm of all params # log LR in case something weird happens if step > 0 and step % (config["log_lr_every"]) == 0: if config["wandb"]: accelerator.log({"lr": scheduler.get_last_lr()[0]}, step=curr_step) if (step + 1) % gradient_accumulation_steps == 0 or step == len(train_dataloader) - 1: optimizer.step() scheduler.step() optimizer.zero_grad() if step > 0 and step % config["save_every"] == 0: accelerator.save_state(f"{config['output_dir']}/step_{curr_step}") if step > 0 and (step % config["eval_every"] == 0 or step == len(train_dataloader) - 1): val_loss = evaluate(model, val_dataloader) log_train = { "train_loss": train_loss.compute() } log_val = { "val_loss": val_loss.compute() } if config["wandb"]: accelerator.log({**log_train, **log_val}, step=curr_step) accelerator.print(f"Current LR: {scheduler.get_last_lr()[0]}") accelerator.print(format_metrics(log_train, "train", f" step {step} ")) accelerator.print(format_metrics(log_val, "val", f" step {step} ")) train_loss.reset() accelerator.print(f"Epoch {epoch} finished") accelerator.print(f"Pushing to HF hub") unwrapped_model = accelerator.unwrap_model(model) unwrapped_model.save_pretrained( f"{config['output_dir']}/epoch_{epoch}", is_main_process=accelerator.is_main_process, save_function=accelerator.save, state_dict=accelerator.get_state_dict(model), ) try: if accelerator.is_main_process: unwrapped_model.push_to_hub(config["save_name"] + f"-epoch_{epoch}", private=True) except Exception as e: accelerator.print(e) accelerator.print(f"Failed to push to hub") if config["num_epochs"] > 1: accelerator.wait_for_everyone() unwrapped_model = accelerator.unwrap_model(model) unwrapped_model.save_pretrained( f"{config['output_dir']}/final", is_main_process=accelerator.is_main_process, save_function=accelerator.save, state_dict=accelerator.get_state_dict(model), ) accelerator.end_training() if __name__ == "__main__": # parse arguments by reading in a config parser = ArgumentParser() parser.add_argument("--config", type=str, default="config.yaml") args = parser.parse_args() config = read_config(args.config) if config["wandb"]: accelerator = Accelerator(log_with="wandb") accelerator.init_trackers( project_name=config["wandb_project_name"], config=config, init_kwargs={"wandb": {"entity": config["wandb_entity"]}}, ) else: accelerator = Accelerator() train(accelerator, config=config) ================================================ FILE: roadmap.md ================================================ # GPT4All 2024 Roadmap To contribute to the development of any of the below roadmap items, make or find the corresponding issue and cross-reference the [in-progress task](https://github.com/orgs/nomic-ai/projects/2/views/1). Each item should have an issue link below. - Chat UI Language Localization (localize UI into the native languages of users) - [ ] Chinese - [ ] German - [ ] French - [x] Portuguese - [ ] Your native language here. - UI Redesign: an internal effort at Nomic to improve the UI/UX of gpt4all for all users. - [x] Design new user interface and gather community feedback - [x] Implement the new user interface and experience. - Installer and Update Improvements - [ ] Seamless native installation and update process on OSX - [ ] Seamless native installation and update process on Windows - [ ] Seamless native installation and update process on Linux - Model discoverability improvements: - [x] Support huggingface model discoverability - [x] Support Nomic hosted model discoverability - LocalDocs (towards a local perplexity) - Multilingual LocalDocs Support - [ ] Create a multilingual experience - [ ] Incorporate a multilingual embedding model - [ ] Specify a preferred multilingual LLM for localdocs - Improved RAG techniques - [ ] Query augmentation and re-writing - [ ] Improved chunking and text extraction from arbitrary modalities - [ ] Custom PDF extractor past the QT default (charts, tables, text) - [ ] Faster indexing and local exact search with v1.5 hamming embeddings and reranking (skip ANN index construction!) - Support queries like 'summarize X document' - Multimodal LocalDocs support with Nomic Embed - Nomic Dataset Integration with real-time LocalDocs - [ ] Include an option to allow the export of private LocalDocs collections to Nomic Atlas for debugging data/chat quality - [ ] Allow optional sharing of LocalDocs collections between users. - [ ] Allow the import of a LocalDocs collection from an Atlas Datasets - Chat with live version of Wikipedia, Chat with Pubmed, chat with the latest snapshot of world news. - First class Multilingual LLM Support - [ ] Recommend and set a default LLM for German - [ ] Recommend and set a default LLM for English - [ ] Recommend and set a default LLM for Chinese - [ ] Recommend and set a default LLM for Spanish - Server Mode improvements - Improved UI and new requested features: - [ ] Fix outstanding bugs and feature requests around networking configurations. - [ ] Support Nomic Embed inferencing - [ ] First class documentation - [ ] Improving developer use and quality of server mode (e.g. support larger batches)