Repository: AstraBert/everything-ai Branch: main Commit: d8cc4f2092b6 Files: 36 Total size: 128.7 KB Directory structure: gitextract_dl7fgpnu/ ├── .github/ │ └── FUNDING.yml ├── .gitignore ├── .v0_1_1/ │ ├── README.md │ ├── docker/ │ │ ├── Dockerfile │ │ ├── build_command.sh │ │ ├── chat.py │ │ ├── requirements.txt │ │ └── utils.py │ └── scripts/ │ ├── gemma-for-datasciences.ipynb │ └── gemma_for_datasciences.py ├── LICENSE ├── README.md ├── _config.yml ├── compose.yaml └── docker/ ├── Dockerfile ├── agnostic_text_generation.py ├── audio_classification.py ├── autotrain_interface.py ├── build_your_llm.py ├── chat_your_llm.py ├── fal_img2img.py ├── image_classification.py ├── image_generation.py ├── image_generation_pollinations.py ├── image_to_text.py ├── llama_cpp_int.py ├── protein_folding_with_esm.py ├── requirements.txt ├── retrieval_image_search.py ├── retrieval_text_generation.py ├── select_and_run.py ├── spaces_api_supabase.py ├── speech_recognition.py ├── text_summarization.py ├── utils.py └── video_generation.py ================================================ FILE CONTENTS ================================================ ================================================ FILE: .github/FUNDING.yml ================================================ # These are supported funding model platforms github: [AstraBert] ================================================ FILE: .gitignore ================================================ flagged/ docker/__pycache__ docker/flagged qdrant_storage/ ================================================ FILE: .v0_1_1/README.md ================================================ # everything-rag >_How was this README generated? Levearaging the power of AI with **reAIdme**, an HuggingChat assistant based on meta-llama/Llama-2-70b-chat-hf._ _Go and give it a try [here](https://hf.co/chat/assistant/660d9a4f590a7924eed02a32)!_ 🤖

Example chat with everything-rag, mediated by google/flan-t5-base

### Table of Contents 0. [TL;DR](#tldr) 1. [Introduction](#introduction) 2. [Inspiration](#inspiration) 2. [Getting Started](#getting-started) 3. [Using the Chatbot](#using-the-chatbot) 4. [Troubleshooting](#troubleshooting) 5. [Contributing](#contributing) 6. [Upcoming features](#upcoming-features) 7. [References](#reference) ## TL;DR * This documentation is soooooo long, I want to get my hands dirty!!! >You can try out everything-rag the [dedicated HuggingFace space](https://huggingface.co/spaces/as-cle-bert/everything-rag), based on google/flan-t5-large.

## Introduction Introducing **everything-rag**, your fully customizable and local chatbot assistant! 🤖 With everything-rag, you can: 1. Use virtually any LLM you want: Switch between different LLMs like _gemma-7b_ or _llama-7b_ to suit your needs. 2. Use your own data: everything-rag can work with any data you provide, whether it's a PDF about data sciences or a document about pallas' cats!🐈 3. Enjoy 100% local and 100% free functionality: No need for hosted APIs or pay-as-you-go services. everything-rag is completely free to use and runs on your desktop. Plus, with the chat_history functionality in ConversationalRetrievalChain, you can easily retrieve and review previous conversations with your chatbot, making it even more convenient to use. While everything-rag offers many benefits, there are a couple of limitations to keep in mind: 1. Performance-critical tasks: Loading large models (>1~2 GB) and generating text can be resource-intensive, so it's recommended to have at least 16GB RAM and 4 CPU cores for optimal performance. 2. Small LLMs can still allucinate: While large LLMs like _gemma-7b_ and _llama-7b_ tend to produce better results, smaller models like _openai-community/gpt2_ can still produce suboptimal responses in certain situations. In summary, everything-rag is a simple, customizable, and local chatbot assistant that offers a wide range of features and capabilities. By leveraging the power of RAG, everything-rag offers a unique and flexible chatbot experience that can be tailored to your specific needs and preferences. Whether you're looking for a simple chatbot to answer basic questions or a more advanced conversational AI to engage with your users, everything-rag has got you covered.😊 ## Inspiration This project is a humble and modest carbon-copy of its main and true inspirations, i.e. [Jan.ai](https://jan.ai/), [Cheshire Cat AI](https://cheshirecat.ai/), [privateGPT](https://privategpt.io/) and many other projects that focus on making LLMs (and AI in general) open-source and easily accessible to everyone. ## Getting Started You can do two things: - Play with generation on [Kaggle](https://www.kaggle.com/code/astrabertelli/gemma-for-datasciences) - Clone this repository, head over to [the python script](./scripts/gemma_for_datasciences.py) and modify everything to your needs! - Docker installation (🥳**FULLY IMPLEMENTED**): you can install everything-rag through docker image and running it thanks do Docker by following these really simple commands: ```bash docker pull ghcr.io/astrabert/everything-rag:latest docker run -p 7860:7860 everything-rag:latest -m microsoft/phi-2 -t text-generation ``` - **IMPORTANT NOTE**: running the script within `docker run` does not log the port on which the app is running until you press `Ctrl+C`, but in that moment it also interrupt the execution! The app will run on port `0.0.0.0:7860` (or `localhost:7860` if your browser is Windows-based), so just make sure to open your browser on that port and to refresh it after 30s to 1 or 2 mins, when the model and the tokenizer should be loaded and the app should be ready to work! - As you can see, you just need to specify the LLM model and its task (this is mandatory). Keep in mind that, for what concerns v0.1.1, everything-rag supports only text-generation and text2text-generation. For these two tasks, you can use virtually *any* model from HuggingFace Hub: the sole recommendation is to watch out for your disk space, RAM and CPU power, LLMs can be quite resource-consuming! ## Using the Chatbot ### GUI The chatbot has a brand-new GradIO-based interface that runs on local server. You can interact by uploading directly your pdf files and/or sending messages, all by running: ```bash python3 scripts/chat.py -m provider/modelname -t task ``` The suggested workflow is, nevertheless, the one that exploits Docker. ### Code breakdown - notebook Everything is explained in [the dedicated notebook](./scripts/gemma-for-datasciences.ipynb), but here's a brief breakdown of the code: 1. The first section imports the necessary libraries, including Hugging Face Transformers, langchain-community, and tkinter. 2. The next section installs the necessary dependencies, including the gemma-2b model, and defines some useful functions for making the LLM-based data science assistant work. 3. The create_a_persistent_db function creates a persistent database from a PDF file, using the PyPDFLoader to split the PDF into smaller chunks and the Hugging Face embeddings to transform the text into numerical vectors. The resulting database is stored in a LocalFileStore. 4. The just_chatting function implements a chat system using the Hugging Face model and the persistent database. It takes a query, tokenizes it, and passes it to the model to generate a response. The response is then returned as a dictionary of strings. 5. The chat_gui class defines a simple chat GUI that displays the chat history and allows the user to input queries. The send_message function is called when the user presses the "Send" button, and it sends the user's message to the just_chatting function to get a response. 6. The script then creates a root Tk object and instantiates a ChatGUI object, which starts the main loop. Et voilà, your chatbot is up and running!🦿 ## Troubleshooting ### Common Issues Q&A * Q: The chatbot is not responding😭 > A: Make sure that the PDF document is in the specified path and that the database has been created successfully. * Q: The chatbot is taking soooo long🫠 > A: This is quite common with resource-limited environments that deal with too large or too small models: large models require **at least** 32 GB RAM and >8 core CPU, whereas small model can easily be allucinating and producing responses that are endless repetitions of the same thing! Check *penalty_score* parameter to avoid this. **try rephrasing the query and be as specific as possible** * Q: My model is allucinating and/or repeating the same sentence over and over again😵‍💫 > A: This is quite common with small or old models: check *penalty_score* and *temperature* parameter to avoid this. * Q: The chatbot is giving incorrect/non-meaningful answers🤥 >A: Check that the PDF document is relevant and up-to-date. Also, **try rephrasing the query and be as specific as possible** * Q: An error occurred while generating the answer💔 >A: This frequently occurs when your (small) LLM has a limited maximum hidden size (generally 512 or 1024) and the context that the retrieval-augmented chain produces goes beyond that maximum. You could, potentially, modify the configuration of the model, but this would mean dramatically increase its resource consumption, and your small laptop is not prepared to take it, trust me!!! A solution, if you have enough RAM and CPU power, is to switch to larger LLMs: they do not have problems in this sense. ## Upcoming features🚀 - [ ] Multi-lingual support (expected for **version 0.2.0**) - [ ] More text-based tasks: question answering, summarisation (expected for **version 0.3.0**) - [ ] Computer vision: Image-to-text, image generation, image segmentation... (expected for **version 1.0.0**) ## Contributing Contributions are welcome! If you would like to improve the chatbot's functionality or add new features, please fork the repository and submit a pull request. ## Reference * [Hugging Face Transformers](https://github.com/huggingface/transformers) * [Langchain-community](https://github.com/langchain-community/langchain-community) * [Tkinter](https://docs.python.org/3/library/tkinter.html) * [PDF document about data science](https://www.kaggle.com/datasets/astrabertelli/what-is-datascience-docs) * [GradIO](https://www.gradio.app/) ## License This project is licensed under the Apache 2.0 License. If you use this work for your projects, please consider citing the author [Astra Bertelli](http://astrabert.vercel.app). ================================================ FILE: .v0_1_1/docker/Dockerfile ================================================ # Use an official Python runtime as a parent image FROM python:3.10-slim-bookworm # Set the working directory in the container to /app WORKDIR /app # Add the current directory contents into the container at /app ADD . /app # Update and install system dependencies RUN apt-get update && apt-get install -y \ build-essential \ libpq-dev \ libffi-dev \ libssl-dev \ musl-dev \ libxml2-dev \ libxslt1-dev \ zlib1g-dev \ && rm -rf /var/lib/apt/lists/* # Install Python dependencies RUN python3 -m pip cache purge RUN python3 -m pip install --no-cache-dir -r requirements.txt # Expose the port that the application will run on EXPOSE 7860 # Set the entrypoint with a default command and allow the user to override it ENTRYPOINT ["python3", "chat.py"] ================================================ FILE: .v0_1_1/docker/build_command.sh ================================================ docker buildx build \ --label org.opencontainers.image.title=everything-rag \ --label org.opencontainers.image.description='Introducing everything-rag, your fully customizable and local chatbot assistant!' \ --label org.opencontainers.image.url=https://github.com/AstraBert/everything-rag \ --label org.opencontainers.image.source=https://github.com/AstraBert/everything-rag --label org.opencontainers.image.version=0.1.7 \ --label org.opencontainers.image.created=2024-04-07T12:39:11.393Z \ --label org.opencontainers.image.licenses=Apache-2.0 \ --platform linux/amd64 \ --tag ghcr.io/astrabert/everything-rag:latest \ --tag ghcr.io/astrabert/everything-rag:0.1.1 \ --push . ================================================ FILE: .v0_1_1/docker/chat.py ================================================ import gradio as gr import os import time from utils import * vectordb = "" def generate_welcome_message(): return (None, "Hello! Welcome to the chatbot. You can enter a message or upload a file.") def print_like_dislike(x: gr.LikeData): print(x.index, x.value, x.liked) def add_message(history, message): if len(message["files"]) > 0: history.append((message["files"], None)) if message["text"] is not None and message["text"] != "": history.append((message["text"], None)) return history, gr.MultimodalTextbox(value=None, interactive=False) def bot(history): global vectordb global tsk if type(history[-1][0]) != tuple: if vectordb == "": pipe = pipeline(tsk, tokenizer=tokenizer, model=model) response = pipe(history[-1][0])[0] response = response["generated_text"] history[-1][1] = "" for character in response: history[-1][1] += character time.sleep(0.05) yield history else: try: response = just_chatting(task=tsk, model=model, tokenizer=tokenizer, query=history[-1][0], vectordb=vectordb, chat_history=[convert_none_to_str(his) for his in history])["answer"] history[-1][1] = "" for character in response: history[-1][1] += character time.sleep(0.05) yield history except Exception as e: response = f"Sorry, the error '{e}' occured while generating the response; check [troubleshooting documentation](https://astrabert.github.io/everything-rag/#troubleshooting) for more" if type(history[-1][0]) == tuple: filelist = [] for i in history[-1][0]: filelist.append(i) if len(filelist) > 1: finalpdf = merge_pdfs(filelist) else: finalpdf = filelist[0] vectordb = create_a_persistent_db(finalpdf, os.path.dirname(finalpdf)+"_localDB", os.path.dirname(finalpdf)+"_embcache") response = "VectorDB was successfully created, now you can ask me anything about the document you uploaded!😊" history[-1][1] = "" for character in response: history[-1][1] += character time.sleep(0.05) yield history with gr.Blocks() as demo: chatbot = gr.Chatbot( [[None, "Hi, I'm **everything-rag**🤖.\nI'm here to assist you and let you chat with _your_ pdfs!\nCheck [my website](https://astrabert.github.io/everything-rag/) for troubleshooting and documentation reference\nHave fun!😊"]], label="everything-rag", elem_id="chatbot", bubble_full_width=False, ) chat_input = gr.MultimodalTextbox(interactive=True, file_types=["pdf"], placeholder="Enter message or upload file...", show_label=False) chat_msg = chat_input.submit(add_message, [chatbot, chat_input], [chatbot, chat_input]) bot_msg = chat_msg.then(bot, chatbot, chatbot, api_name="bot_response") bot_msg.then(lambda: gr.MultimodalTextbox(interactive=True), None, [chat_input]) chatbot.like(print_like_dislike, None, None) clear = gr.ClearButton(chatbot) demo.queue() if __name__ == "__main__": demo.launch(server_name="0.0.0.0", share=False) ================================================ FILE: .v0_1_1/docker/requirements.txt ================================================ langchain-community==0.0.13 langchain==0.1.1 pypdf==3.17.4 sentence_transformers==2.2.2 chromadb==0.4.22 cryptography>=3.1 gradio transformers trl peft ================================================ FILE: .v0_1_1/docker/utils.py ================================================ from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, AutoModelForCausalLM, pipeline import time from langchain_community.llms import HuggingFacePipeline from langchain.storage import LocalFileStore from langchain.embeddings import CacheBackedEmbeddings from langchain_community.vectorstores import Chroma from langchain.text_splitter import CharacterTextSplitter from langchain_community.document_loaders import PyPDFLoader from langchain_community.embeddings import HuggingFaceEmbeddings from langchain.chains import ConversationalRetrievalChain import os from pypdf import PdfMerger from argparse import ArgumentParser argparse = ArgumentParser() argparse.add_argument( "-m", "--model", help="HuggingFace Model identifier, such as 'google/flan-t5-base'", required=True, ) argparse.add_argument( "-t", "--task", help="Task for the model: for now supported task are ['text-generation', 'text2text-generation']", required=True, ) args = argparse.parse_args() mod = args.model tsk = args.task mod = mod.replace("\"", "").replace("'", "") tsk = tsk.replace("\"", "").replace("'", "") TASK_TO_MODEL = {"text-generation": AutoModelForCausalLM, "text2text-generation": AutoModelForSeq2SeqLM} if tsk not in TASK_TO_MODEL: raise Exception("Unsopported task! Supported task are ['text-generation', 'text2text-generation']") def merge_pdfs(pdfs: list): merger = PdfMerger() for pdf in pdfs: merger.append(pdf) merger.write(f"{pdfs[-1].split('.')[0]}_results.pdf") merger.close() return f"{pdfs[-1].split('.')[0]}_results.pdf" def create_a_persistent_db(pdfpath, dbpath, cachepath) -> None: """ Creates a persistent database from a PDF file. Args: pdfpath (str): The path to the PDF file. dbpath (str): The path to the storage folder for the persistent LocalDB. cachepath (str): The path to the storage folder for the embeddings cache. """ print("Started the operation...") a = time.time() loader = PyPDFLoader(pdfpath) documents = loader.load() ### Split the documents into smaller chunks for processing text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0) texts = text_splitter.split_documents(documents) ### Use HuggingFace embeddings for transforming text into numerical vectors ### This operation can take a while the first time but, once you created your local database with ### cached embeddings, it should be a matter of seconds to load them! embeddings = HuggingFaceEmbeddings() store = LocalFileStore( os.path.join( cachepath, os.path.basename(pdfpath).split(".")[0] + "_cache" ) ) cached_embeddings = CacheBackedEmbeddings.from_bytes_store( underlying_embeddings=embeddings, document_embedding_cache=store, namespace=os.path.basename(pdfpath).split(".")[0], ) b = time.time() print( f"Embeddings successfully created and stored at {os.path.join(cachepath, os.path.basename(pdfpath).split('.')[0]+'_cache')} under namespace: {os.path.basename(pdfpath).split('.')[0]}" ) print(f"To load and embed, it took: {b - a}") persist_directory = os.path.join( dbpath, os.path.basename(pdfpath).split(".")[0] + "_localDB" ) vectordb = Chroma.from_documents( documents=texts, embedding=cached_embeddings, persist_directory=persist_directory, ) c = time.time() print( f"Persistent database successfully created and stored at {os.path.join(dbpath, os.path.basename(pdfpath).split('.')[0] + '_localDB')}" ) print(f"To create a persistent database, it took: {c - b}") return vectordb def convert_none_to_str(l: list): newlist = [] for i in range(len(l)): if l[i] is None or type(l[i])==tuple: newlist.append("") else: newlist.append(l[i]) return tuple(newlist) def just_chatting( task, model, tokenizer, query, vectordb, chat_history=[] ): """ Implements a chat system using Hugging Face models and a persistent database. Args: task (str): Task for the pipeline; for now supported task are ['text-generation', 'text2text-generation'] model (AutoModelForCausalLM): Hugging Face model, already loaded and prepared. tokenizer (AutoTokenizer): Hugging Face tokenizer, already loaded and prepared. model_task (str): Task for the Hugging Face model. persistent_db_dir (str): Directory for the persistent database. embeddings_cache (str): Path to cache Hugging Face embeddings. pdfpath (str): Path to the PDF file. query (str): Question by the user vectordb (ChromaDB): vectorstorer variable for retrieval. chat_history (list): A list with previous questions and answers, serves as context; by default it is empty (it may make the model allucinate) """ ### Create a text-generation pipeline and connect it to a ConversationalRetrievalChain pipe = pipeline(task, model=model, tokenizer=tokenizer, max_new_tokens = 2048, repetition_penalty = float(1.2), ) local_llm = HuggingFacePipeline(pipeline=pipe) llm_chain = ConversationalRetrievalChain.from_llm( llm=local_llm, chain_type="stuff", retriever=vectordb.as_retriever(search_kwargs={"k": 1}), return_source_documents=False, ) rst = llm_chain({"question": query, "chat_history": chat_history}) return rst try: tokenizer = AutoTokenizer.from_pretrained( mod, ) model = TASK_TO_MODEL[tsk].from_pretrained( mod, ) except Exception as e: import sys print(f"The error {e} occured while handling model and tokenizer loading: please ensure that the model you provided was correct and suitable for the specified task. Be also sure that the HF repository for the loaded model contains all the necessary files.", file=sys.stderr) sys.exit(1) ================================================ FILE: .v0_1_1/scripts/gemma-for-datasciences.ipynb ================================================ {"metadata":{"colab":{"provenance":[]},"kernelspec":{"name":"python3","display_name":"Python 3","language":"python"},"language_info":{"name":"python","version":"3.10.13","mimetype":"text/x-python","codemirror_mode":{"name":"ipython","version":3},"pygments_lexer":"ipython3","nbconvert_exporter":"python","file_extension":".py"},"kaggle":{"accelerator":"none","dataSources":[{"sourceId":8018995,"sourceType":"datasetVersion","datasetId":4724972},{"sourceId":11270,"sourceType":"modelInstanceVersion","isSourceIdPinned":true,"modelInstanceId":6216}],"dockerImageVersionId":30673,"isInternetEnabled":true,"language":"python","sourceType":"notebook","isGpuEnabled":false}},"nbformat_minor":4,"nbformat":4,"cells":[{"cell_type":"markdown","source":"# gemma-2b AS A DATA SCIENCE TEACHER\n\n## 100% LOCAL, WITHOUT FINETUNING, WITH _YOUR OWN DATA_\n\n> _Information is the oil of the 21st century, and analytics is the combustion engine – Peter Sondergaard (Senior Vice President and the Global Head of Research at Gartner Inc)_\n\nIn a world where data are becoming more important with each day passing, data science is a fundamental discipline to master in order to understand and solve the upcoming challenges of the Big Data World.\n\nUnfortunately, data science is generally available to University-level students only, making it difficult for other people to access its concepts. This obstacle can be removed with the help of Large Language Models, such as _gemma-2b_.\n\nIn this notebook, we'll make our way through the jungle of data science thanks to _gemma-2b_, a simple pdf file titled **\"What is data science?\"** [^[1]](#cite_note-1), ChromaDB vectorstores and Langchain, all elengatly written in python.\n\nThe final goal is to implement a simple, yet powerful, pipeline to generate a 100% local and fully-customizable LLM-based assistant that works with the user's data.\n\nLet's dive in!🛫\n\n\n[^[1]](#cite_ref-1) Brodie, Michael. (2019). What Is Data Science?. 10.1007/978-3-030-11821-1_8.","metadata":{"id":"cY7PiUMxHqWO"}},{"cell_type":"markdown","source":"# Build the environment\n\nFirst of all, we want everything set up the right way to work properly. To do so, we need to:\n\n1. Upload the pdf file in our workspace (we can simply create a dataset in Kaggle containing the pdf and add it as `input` to the notebook): in the following notebook example, we will name it \"/kaggle/input/what-is-datascience-docs/WhatisDataScienceFinalMay162018.pdf\". \n2. Install necessary dependencies\n3. Upload _gemma-2b_ model as Kaggle input\n4. Define useful functions to make our LLM-based data science assistant work","metadata":{"id":"YpQRNdd3NDT0"}},{"cell_type":"code","source":"# INSTALL NECESSARY DEPENDENCIES\n\n## Versions provided for the packages are not strict... Still, you may encounter issues if you use different ones\n\n! python3 -m pip install langchain-community==0.0.13 langchain==0.1.1 torch==2.1.2","metadata":{"id":"mTFqzkbzPQ5o","trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"# INSTALL NECESSARY DEPENDENCIES (pt2)\n\n! python3 -m pip install trl peft","metadata":{"id":"qb2GLlQZZan3","trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"# INSTALL NECESSARY DEPENDENCIES (pt3)\n\n! pip install pypdf==3.17.4","metadata":{"id":"GV6gIH8ucztn","trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"# INSTALL NECESSARY DEPENDENCIES (pt4)\n\n! pip install sentence_transformers==2.2.2","metadata":{"id":"lyEB5Rc_dOec","trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"# INSTALL NECESSARY DEPENDENCIES (pt6)\n\n! pip install chromadb==0.4.22","metadata":{"id":"B0onWX8heNch","trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"from kaggle_secrets import UserSecretsClient\nuser_secrets = UserSecretsClient()\nhf_token = user_secrets.get_secret(\"HF_TOKEN\")","metadata":{"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"# IMPORT gemma-2b MODEL FROM KAGGLE\n\n## To import the model, we'll be uploading the model directly from Kaggle input\n\nfrom transformers import AutoTokenizer, AutoModelForCausalLM\nmodel_checkpoint = \"/kaggle/input/gemma/transformers/2b/1\"\n\ntokenizer = AutoTokenizer.from_pretrained(model_checkpoint, token=hf_token)\nmodel = AutoModelForCausalLM.from_pretrained(model_checkpoint, token=hf_token)","metadata":{"id":"lcy5ImzmSLcq","trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"# DEFINE USEFUL FUNCTIONS\n\n## To chat, we'll need to create a vectorized database from our pdf and then build\n## a retrieval Q&A chain\n\nimport time\nfrom langchain_community.llms import HuggingFacePipeline\nfrom langchain.storage import LocalFileStore\nfrom langchain.embeddings import CacheBackedEmbeddings\nfrom langchain_community.vectorstores import Chroma\nfrom langchain.text_splitter import CharacterTextSplitter\nfrom langchain_community.document_loaders import PyPDFLoader\nfrom langchain_community.embeddings import HuggingFaceEmbeddings\nfrom langchain.chains import ConversationalRetrievalChain\nfrom transformers import pipeline, AutoTokenizer, AutoModelForCausalLM\nimport os\n\ndef create_a_persistent_db(pdfpath, dbpath, cachepath) -> None:\n \"\"\"\n Creates a persistent database from a PDF file.\n\n Args:\n pdfpath (str): The path to the PDF file.\n dbpath (str): The path to the storage folder for the persistent LocalDB.\n cachepath (str): The path to the storage folder for the embeddings cache.\n \"\"\"\n print(\"Started the operation...\")\n a = time.time()\n loader = PyPDFLoader(pdfpath)\n documents = loader.load()\n\n ### Split the documents into smaller chunks for processing\n text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n texts = text_splitter.split_documents(documents)\n\n ### Use HuggingFace embeddings for transforming text into numerical vectors\n ### This operation can take a while the first time but, once you created your local database with\n ### cached embeddings, it should be a matter of seconds to load them!\n embeddings = HuggingFaceEmbeddings()\n store = LocalFileStore(\n os.path.join(\n cachepath, os.path.basename(pdfpath).split(\".\")[0] + \"_cache\"\n )\n )\n cached_embeddings = CacheBackedEmbeddings.from_bytes_store(\n underlying_embeddings=embeddings,\n document_embedding_cache=store,\n namespace=os.path.basename(pdfpath).split(\".\")[0],\n )\n\n b = time.time()\n print(\n f\"Embeddings successfully created and stored at {os.path.join(cachepath, os.path.basename(pdfpath).split('.')[0]+'_cache')} under namespace: {os.path.basename(pdfpath).split('.')[0]}\"\n )\n print(f\"To load and embed, it took: {b - a}\")\n\n persist_directory = os.path.join(\n dbpath, os.path.basename(pdfpath).split(\".\")[0] + \"_localDB\"\n )\n vectordb = Chroma.from_documents(\n documents=texts,\n embedding=cached_embeddings,\n persist_directory=persist_directory,\n )\n c = time.time()\n print(\n f\"Persistent database successfully created and stored at {os.path.join(dbpath, os.path.basename(pdfpath).split('.')[0] + '_localDB')}\"\n )\n print(f\"To create a persistent database, it took: {c - b}\")\n return vectordb\n\ndef just_chatting(\n model,\n tokenizer,\n query,\n vectordb,\n chat_history=[]\n):\n \"\"\"\n Implements a chat system using Hugging Face models and a persistent database.\n\n Args:\n model (AutoModelForCausalLM): Hugging Face model, already loaded and prepared.\n tokenizer (AutoTokenizer): Hugging Face tokenizer, already loaded and prepared.\n model_task (str): Task for the Hugging Face model.\n persistent_db_dir (str): Directory for the persistent database.\n embeddings_cache (str): Path to cache Hugging Face embeddings.\n pdfpath (str): Path to the PDF file.\n query (str): Question by the user\n vectordb (ChromaDB): vectorstorer variable for retrieval.\n chat_history (list): A list with previous questions and answers, serves as context; by default it is empty (it may make the model allucinate)\n \"\"\"\n ### Create a text-generation pipeline and connect it to a ConversationalRetrievalChain\n pipe = pipeline(\"text-generation\",\n model=model,\n tokenizer=tokenizer,\n max_new_tokens = 2048,\n repetition_penalty = float(10),\n )\n\n local_llm = HuggingFacePipeline(pipeline=pipe)\n llm_chain = ConversationalRetrievalChain.from_llm(\n llm=local_llm,\n chain_type=\"stuff\",\n retriever=vectordb.as_retriever(search_kwargs={\"k\": 1}),\n return_source_documents=False,\n )\n rst = llm_chain({\"question\": query, \"chat_history\": chat_history})\n return rst","metadata":{"id":"_8Tt0dtkgEfv","trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"# Chat with the model\n\nTo chat with the model, we first have to build our local, persistent, database, and also compute embeddings: after that, we'll be able to chat with the model without problems!🚀","metadata":{"id":"pospjXN1a3lW"}},{"cell_type":"code","source":"# CREATE PERSISTENT DB\n\nfilepath = \"/kaggle/input/what-is-datascience-docs/WhatisDataScienceFinalMay162018.pdf\"\ndbpath = \"/kaggle/working/\"\ncachepath = \"/kaggle/working/\"\nvectordb = create_a_persistent_db(filepath, dbpath, cachepath)","metadata":{"id":"SSd1jia8bz5s","outputId":"ecc4c317-51f9-4cd3-a589-3138ccc39d23","trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"# CHAT WITH MODEL\n\nchat_history = []\nquery = \"Define datascience\"\nres = just_chatting(model, tokenizer, query, vectordb, chat_history=chat_history)\nchat_history.append([query, res[\"answer\"].replace(\"\\n\",\" \")])","metadata":{"id":"LxJHt3LneuGD","trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"print(\" \".join[res[\"answer\"]])","metadata":{"id":"utFaudIyitNO","trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"# Implement a simple chat GUI (local only)\n\nWant to interact more directly with your model, without going through that pythonic stuff? Let's implement a very simple and rudimental chat GUI, based on builtin package `tkinter`, to achieve this goal!🤯","metadata":{"id":"JlmAbwghlVrO"}},{"cell_type":"code","source":"import tkinter as tk\nfrom tkinter import scrolledtext\n\nclass ChatGUI:\n def __init__(self, master):\n self.master = master\n master.title(\"DataScienceAI\")\n\n self.chat_history = scrolledtext.ScrolledText(master, wrap=tk.WORD, width=40, height=15)\n self.chat_history.pack(padx=10, pady=10)\n\n self.user_input = tk.Entry(master, width=40)\n self.user_input.pack(padx=10, pady=10)\n\n self.send_button = tk.Button(master, text=\"Send\", command=self.send_message)\n self.send_button.pack(pady=10)\n\n # Set up initial conversation\n self.display_message(\"DataScienceAI: Hello! How can I help you today?\")\n\n def send_message(self):\n user_message = self.user_input.get()\n self.display_message(f\"You: {user_message}\")\n # Replace the next line with your chatbot logic to get a response\n chatbot_response = f\"DataScienceAI: {just_chatting(model, tokenizer, user_message, vectordb)[\"answer\"].replace(\"\\n\",\" \")}\"\n self.display_message(chatbot_response)\n self.user_input.delete(0, tk.END) # Clear the input field\n\n def display_message(self, message):\n self.chat_history.insert(tk.END, message + '\\n')\n self.chat_history.see(tk.END) # Scroll to the bottom\n\nif __name__ == \"__main__\":\n root = tk.Tk()\n chat_gui = ChatGUI(root)\n root.mainloop()","metadata":{},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"# Conclusions\n\nThis is it!\n\nWe built a simple assistant, fully customizable in terms of both the LLM employed (you can switch to _gemma-7b_ or to your favorite LLM) and the data you can make it work with (in this case is data sciences, but you can make it work also on a pdf about pallas' cats, if you want!)🐈.\n\nAnother important thing to note is that all of this is completely local, there is no need for hosted APIs, pay-as-you-go services or other things like that: everything is free to use, on your Desktop!\n\nThere are two main disadvantages in this approach: \n\n1. Performance-critical tasks, such as loading the model and making prediction, are heavily resource-dependent: to load big models (>1~2 GB) and to make them generate text, it is useful to have more than 16GB RAM and more than 4 CPU cores.\n2. Small (and old) models, such as _openai-community/gpt2_, can easily allucinate while generating text. This is generally prompt-dependent (meaning that they tend to produce trashy results on certain prompts more frequently than on other ones) and the issue almost totally resolves when employing large LLMs (_gemma-7b_ or _llama-7b_ would not-so-easily allucinate, for instance).\n\n### TLDR😵:\n\n**Pros**:\n- Simple and customizable\n- Use virtually any LLM you want\n- Use your own data\n- 100% local, 100% free, no payments or APIs\n\n**Cons**:\n- Performance might be resource-dependent for large LLMs (if you have >16GB RAM and >4 cores it shouldn't be a great problem)\n- Small LLMs can still allucinate","metadata":{"id":"hRy5ErJ_mkfV"}},{"cell_type":"markdown","source":"# References\n\n- Paul Mooney, Ashley Chow. (2024). Google – AI Assistants for Data Tasks with Gemma. Kaggle. https://kaggle.com/competitions/data-assistants-with-gemma\n- Brodie, Michael. (2019). What Is Data Science?. 10.1007/978-3-030-11821-1_8.\n","metadata":{"id":"wHWCmEp9oFsC"}}]} ================================================ FILE: .v0_1_1/scripts/gemma_for_datasciences.py ================================================ # -*- coding: utf-8 -*- """# gemma-2b AS A DATA SCIENCE TEACHER ## 100% LOCAL, WITHOUT FINETUNING, WITH _YOUR OWN DATA_ > _Information is the oil of the 21st century, and analytics is the combustion engine – Peter Sondergaard (Senior Vice President and the Global Head of Research at Gartner Inc)_ In a world where data are becoming more important with each day passing, data science is a fundamental discipline to master in order to understand and solve the upcoming challenges of the Big Data World. Unfortunately, data science is generally available to University-level students only, making it difficult for other people to access its concepts. This obstacle can be removed with the help of Large Language Models, such as _gemma-2b_. In this notebook, we'll make our way through the jungle of data science thanks to _gemma-2b_, a simple pdf file titled **"What is data science?"**, ChromaDB vectorstores and Langchain, all elengatly written in python. The final goal is to implement a simple, yet powerful, pipeline to generate a 100% local and fully-customizable LLM-based assistant that works with the user's data. Let's dive in!🛫 # Build the environment First of all, we want everything set up the right way to work properly. To do so, we need to: 1. Upload the pdf file in our workspace (we can simply create a dataset in Kaggle containing the pdf and add it as `input` to the notebook): in the following notebook example, we will name it "/kaggle/input/what-is-datascience-docs/WhatisDataScienceFinalMay162018.pdf". 2. Install necessary dependencies 3. Upload _gemma-2b_ model as Kaggle input 4. Define useful functions to make our LLM-based data science assistant work """ # IMPORT gemma-2b MODEL FROM KAGGLE ## To import the model, we'll be uploading the model directly from Kaggle input from transformers import AutoTokenizer, AutoModelForCausalLM model_checkpoint = "/kaggle/input/gemma/transformers/2b/1" hf_token = "YOUR_TOKEN" tokenizer = AutoTokenizer.from_pretrained(model_checkpoint, token=hf_token) model = AutoModelForCausalLM.from_pretrained(model_checkpoint, token=hf_token) # DEFINE USEFUL FUNCTIONS ## To chat, we'll need to create a vectorized database from our pdf and then build ## a retrieval Q&A chain import time from langchain_community.llms import HuggingFacePipeline from langchain.storage import LocalFileStore from langchain.embeddings import CacheBackedEmbeddings from langchain_community.vectorstores import Chroma from langchain.text_splitter import CharacterTextSplitter from langchain_community.document_loaders import PyPDFLoader from langchain_community.embeddings import HuggingFaceEmbeddings from langchain.chains import ConversationalRetrievalChain from transformers import pipeline, AutoTokenizer, AutoModelForCausalLM import os def create_a_persistent_db(pdfpath, dbpath, cachepath) -> None: """ Creates a persistent database from a PDF file. Args: pdfpath (str): The path to the PDF file. dbpath (str): The path to the storage folder for the persistent LocalDB. cachepath (str): The path to the storage folder for the embeddings cache. """ print("Started the operation...") a = time.time() loader = PyPDFLoader(pdfpath) documents = loader.load() ### Split the documents into smaller chunks for processing text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0) texts = text_splitter.split_documents(documents) ### Use HuggingFace embeddings for transforming text into numerical vectors ### This operation can take a while the first time but, once you created your local database with ### cached embeddings, it should be a matter of seconds to load them! embeddings = HuggingFaceEmbeddings() store = LocalFileStore( os.path.join( cachepath, os.path.basename(pdfpath).split(".")[0] + "_cache" ) ) cached_embeddings = CacheBackedEmbeddings.from_bytes_store( underlying_embeddings=embeddings, document_embedding_cache=store, namespace=os.path.basename(pdfpath).split(".")[0], ) b = time.time() print( f"Embeddings successfully created and stored at {os.path.join(cachepath, os.path.basename(pdfpath).split('.')[0]+'_cache')} under namespace: {os.path.basename(pdfpath).split('.')[0]}" ) print(f"To load and embed, it took: {b - a}") persist_directory = os.path.join( dbpath, os.path.basename(pdfpath).split(".")[0] + "_localDB" ) vectordb = Chroma.from_documents( documents=texts, embedding=cached_embeddings, persist_directory=persist_directory, ) c = time.time() print( f"Persistent database successfully created and stored at {os.path.join(dbpath, os.path.basename(pdfpath).split('.')[0] + '_localDB')}" ) print(f"To create a persistent database, it took: {c - b}") return vectordb def just_chatting( model, tokenizer, query, vectordb, chat_history=[] ): """ Implements a chat system using Hugging Face models and a persistent database. Args: model (AutoModelForCausalLM): Hugging Face model, already loaded and prepared. tokenizer (AutoTokenizer): Hugging Face tokenizer, already loaded and prepared. model_task (str): Task for the Hugging Face model. persistent_db_dir (str): Directory for the persistent database. embeddings_cache (str): Path to cache Hugging Face embeddings. pdfpath (str): Path to the PDF file. query (str): Question by the user vectordb (ChromaDB): vectorstorer variable for retrieval. chat_history (list): A list with previous questions and answers, serves as context; by default it is empty (it may make the model allucinate) """ ### Create a text-generation pipeline and connect it to a ConversationalRetrievalChain pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, max_new_tokens = 2048, repetition_penalty = float(10), ) local_llm = HuggingFacePipeline(pipeline=pipe) llm_chain = ConversationalRetrievalChain.from_llm( llm=local_llm, chain_type="stuff", retriever=vectordb.as_retriever(search_kwargs={"k": 1}), return_source_documents=False, ) rst = llm_chain({"question": query, "chat_history": chat_history}) return rst """# Chat with the model To chat with the model, we first have to build our local, persistent, database, and also compute embeddings: after that, we'll be able to chat with the model without problems!🚀 """ # CREATE PERSISTENT DB filepath = "/kaggle/input/what-is-datascience-docs/WhatisDataScienceFinalMay162018.pdf" dbpath = "/kaggle/working/" cachepath = "/kaggle/working/" vectordb = create_a_persistent_db(filepath, dbpath, cachepath) # CHAT WITH MODEL chat_history = [] query = "Define datascience" res = just_chatting(model, tokenizer, query, vectordb, chat_history=chat_history) chat_history.append([query, res["answer"].replace("\n"," ")]) print(" ".join[res["answer"]]) """# Implement a simple chat GUI (local only) Want to interact more directly with your model, without going through that pythonic stuff? Let's implement a very simple and rudimental chat GUI, based on builtin package `tkinter`, to achieve this goal!🤯 """ import tkinter as tk from tkinter import scrolledtext class ChatGUI: def __init__(self, master): self.master = master master.title("DataScienceAI") self.chat_history = scrolledtext.ScrolledText(master, wrap=tk.WORD, width=40, height=15) self.chat_history.pack(padx=10, pady=10) self.user_input = tk.Entry(master, width=40) self.user_input.pack(padx=10, pady=10) self.send_button = tk.Button(master, text="Send", command=self.send_message) self.send_button.pack(pady=10) # Set up initial conversation self.display_message("DataScienceAI: Hello! How can I help you today?") def send_message(self): user_message = self.user_input.get() self.display_message(f"You: {user_message}") # Replace the next line with your chatbot logic to get a response chatbot_response = f"DataScienceAI: {just_chatting(model, tokenizer, user_message, vectordb)["answer"].replace("\n"," ")}" self.display_message(chatbot_response) self.user_input.delete(0, tk.END) # Clear the input field def display_message(self, message): self.chat_history.insert(tk.END, message + '\n') self.chat_history.see(tk.END) # Scroll to the bottom if __name__ == "__main__": root = tk.Tk() chat_gui = ChatGUI(root) root.mainloop() """# Conclusions This is it! We built a simple assistant, fully customizable in terms of both the LLM employed (you can switch to _gemma-7b_ or to your favorite LLM) and the data you can make it work with (in this case is data sciences, but you can make it work also on a pdf about pallas' cats, if you want!)🐈. Another important thing to note is that all of this is completely local, there is no need for hosted APIs, pay-as-you-go services or other things like that: everything is free to use, on your Desktop! There are two main disadvantages in this approach: 1. Performance-critical tasks, such as loading the model and making prediction, are heavily resource-dependent: to load big models (>1~2 GB) and to make them generate text, it is useful to have more than 16GB RAM and more than 4 CPU cores. 2. Small (and old) models, such as _openai-community/gpt2_, can easily allucinate while generating text. This is generally prompt-dependent (meaning that they tend to produce trashy results on certain prompts more frequently than on other ones) and the issue almost totally resolves when employing large LLMs (_gemma-7b_ or _llama-7b_ would not-so-easily allucinate, for instance). ### TLDR😵: **Pros**: - Simple and customizable - Use virtually any LLM you want - Use your own data - 100% local, 100% free, no payments or APIs **Cons**: - Performance might be resource-dependent for large LLMs (if you have >16GB RAM and >4 cores it shouldn't be a great problem) - Small LLMs can still allucinate # References - Paul Mooney, Ashley Chow. (2024). Google – AI Assistants for Data Tasks with Gemma. Kaggle. https://kaggle.com/competitions/data-assistants-with-gemma - Brodie, Michael. (2019). What Is Data Science?. 10.1007/978-3-030-11821-1_8. """ ================================================ FILE: LICENSE ================================================ Apache License Version 2.0, January 2004 http://www.apache.org/licenses/ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 1. Definitions. "License" shall mean the terms and conditions for use, reproduction, and distribution as defined by Sections 1 through 9 of this document. "Licensor" shall mean the copyright owner or entity authorized by the copyright owner that is granting the License. "Legal Entity" shall mean the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, "control" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity. "You" (or "Your") shall mean an individual or Legal Entity exercising permissions granted by this License. "Source" form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files. "Object" form shall mean any form resulting from mechanical transformation or translation of a Source form, including but not limited to compiled object code, generated documentation, and conversions to other media types. "Work" shall mean the work of authorship, whether in Source or Object form, made available under the License, as indicated by a copyright notice that is included in or attached to the work (an example is provided in the Appendix below). "Derivative Works" shall mean any work, whether in Source or Object form, that is based on (or derived from) the Work and for which the editorial revisions, annotations, elaborations, or other modifications represent, as a whole, an original work of authorship. For the purposes of this License, Derivative Works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the Work and Derivative Works thereof. "Contribution" shall mean any work of authorship, including the original version of the Work and any modifications or additions to that Work or Derivative Works thereof, that is intentionally submitted to Licensor for inclusion in the Work by the copyright owner or by an individual or Legal Entity authorized to submit on behalf of the copyright owner. For the purposes of this definition, "submitted" means any form of electronic, verbal, or written communication sent to the Licensor or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, the Licensor for the purpose of discussing and improving the Work, but excluding communication that is conspicuously marked or otherwise designated in writing by the copyright owner as "Not a Contribution." "Contributor" shall mean Licensor and any individual or Legal Entity on behalf of whom a Contribution has been received by Licensor and subsequently incorporated within the Work. 2. Grant of Copyright License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute the Work and such Derivative Works in Source or Object form. 3. Grant of Patent License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work, where such license applies only to those patent claims licensable by such Contributor that are necessarily infringed by their Contribution(s) alone or by combination of their Contribution(s) with the Work to which such Contribution(s) was submitted. If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed. 4. Redistribution. You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications, and in Source or Object form, provided that You meet the following conditions: (a) You must give any other recipients of the Work or Derivative Works a copy of this License; and (b) You must cause any modified files to carry prominent notices stating that You changed the files; and (c) You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works; and (d) If the Work includes a "NOTICE" text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear. The contents of the NOTICE file are for informational purposes only and do not modify the License. You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the Work, provided that such additional attribution notices cannot be construed as modifying the License. You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Works as a whole, provided Your use, reproduction, and distribution of the Work otherwise complies with the conditions stated in this License. 5. Submission of Contributions. Unless You explicitly state otherwise, any Contribution intentionally submitted for inclusion in the Work by You to the Licensor shall be under the terms and conditions of this License, without any additional terms or conditions. Notwithstanding the above, nothing herein shall supersede or modify the terms of any separate license agreement you may have executed with Licensor regarding such Contributions. 6. Trademarks. This License does not grant permission to use the trade names, trademarks, service marks, or product names of the Licensor, except as required for reasonable and customary use in describing the origin of the Work and reproducing the content of the NOTICE file. 7. Disclaimer of Warranty. Unless required by applicable law or agreed to in writing, Licensor provides the Work (and each Contributor provides its Contributions) on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Work and assume any risks associated with Your exercise of permissions under this License. 8. Limitation of Liability. In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall any Contributor be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or out of the use or inability to use the Work (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if such Contributor has been advised of the possibility of such damages. 9. Accepting Warranty or Additional Liability. While redistributing the Work or Derivative Works thereof, You may choose to offer, and charge a fee for, acceptance of support, warranty, indemnity, or other liability obligations and/or rights consistent with this License. However, in accepting such obligations, You may act only on Your own behalf and on Your sole responsibility, not on behalf of any other Contributor, and only if You agree to indemnify, defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability. END OF TERMS AND CONDITIONS APPENDIX: How to apply the Apache License to your work. To apply the Apache License to your work, attach the following boilerplate notice, with the fields enclosed by brackets "[]" replaced with your own identifying information. (Don't include the brackets!) The text should be enclosed in the appropriate comment syntax for the file format. We also recommend that a file or class name and description of purpose be included on the same "printed page" as the copyright notice for easier identification within third-party archives. Copyright [yyyy] [name of copyright owner] Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. ================================================ FILE: README.md ================================================

everything-ai

Your fully proficient, AI-powered and local chatbot assistant🤖

Flowchart for everything-ai

## Quickstart ### 1. Clone this repository ```bash git clone https://github.com/AstraBert/everything-ai.git cd everything-ai ``` ### 2. Set your `.env` file Modify: - `VOLUME` variable in the .env file so that you can mount your local file system into Docker container. - `MODELS_PATH` variable in the .env file so that you can tell llama.cpp where you stored the GGUF models you downloaded. - `MODEL` variable in the .env file so that you can tell llama.cpp what model to use (use the actual name of the gguf file, and do not forget the .gguf extension!) - `MAX_TOKENS` variable in the .env file so that you can tell llama.cpp how many new tokens it can generate as output. An example of a `.env` file could be: ```bash VOLUME="c:/Users/User/:/User/" MODELS_PATH="c:/Users/User/.cache/llama.cpp/" MODEL="stories260K.gguf" MAX_TOKENS="512" ``` This means that now everything that is under "c:/Users/User/" on your local machine is under "/User/" in your Docker container, that llama.cpp knows where to look for models and what model to look for, along with the maximum new tokens for its output. ### 3. Pull the necessary images ```bash docker pull astrabert/everything-ai:latest docker pull qdrant/qdrant:latest docker pull ghcr.io/ggerganov/llama.cpp:server ``` ### 4. Run the multi-container app ```bash docker compose up ``` ### 5. Go to `localhost:8670` and choose your assistant You will see something like this:

Choose the task among: - *retrieval-text-generation*: use `qdrant` backend to build a retrieval-friendly knowledge base, which you can query and tune the response of your model on. You have to pass either a pdf/a bunch of pdfs specified as comma-separated paths or a directory where all the pdfs of interest are stored (**DO NOT** provide both); you can also specify the language in which the PDF is written, using [ISO nomenclature](https://en.wikipedia.org/wiki/List_of_ISO_639_language_codes) - **MULTILINGUAL** - *agnostic-text-generation*: ChatGPT-like text generation (no retrieval architecture), but supports every text-generation model on HF Hub (as long as your hardware supports it!) - **MULTILINGUAL** - *text-summarization*: summarize text and pdfs, supports every text-summarization model on HF Hub - **ENGLISH ONLY** - *image-generation*: stable diffusion, supports every text-to-image model on HF Hub - **MULTILINGUAL** - *image-generation-pollinations*: stable diffusion, use Pollinations AI API; if you choose 'image-generation-pollinations', you do not need to specify anything else apart from the task - **MULTILINGUAL** - *image-classification*: classify an image, supports every image-classification model on HF Hub - **ENGLISH ONLY** - *image-to-text*: describe an image, supports every image-to-text model on HF Hub - **ENGLISH ONLY** - *audio-classification*: classify audio files or microphone recordings, supports audio-classification models on HF hub - *speech-recognition*: transcribe audio files or microphone recordings, supports automatic-speech-recognition models on HF hub. - *video-generation*: generate video upon text prompt, supports text-to-video models on HF hub - **ENGLISH ONLY** - *protein-folding*: get the 3D structure of a protein from its amino-acid sequence, using ESM-2 backbone model - **GPU ONLY** - *autotrain*: fine-tune a model on a specific downstream task with autotrain-advanced, just by specifying you HF username, HF writing token and the path to a yaml config file for the training - *spaces-api-supabase*: use HF Spaces API in combination with Supabase PostgreSQL databases in order to unleash more powerful LLMs and larger RAG-oriented vector databases - **MULTILINGUAL** - *llama.cpp-and-qdrant*: same as *retrieval-text-generation*, but uses **llama.cpp** as inference engine, so you MUST NOT specify a model - **MULTILINGUAL** - *build-your-llm*: Build a customizable chat LLM combining a Qdrant database with your PDFs and the power of Anthropic, OpenAI, Cohere or Groq models: you just need an API key! To build the Qdrant database, have to pass either a pdf/a bunch of pdfs specified as comma-separated paths or a directory where all the pdfs of interest are stored (**DO NOT** provide both); you can also specify the language in which the PDF is written, using [ISO nomenclature](https://en.wikipedia.org/wiki/List_of_ISO_639_language_codes) - **MULTILINGUAL**, **LANGFUSE INTEGRATION** - *simply-chatting*: Build a customizable chat LLM with the power of Anthropic, OpenAI, Cohere or Groq models (no RAG pipeline): you just need an API key! - **MULTILINGUAL**, **LANGFUSE INTEGRATION** - *fal-img2img*: Use [fal.ai](https://fal.ai) ComfyUI API to generate images starting from yur PNG and JPEG images: you just need an API key! You can aklso customize the generation working with prompts and seeds - **ENGLISH ONLY** - *image-retrieval-search*: search an image database uploading a folder as database input. The folder should have the following structure: ``` ./ ├── test/ | ├── label1/ | └── label2/ └── train/ ├── label1/ └── label2/ ``` You can query the database starting from your own pictures. ### 6. Go to `localhost:7860` and start using your assistant Once everything is ready, you can head over to `localhost:7860` and start using your assistant:

================================================ FILE: _config.yml ================================================ theme: jekyll-theme-minimal ================================================ FILE: compose.yaml ================================================ networks: mynet: driver: bridge services: everything-ai: image: astrabert/everything-ai volumes: - $VOLUME networks: - mynet ports: - "7860:7860" - "8760:8760" qdrant: image: qdrant/qdrant ports: - "6333:6333" volumes: - "./qdrant_storage:/qdrant/storage" networks: - mynet llama_server: image: ghcr.io/ggerganov/llama.cpp:server ports: - "8000:8000" volumes: - "$MODELS_PATH:/models" networks: - mynet command: "-m /models/$MODEL --port 8000 --host 0.0.0.0 -n $MAX_TOKENS" ================================================ FILE: docker/Dockerfile ================================================ # Use an official Python runtime as a parent image FROM astrabert/everything-ai # Set the working directory in the container to /app WORKDIR /app # Add the current directory contents into the container at /app ADD . /app RUN pip install fal_client # Expose the port that the application will run on EXPOSE 8760 ENTRYPOINT [ "python3", "select_and_run.py" ] ================================================ FILE: docker/agnostic_text_generation.py ================================================ import gradio as gr from utils import Translation from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline from argparse import ArgumentParser argparse = ArgumentParser() argparse.add_argument( "-m", "--model", help="HuggingFace Model identifier, such as 'google/flan-t5-base'", required=True, ) args = argparse.parse_args() mod = args.model mod = mod.replace("\"", "").replace("'", "") model_checkpoint = mod model = AutoModelForCausalLM.from_pretrained(model_checkpoint) tokenizer = AutoTokenizer.from_pretrained(model_checkpoint) pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, max_new_tokens=2048, repetition_penalty=1.2, temperature=0.4) def reply(message, history): txt = Translation(message, "en") if txt.original == "en": response = pipe(message) return response[0]["generated_text"] else: translation = txt.translatef() response = pipe(translation) t = Translation(response[0]["generated_text"], txt.original) res = t.translatef() return res demo = gr.ChatInterface(fn=reply, title="Multilingual-Bloom Bot") demo.launch(server_name="0.0.0.0", share=False) ================================================ FILE: docker/audio_classification.py ================================================ from transformers import pipeline from argparse import ArgumentParser import torch import gradio as gr import numpy as np argparse = ArgumentParser() argparse.add_argument( "-m", "--model", help="HuggingFace Model identifier, such as 'google/flan-t5-base'", required=True, ) args = argparse.parse_args() mod = args.model mod = mod.replace("\"", "").replace("'", "") model_checkpoint = mod # Audio class classifier = pipeline(task="audio-classification", model=mod) def classify_text(audio): global classifier sr, data = audio short_tensor = data.astype(np.float32) res = classifier(short_tensor) return res[0]["label"] input_audio = gr.Audio( sources=["upload","microphone"], waveform_options=gr.WaveformOptions( waveform_color="#01C6FF", waveform_progress_color="#0066B4", skip_length=2, show_controls=False, ), ) demo = gr.Interface( title="everything-ai-audioclass", fn=classify_text, inputs=input_audio, outputs="text" ) if __name__ == "__main__": demo.launch(server_name="0.0.0.0", share=False) ================================================ FILE: docker/autotrain_interface.py ================================================ import subprocess as sp import gradio as gr import subprocess as sp def build_command(hf_usr, hf_token, configpath): sp.run(f"export HF_USERNAME=\"{hf_usr}\"", shell=True) sp.run(f"export HF_TOKEN=\"{hf_token}\"", shell=True) sp.run(f"autotrain --config {configpath}", shell=True) return f"export HF_USERNAME={hf_usr}\nexport HF_TOKEN={hf_token}\nautotrain --config {configpath}" demo = gr.Interface( build_command, [ gr.Textbox( label="HF username", info="Your HF username", lines=3, value=f"your-cute-name", ), gr.Textbox( label="HF write token", info="An HF token that has write permissions on your repository", lines=3, value=f"your-powerful-token", ), gr.File(label="Yaml configuration file path" ) ], title="everything-ai-autotrain", outputs="textbox", theme=gr.themes.Base() ) if __name__ == "__main__": demo.launch(server_name="0.0.0.0", server_port=7860, share=False) ================================================ FILE: docker/build_your_llm.py ================================================ from langchain_anthropic import ChatAnthropic from langchain_cohere import ChatCohere from langchain_groq import ChatGroq from langchain_openai import ChatOpenAI from langchain_core.runnables.history import RunnableWithMessageHistory from langchain_community.chat_message_histories import SQLChatMessageHistory from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder import gradio as gr from argparse import ArgumentParser from qdrant_client import QdrantClient from sentence_transformers import SentenceTransformer from utils import * import os import subprocess as sp import time from langfuse.callback import CallbackHandler argparse = ArgumentParser() argparse.add_argument( "-pf", "--pdf_file", help="Single pdf file or N pdfs reported like this: /path/to/file1.pdf,/path/to/file2.pdf,...,/path/to/fileN.pdf (there is no strict naming, you just need to provide them comma-separated)", required=False, default="No file" ) argparse.add_argument( "-d", "--directory", help="Directory where all your pdfs of interest are stored", required=False, default="No directory" ) argparse.add_argument( "-l", "--language", help="Language of the written content contained in the pdfs", required=False, default="Same as query" ) args = argparse.parse_args() pdff = args.pdf_file dirs = args.directory lan = args.language if pdff.replace("\\","").replace("'","") != "None" and dirs.replace("\\","").replace("'","") == "No directory": pdfs = pdff.replace("\\","/").replace("'","").split(",") else: pdfs = [os.path.join(dirs.replace("\\","/").replace("'",""), f) for f in os.listdir(dirs.replace("\\","/").replace("'","")) if f.endswith(".pdf")] client = QdrantClient(host="host.docker.internal", port="6333") encoder = SentenceTransformer("all-MiniLM-L6-v2") pdfdb = PDFdatabase(pdfs, encoder, client) pdfdb.preprocess() pdfdb.collect_data() pdfdb.qdrant_collection_and_upload() sp.run("rm -rf memory.db", shell=True) def get_session_history(session_id): return SQLChatMessageHistory(session_id, "sqlite:///memory.db") NAME2CHAT = {"Cohere": ChatCohere, "claude-3-opus-20240229": ChatAnthropic, "claude-3-sonnet-20240229": ChatAnthropic, "claude-3-haiku-20240307": ChatAnthropic, "llama3-8b-8192": ChatGroq, "llama3-70b-8192": ChatGroq, "mixtral-8x7b-32768": ChatGroq, "gemma-7b-it": ChatGroq, "gpt-4o": ChatOpenAI, "gpt-3.5-turbo-0125": ChatOpenAI} NAME2APIKEY = {"Cohere": "COHERE_API_KEY", "claude-3-opus-20240229": "ANTHROPIC_API_KEY", "claude-3-sonnet-20240229": "ANTHROPIC_API_KEY", "claude-3-haiku-20240307": "ANTHROPIC_API_KEY", "llama3-8b-8192": "GROQ_API_KEY", "llama3-70b-8192": "GROQ_API_KEY", "mixtral-8x7b-32768": "GROQ_API_KEY", "gemma-7b-it": "GROQ_API_KEY", "gpt-4o": "OPENAI_API_KEY", "gpt-3.5-turbo-0125": "OPENAI_API_KEY"} system_template = "You are an helpful assistant that can rely on this: {context} and on the previous message history as context, and from that you build a context and history-aware reply to this user input:" def build_langfuse_handler(langfuse_host, langfuse_pkey, langfuse_skey): if langfuse_host!="None" and langfuse_pkey!="None" and langfuse_skey!="None": langfuse_handler = CallbackHandler( public_key=langfuse_pkey, secret_key=langfuse_skey, host=langfuse_host ) return langfuse_handler, True else: return "No langfuse", False def reply(message, history, name, api_key, temperature, max_new_tokens,langfuse_host, langfuse_pkey, langfuse_skey, sessionid): global pdfdb os.environ[NAME2APIKEY[name]] = api_key if name == "Cohere": model = NAME2CHAT[name](temperature=temperature, max_tokens=max_new_tokens) else: model = NAME2CHAT[name](model=name,temperature=temperature, max_tokens=max_new_tokens) prompt_template = ChatPromptTemplate.from_messages( [("system", system_template), MessagesPlaceholder(variable_name="history"), ("human", "{input}")] ) lf_handler, truth = build_langfuse_handler(langfuse_host, langfuse_pkey, langfuse_skey) chain = prompt_template | model runnable_with_history = RunnableWithMessageHistory( chain, get_session_history, input_messages_key="input", history_messages_key="history", ) txt = Translation(message, "en") if txt.original == "en" and lan.replace("\\","").replace("'","") == "None": txt2txt = NeuralSearcher(pdfdb.collection_name, pdfdb.client, pdfdb.encoder) results = txt2txt.search(message) if not truth: response = runnable_with_history.invoke({"context": results[0]["text"], "input": message}, config={"configurable": {"session_id": sessionid}})##CONFIGURE! else: response = runnable_with_history.invoke({"context": results[0]["text"], "input": message}, config={"configurable": {"session_id": sessionid}, "callbacks": [lf_handler]})##CONFIGURE! llm='' for char in response.content: llm+=char time.sleep(0.001) yield llm elif txt.original == "en" and lan.replace("\\","").replace("'","") != "None": txt2txt = NeuralSearcher(pdfdb.collection_name, pdfdb.client, pdfdb.encoder) transl = Translation(message, lan.replace("\\","").replace("'","")) message = transl.translatef() results = txt2txt.search(message) t = Translation(results[0]["text"], txt.original) res = t.translatef() if not truth: response = runnable_with_history.invoke({"context": res, "input": message}, config={"configurable": {"session_id": sessionid}})##CONFIGURE! else: response = runnable_with_history.invoke({"context": res, "input": message}, config={"configurable": {"session_id": sessionid}, "callbacks": [lf_handler]})##CONFIGURE! llm = '' for char in response.content: llm+=char time.sleep(0.001) yield llm elif txt.original != "en" and lan.replace("\\","").replace("'","") == "None": txt2txt = NeuralSearcher(pdfdb.collection_name, pdfdb.client, pdfdb.encoder) results = txt2txt.search(message) transl = Translation(results[0]["text"], "en") translation = transl.translatef() if not truth: response = runnable_with_history.invoke({"context": translation, "input": message}, config={"configurable": {"session_id": sessionid}})##CONFIGURE! else: response = runnable_with_history.invoke({"context": translation, "input": message}, config={"configurable": {"session_id": sessionid}, "callbacks": [lf_handler]})##CONFIGURE! t = Translation(response.content, txt.original) res = t.translatef() llm = '' for char in res: llm+=char time.sleep(0.001) yield llm else: txt2txt = NeuralSearcher(pdfdb.collection_name, pdfdb.client, pdfdb.encoder) transl = Translation(message, lan.replace("\\","").replace("'","")) message = transl.translatef() results = txt2txt.search(message) t = Translation(results[0]["text"], txt.original) res = t.translatef() if not truth: response = runnable_with_history.invoke({"context": res, "input": message}, config={"configurable": {"session_id": sessionid}})##CONFIGURE! else: response = runnable_with_history.invoke({"context": res, "input": message}, config={"configurable": {"session_id": sessionid}, "callbacks": [lf_handler]})##CONFIGURE! tr = Translation(response.content, txt.original) ress = tr.translatef() llm = '' for char in ress: llm+=char time.sleep(0.001) yield llm chat_model = gr.Dropdown( [m for m in list(NAME2APIKEY)], label="Chat Model", info="Choose one of the available chat models" ) user_api_key = gr.Textbox( label="API key", info="Paste your API key here", lines=1, type="password", ) user_temperature = gr.Slider(0, 1, value=0.5, label="Temperature", info="Select model temperature") user_max_new_tokens = gr.Slider(0, 8192, value=1024, label="Max new tokens", info="Select max output tokens (higher number of tokens will result in a longer latency)") user_lf_host = gr.Textbox(label="LangFuse Host",info="Provide LangFuse host URL, or type 'None' if you do not wish to use LangFuse",value="https://cloud.langfuse.com") user_lf_pkey = gr.Textbox(label="LangFuse Public Key",info="Provide LangFuse Public key, or type 'None' if you do not wish to use LangFuse",value="pk-*************************", type="password") user_lf_skey = gr.Textbox(label="LangFuse Secret Key",info="Provide LangFuse Secret key, or type 'None' if you do not wish to use LangFuse",value="sk-*************************", type="password") user_session_id = gr.Textbox(label="Session ID",info="This alphanumeric code will link model reply to a specific message history of which the models will be aware when replying. Changing it will result in the loss of memory for your model",value="1") additional_accordion = gr.Accordion(label="Parameters to be set before you start chatting", open=True) demo = gr.ChatInterface(fn=reply, additional_inputs=[chat_model, user_api_key, user_temperature, user_max_new_tokens, user_lf_host, user_lf_pkey, user_lf_skey, user_session_id], additional_inputs_accordion=additional_accordion, title="everything-ai-buildyourllm") if __name__=="__main__": demo.launch(server_name="0.0.0.0", share=False) ================================================ FILE: docker/chat_your_llm.py ================================================ from langchain_anthropic import ChatAnthropic from langchain_cohere import ChatCohere from langchain_groq import ChatGroq from langchain_openai import ChatOpenAI from langchain_core.output_parsers import StrOutputParser from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder from langchain_core.runnables.history import RunnableWithMessageHistory from langchain_community.chat_message_histories import SQLChatMessageHistory from utils import Translation import time import os from langfuse.callback import CallbackHandler import gradio as gr import subprocess as sp NAME2CHAT = {"Cohere": ChatCohere, "claude-3-opus-20240229": ChatAnthropic, "claude-3-sonnet-20240229": ChatAnthropic, "claude-3-haiku-20240307": ChatAnthropic, "llama3-8b-8192": ChatGroq, "llama3-70b-8192": ChatGroq, "mixtral-8x7b-32768": ChatGroq, "gemma-7b-it": ChatGroq, "gpt-4o": ChatOpenAI, "gpt-3.5-turbo-0125": ChatOpenAI} NAME2APIKEY = {"Cohere": "COHERE_API_KEY", "claude-3-opus-20240229": "ANTHROPIC_API_KEY", "claude-3-sonnet-20240229": "ANTHROPIC_API_KEY", "claude-3-haiku-20240307": "ANTHROPIC_API_KEY", "llama3-8b-8192": "GROQ_API_KEY", "llama3-70b-8192": "GROQ_API_KEY", "mixtral-8x7b-32768": "GROQ_API_KEY", "gemma-7b-it": "GROQ_API_KEY", "gpt-4o": "OPENAI_API_KEY", "gpt-3.5-turbo-0125": "OPENAI_API_KEY"} sp.run("rm -rf memory.db", shell=True) def build_langfuse_handler(langfuse_host, langfuse_pkey, langfuse_skey): if langfuse_host!="None" and langfuse_pkey!="None" and langfuse_skey!="None": langfuse_handler = CallbackHandler( public_key=langfuse_pkey, secret_key=langfuse_skey, host=langfuse_host ) return langfuse_handler, True else: return "No langfuse", False def get_session_history(session_id): return SQLChatMessageHistory(session_id, "sqlite:///chatmemory.db") def reply(message, history, name, api_key, temperature, max_new_tokens,langfuse_host, langfuse_pkey, langfuse_skey, system_template, sessionid): os.environ[NAME2APIKEY[name]] = api_key if name == "Cohere": model = NAME2CHAT[name](temperature=temperature, max_tokens=max_new_tokens) else: model = NAME2CHAT[name](model=name,temperature=temperature, max_tokens=max_new_tokens) prompt_template = ChatPromptTemplate.from_messages( [("system", system_template), MessagesPlaceholder(variable_name="history"), ("human", "{input}")] ) lf_handler, truth = build_langfuse_handler(langfuse_host, langfuse_pkey, langfuse_skey) chain = prompt_template | model runnable_with_history = RunnableWithMessageHistory( chain, get_session_history, input_messages_key="input", history_messages_key="history", ) txt = Translation(message, "en") if txt.original == "en": if not truth: response = runnable_with_history.invoke({"input": message}, config={"configurable": {"session_id": sessionid}})##CONFIGURE! else: response = runnable_with_history.invoke({"input": message}, config={"configurable": {"session_id": sessionid}, "callbacks": [lf_handler]}) r = '' for c in response.content: r+=c time.sleep(0.001) yield r else: translation = txt.translatef() if not truth: response = runnable_with_history.invoke({"input": translation}, config={"configurable": {"session_id": sessionid}})##CONFIGURE! else: response = runnable_with_history.invoke({"input": translation}, config={"configurable": {"session_id": sessionid}, "callbacks": [lf_handler]}) t = Translation(response.content, txt.original) res = t.translatef() r = '' for c in res: r+=c time.sleep(0.001) yield r chat_model = gr.Dropdown( [m for m in list(NAME2APIKEY)], label="Chat Model", info="Choose one of the available chat models" ) user_api_key = gr.Textbox( label="API key", info="Paste your API key here", lines=1, type="password", ) user_temperature = gr.Slider(0, 1, value=0.5, label="Temperature", info="Select model temperature") user_max_new_tokens = gr.Slider(0, 8192, value=1024, label="Max new tokens", info="Select max output tokens (higher number of tokens will result in a longer latency)") user_lf_host = gr.Textbox(label="LangFuse Host",info="Provide LangFuse host URL, or type 'None' if you do not wish to use LangFuse",value="https://cloud.langfuse.com") user_lf_pkey = gr.Textbox(label="LangFuse Public Key",info="Provide LangFuse Public key, or type 'None' if you do not wish to use LangFuse",value="pk-*************************", type="password") user_lf_skey = gr.Textbox(label="LangFuse Secret Key",info="Provide LangFuse Secret key, or type 'None' if you do not wish to use LangFuse",value="sk-*************************", type="password") user_template = gr.Textbox(label="System Template",info="Customize your assistant with your instructions",value="You are an helpful assistant") user_session_id = gr.Textbox(label="Session ID",info="This alphanumeric code will link model reply to a specific message history of which the models will be aware when replying. Changing it will result in the loss of memory for your model",value="1") additional_accordion = gr.Accordion(label="Parameters to be set before you start chatting", open=True) demo = gr.ChatInterface(fn=reply, additional_inputs=[chat_model, user_api_key, user_temperature, user_max_new_tokens, user_lf_host, user_lf_pkey, user_lf_skey, user_template, user_session_id], additional_inputs_accordion=additional_accordion, title="everything-ai-simplychatting") if __name__=="__main__": demo.launch(server_name="0.0.0.0", share=False) ================================================ FILE: docker/fal_img2img.py ================================================ import asyncio import fal_client import os import gradio as gr from PIL import Image MAP_EXTS = {"jpg": "jpeg", "jpeg": "jpeg", "png": "png"} async def submit(image_path, prompt, seed): ext = image_path.split(".")[1] handler = await fal_client.submit_async( "comfy/astrabert/image2image", arguments={ "ksampler_seed": seed, "cliptextencode_text": prompt, "image_load_image_path": f"data:image/{MAP_EXTS[ext]};base64,{image_path}" }, ) result = await handler.get() return result def get_url(results): url = results['outputs'][list(results['outputs'].keys())[0]]['images'][0]['url'] nm = results['outputs'][list(results['outputs'].keys())[0]]['images'][0]['filename'] return f"![{nm}]({url})" def render_image(api_key, image_path, prompt, seed): os.environ["FAL_KEY"] = api_key results = asyncio.run(submit(image_path, prompt, int(seed))) url = get_url(results) img = Image.open(image_path) return img, url demo = gr.Interface(render_image, inputs=[gr.Textbox(label="API key", type="password", value="fal-******************"), gr.File(label="PNG/JPEG Image"), gr.Textbox(label="Prompt", info="Specify how you would like the image generation to be"), gr.Textbox(label="Seed", info="Pass your seed here (if not interested, leave it as it is)", value="123498235498246")], outputs=[gr.Image(label="Your Base Image"), gr.Markdown(label="Generated Image")], title="everything-ai-img2img") if __name__=="__main__": demo.launch(server_name="0.0.0.0", server_port=7860) ================================================ FILE: docker/image_classification.py ================================================ from transformers import AutoModelForImageClassification, AutoImageProcessor, pipeline from PIL import Image from argparse import ArgumentParser import torch argparse = ArgumentParser() argparse.add_argument( "-m", "--model", help="HuggingFace Model identifier, such as 'google/flan-t5-base'", required=True, ) args = argparse.parse_args() mod = args.model mod = mod.replace("\"", "").replace("'", "") model_checkpoint = mod device = torch.device("cuda" if torch.cuda.is_available() else "cpu") model = AutoModelForImageClassification.from_pretrained(model_checkpoint).to(device) processor = AutoImageProcessor.from_pretrained(model_checkpoint) pipe = pipeline("image-classification", model=model, image_processor=processor) def get_results(image, ppln=pipe): img = Image.fromarray(image) result = ppln(img) scores = [] labels = [] for el in result: scores.append(el["score"]) labels.append(el["label"]) return labels[scores.index(max(scores))] import gradio as gr ## Build interface with loaded image + ouput from the model demo = gr.Interface(get_results, gr.Image(), "text") if __name__ == "__main__": demo.launch(server_name="0.0.0.0", share=False) ================================================ FILE: docker/image_generation.py ================================================ from diffusers import DiffusionPipeline import torch from argparse import ArgumentParser argparse = ArgumentParser() argparse.add_argument( "-m", "--model", help="HuggingFace Model identifier, such as 'google/flan-t5-base'", required=True, ) args = argparse.parse_args() mod = args.model mod = mod.replace("\"", "").replace("'", "") model_checkpoint = mod pipe = DiffusionPipeline.from_pretrained(model_checkpoint, torch_dtype=torch.float32) import gradio as gr from utils import Translation def reply(message, history): txt = Translation(message, "en") if txt.original == "en": image = pipe(message).images[0] image.save("generated_image.png") return "Here's your image:\n![generated_image](generated_image.png)" else: translation = txt.translatef() image = pipe(translation).images[0] image.save("generated_image.png") t = Translation("Here's your image:", txt.original) res = t.translatef() return f"{res}:\n![generated_image](generated_image.png)" demo = gr.ChatInterface(fn=reply, title="everything-ai-sd-imgs") demo.launch(server_name="0.0.0.0", share=False) ================================================ FILE: docker/image_generation_pollinations.py ================================================ import gradio as gr from utils import Translation def reply(message, history): txt = Translation(message, "en") if txt.original == "en": image = f"https://pollinations.ai/p/{message.replace(' ', '_')}" return f"Here's your image:\n![generated_image]({image})" else: translation = txt.translatef() image = f"https://pollinations.ai/p/{translation.replace(' ', '_')}" t = Translation("Here's your image:", txt.original) res = t.translatef() return f"{res}:\n![generated_image]({image})" demo = gr.ChatInterface(fn=reply, title="everything-ai-pollinations-imgs") demo.launch(server_name="0.0.0.0", share=False) ================================================ FILE: docker/image_to_text.py ================================================ import torch from transformers import pipeline from PIL import Image from argparse import ArgumentParser argparse = ArgumentParser() argparse.add_argument( "-m", "--model", help="HuggingFace Model identifier, such as 'google/flan-t5-base'", required=True, ) args = argparse.parse_args() mod = args.model mod = mod.replace("\"", "").replace("'", "") model_checkpoint = mod device = torch.device("cuda" if torch.cuda.is_available() else "cpu") pipe = pipeline("image-to-text", model=model_checkpoint, device=device) def get_results(image, ppln=pipe): img = Image.fromarray(image) result = ppln(img, prompt="", generate_kwargs={"max_new_tokens": 1024}) return result[0]["generated_text"].capitalize() import gradio as gr ## Build interface with loaded image + ouput from the model demo = gr.Interface(get_results, gr.Image(), "text") if __name__ == "__main__": demo.launch(server_name="0.0.0.0", share=False) ================================================ FILE: docker/llama_cpp_int.py ================================================ from utils import Translation, PDFdatabase, NeuralSearcher import gradio as gr from qdrant_client import QdrantClient from sentence_transformers import SentenceTransformer from argparse import ArgumentParser import os argparse = ArgumentParser() argparse.add_argument( "-pf", "--pdf_file", help="Single pdf file or N pdfs reported like this: /path/to/file1.pdf,/path/to/file2.pdf,...,/path/to/fileN.pdf (there is no strict naming, you just need to provide them comma-separated)", required=False, default="No file" ) argparse.add_argument( "-d", "--directory", help="Directory where all your pdfs of interest are stored", required=False, default="No directory" ) argparse.add_argument( "-l", "--language", help="Language of the written content contained in the pdfs", required=False, default="Same as query" ) args = argparse.parse_args() pdff = args.pdf_file dirs = args.directory lan = args.language if pdff.replace("\\","").replace("'","") != "None" and dirs.replace("\\","").replace("'","") == "No directory": pdfs = pdff.replace("\\","/").replace("'","").split(",") else: pdfs = [os.path.join(dirs.replace("\\","/").replace("'",""), f) for f in os.listdir(dirs.replace("\\","/").replace("'","")) if f.endswith(".pdf")] client = QdrantClient(host="host.docker.internal", port="6333") encoder = SentenceTransformer("all-MiniLM-L6-v2") pdfdb = PDFdatabase(pdfs, encoder, client) pdfdb.preprocess() pdfdb.collect_data() pdfdb.qdrant_collection_and_upload() import requests def llama_cpp_respond(query, max_new_tokens): url = "http://localhost:8000/completion" headers = { "Content-Type": "application/json" } data = { "prompt": query, "n_predict": int(max_new_tokens) } response = requests.post(url, headers=headers, json=data) a = response.json() return a["content"] def reply(max_new_tokens, message): global pdfdb txt = Translation(message, "en") if txt.original == "en" and lan.replace("\\","").replace("'","") == "None": txt2txt = NeuralSearcher(pdfdb.collection_name, pdfdb.client, pdfdb.encoder) results = txt2txt.search(message) response = llama_cpp_respond(f"Context: {results[0]["text"]}, prompt: {message}", max_new_tokens) return response elif txt.original == "en" and lan.replace("\\","").replace("'","") != "None": txt2txt = NeuralSearcher(pdfdb.collection_name, pdfdb.client, pdfdb.encoder) transl = Translation(message, lan.replace("\\","").replace("'","")) message = transl.translatef() results = txt2txt.search(message) t = Translation(results[0]["text"], txt.original) res = t.translatef() response = llama_cpp_respond(f"Context: {res}, prompt: {message}", max_new_tokens) return response elif txt.original != "en" and lan.replace("\\","").replace("'","") == "None": txt2txt = NeuralSearcher(pdfdb.collection_name, pdfdb.client, pdfdb.encoder) results = txt2txt.search(message) transl = Translation(results[0]["text"], "en") translation = transl.translatef() response = llama_cpp_respond(f"Context: {translation}, prompt: {message}", max_new_tokens) t = Translation(response, txt.original) res = t.translatef() return res else: txt2txt = NeuralSearcher(pdfdb.collection_name, pdfdb.client, pdfdb.encoder) transl = Translation(message, lan.replace("\\","").replace("'","")) message = transl.translatef() results = txt2txt.search(message) t = Translation(results[0]["text"], txt.original) res = t.translatef() response = llama_cpp_respond(f"Context: {res}, prompt: {message}", max_new_tokens) tr = Translation(response, txt.original) ress = tr.translatef() return ress demo = gr.Interface( reply, [ gr.Textbox( label="Max new tokens", info="The number reported should not be higher than the one specified within the .env file", lines=3, value=f"512", ), gr.Textbox( label="Input query", info="Write your input query here", lines=3, value=f"What are penguins?", ) ], title="everything-ai-llamacpp", outputs="textbox" ) demo.launch(server_name="0.0.0.0", share=False) ================================================ FILE: docker/protein_folding_with_esm.py ================================================ from transformers import AutoTokenizer, EsmForProteinFolding from transformers.models.esm.openfold_utils.protein import to_pdb, Protein as OFProtein from transformers.models.esm.openfold_utils.feats import atom14_to_atom37 import gradio as gr from gradio_molecule3d import Molecule3D reps = [ { "model": 0, "chain": "", "resname": "", "style": "stick", "color": "whiteCarbon", "residue_range": "", "around": 0, "byres": False, "visible": False } ] def read_mol(molpath): with open(molpath, "r") as fp: lines = fp.readlines() mol = "" for l in lines: mol += l return mol def molecule(input_pdb): mol = read_mol(input_pdb) x = ( """

""" ) return f"""""" def convert_outputs_to_pdb(outputs): final_atom_positions = atom14_to_atom37(outputs["positions"][-1], outputs) outputs = {k: v.to("cpu").numpy() for k, v in outputs.items()} final_atom_positions = final_atom_positions.cpu().numpy() final_atom_mask = outputs["atom37_atom_exists"] pdbs = [] for i in range(outputs["aatype"].shape[0]): aa = outputs["aatype"][i] pred_pos = final_atom_positions[i] mask = final_atom_mask[i] resid = outputs["residue_index"][i] + 1 pred = OFProtein( aatype=aa, atom_positions=pred_pos, atom_mask=mask, residue_index=resid, b_factors=outputs["plddt"][i], chain_index=outputs["chain_index"][i] if "chain_index" in outputs else None, ) pdbs.append(to_pdb(pred)) return pdbs tokenizer = AutoTokenizer.from_pretrained("facebook/esmfold_v1") model = EsmForProteinFolding.from_pretrained("facebook/esmfold_v1", low_cpu_mem_usage=True) model = model.cuda() model.esm = model.esm.half() import torch torch.backends.cuda.matmul.allow_tf32 = True model.trunk.set_chunk_size(64) def fold_protein(test_protein): tokenized_input = tokenizer([test_protein], return_tensors="pt", add_special_tokens=False)['input_ids'] tokenized_input = tokenized_input.cuda() with torch.no_grad(): output = model(tokenized_input) pdb = convert_outputs_to_pdb(output) with open("output_structure.pdb", "w") as f: f.write("".join(pdb)) html = molecule("output_structure.pdb") return html, "output_structure.pdb" iface = gr.Interface( title="everything-ai-proteinfold", fn=fold_protein, inputs=gr.Textbox( label="Protein Sequence", info="Find sequences examples below, and complete examples with images at: https://github.com/AstraBert/proteinviz/tree/main/examples.md; if you input a sequence, you're gonna get the static image and the 3D model to explore and play with", lines=5, value=f"Paste or write amino-acidic sequence here", ), outputs=[gr.HTML(label="Protein 3D model"), Molecule3D(label="Molecular 3D model", reps=reps)], ) iface.launch(server_name="0.0.0.0", share=False) ================================================ FILE: docker/requirements.txt ================================================ langchain-community==0.0.13 langchain==0.1.1 pypdf==3.17.4 sentence_transformers==2.2.2 transformers==4.39.3 langdetect==1.0.9 deep-translator==1.11.4 torch==2.1.2 gradio==4.36.0 diffusers==0.27.2 pydantic==2.6.4 qdrant_client==1.9.0 pillow==10.2.0 datasets==2.15.0 accelerate ================================================ FILE: docker/retrieval_image_search.py ================================================ from transformers import AutoImageProcessor, AutoModel from utils import ImageDB from PIL import Image from qdrant_client import QdrantClient import gradio as gr from argparse import ArgumentParser import torch argparse = ArgumentParser() argparse.add_argument( "-m", "--model", help="HuggingFace Model identifier, such as 'google/flan-t5-base'", required=True, ) argparse.add_argument( "-id", "--image_dimension", help="Dimension of the image (e.g. 512, 758, 384...)", required=False, default=512, type=int ) argparse.add_argument( "-d", "--directory", help="Directory where all your pdfs of interest are stored", required=False, default="No directory" ) args = argparse.parse_args() mod = args.model dirs = args.directory imd = args.image_dimension device = torch.device("cuda" if torch.cuda.is_available() else "cpu") processor = AutoImageProcessor.from_pretrained(mod) model = AutoModel.from_pretrained(mod).to(device) client = QdrantClient(host="host.docker.internal", port=6333) imdb = ImageDB(dirs, processor, model, client, imd) print(imdb.collection_name) imdb.create_dataset() imdb.to_collection() def see_images(dataset, results): images = [] for i in range(len(results)): img = dataset[results[0].id]['image'] images.append(img) return images def process_img(image): global imdb results = imdb.searchDB(Image.fromarray(image)) images = see_images(imdb.dataset, results) return images iface = gr.Interface( title="everything-ai-retrievalimg", fn=process_img, inputs=gr.Image(label="Input Image"), outputs=gr.Gallery(label="Matching Images"), ) iface.launch(server_name="0.0.0.0", share=False) ================================================ FILE: docker/retrieval_text_generation.py ================================================ from utils import Translation, PDFdatabase, NeuralSearcher import gradio as gr from qdrant_client import QdrantClient from sentence_transformers import SentenceTransformer from argparse import ArgumentParser from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline import torch import os argparse = ArgumentParser() argparse.add_argument( "-m", "--model", help="HuggingFace Model identifier, such as 'google/flan-t5-base'", required=True, ) argparse.add_argument( "-pf", "--pdf_file", help="Single pdf file or N pdfs reported like this: /path/to/file1.pdf,/path/to/file2.pdf,...,/path/to/fileN.pdf (there is no strict naming, you just need to provide them comma-separated)", required=False, default="No file" ) argparse.add_argument( "-d", "--directory", help="Directory where all your pdfs of interest are stored", required=False, default="No directory" ) argparse.add_argument( "-l", "--language", help="Language of the written content contained in the pdfs", required=False, default="Same as query" ) args = argparse.parse_args() mod = args.model pdff = args.pdf_file dirs = args.directory lan = args.language if pdff.replace("\\","").replace("'","") != "None" and dirs.replace("\\","").replace("'","") == "No directory": pdfs = pdff.replace("\\","/").replace("'","").split(",") else: pdfs = [os.path.join(dirs.replace("\\","/").replace("'",""), f) for f in os.listdir(dirs.replace("\\","/").replace("'","")) if f.endswith(".pdf")] client = QdrantClient(host="host.docker.internal", port="6333") encoder = SentenceTransformer("all-MiniLM-L6-v2") pdfdb = PDFdatabase(pdfs, encoder, client) pdfdb.preprocess() pdfdb.collect_data() pdfdb.qdrant_collection_and_upload() device = torch.device("cuda" if torch.cuda.is_available() else "cpu") model = AutoModelForCausalLM.from_pretrained(mod).to(device) tokenizer = AutoTokenizer.from_pretrained(mod) pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, max_new_tokens=2048, repetition_penalty=1.2, temperature=0.4) def reply(message, history): global pdfdb txt = Translation(message, "en") if txt.original == "en" and lan.replace("\\","").replace("'","") == "None": txt2txt = NeuralSearcher(pdfdb.collection_name, pdfdb.client, pdfdb.encoder) results = txt2txt.search(message) response = pipe(f"Context: {results[0]["text"]}, prompt: {message}") return response[0]["generated_text"] elif txt.original == "en" and lan.replace("\\","").replace("'","") != "None": txt2txt = NeuralSearcher(pdfdb.collection_name, pdfdb.client, pdfdb.encoder) transl = Translation(message, lan.replace("\\","").replace("'","")) message = transl.translatef() results = txt2txt.search(message) t = Translation(results[0]["text"], txt.original) res = t.translatef() response = pipe(f"Context: {res}, prompt: {message}") return response[0]["generated_text"] elif txt.original != "en" and lan.replace("\\","").replace("'","") == "None": txt2txt = NeuralSearcher(pdfdb.collection_name, pdfdb.client, pdfdb.encoder) results = txt2txt.search(message) transl = Translation(results[0]["text"], "en") translation = transl.translatef() response = pipe(f"Context: {translation}, prompt: {message}") t = Translation(response[0]["generated_text"], txt.original) res = t.translatef() return res else: txt2txt = NeuralSearcher(pdfdb.collection_name, pdfdb.client, pdfdb.encoder) transl = Translation(message, lan.replace("\\","").replace("'","")) message = transl.translatef() results = txt2txt.search(message) t = Translation(results[0]["text"], txt.original) res = t.translatef() response = pipe(f"Context: {res}, prompt: {message}") tr = Translation(response[0]["generated_text"], txt.original) ress = tr.translatef() return ress demo = gr.ChatInterface(fn=reply, title="everything-ai-retrievaltext") demo.launch(server_name="0.0.0.0", share=False) ================================================ FILE: docker/select_and_run.py ================================================ import subprocess as sp import gradio as gr TASK_TO_SCRIPT = {"retrieval-text-generation": "retrieval_text_generation.py", "agnostic-text-generation": "agnostic_text_generation.py", "text-summarization": "text_summarization.py", "image-generation": "image_generation.py", "image-generation-pollinations": "image_generation_pollinations.py", "image-classification": "image_classification.py", "image-to-text": "image_to_text.py", "retrieval-image-search": "retrieval_image_search.py", "protein-folding": "protein_folding_with_esm.py", "video-generation": "video_generation.py", "speech-recognition": "speech_recognition.py", "spaces-api-supabase": "spaces_api_supabase.py", "audio-classification": "audio_classification.py", "autotrain": "autotrain_interface.py", "llama.cpp-and-qdrant": "llama_cpp_int.py", "build-your-llm": "build_your_llm.py", "simply-chatting": "chat_your_llm.py", "fal-img2img": "fal_img2img.py"} def build_command(tsk, mod="None", pdff="None", dirs="None", lan="None", imdim="512", gradioclient="None", supabaseurl="None", collectname="None", supenc="all-MiniLM-L6-v2", supdim="384"): if tsk != "retrieval-text-generation" and tsk != "image-generation-pollinations" and tsk != "retrieval-image-search" and tsk != "autotrain" and tsk != "protein-folding" and tsk != "spaces-api-supabase" and tsk != "llama.cpp-and-qdrant" and tsk!="build-your-llm" and tsk!="simply-chatting" and tsk!="fal-img2img": sp.run(f"python3 {TASK_TO_SCRIPT[tsk]} -m {mod}", shell=True) return f"python3 {TASK_TO_SCRIPT[tsk]} -m {mod}" elif tsk == "retrieval-text-generation": sp.run(f"python3 {TASK_TO_SCRIPT[tsk]} -m {mod} -pf '{pdff}' -d '{dirs}' -l '{lan}'", shell=True) return f"python3 {TASK_TO_SCRIPT[tsk]} -m {mod} -pf '{pdff}' -d '{dirs}' -l '{lan}'" elif tsk == "llama.cpp-and-qdrant" or tsk== "build-your-llm": sp.run(f"python3 {TASK_TO_SCRIPT[tsk]} -pf '{pdff}' -d '{dirs}' -l '{lan}'", shell=True) return f"python3 {TASK_TO_SCRIPT[tsk]} -pf '{pdff}' -d '{dirs}' -l '{lan}'" elif tsk == "image-generation-pollinations" or tsk == "autotrain" or tsk == "protein-folding" or tsk=="simply-chatting" or tsk=="fal-img2img": sp.run(f"python3 {TASK_TO_SCRIPT[tsk]}", shell=True) return f"python3 {TASK_TO_SCRIPT[tsk]}" elif tsk == "spaces-api-supabase": if lan == "None": sp.run(f"python3 {TASK_TO_SCRIPT[tsk]} -gc {gradioclient} -sdu {supabaseurl} -cn {collectname} -en {supenc} -s {supdim}", shell=True) else: sp.run(f"python3 {TASK_TO_SCRIPT[tsk]} -gc {gradioclient} -sdu {supabaseurl} -cn {collectname} -en {supenc} -s {supdim} -l {lan}", shell=True) return f"python3 {TASK_TO_SCRIPT[tsk]} -gc {gradioclient} -sdu {supabaseurl} -cn {collectname} -en {supenc} -s {supdim} -l {lan}" else: sp.run(f"python3 {TASK_TO_SCRIPT[tsk]} -d {dirs} -id {imdim} -m {mod}", shell=True) return f"python3 {TASK_TO_SCRIPT[tsk]} -d {dirs} -id {imdim} -m {mod}" demo = gr.Interface( build_command, [ gr.Textbox( label="Task", info="Task you want your assistant to help you with", lines=3, value=f"Choose one of the following: {','.join(list(TASK_TO_SCRIPT.keys()))}; if you choose 'image-generation-pollinations' or 'autotrain' or 'protein-folding' or 'simply-chatting' or 'fal-img2img', you do not need to specify anything else. If you choose 'spaces-api-supabase' you need to specify the Spaces API client, the database URL, the collection name, the Sentence-Transformers encoder used to upload the vectors to the Supabase database and the vectors size (optionally also the language)", ), gr.Textbox( label="Model", info="AI model you want your assistant to run with", lines=3, value="None", ), gr.Textbox( label="PDF file(s)", info="Single pdf file or N pdfs reported like this: /path/to/file1.pdf,/path/to/file2.pdf,...,/path/to/fileN.pdf (there is no strict naming, you just need to provide them comma-separated), please do not use '\\' as path separators: only available with 'retrieval-text-generation'", lines=3, value="No file", ), gr.Textbox( label="Directory", info="Directory where all your pdfs or images (.jpg, .jpeg, .png) of interest are stored (only available with 'retrieval-text-generation' for pdfs and 'retrieval-image-search' for images). Please do not use '\\' as path separators", lines=3, value="No directory", ), gr.Textbox( label="Language", info="Language of the written content contained in the pdfs", lines=1, value="None", ), gr.Textbox( label="Image dimension", info="Dimension of the image (this is generally model and/or task-dependent!)", lines=1, value=f"e.g.: 512, 384, 758...", ), gr.Textbox( label="Spaces API client", info="Client for Spaces API", lines=3, value=f"e.g.: eswardivi/Phi-3-mini-4k-instruct", ), gr.Textbox( label="Supabase Database URL", info="URL of the Supabase database (to use with Spaces API)", lines=3, value=f"e.g.: postgresql://postgres.reneogdbgdsbgdbgdsgbdlf:yourcomplexpasswordhere@aws-0-eu-central-1.pooler.supabase.com:5432/postgres", ), gr.Textbox( label="Supabase collection name", info="Name of the Supabase collectio (to use with Spaces API)", lines=2, value=f"e.g.: documents", ), gr.Textbox( label="Supabase Vector Encoder", info="Name of the sentence-transformers encoder you used to upload vectors to your supabase database", lines=2, value=f"e.g.: all-MiniLM-L6-v2", ), gr.Textbox( label="Supabase Vector Size", info="Size of vectors in you supabase database", lines=1, value=f"e.g.: 384", ), ], outputs="textbox", theme=gr.themes.Base(), title="everything-ai" ) if __name__ == "__main__": demo.launch(server_name="0.0.0.0", server_port=8760, share=False) ================================================ FILE: docker/spaces_api_supabase.py ================================================ import gradio as gr from utils import Translation, NeuralSearcheR from gradio_client import Client import os import vecs from sentence_transformers import SentenceTransformer from argparse import ArgumentParser argparse = ArgumentParser() argparse.add_argument( "-gc", "--gradio_client", help="Spaces API to connect with", required=True, ) argparse.add_argument( "-sdu", "--supabase_database_url", help="URL for Supabase database", required=True ) argparse.add_argument( "-cn", "--collection_name", help="Name of the Supabase collection", required=True ) argparse.add_argument( "-l", "--language", help="Language of the written content contained in the pdfs", required=False, default="en" ) argparse.add_argument( "-en", "--encoder", help="Encoder used in text vectorization", required=False, default="all-MiniLM-L6-v2" ) argparse.add_argument( "-s", "--size", help="Size of the vectors", required=False, default=384, type=int ) args = argparse.parse_args() gradcli = args.gradio_client supdb = args.supabase_database_url collname = args.collection_name lan = args.language encd = args.encoder sz = args.size collection_name = collname encoder = SentenceTransformer(encd) client = supdb api_client = Client(gradcli) lan = "en" vx = vecs.create_client(client) docs = vx.get_or_create_collection(name=collection_name, dimension=sz) def reply(message, history): global docs global encoder global api_client global lan txt = Translation(message, "en") print(txt.original, lan) if txt.original == "en" and lan == "en": txt2txt = NeuralSearcheR(docs, encoder) results = txt2txt.search(message) response = api_client.predict( f"Context: {results[0][2]['Content']}; Prompt: {message}", # str in 'Message' Textbox component 0.4, # float (numeric value between 0 and 1) in 'Temperature' Slider component True, # bool in 'Sampling' Checkbox component 512, # float (numeric value between 128 and 4096) in 'Max new tokens' Slider component api_name="/chat" ) return response elif txt.original == "en" and lan != "en": txt2txt = NeuralSearcheR(docs, encoder) transl = Translation(message, lan) message = transl.translatef() results = txt2txt.search(message) t = Translation(results[0][2]['Content'], txt.original) res = t.translatef() response = api_client.predict( f"Context: {res}; Prompt: {message}", # str in 'Message' Textbox component 0.4, # float (numeric value between 0 and 1) in 'Temperature' Slider component True, # bool in 'Sampling' Checkbox component 512, # float (numeric value between 128 and 4096) in 'Max new tokens' Slider component api_name="/chat" ) response = Translation(response, txt.original) return response.translatef() elif txt.original != "en" and lan == "en": txt2txt = NeuralSearcheR(docs, encoder) results = txt2txt.search(message) transl = Translation(results[0][2]['Content'], "en") translation = transl.translatef() response = api_client.predict( f"Context: {translation}; Prompt: {message}", # str in 'Message' Textbox component 0.4, # float (numeric value between 0 and 1) in 'Temperature' Slider component True, # bool in 'Sampling' Checkbox component 512, # float (numeric value between 128 and 4096) in 'Max new tokens' Slider component api_name="/chat" ) t = Translation(response, txt.original) res = t.translatef() return res else: txt2txt = NeuralSearcheR(docs, encoder) transl = Translation(message, lan.replace("\\","").replace("'","")) message = transl.translatef() results = txt2txt.search(message) t = Translation(results[0][2]['Content'], txt.original) res = t.translatef() response = api_client.predict( f"Context: {res}; Prompt: {message}", # str in 'Message' Textbox component 0.4, # float (numeric value between 0 and 1) in 'Temperature' Slider component True, # bool in 'Sampling' Checkbox component 512, # float (numeric value between 128 and 4096) in 'Max new tokens' Slider component api_name="/chat" ) tr = Translation(response, txt.original) ress = tr.translatef() return ress demo = gr.ChatInterface(fn=reply, title="everything-ai-supabase2spacesapi") demo.launch(server_name="0.0.0.0", share=False) ================================================ FILE: docker/speech_recognition.py ================================================ from transformers import pipeline from argparse import ArgumentParser import torch import gradio as gr import numpy as np argparse = ArgumentParser() argparse.add_argument( "-m", "--model", help="HuggingFace Model identifier, such as 'google/flan-t5-base'", required=True, ) args = argparse.parse_args() mod = args.model mod = mod.replace("\"", "").replace("'", "") model_checkpoint = mod # Audio class classifier = pipeline(task="automatic-speech-recognition", model=mod) def classify_text(audio): global classifier sr, data = audio short_tensor = data.astype(np.float32) res = classifier(short_tensor) return res["text"] input_audio = gr.Audio( sources=["upload","microphone"], waveform_options=gr.WaveformOptions( waveform_color="#01C6FF", waveform_progress_color="#0066B4", skip_length=2, show_controls=False, ), ) demo = gr.Interface( title="everything-ai-speechrec", fn=classify_text, inputs=input_audio, outputs="text" ) if __name__ == "__main__": demo.launch(server_name="0.0.0.0", share=False) ================================================ FILE: docker/text_summarization.py ================================================ from transformers import pipeline from argparse import ArgumentParser from langchain.text_splitter import CharacterTextSplitter from langchain_community.document_loaders import PyPDFLoader from utils import merge_pdfs import gradio as gr import time import torch histr = [[None, "Hi, I'm **everything-ai-summarization**🤖.\nI'm here to assist you and let you summarize _your_ texts and _your_ pdfs!\nCheck [my website](https://astrabert.github.io/everything-ai/) for troubleshooting and documentation reference\nHave fun!😊"]] argparse = ArgumentParser() argparse.add_argument( "-m", "--model", help="HuggingFace Model identifier, such as 'google/flan-t5-base'", required=True, ) args = argparse.parse_args() mod = args.model mod = mod.replace("\"", "").replace("'", "") model_checkpoint = mod device = torch.device("cuda" if torch.cuda.is_available() else "cpu") summarizer = pipeline("summarization", model=model_checkpoint, device=device) def convert_none_to_str(l: list): newlist = [] for i in range(len(l)): if l[i] is None or type(l[i])==tuple: newlist.append("") else: newlist.append(l[i]) return tuple(newlist) def pdf2string(pdfpath): loader = PyPDFLoader(pdfpath) documents = loader.load() ### Split the documents into smaller chunks for processing text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0) texts = text_splitter.split_documents(documents) fulltext = "" for text in texts: fulltext += text.page_content+"\n\n\n" return fulltext def add_message(history, message): global histr if history is not None: if len(message["files"]) > 0: history.append((message["files"], None)) histr.append([message["files"], None]) if message["text"] is not None and message["text"] != "": history.append((message["text"], None)) histr.append([message["text"], None]) else: history = histr add_message(history, message) return history, gr.MultimodalTextbox(value=None, interactive=False) def bot(history): global histr if not history is None: if type(history[-1][0]) != tuple: text = history[-1][0] response = summarizer(text, max_length=int(len(text.split(" "))*0.5), min_length=int(len(text.split(" "))*0.05), do_sample=False)[0] response = response["summary_text"] histr[-1][1] = response history[-1][1] = "" for character in response: history[-1][1] += character time.sleep(0.05) yield history if type(history[-1][0]) == tuple: filelist = [] for i in history[-1][0]: filelist.append(i) finalpdf = merge_pdfs(filelist) text = pdf2string(finalpdf) response = summarizer(text, max_length=int(len(text.split(" "))*0.5), min_length=int(len(text.split(" "))*0.05), do_sample=False)[0] response = response["summary_text"] histr[-1][1] = response history[-1][1] = "" for character in response: history[-1][1] += character time.sleep(0.05) yield history else: history = histr bot(history) with gr.Blocks() as demo: chatbot = gr.Chatbot( [[None, "Hi, I'm **everything-ai-summarization**🤖.\nI'm here to assist you and let you summarize _your_ texts and _your_ pdfs!\nCheck [my website](https://astrabert.github.io/everything-ai/) for troubleshooting and documentation reference\nHave fun!😊"]], label="everything-rag", elem_id="chatbot", bubble_full_width=False, ) chat_input = gr.MultimodalTextbox(interactive=True, file_types=["pdf"], placeholder="Enter message or upload file...", show_label=False) chat_msg = chat_input.submit(add_message, [chatbot, chat_input], [chatbot, chat_input]) bot_msg = chat_msg.then(bot, chatbot, chatbot, api_name="bot_response") bot_msg.then(lambda: gr.MultimodalTextbox(interactive=True), None, [chat_input]) demo.queue() if __name__ == "__main__": demo.launch(server_name="0.0.0.0", share=False) ================================================ FILE: docker/utils.py ================================================ # f(x)s that now are useful for all the tasks from langdetect import detect from deep_translator import GoogleTranslator from pypdf import PdfMerger from qdrant_client import models from langchain.text_splitter import CharacterTextSplitter from langchain_community.document_loaders import PyPDFLoader import os from datasets import load_dataset, Dataset import torch import numpy as np def remove_items(test_list, item): res = [i for i in test_list if i != item] return res def merge_pdfs(pdfs: list): merger = PdfMerger() for pdf in pdfs: merger.append(pdf) merger.write(f"{pdfs[-1].split('.')[0]}_results.pdf") merger.close() return f"{pdfs[-1].split('.')[0]}_results.pdf" class NeuralSearcher: def __init__(self, collection_name, client, model): self.collection_name = collection_name # Initialize encoder model self.model = model # initialize Qdrant client self.qdrant_client = client def search(self, text: str): # Convert text query into vector vector = self.model.encode(text).tolist() # Use `vector` for search for closest vectors in the collection search_result = self.qdrant_client.search( collection_name=self.collection_name, query_vector=vector, query_filter=None, # If you don't want any filters for now limit=1, # 5 the most closest results is enough ) # `search_result` contains found vector ids with similarity scores along with the stored payload # In this function you are interested in payload only payloads = [hit.payload for hit in search_result] return payloads class PDFdatabase: def __init__(self, pdfs, encoder, client): self.finalpdf = merge_pdfs(pdfs) self.collection_name = os.path.basename(self.finalpdf).split(".")[0].lower() self.encoder = encoder self.client = client def preprocess(self): loader = PyPDFLoader(self.finalpdf) documents = loader.load() ### Split the documents into smaller chunks for processing text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0) self.pages = text_splitter.split_documents(documents) def collect_data(self): self.documents = [] for text in self.pages: contents = text.page_content.split("\n") contents = remove_items(contents, "") for content in contents: self.documents.append({"text": content, "source": text.metadata["source"], "page": str(text.metadata["page"])}) def qdrant_collection_and_upload(self): self.client.recreate_collection( collection_name=self.collection_name, vectors_config=models.VectorParams( size=self.encoder.get_sentence_embedding_dimension(), # Vector size is defined by used model distance=models.Distance.COSINE, ), ) self.client.upload_points( collection_name=self.collection_name, points=[ models.PointStruct( id=idx, vector=self.encoder.encode(doc["text"]).tolist(), payload=doc ) for idx, doc in enumerate(self.documents) ], ) class Translation: def __init__(self, text, destination): self.text = text self.destination = destination try: self.original = detect(self.text) except Exception as e: self.original = "auto" def translatef(self): translator = GoogleTranslator(source=self.original, target=self.destination) translation = translator.translate(self.text) return translation class ImageDB: def __init__(self, imagesdir, processor, model, client, dimension): self.imagesdir = imagesdir self.processor = processor self.model = model self.client = client self.dimension = dimension if os.path.basename(self.imagesdir) != "": self.collection_name = os.path.basename(self.imagesdir)+"_ImagesCollection" else: if "\\" in self.imagesdir: self.collection_name = self.imagesdir.split("\\")[-2]+"_ImagesCollection" else: self.collection_name = self.imagesdir.split("/")[-2]+"_ImagesCollection" self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu") self.client.recreate_collection( collection_name=self.collection_name, vectors_config=models.VectorParams(size=self.dimension, distance=models.Distance.COSINE) ) def get_embeddings(self, batch): inputs = self.processor(images=batch['image'], return_tensors="pt").to(self.device) with torch.no_grad(): outputs = self.model(**inputs).last_hidden_state.mean(dim=1).cpu().numpy() batch['embeddings'] = outputs return batch def create_dataset(self): self.dataset = load_dataset("imagefolder", data_dir=self.imagesdir, split="train") self.dataset = self.dataset.map(self.get_embeddings, batched=True, batch_size=16) def to_collection(self): np.save(os.path.join(self.imagesdir, "vectors"), np.array(self.dataset['embeddings']), allow_pickle=False) payload = self.dataset.select_columns([ "label" ]).to_pandas().fillna(0).to_dict(orient="records") ids = list(range(self.dataset.num_rows)) embeddings = np.load(os.path.join(self.imagesdir, "vectors.npy")).tolist() batch_size = 1000 for i in range(0, self.dataset.num_rows, batch_size): low_idx = min(i+batch_size, self.dataset.num_rows) batch_of_ids = ids[i: low_idx] batch_of_embs = embeddings[i: low_idx] batch_of_payloads = payload[i: low_idx] self.client.upsert( collection_name = self.collection_name, points=models.Batch( ids=batch_of_ids, vectors=batch_of_embs, payloads=batch_of_payloads ) ) def searchDB(self, image): dtst = {"image": [image], "label": ["None"]} dtst = Dataset.from_dict(dtst) dtst = dtst.map(self.get_embeddings, batched=True, batch_size=1) img = dtst[0] results = self.client.search( collection_name=self.collection_name, query_vector=img['embeddings'], limit=4 ) return results class NeuralSearcheR: def __init__(self, collection, encoder): self.collection = collection self.encoder = encoder def search(self, text): results = self.collection.query( data=self.encoder.encode(text).tolist(), # required limit=1, # number of records to return filters={}, # metadata filters measure="cosine_distance", # distance measure to use include_value=True, # should distance measure values be returned? include_metadata=True, # should record metadata be returned? ) return results ================================================ FILE: docker/video_generation.py ================================================ import gradio as gr import moviepy.editor as mp from diffusers import DiffusionPipeline from argparse import ArgumentParser argparse = ArgumentParser() argparse.add_argument( "-m", "--model", help="HuggingFace Model identifier, such as 'google/flan-t5-base'", required=True, ) args = argparse.parse_args() mod = args.model mod = mod.replace("\"", "").replace("'", "") model_checkpoint = mod # Load diffusion pipelines image_pipeline = DiffusionPipeline.from_pretrained(model_checkpoint) video_pipeline = DiffusionPipeline.from_pretrained(model_checkpoint) def generate_images(prompt, num_images): """Generates images using the image pipeline.""" images = [] for _ in range(num_images): generated_image = image_pipeline(prompt=prompt).images[0] images.append(generated_image) return images def generate_videos(images): """Generates videos from a list of images using the video pipeline.""" videos = [] for image in images: # Wrap the image in a list as expected by the pipeline generated_video = video_pipeline(images=[image]).images[0] videos.append(generated_video) return videos def combine_videos(video_clips): final_clip = mp.concatenate_videoclips(video_clips) return final_clip def generate(prompt): images = generate_images(prompt, 2) video_clips = generate_videos(images) combined_video = combine_videos(video_clips) return combined_video # Gradio interface with improved formatting and video output interface = gr.Interface( fn=generate, inputs="text", outputs="video", title="everything-ai-text2vid", description="Enter a prompt to generate a video using diffusion models.", css=""" .output-video { width: 100%; /* Adjust width as needed */ height: 400px; /* Adjust height as desired */ } """, ) # Launch the interface interface.launch(server_name="0.0.0.0", share=False)