Repository: ganeshnikhil/J.A.R.V.I.S.2.0 Branch: main Commit: fec1d59e3f7f Files: 35 Total size: 131.2 KB Directory structure: gitextract_s5snpbag/ ├── .gitignore ├── readme.md ├── requirements.txt ├── src/ │ ├── BRAIN/ │ │ ├── RAG.py │ │ ├── chat_with_ai.py │ │ ├── code_gen.py │ │ ├── gem_func_call.py │ │ ├── local_func_call.py │ │ └── text_to_info.py │ ├── CONVERSATION/ │ │ ├── speech_to_text.py │ │ ├── t_s.py │ │ ├── test_speech.py │ │ ├── text_speech.py │ │ ├── text_to_speech.py │ │ └── voice_text.py │ ├── FUNCTION/ │ │ ├── Tools/ │ │ │ ├── Email_send.py │ │ │ ├── app_op.py │ │ │ ├── code_exec.py │ │ │ ├── get_env.py │ │ │ ├── greet_time.py │ │ │ ├── incog.py │ │ │ ├── internet_search.py │ │ │ ├── link_op.py │ │ │ ├── news.py │ │ │ ├── phone_call.py │ │ │ ├── random_respon.py │ │ │ ├── searxsearch.py │ │ │ ├── weather.py │ │ │ └── youtube_downloader.py │ │ └── run_function.py │ ├── KEYBOARD/ │ │ ├── key_lst.py │ │ └── key_prs_lst.py │ └── VISION/ │ ├── gem_eye.py │ └── local_eye.py └── ui.py ================================================ FILE CONTENTS ================================================ ================================================ FILE: .gitignore ================================================ # Python cache __pycache__/ *.py[cod] *.so # Virtual environments .venv/ env/ .log/ # Secrets & configs .env *.env *.yml *.sh *.bat device_ips.txt adb.txt # Logs & temporary files *.log *.tmp *.swp # System files .DS_Store Thumbs.db ehthumbs.db Icon? # Data & storage DATA/ *.pkl *.db *.sqlite3 *.csv *.json **/VECTORSTORES/ *.faiss # Backups & test outputs Backup/ Test/ # IDEs .vscode/ .idea/ # Media (test/generated files) *.mp3 *.wav *.mp4 *.m4a *.png *.jpg *.jpeg *.gif # Packaging build/ dist/ *.egg-info/ # Jupyter .ipynb_checkpoints/ # Caches .pytest_cache/ .mypy_cache/ .cache/ ================================================ FILE: readme.md ================================================ # 🚀 JARVIS 2.0 --- ## J.A.R.V.I.S. 2.0 – Judgment Augmented Reasoning for Virtual Intelligent Systems # 🤖 Jarvis AI Assistant Welcome to the **Jarvis AI Assistant** project! 🎙️ This AI-powered assistant can perform various tasks such as **providing weather reports 🌦️, summarizing news 📰, sending emails 📧** , **CAG** , and more, all through **voice commands**. Below, you'll find detailed instructions on how to set up, use, and interact with this assistant. 🎧 --- ## 🌟 Features ✅ **Voice Activation**: activate listening mode. 🎤\ ✅ **Speech Recognition**: Recognizes and processes user commands via speech input. 🗣️\ ✅ **AI Responses**: Provides responses using AI-generated **text-to-speech** output. 🎶\ ✅ **Task Execution**: Handles multiple tasks, including: - 📧 **Sending emails** - 🌦️ **Summarizing weather reports** - 📊 **Data Analysis using csv*** - 🧑🏻‍💻 **Pesonalize chat** - 📰 **Reading news headlines** - 🖼️ **Image generation** - 🏦 **Database functions** - 📱 **Phone call automation using ADB** - 🤖 **AI-based task execution** - 📡 **Automate websites & applications** - 🏞️ **Image processing Using gemini** **Image Source:** ***Upload*** ***URL*** ***Camera*** **Select Action:** ***Basic Detection*** ***Object Detection*** ***Segmentation*** ***Resize*** - 🧠 **Retrieval-Augmented Generation (RAG) for knowledge-based interactions on various topics** - ✅ **Timeout Handling**: Automatically deactivates listening mode after **5 minutes** of inactivity. ⏳ - ✅ **Automatic Input Processing**: If no "stop" command is detected within **60 seconds**, input is finalized and sent to the AI model for processing. ⚙️ - ✅ **Multiple Function Calls**: Call **multiple functions simultaneously**, even if their inputs and outputs are unrelated. 🔄 --- ## 📌 Prerequisites Before running the project, ensure you have the following installed: ✅ **Python 3.9 or later** 🐍\ ✅ Required libraries (listed in `requirements.txt`) 📜 ### 🛠️ Configuration 1. **Create a ************`.env`************ file** in the root directory of the project. 2. **Add your API keys and other configuration variables** to the `.env` file: ```dotenv author_name="ganeshnikhil124@gmail.com" weather_link="https://rapidapi.com/weatherapi/api/weatherapi-com" news_link="https://newsapi.org" name="ganeshnikhil" Rag_model="granite3.1-dense:2b" Chat_model="granite3.1-dense:2b" Function_call_model="gemma3:4b" Text_to_info_model="gemma3:4b" Image_to_text="llava:7b" Embedding_model="nomic-embed-text" genai_key="" Sender_email="ganeshnikhil124@gmail.com" Receiver_email="" Password_email="" Weather_api="" News_api="" Country="in" DEVICE_IP="" CSV_PATH="./DATA/business-employment-data-dec-2024-quarter.csv" UI="on" Yt_path="./DATA/youtube_video/" ``` 2 . Install system requriements ```install bash ./intialize.sh ``` 3. **Setup API Keys & Passwords** : - [🌩️ WEATHER API](https://rapidapi.com/weatherapi/api/weatherapi-com) - Get weather data. - [📰 NEWS API](https://newsapi.org) - Fetch latest news headlines. - [📧 GMAIL PASSWORD](https://myaccount.google.com/apppasswords) - Generate an app password for sending emails. - [🧠 OLLAMA](https://ollama.com) - Download models from Ollama (manual steup) . **install Models from ollama** ``` ollama run gemma3:4b ollama run granite3.1-dense:2b ollama pull nomic-embed-text ``` - [portaudio](https://files.portaudio.com/download.html) - download portaudio to work with sound. - [🔮 GEMINI AI](https://aistudio.google.com/apikey) - API access for function execution. ## Model Details # Gemma for intellignet routing image and simple question answers. ``` Model architecture gemma3 parameters 4.3B context length 8192 embedding length 2560 quantization Q4_K_M Parameters stop "" temperature 0.1 License Gemma Terms of Use Last modified: February 21, 2024 ``` # grantie dense has large context window ,for rag and chat. ``` Model architecture granite parameters 2.5B context length 131072 embedding length 2048 quantization Q4_K_M System Knowledge Cutoff Date: April 2024. You are Granite, developed by IBM. License Apache License Version 2.0, January 2004 ``` # gemini free teir for as fallback mechanism . (only for tool calling) ``` gemini-2.0-flash Audio, images, videos, and text Text, images (experimental), and audio (coming soon) Next generation features, speed, thinking, realtime streaming, and multimodal generation gemini-2.0-flash-lite Audio, images, videos, and text Text A Gemini 2.0 Flash model optimized for cost efficiency and low latency gemini-2.0-pro-exp-02-05 Audio, images, videos, and text Text Our most powerful Gemini 2.0 model gemini-1.5-flash Audio, images, videos, and text Text Fast and versatile performance across a diverse variety of tasks ``` ![JARVIS Screenshot](image.png) --------------------------------------------------------------------------------------------- ![alt text](dig.png) ## 💻 Installation ### 1️⃣ **Clone the Repository** ```bash git clone https://github.com/ganeshnikhil/J.A.R.V.I.S.2.0.git cd J.A.R.V.I.S.2.0 ``` ### 2️⃣ **Install Dependencies** ```bash pip install -r requirements.txt ``` --- ## 🚀 Running the Application ### **Start the Program** ```bash streamlit run ui.py ``` --- ## 🔄 **Function Calling Methods** ### 🔹 **Primary: Gemini AI-Based Function Execution** 🚀 Transitioned to **Gemini AI-powered function calling**, allowing multiple **function calls simultaneously** for better efficiency! ⚙️ If Gemini AI fails to generate function calls, the system automatically falls back to an **Ollama-based model** for reliable execution.  🔹 **AI Model Used**: **Gemini AI** 🧠\ ✅ Higher accuracy ✅ Structured data processing ✅ Reliable AI-driven interactions --- ## 📖 **RAG-Based Knowledge System** 💡 **Retrieval-Augmented Generation (RAG)** dynamically loads relevant markdown-based knowledge files based on the queried topic, **reducing hallucinations and improving response accuracy**. --- ## 📱 **ADB Integration for Phone Automation** 🔹 Integrated **Android Debug Bridge (ADB)** to enable **voice-controlled phone automation**! 🎙️ ✅ **Make phone calls** ☎️\ ✅ **Open apps & toggle settings** 📲\ ✅ **Access phone data & remote operations** 🛠️ ### **Setting Up ADB** 📌 **Windows** ```powershell winget install --id=Google.AndroidSDKPlatformTools -e ``` 📌 **Linux** ```bash sudo apt install adb ``` 📌 **Mac** ```bash brew install android-platform-tools ``` --- ## 🔮 **Future Enhancements** ✨ **Deeper mobile integration** 📱\ ✨ **Advanced AI-driven automation** 🤖\ ✨ **Improved NLP-based command execution** 🧠\ ✨ **Multi-modal interactions (text + voice + image)** 🖼️ 🚀 **Stay tuned for future updates!** 🔥 ```markdown ## Gemini Model Comparison The following table provides a comparison of various Gemini models with respect to their rate limits: | Model | RPM | TPM | RPD | |------------------------------------- |-----:|----------:| -----:| | **Gemini 2.0 Flash** | 15 | 1,000,000 | 1,500 | | **Gemini 2.0 Flash-Lite Preview** | 30 | 1,000,000 | 1,500 | | **Gemini 2.0 Pro Experimental 02-05** | 2 | 1,000,000 | 50 | | **Gemini 2.0 Flash Thinking Experimental** | 10 | 4,000,000 | 1,500 | | **Gemini 1.5 Flash** | 15 | 1,000,000 | 1,500 | | **Gemini 1.5 Flash-8B** | 15 | 1,000,000 | 1,500 | | **Gemini 1.5 Pro** | 2 | 32,000 | 50 | | **Imagen 3** | -- | -- | -- | ``` ### Explanation: - **RPM**: Requests per minute - **TPM**: Tokens per minute - **RPD**: Requests per day ``` The focus of project is mostly on using small model and free (api) models , get accurate agentic behaviours , to run these on low spec systems to. ``` ================================================ FILE: requirements.txt ================================================ aiofiles==24.1.0 aiohappyeyeballs==2.4.6 aiohttp==3.11.12 aiosignal==1.3.2 altair==5.5.0 annotated-types==0.7.0 anyio==4.6.2.post1 asgiref==3.8.1 asttokens==3.0.0 async-timeout==4.0.3 attrs==25.1.0 backoff==2.2.1 bcrypt==4.2.1 beautifulsoup4==4.12.3 blinker==1.9.0 Brotli==1.1.0 build==1.2.2.post1 cachetools==5.5.1 certifi==2024.8.30 cffi==1.17.1 chardet==5.2.0 charset-normalizer==3.4.0 chroma-hnswlib==0.7.6 chromadb==0.6.3 click==8.1.8 coloredlogs==15.0.1 contourpy==1.3.0 cryptography==44.0.1 cycler==0.12.1 dataclasses-json==0.6.7 decorator==5.1.1 deepsearch-glm==1.0.0 Deprecated==1.2.18 dill==0.3.9 distro==1.9.0 docling==2.22.0 docling-core==2.18.1 docling-ibm-models==3.3.2 docling-parse==3.3.1 duckduckgo_search==7.4.2 durationpy==0.9 easyocr==1.7.2 emoji==2.14.1 et_xmlfile==2.0.0 eval_type_backport==0.2.2 exceptiongroup==1.2.2 executing==2.2.0 faiss-cpu==1.10.0 fastapi==0.115.8 filelock==3.17.0 filetype==1.2.0 flatbuffers==25.2.10 fonttools==4.57.0 frozenlist==1.5.0 fsspec==2025.2.0 fuzzywuzzy==0.18.0 #genai==2.1.0 geographiclib==2.0 geopy==2.4.1 gitdb==4.0.12 GitPython==3.1.44 google-ai-generativelanguage==0.6.15 google-api-core==2.24.1 google-api-python-client==2.161.0 google-auth==2.38.0 google-auth-httplib2==0.2.0 google-genai==1.2.0 google-generativeai==0.8.4 googleapis-common-protos==1.67.0 grpcio==1.70.0 grpcio-status==1.70.0 gTTS==2.5.4 h11==0.14.0 h2==4.2.0 hpack==4.1.0 html5lib==1.1 httpcore==1.0.7 httplib2==0.22.0 httptools==0.6.4 httpx==0.28.1 httpx-sse==0.4.0 huggingface-hub==0.28.1 humanfriendly==10.0 hyperframe==6.1.0 idna==3.10 imageio==2.37.0 importlib_metadata==8.5.0 importlib_resources==6.5.2 ipython==8.18.1 jedi==0.19.2 Jinja2==3.1.5 jiter==0.8.0 joblib==1.4.2 jsonlines==3.1.0 jsonpatch==1.33 jsonpointer==3.0.0 jsonref==1.1.0 jsonschema==4.23.0 jsonschema-specifications==2024.10.1 keyboard==0.13.5 kiwisolver==1.4.7 kubernetes==32.0.0 langchain==0.3.18 langchain-community==0.3.17 langchain-core==0.3.35 langchain-ollama==0.2.3 langchain-text-splitters==0.3.6 langdetect==1.0.9 langsmith==0.3.8 latex2mathml==3.77.0 lazy_loader==0.4 Levenshtein==0.27.1 lxml==5.3.1 Markdown==3.7 markdown-it-py==3.0.0 marko==2.1.2 MarkupSafe==3.0.2 marshmallow==3.26.1 matplotlib==3.9.4 matplotlib-inline==0.1.7 mdurl==0.1.2 mmh3==5.1.0 monotonic==1.6 mpire==2.10.2 mpmath==1.3.0 multidict==6.1.0 multiprocess==0.70.17 mypy-extensions==1.0.0 narwhals==1.34.1 nest-asyncio==1.6.0 networkx==3.2.1 ninja==1.11.1.3 nltk==3.9.1 numpy==1.26.4 oauthlib==3.2.2 olefile==0.47 ollama==0.4.7 onnxruntime==1.19.2 #openai==0.27.10 opencv-python-headless==4.11.0.86 openpyxl==3.1.5 opentelemetry-api==1.30.0 opentelemetry-exporter-otlp-proto-common==1.30.0 opentelemetry-exporter-otlp-proto-grpc==1.30.0 opentelemetry-instrumentation==0.51b0 opentelemetry-instrumentation-asgi==0.51b0 opentelemetry-instrumentation-fastapi==0.51b0 opentelemetry-proto==1.30.0 opentelemetry-sdk==1.30.0 opentelemetry-semantic-conventions==0.51b0 opentelemetry-util-http==0.51b0 orjson==3.10.15 overrides==7.7.0 packaging==24.2 pandas==2.2.3 parso==0.8.4 pdfminer.six==20240706 pexpect==4.9.0 pillow==11.1.0 posthog==3.13.0 prompt_toolkit==3.0.50 propcache==0.2.1 proto-plus==1.26.0 protobuf==5.29.3 psutil==7.0.0 ptyprocess==0.7.0 pure_eval==0.2.3 pyarrow==19.0.1 pyasn1==0.6.1 pyasn1_modules==0.4.1 PyAudio==0.2.14 pyclipper==1.3.0.post6 pycparser==2.22 pydantic==2.10.6 pydantic-settings==2.7.1 pydantic_core==2.27.2 pydeck==0.9.1 Pygments==2.19.1 pyobjc-core==11.1 ; sys_platform == "darwin" pyobjc-framework-Cocoa==11.1 ; sys_platform == "darwin" pyparsing==3.2.1 pypdf==5.3.0 pypdfium2==4.30.1 PyPika==0.48.9 pyproject_hooks==1.2.0 python-bidi==0.6.3 python-dateutil==2.9.0.post0 python-docx==1.1.2 python-dotenv==1.0.1 python-iso639==2025.2.8 python-Levenshtein==0.27.1 python-magic==0.4.27 python-oxmsg==0.0.2 python-pptx==1.0.2 pyttsx3==2.98 pytube==15.0.0 pytz==2025.1 PyYAML==6.0.2 RapidFuzz==3.12.1 referencing==0.36.2 regex==2024.11.6 requests==2.32.3 requests-oauthlib==2.0.0 requests-toolbelt==1.0.0 rich==13.9.4 rpds-py==0.22.3 rsa==4.9 Rtree==1.3.0 safetensors==0.5.2 scikit-image==0.24.0 scipy==1.13.1 semchunk==2.2.2 shapely==2.0.7 shellingham==1.5.4 six==1.17.0 smmap==5.0.2 sniffio==1.3.1 socksio==1.0.0 soupsieve==2.6 SpeechRecognition==3.12.0 SQLAlchemy==2.0.38 stack-data==0.6.3 starlette==0.45.3 streamlit==1.44.1 sympy==1.13.1 tabulate==0.9.0 tenacity==9.0.0 tifffile==2024.8.30 tokenizers==0.21.0 toml==0.10.2 tomli==2.2.1 torch==2.6.0 torchvision==0.21.0 tornado==6.4.2 tqdm==4.67.1 traitlets==5.14.3 transformers==4.48.3 typer==0.12.5 typing-inspect==0.9.0 typing_extensions==4.12.2 tzdata==2025.1 unstructured==0.16.20 unstructured-client==0.30.0 uritemplate==4.1.1 urllib3==2.2.3 uvicorn==0.34.0 uvloop==0.21.0 watchfiles==1.0.4 wcwidth==0.2.13 webencodings==0.5.1 websocket-client==1.8.0 websockets==14.2 wrapt==1.17.2 XlsxWriter==3.2.2 yarl==1.18.3 zipp==3.21.0 zstandard==0.23.0 ================================================ FILE: src/BRAIN/RAG.py ================================================ import os import shutil import tempfile from pathlib import Path from docling.datamodel.base_models import InputFormat from docling.datamodel.pipeline_options import PdfPipelineOptions from docling.document_converter import DocumentConverter, PdfFormatOption, WordFormatOption, SimplePipeline from langchain_community.document_loaders import UnstructuredMarkdownLoader from langchain.text_splitter import RecursiveCharacterTextSplitter from langchain_ollama import OllamaEmbeddings, OllamaLLM from langchain_community.vectorstores import FAISS from langchain.chains import ConversationalRetrievalChain from langchain_core.runnables.history import RunnableWithMessageHistory from langchain_community.chat_message_histories.in_memory import ChatMessageHistory from src.FUNCTION.Tools.get_env import EnvManager class RAGPipeline: def __init__(self): self.embedding_model = EnvManager.load_variable("Embedding_model") self.rag_model = EnvManager.load_variable("Rag_model") self.MAX_MESSAGES_PER_SESSION = 20 self.qa_chain = None self.memory_store = {} # In-memory store for session memory def get_paths(self, subject: str): subject_clean = subject.lower().strip().replace(" ", "_") doc_path = Path(f"./DATA/RAWKNOWLEDGEBASE/{subject_clean}_data.pdf") md_path = Path(f"./DATA/KNOWLEDGEBASE/{subject_clean}_data_converted.md") vectorstore_path = Path(f"./DATA/VECTORSTORES/{subject_clean}_vectorstore.pkl") return doc_path, md_path, vectorstore_path def get_document_format(self, file_path) -> InputFormat: ext = Path(file_path).suffix.lower() return { '.pdf': InputFormat.PDF, '.docx': InputFormat.DOCX, '.doc': InputFormat.DOCX, '.pptx': InputFormat.PPTX, '.html': InputFormat.HTML, '.htm': InputFormat.HTML }.get(ext, None) def convert_document_to_markdown(self, subject: str) -> bool: try: doc_path, md_path, _ = self.get_paths(subject) if not doc_path.exists(): print(f"No document found: {doc_path}") return False doc_format = self.get_document_format(doc_path) if not doc_format: print(f"Unsupported format: {doc_path.suffix}") return False input_path = os.path.abspath(str(doc_path)) with tempfile.TemporaryDirectory() as temp_dir: temp_input = os.path.join(temp_dir, os.path.basename(input_path)) shutil.copy2(input_path, temp_input) pipeline_options = PdfPipelineOptions(do_ocr=True, do_table_structure=True) converter = DocumentConverter( allowed_formats=[doc_format], format_options={ doc_format: PdfFormatOption(pipeline_options=pipeline_options), InputFormat.DOCX: WordFormatOption(pipeline_cls=SimplePipeline) } ) conv_result = converter.convert(temp_input) if not conv_result or not conv_result.document: print(f"Conversion failed for: {doc_path}") return False markdown = conv_result.document.export_to_markdown() md_path.parent.mkdir(parents=True, exist_ok=True) with open(md_path, "w", encoding="utf-8") as f: f.write(markdown) return True except Exception as e: print(f"[Conversion Error] {e}") return False def load_or_create_vectorstore(self, subject: str): try: _, md_path, vectorstore_path = self.get_paths(subject) vectorstore_path.parent.mkdir(parents=True, exist_ok=True) embeddings = OllamaEmbeddings(model=self.embedding_model) if vectorstore_path.exists(): print(f"Loading existing vectorstore from: {vectorstore_path}") return FAISS.load_local(str(vectorstore_path), embeddings, allow_dangerous_deserialization=True) print(f"Creating vectorstore for: {md_path}") loader = UnstructuredMarkdownLoader(str(md_path)) documents = loader.load() splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=100) chunks = splitter.split_documents(documents) vectorstore = FAISS.from_documents(chunks, embeddings) vectorstore.save_local(str(vectorstore_path)) return vectorstore except Exception as e: print(f"[Vectorstore Error] {e}") return None def get_memory(self, session_id: str): if session_id not in self.memory_store: self.memory_store[session_id] = ChatMessageHistory() history = self.memory_store[session_id] # Auto-reset memory if too long if len(history.messages) > self.MAX_MESSAGES_PER_SESSION: print(f"[INFO] Memory for session '{session_id}' exceeded {MAX_MESSAGES_PER_SESSION} messages. Resetting.") history.clear() return history def setup_chain(self, subject: str): try: _, md_path, _ = self.get_paths(subject) if not md_path.exists(): if not self.convert_document_to_markdown(subject): return None vectorstore = self.load_or_create_vectorstore(subject) if not vectorstore: return None llm = OllamaLLM(model=self.rag_model, temperature=0) base_chain = ConversationalRetrievalChain.from_llm( llm=llm, retriever=vectorstore.as_retriever(search_kwargs={"k": 2}), return_source_documents=False ) self.qa_chain = RunnableWithMessageHistory( base_chain, get_session_history=self.get_memory, input_messages_key="question", history_messages_key="chat_history" ) return self.qa_chain except Exception as e: print(f"[QA Chain Error] {e}") return None def ask(self, qa_chain, question: str, session_id: str = "default") -> str: if not qa_chain: print("QA chain not initialized.") return "No QA chain available." try: result = qa_chain.invoke({"question": question}, config={"configurable": {"session_id": session_id}}) return result.get("answer", "No answer found.") except Exception as e: return f"Error: {e}" def interactive_chat(self, subject: str): if not self.setup_chain(subject): print("Could not set up RAG chain.") return print("\nChat with the RAG model. Type 'exit' to quit.\n") while True: question = input("You: ") if question.lower() in {"exit", "quit", "bye"}: print("Goodbye!") break print("AI:", self.ask(self.qa_chain, question, session_id="default")) if __name__ == "__main__": rag = RAGPipeline() rag.interactive_chat(subject="Disaster") # Replace with your actual subject ================================================ FILE: src/BRAIN/chat_with_ai.py ================================================ import json from pathlib import Path from langchain_ollama import ChatOllama from langchain_ollama import OllamaEmbeddings from langchain_community.vectorstores import FAISS from src.FUNCTION.Tools.get_env import EnvManager import datetime import math from fuzzywuzzy import fuzz class PersonalChatAI: HISTORY_FILE_PATH = "./DATA/chat_history.json" AI_MODEL = EnvManager.load_variable("Chat_model") EMBEDDING_MODEL = EnvManager.load_variable("Embedding_model") SCORE_THRESHOLD = 0.6 # Adjust this threshold as needed MAX_HISTORY_SIZE = 100 # Limit history size def __init__(self): self.llm = ChatOllama(model=self.AI_MODEL, temperature=0) def get_current_timestamp(self): return datetime.datetime.now().isoformat() def load_chat_history(self): if Path(self.HISTORY_FILE_PATH).exists(): with open(self.HISTORY_FILE_PATH, "r", encoding="utf-8") as file: return json.load(file) return [] def save_chat_history(self, history): with open(self.HISTORY_FILE_PATH, "w", encoding="utf-8") as file: json.dump(history, file, indent=4) def ask_ai_importance(self, prompt: str) -> bool: """Ask AI if the chat message is important.""" llm = ChatOllama(model=self.AI_MODEL, temperature=0, max_token=50) system_prompt = """ You are an AI that determines whether a message contains personally significant information or emotional expression. You must respond ONLY with "yes" or "no". Here are some examples: User: "My name is Ravi and I live in Bangalore." AI: yes User: "The weather is nice today." AI: no User: "I'm feeling really anxious lately." AI: yes User: "I love the book Homo Sapiens, it's my favorite." AI: yes User: "Can you help me with a math problem?" AI: no """ messages = [ {"role": "system", "content": system_prompt}, {"role": "user", "content": f"Is this conversation important? {prompt}"}, ] response = llm.invoke(messages) return "yes" in response.content.strip().lower() def store_important_chat(self, prompt: str, response: str, threshold=80): """Store chat in history if AI deems it important.""" if self.ask_ai_importance(prompt): cur_date_time = self.get_current_timestamp() history = self.load_chat_history() for entry in history: if "user" in entry: similarity_score = max( fuzz.token_sort_ratio(prompt, entry["user"]), fuzz.token_set_ratio(prompt, entry["user"]) ) if similarity_score >= threshold: entry["assistant"] = response entry["timestamp"] = cur_date_time break else: history.append({"user": prompt, "assistant": response, "timestamp": cur_date_time}) if len(history) > self.MAX_HISTORY_SIZE: history.pop(0) self.save_chat_history(history) def distance_to_similarity_inverted(self, distance, scale=1.0): """Sigmoid-based similarity mapping with inverted distance.""" return 1 / (1 + math.exp(-distance * scale)) def semantic_search(self, query: str): history = self.load_chat_history() if not history: return [] embedding = OllamaEmbeddings(model=self.EMBEDDING_MODEL) combined_map = { item["user"] + " " + item["assistant"]: item for item in history if "user" in item and "assistant" in item } vectorstore = FAISS.from_texts( texts=list(combined_map.keys()), embedding=embedding, ) results_with_scores = vectorstore.similarity_search_with_score(query, k=7) filtered_results = [] for result, score in results_with_scores: similarity_score = self.distance_to_similarity_inverted(score) if similarity_score >= self.SCORE_THRESHOLD: filtered_results.append(combined_map[result.page_content]) return filtered_results def message_management(self, query): system_prompt = """ You are an AI assistant that remembers important details from previous conversations. When a user shares personal information (like their name, preferences, or interests), you should recall and use that naturally in responses — like a thoughtful friend would. Avoid being overly formal or generic. Be warm, conversational, and use their name where appropriate.""" messages = [{"role": "system", "content": system_prompt}] relevant_chats = self.semantic_search(query) if relevant_chats: for chat in relevant_chats: messages.append({"role": "user", "content": chat["user"]}) messages.append({"role": "assistant", "content": chat["assistant"]}) messages.append({"role": "user", "content": query}) return messages def personal_chat_ai(self, first_query: str, max_token: int = 2000): """Chat system with persistent history and semantic retrieval.""" try: query = first_query messages = self.message_management(query) llm = ChatOllama(model=self.AI_MODEL, temperature=0, max_token=max_token) while True: store = len(messages) < 100 response_stream = llm.stream(messages) response_content = "" print("AI:", end=" ") for chunk in response_stream: text = chunk.content print(text, end="", flush=True) response_content += text print() if store: self.store_important_chat(query, response_content) query = input("YOU: ") if query.lower() in {"exit", "end"}: break messages = self.message_management(query) except Exception as e: print(f"An error occurred: {e}") return None return True if __name__ == "__main__": print("AI Chat Initialized. Type 'exit' to stop.") first_query = input("YOU: ") chat_bot = PersonalChatAI() chat_bot.personal_chat_ai(first_query) print("Chat session ended.") ================================================ FILE: src/BRAIN/code_gen.py ================================================ import logging import os import pandas as pd import re from langchain_ollama import ChatOllama from src.FUNCTION.Tools.get_env import EnvManager from src.FUNCTION.Tools.code_exec import CodeExecutor from google import genai logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(module)s - %(message)s') logger = logging.getLogger(__name__) class CodeRefactorAssistant: def __init__(self): self.genai_key = EnvManager.load_variable("genai_key") self.local_model = EnvManager.load_variable("Text_to_info_model") self.execute_code_with_dependencies = CodeExecutor() def provide_file_details(self, path: str) -> str: logger.info(f"Providing details for file: {path}") try: df = pd.read_csv(path) details = [ f"File Path: {path}", f"File Size: {df.memory_usage(deep=True).sum() / (1024 ** 2):.2f} MB", f"Shape (rows, columns): {df.shape}", f"Column Names: {df.columns.tolist()}", f"Data Types:\n{df.dtypes.to_string()}", f"First 5 rows:\n{df.head().to_string(index=False)}", f"Missing values:\n{df.isnull().sum().to_string()}", f"Summary statistics:\n{df.describe(include='all').to_string()}", f"Unique values per column:\n{df.nunique().to_string()}", f"Sample value types per column:\n{df.iloc[0].to_dict()}" ] return "\n\n".join(details) except FileNotFoundError: logger.error(f"File not found at path: {path}") return "" except Exception as e: logger.error(f"Error while processing file {path}: {e}", exc_info=True) return "" def extract_python_code(self, text): pattern = r"```python\s*(.*?)\s*```" match = re.search(pattern, text, re.DOTALL) if match: return match.group(1).strip() logger.info("No Python code found in the text.") return "" def generate_refactor_prompt(self, code: str, error: str, file_description: str): return f""" You are an expert Python code refactor assistant. The user has provided code that generated an error during execution. File Description: {file_description} Code: {code} Error: {error} Instructions: - Analyze and refactor to resolve the issue. - Use pandas, numpy, matplotlib correctly. - Handle file-related errors. - Refactor to be functional and error-free.""" def gem_refactor_code(self, code: str, file_path: str, max_attempts=3): for attempt in range(1, max_attempts + 1): logger.info(f"Gemini refactor attempt {attempt}/{max_attempts}") if not code: return {"error": "No code provided for execution."} exec_info = self.execute_code_with_dependencies.execute_code(code) if not exec_info.get("error"): logger.info("Code executed successfully.") return exec_info # ✅ Successful execution # Prepare for next attempt: describe file and generate a new prompt file_desc = self.provide_file_details(file_path) prompt = self.generate_refactor_prompt(code, exec_info["error"], file_desc) try: client = genai.Client(api_key=self.genai_key) response = client.models.generate_content(model='gemini-2.0-flash', contents=prompt) if hasattr(response, "text"): code = self.extract_python_code(response.text) else: logger.warning("Gemini response has no 'text' attribute.") return {"error": "Gemini response format unexpected."} except Exception as e: logger.error("Gemini API failed", exc_info=True) return {"error": str(e)} return {"error": f"Max attempts reached. Last error: {exec_info['error']}"} def gem_text_to_code(self, user_prompt: str, file_path: str): logger.info("Generating code from Gemini") if not os.path.exists(file_path): raise FileNotFoundError(f"{file_path} not found!") file_desc = self.provide_file_details(file_path) prompt = f""" You are a Python data analysis assistant. The user has provided a CSV file at path: '{file_path}'. Use pandas, numpy, matplotlib. File Description: {file_desc} User Query: {user_prompt}""" try: client = genai.Client(api_key=self.genai_key) response = client.models.generate_content(model='gemini-2.0-flash', contents=prompt) code = self.extract_python_code(response.text) return self.gem_refactor_code(code, file_path) except Exception as e: logger.error("Gemini API failed", exc_info=True) return {"error": str(e)} def local_refactor_code(self, code: str, file_path: str, max_attempts=10): if not code: return {"error": "No code provided for refactoring."} last_error = "Unknown error" for attempt in range(1, max_attempts + 1): logger.info(f"Local refactor attempt {attempt}/{max_attempts}") exec_info = self.execute_code_with_dependencies.execute_code(code) if not exec_info.get("error"): logger.info("Code executed successfully.") return exec_info # ✅ Success last_error = exec_info["error"] # Describe file and generate prompt file_desc = self.provide_file_details(file_path) prompt = self.generate_refactor_prompt(code, last_error, file_desc) try: llm = ChatOllama(model=self.local_model, temperature=0.3) messages = [ {"role": "system", "content": prompt}, {"role": "user", "content": f"Here is the code to refactor:\n{code}"} ] response = llm.invoke(messages) code = self.extract_python_code(response) except Exception as e: logger.error("Local LLM failed", exc_info=True) return {"error": str(e)} return {"error": f"Max attempts reached. Last error: {last_error}"} def local_text_to_code(self, user_prompt: str, file_path: str): logger.info("Generating code from local LLM") if not os.path.exists(file_path): raise FileNotFoundError(f"{file_path} not found!") file_desc = self.provide_file_details(file_path) prompt = f""" You are a Python data analysis assistant. The user has provided a CSV file at path: '{file_path}'. Use pandas, numpy, matplotlib. File Description: {file_desc} User Query: {user_prompt}""" try: llm = ChatOllama(model=self.local_model, temperature=0.3) messages = [ {"role": "system", "content": prompt}, {"role": "user", "content": user_prompt} ] response = llm.invoke(messages) code = self.extract_python_code(response) return self.local_refactor_code(code, file_path) except Exception as e: logger.error("Local LLM failed", exc_info=True) return {"error": str(e)} def data_analysis(user_prompt: str , file_path:str): codeassistant = CodeRefactorAssistant() try: # Try to generate code via API response = codeassistant.gem_text_to_code(user_prompt , file_path) except Exception as e: # If API fails, log the error and fall back to local code generation logger.error(f"ERROR: {e} - Falling back to local processing.") response = codeassistant.local_text_to_code(user_prompt , file_path) return response ================================================ FILE: src/BRAIN/gem_func_call.py ================================================ from google import genai from google.genai import types from DATA.tools import ALL_FUNCTIONS, UI_ALL_FUNCTIONS from src.FUNCTION.Tools.get_env import EnvManager class GeminiFunctionCaller: def __init__(self): self.genai_key = EnvManager.load_variable("genai_key") self.UI_ON = EnvManager.load_variable("UI") self.client = genai.Client(api_key=self.genai_key) self.tools_config = self._get_tools_config() def _get_tools_config(self) -> types.GenerateContentConfig: tools = UI_ALL_FUNCTIONS["tools"] if self.UI_ON == "YES" else ALL_FUNCTIONS["tools"] return types.GenerateContentConfig( temperature=0, tools=[types.Tool(function_declarations=tools)], tool_config=types.ToolConfig( function_calling_config=types.FunctionCallingConfig(mode="ANY") ) ) def _call_gemini(self, query: str): try: response = self.client.models.generate_content( model='gemini-2.0-flash', contents=query, config=self.tools_config ) return response.function_calls except Exception as e: print(f"[Gemini Error] {e}") return [] def generate_function_calls(self, user_query: str) -> list[dict]: results = [] function_calls = self._call_gemini(user_query) for fn in function_calls: results.append({ "name": fn.name, "parameters": fn.args }) return results ================================================ FILE: src/BRAIN/local_func_call.py ================================================ from typing import Union import json import re from langchain_ollama import ChatOllama from src.FUNCTION.Tools.get_env import EnvManager from DATA.tools import ALL_FUNCTIONS, UI_ALL_FUNCTIONS, ALL_FUNCTIONS_EXAMPLE, UI_ALL_FUNCTIONS_EXAMPLE # UI_ON = load_variable("UI") # AVAILABLE_FUNCTION_NAMES_STRING = [func.get("name") for func in ALL_FUNCTIONS.get("tools")] # SYSTEM_MESSAGE = f"""You are an AI that determines the best function to call based on user input.\n\n### Available Functions:\n{ALL_FUNCTIONS["tools"] if UI_ON == "NO" else UI_ALL_FUNCTIONS["tools"]}\n\n### Instructions:\n- Choose the function name.\n- Extract necessary arguments.\n- **Respond ONLY in valid JSON format** as follows:\n\n```json\n[\n {{\n "name": "function_name_here",\n "parameters": {{\n "arg1": "value1",\n "arg2": "value2"\n }}\n }}\n]\n```\n\n### Examples:\n{ALL_FUNCTIONS_EXAMPLE if UI_ON == "NO" else UI_ALL_FUNCTIONS_EXAMPLE}\n```\n""" UI_ON = EnvManager.load_variable("UI") AVAILABLE_FUNCTION_NAMES_STRING = [func.get("name") for func in ALL_FUNCTIONS.get("tools")] # - If no function is relevant, return an empty list: [] # - If the user query involves multiple actions, respond with a list of function calls in the correct order. SYSTEM_MESSAGE = f"""You are an AI that determines the best function to call based on user input. ### Available Functions: {ALL_FUNCTIONS["tools"] if UI_ON == "NO" else UI_ALL_FUNCTIONS["tools"]} ### Instructions: - Choose the function name. - Extract necessary arguments. - **Respond ONLY in valid JSON format** as follows: ```json [ {{ "name": "", "arguments": {{ "arg1": "", "arg2": "" }} }} ] ``` ### Examples: {ALL_FUNCTIONS_EXAMPLE if UI_ON == "NO" else UI_ALL_FUNCTIONS_EXAMPLE} ``` """ class LocalFunctionCall: def __init__(self): self.model = EnvManager.load_variable("Function_call_model") def _load_tools(self, file_path: str) -> Union[dict, list]: try: with open(file_path, "r") as file: return json.load(file) except FileNotFoundError: print("Error: Tools configuration file not found.") return [] except json.JSONDecodeError: print("Error: Invalid JSON format.") return [] def load_tools_message(self, file_path: str) -> str: tools = self._load_tools(file_path) return json.dumps(tools, indent=2) def _parse_tool_calls(self, response: str) -> Union[list, None]: try: # Remove markdown fences response = response.replace("```json", "").replace("```", "").strip() # Extract JSON-like part match = re.search(r'\[.*?\]', response, re.DOTALL) if not match: return None json_str = match.group(0) # Replace double braces with single braces json_str = json_str.replace("{{", "{").replace("}}", "}") # Parse JSON return json.loads(json_str) except json.JSONDecodeError as e: print(f"Error parsing JSON: {e}\nOriginal string:\n{json_str}") return None def create_function_call(self, user_query: str) -> Union[list, None]: try: llm = ChatOllama(model=self.model, temprature=0) response = llm.invoke([{"role": "system", "content": SYSTEM_MESSAGE}, {"role": "user", "content": user_query}]) functional_response = self._parse_tool_calls(response.content) return [ func for func in functional_response if func.get("name", "").lower() in AVAILABLE_FUNCTION_NAMES_STRING ] if functional_response else None except Exception as e: print(f"Error creating function call: {e}") return None if __name__ == "__main__": query = "please send email send email " function_caller = LocalFunctionCall() response = function_caller.create_function_call(query) print(response) ================================================ FILE: src/BRAIN/text_to_info.py ================================================ from langchain_ollama import ChatOllama from src.FUNCTION.Tools.get_env import EnvManager class AIResponder: def __init__(self, model_name=None): self.model = model_name or EnvManager.load_variable("Text_to_info_model") self.temperature = 0.3 def ai_response(self, prompt: str, max_token: int = 2000) -> str: """Handle creative prompts like jokes or stories.""" try: llm = ChatOllama( model=self.model, temperature=self.temperature, max_token=max_token ) messages = [ {"role": "system", "content": "You are an intelligent AI system. Understand the user Query carefully and provide the most relevant Answer."}, {"role": "user", "content": str(prompt)} ] response = llm.invoke(messages) return response.content except Exception as e: print(f"An error occurred: {e}") return "Error occurred while processing your request." def send_to_ai(prompt): AI = AIResponder() return AI.ai_response(prompt) # Usage Example: # ai_responder = AIResponder() # result = ai_responder.send_to_ai("Tell me a joke.") # print(result) ================================================ FILE: src/CONVERSATION/speech_to_text.py ================================================ import speech_recognition as sr def recognize_speech(): """Recognize speech from the microphone.""" recognizer = sr.Recognizer() # recognizer.dynamic_energy_threshold = True # recognizer.energy_threshold = 30000 # recognizer.dynamic_energy_adjustment_damping = 0.010 # less more active # recognizer.dynamic_energy_ratio = 1.0 # recognizer.pause_threshold = 0.8 # recognizer.operation_timeout = None # recognizer.non_speaking_duration = 0.5 recognizer.energy_threshold = 3000 recognizer.dynamic_energy_adjustment_damping = 0.07 # less more active recognizer.dynamic_energy_ratio = 1.5 recognizer.pause_threshold = 0.7 recognizer.operation_timeout = None recognizer.non_speaking_duration = 0.6 # recognizer.energy_threshold = 10000 # Higher to filter out background noise # recognizer.dynamic_energy_adjustment_damping = 0.04 # Fast adjustments to changing noise # recognizer.dynamic_energy_ratio = 2.0 # Speech must be significantly louder than noise # recognizer.pause_threshold = 0.7 # Shorter pauses allowed # recognizer.operation_timeout = 10 # Stops if no speech for 10s # recognizer.non_speaking_duration = 0.6 # Ends sooner if background noise is stable try: available = sr.Microphone.list_microphone_names() if len(available) <= 1: return None with sr.Microphone() as source: if source: print("[=] Adjusting for ambient noise... Please wait.") recognizer.adjust_for_ambient_noise(source) print("[+] Listening...") audio = recognizer.listen(source) print("Recognizing...") text = recognizer.recognize_google(audio) return text except sr.UnknownValueError: print("Google Speech Recognition could not understand audio") except sr.RequestError as e: print(f"Could not request results from Google Speech Recognition service; {e}") except Exception as e: print(f"An unexpected error occurred: {e}") return None # if __name__ == "__main__": # print("Say 'start listening' to activate, or 'exit' to quit.") # listening_mode = False # while True: # spoken_text = recognize_speech() # if spoken_text: # print(f"You said: {spoken_text}") # if "hey jarvis" in spoken_text.lower(): # print("Listening mode activated.") # listening_mode = True # elif "exit" in spoken_text.lower(): # listening_mode = False # print("Exiting...") # break # if listening_mode: # print(spoken_text) ================================================ FILE: src/CONVERSATION/t_s.py ================================================ import threading import time import speech_recognition as sr recognizer = sr.Recognizer() def recognize_speech(): """Recognize speech from the microphone.""" recognizer.dynamic_energy_threshold = True recognizer.energy_threshold = 3000 recognizer.dynamic_energy_adjustment_damping = 0.010 # More sensitive with lower values recognizer.dynamic_energy_ratio = 1.0 recognizer.pause_threshold = 0.8 recognizer.operation_timeout = None # No timeout recognizer.non_speaking_duration = 0.5 try: available = sr.Microphone.list_microphone_names() if len(available) <= 1: print("No microphones available.") return None with sr.Microphone() as source: print("[=] Adjusting for ambient noise... Please wait.") recognizer.adjust_for_ambient_noise(source) # Adjust for noise print("[+] Listening...") audio = recognizer.listen(source) # Start listening print("Recognizing...") text = recognizer.recognize_google(audio) # Recognize speech print(f"[Recognized] {text}") return text except sr.UnknownValueError: print("Google Speech Recognition could not understand audio") except sr.RequestError as e: print(f"Could not request results from Google Speech Recognition service; {e}") except Exception as e: print(f"An unexpected error occurred: {e}") return None def listen_in_background(): """Runs speech recognition in the background.""" while True: spoken_text = recognize_speech() if spoken_text: print(f"Detected speech: {spoken_text}") time.sleep(1) # Adjust as needed # Run speech recognition in a separate thread listener_thread = threading.Thread(target=listen_in_background) listener_thread.daemon = True listener_thread.start() # Main program keeps running while background thread listens while True: time.sleep(5) # Main thread sleeps, letting the listener thread do the work ================================================ FILE: src/CONVERSATION/test_speech.py ================================================ import speech_recognition as sr def recognize_speech(): """Recognize speech from the microphone.""" recognizer = sr.Recognizer() #recognizer.dynamic_energy_threshold = True # recognizer.energy_threshold = 3000 # recognizer.dynamic_energy_adjustment_damping = 0.07 # less more active # recognizer.dynamic_energy_ratio = 1.5 # recognizer.pause_threshold = 0.6 # recognizer.operation_timeout = None # recognizer.non_speaking_duration = 0.5 recognizer.energy_threshold = 10000 # Higher to filter out background noise recognizer.dynamic_energy_adjustment_damping = 0.03 # Fast adjustments to changing noise recognizer.dynamic_energy_ratio = 2.0 # Speech must be significantly louder than noise recognizer.pause_threshold = 0.8 # Shorter pauses allowed recognizer.operation_timeout = 10 # Stops if no speech for 10s recognizer.non_speaking_duration = 0.4 # Ends sooner if background noise is stable with sr.Microphone() as source: print("[=] Adjusting for ambient noise... Please wait.") recognizer.adjust_for_ambient_noise(source) print("[+] Listening...") audio = recognizer.listen(source) try: print("Recognizing...") text = recognizer.recognize_google(audio) return text except sr.UnknownValueError: print("Google Speech Recognition could not understand audio") return None except sr.RequestError as e: print(f"Could not request results from Google Speech Recognition service; {e}") return None # if __name__ == "__main__": # print("Say 'start listening' to activate, or 'exit' to quit.") # listening_mode = False # full_text = "" # To store the complete speech input # while True: # spoken_text = recognize_speech() # if spoken_text: # print(f"You said: {spoken_text}") # if "hey jarvis" in spoken_text.lower(): # print("Listening mode activated.") # listening_mode = True # continue # elif "stop" in spoken_text.lower(): # listening_mode = False # print("Listening mode deactivated...") # Will still allow the program to continue running # continue # if listening_mode: # full_text += spoken_text + " " # Append the recognized speech # print(f"Current text: {full_text}") # if spoken_text.lower() == "exit": # print("Exiting...") # break # Option to exit the program when 'exit' is said ================================================ FILE: src/CONVERSATION/text_speech.py ================================================ from random import randint from gtts import gTTS from io import BytesIO import tempfile import os import base64 import io def text_to_speech_local(text, lang="en"): """Converts text to speech using gTTS and streams audio to the browser.""" # Generate TTS audio using gTTS tts = gTTS(text=text, lang=lang) # Create a temporary file to store audio as MP3 with tempfile.NamedTemporaryFile(delete=False, suffix=".mp3") as temp_audio: audio_path = temp_audio.name # Save audio to the temporary file tts.save(audio_path) # Read the audio content as bytes into a BytesIO stream with open(audio_path, "rb") as audio_file: audio_bytes = audio_file.read() # Delete the temporary file after reading os.remove(audio_path) audio_base64 = base64.b64encode(audio_bytes).decode('utf-8') return audio_base64 if __name__ == "__main__": audio_io = text_to_speech_local("hello how are you.") print(audio_io) ================================================ FILE: src/CONVERSATION/text_to_speech.py ================================================ import pyttsx3 from random import randint def speak(text:str) -> None: try: # Initialize the TTS engine engine = pyttsx3.init() # Set the speaking rate try: rate = engine.getProperty('rate') engine.setProperty('rate', 154) # Setting up a new speaking rate except Exception as e: print(f"Error setting rate: {e}") # Set the volume try: volume = engine.getProperty('volume') engine.setProperty('volume', 1.0) # Setting volume level between 0 and 1 except Exception as e: print(f"Error setting volume: {e}") # Set the voice try: voices = engine.getProperty('voices') # print("Available Voices on macOS:") # for idx, v in enumerate(voices): # print(f"{idx}: Name: {v.name}, ID: {v.id}, Languages: {v.languages}") choice = randint(0,1) if len(voices) >= 99: if choice == 1: engine.setProperty('voice', voices[99].id) # Set the voice by index else: engine.setProperty('voice', voices[99].id) else: engine.setProperty('voice', voices[0].id) except Exception as e: print(f"Error setting voice: {e}") # Speak the text try: engine.say(text) engine.runAndWait() except Exception as e: print(f"Error speaking text: {e}") finally: engine.stop() except Exception as e: print(f"Error initializing TTS engine: {e}") if __name__ == "__main__": #text = "Welcome back, sir. Jar vis is online." text = "Welcome back, sir. Jar-vis is online." speak(text) ================================================ FILE: src/CONVERSATION/voice_text.py ================================================ import speech_recognition as sr def voice_to_text(temp_audio): recognizer = sr.Recognizer() with sr.AudioFile(temp_audio) as source: audio_data = recognizer.record(source) try: trnanscribed_text = recognizer.recognize_google(audio_data) return trnanscribed_text except sr.UnknownValueError: print("Could not understand the audio") except sr.RequestError: print("Could not request results, check your internet connection") ================================================ FILE: src/FUNCTION/Tools/Email_send.py ================================================ import smtplib, ssl from email.mime.multipart import MIMEMultipart from email.mime.text import MIMEText from src.FUNCTION.Tools.get_env import EnvManager from DATA.email_schema import email_prompts from src.BRAIN.text_to_info import send_to_ai class EmailSender: def __init__(self): # Load email credentials from environment variables self.smtp_server = "smtp.gmail.com" self.port = 587 self.password = EnvManager.load_variable("Password_email") self.receiver_email = EnvManager.load_variable("Reciever_email") self.sender_email = EnvManager.load_variable("Sender_email") def initate_email(self, subject: str, email_content: str) -> bool: """Send an email with the provided subject and content.""" html_content = f"""

{email_content}

""" try: # Prepare the email msg = MIMEMultipart() msg['From'] = self.sender_email msg['To'] = self.receiver_email msg['Subject'] = subject.strip() msg.attach(MIMEText(html_content, 'html')) # Set up the secure SSL context and send the email context = ssl.create_default_context() with smtplib.SMTP(self.smtp_server, self.port) as server: server.starttls(context=context) server.login(self.sender_email, self.password) server.sendmail(self.sender_email, self.receiver_email, msg.as_string()) print("Email sent successfully...") except Exception as e: print(f"Error: {e}") return False return True def email_content(self) -> dict: """Generate and send an automated email based on selected template.""" select_template = input("Select an email template (job, friend, meeting, doctor, leave, product): ") if select_template not in email_prompts: print("[+] Invalid template selection.") return {} # Fetch the selected template template = email_prompts[select_template] placeholders = {} # Collect placeholder values for the email template for placeholder in template['prompt'].split('{')[1:]: placeholder_key = placeholder.split('}')[0] value = input(f"Enter value for '{placeholder_key}': ").strip() if not value: print(f"Value for '{placeholder_key}' cannot be empty.") return {} placeholders[placeholder_key] = value # Format the prompt with placeholders formatted_prompt = template['prompt'].format(**placeholders) # Display the prompt for review print("----- Start prompt -----") print(formatted_prompt) print("----- End prompt -----") # Generate email content using AI email_prompt = "You are a professional email writer. Write an email based on the provided content in less than 20 words." complete_prompt = f"{email_prompt}\n{formatted_prompt}" response = send_to_ai(complete_prompt).strip() # Generate email subject using AI sub_prompt = f"Give a suitable subject for the given email: {response}. Use 3-4 words max." subject = send_to_ai(sub_prompt).strip() # Return the generated subject and content return {'subject': subject, 'content': response} def send_email(): email_sender = EmailSender() email_details = email_sender.email_content() if email_details: subject = email_details['subject'] content = email_details['content'] flag = email_sender.initate_email(subject, content) return flag return False # Example Usage: if __name__ == "__main__": email_sender = EmailSender() # Generate and send the email email_details = email_sender.send_email() if email_details: subject = email_details['subject'] content = email_details['content'] email_sender.initate_email(subject, content) ================================================ FILE: src/FUNCTION/Tools/app_op.py ================================================ from src.FUNCTION.Tools.get_env import AppManager , EnvManager #load_app, check_os from os import system class AppRunner: def __init__(self, name: str): self.name = name self.os_name = EnvManager.check_os() self.path = AppManager().load_app(name) def start_app(self) -> bool: """Start the app based on the operating system.""" if self.os_name == "Linux": system(f'"{self.path}"') elif self.os_name == "Darwin": system(f'open "{self.path}"') elif self.os_name == "Windows": system(f'start "{self.path}"') else: print("Invalid Operating system..") return False return True def run(self) -> str: """Runs the application and returns a message indicating success or failure.""" if self.start_app(): return f"{self.name} is running now." return f"Oops, some error occurred in opening {self.name}." def app_runner(name:str) -> str: run_app = AppRunner(name) result = run_app.run() return result # Example usage: if __name__ == "__main__": app_name = input("Enter the name of the app to open: ") app_runner = AppRunner(app_name) result = app_runner.run() print(result) ================================================ FILE: src/FUNCTION/Tools/code_exec.py ================================================ import sys import subprocess import importlib import os from io import StringIO class CodeExecutor: def __init__(self, required_libraries=None): """Initialize the class with a list of required libraries.""" self.required_libraries = required_libraries or ['pandas', 'numpy', 'matplotlib'] self.flag_file = './DATA/libraries_installed.txt' def get_pip_command(self): """Returns the appropriate pip command based on the operating system.""" if sys.platform == "win32": return "pip" # Windows uses pip else: return "pip3" # macOS/Linux uses pip3 def check_and_install_libraries(self): """ Checks if required libraries are installed and installs any missing ones. """ # Step 1: Check if the flag file exists if os.path.exists(self.flag_file) and os.path.getsize(self.flag_file) != 0: print("Libraries are already installed.") return # Libraries are already installed, skip installation # If flag file does not exist, install missing libraries for library in self.required_libraries: try: importlib.import_module(library) except ImportError: print(f"Library '{library}' not found. Installing...") pip_command = self.get_pip_command() subprocess.check_call([sys.executable, "-m", "pip", "install", library]) # Step 2: Create the flag file to indicate libraries are installed with open(self.flag_file, 'w') as f: f.write("Libraries installed successfully.") def execute_code(self, code): """ Executes the dynamically generated code and ensures that required libraries are installed. Args: code (str): The Python code to execute. Returns: dict: Dictionary containing the result or error message. """ # Step 1: Install missing libraries if required self.check_and_install_libraries() # Initialize result dictionary result = { 'output': None, 'error': None, } # Step 2: Capture the output of the code execution old_stdout = sys.stdout sys.stdout = StringIO() try: # Step 3: Execute the code in a clean context (no interference from the global scope) exec_globals = {} exec_locals = {} exec(code, exec_globals, exec_locals) # Step 4: Capture the output from print statements or results output = sys.stdout.getvalue() # Step 5: Return the result or output if not output: result['output'] = str(exec_locals) # If no output, return the result of execution else: result['output'] = output except SyntaxError as e: result['error'] = f"Syntax Error: {e.msg} on line {e.lineno}" except NameError as e: result['error'] = f"Name Error: {e.args}" except Exception as e: result['error'] = f"Error during execution: {str(e)}" finally: sys.stdout = old_stdout # Restore the original stdout return result # Example usage: if __name__ == "__main__": executor = CodeExecutor() # Initialize with default libraries code_to_run = """ import pandas as pd import numpy as np data = pd.DataFrame({'a': np.random.randn(100), 'b': np.random.randn(100)}) print(data.head()) """ result = executor.execute_code(code_to_run) if result['error']: print(f"Error: {result['error']}") else: print(f"Output:\n{result['output']}") ================================================ FILE: src/FUNCTION/Tools/get_env.py ================================================ from dotenv import load_dotenv from os import environ from typing import Union import json import os import platform from fuzzywuzzy import process from DATA.Domain import websites import shutil # Load environment variables from .env file load_dotenv() APP_JSON_PATH = "./DATA/app.json" class EnvManager: """Handles environment variable loading.""" @staticmethod def load_variable(variable_name: str) -> Union[str, None]: """Load environment variable.""" try: variable = environ.get(variable_name.strip()) return variable except Exception as e: print(f"Error: {e}") return None @staticmethod def check_os() -> str: """Check the operating system and return the name.""" os_name = platform.system() if os_name == "Windows": return "Windows" elif os_name == "Darwin": return "Darwin" elif os_name == "Linux": return "Linux" else: return "Unknown" class AppManager: """Handles application management tasks such as checking OS, installed apps, and updating the app list.""" @staticmethod def is_app_installed(path: str) -> bool: """Check if an application is installed by verifying its path.""" # Check if it's in the system PATH (for built-in apps like Notepad, Calculator) if shutil.which(path): return True # Check if the direct path exists (for installed applications) return os.path.exists(path) @staticmethod def get_url(website_name: str) -> str: """Retrieve website URL with exact or fuzzy matching.""" if not website_name: print("❌ Website name cannot be empty.") return "" # Normalize input website_name = website_name.strip().lower() # Exact match if website_name in websites: return websites[website_name] # Fuzzy matching closest_match, score = process.extractOne(website_name, websites.keys()) if score >= 80: return websites[closest_match] print(f"❌ Website '{website_name}' not found.") return "" @staticmethod def get_app_path(app_name, app_data): """Retrieve app path with exact match and fuzzy matching.""" # ✅ Strip and lowercase app_name (just in case) app_name = app_name.strip().lower() # ✅ Check for exact match (since app_data is already normalized) if app_name in app_data and AppManager.is_app_installed(app_data.get(app_name)): return app_data.get(app_name) # ✅ Fuzzy match for closest name in normalized keys closest_match, score = process.extractOne(app_name, app_data.keys()) # ✅ Set a threshold for match confidence (e.g., 80) if score >= 80 and AppManager.is_app_installed(closest_match): # High confidence match return app_data[closest_match] # if no app found, fallback to website search link = AppManager.get_url(app_name) if link: return link # ❌ No match found print(f"❌ Application or web '{app_name}' not found.") return "" @staticmethod def load_app(app_name: str) -> str: """Load the path of the application from app.json.""" if not os.path.exists(APP_JSON_PATH) or os.path.getsize(APP_JSON_PATH) == 0: print("app.json is empty or does not exist. Fetching apps...") AppManager.update_app_list() with open(APP_JSON_PATH, "r", encoding="utf-8") as f: app_data = json.load(f) return AppManager.get_app_path(app_name , app_data) @staticmethod def update_app_list(): """Get the list of installed apps and store them in app.json.""" os_name = EnvManager.check_os() apps = [] if os_name == "Windows": apps = AppManager.get_installed_apps_windows() elif os_name == "Darwin": apps = AppManager.get_installed_apps_mac() elif os_name == "Linux": apps = AppManager.get_installed_apps_linux() # Store in app.json app_dict = {app["name"].lower(): app["path"] for app in apps} with open(APP_JSON_PATH, "w", encoding="utf-8") as f: json.dump(app_dict, f, indent=4) print(f"{len(app_dict)} applications found and stored in app.json.") @staticmethod def get_installed_apps_windows(): """Get installed applications on Windows.""" import winreg apps = [] reg_paths = [ r"SOFTWARE\Microsoft\Windows\CurrentVersion\Uninstall", r"SOFTWARE\WOW6432Node\Microsoft\Windows\CurrentVersion\Uninstall", ] for reg_path in reg_paths: try: reg_key = winreg.OpenKey(winreg.HKEY_LOCAL_MACHINE, reg_path) for i in range(winreg.QueryInfoKey(reg_key)[0]): try: subkey_name = winreg.EnumKey(reg_key, i) subkey = winreg.OpenKey(reg_key, subkey_name) name, _ = winreg.QueryValueEx(subkey, "DisplayName") path, _ = winreg.QueryValueEx(subkey, "InstallLocation") if name and path: apps.append({"name": name, "path": path}) except (FileNotFoundError, OSError, ValueError): continue except FileNotFoundError: continue return apps @staticmethod def get_installed_apps_mac(): """Get installed applications on macOS.""" app_paths = [ "/Applications", os.path.expanduser("~/Applications"), "/System/Applications", # System apps ] apps = [] for path in app_paths: if os.path.exists(path): for app in os.listdir(path): if app.endswith(".app"): apps.append({"name": app.replace(".app", ""), "path": os.path.join(path, app)}) return apps @staticmethod def get_installed_apps_linux(): """Get installed applications on Linux.""" import glob app_paths = ["/usr/share/applications", os.path.expanduser("~/.local/share/applications")] apps = [] for path in app_paths: if os.path.exists(path): for file in glob.glob(f"{path}/*.desktop"): with open(file, "r", encoding="utf-8", errors="ignore") as f: lines = f.readlines() name, exec_path = None, None for line in lines: if line.startswith("Name="): name = line.split("=", 1)[1].strip() elif line.startswith("Exec="): exec_path = line.split("=", 1)[1].strip() if name and exec_path: apps.append({"name": name, "path": exec_path.split()[0]}) return apps ================================================ FILE: src/FUNCTION/Tools/greet_time.py ================================================ from datetime import datetime class TimeOfDay: def __init__(self): self.current_hour = datetime.now().hour def time_of_day(self) -> str: if 5 <= self.current_hour < 12: return "Good morning sir!" elif 12 <= self.current_hour < 17: return "Good afternoon sir!" elif 17 <= self.current_hour < 21: return "Good evening sir!" else: return "Good night sir!" ================================================ FILE: src/FUNCTION/Tools/incog.py ================================================ import os from subprocess import run, DEVNULL from src.FUNCTION.Tools.get_env import EnvManager class PrivateModeOpener: def __init__(self, topic: str): self.topic = topic self.search_url = f"https://www.google.com/search?q={self.topic}" self.os_name = EnvManager.check_os() def open_chrome_incognito(self) -> None: """Open Chrome in Incognito mode (Windows).""" possible_paths = [ r"C:\Program Files\Google\Chrome\Application\chrome.exe", r"C:\Program Files (x86)\Google\Chrome\Application\chrome.exe", ] chrome_path = next((path for path in possible_paths if os.path.exists(path)), None) if chrome_path: run([chrome_path, "--incognito", self.search_url], stdout=DEVNULL, stderr=DEVNULL) else: print("❌ Chrome not found. Please install Chrome or use another browser.") def open_firefox_private(self) -> None: """Open Firefox in Private mode (Windows).""" possible_paths = [ r"C:\Program Files\Mozilla Firefox\firefox.exe", r"C:\Program Files (x86)\Mozilla Firefox\firefox.exe", ] firefox_path = next((path for path in possible_paths if os.path.exists(path)), None) if firefox_path: run([firefox_path, "-private-window", self.search_url], stdout=DEVNULL, stderr=DEVNULL) else: print("❌ Firefox not found. Trying Edge...") self.open_edge_private() def open_edge_private(self) -> None: """Open Microsoft Edge in InPrivate mode (Windows).""" edge_path = r"C:\Program Files (x86)\Microsoft\Edge\Application\msedge.exe" if os.path.exists(edge_path): run([edge_path, "--inprivate", self.search_url], stdout=DEVNULL, stderr=DEVNULL) else: print("❌ Edge not found. Please install a supported browser.") def linux_firefox(self) -> None: """Open Firefox in Private mode on Linux.""" run(["firefox", "--private-window", self.search_url], stdout=DEVNULL, stderr=DEVNULL) def incog_mode_mac(self) -> None: """Open Safari in Private mode on macOS using AppleScript.""" applescript_code = f''' tell application "Safari" activate tell application "System Events" keystroke "n" using {{command down, shift down}} -- Open Private Window end tell delay 1 -- Give time to open Private Window tell window 1 set current tab to (make new tab with properties {{URL:"{self.search_url}"}}) end tell end tell ''' run(['osascript', '-e', applescript_code], stdout=DEVNULL, stderr=DEVNULL) def open_in_private_mode(self) -> None: """Open the specified topic in private/incognito mode.""" if self.os_name == "Linux": self.linux_firefox() elif self.os_name == "Darwin": # macOS self.incog_mode_mac() elif self.os_name == "Windows": try: self.open_chrome_incognito() except Exception: try: self.open_firefox_private() except Exception: self.open_edge_private() else: print("❌ Unsupported Operating System.") return "Error occurred in opening in private mode" return "Your browser is ready in private mode." # Usage example def private_mode(topic:str) -> bool: private_mode_opener = PrivateModeOpener("Artificial Intelligence") result = private_mode_opener.open_in_private_mode() return result ================================================ FILE: src/FUNCTION/Tools/internet_search.py ================================================ from src.BRAIN.text_to_info import send_to_ai from duckduckgo_search import DDGS class DuckGoSearch: def __init__(self, query: str): self.query = query def search_query(self) -> str: """Search the provided query on DuckDuckGo for quick information.""" results = DDGS().text(self.query, max_results=3) results_body = [info.get("body", "").strip() for info in results if info.get("body")] full_result = "\n".join(results_body) return full_result def generate_answer(self, search_results: str) -> str: """Generate an answer based on search results using AI.""" prompt = ( "Analyze the following search results carefully and extract the most relevant information." "Provide a concise and accurate answer to the given query." "\n\n=== Search Results ===\n" f"{search_results}\n" "=====================\n" f"Query: {self.query}\n" "Your Response:" ) answer = send_to_ai(prompt) return answer def execute_search(self) -> str: """Execute the DuckDuckGo search and get the response.""" search_results = self.search_query() answer = self.generate_answer(search_results) return answer def duckgo_search(query:str) -> str: duck_search = DuckGoSearch(query) answer = duck_search.execute_search() return answer ================================================ FILE: src/FUNCTION/Tools/link_op.py ================================================ import webbrowser def search_youtube(topic:str) -> None: """Search YouTube for a specific topic.""" format_topic = "+".join(topic.split()) link = f"https://www.youtube.com/results?search_query={format_topic}" try: webbrowser.open(link) except Exception as e: return f"Error occured in youtube search" return f"Your {topic} search results are ready." ================================================ FILE: src/FUNCTION/Tools/news.py ================================================ import requests from src.FUNCTION.Tools.get_env import EnvManager from typing import Union class NewsHeadlines: def __init__(self, top: int = 10, country: str = "india"): self.top = top self.country = country self.api_key = EnvManager.load_variable("News_api") def fetch_headlines(self) -> Union[list[str], None]: """Fetch top news headlines.""" headlines = [] url = ( f'https://newsapi.org/v2/top-headlines?' f'q={self.country}&from=2025-04-03&to=2025-04-03&sortBy=popularity&' f'apiKey={self.api_key}' ) try: response = requests.get(url).json() all_articles = response['articles'] total_results = int(response['totalResults']) for i in range(min(self.top, total_results)): headline = all_articles[i]['title'] headlines.append(headline) return "\n".join(headlines) except Exception as e: print(f"Error: {e}") return None def news_headlines(top=5): # Usage Example: news = NewsHeadlines(top) headlines = news.fetch_headlines() return headlines ================================================ FILE: src/FUNCTION/Tools/phone_call.py ================================================ import subprocess from DATA.phone_details import PHONE_DIR from src.FUNCTION.Tools.get_env import EnvManager class ADBConnect: def __init__(self): self.L_PATH_ADB = "./src/FUNCTION/adb_connect.sh" self.W_PATH_ADB = "./src/FUNCTION/adb_connect.bat" def adb_connect(self): """Establish ADB connection over Wi-Fi.""" os_name = EnvManager.check_os() try: if os_name == "Windows": subprocess.run(self.W_PATH_ADB, shell=True, check=True) else: subprocess.run(['bash', self.L_PATH_ADB], check=True) except subprocess.CalledProcessError: return "❌ Failed to run ADB connect script." connected_devices = subprocess.run(['adb', 'devices'], capture_output=True, text=True) devices = connected_devices.stdout.strip().split('\n')[1:] if not any("device" in line for line in devices): return "❌ No device connected! Ensure ADB is running and the phone is connected." return True class PhoneCall: def __init__(self, adb_connect_instance: ADBConnect): self.adb_connect_instance = adb_connect_instance def start_a_call(self, name: str) -> str: adb_response = self.adb_connect_instance.adb_connect() if adb_response is not True: return adb_response name = name.lower().strip() mobileNo = PHONE_DIR.get(name) if not mobileNo: subprocess.run(['adb', 'disconnect'], check=True) return f"❌ Contact '{name}' not found!" try: subprocess.run(['adb', 'shell', 'am', 'start', '-a', 'android.intent.action.CALL', '-d', f'tel:{mobileNo}'], check=True) except subprocess.CalledProcessError: return f"❌ Failed to initiate call to {name}." # Optionally keep Wi-Fi connection alive by skipping disconnect here subprocess.run(['adb', 'disconnect'], check=True) return f"📞 Phone call initiated to your friend **{name.capitalize()}**!" # Usage Example: def make_a_call(name:str) -> str: adb_instance = ADBConnect() phone_call_instance = PhoneCall(adb_instance) response = phone_call_instance.make_a_call(name) return response ================================================ FILE: src/FUNCTION/Tools/random_respon.py ================================================ from random import choice class RandomChoice: @staticmethod def random_choice(data:list) -> str: return choice(data) # Usage Example: # data_list = ["apple", "banana", "cherry", "date"] # random_choice_instance = RandomChoice(data_list) # random_item = random_choice_instance.random_choice() # print(random_item) ================================================ FILE: src/FUNCTION/Tools/searxsearch.py ================================================ import requests from bs4 import BeautifulSoup import os if __name__ == "__main__": from tools import Tools else: from sources.tools.tools import Tools class searxSearch(Tools): def __init__(self, base_url: str = None): """ A tool for searching a SearxNG instance and extracting URLs and titles. """ super().__init__() self.tag = "web_search" self.base_url = base_url or os.getenv("SEARXNG_BASE_URL") # Requires a SearxNG base URL self.user_agent = "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/134.0.0.0 Safari/537.36" self.paywall_keywords = [ "Member-only", "access denied", "restricted content", "404", "this page is not working" ] if not self.base_url: raise ValueError("SearxNG base URL must be provided either as an argument or via the SEARXNG_BASE_URL environment variable.") def link_valid(self, link): """check if a link is valid.""" # TODO find a better way if not link.startswith("http"): return "Status: Invalid URL" headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36"} try: response = requests.get(link, headers=headers, timeout=5) status = response.status_code if status == 200: content = response.text.lower() if any(keyword in content for keyword in self.paywall_keywords): return "Status: Possible Paywall" return "Status: OK" elif status == 404: return "Status: 404 Not Found" elif status == 403: return "Status: 403 Forbidden" else: return f"Status: {status} {response.reason}" except requests.exceptions.RequestException as e: return f"Error: {str(e)}" def check_all_links(self, links): """Check all links, one by one.""" # TODO Make it asyncromous or smth statuses = [] for i, link in enumerate(links): status = self.link_valid(link) statuses.append(status) return statuses def execute(self, blocks: list, safety: bool = False) -> str: """Executes a search query against a SearxNG instance using POST and extracts URLs and titles.""" if not blocks: return "Error: No search query provided." query = blocks[0].strip() if not query: return "Error: Empty search query provided." search_url = f"{self.base_url}/search" headers = { 'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7', 'Accept-Language': 'en-US,en;q=0.9', 'Cache-Control': 'no-cache', 'Connection': 'keep-alive', 'Content-Type': 'application/x-www-form-urlencoded', 'Pragma': 'no-cache', 'Upgrade-Insecure-Requests': '1', 'User-Agent': self.user_agent } data = f"q={query}&categories=general&language=auto&time_range=&safesearch=0&theme=simple".encode('utf-8') try: response = requests.post(search_url, headers=headers, data=data, verify=False) response.raise_for_status() html_content = response.text soup = BeautifulSoup(html_content, 'html.parser') results = [] for article in soup.find_all('article', class_='result'): url_header = article.find('a', class_='url_header') if url_header: url = url_header['href'] title = article.find('h3').text.strip() if article.find('h3') else "No Title" description = article.find('p', class_='content').text.strip() if article.find('p', class_='content') else "No Description" results.append(f"Title:{title}\nSnippet:{description}\nLink:{url}") if len(results) == 0: return "No search results, web search failed." return "\n\n".join(results) # Return results as a single string, separated by newlines except requests.exceptions.RequestException as e: raise Exception("\nSearxng search failed. did you run start_services.sh? is docker still running?") from e def execution_failure_check(self, output: str) -> bool: """ Checks if the execution failed based on the output. """ return "Error" in output def interpreter_feedback(self, output: str) -> str: """ Feedback of web search to agent. """ if self.execution_failure_check(output): return f"Web search failed: {output}" return f"Web search result:\n{output}" if __name__ == "__main__": search_tool = searxSearch(base_url="http://127.0.0.1:8080") result = search_tool.execute(["are dog better than cat?"]) print(result) ================================================ FILE: src/FUNCTION/Tools/weather.py ================================================ from geopy.geocoders import Nominatim from requests import get from src.FUNCTION.Tools.get_env import EnvManager from src.BRAIN.text_to_info import send_to_ai class WeatherService: def __init__(self, city: str): self.city = city self.api_key = EnvManager.load_variable("Weather_api") self.geolocator = Nominatim(user_agent="your_app_name") self.latitude, self.longitude = self.get_lat_lng(city) def get_lat_lng(self, name: str) -> tuple: """Get the latitude and longitude of a given place.""" location = self.geolocator.geocode(name) if location: latitude = round(location.latitude, 3) longitude = round(location.longitude, 3) return latitude, longitude return 0, 0 def weather_data(self) -> str: """Get the current weather report for the city.""" report = {} url = "https://weatherapi-com.p.rapidapi.com/current.json" querystring = {"q": f"{self.latitude},{self.longitude}"} headers = { "x-rapidapi-key": self.api_key, "x-rapidapi-host": "weatherapi-com.p.rapidapi.com" } response = get(url, headers=headers, params=querystring) all_data = response.json() report["datetime"] = all_data["current"]["last_updated"] report["temp"] = all_data["current"]["temp_c"] report["condition"] = all_data["current"]["condition"]["text"] report["wind"] = all_data["current"]["wind_kph"] report["humidity"] = all_data["current"]["humidity"] report["cloud"] = all_data["current"]["cloud"] report["feels_like"] = all_data["current"]["feelslike_c"] report["uv"] = all_data["current"]["uv"] if report: summarize_text = send_to_ai(f"{report} please summarize given data in less than 20 words without using numerical data.") return summarize_text return "No weather data found." #Usage example: def weather_report(city) -> str: weather_service = WeatherService(city) weather_summary = weather_service.weather_report() return weather_summary ================================================ FILE: src/FUNCTION/Tools/youtube_downloader.py ================================================ import re from pytube import YouTube from os import getcwd, mkdir, system from os.path import join, exists from src.FUNCTION.Tools.get_env import EnvManager class YouTubeDownloader: def __init__(self, url: str): self.url = url self.save_path = EnvManager.load_variable("Yt_path") if not self.save_path: raise ValueError("Error: No save path specified.") def extract_id(self) -> str: """Extracts the video ID from the YouTube URL.""" regex = r"(?:youtube\.com\/(?:[^\/\n\s]+\/\S+\/|(?:v|e(?:mbed)?)\/" \ r"|\S*?[?&]v=)|youtu\.be\/)([a-zA-Z0-9_-]{11})" match = re.search(regex, self.url) if not match: raise ValueError("Invalid YouTube URL") return match.group(1) def download_video(self) -> str: """Download a YouTube video.""" # Extract video ID from URL try: video_id = self.extract_id() except ValueError as e: return str(e) # Create filename file_name = f"{video_id}.mp4" # Download video try: # Create save path if it doesn't exist if not exists(self.save_path): mkdir(self.save_path) # Get the highest resolution video stream video = YouTube(self.url) highest_resolution_stream = video.streams.order_by('resolution').desc().first() # Download the video highest_resolution_stream.download(output_path=self.save_path, filename=file_name) return f"Downloaded video: {file_name}" except Exception as e: print(f"Error downloading video: {e}") return f"Error occurred while downloading video: {e}" # Usage example: def yt_downloader(url:str) -> str: downloader = YouTubeDownloader(url) result = downloader.download_video() return result ================================================ FILE: src/FUNCTION/run_function.py ================================================ from src.FUNCTION.Tools.link_op import search_youtube from src.FUNCTION.Tools.weather import weather_report from src.FUNCTION.Tools.news import news_headlines from src.FUNCTION.Tools.youtube_downloader import yt_downloader from src.FUNCTION.Tools.app_op import app_runner from src.BRAIN.text_to_info import send_to_ai from src.FUNCTION.Tools.incog import private_mode from src.FUNCTION.Tools.Email_send import send_email from src.FUNCTION.Tools.phone_call import make_a_call from src.FUNCTION.Tools.internet_search import duckgo_search from typing import Union from datetime import datetime class FunctionExecutor: def __init__(self): self.function_map = { 'search_youtube': search_youtube, 'weather_report': weather_report, 'news_headlines': news_headlines, 'yt_download': yt_downloader, 'app_runner': app_runner, 'send_to_ai': send_to_ai, 'private_mode': private_mode, 'send_email': send_email, 'make_a_call': make_a_call, 'duckgo_search': duckgo_search, } def execute(self, function_call: dict) -> Union[None, dict, list]: """ Execute a function based on the function call dictionary :param function_call: Dictionary with 'name' and 'parameters' keys :return: Dictionary with status and result of function execution """ output = None func_name = function_call.get('name') args = function_call.get('parameters') try: if not func_name: return None func = self.function_map.get(func_name) if not func: print("[!] No matching function found.") return None if args: all_parameters = [args[k] for k in args.keys()] output = func(*all_parameters) else: print("[*] No parameters provided.") output = func() except KeyError as e: print(f"[!] Missing key in function call: {e}") except Exception as e: print(f"[!] Error executing function: {e}") return { "status": "success" if output is not None else "failed", "function_name": func_name, "args": args, "output": output, "timestamp": datetime.utcnow().isoformat() + "Z" } ================================================ FILE: src/KEYBOARD/key_lst.py ================================================ from pynput.keyboard import Listener # Flag to track recording status is_recording = False def on_press(key): global is_recording try: if key == key.up: if not is_recording: is_recording = True print("Recording started...") except AttributeError: pass def on_release(key): global is_recording try: if key == key.up: if is_recording: is_recording = False print("Recording stopped.") return False # Stop listener after key release except AttributeError: pass if __name__ == "__main__": print("Press and hold the 'up' key to start recording. Release it to stop recording.") # Start listening for key events with Listener(on_press=on_press, on_release=on_release) as listener: listener.join() ================================================ FILE: src/KEYBOARD/key_prs_lst.py ================================================ import pynput.keyboard as keyboard import speech_recognition as sr import threading # Global flags and variables is_listening = False recognized_text = "" # Variable to store recognized text recognizer = sr.Recognizer() def recognize_speech(): """Recognize speech from the microphone continuously.""" global recognized_text with sr.Microphone() as source: print("Listening for speech...") recognizer.adjust_for_ambient_noise(source) # To adjust for ambient noise while is_listening: # Continuously listen while 'up' key is pressed try: audio = recognizer.listen(source, timeout=10) # Timeout to prevent endless listening print("Recognizing...") recognized_text = recognizer.recognize_google(audio) # Store recognized text globally print(f"Recognized: {recognized_text}") except sr.UnknownValueError: print("Google Speech Recognition could not understand audio") except sr.RequestError as e: print(f"Could not request results from Google Speech Recognition service; {e}") except sr.WaitTimeoutError: print("Listening timed out.") def on_press(key): """Handles key press events.""" global is_listening try: # Only start recording if the 'up' key is pressed and we're not already listening if key == keyboard.Key.up and not is_listening: is_listening = True print("Recording started... Listening for speech.") threading.Thread(target=recognize_speech, daemon=True).start() # Start speech recognition in a new thread except AttributeError: pass def on_release(key): """Handles key release events.""" global is_listening if key == keyboard.Key.up and is_listening: # Release 'up' to stop recording is_listening = False print("Recording stopped.") print(f"Final recorded text: {recognized_text}") # Output the final recognized text when stopped if key == keyboard.Key.esc: # Press 'esc' to exit the program return False if __name__ == "__main__": print("Press and hold the 'up' key to start recording. Release it to stop recording.") try: with keyboard.Listener(on_press=on_press, on_release=on_release) as listener: listener.join() except KeyboardInterrupt: print("\nProgram interrupted. Exiting...") ================================================ FILE: src/VISION/gem_eye.py ================================================ import base64 import io import os import json import cv2 import requests import numpy as np from PIL import Image, ImageDraw from google import genai from src.FUNCTION.Tools.get_env import EnvManager import re class ImageProcessor: def __init__(self, image_path="captured_image.png", model_name="gemini-2.5-flash"): self.image_path = image_path self.require_width = 336 self.require_height = 336 self.model = model_name # Set model name here self.client = genai.Client(api_key=EnvManager.load_variable("genai_key")) # ---------- Image capture and resizing ---------- def resize_image(self, require_width=None, require_height=None) -> bool: require_width = require_width or self.require_width require_height = require_height or self.require_height try: with Image.open(self.image_path) as img: width, height = img.size if height <= require_height and width <= require_width: return True img = img.resize((require_width, require_height), Image.ANTIALIAS) img.save(self.image_path) print(f"Image saved to {self.image_path}, size: {require_width}x{require_height}") except Exception as e: print(e) return False return True def capture_image_and_save(self) -> str | None: cap = cv2.VideoCapture(0) if not cap.isOpened(): print("Error: Could not open camera.") return None try: ret, frame = cap.read() if ret: cv2.imwrite(self.image_path, frame) print(f"Image captured and saved as {self.image_path}") return self.image_path else: print("Error: Could not capture image.") return None finally: cap.release() cv2.destroyAllWindows() # ---------- Basic detection ---------- def detect_image(self , query:str) -> str | None: """Detect content using the set model.""" if not query: query = "What is this image?" try: image = Image.open(self.image_path) response = self.client.models.generate_content( model=self.model, contents=[query, image] ) return response.text except Exception as e: print(f"Error: {e}") return None # ---------- Object detection ---------- def detect_objects(self, prompt="Detect all prominent items in the image.") -> list | None: try: image = Image.open(self.image_path) config = genai.types.GenerateContentConfig( response_mime_type="application/json" ) response = self.client.models.generate_content( model=self.model, contents=[prompt, image], config=config ) width, height = image.size boxes = json.loads(response.text) converted_boxes = [] for box in boxes: y1, x1, y2, x2 = box["box_2d"] converted_boxes.append([ int(x1/1000*width), int(y1/1000*height), int(x2/1000*width), int(y2/1000*height) ]) return converted_boxes except Exception as e: print(f"Error: {e}") return None def extract_segmentation_masks(self, prompt=None, output_dir="segmentation_results"): try: if not self.image_path and not self.image_obj: print("No image provided.") return None # Load image image = Image.open(self.image_path or self.image_obj) image.thumbnail([1024, 1024], Image.Resampling.LANCZOS) # Default prompt prompt = prompt or "Give segmentation masks for prominent items." # Enforce structured JSON output prompt = f"""{prompt} Output ONLY a JSON list of segmentation masks. Each entry should have: - "box_2d": [y0, x0, y1, x1] - "mask": base64 PNG string - "label": descriptive text Do not include any extra text or explanations.""" # Prepare model config config = genai.types.GenerateContentConfig( thinking_config=genai.types.ThinkingConfig() ) # Call the model response = self.client.models.generate_content( model=self.model, contents=[prompt, image], config=config ) # Parse JSON safely try: items = json.loads(self._parse_json(response.text)) except json.JSONDecodeError as e: print(f"JSON parsing failed: {e}") items = [] os.makedirs(output_dir, exist_ok=True) composite_images = [] # Store overlay images for Streamlit for i, item in enumerate(items): if not all(k in item for k in ("box_2d", "mask", "label")): print(f"Skipping item {i}: missing required keys") continue y0, x0, y1, x1 = item.get("box_2d", [0, 0, image.height, image.width]) y0 = int(y0 / 1000 * image.size[1]) x0 = int(x0 / 1000 * image.size[0]) y1 = int(y1 / 1000 * image.size[1]) x1 = int(x1 / 1000 * image.size[0]) if y0 >= y1 or x0 >= x1: continue # Decode mask png_str = item["mask"] if not png_str.startswith("data:image/png;base64,"): continue png_str = png_str.removeprefix("data:image/png;base64,") mask_data = base64.b64decode(png_str) mask = Image.open(io.BytesIO(mask_data)).resize((x1-x0, y1-y0), Image.Resampling.BILINEAR) # Create overlay overlay = Image.new('RGBA', image.size, (0, 0, 0, 0)) overlay_draw = ImageDraw.Draw(overlay) mask_array = np.array(mask) color = (255, 255, 255, 200) for y in range(y0, y1): for x in range(x0, x1): if mask_array[y - y0, x - x0] > 128: overlay_draw.point((x, y), fill=color) # Save mask and overlay mask_filename = f"{item['label']}_{i}_mask.png" overlay_filename = f"{item['label']}_{i}_overlay.png" mask.save(os.path.join(output_dir, mask_filename)) composite = Image.alpha_composite(image.convert('RGBA'), overlay) composite.save(os.path.join(output_dir, overlay_filename)) composite_images.append(composite) # For Streamlit print(f"Saved mask and overlay for {item['label']} to {output_dir}") # Convert first composite image to base64 for Streamlit display if composite_images: buffered = io.BytesIO() composite_images[0].save(buffered, format="PNG") img_base64 = base64.b64encode(buffered.getvalue()).decode() return img_base64 return None except Exception as e: print(f"Error in extract_segmentation_masks: {e}") return None # ---------- Helper ---------- def _parse_json(self, json_output: str): """ Extracts the first JSON array or object from a model output, even if there is extra text around it. """ # Remove markdown code fences if present json_output = re.sub(r"```(json)?", "", json_output, flags=re.IGNORECASE).strip() # Find the first JSON object or array start = json_output.find("[") end = json_output.rfind("]") + 1 if start != -1 and end != -1: try: return json.loads(json_output[start:end]) except json.JSONDecodeError as e: print(f"JSON parsing failed: {e}") return [] else: # Try object instead of list start = json_output.find("{") end = json_output.rfind("}") + 1 if start != -1 and end != -1: try: return [json.loads(json_output[start:end])] except json.JSONDecodeError as e: print(f"JSON parsing failed: {e}") return [] return [] # ---------- File API ---------- def upload_image_file(self) -> genai.types.File: return self.client.files.upload(file=self.image_path) # ---------- Multi-image prompt ---------- def multi_image_prompt(self, images: list, prompt: str): contents = [prompt] for img in images: if isinstance(img, str): contents.append(self.client.files.upload(file=img)) elif isinstance(img, Image.Image): contents.append(img) response = self.client.models.generate_content( model=self.model, contents=contents ) return response.text # ---------- Inline image from URL ---------- def get_image_from_url(self, url: str) -> Image.Image: img_bytes = requests.get(url).content return Image.open(io.BytesIO(img_bytes)) def main(): # Initialize processor processor = ImageProcessor(image_path="test_image.png") # ---------- 1. Capture image ---------- captured_path = processor.capture_image_and_save() if captured_path: print(f"Captured image saved at: {captured_path}") # ---------- 2. Resize image ---------- if processor.resize_image(336, 336): print(f"Image resized successfully.") # ---------- 3. Basic detection ---------- detection = processor.detect_image() if detection: print("Basic detection output:") print(detection) # ---------- 4. Object detection ---------- boxes = processor.detect_objects(prompt="Detect prominent objects and their bounding boxes.") if boxes: print("Detected bounding boxes:") for b in boxes: print(b) # ---------- 5. Segmentation ---------- processor.extract_segmentation_masks( prompt="Segment all wooden and glass items in the image.", output_dir="segmentation_results" ) # ---------- 6. File API upload ---------- uploaded_file = processor.upload_image_file() print(f"Uploaded file ID: {uploaded_file.id}") # ---------- 7. Multi-image prompt ---------- # For demonstration, using the same image twice multi_response = processor.multi_image_prompt( images=[processor.image_path, processor.image_path], prompt="Compare these two images and describe the differences." ) print("Multi-image prompt response:") print(multi_response) # ---------- 8. Fetch image from URL ---------- url = "https://via.placeholder.com/150" # Example image URL image_from_url = processor.get_image_from_url(url) image_from_url.show() print("Fetched and displayed image from URL.") if __name__ == "__main__": main() ================================================ FILE: src/VISION/local_eye.py ================================================ import base64 import io import os import json import cv2 import requests import numpy as np from PIL import Image, ImageDraw from src.FUNCTION.Tools.get_env import EnvManager import re # The ollama library is required for this local processor. # You can install it with: pip install ollama try: import ollama except ImportError: raise ImportError("The 'ollama' library is required for LocalImageProcessor. Please install it using 'pip install ollama'.") class LocalImageProcessor: """ A class to process images using a local multimodal model via Ollama. Mirrors the functionality of ImageProcessor for a local environment. """ def __init__(self, image_path="captured_image.png", model_name="llava:7b"): """ Initializes the processor. Args: image_path (str): The default path to save/read images. model_name (str): The name of the local model to use with Ollama (e.g., 'llava'). """ self.image_path = image_path self.require_width = 336 self.require_height = 336 if model_name: self.model = model_name else: self.model = EnvManager.load_variable("Image_to_text") # Check if the Ollama server is running and the model is available try: ollama.show(model_name) except Exception as e: print(f"Error connecting to Ollama or finding model '{model_name}'.") print("Please ensure the Ollama server is running and you have pulled the model (e.g., 'ollama run llava').") raise e # ---------- Image capture and resizing (No changes) ---------- def resize_image(self, require_width=None, require_height=None) -> bool: """Resizes the image to the specified dimensions.""" require_width = require_width or self.require_width require_height = require_height or self.require_height try: with Image.open(self.image_path) as img: # The ANTIALIAS attribute is deprecated and will be removed in Pillow 10 (2023-07-01). # Use Resampling.LANCZOS instead. resample_filter = Image.Resampling.LANCZOS if hasattr(Image, 'Resampling') else Image.ANTIALIAS img = img.resize((require_width, require_height), resample_filter) img.save(self.image_path) print(f"Image saved to {self.image_path}, size: {require_width}x{require_height}") except Exception as e: print(f"Error during resize: {e}") return False return True def capture_image_and_save(self) -> str | None: """Captures an image from the webcam and saves it.""" cap = cv2.VideoCapture(0) if not cap.isOpened(): print("Error: Could not open camera.") return None try: ret, frame = cap.read() if ret: cv2.imwrite(self.image_path, frame) print(f"Image captured and saved as {self.image_path}") return self.image_path else: print("Error: Could not capture image.") return None finally: cap.release() cv2.destroyAllWindows() # ---------- Basic detection ---------- def detect_image(self, query: str) -> str | None: """Detect content using the set local model.""" if not query: query = "What is this image?" try: with open(self.image_path, 'rb') as f: image_bytes = f.read() response = ollama.chat( model=self.model, messages=[ { 'role': 'user', 'content': query, 'images': [image_bytes] } ] ) return response['message']['content'] except Exception as e: print(f"Error during local detection: {e}") return None # ---------- Object detection ---------- def detect_objects(self, prompt="Detect all prominent items in the image.") -> list | None: """Detects objects using a text prompt to force JSON output.""" # Strong prompt to guide the local model to produce structured JSON json_prompt = f"""{prompt}. Analyze the provided image and identify the prominent objects. Your output must be ONLY a valid JSON list of objects. Do not include any text, explanation, or markdown. Each object in the list should be a dictionary with one key: "box_2d". The "box_2d" value must be a list of four integers: [y1, x1, y2, x2]. These coordinates should be normalized to a 1000x1000 grid, where y1 < y2 and x1 < x2. Example format: [{{"box_2d": [250, 150, 750, 850]}}, {{"box_2d": [100, 300, 400, 700]}}] """ try: image = Image.open(self.image_path) width, height = image.size with open(self.image_path, 'rb') as f: image_bytes = f.read() response = ollama.chat( model=self.model, messages=[ { 'role': 'user', 'content': json_prompt, 'images': [image_bytes] } ] ) # Use the helper to parse the model's text response boxes_data = self._parse_json(response['message']['content']) if not boxes_data: print("Could not parse JSON from model response.") return None converted_boxes = [] for box in boxes_data: if "box_2d" in box and len(box["box_2d"]) == 4: y1, x1, y2, x2 = box["box_2d"] # Convert from 1000x1000 grid to actual image dimensions converted_boxes.append([ int(x1 / 1000 * width), int(y1 / 1000 * height), int(x2 / 1000 * width), int(y2 / 1000 * height) ]) return converted_boxes except Exception as e: print(f"Error during local object detection: {e}") return None def extract_segmentation_masks(self, prompt=None, output_dir="segmentation_results"): """This functionality is not supported by standard local VLMs like LLaVA.""" print("Warning: Segmentation mask generation is not supported by this local processor.") raise NotImplementedError("Standard local VLMs do not generate pixel-level segmentation masks.") # ---------- Helper (No changes) ---------- def _parse_json(self, text_output: str) -> list: """ Extracts the first JSON list from a model output string, even if there is extra text or markdown fences around it. """ text_output = re.sub(r"```(json)?", "", text_output, flags=re.IGNORECASE).strip() start = text_output.find("[") end = text_output.rfind("]") + 1 if start != -1 and end != -1: json_str = text_output[start:end] try: return json.loads(json_str) except json.JSONDecodeError as e: print(f"JSON parsing failed on extracted text: {e}") return [] return [] # ---------- File API (Not applicable to local models) ---------- def upload_image_file(self): """This functionality is not applicable for local, stateless model servers like Ollama.""" print("Warning: The concept of a File API does not apply to this local processor.") raise NotImplementedError("Local models process images directly per request; there is no file upload API.") # ---------- Multi-image prompt ---------- def multi_image_prompt(self, images: list[str], prompt: str) -> str | None: """Sends a prompt with multiple images to the local model.""" try: image_bytes_list = [] for img_path in images: if not os.path.exists(img_path): print(f"Image path does not exist: {img_path}") continue with open(img_path, 'rb') as f: image_bytes_list.append(f.read()) if not image_bytes_list: print("No valid images to process.") return None response = ollama.chat( model=self.model, messages=[ { 'role': 'user', 'content': prompt, 'images': image_bytes_list } ] ) return response['message']['content'] except Exception as e: print(f"Error during multi-image prompt: {e}") return None # ---------- Inline image from URL (No changes) ---------- def get_image_from_url(self, url: str) -> Image.Image | None: """Downloads an image from a URL and returns a PIL Image object.""" try: response = requests.get(url) response.raise_for_status() # Raise an exception for bad status codes img_bytes = response.content return Image.open(io.BytesIO(img_bytes)) except requests.RequestException as e: print(f"Error fetching image from URL {url}: {e}") return None def main_local(): """Main function to demonstrate the LocalImageProcessor.""" # Ensure a test image exists. Let's create a placeholder if not. test_image_path = "test_image_local.png" if not os.path.exists(test_image_path): try: img = Image.new('RGB', (640, 480), color = 'red') d = ImageDraw.Draw(img) d.rectangle([100, 100, 300, 300], fill='blue') d.text((50,50), "Test Image", fill=(255,255,0)) img.save(test_image_path) print(f"Created a dummy test image: {test_image_path}") except Exception as e: print(f"Could not create a test image: {e}") return # Initialize processor with a local model name # Ensure you have run 'ollama pull llava' first try: processor = LocalImageProcessor(image_path=test_image_path, model_name="llava") except Exception: return # Stop if initialization fails # ---------- 1. Basic detection ---------- print("\n--- 1. Basic Detection ---") detection = processor.detect_image("Describe this image in detail.") if detection: print("Basic detection output:") print(detection) # ---------- 2. Object detection ---------- print("\n--- 2. Object Detection ---") boxes = processor.detect_objects(prompt="Detect the blue square.") if boxes: print("Detected bounding boxes:") for b in boxes: print(b) # ---------- 3. Multi-image prompt ---------- print("\n--- 3. Multi-image Prompt ---") # For demonstration, creating a second test image test_image_2_path = "test_image_local_2.png" img2 = Image.new('RGB', (640, 480), color = 'green') img2.save(test_image_2_path) multi_response = processor.multi_image_prompt( images=[processor.image_path, test_image_2_path], prompt="Describe the primary color of each image. First image, then second image." ) if multi_response: print("Multi-image prompt response:") print(multi_response) # ---------- 4. Fetch image from URL ---------- print("\n--- 4. Fetch image from URL ---") url = "[https://via.placeholder.com/150/0000FF/FFFFFF?Text=URL+Image](https://via.placeholder.com/150/0000FF/FFFFFF?Text=URL+Image)" image_from_url = processor.get_image_from_url(url) if image_from_url: image_from_url.save("url_image.png") print("Fetched image from URL and saved as url_image.png") # ---------- 5. Unsupported features ---------- print("\n--- 5. Testing Unsupported Features ---") try: processor.extract_segmentation_masks() except NotImplementedError as e: print(f"Correctly caught expected error: {e}") try: processor.upload_image_file() except NotImplementedError as e: print(f"Correctly caught expected error: {e}") if __name__ == "__main__": # You can run the original main() or the new main_local() # main() # For the Google GenAI Processor main_local() # For the Local Ollama Processor ================================================ FILE: ui.py ================================================ import os import json import tempfile import streamlit as st from langchain_ollama import ChatOllama import torch import io import base64 # ----- Custom Modules ----- from src.FUNCTION.run_function import FunctionExecutor from src.BRAIN.text_to_info import send_to_ai from src.BRAIN.local_func_call import LocalFunctionCall from src.CONVERSATION.text_to_speech import speak from src.FUNCTION.Tools.random_respon import RandomChoice from src.FUNCTION.Tools.greet_time import TimeOfDay from DATA.msg import WELCOME_RESPONSES from src.BRAIN.gem_func_call import GeminiFunctionCaller from src.BRAIN.RAG import RAGPipeline from src.BRAIN.chat_with_ai import PersonalChatAI from src.CONVERSATION.text_speech import text_to_speech_local from src.CONVERSATION.voice_text import voice_to_text from src.BRAIN.code_gen import CodeRefactorAssistant from src.VISION.gem_eye import ImageProcessor from src.VISION.local_eye import LocalImageProcessor # # ----- Torch fix ----- if hasattr(torch.classes, '__path__'): torch.classes.__path__ = [] # ----- Initialize Components ----- local_caller = LocalFunctionCall() func_executor = FunctionExecutor() time_greeter = TimeOfDay() code_assistant = CodeRefactorAssistant() chat_ai = PersonalChatAI() rag = RAGPipeline() # ----- Streamlit config ----- os.environ["STREAMLIT_WATCHER_TYPE"] = "none" UPLOAD_DIR = "." os.makedirs(UPLOAD_DIR, exist_ok=True) AI_MODEL = "granite3.1-dense:2b" # ----- Session Initialization ----- def initialize_session(): defaults = { "chat_mode": "normal", "chat_histories": { "normal": [], "chat_with_ai": [], "chat_with_rag": [], "data_analysis": [], "image_processing": [] }, "rag_subject": "", "voice_output": False, "uploaded_file_path": None, "audio_input_key_counter": 0, "image_path": None, "image_obj": None, "image_action": None } for k, v in defaults.items(): if k not in st.session_state: st.session_state[k] = v def set_greeting(): if "greeted_once" not in st.session_state: st.session_state.greeted_once = False if not st.session_state.greeted_once: greeting_message = f"{time_greeter.time_of_day()}. {RandomChoice.random_choice(WELCOME_RESPONSES)}" st.session_state.greeting_message = greeting_message st.session_state.greeted_once = True initialize_session() set_greeting() if "greeting_message" in st.session_state: speak(st.session_state.greeting_message) del st.session_state["greeting_message"] # ----- Utility Functions ----- @st.cache_resource(show_spinner=False) def load_rag_chain(subject): try: return rag.setup_chain(subject.lower().strip().replace(" ", "_")) except: return None @st.cache_data(show_spinner=False) def personal_chat_ai(query, max_token=2000): try: messages = chat_ai.message_management(query) llm = ChatOllama(model=AI_MODEL, temperature=0.3, max_token=max_token) response_content = "".join(chunk.content for chunk in llm.stream(messages)) chat_ai.store_important_chat(query, response_content) return response_content except Exception as e: return f"An error occurred: {e}" def data_analysis(user_prompt: str, file_path: str): try: return code_assistant.gem_text_to_code(user_prompt, file_path) except: return code_assistant.local_text_to_code(user_prompt, file_path) def chat_with_rag_session(subject, query): key = subject.lower().strip().replace(" ", "_") if f"qa_chain_{key}" not in st.session_state: st.session_state[f"qa_chain_{key}"] = load_rag_chain(key) qa_chain = st.session_state.get(f"qa_chain_{key}") return rag.ask(qa_chain, query) if qa_chain else f"Error: Unable to load RAG chain for '{subject}'." def process_command(command): try: gem_caller = GeminiFunctionCaller() response_list_dic = gem_caller.generate_function_calls(command) if not response_list_dic: raise ValueError("Empty Gemini output.") except: response_list_dic = local_caller.create_function_call(command) if not response_list_dic: raise ValueError("Empty Local model output.") results = [] for dic in response_list_dic: func_name = dic.get("name") args = dic.get("arguments", {}) speak(f"Executing function: {func_name}") try: res = func_executor.execute(dic) results.append(res) except Exception as e: results.append({"status": "failed", "function_name": func_name, "args": args, "output": str(e)}) send_to_ai(f"Respond to user's command '{command}' concisely.") return results def add_message(role, content): history = st.session_state.chat_histories[st.session_state.chat_mode] if not any(msg["content"] == content and msg["role"] == role for msg in history): history.append({"role": role, "content": content}) # ----- Sidebar: Mode Selection ----- st.sidebar.markdown("### 🔁 Select Chat Mode") mode_display_map = { "💬 Normal": "normal", "🧍 Personal Chat": "chat_with_ai", "📚 RAG Chat": "chat_with_rag", "📊 Data Analysis": "data_analysis", "🖼️ Image Processing": "image_processing" } selected_display = st.sidebar.selectbox("🧠 Select Chat Mode", list(mode_display_map.keys())) selected_mode = mode_display_map[selected_display] st.session_state.chat_mode = selected_mode # Voice toggle st.session_state.voice_output = st.sidebar.toggle("🎙️ Voice Reply", value=st.session_state.voice_output) # RAG topic select if st.session_state.chat_mode == "chat_with_rag": st.session_state.rag_subject = st.sidebar.selectbox("📘 Select RAG Topic", [ "Disaster", "Finance", "Healthcare", "Artificial Intelligence", "Climate Change", "Cybersecurity", "Education", "Space Technology", "Politics", "History", "Biology" ]) # Data Analysis upload if st.session_state.chat_mode == "data_analysis": st.sidebar.markdown("### 📤 Upload CSV File") uploaded_file = st.sidebar.file_uploader("Choose CSV", type=["csv"]) if uploaded_file: file_path = os.path.join(UPLOAD_DIR, uploaded_file.name) with open(file_path, "wb") as f: f.write(uploaded_file.getbuffer()) st.session_state.uploaded_file_path = file_path st.sidebar.success(f"✅ File saved to: `{file_path}`") # Image Processing setup if st.session_state.chat_mode == "image_processing": st.sidebar.markdown("### 🖼️ Image Input") image_source = st.sidebar.radio("Image Source", ["Upload", "URL", "Camera"]) try: processor = ImageProcessor() except Exception as e: processor = LocalImageProcessor() if image_source == "Upload": uploaded_img = st.sidebar.file_uploader("Upload Image", type=["png","jpg","jpeg"]) if uploaded_img: path = os.path.join(UPLOAD_DIR, uploaded_img.name) with open(path, "wb") as f: f.write(uploaded_img.getbuffer()) processor.image_path = path st.session_state.image_path = path st.image(path, caption="Uploaded Image") elif image_source == "URL": url = st.sidebar.text_input("Image URL") if url: img = processor.get_image_from_url(url) st.session_state.image_obj = img processor.image_obj = img st.image(img, caption="Fetched Image") elif image_source == "Camera": if st.sidebar.button("📸 Capture Image"): captured = processor.capture_image_and_save() if captured: st.session_state.image_path = captured processor.image_path = captured st.image(captured, caption="Captured Image") st.session_state.image_action = st.sidebar.selectbox( "Select Action", ["Basic Detection", "Object Detection", "Segmentation"] ) # ----- Current Mode Display ----- st.markdown(f"
🧠 Current Mode: {st.session_state.chat_mode.upper()}
", unsafe_allow_html=True) # ----- Show Chat History ----- for msg in st.session_state.chat_histories[st.session_state.chat_mode]: with st.chat_message(msg["role"]): st.markdown(msg["content"], unsafe_allow_html=True) if "image" in msg and msg["image"]: st.image( io.BytesIO(base64.b64decode(msg["image"])), use_column_width=True ) # ----- Chat Input ----- if user_input := st.chat_input("Ask me anything... "): st.chat_message("user").markdown(user_input) add_message("user", user_input) # --- Mode handling --- if st.session_state.chat_mode == "chat_with_ai": response = personal_chat_ai(user_input) elif st.session_state.chat_mode == "chat_with_rag": subject = st.session_state.rag_subject response = chat_with_rag_session(subject, user_input) if subject else "🚨 Enter a subject." elif st.session_state.chat_mode == "data_analysis": if not st.session_state.uploaded_file_path: response = "🚨 Please upload a CSV first." else: response = data_analysis(user_input, st.session_state.uploaded_file_path) elif st.session_state.chat_mode == "image_processing": if not processor.image_path and not processor.image_obj: response = "🚨 Please provide an image first." else: action = st.session_state.image_action if action == "Basic Detection": response = processor.detect_image(query=user_input) or "No result." elif action == "Object Detection": boxes = processor.detect_objects(prompt=user_input) response = f"Detected boxes: {boxes}" if boxes else "No objects detected." elif action == "Segmentation": # Extract and return composite image for display segmented_image = processor.extract_segmentation_masks( output_dir="segmentation_results", prompt=user_input ) if segmented_image: # Add to chat history with an 'image' key st.session_state.chat_histories[st.session_state.chat_mode].append({ "role": "assistant", "content": "Segmentation complete.", "image": segmented_image }) response = "Segmentation complete. Image displayed above." else: st.session_state.chat_histories[st.session_state.chat_mode].append({ "role": "assistant", "content": "Segmentation failed or no objects detected.", "image": None }) response = "Segmentation failed or no objects detected." else: response = process_command(user_input) # --- Display response --- if isinstance(response, list) and isinstance(response[0], dict): for entry in response: with st.chat_message("assistant"): st.markdown(f""" **🔧 Function Executed:** `{entry.get('function_name', 'N/A')}` **📌 Status:** ✅ `{entry.get('status', 'unknown')}` **📂 Arguments:** `{entry.get('args', {})}` **📜 Output:** *{entry.get('output', 'No output.')}* """) add_message("assistant", json.dumps(response, indent=2)) else: st.chat_message("assistant").markdown(response) add_message("assistant", response) # --- Voice output --- if st.session_state.voice_output: full_resp = "\n".join([r.get("output","") for r in response]) if isinstance(response,list) else str(response) audio_io = text_to_speech_local(full_resp.replace("*","")) st.markdown(f""" """, unsafe_allow_html=True) #------------------- Voice Input & Audio Handling ------------------- audio_key = f"audio_input_key_{st.session_state.audio_input_key_counter}" audio_value = st.sidebar.audio_input("🎤 Speak", key=audio_key) if audio_value: with tempfile.TemporaryFile(suffix=".wav") as temp_audio: temp_audio.write(audio_value.getvalue()) temp_audio.seek(0) transcribed_text = voice_to_text(temp_audio) if transcribed_text: st.chat_message("user").markdown(transcribed_text) add_message("user", transcribed_text) if st.session_state.chat_mode == "chat_with_ai": response = personal_chat_ai(transcribed_text) elif st.session_state.chat_mode == "chat_with_rag": subject = st.session_state.rag_subject response = chat_with_rag_session(subject, transcribed_text) if subject else "🚨 Please enter a subject." elif st.session_state.chat_mode == "data_analysis": try: if not st.session_state.uploaded_file_path: response = "🚨 Please upload a CSV file first." else: file_path = st.session_state.uploaded_file_path response = data_analysis(transcribed_text,file_path) except Exception as e: response = f"Error reading CSV: {e}" else: response = process_command(transcribed_text) if isinstance(response, list) and isinstance(response[0], dict): for entry in response: with st.chat_message("assistant"): st.markdown(f""" **🔧 Function Executed:** `{entry.get('function_name', 'N/A')}` **📌 Status:** ✅ `{entry.get('status', 'unknown')}` **📂 Arguments:** `{entry.get('args', {})}` **📜 Output:** *{entry.get('output', 'No output.')}* """) add_message("assistant", json.dumps(response, indent=2)) else: st.chat_message("assistant").markdown(response) add_message("assistant", response) # st.chat_message("assistant").markdown(str(response)) # add_message("assistant", str(response)) if st.session_state.voice_output: full_response = "\n".join([sub_response.get("output", "") for sub_response in response]) if isinstance(response, list) else str(response) audio_io = text_to_speech_local(full_response.replace("*", "")) st.markdown(f""" """, unsafe_allow_html=True) del st.session_state[audio_key] st.session_state.audio_input_key_counter += 1 # ----- Download History ----- st.sidebar.download_button( label="📅 Download Chat History", data=json.dumps(st.session_state.chat_histories[st.session_state.chat_mode], indent=2), file_name="chat_history.json", mime="application/json" )