Copy disabled (too large)
Download .txt
Showing preview only (10,945K chars total). Download the full file to get everything.
Repository: SPThole/CoexistAI
Branch: main
Commit: b9e037a8b3ec
Files: 40
Total size: 10.4 MB
Directory structure:
gitextract_gxmiyyip/
├── .dockerignore
├── Dockerfile
├── Dockerfile.searxng
├── LICENSE
├── README.docker.md
├── README.md
├── README_MCP.md
├── __init__.py
├── app.py
├── coexist_tutorial.ipynb
├── config/
│ └── model_config.json
├── demo_queries.ipynb
├── docker-compose.yml
├── entrypoint.sh
├── model_config.py
├── output/
│ └── map_with_route_and_pois.html
├── quick_setup.sh
├── quick_setup_docker.sh
├── requirements.txt
├── searxng/
│ ├── settings.yml
│ ├── settings.yml.new
│ ├── uwsgi.ini
│ └── uwsgi.ini.new
├── static/
│ └── admin.html
├── system_prompt.py
└── utils/
├── __init__.py
├── answer_generation.py
├── config.py
├── crawler_utils.py
├── git_utils.py
├── knowledge_base.py
├── map.py
├── process_content.py
├── profiler_utils.py
├── reddit_utils.py
├── retriever_utils.py
├── startup_banner.py
├── tts_utils.py
├── utils.py
└── websearch_utils.py
================================================
FILE CONTENTS
================================================
================================================
FILE: .dockerignore
================================================
__pycache__
*.pyc
*.pyo
*.pyd
.pytest_cache
.venv
env/
infinity_env/
coexistaienv/
*.log
artifacts/
output/
downloads/
================================================
FILE: Dockerfile
================================================
FROM python:3.13-slim
ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED=1
# Build-time args that will be copied into the image as environment variables.
# Users can pass these via `docker build --build-arg KEY=VALUE` to bake defaults.
ARG LLM_MODEL_NAME=gemini-2.0-flash
ARG LLM_TYPE=google
ARG LLM_TEMPERATURE=0.1
ARG PORT_NUM_APP=8000
ARG PORT_NUM_SEARXNG=8085
ARG HOST_APP=0.0.0.0
ARG HOST_SEARXNG=0.0.0.0
ARG EMBED_MODE=google
ARG EMBEDDING_MODEL_NAME=models/embedding-001
# Export non-secret build args as environment variables so model_config.py can read them at runtime
ENV LLM_MODEL_NAME=${LLM_MODEL_NAME}
ENV LLM_TYPE=${LLM_TYPE}
ENV LLM_TEMPERATURE=${LLM_TEMPERATURE}
ENV PORT_NUM_APP=${PORT_NUM_APP}
ENV PORT_NUM_SEARXNG=${PORT_NUM_SEARXNG}
ENV HOST_APP=${HOST_APP}
ENV HOST_SEARXNG=${HOST_SEARXNG}
ENV EMBED_MODE=${EMBED_MODE}
ENV EMBEDDING_MODEL_NAME=${EMBEDDING_MODEL_NAME}
# Install small set of system deps commonly needed by ML/audio packages
RUN apt-get update && \
apt-get install -y --no-install-recommends \
git \
wget \
ffmpeg \
build-essential \
libsndfile1 \
&& rm -rf /var/lib/apt/lists/*
# Use /app as the workdir so the Dockerfile can be built from the CoexistAI folder
WORKDIR /app
# Copy only requirements first to leverage Docker cache (build context is the CoexistAI folder)
COPY ./requirements.txt ./requirements.txt
RUN python -m pip install --upgrade pip setuptools wheel
# Copy application code (copy the current folder contents into /app)
COPY ./ ./
# Reproduce quick_setup.sh virtualenv installs inside the image (mirrors the script)
# Create a separate infinity_env and install packages there to avoid conflicts as in the script
RUN python3.13 -m venv /opt/infinity_env && \
/opt/infinity_env/bin/pip install --no-cache-dir 'infinity-emb[all]' && \
/opt/infinity_env/bin/pip install --no-cache-dir 'optimum==1.27.0' && \
/opt/infinity_env/bin/pip install --no-cache-dir 'transformers<4.49' && \
/opt/infinity_env/bin/pip install --no-cache-dir --upgrade "typer==0.19.1" "click>=8.1.3" || true
# Create a second venv similar to coexistaienv and install markitdown[all]
RUN python3.13 -m venv /opt/coexistaienv && \
/opt/coexistaienv/bin/pip install --no-cache-dir 'markitdown[all]' || true
# Now install the project requirements into the coexistaienv (matches quick_setup.sh order)
RUN /opt/coexistaienv/bin/pip install --no-cache-dir -r requirements.txt || true
# Entrypoint will be executed via shell; no need to force executable bit when host may mount files
EXPOSE 8000
# Invoke the entrypoint from the copied project path. The entrypoint lives at CoexistAI/entrypoint.sh
CMD ["sh", "/app/entrypoint.sh"]
================================================
FILE: Dockerfile.searxng
================================================
FROM searxng/searxng:latest
# Copy custom settings
COPY ./searxng/settings.yml /etc/searxng/settings.yml
# Optionally copy other config files if needed
# COPY ./searxng/uwsgi.ini /etc/searxng/uwsgi.ini
================================================
FILE: LICENSE
================================================
NON-COMMERCIAL RESEARCH AND EDUCATIONAL USE LICENSE
Copyright (c) 2025 Sidhant Thole and CoexistAI Contributors
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to use, copy, modify, and distribute the Software, subject to the following conditions:
1. **Non-Commercial Use Only**
- The Software may be used, copied, modified, and distributed solely for non-commercial research, prototyping, and educational purposes.
- Commercial use, including but not limited to use in a product, service, or offering for which a fee is charged or which is used in the operation of a business, is strictly prohibited without the express prior written consent of the copyright holders.
2. **No Redistribution for Commercial Purposes**
- Redistribution of the Software or any derivative works for commercial purposes is not permitted.
- Integration of the Software into commercial products or services is not permitted without explicit written permission.
3. **Attribution**
- Any use, copy, or distribution of the Software must retain this license notice, copyright notice, and all disclaimers.
4. **No Warranty**
- THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
5. **Compliance**
- You are responsible for ensuring that your use of the Software complies with all applicable laws, regulations, and terms of service of any data sources or third-party services accessed through the Software.
6. **Contact for Commercial Licensing**
- For commercial licensing or other use not permitted by this license, please contact the maintainers at: [GitHub Issues or project contact email].
By using the Software, you agree to be bound by the terms of this license.
================================================
FILE: README.docker.md
================================================
# CoexistAI — Docker Quickstart
### Short, step-by-step instructions for two ways to start CoexistAI. Pick either Method A (helper script) or Method B (direct Docker Compose).
## Prerequisites
- Docker Engine installed.
## Before you start (one-time)
1. Open a terminal and change into the repository folder:
```bash
cd /path/to/CoexistAI
```
2. Edit the .env file for keys and admin token (which will be used while editing model params):
## Method A — Helper script (recommended for beginners)
This script automates the compose start and waits until the app reports ready.
1. Run the helper (from repo root):
```bash
./quick_setup_docker.sh
```
or
```bash # default timeout 300s
./quick_setup_docker.sh 600 # pass timeout in seconds (example: 600s = 10min)
```
For subsequent starts, run in similar way (it detects the existing image and skips building/installing).
2. What the script does (so you know what to expect):
- Checks if the Docker image 'coexistai-app' already exists; if yes, runs `docker compose up -d` (no build); if not, runs `docker compose up -d --build` to start containers detached.
- Polls `http://localhost:8000/status` every few seconds and prints a spinner.
- Exits with code 0 when the app reports `{"status":"ready"}`.
- Exits non-zero if the app reports `error` or the timeout is reached.
3. After the script finishes successfully, open:
- http://localhost:8000/admin

- If using local models ignore api_keys fields
- By default ADMIN_TOKEN=123456, you can change it via .env
This opens the Admin UI, where you can edit model configurations, API keys, and reload settings without rebuilding the container.
When to use Method A: you're new to Docker or want a simple way to wait until the app is ready.
## Method B — Direct Docker Compose (fast, manual)
1. Start the stack:
- **First time** (builds the image):
```bash
docker compose up -d --build
```
- **Subsequent times** (uses existing image):
```bash
docker compose up -d
```
To stop: `docker compose down`
To restart: `docker compose restart`
2. Wait for ready signal in terminal where you ran docker compose, then open the admin UI:
- http://localhost:8000/admin
3. Verify status from the host:
```bash
curl http://localhost:8000/status
# expected JSON: {"status":"starting"} or {"status":"ready"}
```
4. Edit configuration:
- Use the Admin UI `/admin` and click "Save & Reload" to apply changes without rebuilding.
- Or from the host (curl):
```bash
curl -X POST -H "X-Admin-Token: $ADMIN_TOKEN" http://localhost:8000/admin/reload-config
```
When to use Method A: you prefer to run compose directly and watch logs yourself.
Secrets (recommended pattern)
- Do not store API keys in the repo. Use `.env` or file-backed secrets.
- Recommended: create `CoexistAI/config/keys/` on the host, place key files there, and mount that folder into the container. Reference them in `config/model_config.json` with `llm_api_key_file` / `embed_api_key_file`.
Quick troubleshooting
- App unreachable? Check app logs:
```bash
docker compose logs app --tail=200
```
- App timed out in `quick_setup_docker.sh` or reports `error`? Inspect logs and increase timeout:
```bash
docker compose logs app --tail=400
./quick_setup_docker.sh 600
```
- Long model downloads or HF errors: allow more time on first start or mount `artifacts/` (HF cache) into the container to avoid repeated downloads.
Helpful commands
```bash
# Check status
curl http://localhost:8000/status
# Ask app to reload config (from host)
curl -X POST -H "X-Admin-Token: $ADMIN_TOKEN" http://localhost:8000/admin/reload-config
# Follow logs interactively
docker compose logs -f app --tail=200
```
================================================
FILE: README.md
================================================
# CoexistAI
CoexistAI is a modular, developer-friendly research assistant framework. It enables you to build, search, summarize, and automate research workflows using LLMs, web search, Reddit, YouTube, git and mapping tools—all with simple API calls or Python functions.
<p align="center">
<img src="artifacts/logo.jpeg" alt="CoexistAI Logo" width="200"/>
</p>
## 🎙️ New Features & Updates
- 🔥 _Docker Installation available (Thanks for all the feedback, hope this makes installations easy)_. For a containerized setup with Docker, follow the instructions in [README.docker.md](README.docker.md).
- **Text → Podcast**: Instantly turn written content into engaging podcast episodes—ideal for on-the-go listening or repurposing articles/notes/blogs. Example: Converted [this article](https://www.theatlantic.com/newsletters/archive/2025/08/ai-high-school-college/684057/) to a podcast. **[Listen here](output/podcasts/podcast_58fc33d6.wav)**
- **Text → Speech**: Convert text to high-quality audio using advanced TTS. Check [Notebook](coexist_tutorial.ipynb) for examples.
- **Flexible Integration**: Generate audio files via FastAPI or MCP—integrate with agents or use standalone.
- **Direct Location Search**: Search for any place, not just routes.
- **Advanced Reddit Search**: Custom phrases with BM25 ranking for sharper discovery.
- **YouTube Power-Up**: Search/summarize videos or URLs with custom prompts.
- **File/Folder Exploration**: Explore local folders/files with vision support for images (.png, .jpg, etc.).
- **Sharper Web Search**: More focused, actionable results.
- **MCP Support Everywhere**: Full integration with LM Studio and other MCP hosts. [See Guide](README_MCP.md)
- **GitHub & Local Repo Explorer**: Ask questions about codebases (GitHub or local).
## 🚀 Features
- **Web Explorer**: Query the web, summarize results, and extract context using LLMs.
- **Reddit Explorer**: Fetch and summarize reddit via search phrase or subreddit focused queries
- **YouTube Transcript Explorer**: Search youtube with search phrases and summarise/QA any video
- **Map Explorer**: Generate maps, explore routes, locations with points of interest like hotels, cafes near given locations.
- **Github Explorer**: Explore/summarise/explain/QA any github or even local git codebases
- **Pluggable LLMs and Embedders**: Use any LLMs Google Gemini, OpenAI, Ollama, and any embedders
- **Async & Parallel**: Fast, scalable, and robust asynchronous execution.
- **Notebook & API Ready**: Use as a Python library or via a FAST API.
- **MCP ready**: Spins up the MCP server on the fly along with FAST API server
---
## 🛠️ Installation
**Prerequisite:** Make sure Docker is installed and the Docker daemon is running.
### Method 1: Docker (Recommended) New 🔥
For a containerized setup with Docker, follow the instructions in [README.docker.md](README.docker.md). This method uses Method A (helper script) to automate the process and provides an Admin UI for easy configuration.
### Method 2: Local Setup
1. **Clone the repository:**
```sh
git clone https://github.com/SPThole/CoexistAI.git coexistai
cd coexistai
```
2. **Configure your model and embedding settings:**
- [NEW] Edit `config/model_config.json` to set your preferred LLM and embedding model.
- Edit above file to set your preferred SearxNG host and port (if needed)
- Add LLM and Embedder API Key (for google mode both would be same)
- Example (for full local mode):
```json
{
"llm_model_name": "jan-nano",
"llm_type": "local", // based on baseurl dict given below
"embed_mode": "infinity_emb",
"embedding_model_name": "nomic-ai/nomic-embed-text-v1",
"llm_kwargs": {
"temperature": 0.1,
"max_tokens": null,
"timeout": null,
"max_retries": 2
},
"embed_kwargs": {},
"llm_api_key": "dummy",
"HOST_APP": "localhost",
"PORT_NUM_APP": 8000,
"HOST_SEARXNG": "localhost",
"PORT_NUM_SEARXNG": 8080,
"openai_compatible": {
"google": "https://generativelanguage.googleapis.com/v1beta/openai/",
"local": "http://localhost:1234/v1",
"groq": "https://api.groq.com/openai/v1",
"openai": "https://api.openai.com/v1",
"others": "https://openrouter.ai/api/v1"
}
}
```
- See the file for all available options and defaults.
- If you using others llm type, then check the openai_compatible url dict for others key, you can generally find it by "googling YOUR provider name openai api base compatilble url"
3. **Run the setup script:**
- For macOS or Linux with zsh:
```sh
zsh quick_setup.sh
```
- For Linux with bash:
```sh
bash quick_setup.sh
```
> The script will:
> - Pull the SearxNG Docker image
> - Create and activate a Python virtual environment
> - **USER ACTION NEEDED** Set your `GOOGLE_API_KEY` (edit the script to use your real key). [Obtain your API key (Currently Gemini, OpenAI and ollama is supported)](https://ai.google.dev/gemini-api/docs/api-key) from your preferred LLM provider. (Only needed when google mode is set, else set in model_config.py)
> - Start the SearxNG Docker container
> - Install Python dependencies
> - Start the FastAPI server
4. **That’s it!**
The FastAPI and MCP server will start automatically and you’re ready to go.
**Note:**
- Make sure Docker, Python 3, and pip are installed on your system.
- Edit quick_setup.sh to set your real `GOOGLE_API_KEY` before running (needed if using google models)
- Windows users can use [WSL](https://docs.microsoft.com/en-us/windows/wsl/) or Git Bash to run the script, or follow manual setup steps.
---
### Get Your API Key (optional if you want to use gemini llm/google embedders)
[Obtain your API key (Currently Gemini, OpenAI and ollama is supported)](https://ai.google.dev/gemini-api/docs/api-key) from your preferred LLM provider. Once you have the key, update the `app.py` file or your environment variables as follows:
```python
import os
os.environ['GOOGLE_API_KEY'] = "YOUR_API_KEY"
```
Alternatively, you can set the API key in your shell before starting the server:
```bash
export YOUR_LLM_API_KEY=your-api-key-here
```
> **Note:** For optimal quality and speed, use Google models with `embedding-001` embeddings and Gemini Flash models. They provide free API keys.
Update the place (default: India) in utils/config.py for personalized results
## 🔧 How to use FASTAPI/tools
**Remove comments after // before pasting**
Swagger UI: http://127.0.0.1:8000/docs if you haven't changed the host and port
### 1. Web Search
**Search the web, summarize, and get actionable answers—automatically.**
**Endpoint:**
POST `/web-search`
**Request Example:**
```json
{
"query": "Top news of today worldwide", // Query you want to ask; if you provide a URL and ask to summarise, it will summarize the full page.
"rerank": true, // Set to true for better result ranking.
"num_results": 2, // Number of top results per subquery to explore (higher values = more tokens, slower/more costly).
"local_mode": false, // Set to true to explore local documents (currently, only PDF supported).
"split": true, // Set to false if you want full pages as input to LLMs; false may cause slower/more costly response.
"document_paths": [] // If local_mode is true, add a list of document paths, e.g., ["documents/1706.03762v7.pdf"]
}
```
or QA/sumamrise local documents
```json
{
"query": "Summarise this research paper",
"rerank": true,
"num_results": 3,
"local_mode": true,
"split": true,
"document_paths": ["documents/1706.03762v7.pdf"] // Must be a list.
}
```
---
### 2. Summarize Any Web Page
**Summarize any article or research paper by URL.**
**Endpoint:**
POST `/web-summarize`
**Request Example:**
```json
{
"query": "Write a short blog on the model", // Instruction or question for the fetched page content.
"url": "https://huggingface.co/unsloth/Qwen3-8B-GGUF", // Webpage to fetch content from.
"local_mode": false // Set to true if summarizing a local document.
}
```
---
### 3. YouTube Search
**Search YouTube (supports prompts and batch).**
**Endpoint:**
POST `/youtube-search`
**Request Example:**
```json
{
"query": "switzerland itinerary", // Query to search on YouTube; if a URL is provided, it fetches content from that URL. url should be in format: https://www.youtube.com/watch?v=videoID
"prompt": "I want to plan my Switzerland trip", // Instruction or question for using the fetched content.
"n": 2 // Number of top search results to summarize (only works if query is not a URL).
}
```
---
### 4. Reddit Deep Dive
**Custom Reddit search, sort, filter, and get top comments.**
**Endpoint:**
POST `/reddit-search`
**Request Example:**
```json
{
"subreddit": "", // Subreddit to fetch content from (use if url_type is not 'search').
"url_type": "search", // 'search' for phrase search; "url" for url, otherwise, use 'hot', 'top', 'best', etc.
"n": 3, // Number of posts to fetch.
"k": 1, // Number of top comments per post.
"custom_url": "", // Use if you already have a specific Reddit URL.
"time_filter": "all", // Time range: 'all', 'today', 'week', 'month', 'year'.
"search_query": "gemma 3n reviews", // Search phrase (useful if url_type is 'search').
"sort_type": "relevance" // 'top', 'hot', 'new', 'relevance' — controls how results are sorted.
}
```
---
### 5. Map & Location/Route Search
**Find places, routes, and nearby points of interest.**
**Endpoint:**
POST `/map-search`
**Request Example:**
```json
{
"start_location": "MG Road, Bangalore", // Starting point.
"end_location": "Lalbagh, Bangalore", // Destination.
"pois_radius": 500, // Search radius in meters for amenities.
"amenities": "restaurant|cafe|bar|hotel", // Amenities to search near start or end location.
"limit": 3, // Maximum number of results if address not found exactly.
"task": "route_and_pois" // Use 'location_only' for address/coordinates only, or 'route_and_pois' for routes and POIs.
}
```
OR search for any single location (open street map has api rate limit)
```json
{
"start_location": "MG Road, Bangalore",
"end_location": "Lalbagh, Bangalore",
"pois_radius": 500,
"amenities": "restaurant|cafe|bar|hotel",
"limit": 3,
"task": "location_only"
}
```
---
### 6. GitHub & Local Repo Directory Tree
**Get the directory structure of any GitHub or local repo.**
**Endpoint:**
POST `/git-tree-search`
**Request Example:**
```json
{
"repobaseurl": "https://github.com/SPThole/CoexistAI/" // Base URL of the repository to explore.
}
```
or for local repo:
```json
{
"repobaseurl": "/home/user/projects/myrepo"
}
```
---
### 7. Ask Questions or Search Inside GitHub/Local Code
**Fetch, search, and analyze code in any repo.**
**Endpoint:**
POST `/git-search`
**Request Example:**
```json
{
"repobaseurl": "https://github.com/google-deepmind/gemma", // Base URL of the repository.
"parttoresearch": "research/t5gemma/t5gemma.py", // Folder or file path relative to the base URL.
"query": "explain t5gemma", // Instruction or question to answer from the file/folder.
"type": "file" // Either 'file' or 'folder'.
}
```
or:
```json
{
"repobaseurl": "https://github.com/openai",
"parttoresearch": "openai-cookbook/examples/mcp",
"query": "Write a medium blog, for beginners",
"type": "folder"
}
```
---
## 🧑💻 Usage in Python (use method 2 install else use requests to hit fastapi endpoints)
- [see example notebook](coexist_tutorial.ipynb)
- [Example Usage patterns](demo_queries.ipynb)
```python
from utils.websearch_utils import query_web_response
from utils.reddit_utils import reddit_reader_response
# Web Exploration
result = await query_web_response(
query="latest AI research in the last 7 days",
date="2025-07-08",
day="Tuesday",
websearcher=searcher, #Searxng
hf_embeddings=hf_embeddings,#embedder
rerank=True,
cross_encoder=cross_encoder,#reranker
model=llmgoogle, #replace with llm
text_model=llmgoogle,#replace with llm
num_results=1,#topk results for each subquery
document_paths=[],
local_mode=False, # True if you have local files in document_paths
split=True
)
result = await query_web_response(
query="summarise in the form of linkedin post https://modelcontextprotocol.io/introduction",
date="2025-07-08",
day="Tuesday",
websearcher=searcher, #Searxng
hf_embeddings=hf_embeddings,#embedder
rerank=True,
cross_encoder=cross_encoder,#reranker
model=llmgoogle, #replace with llm
text_model=llmgoogle,#replace with llm
num_results=1,#topk results for each subquery
document_paths=[],
local_mode=False, # True if you have local files in document_paths
split=True
)
## Reddit Exploration
summary = reddit_reader_response(
subreddit="",
url_type="search",
n=5,
k=2,
custom_url=None,# Replace with llm
time_filter="month",
search_query="Gemma 3N reviews",
sort_type="relevance",
model=llmgoogle
)
## Map Exploration
from utils.map import generate_map
# Generate a map with route and POIs
html_path = generate_map("MG Road, Bangalore", "Indiranagar, Bangalore", 500, "hotel", 3)
locations = generate_map("MG Road, Bangalore", "Indiranagar, Bangalore", 500, "", 3,"location_only")
## Youtube Exploration
from utils.websearch_utils import *
learnings = youtube_transcript_response("https://www.youtube.com/watch?v=DB9mjd-65gw",
"Summarise this podcast and share me top learnings as a data scientist",
llmgoogle)
podcast = youtube_transcript_response("History of India top 5 interesting facts",
"Make a podcast of this in Hindi, 5 minutes long",
llmgoogle,
1)
## Git exploration
from utils.git_utils import *
tree = await git_tree_search("https://github.com/SPThole/CoexistAI")
content = await git_specific_content("https://github.com/SPThole/CoexistAI","README.md","file")
```
---
## 🤖 Advanced Patterns & Extensibility
- **Plug in your own LLMs**: Swap out Google Gemini for OpenAI, Ollama, or any LangChain-supported model.
- **Custom Tools**: Add your own tools to the agent system for new capabilities (see `utils/` for examples).
- **Async/Parallel**: All web and document search utilities are asynchronous for high performance.
- **MCP Servers**: Connect your local apps like lmstudio with coexistAI MCP server, all local
---
## 🤝 Contributing
Pull requests, issues, and feature suggestions are welcome! Please open an issue or PR on GitHub.
---
## ⚖️ Legality & Responsible Use
**Non-Commercial Use Only:** CoexistAI is intended strictly for research, prototyping, and educational purposes. Commercial or production use of this project or its outputs is **not permitted**.
**Web and Reddit Data:** CoexistAI uses public web scraping and Reddit JSON endpoints. It does not use the official Reddit API. Always respect robots.txt, site terms, and copyright law when using this tool.
**YouTube & Other Sources:** Use responsibly and in accordance with the terms of service of each data provider.
**Compliance:** You are responsible for ensuring your use of this tool complies with all relevant terms, conditions, and laws.
---
## 📄 License
This project is licensed under a custom Non-Commercial Research and Educational Use License. Use of this software is permitted only for non-commercial research, prototyping, and educational purposes. Commercial or production use is strictly prohibited. See the LICENSE file for full terms and conditions.
---
## ⭐ Star & Share
If you find this project useful, please star the repo and share it with your network!
---
## Acknowledgement:
Special thanks to users like @[TotallyTofu](https://github.com/TotallyTofu) for their valuable feedback.
## 📬 Contact
For questions, reach out via GitHub Issues or open a discussion.
================================================
FILE: README_MCP.md
================================================
# CoexistAI v0.0.2
<p align="center">
<img src="artifacts/v002mcplogo.jpeg" alt="CoexistAI MCP Logo" width="200"/>
</p>
## 🚀 What's New in v2: [Example Usage patterns](demo_queries.ipynb)
- **Direct location search:** You can now search for any place, not just find routes!
- **Advanced Reddit search:** Use your own phrases to search across reddit; results ranked better with BM25 for sharper discovery.
- **YouTube power-up:** Search and summarize YouTube using your own search phrases or video URLs and even add a prompt for custom responses.
- **Explore to your folders/files**: Explore local folders and files with extended support to diverse files including (vision integrated)'.png', '.jpg', '.jpeg', '.gif', '.bmp', '.webp', '.tiff', '.svg', etc, more to come..
- **Sharper web search:** More focused and actionable results than ever before.
- **MCP support everywhere:** Now fully connect coexistai to LM Studio and other MCP hosts—seamless integration! [See Guide](README_MCP.md)
- **GitHub & local repo explorer:** Explore ask questions about codebases - works with both GitHub and local repos!
---
## 🛠 Quick Start
### Method (Less flexible but faster):
**Prerequisite:** Make sure Docker is installed and the Docker daemon is running.
1. **Clone the repository:**
```sh
git clone https://github.com/SPThole/CoexistAI.git coexistai
cd coexistai
```
2. **Configure your model and embedding settings:**
- Edit `model_config.py` to set your preferred LLM and embedding model.
- Edit above file to set your preferred SearxNG host and port (if needed)
- Add LLM and Embedder API Key (for google mode both would be same)
- Example (for full local mode):
```py
model_config = {
# Name of the LLM model to use. For local models, use the model name served by your local server.
"llm_model_name": "google/gemma-3-12b",
# LLM provider type: choose from 'google', 'local', 'groq', or 'openai' or 'others'
# in case of 'others' (base url needs to be updated in openai_compatible given below accordingly).
# Make sure to update the api_key variable above to match the provider.
"llm_type": "local",
# List of tools or plugins to use with the LLM, if any. Set to None if not used.
"llm_tools": None,
# Additional keyword arguments for LLM initialization.
"llm_kwargs": {
"temperature": 0.1, # Sampling temperature for generation.
"max_tokens": None, # Maximum number of tokens to generate (None for default).
"timeout": None, # Timeout for API requests (None for default).
"max_retries": 2, # Maximum number of retries for failed requests.
"api_key": llm_api_key, # API key for authentication.
},
# Name of the embedding model to use.
# For Google, use their embedding model names. For local/HuggingFace, use the model path or name.
"embedding_model_name": "nomic-ai/nomic-embed-text-v1",
"embed_kwargs":{}, #additional kwargs for embedding model initialization
# Embedding backend: 'google' for Google, 'infinity_emb' for local/HuggingFace models.
"embed_mode": "infinity_emb",
# Name of the cross-encoder model for reranking, typically a HuggingFace model.
"cross_encoder_name": "BAAI/bge-reranker-base"
}
```
- See the file for all available options and defaults.
- If you using others llm type, then check the openai_compatible url dict for others key, you can generally find it by "googling YOUR provider name openai api base compatilble url"
3. **Run the setup script:**
- For macOS or Linux with zsh:
```sh
zsh quick_setup.sh
```
- For Linux with bash:
```sh
bash quick_setup.sh
```
> The script will:
> - Pull the SearxNG Docker image
> - Create and activate a Python virtual environment
> - **USER ACTION NEEDED** Set your `GOOGLE_API_KEY` (edit the script to use your real key). [Obtain your API key (Currently Gemini, OpenAI and ollama is supported)](https://ai.google.dev/gemini-api/docs/api-key) from your preferred LLM provider. (Only needed when google mode is set, else set in model_config.py)
> - Start the SearxNG Docker container
> - Install Python dependencies
> - Start the FastAPI server
4. **That’s it!**
The FastAPI and MCP server will start automatically and you’re ready to go.
**Note:**
- Make sure Docker, Python 3, and pip are installed on your system.
- Edit quick_setup.sh to set your real `GOOGLE_API_KEY` before running (needed if using google models)
- Windows users can use [WSL](https://docs.microsoft.com/en-us/windows/wsl/) or Git Bash to run the script, or follow manual setup steps.
## 🔍 What Can You Do? (API Highlights & Examples)
**Remove comments after // before pasting**
Swagger UI: http://127.0.0.1:8000/docs if you haven't changed the host and port
### 1. Web Search
**Search the web, summarize, and get actionable answers—automatically.**
**Endpoint:**
POST `/web-search`
**Request Example:**
```json
{
"query": "Top news of today worldwide", // Query you want to ask; if you provide a URL and ask to summarise, it will summarize the full page.
"rerank": true, // Set to true for better result ranking.
"num_results": 2, // Number of top results per subquery to explore (higher values = more tokens, slower/more costly).
"local_mode": false, // Set to true to explore local documents (currently, only PDF supported).
"split": true, // Set to false if you want full pages as input to LLMs; false may cause slower/more costly response.
"document_paths": [] // If local_mode is true, add a list of document paths, e.g., ["documents/1706.03762v7.pdf"]
}
```
or QA/sumamrise local documents
```json
{
"query": "Summarise this research paper",
"rerank": true,
"num_results": 3,
"local_mode": true,
"split": true,
"document_paths": ["documents/1706.03762v7.pdf"] // Must be list.
}
```
---
### 2. Summarize Any Web Page
**Summarize any article or research paper by URL.**
**Endpoint:**
POST `/web-summarize`
**Request Example:**
```json
{
"query": "Write a short blog on the model", // Instruction or question for the fetched page content.
"url": "https://huggingface.co/unsloth/Qwen3-8B-GGUF", // Webpage to fetch content from.
"local_mode": false // Set to true if summarizing a local document.
}
```
---
### 3. YouTube Search
**Search YouTube (supports prompts and batch).**
**Endpoint:**
POST `/youtube-search`
**Request Example:**
```json
{
"query": "switzerland itinerary", // Query to search on YouTube; if a URL is provided, it fetches content from that URL. url should be in format: https://www.youtube.com/watch?v=videoID
"prompt": "I want to plan my Switzerland trip", // Instruction or question for using the fetched content.
"n": 2 // Number of top search results to summarize (only works if query is not a URL).
}
```
---
### 4. Reddit Deep Dive
**Custom Reddit search, sort, filter, and get top comments.**
**Endpoint:**
POST `/reddit-search`
**Request Example:**
```json
{
"subreddit": "", // Subreddit to fetch content from (use if url_type is not 'search').
"url_type": "search", // 'search' for phrase search; "url" for url, otherwise, use 'hot', 'top', 'best', etc.
"n": 3, // Number of posts to fetch.
"k": 1, // Number of top comments per post.
"custom_url": "", // Use if you already have a specific Reddit URL.
"time_filter": "all", // Time range: 'all', 'today', 'week', 'month', 'year'.
"search_query": "gemma 3n reviews", // Search phrase (useful if url_type is 'search').
"sort_type": "relevance" // 'top', 'hot', 'new', 'relevance' — controls how results are sorted.
}
```
---
### 5. Map & Location/Route Search
**Find places, routes, and nearby points of interest.**
**Endpoint:**
POST `/map-search`
**Request Example:**
```json
{
"start_location": "MG Road, Bangalore", // Starting point.
"end_location": "Lalbagh, Bangalore", // Destination.
"pois_radius": 500, // Search radius in meters for amenities.
"amenities": "restaurant|cafe|bar|hotel", // Amenities to search near start or end location.
"limit": 3, // Maximum number of results if address not found exactly.
"task": "route_and_pois" // Use 'location_only' for address/coordinates only, or 'route_and_pois' for routes and POIs.
}
```
OR search for any single location (open street map has api rate limit)
```json
{
"start_location": "MG Road, Bangalore",
"end_location": "Lalbagh, Bangalore",
"pois_radius": 500,
"amenities": "restaurant|cafe|bar|hotel",
"limit": 3,
"task": "location_only"
}
```
---
### 6. GitHub & Local Repo Directory Tree
**Get the directory structure of any GitHub or local repo.**
**Endpoint:**
POST `/git-tree-search`
**Request Example:**
```json
{
"repobaseurl": "https://github.com/SPThole/CoexistAI/" // Base URL of the repository to explore.
}
```
or for local repo:
```json
{
"repobaseurl": "/home/user/projects/myrepo"
}
```
---
## 🧑💻 Integrate coexistai as an MCP Server (LM Studio, Cursor, etc.)
Starting LM Studio 0.3.17, LM Studio acts as an Model Context Protocol (MCP) Host. This means you can connect MCP servers to the app and make them available to your models.
You can now run coexistai as an MCP server—**plug it into LM Studio** or any other MCP-compatible tool!
### How to Integrate with LM Studio
1. Download (latest) lm studio> 0.3.17 (https://lmstudio.ai/docs/app)
2. Find the [MCP guide](https://lmstudio.ai/docs/app/plugins/mcp)
<p align="center">
<img src="artifacts/lmstudio.png" alt="CoexistAI MCP Logo" width="600"/>
</p>
1. **Edit your `mcp.json` in LM Studio:**
- Go to the Program tab → `Install > Edit mcp.json`
- Add coexistai as a server. Example:
```json
{
"mcpServers": {
"coexistai": {
"url": "http://127.0.0.1:8000/mcp",
"timeout": 180000
}
}
}
```
- Replace with your actual server address and token (if needed).
2. **Or use an "Add to LM Studio" button** (if provided on coexistai website).
**Security note:** Only use MCP servers you trust—servers can access files/network.
3. Use [system prompt](system_prompt.py) as context (system prompt) in lmstudio
---
## 🏆 Best Local Model
For fastest, highest-quality local LLM results, I have personally liked following:
**unsloth/Qwen3-8B-GGUF** and **google/gemma-3-12b** as lmstudio model
for model behind MCP, prefer the model which is faster and yet good at structural output generation
I am working towards making system work with local smaller models which are not so great.
---
================================================
FILE: __init__.py
================================================
================================================
FILE: app.py
================================================
from utils.websearch_utils import *
from utils.reddit_utils import *
from utils.map import *
from fastapi import FastAPI, Request
from pydantic import BaseModel
from utils.utils import *
from utils.map import *
from utils.git_utils import *
from utils.startup_banner import display_startup_banner, display_shutdown_banner, get_ascii_banner
from utils.knowledge_base import create_knowledge_base
from utils.crawler_utils import crawl_and_create_kb
import html as _html
from fastapi.middleware.cors import CORSMiddleware
from fastapi.responses import FileResponse, HTMLResponse
from fastapi.staticfiles import StaticFiles
from uuid import uuid4
import subprocess
from utils.tts_utils import *
from fastapi_mcp import FastApiMCP
import json
import os
import atexit
from model_config import *
import time
from typing import List, Optional, Union
from utils.knowledge_base import create_knowledge_base
# Application state for startup/reload notifications
app_state = {"status": "starting", "message": "Initializing components..."}
def init_components():
"""(Re)initialize model and embedding components from model_config. This is safe to call
at startup or after config reload. It updates module-level globals used by request handlers.
"""
global llm, hf_embeddings, cross_encoder, text_splitter, searcher, date, day, llm_model_name, llm_type, llm_kwargs, embedding_model_name, embed_mode, cross_encoder_name
app_state['status'] = 'starting'
app_state['message'] = 'Loading models and embeddings (this may take a minute)...'
print("=== CoexistAI Startup: Loading models and embeddings ===", flush=True)
try:
# Read config values
print("Reading configuration from model_config...", flush=True)
llm_model_name = model_config.get("llm_model_name", llm_model_name if 'llm_model_name' in globals() else 'google/gemma-3-12b')
llm_type = model_config.get("llm_type", llm_type if 'llm_type' in globals() else 'local')
llm_kwargs = model_config.get("llm_kwargs", llm_kwargs if 'llm_kwargs' in globals() else {'temperature':0.1,'api_key': llm_api_key})
embedding_model_name = model_config.get("embedding_model_name", embedding_model_name if 'embedding_model_name' in globals() else 'models/embedding-001')
embed_mode = model_config.get("embed_mode", embed_mode if 'embed_mode' in globals() else 'google')
cross_encoder_name = model_config.get("cross_encoder_name", cross_encoder_name if 'cross_encoder_name' in globals() else 'BAAI/bge-reranker-base')
print(f"Config loaded: llm_type={llm_type}, llm_model={llm_model_name}, embed_mode={embed_mode}", flush=True)
# instantiate generative LLM
print(f"Initializing LLM: {llm_model_name} ({llm_type})...", flush=True)
llm = get_generative_model(
model_name=llm_model_name,
type=llm_type,
base_url=openai_compatible.get(llm_type, 'https://api.openai.com/v1'),
_tools=None,
kwargs=llm_kwargs
)
print("LLM initialized successfully", flush=True)
# load embeddings and cross-encoder
print(f"Loading embeddings: {embedding_model_name} (mode={embed_mode})...", flush=True)
hf_embeddings, cross_encoder = load_model(embedding_model_name,
_embed_mode=embed_mode,
cross_encoder_name=cross_encoder_name,
kwargs=model_config.get('embed_kwargs', {}))
print("Embeddings and cross-encoder loaded successfully", flush=True)
print("Initializing text splitter...", flush=True)
text_splitter = TokenTextSplitter(chunk_size=128, chunk_overlap=32)
# recreate searxng searcher
print(f"Initializing SearchWeb with {HOST_SEARXNG}:{PORT_NUM_SEARXNG}...", flush=True)
searcher = SearchWeb(PORT_NUM_SEARXNG, HOST_SEARXNG)
print("Getting local date and time...", flush=True)
date, day = get_local_data()
app_state['status'] = 'ready'
app_state['message'] = 'Ready'
print("=== CoexistAI Startup Complete: All components ready ===", flush=True)
except Exception as e:
app_state['status'] = 'error'
app_state['message'] = f'Initialization failed: {e}'
# keep exception visible in logs
print(f"=== CoexistAI Startup FAILED: {e} ===", flush=True)
logger.exception('Failed to initialize components')
raise
# Initialize components once at import/startup
# (This is now done in the lifespan startup handler)
# try:
# init_components()
# except Exception as e:
# # already logged; keep going so admin endpoints can be used to diagnose/fix
# logger.error(f'Failed to initialize at startup: {e}')
# # Update status to show startup failed but app is running for diagnostics
# app_state['status'] = 'error'
# app_state['message'] = f'Startup failed: {e}'
# Use config values for model and embedding paths
llm_model_name = model_config.get("llm_model_name", 'google/gemma-3-12b')
llm_type = model_config.get("llm_type", 'local')
llm_tools = model_config.get("llm_tools",None)
llm_base_url = openai_compatible.get(model_config['llm_type'],
'https://api.openai.com/v1')
llm_kwargs = model_config.get("llm_kwargs", {'temperature': 0.1,
'max_tokens': None,
'timeout': None,
'api_key':llm_api_key,
'max_retries': 2})
embed_kwargs = model_config.get("embed_kwargs", {})
embedding_model_name = model_config.get("embedding_model_name", "models/embedding-001")
embed_mode = model_config.get("embed_mode", "google")
cross_encoder_name = model_config.get("cross_encoder_name", "BAAI/bge-reranker-base")
if not is_searxng_running():
# Running `docker` from inside a container is not supported in most environments
# (docker binary may not exist or there are permission restrictions). Instead,
# log a clear warning and let orchestration (docker-compose / external admin)
# manage the searxng service.
try:
logger.warning(f"SearxNG not reachable at {HOST_SEARXNG}:{PORT_NUM_SEARXNG}. Please start the searxng service (e.g. `docker compose up searxng`) or ensure it's reachable from this container.")
except Exception:
print(f"SearxNG not reachable at {HOST_SEARXNG}:{PORT_NUM_SEARXNG}. Please start searxng service.")
else:
try:
logger.info("SearxNG is reachable.")
except Exception:
print("SearxNG docker container is already running.")
llm = get_generative_model(
model_name=llm_model_name,
type=llm_type,
base_url=llm_base_url,
_tools=None,
kwargs=llm_kwargs
)
hf_embeddings, cross_encoder = load_model(embedding_model_name,
_embed_mode=embed_mode,
cross_encoder_name=cross_encoder_name,
kwargs=embed_kwargs)
text_splitter = TokenTextSplitter(chunk_size=512, chunk_overlap=128)
searcher = SearchWeb(PORT_NUM_SEARXNG, HOST_SEARXNG)
date, day = get_local_data()
# Lifespan context manager for startup/shutdown
from contextlib import asynccontextmanager
@asynccontextmanager
async def lifespan(app_instance):
# Startup
print("\n" + "="*80, flush=True)
print("FastAPI app starting up...", flush=True)
logger.info("FastAPI app starting up...")
app_state['status'] = 'starting'
app_state['message'] = 'Initializing components...'
try:
init_components()
print("="*80 + "\n", flush=True)
except Exception as e:
print(f"STARTUP ERROR: {e}", flush=True)
print("="*80 + "\n", flush=True)
logger.error(f"Failed to initialize components during startup: {e}", exc_info=True)
app_state['status'] = 'error'
app_state['message'] = f'Startup failed: {e}'
yield
# Shutdown
logger.info("FastAPI app shutting down...")
app_state['status'] = 'shutting_down'
app_state['message'] = 'App shutting down'
app = FastAPI(title='coexistai', lifespan=lifespan)
# Mount static files
app.mount("/artifacts", StaticFiles(directory="artifacts"), name="artifacts")
# --- Admin endpoints for runtime config reload/update ---------------------------------
from fastapi import HTTPException, Depends
def _check_admin_token(token: str = None):
# token supplied via header X-Admin-Token or env ADMIN_TOKEN
# FastAPI dependency will pass header automatically when named 'x_admin_token'
env_token = os.environ.get('ADMIN_TOKEN')
if env_token is None:
# no admin token configured; disallow by default to avoid accidental exposure
raise HTTPException(status_code=403, detail='Admin actions disabled (no ADMIN_TOKEN set)')
if token != env_token:
raise HTTPException(status_code=401, detail='Invalid admin token')
return True
@app.post('/admin/reload-config')
async def admin_reload_config(request: Request):
"""Reload model config from the configured JSON file. Protected by ADMIN_TOKEN env var.
Send header 'X-Admin-Token: <token>' to authenticate. Returns the reloaded config on success.
"""
token = request.headers.get('X-Admin-Token')
try:
_check_admin_token(token)
except HTTPException as e:
raise e
try:
new_cfg = reload_model_config()
except Exception as e:
raise HTTPException(status_code=500, detail=f'Failed to reload config: {e}')
# apply config immediately
try:
init_components()
except Exception as e:
raise HTTPException(status_code=500, detail=f'Config reloaded but applying failed: {e}')
return {"status": "ok", "model_config": new_cfg, "app_state": app_state}
@app.post('/admin/update-config')
async def admin_update_config(request: Request):
"""Overwrite the config file with the posted JSON body. Protected by ADMIN_TOKEN.
Body must be a JSON object compatible with the config schema. Returns saved config on success.
"""
token = request.headers.get('X-Admin-Token')
try:
_check_admin_token(token)
except HTTPException as e:
raise e
try:
body = await request.json()
except Exception:
raise HTTPException(status_code=400, detail='Invalid JSON body')
cfg_path = os.environ.get('CONFIG_PATH', os.path.join(os.path.dirname(__file__), 'config', 'model_config.json'))
cfg_dir = os.path.dirname(cfg_path)
os.makedirs(cfg_dir, exist_ok=True)
try:
with open(cfg_path, 'w') as f:
json.dump(body, f, indent=2)
except Exception as e:
raise HTTPException(status_code=500, detail=f'Failed to write config: {e}')
try:
new_cfg = reload_model_config(cfg_path)
except Exception as e:
raise HTTPException(status_code=500, detail=f'Config saved but reload failed: {e}')
# apply new config immediately
try:
init_components()
except Exception as e:
raise HTTPException(status_code=500, detail=f'Config saved but applying failed: {e}')
return {"status": "ok", "saved": cfg_path, "model_config": new_cfg, "app_state": app_state}
# --------------------------------------------------------------------------------------
@app.get('/admin', response_class=HTMLResponse)
async def admin_page():
"""Serve the static admin UI and inject the ASCII banner at request time.
The static UI lives at ./static/admin.html so it's easier to edit and keep
app.py small.
"""
try:
static_path = os.path.join(os.path.dirname(__file__), 'static', 'admin.html')
with open(static_path, 'r', encoding='utf-8') as f:
html = f.read()
except Exception as e:
return HTMLResponse(content=f"<html><body>Error loading admin UI: {e}</body></html>", status_code=500)
# inject the ascii banner into the HTML, escaped for safety
try:
banner = get_ascii_banner() or ''
banner_html = _html.escape(banner)
html = html.replace('BANNER_PLACEHOLDER', banner_html)
except Exception:
pass
return HTMLResponse(content=html)
@app.get('/status')
async def status():
"""Return basic app startup/reload status for UI and health checks."""
return app_state
@app.get('/admin/config')
async def admin_get_config():
"""Return the effective model_config plus helper globals for the admin UI."""
# safe copy of model
# include openai_compatible and host/port defaults
def _mask(s):
try:
if not s:
return ''
s = str(s)
if len(s) <= 6:
return '*' * len(s)
return s[:3] + '...' + s[-3:]
except Exception:
return ''
cfg = dict(model_config)
cfg['_meta'] = {
'openai_compatible': openai_compatible,
'HOST_APP': globals().get('HOST_APP'),
'PORT_NUM_APP': globals().get('PORT_NUM_APP'),
'HOST_SEARXNG': globals().get('HOST_SEARXNG'),
'PORT_NUM_SEARXNG': globals().get('PORT_NUM_SEARXNG'),
'llm_api_key': _mask(globals().get('llm_api_key')),
'embed_api_key': _mask(globals().get('embed_api_key')),
}
return cfg
# Register shutdown handler
atexit.register(display_shutdown_banner)
origins = [
"*", # Allow all origins (use specific domains in production)
]
app.add_middleware(
CORSMiddleware,
allow_origins=origins, # e.g. ["http://localhost", "http://localhost:3000"]
allow_credentials=True,
allow_methods=["*"], # Allow all HTTP methods (including OPTIONS)
allow_headers=["*"], # Allow all headers
)
@app.get('/')
async def root():
return {"message": "Welcome to CoexistAI!"}
class WebSearchRequest(BaseModel):
query: str
rerank: bool = True
num_results: int = 2
local_mode: bool = False
split: bool = True
document_paths: list[str] = [] # List of paths for local documents
vectordb: str = "" # Optional vector database name to use instead of search
quick_answer: bool = False # Whether to force quick answer mode (disables summary mode)
class YouTubeSearchRequest(BaseModel):
query: str
prompt: str
n: int = 1 # Number of videos to summarize, default is 1
class RedditSearchRequest(BaseModel):
subreddit: str = None
url_type: str = "hot"
n: int = 3
k: int = 1
custom_url: str = None
time_filter: str = "all"
search_query: str = None
sort_type: str = "relevance"
class MapSearchRequest(BaseModel):
start_location: Optional[str] = None # Start location can be a string or None
end_location: Optional[str] = None # End location can be a string or None
pois_radius: int = 500 # Default radius for POIs in meters
amenities: str = "restaurant|cafe|bar|hotel" # Default amenities to search for
limit: int = 3 # Default number of results to return
task: str = "route_and_pois" # Default task is to find a route
class WebSummarizeRequest(BaseModel):
query: str
url: str
local_mode: bool = False
class GitTreeRequest(BaseModel):
repobaseurl: str
class GitSearchRequest(BaseModel):
repobaseurl: str
parttoresearch: str
query: str
type: str
class LocalFolderTreeRequest(BaseModel):
folder_path:str
level: str = 'broad-first'
prefix: str = ''
class ResearchCheckRequest(BaseModel):
query: str
toolsshorthand: str # Default budget for deep research, can be adjusted as needed
class ClickableElementRequest(BaseModel):
url:str
query:str
topk:int=10
class PodcastRequest(BaseModel):
text: str = None
prompt: str = None # Optional theme for the podcast
class BasicTTSRequest(BaseModel):
text: str = None
voice: str = "am_santa"
lang: str = "en-us"
filename: str = ""
class KnowledgeBaseRequest(BaseModel):
document_paths: list[str] # List of paths to create knowledge base from
class CrawlerRequest(BaseModel):
url_or_urls: Union[str, List[str]] # Single URL to crawl or list of URLs to scrape
keywords: Optional[List[str]] = [""] # Optional keywords to filter content
depth: Optional[int] = None # Crawl depth for crawling (None for full website crawl)
crawl: bool = True # Whether to crawl (True) or process URLs directly (False)
min_delay: float = 1.0 # Minimum delay between requests in seconds
max_delay: float = 2.0 # Maximum delay between requests in seconds
max_pages: int = 10000 # Maximum number of pages to collect during crawling
url_keyword: Optional[str] = "" # Optional keyword to filter URLs by presence in the URL string
@app.post('/clickable-elements', operation_id="get_website_structure")
async def get_website_structure(request: ClickableElementRequest):
"""
Retrieves the top-k clickable elements from a given URL based on a query.
This will help you to find out if there are any clickable elements on the page that match the query.
You can use this to find deeper links since connected pieces of information are often linked together.
RECOMMENDATION: Be specific with the query to get the most relevant clickable elements.
Args:
url (str): The URL to search for clickable elements.
query (str): The query to filter the clickable elements.
topk (int): The number of top clickable elements to return.
Returns:
list: A list of dictionaries containing the title, URL, and score of each clickable element.
"""
return await get_topk_bm25_clickable_elements(request.url, request.query, request.topk)
@app.post('/local-folder-tree', operation_id="get_local_folder_tree")
async def get_local_folder_tree(request: LocalFolderTreeRequest):
"""
Async Markdown folder tree.
Args:
folder_path (str): Root directory.
level (str):
- 'full': Show all folders and files, recursively, except hidden/system/cache entries.
- 'broad-first': Only show immediate (top-level) folders and files (no nesting).
- 'broad-second': Show top-level folders/files and their immediate child folders/files (two levels, no deeper).
prefix (str): Indentation (internal)
Returns:
str: Markdown tree string
"""
return await folder_tree(request.folder_path, level=request.level, prefix=request.prefix)
@app.post('/git-tree-search',operation_id="get_git_tree")
async def get_git_tree(request:GitTreeRequest):
"""
Retrieves and returns the directory tree structure of a GitHub repository or a local Git repository.
Args:
url (str): The base URL of the GitHub repository (e.g., 'https://github.com/user/repo')
or the path to the local repository on your system.
Returns:
str: The directory tree structure as a string.
"""
return await git_tree_search(request.repobaseurl)
@app.post('/git-search',operation_id="get_git_search")
async def get_git_search(request:GitSearchRequest):
"""
Fetches the content of a specific part (directory or file) from either and does what asked in users query.
First use get_git_tree to understand the structure of the repo and which part might be useful to answer users query
- a GitHub repository (via URL), or
- a local Git repository (via local path).
Args:
base_url (str): The base URL of the GitHub repository (e.g., 'https://github.com/user/repo'),
or the local path to the root of the repository.
part (str): The path inside the repository you wish to access (e.g., 'basefolder/subfolder'). use get_git_tree for getting specific part if needed
query (str): Users query
type (str): "Folder" or "file"
Returns:
str: Response of the users query based on the content fetched
"""
content = await git_specific_content(request.repobaseurl,request.parttoresearch,request.type)
prompt = f"""You are a professional coder, your task is to answer the users query based on the content fetched from git repo
User Query: {request.query}
Fetched Content: {content}
"""
result = await llm.ainvoke(
prompt
)
return result.content
@app.post('/web-search',operation_id="get_web_search")
async def websearch(request: WebSearchRequest):
"""
Performs a web search and retrieves results, then generates a response based on those results.
It also throws back the next steps, you should carry out your research until there are no next steps left.
RECOMMENDATION: Be specific with the query to get the most relevant results. and Set num_results to 2 (for better results)
Args:
query (str): The input query.
rerank (bool): Whether to rerank results.
num_results (int, optional): Number of search results to retrieve. Defaults to 3. (can take values from 1-5)
document_paths (list of str, optional): List of paths for local documents/folders. Defaults to empty list. for an example [path1,path2,path3]. if different tasks are related to different documents
local_mode (bool, optional): Whether to process local documents. Defaults to False.
split (bool, optional): Whether to split documents into chunks. Defaults to True.
vectordb (str, optional): Name of an existing vector database to query instead of performing search. Defaults to None.
quick_answer (bool, optional): Whether to force quick answer mode (disables summary mode). Defaults to False.
Returns:
str: Generated response to query based on the retrieved and reranked search results and sources
"""
# You may need to adjust these arguments based on your actual setup
# For demonstration, using None for models and embeddings
try:
result = await query_web_response(
query=request.query,
date=date,
day=day,
websearcher=searcher, # Replace with your actual searcher instance if needed
hf_embeddings=hf_embeddings,
rerank=request.rerank,
cross_encoder=cross_encoder,
model=llm,
text_model=llm,
num_results=min(2,request.num_results),
document_paths=request.document_paths,
local_mode=request.local_mode,
split=request.split,
vectordb=request.vectordb,
quick_answer=request.quick_answer
)
return "result:" + result[0] + '\n\nsources:' + result[1]
except:
return "No Websites found, Try rephrasing query"
@app.post('/create-knowledge-base', operation_id="get_knowledge_base")
async def create_kb(request: KnowledgeBaseRequest):
"""
Creates a knowledge base from the provided document paths.
Processes all files in the paths, embeds them, and saves to a vector database.
Args:
document_paths (list of str): List of paths to folders or files to include in the knowledge base.
Returns:
str: The name of the created vector database collection.
"""
try:
collection_name = await create_knowledge_base(
document_paths=request.document_paths,
hf_embeddings=hf_embeddings
)
return f"Knowledge base created successfully. Collection name: {collection_name}"
except Exception as e:
return f"Error creating knowledge base: {str(e)}"
@app.post('/crawl-and-create-knowledge-base', operation_id="get_crawled_knowledge_base")
async def crawl_kb(request: CrawlerRequest):
"""
Crawls a website or processes a list of URLs and creates a knowledge base from the content.
Args:
url_or_urls: Single URL to crawl or list of URLs to scrape directly
keywords: Optional list of keywords to filter content by
depth: Maximum crawl depth for crawling (None for full website crawl)
crawl: Whether to crawl (True) or process URLs directly (False)
min_delay: Minimum delay between requests in seconds (default: 1.0)
max_delay: Maximum delay between requests in seconds (default: 3.0)
max_pages: Maximum number of pages to collect during crawling (default: 100)
url_keyword: Optional keyword to filter URLs by presence in the URL string
Returns:
str: Message with the collection name and list of scraped URLs.
"""
try:
collection_name, scraped_urls = await crawl_and_create_kb(
url_or_urls=request.url_or_urls,
keywords=request.keywords,
depth=request.depth,
crawl=request.crawl,
min_delay=request.min_delay,
max_delay=request.max_delay,
max_pages=request.max_pages,
url_keyword=request.url_keyword,
hf_embeddings=hf_embeddings
)
return f"Crawled knowledge base created successfully. Collection name: {collection_name}. Scraped URLs: {scraped_urls}"
except Exception as e:
return f"Error creating crawled knowledge base: {str(e)}"
@app.post('/web-summarize', operation_id="get_web_summarize")
async def websummarize(request: WebSummarizeRequest):
"""Generates a summary of a web page based on the provided query and URL.
Args:
query (str): The input query.
url (str): The URL of the web page to summarize.
model (str): The model to use for summarization.
local_mode (bool): Whether to process local documents.
Returns:
str: The generated summary of the url provided to answer query"""
try:
result = await summary_of_url(
query=request.query,
url=request.url,
model=llm, # Replace with your actual model if needed
local_mode=request.local_mode
)
return result
except:
return "URL is not reacheable, try different URL"
@app.post('/youtube-search', operation_id="get_youtube_search")
async def youtube_search(request: YouTubeSearchRequest):
"""Performs a YouTube search and return summaries of it.
Args:
query (str): The YouTube video URL if provided else search term
prompt (str): The prompt to generate a response from the transcript.
n (int): Number of videos to summarize if search term is provided instead of URL.
Returns:
str: response from the YouTube transcripts based on the given query"""
# You may need to adjust the model argument as per your setup
result = youtube_transcript_response(
request.query,
request.prompt,
n = request.n, #number of videos to summarise
model=llm # Replace with your actual model if needed
)
return result
@app.post('/reddit-search', operation_id="get_reddit_search")
async def reddit_search(request: RedditSearchRequest):
"""Performs a Reddit search and retrieves posts based on the provided parameters.
Args:
subreddit (str): The subreddit to search in. When search_query is provided
url_type (str): The type of Reddit URL to fetch (e.g., 'search','hot', 'new','top','best','controversial','rising').
set to 'search' if specific search_query is provided
n (int): Number of posts to retrieve.
k (int): Number of comments on each post to return after processing. When more perspectives needed increase this.
custom_url (str): Custom URL for Reddit search.
time_filter (str): Time filter for the search (e.g., 'all', 'day').
search_query (str): Search query for Reddit posts. IF NOT SEARCHING FOR A QUERY, dont set this value, keep it ""
sort_type (str): Sorting type for the results.
Returns:
str: A response containing the summary of the Reddit search results"""
# You may need to adjust the model argument as per your setup
if request.search_query:
request.url_type = 'search'
result = reddit_reader_response(
subreddit=request.subreddit,
url_type=request.url_type,
n=request.n,
k=request.k,
custom_url=request.custom_url,
time_filter=request.time_filter,
search_query=request.search_query,
sort_type=request.sort_type,
model=llm # Replace with your actual model if needed
)
return result
@app.post('/map-search', operation_id="get_map_search")
async def map_search(request: MapSearchRequest):
"""Performs a map search and retrieves the route and points of interest like (POIs) between two locations.
Args:
start_location (optional str): The starting location for the route. can be None as well
end_location (optional str): The destination location for the route.can be None as well
pois_radius (int): Radius in meters to search for points of interest around the route.
amenities (str): Types of amenities to search for, separated by '|'. For example, "restaurant|cafe|bar|hotel".
limit (int): Maximum number of POIs to return.
task (str): The task to perform, either "location_only" - if lat long of start and end location is needed,
else by default is "route_and_pois" - if route and POIs are needed.
Returns:
dict: location or route and POIs or both"""
result = generate_map(request.start_location,
request.end_location,
pois_radius=request.pois_radius,
amenities=request.amenities,
limit=request.limit,
task=request.task,
)
return result
@app.post('/check-response', operation_id="get_response_check")
async def check_response(request: ResearchCheckRequest):
"""
Evaluates whether the agent's collected information is complete for writing answer to the user's query.
If any aspect is missing, list them all in bullet format
Args:
query (str): The user's original query.
toolsshorthand (str): Exact Facts/Information collected in bullets from every past tool usage which would be useful to answer
Returns:
str: Suggestions for improvement or confirmation that all aspects are addressed.
"""
system_prompt = f"""You are a professional researcher.
Review the following user query and the agent's short hand of informations collected.
If not explicitly asked for deep research, you should just check if most necessary information and all aspects present in query are covered, NO NEED TO SUGGEST EXTRA, SINCE ITS QUICK QUERY
Determine if the shorthand fully addresses every aspect and intent of the query.
If any part is missing or could be improved, list the specific aspects or suggestions for further research or value addition.(IF DEEP RESEARCH ASKED EXPLICITLY)
If the response is complete, state that all aspects have been addressed.
User Query: {request.query}
Agent Shorthand: {request.toolsshorthand}
"""
result = await llm.ainvoke(
system_prompt
)
return result.content
@app.post('/text-to-podcast', operation_id="get_podcast")
async def podcaster(request: PodcastRequest):
"""
Converts a list of sentences with specified voices into a podcast audio file.
Each sentence is spoken in the specified voice, and random pauses are added between sentences for natural flow.
Args:
prompt: The theme or topic of the podcast episode. You can even provide length instructions, like shorter/longer duration, tone, etc.
text: The detailed content over which the podcast is to be made.
Returns:
FileResponse: The generated podcast .wav file. or str
"""
system_prompt = f"""You are an experienced podcaster who can create engaging episodes on any topic.
Your style makes complex concepts simple, clear, and enjoyable to listen to.
When writing scripts:
Use natural, conversational language.
Avoid special characters (like *, #, etc.) and TTS markup (such as <prosody> tags).
Do not include background descriptions or stage directions.
Always stay on the provided theme (if one is given). If no theme is provided, use the given text to generate engaging, informative content.
The podcast script should be formatted as follows:
<podcast>
[Person1] What Person1 says [Person2] What Person2 says ...
</podcast>
Where each [Person] represents a speaker, followed by their dialogue.
Theme: {request.prompt}
Text: {request.text}
"""
result = await llm.ainvoke(
system_prompt
)
voice_choices = ["af_heart","am_michael","am_adam","am_eric","am_echo","am_puck",
"am_fenrir","am_santa","am_liam","af_river"
]
podcast_segments = await parse_podcast(result.content, voice_choices)
try:
if os.path.exists("output/podcasts") is False:
os.makedirs("output/podcasts")
file_path = f"output/podcasts/podcast_{str(uuid4())[:8]}.wav"
_ = await podcasting(podcast_segments, filename=file_path)
logger.info(f"Current working directory: {os.getcwd()}")
logger.info(f"Podcast file created at: {file_path}")
try:
return FileResponse(
file_path,
media_type="audio/wav",
filename=os.path.basename(file_path)
)
except:
return f"Generated podcast and stored at {file_path}"
except Exception as e:
return {"error": f"Error occurred while creating podcast: {e}"}
@app.post('/basic-tts', operation_id="get_basic_tts")
async def basic_tts(request: BasicTTSRequest):
"""Converts input text to speech using the specified voice and language, and returns the generated audio file.
Args:
request (BasicTTSRequest): The request object containing the following fields:
- text (str): The text to be converted to speech.
- voice (str): The voice to use for speech synthesis.
- lang (str): The language code for speech synthesis.
- filename (str, optional): The output filename for the generated audio file.
Returns:
FileResponse: The generated audio file in WAV format if successful.
dict: An error message if text is not provided or if an exception occurs during TTS generation.
"""
text = request.text
voice = request.voice
lang = request.lang
filename = request.filename
if not filename:
filename = f"output/basic_tts_{str(uuid4())[:8]}.wav"
if not text:
return {"error": "Text is required for TTS."}
try:
await text_to_speech(text, voice, filename, lang)
return FileResponse(
filename,
media_type="audio/wav",
filename=os.path.basename(filename)
)
except Exception as e:
return {"error": f"Error occurred while creating TTS: {e}"}
mcp = FastApiMCP(app,include_operations=['get_web_search',
'get_web_summarize',
'get_youtube_search',
'get_reddit_search',
'get_map_search',
"get_git_tree",
"get_git_search",
"get_local_folder_tree",
"get_response_check",
"get_website_structure",
"get_podcast",
"get_basic_tts"
],)
mcp.mount()
# Display startup banner when the app starts
display_startup_banner(host=HOST_APP, port=PORT_NUM_APP)
================================================
FILE: coexist_tutorial.ipynb
================================================
{
"cells": [
{
"cell_type": "markdown",
"id": "6dd23e8d",
"metadata": {},
"source": [
"# CoexistAI Tool Tutorial\n",
"\n",
"Welcome to the tutorial for the coexistAI tool! This notebook will guide you through the main functionalities of the tool, including web search, document processing, generative models, answer generation, YouTube summarization, and more. Each section contains explanations and code examples to help you get started quickly."
]
},
{
"cell_type": "markdown",
"id": "d0bbf059",
"metadata": {},
"source": [
"## 1. Setup and Initialization\n",
"\n",
"First, let's import the required libraries, set up environment variables, and initialize the main components. This ensures that all dependencies are loaded and ready for use."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3849e2e0",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"USER_AGENT environment variable not set, consider setting it to identify your requests.\n"
]
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">SearxNG docker container is already running.\n",
"</pre>\n"
],
"text/plain": [
"SearxNG docker container is already running.\n"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from utils.utils import *\n",
"from utils.websearch_utils import *\n",
"set_logging(True) \n",
"from langchain_text_splitters import TokenTextSplitter\n",
"import os\n",
"port_num_searxng = 8085\n",
"host_searxng = \"localhost\"\n",
"if not is_searxng_running():\n",
" subprocess.run([\n",
" \"docker\", \"run\", \"--rm\",\n",
" \"-d\", \"-p\", f\"{port_num_searxng}:8080\",\n",
" \"-v\", f\"{os.getcwd()}/searxng:/etc/searxng:rw\",\n",
" \"-e\", f\"BASE_URL=http://{host_searxng}:{port_num_searxng}/\",\n",
" \"-e\", \"INSTANCE_NAME=my-instance\",\n",
" \"searxng/searxng\"\n",
" ])\n",
"else:\n",
" print(\"SearxNG docker container is already running.\")\n",
"\n",
"\n",
"os.environ['GOOGLE_API_KEY'] = 'YOUR_LLM_KEY' # Replace with your actual if google models are being used\n",
"\n",
"text_splitter = TokenTextSplitter(chunk_size=512, chunk_overlap=128)\n",
"from utils.websearch_utils import *\n",
"searcher = SearchWeb(port_num_searxng,host_searxng) # Initialize web search with a result limit"
]
},
{
"cell_type": "markdown",
"id": "ad2fddd7",
"metadata": {},
"source": [
"## 2. Loading Models\n",
"\n",
"Load embedding models and cross-encoders using the `load_model` function. You can choose between different embedding modes such as 'gemini', 'huggingface', or 'infinity_emb'.\n",
"\n",
"for local embedders, either use infinity_emb or huggingface"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "237dc7be",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"[2025-07-27 12:05:45,487] INFO utils.utils: Loading model: models/embedding-001 with embedding mode: google\n",
"[2025-07-27 12:05:45,487] INFO utils.utils: Loading model: models/embedding-001 with embedding mode: google\n",
"[2025-07-27 12:05:45,749] INFO transformers.configuration_utils: loading configuration file config.json from cache at /Users/sidhantthole/.cache/huggingface/hub/models--BAAI--bge-reranker-base/snapshots/2cfc18c9415c912f9d8155881c133215df768a70/config.json\n",
"[2025-07-27 12:05:45,749] INFO transformers.configuration_utils: loading configuration file config.json from cache at /Users/sidhantthole/.cache/huggingface/hub/models--BAAI--bge-reranker-base/snapshots/2cfc18c9415c912f9d8155881c133215df768a70/config.json\n",
"[2025-07-27 12:05:45,751] INFO transformers.configuration_utils: Model config XLMRobertaConfig {\n",
" \"architectures\": [\n",
" \"XLMRobertaForSequenceClassification\"\n",
" ],\n",
" \"attention_probs_dropout_prob\": 0.1,\n",
" \"bos_token_id\": 0,\n",
" \"classifier_dropout\": null,\n",
" \"eos_token_id\": 2,\n",
" \"hidden_act\": \"gelu\",\n",
" \"hidden_dropout_prob\": 0.1,\n",
" \"hidden_size\": 768,\n",
" \"id2label\": {\n",
" \"0\": \"LABEL_0\"\n",
" },\n",
" \"initializer_range\": 0.02,\n",
" \"intermediate_size\": 3072,\n",
" \"label2id\": {\n",
" \"LABEL_0\": 0\n",
" },\n",
" \"layer_norm_eps\": 1e-05,\n",
" \"max_position_embeddings\": 514,\n",
" \"model_type\": \"xlm-roberta\",\n",
" \"num_attention_heads\": 12,\n",
" \"num_hidden_layers\": 12,\n",
" \"output_past\": true,\n",
" \"pad_token_id\": 1,\n",
" \"position_embedding_type\": \"absolute\",\n",
" \"torch_dtype\": \"float32\",\n",
" \"transformers_version\": \"4.52.4\",\n",
" \"type_vocab_size\": 1,\n",
" \"use_cache\": true,\n",
" \"vocab_size\": 250002\n",
"}\n",
"\n",
"[2025-07-27 12:05:45,751] INFO transformers.configuration_utils: Model config XLMRobertaConfig {\n",
" \"architectures\": [\n",
" \"XLMRobertaForSequenceClassification\"\n",
" ],\n",
" \"attention_probs_dropout_prob\": 0.1,\n",
" \"bos_token_id\": 0,\n",
" \"classifier_dropout\": null,\n",
" \"eos_token_id\": 2,\n",
" \"hidden_act\": \"gelu\",\n",
" \"hidden_dropout_prob\": 0.1,\n",
" \"hidden_size\": 768,\n",
" \"id2label\": {\n",
" \"0\": \"LABEL_0\"\n",
" },\n",
" \"initializer_range\": 0.02,\n",
" \"intermediate_size\": 3072,\n",
" \"label2id\": {\n",
" \"LABEL_0\": 0\n",
" },\n",
" \"layer_norm_eps\": 1e-05,\n",
" \"max_position_embeddings\": 514,\n",
" \"model_type\": \"xlm-roberta\",\n",
" \"num_attention_heads\": 12,\n",
" \"num_hidden_layers\": 12,\n",
" \"output_past\": true,\n",
" \"pad_token_id\": 1,\n",
" \"position_embedding_type\": \"absolute\",\n",
" \"torch_dtype\": \"float32\",\n",
" \"transformers_version\": \"4.52.4\",\n",
" \"type_vocab_size\": 1,\n",
" \"use_cache\": true,\n",
" \"vocab_size\": 250002\n",
"}\n",
"\n",
"[2025-07-27 12:05:45,755] INFO transformers.modeling_utils: loading weights file model.safetensors from cache at /Users/sidhantthole/.cache/huggingface/hub/models--BAAI--bge-reranker-base/snapshots/2cfc18c9415c912f9d8155881c133215df768a70/model.safetensors\n",
"[2025-07-27 12:05:45,755] INFO transformers.modeling_utils: loading weights file model.safetensors from cache at /Users/sidhantthole/.cache/huggingface/hub/models--BAAI--bge-reranker-base/snapshots/2cfc18c9415c912f9d8155881c133215df768a70/model.safetensors\n",
"[2025-07-27 12:05:45,826] INFO transformers.modeling_utils: All model checkpoint weights were used when initializing XLMRobertaForSequenceClassification.\n",
"\n",
"[2025-07-27 12:05:45,826] INFO transformers.modeling_utils: All model checkpoint weights were used when initializing XLMRobertaForSequenceClassification.\n",
"\n",
"[2025-07-27 12:05:45,827] INFO transformers.modeling_utils: All the weights of XLMRobertaForSequenceClassification were initialized from the model checkpoint at BAAI/bge-reranker-base.\n",
"If your task is similar to the task the model of the checkpoint was trained on, you can already use XLMRobertaForSequenceClassification for predictions without further training.\n",
"[2025-07-27 12:05:45,827] INFO transformers.modeling_utils: All the weights of XLMRobertaForSequenceClassification were initialized from the model checkpoint at BAAI/bge-reranker-base.\n",
"If your task is similar to the task the model of the checkpoint was trained on, you can already use XLMRobertaForSequenceClassification for predictions without further training.\n",
"[2025-07-27 12:05:46,461] INFO transformers.tokenization_utils_base: loading file sentencepiece.bpe.model from cache at /Users/sidhantthole/.cache/huggingface/hub/models--BAAI--bge-reranker-base/snapshots/2cfc18c9415c912f9d8155881c133215df768a70/sentencepiece.bpe.model\n",
"[2025-07-27 12:05:46,461] INFO transformers.tokenization_utils_base: loading file sentencepiece.bpe.model from cache at /Users/sidhantthole/.cache/huggingface/hub/models--BAAI--bge-reranker-base/snapshots/2cfc18c9415c912f9d8155881c133215df768a70/sentencepiece.bpe.model\n",
"[2025-07-27 12:05:46,463] INFO transformers.tokenization_utils_base: loading file tokenizer.json from cache at /Users/sidhantthole/.cache/huggingface/hub/models--BAAI--bge-reranker-base/snapshots/2cfc18c9415c912f9d8155881c133215df768a70/tokenizer.json\n",
"[2025-07-27 12:05:46,463] INFO transformers.tokenization_utils_base: loading file tokenizer.json from cache at /Users/sidhantthole/.cache/huggingface/hub/models--BAAI--bge-reranker-base/snapshots/2cfc18c9415c912f9d8155881c133215df768a70/tokenizer.json\n",
"[2025-07-27 12:05:46,464] INFO transformers.tokenization_utils_base: loading file added_tokens.json from cache at None\n",
"[2025-07-27 12:05:46,464] INFO transformers.tokenization_utils_base: loading file added_tokens.json from cache at None\n",
"[2025-07-27 12:05:46,465] INFO transformers.tokenization_utils_base: loading file special_tokens_map.json from cache at /Users/sidhantthole/.cache/huggingface/hub/models--BAAI--bge-reranker-base/snapshots/2cfc18c9415c912f9d8155881c133215df768a70/special_tokens_map.json\n",
"[2025-07-27 12:05:46,465] INFO transformers.tokenization_utils_base: loading file special_tokens_map.json from cache at /Users/sidhantthole/.cache/huggingface/hub/models--BAAI--bge-reranker-base/snapshots/2cfc18c9415c912f9d8155881c133215df768a70/special_tokens_map.json\n",
"[2025-07-27 12:05:46,466] INFO transformers.tokenization_utils_base: loading file tokenizer_config.json from cache at /Users/sidhantthole/.cache/huggingface/hub/models--BAAI--bge-reranker-base/snapshots/2cfc18c9415c912f9d8155881c133215df768a70/tokenizer_config.json\n",
"[2025-07-27 12:05:46,466] INFO transformers.tokenization_utils_base: loading file tokenizer_config.json from cache at /Users/sidhantthole/.cache/huggingface/hub/models--BAAI--bge-reranker-base/snapshots/2cfc18c9415c912f9d8155881c133215df768a70/tokenizer_config.json\n",
"[2025-07-27 12:05:46,467] INFO transformers.tokenization_utils_base: loading file chat_template.jinja from cache at None\n",
"[2025-07-27 12:05:46,467] INFO transformers.tokenization_utils_base: loading file chat_template.jinja from cache at None\n",
"[2025-07-27 12:05:47,331] INFO sentence_transformers.cross_encoder.CrossEncoder: Use pytorch device: mps\n",
"[2025-07-27 12:05:47,331] INFO sentence_transformers.cross_encoder.CrossEncoder: Use pytorch device: mps\n"
]
}
],
"source": [
"hf_embeddings, cross_encoder = load_model(\"models/embedding-001\", _embed_mode='google',\n",
" kwargs={'api_key': os.environ['GOOGLE_API_KEY']})"
]
},
{
"cell_type": "markdown",
"id": "eab297bd",
"metadata": {},
"source": [
"## 4. Web Search Integration\n",
"\n",
"Use the `SearchWeb` class to perform web searches and retrieve results. This is useful for augmenting LLMs with up-to-date information from the web."
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "0b1c2e46",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"[2025-07-27 12:05:51,924] INFO utils.websearch_utils: Search results for query 'latest news in AI': [{'snippet': 'SoftBank chief: Forget AGI, ASI will be here within 10 years · Anthropic deploys AI agents to audit models for safety · Sam Altman: AI will cause job losses and ...', 'title': 'AI News | Latest AI News, Analysis & Events', 'link': 'https://www.artificialintelligence-news.com/', 'engines': ['google'], 'category': 'general'}, {'snippet': 'AI · Meta names Shengjia Zhao as chief scientist of AI superintelligence unit · AI referrals to top websites were up 357% year-over-year in June, reaching 1.13B.', 'title': 'AI News & Artificial Intelligence', 'link': 'https://techcrunch.com/category/artificial-intelligence/', 'engines': ['google'], 'category': 'general'}, {'snippet': \"2 Jul 2025 — Here's a recap of some of our biggest AI updates from June, including more ways to search with AI Mode, a new way to share your NotebookLM notebooks publicly.\", 'title': 'The latest AI news we announced in June', 'link': 'https://blog.google/technology/ai/google-ai-updates-june-2025/', 'engines': ['google'], 'category': 'general'}]\n",
"[2025-07-27 12:05:51,924] INFO utils.websearch_utils: Search results for query 'latest news in AI': [{'snippet': 'SoftBank chief: Forget AGI, ASI will be here within 10 years · Anthropic deploys AI agents to audit models for safety · Sam Altman: AI will cause job losses and ...', 'title': 'AI News | Latest AI News, Analysis & Events', 'link': 'https://www.artificialintelligence-news.com/', 'engines': ['google'], 'category': 'general'}, {'snippet': 'AI · Meta names Shengjia Zhao as chief scientist of AI superintelligence unit · AI referrals to top websites were up 357% year-over-year in June, reaching 1.13B.', 'title': 'AI News & Artificial Intelligence', 'link': 'https://techcrunch.com/category/artificial-intelligence/', 'engines': ['google'], 'category': 'general'}, {'snippet': \"2 Jul 2025 — Here's a recap of some of our biggest AI updates from June, including more ways to search with AI Mode, a new way to share your NotebookLM notebooks publicly.\", 'title': 'The latest AI news we announced in June', 'link': 'https://blog.google/technology/ai/google-ai-updates-june-2025/', 'engines': ['google'], 'category': 'general'}]\n"
]
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\">[</span>\n",
" <span style=\"font-weight: bold\">{</span>\n",
" <span style=\"color: #008000; text-decoration-color: #008000\">'snippet'</span>: <span style=\"color: #008000; text-decoration-color: #008000\">'SoftBank chief: Forget AGI, ASI will be here within 10 years · Anthropic deploys AI agents to </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">audit models for safety · Sam Altman: AI will cause job losses and ...'</span>,\n",
" <span style=\"color: #008000; text-decoration-color: #008000\">'title'</span>: <span style=\"color: #008000; text-decoration-color: #008000\">'AI News | Latest AI News, Analysis & Events'</span>,\n",
" <span style=\"color: #008000; text-decoration-color: #008000\">'link'</span>: <span style=\"color: #008000; text-decoration-color: #008000\">'https://www.artificialintelligence-news.com/'</span>,\n",
" <span style=\"color: #008000; text-decoration-color: #008000\">'engines'</span>: <span style=\"font-weight: bold\">[</span><span style=\"color: #008000; text-decoration-color: #008000\">'google'</span><span style=\"font-weight: bold\">]</span>,\n",
" <span style=\"color: #008000; text-decoration-color: #008000\">'category'</span>: <span style=\"color: #008000; text-decoration-color: #008000\">'general'</span>\n",
" <span style=\"font-weight: bold\">}</span>,\n",
" <span style=\"font-weight: bold\">{</span>\n",
" <span style=\"color: #008000; text-decoration-color: #008000\">'snippet'</span>: <span style=\"color: #008000; text-decoration-color: #008000\">'AI · Meta names Shengjia Zhao as chief scientist of AI superintelligence unit · AI referrals to</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">top websites were up 357% year-over-year in June, reaching 1.13B.'</span>,\n",
" <span style=\"color: #008000; text-decoration-color: #008000\">'title'</span>: <span style=\"color: #008000; text-decoration-color: #008000\">'AI News & Artificial Intelligence'</span>,\n",
" <span style=\"color: #008000; text-decoration-color: #008000\">'link'</span>: <span style=\"color: #008000; text-decoration-color: #008000\">'https://techcrunch.com/category/artificial-intelligence/'</span>,\n",
" <span style=\"color: #008000; text-decoration-color: #008000\">'engines'</span>: <span style=\"font-weight: bold\">[</span><span style=\"color: #008000; text-decoration-color: #008000\">'google'</span><span style=\"font-weight: bold\">]</span>,\n",
" <span style=\"color: #008000; text-decoration-color: #008000\">'category'</span>: <span style=\"color: #008000; text-decoration-color: #008000\">'general'</span>\n",
" <span style=\"font-weight: bold\">}</span>,\n",
" <span style=\"font-weight: bold\">{</span>\n",
" <span style=\"color: #008000; text-decoration-color: #008000\">'snippet'</span>: <span style=\"color: #008000; text-decoration-color: #008000\">\"2 Jul 2025 — Here's a recap of some of our biggest AI updates from June, including more ways to</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">search with AI Mode, a new way to share your NotebookLM notebooks publicly.\"</span>,\n",
" <span style=\"color: #008000; text-decoration-color: #008000\">'title'</span>: <span style=\"color: #008000; text-decoration-color: #008000\">'The latest AI news we announced in June'</span>,\n",
" <span style=\"color: #008000; text-decoration-color: #008000\">'link'</span>: <span style=\"color: #008000; text-decoration-color: #008000\">'https://blog.google/technology/ai/google-ai-updates-june-2025/'</span>,\n",
" <span style=\"color: #008000; text-decoration-color: #008000\">'engines'</span>: <span style=\"font-weight: bold\">[</span><span style=\"color: #008000; text-decoration-color: #008000\">'google'</span><span style=\"font-weight: bold\">]</span>,\n",
" <span style=\"color: #008000; text-decoration-color: #008000\">'category'</span>: <span style=\"color: #008000; text-decoration-color: #008000\">'general'</span>\n",
" <span style=\"font-weight: bold\">}</span>\n",
"<span style=\"font-weight: bold\">]</span>\n",
"</pre>\n"
],
"text/plain": [
"\u001b[1m[\u001b[0m\n",
" \u001b[1m{\u001b[0m\n",
" \u001b[32m'snippet'\u001b[0m: \u001b[32m'SoftBank chief: Forget AGI, ASI will be here within 10 years · Anthropic deploys AI agents to \u001b[0m\n",
"\u001b[32maudit models for safety · Sam Altman: AI will cause job losses and ...'\u001b[0m,\n",
" \u001b[32m'title'\u001b[0m: \u001b[32m'AI News | Latest AI News, Analysis & Events'\u001b[0m,\n",
" \u001b[32m'link'\u001b[0m: \u001b[32m'https://www.artificialintelligence-news.com/'\u001b[0m,\n",
" \u001b[32m'engines'\u001b[0m: \u001b[1m[\u001b[0m\u001b[32m'google'\u001b[0m\u001b[1m]\u001b[0m,\n",
" \u001b[32m'category'\u001b[0m: \u001b[32m'general'\u001b[0m\n",
" \u001b[1m}\u001b[0m,\n",
" \u001b[1m{\u001b[0m\n",
" \u001b[32m'snippet'\u001b[0m: \u001b[32m'AI · Meta names Shengjia Zhao as chief scientist of AI superintelligence unit · AI referrals to\u001b[0m\n",
"\u001b[32mtop websites were up 357% year-over-year in June, reaching 1.13B.'\u001b[0m,\n",
" \u001b[32m'title'\u001b[0m: \u001b[32m'AI News & Artificial Intelligence'\u001b[0m,\n",
" \u001b[32m'link'\u001b[0m: \u001b[32m'https://techcrunch.com/category/artificial-intelligence/'\u001b[0m,\n",
" \u001b[32m'engines'\u001b[0m: \u001b[1m[\u001b[0m\u001b[32m'google'\u001b[0m\u001b[1m]\u001b[0m,\n",
" \u001b[32m'category'\u001b[0m: \u001b[32m'general'\u001b[0m\n",
" \u001b[1m}\u001b[0m,\n",
" \u001b[1m{\u001b[0m\n",
" \u001b[32m'snippet'\u001b[0m: \u001b[32m\"2 Jul 2025 — Here's a recap of some of our biggest AI updates from June, including more ways to\u001b[0m\n",
"\u001b[32msearch with AI Mode, a new way to share your NotebookLM notebooks publicly.\"\u001b[0m,\n",
" \u001b[32m'title'\u001b[0m: \u001b[32m'The latest AI news we announced in June'\u001b[0m,\n",
" \u001b[32m'link'\u001b[0m: \u001b[32m'https://blog.google/technology/ai/google-ai-updates-june-2025/'\u001b[0m,\n",
" \u001b[32m'engines'\u001b[0m: \u001b[1m[\u001b[0m\u001b[32m'google'\u001b[0m\u001b[1m]\u001b[0m,\n",
" \u001b[32m'category'\u001b[0m: \u001b[32m'general'\u001b[0m\n",
" \u001b[1m}\u001b[0m\n",
"\u001b[1m]\u001b[0m\n"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Get the top 3 results from querying the web\n",
"results = searcher.query_search(\"latest news in AI\", num_results=3)\n",
"print(results)"
]
},
{
"cell_type": "markdown",
"id": "ed3c2eb8",
"metadata": {},
"source": [
"## 5. Document Conversion from URLs\n",
"\n",
"Convert URLs into document objects using the `urls_to_docs` function. This allows you to process and analyze web content as structured documents."
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "ca473c3a",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/opt/homebrew/Cellar/python@3.13/3.13.1/Frameworks/Python.framework/Versions/3.13/lib/python3.13/multiprocessing/resource_tracker.py:136: UserWarning: resource_tracker: process died unexpectedly, relaunching. Some resources might leak.\n",
" warnings.warn('resource_tracker: process died unexpectedly, '\n",
"[2025-07-27 12:05:53,392] INFO utils.websearch_utils: Fetching URL: https://en.wikipedia.org/wiki/India\n",
"[2025-07-27 12:05:53,392] INFO utils.websearch_utils: Fetching URL: https://en.wikipedia.org/wiki/India\n",
"[2025-07-27 12:05:53,395] INFO utils.websearch_utils: Fetching URL: https://en.wikipedia.org/wiki/Bangalore\n",
"[2025-07-27 12:05:53,395] INFO utils.websearch_utils: Fetching URL: https://en.wikipedia.org/wiki/Bangalore\n",
"[2025-07-27 12:05:53,696] INFO utils.websearch_utils: Fetched content from https://en.wikipedia.org/wiki/Bangalore with type text/html; charset=UTF-8\n",
"[2025-07-27 12:05:53,696] INFO utils.websearch_utils: Fetched content from https://en.wikipedia.org/wiki/Bangalore with type text/html; charset=UTF-8\n",
"[2025-07-27 12:05:53,719] INFO utils.websearch_utils: Fetched content from https://en.wikipedia.org/wiki/India with type text/html; charset=UTF-8\n",
"[2025-07-27 12:05:53,719] INFO utils.websearch_utils: Fetched content from https://en.wikipedia.org/wiki/India with type text/html; charset=UTF-8\n",
"[2025-07-27 12:05:56,040] INFO utils.websearch_utils: Processed markdown for: https://en.wikipedia.org/wiki/Bangalore\n",
"[2025-07-27 12:05:56,040] INFO utils.websearch_utils: Processed markdown for: https://en.wikipedia.org/wiki/Bangalore\n",
"[2025-07-27 12:05:56,073] INFO utils.websearch_utils: Processed markdown for: https://en.wikipedia.org/wiki/India\n",
"[2025-07-27 12:05:56,073] INFO utils.websearch_utils: Processed markdown for: https://en.wikipedia.org/wiki/India\n",
"[2025-07-27 12:05:56,074] INFO utils.websearch_utils: Successfully processed and added document(s) for URL: https://en.wikipedia.org/wiki/India\n",
"[2025-07-27 12:05:56,074] INFO utils.websearch_utils: Successfully processed and added document(s) for URL: https://en.wikipedia.org/wiki/India\n",
"[2025-07-27 12:05:56,074] INFO utils.websearch_utils: Successfully processed and added document(s) for URL: https://en.wikipedia.org/wiki/Bangalore\n",
"[2025-07-27 12:05:56,074] INFO utils.websearch_utils: Successfully processed and added document(s) for URL: https://en.wikipedia.org/wiki/Bangalore\n",
"[2025-07-27 12:05:57,351] INFO utils.websearch_utils: Total URLs processed: 2\n",
"[2025-07-27 12:05:57,351] INFO utils.websearch_utils: Total URLs processed: 2\n"
]
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Loaded <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span> documents.\n",
"</pre>\n"
],
"text/plain": [
"Loaded \u001b[1;36m2\u001b[0m documents.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\">Document</span><span style=\"font-weight: bold\">(</span>\n",
" <span style=\"color: #808000; text-decoration-color: #808000\">metadata</span>=<span style=\"font-weight: bold\">{</span><span style=\"color: #008000; text-decoration-color: #008000\">'source'</span>: <span style=\"color: #008000; text-decoration-color: #008000\">'https://en.wikipedia.org/wiki/India'</span><span style=\"font-weight: bold\">}</span>,\n",
" <span style=\"color: #808000; text-decoration-color: #808000\">page_content</span>=<span style=\"color: #008000; text-decoration-color: #008000\">'India - Wikipedia\\n\\nJump to content\\n\\nCoordinates: 21°N 78°E\\ufeff / \\ufeff21°N 78°E\\ufeff </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">/ 21; 78\\n\\n![Featured \\n\\n![Extended-protected \\n\\nFrom Wikipedia, the free encyclopedia\\n\\nCountry in South </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Asia\\n\\nThis article is about the country. For other uses, see India (disambiguation).\\n\\n| Republic of India </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">*BhÄ\\x81rat Gaá¹\\x87arÄ\\x81jya* | |\\n| --- | --- |\\n| Horizontal tricolour flag bearing, from top to bottom, deep </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">saffron, white, and green horizontal bands. In the centre of the white band is a navy-blue wheel with 24 spokes. </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Flag State emblem | |\\n| **Motto:**Satyameva Jayate\\xa0(Sanskrit) \"Truth Alone Triumphs\"[1] | |\\n| </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">**Anthem:**\\xa0Jana Gana Mana\\xa0(Hindi)[a][2][3] \"Thou Art the Ruler of the Minds of All People\"[4][2] | |\\n| </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">**National song: Vande Mataram\\xa0(Sanskrit)[c]** \"I Bow to Thee, Mother\"[b][1][2] | |\\n| Image of a globe centred </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">on India, with India highlighted. Territory controlled by India Territory claimed but not controlled | |\\n| </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Capital | New Delhi 28°36â\\x80²50â\\x80³N 77°12â\\x80²30â\\x80³E / 28.61389°N 77.20833°E / 28.61389; </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">77.20833 |\\n| Largest city by city proper population | Mumbai |\\n| Largest city by metropolitan area population | </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Delhi |\\n| Official\\xa0languages | |\\n| Recognised regional\\xa0languages | State level and Eighth Schedule[9] |\\n|</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Native languages | 424 languages[g] |\\n| Religion (2011)[11] | |\\n| Demonym(s) | |\\n| Government | Federal </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">parliamentary republic |\\n| | |\\n| â\\x80¢\\xa0President | Droupadi Murmu |\\n| â\\x80¢\\xa0Prime Minister | Narendra </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Modi |\\n| | |\\n| Legislature | Parliament |\\n| â\\x80¢\\xa0Upper house | Rajya Sabha |\\n| â\\x80¢\\xa0Lower house | </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Lok Sabha |\\n| Independence from the United Kingdom | |\\n| | |\\n| â\\x80¢\\xa0Dominion | 15 August 1947 |\\n| </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">â\\x80¢\\xa0Republic | 26 January 1950 |\\n| | |\\n| Area | |\\n| â\\x80¢\\xa0Total | 3,287,263\\xa0km2 </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">(1,269,219\\xa0sq\\xa0mi)[2][h] (7th) |\\n| â\\x80¢\\xa0Water\\xa0(%) | 9.6 |\\n| Population | |\\n| â\\x80¢\\xa02023 </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">estimate | Neutral increase 1,428,627,663[13] (1st) |\\n| â\\x80¢\\xa02011\\xa0census | Neutral increase (2nd) |\\n| </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">â\\x80¢\\xa0Density | 430.5/km2 (1,115.0/sq\\xa0mi) (30th) |\\n| GDP\\xa0(PPP) | 2025\\xa0estimate |\\n| â\\x80¢\\xa0Total |</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Increase $17.647 trillion[16] (3rd) |\\n| â\\x80¢\\xa0Per capita | Increase $12,132[16] (119th) |\\n| GDP\\xa0(nominal) </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">| 2025\\xa0estimate |\\n| â\\x80¢\\xa0Total | Increase $4.187 trillion[16] (4th) |\\n| â\\x80¢\\xa0Per capita | Increase </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">$2,878[16] (136th) |\\n| Gini\\xa0(2021) | Positive decrease\\xa025.5[17] low inequality |\\n| HDI\\xa0(2023) | </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Increase\\xa00.685[18] medium\\xa0(130th) |\\n| Currency | Indian rupee (â\\x82¹) (INR) |\\n| Time zone | UTC+05:30 </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">(IST) |\\n| Date format | |\\n| Calling code | +91 |\\n| ISO 3166 code | IN |\\n| Internet TLD | .in (others) </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">|\\n\\n**India**, officially the **Republic of India**,[j][20] is a country in South Asia. It is the seventh-largest </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">country by area; the most populous country since 2023;[21] and, since its independence in 1947, the world\\'s most </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">populous Bounded by the Indian Ocean on the south, the Arabian Sea on the southwest, and the Bay of Bengal on the </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">southeast, it shares land borders with Pakistan to the west;[k] China, Nepal, and Bhutan to the north; and </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Bangladesh and Myanmar to the east. In the Indian Ocean, India is near Sri Lanka and the Maldives; its Andaman and </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Nicobar Islands share a maritime border with Thailand, Myanmar, and Indonesia.\\n\\nModern humans arrived on the </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Indian subcontinent from Africa no later than 55,000 years ago.[26][27][28] Their long occupation, predominantly in</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">isolation as hunter-gatherers, has made the region highly diverse.[29] Settled life emerged on the subcontinent in </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">the western margins of the Indus river basin 9,000 years ago, evolving gradually into the Indus Valley Civilisation</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">of the third millennium BCE.[30] By 1200\\xa0BCE, an archaic form of Sanskrit, an Indo-European language, had </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">diffused into India from the northwest.[31][32] Its hymns recorded the early dawnings of Hinduism in India.[33] </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">India\\'s pre-existing Dravidian languages were supplanted in the northern regions.[34] By 400\\xa0BCE, caste had </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">emerged within Hinduism,[35] and Buddhism and Jainism had arisen, proclaiming social orders unlinked to </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">heredity.[36] Early political consolidations gave rise to the loose-knit Maurya and Gupta Empires.[37] Widespread </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">creativity suffused this era,[38] but the status of women declined,[39] and untouchability became an organized </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">belief.[l][40] In South India, the Middle kingdoms exported Dravidian language scripts and religious cultures to </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">the kingdoms of Southeast Asia.[41]\\n\\nIn the early medieval era, Christianity, Islam, Judaism, and Zoroastrianism </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">became established on India\\'s southern and western coasts.[42] Muslim armies from Central Asia intermittently </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">overran India\\'s northern plains in the second millennium.[43] The resulting Delhi Sultanate drew northern India </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">into the cosmopolitan networks of medieval Islam.[44] In south India, the Vijayanagara Empire created a </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">long-lasting composite Hindu culture.[45] In the Punjab, Sikhism emerged, rejecting institutionalised religion.[46]</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">The Mughal Empire ushered in two centuries of economic expansion and relative peace,[47] leaving a rich </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">architectural legacy.[48][49] Gradually expanding rule of the British East India Company turned India into a </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">colonial economy but consolidated its sovereignty.[50] British Crown rule began in 1858. The rights promised to </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Indians were granted slowly,[51][52] but technological changes were introduced, and modern ideas of education and </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">the public life took root.[53] A nationalist movement emerged in India, the first in the non-European British </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">empire and an influence on other nationalist movements.[54][55] Noted for nonviolent resistance after 1920,[56] it </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">became the primary factor in ending British rule.[57] In 1947, the British Indian Empire was partitioned into two </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">independent a Hindu-majority dominion of India and a Muslim-majority dominion of Pakistan. A large-scale loss of </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">life and an unprecedented migration accompanied the partition.[62]\\n\\nIndia has been a federal republic since 1950,</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">governed through a democratic parliamentary system. It is a pluralistic, multilingual and multi-ethnic society. </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">India\\'s population grew from 361 million in 1951 to over 1.4 billion in 2023.[63] During this time, its nominal </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">per capita income increased from US$64 annually to US$2,601, and its literacy rate from 16.6% to 74%. A </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">comparatively destitute country in 1951,[64] India has become a fast-growing major economy and hub for information </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">technology services; it has an expanding middle class.[65] Indian movies and music increasingly influence global </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">culture.[66] India has reduced its poverty rate, though at the cost of increasing economic inequality.[67] It is a </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">nuclear-weapon state that ranks high in military expenditure. It has disputes over Kashmir with its neighbours, </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Pakistan and China, unresolved since the mid-20th century.[68] Among the socio-economic challenges India faces are </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">gender inequality, child malnutrition,[69] and rising levels of air pollution.[70] India\\'s land is megadiverse </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">with four biodiversity hotspots.[71] India\\'s wildlife, which has traditionally been viewed with tolerance in its </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">culture,[72] is supported in protected habitats.\\n\\nEtymology\\n---------\\n\\nMain article: Names for </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">India\\n\\nAccording to the *Oxford English Dictionary* (2009), the name \"India\" is derived from the Classical Latin </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">*India*, a reference to South Asia and an uncertain region to its east. In turn \"India\" derived successively from </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Hellenistic Greek *India* (Ἰνδία), Ancient Greek *Indos* (ἸνδÏ\\x8cÏ\\x82), Old Persian *Hindush* (an </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">eastern province of the Achaemenid Empire), and ultimately its cognate, the Sanskrit *Sindhu*, or the Indus River,</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">and by extension its well-settled southern basin.[73][74] The Ancient Greeks referred to the Indians as *Indoi*, </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">\\'the people of the Indus\\'.[75]\\n\\nThe term *Bharat* (*BhÄ\\x81rat*; pronounced [Ë\\x88bʱaË\\x90ɾÉ\\x99t] </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">â\\x93\\x98), mentioned in both Indian epic poetry and the Constitution of India,[76][77] is used in its variations </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">by many Indian languages. A modern rendering of the historical name *Bharatavarsha*, which applied originally to </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">North India,[78][79] *Bharat* gained increased currency from the mid-19th century as a native name for </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">India.[76][80]\\n\\n*Hindustan* ([ɦɪndÊ\\x8aË\\x88staË\\x90n] â\\x93\\x98) is a Middle Persian name for India that </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">became popular by the 13th century,[81] and was used widely since the era of the Mughal Empire. The meaning of </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">*Hindustan* has varied, referring to a region encompassing the northern Indian subcontinent (present-day northern </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">India and Pakistan) or to India in its near \\n\\nHistory\\n-------\\n\\nMain articles: History of India and History of </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">the Republic of India\\n\\n### Ancient India\\n\\n\\n\\nManuscript illustration, c.â\\x80\\x891650, of the Sanskrit epic </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Ramayana, composed in story-telling fashion c.â\\x80\\x89400\\xa0BCE\\xa0â\\x80\\x93 c.â\\x80\\x89300\\xa0CE[83]\\n\\nBy </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">55,000 years ago, the first modern humans, or *Homo sapiens*, had arrived on the Indian subcontinent from </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Africa.[26][27][28] The earliest known modern human remains in South Asia date to about 30,000 years ago.[26] After</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">6500\\xa0BCE, evidence for domestication of food crops and animals, construction of permanent structures, and </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">storage of agricultural surplus appeared in Mehrgarh and other sites in Balochistan, Pakistan.[84] These gradually </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">developed into the Indus Valley the first urban culture in South Asia,[86] which flourished during </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">2500â\\x80\\x931900\\xa0BCE in Pakistan and western India.[87] Centred around cities such as Mohenjo-daro, Harappa, </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Dholavira, and Kalibangan, and relying on varied forms of subsistence, the civilisation engaged robustly in crafts </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">production and wide-ranging trade.[86]\\n\\nDuring the period 2000â\\x80\\x93500\\xa0BCE, many regions of the </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">subcontinent transitioned from the Chalcolithic cultures to the Iron Age ones.[88] The Vedas, the oldest scriptures</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">associated with Hinduism,[89] were composed during this period,[90] and historians have analysed these to posit a </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Vedic culture in the Punjab region and the upper Gangetic Plain.[88] Most historians also consider this period to </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">have encompassed several waves of Indo-Aryan migration into the subcontinent from the north-west.[89] The caste </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">system, which created a hierarchy of priests, warriors, and free peasants, but which excluded indigenous peoples by</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">labelling their occupations impure, arose during this period.[91] On the Deccan Plateau, archaeological evidence </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">from this period suggests the existence of a chiefdom stage of political organisation.[88] In South India, a </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">progression to sedentary life is indicated by the large number of megalithic monuments dating from this period,[92]</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">as well as by nearby traces of agriculture, irrigation tanks, and craft traditions.[92]\\n\\n\\n\\nCave 26 of the </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">rock-cut Ajanta Caves\\n\\nIn the late Vedic period, around the 6th century BCE, the small states and chiefdoms of </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">the Ganges Plain and the north-western regions had consolidated into 16 major oligarchies and monarchies that were </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">known as the The emerging urbanisation gave rise to non-Vedic religious movements, two of which became independent</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">religions. Jainism came into prominence during the life of its exemplar, Mahavira.[95] Buddhism, based on the </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">teachings of Gautama Buddha, attracted followers from all social classes excepting the middle class; chronicling </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">the life of the Buddha was central to the beginnings of recorded history in India.[96][97][98] In an age of </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">increasing urban wealth, both religions held up renunciation as an ideal,[99] and both established long-lasting </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">monastic traditions. Politically, by the 3rd century BCE, the kingdom of Magadha had annexed or reduced other </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">states to emerge as the Maurya Empire.[100] The empire was once thought to have controlled most of the subcontinent</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">except the far south, but its core regions are now thought to have been separated by large autonomous </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">areas.[101][102] The Mauryan kings are known as much for their empire-building and determined management of public </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">life as for Ashoka\\'s renunciation of militarism and far-flung advocacy of the Buddhist *dhamma*.[103][104]\\n\\nThe </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Sangam literature of the Tamil language reveals that, between 200\\xa0BCE and 200\\xa0CE, the southern peninsula was </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">ruled by the Cheras, the Cholas, and the Pandyas, dynasties that traded extensively with the Roman Empire and with </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">West and Southeast Asia.[105][106] In North India, Hinduism asserted patriarchal control within the family, leading</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">to increased subordination of women.[107][100] By the 4th and 5th centuries, the Gupta Empire had created a complex</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">system of administration and taxation in the greater Ganges Plain; this system became a model for later Indian </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">kingdoms.[108][109] Under the Guptas, a renewed Hinduism based on devotion, rather than the management of ritual, </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">began to assert itself.[110] This renewal was reflected in a flowering of sculpture and architecture, which found </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">patrons among an urban elite.[109] Classical Sanskrit literature flowered as well, and Indian science, astronomy, </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">medicine, and mathematics made significant advances.[109]\\n\\n### Medieval India\\n\\nMain article: Medieval </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">India\\n\\n\\n\\nBrihadeshwara temple, Thanjavur, completed in 1010\\xa0CE\\n\\n\\n\\nThe Qutub Minar, 73\\xa0m (240\\xa0ft) </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">tall, completed by the Sultan of Delhi, Iltutmish\\n\\nThe Indian early medieval age, from 600 to 1200\\xa0CE, is </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">defined by regional kingdoms and cultural diversity.[111] When Harsha of Kannauj, who ruled much of the </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Indo-Gangetic Plain from 606 to 647\\xa0CE, attempted to expand southwards, he was defeated by the Chalukya ruler of</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">the Deccan.[112] When his successor attempted to expand eastwards, he was defeated by the Pala king of Bengal.[112]</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">When the Chalukyas attempted to expand southwards, they were defeated by the Pallavas from farther south, who in </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">turn were opposed by the Pandyas and the Cholas from still farther south.[112] No ruler of this period was able to </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">create an empire and consistently control lands much beyond their core region.[111] During this time, pastoral </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">peoples, whose land had been cleared to make way for the growing agricultural economy, were accommodated within </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">caste society, as were new non-traditional ruling classes.[113] The caste system consequently began to show </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">regional differences.[113]\\n\\nIn the 6th and 7th centuries, the first devotional hymns were created in the Tamil </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">language.[114] They were imitated all over India and led to both the resurgence of Hinduism and the development of </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">all modern languages of the subcontinent.[114] Indian royalty, big and small, and the temples they patronised drew </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">citizens in great numbers to the capital cities, which became economic hubs as well.[115] Temple towns of various </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">sizes began to appear everywhere as India underwent another urbanisation.[115] By the 8th and 9th centuries, the </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">effects were felt in Southeast Asia, as South Indian culture and political systems were exported to lands that </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">became part of modern-day Myanmar, Thailand, Laos, Brunei, Cambodia, Vietnam, Philippines, Malaysia, and </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Indonesia.[116] Indian merchants, scholars, and sometimes armies were involved in this transmission; Southeast </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Asians took the initiative as well, with many sojourning in Indian seminaries and translating Buddhist and Hindu </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">texts into their languages.[116]\\n\\nAfter the 10th century, Muslim Central Asian nomadic clans, using swift-horse </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">cavalry and raising vast armies united by ethnicity and religion, repeatedly overran South Asia\\'s north-western </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">plains, leading eventually to the establishment of the Islamic Delhi Sultanate in 1206.[117] The sultanate was to </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">control much of North India and to make many forays into South India. Although at first disruptive for the Indian </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">elites, the sultanate largely left its vast non-Muslim subject population to its own laws and customs.[118][119] By</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">repeatedly repulsing Mongol raiders in the 13th century, the sultanate saved India from the devastation visited on </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">West and Central Asia, setting the scene for centuries of migration of fleeing soldiers, learned men, mystics, </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">traders, artists, and artisans from that region into the subcontinent, thereby creating a syncretic Indo-Islamic </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">culture in the north.[120][121] The sultanate\\'s raiding and weakening of the regional kingdoms of South India </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">paved the way for the indigenous Vijayanagara Empire.[122] Embracing a strong Shaivite tradition and building upon </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">the military technology of the sultanate, the empire came to control much of peninsular India,[123] and was to </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">influence South Indian society for long afterwards.[122]\\n\\n### Early modern India\\n\\n\\n\\nA distant view of the Taj</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Mahal from the Agra Fort\\n\\n\\n\\nA two mohur Company gold coin, issued in 1835, the obverse inscribed \"William IIII,</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">King\"\\n\\nIn the early 16th century, northern India, then under mainly Muslim rulers,[124] fell again to the </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">superior mobility and firepower of a new generation of Central Asian warriors.[125] The resulting Mughal Empire did</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">not stamp out the local societies it came to rule. Instead, it balanced and pacified them through new </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">administrative practices[126][127] and diverse and inclusive ruling elites,[128] leading to more systematic, </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">centralised, and uniform rule.[129] Eschewing tribal bonds and Islamic identity, especially under Akbar, the </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Mughals united their far-flung realms through loyalty, expressed through a Persianised culture, to an emperor who </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">had near-divine status.[128] The Mughal state\\'s economic policies, deriving most revenues from agriculture[130] </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">and mandating that taxes be paid in the well-regulated silver currency,[131] caused peasants and artisans to enter </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">larger markets.[129] The relative peace maintained by the empire during much of the 17th century was a factor in </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">India\\'s economic expansion,[129] resulting in greater patronage of painting, literary forms, textiles, and </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">architecture.[132] Newly coherent social groups in northern and western India, such as the Marathas, the Rajputs, </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">and the Sikhs, gained military and governing ambitions during Mughal rule, which, through collaboration or </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">adversity, gave them both recognition and military experience.[133] Expanding commerce during Mughal rule gave rise</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">to new Indian commercial and political elites along the coasts of southern and eastern India.[133] As the empire </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">disintegrated, many among these elites were able to seek and control their own affairs.[134]\\n\\nBy the early 18th </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">century, with the lines between commercial and political dominance being increasingly blurred, a number of European</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">trading companies, including the English East India Company, had established coastal outposts.[135][136] The East </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">India Company\\'s control of the seas, greater resources, and more advanced military training and technology led it </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">to increasingly assert its military strength and caused it to become attractive to a portion of the Indian elite; </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">these factors were crucial in allowing the company to gain control over the Bengal region by 1765 and sideline the </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">other European Its further access to the riches of Bengal and the subsequent increased strength and size of its </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">army enabled it to annex or subdue most of India by the 1820s.[140] India was then no longer exporting manufactured</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">goods as it long had, but was instead supplying the British Empire with raw materials. Many historians consider </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">this to be the onset of India\\'s colonial period.[135] By this time, with its economic power severely curtailed by </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">the British parliament and having effectively been made an arm of British administration, the East India Company </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">began more consciously to enter non-economic arenas, including education, social reform, and culture.[141]\\n\\n### </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Modern India\\n\\nMain article: History of India (1947â\\x80\\x93present)\\n\\n\\n\\n1909 map of the British Indian </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Empire\\n\\n\\n\\nJawaharlal Nehru sharing a light moment with Mahatma Gandhi, Mumbai, 6 July 1946\\n\\nHistorians </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">consider India\\'s modern age to have begun sometime between 1848 and 1885. The appointment in 1848 of Lord </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Dalhousie as Governor General of the East India Company set the stage for changes essential to a modern state. </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">These included the consolidation and demarcation of sovereignty, the surveillance of the population, and the </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">education of citizens. Technological changesâ\\x80\\x94among them, railways, canals, and the telegraphâ\\x80\\x94were </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">introduced not long after their introduction in However, disaffection with the company also grew during this time </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">and set off the Indian Rebellion of 1857. Fed by diverse resentments and perceptions, including invasive </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">British-style social reforms, harsh land taxes, and summary treatment of some rich landowners and princes, the </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">rebellion rocked many regions of northern and central India and shook the foundations of Company rule.[146][147] </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Although the rebellion was suppressed by 1858, it led to the dissolution of the East India Company and the direct </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">administration of India by the British government. Proclaiming a unitary state and a gradual but limited </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">British-style parliamentary system, the new rulers also protected princes and landed gentry as a feudal safeguard </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">against future unrest.[148][149] In the decades following, public life gradually emerged all over India, leading </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">eventually to the founding of the Indian National Congress in \\n\\nThe rush of technology and the commercialisation </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">of agriculture in the second half of the 19th century was marked by economic setbacks, and many small farmers </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">became dependent on the whims of far-away markets.[154] There was an increase in the number of large-scale </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">famines,[155] and, despite the risks of infrastructure development borne by Indian taxpayers, little industrial </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">employment was generated for Indians.[156] There were also salutary effects: commercial cropping, especially in the</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">newly canalled Punjab, led to increased food production for internal consumption.[157] The railway network provided</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">critical famine relief,[158] notably reduced the cost of moving goods,[158] and helped nascent Indian-owned </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">industry.[157]\\n\\nAfter World War I, in which approximately one million Indians served,[159] a new period began. It</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">was marked by British reforms but also repressive legislation, by more strident Indian calls for self-rule, and by </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">the beginnings of a nonviolent movement of non-co-operation, of which Mahatma Gandhi would become the leader and </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">enduring symbol.[160] During the 1930s, slow legislative reform was enacted by the British; the Indian National </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Congress won victories in the resulting elections.[161] The next decade was beset with crises: Indian participation</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">in World War\\xa0II, the Congress\\'s final push for non-co-operation, and an upsurge of Muslim nationalism. All were</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">capped by the advent of independence in 1947, but tempered by the partition of India into two states: India and </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Pakistan.[162]\\n\\nVital to India\\'s self-image as an independent nation was its constitution, completed in 1950, </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">which put in place a secular and democratic republic.[163] Economic liberalisation, which began in the 1980s and </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">with the collaboration with Soviet Union for technical knowledge,[164] has created a large urban middle class, </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">transformed India into one of the world\\'s fastest-growing economies,[165] and increased its geopolitical </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">influence. Yet, India is also shaped by persistent poverty, both rural and urban;[166] by religious and </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">caste-related violence;[167] by Maoist-inspired Naxalite insurgencies;[168] and by separatism in Jammu and Kashmir </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">and in Northeast India.[169] It has unresolved territorial disputes with China and with Pakistan.[170] India\\'s </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">sustained democratic freedoms are unique among the world\\'s newer nations; however, in spite of its recent economic</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">successes, freedom from want for its disadvantaged population remains a goal yet to be achieved.[171] As of 2025, </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">poverty in India declined sharply, mainly due to government welfare programs.[172]\\n\\nGeography\\n---------\\n\\nMain </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">article: Geography of India\\n\\n\\n\\nThe Tungabhadra, with rocky outcrops, flows into the peninsular Krishna </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">River[173]\\n\\n\\n\\nFishing boats lashed together in a tidal creek in Anjarle village, Maharashtra\\n\\nIndia accounts </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">for the bulk of the Indian subcontinent, lying atop the Indian tectonic plate, a part of the Indo-Australian </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Plate.[174] India\\'s defining geological processes began 75 million years ago when the Indian Plate, then part of </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">the southern supercontinent Gondwana, began a north-eastward drift caused by seafloor spreading to its south-west, </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">and later, south and south-east.[174] Simultaneously, the vast Tethyan oceanic crust, to its northeast, began to </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">subduct under the Eurasian Plate.[174] These dual processes, driven by convection in the Earth\\'s mantle, both </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">created the Indian Ocean and caused the Indian continental crust eventually to under-thrust Eurasia and to uplift </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">the Himalayas.[174] Immediately south of the emerging Himalayas, plate movement created a vast crescent-shaped </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">trough that rapidly filled with river-borne sediment[175] and now constitutes the Indo-Gangetic Plain.[176] The </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">original Indian plate makes its first appearance above the sediment in the ancient Aravalli range, which extends </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">from the Delhi Ridge in a southwesterly direction. To the west lies the Thar Desert, the eastern spread of which is</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">checked by the \\n\\nThe remaining Indian Plate survives as peninsular India, the oldest and geologically most stable</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">part of India. It extends as far north as the Satpura and Vindhya ranges in central India. These parallel chains </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">run from the Arabian Sea coast in Gujarat in the west to the coal-rich Chota Nagpur Plateau in Jharkhand in the </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">east.[180] To the south, the remaining peninsular landmass, the Deccan Plateau, is flanked on the west and east by </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">coastal ranges known as the Western and Eastern Ghats;[181] the plateau contains the country\\'s oldest rock </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">formations, some over one billion years old. Constituted in such fashion, India lies to the north of the equator </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">between 6° 44â\\x80² and 35° 30â\\x80² north latitude[m] and 68° 7â\\x80² and 97° 25â\\x80² east </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">longitude.[182]\\n\\nIndia\\'s coastline measures 7,517 kilometres (4,700\\xa0mi) in length; of this distance, 5,423 </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">kilometres (3,400\\xa0mi) belong to peninsular India and 2,094 kilometres (1,300\\xa0mi) to the Andaman, Nicobar, and</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Lakshadweep island chains.[183] According to the Indian naval hydrographic charts, the mainland coastline consists </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">of the following: 43% sandy beaches; 11% rocky shores, including cliffs; and 46% mudflats or marshy shores.[183] </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Major Himalayan-origin rivers that substantially flow through India include the Ganges and the Brahmaputra, both of</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">which drain into the Bay of Bengal.[184] Important tributaries of the Ganges include the Yamuna and the Kosi; the </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">latter\\'s extremely low gradient, caused by long-term silt deposition, leads to severe floods and course </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">changes.[185][186] Major peninsular rivers, whose steeper gradients prevent their waters from flooding, include the</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Godavari, the Mahanadi, the Kaveri, and the Krishna, which also drain into the Bay of Bengal;[187] and the Narmada </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">and the Tapti, which drain into the Arabian Sea.[188] Coastal features include the marshy Rann of Kutch of western </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">India and the alluvial Sundarbans delta of eastern India; the latter is shared with Bangladesh.[189] India has two </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">archipelagos: the Lakshadweep, coral atolls off India\\'s south-western coast; and the Andaman and Nicobar Islands, </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">a volcanic chain in the Andaman Sea.[190]\\n\\nIndian climate is strongly influenced by the Himalayas and the Thar </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Desert, both of which drive the economically and culturally pivotal summer and winter monsoons.[191] The Himalayas </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">prevent cold Central Asian katabatic winds from blowing in, keeping the bulk of the Indian subcontinent warmer than</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">most locations at similar latitudes.[192][193] The Thar Desert plays a crucial role in attracting the </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">moisture-laden south-west summer monsoon winds that, between June and October, provide the majority of India\\'s </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">rainfall.[191] Four major climatic groupings predominate in India: tropical wet, tropical dry, subtropical humid, </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">and montane.[194] Temperatures in India have risen by 0.7\\xa0°C (1.3\\xa0°F) between 1901 and 2018.[195] Climate </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">change in India is often thought to be the cause. The retreat of Himalayan glaciers has adversely affected the flow</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">rate of the major Himalayan rivers, including the Ganges and the Brahmaputra.[196] According to some current </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">projections, the number and severity of droughts in India will have markedly increased by the end of the present </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">century.[197]\\n\\n### Biodiversity\\n\\nMain articles: Forestry in India and Wildlife of India\\n\\n\\n\\nIndia has the </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">majority of the world\\'s wild tigers, approximately 3,170 in 2022.[198]\\n\\n\\n\\nA chital (*Axis axis*) stag in the </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Nagarhole National Park in a region covered by a moderately dense[n] forest.\\n\\n\\n\\nThree of the last Asiatic </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">cheetahs in India were shot dead in 1948 in Surguja district, Madhya Pradesh, Central India by Maharajah Ramanuj </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Pratap Singh Deo. The young male cheetahs, all from the same litter, were sitting together when they were shot at </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">night.\\n\\nIndia is a megadiverse country, a term employed for 17 countries that display high biological diversity </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">and contain many species exclusively indigenous, or endemic, to them.[199] India is the habitat for 8.6% of all </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">mammals, 13.7% of bird species, 7.9% of reptile species, 6% of amphibian species, 12.2% of fish species, and 6.0% </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">of all flowering plant species.[200][201] Fully a third of Indian plant species are endemic.[202] India also </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">contains four of the world\\'s 34 biodiversity hotspots,[71] or regions that display significant habitat loss in the</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">presence of high endemism.[o][203]\\n\\nIndia\\'s most dense forests, such as the tropical moist forest of the Andaman</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Islands, the Western Ghats, and Northeast India, occupy approximately 3% of its land area.[204][205] *Moderately </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">dense forest*, whose canopy density is between 40% and 70%, occupies 9.39% of India\\'s land area.[204][205] It </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">predominates in the temperate coniferous forest of the Himalayas, the moist deciduous *sal* forest of eastern </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">India, and the dry deciduous teak forest of central and southern India.[206] India has two natural zones of thorn </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">forest, one in the Deccan Plateau, immediately east of the Western Ghats, and the other in the western part of the </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Indo-Gangetic plain, now turned into rich agricultural land by irrigation, its features no longer visible.[207] </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Among the Indian subcontinent\\'s notable indigenous trees are the astringent *Azadirachta indica*, or *neem*, which</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">is widely used in rural Indian herbal medicine,[208] and the luxuriant *Ficus religiosa*, or *peepul*,[209] which </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">is displayed on the ancient seals of Mohenjo-daro,[210] and under which the Buddha is recorded in the Pali canon to</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">have sought enlightenment.[211]\\n\\nMany Indian species have descended from those of Gondwana, the southern </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">supercontinent from which India separated more than 100 million years ago.[212] India\\'s subsequent collision with </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Eurasia set off a mass exchange of species. However, volcanism and climatic changes later caused the extinction of </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">many endemic Indian forms.[213] Still later, mammals entered India from Asia through two zoogeographic passes </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">flanking the Himalayas.[214] This had the effect of lowering endemism among India\\'s mammals, which stands at </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">12.6%, contrasting with 45.8% among reptiles and 55.8% among amphibians.[201] Among endemics are the </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">vulnerable[215] hooded leaf monkey[216] and the threatened Beddome\\'s toad[217][218] of the Western Ghats.\\n\\nIndia</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">contains 172 IUCN-designated threatened animal species, or 2.9% of endangered forms.[219] These include the </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">endangered Bengal tiger and the Ganges river dolphin. Critically endangered species include the gharial, a </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">crocodilian; the great Indian bustard; and the Indian white-rumped vulture, which has become nearly extinct by </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">having ingested the carrion of diclofenac-treated cattle.[220] Before they were extensively used for agriculture </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">and cleared for human settlement, the thorn forests of Punjab were mingled at intervals with open grasslands that </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">were grazed by large herds of blackbuck preyed on by the Asiatic cheetah; the blackbuck, no longer extant in </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Punjab, is now severely endangered in India, and the cheetah is extinct.[221] The pervasive and ecologically </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">devastating human encroachment of recent decades has critically endangered Indian wildlife. In response, the system</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">of national parks and protected areas, first established in 1935, was expanded substantially. In 1972, India </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">enacted the Wildlife Protection Act[222] and Project Tiger to safeguard crucial wilderness; the Forest Conservation</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Act was enacted in 1980 and amendments added in 1988.[223] India hosts more than five hundred wildlife sanctuaries </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">and eighteen\\xa0biosphere reserves,[224] four of which are part of the World Network of Biosphere Reserves; its </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">eighty-nine wetlands are registered under the Ramsar Convention.[225]\\n\\nPolitics and government\\n\\n\\n### </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Politics\\n\\nMain article: Politics of India\\n\\nSee also: Democracy in India\\n\\n\\n\\nAs part of Janadesh 2007, 25,000</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">proâ\\x80\\x93land reform landless people in Madhya Pradesh listen to Rajagopal P. V.[226]\\n\\n\\n\\nUS president Barack</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Obama addresses the members of the Parliament of India in New Delhi in November 2010.\\n\\nIndia is a parliamentary </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">republic with a multi-party system.[227] It has six\\xa0recognised national parties, including the Indian National </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Congress (INC) and the Bharatiya Janata Party (BJP), and over 50\\xa0regional parties.[228] Congress is considered </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">the ideological centre in Indian political culture,[229] whereas the BJP is right-wing to From 1950 to the late </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">1980s, Congress held a majority in the India\\'s parliament. Afterwards, it increasingly shared power with the </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">BJP,[233] as well as with powerful regional parties, which forced multi-party coalition governments at the </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">centre.[234]\\n\\nIn the Republic of India\\'s general elections in 1951, 1957, and 1962, Congress, led by Jawaharlal </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Nehru, won easy victories. On Nehru\\'s death in 1964, Lal Bahadur Shastri briefly became prime minister; he was </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">succeeded in 1966, by Nehru\\'s daughter Indira Gandhi, who led the Congress to election victories in 1967 and 1971.</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Following public discontent with the state of emergency Indira Gandhi had declared in 1975, Congress was voted out </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">of power in 1977; Janata Party, which had opposed the emergency, was voted in. Its government lasted two years; </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Morarji Desai and Charan Singh served as prime ministers. After Congress was returned to power in 1980, Indira </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Gandhi was assassinated and succeeded by Rajiv Gandhi, who won easily in the elections later that year. In the 1989</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">elections a National Front coalition, led by the Janata Dal in alliance with the Left Front, won, lasting just </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">under two years, and V.P. Singh and Chandra Shekhar serving as prime ministers.[235] In the 1991 Indian general </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">election, Congress, as the largest single party, formed a minority government led by P. V. Narasimha </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Rao.[236]\\n\\nAfter the 1996 Indian general election, the BJP formed a government briefly; it was followed by United</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Front coalitions, which depended on external political support. Two prime ministers served during this period: H.D.</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Deve Gowda and I.K. Gujral. In 1998, the BJP formed a coalitionâ\\x80\\x94the National Democratic Alliance (NDA). Led</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">by Atal Bihari Vajpayee, the NDA became the first non-Congress, coalition government to complete a five-year </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">term.[237] In the 2004 Indian general elections, no party won an absolute majority. Still, the Congress emerged as </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">the largest single party, forming another successful coalition: the United Progressive Alliance (UPA). It had the </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">support of left-leaning parties and MPs who opposed the BJP. The UPA returned to power in the 2009 general election</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">with increased numbers, and it no longer required external support from India\\'s communist parties.[238] Manmohan </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Singh became the first prime minister since Jawaharlal Nehru in 1957 and 1962 to be re-elected to a consecutive </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">five-year term.[239] In the 2014 general election, the BJP became the first political party since 1984 to win an </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">absolute majority.[240] In the 2019 general election, the BJP regained an absolute majority. In the 2024 general </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">election, a BJP-led NDA coalition formed the government. Narendra Modi, a former chief minister of Gujarat, is </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">serving as the prime minister of India in his third term since May 26, 2014.[241]\\n\\n### Government\\n\\nMain </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">article: Government of India\\n\\nSee also: Constitution of India\\n\\n\\n\\nRashtrapati Bhavan, the official residence </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">of the President of India, was designed by British architects Edwin Lutyens and Herbert Baker for the Viceroy of </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">India, and constructed between 1911 and 1931 during the British Raj.[242]\\n\\nIndia is a federation with a </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">parliamentary system governed under the Constitution of India. Federalism in India defines the power distribution </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">between the union and the states. India\\'s form of government, traditionally described as \"quasi-federal\" with a </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">strong centre and weak states,[243] has grown increasingly federal since the late 1990s as a result of political, </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">economic, and social changes.[244][245]\\n\\nThe Government of India comprises three branches: the Executive, </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Legislature, and Judiciary.[246] The President of India is the ceremonial head of state,[247] who is elected </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">indirectly for a five-year term by an electoral college comprising members of national and state The Prime </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Minister of India is the head of government and exercises most executive power.[250] Appointed by the </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">president,[251] the prime minister is supported by the party or political alliance with a majority of seats in the </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">lower house of parliament.[250] The executive of the Indian government consists of the president, the </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">vice-president, and the Union Council of Ministersâ\\x80\\x94with the cabinet being its executive </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">committeeâ\\x80\\x94headed by the prime minister. Any minister holding a portfolio must be a member of one of the </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">houses of parliament.[247] In the Indian parliamentary system, the executive is subordinate to the legislature; the</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">prime minister and their council are directly responsible to the lower house of the parliament. Civil servants act </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">as permanent executives and all decisions of the executive are implemented by them.[252]\\n\\nThe legislature of </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">India is the bicameral parliament. Operating under a Westminster-style parliamentary system, it comprises an upper </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">house called the Rajya Sabha (Council of States) and a lower house called the Lok Sabha (House of the People).[253]</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">The Rajya Sabha is a permanent body of 245\\xa0members who serve staggered six-year terms with elections every 2 </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">years.[254] Most are elected indirectly by the state and union territorial legislatures in numbers proportional to </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">their state\\'s share of the national population.[251] The Lok Sabha\\'s 543\\xa0members are elected directly by </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">popular vote among citizens aged at least 18;[255] they represent single-member constituencies for </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">five-year\\xa0terms.[256] Several seats from each state are reserved for candidates from Scheduled Castes and </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Scheduled Tribes in proportion to their population within that state.[255]\\n\\nIndia has a three-tier\\xa0unitary </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">independent judiciary[257] comprising the supreme court, headed by the Chief Justice of India, 25\\xa0high courts, </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">and a large number of trial courts.[257] The supreme court has original jurisdiction over cases involving </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">fundamental rights and over disputes between states and the centre and has appellate jurisdiction over the high </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">courts.[258] It has the power to both strike down union or state laws which contravene the constitution[259] and </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">invalidate any government action it deems \\n\\n### Administrative divisions\\n\\nMain article: Administrative </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">divisions of India\\n\\nSee also: Political integration of India\\n\\n\\n\\nA clickable map of the 28 states and 8 union </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">territories of India\\n\\nIndia is a federal union comprising 28 states and 8 union territories.[12] All states, as </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">well as the union territories of Jammu and Kashmir, Puducherry and the National Capital Territory of Delhi, have </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">elected legislatures and governments following the Westminster system. The remaining five union territories are </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">directly ruled by the central government through appointed administrators. In 1956, under the States Reorganisation</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Act, states were reorganised on a linguistic basis.[261] There are over a quarter of a million local government </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">bodies at city, town, block, district and village levels.[262]\\n\\n#### States\\n\\n#### Union territories\\n\\nForeign,</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">economic, and strategic relations\\n\\n\\nMain article: Foreign relations of India\\n\\nSee also: Indian Armed </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Forces\\n\\n\\n\\nDuring the 1950s and 60s, India played a pivotal role in the Non-Aligned Movement.[263] From left to </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">right: Gamal Abdel Nasser of United Arab Republic (now Egypt), Josip Broz Tito of Yugoslavia and Jawaharlal Nehru </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">in Belgrade, September 1961.\\n\\nIndia became a republic in 1950, remaining a member of the Commonwealth of </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Nations.[264][265] India strongly supported decolonisation in Africa and Asia in the 1950s; it played a leading </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">role in the Non-Aligned Movement.[266] After cordial relations initially, India went to war with China in 1962. </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">India was widely thought to have been humiliated.[267] Another military conflict followed in 1967 in which India </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">successfully repelled a Chinese attack.[268] India has had uneasy relations with its western neighbour, Pakistan. </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">The two countries went to war in 1947, 1965, 1971, and 1999. Three of these wars were fought over the disputed </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">territory of Kashmir. In contrast, the 1971 war followed India\\'s support for the independence of Bangladesh.[269] </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">After the 1965 war with Pakistan, India began to pursue close military and economic ties with the Soviet Union; by </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">the late 1960s, the Soviet Union was its largest arms supplier.[270] India has played a key role in the South Asian</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Association for Regional Cooperation and the World Trade Organization. The nation has supplied 100,000 military and</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">police personnel in 35 UN peacekeeping needed*]\\n\\n\\n\\nThe Indian Air Force contingent marching at the 221st </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Bastille Day military parade in Paris, on 14 July 2009. The parade at which India was the foreign guest was led by </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">India\\'s oldest regiment, the Maratha Light Infantry, founded in 1768.[271]\\n\\nChina\\'s nuclear test of 1964 and </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">threats to intervene in support of Pakistan in the 1965 war caused India to produce nuclear weapons.[272] India </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">conducted its first nuclear weapons test in 1974 and carried out additional underground testing in 1998. India has </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">signed neither the Comprehensive Nuclear-Test-Ban Treaty nor the Nuclear Non-Proliferation Treaty, considering both</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">to be flawed and discriminatory.[273] India maintains a \"no first use\" nuclear policy and is developing a nuclear </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">triad capability as a part of its \"Minimum Credible Deterrence\" doctrine.[274][275]\\n\\nSince the end of the Cold </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">War, India has increased its economic, strategic, and military cooperation with the United States and the European </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Union.[276] In 2008, a civilian nuclear agreement was signed between India and the United States. Although India </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">possessed nuclear weapons at the time and was not a party to the Nuclear Non-Proliferation Treaty, it received </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">waivers from the International Atomic Energy Agency and the Nuclear Suppliers Group, ending earlier restrictions on</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">India\\'s nuclear technology and commerce; India subsequently signed co-operation agreements involving civilian </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">nuclear energy with Russia,[277] France,[278] the United Kingdom,[279] and Canada.[280]\\n\\nThe President of India </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">is the supreme commander of the nation\\'s armed forces; with 1.45\\xa0million active troops, they compose the </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">world\\'s second-largest military. It comprises the Indian Army, the Indian Navy, the Indian Air Force, and the </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Indian Coast Guard.[281] The official Indian defence budget for 2011 was US$36.03\\xa0billion, or 1.83% of GDP.[282]</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Defence expenditure was pegged at US$70.12\\xa0billion for fiscal year 2022â\\x80\\x9323 and, increased 9.8% than </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">previous fiscal year.[283][284] India is the world\\'s second-largest arms importer; between 2016 and 2020, it </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">accounted for 9.5% of the total global arms imports.[285] Much of the military expenditure was focused on defence </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">against Pakistan and countering growing Chinese influence in the Indian Ocean.[286]\\n\\nEconomy\\n-------\\n\\nMain </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">article: Economy of India\\n\\n\\n\\nIn 2019, 43% of India\\'s total workforce was employed in </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">agriculture.[287]\\n\\n\\n\\nIndia is the world\\'s largest producer of milk, with the largest population of cattle. In </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">2018, nearly 80% of India\\'s milk was sourced from small farms with herd size between one and two, the milk </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">harvested by hand milking.[289]\\n\\n\\n\\n55% of India\\'s female workforce was employed in agriculture in </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">2019.[288]\\n\\nAccording to the International Monetary Fund (IMF), the Indian economy in 2024 was nominally worth </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">$3.94\\xa0trillion; it was the fifth-largest economy by market exchange rates and is, at around $15.0\\xa0trillion, </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">the third-largest by purchasing power parity (PPP).[16] With its average annual GDP growth rate of 5.8% over the </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">past two decades, and reaching 6.1% during 2011â\\x80\\x932012,[290] India is one of the world\\'s fastest-growing </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">economies.[291] However, due to its low GDP per capitaâ\\x80\\x94which ranks 136th in the world in nominal per capita</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">income and 125th in per capita income adjusted for purchasing power parity (PPP)â\\x80\\x94the vast majority of </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Indians fall into the low-income group.[292][293] Until 1991, all Indian governments followed protectionist </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">policies that were influenced by socialist economics. Widespread state intervention and regulation largely walled </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">the economy off from the outside world. An acute balance of payments crisis in 1991 forced the nation to liberalise</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">its economy;[294] since then, it has moved increasingly towards a free-market system[295][296] by emphasising both </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">foreign trade and direct investment inflows.[297] India has been a member of World Trade Organization since 1 </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">January 1995.[298]\\n\\nThe 522-million-worker Indian labour force is the world\\'s second largest, as of </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">2017[update].[281] The service sector makes up 55.6% of GDP, the industrial sector 26.3% and the agricultural </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">sector 18.1%. India\\'s foreign exchange remittances of US$100 billion in 2022,[299] highest in the world, were </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">contributed to its economy by 32 million Indians working in foreign countries.[300] In 2006, the share of external </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">trade in India\\'s GDP stood at 24%, up from 6% in 1985.[295] In 2008, India\\'s share of world trade was 1.7%;[301] </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">In 2021, India was the world\\'s ninth-largest importer and the sixteenth-largest exporter.[302] Between 2001 and </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">2011, the contribution of petrochemical and engineering goods to total exports grew from 14% to 42%.[303] India was</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">the world\\'s second-largest textile exporter after China in the 2013 calendar year.[304]\\n\\nAveraging an economic </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">growth rate of 7.5% for several years before 2007,[295] India has more than doubled its hourly wage rates during </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">the first decade of the 21st century.[305] Some 431 million Indians have left poverty since 1985; India\\'s middle </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">classes are projected to number around 580\\xa0million by 2030.[306] In 2023, India\\'s consumer market was the </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">world\\'s fifth largest.[307] India\\'s nominal GDP per capita increased steadily from US$308 in 1991, when economic </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">liberalisation began, to US$1,380 in 2010, to an estimated US$2,731 in 2024. It is expected to grow to US$3,264 by </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">2026.[16]\\n\\n### Industries\\n\\n\\n\\nA tea garden in Sikkim. India, the world\\'s second-largest producer of tea, is a</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">nation of one billion tea drinkers, who consume 70% of India\\'s tea output.\\n\\nThe Indian automotive industry, the </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">world\\'s second-fastest growing, increased domestic sales by 26% during 2009â\\x80\\x932010,[308] and exports by 36% </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">during 2008â\\x80\\x932009.[309] In 2022, India became the world\\'s third-largest vehicle market after China and the </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">United States, surpassing Japan.[310] At the end of 2011, the Indian IT industry employed 2.8\\xa0million </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">professionals, generated revenues close to US$100\\xa0billion equalling 7.5% of Indian GDP, and contributed 26% of </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">India\\'s merchandise exports.[311]\\n\\nThe pharmaceutical industry in India includes 3,000 pharmaceutical companies </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">and 10,500 manufacturing units; India is the world\\'s third-largest pharmaceutical producer, largest producer of </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">generic medicines and supply up to 50â\\x80\\x9360% of global vaccines demand, these all contribute up to </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">US$24.44\\xa0billions in exports and India\\'s local pharmaceutical market is estimated up to </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">US$42\\xa0billion.[312][313] India is among the top 12 biotech destinations in the world.[314][315] The Indian </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">biotech industry grew by 15.1% in 2012â\\x80\\x932013, increasing its revenues from â\\x82¹204.4\\xa0billion (Indian </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">rupees) to â\\x82¹235.24\\xa0billion (US$3.94\\xa0billion at June 2013 exchange rates).[316]\\n\\n### Energy\\n\\nMain </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">article: Energy in India\\n\\nSee also: Energy policy of India\\n\\nIndia\\'s capacity to generate electrical power is </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">300 gigawatts, of which 42 gigawatts is renewable.[317] The country\\'s usage of coal is a major cause of India\\'s </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">greenhouse gas emissions, but its renewable energy is competing \\xa0source\\xa0needed*] India emits about 7% of </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">global greenhouse gas emissions. This equates to about 2.5 tons of carbon dioxide per person per year, which is </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">half the world average.[319][320] Increasing access to electricity and clean cooking with liquefied petroleum gas </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">have been priorities for energy in India.[321]\\n\\n### Socio-economic challenges\\n\\nMain articles: Poverty in India,</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Income inequality in India, and Debt bondage in India\\n\\n\\n\\nHealth workers about to begin another day of </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">immunisation against infectious diseases in 2006. Eight years later, and three years after India\\'s last case of </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">polio, the World Health Organization declared India to be polio-free.[322]\\n\\nDespite economic growth during recent</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">decades, India continues to face socio-economic challenges. In 2006, India contained the largest number of people </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">living below the World Bank\\'s international poverty line of US$1.25 per day.[323] The proportion decreased from </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">60% in 1981 to 42% in 2005.[324] Under the World Bank\\'s later revised poverty line, it was 21%-22.5 in </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">2011.[p][326][327] In 2019, the estimates had gone down to 10.2%.[327] In 2014, 30.7% of India\\'s children under </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">the age of five were underweight.[328] According to a Food and Agriculture Organization report in 2015, 15% of the </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">population was The Midday Meal Scheme attempts to lower these rates.[331]\\n\\nA 2018 Walk Free Foundation report </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">estimated that nearly 8\\xa0million people in India were living in different forms of modern slavery, such as bonded</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">labour, child labour, human trafficking, and forced begging.[332] According to the 2011 census, there were </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">10.1\\xa0million child labourers in the country, a decline of 2.6\\xa0million from 12.6\\xa0million in </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">2001.[333]\\n\\nSince 1991, economic inequality between India\\'s states has consistently grown: the per-capita net </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">state domestic product of the richest states in 2007 was 3.2 times that of the poorest.[334] Corruption in India is</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">perceived to have decreased. According to the Corruption Perceptions Index, India ranked 78th out of 180 countries </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">in 2018, an improvement from 85th in 2014.[335][336]\\n\\nAs of 2025, poverty in India declined sharply. According to</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">the World Bank report, extreme poverty fall from 16.2% in 2011-12 to 2.3% in 2022-23. In rural areas it fell from </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">18.4% to 2.8%, and in urban areas, from 10.7% to 1.1%. 378 million peopole were lifted from poverty and 171 million</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">from extreme poverty. The main reason, according to the World Bank, is not economic growth but different government</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">welfare programs, like transferring food and money to the people with low income, improving their access to </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">services.[172]\\n\\nDemographics, languages, and religion\\n\\n\\nMain articles: Demographics of India, Languages of </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">India, and Religion in India\\n\\nSee also: South Asian ethnic groups\\n\\n\\n\\nA Sikh pilgrim at the Harmandir Sahib, </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">or Golden Temple, in Amritsar, Punjab\\n\\n\\n\\nThe interior of San Thome Basilica, Chennai, Tamil Nadu. Christianity </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">is believed to have been introduced to India by the late 2nd century by Syriac-speaking Christians.\\n\\nWith an </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">estimated 1,428,627,663 residents in 2023, India is the world\\'s most populous country.[13] 1,210,193,422 residents</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">were reported in the 2011 provisional census report.[337] Its population grew by 17.64% from 2001 to 2011,[338] </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">compared to 21.54% growth in the previous decade (1991â\\x80\\x932001).[338] The human sex ratio, according to the </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">2011 census, is 940 females per 1,000 males.[337] The median age was 28.7 in 2020.[281] The first post-colonial </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">census, conducted in 1951, counted 361\\xa0million people.[339] Medical advances made in the last 50 years as well </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">as increased agricultural productivity brought about by the \"Green Revolution\" have caused India\\'s population to </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">grow rapidly.[340]\\n\\nThe life expectancy in India is at 70 yearsâ\\x80\\x9471.5 years for women, 68.7 years for </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">men.[281] There are around 93 physicians per 100,000 people.[341] Migration from rural to urban areas has been an </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">important dynamic in India\\'s recent history. The number of people living in urban areas grew by 31.2% between 1991</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">and 2001.[342] Yet, in 2001, over 70% still lived in rural areas.[343][344] The level of urbanisation increased </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">further from 27.81% in the 2001 Census to 31.16% in the 2011 Census. The slowing down of the overall population </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">growth rate was due to the sharp decline in the growth rate in rural areas since 1991.[345] According to the 2011 </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">census, there are 53 million-plus urban agglomerations in India; among them Mumbai, Delhi, Kolkata, Chennai, </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Bengaluru, Hyderabad and Ahmedabad, in decreasing order by population.[346] The literacy rate in 2011 was 74.04%: </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">65.46% among females and 82.14% among males.[347] The rural-urban literacy gap, which was 21.2 percentage points in</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">2001, dropped to 16.1 percentage points in 2011. The improvement in the rural literacy rate is twice that of urban </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">areas.[345] Kerala is the most literate state with 93.91% literacy; while Bihar the least with </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">63.82%.[347]\\n\\nAmong speakers of the Indian languages, 74% speak Indo-Aryan languages, the easternmost branch of </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">the Indo-European languages; 24% speak Dravidian languages, indigenous to South Asia and spoken widely before the </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">spread of Indo-Aryan languages and 2% speak Austroasiatic languages or the Sino-Tibetan languages. India has no </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">national language.[348] Hindi, with the largest number of speakers, is the official language of the English is </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">used extensively in business and administration and has the status of a \"subsidiary official language\";[6] it is </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">important in education, especially as a medium of higher education. Each state and union territory has one or more </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">official languages, and the constitution recognises in particular 22 \"scheduled languages\".\\n\\nThe 2011 census </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">reported the religion in India with the largest number of followers was Hinduism (79.80% of the population), </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">followed by Islam (14.23%); the remaining were Christianity (2.30%), Sikhism (1.72%), Buddhism (0.70%), Jainism </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">(0.36%) and others[q] (0.9%).[11] India has the third-largest Muslim populationâ\\x80\\x94the largest for a </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">non-Muslim majority country.[351][352]\\n\\nCulture\\n-------\\n\\nMain article: Culture of India\\n\\n### Visual </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">art\\n\\nMain article: Indian art\\n\\nIndia has a very ancient tradition of art, which has exchanged many influences </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">with the rest of Eurasia, especially in the first millennium, when Buddhist art spread with Indian religions to </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Central, East and Southeast Asia, the last also greatly influenced by Hindu art.[353] Thousands of seals from the </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Indus Valley Civilization of the third millennium BCE have been found, usually carved with animals, but also some </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">with human figures. The Pashupati seal, excavated in Mohenjo-daro, Pakistan, in 1928â\\x80\\x9329, is the best </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">known.[354][355] After this there is a long period with virtually nothing surviving.[355][356] Almost all surviving</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">ancient Indian art thereafter is in various forms of religious sculpture in durable materials, or coins. There was </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">probably originally far more in wood, which is lost. In north India Mauryan art is the first imperial In the first</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">millennium CE, Buddhist art spread with Indian religions to Central, East and Southeast Asia, the last also greatly</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">influenced by Hindu art.[360] Over the following centuries a distinctly Indian style of sculpting the human figure </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">developed, with less interest in articulating precise anatomy than ancient Greek sculpture but showing smoothly </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">flowing forms expressing *prana* (\"breath\" or This is often complicated by the need to give figures multiple arms </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">or heads, or represent different genders on the left and right of figures, as with the Ardhanarishvara form of </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Shiva and Parvati.[363][364]\\n\\nMost of the earliest large sculpture is Buddhist, either excavated from Buddhist </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">stupas such as Sanchi, Sarnath and Amaravati,[365] or is rock cut reliefs at sites such as Ajanta, Karla and </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Ellora. Hindu and Jain sites appear rather later.[366][367] In spite of this complex mixture of religious </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">traditions, generally, the prevailing artistic style at any time and place has been shared by the major religious </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">groups, and sculptors probably usually served all communities.[368] Gupta art, at its peak </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">c.â\\x80\\x89300\\xa0CE\\xa0â\\x80\\x93 c.â\\x8
gitextract_gxmiyyip/
├── .dockerignore
├── Dockerfile
├── Dockerfile.searxng
├── LICENSE
├── README.docker.md
├── README.md
├── README_MCP.md
├── __init__.py
├── app.py
├── coexist_tutorial.ipynb
├── config/
│ └── model_config.json
├── demo_queries.ipynb
├── docker-compose.yml
├── entrypoint.sh
├── model_config.py
├── output/
│ └── map_with_route_and_pois.html
├── quick_setup.sh
├── quick_setup_docker.sh
├── requirements.txt
├── searxng/
│ ├── settings.yml
│ ├── settings.yml.new
│ ├── uwsgi.ini
│ └── uwsgi.ini.new
├── static/
│ └── admin.html
├── system_prompt.py
└── utils/
├── __init__.py
├── answer_generation.py
├── config.py
├── crawler_utils.py
├── git_utils.py
├── knowledge_base.py
├── map.py
├── process_content.py
├── profiler_utils.py
├── reddit_utils.py
├── retriever_utils.py
├── startup_banner.py
├── tts_utils.py
├── utils.py
└── websearch_utils.py
SYMBOL INDEX (144 symbols across 15 files)
FILE: app.py
function init_components (line 32) | def init_components():
function lifespan (line 163) | async def lifespan(app_instance):
function _check_admin_token (line 195) | def _check_admin_token(token: str = None):
function admin_reload_config (line 208) | async def admin_reload_config(request: Request):
function admin_update_config (line 232) | async def admin_update_config(request: Request):
function admin_page (line 273) | async def admin_page():
function status (line 296) | async def status():
function admin_get_config (line 302) | async def admin_get_config():
function root (line 345) | async def root():
class WebSearchRequest (line 348) | class WebSearchRequest(BaseModel):
class YouTubeSearchRequest (line 358) | class YouTubeSearchRequest(BaseModel):
class RedditSearchRequest (line 363) | class RedditSearchRequest(BaseModel):
class MapSearchRequest (line 373) | class MapSearchRequest(BaseModel):
class WebSummarizeRequest (line 381) | class WebSummarizeRequest(BaseModel):
class GitTreeRequest (line 386) | class GitTreeRequest(BaseModel):
class GitSearchRequest (line 389) | class GitSearchRequest(BaseModel):
class LocalFolderTreeRequest (line 395) | class LocalFolderTreeRequest(BaseModel):
class ResearchCheckRequest (line 400) | class ResearchCheckRequest(BaseModel):
class ClickableElementRequest (line 404) | class ClickableElementRequest(BaseModel):
class PodcastRequest (line 409) | class PodcastRequest(BaseModel):
class BasicTTSRequest (line 413) | class BasicTTSRequest(BaseModel):
class KnowledgeBaseRequest (line 419) | class KnowledgeBaseRequest(BaseModel):
class CrawlerRequest (line 422) | class CrawlerRequest(BaseModel):
function get_website_structure (line 433) | async def get_website_structure(request: ClickableElementRequest):
function get_local_folder_tree (line 449) | async def get_local_folder_tree(request: LocalFolderTreeRequest):
function get_git_tree (line 465) | async def get_git_tree(request:GitTreeRequest):
function get_git_search (line 479) | async def get_git_search(request:GitSearchRequest):
function websearch (line 507) | async def websearch(request: WebSearchRequest):
function create_kb (line 550) | async def create_kb(request: KnowledgeBaseRequest):
function crawl_kb (line 571) | async def crawl_kb(request: CrawlerRequest):
function websummarize (line 605) | async def websummarize(request: WebSummarizeRequest):
function youtube_search (line 626) | async def youtube_search(request: YouTubeSearchRequest):
function reddit_search (line 644) | async def reddit_search(request: RedditSearchRequest):
function map_search (line 675) | async def map_search(request: MapSearchRequest):
function check_response (line 697) | async def check_response(request: ResearchCheckRequest):
function podcaster (line 724) | async def podcaster(request: PodcastRequest):
function basic_tts (line 789) | async def basic_tts(request: BasicTTSRequest):
FILE: model_config.py
function _env_bool (line 46) | def _env_bool(key, default=False):
function _env_json (line 52) | def _env_json(key, default=None):
function _load_config_file (line 83) | def _load_config_file(path):
function reload_model_config (line 160) | def reload_model_config(path=None):
FILE: utils/answer_generation.py
function query_agent (line 15) | async def query_agent(query, llm, date, day):
function response_gen (line 105) | async def response_gen(model, query, context):
function summarizer (line 188) | async def summarizer(query, docs, llm, batch,max_docs=30,max_words_per_d...
FILE: utils/crawler_utils.py
function get_sitemap_urls (line 22) | def get_sitemap_urls(base_url: str, headers: dict) -> List[str]:
function crawl_website (line 41) | def crawl_website(base_url: str, depth: Optional[int] = None, max_pages:...
function filter_docs_by_keywords (line 237) | def filter_docs_by_keywords(docs_map: dict, keywords: List[str]) -> dict:
function crawl_and_create_kb (line 266) | async def crawl_and_create_kb(
FILE: utils/git_utils.py
function git_tree_search (line 5) | async def git_tree_search(url):
function git_specific_content (line 33) | async def git_specific_content(base_url, part,type):
function folder_tree (line 80) | async def folder_tree(
FILE: utils/knowledge_base.py
function create_knowledge_base (line 10) | async def create_knowledge_base(document_paths, hf_embeddings):
FILE: utils/map.py
function get_coordinates (line 15) | def get_coordinates(location: str, limit: int = 3) -> List[Tuple[str, fl...
function get_route (line 42) | def get_route(start_coords: Tuple[float, float], end_coords: Tuple[float...
function get_pois (line 70) | def get_pois(
function create_map (line 105) | def create_map(
function auto_fix_destination (line 144) | def auto_fix_destination(location: str, limit: int = 3) -> Optional[Tupl...
function get_route_directions (line 156) | def get_route_directions(route_data: dict) -> str:
function generate_map (line 173) | def generate_map(
FILE: utils/process_content.py
function clean_html (line 24) | def clean_html(soup):
function process_content (line 38) | def process_content(url, content_type, content):
function process_content_pdf (line 93) | def process_content_pdf(file):
FILE: utils/profiler_utils.py
class WebSearchProfiler (line 9) | class WebSearchProfiler:
method __init__ (line 15) | def __init__(self, query):
method start_step (line 40) | def start_step(self, step_name, details=""):
method end_step (line 58) | def end_step(self, additional_info=""):
method add_metric (line 79) | def add_metric(self, metric_name, value):
method add_url_content (line 92) | def add_url_content(self, url, content):
method start_url_processing (line 102) | def start_url_processing(self, url):
method end_url_processing (line 107) | def end_url_processing(self, url, docs_count=0, context_length=0, stat...
method calculate_time_saved (line 124) | def calculate_time_saved(self):
method get_time_saved_summary (line 136) | def get_time_saved_summary(self):
method get_summary (line 149) | def get_summary(self):
method _get_seven_step_report (line 253) | def _get_seven_step_report(self, total_time):
method _get_url_processing_report (line 325) | def _get_url_processing_report(self):
method print_summary (line 414) | def print_summary(self):
function get_profiler (line 421) | def get_profiler():
function set_profiler (line 425) | def set_profiler(profiler):
FILE: utils/reddit_utils.py
function fetch_reddit_posts (line 23) | def fetch_reddit_posts(subreddit=None, url_type='hot', limit=10, time_fi...
function fetch_post_comments (line 95) | def fetch_post_comments(post_id, limit=5, is_custom_url=False):
function random_delay (line 126) | def random_delay():
function reddit_reader (line 133) | def reddit_reader(subreddit=None, url_type='hot', n=10, k=5, custom_url=...
function reddit_to_context (line 165) | def reddit_to_context(prompt, subreddit=None, url_type='hot', n=10, k=5,...
function reddit_reader_response (line 187) | def reddit_reader_response(
FILE: utils/retriever_utils.py
function get_chroma_client (line 19) | def get_chroma_client():
function create_vectorstore_async (line 38) | async def create_vectorstore_async(docs, collection_name, hf_embeddings,...
function _create_vectorstore_sync (line 73) | def _create_vectorstore_sync(docs, unique_collection_name, hf_embeddings...
function create_vectorstore (line 162) | def create_vectorstore(docs, collection_name, hf_embeddings, top_k, ense...
function cleanup_old_collections_async (line 171) | async def cleanup_old_collections_async(max_collections=20):
function _cleanup_collections_sync (line 183) | def _cleanup_collections_sync(max_collections):
function cleanup_old_collections (line 212) | def cleanup_old_collections(max_collections=20):
FILE: utils/startup_banner.py
function get_ascii_banner (line 11) | def get_ascii_banner():
function get_system_info (line 22) | def get_system_info():
function display_startup_banner (line 38) | def display_startup_banner(host="localhost", port=8000, mcp_port=None):
function display_shutdown_banner (line 108) | def display_shutdown_banner():
FILE: utils/tts_utils.py
function random_pause (line 15) | def random_pause(min_duration=0.5, max_duration=2.0, sample_rate=None):
function parse_podcast (line 23) | async def parse_podcast(text: str, voice_choices:list) -> list[dict]:
function podcasting (line 68) | async def podcasting(sentences, filename):
function text_to_speech (line 100) | async def text_to_speech(text, voice, filename, lang):
function podcasting_from_text (line 114) | async def podcasting_from_text(text,theme,llm):
FILE: utils/utils.py
function set_logging (line 47) | def set_logging(enabled: bool):
function is_searxng_running (line 77) | def is_searxng_running():
function fix_json (line 143) | def fix_json(json_str):
function load_model (line 162) | def load_model(model_name,
function stream_text_1 (line 321) | def stream_text_1(placeholder, output):
function stream_answer (line 335) | def stream_answer(text):
function get_local_data (line 346) | def get_local_data():
function get_generative_model (line 359) | def get_generative_model(model_name='gemini-1.5-flash',
function log_results (line 413) | def log_results(query, context, date, day):
function ordered_set_by_key (line 428) | def ordered_set_by_key(data):
function remove_consecutive_newlines (line 450) | def remove_consecutive_newlines(text):
function remove_main_url (line 466) | def remove_main_url(url):
function extract_markdown_tables (line 478) | def extract_markdown_tables(filename, md_text):
function extract_urls (line 506) | def extract_urls(text):
function extract_subqueries (line 524) | def extract_subqueries(text):
function extract_urls_from_query (line 547) | def extract_urls_from_query(text):
FILE: utils/websearch_utils.py
class SearchWeb (line 64) | class SearchWeb:
method __init__ (line 73) | def __init__(self, port, host="localhost", type='http'):
method query_search (line 99) | def query_search(self, query, engines=['google','brave','duckduckgo','...
method scrape_text (line 123) | def scrape_text(self, url):
method scrape_top_results (line 164) | def scrape_top_results(self, urls):
function process_url (line 185) | async def process_url(
function context_to_docs (line 414) | async def context_to_docs(
function text_to_docs (line 711) | def text_to_docs(texts_with_metadata):
function remove_urls (line 737) | def remove_urls(text):
function extract_domain_from_url (line 762) | def extract_domain_from_url(url):
function check_url_reachability (line 785) | async def check_url_reachability(url, timeout=10):
function check_urls_reachability (line 809) | async def check_urls_reachability(urls):
function add_domains_to_blacklist (line 839) | def add_domains_to_blacklist(urls):
function modify_query_with_blacklist (line 857) | def modify_query_with_blacklist(query):
function query_to_search_results (line 880) | def query_to_search_results(query, search_response, websearcher, num_res...
function query_web_response (line 1006) | async def query_web_response(
function url_to_markdown (line 1263) | async def url_to_markdown(url, executor, local_mode=False):
function urls_to_docs (line 1316) | async def urls_to_docs(urls, local_mode=False, split=True):
function youtube_transcript_response (line 1416) | def youtube_transcript_response(query, task, model,n=3):
function generate_doc_hash (line 1470) | def generate_doc_hash(text):
function summary_of_url (line 1482) | async def summary_of_url(query, url, model, local_mode=False):
function is_file_folder (line 1516) | def is_file_folder(root_path):
function get_all_paths (line 1527) | def get_all_paths(root_path):
function fetch_html (line 1541) | async def fetch_html(url, client):
function extract_clickable_elements (line 1551) | async def extract_clickable_elements(url):
function bm25_search (line 1577) | def bm25_search(elements, query,topk=10):
function get_topk_bm25_clickable_elements (line 1592) | async def get_topk_bm25_clickable_elements(url, query, topk=10):
function deduplicate_context (line 1597) | def deduplicate_context(context):
Copy disabled (too large)
Download .json
Condensed preview — 40 files, each showing path, character count, and a content snippet. Download the .json file for the full structured content (11,014K chars).
[
{
"path": ".dockerignore",
"chars": 119,
"preview": "__pycache__\n*.pyc\n*.pyo\n*.pyd\n.pytest_cache\n.venv\nenv/\ninfinity_env/\ncoexistaienv/\n*.log\nartifacts/\noutput/\ndownloads/\n"
},
{
"path": "Dockerfile",
"chars": 2730,
"preview": "FROM python:3.13-slim\n\nENV PYTHONDONTWRITEBYTECODE=1\nENV PYTHONUNBUFFERED=1\n\n# Build-time args that will be copied into "
},
{
"path": "Dockerfile.searxng",
"chars": 203,
"preview": "FROM searxng/searxng:latest\n\n# Copy custom settings\nCOPY ./searxng/settings.yml /etc/searxng/settings.yml\n\n# Optionally "
},
{
"path": "LICENSE",
"chars": 2182,
"preview": "NON-COMMERCIAL RESEARCH AND EDUCATIONAL USE LICENSE\n\nCopyright (c) 2025 Sidhant Thole and CoexistAI Contributors\n\nPermis"
},
{
"path": "README.docker.md",
"chars": 3766,
"preview": "# CoexistAI — Docker Quickstart\n\n### Short, step-by-step instructions for two ways to start CoexistAI. Pick either Metho"
},
{
"path": "README.md",
"chars": 15918,
"preview": "# CoexistAI\n\nCoexistAI is a modular, developer-friendly research assistant framework. It enables you to build, search, s"
},
{
"path": "README_MCP.md",
"chars": 10679,
"preview": "# CoexistAI v0.0.2 \n\n<p align=\"center\">\n <img src=\"artifacts/v002mcplogo.jpeg\" alt=\"CoexistAI MCP Logo\" width=\"200\"/>\n<"
},
{
"path": "__init__.py",
"chars": 0,
"preview": ""
},
{
"path": "app.py",
"chars": 35912,
"preview": "from utils.websearch_utils import *\nfrom utils.reddit_utils import *\nfrom utils.map import * \nfrom fastapi import FastAP"
},
{
"path": "coexist_tutorial.ipynb",
"chars": 10317031,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"id\": \"6dd23e8d\",\n \"metadata\": {},\n \"source\": [\n \"# CoexistAI To"
},
{
"path": "config/model_config.json",
"chars": 719,
"preview": "{\n \"llm_model_name\": \"jan-nano\",\n \"llm_type\": \"local\",\n \"embed_mode\": \"infinity_emb\",\n \"embedding_model_name\": \"nomi"
},
{
"path": "demo_queries.ipynb",
"chars": 39414,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"id\": \"89ceb341\",\n \"metadata\": {},\n \"source\": [\n \"# Few Example "
},
{
"path": "docker-compose.yml",
"chars": 1107,
"preview": "version: '3.8'\nservices:\n app:\n build: .\n restart: unless-stopped\n ports:\n - \"8000:8000\"\n volumes:\n "
},
{
"path": "entrypoint.sh",
"chars": 2448,
"preview": "#!/bin/sh\nset -e\n\n# Load environment variables from /app/.env if present\nif [ -f \"/app/.env\" ]; then\n export $(grep -v "
},
{
"path": "model_config.py",
"chars": 10219,
"preview": "import os\nimport json\n\"\"\"\nThis module defines the configuration for language model (LLM) and embedding models.\nAttribute"
},
{
"path": "output/map_with_route_and_pois.html",
"chars": 114474,
"preview": "<!DOCTYPE html>\n<html>\n<head>\n \n <meta http-equiv=\"content-type\" content=\"text/html; charset=UTF-8\" />\n <script"
},
{
"path": "quick_setup.sh",
"chars": 3379,
"preview": "#!/bin/zsh\n# Quick Shell Setup for CoexistAI (macOS/zsh)\n\necho \"Pulling SearxNG Docker image...\"\ndocker pull searxng/sea"
},
{
"path": "quick_setup_docker.sh",
"chars": 2857,
"preview": "#!/usr/bin/env bash\n# start_and_wait.sh - start docker compose and wait until CoexistAI app reports ready\n# Usage: ./sta"
},
{
"path": "requirements.txt",
"chars": 549,
"preview": "fastapi\nuvicorn\npydantic\naiohttp\nlangchain<1.0.0\nlangchain-google-genai\nlangchain-openai\nlangchain-text-splitters\nlanggr"
},
{
"path": "searxng/settings.yml",
"chars": 63013,
"preview": "general:\n # Debug mode, only for development. Is overwritten by ${SEARXNG_DEBUG}\n debug: false\n # displayed name\n in"
},
{
"path": "searxng/settings.yml.new",
"chars": 68607,
"preview": "general:\n # Debug mode, only for development. Is overwritten by ${SEARXNG_DEBUG}\n debug: false\n # displayed name\n in"
},
{
"path": "searxng/uwsgi.ini",
"chars": 1946,
"preview": "[uwsgi]\n# Who will run the code\nuid = searxng\ngid = searxng\n\n# Performance optimizations\n# Number of workers (adjust bas"
},
{
"path": "searxng/uwsgi.ini.new",
"chars": 1257,
"preview": "[uwsgi]\n# Listening address\n# default value: [::]:8080 (see Dockerfile)\nhttp-socket = $(BIND_ADDRESS)\n\n# Who will run th"
},
{
"path": "static/admin.html",
"chars": 24729,
"preview": "<!doctype html>\n<html>\n<head>\n <meta charset=\" body {\n background: linear-gradient(180deg, #050814 "
},
{
"path": "system_prompt.py",
"chars": 3449,
"preview": "#Example system prompt for coexistAI MCP can be used in Agents/LMStudio/LLM system prompts\nsystem_prompt = \"\"\"# Role & C"
},
{
"path": "utils/__init__.py",
"chars": 0,
"preview": ""
},
{
"path": "utils/answer_generation.py",
"chars": 16402,
"preview": "import logging\nimport asyncio\nfrom typing import Union\nfrom pydantic import BaseModel, Field\n\nfrom utils.config import *"
},
{
"path": "utils/config.py",
"chars": 9608,
"preview": "prompts = {\n\n'youtube_summary_prompt':\"\"\"Analyze and respond the task using the given transcript in detail, following th"
},
{
"path": "utils/crawler_utils.py",
"chars": 18268,
"preview": "import asyncio\nimport aiohttp\nimport requests\nfrom bs4 import BeautifulSoup\nfrom urllib.parse import urljoin, urlparse\ni"
},
{
"path": "utils/git_utils.py",
"chars": 5946,
"preview": "from gitingest import ingest_async\nimport os\nimport aiofiles.os\n\nasync def git_tree_search(url):\n \"\"\"\n Retrieves a"
},
{
"path": "utils/knowledge_base.py",
"chars": 3702,
"preview": "import os\nimport hashlib\nfrom utils.websearch_utils import urls_to_docs, get_all_paths\nfrom utils.retriever_utils import"
},
{
"path": "utils/map.py",
"chars": 11015,
"preview": "import logging\nfrom typing import List, Tuple, Optional\nimport requests\nimport folium\nimport os\n\n# Configure logging\nlog"
},
{
"path": "utils/process_content.py",
"chars": 4350,
"preview": "import logging\nimport re\n\nimport fitz\nfrom bs4 import BeautifulSoup\nfrom markdownify import markdownify\nfrom markitdown "
},
{
"path": "utils/profiler_utils.py",
"chars": 17855,
"preview": "# Profiling system for query_web_response\nimport time\nimport datetime\nimport logging\nfrom collections import defaultdict"
},
{
"path": "utils/reddit_utils.py",
"chars": 10999,
"preview": "import requests\nimport time\nimport random\nfrom utils.config import *\nfrom utils.utils import *\nfrom rank_bm25 import BM2"
},
{
"path": "utils/retriever_utils.py",
"chars": 9546,
"preview": "import hashlib\nimport time\nimport asyncio\nimport logging\nfrom concurrent.futures import ThreadPoolExecutor\nfrom langchai"
},
{
"path": "utils/startup_banner.py",
"chars": 4692,
"preview": "\"\"\"\nCoexistAI Startup Banner Module\nDisplays professional ASCII banner and system information on startup\n\"\"\"\n\nimport os\n"
},
{
"path": "utils/tts_utils.py",
"chars": 5253,
"preview": "\nimport logging\nimport random\nlogging.basicConfig(level=logging.INFO)\nlogger = logging.getLogger(__name__)\n\nimport numpy"
},
{
"path": "utils/utils.py",
"chars": 21461,
"preview": "\"\"\"\nutils_langchain.py\nAuthor: Sidhant Thole\nCreated: 25 May 2025\nDescription: Utility functions for LangChain-based app"
},
{
"path": "utils/websearch_utils.py",
"chars": 68824,
"preview": "import asyncio\nimport aiohttp\nimport concurrent.futures\nimport hashlib\nimport logging\nimport os\nimport re\nimport request"
}
]
About this extraction
This page contains the full source code of the SPThole/CoexistAI GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 40 files (10.4 MB), approximately 2.7M tokens, and a symbol index with 144 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.
Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.