Full Code of assafelovic/gpt-researcher for AI

main 7c321744ce33 cached
472 files
10.4 MB
2.8M tokens
966 symbols
1 requests
Copy disabled (too large) Download .txt
Showing preview only (11,012K chars total). Download the full file to get everything.
Repository: assafelovic/gpt-researcher
Branch: main
Commit: 7c321744ce33
Files: 472
Total size: 10.4 MB

Directory structure:
gitextract__hhdux9u/

├── .claude/
│   ├── SKILL.md
│   └── references/
│       ├── adding-features.md
│       ├── advanced-patterns.md
│       ├── api-reference.md
│       ├── architecture.md
│       ├── components.md
│       ├── config-reference.md
│       ├── deep-research.md
│       ├── flows.md
│       ├── mcp.md
│       ├── multi-agents.md
│       ├── prompts.md
│       └── retrievers.md
├── .cursorignore
├── .dockerignore
├── .github/
│   ├── ISSUE_TEMPLATE/
│   │   ├── bug_report.md
│   │   └── feature_request.md
│   ├── dependabot.yml
│   └── workflows/
│       ├── build.yml
│       ├── deploy.yml
│       └── docker-build.yml
├── .gitignore
├── .python-version
├── CODE_OF_CONDUCT.md
├── CONTRIBUTING.md
├── Dockerfile
├── Dockerfile.fullstack
├── LICENSE
├── Procfile
├── README-ja_JP.md
├── README-ko_KR.md
├── README-zh_CN.md
├── README.md
├── backend/
│   ├── Dockerfile
│   ├── Procfile
│   ├── __init__.py
│   ├── chat/
│   │   ├── __init__.py
│   │   └── chat.py
│   ├── memory/
│   │   ├── __init__.py
│   │   ├── draft.py
│   │   └── research.py
│   ├── report_type/
│   │   ├── __init__.py
│   │   ├── basic_report/
│   │   │   ├── __init__.py
│   │   │   └── basic_report.py
│   │   ├── deep_research/
│   │   │   ├── README.md
│   │   │   ├── __init__.py
│   │   │   ├── example.py
│   │   │   └── main.py
│   │   └── detailed_report/
│   │       ├── README.md
│   │       ├── __init__.py
│   │       └── detailed_report.py
│   ├── requirements.txt
│   ├── run_server.py
│   ├── runtime.txt
│   ├── server/
│   │   ├── __init__.py
│   │   ├── app.py
│   │   ├── logging_config.py
│   │   ├── multi_agent_runner.py
│   │   ├── report_store.py
│   │   ├── server_utils.py
│   │   └── websocket_manager.py
│   ├── styles/
│   │   └── pdf_styles.css
│   └── utils.py
├── citation.cff
├── cli.py
├── docker-compose.yml
├── docs/
│   ├── CNAME
│   ├── README.md
│   ├── babel.config.js
│   ├── blog/
│   │   ├── 2023-09-22-gpt-researcher/
│   │   │   └── index.md
│   │   ├── 2023-11-12-openai-assistant/
│   │   │   └── index.md
│   │   ├── 2024-05-19-gptr-langgraph/
│   │   │   └── index.md
│   │   ├── 2024-09-7-hybrid-research/
│   │   │   └── index.md
│   │   ├── 2025-02-26-deep-research/
│   │   │   └── index.md
│   │   ├── 2025-03-10-stepping-into-the-story/
│   │   │   └── index.md
│   │   └── authors.yml
│   ├── discord-bot/
│   │   ├── Dockerfile
│   │   ├── Dockerfile.dev
│   │   ├── commands/
│   │   │   └── ask.js
│   │   ├── deploy-commands.js
│   │   ├── gptr-webhook.js
│   │   ├── index.js
│   │   ├── package.json
│   │   └── server.js
│   ├── docs/
│   │   ├── contribute.md
│   │   ├── examples/
│   │   │   ├── custom_prompt.py
│   │   │   ├── detailed_report.md
│   │   │   ├── examples.ipynb
│   │   │   ├── examples.md
│   │   │   ├── hybrid_research.md
│   │   │   ├── pip-run.ipynb
│   │   │   ├── sample_report.py
│   │   │   └── sample_sources_only.py
│   │   ├── faq.md
│   │   ├── gpt-researcher/
│   │   │   ├── context/
│   │   │   │   ├── azure-storage.md
│   │   │   │   ├── data-ingestion.md
│   │   │   │   ├── filtering-by-domain.md
│   │   │   │   ├── local-docs.md
│   │   │   │   ├── tailored-research.md
│   │   │   │   └── vector-stores.md
│   │   │   ├── frontend/
│   │   │   │   ├── discord-bot.md
│   │   │   │   ├── embed-script.md
│   │   │   │   ├── introduction.md
│   │   │   │   ├── nextjs-frontend.md
│   │   │   │   ├── react-package.md
│   │   │   │   ├── vanilla-js-frontend.md
│   │   │   │   └── visualizing-websockets.md
│   │   │   ├── getting-started/
│   │   │   │   ├── cli.md
│   │   │   │   ├── getting-started-with-docker.md
│   │   │   │   ├── getting-started.md
│   │   │   │   ├── how-to-choose.md
│   │   │   │   ├── introduction.md
│   │   │   │   └── linux-deployment.md
│   │   │   ├── gptr/
│   │   │   │   ├── ai-development.md
│   │   │   │   ├── automated-tests.md
│   │   │   │   ├── claude-skill.md
│   │   │   │   ├── config.md
│   │   │   │   ├── deep_research.md
│   │   │   │   ├── example.md
│   │   │   │   ├── image_generation.md
│   │   │   │   ├── npm-package.md
│   │   │   │   ├── pip-package.md
│   │   │   │   ├── querying-the-backend.md
│   │   │   │   ├── scraping.md
│   │   │   │   └── troubleshooting.md
│   │   │   ├── handling-logs/
│   │   │   │   ├── all-about-logs.md
│   │   │   │   ├── langsmith-logs.md
│   │   │   │   └── simple-logs-example.md
│   │   │   ├── llms/
│   │   │   │   ├── llms.md
│   │   │   │   ├── running-with-azure.md
│   │   │   │   ├── running-with-ollama.md
│   │   │   │   ├── supported-llms.md
│   │   │   │   └── testing-your-llm.md
│   │   │   ├── mcp-server/
│   │   │   │   ├── advanced-usage.md
│   │   │   │   ├── claude-integration.md
│   │   │   │   └── getting-started.md
│   │   │   ├── multi_agents/
│   │   │   │   ├── ag2.md
│   │   │   │   └── langgraph.md
│   │   │   ├── retrievers/
│   │   │   │   └── mcp-configs.mdx
│   │   │   └── search-engines/
│   │   │       ├── search-engines.md
│   │   │       └── test-your-retriever.md
│   │   ├── proposals/
│   │   │   ├── adaptive-deep-research.md
│   │   │   ├── high-quality-content-scraping-architecture.md
│   │   │   ├── local-server-deployment-guide.md
│   │   │   └── social-media-data-acquisition.md
│   │   ├── reference/
│   │   │   ├── config/
│   │   │   │   ├── config.md
│   │   │   │   └── singleton.md
│   │   │   ├── processing/
│   │   │   │   ├── html.md
│   │   │   │   └── text.md
│   │   │   └── sidebar.json
│   │   ├── roadmap.md
│   │   └── welcome.md
│   ├── docusaurus.config.js
│   ├── npm/
│   │   ├── Readme.md
│   │   ├── index.js
│   │   └── package.json
│   ├── package.json
│   ├── pydoc-markdown.yml
│   ├── sidebars.js
│   ├── src/
│   │   ├── components/
│   │   │   ├── HomepageFeatures.js
│   │   │   └── HomepageFeatures.module.css
│   │   ├── css/
│   │   │   └── custom.css
│   │   └── pages/
│   │       ├── index.js
│   │       └── index.module.css
│   └── static/
│       ├── .nojekyll
│       └── CNAME
├── evals/
│   ├── README.md
│   ├── __init__.py
│   ├── hallucination_eval/
│   │   ├── evaluate.py
│   │   ├── inputs/
│   │   │   └── search_queries.jsonl
│   │   ├── requirements.txt
│   │   ├── results/
│   │   │   ├── aggregate_results.json
│   │   │   └── evaluation_records.jsonl
│   │   └── run_eval.py
│   └── simple_evals/
│       ├── .gitignore
│       ├── __init__.py
│       ├── logs/
│       │   ├── .gitkeep
│       │   ├── README.md
│       │   └── SimpleQA Eval 100 Problems 2-22-25.txt
│       ├── problems/
│       │   └── Simple QA Test Set.csv
│       ├── requirements.txt
│       ├── run_eval.py
│       └── simpleqa_eval.py
├── frontend/
│   ├── README.md
│   ├── index.html
│   ├── nextjs/
│   │   ├── .babelrc.build.json
│   │   ├── .dockerignore
│   │   ├── .eslintrc.json
│   │   ├── .example.env
│   │   ├── .gitignore
│   │   ├── .prettierrc
│   │   ├── .python-version
│   │   ├── Dockerfile
│   │   ├── Dockerfile.dev
│   │   ├── README.md
│   │   ├── actions/
│   │   │   └── apiActions.ts
│   │   ├── app/
│   │   │   ├── api/
│   │   │   │   ├── chat/
│   │   │   │   │   └── route.ts
│   │   │   │   └── reports/
│   │   │   │       ├── [id]/
│   │   │   │       │   ├── chat/
│   │   │   │       │   │   └── route.ts
│   │   │   │       │   └── route.ts
│   │   │   │       └── route.ts
│   │   │   ├── globals.css
│   │   │   ├── layout.tsx
│   │   │   ├── page.tsx
│   │   │   └── research/
│   │   │       └── [id]/
│   │   │           └── page.tsx
│   │   ├── components/
│   │   │   ├── Footer.tsx
│   │   │   ├── Header.tsx
│   │   │   ├── Hero.tsx
│   │   │   ├── HumanFeedback.tsx
│   │   │   ├── Images/
│   │   │   │   ├── ImageModal.tsx
│   │   │   │   └── ImagesAlbum.tsx
│   │   │   ├── Langgraph/
│   │   │   │   └── Langgraph.js
│   │   │   ├── LoadingDots.tsx
│   │   │   ├── ResearchBlocks/
│   │   │   │   ├── AccessReport.tsx
│   │   │   │   ├── ChatInterface.tsx
│   │   │   │   ├── ChatResponse.tsx
│   │   │   │   ├── ImageSection.tsx
│   │   │   │   ├── LogsSection.tsx
│   │   │   │   ├── Question.tsx
│   │   │   │   ├── Report.tsx
│   │   │   │   ├── Sources.tsx
│   │   │   │   └── elements/
│   │   │   │       ├── ChatInput.tsx
│   │   │   │       ├── InputArea.tsx
│   │   │   │       ├── LogMessage.tsx
│   │   │   │       ├── SourceCard.tsx
│   │   │   │       └── SubQuestions.tsx
│   │   │   ├── ResearchResults.tsx
│   │   │   ├── ResearchSidebar.tsx
│   │   │   ├── Settings/
│   │   │   │   ├── ChatBox.tsx
│   │   │   │   ├── FileUpload.tsx
│   │   │   │   ├── LayoutSelector.tsx
│   │   │   │   ├── MCPSelector.tsx
│   │   │   │   ├── Modal.tsx
│   │   │   │   ├── Settings.css
│   │   │   │   └── ToneSelector.tsx
│   │   │   ├── SimilarTopics.tsx
│   │   │   ├── Task/
│   │   │   │   ├── Accordion.tsx
│   │   │   │   ├── AgentLogs.tsx
│   │   │   │   ├── DomainFilter.tsx
│   │   │   │   ├── Report.tsx
│   │   │   │   └── ResearchForm.tsx
│   │   │   ├── TypeAnimation.tsx
│   │   │   ├── layouts/
│   │   │   │   ├── CopilotLayout.tsx
│   │   │   │   ├── MobileLayout.tsx
│   │   │   │   └── ResearchPageLayout.tsx
│   │   │   ├── mobile/
│   │   │   │   ├── MobileChatPanel.tsx
│   │   │   │   ├── MobileHomeScreen.tsx
│   │   │   │   └── MobileResearchContent.tsx
│   │   │   └── research/
│   │   │       ├── CopilotPanel.tsx
│   │   │       ├── CopilotResearchContent.tsx
│   │   │       ├── NotFoundContent.tsx
│   │   │       ├── ResearchContent.tsx
│   │   │       └── ResearchPanel.tsx
│   │   ├── config/
│   │   │   └── task.ts
│   │   ├── helpers/
│   │   │   ├── findDifferences.ts
│   │   │   ├── getHost.ts
│   │   │   └── markdownHelper.ts
│   │   ├── hooks/
│   │   │   ├── ResearchHistoryContext.tsx
│   │   │   ├── useAnalytics.ts
│   │   │   ├── useResearchHistory.ts
│   │   │   ├── useScrollHandler.ts
│   │   │   └── useWebSocket.ts
│   │   ├── next.config.mjs
│   │   ├── nginx/
│   │   │   └── default.conf
│   │   ├── package.json
│   │   ├── package.lib.json
│   │   ├── postcss.config.mjs
│   │   ├── public/
│   │   │   ├── embed.js
│   │   │   ├── manifest.json
│   │   │   ├── sw.js
│   │   │   └── workbox-f1770938.js
│   │   ├── rollup.config.js
│   │   ├── src/
│   │   │   ├── GPTResearcher.tsx
│   │   │   ├── index.css
│   │   │   ├── index.d.ts
│   │   │   ├── index.ts
│   │   │   └── utils/
│   │   │       └── imageTransformPlugin.js
│   │   ├── styles/
│   │   │   └── markdown.css
│   │   ├── tailwind.config.ts
│   │   ├── tsconfig.json
│   │   ├── tsconfig.lib.json
│   │   ├── types/
│   │   │   ├── data.ts
│   │   │   └── react-ga4.d.ts
│   │   └── utils/
│   │       ├── consolidateBlocks.ts
│   │       ├── dataProcessing.ts
│   │       └── getLayout.tsx
│   ├── pdf_styles.css
│   ├── scripts.js
│   └── styles.css
├── gpt_researcher/
│   ├── __init__.py
│   ├── actions/
│   │   ├── __init__.py
│   │   ├── agent_creator.py
│   │   ├── markdown_processing.py
│   │   ├── query_processing.py
│   │   ├── report_generation.py
│   │   ├── retriever.py
│   │   ├── utils.py
│   │   └── web_scraping.py
│   ├── agent.py
│   ├── config/
│   │   ├── __init__.py
│   │   ├── config.py
│   │   └── variables/
│   │       ├── __init__.py
│   │       ├── base.py
│   │       ├── default.py
│   │       └── test_local.json
│   ├── context/
│   │   ├── __init__.py
│   │   ├── compression.py
│   │   └── retriever.py
│   ├── document/
│   │   ├── __init__.py
│   │   ├── azure_document_loader.py
│   │   ├── document.py
│   │   ├── langchain_document.py
│   │   └── online_document.py
│   ├── llm_provider/
│   │   ├── __init__.py
│   │   ├── generic/
│   │   │   ├── __init__.py
│   │   │   └── base.py
│   │   └── image/
│   │       ├── __init__.py
│   │       └── image_generator.py
│   ├── mcp/
│   │   ├── README.md
│   │   ├── __init__.py
│   │   ├── client.py
│   │   ├── research.py
│   │   ├── streaming.py
│   │   └── tool_selector.py
│   ├── memory/
│   │   ├── __init__.py
│   │   └── embeddings.py
│   ├── prompts.py
│   ├── retrievers/
│   │   ├── __init__.py
│   │   ├── arxiv/
│   │   │   ├── __init__.py
│   │   │   └── arxiv.py
│   │   ├── bing/
│   │   │   ├── __init__.py
│   │   │   └── bing.py
│   │   ├── bocha/
│   │   │   ├── __init__.py
│   │   │   └── bocha.py
│   │   ├── custom/
│   │   │   ├── __init__.py
│   │   │   └── custom.py
│   │   ├── duckduckgo/
│   │   │   ├── __init__.py
│   │   │   └── duckduckgo.py
│   │   ├── exa/
│   │   │   ├── __init__.py
│   │   │   └── exa.py
│   │   ├── google/
│   │   │   ├── __init__.py
│   │   │   └── google.py
│   │   ├── mcp/
│   │   │   ├── __init__.py
│   │   │   └── retriever.py
│   │   ├── pubmed_central/
│   │   │   ├── __init__.py
│   │   │   └── pubmed_central.py
│   │   ├── searchapi/
│   │   │   ├── __init__.py
│   │   │   └── searchapi.py
│   │   ├── searx/
│   │   │   ├── __init__.py
│   │   │   └── searx.py
│   │   ├── semantic_scholar/
│   │   │   ├── __init__.py
│   │   │   └── semantic_scholar.py
│   │   ├── serpapi/
│   │   │   ├── __init__.py
│   │   │   └── serpapi.py
│   │   ├── serper/
│   │   │   ├── __init__.py
│   │   │   └── serper.py
│   │   ├── tavily/
│   │   │   ├── __init__.py
│   │   │   └── tavily_search.py
│   │   └── utils.py
│   ├── scraper/
│   │   ├── __init__.py
│   │   ├── arxiv/
│   │   │   ├── __init__.py
│   │   │   └── arxiv.py
│   │   ├── beautiful_soup/
│   │   │   ├── __init__.py
│   │   │   └── beautiful_soup.py
│   │   ├── browser/
│   │   │   ├── __init__.py
│   │   │   ├── browser.py
│   │   │   ├── js/
│   │   │   │   └── overlay.js
│   │   │   ├── nodriver_scraper.py
│   │   │   └── processing/
│   │   │       ├── __init__.py
│   │   │       ├── html.py
│   │   │       └── scrape_skills.py
│   │   ├── firecrawl/
│   │   │   ├── __init__.py
│   │   │   └── firecrawl.py
│   │   ├── pymupdf/
│   │   │   ├── __init__.py
│   │   │   └── pymupdf.py
│   │   ├── scraper.py
│   │   ├── tavily_extract/
│   │   │   ├── __init__.py
│   │   │   └── tavily_extract.py
│   │   ├── utils.py
│   │   └── web_base_loader/
│   │       ├── __init__.py
│   │       └── web_base_loader.py
│   ├── skills/
│   │   ├── __init__.py
│   │   ├── browser.py
│   │   ├── context_manager.py
│   │   ├── curator.py
│   │   ├── deep_research.py
│   │   ├── image_generator.py
│   │   ├── researcher.py
│   │   └── writer.py
│   ├── utils/
│   │   ├── __init__.py
│   │   ├── costs.py
│   │   ├── enum.py
│   │   ├── llm.py
│   │   ├── logger.py
│   │   ├── logging_config.py
│   │   ├── rate_limiter.py
│   │   ├── tools.py
│   │   ├── validators.py
│   │   └── workers.py
│   └── vector_store/
│       ├── __init__.py
│       └── vector_store.py
├── json_schema_generator.py
├── langgraph.json
├── main.py
├── mcp-server/
│   └── README.md
├── multi_agents/
│   ├── README.md
│   ├── __init__.py
│   ├── agent.py
│   ├── agents/
│   │   ├── __init__.py
│   │   ├── editor.py
│   │   ├── human.py
│   │   ├── orchestrator.py
│   │   ├── publisher.py
│   │   ├── researcher.py
│   │   ├── reviewer.py
│   │   ├── reviser.py
│   │   ├── utils/
│   │   │   ├── __init__.py
│   │   │   ├── file_formats.py
│   │   │   ├── llms.py
│   │   │   ├── pdf_styles.css
│   │   │   ├── utils.py
│   │   │   └── views.py
│   │   └── writer.py
│   ├── langgraph.json
│   ├── main.py
│   ├── memory/
│   │   ├── __init__.py
│   │   ├── draft.py
│   │   └── research.py
│   ├── package.json
│   ├── requirements.txt
│   └── task.json
├── multi_agents_ag2/
│   ├── README.md
│   ├── __init__.py
│   ├── agents/
│   │   ├── __init__.py
│   │   ├── editor.py
│   │   └── orchestrator.py
│   ├── main.py
│   ├── requirements.txt
│   └── task.json
├── poetry.toml
├── pyproject.toml
├── requirements.txt
├── setup.py
├── terraform/
│   ├── ecr-setup/
│   │   ├── main.tf
│   │   ├── outputs.tf
│   │   ├── variables.tf
│   │   └── versions.tf
│   ├── github-actions-setup/
│   │   ├── main.tf
│   │   ├── outputs.tf
│   │   ├── variables.tf
│   │   └── versions.tf
│   ├── main.tf
│   ├── outputs.tf
│   ├── variables.tf
│   └── versions.tf
└── tests/
    ├── __init__.py
    ├── documents-report-source.py
    ├── gptr-logs-handler.py
    ├── report-types.py
    ├── research_test.py
    ├── test-loaders.py
    ├── test-openai-llm.py
    ├── test-your-embeddings.py
    ├── test-your-llm.py
    ├── test-your-retriever.py
    ├── test_logging.py
    ├── test_logging_output.py
    ├── test_logs.py
    ├── test_mcp.py
    ├── test_quick_search.py
    ├── test_researcher_logging.py
    ├── test_security_fix.py
    └── vector-store.py

================================================
FILE CONTENTS
================================================

================================================
FILE: .claude/SKILL.md
================================================
---
name: gpt-researcher
description: GPT Researcher is an autonomous deep research agent that conducts web and local research, producing detailed reports with citations. Use this skill when helping developers understand, extend, debug, or integrate with GPT Researcher - including adding features, understanding the architecture, working with the API, customizing research workflows, adding new retrievers, integrating MCP data sources, or troubleshooting research pipelines.
---

# GPT Researcher Development Skill

GPT Researcher is an LLM-based autonomous agent using a planner-executor-publisher pattern with parallelized agent work for speed and reliability.

## Quick Start

### Basic Python Usage

```python
from gpt_researcher import GPTResearcher
import asyncio

async def main():
    researcher = GPTResearcher(
        query="What are the latest AI developments?",
        report_type="research_report",  # or detailed_report, deep, outline_report
        report_source="web",            # or local, hybrid
    )
    await researcher.conduct_research()
    report = await researcher.write_report()
    print(report)

asyncio.run(main())
```

### Run Servers

```bash
# Backend
python -m uvicorn backend.server.server:app --reload --port 8000

# Frontend
cd frontend/nextjs && npm install && npm run dev
```

---

## Key File Locations

| Need | Primary File | Key Classes |
|------|--------------|-------------|
| Main orchestrator | `gpt_researcher/agent.py` | `GPTResearcher` |
| Research logic | `gpt_researcher/skills/researcher.py` | `ResearchConductor` |
| Report writing | `gpt_researcher/skills/writer.py` | `ReportGenerator` |
| All prompts | `gpt_researcher/prompts.py` | `PromptFamily` |
| Configuration | `gpt_researcher/config/config.py` | `Config` |
| Config defaults | `gpt_researcher/config/variables/default.py` | `DEFAULT_CONFIG` |
| API server | `backend/server/app.py` | FastAPI `app` |
| Search engines | `gpt_researcher/retrievers/` | Various retrievers |

---

## Architecture Overview

```
User Query → GPTResearcher.__init__()
                │
                ▼
         choose_agent() → (agent_type, role_prompt)
                │
                ▼
         ResearchConductor.conduct_research()
           ├── plan_research() → sub_queries
           ├── For each sub_query:
           │     └── _process_sub_query() → context
           └── Aggregate contexts
                │
                ▼
         [Optional] ImageGenerator.plan_and_generate_images()
                │
                ▼
         ReportGenerator.write_report() → Markdown report
```

**For detailed architecture diagrams**: See [references/architecture.md](references/architecture.md)

---

## Core Patterns

### Adding a New Feature (8-Step Pattern)

1. **Config** → Add to `gpt_researcher/config/variables/default.py`
2. **Provider** → Create in `gpt_researcher/llm_provider/my_feature/`
3. **Skill** → Create in `gpt_researcher/skills/my_feature.py`
4. **Agent** → Integrate in `gpt_researcher/agent.py`
5. **Prompts** → Update `gpt_researcher/prompts.py`
6. **WebSocket** → Events via `stream_output()`
7. **Frontend** → Handle events in `useWebSocket.ts`
8. **Docs** → Create `docs/docs/gpt-researcher/gptr/my_feature.md`

**For complete feature addition guide with Image Generation case study**: See [references/adding-features.md](references/adding-features.md)

### Adding a New Retriever

```python
# 1. Create: gpt_researcher/retrievers/my_retriever/my_retriever.py
class MyRetriever:
    def __init__(self, query: str, headers: dict = None):
        self.query = query
    
    async def search(self, max_results: int = 10) -> list[dict]:
        # Return: [{"title": str, "href": str, "body": str}]
        pass

# 2. Register in gpt_researcher/actions/retriever.py
case "my_retriever":
    from gpt_researcher.retrievers.my_retriever import MyRetriever
    return MyRetriever

# 3. Export in gpt_researcher/retrievers/__init__.py
```

**For complete retriever documentation**: See [references/retrievers.md](references/retrievers.md)

---

## Configuration

Config keys are **lowercased** when accessed:

```python
# In default.py: "SMART_LLM": "gpt-4o"
# Access as: self.cfg.smart_llm  # lowercase!
```

Priority: Environment Variables → JSON Config File → Default Values

**For complete configuration reference**: See [references/config-reference.md](references/config-reference.md)

---

## Common Integration Points

### WebSocket Streaming

```python
class WebSocketHandler:
    async def send_json(self, data):
        print(f"[{data['type']}] {data.get('output', '')}")

researcher = GPTResearcher(query="...", websocket=WebSocketHandler())
```

### MCP Data Sources

```python
researcher = GPTResearcher(
    query="Open source AI projects",
    mcp_configs=[{
        "name": "github",
        "command": "npx",
        "args": ["-y", "@modelcontextprotocol/server-github"],
        "env": {"GITHUB_TOKEN": os.getenv("GITHUB_TOKEN")}
    }],
    mcp_strategy="deep",  # or "fast", "disabled"
)
```

**For MCP integration details**: See [references/mcp.md](references/mcp.md)

### Deep Research Mode

```python
researcher = GPTResearcher(
    query="Comprehensive analysis of quantum computing",
    report_type="deep",  # Triggers recursive tree-like exploration
)
```

**For deep research configuration**: See [references/deep-research.md](references/deep-research.md)

---

## Error Handling

Always use graceful degradation in skills:

```python
async def execute(self, ...):
    if not self.is_enabled():
        return []  # Don't crash
    
    try:
        result = await self.provider.execute(...)
        return result
    except Exception as e:
        await stream_output("logs", "error", f"⚠️ {e}", self.websocket)
        return []  # Graceful degradation
```

---

## Critical Gotchas

| ❌ Mistake | ✅ Correct |
|-----------|-----------|
| `config.MY_VAR` | `config.my_var` (lowercased) |
| Editing pip-installed package | `pip install -e .` |
| Forgetting async/await | All research methods are async |
| `websocket.send_json()` on None | Check `if websocket:` first |
| Not registering retriever | Add to `retriever.py` match statement |

---

## Reference Documentation

| Topic | File |
|-------|------|
| System architecture & diagrams | [references/architecture.md](references/architecture.md) |
| Core components & signatures | [references/components.md](references/components.md) |
| Research flow & data flow | [references/flows.md](references/flows.md) |
| Prompt system | [references/prompts.md](references/prompts.md) |
| Retriever system | [references/retrievers.md](references/retrievers.md) |
| MCP integration | [references/mcp.md](references/mcp.md) |
| Deep research mode | [references/deep-research.md](references/deep-research.md) |
| Multi-agent system | [references/multi-agents.md](references/multi-agents.md) |
| Adding features guide | [references/adding-features.md](references/adding-features.md) |
| Advanced patterns | [references/advanced-patterns.md](references/advanced-patterns.md) |
| REST & WebSocket API | [references/api-reference.md](references/api-reference.md) |
| Configuration variables | [references/config-reference.md](references/config-reference.md) |


================================================
FILE: .claude/references/adding-features.md
================================================
# Adding Features Guide

## Table of Contents
- [The 8-Step Pattern](#the-8-step-pattern)
- [Image Generation Case Study](#image-generation-case-study)
- [Testing New Features](#testing-new-features)

---

## The 8-Step Pattern

```
┌────────┐    ┌────────┐    ┌────────┐    ┌────────┐
│1.CONFIG│ →  │2.PROVIDER│ → │3.SKILL │ →  │4.AGENT │
└────────┘    └────────┘    └────────┘    └────────┘
     ↓             ↓             ↓             ↓
┌────────┐    ┌────────┐    ┌────────┐    ┌────────┐
│5.PROMPTS│ → │6.WEBSOCKET│→ │7.FRONTEND│→ │8.DOCS  │
└────────┘    └────────┘    └────────┘    └────────┘
```

### Step 1: Add Configuration

**File:** `gpt_researcher/config/variables/default.py`

```python
DEFAULT_CONFIG: BaseConfig = {
    "MY_FEATURE_ENABLED": False,
    "MY_FEATURE_MODEL": "model-name",
    "MY_FEATURE_MAX_ITEMS": 3,
}
```

**File:** `gpt_researcher/config/variables/base.py`

```python
class BaseConfig(TypedDict):
    "MY_FEATURE_ENABLED": bool
    "MY_FEATURE_MODEL": Union[str, None]
    "MY_FEATURE_MAX_ITEMS": int
```

### Step 2: Create Provider

**File:** `gpt_researcher/llm_provider/my_feature/my_provider.py`

```python
class MyFeatureProvider:
    def __init__(self, api_key: str = None, model: str = None):
        self.api_key = api_key or os.getenv("MY_API_KEY")
        self.model = model
    
    def is_enabled(self) -> bool:
        return bool(self.api_key and self.model)
    
    async def execute(self, input_data: str) -> Dict[str, Any]:
        # API implementation
        pass
```

Export in `gpt_researcher/llm_provider/__init__.py`.

### Step 3: Create Skill

**File:** `gpt_researcher/skills/my_feature.py`

```python
class MyFeatureSkill:
    def __init__(self, researcher):
        self.researcher = researcher
        self.config = researcher.cfg
        self.provider = MyFeatureProvider(...)
    
    def is_enabled(self) -> bool:
        return getattr(self.config, 'my_feature_enabled', False) and self.provider.is_enabled()
    
    async def execute(self, context: str, query: str) -> List[Dict]:
        if not self.is_enabled():
            return []
        
        await stream_output("logs", "my_feature_start", "🚀 Starting...", self.researcher.websocket)
        results = await self.provider.execute(context)
        await stream_output("logs", "my_feature_complete", "✅ Done", self.researcher.websocket)
        
        return results
```

Export in `gpt_researcher/skills/__init__.py`.

### Step 4: Integrate into Agent

**File:** `gpt_researcher/agent.py`

```python
def __init__(self, ...):
    if self.cfg.my_feature_enabled:
        from gpt_researcher.skills import MyFeatureSkill
        self.my_feature = MyFeatureSkill(self)
    else:
        self.my_feature = None
    self.my_feature_results = []

async def conduct_research(self, ...):
    # ... existing ...
    if self.my_feature and self.my_feature.is_enabled():
        self.my_feature_results = await self.my_feature.execute(self.context, self.query)
```

### Step 5: Update Prompts

**File:** `gpt_researcher/prompts.py`

```python
@staticmethod
def generate_my_feature_prompt(context: str, query: str) -> str:
    return f"""..."""
```

### Step 6: WebSocket Events

Already handled via `stream_output()` in skill.

### Step 7: Frontend (if needed)

**File:** `frontend/nextjs/hooks/useWebSocket.ts`

```typescript
if (data.content === 'my_feature_start') {
    setStatus('processing');
}
```

### Step 8: Documentation

Create `docs/docs/gpt-researcher/gptr/my_feature.md`.

---

## Image Generation Case Study

This section shows the **actual implementation** of the Image Generation feature as a reference.

### 1. Configuration Added

**File:** `gpt_researcher/config/variables/default.py`

```python
DEFAULT_CONFIG: BaseConfig = {
    # ... existing ...
    "IMAGE_GENERATION_MODEL": "models/gemini-2.5-flash-image",
    "IMAGE_GENERATION_MAX_IMAGES": 3,
    "IMAGE_GENERATION_ENABLED": False,
    "IMAGE_GENERATION_STYLE": "dark",  # dark, light, auto
}
```

### 2. Provider Created

**File:** `gpt_researcher/llm_provider/image/image_generator.py`

```python
class ImageGeneratorProvider:
    def __init__(self, api_key: str = None, model: str = None):
        self.api_key = api_key or os.getenv("GOOGLE_API_KEY")
        self.model = model or "models/gemini-2.5-flash-image"
        self._client = None
    
    def is_enabled(self) -> bool:
        return bool(self.api_key and self.model)
    
    def _build_enhanced_prompt(self, prompt: str, context: str = "", style: str = "dark") -> str:
        """Add styling instructions to prompt."""
        if style == "dark":
            style_instructions = """
            Style: Dark mode professional infographic
            - Background: Dark (#0d1117)
            - Accents: Teal/cyan (#14b8a6)
            - Clean, modern, minimalist
            """
        # ... handle light, auto
        return f"{style_instructions}\n\nCreate: {prompt}\n\nContext: {context}"
    
    async def generate_image(
        self,
        prompt: str,
        context: str = "",
        research_id: str = "",
        style: str = "dark",
    ) -> List[Dict[str, Any]]:
        """Generate image using Gemini."""
        full_prompt = self._build_enhanced_prompt(prompt, context, style)
        
        # Call Gemini API
        response = await self._generate_with_gemini(full_prompt, output_path, ...)
        
        return [{"url": f"/outputs/images/{research_id}/img_{hash}.png", ...}]
```

### 3. Skill Created

**File:** `gpt_researcher/skills/image_generator.py`

```python
class ImageGenerator:
    def __init__(self, researcher):
        self.researcher = researcher
        self.config = researcher.cfg
        self.image_provider = ImageGeneratorProvider(
            api_key=os.getenv("GOOGLE_API_KEY"),
            model=getattr(self.config, 'image_generation_model', None),
        )
        self.max_images = getattr(self.config, 'image_generation_max_images', 3)
        self.style = getattr(self.config, 'image_generation_style', 'dark')
    
    def is_enabled(self) -> bool:
        enabled = getattr(self.config, 'image_generation_enabled', False)
        return enabled and self.image_provider.is_enabled()
    
    async def plan_and_generate_images(
        self,
        research_context: str,
        research_query: str,
        research_id: str,
        websocket: Any,
    ) -> List[Dict[str, Any]]:
        """
        1. Use LLM to identify visual concepts from context
        2. Generate images in parallel
        3. Return list of image metadata
        """
        # Stream progress
        await stream_output("logs", "image_planning", "🎨 Planning images...", websocket)
        
        # LLM identifies concepts
        concepts = await self._plan_image_concepts(research_context, research_query)
        
        # Generate images in parallel
        generated_images = []
        for i, concept in enumerate(concepts[:self.max_images]):
            await stream_output("logs", "image_generating", 
                f"🖼️ Generating image {i+1}/{len(concepts)}...", websocket)
            
            images = await self.image_provider.generate_image(
                prompt=concept["prompt"],
                context=concept.get("context", ""),
                research_id=research_id,
                style=self.style,
            )
            generated_images.extend(images)
        
        await stream_output("logs", "images_ready", 
            f"✅ Generated {len(generated_images)} images", websocket)
        
        return generated_images
```

### 4. Agent Integration

**File:** `gpt_researcher/agent.py`

```python
class GPTResearcher:
    def __init__(self, ...):
        # ... existing init ...
        
        # Initialize image generator if enabled
        if self.cfg.image_generation_enabled:
            from gpt_researcher.skills import ImageGenerator
            self.image_generator = ImageGenerator(self)
        else:
            self.image_generator = None
        
        self.available_images: List[Dict[str, Any]] = []
        self.research_id = self._generate_research_id(query)
    
    async def conduct_research(self, on_progress=None):
        # ... existing research ...
        
        self.context = await self.research_conductor.conduct_research()
        
        # Pre-generate images after research, before report writing
        if self.cfg.image_generation_enabled and self.image_generator and self.image_generator.is_enabled():
            self.available_images = await self.image_generator.plan_and_generate_images(
                research_context=self.context,
                research_query=self.query,
                research_id=self.research_id,
                websocket=self.websocket,
            )
        
        return self.context
    
    async def write_report(self, ...):
        report = await self.report_generator.write_report(
            # ... existing params ...
            available_images=self.available_images,  # Pass to report writer
        )
        return report
```

### 5. Prompt Updated

**File:** `gpt_researcher/prompts.py`

```python
@staticmethod
def generate_report_prompt(..., available_images: List[Dict[str, Any]] = []):
    image_instruction = ""
    if available_images:
        image_list = "\n".join([
            f"- Title: {img.get('title', 'Untitled')}\n  URL: {img['url']}"
            for img in available_images
        ])
        image_instruction = f"""
AVAILABLE IMAGES - Embed where relevant using ![Title](URL):
{image_list}
"""
    
    return f"""...(existing prompt)...
{image_instruction}
"""
```

---

## Testing New Features

```python
# tests/test_my_feature.py
import pytest
from gpt_researcher import GPTResearcher

@pytest.mark.asyncio
async def test_my_feature_disabled():
    """Test that feature is skipped when disabled."""
    researcher = GPTResearcher(query="test")
    # MY_FEATURE_ENABLED defaults to False
    assert researcher.my_feature is None

@pytest.mark.asyncio
async def test_my_feature_enabled(monkeypatch):
    """Test feature execution when enabled."""
    monkeypatch.setenv("MY_FEATURE_ENABLED", "true")
    monkeypatch.setenv("MY_API_KEY", "test-key")
    
    researcher = GPTResearcher(query="test")
    assert researcher.my_feature is not None
    assert researcher.my_feature.is_enabled()
```

### Running Tests

```bash
# All tests
python -m pytest tests/

# Specific test
python -m pytest tests/test_my_feature.py -v

# With coverage
python -m pytest tests/ --cov=gpt_researcher
```


================================================
FILE: .claude/references/advanced-patterns.md
================================================
# Advanced Patterns Reference

## Table of Contents
- [Custom Callbacks](#custom-callbacks)
- [Custom WebSocket Handler](#custom-websocket-handler)
- [LangChain Integration](#langchain-integration)
- [Search Restrictions](#search-restrictions)
- [Error Handling Patterns](#error-handling-patterns)

---

## Custom Callbacks

```python
def cost_callback(cost: float):
    print(f"API call cost: ${cost}")

researcher = GPTResearcher(query="...")
researcher.add_costs = cost_callback  # Override cost tracking
```

---

## Custom WebSocket Handler

```python
class CustomWebSocket:
    def __init__(self):
        self.messages = []
    
    async def send_json(self, data):
        self.messages.append(data)
        if data['type'] == 'logs':
            print(f"Progress: {data['output']}")

researcher = GPTResearcher(query="...", websocket=CustomWebSocket())
```

---

## LangChain Integration

### Using with LangChain Documents

```python
from langchain.document_loaders import DirectoryLoader

loader = DirectoryLoader('./docs', glob="**/*.md")
documents = loader.load()

researcher = GPTResearcher(
    query="Summarize the documentation",
    report_source="langchain_documents",
    documents=documents,
)
```

### Using with Vector Store

```python
from langchain.vectorstores import Chroma

vectorstore = Chroma.from_documents(documents, embeddings)

researcher = GPTResearcher(
    query="Find relevant information",
    report_source="langchain_vectorstore",
    vector_store=vectorstore,
    vector_store_filter={"source": "docs"},
)
```

---

## Search Restrictions

### Restricting Search Domains

```python
researcher = GPTResearcher(
    query="Company news",
    query_domains=["reuters.com", "bloomberg.com", "wsj.com"],
)
```

### Using Specific Source URLs

```python
researcher = GPTResearcher(
    query="Analyze these articles",
    source_urls=[
        "https://example.com/article1",
        "https://example.com/article2",
    ],
    complement_source_urls=True,  # Also do web search
)
```

---

## Error Handling Patterns

### Graceful Degradation

```python
# In skills, always check is_enabled()
async def execute(self, ...):
    if not self.is_enabled():
        logger.warning("Feature not enabled, skipping")
        return []  # Return empty, don't crash
    
    try:
        result = await self.provider.execute(...)
        return result
    except Exception as e:
        logger.error(f"Feature error: {e}")
        await stream_output("logs", "feature_error", f"⚠️ Error: {e}", self.websocket)
        return []  # Graceful degradation
```

### API Rate Limiting

```python
# Providers should handle rate limits
async def execute(self, ...):
    try:
        return await self._call_api(...)
    except RateLimitError as e:
        logger.warning(f"Rate limited, waiting...")
        await asyncio.sleep(60)
        return await self._call_api(...)  # Retry
```

### WebSocket None Check

```python
# Always check websocket before sending
if self.researcher.websocket:
    await stream_output("logs", "event", "message", self.researcher.websocket)
```


================================================
FILE: .claude/references/api-reference.md
================================================
# API Reference

## Table of Contents
- [REST API](#rest-api)
- [WebSocket API](#websocket-api)
- [Python Client](#python-client)
- [Output Files](#output-files)

---

## REST API

Base URL: `http://localhost:8000`

### Generate Report

**POST `/report/`**

```json
{
    "task": "What are the latest AI developments?",
    "report_type": "research_report",
    "report_source": "web",
    "tone": "Objective",
    "source_urls": [],
    "query_domains": [],
    "generate_in_background": false
}
```

**Response:**

```json
{
    "report": "# Research Report\n\n...",
    "research_id": "task_1234567890_query",
    "costs": 0.05,
    "pdf_path": "outputs/task_123.pdf",
    "docx_path": "outputs/task_123.docx"
}
```

### Chat with Report

**POST `/api/chat`**

```json
{
    "report": "The full report text...",
    "messages": [
        {"role": "user", "content": "What are the key findings?"}
    ]
}
```

### Report Management

| Method | Endpoint | Description |
|--------|----------|-------------|
| GET | `/api/reports` | List all reports |
| GET | `/api/reports/{id}` | Get single report |
| POST | `/api/reports` | Create/update report |
| PUT | `/api/reports/{id}` | Update report |
| DELETE | `/api/reports/{id}` | Delete report |

### File Operations

| Method | Endpoint | Description |
|--------|----------|-------------|
| POST | `/upload/` | Upload document |
| DELETE | `/delete/{filename}` | Delete file |
| GET | `/outputs/{filename}` | Get output file |

### Configuration

| Method | Endpoint | Description |
|--------|----------|-------------|
| GET | `/getConfig` | Get current config |
| POST | `/setConfig` | Update config |

---

## WebSocket API

**Endpoint:** `ws://localhost:8000/ws`

### Send Research Request

```json
{
    "task": "Research query",
    "report_type": "research_report",
    "report_source": "web",
    "tone": "Objective",
    "source_urls": [],
    "mcp_enabled": false,
    "mcp_strategy": "fast",
    "mcp_configs": []
}
```

### Message Types (Server → Client)

| Type | Content | Description |
|------|---------|-------------|
| `logs` | `starting_research` | Research initiated |
| `logs` | `planning_research` | Generating sub-queries |
| `logs` | `running_subquery_research` | Researching sub-query |
| `logs` | `research_step_finalized` | Research complete |
| `logs` | `agent_generated` | Agent role selected |
| `logs` | `scraping_urls` | Scraping web pages |
| `logs` | `mcp_optimization` | MCP processing |
| `logs` | `image_planning` | Planning images |
| `logs` | `images_ready` | Images generated |
| `report` | - | Streaming report chunks |
| `report_complete` | - | Final complete report |
| `path` | `pdf`, `docx`, `md` | Output file paths |
| `error` | - | Error messages |
| `human_feedback` | `request` | Request user input |

### Message Format

```json
{
    "type": "logs",
    "content": "starting_research",
    "output": "🔍 Starting the research task...",
    "metadata": null
}
```

### Frontend Handler Example

```typescript
ws.onmessage = (event) => {
    const data = JSON.parse(event.data);
    
    switch (data.type) {
        case 'logs':
            setLogs(prev => [...prev, data]);
            break;
        case 'report':
            setAnswer(prev => prev + data.output);
            break;
        case 'report_complete':
            setAnswer(data.output);
            break;
        case 'path':
            setPaths(prev => ({...prev, [data.content]: data.output}));
            break;
        case 'error':
            setError(data.output);
            break;
    }
};
```

---

## Python Client

### Basic Usage

```python
from gpt_researcher import GPTResearcher
import asyncio

async def main():
    researcher = GPTResearcher(
        query="What are the latest AI developments?",
        report_type="research_report",
    )
    
    await researcher.conduct_research()
    report = await researcher.write_report()
    
    print(f"Report: {report}")
    print(f"Costs: ${researcher.get_costs()}")

asyncio.run(main())
```

### With MCP

```python
researcher = GPTResearcher(
    query="Research topic",
    mcp_configs=[{
        "name": "github",
        "command": "npx",
        "args": ["-y", "@modelcontextprotocol/server-github"],
        "env": {"GITHUB_TOKEN": os.getenv("GITHUB_TOKEN")}
    }],
    mcp_strategy="deep",
)
```

### With WebSocket Streaming

```python
class MockWebSocket:
    async def send_json(self, data):
        print(f"[{data['type']}] {data.get('output', '')}")

researcher = GPTResearcher(
    query="Research topic",
    websocket=MockWebSocket(),
)
```

### GPTResearcher Parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `query` | str | required | Research question |
| `report_type` | str | `research_report` | Type of report |
| `report_source` | str | `web` | Data source |
| `tone` | Tone | `Objective` | Writing tone |
| `source_urls` | list | `[]` | Specific URLs to research |
| `document_urls` | list | `[]` | Document URLs |
| `query_domains` | list | `[]` | Restrict to domains |
| `config_path` | str | None | Path to JSON config |
| `websocket` | WebSocket | None | For streaming |
| `mcp_configs` | list | `[]` | MCP server configs |
| `mcp_strategy` | str | `fast` | MCP strategy |
| `verbose` | bool | `True` | Verbose output |

---

## Output Files

```
outputs/
├── task_{timestamp}_{query}.md
├── task_{timestamp}_{query}.pdf
├── task_{timestamp}_{query}.docx
└── images/
    └── {research_id}/
        └── img_{hash}_{index}.png
```

---

## Error Codes

| Code | Description |
|------|-------------|
| 400 | Bad Request - Invalid parameters |
| 404 | Not Found - Report not found |
| 429 | Rate Limited - API quota exceeded |
| 500 | Internal Server Error |


================================================
FILE: .claude/references/architecture.md
================================================
# Architecture Reference

## Table of Contents
- [System Layers](#system-layers)
- [Key File Locations](#key-file-locations)

---

## System Layers

```
┌─────────────────────────────────────────────────────────────────────────────┐
│                              USER REQUEST                                    │
│              (query, report_type, report_source, tone, mcp_configs)         │
└─────────────────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│                         BACKEND API LAYER                                    │
│  ┌──────────────────┐  ┌──────────────────┐  ┌──────────────────┐          │
│  │  FastAPI Server  │  │ WebSocket Manager│  │  Report Store    │          │
│  │  backend/server/ │  │ Real-time events │  │  JSON persistence│          │
│  │  app.py          │  │ websocket_mgr.py │  │  report_store.py │          │
│  └──────────────────┘  └──────────────────┘  └──────────────────┘          │
└─────────────────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│                    GPTResearcher (gpt_researcher/agent.py)                   │
│                                                                              │
│  ┌───────────────────────────────────────────────────────────────────────┐  │
│  │                         SKILLS LAYER                                   │  │
│  │  ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐         │  │
│  │  │ ResearchConductor│ │ ReportGenerator │ │ ContextManager  │         │  │
│  │  │ Plan & gather   │ │ Write reports   │ │ Similarity search│         │  │
│  │  │ researcher.py   │ │ writer.py       │ │ context_manager │         │  │
│  │  └─────────────────┘ └─────────────────┘ └─────────────────┘         │  │
│  │  ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐         │  │
│  │  │ BrowserManager  │ │ SourceCurator   │ │ ImageGenerator  │         │  │
│  │  │ Web scraping    │ │ Rank sources    │ │ Gemini images   │         │  │
│  │  │ browser.py      │ │ curator.py      │ │ image_generator │         │  │
│  │  └─────────────────┘ └─────────────────┘ └─────────────────┘         │  │
│  │  ┌─────────────────┐                                                  │  │
│  │  │ DeepResearchSkill│                                                 │  │
│  │  │ Recursive depth │                                                  │  │
│  │  │ deep_research.py│                                                  │  │
│  │  └─────────────────┘                                                  │  │
│  └───────────────────────────────────────────────────────────────────────┘  │
│                                                                              │
│  ┌───────────────────────────────────────────────────────────────────────┐  │
│  │                        ACTIONS LAYER                                   │  │
│  │  ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐         │  │
│  │  │ report_generation│ │ query_processing│ │ web_scraping    │         │  │
│  │  │ LLM report write│ │ Sub-query plan  │ │ URL scraping    │         │  │
│  │  └─────────────────┘ └─────────────────┘ └─────────────────┘         │  │
│  │  ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐         │  │
│  │  │ retriever.py    │ │ agent_creator   │ │ markdown_process│         │  │
│  │  │ Get retrievers  │ │ Choose agent    │ │ Parse markdown  │         │  │
│  │  └─────────────────┘ └─────────────────┘ └─────────────────┘         │  │
│  └───────────────────────────────────────────────────────────────────────┘  │
│                                                                              │
│  ┌───────────────────────────────────────────────────────────────────────┐  │
│  │                       PROVIDERS LAYER                                  │  │
│  │  ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐         │  │
│  │  │ LLM Provider    │ │ Retrievers      │ │ Scrapers        │         │  │
│  │  │ OpenAI,Anthropic│ │ Tavily,Google   │ │ BS4,Playwright  │         │  │
│  │  │ Google,Groq...  │ │ Bing,MCP...     │ │ PDF,DOCX...     │         │  │
│  │  │ llm_provider/   │ │ retrievers/     │ │ scraper/        │         │  │
│  │  └─────────────────┘ └─────────────────┘ └─────────────────┘         │  │
│  │  ┌─────────────────┐                                                  │  │
│  │  │ ImageGenerator  │                                                  │  │
│  │  │ Gemini/Imagen   │                                                  │  │
│  │  │ llm_provider/   │                                                  │  │
│  │  │ image/          │                                                  │  │
│  │  └─────────────────┘                                                  │  │
│  └───────────────────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│                        CONFIGURATION LAYER                                   │
│                     gpt_researcher/config/                                   │
│                                                                              │
│     Environment Variables  →  JSON Config File  →  Default Values            │
│           (highest)              (medium)            (lowest)                │
│                                                                              │
│     config.py loads and merges all sources                                   │
│     variables/default.py contains all defaults                               │
│     variables/base.py defines TypedDict for type safety                      │
└─────────────────────────────────────────────────────────────────────────────┘
```

---

## Key File Locations

| Need | Primary File | Key Classes/Functions |
|------|--------------|----------------------|
| Main orchestrator | `gpt_researcher/agent.py` | `GPTResearcher` |
| Research logic | `gpt_researcher/skills/researcher.py` | `ResearchConductor` |
| Report writing | `gpt_researcher/skills/writer.py` | `ReportGenerator` |
| Context/embeddings | `gpt_researcher/skills/context_manager.py` | `ContextManager` |
| Source ranking | `gpt_researcher/skills/curator.py` | `SourceCurator` |
| Deep research | `gpt_researcher/skills/deep_research.py` | `DeepResearchSkill` |
| Image generation | `gpt_researcher/skills/image_generator.py` | `ImageGenerator` |
| All prompts | `gpt_researcher/prompts.py` | `PromptFamily` |
| Configuration | `gpt_researcher/config/config.py` | `Config` |
| Config defaults | `gpt_researcher/config/variables/default.py` | `DEFAULT_CONFIG` |
| Config types | `gpt_researcher/config/variables/base.py` | `BaseConfig` |
| API server | `backend/server/app.py` | FastAPI `app` |
| WebSocket mgmt | `backend/server/websocket_manager.py` | `WebSocketManager`, `run_agent` |
| Report types | `backend/report_type/` | `BasicReport`, `DetailedReport` |
| Search engines | `gpt_researcher/retrievers/` | `TavilySearch`, `GoogleSearch`, etc. |
| Web scraping | `gpt_researcher/scraper/` | Various scrapers |
| Enums | `gpt_researcher/utils/enum.py` | `ReportType`, `ReportSource`, `Tone` |


================================================
FILE: .claude/references/components.md
================================================
# Core Components & Method Signatures

## Table of Contents
- [GPTResearcher](#gptresearcher)
- [ResearchConductor](#researchconductor)
- [ReportGenerator](#reportgenerator)

---

## GPTResearcher

**File:** `gpt_researcher/agent.py`

The main orchestrator class. Full initialization signature:

```python
class GPTResearcher:
    def __init__(
        self,
        query: str,                              # Research question (required)
        report_type: str = "research_report",    # research_report, detailed_report, deep, outline_report, resource_report
        report_format: str = "markdown",         # Output format
        report_source: str = "web",              # web, local, hybrid, azure, langchain_documents, langchain_vectorstore
        tone: Tone = Tone.Objective,             # Writing tone (see Tone enum)
        source_urls: list[str] | None = None,    # Specific URLs to research
        document_urls: list[str] | None = None,  # Document URLs to include
        complement_source_urls: bool = False,    # Add web search to source_urls
        query_domains: list[str] | None = None,  # Restrict search to domains
        documents=None,                          # LangChain document objects
        vector_store=None,                       # LangChain vector store
        vector_store_filter=None,                # Filter for vector store
        config_path=None,                        # Path to JSON config file
        websocket=None,                          # WebSocket for streaming
        agent=None,                              # Pre-defined agent type
        role=None,                               # Pre-defined agent role
        parent_query: str = "",                  # Parent query for subtopics
        subtopics: list | None = None,           # Subtopics to research
        visited_urls: set | None = None,         # Already visited URLs
        verbose: bool = True,                    # Verbose logging
        context=None,                            # Pre-loaded context
        headers: dict | None = None,             # HTTP headers
        max_subtopics: int = 5,                  # Max subtopics for detailed
        log_handler=None,                        # Custom log handler
        prompt_family: str | None = None,        # Custom prompt family
        mcp_configs: list[dict] | None = None,   # MCP server configurations
        mcp_max_iterations: int | None = None,   # Deprecated, use mcp_strategy
        mcp_strategy: str | None = None,         # fast, deep, disabled
        **kwargs
    ):
```

### Key Methods

```python
async def conduct_research(self, on_progress=None) -> str:
    """
    Main research orchestration.
    
    1. Selects agent role via LLM (choose_agent)
    2. Delegates to ResearchConductor
    3. Optionally generates images if enabled
    
    Returns: Accumulated research context as string
    """

async def write_report(
    self, 
    existing_headers: list = [],           # Headers to avoid duplication
    relevant_written_contents: list = [],  # Previous content for context
    ext_context=None,                      # External context override
    custom_prompt=""                       # Custom prompt override
) -> str:
    """
    Generate final report from context.
    
    Returns: Markdown report string
    """

def get_costs(self) -> float:
    """Returns total accumulated API costs."""

def add_costs(self, cost: float) -> None:
    """Add to running cost total (used as callback)."""
```

---

## ResearchConductor

**File:** `gpt_researcher/skills/researcher.py`

Manages the research process:

```python
class ResearchConductor:
    def __init__(self, researcher: GPTResearcher):
        self.researcher = researcher
        self.logger = logging.getLogger(__name__)

    async def plan_research(self, query: str, query_domains=None) -> list:
        """
        Generate sub-queries from main query using LLM.
        
        1. Gets initial search results
        2. Calls plan_research_outline() to generate sub-queries
        
        Returns: List of sub-query strings
        """

    async def conduct_research(self) -> str:
        """
        Main research execution based on report_source.
        
        Handles: web, local, hybrid, azure, langchain_documents, langchain_vectorstore
        
        For each source type:
        1. Load/search data
        2. Process sub-queries
        3. Combine context
        4. Optionally curate sources
        
        Returns: Combined research context string
        """

    async def _process_sub_query(
        self, 
        sub_query: str, 
        scraped_data: list = [], 
        query_domains: list = []
    ) -> str:
        """
        Process a single sub-query.
        
        1. Get MCP context (if configured, based on strategy)
        2. Scrape URLs from search results
        3. Get similar content via embeddings
        4. Combine MCP + web context
        
        Returns: Combined context for this sub-query
        """

    async def _get_context_by_web_search(
        self, 
        query: str, 
        scraped_data: list = [], 
        query_domains: list = []
    ) -> str:
        """Web-based research with sub-query planning."""

    async def _scrape_data_by_urls(
        self, 
        sub_query: str, 
        query_domains: list = []
    ) -> list:
        """Search and scrape URLs for a sub-query."""
```

---

## ReportGenerator

**File:** `gpt_researcher/skills/writer.py`

```python
class ReportGenerator:
    def __init__(self, researcher: GPTResearcher):
        self.researcher = researcher
        self.research_params = {
            "query": researcher.query,
            "agent_role_prompt": researcher.cfg.agent_role or researcher.role,
            "report_type": researcher.report_type,
            "report_source": researcher.report_source,
            "tone": researcher.tone,
            "websocket": researcher.websocket,
            "cfg": researcher.cfg,
            "headers": researcher.headers,
        }

    async def write_report(
        self,
        existing_headers: list = [],
        relevant_written_contents: list = [],
        ext_context=None,
        custom_prompt="",
        available_images: list = [],  # Pre-generated images to embed
    ) -> str:
        """
        Generate report using LLM.
        
        Calls generate_report() action with context and images.
        
        Returns: Markdown report
        """

    async def write_introduction(self, ...) -> str:
        """Write report introduction section."""

    async def write_conclusion(self, ...) -> str:
        """Write report conclusion with references."""
```


================================================
FILE: .claude/references/config-reference.md
================================================
# Configuration Reference

## Table of Contents
- [Required Variables](#required-variables)
- [LLM Configuration](#llm-configuration)
- [Provider API Keys](#provider-api-keys)
- [Retriever Configuration](#retriever-configuration)
- [Report Configuration](#report-configuration)
- [Feature Toggles](#feature-toggles)
- [Configuration Priority](#configuration-priority)
- [Example .env](#example-env)

---

## Required Variables

```bash
OPENAI_API_KEY=sk-...          # Or another LLM provider key
TAVILY_API_KEY=tvly-...        # Or another retriever key
```

---

## LLM Configuration

```bash
LLM_PROVIDER=openai            # openai, anthropic, google, groq, together, etc.
FAST_LLM=gpt-4o-mini           # Quick tasks (summarization)
SMART_LLM=gpt-4o               # Complex reasoning (report writing)
STRATEGIC_LLM=o3-mini          # Planning (agent selection)
TEMPERATURE=0.4                # 0.0-1.0
MAX_TOKENS=4000
REASONING_EFFORT=medium        # For o-series: low, medium, high
```

---

## Provider API Keys

```bash
# OpenAI
OPENAI_API_KEY=sk-...
OPENAI_BASE_URL=https://api.openai.com/v1

# Anthropic
ANTHROPIC_API_KEY=sk-ant-...

# Google
GOOGLE_API_KEY=AIza...

# Groq
GROQ_API_KEY=gsk_...

# Azure OpenAI
AZURE_OPENAI_API_KEY=...
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/
```

---

## Retriever Configuration

```bash
RETRIEVER=tavily               # Single or comma-separated: tavily,google,mcp
MAX_SEARCH_RESULTS_PER_QUERY=5
MAX_URLS_TO_SCRAPE=10
SIMILARITY_THRESHOLD=0.42
```

### Retriever API Keys

```bash
TAVILY_API_KEY=tvly-...
GOOGLE_API_KEY=AIza...
GOOGLE_CX_KEY=...
BING_API_KEY=...
SERPER_API_KEY=...
SERPAPI_API_KEY=...
EXA_API_KEY=...
```

---

## Report Configuration

```bash
REPORT_FORMAT=apa              # apa, mla, chicago, harvard, ieee
TOTAL_WORDS=1000
LANGUAGE=english
CURATE_SOURCES=true
```

---

## Feature Toggles

### Image Generation

```bash
IMAGE_GENERATION_ENABLED=true
GOOGLE_API_KEY=AIza...
IMAGE_GENERATION_MODEL=models/gemini-2.5-flash-image
IMAGE_GENERATION_MAX_IMAGES=3
IMAGE_GENERATION_STYLE=dark    # dark, light, auto
```

### Deep Research

```bash
DEEP_RESEARCH_BREADTH=4        # Subtopics per level
DEEP_RESEARCH_DEPTH=2          # Recursion levels
DEEP_RESEARCH_CONCURRENCY=2    # Parallel tasks
```

### MCP

```bash
MCP_STRATEGY=fast              # fast, deep, disabled
```

### Local Documents

```bash
DOC_PATH=./my-docs
# Supports: PDF, DOCX, TXT, CSV, XLSX, PPTX, MD
```

### Server

```bash
HOST=0.0.0.0
PORT=8000
VERBOSE=true
```

---

## Configuration Priority

```
Environment Variables (highest)
        ↓
JSON Config File (if provided)
        ↓
Default Values (lowest)
```

**Important:** Config keys are lowercased when accessed:

```python
# In default.py: "SMART_LLM": "gpt-4o"
# Access as: self.cfg.smart_llm  # lowercase!
```

---

## Example .env

```bash
# Required
OPENAI_API_KEY=sk-your-key
TAVILY_API_KEY=tvly-your-key

# LLM
FAST_LLM=gpt-4o-mini
SMART_LLM=gpt-4o

# Report
TOTAL_WORDS=1000
LANGUAGE=english

# Optional: Images
IMAGE_GENERATION_ENABLED=true
GOOGLE_API_KEY=AIza-your-key
IMAGE_GENERATION_STYLE=dark
```


================================================
FILE: .claude/references/deep-research.md
================================================
# Deep Research Mode Reference

## Table of Contents
- [Overview](#overview)
- [Configuration](#configuration)
- [DeepResearchSkill](#deepresearchskill)
- [Usage](#usage)

---

## Overview

Deep Research uses recursive tree-like exploration with configurable depth and breadth.

---

## Configuration

```bash
DEEP_RESEARCH_BREADTH=4    # Subtopics per level
DEEP_RESEARCH_DEPTH=2      # Recursion levels
DEEP_RESEARCH_CONCURRENCY=2  # Parallel tasks
```

---

## DeepResearchSkill

**File:** `gpt_researcher/skills/deep_research.py`

```python
class DeepResearchSkill:
    def __init__(self, researcher):
        self.researcher = researcher
        self.breadth = getattr(researcher.cfg, 'deep_research_breadth', 4)
        self.depth = getattr(researcher.cfg, 'deep_research_depth', 2)
        self.concurrency_limit = getattr(researcher.cfg, 'deep_research_concurrency', 2)
        self.learnings = []
        self.research_sources = []
        self.context = []

    async def deep_research(self, query: str, on_progress=None) -> str:
        """
        Recursive research with depth and breadth.
        
        1. Research main topic
        2. Generate subtopics (breadth)
        3. For each subtopic, recursively research (depth)
        4. Aggregate all findings
        5. Generate comprehensive report
        """
```

---

## Usage

```python
researcher = GPTResearcher(
    query="Comprehensive analysis of quantum computing",
    report_type="deep",  # Triggers deep research
)
await researcher.conduct_research()
report = await researcher.write_report()
```

### Research Tree Structure

```
Query: "Quantum Computing"
├── Subtopic 1: Hardware (depth 1)
│   ├── Subtopic 1.1: Superconducting qubits (depth 2)
│   └── Subtopic 1.2: Ion traps (depth 2)
├── Subtopic 2: Algorithms (depth 1)
│   ├── Subtopic 2.1: Shor's algorithm (depth 2)
│   └── Subtopic 2.2: Grover's algorithm (depth 2)
├── Subtopic 3: Applications (depth 1)
│   └── ...
└── Subtopic 4: Challenges (depth 1)
    └── ...
```

With `DEEP_RESEARCH_BREADTH=4` and `DEEP_RESEARCH_DEPTH=2`, this explores 4 subtopics at each level, going 2 levels deep.


================================================
FILE: .claude/references/flows.md
================================================
# Research Flow & Data Flow

## Table of Contents
- [End-to-End Research Flow](#end-to-end-research-flow)
- [Data Flow Between Components](#data-flow-between-components)

---

## End-to-End Research Flow

### 1. Request Entry

**File:** `backend/server/app.py`

```python
# REST API endpoint
@app.post("/report/")
async def generate_report(research_request: ResearchRequest, background_tasks: BackgroundTasks):
    research_id = sanitize_filename(f"task_{int(time.time())}_{research_request.task}")
    # Calls write_report() which uses run_agent()

# WebSocket endpoint
@app.websocket("/ws")
async def websocket_endpoint(websocket: WebSocket):
    await manager.connect(websocket)
    await handle_websocket_communication(websocket, manager)
```

### 2. Agent Runner

**File:** `backend/server/websocket_manager.py`

```python
async def run_agent(task, report_type, report_source, source_urls, ...):
    """Main entry point for research execution."""
    # Create logs handler
    logs_handler = CustomLogsHandler(websocket, task)
    
    # Configure MCP if enabled
    if mcp_enabled and mcp_configs:
        os.environ["RETRIEVER"] = f"{current_retriever},mcp"
        os.environ["MCP_STRATEGY"] = mcp_strategy
    
    # Route based on report type
    if report_type == "multi_agents":
        report = await run_research_task(query=task, websocket=logs_handler, ...)
    elif report_type == ReportType.DetailedReport.value:
        researcher = DetailedReport(query=task, ...)
        report = await researcher.run()
    else:
        researcher = BasicReport(query=task, ...)
        report = await researcher.run()
    
    return report
```

### 3. Research Phase

**File:** `gpt_researcher/agent.py`

```python
async def conduct_research(self, on_progress=None):
    # Handle deep research separately
    if self.report_type == ReportType.DeepResearch.value and self.deep_researcher:
        return await self._handle_deep_research(on_progress)
    
    # Choose agent role via LLM
    if not (self.agent and self.role):
        self.agent, self.role = await choose_agent(
            query=self.query,
            cfg=self.cfg,
            parent_query=self.parent_query,
            cost_callback=self.add_costs,
            headers=self.headers,
            prompt_family=self.prompt_family,
        )
    
    # Conduct research
    self.context = await self.research_conductor.conduct_research()
    
    # Generate images if enabled (pre-generation for seamless UX)
    if self.cfg.image_generation_enabled and self.image_generator:
        self.available_images = await self.image_generator.plan_and_generate_images(
            research_context=self.context,
            research_query=self.query,
            research_id=self.research_id,
            websocket=self.websocket,
        )
    
    return self.context
```

### 4. Sub-Query Processing

**File:** `gpt_researcher/skills/researcher.py`

```python
async def _process_sub_query(self, sub_query: str, scraped_data: list = [], query_domains: list = []):
    # MCP Strategy handling
    mcp_retrievers = [r for r in self.researcher.retrievers if "mcpretriever" in r.__name__.lower()]
    mcp_strategy = self._get_mcp_strategy()
    
    if mcp_retrievers:
        if mcp_strategy == "fast" and self._mcp_results_cache is not None:
            # Reuse cached MCP results
            mcp_context = self._mcp_results_cache.copy()
        elif mcp_strategy == "deep":
            # Run MCP for every sub-query
            mcp_context = await self._execute_mcp_research_for_queries([sub_query], mcp_retrievers)
    
    # Get web search context
    if not scraped_data:
        scraped_data = await self._scrape_data_by_urls(sub_query, query_domains)
    
    # Get similar content via embeddings
    if scraped_data:
        web_context = await self.researcher.context_manager.get_similar_content_by_query(
            sub_query, scraped_data
        )
    
    # Combine MCP + web context
    combined_context = self._combine_mcp_and_web_context(mcp_context, web_context, sub_query)
    return combined_context
```

### 5. Report Generation

**File:** `gpt_researcher/actions/report_generation.py`

```python
async def generate_report(
    query: str,
    context: str,
    agent_role_prompt: str,
    report_type: str,
    websocket=None,
    cfg=None,
    tone=None,
    headers=None,
    cost_callback=None,
    prompt_family=None,
    available_images: list = [],
    **kwargs
) -> str:
    """Generate report using LLM."""
    # Get prompt generator
    generate_prompt = prompt_family.get_prompt_by_report_type(report_type)
    
    # Build prompt with context and available images
    content = generate_prompt(
        query, context, report_source,
        report_format=cfg.report_format,
        tone=tone,
        total_words=cfg.total_words,
        language=cfg.language,
        available_images=available_images,
    )
    
    # Call LLM
    report = await create_chat_completion(
        model=cfg.smart_llm,
        messages=[{"role": "user", "content": content}],
        temperature=cfg.temperature,
        llm_provider=cfg.smart_llm_provider,
        max_tokens=cfg.smart_token_limit,
        llm_kwargs=cfg.llm_kwargs,
        cost_callback=cost_callback,
    )
    
    return report
```

---

## Data Flow Between Components

```
User Query
    │
    ▼
┌─────────────────────────────────────────────────────────────────┐
│ GPTResearcher.__init__()                                         │
│   • Loads Config (env → json → defaults)                        │
│   • Initializes skills: ResearchConductor, ReportGenerator, etc │
│   • Initializes retrievers based on RETRIEVER env var           │
│   • Initializes ImageGenerator if IMAGE_GENERATION_ENABLED      │
└─────────────────────────────────────────────────────────────────┘
    │
    │  researcher.conduct_research()
    ▼
┌─────────────────────────────────────────────────────────────────┐
│ choose_agent()                                                   │
│   Input: query, config                                          │
│   Output: (agent_type: str, role_prompt: str)                   │
│   • LLM selects best agent role for the query                   │
└─────────────────────────────────────────────────────────────────┘
    │
    ▼
┌─────────────────────────────────────────────────────────────────┐
│ ResearchConductor.conduct_research()                             │
│   Input: self.researcher (has query, config, retrievers)        │
│   Output: context: str                                          │
│                                                                  │
│   ┌─────────────────────────────────────────────────────────┐   │
│   │ plan_research()                                          │   │
│   │   Input: query                                           │   │
│   │   Output: sub_queries: list[str]                         │   │
│   │   • Calls LLM to generate 3-5 sub-queries                │   │
│   └─────────────────────────────────────────────────────────┘   │
│                          │                                       │
│                          ▼                                       │
│   ┌─────────────────────────────────────────────────────────┐   │
│   │ For each sub_query:                                      │   │
│   │   _process_sub_query()                                   │   │
│   │     Input: sub_query                                     │   │
│   │     Output: sub_context: str                             │   │
│   │                                                          │   │
│   │     1. MCP retrieval (if configured)                     │   │
│   │        → mcp_context: list[dict]                         │   │
│   │                                                          │   │
│   │     2. Web search via retrievers                         │   │
│   │        → search_results: list[dict]                      │   │
│   │                                                          │   │
│   │     3. Scrape URLs                                       │   │
│   │        → scraped_content: list[dict]                     │   │
│   │                                                          │   │
│   │     4. Similarity search via embeddings                  │   │
│   │        → relevant_context: str                           │   │
│   │                                                          │   │
│   │     5. Combine MCP + web context                         │   │
│   │        → combined_context: str                           │   │
│   └─────────────────────────────────────────────────────────┘   │
│                          │                                       │
│                          ▼                                       │
│   Aggregate all sub_contexts → final context: str               │
└─────────────────────────────────────────────────────────────────┘
    │
    │  If IMAGE_GENERATION_ENABLED:
    ▼
┌─────────────────────────────────────────────────────────────────┐
│ ImageGenerator.plan_and_generate_images()                        │
│   Input: context, query, research_id                            │
│   Output: available_images: list[dict]                          │
│     [{"url": "/outputs/images/.../img.png",                     │
│       "title": "...", "description": "..."}]                    │
│                                                                  │
│   1. LLM analyzes context for visual concepts                   │
│   2. Generates 2-3 images in parallel via Gemini                │
│   3. Saves to outputs/images/{research_id}/                     │
└─────────────────────────────────────────────────────────────────┘
    │
    │  researcher.write_report()
    ▼
┌─────────────────────────────────────────────────────────────────┐
│ ReportGenerator.write_report()                                   │
│   Input: context, available_images                              │
│   Output: report: str (markdown)                                │
│                                                                  │
│   → generate_report() action                                    │
│       • Builds prompt with context + image list                 │
│       • LLM generates report with embedded images               │
└─────────────────────────────────────────────────────────────────┘
    │
    ▼
┌─────────────────────────────────────────────────────────────────┐
│ Output                                                           │
│   • Streamed via WebSocket (type: "report")                     │
│   • Final via WebSocket (type: "report_complete")               │
│   • Exported to PDF, DOCX, Markdown                             │
│   • Saved to outputs/ directory                                 │
└─────────────────────────────────────────────────────────────────┘
```


================================================
FILE: .claude/references/mcp.md
================================================
# MCP Integration Reference

## Table of Contents
- [Overview](#overview)
- [Configuration](#configuration)
- [Strategy Options](#strategy-options)
- [Processing Logic](#processing-logic)

---

## Overview

MCP (Model Context Protocol) enables research from specialized data sources (GitHub, databases, APIs) alongside web search.

---

## Configuration

```python
researcher = GPTResearcher(
    query="...",
    mcp_configs=[
        {
            "name": "github",                    # Server name
            "command": "npx",                    # Command to start
            "args": ["-y", "@modelcontextprotocol/server-github"],
            "env": {"GITHUB_TOKEN": "..."},      # Environment vars
        },
        {
            "name": "filesystem",
            "command": "npx",
            "args": ["-y", "@anthropic/mcp-server-filesystem", "/docs"],
        },
        {
            "name": "remote",
            "connection_url": "ws://server:8080",  # WebSocket connection
            "connection_type": "websocket",
            "connection_token": "auth_token",
        }
    ],
    mcp_strategy="fast",  # fast, deep, disabled
)
```

---

## Strategy Options

| Strategy | Behavior | Use Case |
|----------|----------|----------|
| `fast` (default) | Run MCP once with original query, cache results | Performance-focused |
| `deep` | Run MCP for every sub-query | Maximum thoroughness |
| `disabled` | Skip MCP entirely | Web-only research |

---

## Processing Logic

**File:** `gpt_researcher/skills/researcher.py`

```python
# At start of research (for 'fast' strategy)
if mcp_strategy == "fast":
    mcp_context = await self._execute_mcp_research_for_queries([query], mcp_retrievers)
    self._mcp_results_cache = mcp_context  # Cache for reuse

# During sub-query processing
if mcp_strategy == "fast" and self._mcp_results_cache is not None:
    mcp_context = self._mcp_results_cache.copy()  # Reuse cache
elif mcp_strategy == "deep":
    mcp_context = await self._execute_mcp_research_for_queries([sub_query], mcp_retrievers)
```

### WebSocket Request Example

```json
{
    "task": "Research query",
    "report_type": "research_report",
    "mcp_enabled": true,
    "mcp_strategy": "fast",
    "mcp_configs": [
        {
            "name": "github",
            "command": "npx",
            "args": ["-y", "@modelcontextprotocol/server-github"],
            "env": {"GITHUB_TOKEN": "..."}
        }
    ]
}
```


================================================
FILE: .claude/references/multi-agents.md
================================================
# Multi-Agent System Reference

## Table of Contents
- [Overview](#overview)
- [Agent Roles](#agent-roles)
- [Workflow](#workflow)
- [Usage](#usage)

---

## Overview

**Directory:** `multi_agents/`

LangGraph-based system inspired by [STORM paper](https://arxiv.org/abs/2402.14207). Generates 5-6 page reports with multiple agents collaborating.

---

## Agent Roles

| Agent | File | Role |
|-------|------|------|
| Human | - | Oversees and provides feedback |
| Chief Editor | `agents/editor.py` | Master coordinator via LangGraph |
| Researcher | Uses GPTResearcher | Deep research on topics |
| Editor | `agents/editor.py` | Plans outline and structure |
| Reviewer | `agents/reviewer.py` | Validates research correctness |
| Revisor | `agents/revisor.py` | Revises based on feedback |
| Writer | `agents/writer.py` | Compiles final report |
| Publisher | `agents/publisher.py` | Exports to PDF, DOCX, Markdown |

---

## Workflow

```
1. Browser (GPTResearcher) → Initial research
2. Editor → Plans report outline
3. For each outline topic (parallel):
   a. Researcher → In-depth subtopic research
   b. Reviewer → Validates draft
   c. Revisor → Revises until satisfactory
4. Writer → Compiles final report
5. Publisher → Exports to multiple formats
```

---

## Usage

### Via API

```python
report_type = "multi_agents"
```

### Via WebSocket

```json
{
    "task": "Research query",
    "report_type": "multi_agents",
    "tone": "Analytical"
}
```

### Directly in Python

```python
from multi_agents import run_research_task

report = await run_research_task(
    query="Comprehensive analysis of market trends",
    websocket=handler,
    tone=Tone.Analytical,
)
```

### Configuration File

**File:** `multi_agents/task.json`

Configure the multi-agent research task parameters and agent behaviors.


================================================
FILE: .claude/references/prompts.md
================================================
# Prompt System Reference

## Table of Contents
- [PromptFamily Class](#promptfamily-class)
- [Key Prompt Examples](#key-prompt-examples)

---

## PromptFamily Class

**File:** `gpt_researcher/prompts.py`

All prompts are centralized in the `PromptFamily` class. This allows for model-specific prompt variations.

```python
class PromptFamily:
    """
    General purpose class for prompt formatting.
    Can be overwritten with model-specific derived classes.
    """

    def __init__(self, config: Config):
        self.cfg = config

    @staticmethod
    def get_prompt_by_report_type(report_type: str):
        """Returns the appropriate prompt generator for the report type."""
        match report_type:
            case ReportType.ResearchReport.value:
                return PromptFamily.generate_report_prompt
            case ReportType.DetailedReport.value:
                return PromptFamily.generate_report_prompt
            case ReportType.OutlineReport.value:
                return PromptFamily.generate_outline_report_prompt
            # ... etc
```

---

## Key Prompt Examples

### Agent Selection Prompt

```python
@staticmethod
def generate_agent_role_prompt(query: str, parent_query: str = "") -> str:
    return f"""Analyze the research query and select the most appropriate agent role.

Query: "{query}"
{f'Parent Query: "{parent_query}"' if parent_query else ''}

Based on the query, determine:
1. The domain expertise needed
2. The research approach required
3. The appropriate agent persona

Return a JSON object with:
- "agent": The agent type (e.g., "Research Analyst", "Technical Writer")
- "role": A detailed role description for how the agent should approach this research
"""
```

### Research Planning Prompt

```python
@staticmethod
def generate_search_queries_prompt(
    query: str,
    parent_query: str = "",
    report_type: str = "",
    max_iterations: int = 3,
    context: str = "",
) -> str:
    return f"""Generate {max_iterations} focused search queries to research: "{query}"

Context from initial search:
{context}

Requirements:
- Each query should explore a different aspect
- Queries should be specific and searchable
- Consider the report type: {report_type}

Return a JSON array of query strings.
"""
```

### Report Generation Prompt (with images)

```python
@staticmethod
def generate_report_prompt(
    question: str,
    context: str,
    report_source: str,
    report_format="apa",
    total_words=1000,
    tone=None,
    language="english",
    available_images: list = [],
) -> str:
    # Build image embedding instruction if images available
    image_instruction = ""
    if available_images:
        image_list = "\n".join([
            f"- Title: {img.get('title')}\n  URL: {img['url']}"
            for img in available_images
        ])
        image_instruction = f"""
AVAILABLE IMAGES (embed where relevant):
{image_list}

Use markdown format: ![Title](URL)
"""

    return f"""Information: "{context}"
---
Using the above information, answer: "{question}" in a detailed report.

- Format: {report_format}
- Length: ~{total_words} words
- Tone: {tone.value if tone else "Objective"}
- Language: {language}
- Include citations for all factual claims
{image_instruction}
"""
```

### MCP Tool Selection Prompt

```python
@staticmethod
def generate_mcp_tool_selection_prompt(query: str, tools_info: list, max_tools: int = 3) -> str:
    return f"""Select the most relevant tools for researching: "{query}"

AVAILABLE TOOLS:
{json.dumps(tools_info, indent=2)}

Select exactly {max_tools} tools ranked by relevance.

Return JSON:
{{
  "selected_tools": [
    {{"index": 0, "name": "tool_name", "relevance_score": 9, "reason": "..."}}
  ]
}}
"""
```


================================================
FILE: .claude/references/retrievers.md
================================================
# Retriever System Reference

## Table of Contents
- [Available Retrievers](#available-retrievers)
- [Retriever Selection](#retriever-selection)
- [Adding a New Retriever](#adding-a-new-retriever)

---

## Available Retrievers

**Directory:** `gpt_researcher/retrievers/`

| Retriever | Class | API Key Env Var |
|-----------|-------|-----------------|
| Tavily | `TavilySearch` | `TAVILY_API_KEY` |
| Google | `GoogleSearch` | `GOOGLE_API_KEY`, `GOOGLE_CX_KEY` |
| DuckDuckGo | `Duckduckgo` | None |
| Bing | `BingSearch` | `BING_API_KEY` |
| Serper | `SerperSearch` | `SERPER_API_KEY` |
| SerpAPI | `SerpApiSearch` | `SERPAPI_API_KEY` |
| SearchAPI | `SearchApiSearch` | `SEARCHAPI_API_KEY` |
| Exa | `ExaSearch` | `EXA_API_KEY` |
| arXiv | `ArxivSearch` | None |
| Semantic Scholar | `SemanticScholarSearch` | None |
| PubMed Central | `PubMedCentralSearch` | None |
| MCP | `MCPRetriever` | Per-server |
| Custom | `CustomRetriever` | User-defined |

---

## Retriever Selection

**File:** `gpt_researcher/actions/retriever.py`

```python
def get_retriever(retriever: str):
    """Get a retriever class by name."""
    match retriever:
        case "tavily":
            from gpt_researcher.retrievers import TavilySearch
            return TavilySearch
        case "google":
            from gpt_researcher.retrievers import GoogleSearch
            return GoogleSearch
        case "mcp":
            from gpt_researcher.retrievers import MCPRetriever
            return MCPRetriever
        # ... etc

def get_retrievers(retriever_names: str, headers: dict = None) -> list:
    """
    Get multiple retrievers from comma-separated string.
    
    Usage: RETRIEVER=tavily,google,mcp
    """
    retrievers = []
    for name in retriever_names.split(","):
        retriever_class = get_retriever(name.strip())
        if retriever_class:
            retrievers.append(retriever_class)
    return retrievers
```

---

## Adding a New Retriever

### Step 1: Create Retriever File

**File:** `gpt_researcher/retrievers/my_retriever/my_retriever.py`

```python
class MyRetriever:
    def __init__(self, query: str, headers: dict = None):
        self.query = query
        self.headers = headers
    
    async def search(self, max_results: int = 10) -> list[dict]:
        """
        Returns list of:
        {
            "title": str,
            "href": str,
            "body": str
        }
        """
        # Implementation
        pass
```

### Step 2: Register in retriever.py

**File:** `gpt_researcher/actions/retriever.py`

```python
case "my_retriever":
    from gpt_researcher.retrievers.my_retriever import MyRetriever
    return MyRetriever
```

### Step 3: Export in __init__.py

**File:** `gpt_researcher/retrievers/__init__.py`

```python
from .my_retriever import MyRetriever
__all__ = [..., "MyRetriever"]
```

### Step 4: Usage

```bash
RETRIEVER=tavily,my_retriever
```

```python
researcher = GPTResearcher(
    query="...",
    # Will use both Tavily and your custom retriever
)
```


================================================
FILE: .cursorignore
================================================
.venv
__pycache__
outputs
.github

================================================
FILE: .dockerignore
================================================
.git
output/


================================================
FILE: .github/ISSUE_TEMPLATE/bug_report.md
================================================
---
name: Bug report
about: Create a report to help us improve
title: ''
labels: ''
assignees: ''

---

**Describe the bug**
A clear and concise description of what the bug is.

**To Reproduce**
Steps to reproduce the behavior:
1. Go to '...'
2. Click on '....'
3. Scroll down to '....'
4. See error

**Expected behavior**
A clear and concise description of what you expected to happen.

**Screenshots**
If applicable, add screenshots to help explain your problem.

**Desktop (please complete the following information):**
 - OS: [e.g. iOS]
 - Browser [e.g. chrome, safari]
 - Version [e.g. 22]

**Smartphone (please complete the following information):**
 - Device: [e.g. iPhone6]
 - OS: [e.g. iOS8.1]
 - Browser [e.g. stock browser, safari]
 - Version [e.g. 22]

**Additional context**
Add any other context about the problem here.


================================================
FILE: .github/ISSUE_TEMPLATE/feature_request.md
================================================
---
name: Feature request
about: Suggest an idea for this project
title: ''
labels: ''
assignees: ''

---

**Is your feature request related to a problem? Please describe.**
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

**Describe the solution you'd like**
A clear and concise description of what you want to happen.

**Describe alternatives you've considered**
A clear and concise description of any alternative solutions or features you've considered.

**Additional context**
Add any other context or screenshots about the feature request here.


================================================
FILE: .github/dependabot.yml
================================================
# To get started with Dependabot version updates, you'll need to specify which
# package ecosystems to update and where the package manifests are located.
# Please see the documentation for all configuration options:
# https://docs.github.com/github/administering-a-repository/configuration-options-for-dependency-updates

version: 2
updates:
  - package-ecosystem: "pip" # See documentation for possible values
    directory: "/" # Location of package manifests
    schedule:
      interval: "weekly"
  - package-ecosystem: "docker"
    directory: "/"
    schedule:
      interval: "weekly"


================================================
FILE: .github/workflows/build.yml
================================================
name: Build-Push and Update Image Tag

on:
    push: 
        branches: [ master ]
        paths-ignore:
        - 'terraform/**' 
    
env:
    REPO_FULL_NAME: ${{ github.repository }}
    AWS_REGION: us-east-1

jobs:
    build-and-update:
        runs-on: ubuntu-latest
        outputs:
            image-tag: ${{ steps.image-tag.outputs.image_tag }}
        permissions:
            contents: write
            id-token: write
            actions: write

        steps:
        - name: Checkout code
          uses: actions/checkout@v5
          with:
            token: ${{ secrets.GITHUB_TOKEN }}
            fetch-depth: 0

        - name: Extract repository name
          id: extract_short_name_repo
          run: |
            REPO_NAME="${REPO_FULL_NAME##*/}"
            echo "Repository Short name: $REPO_NAME"
            echo "REPO_NAME=$REPO_NAME" >> $GITHUB_OUTPUT

        - name: Configure Git
          run: |
            git config --global user.name "github-actions[bot]"
            git config --global user.email "github-actions[bot]@users.noreply.github.com"

        - name: Generate image tag
          id: image-tag
          run: |
            SHORT_SHA=$(echo ${{ github.sha }} | cut -c1-7)
            TIMESTAMP=$(date +%Y%m%d-%H%M%S)
            IMAGE_TAG="${TIMESTAMP}-${SHORT_SHA}"
            echo "tag=${IMAGE_TAG}" >> $GITHUB_OUTPUT
            echo "short_sha=${SHORT_SHA}" >> $GITHUB_OUTPUT
        
        - name: Configure AWS credentials
          uses: aws-actions/configure-aws-credentials@v4
          with:
            role-to-assume: arn:aws:iam::908027381725:role/${{ steps.extract_short_name_repo.outputs.REPO_NAME }}-github-actions-role
            aws-region: ${{ env.AWS_REGION }}
        
        - name: Login to ECR
          id: login-ecr
          uses: aws-actions/amazon-ecr-login@v2
        
        - name: Build Docker image
          working-directory: .
          run: |
            docker build --build-arg VITE_API_URL="${{ vars.VITE_API_URL }}" -t ${{ steps.login-ecr.outputs.registry }}/${{ steps.extract_short_name_repo.outputs.REPO_NAME }}:${{ steps.image-tag.outputs.tag }} . 
            docker push ${{ steps.login-ecr.outputs.registry }}/${{ steps.extract_short_name_repo.outputs.REPO_NAME }}:${{ steps.image-tag.outputs.tag }}
        
        - name: Update image tag
          run: |
            echo "image_tag=${{ steps.image-tag.outputs.tag }}" >> $GITHUB_OUTPUT

        - name: Trigger deployment workflow
          run: |
            echo "Triggering deployment workflow with image tag: ${{ steps.image-tag.outputs.tag }}"
            
            # Try GitHub CLI first
            if gh workflow run .github/workflows/deploy.yml \
            --ref master \
            --field image_tag="${{ steps.image-tag.outputs.tag }}"; then
            echo "✅ Successfully triggered deployment workflows via GitHub CLI"
            else
            echo "⚠️ GitHub CLI failed, trying API directly..."
            # Fallback to direct API call
            curl -X POST \
                -H "Accept: application/vnd.github.v3+json" \
                -H "Authorization: token ${{ secrets.GITHUB_TOKEN }}" \
                https://api.github.com/repos/${{ github.repository }}/actions/workflows/deploy.yml/dispatches \
                -d '{"ref":"master","inputs":{"image_tag":"${{ steps.image-tag.outputs.tag }}"}}'
            echo "✅ Triggered deployment workflow via API"
            fi
          env:
            GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
      
        - name: Create deployment summary
          run: |
            echo "## 🚀 Build Summary" >> $GITHUB_STEP_SUMMARY
            echo "| Item | Value |" >> $GITHUB_STEP_SUMMARY
            echo "|------|-------|" >> $GITHUB_STEP_SUMMARY
            echo "| **Image Tag** | \`${{ steps.image-tag.outputs.tag }}\` |" >> $GITHUB_STEP_SUMMARY
            echo "| **ECR Repository** | \`${{ steps.extract_short_name_repo.outputs.REPO_NAME }}\` |" >> $GITHUB_STEP_SUMMARY
            echo "| **Commit SHA** | \`${{ steps.image-tag.outputs.short_sha }}\` |" >> $GITHUB_STEP_SUMMARY
            echo "| **Build Status** | ✅ Complete |" >> $GITHUB_STEP_SUMMARY
            echo "" >> $GITHUB_STEP_SUMMARY
            echo "### Next Steps" >> $GITHUB_STEP_SUMMARY
            echo "- ✅ Deployment workflow triggered with image tag: \`${{ steps.image-tag.outputs.tag }}\`" >> $GITHUB_STEP_SUMMARY
            echo "- Monitor the **Terraform Deploy** workflow for completion" >> $GITHUB_STEP_SUMMARY
            echo "- Service will be available at \`${{ steps.extract_short_name_repo.outputs.REPO_NAME }}.ggai:3535\`" >> $GITHUB_STEP_SUMMARY


================================================
FILE: .github/workflows/deploy.yml
================================================
name: Terraform Deploy
on:
    push:
        branches: [ master ]
        paths:
        - 'terraform/**'
    pull_request:
        branches: [ master ]
        paths:
        - 'terraform/**'
    workflow_dispatch:
        inputs:
            image_tag:
                description: 'Docker image tag to deploy'
                required: false
                default: 'latest'
                type: string
env:
    AWS_REGION: us-east-1
    TF_VAR_image_tag: us-east-1
    REPO_FULL_NAME: ${{ github.repository }}

jobs:
  terraform-plan:
    if: github.event_name == 'pull_request'
    runs-on: ubuntu-latest

    permissions:
      contents: read
      pull-requests: write
      id-token: write

    steps:
    - name: Checkout code
      uses: actions/checkout@v4

    - name: Extract repository name
      id: extract_short_name_repo
      run: |
        REPO_NAME="${REPO_FULL_NAME##*/}"
        echo "Repository Short name: $REPO_NAME"
        echo "REPO_NAME=$REPO_NAME" >> $GITHUB_OUTPUT

    - name: Set default image tag for PR
      id: pr-image-tag
      run: |
        # Use defaults if inputs are empty or not provided
        IMAGE_TAG="${{ inputs.image_tag }}"
        
        # Set to 'latest' if empty, null, or not provided
        if [ -z "$IMAGE_TAG" ] || [ "$IMAGE_TAG" = "null" ]; then
          IMAGE_TAG="latest"
        fi
        
        echo "image_tag=$IMAGE_TAG" >> $GITHUB_OUTPUT
        echo "Using image tag: $IMAGE_TAG"

    - name: Setup Terraform
      uses: hashicorp/setup-terraform@v3
      with:
        terraform_version: ~1.5

    - name: Configure AWS credentials
      uses: aws-actions/configure-aws-credentials@v4
      with:
        role-to-assume: arn:aws:iam::908027381725:role/${{ steps.extract_short_name_repo.outputs.REPO_NAME }}-github-actions-role
        aws-region: ${{ env.AWS_REGION }}

    - name: Configure Git for private modules
      run: |
        git config --global url."https://${{ secrets.GH_TOKEN }}@github.com/".insteadOf "https://github.com/"

    - name: Terraform Init
      working-directory: ./terraform
      run: terraform init

    - name: Terraform Validate
      working-directory: ./terraform
      run: terraform validate

    - name: Terraform Plan
      id: plan
      working-directory: ./terraform
      run: |
        terraform plan -input=false -lock=false -no-color -out=tfplan
        terraform show -no-color tfplan > plan_output.txt
      env:
        TF_VAR_image_tag: ${{ steps.pr-image-tag.outputs.image_tag }}

    - name: Update Pull Request
      uses: actions/github-script@v7
      with:
        script: |
          const fs = require('fs');
          const planOutput = fs.readFileSync('./terraform/plan_output.txt', 'utf8');
          const output = `## 🏗️ Terraform Plan for ${{ steps.extract_short_name_repo.outputs.REPO_SHORT_NAME }}
          <details>
          <summary>Click to expand plan</summary>
          \`\`\`hcl
          ${planOutput}
          \`\`\`
          </details>
          **Plan Status:** ${{ steps.plan.outcome }}
          **Service:** ${{ steps.extract_short_name_repo.outputs.REPO_SHORT_NAME }}.ggai:8000
          `;
          github.rest.issues.createComment({
            issue_number: context.issue.number,
            owner: context.repo.owner,
            repo: context.repo.repo,
            body: output
          });

  terraform-apply:
    if: github.ref == 'refs/heads/master' && (github.event_name == 'push' || github.event_name == 'workflow_dispatch')
    runs-on: ubuntu-latest

    permissions:
      contents: read
      id-token: write

    steps:
    - name: Checkout code
      uses: actions/checkout@v4
    
    - name: Extract repository name
      id: extract_short_name_repo
      run: |
        REPO_NAME="${REPO_FULL_NAME##*/}"
        echo "Repository Short name: $REPO_NAME"
        echo "REPO_NAME=$REPO_NAME" >> $GITHUB_OUTPUT

    - name: Configure AWS credentials
      uses: aws-actions/configure-aws-credentials@v4
      with:
        role-to-assume: arn:aws:iam::908027381725:role/${{ steps.extract_short_name_repo.outputs.REPO_NAME }}-github-actions-role
        aws-region: ${{ env.AWS_REGION }}

    - name: Determine image tag
      id: get-image-tag
      run: |
        if [ "${{ github.event_name }}" = "workflow_dispatch" ] && [ -n "${{ inputs.image_tag }}" ]; then
          # Priority 1: Manual input from workflow_dispatch (triggered by build workflow or manual)
          IMAGE_TAG="${{ inputs.image_tag }}"
          echo "source=manual" >> $GITHUB_OUTPUT
          echo "image_tag=${IMAGE_TAG}" >> $GITHUB_OUTPUT
          echo "Using provided image tag: ${IMAGE_TAG}"
        else
          # Priority 2: Direct terraform push (no application changes)
          # Check if commit contains application changes (non-terraform files)
          APP_CHANGES=$(git diff --name-only HEAD~1 HEAD | grep -v "^terraform/" | grep -v "^\.github/workflows/deploy\.yml" | wc -l)
          if [ "$APP_CHANGES" -gt 0 ]; then
            # Application changes detected but deploy workflow was triggered directly
            echo "⚠️ Application changes detected but no image tag provided!"
            echo "This deployment may fail because build workflow should have run first."
            echo "Attempting to get current image tag from ECS task definition..."
          fi
          # Get current image tag FE and BE from ECS task definition
          CURRENT_IMAGE=$(aws ecs describe-task-definition \
            --task-definition ${{ steps.extract_short_name_repo.outputs.REPO_NAME }}-prod-task-def \
            --query 'taskDefinition.containerDefinitions[0].image' \
            --output text 2>/dev/null || echo "")
          if [ -n "$CURRENT_IMAGE" ] && [[ "$CURRENT_IMAGE" != "None" ]]; then
            # Extract tag from image URL (format: 908027381725.dkr.ecr.us-east-1.amazonaws.com/${{ steps.extract_short_name_repo.outputs.REPO_NAME }}:TAG)
            IMAGE_TAG=$(echo "$CURRENT_IMAGE" | cut -d':' -f2)
            echo "source=current_ecs" >> $GITHUB_OUTPUT
            echo "image_tag=${IMAGE_TAG}" >> $GITHUB_OUTPUT
            echo "Using current ECS image tag: ${IMAGE_TAG} (extracted from task definition)"
          else
            # Fallback to latest if we can't get current task definition
            echo "Could not retrieve current task definition, falling back to 'latest'"
            IMAGE_TAG="latest"
            echo "source=fallback" >> $GITHUB_OUTPUT
            echo "image_tag=${IMAGE_TAG}" >> $GITHUB_OUTPUT
            echo "Using fallback image tag: ${IMAGE_TAG}"
          fi
        fi
    - name: Setup Terraform
      uses: hashicorp/setup-terraform@v3
      with:
        terraform_version: ~1.5

    - name: Configure Git for private modules
      run: |
        git config --global url."https://${{ secrets.GH_TOKEN }}@github.com/".insteadOf "https://github.com/"
    - name: Terraform Init
      working-directory: ./terraform
      run: terraform init

    - name: Terraform Validate
      working-directory: ./terraform
      run: terraform validate

    - name: Terraform Plan
      working-directory: ./terraform
      run: terraform plan -no-color
      env:
        TF_VAR_image_tag: ${{ steps.get-image-tag.outputs.image_tag }}

    - name: Terraform Apply
      working-directory: ./terraform
      run: terraform apply -auto-approve
      env:
        TF_VAR_image_tag: ${{ steps.get-image-tag.outputs.image_tag }}

    - name: Get deployment outputs
      id: terraform-output
      working-directory: ./terraform
      run: |
        SERVICE_URL=$(terraform output -raw service_discovery_endpoint 2>/dev/null || echo '${{ steps.extract_short_name_repo.outputs.REPO_NAME }}.ggai')
        ECR_REPO=$(terraform output -raw ecr_repository_url 2>/dev/null || echo 'N/A')
        delimiter=$(openssl rand -hex 8)
        echo "service_url<<${delimiter}" >> $GITHUB_OUTPUT
        echo "${SERVICE_URL}" >> $GITHUB_OUTPUT
        echo "${delimiter}" >> $GITHUB_OUTPUT
        echo "ecr_repository<<${delimiter}" >> $GITHUB_OUTPUT
        echo "${ECR_REPO}" >> $GITHUB_OUTPUT
        echo "${delimiter}" >> $GITHUB_OUTPUT

    - name: Create deployment summary
      run: |
        echo "## 🚀 ${{ steps.extract_short_name_repo.outputs.REPO_NAME }} Deployment Summary" >> $GITHUB_STEP_SUMMARY
        echo "| Component | Status |" >> $GITHUB_STEP_SUMMARY
        echo "|-----------|--------|" >> $GITHUB_STEP_SUMMARY
        echo "| **Terraform Init** | ✅ Success |" >> $GITHUB_STEP_SUMMARY
        echo "| **Terraform Validate** | ✅ Success |" >> $GITHUB_STEP_SUMMARY
        echo "| **Terraform Apply** | ✅ Success |" >> $GITHUB_STEP_SUMMARY
        echo "" >> $GITHUB_STEP_SUMMARY
        echo "### Service Information" >> $GITHUB_STEP_SUMMARY
        echo "- **Service URL**: ${{ steps.terraform-output.outputs.service_url }}" >> $GITHUB_STEP_SUMMARY
        echo "- **Container Port**: 3535" >> $GITHUB_STEP_SUMMARY
        echo "- **ECR Repository**: ${{ steps.terraform-output.outputs.ecr_repository }}" >> $GITHUB_STEP_SUMMARY
        echo "- **Image Tag**: \`${{ steps.get-image-tag.outputs.image_tag }}\`" >> $GITHUB_STEP_SUMMARY
        echo "- **Tag Source**: ${{ steps.get-image-tag.outputs.source }}" >> $GITHUB_STEP_SUMMARY
        echo "- **Trigger**: ${{ github.event_name }}" >> $GITHUB_STEP_SUMMARY
        echo "- **Deployment Time**: $(date)" >> $GITHUB_STEP_SUMMARY
        echo "" >> $GITHUB_STEP_SUMMARY
        # Add deployment trigger notice
        if [ "${{ github.event_name }}" = "workflow_dispatch" ]; then
          if [ "${{ steps.get-image-tag.outputs.source }}" = "manual" ]; then
            echo "### 🎯 Manual Deployment" >> $GITHUB_STEP_SUMMARY
            echo "This deployment was triggered manually with image tag: \`${{ steps.get-image-tag.outputs.image_tag }}\`" >> $GITHUB_STEP_SUMMARY
            echo "" >> $GITHUB_STEP_SUMMARY
          else
            echo "### 🤖 Build-Triggered Deployment" >> $GITHUB_STEP_SUMMARY
            echo "This deployment was triggered automatically by the build workflow with image tag: \`${{ steps.get-image-tag.outputs.image_tag }}\`" >> $GITHUB_STEP_SUMMARY
            echo "" >> $GITHUB_STEP_SUMMARY
          fi
        fi
        
        echo "### Next Steps" >> $GITHUB_STEP_SUMMARY
        echo "- Service will be available at \`${{ steps.extract_short_name_repo.outputs.REPO_NAME }}.ggai:8000\`" >> $GITHUB_STEP_SUMMARY
        echo "- Check ECS console for service health" >> $GITHUB_STEP_SUMMARY
        echo "- Monitor CloudWatch logs: \`/ecs/${{ steps.extract_short_name_repo.outputs.REPO_NAME }}\`" >> $GITHUB_STEP_SUMMARY
        

================================================
FILE: .github/workflows/docker-build.yml
================================================
name: GPTR tests
run-name: ${{ github.actor }} ran the GPTR tests flow
permissions:
  contents: read
  pull-requests: write
on:
  workflow_dispatch:  # Add this line to enable manual triggering
  # pull_request:
  #   types: [opened, synchronize]

jobs:
  docker:
    runs-on: ubuntu-latest
    environment: tests  # Specify the environment to use for this job
    env:
      # Ensure these environment variables are set for the entire job
      OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
      TAVILY_API_KEY: ${{ secrets.TAVILY_API_KEY }}
      LANGCHAIN_API_KEY: ${{ secrets.LANGCHAIN_API_KEY }}
    steps:
      - name: Git checkout
        uses: actions/checkout@v3

      - name: Set up QEMU
        uses: docker/setup-qemu-action@v2

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v2
        with:
          driver: docker

      # - name: Build Docker images
      #   uses: docker/build-push-action@v4
      #   with:
      #     push: false
      #     tags: gptresearcher/gpt-researcher:latest
      #     file: Dockerfile          

      - name: Set up Docker Compose
        run: |
          sudo curl -L "https://github.com/docker/compose/releases/download/1.29.2/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
          sudo chmod +x /usr/local/bin/docker-compose
      - name: Run tests with Docker Compose
        run: |
          docker-compose --profile test run --rm gpt-researcher-tests

================================================
FILE: .gitignore
================================================
#Ignore env containing secrets
.env
.venv
.envrc

#Ignore Virtual Env
env/
venv/
.venv/

# Other Environments
ENV/
env.bak/
venv.bak/

#Ignore generated outputs
outputs/
*.lock
dist/
gpt_researcher.egg-info/

#Ignore my local docs
my-docs/

#Ignore pycache
**/__pycache__/

#Ignore mypy cache
.mypy_cache/
node_modules
.idea
.DS_Store
.docusaurus
build
docs/build

.vscode/launch.json
.langgraph-data/
.next/
package-lock.json

#Vim swp files
*.swp

# Log files
logs/
*.orig
*.log
server_log.txt

#Cursor Rules
.cursorrules
CURSOR_RULES.md
/.history


================================================
FILE: .python-version
================================================
3.11

================================================
FILE: CODE_OF_CONDUCT.md
================================================
# Contributor Covenant Code of Conduct

## Our Pledge

We, as members, contributors, and leaders, pledge to make participation in our
community a harassment-free experience for everyone, regardless of age, body
size, visible or invisible disability, ethnicity, sex characteristics, gender
identity and expression, level of experience, education, socio-economic status,
nationality, personal appearance, race, religion, sexual identity, or
orientation.

We commit to acting and interacting in ways that contribute to an open, welcoming,
diverse, inclusive, and healthy community.

## Our Standards

Examples of behavior that contributes to a positive environment for our
community include:

- Demonstrating empathy and kindness toward others
- Being respectful of differing opinions, viewpoints, and experiences
- Giving and gracefully accepting constructive feedback
- Accepting responsibility and apologizing to those affected by our mistakes, and learning from the experience
- Focusing on what is best not just for us as individuals, but for the
  overall community

Examples of unacceptable behavior include:

- The use of sexualized language or imagery, and sexual attention or
  advances of any kind
- Trolling, insulting or derogatory comments, and personal or political attacks
- Public or private harassment
- Publishing others' private information, such as a physical or email address, without their explicit permission
- Other conduct that could reasonably be considered inappropriate in a professional setting

## Enforcement Responsibilities

Community leaders are responsible for clarifying and enforcing our standards of
acceptable behavior and will take appropriate and fair corrective action in
response to any behavior deemed inappropriate, threatening, offensive,
or harmful.

Community leaders have the right and responsibility to remove, edit, or reject
comments, commits, code, wiki edits, issues, and other contributions that do not
align with this Code of Conduct, and will communicate reasons for moderation
decisions when appropriate.

## Scope

This Code of Conduct applies to all community spaces and also applies when
an individual is officially representing the community in public spaces.
Examples include using an official email address, posting via an official
social media account, or acting as an appointed representative at an online or offline event.

## Enforcement

Instances of abusive, harassing, or otherwise unacceptable behavior may be
reported to the community leaders responsible for enforcement at
[Assaf.elovic@gmail.com](mailto:Assaf.elovic@gmail.com).
All complaints will be reviewed and investigated promptly and fairly.

All community leaders are obligated to respect the privacy and security of the
reporter of any incident.

## Enforcement Guidelines

Community leaders will follow these Community Impact Guidelines in determining
the consequences for any action they deem in violation of this Code of Conduct:

### 1. Correction

**Community Impact**: Use of inappropriate language or other behavior deemed
unprofessional or unwelcome in the community.

**Consequence**: A private, written warning from community leaders, providing
clarity around the nature of the violation and an explanation of why the
behavior was inappropriate. A public apology may be requested.

### 2. Warning

**Community Impact**: A violation through a single incident or series
of actions.

**Consequence**: A warning with consequences for continued behavior. No
interaction with the people involved, including unsolicited interaction with
those enforcing the Code of Conduct, for a specified period. This includes
avoiding interactions in community spaces and external channels like social media.
Violating these terms may lead to a temporary or permanent ban.

### 3. Temporary Ban

**Community Impact**: A serious violation of community standards, including
sustained inappropriate behavior.

**Consequence**: A temporary ban from any interaction or public
communication with the community for a specified period. No public or
private interaction with the people involved, including unsolicited interaction
with those enforcing the Code of Conduct, is allowed during this period.
Violating these terms may lead to a permanent ban.

### 4. Permanent Ban

**Community Impact**: Demonstrating a pattern of violation of community
standards, including sustained inappropriate behavior, harassment of an
individual, or aggression toward or disparagement of groups of individuals.

**Consequence**: A permanent ban from any public interaction within
the community.

## Attribution

This Code of Conduct is adapted from the [Contributor Covenant][homepage],
version 2.0, available at
https://www.contributor-covenant.org/version/2/0/code_of_conduct.html.

Community Impact Guidelines were inspired by [Mozilla's code of conduct
enforcement ladder](https://github.com/mozilla/diversity).

[homepage]: https://www.contributor-covenant.org

For answers to common questions about this code of conduct, see the FAQ at
https://www.contributor-covenant.org/faq. Translations are available at
https://www.contributor-covenant.org/translations.


================================================
FILE: CONTRIBUTING.md
================================================
# Contributing to GPT Researcher

First off, we'd like to welcome you and thank you for your interest and effort in contributing to our open-source project ❤️. Contributions of all forms are welcome—from new features and bug fixes to documentation and more.

We are on a mission to build the #1 AI agent for comprehensive, unbiased, and factual research online, and we need your support to achieve this grand vision.

Please take a moment to review this document to make the contribution process easy and effective for everyone involved.

## Reporting Issues

If you come across any issue or have an idea for an improvement, don't hesitate to create an issue on GitHub. Describe your problem in sufficient detail, providing as much relevant information as possible. This way, we can reproduce the issue before attempting to fix it or respond appropriately.

## Contributing Code

1. **Fork the repository and create your branch from `master`.**  
   If it’s not an urgent bug fix, branch from `master` and work on the feature or fix there.

2. **Make your changes.**  
   Implement your changes following best practices for coding in the project's language.

3. **Test your changes.**  
   Ensure that your changes pass all tests if any exist. If the project doesn’t have automated tests, test your changes manually to confirm they behave as expected.

4. **Follow the coding style.**  
   Ensure your code adheres to the coding conventions used throughout the project, including indentation, accurate comments, etc.

5. **Commit your changes.**  
   Make your Git commits informative and concise. This is very helpful for others when they look at the Git log.

6. **Push to your fork and submit a pull request.**  
   When your work is ready and passes tests, push your branch to your fork of the repository and submit a pull request from there.

7. **Pat yourself on the back and wait for review.**  
   Your work is done, congratulations! Now sit tight. The project maintainers will review your submission as soon as possible. They might suggest changes or ask for improvements. Both constructive conversation and patience are key to the collaboration process.

## Documentation

If you would like to contribute to the project's documentation, please follow the same steps: fork the repository, make your changes, test them, and submit a pull request.

Documentation is a vital part of any software. It's not just about having good code; ensuring that users and contributors understand what's going on, how to use the software, or how to contribute is crucial.

We're grateful for all our contributors, and we look forward to building the world's leading AI research agent hand-in-hand with you. Let's harness the power of open source and AI to change the world together!


================================================
FILE: Dockerfile
================================================
# Stage 1: Browser and build tools installation
# Python 3.12+ required for LangChain v1
FROM python:3.12-slim-bookworm AS install-browser

# Install Chromium, Chromedriver, Firefox, Geckodriver, and build tools in one layer
RUN apt-get update \
    && apt-get install -y gnupg wget ca-certificates --no-install-recommends \
    && ARCH=$(dpkg --print-architecture) \
    && wget -qO - https://dl.google.com/linux/linux_signing_key.pub | apt-key add - \
    && echo "deb [arch=${ARCH}] http://dl.google.com/linux/chrome/deb/ stable main" > /etc/apt/sources.list.d/google-chrome.list \
    && apt-get update \
    && apt-get install -y chromium chromium-driver \
    && chromium --version && chromedriver --version \
    && apt-get install -y --no-install-recommends firefox-esr build-essential \
    && GECKO_ARCH=$(case ${ARCH} in amd64) echo "linux64" ;; arm64) echo "linux-aarch64" ;; *) echo "linux64" ;; esac) \
    && wget https://github.com/mozilla/geckodriver/releases/download/v0.36.0/geckodriver-v0.36.0-${GECKO_ARCH}.tar.gz \
    && tar -xvzf geckodriver-v0.36.0-${GECKO_ARCH}.tar.gz \
    && chmod +x geckodriver \
    && mv geckodriver /usr/local/bin/ \
    && rm geckodriver-v0.36.0-${GECKO_ARCH}.tar.gz \
    && rm -rf /var/lib/apt/lists/*  # Clean up apt lists to reduce image size

# Stage 2: Python dependencies installation
FROM install-browser AS gpt-researcher-install

ENV PIP_ROOT_USER_ACTION=ignore
WORKDIR /usr/src/app

# Copy and install Python dependencies in a single layer to optimize cache usage
COPY ./requirements.txt ./requirements.txt
COPY ./multi_agents/requirements.txt ./multi_agents/requirements.txt

RUN pip install --upgrade pip && \
    pip install --no-cache-dir -r requirements.txt --upgrade --prefer-binary && \
    pip install --no-cache-dir -r multi_agents/requirements.txt --upgrade --prefer-binary

# Stage 3: Final stage with non-root user and app
FROM gpt-researcher-install AS gpt-researcher

# Basic server configuration
ARG HOST=0.0.0.0
ENV HOST=${HOST}
ARG PORT=8000
ENV PORT=${PORT}
EXPOSE ${PORT}

# Uvicorn parameters used in CMD
ARG WORKERS=1
ENV WORKERS=${WORKERS}

# Create a non-root user for security
# NOTE: Don't use this if you are relying on `_check_pkg` to pip install packages dynamically.
RUN useradd -ms /bin/bash gpt-researcher && \
    chown -R gpt-researcher:gpt-researcher /usr/src/app && \
    # Add these lines to create and set permissions for outputs directory
    mkdir -p /usr/src/app/outputs && \
    chown -R gpt-researcher:gpt-researcher /usr/src/app/outputs && \
    chmod 777 /usr/src/app/outputs
USER gpt-researcher
WORKDIR /usr/src/app

# Copy the rest of the application files with proper ownership
COPY --chown=gpt-researcher:gpt-researcher ./ ./
CMD uvicorn main:app --host ${HOST} --port ${PORT} --workers ${WORKERS}


================================================
FILE: Dockerfile.fullstack
================================================
########################################################################
# Stage 1: Frontend build
########################################################################
FROM node:slim AS frontend-builder
WORKDIR /app/frontend/nextjs

# Copy package files and install dependencies
COPY frontend/nextjs/package.json frontend/nextjs/package-lock.json* ./
RUN npm install --legacy-peer-deps

# Copy the rest of the frontend application and build it
COPY frontend/nextjs/ ./
RUN npm run build

########################################################################
# Stage 2: Browser and backend build tools installation
########################################################################
FROM python:3.13.3-slim-bookworm AS install-browser

# Install Chromium, Chromedriver, Firefox, Geckodriver, and build tools in one layer
RUN echo 'Acquire::Retries "3";' > /etc/apt/apt.conf.d/80-retries \
  && echo 'Acquire::http::Timeout "60";' >> /etc/apt/apt.conf.d/80-retries \
  && echo 'Acquire::https::Timeout "60";' >> /etc/apt/apt.conf.d/80-retries \
  && echo 'Acquire::ftp::Timeout "60";' >> /etc/apt/apt.conf.d/80-retries \
  && apt-get update \
  && apt-get install -y gnupg wget ca-certificates --no-install-recommends \
  && ARCH=$(dpkg --print-architecture) \
  && if [ "$ARCH" = "arm64" ]; then \
  apt-get install -y chromium chromium-driver \
  && chromium --version && chromedriver --version; \
  else \
  wget -qO - https://dl.google.com/linux/linux_signing_key.pub | apt-key add - \
  && echo "deb [arch=${ARCH}] http://dl.google.com/linux/chrome/deb/ stable main" \
  > /etc/apt/sources.list.d/google-chrome.list \
  && apt-get update \
  && apt-get install -y google-chrome-stable; \
  fi \
  && apt-get install -y --no-install-recommends firefox-esr build-essential \
  && GECKO_ARCH=$(case ${ARCH} in amd64) echo "linux64" ;; arm64) echo "linux-aarch64" ;; *) echo "linux64" ;; esac) \
  && wget https://github.com/mozilla/geckodriver/releases/download/v0.36.0/geckodriver-v0.36.0-${GECKO_ARCH}.tar.gz \
  && tar -xvzf geckodriver-v0.36.0-${GECKO_ARCH}.tar.gz \
  && chmod +x geckodriver \
  && mv geckodriver /usr/local/bin/ \
  && rm geckodriver-v0.36.0-${GECKO_ARCH}.tar.gz \
  && rm -rf /var/lib/apt/lists/*

########################################################################
# Stage 3: Python dependencies installation
########################################################################
FROM install-browser AS backend-builder
WORKDIR /usr/src/app

ENV PIP_ROOT_USER_ACTION=ignore

COPY ./requirements.txt ./requirements.txt
COPY ./multi_agents/requirements.txt ./multi_agents/requirements.txt

# Install Python packages with retry logic and timeout configuration
RUN pip config set global.timeout 60 && \
  pip config set global.retries 3 && \
  pip install --upgrade pip && \
  pip install --no-cache-dir -r requirements.txt --upgrade --prefer-binary && \
  pip install --no-cache-dir -r multi_agents/requirements.txt --upgrade --prefer-binary

########################################################################
# Stage 4: Final image with backend, frontend
########################################################################
FROM backend-builder AS final

WORKDIR /usr/src/app

# Install Node.js and supervisord with retry logic
RUN apt-get update && \
  apt-get install -y curl supervisor nginx && \
  curl -fsSL --retry 3 --retry-delay 10 https://deb.nodesource.com/setup_20.x | bash - && \
  apt-get install -y nodejs && \
  rm -rf /var/lib/apt/lists/*

# Set backend server configuration
ARG HOST=0.0.0.0
ENV HOST=${HOST}

ARG PORT=8000
ENV PORT=${PORT}
EXPOSE ${PORT}

ARG NEXT_PORT=3000
ENV NEXT_PORT=${NEXT_PORT}
EXPOSE ${NEXT_PORT}

# Internal Next.js port (not exposed)
ARG NEXT_INTERNAL_PORT=3001
ENV NEXT_INTERNAL_PORT=${NEXT_INTERNAL_PORT}

# Copy application files
COPY ./ ./

# Copy built frontend from the frontend-builder stage
COPY --from=frontend-builder /app/frontend/nextjs/.next ./frontend/nextjs/.next
COPY --from=frontend-builder /app/frontend/nextjs/node_modules ./frontend/nextjs/node_modules
COPY --from=frontend-builder /app/frontend/nextjs/public ./frontend/nextjs/public
COPY --from=frontend-builder /app/frontend/nextjs/package.json ./frontend/nextjs/package.json
# Ensure next.config.mjs and other necessary files are present
COPY --from=frontend-builder /app/frontend/nextjs/next.config.mjs ./frontend/nextjs/next.config.mjs

# Create nginx configuration
RUN echo 'events {' > /etc/nginx/nginx.conf && \
  echo '    worker_connections 1024;' >> /etc/nginx/nginx.conf && \
  echo '}' >> /etc/nginx/nginx.conf && \
  echo '' >> /etc/nginx/nginx.conf && \
  echo 'http {' >> /etc/nginx/nginx.conf && \
  echo '    include /etc/nginx/mime.types;' >> /etc/nginx/nginx.conf && \
  echo '    default_type application/octet-stream;' >> /etc/nginx/nginx.conf && \
  echo '' >> /etc/nginx/nginx.conf && \
  echo '    # Logging' >> /etc/nginx/nginx.conf && \
  echo '    access_log /var/log/nginx/access.log;' >> /etc/nginx/nginx.conf && \
  echo '    error_log /var/log/nginx/error.log;' >> /etc/nginx/nginx.conf && \
  echo '' >> /etc/nginx/nginx.conf && \
  echo '    # Gzip compression' >> /etc/nginx/nginx.conf && \
  echo '    gzip on;' >> /etc/nginx/nginx.conf && \
  echo '    gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript;' >> /etc/nginx/nginx.conf && \
  echo '' >> /etc/nginx/nginx.conf && \
  echo '    # WebSocket support' >> /etc/nginx/nginx.conf && \
  echo '    map $http_upgrade $connection_upgrade {' >> /etc/nginx/nginx.conf && \
  echo '        default upgrade;' >> /etc/nginx/nginx.conf && \
  echo '        '"'"''"'"' close;' >> /etc/nginx/nginx.conf && \
  echo '    }' >> /etc/nginx/nginx.conf && \
  echo '' >> /etc/nginx/nginx.conf && \
  echo '    server {' >> /etc/nginx/nginx.conf && \
  echo '        listen 3000;' >> /etc/nginx/nginx.conf && \
  echo '        server_name _;' >> /etc/nginx/nginx.conf && \
  echo '' >> /etc/nginx/nginx.conf && \
  echo '        # Proxy backend routes to FastAPI server' >> /etc/nginx/nginx.conf && \
  echo '        location /outputs {' >> /etc/nginx/nginx.conf && \
  echo '            proxy_pass http://127.0.0.1:8000;' >> /etc/nginx/nginx.conf && \
  echo '            proxy_set_header Host $host;' >> /etc/nginx/nginx.conf && \
  echo '            proxy_set_header X-Real-IP $remote_addr;' >> /etc/nginx/nginx.conf && \
  echo '            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;' >> /etc/nginx/nginx.conf && \
  echo '            proxy_set_header X-Forwarded-Proto $scheme;' >> /etc/nginx/nginx.conf && \
  echo '        }' >> /etc/nginx/nginx.conf && \
  echo '' >> /etc/nginx/nginx.conf && \
  echo '        location /reports {' >> /etc/nginx/nginx.conf && \
  echo '            proxy_pass http://127.0.0.1:8000;' >> /etc/nginx/nginx.conf && \
  echo '            proxy_set_header Host $host;' >> /etc/nginx/nginx.conf && \
  echo '            proxy_set_header X-Real-IP $remote_addr;' >> /etc/nginx/nginx.conf && \
  echo '            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;' >> /etc/nginx/nginx.conf && \
  echo '            proxy_set_header X-Forwarded-Proto $scheme;' >> /etc/nginx/nginx.conf && \
  echo '        }' >> /etc/nginx/nginx.conf && \
  echo '' >> /etc/nginx/nginx.conf && \
  echo '        location /ws {' >> /etc/nginx/nginx.conf && \
  echo '            proxy_pass http://127.0.0.1:8000;' >> /etc/nginx/nginx.conf && \
  echo '            proxy_http_version 1.1;' >> /etc/nginx/nginx.conf && \
  echo '            proxy_set_header Upgrade $http_upgrade;' >> /etc/nginx/nginx.conf && \
  echo '            proxy_set_header Connection $connection_upgrade;' >> /etc/nginx/nginx.conf && \
  echo '            proxy_set_header Host $host;' >> /etc/nginx/nginx.conf && \
  echo '            proxy_set_header X-Real-IP $remote_addr;' >> /etc/nginx/nginx.conf && \
  echo '            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;' >> /etc/nginx/nginx.conf && \
  echo '            proxy_set_header X-Forwarded-Proto $scheme;' >> /etc/nginx/nginx.conf && \
  echo '        }' >> /etc/nginx/nginx.conf && \
  echo '' >> /etc/nginx/nginx.conf && \
  echo '        # Proxy all other requests to Next.js' >> /etc/nginx/nginx.conf && \
  echo '        location / {' >> /etc/nginx/nginx.conf && \
  echo '            proxy_pass http://127.0.0.1:3001;' >> /etc/nginx/nginx.conf && \
  echo '            proxy_set_header Host $host;' >> /etc/nginx/nginx.conf && \
  echo '            proxy_set_header X-Real-IP $remote_addr;' >> /etc/nginx/nginx.conf && \
  echo '            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;' >> /etc/nginx/nginx.conf && \
  echo '            proxy_set_header X-Forwarded-Proto $scheme;' >> /etc/nginx/nginx.conf && \
  echo '        }' >> /etc/nginx/nginx.conf && \
  echo '    }' >> /etc/nginx/nginx.conf && \
  echo '}' >> /etc/nginx/nginx.conf

# Create supervisord configuration
# stdout/stderr_maxbytes prevents log file rotation and ensures continuous output
RUN echo '[supervisord]' > /etc/supervisor/conf.d/supervisord.conf && \
  echo 'nodaemon=true' >> /etc/supervisor/conf.d/supervisord.conf && \
  echo 'user=root' >> /etc/supervisor/conf.d/supervisord.conf && \
  echo 'logfile=/dev/stdout' >> /etc/supervisor/conf.d/supervisord.conf && \
  echo 'logfile_maxbytes=0' >> /etc/supervisor/conf.d/supervisord.conf && \
  echo '' >> /etc/supervisor/conf.d/supervisord.conf && \
  echo '[program:backend]' >> /etc/supervisor/conf.d/supervisord.conf && \
  echo 'command=uvicorn main:app --host %(ENV_HOST)s --port %(ENV_PORT)s' >> /etc/supervisor/conf.d/supervisord.conf && \
  echo 'directory=/usr/src/app' >> /etc/supervisor/conf.d/supervisord.conf && \
  echo 'autostart=true' >> /etc/supervisor/conf.d/supervisord.conf && \
  echo 'autorestart=true' >> /etc/supervisor/conf.d/supervisord.conf && \
  echo 'stdout_logfile=/dev/stdout' >> /etc/supervisor/conf.d/supervisord.conf && \
  echo 'stdout_logfile_maxbytes=0' >> /etc/supervisor/conf.d/supervisord.conf && \
  echo 'stderr_logfile=/dev/stderr' >> /etc/supervisor/conf.d/supervisord.conf && \
  echo 'stderr_logfile_maxbytes=0' >> /etc/supervisor/conf.d/supervisord.conf && \
  echo '' >> /etc/supervisor/conf.d/supervisord.conf && \
  echo '[program:frontend]' >> /etc/supervisor/conf.d/supervisord.conf && \
  echo 'command=npm run start -- -p %(ENV_NEXT_INTERNAL_PORT)s' >> /etc/supervisor/conf.d/supervisord.conf && \
  echo 'directory=/usr/src/app/frontend/nextjs' >> /etc/supervisor/conf.d/supervisord.conf && \
  echo 'autostart=true' >> /etc/supervisor/conf.d/supervisord.conf && \
  echo 'autorestart=true' >> /etc/supervisor/conf.d/supervisord.conf && \
  echo 'stdout_logfile=/dev/stdout' >> /etc/supervisor/conf.d/supervisord.conf && \
  echo 'stdout_logfile_maxbytes=0' >> /etc/supervisor/conf.d/supervisord.conf && \
  echo 'stderr_logfile=/dev/stderr' >> /etc/supervisor/conf.d/supervisord.conf && \
  echo 'stderr_logfile_maxbytes=0' >> /etc/supervisor/conf.d/supervisord.conf && \
  echo '' >> /etc/supervisor/conf.d/supervisord.conf && \
  echo '[program:nginx]' >> /etc/supervisor/conf.d/supervisord.conf && \
  echo 'command=nginx -g "daemon off;"' >> /etc/supervisor/conf.d/supervisord.conf && \
  echo 'autostart=true' >> /etc/supervisor/conf.d/supervisord.conf && \
  echo 'autorestart=true' >> /etc/supervisor/conf.d/supervisord.conf && \
  echo 'stdout_logfile=/dev/stdout' >> /etc/supervisor/conf.d/supervisord.conf && \
  echo 'stdout_logfile_maxbytes=0' >> /etc/supervisor/conf.d/supervisord.conf && \
  echo 'stderr_logfile=/dev/stderr' >> /etc/supervisor/conf.d/supervisord.conf && \
  echo 'stderr_logfile_maxbytes=0' >> /etc/supervisor/conf.d/supervisord.conf

# Start supervisord to manage both services
CMD ["/usr/bin/supervisord", "-c", "/etc/supervisor/conf.d/supervisord.conf"]


================================================
FILE: LICENSE
================================================
                                 Apache License
                           Version 2.0, January 2004
                        http://www.apache.org/licenses/

   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION

   1. Definitions.

      "License" shall mean the terms and conditions for use, reproduction,
      and distribution as defined by Sections 1 through 9 of this document.

      "Licensor" shall mean the copyright owner or entity authorized by
      the copyright owner that is granting the License.

      "Legal Entity" shall mean the union of the acting entity and all
      other entities that control, are controlled by, or are under common
      control with that entity. For the purposes of this definition,
      "control" means (i) the power, direct or indirect, to cause the
      direction or management of such entity, whether by contract or
      otherwise, or (ii) ownership of fifty percent (50%) or more of the
      outstanding shares, or (iii) beneficial ownership of such entity.

      "You" (or "Your") shall mean an individual or Legal Entity
      exercising permissions granted by this License.

      "Source" form shall mean the preferred form for making modifications,
      including but not limited to software source code, documentation
      source, and configuration files.

      "Object" form shall mean any form resulting from mechanical
      transformation or translation of a Source form, including but
      not limited to compiled object code, generated documentation,
      and conversions to other media types.

      "Work" shall mean the work of authorship, whether in Source or
      Object form, made available under the License, as indicated by a
      copyright notice that is included in or attached to the work
      (an example is provided in the Appendix below).

      "Derivative Works" shall mean any work, whether in Source or Object
      form, that is based on (or derived from) the Work and for which the
      editorial revisions, annotations, elaborations, or other modifications
      represent, as a whole, an original work of authorship. For the purposes
      of this License, Derivative Works shall not include works that remain
      separable from, or merely link (or bind by name) to the interfaces of,
      the Work and Derivative Works thereof.

      "Contribution" shall mean any work of authorship, including
      the original version of the Work and any modifications or additions
      to that Work or Derivative Works thereof, that is intentionally
      submitted to Licensor for inclusion in the Work by the copyright owner
      or by an individual or Legal Entity authorized to submit on behalf of
      the copyright owner. For the purposes of this definition, "submitted"
      means any form of electronic, verbal, or written communication sent
      to the Licensor or its representatives, including but not limited to
      communication on electronic mailing lists, source code control systems,
      and issue tracking systems that are managed by, or on behalf of, the
      Licensor for the purpose of discussing and improving the Work, but
      excluding communication that is conspicuously marked or otherwise
      designated in writing by the copyright owner as "Not a Contribution."

      "Contributor" shall mean Licensor and any individual or Legal Entity
      on behalf of whom a Contribution has been received by Licensor and
      subsequently incorporated within the Work.

   2. Grant of Copyright License. Subject to the terms and conditions of
      this License, each Contributor hereby grants to You a perpetual,
      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
      copyright license to reproduce, prepare Derivative Works of,
      publicly display, publicly perform, sublicense, and distribute the
      Work and such Derivative Works in Source or Object form.

   3. Grant of Patent License. Subject to the terms and conditions of
      this License, each Contributor hereby grants to You a perpetual,
      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
      (except as stated in this section) patent license to make, have made,
      use, offer to sell, sell, import, and otherwise transfer the Work,
      where such license applies only to those patent claims licensable
      by such Contributor that are necessarily infringed by their
      Contribution(s) alone or by combination of their Contribution(s)
      with the Work to which such Contribution(s) was submitted. If You
      institute patent litigation against any entity (including a
      cross-claim or counterclaim in a lawsuit) alleging that the Work
      or a Contribution incorporated within the Work constitutes direct
      or contributory patent infringement, then any patent licenses
      granted to You under this License for that Work shall terminate
      as of the date such litigation is filed.

   4. Redistribution. You may reproduce and distribute copies of the
      Work or Derivative Works thereof in any medium, with or without
      modifications, and in Source or Object form, provided that You
      meet the following conditions:

      (a) You must give any other recipients of the Work or
          Derivative Works a copy of this License; and

      (b) You must cause any modified files to carry prominent notices
          stating that You changed the files; and

      (c) You must retain, in the Source form of any Derivative Works
          that You distribute, all copyright, patent, trademark, and
          attribution notices from the Source form of the Work,
          excluding those notices that do not pertain to any part of
          the Derivative Works; and

      (d) If the Work includes a "NOTICE" text file as part of its
          distribution, then any Derivative Works that You distribute must
          include a readable copy of the attribution notices contained
          within such NOTICE file, excluding those notices that do not
          pertain to any part of the Derivative Works, in at least one
          of the following places: within a NOTICE text file distributed
          as part of the Derivative Works; within the Source form or
          documentation, if provided along with the Derivative Works; or,
          within a display generated by the Derivative Works, if and
          wherever such third-party notices normally appear. The contents
          of the NOTICE file are for informational purposes only and
          do not modify the License. You may add Your own attribution
          notices within Derivative Works that You distribute, alongside
          or as an addendum to the NOTICE text from the Work, provided
          that such additional attribution notices cannot be construed
          as modifying the License.

      You may add Your own copyright statement to Your modifications and
      may provide additional or different license terms and conditions
      for use, reproduction, or distribution of Your modifications, or
      for any such Derivative Works as a whole, provided Your use,
      reproduction, and distribution of the Work otherwise complies with
      the conditions stated in this License.

   5. Submission of Contributions. Unless You explicitly state otherwise,
      any Contribution intentionally submitted for inclusion in the Work
      by You to the Licensor shall be under the terms and conditions of
      this License, without any additional terms or conditions.
      Notwithstanding the above, nothing herein shall supersede or modify
      the terms of any separate license agreement you may have executed
      with Licensor regarding such Contributions.

   6. Trademarks. This License does not grant permission to use the trade
      names, trademarks, service marks, or product names of the Licensor,
      except as required for reasonable and customary use in describing the
      origin of the Work and reproducing the content of the NOTICE file.

   7. Disclaimer of Warranty. Unless required by applicable law or
      agreed to in writing, Licensor provides the Work (and each
      Contributor provides its Contributions) on an "AS IS" BASIS,
      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
      implied, including, without limitation, any warranties or conditions
      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
      PARTICULAR PURPOSE. You are solely responsible for determining the
      appropriateness of using or redistributing the Work and assume any
      risks associated with Your exercise of permissions under this License.

   8. Limitation of Liability. In no event and under no legal theory,
      whether in tort (including negligence), contract, or otherwise,
      unless required by applicable law (such as deliberate and grossly
      negligent acts) or agreed to in writing, shall any Contributor be
      liable to You for damages, including any direct, indirect, special,
      incidental, or consequential damages of any character arising as a
      result of this License or out of the use or inability to use the
      Work (including but not limited to damages for loss of goodwill,
      work stoppage, computer failure or malfunction, or any and all
      other commercial damages or losses), even if such Contributor
      has been advised of the possibility of such damages.

   9. Accepting Warranty or Additional Liability. While redistributing
      the Work or Derivative Works thereof, You may choose to offer,
      and charge a fee for, acceptance of support, warranty, indemnity,
      or other liability obligations and/or rights consistent with this
      License. However, in accepting such obligations, You may act only
      on Your own behalf and on Your sole responsibility, not on behalf
      of any other Contributor, and only if You agree to indemnify,
      defend, and hold each Contributor harmless for any liability
      incurred by, or claims asserted against, such Contributor by reason
      of your accepting any such warranty or additional liability.

   END OF TERMS AND CONDITIONS

   APPENDIX: How to apply the Apache License to your work.

      To apply the Apache License to your work, attach the following
      boilerplate notice, with the fields enclosed by brackets "[]"
      replaced with your own identifying information. (Don't include
      the brackets!)  The text should be enclosed in the appropriate
      comment syntax for the file format. We also recommend that a
      file or class name and description of purpose be included on the
      same "printed page" as the copyright notice for easier
      identification within third-party archives.

   Copyright [yyyy] [name of copyright owner]

   Licensed under the Apache License, Version 2.0 (the "License");
   you may not use this file except in compliance with the License.
   You may obtain a copy of the License at

       http://www.apache.org/licenses/LICENSE-2.0

   Unless required by applicable law or agreed to in writing, software
   distributed under the License is distributed on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   See the License for the specific language governing permissions and
   limitations under the License.


================================================
FILE: Procfile
================================================
web: python -m uvicorn backend.server.server:app --host=0.0.0.0 --port=${PORT}

================================================
FILE: README-ja_JP.md
================================================
<div align="center">
<!--<h1 style="display: flex; align-items: center; gap: 10px;">
  <img src="https://github.com/assafelovic/gpt-researcher/assets/13554167/a45bac7c-092c-42e5-8eb6-69acbf20dde5" alt="Logo" width="25">
  GPT Researcher
</h1>-->
<img src="https://github.com/assafelovic/gpt-researcher/assets/13554167/20af8286-b386-44a5-9a83-3be1365139c3" alt="Logo" width="80">


####

[![公式サイト](https://img.shields.io/badge/公式サイト-gptr.dev-blue?style=for-the-badge&logo=world&logoColor=white)](https://gptr.dev)
[![Documentation](https://img.shields.io/badge/Documentation-DOCS-f472b6?logo=googledocs&logoColor=white&style=for-the-badge)](https://docs.gptr.dev)
[![Discord Follow](https://img.shields.io/discord/1127851779011391548?style=for-the-badge&logo=discord&label=Chat%20on%20Discord)](https://discord.gg/QgZXvJAccX)

[![PyPI version](https://img.shields.io/pypi/v/gpt-researcher?logo=pypi&logoColor=white&style=flat)](https://badge.fury.io/py/gpt-researcher)
![GitHub Release](https://img.shields.io/github/v/release/assafelovic/gpt-researcher?style=flat&logo=github)
[![Open In Colab](https://img.shields.io/static/v1?message=Open%20in%20Colab&logo=googlecolab&labelColor=grey&color=yellow&label=%20&style=flat&logoSize=40)](https://colab.research.google.com/github/assafelovic/gpt-researcher/blob/master/docs/docs/examples/pip-run.ipynb)
[![Docker Image Version](https://img.shields.io/docker/v/elestio/gpt-researcher/latest?arch=amd64&style=flat&logo=docker&logoColor=white&color=1D63ED)](https://hub.docker.com/r/gptresearcher/gpt-researcher)
[![Twitter Follow](https://img.shields.io/twitter/follow/assaf_elovic?style=social)](https://twitter.com/assaf_elovic)

[English](README.md) |
[中文](README-zh_CN.md) |
[日本語](README-ja_JP.md) |
[한국어](README-ko_KR.md)
</div>

# 🔎 GPT Researcher

**GPT Researcher は、さまざまなタスクに対する包括的なオンラインリサーチのために設計された自律エージェントです。**

このエージェントは、詳細で事実に基づいた偏りのない研究レポートを生成することができ、関連するリソース、アウトライン、およびレッスンに焦点を当てるためのカスタマイズオプションを提供します。最近の [Plan-and-Solve](https://arxiv.org/abs/2305.04091) および [RAG](https://arxiv.org/abs/2005.11401) 論文に触発され、GPT Researcher は速度、決定論、および信頼性の問題に対処し、同期操作ではなく並列化されたエージェント作業を通じてより安定したパフォーマンスと高速化を提供します。

**私たちの使命は、AIの力を活用して、個人や組織に正確で偏りのない事実に基づいた情報を提供することです。**

## なぜGPT Researcherなのか?

- 手動の研究タスクで客観的な結論を形成するには時間がかかることがあり、適切なリソースと情報を見つけるのに数週間かかることもあります。
- 現在のLLMは過去の情報に基づいて訓練されており、幻覚のリスクが高く、研究タスクにはほとんど役に立ちません。
- 現在のLLMは短いトークン出力に制限されており、長く詳細な研究レポート(2,000語以上)には不十分です。
- Web検索を可能にするサービス(ChatGPT + Webプラグインなど)は、限られたリソースとコンテンツのみを考慮し、場合によっては表面的で偏った回答をもたらします。
- Webソースの選択のみを使用すると、研究タスクの正しい結論を導く際にバイアスが生じる可能性があります。

## アーキテクチャ
主なアイデアは、「プランナー」と「実行」エージェントを実行することであり、プランナーは研究する質問を生成し、実行エージェントは生成された各研究質問に基づいて最も関連性の高い情報を探します。最後に、プランナーはすべての関連情報をフィルタリングおよび集約し、研究レポートを作成します。<br /> <br /> 
エージェントは、研究タスクを完了するために gpt-4o-mini と gpt-4o(128K コンテキスト)の両方を活用します。必要に応じてそれぞれを使用することでコストを最適化します。**平均的な研究タスクは完了するのに約3分かかり、コストは約0.1ドルです**。

<div align="center">
<img align="center" height="500" src="https://cowriter-images.s3.amazonaws.com/architecture.png">
</div>


詳細説明:
* 研究クエリまたはタスクに基づいて特定のドメインエージェントを作成します。
* 研究タスクに対する客観的な意見を形成する一連の研究質問を生成します。
* 各研究質問に対して、与えられたタスクに関連する情報をオンラインリソースから収集するクローラーエージェントをトリガーします。
* 各収集されたリソースについて、関連情報に基づいて要約し、そのソースを追跡します。
* 最後に、すべての要約されたソースをフィルタリングおよび集約し、最終的な研究レポートを生成します。

## デモ
https://github.com/assafelovic/gpt-researcher/assets/13554167/a00c89a6-a295-4dd0-b58d-098a31c40fda

## チュートリアル
 - [動作原理](https://docs.gptr.dev/blog/building-gpt-researcher)
 - [インストール方法](https://www.loom.com/share/04ebffb6ed2a4520a27c3e3addcdde20?sid=da1848e8-b1f1-42d1-93c3-5b0b9c3b24ea)
 - [ライブデモ](https://www.loom.com/share/6a3385db4e8747a1913dd85a7834846f?sid=a740fd5b-2aa3-457e-8fb7-86976f59f9b8)

## 特徴
- 📝 研究、アウトライン、リソース、レッスンレポートを生成
- 🌐 各研究で20以上のWebソースを集約し、客観的で事実に基づいた結論を形成
- 🖥️ 使いやすいWebインターフェース(HTML/CSS/JS)を含む
- 🔍 JavaScriptサポート付きのWebソースをスクレイピング
- 📂 訪問および使用されたWebソースのコンテキストを追跡
- 📄 研究レポートをPDF、Wordなどにエクスポート

## 📖 ドキュメント

完全なドキュメントについては、[こちら](https://docs.gptr.dev/docs/gpt-researcher/getting-started/introduction)を参照してください:

- 入門(インストール、環境設定、簡単な例)
- 操作例(デモ、統合、dockerサポート)
- 参考資料(API完全ドキュメント)
- Tavilyアプリケーションインターフェースの統合(コア概念の高度な説明)

## クイックスタート
> **ステップ 0** - Python 3.11 以降をインストールします。[こちら](https://www.tutorialsteacher.com/python/install-python)を参照して、ステップバイステップのガイドを確認してください。

<br />

> **ステップ 1** - プロジェクトをダウンロードします

```bash
$ git clone https://github.com/assafelovic/gpt-researcher.git
$ cd gpt-researcher
```

<br />

> **ステップ2** - 依存関係をインストールします
```bash
$ pip install -r requirements.txt
```
<br />

> **ステップ 3** - OpenAI キーと Tavily API キーを使用して .env ファイルを作成するか、直接エクスポートします

```bash
$ export OPENAI_API_KEY={Your OpenAI API Key here}
```
```bash
$ export TAVILY_API_KEY={Your Tavily API Key here}
```

(オプション)トレースと可観測性を強化するには、以下も設定できます:

```bash
# $ export LANGCHAIN_TRACING_V2=true
# $ export LANGCHAIN_API_KEY={Your LangChain API Key here}
```

- **LLMには、[OpenAI GPT](https://platform.openai.com/docs/guides/gpt) を使用することをお勧めします**が、[Langchain Adapter](https://python.langchain.com/docs/guides/adapters/openai) がサポートする他の LLM モデル(オープンソースを含む)を使用することもできます。llm モデルとプロバイダーを config/config.py で変更するだけです。[このガイド](https://python.langchain.com/docs/integrations/llms/) に従って、LLM を Langchain と統合する方法を学んでください。
- **検索エンジンには、[Tavily Search API](https://app.tavily.com)(LLM 用に最適化されています)を使用することをお勧めします**が、他の検索エンジンを選択することもできます。config/config.py で検索プロバイダーを「duckduckgo」、「googleAPI」、「googleSerp」、「searchapi」、「searx」に変更するだけです。次に、config.py ファイルに対応する env API キーを追加します。
- **最適なパフォーマンスを得るために、[OpenAI GPT](https://platform.openai.com/docs/guides/gpt) モデルと [Tavily Search API](https://app.tavily.com) を使用することを強くお勧めします。**
<br />

> **ステップ 4** - FastAPI を使用してエージェントを実行します

```bash
$ uvicorn main:app --reload
```
<br />

> **ステップ 5** - 任意のブラウザで http://localhost:8000 にアクセスして、リサーチを楽しんでください!

Docker の使い方や機能とサービスの詳細については、[ドキュメント](https://docs.gptr.dev) ページをご覧ください。

## 🔍 可観測性

GPT Researcher は **LangSmith** をサポートしており、複雑なマルチエージェントワークフローのトレースと可観測性を向上させ、デバッグや最適化を容易にします。

トレースを有効にするには:
1. 以下の環境変数を設定します:
   ```bash
   export LANGCHAIN_TRACING_V2=true
   export LANGCHAIN_API_KEY=あなたのAPIキー
   export LANGCHAIN_PROJECT="gpt-researcher"
   ```
2. 通常通りリサーチタスクを実行します。LangGraph ベースのエージェント間のやり取りは自動的にトレースされ、LangSmith ダッシュボードで可視化されます。

## 🚀 貢献
私たちは貢献を大歓迎します!興味がある場合は、[貢献](CONTRIBUTING.md) をご覧ください。

私たちの[ロードマップ](https://trello.com/b/3O7KBePw/gpt-researcher-roadmap) ページを確認し、私たちの使命に参加することに興味がある場合は、[Discord コミュニティ](https://discord.gg/QgZXvJAccX) を通じてお問い合わせください。

## ✉️ サポート / お問い合わせ
- [コミュニティディスカッション](https://discord.gg/spBgZmm3Xe)
- 私たちのメール: support@tavily.com

## 🛡 免責事項

このプロジェクト「GPT Researcher」は実験的なアプリケーションであり、明示または黙示のいかなる保証もなく「現状のまま」提供されます。私たちは学術目的のためにMITライセンスの下でコードを共有しています。ここに記載されている内容は学術的なアドバイスではなく、学術論文や研究論文での使用を推奨するものではありません。

私たちの客観的な研究主張に対する見解:
1. 私たちのスクレイピングシステムの主な目的は、不正確な事実を減らすことです。どうやって解決するのか?私たちがスクレイピングするサイトが多ければ多いほど、誤ったデータの可能性は低くなります。各研究で20の情報を収集し、それらがすべて間違っている可能性は非常に低いです。
2. 私たちの目標はバイアスを排除することではなく、可能な限りバイアスを減らすことです。**私たちはここでコミュニティとして最も効果的な人間と機械の相互作用を探求しています**。
3. 研究プロセスでは、人々も自分が研究しているトピックに対してすでに意見を持っているため、バイアスがかかりやすいです。このツールは多くの意見を収集し、偏った人が決して読まないであろう多様な見解を均等に説明します。

**GPT-4 言語モデルの使用は、トークンの使用により高額な費用がかかる可能性があることに注意してください**。このプロジェクトを利用することで、トークンの使用状況と関連する費用を監視および管理する責任があることを認めたことになります。OpenAI API の使用状況を定期的に確認し、予期しない料金が発生しないように必要な制限やアラートを設定することを強くお勧めします。

---

<p align="center">
<a href="https://star-history.com/#assafelovic/gpt-researcher">
  <picture>
    <source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=assafelovic/gpt-researcher&type=Date&theme=dark" />
    <source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=assafelovic/gpt-researcher&type=Date" />
    <img alt="Star History Chart" src="https://api.star-history.com/svg?repos=assafelovic/gpt-researcher&type=Date" />
  </picture>
</a>
</p>


================================================
FILE: README-ko_KR.md
================================================
<div align="center">
<!--<h1 style="display: flex; align-items: center; gap: 10px;">
  <img src="https://github.com/assafelovic/gpt-researcher/assets/13554167/a45bac7c-092c-42e5-8eb6-69acbf20dde5" alt="Logo" width="25">
  GPT Researcher
</h1>-->
<img src="https://github.com/assafelovic/gpt-researcher/assets/13554167/20af8286-b386-44a5-9a83-3be1365139c3" alt="Logo" width="80">


####

[![Website](https://img.shields.io/badge/Official%20Website-gptr.dev-teal?style=for-the-badge&logo=world&logoColor=white&color=0891b2)](https://gptr.dev)
[![Documentation](https://img.shields.io/badge/Documentation-DOCS-f472b6?logo=googledocs&logoColor=white&style=for-the-badge)](https://docs.gptr.dev)
[![Discord Follow](https://img.shields.io/discord/1127851779011391548?style=for-the-badge&logo=discord&label=Chat%20on%20Discord)](https://discord.gg/QgZXvJAccX)

[![PyPI version](https://img.shields.io/pypi/v/gpt-researcher?logo=pypi&logoColor=white&style=flat)](https://badge.fury.io/py/gpt-researcher)
![GitHub Release](https://img.shields.io/github/v/release/assafelovic/gpt-researcher?style=flat&logo=github)
[![Open In Colab](https://img.shields.io/static/v1?message=Open%20in%20Colab&logo=googlecolab&labelColor=grey&color=yellow&label=%20&style=flat&logoSize=40)](https://colab.research.google.com/github/assafelovic/gpt-researcher/blob/master/docs/docs/examples/pip-run.ipynb)
[![Docker Image Version](https://img.shields.io/docker/v/elestio/gpt-researcher/latest?arch=amd64&style=flat&logo=docker&logoColor=white&color=1D63ED)](https://hub.docker.com/r/gptresearcher/gpt-researcher)
[![Twitter Follow](https://img.shields.io/twitter/follow/assaf_elovic?style=social)](https://twitter.com/assaf_elovic)

[English](README.md) |
[中文](README-zh_CN.md) |
[日本語](README-ja_JP.md) |
[한국어](README-ko_KR.md)
</div>

# 🔎 GPT Researcher

**GPT Researcher는 다양한 작업을 대해 포괄적인 온라인 연구를 수행하도록 설계된 자율 에이전트입니다.**

이 에이전트는 세부적이고 사실에 기반하며 편견 없는 연구 보고서를 생성할 수 있으며, 관련 리소스와 개요에 초점을 맞춘 맞춤형 옵션을 제공합니다.  최근 발표된 [Plan-and-Solve](https://arxiv.org/abs/2305.04091) 및 [RAG](https://arxiv.org/abs/2005.11401) 논문에서 영감을 받아 GPT Researcher는 잘못된 정보, 속도, 결정론적 접근 방식, 신뢰성 문제를 해결하고, 동기화 작업이 아닌 병렬 에이전트 작업을 통해 더 안정적이고 빠른 성능을 제공합니다.

**우리의 목표는 AI의 힘을 활용하여 개인과 조직에게 정확하고 편향 없는 사실에 기반한 정보를 제공하는 것입니다.**

## 왜 GPT Researcher인가?

- 직접 수행하는 연구 과정은 객관적인 결론을 도출하는 데 시간이 오래 걸리며, 적절한 리소스와 정보를 찾는 데 몇 주가 걸릴 수 있습니다.
- 현재의 대규모 언어 모델(LLM)은 과거 정보에 기반해 훈련되었으며, 환각 현상이 발생할 위험이 높아 연구 작업에는 적합하지 않습니다.
- 현재 LLM은 짧은 토큰 출력으로 제한되며, 2,000단어 이상의 길고 자세한 연구 보고서를 작성하는 데는 충분하지 않습니다.
- 웹 검색을 지원하는 서비스(예: ChatGPT 또는 Perplexity)는 제한된 리소스와 콘텐츠만을 고려하여 경우에 따라 피상적이고 편향된 답변을 제공합니다.
- 웹 소스만을 사용하면 연구 작업에서 올바른 결론을 도출할 때 편향이 발생할 수 있습니다.

## 데모
https://github.com/user-attachments/assets/092e9e71-7e27-475d-8c4f-9dddd28934a3

## 아키텍처
주요 아이디어는 "플래너"와 "실행" 에이전트를 실행하는 것으로, 플래너는 연구할 질문을 생성하고, 실행 에이전트는 생성된 각 연구 질문에 따라 가장 관련성 높은 정보를 찾습니다. 마지막으로 플래너는 모든 관련 정보를 필터링하고 집계하여 연구 보고서를 작성합니다.
<br /> <br /> 
에이전트는 `gpt-4o-mini`와 `gpt-4o`(128K 컨텍스트)를 활용하여 연구 작업을 완료합니다. 필요에 따라 각각을 사용하여 비용을 최적화합니다. **평균 연구 작업은 약 2분이 소요되며, 비용은 약 $0.005입니다.**.

<div align="center">
<img align="center" height="600" src="https://github.com/assafelovic/gpt-researcher/assets/13554167/4ac896fd-63ab-4b77-9688-ff62aafcc527">
</div>

구체적으로:
* 연구 쿼리 또는 작업을 기반으로 도메인별 에이전트를 생성합니다.
* 주어진 작업에 대해 객관적인 의견을 형성할 수 있는 일련의 연구 질문을 생성합니다.
* 각 연구 질문에 대해 크롤러 에이전트를 실행하여 작업과 관련된 정보를 온라인 리소스에서 수집합니다.
* 수집된 각 리소스에서 관련 정보를 요약하고 출처를 기록합니다.
* 마지막으로, 요약된 모든 정보를 필터링하고 집계하여 최종 연구 보고서를 생성합니다.

## 튜토리얼
 - [동작원리](https://docs.gptr.dev/blog/building-gpt-researcher)
 - [설치방법](https://www.loom.com/share/04ebffb6ed2a4520a27c3e3addcdde20?sid=da1848e8-b1f1-42d1-93c3-5b0b9c3b24ea)
 - [라이브 데모](https://www.loom.com/share/6a3385db4e8747a1913dd85a7834846f?sid=a740fd5b-2aa3-457e-8fb7-86976f59f9b8)


## 기능
- 📝 로컬 문서 및 웹 소스를 사용하여 연구, 개요, 리소스 및 학습 보고서 생성
- 📜 2,000단어 이상의 길고 상세한 연구 보고서 생성 가능
- 🌐 연구당 20개 이상의 웹 소스를 집계하여 객관적이고 사실에 기반한 결론 도출
- 🖥️ 경량 HTML/CSS/JS와 프로덕션용 (NextJS + Tailwind) UX/UI 포함
- 🔍 자바스크립트 지원 웹 소스 스크래핑 기능
- 📂 연구 과정에서 맥락과 메모리 추적 및 유지
- 📄 연구 보고서를 PDF, Word 등으로 내보내기 지원

## 📖 문서

전체 문서(설치, 환경 설정, 간단한 예시)를 보려면 [여기](https://docs.gptr.dev/docs/gpt-researcher/getting-started)를 참조하세요.

- 시작하기 (설치, 환경 설정, 간단한 예시)
- 맞춤 설정 및 구성
- 사용 방법 예시 (데모, 통합, 도커 지원)
- 참고자료 (전체 API 문서)

## ⚙️ 시작하기
### 설치
> **1단계** - Python 3.11 또는 그 이상의 버전을 설치하세요. [여기](https://www.tutorialsteacher.com/python/install-python)를 참조하여 단계별 가이드를 확인하세요.

> **2단계** - 프로젝트를 다운로드하고 해당 디렉토리로 이동하세요.

```bash
git clone https://github.com/assafelovic/gpt-researcher.git
cd gpt-researcher
```

> **3단계** - 두 가지 방법으로 API 키를 설정하세요: 직접 export하거나 `.env` 파일에 저장하세요.

Linux/Windows에서 임시 설정을 하려면 export 방법을 사용하세요:

```bash
export OPENAI_API_KEY={OpenAI API 키 입력}
export TAVILY_API_KEY={Tavily API 키 입력}
```

(선택 사항) 향상된 트레이싱 및 관측 가능성을 위해 다음을 설정할 수도 있습니다:

```bash
# export LANGCHAIN_TRACING_V2=true
# export LANGCHAIN_API_KEY={LangChain API 키 입력}
```

더 영구적인 설정을 원한다면, 현재의 `gpt-researcher` 디렉토리에 `.env` 파일을 생성하고 환경 변수를 입력하세요 (export 없이).

- 기본 LLM은 [GPT](https://platform.openai.com/docs/guides/gpt)이지만, `claude`, `ollama3`, `gemini`, `mistral` 등 다른 LLM도 사용할 수 있습니다. LLM 제공자를 변경하는 방법은 [LLMs 문서](https://docs.gptr.dev/docs/gpt-researcher/llms)를 참조하세요. 이 프로젝트는 OpenAI GPT 모델에 최적화되어 있습니다.
- 기본 검색기는 [Tavily](https://app.tavily.com)이지만, `duckduckgo`, `google`, `bing`, `searchapi`, `serper`, `searx`, `arxiv`, `exa` 등의 검색기를 사용할 수 있습니다. 검색 제공자를 변경하는 방법은 [검색기 문서](https://docs.gptr.dev/docs/gpt-researcher/retrievers)를 참조하세요.

### 빠른 시작

> **1단계** - 필요한 종속성 설치

```bash
pip install -r requirements.txt
```

> **2단계** - FastAPI로 에이전트 실행

```bash
python -m uvicorn main:app --reload
```

> **3단계** - 브라우저에서 http://localhost:8000 으로 이동하여 연구를 시작하세요!

<br />

**[Poetry](https://docs.gptr.dev/docs/gpt-researcher/getting-started#poetry) 또는 [가상 환경](https://docs.gptr.dev/docs/gpt-researcher/getting-started/getting-started#virtual-environment)에 대해 배우고 싶다면, [문서](https://docs.gptr.dev/docs/gpt-researcher/getting-started/getting-started)를 참조하세요.**

### PIP 패키지로 실행하기
```bash
pip install gpt-researcher
```

```python
...
from gpt_researcher import GPTResearcher

query = "왜 Nvidia 주식이 오르고 있나요?"
researcher = GPTResearcher(query=query, report_type="research_report")
# 주어진 질문에 대한 연구 수행
research_result = await researcher.conduct_research()
# 보고서 작성
report = await researcher.write_report()
...
```

**더 많은 예제와 구성 옵션은 [PIP 문서](https://docs.gptr.dev/docs/gpt-researcher/gptr/pip-package)를 참조하세요.**

## Docker로 실행

> **1단계** - [Docker 설치](https://docs.gptr.dev/docs/gpt-researcher/getting-started/getting-started-with-docker)

> **2단계** - `.env.example` 파일을 복사하고 API 키를 추가한 후, 파일을 `.env`로 저장하세요.

> **3단계** - docker-compose 파일에서 실행하고 싶지 않은 서비스를 주석 처리하세요.

```bash
$ docker-compose up --build
```

> **4단계** - docker-compose 파일에서 아무 것도 주석 처리하지 않았다면, 기본적으로 두 가지 프로세스가 시작됩니다:
 - localhost:8000에서 실행 중인 Python 서버<br>
 - localhost:3000에서 실행 중인 React 앱<br>

브라우저에서 localhost:3000으로 이동하여 연구를 시작하세요!

## 🔍 관측 가능성 (Observability)

GPT Researcher는 **LangSmith**를 지원하여 복잡한 다중 에이전트 워크플로우의 트레이싱과 관측 가능성을 향상시키며, 디버깅과 최적화를 용이하게 합니다.

트레이싱을 활성화하려면:
1. 다음 환경 변수를 설정하십시오:
   ```bash
   export LANGCHAIN_TRACING_V2=true
   export LANGCHAIN_API_KEY=당신의_API_키
   export LANGCHAIN_PROJECT="gpt-researcher"
   ```
2. 평소와 같이 연구 작업을 실행하십시오. 모든 LangGraph 기반 에이전트 상호 작용은 자동으로 추적되며 LangSmith 대시보드에서 시각화됩니다.

## 📄 로컬 문서로 연구하기

GPT Researcher를 사용하여 로컬 문서를 기반으로 연구 작업을 수행할 수 있습니다. 현재 지원되는 파일 형식은 PDF, 일반 텍스트, CSV, Excel, Markdown, PowerPoint, Word 문서입니다.

1단계: `DOC_PATH` 환경 변수를 설정하여 문서가 있는 폴더를 지정하세요.

```bash
export DOC_PATH="./my-docs"
```

2단계:
 - 프론트엔드 앱을 localhost:8000에서 실행 중이라면, "Report Source" 드롭다운 옵션에서 "My Documents"를 선택하세요.
 - GPT Researcher를 [PIP 패키지](https://docs.tavily.com/guides/gpt-researcher/gpt-researcher#pip-package)로 실행 중이라면, `report_source` 인수를 "local"로 설정하여 `GPTResearcher` 클래스를 인스턴스화하세요. [코드 예제](https://docs.gptr.dev/docs/gpt-researcher/context/tailored-research)를 참조하세요.

## 👪 다중 에이전트 어시스턴트

AI가 프롬프트 엔지니어링 및 RAG에서 다중 에이전트 시스템으로 발전함에 따라, 우리는 [LangGraph](https://python.langchain.com/v0.1/docs/langgraph/)로 구축된 새로운 다중 에이전트 어시스턴트를 소개합니다.

LangGraph를 사용하면 여러 에이전트의 전문 기술을 활용하여 연구 과정의 깊이와 질을 크게 향상시킬 수 있습니다. 최근 [STORM](https://arxiv.org/abs/2402.14207) 논문에서 영감을 받아, 이 프로젝트는 AI 에이전트 팀이 주제에 대한 연구를 계획에서 출판까지 함께 수행하는 방법을 보여줍니다.

평균 실행은 5-6 페이지 분량의 연구 보고서를 PDF, Docx, Markdown 형식으로 생성합니다.

[여기](https://github.com/assafelovic/gpt-researcher/tree/master/multi_agents)에서 확인하거나 [문서](https://docs.gptr.dev/docs/gpt-researcher/multi_agents/langgraph)에서 자세한 내용을 참조하세요.

## 🖥️ 프론트엔드 애플리케이션

GPT-Researcher는 사용자 경험을 개선하고 연구 프로세스를 간소화하기 위해 향상된 프론트엔드를 제공합니다. 프론트엔드는 다음과 같은 기능을 제공합니다:

- 연구 쿼리를 입력할 수 있는 직관적인 인터페이스
- 연구 작업의 실시간 진행 상황 추적
- 연구 결과의 대화형 디스플레이
- 맞춤형 연구 경험을 위한 설정 가능

두 가지 배포 옵션이 있습니다:
1. FastAPI로 제공되는 경량 정적 프론트엔드
2. 고급 기능을 제공하는 NextJS 애플리케이션

프론트엔드 기능에 대한 자세한 설치 방법 및 정보를 원하시면 [문서 페이지](https://docs.gptr.dev/docs/gpt-researcher/frontend/introduction)를 참조하세요.

## 🚀 기여하기
우리는 기여를 적극 환영합니다! 관심이 있다면 [기여 가이드](https://github.com/assafelovic/gpt-researcher/blob/master/CONTRIBUTING.md)를 확인해 주세요.

[로드맵](https://trello.com/b/3O7KBePw/gpt-researcher-roadmap) 페이지를 확인하고, 우리 [Discord 커뮤니티](https://discord.gg/QgZXvJAccX)에 가입하여 우리의 목표에 함께 참여해 주세요.
<a href="https://github.com/assafelovic/gpt-researcher/graphs/contributors">
  <img src="https://contrib.rocks/image?repo=assafelovic/gpt-researcher" />
</a>

## ✉️ 지원 / 문의
- [커뮤니티 Discord](https://discord.gg/spBgZmm3Xe)
- 저자 이메일: assaf.elovic@gmail.com

## 🛡️ 면책 조항

이 프로젝트인 GPT Researcher는 실험적인 응용 프로그램이며, 명시적이거나 묵시적인 보증 없이 "있는 그대로" 제공됩니다. 우리는 이 코드를 학술적 목적으로 Apache 2 라이선스 하에 공유하고 있습니다. 여기에 있는 것은 학술적 조언이 아니며, 학술 또는 연구 논문에 사용하는 것을 권장하지 않습니다.

편향되지 않은 연구 주장에 대한 우리의 견해:
1. GPT Researcher의 주요 목표는 잘못된 정보와 편향된 사실을 줄이는 것입니다. 그 방법은 무엇일까요? 우리는 더 많은 사이트를 스크래핑할수록 잘못된 데이터의 가능성이 줄어든다고 가정합니다. 여러 사이트에서 정보를 스크래핑하고 가장 빈번한 정보를 선택하면, 모든 정보가 틀릴 확률은 매우 낮습니다.
2. 우리는 편향을 완전히 제거하려고 하지는 않지만, 가능한 한 줄이는 것을 목표로 합니다. **우리는 인간과 LLM의 가장 효과적인 상호작용을 찾기 위한 커뮤니티입니다.**
3. 연구에서 사람들도 이미 자신이 연구하는 주제에 대해 의견을 가지고 있기 때문에 편향되는 경향이 있습니다. 이 도구는 많은 의견을 스크래핑하며, 편향된 사람이라면 결코 읽지 않았을 다양한 견해를 고르게 설명합니다.

**GPT-4 모델을 사용할 경우, 토큰 사용량 때문에 비용이 많이 들 수 있습니다.** 이 프로젝트를 사용하는 경우, 자신의 토큰 사용량 및 관련 비용을 모니터링하고 관리하는 것은 본인의 책임입니다. OpenAI API 사용량을 정기적으로 확인하고, 예상치 못한 비용을 방지하기 위해 필요한 한도를 설정하거나 알림을 설정하는 것이 좋습니다.


---

<p align="center">
<a href="https://star-history.com/#assafelovic/gpt-researcher">
  <picture>
    <source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=assafelovic/gpt-researcher&type=Date&theme=dark" />
    <source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=assafelovic/gpt-researcher&type=Date" />
    <img alt="Star History Chart" src="https://api.star-history.com/svg?repos=assafelovic/gpt-researcher&type=Date" />
  </picture>
</a>
</p>


================================================
FILE: README-zh_CN.md
================================================
<div align="center">
<!--<h1 style="display: flex; align-items: center; gap: 10px;">
  <img src="https://github.com/assafelovic/gpt-researcher/assets/13554167/a45bac7c-092c-42e5-8eb6-69acbf20dde5" alt="Logo" width="25">
  GPT Researcher
</h1>-->
<img src="https://github.com/assafelovic/gpt-researcher/assets/13554167/20af8286-b386-44a5-9a83-3be1365139c3" alt="Logo" width="80">


####

[![Website](https://img.shields.io/badge/Official%20Website-gptr.dev-teal?style=for-the-badge&logo=world&logoColor=white&color=0891b2)](https://gptr.dev)
[![Documentation](https://img.shields.io/badge/Documentation-DOCS-f472b6?logo=googledocs&logoColor=white&style=for-the-badge)](https://docs.gptr.dev)
[![Discord Follow](https://img.shields.io/discord/1127851779011391548?style=for-the-badge&logo=discord&label=Chat%20on%20Discord)](https://discord.gg/QgZXvJAccX)

[![PyPI version](https://img.shields.io/pypi/v/gpt-researcher?logo=pypi&logoColor=white&style=flat)](https://badge.fury.io/py/gpt-researcher)
![GitHub Release](https://img.shields.io/github/v/release/assafelovic/gpt-researcher?style=flat&logo=github)
[![Open In Colab](https://img.shields.io/static/v1?message=Open%20in%20Colab&logo=googlecolab&labelColor=grey&color=yellow&label=%20&style=flat&logoSize=40)](https://colab.research.google.com/github/assafelovic/gpt-researcher/blob/master/docs/docs/examples/pip-run.ipynb)
[![Docker Image Version](https://img.shields.io/docker/v/elestio/gpt-researcher/latest?arch=amd64&style=flat&logo=docker&logoColor=white&color=1D63ED)](https://hub.docker.com/r/gptresearcher/gpt-researcher)
[![Twitter Follow](https://img.shields.io/twitter/follow/assaf_elovic?style=social)](https://twitter.com/assaf_elovic)

[English](README.md) |
[中文](README-zh_CN.md) |
[日本語](README-ja_JP.md) |
[한국어](README-ko_KR.md)
</div>

# 🔎 GPT Researcher

**GPT Researcher 是一个智能体代理,专为各种任务的综合在线研究而设计。**

代理可以生成详细、正式且客观的研究报告,并提供自定义选项,专注于相关资源、结构框架和经验报告。受最近发表的[Plan-and-Solve](https://arxiv.org/abs/2305.04091) 和[RAG](https://arxiv.org/abs/2005.11401) 论文的启发,GPT Researcher 解决了速度、确定性和可靠性等问题,通过并行化的代理运行,而不是同步操作,提供了更稳定的性能和更高的速度。

**我们的使命是利用人工智能的力量,为个人和组织提供准确、客观和事实的信息。**

## 为什么选择GPT Researcher?

- 因为人工研究任务形成客观结论可能需要时间和经历,有时甚至需要数周才能找到正确的资源和信息。
- 目前的LLM是根据历史和过时的信息进行训练的,存在严重的幻觉风险,因此几乎无法胜任研究任务。
- 网络搜索的解决方案(例如 ChatGPT + Web 插件)仅考虑有限的资源和内容,在某些情况下会导致肤浅的结论或不客观的答案。
- 只使用部分资源可能会在确定研究问题或任务的正确结论时产生偏差。

## 架构
主要思想是运行“**计划者**”和“**执行**”代理,而**计划者**生成问题进行研究,“**执行**”代理根据每个生成的研究问题寻找最相关的信息。最后,“**计划者**”过滤和聚合所有相关信息并创建研究报告。<br /> <br /> 
代理同时利用 gpt-40-mini 和 gpt-4o(128K 上下文)来完成一项研究任务。我们仅在必要时使用这两种方法对成本进行优化。**研究任务平均耗时约 3 分钟,成本约为 ~0.1 美元**。

<div align="center">
<img align="center" height="500" src="https://cowriter-images.s3.amazonaws.com/architecture.png">
</div>


详细说明:
* 根据研究搜索或任务创建特定领域的代理。
* 生成一组研究问题,这些问题共同形成答案对任何给定任务的客观意见。
* 针对每个研究问题,触发一个爬虫代理,从在线资源中搜索与给定任务相关的信息。
* 对于每一个抓取的资源,根据相关信息进行汇总,并跟踪其来源。
* 最后,对所有汇总的资料来源进行过滤和汇总,并生成最终研究报告。

## 演示
https://github.com/assafelovic/gpt-researcher/assets/13554167/a00c89a6-a295-4dd0-b58d-098a31c40fda

## 教程
 - [运行原理](https://docs.gptr.dev/blog/building-gpt-researcher)
 - [如何安装](https://www.loom.com/share/04ebffb6ed2a4520a27c3e3addcdde20?sid=da1848e8-b1f1-42d1-93c3-5b0b9c3b24ea)
 - [现场演示](https://www.loom.com/share/6a3385db4e8747a1913dd85a7834846f?sid=a740fd5b-2aa3-457e-8fb7-86976f59f9b8)

## 特性
- 📝 生成研究问题、大纲、资源和课题报告
- 🌐 每项研究汇总超过20个网络资源,形成客观和真实的结论
- 🖥️ 包括易于使用的web界面 (HTML/CSS/JS)
- 🔍 支持JavaScript网络资源抓取功能
- 📂 追踪访问过和使用过的网络资源和来源
- 📄 将研究报告导出为PDF或其他格式...

## 📖 文档

请参阅[此处](https://docs.gptr.dev/docs/gpt-researcher/getting-started/introduction),了解完整文档:

- 入门(安装、设置环境、简单示例)
- 操作示例(演示、集成、docker 支持)
- 参考资料(API完整文档)
- Tavily 应用程序接口集成(核心概念的高级解释)

## 快速开始
> **步骤 0** - 安装 Python 3.11 或更高版本。[参见此处](https://www.tutorialsteacher.com/python/install-python) 获取详细指南。

<br />

> **步骤 1** - 下载项目

```bash
$ git clone https://github.com/assafelovic/gpt-researcher.git
$ cd gpt-researcher
```

<br />

> **步骤2** -安装依赖项
```bash
$ pip install -r requirements.txt
```
<br />

> **第 3 步** - 使用 OpenAI 密钥和 Tavily API 密钥创建 .env 文件,或直接导出该文件

```bash
$ export OPENAI_API_KEY={Your OpenAI API Key here}
```
```bash
$ export TAVILY_API_KEY={Your Tavily API Key here}
```

(可选)如需开启全链路追踪和可观测性,可设置:

```bash
# $ export LANGCHAIN_TRACING_V2=true
# $ export LANGCHAIN_API_KEY={Your LangChain API Key here}
```

- **LLM,我们推荐使用 [OpenAI GPT](https://platform.openai.com/docs/guides/gpt)**,但您也可以使用 [Langchain Adapter](https://python.langchain.com/docs/guides/adapters/openai) 支持的任何其他 LLM 模型(包括开源),只需在 config/config.py 中更改 llm 模型和提供者即可。请按照 [这份指南](https://python.langchain.com/docs/integrations/llms/) 学习如何将 LLM 与 Langchain 集成。
- **对于搜索引擎,我们推荐使用 [Tavily Search API](https://app.tavily.com)(已针对 LLM 进行优化)**,但您也可以选择其他搜索引擎,只需将 config/config.py 中的搜索提供程序更改为 "duckduckgo"、"googleAPI"、"searchapi"、"googleSerp "或 "searx "即可。然后在 config.py 文件中添加相应的 env API 密钥。
- **我们强烈建议使用 [OpenAI GPT](https://platform.openai.com/docs/guides/gpt) 模型和 [Tavily Search API](https://app.tavily.com) 以获得最佳性能。**
<br />

> **第 4 步** - 使用 FastAPI 运行代理

```bash
$ uvicorn main:app --reload
```
<br />

> **第 5 步** - 在任何浏览器上访问 http://localhost:8000,享受研究乐趣!

要了解如何开始使用 Docker 或了解有关功能和服务的更多信息,请访问 [documentation](https://docs.gptr.dev) 页面。

## 🔍 可观测性

GPT Researcher 支持 **LangSmith** 以增强链路追踪和可观测性,特别适用于调试和优化复杂的多智能体工作流。

要开启追踪:
1. 设置以下环境变量:
   ```bash
   export LANGCHAIN_TRACING_V2=true
   export LANGCHAIN_API_KEY=您的_API_KEY
   export LANGCHAIN_PROJECT="gpt-researcher"
   ```
2. 正常运行研究任务。所有基于 LangGraph 的智能体交互将自动被追踪,并可在您的 LangSmith 控制台中查看可视化结果。

## 🚀 贡献
我们非常欢迎您的贡献!如果您感兴趣,请查看 [contributing](CONTRIBUTING.md)。

如果您有兴趣加入我们的任务,请查看我们的 [路线图](https://trello.com/b/3O7KBePw/gpt-researcher-roadmap) 页面,并通过我们的 [Discord 社区](https://discord.gg/QgZXvJAccX) 联系我们。

## ✉️ 支持 / 联系我们
- [社区讨论区](https://discord.gg/spBgZmm3Xe)
- 我们的邮箱: support@tavily.com

## 🛡 免责声明

本项目 "GPT Researcher "是一个实验性应用程序,按 "现状 "提供,不做任何明示或暗示的保证。我们根据 MIT 许可分享用于学术目的的代码。本文不提供任何学术建议,也不建议在学术或研究论文中使用。

我们对客观研究主张的看法:
1.  我们抓取系统的全部目的是减少不正确的事实。如何解决?我们抓取的网站越多,错误数据的可能性就越小。我们每项研究都会收集20条信息,它们全部错误的可能性极低。
2. 我们的目标不是消除偏见,而是尽可能减少偏见。**作为一个社区,我们在这里探索最有效的人机互动**。
3. 在研究过程中,人们也容易产生偏见,因为大多数人对自己研究的课题都有自己的看法。这个工具可以搜罗到许多观点,并均匀地解释各种不同的观点,而有偏见的人是绝对读不到这些观点的。

**请注意,使用 GPT-4 语言模型可能会因使用令牌而产生高昂费用**。使用本项目即表示您承认有责任监控和管理自己的令牌使用情况及相关费用。强烈建议您定期检查 OpenAI API 的使用情况,并设置任何必要的限制或警报,以防止发生意外费用。

---

<p align="center">
<a href="https://star-history.com/#assafelovic/gpt-researcher">
  <picture>
    <source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=assafelovic/gpt-researcher&type=Date&theme=dark" />
    <source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=assafelovic/gpt-researcher&type=Date" />
    <img alt="Star History Chart" src="https://api.star-history.com/svg?repos=assafelovic/gpt-researcher&type=Date" />
  </picture>
</a>
</p>


================================================
FILE: README.md
================================================
<div align="center" id="top">

<img src="https://github.com/assafelovic/gpt-researcher/assets/13554167/20af8286-b386-44a5-9a83-3be1365139c3" alt="Logo" width="80">

####

[![Website](https://img.shields.io/badge/Official%20Website-gptr.dev-teal?style=for-the-badge&logo=world&logoColor=white&color=0891b2)](https://gptr.dev)
[![Documentation](https://img.shields.io/badge/Documentation-DOCS-f472b6?logo=googledocs&logoColor=white&style=for-the-badge)](https://docs.gptr.dev)
[![Discord](https://img.shields.io/discord/1127851779011391548?logo=discord&logoColor=white&label=Discord&color=34b76a&style=for-the-badge)](https://discord.gg/QgZXvJAccX)


[![PyPI version](https://img.shields.io/pypi/v/gpt-researcher?logo=pypi&logoColor=white&style=flat)](https://badge.fury.io/py/gpt-researcher)
![GitHub Release](https://img.shields.io/github/v/release/assafelovic/gpt-researcher?style=flat&logo=github)
[![Open In Colab](https://img.shields.io/static/v1?message=Open%20in%20Colab&logo=googlecolab&labelColor=grey&color=yellow&label=%20&style=flat&logoSize=40)](https://colab.research.google.com/github/assafelovic/gpt-researcher/blob/master/docs/docs/examples/pip-run.ipynb)
[![Docker Image Version](https://img.shields.io/docker/v/elestio/gpt-researcher/latest?arch=amd64&style=flat&logo=docker&logoColor=white&color=1D63ED)](https://hub.docker.com/r/gptresearcher/gpt-researcher)
[![Skill](https://img.shields.io/badge/Claude%20Skill-skills.sh-blueviolet?style=flat&logo=anthropic&logoColor=white)](https://skills.sh/assafelovic/gpt-researcher/gpt-researcher)
[![Twitter Follow](https://img.shields.io/twitter/follow/assaf_elovic?style=social)](https://twitter.com/assaf_elovic)

[English](README.md) | [中文](README-zh_CN.md) | [日本語](README-ja_JP.md) | [한국어](README-ko_KR.md)

</div>

# 🔎 GPT Researcher

**GPT Researcher is an open deep research agent designed for both web and local research on any given task.** 

The agent produces detailed, factual, and unbiased research reports with citations. GPT Researcher provides a full suite of customization options to create tailor made and domain specific research agents. Inspired by the recent [Plan-and-Solve](https://arxiv.org/abs/2305.04091) and [RAG](https://arxiv.org/abs/2005.11401) papers, GPT Researcher addresses misinformation, speed, determinism, and reliability by offering stable performance and increased speed through parallelized agent work.

**Our mission is to empower individuals and organizations with accurate, unbiased, and factual information through AI.**

## Why GPT Researcher?

- Objective conclusions for manual research can take weeks, requiring vast resources and time.
- LLMs trained on outdated information can hallucinate, becoming irrelevant for current research tasks.
- Current LLMs have token limitations, insufficient for generating long research reports.
- Limited web sources in existing services lead to misinformation and shallow results.
- Selective web sources can introduce bias into research tasks.

## Demo
<a href="https://www.youtube.com/watch?v=f60rlc_QCxE" target="_blank" rel="noopener">
  <img src="https://github.com/user-attachments/assets/ac2ec55f-b487-4b3f-ae6f-b8743ad296e4" alt="Demo video" width="800" target="_blank" />
</a>

## Install as Claude Skill

Extend Claude's deep research capabilities by installing GPT Researcher as a [Claude Skill](https://skills.sh/assafelovic/gpt-researcher/gpt-researcher):

```bash
npx skills add assafelovic/gpt-researcher
```

Once installed, Claude can leverage GPT Researcher's deep research capabilities directly within your conversations.

## Architecture

The core idea is to utilize 'planner' and 'execution' agents. The planner generates research questions, while the execution agents gather relevant information. The publisher then aggregates all findings into a comprehensive report.

<div align="center">
<img align="center" height="600" src="https://github.com/assafelovic/gpt-researcher/assets/13554167/4ac896fd-63ab-4b77-9688-ff62aafcc527">
</div>

Steps:
* Create a task-specific agent based on a research query.
* Generate questions that collectively form an objective opinion on the task.
* Use a crawler agent for gathering information for each question.
* Summarize and source-track each resource.
* Filter and aggregate summaries into a final research report.

## Tutorials
 - [How it Works](https://docs.gptr.dev/blog/building-gpt-researcher)
 - [How to Install](https://www.loom.com/share/04ebffb6ed2a4520a27c3e3addcdde20?sid=da1848e8-b1f1-42d1-93c3-5b0b9c3b24ea)
 - [Live Demo](https://www.loom.com/share/6a3385db4e8747a1913dd85a7834846f?sid=a740fd5b-2aa3-457e-8fb7-86976f59f9b8)

## Features

- 📝 Generate detailed research reports using web and local documents.
- 🖼️ Smart image scraping and filtering for reports.
- 🍌 **AI-generated inline images** using Google Gemini (Nano Banana) for visual illustrations.
- 📜 Generate detailed reports exceeding 2,000 words.
- 🌐 Aggregate over 20 sources for objective conclusions.
- 🖥️ Frontend available in lightweight (HTML/CSS/JS) and production-ready (NextJS + Tailwind) versions.
- 🔍 JavaScript-enabled web scraping.
- 📂 Maintains memory and context throughout research.
- 📄 Export reports to PDF, Word, and other formats.

## 📖 Documentation

See the [Documentation](https://docs.gptr.dev/docs/gpt-researcher/getting-started) for:
- Installation and setup guides
- Configuration and customization options
- How-To examples
- Full API references

## ⚙️ Getting Started

### Installation

1. Install Python 3.11 or later. [Guide](https://www.tutorialsteacher.com/python/install-python).
2. Clone the project and navigate to the directory:

    ```bash
    git clone https://github.com/assafelovic/gpt-researcher.git
    cd gpt-researcher
    ```

3. Set up API keys by exporting them or storing them in a `.env` file.

    ```bash
    export OPENAI_API_KEY={Your OpenAI API Key here}
    export TAVILY_API_KEY={Your Tavily API Key here}
    ```

    (Optional) For enhanced tracing and observability, you can also set:
    
    ```bash
    # export LANGCHAIN_TRACING_V2=true
    # export LANGCHAIN_API_KEY={Your LangChain API Key here}
    ```

    For custom OpenAI-compatible APIs (e.g., local models, other providers), you can also set:
    
    ```bash
    export OPENAI_BASE_URL={Your custom API base URL here}
    ```

4. Install dependencies and start the server:

    ```bash
    pip install -r requirements.txt
    python -m uvicorn main:app --reload
    ```

Visit [http://localhost:8000](http://localhost:8000) to start.

For other setups (e.g., Poetry or virtual environments), check the [Getting Started page](https://docs.gptr.dev/docs/gpt-researcher/getting-started).

## Run as PIP package
```bash
pip install gpt-researcher

```
### Example Usage:
```python
...
from gpt_researcher import GPTResearcher

query = "why is Nvidia stock going up?"
researcher = GPTResearcher(query=query)
# Conduct research on the given query
research_result = await researcher.conduct_research()
# Write the report
report = await researcher.write_report()
...
```

**For more examples and configurations, please refer to the [PIP documentation](https://docs.gptr.dev/docs/gpt-researcher/gptr/pip-package) page.**

### 🔧 MCP Client
GPT Researcher supports MCP integration to connect with specialized data sources like GitHub repositories, databases, and custom APIs. This enables research from data sources alongside web search.

```bash
export RETRIEVER=tavily,mcp  # Enable hybrid web + MCP research
```

```python
from gpt_researcher import GPTResearcher
import asyncio
import os

async def mcp_research_example():
    # Enable MCP with web search
    os.environ["RETRIEVER"] = "tavily,mcp"
    
    researcher = GPTResearcher(
        query="What are the top open source web research agents?",
        mcp_configs=[
            {
                "name": "github",
                "command": "npx",
                "args": ["-y", "@modelcontextprotocol/server-github"],
                "env": {"GITHUB_TOKEN": os.getenv("GITHUB_TOKEN")}
            }
        ]
    )
    
    research_result = await researcher.conduct_research()
    report = await researcher.write_report()
    return report
```

> For comprehensive MCP documentation and advanced examples, visit the [MCP Integration Guide](https://docs.gptr.dev/docs/gpt-researcher/retrievers/mcp-configs).

## 🍌 Inline Image Generation

GPT Researcher can automatically generate and embed AI-created illustrations in your research reports using Google's Gemini models (Nano Banana).

```bash
# Enable in your .env file
IMAGE_GENERATION_ENABLED=true
GOOGLE_API_KEY=your_google_api_key
IMAGE_GENERATION_MODEL=models/gemini-2.5-flash-image
```

When enabled, the system will:
1. Analyze your research context to identify visualization opportunities
2. Pre-generate 2-3 relevant images during the research phase
3. Embed them inline as the report is written

Images are generated with dark-mode styling that matches the GPT Researcher UI, featuring professional infographic aesthetics with teal accents.

[Learn more about Image Generation](https://docs.gptr.dev/docs/gpt-researcher/gptr/image_generation) in our documentation.

## ✨ Deep Research

GPT Researcher now includes Deep Research - an advanced recursive research workflow that explores topics with agentic depth and breadth. This feature employs a tree-like exploration pattern, diving deeper into subtopics while maintaining a comprehensive view of the research subject.

- 🌳 Tree-like exploration with configurable depth and breadth
- ⚡️ Concurrent processing for faster results
- 🤝 Smart context management across research branches
- ⏱️ Takes ~5 minutes per deep research
- 💰 Costs ~$0.4 per research (using `o3-mini` on "high" reasoning effort)

[Learn more about Deep Research](https://docs.gptr.dev/docs/gpt-researcher/gptr/deep_research) in our documentation.

## Run with Docker

> **Step 1** - [Install Docker](https://docs.gptr.dev/docs/gpt-researcher/getting-started/getting-started-with-docker)

> **Step 2** - Clone the '.env.example' file, add your API Keys to the cloned file and save the file as '.env'

> **Step 3** - Within the docker-compose file comment out services that you don't want to run with Docker.

```bash
docker-compose up --build
```

If that doesn't work, try running it without the dash:
```bash
docker compose up --build
```

> **Step 4** - By default, if you haven't uncommented anything in your docker-compose file, this flow will start 2 processes:
 - the Python server running on localhost:8000<br>
 - the React app running on localhost:3000<br>

Visit localhost:3000 on any browser and enjoy researching!


## 📄 Research on Local Documents

You can instruct the GPT Researcher to run research tasks based on your local documents. Currently supported file formats are: PDF, plain text, CSV, Excel, Markdown, PowerPoint, and Word documents.

Step 1: Add the env variable `DOC_PATH` pointing to the folder where your documents are located.

```bash
export DOC_PATH="./my-docs"
```

Step 2: 
 - If you're running the frontend app on localhost:8000, simply select "My Documents" from the "Report Source" Dropdown Options.
 - If you're running GPT Researcher with the [PIP package](https://docs.tavily.com/guides/gpt-researcher/gpt-researcher#pip-package), pass the `report_source` argument as "local" when you instantiate the `GPTResearcher` class [code sample here](https://docs.gptr.dev/docs/gpt-researcher/context/tailored-research).


## 🤖 MCP Server

We've moved our MCP server to a dedicated repository: [gptr-mcp](https://github.com/assafelovic/gptr-mcp).

The GPT Researcher MCP Server enables AI applications like Claude to conduct deep research. While LLM apps can access web search tools with MCP, GPT Researcher MCP delivers deeper, more reliable research results.

Features:
- Deep research capabilities for AI assistants
- Higher quality information with optimized context usage
- Comprehensive results with better reasoning for LLMs
- Claude Desktop integration

For detailed installation and usage instructions, please visit the [official repository](https://github.com/assafelovic/gptr-mcp).


## 👪 Multi-Agent Assistant
As AI evolves from prompt engineering and RAG to multi-agent systems, we're excited to introduce multi-agent assistants built with [LangGraph](https://python.langchain.com/v0.1/docs/langgraph/) and [AG2](https://github.com/ag2ai/ag2).

By using multi-agent frameworks, the research process can be significantly improved in depth and quality by leveraging multiple agents with specialized skills. Inspired by the recent [STORM](https://arxiv.org/abs/2402.14207) paper, this project showcases how a team of AI agents can work together to conduct research on a given topic, from planning to publication.

An average run generates a 5-6 page research report in multiple formats such as PDF, Docx and Markdown.

Check it out [here](https://github.com/assafelovic/gpt-researcher/tree/master/multi_agents) or head over to our documentation for [LangGraph](https://docs.gptr.dev/docs/gpt-researcher/multi_agents/langgraph) and [AG2](https://docs.gptr.dev/docs/gpt-researcher/multi_agents/ag2) for more information.

## 🔍 Observability

GPT Researcher supports **LangSmith** for enhanced tracing and observability, making it easier to debug and optimize complex multi-agent workflows.

To enable tracing:
1. Set the following environment variables:
   ```bash
   export LANGCHAIN_TRACING_V2=true
   export LANGCHAIN_API_KEY=your_api_key
   export LANGCHAIN_PROJECT="gpt-researcher"
   ```
2. Run your research tasks as usual. All LangGraph-based agent interactions will be automatically traced and visualized in your LangSmith dashboard.

## 🖥️ Frontend Applications

GPT-Researcher now features an enhanced frontend to improve the user experience and streamline the research process. The frontend offers:

- An intuitive interface for inputting research queries
- Real-time progress tracking of research tasks
- Interactive display of research findings
- Customizable settings for tailored research experiences

Two deployment options are available:
1. A lightweight static frontend served by FastAPI
2. A feature-rich NextJS application for advanced functionality

For detailed setup instructions and more information about the frontend features, please visit our [documentation page](https://docs.gptr.dev/docs/gpt-researcher/frontend/introduction).

## 🚀 Contributing
We highly welcome contributions! Please check out [contributing](https://github.com/assafelovic/gpt-researcher/blob/master/CONTRIBUTING.md) if you're interested.

Please check out our [roadmap](https://trello.com/b/3O7KBePw/gpt-researcher-roadmap) page and reach out to us via our [Discord community](https://discord.gg/QgZXvJAccX) if you're interested in joining our mission.
<a href="https://github.com/assafelovic/gpt-researcher/graphs/contributors">
  <img src="https://contrib.rocks/image?repo=assafelovic/gpt-researcher&max=1000" />
</a>
## ✉️ Support / Contact us
- [Community Discord](https://discord.gg/spBgZmm3Xe)
- Author Email: assaf.elovic@gmail.com

## 🛡 Disclaimer

This project, GPT Researcher, is an experimental application and is provided "as-is" without any warranty, express or implied. We are sharing codes for academic purposes under the Apache 2 license. Nothing herein is academic advice, and NOT a recommendation to use in academic or research papers.

Our view on unbiased research claims:
1. The main goal of GPT Researcher is to reduce incorrect and biased facts. How? We assume that the more sites we scrape the less chances of incorrect data. By scraping multiple sites per research, and choosing the most frequent information, the chances that they are all wrong is extremely low.
2. We do not aim to eliminate biases; we aim to reduce it as much as possible. **We are here as a community to figure out the most effective human/llm interactions.**
3. In research, people also tend towards biases as most have already opinions on the topics they research about. This tool scrapes many opinions and will evenly explain diverse views that a biased person would never have read.

---

<p align="center">
<a href="https://star-history.com/#assafelovic/gpt-researcher">
  <picture>
    <source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=assafelovic/gpt-researcher&type=Date&theme=dark" />
    <source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=assafelovic/gpt-researcher&type=Date" />
    <img alt="Star History Chart" src="https://api.star-history.com/svg?repos=assafelovic/gpt-researcher&type=Date" />
  </picture>
</a>
</p>


<p align="right">
  <a href="#top">⬆️ Back to Top</a>
</p>


================================================
FILE: backend/Dockerfile
================================================
FROM python:3.11-slim

WORKDIR /app

# Copy requirements first to leverage Docker cache
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy the rest of the application
COPY . .

# Expose the port the app will run on
EXPOSE 8000

# Start the application
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"] 

================================================
FILE: backend/Procfile
================================================
web: uvicorn server.app:app --host 0.0.0.0 --port $PORT --workers 1 

================================================
FILE: backend/__init__.py
================================================


================================================
FILE: backend/chat/__init__.py
================================================


# Chat package initialization

================================================
FILE: backend/chat/chat.py
================================================
import logging
import os
import uuid
import json
from fastapi import WebSocket
from typing import List, Dict, Any

from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import InMemoryVectorStore
from gpt_researcher.memory import Memory
from gpt_researcher.config.config import Config
from gpt_researcher.utils.llm import create_chat_completion
from gpt_researcher.utils.tools import create_chat_completion_with_tools, create_search_tool
from tavily import TavilyClient
from datetime import datetime

# Setup logging
# Get logger instance
logger = logging.getLogger(__name__)

logging.basicConfig(
    level=logging.INFO,
    format="%(asctime)s - %(levelname)s - %(message)s",
    handlers=[
        logging.StreamHandler()  # Only log to console
    ]
)

# Note: LLM client is now handled through GPT Researcher's unified LLM system
# This supports all configured providers (OpenAI, Google Gemini, Anthropic, etc.)

def get_tools():
    """Define tools for LLM function calling (primarily for OpenAI-compatible providers)"""
    tools = [
        {
            "type": "function",
            "function": {
                "name": "quick_search",
                "description": "Search for current events or online information when you need new knowledge that doesn't exist in the current context",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "query": {
                            "type": "string",
                            "description": "The search query"
                        }
                    },
                    "required": ["query"]
                }
            }
        }
    ]
    return tools

class ChatAgentWithMemory:
    def __init__(
        self,
        report: str,
        config_path="default",
        headers=None,
        vector_store=None
    ):
        self.report = report
        self.headers = headers
        self.config = Config(config_path)
        self.vector_store = vector_store
        self.retriever = None
        self.search_metadata = None
        
        # Initialize Tavily client (optional - only if API key is available)
        tavily_api_key = os.environ.get("TAVILY_API_KEY")
        if tavily_api_key:
            self.tavily_client = TavilyClient(api_key=tavily_api_key)
        else:
            self.tavily_client = None
            logger.warning("TAVILY_API_KEY not set - web search in chat will be disabled")
        
        # Process document and create vector store if not provided
        if not self.vector_store and False:
            self._setup_vector_store()
    
    def _setup_vector_store(self):
        """Setup vector store for document retrieval"""
        # Process document into chunks
        documents = self._process_document(self.report)
        
        # Create unique thread ID
        self.thread_id = str(uuid.uuid4())
        
        # Setup embeddings and vector store
        cfg = Config()
        self.embedding = Memory(
            cfg.embedding_provider,
            cfg.embedding_model,
            **cfg.embedding_kwargs
        ).get_embeddings()
        
        # Create vector store and retriever
        self.vector_store = InMemoryVectorStore(self.embedding)
        self.vector_store.add_texts(documents)
        self.retriever = self.vector_store.as_retriever(k=4)
        
    def _process_document(self, report):
        """Split Report into Chunks"""
        text_splitter = RecursiveCharacterTextSplitter(
            chunk_size=1024,
            chunk_overlap=20,
            length_function=len,
            is_separator_regex=False,
        )
        documents = text_splitter.split_text(report)
        return documents

    def quick_search(self, query):
        """Perform a web search for current information using Tavily"""
        try:
            # Check if Tavily client is available
            if self.tavily_client is None:
                logger.warning(f"Tavily client not available, skipping web search for: {query}")
                self.search_metadata = {
                    "query": query,
                    "sources": [],
                    "error": "Web search is disabled - TAVILY_API_KEY not configured"
                }
                return {
                    "error": "Web search is disabled - TAVILY_API_KEY not configured",
                    "results": []
                }
            
            logger.info(f"Performing web search for: {query}")
            results = self.tavily_client.search(query=query, max_results=5)
            
            # Store search metadata for frontend
            self.search_metadata = {
                "query": query,
                "sources": [
                    {"title": result.get("title", ""), 
                     "url": result.get("url", ""),
                     "content": result.get("content", "")[:200] + "..." if len(result.get("content", "")) > 200 else result.get("content", "")}
                    for result in results.get("results", [])
                ]
            }
            
            return results
        except Exception as e:
            logger.error(f"Error performing web search: {str(e)}", exc_info=True)
            return {
                "error": str(e),
                "results": []
            }


    async def process_chat_completion(self, messages: List[Dict[str, str]]):
        """Process chat completion using configured LLM provider with tool calling support"""
        # Create a search tool using the utility function
        search_tool = create_search_tool(self.quick_search)
        
        # Use the tool-enabled chat completion utility
        response, tool_calls_metadata = await create_chat_completion_with_tools(
            messages=messages,
            tools=[search_tool],
            model=self.config.smart_llm_model,
            llm_provider=self.config.smart_llm_provider,
            llm_kwargs=self.config.llm_kwargs,
        )
        
        # Process metadata to match the expected format for the chat system
        processed_metadata = []
        for metadata in tool_calls_metadata:
            if metadata.get("tool") == "search_tool":
                # Extract query from args
                query = metadata.get("args", {}).get("query", "")
                
                # Trigger search again to get metadata (the search was already executed by LangChain)
                if query:
                    self.quick_search(query)  # This populates self.search_metadata
                    
                processed_metadata.append({
                    "tool": "quick_search",
                    "query": query,
                    "search_metadata": self.search_metadata
                })
        
        return response, processed_metadata


    async def chat(self, messages, websocket=None):
        """Chat with configured LLM provider (supports OpenAI, Google Gemini, Anthropic, etc.)
        
        Args:
            messages: List of chat messages with role and content
            websocket: Optional websocket for streaming responses
        
        Returns:
            tuple: (str: The AI response message, dict: metadata about tool usage)
        """
        try:
            
            # Format system prompt with the report context
            system_prompt = f"""
            You are GPT Researcher, an autonomous research agent created by an open source community at https://github.com/assafelovic/gpt-researcher, homepage: https://gptr.dev. 
            To learn more about GPT Researcher you can suggest to check out: https://docs.gptr.dev.
            
            This is a chat about a research report that you created. Answer based on the given context and report.
            You must include citations to your answer based on the report.
            
            You may use the quick_search tool when the user asks about information that might require current data 
            not found in the report, such as recent events, updated statistics, or news. If there's no report available,
            you can use the quick_search tool to find information online.
            
            You must respond in markdown format. You must make it readable with paragraphs, tables, etc when possible. 
            Remember that you're answering in a chat not a report.
            
            Assume the current time is: {datetime.now()}.
            
            Report: {self.report}
            
            """
            
            # Format message history for OpenAI input
            formatted_messages = []
            
            # Add system message first
            formatted_messages.append({
                "role": "system", 
                "content": system_prompt
            })
            
            # Add user/assistant message history - filter out non-essential fields
            for msg in messages:
                if 'role' in msg and 'content' in msg:
                    formatted_messages.append({
                        "role": msg["role"],
                        "content": msg["content"]
                    })
                else:
                    logger.warning(f"Skipping message with missing role or content: {msg}")
            
            # Process the chat using configured LLM provider
            ai_message, tool_calls_metadata = await self.process_chat_completion(formatted_messages)
            
            # Provide fallback response if message is empty
            if not ai_message:
                logger.warning("No AI message content found in response, using fallback message")
                ai_message = "I apologize, but I couldn't generate a proper response. Please try asking your question again."
            
            logger.info(f"Generated response: {ai_message[:100]}..." if len(ai_message) > 100 else f"Generated response: {ai_message}")
            
            # Return both the message and any metadata about tools used
            return ai_message, tool_calls_metadata
            
        except Exception as e:
            logger.error(f"Error in chat: {str(e)}", exc_info=True)
            raise

    def get_context(self):
        """return the current context of the chat"""
        return self.report


================================================
FILE: backend/memory/__init__.py
================================================


================================================
FILE: backend/memory/draft.py
================================================
from typing import TypedDict, List, Annotated
import operator


class DraftState(TypedDict):
    task: dict
    topic: str
    draft: dict
    review: str
    revision_notes: str

================================================
FILE: backend/memory/research.py
================================================
from typing import TypedDict, List, Annotated
import operator


class ResearchState(TypedDict):
    task: dict
    initial_research: str
    sections: List[str]
    research_data: List[dict]
    # Report layout
    title: str
    headers: dict
    date: str
    table_of_contents: str
    introduction: str
    conclusion: str
    sources: List[str]
    report: str




================================================
FILE: backend/report_type/__init__.py
================================================
from .basic_report.basic_report import BasicReport
from .detailed_report.detailed_report import DetailedReport

__all__ = [
    "BasicReport",
    "DetailedReport"
]

================================================
FILE: backend/report_type/basic_report/__init__.py
================================================


================================================
FILE: backend/report_type/basic_report/basic_report.py
================================================
import hashlib
import time
from fastapi import WebSocket
from typing import Any

from gpt_researcher import GPTResearcher


class BasicReport:
    def __init__(
        self,
        query: str,
        query_domains: list,
        report_type: str,
        report_source: str,
        source_urls,
        document_urls,
        tone: Any,
        config_path: str,
        websocket: WebSocket,
        headers=None,
        mcp_configs=None,
        mcp_strategy=None,
        max_search_results=None,
    ):
        self.query = query
        self.query_domains = query_domains
        self.report_type = report_type
        self.report_source = report_source
        self.source_urls = source_urls
        self.document_urls = document_urls
        self.tone = tone
        self.config_path = config_path
        self.websocket = websocket
        self.headers = headers or {}
        
        # Generate a unique research ID for this report
        self.research_id = self._generate_research_id(query)

        # Initialize researcher with optional MCP parameters
        gpt_researcher_params = {
            "query": self.query,
            "query_domains": self.query_domains,
            "report_type": self.report_type,
            "report_source": self.report_source,
            "source_urls": self.source_urls,
            "document_urls": self.document_urls,
            "tone": self.tone,
            "config_path": self.config_path,
            "websocket": self.websocket,
            "headers": self.headers,
        }

        # Add MCP parameters if provided
        if mcp_configs is not None:
            gpt_researcher_params["mcp_configs"] = mcp_configs
        if mcp_strategy is not None:
            gpt_researcher_params["mcp_strategy"] = mcp_strategy

        self.gpt_researcher = GPTResearcher(**gpt_researcher_params)

        # Override max_search_results_per_query if provided by user
        if max_search_results is not None:
            self.gpt_researcher.cfg.max_search_results_per_query = int(max_search_results)

    def _generate_research_id(self, query: str) -> str:
        """Generate a unique research ID from query and timestamp."""
        timestamp = str(int(time.time()))
        query_hash = hashlib.md5(query.encode()).hexdigest()[:8]
        return f"research_{timestamp}_{query_hash}"

    async def run(self):
        await self.gpt_researcher.conduct_research()
        report = await self.gpt_researcher.write_report()
        return report


================================================
FILE: backend/report_type/deep_research/README.md
================================================
# Deep Research ✨ NEW ✨

With the latest "Deep Research" trend in the AI community, we're excited to implement our own Open source deep research capability! Introducing GPT Researcher's Deep Research - an advanced recursive research system that explores topics with unprecedented depth and breadth.

## How It Works

Deep Research employs a fascinating tree-like exploration pattern:

1. **Breadth**: At each level, it generates multiple search queries to explore different aspects of your topic
2. **Depth**: For each branch, it recursively dives deeper, following leads and uncovering connections
3. **Concurrent Processing**: Utilizes async/await patterns to run multiple research paths simultaneously
4. **Smart Context Management**: Automatically aggregates and synthesizes findings across all branches
5. **Progress Tracking**: Real-time updates on research progress across both breadth and depth dimensions

Think of it as deploying a team of AI researchers, each following their own research path while collaborating to build a comprehensive understanding of your topic.

## Process Flow
![deep research](https://github.com/user-attachments/assets/eba2d94b-bef3-4f8d-bbc0-f15bd0a40968)


## Quick Start

```python
from gpt_researcher import GPTResearcher
from gpt_researcher.utils.enum import ReportType, Tone
import asyncio

async def main():
    # Initialize researcher with deep research type
    researcher = GPTResearcher(
        query="What are the latest developments in quantum computing?",
        report_type="deep",  # This triggers deep research modd
    )
    
    # Run research
    research_data = await researcher.conduct_research()
    
    # Generate report
    report = await researcher.write_report()
    print(report)

if __name__ == "__main__":
    asyncio.run(main())
```

## Configuration

Deep Research behavior can be customized through several parameters:

- `deep_research_breadth`: Number of parallel research paths at each level (default: 4)
- `deep_research_depth`: How many levels deep to explore (default: 2)
- `deep_research_concurrency`: Maximum number of concurrent research operations (default: 2)

You can configure these in your config file, pass as environment variables or pass them directly:

```python
researcher = GPTResearcher(
    query="your query",
    report_type="deep",
    config_path="path/to/config.yaml"  # Configure deep research parameters here
)
```

## Progress Tracking

The `on_progress` callback provides real-time insights into the research process:

```python
class ResearchProgress:
    current_depth: int       # Current depth level
    total_depth: int         # Maximum depth to explore
    current_breadth: int     # Current number of parallel paths
    total_breadth: int       # Maximum breadth at each level
    current_query: str       # Currently processing query
    completed_queries: int   # Number of completed queries
    total_queries: int       # Total queries to process
```

## Advanced Usage

### Custom Research Flow

```python
researcher = GPTResearcher(
    query="your query",
    report_type="deep",
    tone=Tone.Objective,
    headers={"User-Agent": "your-agent"},  # Custom headers for web requests
    verbose=True  # Enable detailed logging
)

# Get raw research context
context = await researcher.conduct_research()

# Access research sources
sources = researcher.get_research_sources()

# Get visited URLs
urls = researcher.get_source_urls()

# Generate formatted report
report = await researcher.write_report()
```

### Error Handling

The deep research system is designed to be resilient:

- Failed queries are automatically skipped
- Research continues even if some branches fail
- Progress tracking helps identify any issues

## Best Practices

1. **Start Broad**: Begin with a general query and let the system explore specifics
2. **Monitor Progress**: Use the progress callback to understand the research flow
3. **Adjust Parameters**: Tune breadth and depth based on your needs:
   - More breadth = wider coverage
   - More depth = deeper insights
4. **Resource Management**: Consider concurrency limits based on your system capabilities

## Limitations

- Usage of reasoning LLM models such as `o3-mini`. This means that permissions for reasoning are required and the overall run will be significantly slower.
- Deep research may take longer than standard research
- Higher API usage and costs due to multiple concurrent queries
- May require more system resources for parallel processing

Happy researching! 🎉 


================================================
FILE: backend/report_type/deep_research/__init__.py
================================================


================================================
FILE: backend/report_type/deep_research/example.py
================================================
from typing import List, Dict, Any, Optional, Set
from fastapi import WebSocket
import asyncio
import logging
from gpt_researcher import GPTResearcher
from gpt_researcher.llm_provider.generic.base import ReasoningEfforts
from gpt_researcher.utils.llm import create_chat_completion
from gpt_researcher.utils.enum import ReportType, ReportSource, Tone

logger = logging.getLogger(__name__)

# Constants for models
GPT4_MODEL = "gpt-4o"  # For standard tasks
O3_MINI_MODEL = "o3-mini"  # For reasoning tasks
LLM_PROVIDER = "openai"

class ResearchProgress:
    def __init__(self, total_depth: int, total_breadth: int):
        self.current_depth = total_depth
        self.total_depth = total_depth
        self.current_breadth = total_breadth
        self.total_breadth = total_breadth
        self.current_query: Optional[str] = None
        self.total_queries = 0
        self.completed_queries = 0

class DeepResearch:
    def __init__(
        self,
        query: str,
        breadth: int = 4,
        depth: int = 2,
        websocket: Optional[WebSocket] = None,
        tone: Tone = Tone.Objective,
        config_path: Optional[str] = None,
        headers: Optional[Dict] = None,
        concurrency_limit: int = 2  # Match TypeScript version
    ):
        self.query = query
        self.breadth = breadth
        self.depth = depth
        self.websocket = websocket
        self.tone = tone
        self.config_path = config_path
        self.headers = headers or {}
        self.visited_urls: Set[str] = set()
        self.learnings: List[str] = []
        self.concurrency_limit = concurrency_limit

    async def generate_feedback(self, query: str, num_questions: int = 3) -> List[str]:
        """Generate follow-up questions to clarify research direction"""
        messages = [
            {"role": "system", "content": "You are an expert researcher helping to clarify research directions."},
            {"role": "user", "content": f"Given the following query from the user, ask some follow up questions to clarify the research direction. Return a maximum of {num_questions} questions, but feel free to return less if the original query is clear. Format each question on a new line starting with 'Question: ': {query}"}
        ]

        response = await create_chat_completion(
            messages=messages,
            llm_provider=LLM_PROVIDER,
            model=O3_MINI_MODEL,  # Using reasoning model for better question generation
            temperature=0.7,
            max_tokens=500,
            reasoning_effort=ReasoningEfforts.High.value
        )

        # Parse questions from response
        questions = [q.replace('Question:', '').strip()
                    for q in response.split('\n')
                    if q.strip().startswith('Question:')]
        return questions[:num_questions]

    async def generate_serp_queries(self, query: str, num_queries: int = 3) -> List[Dict[str, str]]:
        """Generate SERP queries for research"""
        messages = [
            {"role": "system", "content": "You are an expert researcher generating search queries."},
            {"role": "user", "content": f"Given the following prompt, generate {num_queries} unique search queries to research the topic thoroughly. For each query, provide a research goal. Format as 'Query: <query>' followed by 'Goal: <goal>' for each pair: {query}"}
        ]

        response = await create_chat_completion(
            messages=messages,
            llm_provider=LLM_PROVIDER,
            model=GPT4_MODEL,  # Using GPT-4 for general task
            temperature=0.7,
            max_tokens=1000
        )

        # Parse queries and goals from response
        lines = response.split('\n')
        queries = []
        current_query = {}

        for line in lines:
            line = line.strip()
            if line.startswith('Query:'):
                if current_query:
                    queries.append(current_query)
                current_query = {'query': line.replace('Query:', '').strip()}
            elif line.startswith('Goal:') and current_query:
                current_query['researchGoal'] = line.replace('Goal:', '').strip()

        if current_query:
            queries.append(current_query)

        return queries[:num_queries]

    async def process_serp_result(self, query: str, context: str, num_learnings: int = 3) -> Dict[str, List[str]]:
        """Process research results to extract learnings and follow-up questions"""
        messages = [
            {"role": "system", "content": "You are an expert researcher analyzing search results."},
            {"role": "user", "content": f"Given the following research results for the query '{query}', extract key learnings and suggest follow-up questions. For each learning, include a citation to the source URL if available. Format each learning as 'Learning [source_url]: <insight>' and each question as 'Question: <question>':\n\n{context}"}
        ]

        response = await create_chat_completion(
            messages=messages,
            llm_provider=LLM_PROVIDER,
            model=O3_MINI_MODEL,  # Using reasoning model for analysis
            temperature=0.7,
            max_tokens=1000,
            reasoning_effort=ReasoningEfforts.High.value
        )

        # Parse learnings and questions with citations
        lines = response.split('\n')
        learnings = []
        questions = []
        citations = {}

        for line in lines:
            line = line.strip()
            if line.startswith('Learning'):
                # Extract URL if present in square brackets
                import re
                url_match = re.search(r'\[(.*?)\]:', line)
                if url_match:
                    url = url_match.group(1)
                    learning = line.split(':', 1)[1].strip()
                    learnings.append(learning)
                    citations[learning] = url
                else:
                    learnings.append(line.replace('Learning:', '').strip())
            elif line.startswith('Question:'):
                questions.append(line.replace('Question:', '').strip())

        return {
            'learnings': learnings[:num_learnings],
            'followUpQuestions': questions[:num_learnings],
            'citations': citations
        }

    async def deep_research(
        self,
        query: str,
        breadth: int,
        depth: int,
        learnings: List[str] = None,
        citations: Dict[str, str] = None,
        visited_urls: Set[str] = None,
        on_progress = None
    ) -> Dict[str, Any]:
        """Conduct deep iterative research"""
        if learnings is None:
            learnings = []
        if citations is None:
            citations = {}
        if visited_urls is None:
            visited_urls = set()

        progress = ResearchProgress(depth, breadth)

        if on_progress:
            on_progress(progress)

        # Generate search queries
        serp_queries = await self.generate_serp_queries(query, num_queries=breadth)
        progress.total_queries = len(serp_queries)

        all_learnings = learnings.copy()
        all_citations = citations.copy()
        all_visited_urls = visited_urls.copy()

        # Process queries with concurrency limit
        semaphore = asyncio.Semaphore(self.concurrency_limit)

        async def process_query(serp_query: Dict[str, str]) -> Optional[Dict[str, Any]]:
            async with semaphore:
                try:
                    progress.current_query = serp_query['query']
                    if on_progress:
                        on_progress(progress)

                    # Initialize researcher for this query
                    researcher = GPTResearcher(
                        query=serp_query['query'],
                        report_type=ReportType.ResearchReport.value,
                        report_source=ReportSource.Web.value,
                        tone=self.tone,
                        websocket=self.websocket,
                        config_path=self.config_path,
                        headers=self.headers
                    )

                    # Conduct research
                    await researcher.conduct_research()

                    # Get results
                    context = researcher.context
                    visited = set(researcher.visited_urls)

                    # Process results
                    results = await self.process_serp_result(
                        query=serp_query['query'],
                        context=context
                    )

                    # Update progress
                    progress.completed_queries += 1
                    if on_progress:
                        on_progress(progress)

                    return {
                        'learnings': results['learnings'],
                        'visited_urls': visited,
                        'followUpQuestions': results['followUpQuestions'],
                        'researchGoal': serp_query['researchGoal'],
                        'citations': results['citations']
                    }

                except Exception as e:
                    logger.error(f"Error processing query '{serp_query['query']}': {str(e)}")
                    return None

        # Process queries concurrently with limit
        tasks = [process_query(query) for query in serp_queries]
        results = await asyncio.gather(*tasks)
        results = [r for r in results if r is not None]  # Filter out failed queries

        # Collect all results
        for result in results:
            all_learnings.extend(result['learnings'])
            all_visited_urls.update(set(result['visited_urls']))
            all_citations.update(result['citations'])

            # Continue deeper if needed
            if depth > 1:
                new_breadth = max(2, breadth // 2)
                new_depth = depth - 1

                # Create next query from research goal and follow-up questions
                next_query = f"""
                Previous research goal: {result['researchGoal']}
                Follow-up questions: {' '.join(result['followUpQuestions'])}
                """

                # Recursive research
                deeper_results = await self.deep_research(
                    query=next_query,
                    breadth=new_breadth,
                    depth=new_depth,
                    learnings=all_learnings,
                    citations=all_citations,
                    visited_urls=all_visited_urls,
                    on_progress=on_progress
                )

                all_learnings = deeper_results['learnings']
                all_visited_urls = set(deeper_results['visited_urls'])
                all_citations.update(deeper_results['citations'])

        return {
            'learnings': list(set(all_learnings)),
            'visited_urls': list(all_visited_urls),
            'citations': all_citations
        }

    async def run(self, on_progress=None) -> str:
        """Run the deep research process and generate final report"""
        # Get initial feedback
        follow_up_questions = await self.generate_feedback(self.query)

        # Collect answers (this would normally come from user interaction)
        answers = ["Automatically proceeding with research"] * len(follow_up_questions)

        # Combine query and Q&A
        combined_query = f"""
        Initial Query: {self.query}
        Follow-up Questions and Answers:
        {' '.join([f'Q: {q}\nA: {a}' for q, a in zip(follow_up_questions, answers)])}
        """

        # Run deep research
      
Download .txt
gitextract__hhdux9u/

├── .claude/
│   ├── SKILL.md
│   └── references/
│       ├── adding-features.md
│       ├── advanced-patterns.md
│       ├── api-reference.md
│       ├── architecture.md
│       ├── components.md
│       ├── config-reference.md
│       ├── deep-research.md
│       ├── flows.md
│       ├── mcp.md
│       ├── multi-agents.md
│       ├── prompts.md
│       └── retrievers.md
├── .cursorignore
├── .dockerignore
├── .github/
│   ├── ISSUE_TEMPLATE/
│   │   ├── bug_report.md
│   │   └── feature_request.md
│   ├── dependabot.yml
│   └── workflows/
│       ├── build.yml
│       ├── deploy.yml
│       └── docker-build.yml
├── .gitignore
├── .python-version
├── CODE_OF_CONDUCT.md
├── CONTRIBUTING.md
├── Dockerfile
├── Dockerfile.fullstack
├── LICENSE
├── Procfile
├── README-ja_JP.md
├── README-ko_KR.md
├── README-zh_CN.md
├── README.md
├── backend/
│   ├── Dockerfile
│   ├── Procfile
│   ├── __init__.py
│   ├── chat/
│   │   ├── __init__.py
│   │   └── chat.py
│   ├── memory/
│   │   ├── __init__.py
│   │   ├── draft.py
│   │   └── research.py
│   ├── report_type/
│   │   ├── __init__.py
│   │   ├── basic_report/
│   │   │   ├── __init__.py
│   │   │   └── basic_report.py
│   │   ├── deep_research/
│   │   │   ├── README.md
│   │   │   ├── __init__.py
│   │   │   ├── example.py
│   │   │   └── main.py
│   │   └── detailed_report/
│   │       ├── README.md
│   │       ├── __init__.py
│   │       └── detailed_report.py
│   ├── requirements.txt
│   ├── run_server.py
│   ├── runtime.txt
│   ├── server/
│   │   ├── __init__.py
│   │   ├── app.py
│   │   ├── logging_config.py
│   │   ├── multi_agent_runner.py
│   │   ├── report_store.py
│   │   ├── server_utils.py
│   │   └── websocket_manager.py
│   ├── styles/
│   │   └── pdf_styles.css
│   └── utils.py
├── citation.cff
├── cli.py
├── docker-compose.yml
├── docs/
│   ├── CNAME
│   ├── README.md
│   ├── babel.config.js
│   ├── blog/
│   │   ├── 2023-09-22-gpt-researcher/
│   │   │   └── index.md
│   │   ├── 2023-11-12-openai-assistant/
│   │   │   └── index.md
│   │   ├── 2024-05-19-gptr-langgraph/
│   │   │   └── index.md
│   │   ├── 2024-09-7-hybrid-research/
│   │   │   └── index.md
│   │   ├── 2025-02-26-deep-research/
│   │   │   └── index.md
│   │   ├── 2025-03-10-stepping-into-the-story/
│   │   │   └── index.md
│   │   └── authors.yml
│   ├── discord-bot/
│   │   ├── Dockerfile
│   │   ├── Dockerfile.dev
│   │   ├── commands/
│   │   │   └── ask.js
│   │   ├── deploy-commands.js
│   │   ├── gptr-webhook.js
│   │   ├── index.js
│   │   ├── package.json
│   │   └── server.js
│   ├── docs/
│   │   ├── contribute.md
│   │   ├── examples/
│   │   │   ├── custom_prompt.py
│   │   │   ├── detailed_report.md
│   │   │   ├── examples.ipynb
│   │   │   ├── examples.md
│   │   │   ├── hybrid_research.md
│   │   │   ├── pip-run.ipynb
│   │   │   ├── sample_report.py
│   │   │   └── sample_sources_only.py
│   │   ├── faq.md
│   │   ├── gpt-researcher/
│   │   │   ├── context/
│   │   │   │   ├── azure-storage.md
│   │   │   │   ├── data-ingestion.md
│   │   │   │   ├── filtering-by-domain.md
│   │   │   │   ├── local-docs.md
│   │   │   │   ├── tailored-research.md
│   │   │   │   └── vector-stores.md
│   │   │   ├── frontend/
│   │   │   │   ├── discord-bot.md
│   │   │   │   ├── embed-script.md
│   │   │   │   ├── introduction.md
│   │   │   │   ├── nextjs-frontend.md
│   │   │   │   ├── react-package.md
│   │   │   │   ├── vanilla-js-frontend.md
│   │   │   │   └── visualizing-websockets.md
│   │   │   ├── getting-started/
│   │   │   │   ├── cli.md
│   │   │   │   ├── getting-started-with-docker.md
│   │   │   │   ├── getting-started.md
│   │   │   │   ├── how-to-choose.md
│   │   │   │   ├── introduction.md
│   │   │   │   └── linux-deployment.md
│   │   │   ├── gptr/
│   │   │   │   ├── ai-development.md
│   │   │   │   ├── automated-tests.md
│   │   │   │   ├── claude-skill.md
│   │   │   │   ├── config.md
│   │   │   │   ├── deep_research.md
│   │   │   │   ├── example.md
│   │   │   │   ├── image_generation.md
│   │   │   │   ├── npm-package.md
│   │   │   │   ├── pip-package.md
│   │   │   │   ├── querying-the-backend.md
│   │   │   │   ├── scraping.md
│   │   │   │   └── troubleshooting.md
│   │   │   ├── handling-logs/
│   │   │   │   ├── all-about-logs.md
│   │   │   │   ├── langsmith-logs.md
│   │   │   │   └── simple-logs-example.md
│   │   │   ├── llms/
│   │   │   │   ├── llms.md
│   │   │   │   ├── running-with-azure.md
│   │   │   │   ├── running-with-ollama.md
│   │   │   │   ├── supported-llms.md
│   │   │   │   └── testing-your-llm.md
│   │   │   ├── mcp-server/
│   │   │   │   ├── advanced-usage.md
│   │   │   │   ├── claude-integration.md
│   │   │   │   └── getting-started.md
│   │   │   ├── multi_agents/
│   │   │   │   ├── ag2.md
│   │   │   │   └── langgraph.md
│   │   │   ├── retrievers/
│   │   │   │   └── mcp-configs.mdx
│   │   │   └── search-engines/
│   │   │       ├── search-engines.md
│   │   │       └── test-your-retriever.md
│   │   ├── proposals/
│   │   │   ├── adaptive-deep-research.md
│   │   │   ├── high-quality-content-scraping-architecture.md
│   │   │   ├── local-server-deployment-guide.md
│   │   │   └── social-media-data-acquisition.md
│   │   ├── reference/
│   │   │   ├── config/
│   │   │   │   ├── config.md
│   │   │   │   └── singleton.md
│   │   │   ├── processing/
│   │   │   │   ├── html.md
│   │   │   │   └── text.md
│   │   │   └── sidebar.json
│   │   ├── roadmap.md
│   │   └── welcome.md
│   ├── docusaurus.config.js
│   ├── npm/
│   │   ├── Readme.md
│   │   ├── index.js
│   │   └── package.json
│   ├── package.json
│   ├── pydoc-markdown.yml
│   ├── sidebars.js
│   ├── src/
│   │   ├── components/
│   │   │   ├── HomepageFeatures.js
│   │   │   └── HomepageFeatures.module.css
│   │   ├── css/
│   │   │   └── custom.css
│   │   └── pages/
│   │       ├── index.js
│   │       └── index.module.css
│   └── static/
│       ├── .nojekyll
│       └── CNAME
├── evals/
│   ├── README.md
│   ├── __init__.py
│   ├── hallucination_eval/
│   │   ├── evaluate.py
│   │   ├── inputs/
│   │   │   └── search_queries.jsonl
│   │   ├── requirements.txt
│   │   ├── results/
│   │   │   ├── aggregate_results.json
│   │   │   └── evaluation_records.jsonl
│   │   └── run_eval.py
│   └── simple_evals/
│       ├── .gitignore
│       ├── __init__.py
│       ├── logs/
│       │   ├── .gitkeep
│       │   ├── README.md
│       │   └── SimpleQA Eval 100 Problems 2-22-25.txt
│       ├── problems/
│       │   └── Simple QA Test Set.csv
│       ├── requirements.txt
│       ├── run_eval.py
│       └── simpleqa_eval.py
├── frontend/
│   ├── README.md
│   ├── index.html
│   ├── nextjs/
│   │   ├── .babelrc.build.json
│   │   ├── .dockerignore
│   │   ├── .eslintrc.json
│   │   ├── .example.env
│   │   ├── .gitignore
│   │   ├── .prettierrc
│   │   ├── .python-version
│   │   ├── Dockerfile
│   │   ├── Dockerfile.dev
│   │   ├── README.md
│   │   ├── actions/
│   │   │   └── apiActions.ts
│   │   ├── app/
│   │   │   ├── api/
│   │   │   │   ├── chat/
│   │   │   │   │   └── route.ts
│   │   │   │   └── reports/
│   │   │   │       ├── [id]/
│   │   │   │       │   ├── chat/
│   │   │   │       │   │   └── route.ts
│   │   │   │       │   └── route.ts
│   │   │   │       └── route.ts
│   │   │   ├── globals.css
│   │   │   ├── layout.tsx
│   │   │   ├── page.tsx
│   │   │   └── research/
│   │   │       └── [id]/
│   │   │           └── page.tsx
│   │   ├── components/
│   │   │   ├── Footer.tsx
│   │   │   ├── Header.tsx
│   │   │   ├── Hero.tsx
│   │   │   ├── HumanFeedback.tsx
│   │   │   ├── Images/
│   │   │   │   ├── ImageModal.tsx
│   │   │   │   └── ImagesAlbum.tsx
│   │   │   ├── Langgraph/
│   │   │   │   └── Langgraph.js
│   │   │   ├── LoadingDots.tsx
│   │   │   ├── ResearchBlocks/
│   │   │   │   ├── AccessReport.tsx
│   │   │   │   ├── ChatInterface.tsx
│   │   │   │   ├── ChatResponse.tsx
│   │   │   │   ├── ImageSection.tsx
│   │   │   │   ├── LogsSection.tsx
│   │   │   │   ├── Question.tsx
│   │   │   │   ├── Report.tsx
│   │   │   │   ├── Sources.tsx
│   │   │   │   └── elements/
│   │   │   │       ├── ChatInput.tsx
│   │   │   │       ├── InputArea.tsx
│   │   │   │       ├── LogMessage.tsx
│   │   │   │       ├── SourceCard.tsx
│   │   │   │       └── SubQuestions.tsx
│   │   │   ├── ResearchResults.tsx
│   │   │   ├── ResearchSidebar.tsx
│   │   │   ├── Settings/
│   │   │   │   ├── ChatBox.tsx
│   │   │   │   ├── FileUpload.tsx
│   │   │   │   ├── LayoutSelector.tsx
│   │   │   │   ├── MCPSelector.tsx
│   │   │   │   ├── Modal.tsx
│   │   │   │   ├── Settings.css
│   │   │   │   └── ToneSelector.tsx
│   │   │   ├── SimilarTopics.tsx
│   │   │   ├── Task/
│   │   │   │   ├── Accordion.tsx
│   │   │   │   ├── AgentLogs.tsx
│   │   │   │   ├── DomainFilter.tsx
│   │   │   │   ├── Report.tsx
│   │   │   │   └── ResearchForm.tsx
│   │   │   ├── TypeAnimation.tsx
│   │   │   ├── layouts/
│   │   │   │   ├── CopilotLayout.tsx
│   │   │   │   ├── MobileLayout.tsx
│   │   │   │   └── ResearchPageLayout.tsx
│   │   │   ├── mobile/
│   │   │   │   ├── MobileChatPanel.tsx
│   │   │   │   ├── MobileHomeScreen.tsx
│   │   │   │   └── MobileResearchContent.tsx
│   │   │   └── research/
│   │   │       ├── CopilotPanel.tsx
│   │   │       ├── CopilotResearchContent.tsx
│   │   │       ├── NotFoundContent.tsx
│   │   │       ├── ResearchContent.tsx
│   │   │       └── ResearchPanel.tsx
│   │   ├── config/
│   │   │   └── task.ts
│   │   ├── helpers/
│   │   │   ├── findDifferences.ts
│   │   │   ├── getHost.ts
│   │   │   └── markdownHelper.ts
│   │   ├── hooks/
│   │   │   ├── ResearchHistoryContext.tsx
│   │   │   ├── useAnalytics.ts
│   │   │   ├── useResearchHistory.ts
│   │   │   ├── useScrollHandler.ts
│   │   │   └── useWebSocket.ts
│   │   ├── next.config.mjs
│   │   ├── nginx/
│   │   │   └── default.conf
│   │   ├── package.json
│   │   ├── package.lib.json
│   │   ├── postcss.config.mjs
│   │   ├── public/
│   │   │   ├── embed.js
│   │   │   ├── manifest.json
│   │   │   ├── sw.js
│   │   │   └── workbox-f1770938.js
│   │   ├── rollup.config.js
│   │   ├── src/
│   │   │   ├── GPTResearcher.tsx
│   │   │   ├── index.css
│   │   │   ├── index.d.ts
│   │   │   ├── index.ts
│   │   │   └── utils/
│   │   │       └── imageTransformPlugin.js
│   │   ├── styles/
│   │   │   └── markdown.css
│   │   ├── tailwind.config.ts
│   │   ├── tsconfig.json
│   │   ├── tsconfig.lib.json
│   │   ├── types/
│   │   │   ├── data.ts
│   │   │   └── react-ga4.d.ts
│   │   └── utils/
│   │       ├── consolidateBlocks.ts
│   │       ├── dataProcessing.ts
│   │       └── getLayout.tsx
│   ├── pdf_styles.css
│   ├── scripts.js
│   └── styles.css
├── gpt_researcher/
│   ├── __init__.py
│   ├── actions/
│   │   ├── __init__.py
│   │   ├── agent_creator.py
│   │   ├── markdown_processing.py
│   │   ├── query_processing.py
│   │   ├── report_generation.py
│   │   ├── retriever.py
│   │   ├── utils.py
│   │   └── web_scraping.py
│   ├── agent.py
│   ├── config/
│   │   ├── __init__.py
│   │   ├── config.py
│   │   └── variables/
│   │       ├── __init__.py
│   │       ├── base.py
│   │       ├── default.py
│   │       └── test_local.json
│   ├── context/
│   │   ├── __init__.py
│   │   ├── compression.py
│   │   └── retriever.py
│   ├── document/
│   │   ├── __init__.py
│   │   ├── azure_document_loader.py
│   │   ├── document.py
│   │   ├── langchain_document.py
│   │   └── online_document.py
│   ├── llm_provider/
│   │   ├── __init__.py
│   │   ├── generic/
│   │   │   ├── __init__.py
│   │   │   └── base.py
│   │   └── image/
│   │       ├── __init__.py
│   │       └── image_generator.py
│   ├── mcp/
│   │   ├── README.md
│   │   ├── __init__.py
│   │   ├── client.py
│   │   ├── research.py
│   │   ├── streaming.py
│   │   └── tool_selector.py
│   ├── memory/
│   │   ├── __init__.py
│   │   └── embeddings.py
│   ├── prompts.py
│   ├── retrievers/
│   │   ├── __init__.py
│   │   ├── arxiv/
│   │   │   ├── __init__.py
│   │   │   └── arxiv.py
│   │   ├── bing/
│   │   │   ├── __init__.py
│   │   │   └── bing.py
│   │   ├── bocha/
│   │   │   ├── __init__.py
│   │   │   └── bocha.py
│   │   ├── custom/
│   │   │   ├── __init__.py
│   │   │   └── custom.py
│   │   ├── duckduckgo/
│   │   │   ├── __init__.py
│   │   │   └── duckduckgo.py
│   │   ├── exa/
│   │   │   ├── __init__.py
│   │   │   └── exa.py
│   │   ├── google/
│   │   │   ├── __init__.py
│   │   │   └── google.py
│   │   ├── mcp/
│   │   │   ├── __init__.py
│   │   │   └── retriever.py
│   │   ├── pubmed_central/
│   │   │   ├── __init__.py
│   │   │   └── pubmed_central.py
│   │   ├── searchapi/
│   │   │   ├── __init__.py
│   │   │   └── searchapi.py
│   │   ├── searx/
│   │   │   ├── __init__.py
│   │   │   └── searx.py
│   │   ├── semantic_scholar/
│   │   │   ├── __init__.py
│   │   │   └── semantic_scholar.py
│   │   ├── serpapi/
│   │   │   ├── __init__.py
│   │   │   └── serpapi.py
│   │   ├── serper/
│   │   │   ├── __init__.py
│   │   │   └── serper.py
│   │   ├── tavily/
│   │   │   ├── __init__.py
│   │   │   └── tavily_search.py
│   │   └── utils.py
│   ├── scraper/
│   │   ├── __init__.py
│   │   ├── arxiv/
│   │   │   ├── __init__.py
│   │   │   └── arxiv.py
│   │   ├── beautiful_soup/
│   │   │   ├── __init__.py
│   │   │   └── beautiful_soup.py
│   │   ├── browser/
│   │   │   ├── __init__.py
│   │   │   ├── browser.py
│   │   │   ├── js/
│   │   │   │   └── overlay.js
│   │   │   ├── nodriver_scraper.py
│   │   │   └── processing/
│   │   │       ├── __init__.py
│   │   │       ├── html.py
│   │   │       └── scrape_skills.py
│   │   ├── firecrawl/
│   │   │   ├── __init__.py
│   │   │   └── firecrawl.py
│   │   ├── pymupdf/
│   │   │   ├── __init__.py
│   │   │   └── pymupdf.py
│   │   ├── scraper.py
│   │   ├── tavily_extract/
│   │   │   ├── __init__.py
│   │   │   └── tavily_extract.py
│   │   ├── utils.py
│   │   └── web_base_loader/
│   │       ├── __init__.py
│   │       └── web_base_loader.py
│   ├── skills/
│   │   ├── __init__.py
│   │   ├── browser.py
│   │   ├── context_manager.py
│   │   ├── curator.py
│   │   ├── deep_research.py
│   │   ├── image_generator.py
│   │   ├── researcher.py
│   │   └── writer.py
│   ├── utils/
│   │   ├── __init__.py
│   │   ├── costs.py
│   │   ├── enum.py
│   │   ├── llm.py
│   │   ├── logger.py
│   │   ├── logging_config.py
│   │   ├── rate_limiter.py
│   │   ├── tools.py
│   │   ├── validators.py
│   │   └── workers.py
│   └── vector_store/
│       ├── __init__.py
│       └── vector_store.py
├── json_schema_generator.py
├── langgraph.json
├── main.py
├── mcp-server/
│   └── README.md
├── multi_agents/
│   ├── README.md
│   ├── __init__.py
│   ├── agent.py
│   ├── agents/
│   │   ├── __init__.py
│   │   ├── editor.py
│   │   ├── human.py
│   │   ├── orchestrator.py
│   │   ├── publisher.py
│   │   ├── researcher.py
│   │   ├── reviewer.py
│   │   ├── reviser.py
│   │   ├── utils/
│   │   │   ├── __init__.py
│   │   │   ├── file_formats.py
│   │   │   ├── llms.py
│   │   │   ├── pdf_styles.css
│   │   │   ├── utils.py
│   │   │   └── views.py
│   │   └── writer.py
│   ├── langgraph.json
│   ├── main.py
│   ├── memory/
│   │   ├── __init__.py
│   │   ├── draft.py
│   │   └── research.py
│   ├── package.json
│   ├── requirements.txt
│   └── task.json
├── multi_agents_ag2/
│   ├── README.md
│   ├── __init__.py
│   ├── agents/
│   │   ├── __init__.py
│   │   ├── editor.py
│   │   └── orchestrator.py
│   ├── main.py
│   ├── requirements.txt
│   └── task.json
├── poetry.toml
├── pyproject.toml
├── requirements.txt
├── setup.py
├── terraform/
│   ├── ecr-setup/
│   │   ├── main.tf
│   │   ├── outputs.tf
│   │   ├── variables.tf
│   │   └── versions.tf
│   ├── github-actions-setup/
│   │   ├── main.tf
│   │   ├── outputs.tf
│   │   ├── variables.tf
│   │   └── versions.tf
│   ├── main.tf
│   ├── outputs.tf
│   ├── variables.tf
│   └── versions.tf
└── tests/
    ├── __init__.py
    ├── documents-report-source.py
    ├── gptr-logs-handler.py
    ├── report-types.py
    ├── research_test.py
    ├── test-loaders.py
    ├── test-openai-llm.py
    ├── test-your-embeddings.py
    ├── test-your-llm.py
    ├── test-your-retriever.py
    ├── test_logging.py
    ├── test_logging_output.py
    ├── test_logs.py
    ├── test_mcp.py
    ├── test_quick_search.py
    ├── test_researcher_logging.py
    ├── test_security_fix.py
    └── vector-store.py
Download .txt
SYMBOL INDEX (966 symbols across 198 files)

FILE: backend/chat/chat.py
  function get_tools (line 32) | def get_tools():
  class ChatAgentWithMemory (line 55) | class ChatAgentWithMemory:
    method __init__ (line 56) | def __init__(
    method _setup_vector_store (line 82) | def _setup_vector_store(self):
    method _process_document (line 103) | def _process_document(self, report):
    method quick_search (line 114) | def quick_search(self, query):
    method process_chat_completion (line 153) | async def process_chat_completion(self, messages: List[Dict[str, str]]):
    method chat (line 187) | async def chat(self, messages, websocket=None):
    method get_context (line 256) | def get_context(self):

FILE: backend/memory/draft.py
  class DraftState (line 5) | class DraftState(TypedDict):

FILE: backend/memory/research.py
  class ResearchState (line 5) | class ResearchState(TypedDict):

FILE: backend/report_type/basic_report/basic_report.py
  class BasicReport (line 9) | class BasicReport:
    method __init__ (line 10) | def __init__(
    method _generate_research_id (line 66) | def _generate_research_id(self, query: str) -> str:
    method run (line 72) | async def run(self):

FILE: backend/report_type/deep_research/example.py
  class ResearchProgress (line 17) | class ResearchProgress:
    method __init__ (line 18) | def __init__(self, total_depth: int, total_breadth: int):
  class DeepResearch (line 27) | class DeepResearch:
    method __init__ (line 28) | def __init__(
    method generate_feedback (line 50) | async def generate_feedback(self, query: str, num_questions: int = 3) ...
    method generate_serp_queries (line 72) | async def generate_serp_queries(self, query: str, num_queries: int = 3...
    method process_serp_result (line 106) | async def process_serp_result(self, query: str, context: str, num_lear...
    method deep_research (line 150) | async def deep_research(
    method run (line 275) | async def run(self, on_progress=None) -> str:

FILE: backend/report_type/deep_research/main.py
  function main (line 6) | async def main(task: str):

FILE: backend/report_type/detailed_report/detailed_report.py
  class DetailedReport (line 10) | class DetailedReport:
    method __init__ (line 11) | def __init__(
    method _generate_research_id (line 78) | def _generate_research_id(self, query: str) -> str:
    method run (line 84) | async def run(self) -> str:
    method _initial_research (line 93) | async def _initial_research(self) -> None:
    method _get_all_subtopics (line 98) | async def _get_all_subtopics(self) -> List[Dict]:
    method _generate_subtopic_reports (line 110) | async def _generate_subtopic_reports(self, subtopics: List[Dict]) -> t...
    method _get_subtopic_report (line 122) | async def _get_subtopic_report(self, subtopic: Dict) -> Dict[str, str]:
    method _construct_detailed_report (line 181) | async def _construct_detailed_report(self, introduction: str, report_b...

FILE: backend/server/app.py
  class ResearchRequest (line 52) | class ResearchRequest(BaseModel):
  class ChatRequest (line 63) | class ChatRequest(BaseModel):
  function lifespan (line 71) | async def lifespan(app: FastAPI):
  function serve_frontend (line 147) | async def serve_frontend():
  function read_report (line 161) | async def read_report(request: Request, research_id: str):
  function get_all_reports (line 170) | async def get_all_reports(report_ids: str = None):
  function get_report_by_id (line 177) | async def get_report_by_id(research_id: str):
  function create_or_update_report (line 185) | async def create_or_update_report(request: Request):
  function update_report (line 214) | async def update_report(research_id: str, request: Request):
  function delete_report (line 234) | async def delete_report(research_id: str):
  function get_report_chat (line 242) | async def get_report_chat(research_id: str):
  function add_report_chat_message (line 250) | async def add_report_chat_message(research_id: str, request: Request):
  function write_report (line 273) | async def write_report(research_request: ResearchRequest, research_id: s...
  function generate_report (line 312) | async def generate_report(research_request: ResearchRequest, background_...
  function list_files (line 325) | async def list_files():
  function run_multi_agents (line 334) | async def run_multi_agents():
  function upload_file (line 339) | async def upload_file(file: UploadFile = File(...)):
  function delete_file (line 344) | async def delete_file(filename: str):
  function websocket_endpoint (line 349) | async def websocket_endpoint(websocket: WebSocket):
  function chat (line 363) | async def chat(chat_request: ChatRequest):
  function research_report_chat (line 407) | async def research_report_chat(research_id: str, request: Request):
  function update_report (line 444) | async def update_report(research_id: str, request: Request):
  function delete_report (line 450) | async def delete_report(research_id: str):

FILE: backend/server/logging_config.py
  class JSONResearchHandler (line 7) | class JSONResearchHandler:
    method __init__ (line 8) | def __init__(self, json_file):
    method log_event (line 22) | def log_event(self, event_type: str, data: dict):
    method update_content (line 30) | def update_content(self, key: str, value):
    method _save_json (line 34) | def _save_json(self):
  function setup_research_logging (line 38) | def setup_research_logging():
  function get_research_logger (line 79) | def get_research_logger():
  function get_json_handler (line 82) | def get_json_handler():

FILE: backend/server/multi_agent_runner.py
  function _ensure_repo_root_on_path (line 8) | def _ensure_repo_root_on_path() -> None:
  function _resolve_run_research_task (line 15) | def _resolve_run_research_task() -> RunResearchTask:
  function run_multi_agent_task (line 31) | async def run_multi_agent_task(*args, **kwargs) -> Any:

FILE: backend/server/report_store.py
  class ReportStore (line 7) | class ReportStore:
    method __init__ (line 8) | def __init__(self, path: Path):
    method _ensure_parent_dir (line 12) | async def _ensure_parent_dir(self) -> None:
    method _read_all_unlocked (line 15) | async def _read_all_unlocked(self) -> Dict[str, Dict[str, Any]]:
    method _write_all_unlocked (line 26) | async def _write_all_unlocked(self, data: Dict[str, Dict[str, Any]]) -...
    method list_reports (line 32) | async def list_reports(self, report_ids: List[str] | None = None) -> L...
    method get_report (line 39) | async def get_report(self, report_id: str) -> Dict[str, Any] | None:
    method upsert_report (line 44) | async def upsert_report(self, report_id: str, report: Dict[str, Any]) ...
    method delete_report (line 50) | async def delete_report(self, report_id: str) -> bool:

FILE: backend/server/server_utils.py
  class CustomLogsHandler (line 33) | class CustomLogsHandler:
    method __init__ (line 35) | def __init__(self, websocket, task: str):
    method send_json (line 56) | async def send_json(self, data: Dict[str, Any]) -> None:
  class Researcher (line 82) | class Researcher:
    method __init__ (line 83) | def __init__(self, query: str, report_type: str = "research_report"):
    method research (line 96) | async def research(self) -> dict:
  function sanitize_filename (line 115) | def sanitize_filename(filename: str) -> str:
  function handle_start_command (line 126) | async def handle_start_command(websocket, data: str, manager):
  function handle_human_feedback (line 181) | async def handle_human_feedback(data: str):
  function handle_chat_command (line 187) | async def handle_chat_command(websocket, data: str):
  function generate_report_files (line 256) | async def generate_report_files(report: str, filename: str) -> Dict[str,...
  function send_file_paths (line 263) | async def send_file_paths(websocket, file_paths: Dict[str, str]):
  function get_config_dict (line 267) | def get_config_dict(
  function update_environment_variables (line 290) | def update_environment_variables(config: Dict[str, str]):
  function handle_file_upload (line 295) | async def handle_file_upload(file, DOC_PATH: str) -> Dict[str, str]:
  function handle_file_deletion (line 307) | async def handle_file_deletion(filename: str, DOC_PATH: str) -> JSONResp...
  function execute_multi_agents (line 318) | async def execute_multi_agents(manager) -> Any:
  function handle_websocket_communication (line 327) | async def handle_websocket_communication(websocket, manager):
  function extract_command_data (line 398) | def extract_command_data(json_data: Dict) -> tuple:

FILE: backend/server/websocket_manager.py
  class WebSocketManager (line 19) | class WebSocketManager:
    method __init__ (line 22) | def __init__(self):
    method start_sender (line 28) | async def start_sender(self, websocket: WebSocket):
    method connect (line 51) | async def connect(self, websocket: WebSocket):
    method disconnect (line 64) | async def disconnect(self, websocket: WebSocket):
    method start_streaming (line 99) | async def start_streaming(self, task, report_type, report_source, sour...
  function run_agent (line 114) | async def run_agent(task, report_type, report_source, source_urls, docum...

FILE: backend/utils.py
  function write_to_file (line 6) | async def write_to_file(filename: str, text: str) -> None:
  function write_text_to_md (line 23) | async def write_text_to_md(text: str, filename: str = "") -> str:
  function _preprocess_images_for_pdf (line 36) | def _preprocess_images_for_pdf(text: str) -> str:
  function write_md_to_pdf (line 62) | async def write_md_to_pdf(text: str, filename: str = "") -> str:
  function write_md_to_word (line 99) | async def write_md_to_word(text: str, filename: str = "") -> str:

FILE: cli.py
  function main (line 138) | async def main(args):

FILE: docs/discord-bot/commands/ask.js
  method execute (line 7) | async execute(interaction) {

FILE: docs/discord-bot/gptr-webhook.js
  function initializeWebSocket (line 7) | async function initializeWebSocket() {
  function sendWebhookMessage (line 50) | async function sendWebhookMessage({query, moreContext}) {

FILE: docs/discord-bot/index.js
  function splitMessage (line 17) | function splitMessage(message, chunkSize = 1500) {
  function runDevTeam (line 108) | async function runDevTeam({ interaction, query, moreContext, thread }) {

FILE: docs/discord-bot/server.js
  function keepAlive (line 9) | function keepAlive() {

FILE: docs/docs/examples/custom_prompt.py
  function custom_report_example (line 17) | async def custom_report_example():

FILE: docs/docs/examples/sample_report.py
  function get_report (line 9) | async def get_report(query: str, report_type: str, custom_prompt: str = ...

FILE: docs/docs/examples/sample_sources_only.py
  function get_report (line 5) | async def get_report(query: str, report_source: str, sources: list) -> str:

FILE: docs/npm/index.js
  class GPTResearcher (line 4) | class GPTResearcher {
    method constructor (line 5) | constructor(options = {}) {
    method initializeWebSocket (line 13) | async initializeWebSocket() {
    method sendMessage (line 50) | async sendMessage({
    method sendHttpRequest (line 102) | async sendHttpRequest(data) {
    method getReport (line 112) | async getReport(reportId) {

FILE: docs/src/components/HomepageFeatures.js
  function Feature (line 49) | function Feature({Svg, title, description, docLink}) {
  function HomepageFeatures (line 66) | function HomepageFeatures() {

FILE: docs/src/pages/index.js
  function HomepageHeader (line 9) | function HomepageHeader() {
  function Home (line 28) | function Home() {

FILE: evals/hallucination_eval/evaluate.py
  class HallucinationEvaluator (line 18) | class HallucinationEvaluator:
    method __init__ (line 21) | def __init__(self, model: str = "openai/gpt-4o"):
    method evaluate_response (line 26) | def evaluate_response(self, model_output: str, source_text: str) -> Dict:
  function main (line 55) | def main():

FILE: evals/hallucination_eval/run_eval.py
  class ResearchEvaluator (line 34) | class ResearchEvaluator:
    method __init__ (line 37) | def __init__(self, queries_file: str = DEFAULT_QUERIES_FILE):
    method load_queries (line 48) | def load_queries(self, num_queries: Optional[int] = None) -> List[str]:
    method run_research (line 68) | async def run_research(self, query: str) -> Dict:
    method evaluate_research (line 97) | def evaluate_research(
  function main (line 146) | async def main(num_queries: int = 5, output_dir: str = DEFAULT_OUTPUT_DIR):

FILE: evals/simple_evals/run_eval.py
  function map_with_progress (line 17) | def map_with_progress(fn: Callable[[T], R], items: List[T]) -> List[R]:
  function evaluate_single_query (line 30) | async def evaluate_single_query(query: str, evaluator: SimpleQAEval) -> ...
  function main (line 74) | async def main(num_examples: int):

FILE: evals/simple_evals/simpleqa_eval.py
  class SimpleQAEval (line 101) | class SimpleQAEval:
    method __init__ (line 102) | def __init__(self, grader_model, num_examples=1):
    method evaluate_example (line 119) | def evaluate_example(self, example: dict) -> dict:
    method grade_response (line 144) | def grade_response(self, question: str, correct_answer: str, model_ans...

FILE: frontend/nextjs/actions/apiActions.ts
  function handleSourcesAndAnswer (line 3) | async function handleSourcesAndAnswer(question: string) {
  function handleSimilarQuestions (line 58) | async function handleSimilarQuestions(question: string) {
  function handleLanggraphAnswer (line 67) | async function handleLanggraphAnswer(question: string) {

FILE: frontend/nextjs/app/api/chat/route.ts
  function POST (line 3) | async function POST(request: Request) {

FILE: frontend/nextjs/app/api/reports/[id]/chat/route.ts
  function GET (line 3) | async function GET(
  function POST (line 33) | async function POST(

FILE: frontend/nextjs/app/api/reports/[id]/route.ts
  function GET (line 3) | async function GET(
  function DELETE (line 35) | async function DELETE(
  function PUT (line 68) | async function PUT(

FILE: frontend/nextjs/app/api/reports/route.ts
  function GET (line 3) | async function GET(request: Request) {
  function POST (line 66) | async function POST(request: Request) {

FILE: frontend/nextjs/app/layout.tsx
  function RootLayout (line 56) | function RootLayout({

FILE: frontend/nextjs/app/page.tsx
  function Home (line 28) | function Home() {

FILE: frontend/nextjs/app/research/[id]/page.tsx
  function ResearchPage (line 22) | function ResearchPage({ params }: { params: { id: string } }) {

FILE: frontend/nextjs/components/Footer.tsx
  type FooterProps (line 7) | interface FooterProps {

FILE: frontend/nextjs/components/Header.tsx
  type HeaderProps (line 4) | interface HeaderProps {

FILE: frontend/nextjs/components/Hero.tsx
  type THeroProps (line 6) | type THeroProps = {
  type suggestionType (line 260) | type suggestionType = {

FILE: frontend/nextjs/components/HumanFeedback.tsx
  type HumanFeedbackProps (line 5) | interface HumanFeedbackProps {

FILE: frontend/nextjs/components/Images/ImageModal.tsx
  type ImageModalProps (line 4) | interface ImageModalProps {
  function ImageModal (line 13) | function ImageModal({ imageSrc, isOpen, onClose, onNext, onPrev }: Image...

FILE: frontend/nextjs/components/Images/ImagesAlbum.tsx
  type ImageType (line 4) | type ImageType = any;
  type ImagesAlbumProps (line 6) | interface ImagesAlbumProps {
  function ImagesAlbum (line 10) | function ImagesAlbum({ images }: ImagesAlbumProps) {

FILE: frontend/nextjs/components/Langgraph/Langgraph.js
  function startLanggraphResearch (line 4) | async function startLanggraphResearch(newQuestion, report_source, langgr...

FILE: frontend/nextjs/components/ResearchBlocks/AccessReport.tsx
  type AccessReportProps (line 4) | interface AccessReportProps {

FILE: frontend/nextjs/components/ResearchBlocks/ChatInterface.tsx
  type ChatInterfaceProps (line 7) | interface ChatInterfaceProps {

FILE: frontend/nextjs/components/ResearchBlocks/ChatResponse.tsx
  type ChatResponseProps (line 7) | interface ChatResponseProps {
  function ChatResponse (line 25) | function ChatResponse({ answer, metadata }: ChatResponseProps) {

FILE: frontend/nextjs/components/ResearchBlocks/ImageSection.tsx
  type ImageSectionProps (line 5) | interface ImageSectionProps {

FILE: frontend/nextjs/components/ResearchBlocks/LogsSection.tsx
  type Log (line 5) | interface Log {
  type OrderedLogsProps (line 12) | interface OrderedLogsProps {

FILE: frontend/nextjs/components/ResearchBlocks/Question.tsx
  type QuestionProps (line 4) | interface QuestionProps {

FILE: frontend/nextjs/components/ResearchBlocks/Report.tsx
  function Report (line 9) | function Report({ answer, researchId }: { answer: string, researchId?: s...

FILE: frontend/nextjs/components/ResearchBlocks/Sources.tsx
  function Sources (line 5) | function Sources({

FILE: frontend/nextjs/components/ResearchBlocks/elements/ChatInput.tsx
  type TChatInputProps (line 5) | type TChatInputProps = {
  function debounce (line 13) | function debounce(func: Function, wait: number) {

FILE: frontend/nextjs/components/ResearchBlocks/elements/InputArea.tsx
  type TInputAreaProps (line 5) | type TInputAreaProps = {
  function debounce (line 16) | function debounce(func: Function, wait: number) {

FILE: frontend/nextjs/components/ResearchBlocks/elements/LogMessage.tsx
  type ProcessedData (line 8) | type ProcessedData = {
  type Log (line 14) | type Log = {
  type LogMessageProps (line 21) | interface LogMessageProps {

FILE: frontend/nextjs/components/ResearchBlocks/elements/SubQuestions.tsx
  type SubQuestionsProps (line 3) | interface SubQuestionsProps {

FILE: frontend/nextjs/components/ResearchResults.tsx
  type ResearchResultsProps (line 12) | interface ResearchResultsProps {

FILE: frontend/nextjs/components/ResearchSidebar.tsx
  type ResearchSidebarProps (line 7) | interface ResearchSidebarProps {

FILE: frontend/nextjs/components/Settings/ChatBox.tsx
  type ChatBoxProps (line 9) | interface ChatBoxProps {
  type OutputData (line 14) | interface OutputData {
  type WebSocketMessage (line 20) | interface WebSocketMessage {
  function ChatBox (line 25) | function ChatBox({ chatBoxSettings, setChatBoxSettings }: ChatBoxProps) {

FILE: frontend/nextjs/components/Settings/LayoutSelector.tsx
  type LayoutSelectorProps (line 3) | interface LayoutSelectorProps {
  function LayoutSelector (line 8) | function LayoutSelector({ layoutType, onLayoutChange }: LayoutSelectorPr...

FILE: frontend/nextjs/components/Settings/MCPSelector.tsx
  type MCPConfig (line 3) | interface MCPConfig {
  type MCPSelectorProps (line 10) | interface MCPSelectorProps {

FILE: frontend/nextjs/components/Settings/Modal.tsx
  type ChatBoxProps (line 8) | interface ChatBoxProps {
  type Domain (line 13) | interface Domain {

FILE: frontend/nextjs/components/Settings/ToneSelector.tsx
  type ToneSelectorProps (line 3) | interface ToneSelectorProps {
  function ToneSelector (line 7) | function ToneSelector({ tone, onToneChange }: ToneSelectorProps) {

FILE: frontend/nextjs/components/Task/Accordion.tsx
  type ProcessedData (line 6) | type ProcessedData = {
  type Log (line 12) | type Log = {
  type AccordionProps (line 18) | interface AccordionProps {

FILE: frontend/nextjs/components/Task/AgentLogs.tsx
  function AgentLogs (line 1) | function AgentLogs({agentLogs}:any){

FILE: frontend/nextjs/components/Task/DomainFilter.tsx
  type DomainFilterProps (line 4) | interface DomainFilterProps {
  function DomainFilter (line 12) | function DomainFilter({

FILE: frontend/nextjs/components/Task/Report.tsx
  function Report (line 5) | function Report({report}:any) {

FILE: frontend/nextjs/components/Task/ResearchForm.tsx
  type ResearchFormProps (line 10) | interface ResearchFormProps {
  function ResearchForm (line 21) | function ResearchForm({

FILE: frontend/nextjs/components/layouts/CopilotLayout.tsx
  type CopilotLayoutProps (line 8) | interface CopilotLayoutProps {
  function CopilotLayout (line 22) | function CopilotLayout({

FILE: frontend/nextjs/components/layouts/MobileLayout.tsx
  type MobileLayoutProps (line 8) | interface MobileLayoutProps {
  function MobileLayout (line 22) | function MobileLayout({

FILE: frontend/nextjs/components/layouts/ResearchPageLayout.tsx
  type ResearchPageLayoutProps (line 7) | interface ResearchPageLayoutProps {
  function ResearchPageLayout (line 22) | function ResearchPageLayout({

FILE: frontend/nextjs/components/mobile/MobileChatPanel.tsx
  type MobileChatPanelProps (line 13) | interface MobileChatPanelProps {
  function processMarkdown (line 180) | function processMarkdown(content: string): string {

FILE: frontend/nextjs/components/mobile/MobileHomeScreen.tsx
  type MobileHomeScreenProps (line 7) | interface MobileHomeScreenProps {
  function MobileHomeScreen (line 16) | function MobileHomeScreen({

FILE: frontend/nextjs/components/mobile/MobileResearchContent.tsx
  type MobileResearchContentProps (line 7) | interface MobileResearchContentProps {
  function MobileResearchContent (line 21) | function MobileResearchContent({

FILE: frontend/nextjs/components/research/CopilotPanel.tsx
  type CopilotPanelProps (line 9) | interface CopilotPanelProps {

FILE: frontend/nextjs/components/research/CopilotResearchContent.tsx
  type CopilotResearchContentProps (line 6) | interface CopilotResearchContentProps {
  function CopilotResearchContent (line 28) | function CopilotResearchContent({

FILE: frontend/nextjs/components/research/NotFoundContent.tsx
  type NotFoundContentProps (line 3) | interface NotFoundContentProps {
  function NotFoundContent (line 7) | function NotFoundContent({ onNewResearch }: NotFoundContentProps) {

FILE: frontend/nextjs/components/research/ResearchContent.tsx
  type ResearchContentProps (line 8) | interface ResearchContentProps {
  function ResearchContent (line 31) | function ResearchContent({

FILE: frontend/nextjs/components/research/ResearchPanel.tsx
  type ResearchPanelProps (line 7) | interface ResearchPanelProps {

FILE: frontend/nextjs/helpers/findDifferences.ts
  type Value (line 1) | type Value = string | number | boolean | null | undefined | object | Val...
  type Changes (line 2) | type Changes = { [key: string]: { before: Value; after: Value } | Change...
  function findDifferences (line 4) | function findDifferences<T extends Record<string, any>>(obj1: T, obj2: T...

FILE: frontend/nextjs/helpers/getHost.ts
  type GetHostParams (line 1) | interface GetHostParams {

FILE: frontend/nextjs/hooks/ResearchHistoryContext.tsx
  type ResearchHistoryContextType (line 8) | interface ResearchHistoryContextType {

FILE: frontend/nextjs/hooks/useAnalytics.ts
  type ResearchData (line 3) | interface ResearchData {
  type TrackResearchData (line 9) | interface TrackResearchData {

FILE: frontend/nextjs/hooks/useScrollHandler.ts
  function useScrollHandler (line 3) | function useScrollHandler(

FILE: frontend/nextjs/next.config.mjs
  method rewrites (line 19) | async rewrites() {

FILE: frontend/nextjs/public/workbox-f1770938.js
  class s (line 1) | class s extends Error{constructor(t,s){super(e(t,s)),this.name=t,this.de...
    method constructor (line 1) | constructor(t,s){super(e(t,s)),this.name=t,this.details=s}
  class r (line 1) | class r{constructor(t,e,s="GET"){this.handler=n(e),this.match=t,this.met...
    method constructor (line 1) | constructor(t,e,s="GET"){this.handler=n(e),this.match=t,this.method=s}
    method setCatchHandler (line 1) | setCatchHandler(t){this.catchHandler=n(t)}
  class i (line 1) | class i extends r{constructor(t,e,s){super(({url:e})=>{const s=t.exec(e....
    method constructor (line 1) | constructor(t,e,s){super(({url:e})=>{const s=t.exec(e.href);if(s&&(e.o...
  class a (line 1) | class a{constructor(){this.t=new Map,this.i=new Map}get routes(){return ...
    method constructor (line 1) | constructor(){this.t=new Map,this.i=new Map}
    method routes (line 1) | get routes(){return this.t}
    method addFetchListener (line 1) | addFetchListener(){self.addEventListener("fetch",t=>{const{request:e}=...
    method addCacheListener (line 1) | addCacheListener(){self.addEventListener("message",t=>{if(t.data&&"CAC...
    method handleRequest (line 1) | handleRequest({request:t,event:e}){const s=new URL(t.url,location.href...
    method findMatchingRoute (line 1) | findMatchingRoute({url:t,sameOrigin:e,request:s,event:n}){const r=this...
    method setDefaultHandler (line 1) | setDefaultHandler(t,e="GET"){this.i.set(e,n(t))}
    method setCatchHandler (line 1) | setCatchHandler(t){this.o=n(t)}
    method registerRoute (line 1) | registerRoute(t){this.t.has(t.method)||this.t.set(t.method,[]),this.t....
    method unregisterRoute (line 1) | unregisterRoute(t){if(!this.t.has(t.method))throw new s("unregister-ro...
  function h (line 1) | function h(t,e,n){let a;if("string"==typeof t){const s=new URL(t,locatio...
  function p (line 1) | function p(t,e){const s=new URL(t);for(const t of e)s.searchParams.delet...
  class y (line 1) | class y{constructor(){this.promise=new Promise((t,e)=>{this.resolve=t,th...
    method constructor (line 1) | constructor(){this.promise=new Promise((t,e)=>{this.resolve=t,this.rej...
  function m (line 1) | function m(t){return"string"==typeof t?new Request(t):t}
  class v (line 1) | class v{constructor(t,e){this.h={},Object.assign(this,e),this.event=e.ev...
    method constructor (line 1) | constructor(t,e){this.h={},Object.assign(this,e),this.event=e.event,th...
    method fetch (line 1) | async fetch(t){const{event:e}=this;let n=m(t);if("navigate"===n.mode&&...
    method fetchAndCachePut (line 1) | async fetchAndCachePut(t){const e=await this.fetch(t),s=e.clone();retu...
    method cacheMatch (line 1) | async cacheMatch(t){const e=m(t);let s;const{cacheName:n,matchOptions:...
    method cachePut (line 1) | async cachePut(t,e){const n=m(t);var r;await(r=0,new Promise(t=>setTim...
    method getCacheKey (line 1) | async getCacheKey(t,e){const s=`${t.url} | ${e}`;if(!this.h[s]){let n=...
    method hasCallback (line 1) | hasCallback(t){for(const e of this.u.plugins)if(t in e)return!0;return!1}
    method runCallbacks (line 1) | async runCallbacks(t,e){for(const s of this.iterateCallbacks(t))await ...
    method iterateCallbacks (line 1) | *iterateCallbacks(t){for(const e of this.u.plugins)if("function"==type...
    method waitUntil (line 1) | waitUntil(t){return this.p.push(t),t}
    method doneWaiting (line 1) | async doneWaiting(){let t;for(;t=this.p.shift();)await t}
    method destroy (line 1) | destroy(){this.l.resolve(null)}
    method R (line 1) | async R(t){let e=t,s=!1;for(const t of this.iterateCallbacks("cacheWil...
  class R (line 1) | class R{constructor(t={}){this.cacheName=d(t.cacheName),this.plugins=t.p...
    method constructor (line 1) | constructor(t={}){this.cacheName=d(t.cacheName),this.plugins=t.plugins...
    method handle (line 1) | handle(t){const[e]=this.handleAll(t);return e}
    method handleAll (line 1) | handleAll(t){t instanceof FetchEvent&&(t={event:t,request:t.request});...
    method q (line 1) | async q(t,e,n){let r;await t.runCallbacks("handlerWillStart",{event:n,...
    method D (line 1) | async D(t,e,s,n){let r,i;try{r=await t}catch(i){}try{await e.runCallba...
  function b (line 1) | function b(t){t.then(()=>{})}
  function q (line 1) | function q(){return q=Object.assign?Object.assign.bind():function(t){for...
  method get (line 1) | get(t,e,s){if(t instanceof IDBTransaction){if("done"===e)return L.get(t)...
  function O (line 1) | function O(t){return t!==IDBDatabase.prototype.transaction||"objectStore...
  function T (line 1) | function T(t){return"function"==typeof t?O(t):(t instanceof IDBTransacti...
  function k (line 1) | function k(t){if(t instanceof IDBRequest)return function(t){const e=new ...
  function j (line 1) | function j(t,e){if(!(t instanceof IDBDatabase)||e in t||"string"!=typeof...
  class A (line 1) | class A{constructor(t){this._=null,this.L=t}I(t){const e=t.createObjectS...
    method constructor (line 1) | constructor(t){this._=null,this.L=t}
    method I (line 1) | I(t){const e=t.createObjectStore(S,{keyPath:"id"});e.createIndex("cach...
    method C (line 1) | C(t){this.I(t),this.L&&function(t,{blocked:e}={}){const s=indexedDB.de...
    method setTimestamp (line 1) | async setTimestamp(t,e){const s={url:t=K(t),timestamp:e,cacheName:this...
    method getTimestamp (line 1) | async getTimestamp(t){const e=await this.getDb(),s=await e.get(S,this....
    method expireEntries (line 1) | async expireEntries(t,e){const s=await this.getDb();let n=await s.tran...
    method N (line 1) | N(t){return this.L+"|"+K(t)}
    method getDb (line 1) | async getDb(){return this._||(this._=await function(t,e,{blocked:s,upg...
  class F (line 1) | class F{constructor(t,e={}){this.O=!1,this.T=!1,this.k=e.maxEntries,this...
    method constructor (line 1) | constructor(t,e={}){this.O=!1,this.T=!1,this.k=e.maxEntries,this.B=e.m...
    method expireEntries (line 1) | async expireEntries(){if(this.O)return void(this.T=!0);this.O=!0;const...
    method updateTimestamp (line 1) | async updateTimestamp(t){await this.M.setTimestamp(t,Date.now())}
    method isURLExpired (line 1) | async isURLExpired(t){if(this.B){const e=await this.M.getTimestamp(t),...
    method delete (line 1) | async delete(){this.T=!1,await this.M.expireEntries(1/0)}
  function H (line 1) | async function H(t,e){try{if(206===e.status)return e;const n=t.headers.g...
  function $ (line 1) | function $(t,e){const s=e();return t.waitUntil(s),s}
  function z (line 1) | function z(t){if(!t)throw new s("add-to-cache-list-unexpected-type",{ent...
  class G (line 1) | class G{constructor(){this.updatedURLs=[],this.notUpdatedURLs=[],this.ha...
    method constructor (line 1) | constructor(){this.updatedURLs=[],this.notUpdatedURLs=[],this.handlerW...
  class V (line 1) | class V{constructor({precacheController:t}){this.cacheKeyWillBeUsed=asyn...
    method constructor (line 1) | constructor({precacheController:t}){this.cacheKeyWillBeUsed=async({req...
  function X (line 1) | async function X(t,e){let n=null;if(t.url){n=new URL(t.url).origin}if(n!...
  class Y (line 1) | class Y extends R{constructor(t={}){t.cacheName=w(t.cacheName),super(t),...
    method constructor (line 1) | constructor(t={}){t.cacheName=w(t.cacheName),super(t),this.j=!1!==t.fa...
    method U (line 1) | async U(t,e){const s=await e.cacheMatch(t);return s||(e.event&&"instal...
    method K (line 1) | async K(t,e){let n;const r=e.params||{};if(!this.j)throw new s("missin...
    method S (line 1) | async S(t,e){this.A();const n=await e.fetch(t);if(!await e.cachePut(t,...
    method A (line 1) | A(){let t=null,e=0;for(const[s,n]of this.plugins.entries())n!==Y.copyR...
  class Z (line 1) | class Z{constructor({cacheName:t,plugins:e=[],fallbackToNetwork:s=!0}={}...
    method constructor (line 1) | constructor({cacheName:t,plugins:e=[],fallbackToNetwork:s=!0}={}){this...
    method strategy (line 1) | get strategy(){return this.u}
    method precache (line 1) | precache(t){this.addToCacheList(t),this.G||(self.addEventListener("ins...
    method addToCacheList (line 1) | addToCacheList(t){const e=[];for(const n of t){"string"==typeof n?e.pu...
    method install (line 1) | install(t){return $(t,async()=>{const e=new G;this.strategy.plugins.pu...
    method activate (line 1) | activate(t){return $(t,async()=>{const t=await self.caches.open(this.s...
    method getURLsToCacheKeys (line 1) | getURLsToCacheKeys(){return this.F}
    method getCachedURLs (line 1) | getCachedURLs(){return[...this.F.keys()]}
    method getCacheKeyForURL (line 1) | getCacheKeyForURL(t){const e=new URL(t,location.href);return this.F.ge...
    method getIntegrityForCacheKey (line 1) | getIntegrityForCacheKey(t){return this.$.get(t)}
    method matchPrecache (line 1) | async matchPrecache(t){const e=t instanceof Request?t.url:t,s=this.get...
    method createHandlerBoundToURL (line 1) | createHandlerBoundToURL(t){const e=this.getCacheKeyForURL(t);if(!e)thr...
  class et (line 1) | class et extends r{constructor(t,e){super(({request:s})=>{const n=t.getU...
    method constructor (line 1) | constructor(t,e){super(({request:s})=>{const n=t.getURLsToCacheKeys();...
  method U (line 1) | async U(t,e){let n,r=await e.cacheMatch(t);if(!r)try{r=await e.fetchAndC...
  method constructor (line 1) | constructor(t={}){this.cachedResponseWillBeUsed=async({event:t,request:e...
  method J (line 1) | J(t){if(t===d())throw new s("expire-custom-caches-only");let e=this.Y.ge...
  method V (line 1) | V(t){if(!this.B)return!0;const e=this.Z(t);if(null===e)return!0;return e...
    method constructor (line 1) | constructor({precacheController:t}){this.cacheKeyWillBeUsed=async({req...
  method Z (line 1) | Z(t){if(!t.headers.has("date"))return null;const e=t.headers.get("date")...
    method constructor (line 1) | constructor({cacheName:t,plugins:e=[],fallbackToNetwork:s=!0}={}){this...
    method strategy (line 1) | get strategy(){return this.u}
    method precache (line 1) | precache(t){this.addToCacheList(t),this.G||(self.addEventListener("ins...
    method addToCacheList (line 1) | addToCacheList(t){const e=[];for(const n of t){"string"==typeof n?e.pu...
    method install (line 1) | install(t){return $(t,async()=>{const e=new G;this.strategy.plugins.pu...
    method activate (line 1) | activate(t){return $(t,async()=>{const t=await self.caches.open(this.s...
    method getURLsToCacheKeys (line 1) | getURLsToCacheKeys(){return this.F}
    method getCachedURLs (line 1) | getCachedURLs(){return[...this.F.keys()]}
    method getCacheKeyForURL (line 1) | getCacheKeyForURL(t){const e=new URL(t,location.href);return this.F.ge...
    method getIntegrityForCacheKey (line 1) | getIntegrityForCacheKey(t){return this.$.get(t)}
    method matchPrecache (line 1) | async matchPrecache(t){const e=t instanceof Request?t.url:t,s=this.get...
    method createHandlerBoundToURL (line 1) | createHandlerBoundToURL(t){const e=this.getCacheKeyForURL(t);if(!e)thr...
  method deleteCacheAndMetadata (line 1) | async deleteCacheAndMetadata(){for(const[t,e]of this.Y)await self.caches...
  method constructor (line 1) | constructor(t={}){super(t),this.plugins.some(t=>"cacheWillUpdate"in t)||...
  method U (line 1) | async U(t,e){const n=[],r=[];let i;if(this.tt){const{id:s,promise:a}=thi...
  method et (line 1) | et({request:t,logs:e,handler:s}){let n;return{promise:new Promise(e=>{n=...
    method constructor (line 1) | constructor(t,e){super(({request:s})=>{const n=t.getURLsToCacheKeys();...
  method st (line 1) | async st({timeoutId:t,request:e,logs:s,handler:n}){let r,i;try{i=await n...
  method constructor (line 1) | constructor(){this.cachedResponseWillBeUsed=async({request:t,cachedRespo...
  method constructor (line 1) | constructor(t={}){super(t),this.plugins.some(t=>"cacheWillUpdate"in t)||...
  method U (line 1) | async U(t,e){const n=e.fetchAndCachePut(t).catch(()=>{});e.waitUntil(n);...

FILE: frontend/nextjs/rollup.config.js
  method transform (line 14) | transform(code) {

FILE: frontend/nextjs/src/GPTResearcher.tsx
  type GPTResearcherProps (line 18) | interface GPTResearcherProps {

FILE: frontend/nextjs/src/index.d.ts
  type GPTResearcherProps (line 4) | interface GPTResearcherProps {

FILE: frontend/nextjs/src/utils/imageTransformPlugin.js
  function imageTransformPlugin (line 2) | function imageTransformPlugin() {

FILE: frontend/nextjs/types/data.ts
  type BaseData (line 1) | interface BaseData {
  type BasicData (line 5) | interface BasicData extends BaseData {
  type LanggraphButtonData (line 10) | interface LanggraphButtonData extends BaseData {
  type DifferencesData (line 15) | interface DifferencesData extends BaseData {
  type QuestionData (line 21) | interface QuestionData extends BaseData {
  type ChatData (line 26) | interface ChatData extends BaseData {
  type Data (line 32) | type Data = BasicData | LanggraphButtonData | DifferencesData | Question...
  type MCPConfig (line 34) | interface MCPConfig {
  type ChatBoxSettings (line 41) | interface ChatBoxSettings {
  type Domain (line 53) | interface Domain {
  type ChatMessage (line 57) | interface ChatMessage {
  type ResearchHistoryItem (line 64) | interface ResearchHistoryItem {

FILE: frontend/nextjs/types/react-ga4.d.ts
  type InitOptions (line 2) | interface InitOptions {

FILE: frontend/nextjs/utils/getLayout.tsx
  type LayoutProps (line 7) | interface LayoutProps {

FILE: frontend/scripts.js
  function setCookie (line 1415) | function setCookie(name, value, days) {
  function getCookie (line 1475) | function getCookie(name) {
  function deleteCookie (line 1509) | function deleteCookie(name) {

FILE: gpt_researcher/actions/agent_creator.py
  function choose_agent (line 18) | async def choose_agent(
  function handle_json_error (line 65) | async def handle_json_error(response: str | None):
  function extract_json_with_regex (line 110) | def extract_json_with_regex(response: str | None) -> str | None:

FILE: gpt_researcher/actions/markdown_processing.py
  function extract_headers (line 5) | def extract_headers(markdown_text: str) -> List[Dict]:
  function extract_sections (line 41) | def extract_sections(markdown_text: str) -> List[Dict[str, str]]:
  function table_of_contents (line 68) | def table_of_contents(markdown_text: str) -> str:
  function add_references (line 94) | def add_references(report_markdown: str, visited_urls: set) -> str:

FILE: gpt_researcher/actions/query_processing.py
  function get_search_results (line 12) | async def get_search_results(query: str, retriever: Any, query_domains: ...
  function generate_sub_queries (line 37) | async def generate_sub_queries(
  function plan_research_outline (line 112) | async def plan_research_outline(

FILE: gpt_researcher/actions/report_generation.py
  function write_report_introduction (line 12) | async def write_report_introduction(
  function write_conclusion (line 63) | async def write_conclusion(
  function summarize_url (line 115) | async def summarize_url(
  function generate_draft_section_titles (line 160) | async def generate_draft_section_titles(
  function generate_report (line 209) | async def generate_report(

FILE: gpt_researcher/actions/retriever.py
  function get_retriever (line 8) | def get_retriever(retriever: str):
  function get_retrievers (line 99) | def get_retrievers(headers: dict[str, str], cfg):
  function get_default_retriever (line 139) | def get_default_retriever():

FILE: gpt_researcher/actions/utils.py
  function stream_output (line 7) | async def stream_output(
  function safe_send_json (line 35) | async def safe_send_json(websocket: Any, data: Dict[str, Any]) -> None:
  function calculate_cost (line 62) | def calculate_cost(
  function format_token_count (line 100) | def format_token_count(count: int) -> str:
  function update_cost (line 113) | async def update_cost(
  function create_cost_callback (line 145) | def create_cost_callback(websocket: Any) -> Callable:

FILE: gpt_researcher/actions/web_scraping.py
  function scrape_urls (line 12) | async def scrape_urls(
  function filter_urls (line 45) | async def filter_urls(urls: list[str], config: Config) -> list[str]:
  function extract_main_content (line 64) | async def extract_main_content(html_content: str) -> str:
  function process_scraped_data (line 79) | async def process_scraped_data(scraped_data: list[dict[str, Any]], confi...

FILE: gpt_researcher/agent.py
  class GPTResearcher (line 36) | class GPTResearcher:
    method __init__ (line 52) | def __init__(
    method _generate_research_id (line 202) | def _generate_research_id(self) -> str:
    method _resolve_mcp_strategy (line 216) | def _resolve_mcp_strategy(self, mcp_strategy: str | None, mcp_max_iter...
    method _process_mcp_configs (line 282) | def _process_mcp_configs(self, mcp_configs: list[dict]) -> None:
    method _log_event (line 311) | async def _log_event(self, event_type: str, **kwargs):
    method conduct_research (line 331) | async def conduct_research(self, on_progress=None):
    method _handle_deep_research (line 403) | async def _handle_deep_research(self, on_progress=None):
    method write_report (line 451) | async def write_report(
    method write_report_conclusion (line 494) | async def write_report_conclusion(self, report_body: str) -> str:
    method write_introduction (line 508) | async def write_introduction(self) -> str:
    method quick_search (line 519) | async def quick_search(self, query: str, query_domains: list[str] = No...
    method get_subtopics (line 553) | async def get_subtopics(self):
    method get_draft_section_titles (line 561) | async def get_draft_section_titles(self, current_subtopic: str) -> lis...
    method get_similar_written_contents_by_draft_section_titles (line 572) | async def get_similar_written_contents_by_draft_section_titles(
    method get_research_images (line 598) | def get_research_images(self, top_k: int = 10) -> list[dict[str, Any]]:
    method add_research_images (line 609) | def add_research_images(self, images: list[dict[str, Any]]) -> None:
    method get_research_sources (line 617) | def get_research_sources(self) -> list[dict[str, Any]]:
    method add_research_sources (line 625) | def add_research_sources(self, sources: list[dict[str, Any]]) -> None:
    method add_references (line 633) | def add_references(self, report_markdown: str, visited_urls: set) -> str:
    method extract_headers (line 645) | def extract_headers(self, markdown_text: str) -> list[dict]:
    method extract_sections (line 656) | def extract_sections(self, markdown_text: str) -> list[dict]:
    method table_of_contents (line 667) | def table_of_contents(self, markdown_text: str) -> str:
    method get_source_urls (line 678) | def get_source_urls(self) -> list:
    method get_research_context (line 686) | def get_research_context(self) -> list:
    method get_costs (line 694) | def get_costs(self) -> float:
    method get_step_costs (line 702) | def get_step_costs(self) -> dict[str, float]:
    method set_verbose (line 710) | def set_verbose(self, verbose: bool) -> None:
    method add_costs (line 718) | def add_costs(self, cost: float) -> None:

FILE: gpt_researcher/config/config.py
  class Config (line 19) | class Config:
    method __init__ (line 34) | def __init__(self, config_path: str | None = None):
    method _set_attributes (line 62) | def _set_attributes(self, config: Dict[str, Any]) -> None:
    method _set_embedding_attributes (line 85) | def _set_embedding_attributes(self) -> None:
    method _set_llm_attributes (line 91) | def _set_llm_attributes(self) -> None:
    method _handle_deprecated_attributes (line 98) | def _handle_deprecated_attributes(self) -> None:
    method _set_doc_path (line 147) | def _set_doc_path(self, config: Dict[str, Any]) -> None:
    method load_config (line 157) | def load_config(cls, config_path: str | None) -> Dict[str, Any]:
    method list_available_configs (line 180) | def list_available_configs(cls) -> List[str]:
    method parse_retrievers (line 188) | def parse_retrievers(self, retriever_str: str) -> List[str]:
    method parse_llm (line 204) | def parse_llm(llm_str: str | None) -> tuple[str | None, str | None]:
    method parse_reasoning_effort (line 224) | def parse_reasoning_effort(reasoning_effort_str: str | None) -> str | ...
    method parse_embedding (line 233) | def parse_embedding(embedding_str: str | None) -> tuple[str | None, st...
    method validate_doc_path (line 252) | def validate_doc_path(self):
    method convert_env_value (line 257) | def convert_env_value(key: str, env_value: str, type_hint: Type) -> Any:
    method set_verbose (line 291) | def set_verbose(self, verbose: bool) -> None:
    method get_mcp_server_config (line 295) | def get_mcp_server_config(self, name: str) -> dict:

FILE: gpt_researcher/config/variables/base.py
  class BaseConfig (line 5) | class BaseConfig(TypedDict):

FILE: gpt_researcher/context/compression.py
  class VectorstoreCompressor (line 36) | class VectorstoreCompressor:
    method __init__ (line 48) | def __init__(
    method async_get_context (line 71) | async def async_get_context(self, query: str, max_results: int = 5) ->...
  class ContextCompressor (line 85) | class ContextCompressor:
    method __init__ (line 98) | def __init__(
    method __get_contextual_retriever (line 122) | def __get_contextual_retriever(self):
    method async_get_context (line 143) | async def async_get_context(self, query: str, max_results: int = 5, co...
  class WrittenContentCompressor (line 181) | class WrittenContentCompressor:
    method __init__ (line 194) | def __init__(self, documents, embeddings, similarity_threshold: float,...
    method __get_contextual_retriever (line 208) | def __get_contextual_retriever(self):
    method __pretty_docs_list (line 228) | def __pretty_docs_list(self, docs, top_n: int) -> list[str]:
    method async_get_context (line 240) | async def async_get_context(self, query: str, max_results: int = 5, co...

FILE: gpt_researcher/context/retriever.py
  class SearchAPIRetriever (line 10) | class SearchAPIRetriever(BaseRetriever):
    method _get_relevant_documents (line 14) | def _get_relevant_documents(
  class SectionRetriever (line 31) | class SectionRetriever(BaseRetriever):
    method _get_relevant_documents (line 48) | def _get_relevant_documents(

FILE: gpt_researcher/document/azure_document_loader.py
  class AzureDocumentLoader (line 5) | class AzureDocumentLoader:
    method __init__ (line 6) | def __init__(self, container_name, connection_string):
    method load (line 10) | async def load(self):

FILE: gpt_researcher/document/document.py
  class DocumentLoader (line 16) | class DocumentLoader:
    method __init__ (line 18) | def __init__(self, path: Union[str, List[str]]):
    method load (line 21) | async def load(self) -> list:
    method _load_document (line 63) | async def _load_document(self, file_path: str, file_extension: str) ->...

FILE: gpt_researcher/document/langchain_document.py
  class LangChainDocumentLoader (line 10) | class LangChainDocumentLoader:
    method __init__ (line 12) | def __init__(self, documents: List[Document]):
    method load (line 15) | async def load(self, metadata_source_index="title") -> List[Dict[str, ...

FILE: gpt_researcher/document/online_document.py
  class OnlineDocumentLoader (line 15) | class OnlineDocumentLoader:
    method __init__ (line 17) | def __init__(self, urls):
    method load (line 20) | async def load(self) -> list:
    method _download_and_process (line 36) | async def _download_and_process(self, url: str) -> list:
    method _load_document (line 62) | async def _load_document(self, file_path: str, file_extension: str) ->...
    method _get_extension (line 90) | def _get_extension(url: str) -> str:

FILE: gpt_researcher/llm_provider/generic/base.py
  class ReasoningEfforts (line 67) | class ReasoningEfforts(Enum):
  class ChatLogger (line 73) | class ChatLogger:
    method __init__ (line 78) | def __init__(self, fname: str):
    method log_request (line 82) | async def log_request(self, messages, response):
  class GenericLLMProvider (line 91) | class GenericLLMProvider:
    method __init__ (line 93) | def __init__(self, llm, chat_log: str | None = None,  verbose: bool = ...
    method from_provider (line 98) | def from_provider(cls, provider: str, chat_log: str | None = None, ver...
    method get_chat_response (line 275) | async def get_chat_response(self, messages, stream, websocket=None, **...
    method stream_response (line 290) | async def stream_response(self, messages, websocket=None, **kwargs):
    method _send_output (line 309) | async def _send_output(self, content, websocket=None):
  function _check_pkg (line 316) | def _check_pkg(pkg: str) -> None:

FILE: gpt_researcher/llm_provider/image/image_generator.py
  class ImageGeneratorProvider (line 22) | class ImageGeneratorProvider:
    method __init__ (line 48) | def __init__(
    method _ensure_client (line 75) | def _ensure_client(self):
    method _ensure_output_dir (line 91) | def _ensure_output_dir(self, research_id: str = "") -> Path:
    method _generate_image_filename (line 101) | def _generate_image_filename(self, prompt: str, index: int = 0) -> str:
    method _crop_to_landscape (line 106) | def _crop_to_landscape(self, image_bytes: bytes, target_ratio: float =...
    method _build_enhanced_prompt (line 156) | def _build_enhanced_prompt(self, prompt: str, context: str = "", style...
    method generate_image (line 225) | async def generate_image(
    method _generate_with_gemini (line 268) | async def _generate_with_gemini(
    method _generate_with_imagen (line 351) | async def _generate_with_imagen(
    method _generate_alt_text (line 411) | def _generate_alt_text(self, prompt: str) -> str:
    method is_available (line 420) | def is_available(self) -> bool:
    method from_config (line 432) | def from_config(cls, config) -> Optional["ImageGeneratorProvider"]:

FILE: gpt_researcher/mcp/client.py
  class MCPClientManager (line 19) | class MCPClientManager:
    method __init__ (line 29) | def __init__(self, mcp_configs: List[Dict[str, Any]]):
    method convert_configs_to_langchain_format (line 40) | def convert_configs_to_langchain_format(self) -> Dict[str, Dict[str, A...
    method get_or_create_client (line 105) | async def get_or_create_client(self) -> Optional[object]:
    method close_client (line 138) | async def close_client(self):
    method get_all_tools (line 155) | async def get_all_tools(self) -> List:

FILE: gpt_researcher/mcp/research.py
  class MCPResearchSkill (line 13) | class MCPResearchSkill:
    method __init__ (line 23) | def __init__(self, cfg, researcher=None):
    method conduct_research_with_tools (line 34) | async def conduct_research_with_tools(self, query: str, selected_tools...
    method _process_tool_result (line 158) | def _process_tool_result(self, tool_name: str, result: Any) -> List[Di...

FILE: gpt_researcher/mcp/streaming.py
  class MCPStreamer (line 13) | class MCPStreamer:
    method __init__ (line 23) | def __init__(self, websocket=None):
    method stream_log (line 32) | async def stream_log(self, message: str, data: Any = None):
    method stream_log_sync (line 49) | def stream_log_sync(self, message: str, data: Any = None):
    method stream_stage_start (line 66) | async def stream_stage_start(self, stage: str, description: str):
    method stream_stage_complete (line 70) | async def stream_stage_complete(self, stage: str, result_count: int = ...
    method stream_tool_selection (line 77) | async def stream_tool_selection(self, selected_count: int, total_count...
    method stream_tool_execution (line 81) | async def stream_tool_execution(self, tool_name: str, step: int, total...
    method stream_research_results (line 85) | async def stream_research_results(self, result_count: int, total_chars...
    method stream_error (line 92) | async def stream_error(self, error_msg: str):
    method stream_warning (line 96) | async def stream_warning(self, warning_msg: str):
    method stream_info (line 100) | async def stream_info(self, info_msg: str):

FILE: gpt_researcher/mcp/tool_selector.py
  class MCPToolSelector (line 14) | class MCPToolSelector:
    method __init__ (line 24) | def __init__(self, cfg, researcher=None):
    method select_relevant_tools (line 35) | async def select_relevant_tools(self, query: str, all_tools: List, max...
    method _call_llm_for_tool_selection (line 129) | async def _call_llm_for_tool_selection(self, prompt: str) -> str:
    method _fallback_tool_selection (line 163) | def _fallback_tool_selection(self, all_tools: List, max_tools: int) ->...

FILE: gpt_researcher/memory/embeddings.py
  class Memory (line 55) | class Memory:
    method __init__ (line 72) | def __init__(self, embedding_provider: str, model: str, **embedding_kw...
    method get_embeddings (line 209) | def get_embeddings(self):

FILE: gpt_researcher/prompts.py
  class PromptFamily (line 14) | class PromptFamily:
    method __init__ (line 31) | def __init__(self, config: Config):
    method generate_mcp_tool_selection_prompt (line 40) | def generate_mcp_tool_selection_prompt(query: str, tools_info: List[Di...
    method generate_mcp_research_prompt (line 86) | def generate_mcp_research_prompt(query: str, selected_tools: List) -> ...
    method generate_image_analysis_prompt (line 122) | def generate_image_analysis_prompt(
    method generate_image_prompt_enhancement (line 178) | def generate_image_prompt_enhancement(
    method generate_search_queries_prompt (line 213) | def generate_search_queries_prompt(
    method generate_report_prompt (line 258) | def generate_report_prompt(
    method curate_sources (line 315) | def curate_sources(query, sources, max_results=10):
    method generate_resource_report_prompt (line 349) | def generate_resource_report_prompt(
    method generate_custom_report_prompt (line 389) | def generate_custom_report_prompt(
    method generate_outline_report_prompt (line 395) | def generate_outline_report_prompt(
    method generate_deep_research_prompt (line 414) | def generate_deep_research_prompt(
    method auto_agent_instructions (line 486) | def auto_agent_instructions():
    method generate_summary_prompt (line 514) | def generate_summary_prompt(query, data):
    method generate_quick_summary_prompt (line 528) | def generate_quick_summary_prompt(query: str, context: str) -> str:
    method pretty_print_docs (line 551) | def pretty_print_docs(docs: list[Document], top_n: int | None = None) ...
    method join_local_web_documents (line 560) | def join_local_web_documents(docs_context: str, web_context: str) -> str:
    method generate_subtopics_prompt (line 569) | def generate_subtopics_prompt() -> str:
    method generate_subtopic_report_prompt (line 592) | def generate_subtopic_report_prompt(
    method generate_draft_titles_prompt (line 669) | def generate_draft_titles_prompt(
    method generate_report_introduction (line 703) | def generate_report_introduction(question: str, research_summary: str ...
    method generate_report_conclusion (line 716) | def generate_report_conclusion(query: str, report_content: str, langua...
  class GranitePromptFamily (line 752) | class GranitePromptFamily(PromptFamily):
    method _get_granite_class (line 756) | def _get_granite_class(self) -> type[PromptFamily]:
    method pretty_print_docs (line 765) | def pretty_print_docs(self, *args, **kwargs) -> str:
    method join_local_web_documents (line 768) | def join_local_web_documents(self, *args, **kwargs) -> str:
  class Granite3PromptFamily (line 772) | class Granite3PromptFamily(PromptFamily):
    method pretty_print_docs (line 779) | def pretty_print_docs(cls, docs: list[Document], top_n: int | None = N...
    method join_local_web_documents (line 792) | def join_local_web_documents(cls, docs_context: str | list, web_contex...
  class Granite33PromptFamily (line 802) | class Granite33PromptFamily(PromptFamily):
    method _get_content (line 810) | def _get_content(doc: Document) -> str:
    method pretty_print_docs (line 817) | def pretty_print_docs(cls, docs: list[Document], top_n: int | None = N...
    method join_local_web_documents (line 828) | def join_local_web_documents(cls, docs_context: str | list, web_contex...
  function get_prompt_by_report_type (line 858) | def get_prompt_by_report_type(
  function get_prompt_family (line 885) | def get_prompt_family(

FILE: gpt_researcher/retrievers/arxiv/arxiv.py
  class ArxivSearch (line 4) | class ArxivSearch:
    method __init__ (line 8) | def __init__(self, query, sort='Relevance', query_domains=None):
    method search (line 15) | def search(self, max_results=5):

FILE: gpt_researcher/retrievers/bing/bing.py
  class BingSearch (line 10) | class BingSearch():
    method __init__ (line 15) | def __init__(self, query, query_domains=None):
    method get_api_key (line 26) | def get_api_key(self):
    method search (line 39) | def search(self, max_results=7) -> list[dict[str]]:

FILE: gpt_researcher/retrievers/bocha/bocha.py
  class BoChaSearch (line 10) | class BoChaSearch():
    method __init__ (line 15) | def __init__(self, query, query_domains=None):
    method search (line 25) | def search(self, max_results=7) -> list[dict[str]]:

FILE: gpt_researcher/retrievers/custom/custom.py
  class CustomRetriever (line 6) | class CustomRetriever:
    method __init__ (line 11) | def __init__(self, query: str, query_domains=None):
    method _populate_params (line 19) | def _populate_params(self) -> Dict[str, Any]:
    method search (line 29) | def search(self, max_results: int = 5) -> Optional[List[Dict[str, Any]]]:

FILE: gpt_researcher/retrievers/duckduckgo/duckduckgo.py
  class Duckduckgo (line 5) | class Duckduckgo:
    method __init__ (line 9) | def __init__(self, query, query_domains=None):
    method search (line 16) | def search(self, max_results=5):

FILE: gpt_researcher/retrievers/exa/exa.py
  class ExaSearch (line 5) | class ExaSearch:
    method __init__ (line 10) | def __init__(self, query, query_domains=None):
    method _retrieve_api_key (line 24) | def _retrieve_api_key(self):
    method search (line 41) | def search(
    method find_similar (line 68) | def find_similar(self, url, exclude_source_domain=False, **filters):
    method get_contents (line 87) | def get_contents(self, ids, **options):

FILE: gpt_researcher/retrievers/google/google.py
  class GoogleSearch (line 9) | class GoogleSearch:
    method __init__ (line 13) | def __init__(self, query, headers=None, query_domains=None):
    method get_api_key (line 25) | def get_api_key(self):
    method get_cx_key (line 39) | def get_cx_key(self):
    method search (line 53) | def search(self, max_results=7):

FILE: gpt_researcher/retrievers/mcp/retriever.py
  class MCPRetriever (line 27) | class MCPRetriever:
    method __init__ (line 44) | def __init__(
    method _get_mcp_configs (line 91) | def _get_mcp_configs(self) -> List[Dict[str, Any]]:
    method _get_config (line 102) | def _get_config(self):
    method search_async (line 116) | async def search_async(self, max_results: int = 10) -> List[Dict[str, ...
    method search (line 201) | def search(self, max_results: int = 10) -> List[Dict[str, str]]:
    method _get_all_tools (line 300) | async def _get_all_tools(self) -> List:

FILE: gpt_researcher/retrievers/pubmed_central/pubmed_central.py
  class PubMedCentralSearch (line 7) | class PubMedCentralSearch:
    method __init__ (line 12) | def __init__(self, query: str, query_domains=None):
    method _populate_params (line 27) | def _populate_params(self) -> Dict[str, Any]:
    method _search_articles (line 42) | def _search_articles(self, max_results: int) -> Optional[List[str]]:
    method _fetch_full_text (line 73) | def _fetch_full_text(self, article_id: str) -> Optional[Dict[str, str]]:
    method search (line 126) | def search(self, max_results: int = 5) -> Optional[List[Dict[str, Any]]]:

FILE: gpt_researcher/retrievers/searchapi/searchapi.py
  class SearchApiSearch (line 9) | class SearchApiSearch():
    method __init__ (line 13) | def __init__(self, query, query_domains=None):
    method get_api_key (line 22) | def get_api_key(self):
    method search (line 35) | def search(self, max_results=7):

FILE: gpt_researcher/retrievers/searx/searx.py
  class SearxSearch (line 8) | class SearxSearch():
    method __init__ (line 12) | def __init__(self, query: str, query_domains=None):
    method get_searxng_url (line 22) | def get_searxng_url(self) -> str:
    method search (line 39) | def search(self, max_results: int = 10) -> List[Dict[str, str]]:

FILE: gpt_researcher/retrievers/semantic_scholar/semantic_scholar.py
  class SemanticScholarSearch (line 6) | class SemanticScholarSearch:
    method __init__ (line 14) | def __init__(self, query: str, sort: str = "relevance", query_domains=...
    method search (line 25) | def search(self, max_results: int = 20) -> List[Dict[str, str]]:

FILE: gpt_researcher/retrievers/serpapi/serpapi.py
  class SerpApiSearch (line 9) | class SerpApiSearch():
    method __init__ (line 13) | def __init__(self, query, query_domains=None):
    method get_api_key (line 23) | def get_api_key(self):
    method search (line 36) | def search(self, max_results=7):

FILE: gpt_researcher/retrievers/serper/serper.py
  class SerperSearch (line 9) | class SerperSearch():
    method __init__ (line 13) | def __init__(self, query, query_domains=None, country=None, language=N...
    method _get_exclude_sites_from_env (line 32) | def _get_exclude_sites_from_env(self):
    method get_api_key (line 44) | def get_api_key(self):
    method search (line 57) | def search(self, max_results=7):

FILE: gpt_researcher/retrievers/tavily/tavily_search.py
  class TavilySearch (line 14) | class TavilySearch:
    method __init__ (line 19) | def __init__(self, query, headers=None, topic="general", query_domains...
    method get_api_key (line 39) | def get_api_key(self):
    method _search (line 57) | def _search(
    method search (line 100) | def search(self, max_results=10):

FILE: gpt_researcher/retrievers/utils.py
  function stream_output (line 14) | async def stream_output(log_type, step, content, websocket=None, with_da...
  function check_pkg (line 44) | def check_pkg(pkg: str) -> None:
  function get_all_retriever_names (line 80) | def get_all_retriever_names():

FILE: gpt_researcher/scraper/arxiv/arxiv.py
  class ArxivScraper (line 4) | class ArxivScraper:
    method __init__ (line 6) | def __init__(self, link, session=None):
    method scrape (line 10) | def scrape(self):

FILE: gpt_researcher/scraper/beautiful_soup/beautiful_soup.py
  class BeautifulSoupScraper (line 6) | class BeautifulSoupScraper:
    method __init__ (line 8) | def __init__(self, link, session=None):
    method scrape (line 12) | def scrape(self):

FILE: gpt_researcher/scraper/browser/browser.py
  class BrowserScraper (line 24) | class BrowserScraper:
    method __init__ (line 25) | def __init__(self, url: str, session=None):
    method scrape (line 38) | def scrape(self) -> tuple:
    method _import_selenium (line 61) | def _import_selenium(self):
    method setup_driver (line 83) | def setup_driver(self) -> None:
    method _load_saved_cookies (line 121) | def _load_saved_cookies(self):
    method _load_browser_cookies (line 131) | def _load_browser_cookies(self):
    method _cleanup_cookie_file (line 152) | def _cleanup_cookie_file(self):
    method _generate_random_string (line 163) | def _generate_random_string(self, length):
    method _get_domain (line 167) | def _get_domain(self):
    method _visit_google_and_save_cookies (line 175) | def _visit_google_and_save_cookies(self):
    method scrape_text_with_selenium (line 191) | def scrape_text_with_selenium(self) -> tuple:
    method _scroll_to_bottom (line 226) | def _scroll_to_bottom(self):
    method _scroll_to_percentage (line 237) | def _scroll_to_percentage(self, ratio: float) -> None:
    method _add_header (line 243) | def _add_header(self) -> None:

FILE: gpt_researcher/scraper/browser/nodriver_scraper.py
  class NoDriverScraper (line 16) | class NoDriverScraper:
    method get_domain (line 24) | def get_domain(url: str) -> str:
    class Browser (line 31) | class Browser:
      method __init__ (line 32) | def __init__(
      method get (line 45) | async def get(self, url: str) -> "zendriver.Tab":
      method scroll_page_to_bottom (line 59) | async def scroll_page_to_bottom(self, page: "zendriver.Tab"):
      method wait_or_timeout (line 82) | async def wait_or_timeout(
      method close_page (line 99) | async def close_page(self, page: "zendriver.Tab"):
      method rate_limit_for_domain (line 108) | async def rate_limit_for_domain(self, url: str):
      method stop (line 130) | async def stop(self):
    method get_browser (line 137) | async def get_browser(cls, headless: bool = False) -> "NoDriverScraper...
    method release_browser (line 175) | async def release_browser(cls, browser: Browser):
    method __init__ (line 185) | def __init__(self, url: str, session: requests.Session | None = None):
    method scrape_async (line 190) | async def scrape_async(self) -> Tuple[str, list[dict], str]:

FILE: gpt_researcher/scraper/browser/processing/html.py
  function extract_hyperlinks (line 8) | def extract_hyperlinks(soup: BeautifulSoup, base_url: str) -> list[tuple...
  function format_hyperlinks (line 24) | def format_hyperlinks(hyperlinks: list[tuple[str, str]]) -> list[str]:

FILE: gpt_researcher/scraper/browser/processing/scrape_skills.py
  function scrape_pdf_with_pymupdf (line 5) | def scrape_pdf_with_pymupdf(url) -> str:
  function scrape_pdf_with_arxiv (line 19) | def scrape_pdf_with_arxiv(query) -> str:

FILE: gpt_researcher/scraper/firecrawl/firecrawl.py
  class FireCrawl (line 5) | class FireCrawl:
    method __init__ (line 7) | def __init__(self, link, session=None):
    method get_api_key (line 13) | def get_api_key(self) -> str:
    method get_server_url (line 26) | def get_server_url(self) -> str:
    method scrape (line 39) | def scrape(self) -> tuple:

FILE: gpt_researcher/scraper/pymupdf/pymupdf.py
  class PyMuPDFScraper (line 8) | class PyMuPDFScraper:
    method __init__ (line 10) | def __init__(self, link, session=None):
    method is_url (line 21) | def is_url(self) -> bool:
    method scrape (line 34) | def scrape(self) -> tuple[str, list[str], str]:

FILE: gpt_researcher/scraper/scraper.py
  class Scraper (line 30) | class Scraper:
    method __init__ (line 35) | def __init__(self, urls, user_agent, scraper, worker_pool: WorkerPool):
    method run (line 63) | async def run(self):
    method _check_pkg (line 74) | def _check_pkg(self, scrapper_name: str) -> None:
    method extract_data_from_url (line 108) | async def extract_data_from_url(self, link, session):
    method get_scraper (line 171) | def get_scraper(self, link):

FILE: gpt_researcher/scraper/tavily_extract/tavily_extract.py
  class TavilyExtract (line 5) | class TavilyExtract:
    method __init__ (line 7) | def __init__(self, link, session=None):
    method get_api_key (line 13) | def get_api_key(self) -> str:
    method scrape (line 26) | def scrape(self) -> tuple:

FILE: gpt_researcher/scraper/utils.py
  function get_relevant_images (line 16) | def get_relevant_images(soup: BeautifulSoup, url: str) -> list:
  function parse_dimension (line 58) | def parse_dimension(value: str) -> int:
  function extract_title (line 68) | def extract_title(soup: BeautifulSoup) -> str:
  function get_image_hash (line 72) | def get_image_hash(image_url: str) -> str:
  function clean_soup (line 94) | def clean_soup(soup: BeautifulSoup) -> BeautifulSoup:
  function get_text_from_soup (line 127) | def get_text_from_soup(soup: BeautifulSoup) -> str:

FILE: gpt_researcher/scraper/web_base_loader/web_base_loader.py
  class WebBaseLoaderScraper (line 6) | class WebBaseLoaderScraper:
    method __init__ (line 8) | def __init__(self, link, session=None):
    method scrape (line 12) | def scrape(self) -> tuple:

FILE: gpt_researcher/skills/browser.py
  class BrowserManager (line 14) | class BrowserManager:
    method __init__ (line 25) | def __init__(self, researcher):
    method browse_urls (line 37) | async def browse_urls(self, urls: list[str]) -> list[dict]:
    method select_top_images (line 86) | def select_top_images(self, images: list[dict], k: int = 2) -> list[str]:

FILE: gpt_researcher/skills/context_manager.py
  class ContextManager (line 18) | class ContextManager:
    method __init__ (line 29) | def __init__(self, researcher):
    method get_similar_content_by_query (line 37) | async def get_similar_content_by_query(self, query: str, pages: list) ...
    method get_similar_content_by_query_with_vectorstore (line 65) | async def get_similar_content_by_query_with_vectorstore(self, query: s...
    method get_similar_written_contents_by_draft_section_titles (line 88) | async def get_similar_written_contents_by_draft_section_titles(
    method __get_similar_written_contents_by_query (line 120) | async def __get_similar_written_contents_by_query(

FILE: gpt_researcher/skills/curator.py
  class SourceCurator (line 15) | class SourceCurator:
    method __init__ (line 25) | def __init__(self, researcher):
    method curate_sources (line 33) | async def curate_sources(

FILE: gpt_researcher/skills/deep_research.py
  function count_words (line 17) | def count_words(text) -> int:
  function trim_context_to_word_limit (line 23) | def trim_context_to_word_limit(context_list: List[str], max_words: int =...
  class ResearchProgress (line 39) | class ResearchProgress:
    method __init__ (line 40) | def __init__(self, total_depth: int, total_breadth: int):
  class DeepResearchSkill (line 50) | class DeepResearchSkill:
    method __init__ (line 51) | def __init__(self, researcher):
    method generate_search_queries (line 65) | async def generate_search_queries(self, query: str, num_queries: int =...
    method generate_research_plan (line 99) | async def generate_research_plan(self, query: str, num_questions: int ...
    method process_research_results (line 147) | async def process_research_results(self, query: str, context: str, num...
    method deep_research (line 199) | async def deep_research(
    method run (line 363) | async def run(self, on_progress=None) -> str:

FILE: gpt_researcher/skills/image_generator.py
  class ImageGenerator (line 20) | class ImageGenerator:
    method __init__ (line 33) | def __init__(self, researcher):
    method _init_provider (line 48) | def _init_provider(self):
    method is_enabled (line 65) | def is_enabled(self) -> bool:
    method plan_and_generate_images (line 73) | async def plan_and_generate_images(
    method _plan_image_concepts (line 178) | async def _plan_image_concepts(
    method analyze_report_for_images (line 267) | async def analyze_report_for_images(
    method _extract_sections (line 320) | def _extract_sections(self, report: str) -> List[Dict[str, Any]]:
    method _build_analysis_prompt (line 367) | def _build_analysis_prompt(
    method _parse_analysis_response (line 419) | def _parse_analysis_response(
    method generate_images_for_report (line 463) | async def generate_images_for_report(
    method _embed_images_in_report (line 583) | def _embed_images_in_report(
    method get_generated_images (line 627) | def get_generated_images(self) -> List[Dict[str, Any]]:
    method process_image_placeholders (line 635) | async def process_image_placeholders(

FILE: gpt_researcher/skills/researcher.py
  class ResearchConductor (line 21) | class ResearchConductor:
    method __init__ (line 34) | def __init__(self, researcher):
    method plan_research (line 48) | async def plan_research(self, query, query_domains=None):
    method conduct_research (line 89) | async def conduct_research(self):
    method _get_context_by_urls (line 213) | async def _get_context_by_urls(self, urls):
    method _get_context_by_vectorstore (line 233) | async def _get_context_by_vectorstore(self, query, filter: dict | None...
    method _get_context_by_web_search (line 266) | async def _get_context_by_web_search(self, query, scraped_data: list |...
    method _get_mcp_strategy (line 367) | def _get_mcp_strategy(self) -> str:
    method _execute_mcp_research_for_queries (line 393) | async def _execute_mcp_research_for_queries(self, queries: list, mcp_r...
    method _process_sub_query (line 449) | async def _process_sub_query(self, sub_query: str, scraped_data: list ...
    method _execute_mcp_research (line 580) | async def _execute_mcp_research(self, retriever, query):
    method _combine_mcp_and_web_context (line 654) | def _combine_mcp_and_web_context(self, mcp_context: list, web_context:...
    method _process_sub_query_with_vectorstore (line 707) | async def _process_sub_query_with_vectorstore(self, sub_query: str, fi...
    method _get_new_urls (line 728) | async def _get_new_urls(self, url_set_input):
    method _search_relevant_source_urls (line 751) | async def _search_relevant_source_urls(self, query, query_domains: lis...
    method _scrape_data_by_urls (line 784) | async def _scrape_data_by_urls(self, sub_query, query_domains: list | ...
    method _search (line 816) | async def _search(self, retriever, query):
    method _extract_content (line 906) | async def _extract_content(self, results):
    method _summarize_content (line 943) | async def _summarize_content(self, query, content):
    method _update_search_progress (line 967) | async def _update_search_progress(self, current, total):

FILE: gpt_researcher/skills/writer.py
  class ReportGenerator (line 20) | class ReportGenerator:
    method __init__ (line 31) | def __init__(self, researcher):
    method write_report (line 49) | async def write_report(self, existing_headers: list = [], relevant_wri...
    method write_report_conclusion (line 125) | async def write_report_conclusion(self, report_content: str) -> str:
    method write_introduction (line 164) | async def write_introduction(self):
    method get_subtopics (line 195) | async def get_subtopics(self):
    method get_draft_section_titles (line 224) | async def get_draft_section_titles(self, current_subtopic: str):

FILE: gpt_researcher/utils/costs.py
  function estimate_llm_cost (line 18) | def estimate_llm_cost(input_content: str, output_content: str) -> float:
  function estimate_embedding_cost (line 38) | def estimate_embedding_cost(model: str, docs: list) -> float:

FILE: gpt_researcher/utils/enum.py
  class ReportType (line 6) | class ReportType(Enum):
  class ReportSource (line 30) | class ReportSource(Enum):
  class Tone (line 54) | class Tone(Enum):
  class PromptFamily (line 94) | class PromptFamily(Enum):

FILE: gpt_researcher/utils/llm.py
  function get_llm (line 27) | def get_llm(llm_provider: str, **kwargs):
  function create_chat_completion (line 41) | async def create_chat_completion(
  function construct_subtopics (line 138) | async def construct_subtopics(

FILE: gpt_researcher/utils/logger.py
  function get_formatted_logger (line 11) | def get_formatted_logger():
  class ColourizedFormatter (line 40) | class ColourizedFormatter(logging.Formatter):
    method __init__ (line 58) | def __init__(
    method color_level_name (line 71) | def color_level_name(self, level_name: str, level_no: int) -> str:
    method should_use_colors (line 78) | def should_use_colors(self) -> bool:
    method formatMessage (line 81) | def formatMessage(self, record: logging.LogRecord) -> str:
  class DefaultFormatter (line 94) | class DefaultFormatter(ColourizedFormatter):
    method should_use_colors (line 95) | def should_use_colors(self) -> bool:

FILE: gpt_researcher/utils/logging_config.py
  class JSONResearchHandler (line 7) | class JSONResearchHandler:
    method __init__ (line 8) | def __init__(self, json_file):
    method log_event (line 22) | def log_event(self, event_type: str, data: dict):
    method update_content (line 30) | def update_content(self, key: str, value):
    method _save_json (line 34) | def _save_json(self):
  function setup_research_logging (line 38) | def setup_research_logging():
  function get_research_logger (line 78) | def get_research_logger():
  function get_json_handler (line 81) | def get_json_handler():

FILE: gpt_researcher/utils/rate_limiter.py
  class GlobalRateLimiter (line 13) | class GlobalRateLimiter:
    method __new__ (line 24) | def __new__(cls):
    method __init__ (line 30) | def __init__(self):
    method get_lock (line 45) | def get_lock(cls):
    method configure (line 51) | def configure(self, rate_limit_delay: float):
    method wait_if_needed (line 60) | async def wait_if_needed(self):
    method reset (line 81) | def reset(self):
  function get_global_rate_limiter (line 90) | def get_global_rate_limiter() -> GlobalRateLimiter:

FILE: gpt_researcher/utils/tools.py
  function create_chat_completion_with_tools (line 20) | async def create_chat_completion_with_tools(
  function create_search_tool (line 198) | def create_search_tool(search_function: Callable[[str], Dict]) -> Callable:
  function create_custom_tool (line 242) | def create_custom_tool(
  function get_available_providers_with_tools (line 288) | def get_available_providers_with_tools() -> List[str]:
  function supports_tools (line 307) | def supports_tools(provider: str) -> bool:

FILE: gpt_researcher/utils/validators.py
  class Subtopic (line 8) | class Subtopic(BaseModel):
  class Subtopics (line 17) | class Subtopics(BaseModel):

FILE: gpt_researcher/utils/workers.py
  class WorkerPool (line 8) | class WorkerPool:
    method __init__ (line 9) | def __init__(self, max_workers: int, rate_limit_delay: float = 0.0):
    method throttle (line 36) | async def throttle(self):

FILE: gpt_researcher/vector_store/vector_store.py
  class VectorStoreWrapper (line 10) | class VectorStoreWrapper:
    method __init__ (line 14) | def __init__(self, vector_store : VectorStore):
    method load (line 17) | def load(self, documents):
    method _create_langchain_documents (line 26) | def _create_langchain_documents(self, data: List[Dict[str, str]]) -> L...
    method _split_documents (line 30) | def _split_documents(self, documents: List[Document], chunk_size: int ...
    method asimilarity_search (line 40) | async def asimilarity_search(self, query, k, filter):

FILE: json_schema_generator.py
  class UserSchema (line 5) | class UserSchema(BaseModel):
  function generate_structured_json (line 12) | def generate_structured_json(schema: BaseModel, data: Dict[str, Any]) ->...

FILE: multi_agents/agents/editor.py
  class EditorAgent (line 13) | class EditorAgent:
    method __init__ (line 16) | def __init__(self, websocket=None, stream_output=None, tone=None, head...
    method plan_research (line 22) | async def plan_research(self, research_state: Dict[str, any]) -> Dict[...
    method run_parallel_research (line 52) | async def run_parallel_research(self, research_state: Dict[str, any]) ...
    method _create_planning_prompt (line 79) | def _create_planning_prompt(self, initial_research: str, include_human...
    method _format_planning_instructions (line 96) | def _format_planning_instructions(self, initial_research: str, include...
    method _initialize_agents (line 118) | def _initialize_agents(self) -> Dict[str, any]:
    method _create_workflow (line 126) | def _create_workflow(self) -> StateGraph:
    method _log_parallel_research (line 146) | def _log_parallel_research(self, queries: List[str]) -> None:
    method _create_task_input (line 161) | def _create_task_input(self, research_state: Dict[str, any], query: st...

FILE: multi_agents/agents/human.py
  class HumanAgent (line 4) | class HumanAgent:
    method __init__ (line 5) | def __init__(self, websocket=None, stream_output=None, headers=None):
    method review_plan (line 10) | async def review_plan(self, research_state: dict):

FILE: multi_agents/agents/orchestrator.py
  class ChiefEditorAgent (line 19) | class ChiefEditorAgent:
    method __init__ (line 22) | def __init__(self, task: dict, websocket=None, stream_output=None, ton...
    method _generate_task_id (line 31) | def _generate_task_id(self):
    method _create_output_directory (line 35) | def _create_output_directory(self):
    method _initialize_agents (line 43) | def _initialize_agents(self):
    method _create_workflow (line 52) | def _create_workflow(self, agents):
    method _add_workflow_edges (line 68) | def _add_workflow_edges(self, workflow):
    method init_research_team (line 83) | def init_research_team(self):
    method _log_research_start (line 88) | async def _log_research_start(self):
    method run_research_task (line 95) | async def run_research_task(self, task_id=None):

FILE: multi_agents/agents/publisher.py
  class PublisherAgent (line 9) | class PublisherAgent:
    method __init__ (line 10) | def __init__(self, output_dir: str, websocket=None, stream_output=None...
    method publish_research_report (line 16) | async def publish_research_report(self, research_state: dict, publish_...
    method generate_layout (line 22) | def generate_layout(self, research_state: dict):
    method write_report_by_formats (line 55) | async def write_report_by_formats(self, layout:str, publish_formats: d...
    method run (line 63) | async def run(self, research_state: dict):

FILE: multi_agents/agents/researcher.py
  class ResearchAgent (line 6) | class ResearchAgent:
    method __init__ (line 7) | def __init__(self, websocket=None, stream_output=None, tone=None, head...
    method research (line 13) | async def research(self, query: str, research_report: str = "research_...
    method run_subtopic_research (line 25) | async def run_subtopic_research(self, parent_query: str, subtopic: str...
    method run_initial_research (line 34) | async def run_initial_research(self, research_state: dict):
    method run_depth_research (line 46) | async def run_depth_research(self, draft_state: dict):

FILE: multi_agents/agents/reviewer.py
  class ReviewerAgent (line 9) | class ReviewerAgent:
    method __init__ (line 10) | def __init__(self, websocket=None, stream_output=None, headers=None):
    method review_draft (line 15) | async def review_draft(self, draft_state: dict):
    method run (line 63) | async def run(self, draft_state: dict):

FILE: multi_agents/agents/reviser.py
  class ReviserAgent (line 15) | class ReviserAgent:
    method __init__ (line 16) | def __init__(self, websocket=None, stream_output=None, headers=None):
    method revise_draft (line 21) | async def revise_draft(self, draft_state: dict):
    method run (line 54) | async def run(self, draft_state: dict):

FILE: multi_agents/agents/utils/file_formats.py
  function write_to_file (line 7) | async def write_to_file(filename: str, text: str) -> None:
  function write_text_to_md (line 24) | async def write_text_to_md(text: str, path: str) -> str:
  function write_md_to_pdf (line 40) | async def write_md_to_pdf(text: str, path: str) -> str:
  function write_md_to_word (line 72) | async def write_md_to_word(text: str, path: str) -> str:

FILE: multi_agents/agents/utils/llms.py
  function call_model (line 10) | async def call_model(

FILE: multi_agents/agents/utils/utils.py
  function sanitize_filename (line 3) | def sanitize_filename(filename: str) -> str:

FILE: multi_agents/agents/utils/views.py
  class AgentColor (line 5) | class AgentColor(Enum):
  function print_agent_output (line 15) | def print_agent_output(output:str, agent: str="RESEARCHER"):

FILE: multi_agents/agents/writer.py
  class WriterAgent (line 16) | class WriterAgent:
    method __init__ (line 17) | def __init__(self, websocket=None, stream_output=None, headers=None):
    method get_headers (line 22) | def get_headers(self, research_state: dict):
    method write_sections (line 32) | async def write_sections(self, research_state: dict):
    method revise_headers (line 69) | async def revise_headers(self, task: dict, headers: dict):
    method run (line 94) | async def run(self, research_state: dict):

FILE: multi_agents/main.py
  function open_task (line 17) | def open_task():
  function run_research_task (line 40) | async def run_research_task(query, websocket=None, stream_output=None, t...
  function main (line 52) | async def main():

FILE: multi_agents/memory/draft.py
  class DraftState (line 5) | class DraftState(TypedDict):

FILE: multi_agents/memory/research.py
  class ResearchState (line 5) | class ResearchState(TypedDict):

FILE: multi_agents_ag2/agents/editor.py
  class EditorAgent (line 8) | class EditorAgent:
    method __init__ (line 11) | def __init__(self, websocket=None, stream_output=None, tone=None, head...
    method plan_research (line 17) | async def plan_research(self, research_state: Dict[str, any]) -> Dict[...
    method _create_planning_prompt (line 43) | def _create_planning_prompt(
    method _format_planning_instructions (line 65) | def _format_planning_instructions(

FILE: multi_agents_ag2/agents/orchestrator.py
  class ChiefEditorAgent (line 20) | class ChiefEditorAgent:
    method __init__ (line 23) | def __init__(self, task: dict, websocket=None, stream_output=None, ton...
    method _generate_task_id (line 33) | def _generate_task_id(self) -> int:
    method _create_output_directory (line 36) | def _create_output_directory(self) -> str:
    method _llm_config (line 43) | def _llm_config(self) -> Dict[str, Any]:
    method _initialize_ag2_team (line 55) | def _initialize_ag2_team(self):
    method _chat (line 113) | def _chat(self, agent_key: str, message: str) -> None:
    method _log (line 119) | async def _log(self, agent_key: str, message: str, stream_tag: str = "...
    method _initialize_agents (line 126) | def _initialize_agents(self) -> Dict[str, Any]:
    method _run_section (line 137) | async def _run_section(self, agents: Dict[str, Any], topic: str, title...
    method _run_parallel_research (line 165) | async def _run_parallel_research(
    method run_research_task (line 171) | async def run_research_task(self, task_id: Optional[str] = None) -> Di...

FILE: multi_agents_ag2/main.py
  function open_task (line 15) | def open_task() -> dict:
  function run_research_task (line 37) | async def run_research_task(query, websocket=None, stream_output=None, t...
  function main (line 50) | async def main():

FILE: tests/documents-report-source.py
  function test_gpt_researcher (line 28) | async def test_gpt_researcher(report_type):

FILE: tests/gptr-logs-handler.py
  function run (line 7) | async def run() -> None:

FILE: tests/report-types.py
  function test_gpt_researcher (line 18) | async def test_gpt_researcher(report_type):

FILE: tests/research_test.py
  function get_report (line 23) | async def get_report(query: str, report_type: str, sources: list) -> str:

FILE: tests/test-openai-llm.py
  function main (line 7) | async def main():
  function test_llm (line 21) | async def test_llm(llm):

FILE: tests/test-your-embeddings.py
  function main (line 8) | async def main():

FILE: tests/test-your-llm.py
  function main (line 7) | async def main():

FILE: tests/test-your-retriever.py
  function test_scrape_data_by_query (line 10) | async def test_scrape_data_by_query():

FILE: tests/test_logging.py
  function test_custom_logs_handler (line 9) | async def test_custom_logs_handler():
  function test_content_update (line 38) | async def test_content_update():

FILE: tests/test_logging_output.py
  class TestWebSocket (line 12) | class TestWebSocket(WebSocket):
    method __init__ (line 13) | def __init__(self):
    method __bool__ (line 17) | def __bool__(self):
    method accept (line 20) | async def accept(self):
    method send_json (line 24) | async def send_json(self, event):
  function test_log_output_file (line 29) | async def test_log_output_file():

FILE: tests/test_logs.py
  function test_logs_creation (line 11) | def test_logs_creation():

FILE: tests/test_mcp.py
  function get_mcp_config (line 40) | def get_mcp_config():
  function get_github_mcp_config (line 53) | def get_github_mcp_config():
  function setup_environment (line 66) | def setup_environment():
  function test_web_search_mcp (line 93) | async def test_web_search_mcp():
  function test_github_mcp (line 155) | async def test_github_mcp():
  function main (line 217) | async def main():

FILE: tests/test_quick_search.py
  class TestQuickSearch (line 7) | class TestQuickSearch(unittest.TestCase):
    method test_quick_search_no_summary (line 12) | def test_quick_search_no_summary(self, mock_embeddings, mock_create_ch...
    method test_quick_search_with_summary (line 30) | def test_quick_search_with_summary(self, mock_embeddings, mock_create_...

FILE: tests/test_researcher_logging.py
  function test_researcher_logging (line 16) | async def test_researcher_logging():  # Renamed function to be more spec...

FILE: tests/test_security_fix.py
  class TestSecureFilename (line 28) | class TestSecureFilename:
    method test_basic_filename (line 31) | def test_basic_filename(self):
    method test_path_traversal_attacks (line 36) | def test_path_traversal_attacks(self):
    method test_null_byte_injection (line 47) | def test_null_byte_injection(self):
    method test_control_characters (line 54) | def test_control_characters(self):
    method test_unicode_normalization (line 60) | def test_unicode_normalization(self):
    method test_drive_letters_windows (line 68) | def test_drive_letters_windows(self):
    method test_reserved_names_windows (line 73) | def test_reserved_names_windows(self):
    method test_empty_filename (line 84) | def test_empty_filename(self):
    method test_filename_length_limit (line 95) | def test_filename_length_limit(self):
    method test_leading_dots_spaces (line 102) | def test_leading_dots_spaces(self):
  class TestValidateFilePath (line 109) | class TestValidateFilePath:
    method test_valid_path (line 112) | def test_valid_path(self):
    method test_path_traversal_blocked (line 119) | def test_path_traversal_blocked(self):
    method test_symlink_traversal_blocked (line 128) | def test_symlink_traversal_blocked(self):
  class TestHandleFileUpload (line 146) | class TestHandleFileUpload:
    method mock_file (line 150) | def mock_file(self):
    method temp_doc_path (line 158) | def temp_doc_path(self):
    method test_normal_file_upload (line 165) | async def test_normal_file_upload(self, mock_file, temp_doc_path):
    method test_malicious_filename_upload (line 193) | async def test_malicious_filename_upload(self, temp_doc_path):
    method test_empty_filename_upload (line 206) | async def test_empty_filename_upload(self, temp_doc_path):
    method test_file_conflict_handling (line 218) | async def test_file_conflict_handling(self, mock_file, temp_doc_path):
  class TestHandleFileDeletion (line 248) | class TestHandleFileDeletion:
    method temp_doc_path (line 252) | def temp_doc_path(self):
    method test_normal_file_deletion (line 259) | async def test_normal_file_deletion(self, temp_doc_path):
    method test_malicious_filename_deletion (line 273) | async def test_malicious_filename_deletion(self, temp_doc_path):
    method test_nonexistent_file_deletion (line 281) | async def test_nonexistent_file_deletion(self, temp_doc_path):
    method test_directory_deletion_blocked (line 289) | async def test_directory_deletion_blocked(self, temp_doc_path):
  class TestSecurityIntegration (line 302) | class TestSecurityIntegration:
    method test_attack_vectors_blocked (line 305) | def test_attack_vectors_blocked(self):
    method test_legitimate_files_allowed (line 331) | def test_legitimate_files_allowed(self):

FILE: tests/vector-store.py
  function load_document (line 102) | def load_document():
  function create_vectorstore (line 108) | def create_vectorstore(documents: List[Document]):
  function test_gpt_researcher_with_vector_store (line 113) | async def test_gpt_researcher_with_vector_store():
  function test_store_in_vector_store_web (line 142) | async def test_store_in_vector_store_web():
  function test_store_in_vector_store_urls (line 162) | async def test_store_in_vector_store_urls():
  function test_store_in_vector_store_langchain_docs (line 181) | async def test_store_in_vector_store_langchain_docs():
  function test_store_in_vector_store_locals (line 201) | async def test_store_in_vector_store_locals():
  function test_store_in_vector_store_hybrids (line 220) | async def test_store_in_vector_store_hybrids():
Copy disabled (too large) Download .json
Condensed preview — 472 files, each showing path, character count, and a content snippet. Download the .json file for the full structured content (11,243K chars).
[
  {
    "path": ".claude/SKILL.md",
    "chars": 7247,
    "preview": "---\nname: gpt-researcher\ndescription: GPT Researcher is an autonomous deep research agent that conducts web and local re"
  },
  {
    "path": ".claude/references/adding-features.md",
    "chars": 10554,
    "preview": "# Adding Features Guide\n\n## Table of Contents\n- [The 8-Step Pattern](#the-8-step-pattern)\n- [Image Generation Case Study"
  },
  {
    "path": ".claude/references/advanced-patterns.md",
    "chars": 3094,
    "preview": "# Advanced Patterns Reference\n\n## Table of Contents\n- [Custom Callbacks](#custom-callbacks)\n- [Custom WebSocket Handler]"
  },
  {
    "path": ".claude/references/api-reference.md",
    "chars": 5775,
    "preview": "# API Reference\n\n## Table of Contents\n- [REST API](#rest-api)\n- [WebSocket API](#websocket-api)\n- [Python Client](#pytho"
  },
  {
    "path": ".claude/references/architecture.md",
    "chars": 7645,
    "preview": "# Architecture Reference\n\n## Table of Contents\n- [System Layers](#system-layers)\n- [Key File Locations](#key-file-locati"
  },
  {
    "path": ".claude/references/components.md",
    "chars": 6699,
    "preview": "# Core Components & Method Signatures\n\n## Table of Contents\n- [GPTResearcher](#gptresearcher)\n- [ResearchConductor](#res"
  },
  {
    "path": ".claude/references/config-reference.md",
    "chars": 3123,
    "preview": "# Configuration Reference\n\n## Table of Contents\n- [Required Variables](#required-variables)\n- [LLM Configuration](#llm-c"
  },
  {
    "path": ".claude/references/deep-research.md",
    "chars": 2134,
    "preview": "# Deep Research Mode Reference\n\n## Table of Contents\n- [Overview](#overview)\n- [Configuration](#configuration)\n- [DeepRe"
  },
  {
    "path": ".claude/references/flows.md",
    "chars": 10862,
    "preview": "# Research Flow & Data Flow\n\n## Table of Contents\n- [End-to-End Research Flow](#end-to-end-research-flow)\n- [Data Flow B"
  },
  {
    "path": ".claude/references/mcp.md",
    "chars": 2438,
    "preview": "# MCP Integration Reference\n\n## Table of Contents\n- [Overview](#overview)\n- [Configuration](#configuration)\n- [Strategy "
  },
  {
    "path": ".claude/references/multi-agents.md",
    "chars": 1814,
    "preview": "# Multi-Agent System Reference\n\n## Table of Contents\n- [Overview](#overview)\n- [Agent Roles](#agent-roles)\n- [Workflow]("
  },
  {
    "path": ".claude/references/prompts.md",
    "chars": 3719,
    "preview": "# Prompt System Reference\n\n## Table of Contents\n- [PromptFamily Class](#promptfamily-class)\n- [Key Prompt Examples](#key"
  },
  {
    "path": ".claude/references/retrievers.md",
    "chars": 3015,
    "preview": "# Retriever System Reference\n\n## Table of Contents\n- [Available Retrievers](#available-retrievers)\n- [Retriever Selectio"
  },
  {
    "path": ".cursorignore",
    "chars": 33,
    "preview": ".venv\n__pycache__\noutputs\n.github"
  },
  {
    "path": ".dockerignore",
    "chars": 13,
    "preview": ".git\noutput/\n"
  },
  {
    "path": ".github/ISSUE_TEMPLATE/bug_report.md",
    "chars": 834,
    "preview": "---\nname: Bug report\nabout: Create a report to help us improve\ntitle: ''\nlabels: ''\nassignees: ''\n\n---\n\n**Describe the b"
  },
  {
    "path": ".github/ISSUE_TEMPLATE/feature_request.md",
    "chars": 595,
    "preview": "---\nname: Feature request\nabout: Suggest an idea for this project\ntitle: ''\nlabels: ''\nassignees: ''\n\n---\n\n**Is your fea"
  },
  {
    "path": ".github/dependabot.yml",
    "chars": 592,
    "preview": "# To get started with Dependabot version updates, you'll need to specify which\n# package ecosystems to update and where "
  },
  {
    "path": ".github/workflows/build.yml",
    "chars": 4676,
    "preview": "name: Build-Push and Update Image Tag\n\non:\n    push: \n        branches: [ master ]\n        paths-ignore:\n        - 'terr"
  },
  {
    "path": ".github/workflows/deploy.yml",
    "chars": 10652,
    "preview": "name: Terraform Deploy\non:\n    push:\n        branches: [ master ]\n        paths:\n        - 'terraform/**'\n    pull_reque"
  },
  {
    "path": ".github/workflows/docker-build.yml",
    "chars": 1466,
    "preview": "name: GPTR tests\nrun-name: ${{ github.actor }} ran the GPTR tests flow\npermissions:\n  contents: read\n  pull-requests: wr"
  },
  {
    "path": ".gitignore",
    "chars": 550,
    "preview": "#Ignore env containing secrets\n.env\n.venv\n.envrc\n\n#Ignore Virtual Env\nenv/\nvenv/\n.venv/\n\n# Other Environments\nENV/\nenv.b"
  },
  {
    "path": ".python-version",
    "chars": 4,
    "preview": "3.11"
  },
  {
    "path": "CODE_OF_CONDUCT.md",
    "chars": 5163,
    "preview": "# Contributor Covenant Code of Conduct\n\n## Our Pledge\n\nWe, as members, contributors, and leaders, pledge to make partici"
  },
  {
    "path": "CONTRIBUTING.md",
    "chars": 2775,
    "preview": "# Contributing to GPT Researcher\n\nFirst off, we'd like to welcome you and thank you for your interest and effort in cont"
  },
  {
    "path": "Dockerfile",
    "chars": 2808,
    "preview": "# Stage 1: Browser and build tools installation\n# Python 3.12+ required for LangChain v1\nFROM python:3.12-slim-bookworm "
  },
  {
    "path": "Dockerfile.fullstack",
    "chars": 11918,
    "preview": "########################################################################\n# Stage 1: Frontend build\n#####################"
  },
  {
    "path": "LICENSE",
    "chars": 11357,
    "preview": "                                 Apache License\n                           Version 2.0, January 2004\n                   "
  },
  {
    "path": "Procfile",
    "chars": 78,
    "preview": "web: python -m uvicorn backend.server.server:app --host=0.0.0.0 --port=${PORT}"
  },
  {
    "path": "README-ja_JP.md",
    "chars": 7716,
    "preview": "<div align=\"center\">\n<!--<h1 style=\"display: flex; align-items: center; gap: 10px;\">\n  <img src=\"https://github.com/assa"
  },
  {
    "path": "README-ko_KR.md",
    "chars": 10714,
    "preview": "<div align=\"center\">\n<!--<h1 style=\"display: flex; align-items: center; gap: 10px;\">\n  <img src=\"https://github.com/assa"
  },
  {
    "path": "README-zh_CN.md",
    "chars": 6771,
    "preview": "<div align=\"center\">\n<!--<h1 style=\"display: flex; align-items: center; gap: 10px;\">\n  <img src=\"https://github.com/assa"
  },
  {
    "path": "README.md",
    "chars": 16757,
    "preview": "<div align=\"center\" id=\"top\">\n\n<img src=\"https://github.com/assafelovic/gpt-researcher/assets/13554167/20af8286-b386-44a"
  },
  {
    "path": "backend/Dockerfile",
    "chars": 355,
    "preview": "FROM python:3.11-slim\n\nWORKDIR /app\n\n# Copy requirements first to leverage Docker cache\nCOPY requirements.txt .\nRUN pip "
  },
  {
    "path": "backend/Procfile",
    "chars": 68,
    "preview": "web: uvicorn server.app:app --host 0.0.0.0 --port $PORT --workers 1 "
  },
  {
    "path": "backend/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "backend/chat/__init__.py",
    "chars": 31,
    "preview": "\n\n# Chat package initialization"
  },
  {
    "path": "backend/chat/chat.py",
    "chars": 10344,
    "preview": "import logging\nimport os\nimport uuid\nimport json\nfrom fastapi import WebSocket\nfrom typing import List, Dict, Any\n\nfrom "
  },
  {
    "path": "backend/memory/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "backend/memory/draft.py",
    "chars": 178,
    "preview": "from typing import TypedDict, List, Annotated\nimport operator\n\n\nclass DraftState(TypedDict):\n    task: dict\n    topic: s"
  },
  {
    "path": "backend/memory/research.py",
    "chars": 368,
    "preview": "from typing import TypedDict, List, Annotated\nimport operator\n\n\nclass ResearchState(TypedDict):\n    task: dict\n    initi"
  },
  {
    "path": "backend/report_type/__init__.py",
    "chars": 165,
    "preview": "from .basic_report.basic_report import BasicReport\nfrom .detailed_report.detailed_report import DetailedReport\n\n__all__ "
  },
  {
    "path": "backend/report_type/basic_report/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "backend/report_type/basic_report/basic_report.py",
    "chars": 2499,
    "preview": "import hashlib\nimport time\nfrom fastapi import WebSocket\nfrom typing import Any\n\nfrom gpt_researcher import GPTResearche"
  },
  {
    "path": "backend/report_type/deep_research/README.md",
    "chars": 4525,
    "preview": "# Deep Research ✨ NEW ✨\n\nWith the latest \"Deep Research\" trend in the AI community, we're excited to implement our own O"
  },
  {
    "path": "backend/report_type/deep_research/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "backend/report_type/deep_research/example.py",
    "chars": 12816,
    "preview": "from typing import List, Dict, Any, Optional, Set\nfrom fastapi import WebSocket\nimport asyncio\nimport logging\nfrom gpt_r"
  },
  {
    "path": "backend/report_type/deep_research/main.py",
    "chars": 1221,
    "preview": "from gpt_researcher import GPTResearcher\nfrom backend.utils import write_md_to_pdf\nimport asyncio\n\n\nasync def main(task:"
  },
  {
    "path": "backend/report_type/detailed_report/README.md",
    "chars": 746,
    "preview": "## Detailed Reports\n\nIntroducing long and detailed reports, with a completely new architecture inspired by the latest [S"
  },
  {
    "path": "backend/report_type/detailed_report/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "backend/report_type/detailed_report/detailed_report.py",
    "chars": 7884,
    "preview": "import asyncio\nimport hashlib\nimport time\nfrom typing import List, Dict, Set, Optional, Any\nfrom fastapi import WebSocke"
  },
  {
    "path": "backend/requirements.txt",
    "chars": 645,
    "preview": "# Backend-specific requirements\n# For production backend deployment\n\n# Core Framework\nfastapi>=0.104.1\nuvicorn>=0.24.0\np"
  },
  {
    "path": "backend/run_server.py",
    "chars": 549,
    "preview": "#!/usr/bin/env python3\n\"\"\"\nGPT-Researcher Backend Server Startup Script\n\nRun this to start the research API server.\n\"\"\"\n"
  },
  {
    "path": "backend/runtime.txt",
    "chars": 11,
    "preview": "python-3.11"
  },
  {
    "path": "backend/server/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "backend/server/app.py",
    "chars": 16025,
    "preview": "import json\nimport os\nfrom typing import Dict, List, Any\nimport time\nimport logging\nimport sys\nimport warnings\nfrom path"
  },
  {
    "path": "backend/server/logging_config.py",
    "chars": 2680,
    "preview": "import logging\nimport json\nimport os\nfrom datetime import datetime\nfrom pathlib import Path\n\nclass JSONResearchHandler:\n"
  },
  {
    "path": "backend/server/multi_agent_runner.py",
    "chars": 1065,
    "preview": "import os\nimport sys\nfrom typing import Any, Awaitable, Callable\n\nRunResearchTask = Callable[..., Awaitable[Any]]\n\n\ndef "
  },
  {
    "path": "backend/server/report_store.py",
    "chars": 2115,
    "preview": "import asyncio\nimport json\nfrom pathlib import Path\nfrom typing import Any, Dict, List\n\n\nclass ReportStore:\n    def __in"
  },
  {
    "path": "backend/server/server_utils.py",
    "chars": 15582,
    "preview": "import asyncio\nimport json\nimport os\nimport re\nimport time\nimport shutil\nimport traceback\nfrom typing import Awaitable, "
  },
  {
    "path": "backend/server/websocket_manager.py",
    "chars": 7506,
    "preview": "import asyncio\nimport datetime\nimport json\nimport logging\nimport traceback\nfrom typing import Dict, List\n\nfrom fastapi i"
  },
  {
    "path": "backend/styles/pdf_styles.css",
    "chars": 1093,
    "preview": "body {\n    font-family: 'Libre Baskerville', serif;\n    font-size: 12pt; /* standard size for academic papers */\n    lin"
  },
  {
    "path": "backend/utils.py",
    "chars": 4179,
    "preview": "import aiofiles\nimport urllib\nimport mistune\nimport os\n\nasync def write_to_file(filename: str, text: str) -> None:\n    \""
  },
  {
    "path": "citation.cff",
    "chars": 285,
    "preview": "cff-version: 1.0.0\nmessage: \"If you use this software, please cite it as below.\"\nauthors:\n  - family-names: Elovic\n    g"
  },
  {
    "path": "cli.py",
    "chars": 6461,
    "preview": "\"\"\"\nProvides a command line interface for the GPTResearcher class.\n\nUsage:\n\n```shell\npython cli.py \"<query>\" --report_ty"
  },
  {
    "path": "docker-compose.yml",
    "chars": 2320,
    "preview": "services:\n  gpt-researcher:\n    pull_policy: build\n    image: gptresearcher/gpt-researcher\n    build: ./\n    environment"
  },
  {
    "path": "docs/CNAME",
    "chars": 13,
    "preview": "docs.gptr.dev"
  },
  {
    "path": "docs/README.md",
    "chars": 791,
    "preview": "# Website\n\nThis website is built using [Docusaurus 2](https://docusaurus.io/), a modern static website generator.\n\n## Pr"
  },
  {
    "path": "docs/babel.config.js",
    "chars": 89,
    "preview": "module.exports = {\n  presets: [require.resolve('@docusaurus/core/lib/babel/preset')],\n};\n"
  },
  {
    "path": "docs/blog/2023-09-22-gpt-researcher/index.md",
    "chars": 8268,
    "preview": "---\nslug: building-gpt-researcher\ntitle: How we built GPT Researcher\nauthors: [assafe]\ntags: [gpt-researcher, autonomous"
  },
  {
    "path": "docs/blog/2023-11-12-openai-assistant/index.md",
    "chars": 10823,
    "preview": "---\nslug: building-openai-assistant\ntitle: How to build an OpenAI Assistant with Internet access\nauthors: [assafe]\ntags:"
  },
  {
    "path": "docs/blog/2024-05-19-gptr-langgraph/index.md",
    "chars": 14108,
    "preview": "---\nslug: gptr-langgraph\ntitle: How to Build the Ultimate Research Multi-Agent Assistant\nauthors: [assafe]\ntags: [multi-"
  },
  {
    "path": "docs/blog/2024-09-7-hybrid-research/index.md",
    "chars": 11851,
    "preview": "---\nslug: gptr-hybrid\ntitle: The Future of Research is Hybrid\nauthors: [assafe]\ntags: [hybrid-research, gpt-researcher, "
  },
  {
    "path": "docs/blog/2025-02-26-deep-research/index.md",
    "chars": 12898,
    "preview": "# Introducing Deep Research: The Open Source Alternative\n\n## The Dawn of Deep Research in AI\n\nThe AI research landscape "
  },
  {
    "path": "docs/blog/2025-03-10-stepping-into-the-story/index.md",
    "chars": 6500,
    "preview": "---\nslug: stepping-into-the-story\ntitle: Stepping Into the Story of GPT Researcher\nauthors: [elishakay]\ntags: [ai, gpt-r"
  },
  {
    "path": "docs/blog/authors.yml",
    "chars": 393,
    "preview": "assafe:\n  name: Assaf Elovic\n  title: Creator @ GPT Researcher and Tavily\n  url: https://github.com/assafelovic\n  image_"
  },
  {
    "path": "docs/discord-bot/Dockerfile",
    "chars": 129,
    "preview": "FROM node:18.17.0-alpine\nWORKDIR /app\nCOPY ./package.json ./\nRUN npm install --legacy-peer-deps\nCOPY . .\nCMD [\"node\", \"i"
  },
  {
    "path": "docs/discord-bot/Dockerfile.dev",
    "chars": 159,
    "preview": "FROM node:18.17.0-alpine\nWORKDIR /app\nCOPY ./package.json ./\nRUN npm install --legacy-peer-deps\nRUN npm install -g nodem"
  },
  {
    "path": "docs/discord-bot/commands/ask.js",
    "chars": 297,
    "preview": "const { SlashCommandBuilder } = require('discord.js');\n\nmodule.exports = {\n    data: new SlashCommandBuilder()\n        ."
  },
  {
    "path": "docs/discord-bot/deploy-commands.js",
    "chars": 828,
    "preview": "const { Client, GatewayIntentBits, REST, Routes } = require('discord.js');\nrequire('dotenv').config();\n\n// Create a new "
  },
  {
    "path": "docs/discord-bot/gptr-webhook.js",
    "chars": 2586,
    "preview": "// gptr-webhook.js\nconst WebSocket = require('ws');\n\nlet socket = null;\nconst responseCallbacks = new Map(); // Using Ma"
  },
  {
    "path": "docs/discord-bot/index.js",
    "chars": 6267,
    "preview": "require('dotenv').config();\nconst { Client, GatewayIntentBits, ActionRowBuilder, Events, ModalBuilder, TextInputBuilder,"
  },
  {
    "path": "docs/discord-bot/package.json",
    "chars": 453,
    "preview": "{\n  \"name\": \"Discord-Bot-JS\",\n  \"version\": \"1.0.0\",\n  \"description\": \"\",\n  \"main\": \"index.js\",\n  \"dependencies\": {\n    \""
  },
  {
    "path": "docs/discord-bot/server.js",
    "chars": 774,
    "preview": "const express = require(\"express\")\n\nconst server = express()\n\nserver.all(\"/\", (req, res) => {\n  res.send(\"Bot is running"
  },
  {
    "path": "docs/docs/contribute.md",
    "chars": 390,
    "preview": "# Contribute\n\nWe highly welcome contributions! Please check out [contributing](https://github.com/assafelovic/gpt-resear"
  },
  {
    "path": "docs/docs/examples/custom_prompt.py",
    "chars": 3088,
    "preview": "\"\"\"\nCustom Prompt Example for GPT Researcher\n\nThis example demonstrates how to use the custom_prompt parameter to custom"
  },
  {
    "path": "docs/docs/examples/detailed_report.md",
    "chars": 3763,
    "preview": "# Detailed Report\n\n## Overview\n\nThe `DetailedReport` class inspired by the recent STORM paper, is a powerful component o"
  },
  {
    "path": "docs/docs/examples/examples.ipynb",
    "chars": 13858,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"6ab73899\",\n   \"metadata\": {},\n   \"source\": [\n    \"# Tavily Sampl"
  },
  {
    "path": "docs/docs/examples/examples.md",
    "chars": 2156,
    "preview": "# Simple Run\n\n### Run PIP Package\n```python\nfrom gpt_researcher import GPTResearcher\nimport asyncio\n\n### Using Quick Run"
  },
  {
    "path": "docs/docs/examples/hybrid_research.md",
    "chars": 5318,
    "preview": "# Hybrid Research\n\n## Introduction\n\nGPT Researcher can combine web search capabilities with local document analysis to p"
  },
  {
    "path": "docs/docs/examples/pip-run.ipynb",
    "chars": 2552,
    "preview": "{\n  \"nbformat\": 4,\n  \"nbformat_minor\": 0,\n  \"metadata\": {\n    \"colab\": {\n      \"provenance\": []\n    },\n    \"kernelspec\":"
  },
  {
    "path": "docs/docs/examples/sample_report.py",
    "chars": 1577,
    "preview": "import nest_asyncio  # required for notebooks\n\nnest_asyncio.apply()\n\nfrom gpt_researcher import GPTResearcher\nimport asy"
  },
  {
    "path": "docs/docs/examples/sample_sources_only.py",
    "chars": 786,
    "preview": "from gpt_researcher import GPTResearcher\nimport asyncio\n\n\nasync def get_report(query: str, report_source: str, sources: "
  },
  {
    "path": "docs/docs/faq.md",
    "chars": 2562,
    "preview": "# FAQ\n\n### How do I get started?\nIt really depends on what you're aiming for. \n\nIf you're looking to connect your AI app"
  },
  {
    "path": "docs/docs/gpt-researcher/context/azure-storage.md",
    "chars": 644,
    "preview": "# Azure Storage\n\nIf you want to use Azure Blob Storage as the source for your GPT Researcher report context, follow thes"
  },
  {
    "path": "docs/docs/gpt-researcher/context/data-ingestion.md",
    "chars": 6161,
    "preview": "# Data Ingestion\n\nWhen you're dealing with a large amount of context data, you may want to start meditating upon a stand"
  },
  {
    "path": "docs/docs/gpt-researcher/context/filtering-by-domain.md",
    "chars": 2299,
    "preview": "# Filtering by Domain\n\nYou can filter web search results by specific domains when using either the Tavily or Google Sear"
  },
  {
    "path": "docs/docs/gpt-researcher/context/local-docs.md",
    "chars": 1066,
    "preview": "# Local Documents\n\n## Just Local Docs\n\nYou can instruct the GPT Researcher to run research tasks based on your local doc"
  },
  {
    "path": "docs/docs/gpt-researcher/context/tailored-research.md",
    "chars": 6183,
    "preview": "# Tailored Research\n\nThe GPT Researcher package allows you to tailor the research to your needs such as researching on s"
  },
  {
    "path": "docs/docs/gpt-researcher/context/vector-stores.md",
    "chars": 6466,
    "preview": "# Vector Stores\n\nThe GPT Researcher package allows you to integrate with existing langchain vector stores that have been"
  },
  {
    "path": "docs/docs/gpt-researcher/frontend/discord-bot.md",
    "chars": 1809,
    "preview": "# Discord Bot\n\n## Intro\n\nYou can either leverage the official GPTR Discord bot or create your own custom bot.\n\nTo add th"
  },
  {
    "path": "docs/docs/gpt-researcher/frontend/embed-script.md",
    "chars": 1095,
    "preview": "# Embed Script\n\nThe embed script enables you to embed the latest GPTR NextJS app into your web app.\n\nTo achieve this, si"
  },
  {
    "path": "docs/docs/gpt-researcher/frontend/introduction.md",
    "chars": 841,
    "preview": "# Intro to the Frontends\n\nThe frontends enhance GPT-Researcher by providing:\n\n1. Intuitive Research Interface: Streamlin"
  },
  {
    "path": "docs/docs/gpt-researcher/frontend/nextjs-frontend.md",
    "chars": 3030,
    "preview": "# NextJS Frontend\n\nThis frontend project aims to enhance the user experience of GPT Researcher, providing an intuitive a"
  },
  {
    "path": "docs/docs/gpt-researcher/frontend/react-package.md",
    "chars": 975,
    "preview": "# React Package\n\nThe GPTR React package is an abstraction on top of the NextJS app meant to empower users to easily impo"
  },
  {
    "path": "docs/docs/gpt-researcher/frontend/vanilla-js-frontend.md",
    "chars": 587,
    "preview": "# Vanilla JS Frontend\n\nThe VanillaJS frontend is a lightweight solution leveraging FastAPI to serve static files.\n\n### D"
  },
  {
    "path": "docs/docs/gpt-researcher/frontend/visualizing-websockets.md",
    "chars": 941,
    "preview": "# Visualizing Websockets\n\nThe GPTR Frontend is powered by Websockets streaming back from the Backend. This allows for re"
  },
  {
    "path": "docs/docs/gpt-researcher/getting-started/cli.md",
    "chars": 2886,
    "preview": "# Run with CLI\n\nThis command-line interface (CLI) tool allows you to generate research reports using the GPTResearcher c"
  },
  {
    "path": "docs/docs/gpt-researcher/getting-started/getting-started-with-docker.md",
    "chars": 1387,
    "preview": "# Docker: Quickstart\n\n> **Step 1** - Install & Open Docker Desktop\n\nFollow instructions at https://www.docker.com/produc"
  },
  {
    "path": "docs/docs/gpt-researcher/getting-started/getting-started.md",
    "chars": 4251,
    "preview": "# Getting Started\n\n> **Step 0** - Install Python 3.11 or later. [See here](https://www.tutorialsteacher.com/python/insta"
  },
  {
    "path": "docs/docs/gpt-researcher/getting-started/how-to-choose.md",
    "chars": 5467,
    "preview": "# How to Choose\n\nGPT Researcher is a powerful autonomous research agent designed to enhance and streamline your research"
  },
  {
    "path": "docs/docs/gpt-researcher/getting-started/introduction.md",
    "chars": 4735,
    "preview": "# Introduction\n\n[![Official Website](https://img.shields.io/badge/Official%20Website-gptr.dev-teal?style=for-the-badge&l"
  },
  {
    "path": "docs/docs/gpt-researcher/getting-started/linux-deployment.md",
    "chars": 4539,
    "preview": "# Running on Linux\n\nThis guide will walk you through the process of deploying GPT Researcher on a Linux server.\n\n## Serv"
  },
  {
    "path": "docs/docs/gpt-researcher/gptr/ai-development.md",
    "chars": 5789,
    "preview": "---\nsidebar_label: AI-Assisted Development\nsidebar_position: 6\n---\n\n# 🤖 AI-Assisted Development with Claude\n\nGPT Researc"
  },
  {
    "path": "docs/docs/gpt-researcher/gptr/automated-tests.md",
    "chars": 1410,
    "preview": "# Automated Tests\n\n## Automated Testing with Github Actions\n\nThis repository contains the code for the automated testing"
  },
  {
    "path": "docs/docs/gpt-researcher/gptr/claude-skill.md",
    "chars": 2388,
    "preview": "# Claude Skill\n\nGPT Researcher is available as a [Claude Skill](https://skills.sh/assafelovic/gpt-researcher/gpt-researc"
  },
  {
    "path": "docs/docs/gpt-researcher/gptr/config.md",
    "chars": 8950,
    "preview": "# Configuration\n\nThe config.py enables you to customize GPT Researcher to your specific needs and preferences.\n\nThanks t"
  },
  {
    "path": "docs/docs/gpt-researcher/gptr/deep_research.md",
    "chars": 4331,
    "preview": "# Deep Research ✨ NEW ✨\n\nWith the latest \"Deep Research\" trend in the AI community, we're excited to implement our own O"
  },
  {
    "path": "docs/docs/gpt-researcher/gptr/example.md",
    "chars": 1346,
    "preview": "# Agent Example\n\nIf you're interested in using GPT Researcher as a standalone agent, you can easily import it into any e"
  },
  {
    "path": "docs/docs/gpt-researcher/gptr/image_generation.md",
    "chars": 6326,
    "preview": "---\nsidebar_label: Image Generation\nsidebar_position: 5\n---\n\n# 🍌 Inline Image Generation\n\nGPT Researcher supports **inli"
  },
  {
    "path": "docs/docs/gpt-researcher/gptr/npm-package.md",
    "chars": 572,
    "preview": "# npm package\n\nThe [gpt-researcher npm package](https://www.npmjs.com/package/gpt-researcher) is a WebSocket client for "
  },
  {
    "path": "docs/docs/gpt-researcher/gptr/pip-package.md",
    "chars": 9314,
    "preview": "# PIP Package\n[![PyPI version](https://badge.fury.io/py/gpt-researcher.svg)](https://badge.fury.io/py/gpt-researcher)\n[!"
  },
  {
    "path": "docs/docs/gpt-researcher/gptr/querying-the-backend.md",
    "chars": 2975,
    "preview": "# Querying the Backend\n\n## Introduction\n\nIn this section, we will discuss how to query the GPTR backend server. The GPTR"
  },
  {
    "path": "docs/docs/gpt-researcher/gptr/scraping.md",
    "chars": 8441,
    "preview": "# Scraping Options\n\nGPT Researcher now offers various methods for web scraping: static scraping with BeautifulSoup, dyna"
  },
  {
    "path": "docs/docs/gpt-researcher/gptr/troubleshooting.md",
    "chars": 2767,
    "preview": "# Troubleshooting\n\nWe're constantly working to provide a more stable version. If you're running into any issues, please "
  },
  {
    "path": "docs/docs/gpt-researcher/handling-logs/all-about-logs.md",
    "chars": 10824,
    "preview": "# All About Logs\n\nThis document explains how to interpret the log files generated for each report. These logs provide a "
  },
  {
    "path": "docs/docs/gpt-researcher/handling-logs/langsmith-logs.md",
    "chars": 1006,
    "preview": "# Langsmith Logs\n\nWith the help of Langsmith, you can easily visualize logs on cost and errors within your Langsmith Das"
  },
  {
    "path": "docs/docs/gpt-researcher/handling-logs/simple-logs-example.md",
    "chars": 2356,
    "preview": "# Simple Logs Example\n\nHere is a snippet of code to help you handle the streaming logs of your Research tasks.\n\n```pytho"
  },
  {
    "path": "docs/docs/gpt-researcher/llms/llms.md",
    "chars": 15501,
    "preview": "# Configure LLM\n\nAs described in the [introduction](/docs/gpt-researcher/gptr/config), the default LLM and embedding is "
  },
  {
    "path": "docs/docs/gpt-researcher/llms/running-with-azure.md",
    "chars": 1185,
    "preview": "# Running with Azure\n\n## Example: Azure OpenAI Configuration\n\nIf you are not using OpenAI's models, but other model prov"
  },
  {
    "path": "docs/docs/gpt-researcher/llms/running-with-ollama.md",
    "chars": 3588,
    "preview": "# Running with Ollama\n\nOllama is a platform that allows you to deploy and manage custom language models. This guide will"
  },
  {
    "path": "docs/docs/gpt-researcher/llms/supported-llms.md",
    "chars": 1077,
    "preview": "# Supported LLMs\n\nThe following LLMs are supported by GPTR (though you'll need to install the relevant langchain package"
  },
  {
    "path": "docs/docs/gpt-researcher/llms/testing-your-llm.md",
    "chars": 824,
    "preview": "# Testing your LLM\n\nHere is a snippet of code to help you verify that your LLM-related environment variables are set up "
  },
  {
    "path": "docs/docs/gpt-researcher/mcp-server/advanced-usage.md",
    "chars": 5236,
    "preview": "---\nsidebar_position: 2\n---\n\n# Advanced Usage\n\nThis guide covers advanced usage scenarios and configurations for the GPT"
  },
  {
    "path": "docs/docs/gpt-researcher/mcp-server/claude-integration.md",
    "chars": 4467,
    "preview": "---\nsidebar_position: 3\n---\n\n# Claude Desktop Integration\n\nThis guide specifically focuses on how to integrate your loca"
  },
  {
    "path": "docs/docs/gpt-researcher/mcp-server/getting-started.md",
    "chars": 5770,
    "preview": "---\nsidebar_position: 1\n---\n\n# Getting Started\n\nThe GPT Researcher MCP Server provides Model Context Protocol (MCP) inte"
  },
  {
    "path": "docs/docs/gpt-researcher/multi_agents/ag2.md",
    "chars": 3368,
    "preview": "# AG2\n\n[AG2](https://github.com/ag2ai/ag2) is a framework for building multi-agent applications with LLMs.\nThis example "
  },
  {
    "path": "docs/docs/gpt-researcher/multi_agents/langgraph.md",
    "chars": 6364,
    "preview": "# LangGraph\n\n[LangGraph](https://python.langchain.com/docs/langgraph) is a library for building stateful, multi-actor ap"
  },
  {
    "path": "docs/docs/gpt-researcher/retrievers/mcp-configs.mdx",
    "chars": 17953,
    "preview": "# MCP Integration\n\nThe Model Context Protocol (MCP) enables GPT Researcher to connect with diverse data sources and tool"
  },
  {
    "path": "docs/docs/gpt-researcher/search-engines/search-engines.md",
    "chars": 3538,
    "preview": "# Search Engines\n\nSearch Engines are used to find the most relevant web sources and content for a given research task.\nY"
  },
  {
    "path": "docs/docs/gpt-researcher/search-engines/test-your-retriever.md",
    "chars": 2513,
    "preview": "# Testing your Retriever\n\nTo test your retriever, you can use the following code snippet. The script will search for a s"
  },
  {
    "path": "docs/docs/proposals/adaptive-deep-research.md",
    "chars": 19161,
    "preview": "# RFC: 自适应深度研究 - 质量驱动的递归搜索\n\n> **状态**: 提案\n> **作者**: 社区贡献者\n> **创建日期**: 2026-01-30\n> **目标版本**: v4.x\n\n## 概述\n\n本提案引入**自适应深度研究*"
  },
  {
    "path": "docs/docs/proposals/high-quality-content-scraping-architecture.md",
    "chars": 29604,
    "preview": "# RFC: 高质量内容与图片抓取架构\n\n> **状态**: 提案\n> **作者**: 社区贡献者\n> **创建日期**: 2026-01-31\n> **目标版本**: v4.x\n\n## 概述\n\n本提案分析 GPT-Researcher 当"
  },
  {
    "path": "docs/docs/proposals/local-server-deployment-guide.md",
    "chars": 16231,
    "preview": "# GPT-Researcher 本地服务器部署规划指南\n\n## 目录\n\n1. [架构概述](#架构概述)\n2. [硬件需求规划](#硬件需求规划)\n3. [部署方案选择](#部署方案选择)\n4. [环境配置详解](#环境配置详解)\n5. "
  },
  {
    "path": "docs/docs/proposals/social-media-data-acquisition.md",
    "chars": 30787,
    "preview": "# RFC: 社交媒体平台数据获取方案\n\n> **状态**: 提案\n> **创建日期**: 2026-02-01\n> **作者**: GPT-Researcher 社区\n> **目标版本**: v4.x\n\n---\n\n## 目录\n\n1. [背"
  },
  {
    "path": "docs/docs/reference/config/config.md",
    "chars": 2190,
    "preview": "---\nsidebar_label: config\ntitle: config.config\n---\n\nConfiguration class to store the state of bools for different script"
  },
  {
    "path": "docs/docs/reference/config/singleton.md",
    "chars": 541,
    "preview": "---\nsidebar_label: singleton\ntitle: config.singleton\n---\n\nThe singleton metaclass for ensuring only one instance of a cl"
  },
  {
    "path": "docs/docs/reference/processing/html.md",
    "chars": 745,
    "preview": "---\nsidebar_label: html\ntitle: processing.html\n---\n\nHTML processing functions\n\n#### extract\\_hyperlinks\n\n```python\ndef e"
  },
  {
    "path": "docs/docs/reference/processing/text.md",
    "chars": 1875,
    "preview": "---\nsidebar_label: text\ntitle: processing.text\n---\n\nText processing functions\n\n#### split\\_text\n\n```python\ndef split_tex"
  },
  {
    "path": "docs/docs/reference/sidebar.json",
    "chars": 63,
    "preview": "{\n  \"items\": [],\n  \"label\": \"Reference\",\n  \"type\": \"category\"\n}"
  },
  {
    "path": "docs/docs/roadmap.md",
    "chars": 728,
    "preview": "# Roadmap\n\nWe're constantly working on additional features and improvements to our products and services. We're also wor"
  },
  {
    "path": "docs/docs/welcome.md",
    "chars": 1010,
    "preview": "# Welcome\n\nHey there! 👋\n\nWe're a team of AI researchers and developers who are passionate about building the next genera"
  },
  {
    "path": "docs/docusaurus.config.js",
    "chars": 3448,
    "preview": "/** @type {import('@docusaurus/types').DocusaurusConfig} */\nconst math = require('remark-math');\nconst katex = require('"
  },
  {
    "path": "docs/npm/Readme.md",
    "chars": 4132,
    "preview": "# GPT Researcher\n\nThe gpt-researcher npm package is a WebSocket client for interacting with GPT Researcher.\n\n<div align="
  },
  {
    "path": "docs/npm/index.js",
    "chars": 3296,
    "preview": "// index.js\nconst WebSocket = require('ws');\n\nclass GPTResearcher {\n  constructor(options = {}) {\n    this.host = option"
  },
  {
    "path": "docs/npm/package.json",
    "chars": 656,
    "preview": "{\n  \"name\": \"gpt-researcher\",\n  \"version\": \"1.0.27\",\n  \"description\": \"WebSocket client for GPT Researcher\",\n  \"main\": \""
  },
  {
    "path": "docs/package.json",
    "chars": 1414,
    "preview": "{\n  \"name\": \"website\",\n  \"version\": \"0.0.0\",\n  \"private\": true,\n  \"resolutions\": {\n    \"nth-check\": \"2.0.1\",\n    \"trim\":"
  },
  {
    "path": "docs/pydoc-markdown.yml",
    "chars": 357,
    "preview": "loaders:\n   - type: python\n     search_path: [../docs]\nprocessors:\n  - type: filter\n    skip_empty_modules: true\n  - typ"
  },
  {
    "path": "docs/sidebars.js",
    "chars": 4150,
    "preview": "/**\n * Creating a sidebar enables you to:\n - create an ordered group of docs\n - render a sidebar for each doc of that gr"
  },
  {
    "path": "docs/src/components/HomepageFeatures.js",
    "chars": 2307,
    "preview": "import React from 'react';\nimport clsx from 'clsx';\nimport { Link } from 'react-router-dom';\nimport styles from './Homep"
  },
  {
    "path": "docs/src/components/HomepageFeatures.module.css",
    "chars": 191,
    "preview": "/* stylelint-disable docusaurus/copyright-header */\n\n.features {\n  display: flex;\n  align-items: center;\n  padding: 2rem"
  },
  {
    "path": "docs/src/css/custom.css",
    "chars": 3540,
    "preview": ":root {\n  --ifm-font-size-base: 16px;\n  --ifm-code-font-size: 90%;\n\n  --ifm-color-primary: #0c4da2;\n  --ifm-color-primar"
  },
  {
    "path": "docs/src/pages/index.js",
    "chars": 1343,
    "preview": "import React from 'react';\nimport clsx from 'clsx';\nimport Layout from '@theme/Layout';\nimport Link from '@docusaurus/Li"
  },
  {
    "path": "docs/src/pages/index.module.css",
    "chars": 418,
    "preview": "/* stylelint-disable docusaurus/copyright-header */\n\n/**\n * CSS files with the .module.css suffix will be treated as CSS"
  },
  {
    "path": "docs/static/.nojekyll",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "docs/static/CNAME",
    "chars": 13,
    "preview": "docs.gptr.dev"
  },
  {
    "path": "evals/README.md",
    "chars": 7824,
    "preview": "# GPT-Researcher Evaluations\n\nThis directory contains evaluation tools and frameworks for assessing the performance of G"
  },
  {
    "path": "evals/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "evals/hallucination_eval/evaluate.py",
    "chars": 2483,
    "preview": "\"\"\"\nEvaluate model outputs for hallucination using the judges library.\n\"\"\"\nimport logging\nfrom pathlib import Path\nfrom "
  },
  {
    "path": "evals/hallucination_eval/inputs/search_queries.jsonl",
    "chars": 5299,
    "preview": "{\"question\": \"What are the top emerging startups in AI hardware in 2025?\"}\n{\"question\": \"Compare pricing and features of"
  },
  {
    "path": "evals/hallucination_eval/requirements.txt",
    "chars": 28,
    "preview": "judges>=0.1.0\nopenai>=1.0.0 "
  },
  {
    "path": "evals/hallucination_eval/results/aggregate_results.json",
    "chars": 122502,
    "preview": "{\n  \"total_queries\": 2,\n  \"successful_queries\": 2,\n  \"total_responses\": 2,\n  \"total_evaluated\": 2,\n  \"total_hallucinated"
  },
  {
    "path": "evals/hallucination_eval/results/evaluation_records.jsonl",
    "chars": 749684,
    "preview": "{\"query\": \"What trends are emerging in real-time AI evaluation tools?\", \"report\": \"# Emerging Trends in Real-Time AI Eva"
  },
  {
    "path": "evals/hallucination_eval/run_eval.py",
    "chars": 7753,
    "preview": "\"\"\"\nScript to run GPT-Researcher queries and evaluate them for hallucination.\n\"\"\"\nimport json\nimport logging\nimport rand"
  },
  {
    "path": "evals/simple_evals/.gitignore",
    "chars": 83,
    "preview": "# Override global gitignore to track our evaluation logs\n!logs/\n!logs/*\n!logs/**/* "
  },
  {
    "path": "evals/simple_evals/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "evals/simple_evals/logs/.gitkeep",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "evals/simple_evals/logs/README.md",
    "chars": 1525,
    "preview": "# Evaluation Results\n\nThis directory contains historical evaluation results for GPT-Researcher using the SimpleQA method"
  },
  {
    "path": "evals/simple_evals/logs/SimpleQA Eval 100 Problems 2-22-25.txt",
    "chars": 6041507,
    "preview": "Last login: Sat Feb 22 09:30:52 on ttys005\nkellyabbott@mac ~ % cd /Users/kellyabbott/Documents/GitHub/gpt-researcher-fre"
  },
  {
    "path": "evals/simple_evals/problems/Simple QA Test Set.csv",
    "chars": 2012119,
    "preview": "metadata,problem,answer\n\"{'topic': 'Science and technology', 'answer_type': 'Person', 'urls': ['https://en.wikipedia.org"
  },
  {
    "path": "evals/simple_evals/requirements.txt",
    "chars": 27,
    "preview": "pandas>=1.5.0\ntqdm>=4.65.0 "
  },
  {
    "path": "evals/simple_evals/run_eval.py",
    "chars": 7812,
    "preview": "import asyncio\nimport os\nimport argparse\nfrom typing import Callable, List, TypeVar\nfrom tqdm import tqdm\nfrom dotenv im"
  },
  {
    "path": "evals/simple_evals/simpleqa_eval.py",
    "chars": 9470,
    "preview": "\"\"\"\nSimpleQA: Measuring short-form factuality in large language models\nAdapted for GPT-Researcher from OpenAI's simple-e"
  },
  {
    "path": "frontend/README.md",
    "chars": 2133,
    "preview": "# Frontend Application\n\nThis frontend project aims to enhance the user experience of GPT-Researcher, providing an intuit"
  },
  {
    "path": "frontend/index.html",
    "chars": 23367,
    "preview": "<!DOCTYPE html>\n<html lang=\"en\">\n\n<head>\n    <title>GPT Researcher</title>\n    <meta name=\"description\" content=\"A resea"
  },
  {
    "path": "frontend/nextjs/.babelrc.build.json",
    "chars": 332,
    "preview": "{\n  \"env\": {\n    \"production\": {\n      \"presets\": [\n        \"@babel/preset-env\",\n        \"@babel/preset-react\",\n        "
  },
  {
    "path": "frontend/nextjs/.dockerignore",
    "chars": 632,
    "preview": ".git\n\n# Ignore env containing secrets\n.env\n.venv\n.envrc\n\n# Ignore Virtual Env\nenv/\nvenv/\n.venv/\n\n# Other Environments\nEN"
  },
  {
    "path": "frontend/nextjs/.eslintrc.json",
    "chars": 402,
    "preview": "{\n  \"extends\": \"next/core-web-vitals\",\n  \"rules\": {\n    \"no-unused-vars\": \"off\",\n    \"no-undef\": \"off\",\n    \"no-console\""
  },
  {
    "path": "frontend/nextjs/.example.env",
    "chars": 50,
    "preview": "TOGETHER_API_KEY=\nBING_API_KEY=\nHELICONE_API_KEY=\n"
  },
  {
    "path": "frontend/nextjs/.gitignore",
    "chars": 414,
    "preview": "# See https://help.github.com/articles/ignoring-files/ for more about ignoring files.\n.env\npackage-lock.json\n\n# dependen"
  },
  {
    "path": "frontend/nextjs/.prettierrc",
    "chars": 47,
    "preview": "{ \"plugins\": [\"prettier-plugin-tailwindcss\"] }\n"
  },
  {
    "path": "frontend/nextjs/.python-version",
    "chars": 8,
    "preview": "3.11.13\n"
  },
  {
    "path": "frontend/nextjs/Dockerfile",
    "chars": 1331,
    "preview": "###############################################\n# 1) Dependencies layer\n###############################################\n"
  },
  {
    "path": "frontend/nextjs/Dockerfile.dev",
    "chars": 130,
    "preview": "FROM node:18.17.0-alpine\nWORKDIR /app\nCOPY ./package.json ./\nRUN npm install --legacy-peer-deps\nCOPY . .\nCMD [\"npm\", \"ru"
  },
  {
    "path": "frontend/nextjs/README.md",
    "chars": 3853,
    "preview": "# GPT Researcher UI\n\nA React component library for integrating the GPT Researcher interface into your React applications"
  },
  {
    "path": "frontend/nextjs/actions/apiActions.ts",
    "chars": 2698,
    "preview": "import { createParser, ParsedEvent, ReconnectInterval } from \"eventsource-parser\";\n\nexport async function handleSourcesA"
  },
  {
    "path": "frontend/nextjs/app/api/chat/route.ts",
    "chars": 1091,
    "preview": "import { NextResponse } from 'next/server';\n\nexport async function POST(request: Request) {\n  const backendUrl = process"
  },
  {
    "path": "frontend/nextjs/app/api/reports/[id]/chat/route.ts",
    "chars": 2201,
    "preview": "import { NextResponse } from 'next/server';\n\nexport async function GET(\n  request: Request,\n  { params }: { params: { id"
  },
  {
    "path": "frontend/nextjs/app/api/reports/[id]/route.ts",
    "chars": 3505,
    "preview": "import { NextResponse } from 'next/server';\n\nexport async function GET(\n  request: Request,\n  { params }: { params: { id"
  },
  {
    "path": "frontend/nextjs/app/api/reports/route.ts",
    "chars": 3851,
    "preview": "import { NextResponse } from 'next/server';\n\nexport async function GET(request: Request) {\n  const backendUrl = process."
  }
]

// ... and 272 more files (download for full content)

About this extraction

This page contains the full source code of the assafelovic/gpt-researcher GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 472 files (10.4 MB), approximately 2.8M tokens, and a symbol index with 966 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Copied to clipboard!