Spaces:
Running
Running
Joseph Pollack
commited on
Commit
·
816af43
1
Parent(s):
3f9bc77
WIP: Local changes before applying stash
Browse files- README.md +10 -3
- dev/__init__.py +1 -0
- docs/index.md +15 -5
- docs/overview/architecture.md +1 -1
- docs/overview/features.md +6 -3
- mkdocs.yml +3 -1
- pyproject.toml +3 -1
- requirements.txt +2 -0
- src/agents/input_parser.py +18 -8
- src/app.py +19 -9
- src/orchestrator/graph_orchestrator.py +108 -13
- src/orchestrator_magentic.py +13 -10
- src/prompts/hypothesis.py +20 -20
- src/prompts/judge.py +24 -17
- src/services/tts_modal.py +35 -9
- src/tools/crawl_adapter.py +3 -13
- src/tools/vendored/__init__.py +5 -7
- src/tools/vendored/crawl_website.py +127 -0
- uv.lock +4 -0
README.md
CHANGED
|
@@ -1,5 +1,5 @@
|
|
| 1 |
---
|
| 2 |
-
title:
|
| 3 |
emoji: 🐉
|
| 4 |
colorFrom: red
|
| 5 |
colorTo: yellow
|
|
@@ -45,9 +45,16 @@ tags:
|
|
| 45 |
|
| 46 |
## About
|
| 47 |
|
| 48 |
-
The DETERMINATOR is a deep research agent system
|
| 49 |
|
| 50 |
-
**
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 51 |
|
| 52 |
For this hackathon we're proposing a simple yet powerful Deep Research Agent that iteratively looks for the answer until it finds it using general purpose websearch and special purpose retrievers for technical retrievers.
|
| 53 |
|
|
|
|
| 1 |
---
|
| 2 |
+
title: The DETERMINATOR
|
| 3 |
emoji: 🐉
|
| 4 |
colorFrom: red
|
| 5 |
colorTo: yellow
|
|
|
|
| 45 |
|
| 46 |
## About
|
| 47 |
|
| 48 |
+
The DETERMINATOR is a powerful generalist deep research agent system that stops at nothing until finding precise answers to complex questions. It uses iterative search-and-judge loops to comprehensively investigate any research question from any domain.
|
| 49 |
|
| 50 |
+
**Key Features**:
|
| 51 |
+
- **Generalist**: Handles queries from any domain (medical, technical, business, scientific, etc.)
|
| 52 |
+
- **Automatic Medical Detection**: Automatically determines if medical knowledge sources (PubMed, ClinicalTrials.gov) are needed
|
| 53 |
+
- **Multi-Source Search**: Web search, PubMed, ClinicalTrials.gov, Europe PMC, RAG
|
| 54 |
+
- **Stops at Nothing**: Only stops at configured limits (budget, time, iterations), otherwise continues until finding precise answers
|
| 55 |
+
- **Evidence Synthesis**: Comprehensive reports with proper citations
|
| 56 |
+
|
| 57 |
+
**Important**: The DETERMINATOR is a research tool that synthesizes evidence. It cannot provide medical advice or answer medical questions directly.
|
| 58 |
|
| 59 |
For this hackathon we're proposing a simple yet powerful Deep Research Agent that iteratively looks for the answer until it finds it using general purpose websearch and special purpose retrievers for technical retrievers.
|
| 60 |
|
dev/__init__.py
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
"""Development utilities and plugins."""
|
docs/index.md
CHANGED
|
@@ -1,14 +1,24 @@
|
|
| 1 |
# The DETERMINATOR
|
| 2 |
|
| 3 |
-
**Deep Research Agent
|
| 4 |
|
| 5 |
-
The DETERMINATOR is a deep research agent system that uses iterative search-and-judge loops to comprehensively investigate research
|
| 6 |
|
| 7 |
-
**
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 8 |
|
| 9 |
## Features
|
| 10 |
|
| 11 |
-
- **
|
|
|
|
|
|
|
|
|
|
| 12 |
- **MCP Integration**: Use our tools from Claude Desktop or any MCP client
|
| 13 |
- **HuggingFace OAuth**: Sign in with your HuggingFace account to automatically use your API token
|
| 14 |
- **Modal Sandbox**: Secure execution of AI-generated statistical code
|
|
@@ -38,7 +48,7 @@ For detailed installation and setup instructions, see the [Getting Started Guide
|
|
| 38 |
|
| 39 |
The DETERMINATOR uses a Vertical Slice Architecture:
|
| 40 |
|
| 41 |
-
1. **Search Slice**: Retrieving evidence from PubMed, ClinicalTrials.gov,
|
| 42 |
2. **Judge Slice**: Evaluating evidence quality using LLMs
|
| 43 |
3. **Orchestrator Slice**: Managing the research loop and UI
|
| 44 |
|
|
|
|
| 1 |
# The DETERMINATOR
|
| 2 |
|
| 3 |
+
**Generalist Deep Research Agent - Stops at Nothing Until Finding Precise Answers**
|
| 4 |
|
| 5 |
+
The DETERMINATOR is a powerful generalist deep research agent system that uses iterative search-and-judge loops to comprehensively investigate any research question. It stops at nothing until finding precise answers, only stopping at configured limits (budget, time, iterations).
|
| 6 |
|
| 7 |
+
**Key Features**:
|
| 8 |
+
- **Generalist**: Handles queries from any domain (medical, technical, business, scientific, etc.)
|
| 9 |
+
- **Automatic Source Selection**: Automatically determines if medical knowledge sources (PubMed, ClinicalTrials.gov) are needed
|
| 10 |
+
- **Multi-Source Search**: Web search, PubMed, ClinicalTrials.gov, Europe PMC, RAG
|
| 11 |
+
- **Iterative Refinement**: Continues searching and refining until precise answers are found
|
| 12 |
+
- **Evidence Synthesis**: Comprehensive reports with proper citations
|
| 13 |
+
|
| 14 |
+
**Important**: The DETERMINATOR is a research tool that synthesizes evidence. It cannot provide medical advice or answer medical questions directly.
|
| 15 |
|
| 16 |
## Features
|
| 17 |
|
| 18 |
+
- **Generalist Research**: Handles any research question from any domain
|
| 19 |
+
- **Automatic Medical Detection**: Automatically determines if medical knowledge sources are needed
|
| 20 |
+
- **Multi-Source Search**: Web search, PubMed, ClinicalTrials.gov, Europe PMC (includes bioRxiv/medRxiv), RAG
|
| 21 |
+
- **Iterative Until Precise**: Stops at nothing until finding precise answers (only stops at configured limits)
|
| 22 |
- **MCP Integration**: Use our tools from Claude Desktop or any MCP client
|
| 23 |
- **HuggingFace OAuth**: Sign in with your HuggingFace account to automatically use your API token
|
| 24 |
- **Modal Sandbox**: Secure execution of AI-generated statistical code
|
|
|
|
| 48 |
|
| 49 |
The DETERMINATOR uses a Vertical Slice Architecture:
|
| 50 |
|
| 51 |
+
1. **Search Slice**: Retrieving evidence from multiple sources (web, PubMed, ClinicalTrials.gov, Europe PMC, RAG) based on query analysis
|
| 52 |
2. **Judge Slice**: Evaluating evidence quality using LLMs
|
| 53 |
3. **Orchestrator Slice**: Managing the research loop and UI
|
| 54 |
|
docs/overview/architecture.md
CHANGED
|
@@ -1,6 +1,6 @@
|
|
| 1 |
# Architecture Overview
|
| 2 |
|
| 3 |
-
The DETERMINATOR is a deep research agent system that uses iterative search-and-judge loops to comprehensively
|
| 4 |
|
| 5 |
## Core Architecture
|
| 6 |
|
|
|
|
| 1 |
# Architecture Overview
|
| 2 |
|
| 3 |
+
The DETERMINATOR is a powerful generalist deep research agent system that uses iterative search-and-judge loops to comprehensively investigate any research question. It stops at nothing until finding precise answers, only stopping at configured limits (budget, time, iterations). The system automatically determines if medical knowledge sources are needed and adapts its search strategy accordingly. It supports multiple orchestration patterns, graph-based execution, parallel research workflows, and long-running task management with real-time streaming.
|
| 4 |
|
| 5 |
## Core Architecture
|
| 6 |
|
docs/overview/features.md
CHANGED
|
@@ -6,10 +6,12 @@ The DETERMINATOR provides a comprehensive set of features for AI-assisted resear
|
|
| 6 |
|
| 7 |
### Multi-Source Search
|
| 8 |
|
| 9 |
-
- **
|
| 10 |
-
- **
|
|
|
|
| 11 |
- **Europe PMC**: Search preprints and peer-reviewed articles (includes bioRxiv/medRxiv)
|
| 12 |
- **RAG**: Semantic search within collected evidence using LlamaIndex
|
|
|
|
| 13 |
|
| 14 |
### MCP Integration
|
| 15 |
|
|
@@ -40,9 +42,10 @@ The DETERMINATOR provides a comprehensive set of features for AI-assisted resear
|
|
| 40 |
|
| 41 |
- **Graph-Based Execution**: Flexible graph orchestration with conditional routing
|
| 42 |
- **Parallel Research Loops**: Run multiple research tasks concurrently
|
| 43 |
-
- **Iterative Research**: Single-loop research with search-judge-synthesize cycles
|
| 44 |
- **Deep Research**: Multi-section parallel research with planning and synthesis
|
| 45 |
- **Magentic Orchestration**: Multi-agent coordination using Microsoft Agent Framework
|
|
|
|
| 46 |
|
| 47 |
### Real-Time Streaming
|
| 48 |
|
|
|
|
| 6 |
|
| 7 |
### Multi-Source Search
|
| 8 |
|
| 9 |
+
- **General Web Search**: Search general knowledge sources for any domain
|
| 10 |
+
- **PubMed**: Search peer-reviewed biomedical literature via NCBI E-utilities (automatically used when medical knowledge needed)
|
| 11 |
+
- **ClinicalTrials.gov**: Search interventional clinical trials (automatically used when medical knowledge needed)
|
| 12 |
- **Europe PMC**: Search preprints and peer-reviewed articles (includes bioRxiv/medRxiv)
|
| 13 |
- **RAG**: Semantic search within collected evidence using LlamaIndex
|
| 14 |
+
- **Automatic Source Selection**: Automatically determines which sources are needed based on query analysis
|
| 15 |
|
| 16 |
### MCP Integration
|
| 17 |
|
|
|
|
| 42 |
|
| 43 |
- **Graph-Based Execution**: Flexible graph orchestration with conditional routing
|
| 44 |
- **Parallel Research Loops**: Run multiple research tasks concurrently
|
| 45 |
+
- **Iterative Research**: Single-loop research with search-judge-synthesize cycles that continues until precise answers are found
|
| 46 |
- **Deep Research**: Multi-section parallel research with planning and synthesis
|
| 47 |
- **Magentic Orchestration**: Multi-agent coordination using Microsoft Agent Framework
|
| 48 |
+
- **Stops at Nothing**: Only stops at configured limits (budget, time, iterations), otherwise continues until finding precise answers
|
| 49 |
|
| 50 |
### Real-Time Streaming
|
| 51 |
|
mkdocs.yml
CHANGED
|
@@ -1,5 +1,5 @@
|
|
| 1 |
site_name: The DETERMINATOR
|
| 2 |
-
site_description: Deep Research Agent
|
| 3 |
site_author: The DETERMINATOR Team
|
| 4 |
site_url: https://deepcritical.github.io/GradioDemo/
|
| 5 |
|
|
@@ -49,6 +49,8 @@ plugins:
|
|
| 49 |
minify_css: true
|
| 50 |
|
| 51 |
markdown_extensions:
|
|
|
|
|
|
|
| 52 |
- pymdownx.highlight:
|
| 53 |
anchor_linenums: true
|
| 54 |
- pymdownx.inlinehilite
|
|
|
|
| 1 |
site_name: The DETERMINATOR
|
| 2 |
+
site_description: Generalist Deep Research Agent that Stops at Nothing
|
| 3 |
site_author: The DETERMINATOR Team
|
| 4 |
site_url: https://deepcritical.github.io/GradioDemo/
|
| 5 |
|
|
|
|
| 49 |
minify_css: true
|
| 50 |
|
| 51 |
markdown_extensions:
|
| 52 |
+
- dev.docs_plugins:
|
| 53 |
+
base_path: "."
|
| 54 |
- pymdownx.highlight:
|
| 55 |
anchor_linenums: true
|
| 56 |
- pymdownx.inlinehilite
|
pyproject.toml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
| 1 |
[project]
|
| 2 |
name = "determinator"
|
| 3 |
version = "0.1.0"
|
| 4 |
-
description = "The DETERMINATOR - Deep Research Agent
|
| 5 |
readme = "README.md"
|
| 6 |
requires-python = ">=3.11"
|
| 7 |
dependencies = [
|
|
@@ -42,6 +42,8 @@ dependencies = [
|
|
| 42 |
"llama-index-llms-openai>=0.6.9",
|
| 43 |
"llama-index-embeddings-openai>=0.5.1",
|
| 44 |
"ddgs>=9.9.2",
|
|
|
|
|
|
|
| 45 |
]
|
| 46 |
|
| 47 |
[project.optional-dependencies]
|
|
|
|
| 1 |
[project]
|
| 2 |
name = "determinator"
|
| 3 |
version = "0.1.0"
|
| 4 |
+
description = "The DETERMINATOR - the Deep Research Agent that Stops at Nothing"
|
| 5 |
readme = "README.md"
|
| 6 |
requires-python = ">=3.11"
|
| 7 |
dependencies = [
|
|
|
|
| 42 |
"llama-index-llms-openai>=0.6.9",
|
| 43 |
"llama-index-embeddings-openai>=0.5.1",
|
| 44 |
"ddgs>=9.9.2",
|
| 45 |
+
"aiohttp>=3.13.2",
|
| 46 |
+
"lxml>=6.0.2",
|
| 47 |
]
|
| 48 |
|
| 49 |
[project.optional-dependencies]
|
requirements.txt
CHANGED
|
@@ -15,7 +15,9 @@ anthropic>=0.18.0
|
|
| 15 |
|
| 16 |
# HTTP & Parsing
|
| 17 |
httpx>=0.27
|
|
|
|
| 18 |
beautifulsoup4>=4.12
|
|
|
|
| 19 |
xmltodict>=0.13
|
| 20 |
|
| 21 |
# HuggingFace Hub
|
|
|
|
| 15 |
|
| 16 |
# HTTP & Parsing
|
| 17 |
httpx>=0.27
|
| 18 |
+
aiohttp>=3.13.2 # Required for website crawling
|
| 19 |
beautifulsoup4>=4.12
|
| 20 |
+
lxml>=6.0.2 # Required for BeautifulSoup lxml parser (faster than html.parser)
|
| 21 |
xmltodict>=0.13
|
| 22 |
|
| 23 |
# HuggingFace Hub
|
src/agents/input_parser.py
CHANGED
|
@@ -20,25 +20,33 @@ logger = structlog.get_logger()
|
|
| 20 |
|
| 21 |
# System prompt for the input parser agent
|
| 22 |
SYSTEM_PROMPT = """
|
| 23 |
-
You are an expert research query analyzer. Your job is to analyze user queries and determine:
|
| 24 |
1. Whether the query requires iterative research (single focused question) or deep research (multiple sections/topics)
|
| 25 |
-
2.
|
| 26 |
-
3.
|
| 27 |
-
4. Extract
|
|
|
|
| 28 |
|
| 29 |
Guidelines for determining research mode:
|
| 30 |
- **Iterative mode**: Single focused question, straightforward research goal, can be answered with a focused search loop
|
| 31 |
-
Examples: "What is the mechanism of metformin?", "
|
| 32 |
|
| 33 |
- **Deep mode**: Complex query requiring multiple sections, comprehensive report, multiple related topics
|
| 34 |
-
Examples: "Write a comprehensive report on diabetes treatment", "Analyze the market for quantum computing"
|
| 35 |
Indicators: words like "comprehensive", "report", "sections", "analyze", "market analysis", "overview"
|
| 36 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 37 |
Your output must be valid JSON matching the ParsedQuery schema. Always provide:
|
| 38 |
- original_query: The exact input query
|
| 39 |
- improved_query: A refined, clearer version of the query
|
| 40 |
- research_mode: Either "iterative" or "deep"
|
| 41 |
-
- key_entities: List of important entities (drugs, diseases, companies, etc.)
|
| 42 |
- research_questions: List of specific questions to answer
|
| 43 |
|
| 44 |
Only output JSON. Do not output anything else.
|
|
@@ -152,7 +160,9 @@ class InputParserAgent:
|
|
| 152 |
)
|
| 153 |
|
| 154 |
|
| 155 |
-
def create_input_parser_agent(
|
|
|
|
|
|
|
| 156 |
"""
|
| 157 |
Factory function to create an input parser agent.
|
| 158 |
|
|
|
|
| 20 |
|
| 21 |
# System prompt for the input parser agent
|
| 22 |
SYSTEM_PROMPT = """
|
| 23 |
+
You are an expert research query analyzer for a generalist deep research agent. Your job is to analyze user queries and determine:
|
| 24 |
1. Whether the query requires iterative research (single focused question) or deep research (multiple sections/topics)
|
| 25 |
+
2. Whether the query requires medical/biomedical knowledge sources (PubMed, ClinicalTrials.gov) or general knowledge sources (web search)
|
| 26 |
+
3. Improve and refine the query for better research results
|
| 27 |
+
4. Extract key entities (drugs, diseases, companies, technologies, concepts, etc.)
|
| 28 |
+
5. Extract specific research questions
|
| 29 |
|
| 30 |
Guidelines for determining research mode:
|
| 31 |
- **Iterative mode**: Single focused question, straightforward research goal, can be answered with a focused search loop
|
| 32 |
+
Examples: "What is the mechanism of metformin?", "How does quantum computing work?", "What are the latest AI models?"
|
| 33 |
|
| 34 |
- **Deep mode**: Complex query requiring multiple sections, comprehensive report, multiple related topics
|
| 35 |
+
Examples: "Write a comprehensive report on diabetes treatment", "Analyze the market for quantum computing", "Review the state of AI in healthcare"
|
| 36 |
Indicators: words like "comprehensive", "report", "sections", "analyze", "market analysis", "overview"
|
| 37 |
|
| 38 |
+
Guidelines for determining if medical knowledge is needed:
|
| 39 |
+
- **Medical knowledge needed**: Queries about diseases, treatments, drugs, clinical trials, medical conditions, biomedical mechanisms, health outcomes, etc.
|
| 40 |
+
Examples: "Alzheimer's treatment", "metformin mechanism", "cancer clinical trials", "diabetes research"
|
| 41 |
+
|
| 42 |
+
- **General knowledge sufficient**: Queries about technology, business, science (non-medical), history, current events, etc.
|
| 43 |
+
Examples: "quantum computing", "AI models", "market analysis", "historical events"
|
| 44 |
+
|
| 45 |
Your output must be valid JSON matching the ParsedQuery schema. Always provide:
|
| 46 |
- original_query: The exact input query
|
| 47 |
- improved_query: A refined, clearer version of the query
|
| 48 |
- research_mode: Either "iterative" or "deep"
|
| 49 |
+
- key_entities: List of important entities (drugs, diseases, companies, technologies, etc.)
|
| 50 |
- research_questions: List of specific questions to answer
|
| 51 |
|
| 52 |
Only output JSON. Do not output anything else.
|
|
|
|
| 160 |
)
|
| 161 |
|
| 162 |
|
| 163 |
+
def create_input_parser_agent(
|
| 164 |
+
model: Any | None = None, oauth_token: str | None = None
|
| 165 |
+
) -> InputParserAgent:
|
| 166 |
"""
|
| 167 |
Factory function to create an input parser agent.
|
| 168 |
|
src/app.py
CHANGED
|
@@ -752,12 +752,16 @@ def create_demo() -> gr.Blocks:
|
|
| 752 |
gr.Markdown("---")
|
| 753 |
gr.Markdown("### ℹ️ About") # noqa: RUF001
|
| 754 |
gr.Markdown(
|
| 755 |
-
"**The DETERMINATOR** - Deep Research Agent
|
| 756 |
-
"
|
| 757 |
-
"
|
| 758 |
-
"-
|
| 759 |
-
"-
|
| 760 |
-
"
|
|
|
|
|
|
|
|
|
|
|
|
|
| 761 |
)
|
| 762 |
gr.Markdown("---")
|
| 763 |
|
|
@@ -891,10 +895,16 @@ def create_demo() -> gr.Blocks:
|
|
| 891 |
multimodal=True, # Enable multimodal input (text + images + audio)
|
| 892 |
title="🔬 The DETERMINATOR",
|
| 893 |
description=(
|
| 894 |
-
"*Deep Research Agent
|
| 895 |
-
"ClinicalTrials.gov & Europe PMC*\n\n"
|
| 896 |
"---\n"
|
| 897 |
-
"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 898 |
"**MCP Server Active**: Connect Claude Desktop to `/gradio_api/mcp/`\n\n"
|
| 899 |
"**🎤 Multimodal Support**: Upload images (OCR), record audio (STT), or type text.\n\n"
|
| 900 |
"**⚠️ Authentication Required**: Please **sign in with HuggingFace** above before using this application."
|
|
|
|
| 752 |
gr.Markdown("---")
|
| 753 |
gr.Markdown("### ℹ️ About") # noqa: RUF001
|
| 754 |
gr.Markdown(
|
| 755 |
+
"**The DETERMINATOR** - Generalist Deep Research Agent\n\n"
|
| 756 |
+
"A powerful research agent that stops at nothing until finding precise answers to complex questions.\n\n"
|
| 757 |
+
"**Available Sources**:\n"
|
| 758 |
+
"- Web Search (general knowledge)\n"
|
| 759 |
+
"- PubMed (biomedical literature)\n"
|
| 760 |
+
"- ClinicalTrials.gov (clinical trials)\n"
|
| 761 |
+
"- Europe PMC (preprints & papers)\n"
|
| 762 |
+
"- RAG (semantic search)\n\n"
|
| 763 |
+
"**Automatic Detection**: Automatically determines if medical knowledge sources are needed for your query.\n\n"
|
| 764 |
+
"⚠️ **Research tool only** - Synthesizes evidence but cannot provide medical advice."
|
| 765 |
)
|
| 766 |
gr.Markdown("---")
|
| 767 |
|
|
|
|
| 895 |
multimodal=True, # Enable multimodal input (text + images + audio)
|
| 896 |
title="🔬 The DETERMINATOR",
|
| 897 |
description=(
|
| 898 |
+
"*Generalist Deep Research Agent — stops at nothing until finding precise answers to complex questions*\n\n"
|
|
|
|
| 899 |
"---\n"
|
| 900 |
+
"**The DETERMINATOR** uses iterative search-and-judge loops to comprehensively investigate any research question. "
|
| 901 |
+
"It automatically determines if medical knowledge sources (PubMed, ClinicalTrials.gov) are needed and adapts its search strategy accordingly.\n\n"
|
| 902 |
+
"**Key Features**:\n"
|
| 903 |
+
"- 🔍 Multi-source search (Web, PubMed, ClinicalTrials.gov, Europe PMC, RAG)\n"
|
| 904 |
+
"- 🧠 Automatic medical knowledge detection\n"
|
| 905 |
+
"- 🔄 Iterative refinement until precise answers are found\n"
|
| 906 |
+
"- ⏹️ Stops only at configured limits (budget, time, iterations)\n"
|
| 907 |
+
"- 📊 Evidence synthesis with citations\n\n"
|
| 908 |
"**MCP Server Active**: Connect Claude Desktop to `/gradio_api/mcp/`\n\n"
|
| 909 |
"**🎤 Multimodal Support**: Upload images (OCR), record audio (STT), or type text.\n\n"
|
| 910 |
"**⚠️ Authentication Required**: Please **sign in with HuggingFace** above before using this application."
|
src/orchestrator/graph_orchestrator.py
CHANGED
|
@@ -506,7 +506,8 @@ class GraphOrchestrator:
|
|
| 506 |
current_node_id = self._graph.entry_node
|
| 507 |
iteration = 0
|
| 508 |
|
| 509 |
-
|
|
|
|
| 510 |
# Check budget
|
| 511 |
if not context.budget_tracker.can_continue("graph_execution"):
|
| 512 |
self.logger.warning("Budget exceeded, exiting graph execution")
|
|
@@ -537,26 +538,27 @@ class GraphOrchestrator:
|
|
| 537 |
)
|
| 538 |
break
|
| 539 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 540 |
# Get next node(s)
|
| 541 |
next_nodes = self._get_next_node(current_node_id, context)
|
| 542 |
|
| 543 |
if not next_nodes:
|
| 544 |
-
# No more nodes,
|
| 545 |
-
if current_node_id in self._graph.exit_nodes:
|
| 546 |
-
break
|
| 547 |
-
# Otherwise, we've reached a dead end
|
| 548 |
self.logger.warning("Reached dead end in graph", node_id=current_node_id)
|
| 549 |
break
|
| 550 |
|
| 551 |
current_node_id = next_nodes[0] # For now, take first next node (handle parallel later)
|
| 552 |
|
| 553 |
-
# Final event
|
| 554 |
final_result = context.get_node_result(current_node_id) if current_node_id else None
|
| 555 |
-
|
| 556 |
# Check if final result contains file information
|
| 557 |
event_data: dict[str, Any] = {"mode": self.mode, "iterations": iteration}
|
| 558 |
message: str = "Research completed"
|
| 559 |
-
|
| 560 |
if isinstance(final_result, str):
|
| 561 |
message = final_result
|
| 562 |
elif isinstance(final_result, dict):
|
|
@@ -574,7 +576,7 @@ class GraphOrchestrator:
|
|
| 574 |
elif isinstance(files, str):
|
| 575 |
event_data["files"] = [files]
|
| 576 |
message = final_result.get("message", "Report generated. Download available.")
|
| 577 |
-
|
| 578 |
yield AgentEvent(
|
| 579 |
type="complete",
|
| 580 |
message=message,
|
|
@@ -628,7 +630,7 @@ class GraphOrchestrator:
|
|
| 628 |
Returns:
|
| 629 |
Agent execution result
|
| 630 |
"""
|
| 631 |
-
# Special handling for synthesizer node
|
| 632 |
if node.node_id == "synthesizer":
|
| 633 |
# Call LongWriterAgent.write_report() directly instead of using agent.run()
|
| 634 |
from src.agent_factory.agents import create_long_writer_agent
|
|
@@ -691,6 +693,62 @@ class GraphOrchestrator:
|
|
| 691 |
}
|
| 692 |
return final_report
|
| 693 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 694 |
# Standard agent execution
|
| 695 |
# Prepare input based on node type
|
| 696 |
if node.node_id == "planner":
|
|
@@ -718,14 +776,14 @@ class GraphOrchestrator:
|
|
| 718 |
)
|
| 719 |
# Return a minimal fallback ReportPlan
|
| 720 |
from src.utils.models import ReportPlan, ReportPlanSection
|
| 721 |
-
|
| 722 |
# Extract query from input_data if possible
|
| 723 |
fallback_query = query
|
| 724 |
if isinstance(input_data, str):
|
| 725 |
# Try to extract query from input string
|
| 726 |
if "QUERY:" in input_data:
|
| 727 |
fallback_query = input_data.split("QUERY:")[-1].strip()
|
| 728 |
-
|
| 729 |
return ReportPlan(
|
| 730 |
background_context="",
|
| 731 |
report_outline=[
|
|
@@ -740,7 +798,44 @@ class GraphOrchestrator:
|
|
| 740 |
raise
|
| 741 |
|
| 742 |
# Transform output if needed
|
| 743 |
-
output
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 744 |
if node.output_transformer:
|
| 745 |
output = node.output_transformer(output)
|
| 746 |
|
|
|
|
| 506 |
current_node_id = self._graph.entry_node
|
| 507 |
iteration = 0
|
| 508 |
|
| 509 |
+
# Execute nodes until we reach an exit node
|
| 510 |
+
while current_node_id:
|
| 511 |
# Check budget
|
| 512 |
if not context.budget_tracker.can_continue("graph_execution"):
|
| 513 |
self.logger.warning("Budget exceeded, exiting graph execution")
|
|
|
|
| 538 |
)
|
| 539 |
break
|
| 540 |
|
| 541 |
+
# Check if current node is an exit node - if so, we're done
|
| 542 |
+
if current_node_id in self._graph.exit_nodes:
|
| 543 |
+
break
|
| 544 |
+
|
| 545 |
# Get next node(s)
|
| 546 |
next_nodes = self._get_next_node(current_node_id, context)
|
| 547 |
|
| 548 |
if not next_nodes:
|
| 549 |
+
# No more nodes, we've reached a dead end
|
|
|
|
|
|
|
|
|
|
| 550 |
self.logger.warning("Reached dead end in graph", node_id=current_node_id)
|
| 551 |
break
|
| 552 |
|
| 553 |
current_node_id = next_nodes[0] # For now, take first next node (handle parallel later)
|
| 554 |
|
| 555 |
+
# Final event - get result from the last executed node (which should be an exit node)
|
| 556 |
final_result = context.get_node_result(current_node_id) if current_node_id else None
|
| 557 |
+
|
| 558 |
# Check if final result contains file information
|
| 559 |
event_data: dict[str, Any] = {"mode": self.mode, "iterations": iteration}
|
| 560 |
message: str = "Research completed"
|
| 561 |
+
|
| 562 |
if isinstance(final_result, str):
|
| 563 |
message = final_result
|
| 564 |
elif isinstance(final_result, dict):
|
|
|
|
| 576 |
elif isinstance(files, str):
|
| 577 |
event_data["files"] = [files]
|
| 578 |
message = final_result.get("message", "Report generated. Download available.")
|
| 579 |
+
|
| 580 |
yield AgentEvent(
|
| 581 |
type="complete",
|
| 582 |
message=message,
|
|
|
|
| 630 |
Returns:
|
| 631 |
Agent execution result
|
| 632 |
"""
|
| 633 |
+
# Special handling for synthesizer node (deep research)
|
| 634 |
if node.node_id == "synthesizer":
|
| 635 |
# Call LongWriterAgent.write_report() directly instead of using agent.run()
|
| 636 |
from src.agent_factory.agents import create_long_writer_agent
|
|
|
|
| 693 |
}
|
| 694 |
return final_report
|
| 695 |
|
| 696 |
+
# Special handling for writer node (iterative research)
|
| 697 |
+
if node.node_id == "writer":
|
| 698 |
+
# Call WriterAgent.write_report() directly instead of using agent.run()
|
| 699 |
+
# Collect all findings from workflow state
|
| 700 |
+
from src.agent_factory.agents import create_writer_agent
|
| 701 |
+
|
| 702 |
+
# Get all evidence from workflow state and convert to findings string
|
| 703 |
+
evidence = context.state.evidence
|
| 704 |
+
if evidence:
|
| 705 |
+
# Convert evidence to findings format (similar to conversation.get_all_findings())
|
| 706 |
+
findings_parts: list[str] = []
|
| 707 |
+
for ev in evidence:
|
| 708 |
+
finding = f"**{ev.title}**\n{ev.content}"
|
| 709 |
+
if ev.url:
|
| 710 |
+
finding += f"\nSource: {ev.url}"
|
| 711 |
+
findings_parts.append(finding)
|
| 712 |
+
all_findings = "\n\n".join(findings_parts)
|
| 713 |
+
else:
|
| 714 |
+
all_findings = "No findings available yet."
|
| 715 |
+
|
| 716 |
+
# Get WriterAgent instance and call write_report directly
|
| 717 |
+
writer_agent = create_writer_agent(oauth_token=self.oauth_token)
|
| 718 |
+
final_report = await writer_agent.write_report(
|
| 719 |
+
query=query,
|
| 720 |
+
findings=all_findings,
|
| 721 |
+
output_length="",
|
| 722 |
+
output_instructions="",
|
| 723 |
+
)
|
| 724 |
+
|
| 725 |
+
# Estimate tokens (rough estimate)
|
| 726 |
+
estimated_tokens = len(final_report) // 4 # Rough token estimate
|
| 727 |
+
context.budget_tracker.add_tokens("graph_execution", estimated_tokens)
|
| 728 |
+
|
| 729 |
+
# Save report to file if enabled
|
| 730 |
+
file_path: str | None = None
|
| 731 |
+
try:
|
| 732 |
+
file_service = self._get_file_service()
|
| 733 |
+
if file_service:
|
| 734 |
+
file_path = file_service.save_report(
|
| 735 |
+
report_content=final_report,
|
| 736 |
+
query=query,
|
| 737 |
+
)
|
| 738 |
+
self.logger.info("Report saved to file", file_path=file_path)
|
| 739 |
+
except Exception as e:
|
| 740 |
+
# Don't fail the entire operation if file saving fails
|
| 741 |
+
self.logger.warning("Failed to save report to file", error=str(e))
|
| 742 |
+
file_path = None
|
| 743 |
+
|
| 744 |
+
# Return dict with file path if available, otherwise return string (backward compatible)
|
| 745 |
+
if file_path:
|
| 746 |
+
return {
|
| 747 |
+
"message": final_report,
|
| 748 |
+
"file": file_path,
|
| 749 |
+
}
|
| 750 |
+
return final_report
|
| 751 |
+
|
| 752 |
# Standard agent execution
|
| 753 |
# Prepare input based on node type
|
| 754 |
if node.node_id == "planner":
|
|
|
|
| 776 |
)
|
| 777 |
# Return a minimal fallback ReportPlan
|
| 778 |
from src.utils.models import ReportPlan, ReportPlanSection
|
| 779 |
+
|
| 780 |
# Extract query from input_data if possible
|
| 781 |
fallback_query = query
|
| 782 |
if isinstance(input_data, str):
|
| 783 |
# Try to extract query from input string
|
| 784 |
if "QUERY:" in input_data:
|
| 785 |
fallback_query = input_data.split("QUERY:")[-1].strip()
|
| 786 |
+
|
| 787 |
return ReportPlan(
|
| 788 |
background_context="",
|
| 789 |
report_outline=[
|
|
|
|
| 798 |
raise
|
| 799 |
|
| 800 |
# Transform output if needed
|
| 801 |
+
# Defensively extract output - handle various result formats
|
| 802 |
+
output = result.output if hasattr(result, "output") else result
|
| 803 |
+
|
| 804 |
+
# Handle case where output might be a tuple (from pydantic-ai validation errors)
|
| 805 |
+
if isinstance(output, tuple):
|
| 806 |
+
# If tuple contains a dict-like structure, try to reconstruct the object
|
| 807 |
+
if len(output) == 2 and isinstance(output[0], str) and output[0] == "research_complete":
|
| 808 |
+
# This is likely a validation error format: ('research_complete', False)
|
| 809 |
+
# Try to get the actual output from result
|
| 810 |
+
self.logger.warning(
|
| 811 |
+
"Agent result output is a tuple, attempting to extract actual output",
|
| 812 |
+
node_id=node.node_id,
|
| 813 |
+
tuple_value=output,
|
| 814 |
+
)
|
| 815 |
+
# Try to get output from result attributes
|
| 816 |
+
if hasattr(result, "data"):
|
| 817 |
+
output = result.data
|
| 818 |
+
elif hasattr(result, "response"):
|
| 819 |
+
output = result.response
|
| 820 |
+
else:
|
| 821 |
+
# Last resort: try to reconstruct from tuple
|
| 822 |
+
# This shouldn't happen, but handle gracefully
|
| 823 |
+
from src.utils.models import KnowledgeGapOutput
|
| 824 |
+
|
| 825 |
+
if node.node_id == "knowledge_gap":
|
| 826 |
+
output = KnowledgeGapOutput(
|
| 827 |
+
research_complete=output[1] if len(output) > 1 else False,
|
| 828 |
+
outstanding_gaps=[],
|
| 829 |
+
)
|
| 830 |
+
else:
|
| 831 |
+
# For other nodes, log error and use fallback
|
| 832 |
+
self.logger.error(
|
| 833 |
+
"Cannot reconstruct output from tuple",
|
| 834 |
+
node_id=node.node_id,
|
| 835 |
+
tuple_value=output,
|
| 836 |
+
)
|
| 837 |
+
raise ValueError(f"Cannot extract output from tuple: {output}")
|
| 838 |
+
|
| 839 |
if node.output_transformer:
|
| 840 |
output = node.output_transformer(output)
|
| 841 |
|
src/orchestrator_magentic.py
CHANGED
|
@@ -122,21 +122,24 @@ class MagenticOrchestrator:
|
|
| 122 |
|
| 123 |
workflow = self._build_workflow()
|
| 124 |
|
| 125 |
-
task = f"""Research
|
| 126 |
|
| 127 |
Workflow:
|
| 128 |
-
1. SearchAgent: Find evidence from PubMed, ClinicalTrials.gov,
|
| 129 |
-
2. HypothesisAgent: Generate
|
| 130 |
-
3. JudgeAgent: Evaluate if evidence is sufficient
|
| 131 |
-
4. If insufficient -> SearchAgent refines search based on gaps
|
| 132 |
-
5. If sufficient -> ReportAgent synthesizes final report
|
| 133 |
|
| 134 |
Focus on:
|
| 135 |
-
-
|
| 136 |
-
-
|
| 137 |
-
-
|
|
|
|
| 138 |
|
| 139 |
-
The
|
|
|
|
|
|
|
| 140 |
|
| 141 |
iteration = 0
|
| 142 |
try:
|
|
|
|
| 122 |
|
| 123 |
workflow = self._build_workflow()
|
| 124 |
|
| 125 |
+
task = f"""Research query: {query}
|
| 126 |
|
| 127 |
Workflow:
|
| 128 |
+
1. SearchAgent: Find evidence from available sources (automatically selects: web search, PubMed, ClinicalTrials.gov, Europe PMC, or RAG based on query)
|
| 129 |
+
2. HypothesisAgent: Generate research hypotheses and questions based on evidence
|
| 130 |
+
3. JudgeAgent: Evaluate if evidence is sufficient to answer the query precisely
|
| 131 |
+
4. If insufficient -> SearchAgent refines search based on identified gaps
|
| 132 |
+
5. If sufficient -> ReportAgent synthesizes final comprehensive report
|
| 133 |
|
| 134 |
Focus on:
|
| 135 |
+
- Finding precise answers to the research question
|
| 136 |
+
- Identifying all relevant evidence from appropriate sources
|
| 137 |
+
- Understanding mechanisms, relationships, and key findings
|
| 138 |
+
- Synthesizing comprehensive findings with proper citations
|
| 139 |
|
| 140 |
+
The DETERMINATOR stops at nothing until finding precise answers, only stopping at configured limits (budget, time, iterations).
|
| 141 |
+
|
| 142 |
+
The final output should be a structured research report with comprehensive evidence synthesis."""
|
| 143 |
|
| 144 |
iteration = 0
|
| 145 |
try:
|
src/prompts/hypothesis.py
CHANGED
|
@@ -8,27 +8,27 @@ if TYPE_CHECKING:
|
|
| 8 |
from src.services.embeddings import EmbeddingService
|
| 9 |
from src.utils.models import Evidence
|
| 10 |
|
| 11 |
-
SYSTEM_PROMPT = """You are
|
| 12 |
|
| 13 |
-
Your role is to generate
|
| 14 |
|
| 15 |
-
IMPORTANT: You are a research assistant. You cannot
|
| 16 |
|
| 17 |
A good hypothesis:
|
| 18 |
-
1. Proposes a MECHANISM:
|
| 19 |
-
|
| 20 |
-
|
|
|
|
|
|
|
|
|
|
| 21 |
4. Generates SEARCH QUERIES: Helps find more evidence
|
| 22 |
|
| 23 |
-
Example hypothesis
|
| 24 |
-
-
|
| 25 |
-
-
|
| 26 |
-
-
|
| 27 |
-
- Effect: Enhanced clearance of amyloid-beta in Alzheimer's
|
| 28 |
-
- Confidence: 0.7
|
| 29 |
-
- Search suggestions: ["metformin AMPK brain", "autophagy amyloid clearance"]
|
| 30 |
|
| 31 |
-
Be specific. Use actual
|
| 32 |
|
| 33 |
|
| 34 |
async def format_hypothesis_prompt(
|
|
@@ -56,15 +56,15 @@ async def format_hypothesis_prompt(
|
|
| 56 |
]
|
| 57 |
)
|
| 58 |
|
| 59 |
-
return f"""Based on the following evidence about "{query}", generate
|
| 60 |
|
| 61 |
-
## Evidence ({len(selected)}
|
| 62 |
{evidence_text}
|
| 63 |
|
| 64 |
## Task
|
| 65 |
-
1. Identify
|
| 66 |
-
2. Propose
|
| 67 |
3. Rate confidence based on evidence strength
|
| 68 |
-
4. Suggest
|
| 69 |
|
| 70 |
-
Generate 2-4 hypotheses, prioritized by confidence."""
|
|
|
|
| 8 |
from src.services.embeddings import EmbeddingService
|
| 9 |
from src.utils.models import Evidence
|
| 10 |
|
| 11 |
+
SYSTEM_PROMPT = """You are an expert research scientist functioning as a generalist research assistant.
|
| 12 |
|
| 13 |
+
Your role is to generate research hypotheses, questions, and investigation paths based on evidence from any domain.
|
| 14 |
|
| 15 |
+
IMPORTANT: You are a research assistant. You cannot provide medical advice or answer medical questions directly. Your hypotheses are for research investigation purposes only.
|
| 16 |
|
| 17 |
A good hypothesis:
|
| 18 |
+
1. Proposes a MECHANISM or RELATIONSHIP: Explains how things work or relate
|
| 19 |
+
- For medical: Drug -> Target -> Pathway -> Effect
|
| 20 |
+
- For technical: Technology -> Mechanism -> Outcome
|
| 21 |
+
- For business: Strategy -> Market -> Result
|
| 22 |
+
2. Is TESTABLE: Can be supported or refuted by further research
|
| 23 |
+
3. Is SPECIFIC: Names actual entities, processes, or mechanisms
|
| 24 |
4. Generates SEARCH QUERIES: Helps find more evidence
|
| 25 |
|
| 26 |
+
Example hypothesis formats:
|
| 27 |
+
- Medical: "Metformin -> AMPK activation -> mTOR inhibition -> autophagy -> amyloid clearance"
|
| 28 |
+
- Technical: "Transformer architecture -> attention mechanism -> improved NLP performance"
|
| 29 |
+
- Business: "Subscription model -> recurring revenue -> higher valuation"
|
|
|
|
|
|
|
|
|
|
| 30 |
|
| 31 |
+
Be specific. Use actual names, technical terms, and precise language when possible."""
|
| 32 |
|
| 33 |
|
| 34 |
async def format_hypothesis_prompt(
|
|
|
|
| 56 |
]
|
| 57 |
)
|
| 58 |
|
| 59 |
+
return f"""Based on the following evidence about "{query}", generate research hypotheses and investigation paths.
|
| 60 |
|
| 61 |
+
## Evidence ({len(selected)} sources selected for diversity)
|
| 62 |
{evidence_text}
|
| 63 |
|
| 64 |
## Task
|
| 65 |
+
1. Identify key mechanisms, relationships, or processes mentioned in the evidence
|
| 66 |
+
2. Propose testable hypotheses explaining how things work or relate
|
| 67 |
3. Rate confidence based on evidence strength
|
| 68 |
+
4. Suggest specific search queries to test each hypothesis
|
| 69 |
|
| 70 |
+
Generate 2-4 hypotheses, prioritized by confidence. Adapt the hypothesis format to the domain of the query (medical, technical, business, etc.)."""
|
src/prompts/judge.py
CHANGED
|
@@ -2,35 +2,42 @@
|
|
| 2 |
|
| 3 |
from src.utils.models import Evidence
|
| 4 |
|
| 5 |
-
SYSTEM_PROMPT = """You are
|
| 6 |
|
| 7 |
-
Your task is to evaluate evidence from
|
| 8 |
|
| 9 |
-
IMPORTANT: You are a research assistant. You cannot
|
| 10 |
|
| 11 |
## Evaluation Criteria
|
| 12 |
|
| 13 |
-
1. **Mechanism Score (0-10)**: How well does the evidence explain the
|
| 14 |
-
-
|
| 15 |
-
-
|
| 16 |
-
-
|
| 17 |
-
|
| 18 |
-
|
| 19 |
-
-
|
| 20 |
-
|
| 21 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 22 |
|
| 23 |
3. **Sufficiency**: Evidence is sufficient when:
|
| 24 |
- Combined scores >= 12 AND
|
| 25 |
-
-
|
| 26 |
-
-
|
| 27 |
|
| 28 |
## Output Rules
|
| 29 |
|
| 30 |
- Always output valid JSON matching the schema
|
| 31 |
-
- Be conservative: only recommend "synthesize" when truly confident
|
| 32 |
-
- If continuing, suggest specific, actionable search queries
|
| 33 |
-
- Never hallucinate
|
|
|
|
| 34 |
"""
|
| 35 |
|
| 36 |
|
|
|
|
| 2 |
|
| 3 |
from src.utils.models import Evidence
|
| 4 |
|
| 5 |
+
SYSTEM_PROMPT = """You are an expert research evidence evaluator for a generalist deep research agent.
|
| 6 |
|
| 7 |
+
Your task is to evaluate evidence from any domain (medical, scientific, technical, business, etc.) and determine if sufficient evidence has been gathered to provide a precise answer to the research question.
|
| 8 |
|
| 9 |
+
IMPORTANT: You are a research assistant. You cannot provide medical advice or answer medical questions directly. Your role is to assess whether enough high-quality evidence has been collected to synthesize comprehensive findings.
|
| 10 |
|
| 11 |
## Evaluation Criteria
|
| 12 |
|
| 13 |
+
1. **Mechanism/Explanation Score (0-10)**: How well does the evidence explain the underlying mechanism, process, or concept?
|
| 14 |
+
- For medical queries: biological mechanisms, pathways, drug actions
|
| 15 |
+
- For technical queries: how systems work, algorithms, processes
|
| 16 |
+
- For business queries: market dynamics, business models, strategies
|
| 17 |
+
- 0-3: No clear explanation, speculative
|
| 18 |
+
- 4-6: Some insight, but gaps exist
|
| 19 |
+
- 7-10: Clear, well-supported explanation
|
| 20 |
+
|
| 21 |
+
2. **Evidence Quality Score (0-10)**: Strength and reliability of the evidence?
|
| 22 |
+
- For medical: clinical trials, peer-reviewed studies, meta-analyses
|
| 23 |
+
- For technical: peer-reviewed papers, authoritative sources, verified implementations
|
| 24 |
+
- For business: market reports, financial data, expert analysis
|
| 25 |
+
- 0-3: Weak or theoretical evidence only
|
| 26 |
+
- 4-6: Moderate quality evidence
|
| 27 |
+
- 7-10: Strong, authoritative evidence
|
| 28 |
|
| 29 |
3. **Sufficiency**: Evidence is sufficient when:
|
| 30 |
- Combined scores >= 12 AND
|
| 31 |
+
- Key questions from the research query are addressed AND
|
| 32 |
+
- Evidence is comprehensive enough to provide a precise answer
|
| 33 |
|
| 34 |
## Output Rules
|
| 35 |
|
| 36 |
- Always output valid JSON matching the schema
|
| 37 |
+
- Be conservative: only recommend "synthesize" when truly confident the answer is precise
|
| 38 |
+
- If continuing, suggest specific, actionable search queries to fill gaps
|
| 39 |
+
- Never hallucinate findings, names, or facts not in the evidence
|
| 40 |
+
- Adapt evaluation criteria to the domain of the query (medical vs technical vs business)
|
| 41 |
"""
|
| 42 |
|
| 43 |
|
src/services/tts_modal.py
CHANGED
|
@@ -33,7 +33,32 @@ def _get_modal_app() -> Any:
|
|
| 33 |
try:
|
| 34 |
import modal
|
| 35 |
|
| 36 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 37 |
except ImportError as e:
|
| 38 |
raise ConfigurationError(
|
| 39 |
"Modal SDK not installed. Run: uv sync or pip install modal>=0.63.0"
|
|
@@ -68,8 +93,6 @@ def _setup_modal_function() -> None:
|
|
| 68 |
return # Already set up
|
| 69 |
|
| 70 |
try:
|
| 71 |
-
import modal
|
| 72 |
-
|
| 73 |
app = _get_modal_app()
|
| 74 |
tts_image = _get_tts_image()
|
| 75 |
|
|
@@ -100,8 +123,8 @@ def _setup_modal_function() -> None:
|
|
| 100 |
|
| 101 |
# Import Kokoro inside function (lazy load)
|
| 102 |
try:
|
| 103 |
-
from kokoro import KModel, KPipeline
|
| 104 |
import torch
|
|
|
|
| 105 |
|
| 106 |
# Initialize model (cached on GPU)
|
| 107 |
model = KModel().to("cuda").eval()
|
|
@@ -126,11 +149,13 @@ def _setup_modal_function() -> None:
|
|
| 126 |
|
| 127 |
# Store function reference for remote calls
|
| 128 |
_tts_function = kokoro_tts_function
|
| 129 |
-
|
| 130 |
# Verify function is properly attached to app
|
| 131 |
if not hasattr(app, kokoro_tts_function.__name__):
|
| 132 |
-
logger.warning(
|
| 133 |
-
|
|
|
|
|
|
|
| 134 |
logger.info(
|
| 135 |
"modal_tts_function_setup_complete",
|
| 136 |
gpu=gpu_type,
|
|
@@ -196,7 +221,9 @@ class ModalTTSExecutor:
|
|
| 196 |
# Call the GPU function remotely
|
| 197 |
result = _tts_function.remote(text, voice, speed)
|
| 198 |
|
| 199 |
-
logger.info(
|
|
|
|
|
|
|
| 200 |
|
| 201 |
return result
|
| 202 |
|
|
@@ -257,4 +284,3 @@ def get_tts_service() -> TTSService:
|
|
| 257 |
ConfigurationError: If Modal credentials not configured
|
| 258 |
"""
|
| 259 |
return TTSService()
|
| 260 |
-
|
|
|
|
| 33 |
try:
|
| 34 |
import modal
|
| 35 |
|
| 36 |
+
# Validate Modal credentials before attempting lookup
|
| 37 |
+
if not settings.modal_available:
|
| 38 |
+
raise ConfigurationError(
|
| 39 |
+
"Modal credentials not configured. Set MODAL_TOKEN_ID and MODAL_TOKEN_SECRET environment variables."
|
| 40 |
+
)
|
| 41 |
+
|
| 42 |
+
# Validate token ID format (Modal token IDs are typically UUIDs or specific formats)
|
| 43 |
+
token_id = settings.modal_token_id
|
| 44 |
+
if token_id:
|
| 45 |
+
# Basic validation: token ID should not be empty and should be a reasonable length
|
| 46 |
+
if len(token_id.strip()) < 10:
|
| 47 |
+
raise ConfigurationError(
|
| 48 |
+
f"Modal token ID appears malformed (too short: {len(token_id)} chars). "
|
| 49 |
+
"Token ID should be a valid Modal token identifier."
|
| 50 |
+
)
|
| 51 |
+
|
| 52 |
+
try:
|
| 53 |
+
_modal_app = modal.App.lookup("deepcritical-tts", create_if_missing=True)
|
| 54 |
+
except Exception as e:
|
| 55 |
+
error_msg = str(e).lower()
|
| 56 |
+
if "token" in error_msg or "malformed" in error_msg or "invalid" in error_msg:
|
| 57 |
+
raise ConfigurationError(
|
| 58 |
+
f"Modal token validation failed: {e}. "
|
| 59 |
+
"Please check that MODAL_TOKEN_ID and MODAL_TOKEN_SECRET are correctly set."
|
| 60 |
+
) from e
|
| 61 |
+
raise
|
| 62 |
except ImportError as e:
|
| 63 |
raise ConfigurationError(
|
| 64 |
"Modal SDK not installed. Run: uv sync or pip install modal>=0.63.0"
|
|
|
|
| 93 |
return # Already set up
|
| 94 |
|
| 95 |
try:
|
|
|
|
|
|
|
| 96 |
app = _get_modal_app()
|
| 97 |
tts_image = _get_tts_image()
|
| 98 |
|
|
|
|
| 123 |
|
| 124 |
# Import Kokoro inside function (lazy load)
|
| 125 |
try:
|
|
|
|
| 126 |
import torch
|
| 127 |
+
from kokoro import KModel, KPipeline
|
| 128 |
|
| 129 |
# Initialize model (cached on GPU)
|
| 130 |
model = KModel().to("cuda").eval()
|
|
|
|
| 149 |
|
| 150 |
# Store function reference for remote calls
|
| 151 |
_tts_function = kokoro_tts_function
|
| 152 |
+
|
| 153 |
# Verify function is properly attached to app
|
| 154 |
if not hasattr(app, kokoro_tts_function.__name__):
|
| 155 |
+
logger.warning(
|
| 156 |
+
"modal_function_not_attached", function_name=kokoro_tts_function.__name__
|
| 157 |
+
)
|
| 158 |
+
|
| 159 |
logger.info(
|
| 160 |
"modal_tts_function_setup_complete",
|
| 161 |
gpu=gpu_type,
|
|
|
|
| 221 |
# Call the GPU function remotely
|
| 222 |
result = _tts_function.remote(text, voice, speed)
|
| 223 |
|
| 224 |
+
logger.info(
|
| 225 |
+
"tts_synthesis_complete", sample_rate=result[0], audio_shape=result[1].shape
|
| 226 |
+
)
|
| 227 |
|
| 228 |
return result
|
| 229 |
|
|
|
|
| 284 |
ConfigurationError: If Modal credentials not configured
|
| 285 |
"""
|
| 286 |
return TTSService()
|
|
|
src/tools/crawl_adapter.py
CHANGED
|
@@ -1,6 +1,6 @@
|
|
| 1 |
"""Website crawl tool adapter for Pydantic AI agents.
|
| 2 |
|
| 3 |
-
|
| 4 |
"""
|
| 5 |
|
| 6 |
import structlog
|
|
@@ -22,8 +22,8 @@ async def crawl_website(starting_url: str) -> str:
|
|
| 22 |
Formatted string with crawled content including titles, descriptions, and URLs
|
| 23 |
"""
|
| 24 |
try:
|
| 25 |
-
#
|
| 26 |
-
from
|
| 27 |
|
| 28 |
# Call the tool function
|
| 29 |
# The tool returns List[ScrapeResult] or str
|
|
@@ -56,13 +56,3 @@ async def crawl_website(starting_url: str) -> str:
|
|
| 56 |
except Exception as e:
|
| 57 |
logger.error("Crawl failed", error=str(e), url=starting_url)
|
| 58 |
return f"Error crawling website: {e!s}"
|
| 59 |
-
|
| 60 |
-
|
| 61 |
-
|
| 62 |
-
|
| 63 |
-
|
| 64 |
-
|
| 65 |
-
|
| 66 |
-
|
| 67 |
-
|
| 68 |
-
|
|
|
|
| 1 |
"""Website crawl tool adapter for Pydantic AI agents.
|
| 2 |
|
| 3 |
+
Uses the vendored crawl_website implementation from src/tools/vendored/crawl_website.py.
|
| 4 |
"""
|
| 5 |
|
| 6 |
import structlog
|
|
|
|
| 22 |
Formatted string with crawled content including titles, descriptions, and URLs
|
| 23 |
"""
|
| 24 |
try:
|
| 25 |
+
# Import vendored crawl tool
|
| 26 |
+
from src.tools.vendored.crawl_website import crawl_website as crawl_tool
|
| 27 |
|
| 28 |
# Call the tool function
|
| 29 |
# The tool returns List[ScrapeResult] or str
|
|
|
|
| 56 |
except Exception as e:
|
| 57 |
logger.error("Crawl failed", error=str(e), url=starting_url)
|
| 58 |
return f"Error crawling website: {e!s}"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
src/tools/vendored/__init__.py
CHANGED
|
@@ -1,16 +1,17 @@
|
|
| 1 |
"""Vendored web search components from folder/tools/web_search.py."""
|
| 2 |
|
|
|
|
|
|
|
|
|
|
| 3 |
from src.tools.vendored.web_search_core import (
|
| 4 |
CONTENT_LENGTH_LIMIT,
|
| 5 |
ScrapeResult,
|
| 6 |
WebpageSnippet,
|
| 7 |
-
scrape_urls,
|
| 8 |
fetch_and_process_url,
|
| 9 |
html_to_text,
|
| 10 |
is_valid_url,
|
|
|
|
| 11 |
)
|
| 12 |
-
from src.tools.vendored.serper_client import SerperClient
|
| 13 |
-
from src.tools.vendored.searchxng_client import SearchXNGClient
|
| 14 |
|
| 15 |
__all__ = [
|
| 16 |
"CONTENT_LENGTH_LIMIT",
|
|
@@ -22,8 +23,5 @@ __all__ = [
|
|
| 22 |
"fetch_and_process_url",
|
| 23 |
"html_to_text",
|
| 24 |
"is_valid_url",
|
|
|
|
| 25 |
]
|
| 26 |
-
|
| 27 |
-
|
| 28 |
-
|
| 29 |
-
|
|
|
|
| 1 |
"""Vendored web search components from folder/tools/web_search.py."""
|
| 2 |
|
| 3 |
+
from src.tools.vendored.crawl_website import crawl_website
|
| 4 |
+
from src.tools.vendored.searchxng_client import SearchXNGClient
|
| 5 |
+
from src.tools.vendored.serper_client import SerperClient
|
| 6 |
from src.tools.vendored.web_search_core import (
|
| 7 |
CONTENT_LENGTH_LIMIT,
|
| 8 |
ScrapeResult,
|
| 9 |
WebpageSnippet,
|
|
|
|
| 10 |
fetch_and_process_url,
|
| 11 |
html_to_text,
|
| 12 |
is_valid_url,
|
| 13 |
+
scrape_urls,
|
| 14 |
)
|
|
|
|
|
|
|
| 15 |
|
| 16 |
__all__ = [
|
| 17 |
"CONTENT_LENGTH_LIMIT",
|
|
|
|
| 23 |
"fetch_and_process_url",
|
| 24 |
"html_to_text",
|
| 25 |
"is_valid_url",
|
| 26 |
+
"crawl_website",
|
| 27 |
]
|
|
|
|
|
|
|
|
|
|
|
|
src/tools/vendored/crawl_website.py
ADDED
|
@@ -0,0 +1,127 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Website crawl tool vendored from folder/tools/crawl_website.py.
|
| 2 |
+
|
| 3 |
+
This module provides website crawling functionality that starts from a given URL
|
| 4 |
+
and crawls linked pages in a breadth-first manner, prioritizing navigation links.
|
| 5 |
+
"""
|
| 6 |
+
|
| 7 |
+
from urllib.parse import urljoin, urlparse
|
| 8 |
+
|
| 9 |
+
import aiohttp
|
| 10 |
+
import structlog
|
| 11 |
+
from bs4 import BeautifulSoup
|
| 12 |
+
|
| 13 |
+
from src.tools.vendored.web_search_core import (
|
| 14 |
+
ScrapeResult,
|
| 15 |
+
WebpageSnippet,
|
| 16 |
+
scrape_urls,
|
| 17 |
+
ssl_context,
|
| 18 |
+
)
|
| 19 |
+
|
| 20 |
+
logger = structlog.get_logger()
|
| 21 |
+
|
| 22 |
+
|
| 23 |
+
async def crawl_website(starting_url: str) -> list[ScrapeResult] | str:
|
| 24 |
+
"""Crawl the pages of a website starting with the starting_url and then descending into the pages linked from there.
|
| 25 |
+
|
| 26 |
+
Prioritizes links found in headers/navigation, then body links, then subsequent pages.
|
| 27 |
+
|
| 28 |
+
Args:
|
| 29 |
+
starting_url: Starting URL to scrape
|
| 30 |
+
|
| 31 |
+
Returns:
|
| 32 |
+
List of ScrapeResult objects which have the following fields:
|
| 33 |
+
- url: The URL of the web page
|
| 34 |
+
- title: The title of the web page
|
| 35 |
+
- description: The description of the web page
|
| 36 |
+
- text: The text content of the web page
|
| 37 |
+
"""
|
| 38 |
+
if not starting_url:
|
| 39 |
+
return "Empty URL provided"
|
| 40 |
+
|
| 41 |
+
# Ensure URL has a protocol
|
| 42 |
+
if not starting_url.startswith(("http://", "https://")):
|
| 43 |
+
starting_url = "http://" + starting_url
|
| 44 |
+
|
| 45 |
+
max_pages = 10
|
| 46 |
+
base_domain = urlparse(starting_url).netloc
|
| 47 |
+
|
| 48 |
+
async def extract_links(html: str, current_url: str) -> tuple[list[str], list[str]]:
|
| 49 |
+
"""Extract prioritized links from HTML content"""
|
| 50 |
+
soup = BeautifulSoup(html, "html.parser")
|
| 51 |
+
nav_links = set()
|
| 52 |
+
body_links = set()
|
| 53 |
+
|
| 54 |
+
# Find navigation/header links
|
| 55 |
+
for nav_element in soup.find_all(["nav", "header"]):
|
| 56 |
+
for a in nav_element.find_all("a", href=True):
|
| 57 |
+
link = urljoin(current_url, a["href"])
|
| 58 |
+
if urlparse(link).netloc == base_domain:
|
| 59 |
+
nav_links.add(link)
|
| 60 |
+
|
| 61 |
+
# Find remaining body links
|
| 62 |
+
for a in soup.find_all("a", href=True):
|
| 63 |
+
link = urljoin(current_url, a["href"])
|
| 64 |
+
if urlparse(link).netloc == base_domain and link not in nav_links:
|
| 65 |
+
body_links.add(link)
|
| 66 |
+
|
| 67 |
+
return list(nav_links), list(body_links)
|
| 68 |
+
|
| 69 |
+
async def fetch_page(url: str) -> str:
|
| 70 |
+
"""Fetch HTML content from a URL"""
|
| 71 |
+
connector = aiohttp.TCPConnector(ssl=ssl_context)
|
| 72 |
+
async with aiohttp.ClientSession(connector=connector) as session:
|
| 73 |
+
try:
|
| 74 |
+
timeout = aiohttp.ClientTimeout(total=30)
|
| 75 |
+
async with session.get(url, timeout=timeout) as response:
|
| 76 |
+
if response.status == 200:
|
| 77 |
+
return await response.text()
|
| 78 |
+
return ""
|
| 79 |
+
except Exception as e:
|
| 80 |
+
logger.warning("Error fetching URL", url=url, error=str(e))
|
| 81 |
+
return ""
|
| 82 |
+
|
| 83 |
+
# Initialize with starting URL
|
| 84 |
+
queue: list[str] = [starting_url]
|
| 85 |
+
next_level_queue: list[str] = []
|
| 86 |
+
all_pages_to_scrape: set[str] = set([starting_url])
|
| 87 |
+
|
| 88 |
+
# Breadth-first crawl
|
| 89 |
+
while queue and len(all_pages_to_scrape) < max_pages:
|
| 90 |
+
current_url = queue.pop(0)
|
| 91 |
+
|
| 92 |
+
# Fetch and process the page
|
| 93 |
+
html_content = await fetch_page(current_url)
|
| 94 |
+
if html_content:
|
| 95 |
+
nav_links, body_links = await extract_links(html_content, current_url)
|
| 96 |
+
|
| 97 |
+
# Add unvisited nav links to current queue (higher priority)
|
| 98 |
+
remaining_slots = max_pages - len(all_pages_to_scrape)
|
| 99 |
+
for link in nav_links:
|
| 100 |
+
link = link.rstrip("/")
|
| 101 |
+
if link not in all_pages_to_scrape and remaining_slots > 0:
|
| 102 |
+
queue.append(link)
|
| 103 |
+
all_pages_to_scrape.add(link)
|
| 104 |
+
remaining_slots -= 1
|
| 105 |
+
|
| 106 |
+
# Add unvisited body links to next level queue (lower priority)
|
| 107 |
+
for link in body_links:
|
| 108 |
+
link = link.rstrip("/")
|
| 109 |
+
if link not in all_pages_to_scrape and remaining_slots > 0:
|
| 110 |
+
next_level_queue.append(link)
|
| 111 |
+
all_pages_to_scrape.add(link)
|
| 112 |
+
remaining_slots -= 1
|
| 113 |
+
|
| 114 |
+
# If current queue is empty, add next level links
|
| 115 |
+
if not queue:
|
| 116 |
+
queue = next_level_queue
|
| 117 |
+
next_level_queue = []
|
| 118 |
+
|
| 119 |
+
# Convert set to list for final processing
|
| 120 |
+
pages_to_scrape = list(all_pages_to_scrape)[:max_pages]
|
| 121 |
+
pages_to_scrape_snippets: list[WebpageSnippet] = [
|
| 122 |
+
WebpageSnippet(url=page, title="", description="") for page in pages_to_scrape
|
| 123 |
+
]
|
| 124 |
+
|
| 125 |
+
# Use scrape_urls to get the content for all discovered pages
|
| 126 |
+
result = await scrape_urls(pages_to_scrape_snippets)
|
| 127 |
+
return result
|
uv.lock
CHANGED
|
@@ -1166,6 +1166,7 @@ version = "0.1.0"
|
|
| 1166 |
source = { editable = "." }
|
| 1167 |
dependencies = [
|
| 1168 |
{ name = "agent-framework-core" },
|
|
|
|
| 1169 |
{ name = "anthropic" },
|
| 1170 |
{ name = "beautifulsoup4" },
|
| 1171 |
{ name = "chromadb" },
|
|
@@ -1182,6 +1183,7 @@ dependencies = [
|
|
| 1182 |
{ name = "llama-index-llms-huggingface-api" },
|
| 1183 |
{ name = "llama-index-llms-openai" },
|
| 1184 |
{ name = "llama-index-vector-stores-chroma" },
|
|
|
|
| 1185 |
{ name = "modal" },
|
| 1186 |
{ name = "numpy" },
|
| 1187 |
{ name = "openai" },
|
|
@@ -1249,6 +1251,7 @@ dev = [
|
|
| 1249 |
requires-dist = [
|
| 1250 |
{ name = "agent-framework-core", specifier = ">=1.0.0b251120,<2.0.0" },
|
| 1251 |
{ name = "agent-framework-core", marker = "extra == 'magentic'", specifier = ">=1.0.0b251120,<2.0.0" },
|
|
|
|
| 1252 |
{ name = "anthropic", specifier = ">=0.18.0" },
|
| 1253 |
{ name = "beautifulsoup4", specifier = ">=4.12" },
|
| 1254 |
{ name = "chromadb", specifier = ">=0.4.0" },
|
|
@@ -1271,6 +1274,7 @@ requires-dist = [
|
|
| 1271 |
{ name = "llama-index-llms-openai", marker = "extra == 'modal'", specifier = ">=0.6.9" },
|
| 1272 |
{ name = "llama-index-vector-stores-chroma", specifier = ">=0.5.3" },
|
| 1273 |
{ name = "llama-index-vector-stores-chroma", marker = "extra == 'modal'" },
|
|
|
|
| 1274 |
{ name = "mkdocs", marker = "extra == 'dev'", specifier = ">=1.6.0" },
|
| 1275 |
{ name = "mkdocs-codeinclude-plugin", marker = "extra == 'dev'", specifier = ">=0.2.0" },
|
| 1276 |
{ name = "mkdocs-material", marker = "extra == 'dev'", specifier = ">=9.0.0" },
|
|
|
|
| 1166 |
source = { editable = "." }
|
| 1167 |
dependencies = [
|
| 1168 |
{ name = "agent-framework-core" },
|
| 1169 |
+
{ name = "aiohttp" },
|
| 1170 |
{ name = "anthropic" },
|
| 1171 |
{ name = "beautifulsoup4" },
|
| 1172 |
{ name = "chromadb" },
|
|
|
|
| 1183 |
{ name = "llama-index-llms-huggingface-api" },
|
| 1184 |
{ name = "llama-index-llms-openai" },
|
| 1185 |
{ name = "llama-index-vector-stores-chroma" },
|
| 1186 |
+
{ name = "lxml" },
|
| 1187 |
{ name = "modal" },
|
| 1188 |
{ name = "numpy" },
|
| 1189 |
{ name = "openai" },
|
|
|
|
| 1251 |
requires-dist = [
|
| 1252 |
{ name = "agent-framework-core", specifier = ">=1.0.0b251120,<2.0.0" },
|
| 1253 |
{ name = "agent-framework-core", marker = "extra == 'magentic'", specifier = ">=1.0.0b251120,<2.0.0" },
|
| 1254 |
+
{ name = "aiohttp", specifier = ">=3.13.2" },
|
| 1255 |
{ name = "anthropic", specifier = ">=0.18.0" },
|
| 1256 |
{ name = "beautifulsoup4", specifier = ">=4.12" },
|
| 1257 |
{ name = "chromadb", specifier = ">=0.4.0" },
|
|
|
|
| 1274 |
{ name = "llama-index-llms-openai", marker = "extra == 'modal'", specifier = ">=0.6.9" },
|
| 1275 |
{ name = "llama-index-vector-stores-chroma", specifier = ">=0.5.3" },
|
| 1276 |
{ name = "llama-index-vector-stores-chroma", marker = "extra == 'modal'" },
|
| 1277 |
+
{ name = "lxml", specifier = ">=6.0.2" },
|
| 1278 |
{ name = "mkdocs", marker = "extra == 'dev'", specifier = ">=1.6.0" },
|
| 1279 |
{ name = "mkdocs-codeinclude-plugin", marker = "extra == 'dev'", specifier = ">=0.2.0" },
|
| 1280 |
{ name = "mkdocs-material", marker = "extra == 'dev'", specifier = ">=9.0.0" },
|