Athena is an open-source CLI search agent that combines web search, intelligent scraping, semantic retrieval, and local LLM reasoning to answer user questions with up-to-date information from the web. Unlike traditional search engines that return lists of links, Athena reads and understands web content to provide direct, well-sourced answers to your questions.
- Python 3.8+
- Ollama installed and running
- At least one LLM model pulled (e.g.,
gemma3:1borllama3:1b)
-
Clone the repository:
git clone <repository-url> cd athena
-
Install dependencies:
pip install -r requirements.txt
-
Pull an LLM model using Ollama:
ollama pull gemma3:1b # or ollama pull llama3:1b -
Ensure Ollama is running:
ollama serve
Run the application:
python run.pyYou'll see:
Initializing sentencepiece...
Ask Athena:
Enter your question when prompted. Athena will:
- Search DuckDuckGo for relevant results
- Scrape and extract content from the top pages
- Find the most relevant information using semantic search
- Generate an answer using the local LLM
- Display the response with source attribution
- Save the query to history
To exit, press Ctrl+C. Your conversation history will be saved automatically.
- Uses DuckDuckGo (
ddgslibrary) to search for the user query - Returns top results with URLs, titles, and snippets
- For each URL:
- First attempts static scraping using
requestswith proper headers - If static scraping fails (non-200 status, Cloudflare protection, etc.):
- Falls back to dynamic scraping using Selenium in headless mode
- Uses semaphores to limit concurrent requests (10 global, 2 dynamic)
- Implements timeouts to prevent hanging requests
- First attempts static scraping using
- HTML content is converted to clean text using Trafilatura
- Text is split into overlapping chunks (300 words each)
- Sentence-transformers (
all-MiniLM-L6-v2) creates embeddings for all chunks - FAISS index is built for efficient similarity search
- The user query is embedded and used to search the FAISS index
- Top 3 most relevant text chunks are retrieved as context
- Context + question + system prompt are formatted for the LLM
- Ollama generates a response using the specified model
- Response is streamed back to the user in real-time
- Sources are deduplicated and displayed (top 5 unique URLs)
- Each query and timestamp is stored in
History/history.json - History is loaded on startup and saved on exit
- Currently, history is not used in the LLM prompt but is available for future enhancement
To change the LLM model, modify these files:
- In
agent.py: Change themodel_nameparameter in theagent()function call - In
run.py: Change the model name in theagent()call (currently "gemma3:1b")
Adjust these values in the code:
max_resultsinseeker/search.py(default: 10)kinrun.pyretrieve function (default: 3 chunks)chunk_sizeinrun.py(default: 300 words)
- Static scrape timeout: 8 seconds
- Dynamic scrape timeout: 15 seconds
- Global concurrency limit: 10 URLs
- Dynamic concurrency limit: 2 URLs (to reduce strain on Selenium)
Key dependencies include:
ddgs: DuckDuckGo searchrequests&selenium: Web scrapingtrafilatura: HTML-to-text conversionsentence-transformers&torch: Text embeddingsfaiss-cpu: Vector similarity searchollama: LLM interfacerich: Beautiful terminal outputnumpy: Numerical operations
See requirements.txt for the complete list.
Modify seeker/search.py to use different search APIs (Google, Bing, etc.) while maintaining the same return format.
Adjust scrapion/scrape.py to:
- Add more headers or cookies
- Implement different waiting strategies for dynamic content
- Add proxy support
Change the model in run.py:
embed_model = SentenceTransformer("your-model-name")Modify utils/model.py to work with different LLM APIs (OpenAI, Anthropic, etc.) while keeping the same interface.
Athena is designed for privacy:
- All processing happens locally on your machine
- No data is sent to external APIs (except for the initial web search)
- LLMs run locally via Ollama
- History is stored only on your local machine
- Scraped content is processed in memory and not persisted
-
"No results found"
- Check your internet connection
- Try a different query
- Verify DuckDuckGo is accessible
-
"No usable content extracted"
- The search results may be from sites that block scraping
- Try a query likely to return text-heavy results (news, Wikipedia, etc.)
-
Model loading errors
- Ensure Ollama is running:
ollama serve - Verify the model is pulled:
ollama list - Check if you have enough RAM/VRAM for the model
- Ensure Ollama is running:
-
Selenium issues
- Ensure Chrome/Chromium is installed
- Try updating selenium and webdriver-manager
-
CUDA/GPU issues with sentence-transformers
- The package uses CPU by default
- For GPU support, install appropriate PyTorch with CUDA
This project is open source and available under the MIT License.
- Built with Ollama for local LLM inference
- Uses sentence-transformers for embeddings
- Powered by FAISS for vector search
- Scraping powered by requests and Selenium
- Content extraction via trafilatura
- Search via DuckDuckGo Instant Answer API
- Terminal UI enhanced by Rich
Start exploring the web with Athena - your private, intelligent search agent! 🚀