--- Initial skeleton set up in collaboration with Claude. ---
AI safety forecasting with trajectory analysis, using LLMs to synthesize expert opinions and track AI capabilities progress.
2025.08.17: Initial draft - WIP!
The motivation for this project is threefold:
- Explore forecasting.
- Explore the current landscape of AI safety forecasting.
- Test out performance of different LLM models in this domain.
Automatic and timely database updates, comprehensive coverage of key data sources, and optimizing prompts will be some of the primary future goals for the project, alongside development of benchmarks that aren't represented in other forecasting datasets and dashboards.
-
PostgreSQL: persistent storage
-
Redis: cache query data
-
Streamlit: low-effort frontend/viz, no-frills deploy
- if this grows beyond being a toy app, move to a more performant frontend
-
Docker: self-contained deploy
-
Airflow: data scheduler
-
MCP/CrewAI: orchestrate LLM agents (separate from target data source API polling)
- analyze websites and newsletters for relevant new/updated datasets
- analyze expert chatter and evaluate the system (LLM as judge of the forecasting system)
-
dbt: data transformation management/versioning (possibly overkill)
git clone git@github.com:msyvr/forecast-aisafety
cd forecast-aisafety
# Setup environment
cp .env.example .env
# Edit .env with your configuration
# Initialize database and load seed data
python run_app.py
# Explore data
jupyter notebook notebooks/data_exploration.ipynb- fireducks: drop-in replacement for pandas with identical api (literally, just import fireducks as pd, no other changes)
- better known but, coming from pandas, the api takes getting used to - polars: multithreaded on a single node (for distributed processing, use Apache Spark); pandas is single-threaded
- ibis: notes from a fan