- Seattle, WA
- willepperson.com
- @w_epperson
Stars
A benchmark to evaluate AI Agents in social domains.
The 100 line AI agent that solves GitHub issues or helps you in your command line. Radically simple, no huge configs, no giant monorepo—but scores >74% on SWE-bench verified!
An agent benchmark with tasks in a simulated software company.
Magentic-Marketplace: Simulate Agentic Markets and See How They Evolve
Embedding Atlas is a tool that provides interactive visualizations for large embeddings. It allows you to visualize, cross-filter, and search embeddings and metadata.
Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website …
Python tool for converting files and office documents to Markdown.
LLM Comparator is an interactive data visualization tool for evaluating and analyzing LLM responses side-by-side, developed by the PAIR team.
🍁 Sycamore is an LLM-powered search and analytics platform for unstructured data.
Concept Induction: Analyzing Unstructured Text with High-Level Concepts Using LLooM (CHI 2024 paper). LLooM automatically surfaces high-level concepts to analyze unstructured text.
An Improved Langchain RAG Tutorial (v2) with local LLMs, database updates, and testing.
A Bulletproof Way to Generate Structured JSON from Language Models
An extensible framework for linking databases and interactive views.
Angler: Machine Translation Visualization (CHI 2023)
prompt2model - Generate Deployable Models from Natural Language Instructions
Beyond the Imitation Game collaborative benchmark for measuring and extrapolating the capabilities of language models
An open-source data logging library for machine learning models and data pipelines. 📚 Provides visibility into data quality & model performance over time. 🛡️ Supports privacy-preserving data collec…
JupyterLab desktop application, based on Electron.
Jupyter extensions that help you write code faster: Context aware AI Chat, Autocomplete, and Spreadsheet
A Python library for anomaly detection across tabular, time series, graph, text, image, and audio data. 60+ detectors, benchmark-backed ADEngine orchestration, and an agentic workflow for AI agents.