Stars
A JAX research toolkit for building, editing, and visualizing neural networks.
Chinese character stroke order animations and practice quizzes
Code for reproducing our paper "Are Sparse Autoencoders Useful? A Case Study in Sparse Probing"
A playbook for systematically maximizing the performance of deep learning models.
DoWhy is a Python library for causal inference that supports explicit modeling and testing of causal assumptions. DoWhy is based on a unified language for causal inference, combining causal graphic…
Official codebase for "Analyzing the Generalization and Reliability of Steering Vectors"
A visual playground for agentic workflows: Iterate over your agents 10x faster
Understanding Why and How Instruction Tuning Changes Pre-trained Models
Generate single text file that represents a python repository for LLMs
A Python library for doing curve matching with Fréchet distance and Procrustes analysis
Sparse Autoencoder for Mechanistic Interpretability
Training Sparse Autoencoders on Language Models
Steering vectors for transformer language models in Pytorch / Huggingface
Steering Llama 2 with Contrastive Activation Addition