Highlights
- Pro
-
extract-openreview-comments Public
A Python command-line tool to extract OpenReview comments for a paper into markdown format for easy copy/pasting when writing rebuttals.
-
feature-hedging-paper Public
Code for the paper "Feature Hedging: Correlated Features Break Narrow Sparse Autoencoders"
-
-
-
TransformerLens Public
Forked from TransformerLensOrg/TransformerLensA library for mechanistic interpretability of GPT-style language models
Python MIT License UpdatedNov 10, 2025 -
circuit-tracer Public
Forked from safety-research/circuit-tracerPython MIT License UpdatedNov 10, 2025 -
hanzi-writer Public
Chinese character stroke order animations and practice quizzes
-
SAELens Public
Forked from decoderesearch/SAELensTraining Sparse Autoencoders on Language Models
Jupyter Notebook MIT License UpdatedSep 2, 2025 -
SAE-Probes Public
Forked from JoshEngels/SAE-ProbesCode for reproducing our paper "Are Sparse Autoencoders Useful? A Case Study in Sparse Probing"
Jupyter Notebook UpdatedAug 16, 2025 -
automated-interpretability Public
Forked from hijohnnylin/automated-interpretability -
chainscope Public
Forked from jettjaniak/chainscopeRepository for the "Chain-of-Thought Reasoning In The Wild Is Not Always Faithful" paper
HTML MIT License UpdatedJun 18, 2025 -
criminle Public
Wordle-inspired country guessing game, made in 30 minutes of vibe-coding with Cursor/Claude
TypeScript UpdatedMay 16, 2025 -
-
dictionary_learning Public
Forked from saprmarks/dictionary_learningPython MIT License UpdatedFeb 12, 2025 -
matryoshka-saes Public
Forked from noanabeshima/matryoshka-saes -
-
transformers Public
Forked from huggingface/transformers🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Python Apache License 2.0 UpdatedNov 5, 2024 -
amr-logic-converter Public
Convert Abstract Meaning Representation (AMR) into first-order logic
-
linear-relational Public
Linear Relational Embeddings (LREs) and Linear Relational Concepts (LRCs) for LLMs in PyTorch
-
-
-
LLM_Categorical_Hierarchical_Representations Public
Forked from KihoPark/LLM_Categorical_Hierarchical_RepresentationsJupyter Notebook UpdatedJun 4, 2024 -
feature-circuits Public
Forked from saprmarks/feature-circuitsPython MIT License UpdatedApr 21, 2024 -
-
-
GENIES Public
Forked from Joshuaclymer/GENIESGeneralization Analogies: A Testbed for Generalizing AI Oversight to Hard-To-Measure Domains
Python UpdatedFeb 21, 2024 -
hanzi-writer-miniprogram Public archive
Wechat Miniprogram plugin for Hanzi Writer (微信小程序组件)
-
-
penman-js Public
Abstract Meaning Representation (AMR) parser and generator for Javascript
-
penman Public
Forked from goodmami/penmanPENMAN notation (e.g. AMR) in Python
Python MIT License UpdatedJan 2, 2024