Erikiss

Erikiss

Highlights

Stars

Sapphire-Bridge / AoM_mechanism

Mechanistic interpretability pipeline comparing raw residual-stream and SAE-basis interventions on meaning-sensitive tasks in Gemma 2 2B. Implements hard-gated substrate comparison, FP64 endpoint-n…

Python 1 Updated Apr 28, 2026

OpenInterpretability / notebooks

Train your first SAE in 30 min → paper-grade at 27B. Free Colab · free Kaggle · cloud ladders. Every scale covered.

Jupyter Notebook 2 Updated Apr 29, 2026

OSU-NLP-Group / Loop-Think-Generalize

Python 73 10 Updated Apr 9, 2026

JeanKaddour / tpo

Target Policy Optimization (JAX)

Python 25 Updated Apr 18, 2026

safety-research / introspection-mechanisms

introspection mechanisms

Python 21 7 Updated Apr 16, 2026

Elgoghel / hwta-circuits

Attention Is Not All You Need: Hierarchical WTA Circuits for Compositional Reasoning

Python 3 Updated Apr 15, 2026

hun-nemethpeter / InfoCell

Information-cell theory paper and POC C++ implementation

C++ 5 1 Updated Apr 19, 2026

pzinn / hadamard

Hadamard matrices with transformers

Python 1 Updated Apr 29, 2026

aditya-taparia / LLM-Uncertainty

Jupyter Notebook 3 Updated Jan 31, 2026

elder-plinius / GLOSSOPETRAE

LINGUISTIC ENGINE FOR AI

JavaScript 206 49 Updated Apr 14, 2026

dreadnode / agent-lens

Agent observability and replay tooling for AI safety & interpretability research.

Python 103 8 Updated Mar 19, 2026

KellerJordan / modded-nanogpt

NanoGPT (124M) in 90 seconds

Python 5,172 729 Updated Apr 28, 2026

openai / parameter-golf

Train the smallest LM you can that fits in 16MB. Best model wins!

Python 4,980 3,331 Updated Apr 29, 2026

alainnothere / llm-circuit-finder

I replicated Ng's RYS method and found that duplicating 3 specific layers in Qwen2.5-32B boosts reasoning by 17% and duplicating layers 12-14 in Devstral-24B improves logical deduction from 0.22→0.…

Python 234 17 Updated Mar 20, 2026

ggml-org / llama.cpp

LLM inference in C/C++

C++ 107,444 17,570 Updated Apr 29, 2026

madhuri723 / hydra-effect-refusal

This repository explores how hydra effect plays a role in refusal.

Jupyter Notebook 1 Updated Apr 16, 2026

pj4533 / modelwar

ModelWar is a Core War battle platform. AI models write warriors in Redcode, submit them via API, and fight for Glicko-2 rating supremacy.

TypeScript 1 Updated Mar 1, 2026

zli12321 / MM-Zero

Self-evolving vision language models from zero data

Python 71 2 Updated Mar 14, 2026

texttron / AgentIR

AgentIR is a retriever specialized for Deep Research agents.

Python 56 4 Updated Apr 16, 2026

danihyunlee / nca-pre-pretraining

Python 81 15 Updated Mar 12, 2026

nnnoidea / Membox

Python 24 2 Updated Mar 28, 2026

withmartian / ares

Agentic Research and Evaluation Suite

Python 94 14 Updated Apr 7, 2026

iTsingalis / AdaCubic

Python 1 Updated Mar 1, 2026

aishgupta / memorization_stitching

The official implementation of the TMLR paper titled "Probing Layer-wise Memorization and Generalization in Deep Neural Networks via Model Stitching""

Python 1 Updated Feb 23, 2026

Complexity-ML / complexity-deep

Python 4 Updated Mar 28, 2026

black-forest-labs / Self-Flow

Code and website for Self-Flow: Self-Supervised Flow Matching for Scalable Multi-Modal Synthesis

Python 453 15 Updated Mar 15, 2026

dair-ai / ML-Papers-Explained

Explanation to key concepts in ML

8,552 696 Updated Jun 30, 2025

liushiliushi / JitRL

Python 12 Updated Apr 6, 2026

karpathy / autoresearch

AI agents running research on single-GPU nanochat training automatically

Python 77,875 11,360 Updated Mar 26, 2026

AlperYildirim1 / geometric-grokking

The Geometric Inductive Bias of Grokking

Jupyter Notebook 2 Updated Mar 7, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly