Couple bugs + a thought on the architecture

I read through the code after your tweet and Karpathy's reply. I think I see two things that might be messing with your results, and I had a thought on the overall picture.

**The bugs**:
There's a rank mismatch between `chat.py` and `sleep.py.` `chat.py` sets up LoRA with `rank=8`, but `sleep.py` trains at `rank=16`. When `chat.py` does `model.load_weights(adapter_file, strict=False)`, I'm pretty sure the shape mismatch means MLX silently skips loading those weights. So after a sleep cycle, the model is running with uninitialized LoRA layers - it's not actually using anything it learned. I think that would be enough to explain "not working so well."

The other thing was that `sleep.py` loads a fresh base model and fresh LoRA layers every time. (It doesn't load the existing adapter before training). So each sleep throws away the previous adapter's weights and retrains from scratch on the full `qa.jsonl`. With iters fixed at 100 and `batch_size=1`, the model sees 100 samples per sleep no matter how many Q&A pairs we've got. Fine at first, but it gets undertrained as the dataset grows.

**The bigger question:**
I think the brain damage problem might be baked into the approach of using LoRA for factual recall specifically. The Q&A extraction prompt in `sleep.py` pushes toward precise fact pairs ("What is my name?" / "Your name is Awni"), and you're essentially asking a low-rank adapter to act as a key-value store. LoRA isn't amazing at this and would rather shift how the model behaves broadly instead of memorizing specific input-output mappings.

This connects to what Karpathy was saying about memory ops as tools. What if the factual stuff (names, places, things I told you) lived in token space as a structured memory file that gets pulled into the system prompt, and LoRA sleep was just for the behavioral side? Personality, tone, communication style, the kind of stuff that doesn't have a single right answer but that you want baked into how the model responds.

That split might make the weight update problem a lot easier because you're no longer asking LoRA to do precise recall. You're just nudging the model's distribution, which is what it does well and where small errors don't show up as obviously wrong answers.

No idea if this is useful but figured I'd share what I was seeing in the code. It could be too far away from true learning, I realize, without manipulating weights with hard data.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Couple bugs + a thought on the architecture #1

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Couple bugs + a thought on the architecture #1

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions