Skip to content

Latest commit

 

History

History
 
 

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

README.md

Tinker Tutorials

A guided introduction to Tinker, from your first API call to building custom RL training pipelines.

These tutorials are marimo notebooks — reactive Python notebooks stored as .py files.

Prerequisites

Setup

uv pip install tinker tinker-cookbook marimo
export TINKER_API_KEY="your-api-key-here"

Running a tutorial

git clone https://github.com/thinking-machines-lab/tinker-cookbook.git
cd tinker-cookbook
marimo edit tutorials/101_hello_tinker.py

This opens the notebook in your browser with an interactive editor. Rendered versions are also available on the Tinker docs site.

Alternatively, you can try notebooks online in molab, using the links below.

Tutorials

Basics (1xx)

# Notebook What you'll learn Try on molab
101 Hello Tinker Architecture overview, client hierarchy, sampling from a model Open in molab
102 Your First SFT Renderers, datum construction, training loop Open in molab
103 Async Patterns Concurrent futures, num_samples, batch evaluation throughput Open in molab
104 First RL GRPO on GSM8K: reward functions, group-relative advantages Open in molab

Core Concepts (2xx)

# Notebook What you'll learn Try on molab
201 Rendering Renderers, tokenization, vision inputs, TrainOnWhat Open in molab
202 Loss Functions cross_entropy, IS, PPO, CISPO, custom loss Open in molab
203 Completers TokenCompleter vs MessageCompleter, LLM-as-judge Open in molab
204 Weights Checkpoint lifecycle, save/load/download/TTL Open in molab
205 Evaluations Custom evaluators, NLL, Inspect AI Open in molab

Cookbook Abstractions (3xx)

# Notebook What you'll learn Try on molab
301 Cookbook Abstractions Env, EnvGroupBuilder, RLDataset, ProblemEnv Open in molab
302 Custom Environment Build your own ProblemEnv subclass and RLDataset Open in molab
303 SFT with Config train.Config, ChatDatasetBuilder, train.main() Open in molab
304 RL with Config RLDatasetBuilder, RL training pipeline Open in molab

Advanced (4xx)

# Notebook What you'll learn Try on molab
401 SL Hyperparameters LR scaling, rank selection, sweeps Open in molab
402 RL Hyperparameters KL penalty, group size, advantages Open in molab
403 DPO & Preferences Comparison, DPO loss, PreferenceModel Open in molab
404 Sequence Extension Multi-turn RL, conversation masks Open in molab
405 Multi-Agent RL MessageEnv, self-play, group rewards Open in molab
406 Prompt Distillation Teacher/student, context distillation Open in molab
407 RLHF Pipeline 3-stage SFT, preference model, RL Open in molab

Deployment (5xx)

# Notebook What you'll learn Try on molab
501 Export to HF Merge LoRA into full model Open in molab
502 Build LoRA Adapter PEFT format for vLLM/SGLang Open in molab
503 Publish to Hub Upload to HuggingFace with model card Open in molab

Work through them in order — each builds on concepts from the previous one.

After the tutorials