Skip to content

kamwoh/kamwoh

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 

Repository files navigation

Hi, I'm Kam Woh 👋

Research Scientist @ Meta AI · London, UK
Video diffusion · generative neural rendering · talking avatars · world models


About me

I'm a Research Scientist at Meta AI London, working on video diffusion models, generative neural rendering, talking avatars, and world models — teaching machines to observe, imagine, and simulate visual reality. The longer-term hope is to make worlds as easy to build as they are to imagine, so anyone can freely express the ones in their head — not only engineers.

I received my Ph.D. in Computer Science from the University of Surrey (supervised by Prof. Tao Xiang and Prof. Yi-Zhe Song, working closely with Dr. Xiatian Zhu), and my Bachelor's in Computer Science (AI) from the University of Malaya.


🌙 Currently building: Yume (夢)

A programmable, explicit world model on Godot — built by Claude, for Claude.

Yume is my take on a question I keep circling back to: what if you could describe a world in plain language and have it materialize into something runnable — without writing any per-world code? (夢 means "dream" in Japanese.)

A world's entities and rules are written as pure JSON; a small fixed interpreter advances that world tick by tick; Godot projects the resulting state to pixels, audio, HUD, or text. The engine ships seven primitives — Entity / Tag / Rule / Trigger / Effect / Query / Relation — and no game-specific code. You describe a world; you never edit the engine.

A world model is just a transition function f(state, action) → next_state. Yume lets you write f as JSON and run it — which makes it useful well beyond games:

Use How
🎮 Games the Godot projection — a playable build
🤖 RL / agent-eval testbeds deterministic, seedable, gym-like stepping
🏞️ Scene / world generation prose → 3D scene pipelines
🧠 Training data for neural world models roll a JSON world out, record (state, action, next_state) trajectories, train a Dreamer/Genie-style implicit model that approximates the same f at scale

That last row is the thesis: a clean, authorable explicit substrate that bridges to the implicit (neural) world-model world — interpret it directly, and use it as a faucet of reproducible training data.

It's the most fun I've had with a side project in years — the entire repo is written and operated by Claude Code. It's pre-1.0 and experimental, and I'd love for you to take a look. ⭐


📄 Selected publications

Year Work Venue
2026 Kaleido — unified neural rendering via spatial generative models ICLR 2026
2026 Rays as Pixels — joint video generation & camera trajectory estimation ICML 2026
2026 VecGlypher — LLM-based vector glyph generation from text/image CVPR 2026
2024 PartCraft — part-based compositional image generation ECCV 2024
2024 ConceptHash — interpretable hashing via part-based concepts CVPRW 2024 (Best Paper)
2021 OrthoHash — one-loss deep hashing with orthogonal centres NeurIPS 2021
2019 DeepIPR — DNN ownership protection via passport layers NeurIPS 2019

📚 Full list on my website and Google Scholar.


📫 kamwoh [at] gmail.com

About

👋 My GitHub profile

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors