-
UC Berkeley and Voio, Inc
- Berkeley
-
11:31
(UTC -07:00) - https://zwcolin.github.io/
- @zwcolin
- in/zwcolin
Highlights
- Pro
Lists (1)
Sort Name ascending (A-Z)
Stars
AutoGaze automatically removes redundant patches in a video, reducing #tokens in ViT/MLLM by 4x-100x.
OpenCUA: Open Foundations for Computer-Use Agents
Official Repository of VisGym: Diverse, Customizable, Scalable Environments for Multimodal Agents
The implementation for ThreadWeaver Adaptive Threading for Efficient Parallel Reasoning in Language Models
A benchmark for evaluating LLMs on open-ended CS problems. Exploring the Next Frontier of Computer Science.
Curate, Annotate, and Manage Your Data in LightlyStudio.
[NeurIPS 2025 Spotlight] Reasoning Environments for Reinforcement Learning with Verifiable Rewards
rpinsler / gym-maze
Forked from zuoxingdong/mazelabA customizable gym environment for maze/gridworld
Awesome Reasoning in MLLMs: Papers and Projects about learning to reason with MLLMs, including Chain-of-Thought (CoT), OpenAl o1, and DeepSeek-R1
Our library for RL environments + evals
Witness the aha moment of VLM with less than $3.
Random maze environments with different size and complexity for reinforcement learning research.
A customizable framework to create maze and gridworld environments
A framework for few-shot evaluation of language models.
A fork to add multimodal model training to open-r1
[ICCVW 25] LLaVA-MORE: A Comparative Study of LLMs and Visual Backbones for Enhanced Visual Instruction Tuning
A collection of materials for CS application
A novel Multimodal Large Language Model (MLLM) architecture, designed to structurally align visual and textual embeddings.