-
Thoughtworks
- Singapore
- https://cemse.kaust.edu.sa/people/person/fangyuan-yu
Stars
This is a repository for reinforcement learning implementation for Unitree robots, based on Mujoco.
Evaluating long-term memory of reinforcement learning algorithms
Simple language-driven navigation tasks for studying compositional learning
An interface library for RL post training with environments.
We introduce BabyVision, a benchmark revealing the infancy of AI vision.
Collections of robotics environments geared towards benchmarking multi-task and meta reinforcement learning
PyTorch implementation of JiT https://arxiv.org/abs/2511.13720
A Collection of Competitive Text-Based Games for Language Model Evaluation and Reinforcement Learning
Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
This repo contains the source code for the paper "Evolution Strategies at Scale: LLM Fine-Tuning Beyond Reinforcement Learning"
Experiment for abstraction learning
Train your Agent model via our easy and efficient framework
Open-source framework for the research and development of foundation models.
Hierarchical Reasoning Model Official Release
KernelBench: Can LLMs Write GPU Kernels? - Benchmark + Toolkit with Torch -> CUDA (+ more DSLs)
SkyReels-V2: Infinite-length Film Generative model
Interactive visualizations of the geometric intuition behind diffusion models.
User-friendly implementation of the Mixture-of-Sparse-Attention (MoSA). MoSA selects distinct tokens for each head with expert choice routing providing a content-based sparse attention mechanism.
Making large AI models cheaper, faster and more accessible
Lets make video diffusion practical!
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
[ICML2025] SpargeAttention: A training-free sparse attention that accelerates any model inference.
Official PyTorch implementation for Hogwild! Inference: Parallel LLM Generation with a Concurrent Attention Cache
Official PyTorch implementation of One-Minute Video Generation with Test-Time Training