DyeKuu

😻

Kunhao ZHENG DyeKuu

😻

Now: Meta FAIR CodeGen | Ex: Hugging Face, Sea AI Lab, OpenAI, Stockly | Alumni: Ecole Polytechnique X18, SJTU 16

85 followers · 41 following

Meta FAIR
France
02:22 (UTC +01:00)
dyekuu.github.io
@kunhaoZ
in/kunhao-zheng-x18

Achievements

x3 x3

Achievements

x3 x3

Highlights

Developer Program Member
Pro

Starred repositories

shangshang-wang / Tina

Tina: Tiny Reasoning Models via LoRA

Python 310 39 Updated Sep 23, 2025

facebookresearch / flow_matching

A PyTorch library for implementing flow matching algorithms, featuring continuous and discrete flow matching implementations. It includes practical examples for both text and image modalities.

Python 3,892 280 Updated Sep 25, 2025

axon-rl / gem

A Gym for Agentic LLMs

Python 409 27 Updated Dec 23, 2025

openai / gpt-oss

gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI

Python 19,464 1,998 Updated Nov 1, 2025

sail-sg / jrystal

A JAX-based Differentiable Density Functional Theory Framework for Materials

Python 42 2 Updated Dec 5, 2025

volcengine / verl

verl: Volcano Engine Reinforcement Learning for LLMs

Python 17,739 2,880 Updated Dec 23, 2025

facebookresearch / BigOBench

BigOBench assesses the capacity of Large Language Models (LLMs) to comprehend time-space computational complexity of input or generated code.

Python 38 5 Updated Apr 15, 2025

ultimate-pa / ultimate

The Ultimate program analysis framework.

Java 236 47 Updated Dec 18, 2025

huggingface / smollm

Everything about the SmolLM and SmolVLM family of models

Python 3,493 245 Updated Nov 20, 2025

facebookresearch / LeanUniverse

LeanUniverse: A Library for Consistent and Scalable Lean4 Dataset Management

Python 75 4 Updated Jan 15, 2025

antonpk1 / stackfish

Stackfish is an open-source LLM-powered pipeline designed to automatically solve competitive programming problems.

C++ 50 4 Updated Dec 14, 2024

sixty-north / cosmic-ray

Mutation testing for Python

Python 618 68 Updated Nov 9, 2025

karpathy / minbpe

Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.

Python 10,221 984 Updated Jul 1, 2024

esbatmop / MNBVC

MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化，也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。

4,063 285 Updated Nov 26, 2025

Ablustrund / APPS_Plus

StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback

74 3 Updated Aug 31, 2024

sail-sg / autofd

Automatic Functional Differentiation in JAX

Python 80 1 Updated Sep 18, 2025

evalplus / evalplus

Rigourous evaluation of LLM-synthesized code - NeurIPS 2023 & COLM 2024

Python 1,660 187 Updated Oct 2, 2025

srush / LLM-Training-Puzzles

What would you do with 1000 H100s...

Jupyter Notebook 1,134 69 Updated Jan 10, 2024

deepspeedai / DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 41,067 4,670 Updated Dec 22, 2025

Liuhong99 / Sophia

The official implementation of “Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training”

Python 980 57 Updated Jan 30, 2024

sail-sg / Adan

Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models

Python 805 70 Updated Jun 8, 2025

BBuf / how-to-optim-algorithm-in-cuda

how to optimize some algorithm in cuda.

Cuda 2,708 244 Updated Dec 23, 2025

huggingface / transformers

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Python 154,174 31,523 Updated Dec 23, 2025