ccdgyro

Follow

CD CAO ccdgyro

Follow

18 followers · 224 following

Achievements

Achievements

Lists (3)

Sort

🔮 Future ideas

✨ Inspiration

🚀 My stack

Starred repositories

Jiayi-Pan / TinyZero

Minimal reproduction of DeepSeek R1-Zero

Python 12,352 1,523 Updated Apr 24, 2025

deepseek-ai / DeepSeek-Math

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Python 2,961 551 Updated Apr 15, 2024

huggingface / trl

Train transformer language models with reinforcement learning.

Python 16,157 2,272 Updated Nov 5, 2025

datawhalechina / hello-agents

📚 从零开始的智能体原理与实践教程

Python 2,691 278 Updated Nov 4, 2025

rllm-org / rllm

Democratizing Reinforcement Learning for LLMs

Jupyter Notebook 4,667 440 Updated Nov 4, 2025

IBM / model-reprogramming

Repository for research works and resources related to model reprogramming <https://arxiv.org/abs/2202.10629>

62 1 Updated Sep 17, 2025

lupantech / AgentFlow

AgentFlow: In-the-Flow Agentic System Optimization

Python 1,148 136 Updated Nov 5, 2025

AlexFanw / DeepPlanner

Code and dataset for paper: DeepPlanner: Scaling Planning Capability for Deep Research Agents via Advantage Shaping

19 Updated Nov 5, 2025

ZSHsh98 / NSG-VD

[NeurIPS 2025 Spotlight] "Physics-Driven Spatiotemporal Modeling for AI-Generated Video Detection"

Python 11 1 Updated Oct 6, 2025

ByteDance-Seed / Chain-of-Action

Official implementation of Chain-of-Action: Trajectory Autoregressive Modeling for Robotic Manipulation

Python 76 2 Updated Oct 29, 2025

ShuoYang-1998 / Few_Shot_Distribution_Calibration

[ICLR2021 Oral] Free Lunch for Few-Shot Learning: Distribution Calibration

Python 475 71 Updated Nov 19, 2021

0raiser0 / PH-Reg

Official code for "Vision Transformers with Self-Distilled Registers" (NeurIPS 2025 Spotlight)

Jupyter Notebook 10 Updated Oct 19, 2025

deeplearning-wisc / haloscope

source code for NeurIPS'24 paper "HaloScope: Harnessing Unlabeled LLM Generations for Hallucination Detection"

Python 60 5 Updated Apr 11, 2025

Ma-Lab-Berkeley / deep-representation-learning-book

Learning Deep Representations of Data Distributions

TeX 579 45 Updated Oct 29, 2025

QwenLM / qwen-code

Qwen Code is a coding agent that lives in the digital world.

TypeScript 15,060 1,239 Updated Nov 5, 2025

bbruceyuan / LLMs-Zero-to-Hero

从无名小卒到大模型（LLM）大英雄~ 欢迎关注后续！！！

Jupyter Notebook 1,794 124 Updated Oct 19, 2025

MathFoundationRL / Book-Mathematical-Foundation-of-Reinforcement-Learning

This is the homepage of a new book entitled "Mathematical Foundations of Reinforcement Learning."

MATLAB 12,694 1,204 Updated Oct 28, 2025

langchain-ai / langchain

🦜🔗 The platform for reliable agents.

Python 118,922 19,586 Updated Nov 5, 2025

stanford-cs336 / spring2025-lectures

Python 1,989 422 Updated Oct 28, 2025

QwenLM / Qwen-Image

Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.

Python 5,919 322 Updated Sep 30, 2025

trent-b / iterative-stratification

scikit-learn cross validators for iterative stratification of multilabel data

Python 880 74 Updated Oct 12, 2024

wannature / COMIC

Pytorch implementation of Learning in Imperfect Environment: Multi-Label Classification with Long-Tailed Distribution and Partial Labels

Python 24 2 Updated Jan 3, 2024

P2333 / Bag-of-Tricks-for-AT

Empirical tricks for training robust models (ICLR 2021)

Python 257 27 Updated May 25, 2023

bigcode-project / starcoder2

Home of StarCoder2!

Python 1,982 191 Updated Mar 21, 2024

huggingface / pytorch-image-models

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…

Python 35,664 5,058 Updated Nov 5, 2025

zoubohao / DenoisingDiffusionProbabilityModel-ddpm-

This may be the simplest implement of DDPM. You can directly run Main.py to train the UNet on CIFAR-10 dataset and see the amazing process of denoising.

Python 2,054 216 Updated Apr 24, 2023

lm-sys / FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Python 39,213 4,766 Updated Jun 2, 2025

wizard-III / ArcherCodeR

ArcherCodeR is an open-source initiative enhancing code reasoning in large language models through scalable, rule-governed reinforcement learning.

Python 43 2 Updated Aug 6, 2025

resistzzz / Co-rewarding

[arXiv:2508.00410] "Co-rewarding: Stable Self-supervised RL for Eliciting Reasoning in Large Language Models"

Python 29 4 Updated Oct 6, 2025

tmlr-group / Co-rewarding

Forked from resistzzz/Co-rewarding

[arXiv:2508.00410] "Co-rewarding: Stable Self-supervised RL for Eliciting Reasoning in Large Language Models"

Python 42 1 Updated Oct 6, 2025

Starred topics

acm-icpc-handbook

Awesome Lists

HTML