Skip to content
View crazycth's full-sized avatar
😻
😻

Block or report crazycth

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

C++ 4,300 360 Updated Dec 25, 2025

The open-source code for the NeurIPS 2025 paper, "Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning."

Python 24 1 Updated Dec 22, 2025

SGLang is a fast serving framework for large language models and vision language models.

Python 21,956 3,862 Updated Dec 25, 2025

DoctorAgent-RL: A Multi-Agent Collaborative Reinforcement Learning System for Multi-Turn Clinical Dialogue

Python 53 7 Updated Oct 15, 2025
94 3 Updated Dec 5, 2025

This is the homepage of a new book entitled "Mathematical Foundations of Reinforcement Learning."

MATLAB 13,952 1,309 Updated Oct 28, 2025

🌐 Make websites accessible for AI agents. Automate tasks online with ease.

Python 74,118 8,875 Updated Dec 24, 2025

Qwen3Guard is a multilingual guardrail model series developed by the Qwen team at Alibaba Cloud.

Python 388 26 Updated Oct 21, 2025

nnScaler: Compiling DNN models for Parallel Training

Python 121 22 Updated Sep 23, 2025

maps between 1-D space filling hilbert curve and N-D coordinates

Python 269 38 Updated Apr 28, 2024

MiroMind Research Agent: Fully Open-Source Deep Research Agent with Reproducible State-of-the-Art Performance on FutureX, GAIA, HLE, BrowserComp and xBench.

Python 1,619 175 Updated Nov 30, 2025

This is the official implementation of paper "The Choice of Divergence: A Neglected Key to Mitigating Diversity Collapse in Reinforcement Learning with Verifiable Reward"

Python 10 Updated Oct 17, 2025

🚀 Efficient implementations of state-of-the-art linear attention models

Python 4,118 338 Updated Dec 24, 2025

Official Repository of "Learning what reinforcement learning can't"

Python 71 1 Updated Nov 16, 2025

Unleashing the Power of Reinforcement Learning for Math and Code Reasoners

Python 736 44 Updated Jun 6, 2025

Lightweight coding agent that runs in your terminal

Rust 54,648 6,950 Updated Dec 25, 2025
Python 3 Updated Sep 14, 2025

The official github repo for "Training Optimal Large Diffusion Language Models", the first-ever large-scale diffusion language models scaling law..

Python 45 1 Updated Nov 6, 2025

Awesome-Parallel-Reasoning: Unlocking the reasoning potential of LLMs. Papers, Code, Resources & Survey.

HTML 41 3 Updated Dec 20, 2025

[NeurIPS2025] "AI-Researcher: Autonomous Scientific Innovation" -- A production-ready version: https://novix.science/chat

Python 3,825 453 Updated Oct 16, 2025

Repository for the paper 'Is In-Context Learning Learning?'

Jupyter Notebook 3 Updated Sep 16, 2025

MiMo: Unlocking the Reasoning Potential of Language Model – From Pretraining to Posttraining

Python 1,858 75 Updated Jun 5, 2025

🧬 Augmenting zero-shot mutant prediction by retrieval-based logits fusion. (ISMB/ECCB 2025)

Python 114 12 Updated Aug 20, 2025

Awesome In-Context RL: A curated list of In-Context Reinforcement Learning - - —

261 14 Updated Sep 8, 2025

Ethereal Style for Zotero

JavaScript 4,684 147 Updated Nov 24, 2025

Official repository for the paper "LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code"

Python 748 155 Updated Jul 16, 2025

Giving Kubernetes Superpowers to everyone

Go 7,255 912 Updated Dec 23, 2025

One-shot Entropy Minimization

Python 187 11 Updated Jun 13, 2025
Next