Skip to content
View XuGW-Kevin's full-sized avatar

Highlights

  • Pro

Block or report XuGW-Kevin

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

KempnerPulse - real-time GPU monitoring dashboard for DCGM metrics.

Python 19 Updated Apr 15, 2026

Official Implementation of "Maximum Likelihood Reinforcement Learning (MaxRL)"

Python 166 26 Updated Mar 15, 2026

Train the smallest LM you can that fits in 16MB. Best model wins!

Python 4,842 3,194 Updated Apr 9, 2026

Official Code Implementation of Translating Flow to Policy via Hindsight Online Imitation

Python 122 Updated Mar 12, 2026

SWE-agent takes a GitHub issue and tries to automatically fix it, using your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges. [NeurIPS 2024]

Python 18,995 2,050 Updated Apr 13, 2026

Official Implementation of iMF https://arxiv.org/abs/2512.02012

Python 266 9 Updated Feb 27, 2026

🤗 smolagents: a barebones library for agents that think in code.

Python 26,654 2,485 Updated Apr 16, 2026

Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflo…

Shell 114,850 19,173 Updated Apr 16, 2026

[EMNLP 2025 Demo] Extracting internal representations from vision-language models. Beta version.

Python 122 5 Updated Mar 10, 2026

dLLM: Simple Diffusion Language Modeling

Python 2,386 240 Updated Apr 15, 2026

Frequently updated list of dLLM (Diffusion Large Language Models) papers, models, and other resources

Python 26 Updated Mar 31, 2026

A framework for few-shot evaluation of language models.

Python 12,208 3,191 Updated Apr 8, 2026

[ICLR 2026] Tree Search for LLM Agent Reinforcement Learning

Python 335 29 Updated Jan 26, 2026

Multi-Agent Verification: Scaling Test-Time Compute with Multiple Verifiers

Python 29 2 Updated Mar 1, 2025

metaTextGrad: Automatically optimizing language model optimizers. Published in NeurIPS 2025.

Python 11 2 Updated Nov 5, 2025

Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends

Python 2,379 452 Updated Apr 16, 2026

Minimal reproduction of DeepSeek R1-Zero

Python 13,053 1,582 Updated Feb 27, 2026

MENTOR is a highly efficient visual RL algorithm that excels in both simulation and real-world complex robotic learning tasks.

Python 27 1 Updated Jul 9, 2025

Official Implementation for the paper "d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning"

Python 434 52 Updated Jan 26, 2026

Official PyTorch implementation for "Large Language Diffusion Models"

Python 3,727 258 Updated Nov 12, 2025

[ICLR 2026] On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification.

Python 556 23 Updated Jan 4, 2026

Official Repository for The Paper: Safety Alignment Should Be Made More Than Just a Few Tokens Deep

Python 179 13 Updated Apr 23, 2025

Testing baseline LLMs performance across various models

Python 347 65 Updated Mar 20, 2026

SWE-bench: Can Language Models Resolve Real-world Github Issues?

Python 4,711 826 Updated Apr 1, 2026

A LLM trained only on data from certain time periods to reduce modern bias

Python 1,895 72 Updated Apr 8, 2026

This repository is the official implementation of "Look-Back: Implicit Visual Re-focusing in MLLM Reasoning".

Python 90 4 Updated Jul 10, 2025

Repo for Rho-1: Token-level Data Selection & Selective Pretraining of LLMs.

465 14 Updated Apr 18, 2024

[NeurIPS 2023] Official code release for the paper: "Can Pre-Trained Text-to-Image Models Generate Visual Goals for Reinforcement Learning?"

Python 6 Updated Sep 29, 2024
Next