XuGW-Kevin

Follow

Guowei Xu XuGW-Kevin

Follow

105 followers · 19 following

Achievements

Achievements

Highlights

Pro

Stars

KempnerInstitute / kempnerpulse

KempnerPulse - real-time GPU monitoring dashboard for DCGM metrics.

Python 19 Updated Apr 15, 2026

tajwarfahim / maxrl

Official Implementation of "Maximum Likelihood Reinforcement Learning (MaxRL)"

Python 166 26 Updated Mar 15, 2026

openai / parameter-golf

Train the smallest LM you can that fits in 16MB. Best model wins!

Python 4,842 3,194 Updated Apr 9, 2026

yzc0731 / HinFlow

Official Code Implementation of Translating Flow to Policy via Hindsight Online Imitation

Python 122 Updated Mar 12, 2026

SWE-agent / SWE-agent

SWE-agent takes a GitHub issue and tries to automatically fix it, using your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges. [NeurIPS 2024]

Python 18,995 2,050 Updated Apr 13, 2026

Lyy-iiis / imeanflow

Official Implementation of iMF https://arxiv.org/abs/2512.02012

Python 266 9 Updated Feb 27, 2026

test-time-training / discover

Python 541 72 Updated Mar 30, 2026

huggingface / smolagents

🤗 smolagents: a barebones library for agents that think in code.

Python 26,654 2,485 Updated Apr 16, 2026

anthropics / claude-code

Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflo…

Shell 114,850 19,173 Updated Apr 16, 2026

compling-wat / vlm-lens

[EMNLP 2025 Demo] Extracting internal representations from vision-language models. Beta version.

Python 122 5 Updated Mar 10, 2026

ZHZisZZ / dllm

dLLM: Simple Diffusion Language Modeling

Python 2,386 240 Updated Apr 15, 2026

piesauce / awesome-dLLM-resources

Frequently updated list of dLLM (Diffusion Large Language Models) papers, models, and other resources

Python 26 Updated Mar 31, 2026

EleutherAI / lm-evaluation-harness

A framework for few-shot evaluation of language models.

Python 12,208 3,191 Updated Apr 8, 2026

AMAP-ML / Tree-GRPO

[ICLR 2026] Tree Search for LLM Agent Reinforcement Learning

Python 335 29 Updated Jan 26, 2026

Shalev-Lifshitz / MultiAgentVerification

Multi-Agent Verification: Scaling Test-Time Compute with Multiple Verifiers

Python 29 2 Updated Mar 1, 2025

aakaran / reasoning-with-sampling

Python 417 55 Updated Nov 7, 2025

zou-group / metatextgrad

metaTextGrad: Automatically optimizing language model optimizers. Published in NeurIPS 2025.

Python 11 2 Updated Nov 5, 2025

huggingface / lighteval

Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends

Python 2,379 452 Updated Apr 16, 2026

Jiayi-Pan / TinyZero

Minimal reproduction of DeepSeek R1-Zero

Python 13,053 1,582 Updated Feb 27, 2026

suninghuang19 / mentor

MENTOR is a highly efficient visual RL algorithm that excels in both simulation and real-world complex robotic learning tasks.

Python 27 1 Updated Jul 9, 2025

dllm-reasoning / d1

Official Implementation for the paper "d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning"

Python 434 52 Updated Jan 26, 2026

ML-GSAI / LLaDA

Official PyTorch implementation for "Large Language Diffusion Models"

Python 3,727 258 Updated Nov 12, 2025

yongliang-wu / DFT

[ICLR 2026] On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification.

Python 556 23 Updated Jan 4, 2026

Unispac / shallow-vs-deep-alignment

Official Repository for The Paper: Safety Alignment Should Be Made More Than Just a Few Tokens Deep

Python 179 13 Updated Apr 23, 2025

arcprize / arc-agi-benchmarking

Testing baseline LLMs performance across various models

Python 347 65 Updated Mar 20, 2026

SWE-bench / SWE-bench

SWE-bench: Can Language Models Resolve Real-world Github Issues?

Python 4,711 826 Updated Apr 1, 2026

haykgrigo3 / TimeCapsuleLLM

A LLM trained only on data from certain time periods to reduce modern bias

Python 1,895 72 Updated Apr 8, 2026

PKU-YuanGroup / Look-Back

This repository is the official implementation of "Look-Back: Implicit Visual Re-focusing in MLLM Reasoning".

Python 90 4 Updated Jul 10, 2025

microsoft / rho

Repo for Rho-1: Token-level Data Selection & Selective Pretraining of LLMs.

465 14 Updated Apr 18, 2024

gaojl19 / LfVoid

[NeurIPS 2023] Official code release for the paper: "Can Pre-Trained Text-to-Image Models Generate Visual Goals for Reinforcement Learning?"

Python 6 Updated Sep 29, 2024