Skip to content
View tugot17's full-sized avatar

Block or report tugot17

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Simplifying reinforcement learning for complex game environments

C 4,109 299 Updated Nov 6, 2025
Python 10 1 Updated Oct 17, 2025

Type annotations and runtime checking for shape and dtype of JAX/NumPy/PyTorch/etc. arrays. https://docs.kidger.site/jaxtyping/

Python 1,622 80 Updated Oct 3, 2025

Solve puzzles. Improve your pytorch.

Jupyter Notebook 3,766 336 Updated Jul 15, 2024

A scalable asynchronous reinforcement learning implementation with in-flight weight updates.

Python 286 27 Updated Nov 6, 2025

Atropos is a Language Model Reinforcement Learning Environments framework for collecting and evaluating LLM trajectories through diverse environments

Python 738 165 Updated Nov 7, 2025

Scalable RL solution for advanced reasoning of language models

Python 1,765 99 Updated Mar 18, 2025

Tongyi Deep Research, the Leading Open-source Deep Research Agent

Python 16,973 1,294 Updated Nov 3, 2025

My learning notes/codes for ML SYS.

Python 4,077 248 Updated Nov 6, 2025

SkyRL: A Modular Full-stack RL Library for LLMs

Python 1,160 161 Updated Nov 7, 2025

A Collection of Competitive Text-Based Games for Language Model Evaluation and Reinforcement Learning

Python 302 69 Updated Oct 29, 2025

Optimize prompts, code, and more with AI-powered Reflective Text Evolution

Jupyter Notebook 1,495 107 Updated Nov 6, 2025

slime is an LLM post-training framework for RL Scaling.

Python 2,391 244 Updated Nov 7, 2025

Async RL Training at Scale

Python 748 125 Updated Nov 7, 2025

Real-time terminal monitor for InfiniBand networks - htop for high-speed interconnects

Rust 45 1 Updated Sep 3, 2025

Voyager is an interactive RGBD video generation model conditioned on camera input, and supports real-time 3D reconstruction.

Python 1,321 124 Updated Oct 22, 2025
Python 910 96 Updated Nov 7, 2025

Training-Ready RL Environments + Evals

Python 164 181 Updated Nov 7, 2025

Hunyuan-GameCraft: High-dynamic Interactive Game Video Generation with Hybrid History Condition

Python 617 68 Updated Oct 16, 2025

A PyTorch native platform for training generative AI models

Python 4,656 597 Updated Nov 7, 2025

Automatic evals for LLMs

HTML 552 67 Updated Jun 27, 2025

Environments for LLM Reinforcement Learning

Python 3,462 426 Updated Nov 7, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 15,184 2,438 Updated Nov 6, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 17 Updated Oct 2, 2025

Qwen Code is a coding agent that lives in the digital world.

TypeScript 15,113 1,245 Updated Nov 7, 2025

Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]

Python 30,760 4,613 Updated Nov 7, 2025

Use Claude Code as the foundation for coding infrastructure, allowing you to decide how to interact with the model while enjoying updates from Anthropic.

TypeScript 21,022 1,623 Updated Nov 6, 2025
Next