Skip to content
View tugot17's full-sized avatar

Block or report tugot17

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Simplifying reinforcement learning for complex game environments

C 4,198 305 Updated Nov 14, 2025
Python 10 1 Updated Oct 17, 2025

Type annotations and runtime checking for shape and dtype of JAX/NumPy/PyTorch/etc. arrays. https://docs.kidger.site/jaxtyping/

Python 1,638 80 Updated Oct 3, 2025

Solve puzzles. Improve your pytorch.

Jupyter Notebook 3,773 339 Updated Jul 15, 2024

A scalable asynchronous reinforcement learning implementation with in-flight weight updates.

Python 299 28 Updated Nov 14, 2025

Atropos is a Language Model Reinforcement Learning Environments framework for collecting and evaluating LLM trajectories through diverse environments

Python 744 168 Updated Nov 14, 2025

Scalable RL solution for advanced reasoning of language models

Python 1,766 99 Updated Mar 18, 2025

Tongyi Deep Research, the Leading Open-source Deep Research Agent

Python 17,154 1,302 Updated Nov 10, 2025

My learning notes/codes for ML SYS.

Python 4,156 253 Updated Nov 10, 2025

SkyRL: A Modular Full-stack RL Library for LLMs

Python 1,196 167 Updated Nov 13, 2025

A Collection of Competitive Text-Based Games for Language Model Evaluation and Reinforcement Learning

Python 310 75 Updated Oct 29, 2025

Optimize prompts, code, and more with AI-powered Reflective Text Evolution

Jupyter Notebook 1,551 113 Updated Nov 12, 2025

slime is an LLM post-training framework for RL Scaling.

Python 2,472 253 Updated Nov 14, 2025

Async RL Training at Scale

Python 768 133 Updated Nov 14, 2025

Real-time terminal monitor for InfiniBand networks - htop for high-speed interconnects

Rust 45 1 Updated Sep 3, 2025

Voyager is an interactive RGBD video generation model conditioned on camera input, and supports real-time 3D reconstruction.

Python 1,339 128 Updated Oct 22, 2025
Python 915 97 Updated Nov 13, 2025

Training-Ready RL Environments + Evals

Python 174 187 Updated Nov 14, 2025

Hunyuan-GameCraft: High-dynamic Interactive Game Video Generation with Hybrid History Condition

Python 617 68 Updated Oct 16, 2025

A PyTorch native platform for training generative AI models

Python 4,706 601 Updated Nov 14, 2025

Automatic evals for LLMs

HTML 557 68 Updated Jun 27, 2025

Environments for LLM Reinforcement Learning

Python 3,485 431 Updated Nov 14, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 15,627 2,522 Updated Nov 14, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 17 Updated Nov 13, 2025

Qwen Code is a coding agent that lives in the digital world.

TypeScript 15,334 1,271 Updated Nov 14, 2025

Call any LLM API with cost tracking, guardrails, logging and load balancing. 1.8k+ models, 80+ providers, 50+ endpoints (unified + native format). Available as a Python SDK or Proxy Server (AI Gate…

Python 31,054 4,700 Updated Nov 14, 2025

Use Claude Code as the foundation for coding infrastructure, allowing you to decide how to interact with the model while enjoying updates from Anthropic.

TypeScript 21,628 1,679 Updated Nov 13, 2025
Next