Skip to content
View LUMO666's full-sized avatar

Block or report LUMO666

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 5 3 Updated May 26, 2026

Code to pretrain, fine-tune, and evaluate DreamZero and run sim & real-world evals

Python 2,253 194 Updated Apr 19, 2026

Search, understand, reproduce, and improve an idea with ease

Python 1,202 123 Updated Jun 12, 2026

RLinf: Reinforcement Learning Infrastructure for Embodied and Agentic AI

Python 3,774 526 Updated Jun 13, 2026

Official Repo for Open-Reasoner-Zero

Python 2,097 120 Updated Jun 2, 2025

The RL Bridge for LLM-based Agent Applications. Made Simple & Flexible.

Python 5,293 519 Updated Jun 13, 2026

Repo of paper "Free Process Rewards without Process Labels"

Python 171 11 Updated Mar 14, 2025

Scalable RL solution for advanced reasoning of language models

Python 1,862 112 Updated Mar 18, 2025

Super-Efficient RLHF Training of LLMs with Parameter Reallocation

Python 335 22 Updated Apr 24, 2025

GUI for a Vocal Remover that uses Deep Neural Networks.

Python 25,035 1,869 Updated Mar 13, 2025

Generative Agents: Interactive Simulacra of Human Behavior

21,529 3,020 Updated Aug 5, 2024

为GPT/GLM等LLM大语言模型提供实用化交互接口,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, m…

Python 70,872 8,365 Updated Jan 25, 2026

Web application where humans can play Overcooked with AI agents.

JavaScript 60 28 Updated Dec 6, 2022

Code for "On the Utility of Learning about Humans for Human-AI Coordination"

Python 112 46 Updated Apr 17, 2023

A benchmark environment for fully cooperative human-AI performance.

Jupyter Notebook 976 220 Updated Mar 22, 2025

Diversity is All You Need: Learning Skills without a Reward Function in PyTorch.

Python 89 27 Updated Jan 12, 2026
Python 59 8 Updated Jul 11, 2022

A minimalist environment for decision-making in autonomous driving

Python 3,273 878 Updated May 29, 2026
Python 1 Updated Jan 6, 2021
Python 1 Updated Nov 19, 2020

多智能体强化学习(MARL)算法复现,包括QMIX,VDN,QTRAN、MAVEN等等

Python 217 24 Updated Jun 6, 2022

Implementations of IQL, QMIX, VDN, COMA, QTRAN, MAVEN, CommNet, DyMA-CL, and G2ANet on SMAC, the decentralised micromanagement scenario of StarCraft II

Python 1,745 301 Updated Sep 8, 2022

《战双:帕弥什》游戏数据

Lua 268 223 Updated Mar 14, 2020