Stars
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
A framework for few-shot evaluation of language models.
Fine-tuned MARL algorithms on SMAC (100% win rates on most scenarios)
Demonstrations of Loss of Plasticity and Implementation of Continual Backpropagation
Implementations of IQL, QMIX, VDN, COMA, QTRAN, MAVEN, CommNet, DyMA-CL, and G2ANet on SMAC, the decentralised micromanagement scenario of StarCraft II
Python Multi-Agent Reinforcement Learning framework
[NeurIPS 2023 Spotlight] LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios (awesome MCTS)
PyTorch implementation of FQF, IQN and QR-DQN.
Installer Microsoft Office For MacOS
Rainbow: Combining Improvements in Deep Reinforcement Learning
Mastering Diverse Domains through World Models
🚀 「大模型」1小时从0训练26M参数的视觉多模态VLM!🌏 Train a 26M-parameter VLM from scratch in just 1 hours!
chinese translation of llm-course
我的AI学习笔记。包括b站up主deep_thoughts的PyTorch课程笔记和相关代码;北邮深度学习与数字视频PPT代码。
🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT!🌏 Train a 26M-parameter GPT from scratch in just 2h!
[NIPS'25 Spotlight] Mulberry, an o1-like Reasoning and Reflection MLLM Implemented via Collective MCTS
Recipes to train the self-rewarding reasoning LLMs.
An open source flight dynamics & control software library
An environment based on JSBSIM aimed at one-to-one close air combat.
An educational resource to help anyone learn deep reinforcement learning.
PPO x Family DRL Tutorial Course(决策智能入门级公开课:8节课帮你盘清算法理论,理顺代码逻辑,玩转决策AI应用实践 )
API to run VirtualHome, a Multi-Agent Household Simulator
A library for advanced large language model reasoning