Skip to content
View holarissun's full-sized avatar
🎯
Focusing
🎯
Focusing

Highlights

  • Pro

Block or report holarissun

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 74 7 Updated Apr 27, 2024

Projects related to Annual Computer Poker Competition

C 15 11 Updated Sep 19, 2016

OpenSpiel is a collection of environments and algorithms for research in general reinforcement learning and search/planning in games.

C++ 5,178 1,125 Updated Apr 26, 2026

A Collection of Competitive Text-Based Games for Language Model Evaluation and Reinforcement Learning

Python 394 90 Updated Apr 15, 2026

🤖 An Open Source Texas Hold'em AI

Python 346 71 Updated Oct 22, 2023

Poker-Hand-Evaluator: An efficient poker hand evaluation algorithm and its implementation, supporting 7-card poker and Omaha poker evaluation

C 494 107 Updated Nov 25, 2025

BibTool is a tool for manipulating BibTeX data bases. BibTeX provides a mean to integrate citations into LaTeX documents. BibTool allows the manipulation of BibTeX files which goes beyond the possi…

C 239 33 Updated Jan 14, 2026

The RL Bridge for LLM-based Agent Applications. Made Simple & Flexible.

Python 5,105 485 Updated Apr 27, 2026

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 4,889 366 Updated Apr 6, 2026

Partially Observable Process Gym

Python 214 20 Updated Jun 12, 2025
Python 359 20 Updated Jul 29, 2025
Python 64 3 Updated Mar 8, 2026

verl/HybridFlow: A Flexible and Efficient RL Post-Training Framework

Python 20,969 3,760 Updated Apr 27, 2026
Python 11 2 Updated Apr 27, 2026
178 61 Updated Aug 26, 2020

SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters (ICLR 2025)

Python 17 Updated Aug 22, 2025

Hackable and optimized Transformers building blocks, supporting a composable construction.

Python 10,436 774 Updated Apr 21, 2026

Active reward modeling with last layer Fisher Information (ICML'25)

Python 7 Updated Jul 9, 2025

Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.

Python 27,170 1,981 Updated Jan 9, 2026

This repo is meant to serve as a guide for Machine Learning/AI technical interviews.

Jupyter Notebook 8,093 1,453 Updated Nov 28, 2025

Codebase for Paper Reusing Embeddings: Reproducible Reward Model Research in Large Language Model Alignment without GPUs

Python 22 2 Updated Apr 24, 2025

Self-playing Adversarial Language Game Enhances LLM Reasoning, NeurIPS 2024

Python 144 23 Updated Feb 24, 2025

Reusable BatchBALD implementation

Jupyter Notebook 77 15 Updated Feb 28, 2024

OpenDILab Decision AI Engine. The Most Comprehensive Reinforcement Learning Framework B.P.

Python 3,623 431 Updated Dec 7, 2025

AlphaFold 3 inference pipeline.

Python 7,912 1,194 Updated Apr 23, 2026

RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RN…

Python 14,491 1,004 Updated Apr 27, 2026

official implementation of ICLR'2025 paper: Rethinking Bradley-Terry Models in Preference-based Reward Modeling: Foundations, Theory, and Alternatives

Python 73 5 Updated Apr 2, 2025
Next