Skip to content
View nissymori's full-sized avatar

Block or report nissymori

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

One unified CLI for headless coding agent execution 🤖

TypeScript 16 3 Updated May 15, 2026

[ICML 2026] CapBencher toolkit: Give your LLM benchmark a built-in alarm for leakage and gaming

Python 8 1 Updated Feb 24, 2026

[English/Japanese] A curated list of awesome online-prediction papers, libraries, and resources. Created and hosted by MIRU2025 Young Researchers Program group 5.

8 Updated Aug 19, 2025

[ICML2026] Official JAX code for Emergence of Exploration in Policy Gradient Reinforcement Learning via Retrying

Python 5 Updated May 9, 2026

A Python tool that automatically cleans, completes, and standardizes BibTeX entries using LLMs and web search.

Python 181 7 Updated Mar 18, 2026

Transform arXiv papers into a single LaTeX source that can be used as a prompt for asking LLMs questions about the paper.

Python 157 10 Updated Apr 29, 2026

MCP server that uses arxiv-to-prompt to fetch and process arXiv LaTeX sources for precise interpretation of mathematical expressions in scientific papers.

Python 132 13 Updated May 11, 2026

Fully autonomous & self-evolving research from idea to paper. Chat an Idea. Get a Paper. 🦞

Python 12,379 1,445 Updated May 20, 2026

AI agents running research on single-GPU nanochat training automatically

Python 82,303 11,948 Updated Mar 26, 2026

A Simple and Universal Swarm Intelligence Engine, Predicting Anything. 简洁通用的群体智能引擎,预测万物

Python 61,379 9,623 Updated Apr 2, 2026

Implementation for our paper "Gradient Regularization prevents Reward Hacking in RLHF and RLVR". Implemented TRL and for Huggingface Transformers

Python 11 Updated Feb 24, 2026

A fast and soft pattern search for trillion-scale corpora.

Python 225 10 Updated Feb 28, 2026
C++ 10 1 Updated Feb 18, 2026

https://mahjongfont.pages.dev - Japanese Mahjong (Riichi Mahjong) Font with OpenType|OpenType 機能付き麻雀牌図フォント

TypeScript 22 2 Updated May 14, 2026

High-Performance Research Environment for Riichi Mahjong

Rust 52 13 Updated May 9, 2026

Open Bandit Pipeline: a python library for bandit algorithms and off-policy evaluation

Python 707 100 Updated Jun 3, 2024

A GPU-Accelerated Mahjong Simulator for RL in JAX

Python 25 3 Updated May 20, 2026

Minimal JAX implementation unifying Diffusion and Flow Matching algorithms as alternative strategies for transporting data distributions.

Python 66 3 Updated Dec 19, 2025

Instant Skinned Gaussian Avatars for Web, Mobile and VR Applications

JavaScript 409 26 Updated May 9, 2026

Official Jax Implementation of MD4 Masked Diffusion Models

Python 160 17 Updated Feb 27, 2025

Clean single-file implementation of offline RL algorithms in JAX

Python 177 4 Updated Nov 24, 2025
Jupyter Notebook 11 1 Updated Aug 8, 2025

gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI

Python 20,111 2,078 Updated Mar 27, 2026

[RLC 2025] Official code repository for "Offline Reinforcement Learning with Wasserstein Regularization via Optimal Transport Maps"

Python 3 1 Updated Oct 20, 2025

Official implementation for "How Should We Meta-Learn Reinforcement Learning Algorithms?"

Python 23 1 Updated Sep 7, 2025

[TMLR 2025] Importance Weighting for Aligning Language Models under Deployment Distribution Shift

Python 5 Updated Jul 22, 2025

Implementation for our COLM paper "Off-Policy Corrected Reward Modeling for RLHF"

Python 8 Updated Jul 23, 2025
Next