-
The University of Tokyo
- Tokyo, Japan
-
07:33
(UTC -12:00) - https://nissymori.github.io/
- @nissymori1
Stars
One unified CLI for headless coding agent execution 🤖
[ICML 2026] CapBencher toolkit: Give your LLM benchmark a built-in alarm for leakage and gaming
[English/Japanese] A curated list of awesome online-prediction papers, libraries, and resources. Created and hosted by MIRU2025 Young Researchers Program group 5.
[ICML2026] Official JAX code for Emergence of Exploration in Policy Gradient Reinforcement Learning via Retrying
A Python tool that automatically cleans, completes, and standardizes BibTeX entries using LLMs and web search.
Transform arXiv papers into a single LaTeX source that can be used as a prompt for asking LLMs questions about the paper.
MCP server that uses arxiv-to-prompt to fetch and process arXiv LaTeX sources for precise interpretation of mathematical expressions in scientific papers.
Fully autonomous & self-evolving research from idea to paper. Chat an Idea. Get a Paper. 🦞
AI agents running research on single-GPU nanochat training automatically
A Simple and Universal Swarm Intelligence Engine, Predicting Anything. 简洁通用的群体智能引擎,预测万物
Implementation for our paper "Gradient Regularization prevents Reward Hacking in RLHF and RLVR". Implemented TRL and for Huggingface Transformers
A fast and soft pattern search for trillion-scale corpora.
https://mahjongfont.pages.dev - Japanese Mahjong (Riichi Mahjong) Font with OpenType|OpenType 機能付き麻雀牌図フォント
High-Performance Research Environment for Riichi Mahjong
Open Bandit Pipeline: a python library for bandit algorithms and off-policy evaluation
A GPU-Accelerated Mahjong Simulator for RL in JAX
Minimal JAX implementation unifying Diffusion and Flow Matching algorithms as alternative strategies for transporting data distributions.
Instant Skinned Gaussian Avatars for Web, Mobile and VR Applications
Official Jax Implementation of MD4 Masked Diffusion Models
Clean single-file implementation of offline RL algorithms in JAX
gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI
[RLC 2025] Official code repository for "Offline Reinforcement Learning with Wasserstein Regularization via Optimal Transport Maps"
Official implementation for "How Should We Meta-Learn Reinforcement Learning Algorithms?"
[TMLR 2025] Importance Weighting for Aligning Language Models under Deployment Distribution Shift
Implementation for our COLM paper "Off-Policy Corrected Reward Modeling for RLHF"