nzw0301

Kento Nozawa nzw0301

193 followers · 132 following

Preferred Networks, Inc.
Japan
05:46 (UTC +09:00)
nzw0301.github.io

Achievements

x4 x3 x3 x2

Achievements

x4 x3 x3 x2

Organizations

Lists (4)

Sort

Stars

jingxuanf0214 / rm-scaling

Python 3 Updated Feb 10, 2026

databricks / flashoptim

Python 230 9 Updated Mar 9, 2026

ZhaolinGao / A-PO

Accelerating RL for LLM Reasoning with Optimal Advantage Regression

Python 40 1 Updated May 30, 2025

pfnet-research / jfbench

Python 12 1 Updated Mar 12, 2026

google-research / kauldron

Modular, scalable library to train ML models

Python 229 26 Updated Mar 30, 2026

Dao-AILab / sonic-moe

Accelerating MoE with IO and Tile-aware Optimizations

Python 614 67 Updated Mar 27, 2026

meta-pytorch / MSLK

MSLK (Meta Superintelligence Labs Kernels) is a collection of PyTorch GPU operator libraries that are designed and optimized for GenAI training and inference, such as FP8 row-wise quantization and …

Python 88 35 Updated Mar 30, 2026

Zhiyuan-Zeng / RLVE

[Preprint] RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments

Python 193 20 Updated Jan 12, 2026

PrimeIntellect-ai / prime-rl

Async RL Training at Scale

Python 1,229 244 Updated Mar 30, 2026

huggingface / kernels-community

Kernel sources for https://huggingface.co/kernels-community

C++ 84 27 Updated Mar 30, 2026

ServiceNow / PipelineRL

A scalable asynchronous reinforcement learning implementation with in-flight weight updates.

Python 385 41 Updated Mar 25, 2026

verl-project / verl

verl: Volcano Engine Reinforcement Learning for LLMs

Python 20,330 3,536 Updated Mar 30, 2026

QwenLM / AutoIF

Python 328 32 Updated Jul 25, 2024

digital-go-jp / lawqa_jp

265 6 Updated Feb 13, 2026

facebookresearch / RAM

A framework to study AI models in Reasoning, Alignment, and use of Memory (RAM).

Python 349 40 Updated Mar 30, 2026

deepseek-ai / DeepSeek-V3.2-Exp

Python 1,538 150 Updated Nov 18, 2025

huggingface / trl

Train transformer language models with reinforcement learning.

Python 17,848 2,595 Updated Mar 30, 2026

reka-ai / research-eval

A benchmark to evaluate search-augmented LLMs

Python 17 2 Updated Aug 28, 2025

NVIDIA-NeMo / Curator

Scalable data pre processing and curation toolkit for LLMs

Python 1,495 247 Updated Mar 30, 2026

ke1337 / IFBench

Forked from allenai/IFBench

Python 1 Updated Oct 15, 2025

facebookresearch / dinov3

Reference PyTorch implementation and models for DINOv3

Jupyter Notebook 9,964 791 Updated Mar 30, 2026

meta-llama / synthetic-data-kit

Tool for generating high quality Synthetic datasets

Python 1,546 217 Updated Oct 28, 2025

google-research / metricx

Python 135 29 Updated Jan 22, 2026

openai / gpt-oss

gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI

Python 19,948 2,061 Updated Mar 27, 2026

character-ai / pipelining-sft

Simple and efficient DeepSeek V3 SFT using pipeline parallel and expert parallel, with both FP8 and BF16 trainings

Python 117 18 Updated Jul 27, 2025

ByteDance-Seed / Seed-X-7B

Python 167 7 Updated Aug 18, 2025

janhq / jan

Jan is an open source alternative to ChatGPT that runs 100% offline on your computer.

TypeScript 41,401 2,650 Updated Mar 30, 2026

allenai / IFBench

Python 117 19 Updated Jan 4, 2026

JohannesAck / OffPolicyCorrectedRewardModeling

Implementation for our COLM paper "Off-Policy Corrected Reward Modeling for RLHF"

Python 8 Updated Jul 23, 2025

actions / labeler

An action for automatically labelling pull requests

TypeScript 2,416 480 Updated Mar 27, 2026

Kento Nozawa nzw0301

Organizations

Lists (4)

datasets

resources

self-sup

tools

Stars