nzw0301

Kento Nozawa nzw0301

193 followers · 132 following

Preferred Networks, Inc.
Japan
15:44 (UTC +09:00)
nzw0301.github.io

Achievements

x4 x3 x3 x2

Achievements

x4 x3 x3 x2

Organizations

Lists (4)

Sort

Stars

jingxuanf0214 / rm-scaling

Python 2 Updated Feb 10, 2026

databricks / flashoptim

Python 222 9 Updated Mar 9, 2026

ZhaolinGao / A-PO

Accelerating RL for LLM Reasoning with Optimal Advantage Regression

Python 40 1 Updated May 30, 2025

pfnet-research / jfbench

Python 12 1 Updated Mar 12, 2026

google-research / kauldron

Modular, scalable library to train ML models

Python 227 24 Updated Mar 20, 2026

Dao-AILab / sonic-moe

Accelerating MoE with IO and Tile-aware Optimizations

Python 612 65 Updated Mar 17, 2026

meta-pytorch / MSLK

MSLK (Meta Superintelligence Labs Kernels) is a collection of PyTorch GPU operator libraries that are designed and optimized for GenAI training and inference, such as FP8 row-wise quantization and …

Python 87 33 Updated Mar 21, 2026

Zhiyuan-Zeng / RLVE

[Preprint] RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments

Python 191 20 Updated Jan 12, 2026

PrimeIntellect-ai / prime-rl

Async RL Training at Scale

Python 1,166 234 Updated Mar 22, 2026

huggingface / kernels-community

Kernel sources for https://huggingface.co/kernels-community

C++ 81 26 Updated Mar 20, 2026

ServiceNow / PipelineRL

A scalable asynchronous reinforcement learning implementation with in-flight weight updates.

Python 382 39 Updated Mar 20, 2026

verl-project / verl

verl: Volcano Engine Reinforcement Learning for LLMs

Python 20,097 3,476 Updated Mar 21, 2026

QwenLM / AutoIF

Python 327 32 Updated Jul 25, 2024

digital-go-jp / lawqa_jp

265 6 Updated Feb 13, 2026

facebookresearch / RAM

A framework to study AI models in Reasoning, Alignment, and use of Memory (RAM).

Python 344 40 Updated Mar 21, 2026

deepseek-ai / DeepSeek-V3.2-Exp

Python 1,515 150 Updated Nov 18, 2025

huggingface / trl

Train transformer language models with reinforcement learning.

Python 17,742 2,576 Updated Mar 21, 2026

reka-ai / research-eval

A benchmark to evaluate search-augmented LLMs

Python 17 2 Updated Aug 28, 2025

NVIDIA-NeMo / Curator

Scalable data pre processing and curation toolkit for LLMs

Python 1,464 239 Updated Mar 22, 2026

ke1337 / IFBench

Forked from allenai/IFBench

Python 1 Updated Oct 15, 2025

facebookresearch / dinov3

Reference PyTorch implementation and models for DINOv3

Jupyter Notebook 9,891 779 Updated Mar 11, 2026

meta-llama / synthetic-data-kit

Tool for generating high quality Synthetic datasets

Python 1,540 215 Updated Oct 28, 2025

google-research / metricx

Python 134 27 Updated Jan 22, 2026

openai / gpt-oss

gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI

Python 19,930 2,064 Updated Jan 13, 2026

character-ai / pipelining-sft

Simple and efficient DeepSeek V3 SFT using pipeline parallel and expert parallel, with both FP8 and BF16 trainings

Python 117 18 Updated Jul 27, 2025

ByteDance-Seed / Seed-X-7B

Python 167 7 Updated Aug 18, 2025

janhq / jan

Jan is an open source alternative to ChatGPT that runs 100% offline on your computer.

TypeScript 41,185 2,612 Updated Mar 22, 2026

allenai / IFBench

Python 115 19 Updated Jan 4, 2026

JohannesAck / OffPolicyCorrectedRewardModeling

Implementation for our COLM paper "Off-Policy Corrected Reward Modeling for RLHF"

Python 8 Updated Jul 23, 2025

actions / labeler

An action for automatically labelling pull requests

TypeScript 2,407 478 Updated Mar 20, 2026

Kento Nozawa nzw0301

Organizations

Lists (4)

datasets

resources

self-sup

tools

Stars