Skip to content
View zxteloiv's full-sized avatar

Highlights

  • Pro

Block or report zxteloiv

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please donโ€™t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this userโ€™s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Scripts and doc for https://www.dolthub.com/repositories/chenditc/investment_data

Python 1,198 166 Updated May 17, 2026

An efficient prompt optimization method that uses zeroth-order method to optimize the prompts for black-box LLMs.

Python 9 Updated Oct 21, 2025

This is the homepage of a new book entitled "Mathematical Foundations of Reinforcement Learning."

MATLAB 16,208 1,527 Updated May 9, 2026

๐Ÿฆ‰ Data Versioning and ML Experiments

Python 15,607 1,296 Updated Apr 28, 2026

Autonomously train research-agent LLMs on custom data using reinforcement learning and self-verification.

Jupyter Notebook 689 63 Updated Mar 22, 2025

My learning notes for ML SYS.

Python 6,319 417 Updated Apr 23, 2026
Python 219 9 Updated Feb 20, 2025

FlashMLA: Efficient Multi-head Latent Attention Kernels

C++ 12,650 1,034 Updated Apr 30, 2026

A very simple GRPO implement for reproducing r1-like LLM thinking.

Python 1,673 132 Updated Nov 21, 2025

Fully open data curation for reasoning models

Python 2,259 187 Updated Dec 2, 2025

An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & VLM & TIS & vLLM & Ray & Async RL)

Python 9,521 944 Updated May 15, 2026

Code for BLT research paper

Python 2,041 192 Updated Nov 3, 2025

Token Omission Via Attention

Python 128 6 Updated Oct 13, 2024

A financial agent for investment research

TypeScript 1,967 396 Updated Aug 19, 2025

Pretraining and inference code for a large-scale depth-recurrent language model

Python 885 79 Updated Dec 29, 2025
Python 337 18 Updated May 31, 2025

A lightweight reproduction of DeepSeek-R1-Zero with indepth analysis of self-reflection behavior.

Python 250 9 Updated Apr 15, 2025

s1: Simple test-time scaling

Python 6,653 761 Updated Jun 25, 2025

Fully open reproduction of DeepSeek-R1

Python 26,017 2,418 Updated Apr 2, 2026

Minimal reproduction of DeepSeek R1-Zero

Python 13,099 1,587 Updated Feb 27, 2026

Reproduce R1 Zero on Logic Puzzle

Python 2,450 165 Updated Mar 20, 2025

Data processing for and with foundation models! ๐ŸŽ ๐Ÿ‹ ๐ŸŒฝ โžก๏ธ โžก๏ธ๐Ÿธ ๐Ÿน ๐Ÿท

Python 6,416 371 Updated May 18, 2026
9 Updated Jul 8, 2024

LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath

Python 9,480 750 Updated Jun 7, 2025

Code implementation of synthetic continued pretraining

Jupyter Notebook 159 19 Updated Jan 6, 2025

SGLang is a high-performance serving framework for large language models and multimodal models.

Python 27,934 5,958 Updated May 18, 2026

Free, simple, fast interactive diagrams for any GitHub repository

TypeScript 15,609 1,197 Updated May 14, 2026

O1 Replication Journey

2,000 61 Updated Jan 14, 2025
Next