Skip to content
View gszfwsb's full-sized avatar
😈
Making alchemy
😈
Making alchemy

Highlights

  • Pro

Block or report gszfwsb

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Shaping capabilities with token-level pretraining data filtering

Python 65 3 Updated Jan 28, 2026

Our code for ICLR'25 paper "DataMan: Data Manager for Pre-training Large Language Models".

Python 114 2 Updated Aug 18, 2025
Python 12 Updated Sep 30, 2025
Python 220 10 Updated Oct 27, 2025

Code implementation of synthetic continued pretraining

Jupyter Notebook 146 16 Updated Jan 6, 2025

A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.

Python 3,307 410 Updated Jan 19, 2026

repo for paper https://arxiv.org/abs/2504.13837

Python 325 19 Updated Dec 17, 2025

一个基于nano banana pro🍌的原生AI PPT生成应用,迈向真正的"Vibe PPT"; 支持上传任意模板图片;上传任意素材&智能解析;一句话/大纲/页面描述自动生成PPT;口头修改指定区域、一键导出可编辑ppt - An AI-native PPT generator based on nano banana pro🍌

Python 11,567 1,336 Updated Feb 2, 2026
Python 55 4 Updated Jun 23, 2025

Code for VideoCompressa: Data-Efficient Video Understanding via Joint Temporal Compression and Spatial Reconstruction

Python 3 Updated Dec 10, 2025

Crowdfunding open source projects: use OpenReview's high-quality review data to fine-tune a professional review and response LLM. 众筹开源项目:利用OpenReview的优质审稿数据,微调出一个专业的审稿和审稿回复GPT

Python 209 12 Updated Apr 26, 2023

Agent framework and applications built upon Qwen>=3.0, featuring Function Calling, MCP, Code Interpreter, RAG, Chrome extension, etc.

Python 13,192 1,241 Updated Feb 3, 2026

Tools for merging pretrained large language models.

Python 6,761 662 Updated Jan 26, 2026
Python 44 5 Updated Jan 20, 2026

The official implementation for [NeurIPS2025 Oral] Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free

Jupyter Notebook 830 51 Updated Dec 20, 2025

Enjoy the magic of Diffusion models!

Python 11,705 1,125 Updated Feb 4, 2026

The official implementation of dLLM-Var

Python 31 Updated Nov 6, 2025

Physics of Language Models: Part 4.2, Canon Layers at Scale where Synthetic Pretraining Resonates in Reality

HTML 317 18 Updated Jan 5, 2026

Search Self-Play: Pushing the Frontier of Agent Capability without Supervision

Python 89 8 Updated Jan 6, 2026

Scaling Preference Data Curation via Human-AI Synergy

141 3 Updated Jul 3, 2025

Code repository for Group-MATES Group-Level Data Selection for Efficient Pretraining

Python 10 2 Updated Jun 14, 2025
Python 1,772 78 Updated Dec 16, 2025

Socratic-Zero is a fully autonomous framework that generates high-quality training data for mathematical reasoning

Python 3 Updated Oct 27, 2025

This is the official implementation for Human-Agent Collaborative Paper-to-Page Crafting for Under $0.1.

HTML 155 13 Updated Oct 27, 2025

ERGO (Efficient Reasoning & Guided Observation) is a large vision–language model trained with reinforcement learning on efficiency objectives.

Python 12 1 Updated Feb 3, 2026
Next