Skip to content
View CuSO4-Chen's full-sized avatar
🎯
Focusing
🎯
Focusing
  • SYSU
  • Shenzhen,Guangdong

Block or report CuSO4-Chen

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

The official code of ARPO & AEPO

Python 831 38 Updated Dec 20, 2025

The official implemention of "TreeRPO: Tree Relative Policy Optimization"

Jupyter Notebook 5 1 Updated Nov 3, 2025

[AAAI'26, Oral] Code for "Teaching Large Language Models to Maintain Contextual Faithfulness via Synthetic Tasks and Reinforcement Learning"

Python 42 Updated Jul 16, 2025

[ACL'25] Code for "Aligning Large Language Models to Follow Instructions and Hallucinate Less via Effective Data Filtering"

Python 20 Updated Jul 23, 2025

[EMNLP'25, SAC Highlights Award] Code for "GATEAU: Selecting Influential Samples for Long Context Alignment"

Python 40 Updated Jun 4, 2025

[EMNLP 2025] Expanding before Inferring: Enhancing Factuality in Large Language Models through Premature Layers Interpolation

3 Updated Nov 30, 2025

Agent-R1: Training Powerful LLM Agents with End-to-End Reinforcement Learning

Python 1,058 75 Updated Nov 25, 2025

Democratizing Reinforcement Learning for LLMs

Python 4,897 468 Updated Dec 21, 2025

verl-agent is an extension of veRL, designed for training LLM/VLM agents via RL. verl-agent is also the official code for paper "Group-in-Group Policy Optimization for LLM Agent Training"

Python 1,309 117 Updated Dec 11, 2025

🧑‍🚀 全世界最好的LLM资料总结(多模态生成、Agent、辅助编程、AI审稿、数据处理、模型训练、模型推理、o1 模型、MCP、小语言模型、视觉语言模型) | Summary of the world's best LLM resources.

7,028 682 Updated Dec 18, 2025

Train your Agent model via our easy and efficient framework

Python 1,668 156 Updated Dec 5, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 17,731 2,877 Updated Dec 23, 2025

An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & TIS & vLLM & Ray & Dynamic Sampling & Async Agentic RL)

Python 8,647 839 Updated Dec 18, 2025

《EasyOffer》(<大模型面经合集>)是针对LLM宝宝们量身打造的大模型暑期实习Offer指南,主要记录大模型暑期实习和秋招准备的一些常见大厂手撕代码、大厂面经经验、常见大厂思考题等;小白一个,正在学习ing......有问题各位大佬随时指正,希望大家都能拿到心仪Offer!

Jupyter Notebook 602 46 Updated Mar 25, 2025

[NAACL'25 Oral] Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering

Python 68 6 Updated Nov 25, 2024

最完整的AI算法面试题目仓库,1000道,25个类目

1,313 114 Updated Aug 13, 2023

[ACL-2024]Enhancing Noise Robustness of Retrieval-Augmented Language Models with Adaptive Adversarial Training

Python 40 3 Updated Oct 28, 2024

《动手学大模型Dive into LLMs》系列编程实践教程

Jupyter Notebook 11,241 1,251 Updated Oct 10, 2025

Mastering Transformers, published by Packt

Jupyter Notebook 358 150 Updated Dec 15, 2025

[EMNLP 2024] To Forget or Not? Towards Practical Knowledge Unlearning for Large Language Models

Python 47 1 Updated Jan 23, 2025
Python 27 3 Updated Oct 28, 2024

Official implementation for the paper "DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models"

Python 532 66 Updated Jan 17, 2025

A package to evaluate factuality of long-form generation. Original implementation of our EMNLP 2023 paper "FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation"

Python 413 61 Updated Apr 13, 2025

Reading list of hallucination in LLMs. Check out our new survey paper: "Siren’s Song in the AI Ocean: A Survey on Hallucination in Large Language Models"

1,068 54 Updated Sep 27, 2025

A curated list of safety-related papers, articles, and resources focused on Large Language Models (LLMs). This repository aims to provide researchers, practitioners, and enthusiasts with insights i…

HTML 1,723 87 Updated Dec 19, 2025

Code for ACL 2024 paper "TruthX: Alleviating Hallucinations by Editing Large Language Models in Truthful Space"

Python 144 7 Updated Mar 26, 2024

Official implementation of our paper "Separate the Wheat from the Chaff: Model Deficiency Unlearning via Parameter-Efficient Module Operation". A model merge method for deficiency unlearning, compi…

Python 11 2 Updated Sep 20, 2024

Code & Data for our Paper "Alleviating Hallucinations of Large Language Models through Induced Hallucinations"

Python 69 10 Updated Feb 27, 2024
Next