Giantshaco

Septend Giantshaco

0 followers · 2 following

Lists (1)

Sort

私藏

1 repository

Stars

16 results for source starred repositories written in Python

Clear filter

hiyouga / LLaMA-Factory

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 62,196 7,527 Updated Nov 10, 2025

volcengine / verl

verl: Volcano Engine Reinforcement Learning for LLMs

Python 15,295 2,467 Updated Nov 10, 2025

datajuicer / data-juicer

Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷

Python 5,493 288 Updated Nov 10, 2025

THUDM / slime

slime is an LLM post-training framework for RL Scaling.

Python 2,435 246 Updated Nov 7, 2025

showlab / Show-o

[ICLR & NeurIPS 2025] Repository for Show-o series, One Single Transformer to Unify Multimodal Understanding and Generation.

Python 1,768 76 Updated Oct 22, 2025

segment-any-text / wtpsplit

Toolkit to segment text into sentences or other semantic units in a robust, efficient and adaptable way.

Python 1,187 75 Updated Oct 8, 2025

qiufengqijun / mini_qwen

这是一个从头训练大语言模型的项目，包括预训练、微调和直接偏好优化，模型拥有1B参数，支持中英文。

Python 669 90 Updated Feb 18, 2025

allenai / reward-bench

RewardBench: the first evaluation tool for reward models.

Python 650 89 Updated Jun 12, 2025

tensorgi / TPA

[NeurIPS 2025 Spotlight] TPA: Tensor ProducT ATTenTion Transformer (T6) (https://arxiv.org/abs/2501.06425)

Python 425 36 Updated Oct 23, 2025

pykt-team / pykt-toolkit

pyKT: A Python Library to Benchmark Deep Learning based Knowledge Tracing Models

Python 317 100 Updated Sep 18, 2025

sii-research / siiRL

siiRL: Shanghai Innovation Institute RL Framework for Advanced LLMs and Multi-Agent Systems

Python 224 20 Updated Nov 10, 2025

Hank0626 / TimeBridge

Official implementation of "TimeBridge: Non-Stationarity Matters for Long-term Time Series Forecasting" (ICML 2025)

Python 179 9 Updated May 16, 2025

hhyqhh / LAEA

Python 14 4 Updated Jun 18, 2024

UncertaintyForKnowledgeTracing / UKT

Python 13 3 Updated Feb 24, 2025

DoniMoon / LLMKT

EDM 2025, Using Large Multimodal Models to Extract Knowledge Components for Knowledge Tracing from Multimedia Question Information

Python 6 2 Updated Oct 1, 2024

Greg-Tarr / tpa_pytorch

Simple (slightly optimized) implementation of Tensor Product Attention from the T6 paper with a KV cache

Python 4 Updated Jan 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly