Skip to content
View guijinSON's full-sized avatar

Block or report guijinSON

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

🤗 Benchmark Large Language Models Reliably On Your Data

HTML 402 36 Updated Oct 2, 2025

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, GLM4.5, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, Llava, GLM4v, Ph…

Python 10,261 900 Updated Oct 10, 2025

Repository for "Scaling Evaluation-time Compute with Reasoning Models as Process Evaluators"

12 Updated Mar 25, 2025

nanoGRPO is a lightweight implementation of Group Relative Policy Optimization (GRPO)

Python 121 6 Updated May 8, 2025

The most modern LLM evaluation toolkit

Python 70 9 Updated Sep 25, 2025

A hackable, simple, and reseach-friendly GRPO Training Framework with high speed weight synchronization in a multinode environment.

Python 31 4 Updated Aug 27, 2025

🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.

Python 527 40 Updated Oct 2, 2025

Official implementation for "MM-Eval: A Multilingual Meta-Evaluation Benchmark for LLM-as-a-Judge and Reward Models"

Jupyter Notebook 16 3 Updated Oct 26, 2024

ETL scripts for Bitcoin, Litecoin, Dash, Zcash, Doge, Bitcoin Cash. Available in Google BigQuery https://goo.gl/oY5BCQ

Python 438 133 Updated May 2, 2025

Performs benchmarking on two Korean datasets with minimal time and effort.

Jupyter Notebook 43 7 Updated Aug 14, 2025

🤏🏻 `investpy` but made tiny

Python 400 44 Updated Jan 8, 2023

An Open Source Toolkit For LLM Distillation

Python 734 95 Updated Jul 8, 2025

Arrakis is a library to conduct, track and visualize mechanistic interpretability experiments.

Jupyter Notebook 31 3 Updated Apr 22, 2025

DataInf: Efficiently Estimating Data Influence in LoRA-tuned LLMs and Diffusion Models (ICLR 2024)

Jupyter Notebook 76 12 Updated Oct 3, 2024

Evaluate your LLM's response with Prometheus and GPT4 💯

Python 1,001 64 Updated Apr 25, 2025

Codebase for Merging Language Models (ICML 2024)

Python 850 52 Updated May 5, 2024
Python 7 2 Updated Aug 16, 2024
Jupyter Notebook 2 2 Updated Mar 25, 2024

Corpus of Annual Reports in Japan

Python 94 7 Updated Dec 19, 2020

LLaVA-Plus: Large Language and Vision Assistants that Plug and Learn to Use Skills

Python 758 58 Updated Feb 1, 2024
Jupyter Notebook 3 Updated Jan 31, 2024

Tools for merging pretrained large language models.

Python 6,352 620 Updated Sep 17, 2025

Korean Port for teknium1/LLM-Logbook

HTML 6 Updated Oct 31, 2023

Japanese LLaMa experiment

C 54 2 Updated Dec 7, 2024

self-instruct unseen data eval in Korean

Python 6 1 Updated May 3, 2023

Robust recipes to align language models with human and AI preferences

Python 5,390 459 Updated Sep 8, 2025

Go ahead and axolotl questions

Python 10,589 1,166 Updated Oct 10, 2025

Korean Multi-task Instruction Tuning

Jupyter Notebook 158 21 Updated Dec 20, 2023

Structured Outputs

Python 12,675 640 Updated Oct 8, 2025
Next