Skip to content
View suhmily's full-sized avatar

Block or report suhmily

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Codebase for reproducing the experiments of the semantic uncertainty paper (short-phrase and sentence-length experiments).

Python 411 62 Updated Apr 12, 2024

Dermatology ddx dataset, Jax implementations of Monte Carlo conformal prediction, plausibility regions and statistical annotation aggregation from our recent work on uncertain ground truth (TMLR'23…

Python 677 48 Updated Mar 28, 2024

Code for STaR: Bootstrapping Reasoning With Reasoning (NeurIPS 2022)

Python 221 24 Updated Feb 21, 2023

Official implementation of Phi-Mamba. A MOHAWK-distilled model (Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models)

Python 123 12 Updated Sep 13, 2024

This repository contains the paper list for the paper: Igniting Language Intelligence: The Hitchhiker's Guide From Chain-of-Thought Reasoning to Language Agents

365 34 Updated Nov 25, 2023

CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)

Python 8,766 688 Updated Aug 13, 2024

GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型

Python 7,071 616 Updated Jul 4, 2025

SWE-bench: Can Language Models Resolve Real-world Github Issues?

Python 4,596 816 Updated Apr 1, 2026

中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)

Python 18,962 1,864 Updated Jul 15, 2025

Llama3、Llama3.1 中文后训练版仓库 - 微调、魔改版本有趣权重 & 训练、推理、评测、部署教程视频 & 文档。

Python 4,162 336 Updated Feb 21, 2026

[ICLR 2025] Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing. Your efficient and high-quality synthetic data generation pipeline!

Python 839 67 Updated Mar 17, 2025

Finetune Llama-3-8b on the MathInstruct dataset

Python 116 26 Updated Oct 17, 2024

A family of compressed models obtained via pruning and knowledge distillation

375 19 Updated Nov 6, 2025
Jupyter Notebook 487 36 Updated Jul 22, 2024

[ACL 2024] The project of Symbol-LLM

Python 59 4 Updated Jul 10, 2024

PaL: Program-Aided Language Models (ICML 2023)

Python 518 66 Updated Jun 30, 2023

Mix of Minimal Optimal Sets (MMOS) of dataset has two advantages for two aspects, higher performance and lower construction costs on math reasoning.

Python 74 3 Updated Jul 27, 2024

ToRA is a series of Tool-integrated Reasoning LLM Agents designed to solve challenging mathematical reasoning problems by interacting with tools [ICLR'24].

Python 1,114 80 Updated Feb 22, 2024

Train transformer language models with reinforcement learning.

Python 17,876 2,601 Updated Apr 1, 2026

[TMLR] A curated list of language modeling researches for code (and other software engineering activities), plus related datasets.

3,280 223 Updated Mar 5, 2026

代码大模型 预训练&微调&DPO 数据处理 业界处理pipeline sota

Python 53 12 Updated Jul 25, 2024

Lightweight and portable LLM sandbox runtime (code interpreter) Python library.

Python 995 94 Updated Mar 2, 2026

Parse LaTeX math expressions

Python 144 32 Updated Aug 5, 2024

[NeurIPS 2023 Spotlight] LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios (awesome MCTS)

Python 1,557 188 Updated Mar 28, 2026

(ICML 2024) Alphazero-like Tree-Search can guide large language model decoding and training

Python 285 30 Updated May 26, 2024

Hackable and optimized Transformers building blocks, supporting a composable construction.

Python 10,394 775 Updated Mar 30, 2026

Implementation for "Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs"

Python 392 16 Updated Jan 19, 2025

Claude Engineer is an interactive command-line interface (CLI) that leverages the power of Anthropic's Claude-3.5-Sonnet model to assist with software development tasks.This framework enables Claud…

Python 11,167 1,144 Updated Dec 12, 2024
Next