Skip to content
View suhmily's full-sized avatar

Block or report suhmily

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Codebase for reproducing the experiments of the semantic uncertainty paper (short-phrase and sentence-length experiments).

Python 399 55 Updated Apr 12, 2024

Dermatology ddx dataset, Jax implementations of Monte Carlo conformal prediction, plausibility regions and statistical annotation aggregation from our recent work on uncertain ground truth (TMLR'23…

Python 679 49 Updated Mar 28, 2024

Code for STaR: Bootstrapping Reasoning With Reasoning (NeurIPS 2022)

Python 217 23 Updated Feb 21, 2023

Official implementation of Phi-Mamba. A MOHAWK-distilled model (Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models)

Python 116 11 Updated Sep 13, 2024

This repository contains the paper list for the paper: Igniting Language Intelligence: The Hitchhiker's Guide From Chain-of-Thought Reasoning to Language Agents

366 34 Updated Nov 25, 2023

CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)

Python 8,717 681 Updated Aug 13, 2024

GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型

Python 6,971 602 Updated Jul 4, 2025

SWE-bench: Can Language Models Resolve Real-world Github Issues?

Python 3,995 719 Updated Dec 18, 2025

中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)

Python 18,969 1,872 Updated Jul 15, 2025

Llama3、Llama3.1 中文后训练版仓库 - 微调、魔改版本有趣权重 & 训练、推理、评测、部署教程视频 & 文档。

Python 4,159 337 Updated May 7, 2025

[ICLR 2025] Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing. Your efficient and high-quality synthetic data generation pipeline!

Python 807 69 Updated Mar 17, 2025

Finetune Llama-3-8b on the MathInstruct dataset

Python 115 26 Updated Oct 17, 2024

A family of compressed models obtained via pruning and knowledge distillation

361 18 Updated Nov 6, 2025
Jupyter Notebook 477 34 Updated Jul 22, 2024

[ACL 2024] The project of Symbol-LLM

Python 61 4 Updated Jul 10, 2024

PaL: Program-Aided Language Models (ICML 2023)

Python 518 64 Updated Jun 30, 2023

Mix of Minimal Optimal Sets (MMOS) of dataset has two advantages for two aspects, higher performance and lower construction costs on math reasoning.

Python 74 3 Updated Jul 27, 2024

ToRA is a series of Tool-integrated Reasoning LLM Agents designed to solve challenging mathematical reasoning problems by interacting with tools [ICLR'24].

Python 1,108 77 Updated Feb 22, 2024

Train transformer language models with reinforcement learning.

Python 16,737 2,372 Updated Dec 22, 2025

[TMLR] A curated list of language modeling researches for code (and other software engineering activities), plus related datasets.

3,144 213 Updated Dec 9, 2025

代码大模型 预训练&微调&DPO 数据处理 业界处理pipeline sota

Python 48 11 Updated Jul 25, 2024

Lightweight and portable LLM sandbox runtime (code interpreter) Python library.

Python 723 63 Updated Dec 11, 2025

Parse LaTeX math expressions

Python 141 31 Updated Aug 5, 2024

[NeurIPS 2023 Spotlight] LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios (awesome MCTS)

Python 1,487 182 Updated Dec 19, 2025

(ICML 2024) Alphazero-like Tree-Search can guide large language model decoding and training

Python 282 30 Updated May 26, 2024

Hackable and optimized Transformers building blocks, supporting a composable construction.

Python 10,203 747 Updated Dec 12, 2025

Implementation for "Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs"

Python 388 16 Updated Jan 19, 2025

Claude Engineer is an interactive command-line interface (CLI) that leverages the power of Anthropic's Claude-3.5-Sonnet model to assist with software development tasks.This framework enables Claud…

Python 11,141 1,160 Updated Dec 12, 2024
Next