Stars
An Efficient "Factory" to Build Multiple LoRA Adapters
[SIGIR'24] The official implementation code of MOELoRA.
Costrict - strict AI coder for enterprises, quality first, including AI Agent, AI CodeReview, AI Completion.
Implementation of some unbalanced loss like focal_loss, dice_loss, DSC Loss, GHM Loss et.al
Minimal reproduction of DeepSeek R1-Zero
Fully open reproduction of DeepSeek-R1
主要记录大语言大模型(LLMs) 算法(应用)工程师相关的知识及面试题
PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models(NeurIPS 2024 Spotlight)
Chinese-LLaMA 1&2、Chinese-Falcon 基础模型;ChatFlow中文对话模型;中文OpenLLaMA模型;NLP预训练/指令微调数据集
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
中文LLaMA-2 & Alpaca-2大模型二期项目 + 64K超长上下文模型 (Chinese LLaMA-2 & Alpaca-2 LLMs with 64K long context models)
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
A family of open-sourced Mixture-of-Experts (MoE) Large Language Models
A framework for few-shot evaluation of language models.
The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
An Open-sourced Knowledgable Large Language Model Framework.
BELLE: Be Everyone's Large Language model Engine(开源中文对话大模型)
Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.
User-friendly LLaMA: Train or Run the model using PyTorch. Nothing else.
🐙 Guides, papers, lessons, notebooks and resources for prompt engineering, context engineering, RAG, and AI Agents.
Text2Event: Controllable Sequence-to-Structure Generation for End-to-end Event Extraction
中英文敏感词、语言检测、中外手机/电话归属地/运营商查询、名字推断性别、手机号抽取、身份证抽取、邮箱抽取、中日文人名库、中文缩写库、拆字词典、词汇情感值、停用词、反动词表、暴恐词表、繁简体转换、英文模拟中文发音、汪峰歌词生成器、职业名称词库、同义词库、反义词库、否定词库、汽车品牌词库、汽车零件词库、连续英文切割、各种中文词向量、公司名字大全、古诗词库、IT词库、财经词库、成语词库、地名词库、…
一个简单的中文事件抽取模型,触发词和实体联合标注识别,同时判定实体角色。
An experiment and demo-level tool for text information extraction (event-triples extraction), which can be a route to the event chain and topic graph, 基于依存句法与语义角色标注的事件三元组抽取,可用于文本理解如文档主题链,事件线等应用。