Skip to content
View jingcangcang's full-sized avatar
💭
ing
💭
ing

Block or report jingcangcang

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

MTEB: Massive Text Embedding Benchmark

Python 3,042 526 Updated Dec 25, 2025

本项目旨在分享大模型相关技术原理以及实战经验(大模型工程化、大模型应用落地)

HTML 22,489 2,631 Updated Dec 24, 2025

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 81,666 12,217 Updated Dec 21, 2025

llama3 implementation one matrix multiplication at a time

Jupyter Notebook 15,201 1,290 Updated May 23, 2024

No fortress, purely open ground. OpenManus is Coming.

Python 51,451 8,978 Updated Nov 17, 2025

code for piccolo embedding model from SenseTime

Python 144 6 Updated May 21, 2024

[ICLR 2025] LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs

Python 1,792 182 Updated Jun 24, 2025

雅意信息抽取大模型:在百万级人工构造的高质量信息抽取数据上进行指令微调,由中科闻歌算法团队研发。 (Repo for YAYI Unified Information Extraction Model)

314 14 Updated Aug 8, 2024

CoreNet: A library for training deep neural networks

Jupyter Notebook 7,023 546 Updated Oct 9, 2025

[LREC 2022] An off-the-shelf pre-trained Tweet NLP Toolkit (NER, tokenization, lemmatization, POS tagging, dependency parsing) + Tweebank-NER dataset

Python 105 9 Updated Jan 24, 2024

Repository for the paper "MultiNERD: A Multilingual, Multi-Genre and Fine-Grained Dataset for Named Entity Recognition (and Disambiguation)" (NAACL 2022).

Jupyter Notebook 45 6 Updated Jan 30, 2024

A new dataset HarveyNER with fine-grained locations annotated in tweets with strong baseline models using Curriculum Learning.

Python 6 1 Updated Nov 8, 2022

The Broad Twitter Corpus, an NER dataset in English stratified for time, location, social media genre, socioeconomic factors (COLING 2016)

Jupyter Notebook 68 6 Updated May 12, 2022

Guideline following Large Language Model for Information Extraction

Python 421 27 Updated Oct 27, 2024

Awesome papers about generative Information Extraction (IE) using Large Language Models (LLMs)

1,035 61 Updated Nov 18, 2024

[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

Python 3,108 284 Updated Jun 4, 2024
Python 91 17 Updated Aug 3, 2021

large language model training-3-stages+deployment

Python 49 12 Updated Aug 14, 2023

This is the official repo for "PromptAgent: Strategic Planning with Language Models Enables Expert-level Prompt Optimization". PromptAgent is a novel automatic prompt optimization method that auton…

Python 344 43 Updated Jul 17, 2025

[ACL 2023] This is the code repo for our ACL'23 paper "Augmentation-Adapted Retriever Improves Generalization of Language Models as Generic Plug-In".

Python 60 5 Updated Jul 12, 2024

中文Mixtral-8x7B(Chinese-Mixtral-8x7B)

Python 656 35 Updated Aug 17, 2024
Python 147 3 Updated Jul 1, 2024

Official inference library for Mistral models

Jupyter Notebook 10,606 1,002 Updated Nov 21, 2025

Deita: Data-Efficient Instruction Tuning for Alignment [ICLR2024]

Python 579 33 Updated Dec 9, 2024

A principled instruction benchmark on formulating effective queries and prompts for large language models (LLMs). Our paper: https://arxiv.org/abs/2312.16171

Python 979 102 Updated May 28, 2024

Implementation of Chinese ChatGPT

Python 289 35 Updated Nov 20, 2023

ChatGLM3 series: Open Bilingual Chat LLMs | 开源双语对话语言模型

Python 13,746 1,607 Updated Jan 13, 2025

ChatGLM2-6B 全参数微调,支持多轮对话的高效微调。

Python 401 41 Updated Aug 17, 2023

An implementation of model parallel GPT-2 and GPT-3-style models using the mesh-tensorflow library.

Python 8,285 963 Updated Feb 25, 2022
Next