- Beijing
-
13:43
(UTC +08:00) - https://scholar.google.com/citations?user=j4EmuqkAAAAJ
Stars
TensorFlow code and pre-trained models for BERT
中文分词 词性标注 命名实体识别 依存句法分析 成分句法分析 语义依存分析 语义角色标注 指代消解 风格转换 语义相似度 新词发现 关键词短语提取 自动摘要 文本分类聚类 拼音简繁转换 自然语言处理
State-of-the-Art Text Embeddings
A collection of libraries to optimise AI model performances
Free English to Chinese Dictionary Database
Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.
🎉 Repo for LaWGPT, Chinese-Llama tuned with Chinese Legal knowledge. 基于中文法律知识的大语言模型
Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch
keras implement of transformers for humans
The RedPajama-Data repository contains code for preparing large datasets for training large language models.
Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.
Open Source Pre-training Model Framework in PyTorch & Pre-trained Model Zoo
4 bits quantization of LLaMA using GPTQ
An implement of the paper of EDA for Chinese corpus.中文语料的EDA数据增强工具。NLP数据增强。论文阅读笔记。
ChineseSemanticKB,chinese semantic knowledge base, 面向中文处理的12类、百万规模的语义常用词典,包括34万抽象语义库、34万反义语义库、43万同义语义库等,可支持句子扩展、转写、事件抽象与泛化等多种应用场景。
GAOKAO-Bench is an evaluation framework that utilizes GAOKAO questions as a dataset to evaluate large language models.
My Keras implementation of the Deep Semantic Similarity Model (DSSM)/Convolutional Latent Semantic Model (CLSM) described here: http://research.microsoft.com/pubs/226585/cikm2014_cdssm_final.pdf.
FaRL for Facial Representation Learning [Official, CVPR 2022]
[CVPR 2021] Multi-Modal-CelebA-HQ: A Large-Scale Text-Driven Face Generation and Understanding Dataset
Synthetic Faces High Quality (SFHQ) Dataset. 425,000 curated 1024x1024 synthetic face images
A High-Performance Pytorch Implementation of face detection models, including RetinaFace and DSFD
Open Instruction Generalist is an assistant trained on massive synthetic instructions to perform many millions of tasks