-
Facebook; University de Montreal, Harbin Institute of Technology
- Menlo Park, California, US
- hantek.github.io
Stars
Retrieval and Retrieval-augmented LLMs
[TMLR'24] A CommonSense Reasoning Dataset pertaining to Physical Commonsense affordance of objects.
LLM API 管理 & 分发系统,支持 OpenAI、Azure、Anthropic Claude、Google Gemini、DeepSeek、字节豆包、ChatGLM、文心一言、讯飞星火、通义千问、360 智脑、腾讯混元等主流模型,统一 API 适配,可用于 key 管理与二次分发。单可执行文件,提供 Docker 镜像,一键部署,开箱即用。LLM API management & k…
Supercharge Your LLM Application Evaluations 🚀
视频硬字幕提取,生成srt文件。无需申请第三方API,本地实现文本识别。基于深度学习的视频字幕提取框架,包含字幕区域检测、字幕内容提取。A GUI tool for extracting hard-coded subtitle (hardsub) from videos and generating srt files.
A data set based on all arXiv publications, pre-processed for NLP, including structured full-text and citation network
Code and datasets for paper "K2: A Foundation Language Model for Geoscience Knowledge Understanding and Utilization" in WSDM-2024
A library for efficient similarity search and clustering of dense vectors.
Repo for the paper "Large Language Models Struggle to Learn Long-Tail Knowledge"
A visual no-code/code-free web crawler/spider易采集:一个可视化浏览器自动化测试/数据采集/爬虫软件,可以无代码图形化的设计和执行爬虫任务。别名:ServiceWrapper面向Web应用的智能化服务封装系统。
Code for the ProteinMPNN paper
JARVIS, a system to connect LLMs with ML community. Paper: https://arxiv.org/pdf/2303.17580.pdf
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
A tool for extracting plain text from Wikipedia dumps
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
BELLE: Be Everyone's Large Language model Engine(开源中文对话大模型)
MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。
GLM-130B: An Open Bilingual Pre-Trained Model (ICLR 2023)
Making large AI models cheaper, faster and more accessible
更新2008年版本的《上海交通大学生存手册》gitbook发布于https://survivesjtu.gitbook.io/survivesjtumanual/
Collections of resources from Joint Laboratory of HIT and iFLYTEK Research (HFL)
Collection of works from VIPL-AVSU
Code for prefix beam search tutorial by @labodk
A comprehensive mapping database of English to Chinese technical vocabulary in the artificial intelligence domain