Highlights
- Pro
Stars
The Dataset and Official Implementation for <The ELCo Dataset: Bridging Emoji and Lexical Composition> @ LREC-COLING 2024
Private AI platform for agents, assistants and enterprise search. Built-in Agent Builder, Deep research, Document analysis, Multi-model support, and API connectivity for agents.
This project collects awesome resources (e.g., papers, open-source models) for large language model (LLM)
Policies of scientific publisher and conferences towards large language model (LLM), such as ChatGPT
A curated list of modern Generative Artificial Intelligence projects and services
This repository provides details and links to the ACL anthology corpus/collection including .bib, .pdf and grobid extractions of the pdfs
Task management for the Obsidian knowledge base.
A curated collection of resources on scholarly data analysis ranging from datasets, papers, and code about bibliometrics, citation analysis, and other scholarly commons resources.
Mining and Analyzing Questions from Research Paper TItles
An Open-Source Framework for Prompt-Learning.
Neuralized version of the Reference String Parser component of the ParsCit package.
Python package built to ease deep learning on graph, on top of existing DL frameworks.
🚀 State-of-the-art parsers for natural language.
😎 A curated list of the Question Answering (QA)
DoTAT 是一款基于web、面向领域的通用文本标注工具,支持大规模实体标注、关系标注、事件标注、文本分类、基于字典匹配和正则匹配的自动标注以及用于实现归一化的标准名标注,同时也支持迭代标注、嵌套实体标注和嵌套事件标注。标注规范可自定义且同类型任务中可“一次创建多次复用”。通过分级实体集合扩大了实体类型的规模,并设计了全新高效的标注方式,提升了用户体验和标注效率。此外,本工具增加了审核环节,…
Data augmentation for NLP, presented at EMNLP 2019
📖 A collection of pure bash alternatives to external processes.
An open-source NLP research library, built on PyTorch.
HIT-SCIR / ELMoForManyLangs
Forked from bozheng-hit/ELMoPre-trained ELMo Representations for Many Languages
A novel method for first story detection on Twitter data. Includes a sample dataset.
The most comprehensive database of Chinese poetry 🧶最全中华古诗词数据库, 唐宋两朝近一万四千古诗人, 接近5.5万首唐诗加26万宋诗. 两宋时期1564位词人,21050首词。