Stars
🔬 A curated collection of 23,000+ agent skills for empirical research across 8 social science disciplines. | 精选 23,000+ AI Agent 技能库,覆盖8大社会科学学科的实证研究。CoPaper.AI 20分钟完成一篇可复现的规范实证论文,并支持用户上传 Skills。-- …
AutoScientists: Self-Organizing Agent Teams for Long-Running Scientific Experimentation
A framework to learn cross-lingual word embedding mappings
A modular graph-based Retrieval-Augmented Generation (RAG) system
"RAG-Anything: All-in-One RAG Framework"
LAION research paper dataset visual explorer 🔬 🧑🔬 👩🔬
(WSDM 2024) Official implementation of the paper "ONCE: Boosting Content-based Recommendation with Both Open- and Closed-source Large Language Models"
Single Cell Multi-Omics Quality Control Toolkit
Implementation of the first paper on word2vec
Karate Club: An API Oriented Open-source Python Framework for Unsupervised Learning on Graphs (CIKM 2020)
Accurate, efficient Earth Mover's Distance for Python (and MATLAB).
Label Studio is a multi-type data labeling and annotation tool with standardized output format
DoTAT 是一款基于web、面向领域的通用文本标注工具,支持大规模实体标注、关系标注、事件标注、文本分类、基于字典匹配和正则匹配的自动标注以及用于实现归一化的标准名标注,同时也支持迭代标注、嵌套实体标注和嵌套事件标注。标注规范可自定义且同类型任务中可“一次创建多次复用”。通过分级实体集合扩大了实体类型的规模,并设计了全新高效的标注方式,提升了用户体验和标注效率。此外,本工具增加了审核环节,…
https://sites.google.com/site/multidimensionaltagger
A visual no-code/code-free web crawler/spider易采集:一个可视化浏览器自动化测试/数据采集/网页爬虫软件,可以无代码图形化的设计和执行爬虫任务。别名:ServiceWrapper面向Web应用的智能化服务封装系统。
MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW
Locality Sensitive Hashing, fuzzy-hash, min-hash, simhash, aHash, pHash, dHash。基于 Hash值的图片相似度、文本相似度
DuckDB is an analytical in-process SQL database management system
Dynamic Word Embeddings for Evolving Semantic Discovery code.
Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting has…
Pytorch Implementations of large number classical backbone CNNs, data enhancement, torch loss, attention, visualization and some common algorithms.
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Repository containing notebooks of my posts on Medium
Transformer based model for time series prediction
About Code release for "Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting" (NeurIPS 2021), https://arxiv.org/abs/2106.13008