Stars
A curated list of product management advice for technical people.
Kepler.gl is a powerful open source geospatial analysis tool for large-scale data sets.
Documentation for the General Bikeshare Feed Specification, a standardized data feed for shared mobility system availability. Maintained by MobilityData
A data specification to enable right-of-way regulation, digital policy, geofencing, and two-way communication between mobility companies and public agencies worldwide for any regulated, shared vehi…
A sample online store using rails. Video of progress in: https://goo.gl/NYGrTq
Source code for the Kafka Streams in Action Book
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Transform the DOM by selecting elements and joining to data.
Natural Language Processing Best Practices & Examples
Tacotron 2 - PyTorch implementation with faster-than-realtime inference
Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"
Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languages
DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON
🎓 Path to a free self-taught education in Computer Science!
Unsupervised text tokenizer for Neural Network-based text generation.
Data ingestion library for Amundsen to build graph and search index
中文分词 词性标注 命名实体识别 依存句法分析 成分句法分析 语义依存分析 语义角色标注 指代消解 风格转换 语义相似度 新词发现 关键词短语提取 自动摘要 文本分类聚类 拼音简繁转换 自然语言处理
Synthetic Patient Population Simulator
System design interview for IT companies
Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
numeric fused-head identification and resolution
BioWordVec & BioSentVec: pre-trained embeddings for biomedical words and sentences
Definition and DDLs for the OMOP Common Data Model (CDM)
Super easy library for BERT based NLP models