Skip to content
View yg37's full-sized avatar
  • San Francisco

Block or report yg37

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A curated list of product management advice for technical people.

4,253 833 Updated Jul 1, 2024

Kepler.gl is a powerful open source geospatial analysis tool for large-scale data sets.

TypeScript 11,688 1,924 Updated Mar 26, 2026

Documentation for the General Bikeshare Feed Specification, a standardized data feed for shared mobility system availability. Maintained by MobilityData

890 303 Updated Mar 17, 2026

A data specification to enable right-of-way regulation, digital policy, geofencing, and two-way communication between mobility companies and public agencies worldwide for any regulated, shared vehi…

727 248 Updated Mar 19, 2026

A sample online store using rails. Video of progress in: https://goo.gl/NYGrTq

Ruby 2 Updated Aug 19, 2016

Source code for the Kafka Streams in Action Book

Java 268 179 Updated Jul 11, 2021

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Python 158,450 32,623 Updated Mar 26, 2026

Human-readable reference marks for scales.

JavaScript 208 107 Updated Oct 6, 2023

Transform the DOM by selecting elements and joining to data.

JavaScript 569 287 Updated Jan 3, 2025

Natural Language Processing Best Practices & Examples

Python 6,444 912 Updated Aug 30, 2022

Tacotron 2 - PyTorch implementation with faster-than-realtime inference

Jupyter Notebook 5,305 1,425 Updated Jun 12, 2024

Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

Python 6,497 792 Updated Jan 14, 2026

Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languages

Python 7,754 942 Updated Mar 26, 2026

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

C++ 26,752 4,101 Updated Jun 19, 2025

Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON

Go 9,808 232 Updated Mar 23, 2026

🎓 Path to a free self-taught education in Computer Science!

HTML 202,748 25,172 Updated Mar 26, 2026

Unsupervised text tokenizer for Neural Network-based text generation.

C++ 11,719 1,332 Updated Mar 26, 2026

Data ingestion library for Amundsen to build graph and search index

Python 204 207 Updated Mar 13, 2024

Natural Language Processing for the next decade. Tokenization, Part-of-Speech Tagging, Named Entity Recognition, Syntactic & Semantic Dependency Parsing, Document Classification

Python 36,225 10,907 Updated Nov 15, 2025

SparkOnHBase

Scala 278 174 Updated Mar 30, 2021

Synthetic Patient Population Simulator

Java 3,055 845 Updated Mar 19, 2026

NLP, before and after spaCy

Python 2,237 249 Updated Sep 22, 2023

System design interview for IT companies

23,037 5,205 Updated Apr 3, 2023

Apache Flink

Java 25,898 13,904 Updated Mar 26, 2026

Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.

Python 340,313 55,048 Updated Mar 20, 2026

numeric fused-head identification and resolution

Jupyter Notebook 33 3 Updated Oct 16, 2019

BioWordVec & BioSentVec: pre-trained embeddings for biomedical words and sentences

Jupyter Notebook 611 99 Updated Aug 15, 2023

A BERT model for scientific text.

Python 1,677 233 Updated Feb 22, 2022

Definition and DDLs for the OMOP Common Data Model (CDM)

HTML 1,029 491 Updated Nov 5, 2025

Super easy library for BERT based NLP models

Python 1,920 341 Updated Aug 19, 2024
Next