CodingCat

Nan Zhu CodingCat

411 followers · 27 following

Pinterest
Seattle
http://codingcat.me/

Achievements

x3 x4

Achievements

x3 x4

Highlights

Developer Program Member
Pro

Organizations

Stars

CodingCat / lerobot-explorer

Jupyter Notebook 1 Updated Jan 16, 2026

deepseek-ai / 3FS

A high-performance distributed file system designed to address the challenges of AI training and inference workloads.

C++ 10,026 1,061 Updated May 7, 2026

oap-project / gazelle_plugin

Native SQL Engine plugin for Spark SQL with vectorized SIMD optimizations.

Scala 255 73 Updated Feb 21, 2023

rust-unofficial / awesome-rust

A curated list of Rust code and resources.

Rust 58,130 3,456 Updated Jul 2, 2026

treeverse / lakeFS

lakeFS - Data version control for your data lake | Git for data

Go 5,425 458 Updated Jun 29, 2026

OpenLineage / OpenLineage

An Open Standard for lineage metadata collection

Java 2,523 483 Updated Jul 2, 2026

uber / h3

Hexagonal hierarchical geospatial indexing system

C 6,360 590 Updated Jul 1, 2026

uber / RemoteShuffleService

Remote shuffle service for Apache Spark to store shuffle data on remote servers.

Java 335 100 Updated Sep 29, 2023

microsoft / hyperspace

An open source indexing subsystem that brings index-based query acceleration to Apache Spark™ and big data workloads.

Scala 430 117 Updated Jan 14, 2022

databricks / koalas

Koalas: pandas API on Apache Spark

Python 3,374 369 Updated Mar 20, 2024

Netflix / genie

Distributed Big Data Orchestration Service

Java 1,763 373 Updated Jun 11, 2026

treeverse / dvc

🦉 Data Versioning and ML Experiments

Python 15,716 1,306 Updated Jun 29, 2026

apache / iceberg

Apache Iceberg

Java 9,003 3,369 Updated Jul 3, 2026

opencypher / morpheus

Morpheus brings the leading graph query language, Cypher, onto the leading distributed processing platform, Spark.

Scala 349 65 Updated Jan 14, 2026

intel / ipex-llm

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, DeepSeek, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discr…

Python 8,857 1,429 Updated Jan 28, 2026

intel / BigDL

BigDL: Distributed TensorFlow, Keras and PyTorch on Apache Spark/Flink & Ray

Jupyter Notebook 2,699 730 Updated Jun 12, 2026

linkedin / Avro2TF

Avro2TF is designed to fill the gap of making users' training data ready to be consumed by deep learning training frameworks.

Scala 129 21 Updated May 9, 2020

VowpalWabbit / vowpal_wabbit

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive lea…

C++ 8,684 1,925 Updated May 8, 2026

xgboost-ai / xgboost-ai.github.io

xgboost website

Ruby 21 8 Updated Aug 6, 2025

chipsalliance / chisel

Chisel: A Modern Hardware Design Language

Scala 4,704 649 Updated Jul 2, 2026

aws / amazon-sagemaker-examples

Example 📓 Jupyter notebooks that demonstrate how to build, train, and deploy machine learning models using 🧠 Amazon SageMaker.

Jupyter Notebook 10,965 6,971 Updated Jul 2, 2026

rapidsai / cudf

cuDF - GPU DataFrame Library

C++ 9,694 1,074 Updated Jul 3, 2026

hibayesian / awesome-automl-papers

A curated list of automated machine learning papers, articles, tutorials, slides and projects

4,148 681 Updated Jun 11, 2024

twosigma / flint

A Time Series Library for Apache Spark

Scala 1,165 199 Updated Jul 3, 2020

apache / pinot

Apache Pinot - A realtime distributed OLAP datastore

Java 6,104 1,484 Updated Jul 2, 2026

alteryx / featuretools

An open source python library for automated feature engineering

Python 7,659 914 Updated Jun 17, 2026

shap / shap

A game theoretic approach to explain the output of any machine learning model.

Jupyter Notebook 25,580 3,730 Updated Jul 1, 2026

mlflow / mlflow

The open source AI engineering platform for agents, LLMs, and ML models. MLflow enables teams of all sizes to debug, evaluate, monitor, and optimize production-quality AI applications while control…

Python 26,830 5,939 Updated Jul 3, 2026

Netflix / iceberg

Iceberg is a table format for large, slow-moving tabular data

Java 494 63 Updated Apr 10, 2023

dremio / gandiva

Vectorized processing for Apache Arrow

483 60 Updated Feb 14, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nan Zhu CodingCat

Achievements

Achievements

Highlights

Organizations

Block or report CodingCat

Stars

CodingCat / lerobot-explorer

deepseek-ai / 3FS

oap-project / gazelle_plugin

rust-unofficial / awesome-rust

treeverse / lakeFS

OpenLineage / OpenLineage

uber / h3

uber / RemoteShuffleService

microsoft / hyperspace

databricks / koalas

Netflix / genie

treeverse / dvc

apache / iceberg

opencypher / morpheus

intel / ipex-llm

intel / BigDL

linkedin / Avro2TF

VowpalWabbit / vowpal_wabbit

xgboost-ai / xgboost-ai.github.io

chipsalliance / chisel

aws / amazon-sagemaker-examples

rapidsai / cudf

hibayesian / awesome-automl-papers

twosigma / flint

apache / pinot

alteryx / featuretools

shap / shap

mlflow / mlflow

Netflix / iceberg

dremio / gandiva