Mrhs121

huangsheng Mrhs121

Core Contributor to Apache Kylin. Focus on OLAP Engine and Big Data

86 followers · 522 following

Ruijie Networks
Shanghai
http://hslovelal.top:8080

Achievements

x2 x3

Achievements

x2 x3

Lists (5)

Sort

Starred repositories

XiaomiMiMo / MiMo

MiMo: Unlocking the Reasoning Potential of Language Model – From Pretraining to Posttraining

Python 1,788 74 Updated Jun 5, 2025

nurion-ai / nurion

Python 2 Updated Dec 17, 2025

risingwavelabs / risingwave

Streaming data platform. Real-time stream processing, low-latency serving, and Iceberg table management.

Rust 8,615 716 Updated Dec 19, 2025

ginobefun / agentic-design-patterns-cn

《Agentic Design Patterns》中文翻译版

Python 5,568 653 Updated Dec 4, 2025

NVIDIA / cutile-python

cuTile is a programming model for writing parallel kernels for NVIDIA GPUs

Python 1,625 83 Updated Dec 19, 2025

lance-format / lance

Open Lakehouse Format for Multimodal AI. Convert from Parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, a…

Rust 5,837 501 Updated Dec 19, 2025

pathwaycom / pathway

Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.

Python 50,694 1,459 Updated Dec 19, 2025

jlfsdtc / Kylin-MCP

kylin-mcp

Python 1 Updated Oct 22, 2025

FalkorDB / FalkorDB

A super fast Graph Database uses GraphBLAS under the hood for its sparse adjacency matrix graph representation. Our goal is to provide the best Knowledge Graph for LLM (GraphRAG).

C 2,580 196 Updated Dec 19, 2025

FalkorDB / GraphRAG-SDK

Build fast and accurate GenAI apps with GraphRAG SDK at scale.

Python 529 66 Updated Dec 14, 2025

FireFramework / fire

Fire框架是由中通大数据自主研发并开源的、专门用于进行Spark和Flink任务开发的大数据框架，可节约70%以上的代码量。首创基于注解进行Spark和Flink任务开发，具备实时血缘、根因诊断、动态调优、参数热调整等众多平台化功能。Fire框架在中通内部每天处理数据量高达数千亿，在外部已被数十家公司所使用。

Java 44 17 Updated Jul 11, 2024

happyfish100 / fastdfs

FastDFS is a high performance distributed file system (DFS). It's major functions include: file storing, file syncing and file accessing, and design for high capacity and load balance. Wechat/Weixi…

C 9,222 2,003 Updated Dec 16, 2025

kubeflow / mcp-apache-spark-history-server

MCP Server for Apache Spark History Server. The bridge between Agentic AI and Apache Spark.

Python 115 37 Updated Dec 10, 2025

Snowflake-Labs / pg_lake

pg_lake: Postgres with Iceberg and data lake access

C 1,327 62 Updated Dec 19, 2025

zyclove / sql-parser-lineage

sql 血缘解析（hive sql、spark sql、starrocks sql、doris sql）

Java 26 8 Updated Feb 20, 2023

sqlparser / java_data_lineage

Analyze SQL and stored procedure data lineage using Java

Java 23 11 Updated Nov 2, 2024

melin / sqlflow

解析 SQL 字段数据血缘

Java 94 41 Updated Apr 17, 2025

starlake-ai / jsqltranspiler

Rewrite BigQuery, Redshift, Snowflake and Databricks queries into DuckDB compatible SQL (with deep transformation of functions, data types and format characters) using Java.

Java 64 7 Updated Dec 16, 2025

Mrhs121 / RaidOverYunPan

一个通过类Raid技术，将文件分布式存储于多个消费级网盘，以实现极致下载加速的开源云盘分布式文件系统

Go 4 Updated Dec 4, 2025

apache / kyuubi

Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.

Scala 2,283 970 Updated Dec 18, 2025

JupiterMouse / data-lineage-parent

解析SQL，获取字段、表级别的血缘关系。转换成血缘模型，在图数据库neo4j上呈现。

Java 183 84 Updated Nov 17, 2020

melin / superior-sql-parser

基于 antlr4 的多种数据库SQL解析器，获取SQL中元数据，可用于数据平台产品中的多个场景：ddl语句提取元数据、sql 权限校验、表级血缘、sql语法校验等场景。支持spark、flink、gauss、starrocks、Oracle、MYSQL、Postgresql，sqlserver,、db2等

ANTLR 393 140 Updated Dec 19, 2025

vesoft-inc / nebula-algorithm

Nebula-Algorithm is a Spark Application based on GraphX, which enables state of art Graph Algorithms to run on top of NebulaGraph and write back results to NebulaGraph.

Scala 76 41 Updated Aug 19, 2024

wey-gu / nebula-up

One-liner NebulaGraph playground with allllllllll-in-one toolchain integrated on single Linux Server

Shell 69 17 Updated May 31, 2025

wey-gu / nebula-shareholding-example

A dataset generator/graph modeling demo of Shareholding Breakthrough with Distributed open-source Graph Database: Nebula Graph. 图数据库应用示例、数据集、图建模：股权关系穿透

Python 35 9 Updated Apr 20, 2024

wey-gu / covid-track-graph-datagen

China Fake Dataset Generator for Covid Track

Python 10 Updated Dec 8, 2022

wey-gu / fraud-detection-datagen

Fraud detection data generation with configurable degree distribution& community structure, ready for NebulaGraph.

Python 28 9 Updated Apr 19, 2024

Mrhs121 / spark-sql-for-cluster

Spark Sql on Yarn Cluster or Kubeflow-Spark-Operator

Java 1 Updated Sep 26, 2025

open-metadata / OpenMetadata

OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team co…

TypeScript 8,248 1,559 Updated Dec 19, 2025

tomlay0520 / SuepOS

Just for fun, no reasons. Just do it.√

C 8 5 Updated Oct 26, 2025

huangsheng Mrhs121

Lists (5)

Data

🔮 Future ideas

✨ Inspiration

🚀 My stack

nas

Starred repositories

Machine learning