Skip to content
View pariksheet's full-sized avatar

Block or report pariksheet

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning

20,298 2,533 Updated Mar 26, 2026

📖 A curated list of resources dedicated to Natural Language Processing (NLP)

18,342 2,758 Updated Feb 7, 2026

Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics

C++ 16,624 4,065 Updated Mar 27, 2026

Oxford Deep NLP 2017 course

15,866 3,567 Updated Jul 2, 2023

An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs

Scala 8,646 2,030 Updated Mar 27, 2026

Alluxio, data orchestration for analytics and machine learning in the cloud

Java 7,174 2,948 Updated Apr 29, 2025

Machine Learning and Agentic AI Resources, Practice and Research

Python 4,676 1,688 Updated Nov 2, 2025

Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.

Scala 3,598 581 Updated Mar 26, 2026

Apache TinkerPop - a graph computing framework

Java 2,115 847 Updated Mar 28, 2026

A curated list of awesome Deep Learning (DL) for Natural Language Processing (NLP) resources

1,303 255 Updated Jan 24, 2026

Mirror of Apache griffin

Scala 1,171 586 Updated Aug 3, 2025

Mirror of Apache Toree (Incubating)

Scala 749 223 Updated Mar 27, 2026

Python framework for building efficient data pipelines. It promotes modularity and collaboration, enabling the creation of complex pipelines from simple, reusable components.

Python 653 38 Updated Mar 1, 2026

Pythonic Programming Framework to orchestrate jobs in Databricks Workflow

Python 227 65 Updated Mar 25, 2026

VSCode extension to work with Databricks

TypeScript 134 27 Updated Mar 25, 2026