-
https://github.com/zinggAI/zingg
- India
- @sonalgoyal
Stars
ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.
Distributed and fault-tolerant realtime computation: stream processing, continuous computation, distributed RPC, and more
A scalable, distributed Time Series Database.
Collect, aggregate, and visualize a data ecosystem's metadata
JVector: the most advanced embedded vector search engine
A platform for visualization and real-time monitoring of data workflows
Scalable identity resolution, entity resolution, data mastering and deduplication using ML
A fully asynchronous, non-blocking, thread-safe, high-performance HBase client.
Trident-ML : A realtime online machine learning library
Cascading is a feature rich API for defining and executing complex and fault tolerant data processing flows locally or on a cluster.
GoldenOrb is an open-source implementation of Pregel, Google's graph processing framework
Schema modelling framework for decentralised domain-driven ownership of data.
Hadoop Data Integration with various databases, ftp servers, salesforce. Incremental update, dedup, append, merge your data on Hadoop.
An example of SparkConnect extension.
jyates / culvert
Forked from booz-allen-hamilton/culvertSecondary indexing for structured and unstructured data in Big Table style databases.