- United Kingdom
Starred repositories
An Open Source Machine Learning Framework for Everyone
Protocol Buffers - Google's data interchange format
DuckDB is an analytical in-process SQL database management system
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
Parsing gigabytes of JSON per second : used by Facebook/Meta Velox, the Node.js runtime, ClickHouse, WatermelonDB, Apache Doris, Milvus, StarRocks
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning …
Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics
FUSE-based file system backed by Amazon S3
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports comp…
lightweight, standalone C++ inference engine for Google's Gemma models.
mlpack: a fast, header-only C++ machine learning library
Header-only C++/python library for fast approximate nearest neighbors
Stan development repository. The master branch contains the current release. The develop branch contains the latest stable development. See the Developer Process Wiki for details.
Multiple Object Tracker, Based on Hungarian algorithm + Kalman filter.
DuckLake is an integrated data lake and catalog format
A Python package for manipulating 2-dimensional tabular data structures
Combining tree-boosting with Gaussian process and mixed effects models
A pathtracer for R. Build and render complex scenes and 3D data visualizations directly from R