Stars
A modular graph-based Retrieval-Augmented Generation (RAG) system
Example Shiny for Python app which talks to the OpenAI API
High-performance runtime for data analytics applications
Annotated Microsoft Azure documentation links used throughout day to day technical conversations.
♾️ CML - Continuous Machine Learning | CI/CD for ML
Dr. Elephant is a job and flow-level performance monitoring and tuning tool for Apache Hadoop and Apache Spark
A multilingual glossary for computing and data science terms.
Peregrine is a workload optimization platform for cloud query engines. The goal of Peregrine is three-fold: 1. make it easier to ingest and analyze query workload telemetry into a common engine-agn…
The Common Data Model (CDM) is a standard and extensible collection of schemas (entities, attributes, relationships) that represents business concepts and activities with well-defined semantics, to…
Gaussian Process Optimization using GPy
Development of bioacoustic tools for analyzing Orcasound data -- either post-processing of archived raw FLAC files or real-time analysis of the lossy stream and/or FLAC files.
Automatically exported from code.google.com/p/smhasher
scikit-learn: machine learning in Python
Dropout As A Bayesian Approximation: Code
ML.NET is an open source and cross-platform machine learning framework for .NET.
Hummingbird compiles trained ML models into tensor computation for faster inference.
Collection of analyses, packages, visualisations of COVID-19 data in R
The repository contains an ongoing collection of tweets IDs associated with the novel coronavirus COVID-19 (SARS-CoV-2), which commenced on January 28, 2020.
R & stats illustrations by @allison_horst
Whisper is a minimal documentation theme for Hugo.
Links to slides for rstudio::conf 2020
Code and Resources for "Applied Machine Learning"