Skip to content
View jyuu's full-sized avatar

Organizations

@Microsoft-CISL @microsoft

Block or report jyuu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A modular graph-based Retrieval-Augmented Generation (RAG) system

Python 31,826 3,353 Updated Mar 27, 2026

Example Shiny for Python app which talks to the OpenAI API

Python 75 17 Updated Sep 27, 2024

High-performance runtime for data analytics applications

Rust 3,002 252 Updated Jun 22, 2022

Annotated Microsoft Azure documentation links used throughout day to day technical conversations.

10 2 Updated Oct 1, 2021

Data Analysis Baseline Library

Jupyter Notebook 729 103 Updated Dec 16, 2024

Resources for Machine Learning Operations with R

44 15 Updated Jan 5, 2021

♾️ CML - Continuous Machine Learning | CI/CD for ML

JavaScript 4,171 345 Updated Jun 2, 2025

Dr. Elephant is a job and flow-level performance monitoring and tuning tool for Apache Hadoop and Apache Spark

Java 1,369 845 Updated Aug 22, 2023
Jupyter Notebook 11 5 Updated Dec 17, 2020

A multilingual glossary for computing and data science terms.

SCSS 128 239 Updated Oct 24, 2025

Microsoft Azure Traces

Jupyter Notebook 1,098 177 Updated Dec 6, 2025

Peregrine is a workload optimization platform for cloud query engines. The goal of Peregrine is three-fold: 1. make it easier to ingest and analyze query workload telemetry into a common engine-agn…

Python 22 1 Updated Aug 31, 2020

A curated list of references for MLOps

13,833 2,034 Updated Nov 21, 2024

The Common Data Model (CDM) is a standard and extensible collection of schemas (entities, attributes, relationships) that represents business concepts and activities with well-defined semantics, to…

C# 1,812 550 Updated Jan 22, 2025

Gaussian Process Optimization using GPy

Jupyter Notebook 949 260 Updated Jan 17, 2023

Development of bioacoustic tools for analyzing Orcasound data -- either post-processing of archived raw FLAC files or real-time analysis of the lossy stream and/or FLAC files.

Jupyter Notebook 57 23 Updated Mar 26, 2026

Automatically exported from code.google.com/p/smhasher

C++ 2,849 486 Updated Nov 14, 2024

scikit-learn: machine learning in Python

Python 65,531 26,854 Updated Mar 27, 2026

Dropout As A Bayesian Approximation: Code

Shell 205 42 Updated Jul 3, 2015

ML.NET is an open source and cross-platform machine learning framework for .NET.

C# 9,334 1,941 Updated Mar 23, 2026

Hummingbird compiles trained ML models into tensor computation for faster inference.

Python 3,532 290 Updated Jul 17, 2025

Collection of analyses, packages, visualisations of COVID-19 data in R

151 37 Updated Sep 30, 2022

The repository contains an ongoing collection of tweets IDs associated with the novel coronavirus COVID-19 (SARS-CoV-2), which commenced on January 28, 2020.

Python 724 302 Updated Feb 22, 2023

R & stats illustrations by @allison_horst

1,828 200 Updated Oct 16, 2022

Whisper is a minimal documentation theme for Hugo.

SCSS 273 128 Updated Mar 14, 2024

Links to slides for rstudio::conf 2020

176 33 Updated Feb 11, 2020

Code and Resources for "Applied Machine Learning"

HTML 162 91 Updated Jun 18, 2020

Box2D is a 2D physics engine for games

C 9,578 1,747 Updated Mar 28, 2026
Next