Skip to content
View jyuu's full-sized avatar

Organizations

@Microsoft-CISL @microsoft

Block or report jyuu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A modular graph-based Retrieval-Augmented Generation (RAG) system

Python 29,874 3,155 Updated Dec 20, 2025

Example Shiny for Python app which talks to the OpenAI API

Python 75 18 Updated Sep 27, 2024

High-performance runtime for data analytics applications

Rust 3,004 255 Updated Jun 22, 2022

Annotated Microsoft Azure documentation links used throughout day to day technical conversations.

10 2 Updated Oct 1, 2021

Data Analysis Baseline Library

Jupyter Notebook 727 103 Updated Dec 16, 2024

Resources for Machine Learning Operations with R

43 15 Updated Jan 5, 2021

♾️ CML - Continuous Machine Learning | CI/CD for ML

JavaScript 4,161 346 Updated Jun 2, 2025

Dr. Elephant is a job and flow-level performance monitoring and tuning tool for Apache Hadoop and Apache Spark

Java 1,367 848 Updated Aug 22, 2023
Jupyter Notebook 11 5 Updated Dec 17, 2020

A multilingual glossary for computing and data science terms.

SCSS 126 241 Updated Oct 24, 2025

Microsoft Azure Traces

Jupyter Notebook 1,049 172 Updated Dec 6, 2025

Peregrine is a workload optimization platform for cloud query engines. The goal of Peregrine is three-fold: 1. make it easier to ingest and analyze query workload telemetry into a common engine-agn…

Python 22 1 Updated Aug 31, 2020

A curated list of references for MLOps

13,484 1,996 Updated Nov 21, 2024

The Common Data Model (CDM) is a standard and extensible collection of schemas (entities, attributes, relationships) that represents business concepts and activities with well-defined semantics, to…

C# 1,783 551 Updated Jan 22, 2025

Gaussian Process Optimization using GPy

Jupyter Notebook 950 261 Updated Jan 17, 2023

Development of bioacoustic tools for analyzing Orcasound data -- either post-processing of archived raw FLAC files or real-time analysis of the lossy stream and/or FLAC files.

Jupyter Notebook 57 23 Updated Nov 5, 2025

Automatically exported from code.google.com/p/smhasher

C++ 2,831 485 Updated Nov 14, 2024

scikit-learn: machine learning in Python

Python 64,345 26,513 Updated Dec 21, 2025

Dropout As A Bayesian Approximation: Code

Shell 204 41 Updated Jul 3, 2015

ML.NET is an open source and cross-platform machine learning framework for .NET.

C# 9,300 1,938 Updated Dec 15, 2025

Hummingbird compiles trained ML models into tensor computation for faster inference.

Python 3,511 287 Updated Jul 17, 2025

Collection of analyses, packages, visualisations of COVID-19 data in R

151 36 Updated Sep 30, 2022

The repository contains an ongoing collection of tweets IDs associated with the novel coronavirus COVID-19 (SARS-CoV-2), which commenced on January 28, 2020.

Python 723 302 Updated Feb 22, 2023

R & stats illustrations by @allison_horst

1,827 200 Updated Oct 16, 2022

Whisper is a minimal documentation theme for Hugo.

SCSS 272 131 Updated Mar 14, 2024

Links to slides for rstudio::conf 2020

177 33 Updated Feb 11, 2020

Code and Resources for "Applied Machine Learning"

HTML 163 91 Updated Jun 18, 2020

Box2D is a 2D physics engine for games

C 9,388 1,709 Updated Dec 15, 2025
Next