Skip to content

nuwangunasekara/ICDE2025

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

Machine Learning on the Fly: A Hands-On Tutorial for Streaming Data

(Tutorial at IEEE ICDE 2025)

Stream.png

This hands-on tutorial provides a comprehensive introduction to key topics in data stream learning, combining theoretical foundations with practical demonstrations and code examples. Participants will explore essential concepts, including supervised learning for data streams, building efficient pipelines for online preprocessing and model training, detecting and visualizing concept drift, and applying anomaly detection algorithms to streaming data. We will also delve into the challenges and opportunities of AutoML for data streams and tackle practical concerns related to partially and delayed labeled data streams. The tutorial features CapyMOA, an open-source library that offers efficient algorithm implementations through a high-level Python API. Participants will gain hands-on experience using this tool, with all source code available at https://github.com/adaptive-machine-learning/CapyMOA and supporting tutorials and installation guides accessible at https://capymoa.org/. By the end of the session, attendees will be equipped with practical skills and tools to address real-world challenges in data stream learning.

Goals and Objectives

In this hands-on tutorial, our aim is to familiarize participants with the application of various machine-learning tasks to streaming data. Alongside providing an introductory overview outlining the typical supervised learning cycle (classification and regression), and assumptions of this setting, we will focus on the following topics:

  • Introduction to data stream learning and supervised tasks for stream learning;
  • Pipelines for online preprocessing and supervised learning tasks;
  • Concept drift detection, visualization and evaluation;
  • Anomaly detection algorithms on streaming data;
  • The limitations and opportunities w.r.t. AutoML for data streams; and
  • Practical concerns when dealing with partially and delayed labeled data streams;

Resources

CapyMOA

CapyMOA.jpeg

Notebooks

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published