0% found this document useful (0 votes)
30 views30 pages

1 Introduction

ISOM3360 Data Mining for Business Analytics is a Spring 2025 course taught by Associate Professor Yi Yang, focusing on machine learning and data mining techniques applied to business contexts. The course includes lectures, lab sessions, assignments, a team project, and exams, with materials available on the Canvas course website. Students will learn to analyze structured and unstructured data, make predictions, and apply machine learning models to derive actionable insights for decision-making.

Uploaded by

29 4A
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views30 pages

1 Introduction

ISOM3360 Data Mining for Business Analytics is a Spring 2025 course taught by Associate Professor Yi Yang, focusing on machine learning and data mining techniques applied to business contexts. The course includes lectures, lab sessions, assignments, a team project, and exams, with materials available on the Canvas course website. Students will learn to analyze structured and unstructured data, make predictions, and apply machine learning models to derive actionable insights for decision-making.

Uploaded by

29 4A
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

ISOM3360 Data Mining for Business Analytics

Introduction

Instructor: Yi Yang

Department of ISOM

Spring 2025
Welcome

❑ Course information

2
About me

❑ Instructor: Yi Yang, Associate Professor, ISOM

❑ Research interests: machine learning

❑ Ph.D. from Northwestern University

❑ Taught at UIUC for two years

❑ Worked at IBM Research and Amazon

❑ Consulting for hedge fund on machine learning

❑ Teaching ISOM 3360, 3370, 5270 and ExecEd


3
Course information

❑ 18~19 lectures

❑ 10 lab sessions

❑ Hands-on problem solving using Python

❑ 3 assignments

❑ 1 team project (3~4 person per group)

❑ 2 exams: Midterm exam (tentatively Mar 20,


7:00pm-8:30pm); Final exam (TBD)
4
Course material

❑ All the materials (e.g., lecture slides, readings)


will be posted on Canvas course website.

❑ Data Mining for Business Analytics: Concepts, Techniques, and


Applications in R, by Galit Shmueli, Peter C. Bruce, Inbal Yahav, Nitin R.
Patel, Kenneth C. Lichtendahl

❑ Data Science for Business: What you need to know about data mining
and data-analytic thinking, by Foster Provost, Tom Fawcett

❑ Learning Data Mining with Python, by Robert Layton

5
Grading components

❑ Lab 5%

❑ Class Attendance/Participation 10%

❑ Homework Assignments 10%

❑ Group Project 15%

❑ Midterm Exam 28%

❑ Final Exam 32%

6
❑ Instructor: Yi Yang
❑ Email: imyiyang@ust.hk Begin subject: [ISOM3360]

❑ Office Hours: by appointment

❑ Teaching Assistant:
❑ Sophie Gu, imsophie@ust.hk

7
Academic integrity

8
Questions

9
You may have heard of these

Big Data
Artificial
Intelligence
Data
Mining Data
Science

Machine Python
Learning
Course: Small data
Python: not related to machine learning/data mining --> they are
independent, but usually ppl like using python, but actually they can 10
use other coding language
Data mining

11
Data mining

Hong Kong Smart City: All the things are


interconnected, e.g. traffic light

12
Data

Structured Data Unstructured Data


Data that has a predefined
Data that does not have a
and organized format or
Definition predefined or organized format
schema, often stored in a
or schema
database or spreadsheet
Text documents, social media
Tables in a relational
posts, images, videos, audio
Examples database, spreadsheets,
recordings, email messages,
log files, financial data
web pages, GPS data, etc
Natural language processing,
Techniques Machine learning (simple) computer vision, speech
recognition (hard)
Data Volume Usually smaller in volume Usually much larger in volume
most exciting data for company
Focus of the course: structured data but still rely on machine learning but need extra
technique e.g. nlp
13
Unstructured data

❑ In addition to traditional numerical data, a wealth


of potentially valuable business information may
originate in unstructured forms.

14
Unstructured data: Text
data is in the text format
e.g. financial annual report (operation; earning; numbers is in
the text)
--> valuable for the company e.g. use for trading
---> but how to process: use machine learning

understanding the information


then give insights is important

15
16
Unstructured data: Image

Street view
recongize the car/truck by using machine learning
can use human but costly and time-consuming

17
Data mining

IMPORTANT: Definition

❑ Finding patterns in large amount of data, using


machine learning methods, for actionable
insights Patterns: similarity/commonity
we can still build the modern with small data, don't need large amount of data, but of
course large is better
finding actionable insights is the "Purpose"
18
Prediction is the key

❑ Prediction is the key for decision


making under uncertainty.

❑ Better prediction creates competitive


advantages.

What is actionable insights?


Insights refers to prediction
prediction all we care about
beacuse our life is all of uncertainty
e.g. do we need to bring unbrumella , the can give use predicition, then
can make a better decision

reduce uncertainty --> then can earn money


e.g. estimate the customer demand in next month

19
Machine Learning
machine learning can help us make predicition

❑ Machine learning algorithms enable computer programs to


automatically analyze data, recognize patterns, and make
predictions for new unseen data. this is the key

❑ Machine learning models make predictions.

Machine learning: It’s a


induces pattern FACE
from data
learn patterns from human face e.g. eyes
by use mathmatical forms

use old data format to predict unseen/new


data --> better accuuracy
X random guess

Face recognition from image 20


❑ Q: Is ChatGPT a machine learning model?
prediction process?
input: is our question, whcih is a new unseen data
ourput: ChatGPT's predicition

YES

21
An Example: customer complaint management
Some complaint with high priority/ low priority
--> then can classify
--> machine learning: traing the model to predict the priority type/complaint type e.g. shipping problem or product
--> then can send the provlem to different team very quickly withour manually

Paint point: Firms receive customer complaint


filings on different aspects, how to handle the
complaints in a timely manner?
22
Traditional vs. machine learning solution
❑ Traditional: hire a team of customer services to read the
complaints and forward the complaints to different teams for
handling.

❑ Machine learning: automatically classify customer complaints


into different categories (e.g. shipping related, product defect
related); automatically rank the priority of the complaints; and
forward the complaints to different teams for handling.

23
Business Intelligence pyramid

Decision making and


strategic planning

Data Mining, Machine Learning

Data retrieval and


aggregation

Data management
and Storage
different layer requires different technique

24
Predictive vs. Descriptive analytics
Descriptive Analytics Predictive Analytics

Focus Understanding past events Predicting future outcomes

Goal Summarize and present data Build predictive models

Data Type Historical data Historical data

Analysis Identify patterns and trends Build models to make predictions

Use Case Understand past performance Forecast future outcomes


understand the existing data/patterns

Example Sales data for the past year Sales forecast for next quarter

Summary statistics and


Output Predictive models and forecasts
visualizations predicition is about new data

Data aggregation, visualization,


Techniques Machine learning
and basic statistics

Decision Making Reactive Proactive


cannot use summarize to answer the question" what is
the sales for next month"
it needs analyse and prediction

25
Exercise
You, as a company marketing director, want to know the answers to the
following questions. Which ones require a data mining solution?

Who are the high-value customers?


desceiptive anaysis:

Is there an age difference between the high-value customers and the low-value
customers?
descriptive anaysis: find the low-calue and high-value, then calcuate the averge age, then find the different
--> t-test (method name)
Will some particular new customer be high-value customer?
predictive analysis

How many sales amount should I expect a new customer to generate?


predictive analysis: regression number, beacuse need to predict the real number

Customer Gender Age Membership Monthly Amount


Purchase

Alice F 25 Y 5 $120
Let’s define customers whose
Bob M 40 Y 3 $30
amount > $100 as high-value
Charlie M 35 Y 6 $210 customer. The rests are the low-
value customer.
Doug M 18 N 4 $95

… … … … … …

26
Descriptive analytics

27
Exercise

❑ Say you work in a digital media company that provides


online streaming video service. You have lots of data
about lots of users watching lots of movies/TVs. What are
the
personalized use cases of predictive analytics?
recommadation
of videos --> prediciton -->
based on the customer's
behavior in the past --> then
predict the customers's new
move
Good example: Tiktok, keep
suggest good short videos to
you, then keep watching
Loop: watch more, more data,
tiktok know more about use,
then better prediction /
accuracy

budget planning: can know


whcih movies is worth to be
invested
--> predict ppl interest in
what topics of mov
--> so company invest more
in the host of cast

Misuse: share account of


netflix, good for individual,
use IP address to check ppl share account
but bad for netflix
--> prediction problem 28
--> Netflix flight for them
Course objective

❑ You will learn


❑ Various machine learning models

❑ Hands on experience by lab practice

❑ Analytical thinking by various business examples


large data
use little shelves to dig the gold --> slow
with machine learning, know how to use the bigger tools --> more effectively and
efficient

course: not about how to disign digger, but learn how to use it

❑ You will not learn


❑ Data warehousing, Database, big data techniques

❑ Business/Managerial planning
29
30

You might also like