0% found this document useful (0 votes)
143 views21 pages

Membangun Model Prediktif

The document summarizes a public lecture about building a predictive super model. It discusses predictive analytics and common predictive algorithms used in business like logistic regression and random forests. It then introduces the idea of a "super algorithm" that incorporates techniques like variable selection, feature engineering, and ensemble learning to produce highly accurate predictive models. The talk concludes by discussing the super learning approach, which combines predictions from multiple base models using a meta-learner, as a potential way to develop a predictive "super model".
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
143 views21 pages

Membangun Model Prediktif

The document summarizes a public lecture about building a predictive super model. It discusses predictive analytics and common predictive algorithms used in business like logistic regression and random forests. It then introduces the idea of a "super algorithm" that incorporates techniques like variable selection, feature engineering, and ensemble learning to produce highly accurate predictive models. The talk concludes by discussing the super learning approach, which combines predictions from multiple base models using a meta-learner, as a potential way to develop a predictive "super model".
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

FMIPA Public Lecture

Membangun
Model Prediktif Super,
Mungkinkah?
Bagus Sartono
Departemen Statistika FMIPA
Collaborators:
Dr. Eng. Annisa 21 Nov 2019
Gerry Alfa Dito, SSi Auditorium FMIPA – IPB University
Bagus Sartono

• Dosen di Departemen Statistika –


FMIPA IPB University

• Koordinator Working Group Data


Mining – FMIPA IPB University

• Wakil Ketua FORSTAT (Forum


Penyelenggara Pendidikan Tinggi
Statistika)
Apa yang Anda pikirkan
tentang model yang super?
definitely not these ones!
Predictive Analytics
Predictive analytics is the branch of
advance analytics which is used to make
prediction about unknown future events.
(PAT Research)

Predictive analytics is the use of data,


statistical algorithms and machine
learning techniques to identify the
likelihood of future outcomes based on
historical data. (SAS)

Predictive analytics is a category of data


analytics aimed at making predictions
about future outcomes based on
historical data and analytics techniques
such as statistical modeling and machine
learning. (John Edward, cio.com)
Predictive Analytics
in Business

• Scoring model to predict


the risk level of debtors

CREDIT scoring • Classification model


involving predictors: socio-
demographical variables,
historical payment, other
transaction records

• Scores
• Good/Excellent Risk
• Bad/Poor Risk

• Common algorithms:
• Logistic Regression
• Classification Tree

6
Predictive Analytics
in Business

• Propensity model to predict the likelihood-to-buy of individuals


• Up-Sell / Cross-Sell campaign

• Selective campaign
• High propensity  give the offering
• Low propensity  no offer

• Common algorithms: Random Forest, Boosted Tree

7
Predictive Analytics
in Business

• Identifying the Debit/Credit Card


probability of dormant
cards to be active activation
• Recall Campaign to the
prospective active card
holder

• Common Algorithm:
• k-Nearest Neighbor

8
Contoh Lainnya

• Prediksi keberhasilan studi mahasiswa

• Prediksi resiko penyakit

• Prediksi cuaca
Common
Classification Model
Algorithms

Logistic Regression Classification Tree Support Vector Machine Random Forest

Neural Network Bayesian Classifier k-Nearest Neighbor Boosting


Model Prediktif Dambaan

Memiliki Ketepatan Sederhana


Prediksi yang Tinggi
Strategi Umum

• VARIABLE SELECTION
• Mengurangi banyaknya prediktor, mengurangi banyaknya
parameter model, menghindari model yang kompleks

• FEATURE ENGINEERING
• Membuat prediktor baru yang lebih prediktif

• ENSEMBLE LEARNING
• Menggabungkan prediksi dari beberapa model/algoritma
berbeda  meningkatkan ketepatan prediksi
Super Algorithm
Memiliki berbagai fitur untuk
menghasilkan model yang baik:
seleksi variabel, feature
engineering, ensemble learning

Bekerja dengan baik meskipun


pada ill-conditioned data

Tidak overfit, memiliki


kemampuan prediksi yang baik
pada data lain
“senjata” pada beberapa algoritma
pemodelan prediktif

Algoritma Pemodelan Variable Feature Ensemble


Selection Engineering
Regresi Logistik - - -
K Nearest Neighbor - - -
Classification Tree Baik Cukup -
Support Vector Machine - Baik -
Random Forest Cukup Baik Baik
Boosted Tree Baik Cukup Baik
Neural Network - Baik Baik
Ide dasar “Super Learner”

• van der Laan, M. J., Polley, E. C. and Hubbard, A. E. (2007) Super Learner.
Statistical Applications of Genetics and Molecular Biology, 6, article 25.

• Polley EC, van der Laan MJ (2010) Super Learner in Prediction. U.C.
Berkeley Division of Biostatistics Working Paper Series. Paper 226.

• STACKING
• menjadikan prediksi dari berbagai model dasar sebagai prediktor
bagi model metalearner
Algoritma Super Learner
CROSS
VALIDATION

FEATURE
ENGINEERING

BASE PREDICTIONS
DATASET LEARNERS META FINAL
LEARNER PREDICTION

VARIABLE ENSEMBLE
SELECTION

https://cran.r-project.org/web/packages/SuperLearner/vignettes/Guide-to-SuperLearner.html
Success Story Empiris
Rata-Rata Peringkat Ketepatan Prediksi berbagai Algoritma melalui proses
validasi silang menggunakan delapan dataset berbeda
Super Learner 1.9
Conditional Forest 4.1
Glm Boost 4.4
Random Forest 5.0
Logistic Regression 5.6
Extra Trees 5.6
Ada Boost 5.8
Naïve Bayes 8.5
Gaussian Process 9.5
Xgboost 10.5
SVM 11.1
CART 11.8
Conditional Tree 11.8
C50 13.9
J48 15.1
Evolutionary Tree 15.9
IBk 16.3
Neural Network 16.3
OneR 17.1
Penutup

• Kebutuhan prediksi ada dimana-mana

• Analis memerlukan algoritma penyusunan model


prediksi yang mampu menghasilkan model super

• Pendekatan super learner bisa menjadi pilihan


karena dilengkapi dengan berbagai senjata

• Selamat mencoba!
terima kasih
bagusco@apps.ipb.ac.id
Preface Slide

You might also like