Welcome to BT4222
Lecturer: Yiliang ZHAO
● PhD in Computer Science
● Director, Head of Data Science, Openspace Ventures (Current)
● Adjunct Faculty, MITB (Artificial Intelligence), SMU (Current)
● J/APAC Machine Learning Practice Lead, Google
● Senior Data Scientist, Shopee
Teaching Assistant: Ta YU
● Ph.D. student (2020 Aug intake) in Information
Systems (current)
● Master in MIS, National Chengchi University,
Taiwan
○ Decision and Quantitative Analysis Lab
○ Machine Learning - Recommendation system
● Office: IS Research Lab 2 [COM2-01-03]
● E0546019@u.nus.edu
● https://www.linkedin.com/in/yutanccu/
Teaching Assistant: Jingqiao TAO
● Ph.D. student (2020 Aug intake) in Information
Systems & Analytics
● Bachelor in MIS, Zhejiang University, China
● Office: IS Research Lab 2 [COM2-01-03]
● tao_jingqiao@u.nus.edu
● https://www.linkedin.com/in/jingqiao-tao-b62812223
Teaching Assistant: Zhang Xinyi
● Ph.D. student (2020 Aug intake) in Information Systems &
Analytics
● Bachelor in Financial management, SCUT
● Master in Business Analytics, HKU
● Office: IS Research Lab 1 [COM2-01-02]
● xinyizhang@u.nus.edu
● https://www.linkedin.com/in/xinyi-zhang-8324b4176/
Ice Breaker
● Tell us about your background
● Tell us what you would like to get out of the course
Some Expectations
● Knowledge sharing instead of teaching
○ Interactive
○ Initiative
○ Innovative
Some Expectations
● Knowledge sharing instead of teaching
○ Interactive
○ Initiative
○ Innovative
● Tuned towards more industry-focused learning
○ Try to be less theoretical
○ Focus on project/report/presentation
Some Expectations
● Knowledge sharing instead of teaching
○ Interactive
○ Initiative
○ Innovative
● Tuned towards more industry-focused learning
○ Try to be less theoretical
○ Focus on project/report/presentation
● Ask questions verbally instead of using chat
Agenda
● Introduction to Natural Language Processing
● Introduction to Deep Learning
● Deep learning and NLP
● K-Nearest Neighbour Classifier
● Hands-On
Terms
● Artificial Intelligence: Intelligence exhibited by machines to mimic a human
mind
● Machine Learning: Computers being able to learn without hand-coding each
step
● Deep Learning: Multi-layered algorithms for learning from data
● Data Science: Methods, processes, and systems to extract insights from data
● Data Analytics: Discovery of meaningful patterns in data
What is what
Goodfellow, Ian, et al. Deep learning. Vol. 1. Cambridge: MIT press, 2016.
Goodfellow, Ian, et al. Deep learning. Vol. 1. Cambridge: MIT press, 2016.
Natural Language Processing
What is Natural Language Processing (NLP)?
● Natural Language Processing: a field with
three sub-topics:
○ Computer Science
○ Artificial intelligence
○ Linguistics
● NLP enables computers to understand and
process human languages.
● One definition of AI-complete is perfect
language understanding.
Easy NLP Tasks
● Spell Checking
● Keyword Search
Medium-Level NLP Tasks
● Name Entity Recognition
● Convert unstructured text into a well structured document
Medium-Level NLP Tasks
● Topic Classification
● Assign topic into each document/piece of text
Hard NLP Tasks
● Sentiment Analysis
● Aspect-based sentiment Analysis
● Analyze opinions/sentiment behind text
Hard NLP Tasks
● Machine Translation
● Question Answering
● Visual Question Answering
NLP is very challenging
● AI-complete
● Ambiguity of Language
○ Lexical/Semantic Ambiguity: The fisherman went to the bank.
○ Syntactic/Structural Ambiguity: He watched her paint with enthusiasm.
● Data Variation
○ We have ImageNet, while we do not have such huge labelled volume text data
● Complexity in representation, learning and using
linguistic/situational/word/visual knowledge
Some Machine Translation Examples
Cloud Natural Language
Extract Detect Analyze Classify
entities sentiment syntax content
https://cloud.google.com/natural-language/ Confidential + Proprietary
Machine Learning
Machine Learning
Machine Learning can be decomposed into three components:
● Representation (Model and Data Level)
● Evaluation (Loss Function/ Target Function)
● Optimization: How to search representation to obtain better evaluation
Representation Learning
● Given a task: how to classify these following shapes:
● Our system should work as:
○ Input: Image
○ Representation: Number of corners.
○ Model: Fed with representation and based on mathematical models or rules to make prediction
● Designing features is a complex process, which require a deep domain
expertise.
● Deep learning is the method which tries to learn features by the model itself.
Deep Learning
Deep Learning
● Deep learning is a subfield of machine learning
● Most machine learning methods work well
because of high-quality feature engineering.
○ SIFT or HOG features for images
○ MFCC or LPC features for speech
○ Features about words parts (suffix, capitalized)
● Optimization in conventional machine learning
only focus on model-level to improve evaluation.
Deep Learning
DL focus on representation learning instead of feature engineering
○ Representation learning attempts to automatically learn good features or representation
○ It will learn multiple levels of representation
○ From “raw” inputs x
Deep Learning for Speech
The first real-world tasks addressed by deep learning is speech recognition
Deep Learning for Computer Vision
● Computer vision may be the most well-known breakthrough of DL.
● ImageNet Classification with Deep Convolutional Neural Networks.
ImageNet Scoreboard
Deep Learning For Arts
Style transfer based on Deep Learning: use one image to stylize another.
Deep Learning For Data Generation
Glow, a reversible generative model using invertible 1*1 convolutions, learns a
latent space where certain directions capture attributes like age, hair color, and so
on. (Kingma & Dhariwal 2018)
Why is Deep Learning Powerful Now?
● Feature engineering require high-level expert knowledge, which are easily
over-specified and incomplete.
● Large amounts of training data
● Modern multi-core CPUs/GPUs/TPUs
● Better deep learning ‘tricks’ such as regularization, optimization, transfer etc.
● Better context-modeling due to less independence assumptions
● Effective method for end-to-end system optimization.
Deep Learning meets NLP
Deep Learning Meets NLP
● Deep learning methods are used to solve NLP problems with a focus on
representation learning, i.e. better vectors.
● Based on different levels of natural language, DL has achieved several big
improvements:
○ Linguistic Levels: word, syntax
○ Intermediate tasks/tools: entities, parsing, parts-of-speech
○ Full applications: sentiment analysis, machine translation, question answering
Word Vector
Each word is represented as a dense and real-valued vector in a low dimensional
space.
This is a graphic from (He. et al, 2014)
Semantic Vector
● Semantic behind sentences/documents
can be encoded as vectors.
● Deep learning is able to do the
composition as:
○ Every word is a vector
○ A neural network (CNN or RNN) do the
composition
Sentiment Analysis
● Traditional approaches:
○ Bag-of-words are used and fed
into classifiers.
○ Sentiment word list are used,
which contain positive and
negative words.
● Deep learning models
○ Same semantic vector models
○ Word vectors or even char vectors
as input
Question Answering
● Traditional approaches
○ Hand-craft rules are designed to
capture word and other knowledge.
○ Regular expression used a lot
● Deep learning approaches:
○ Same semantic vector models
○ Questions and answers are projected
into the same vector space.
From Tan et al 2016
Chatbot
● Traditional approaches:
○ Hand-craft knowledge base are used.
○ Can not address out-of-domain question.
● Deep learning approaches:
○ Neural language models which can generate language.
Machine Translation
● Traditional approaches:
○ Statistical model (Moses)
○ Very large complex system
● Deep learning approaches:
○ Source sentence is mapped to
vector, then output sentence
generated
KNN Classifier
Different Learning Methods
● Eager Learning
○ Explicit description of target function on the whole
training set
● Instance-based Learning
○ Learning=storing all training instances
○ Classification=assigning target function to a new
instance
○ Referred to as “Lazy” learning
K Nearest Neighbour Classifier
● All instances correspond to points in an n-dimensional Euclidean space
● Classification is delayed till a new instance arrives
● Classification done by comparing feature vectors of the different points
● Target function may be discrete or real-valued
Summary
● NLP achieves the interaction between computers and human languages;
● ML = Representation + Loss/Target + Optimization;
● Deep Learning is promising these days given large data and faster
computation resources
● Deep Learning has lots of applications in NLP
● KNN is a simple instance-based learning approach