The University of Jordan
Natural Language Processing
                           Course ID: 1905380
14/07/2024                      Dr. Loai Alnemer   1
The University of Jordan
Welcome to this course (by ChatGPT)
• In this course, you will learn the fundamental concepts and
  techniques of Natural Language Processing (NLP), a field that
  enables computers to understand, interpret, and generate
  human language.
• Prepare to embark on an exciting journey as we explore the
  cutting-edge applications of NLP in areas such as text analysis,
  sentiment analysis, language generation, and more.
14/07/2024                    Dr. Loai Alnemer                       2
The University of Jordan
• "Natural language is the most important part of artificial
  intelligence." John Searle
• "Natural language processing is a cornerstone of artificial
  intelligence, allowing computers to read and understand
  human language, as well as to produce and recognize
  speech." Ginni Rometty
• "Natural language processing is one of the most
  important fields in artificial intelligence and also one of
  the most difficult." Dan Jurafsky
14/07/2024                       Dr. Loai Alnemer               3
The University of Jordan
Lecture 1: Introduction and word vector
• The course (10 mins)
• Human language and word meaning (15 mins)
• Word2vec introduction (15 mins)
• Word2vec objective function gradients (25 mins)
• Optimization basics (5 mins)
• Looking at word vectors (10 mins or less)
14/07/2024                     Dr. Loai Alnemer     4
The University of Jordan
Course Description
• This course provides an introduction to the field of natural language processing
  (NLP) providing a theoretical foundation and hands-on (lab-style) practice in
  computational approaches for processing natural language text. We will discuss
  problems involving different language system components (such as meaning in
  context and linguistic structures)
• Students will collaborate in teams on modeling and implementing natural
  language processing and digital text solutions using Python and a variety of
  relevant tools
• We will begin by discussing machine learning methods for NLP as well as core
  NLP, such as language modeling, part of speech tagging, and parsing.
• We will also discuss applications such as information extraction, machine
  translation, text generation, and automatic summarization
14/07/2024                            Dr. Loai Alnemer                               5
The University of Jordan
Course Objectives
By the end of this course, students will be able to:
• Understand the key concepts and principles of natural language processing.
• Implement common NLP techniques such as tokenization, stemming, lemmatization, and
  part-of-speech tagging.
• Develop language models using n-grams and neural networks.
• Perform text classification, sentiment analysis, and named entity recognition.
• Build machine translation systems using sequence-to-sequence models.
• Explore advanced NLP topics such as dialogue systems, question answering, and text
  generation.
• Apply NLP techniques to solve practical problems in various domains, such as customer
  service, content analysis, and information retrieval.
14/07/2024                             Dr. Loai Alnemer                               6
The University of Jordan
Course work and grading policy
                           Collaboration policy:
• Project 15%              - Don’t take code off the web
                           - Acknowledge working with other students
• Midterm 30%              - Write your own assignment solutions
• Quizzes 5%               - Students must independently submit their solutions
                            AI tools policy :Large language models are great (!), but
• Final Exam 50%           • we don’t want ChatGPT’s solutions to our assignments
                           • Collaborative coding with AI tools is allowed; asking it to answer
                               questions is strictly prohibited
                           • Employing AI tools to substantially complete assignments will be
                               considered a violation of the Honor Code
14/07/2024                              Dr. Loai Alnemer                                          7
The University of Jordan
What is NLP?
• NLP is a subfield of computer science and artificial intelligence (AI) that uses machine
  learning to enable computers to understand and communicate with human language.
• NLP combines computational linguistics (rule-based modeling of human language) with
  statistical modeling, machine learning, and deep learning to enable computers and
  digital devices to recognize, understand, and generate text and speech.
• NLP research has enabled the era of generative AI, from large language models to image
  generation models that can understand and respond to natural language requests.
• NLP is already widely applied in everyday technologies like search engines, chatbots,
  voice assistants, and GPS systems.
• NLP also plays a growing role in enterprise solutions that help streamline and automate
  business operations, increase employee productivity, and simplify mission-critical
  business processes.
14/07/2024                               Dr. Loai Alnemer                                    8
The University of Jordan
The term NLP
• Natural language: refers to the language that humans use to
  communicate with each other, such as English, Spanish, or Arabic
• Processing: As distinguished from data processing Question: How is
  data processing and natural language processing different?
14/07/2024                     Dr. Loai Alnemer                        9
The University of Jordan
Turing test
• NLP core technologies and methodologies arose from famous Turing
  Test proposed by Sir Alan Turing (1912–1954) in 1950s, the father of
  AI.
14/07/2024                      Dr. Loai Alnemer                         10
The University of Jordan
NLP Challenges
• Word sense ambiguity
14/07/2024                 Dr. Loai Alnemer   11
The University of Jordan
NLP Challenges
• Word sense / meaning ambiguity                           Ambiguous headlines:
                                                           - Include   your children when baking
                                                           cookies
                                                           - Safety Experts Say School Bus
                                                           Passengers Should Be Belted
  Credit:    http://stuffsirisaid.com
14/07/2024                              Dr. Loai Alnemer                                           12
The University of Jordan
NLP Challenges
• Language is not static:
      • Language grows
      • Cyber language: BRB, G2G , ….
14/07/2024                              Dr. Loai Alnemer   13
The University of Jordan
NLP Challenges
• Language is compositional
14/07/2024                    Dr. Loai Alnemer   14
The University of Jordan
NLP Challenges
• Scale
      •   Huge amount of data
      •   Penn Tree bank ~1M from Wall street journal
      •   Newswire collection: 500M+
      •   Wikipedia: 2.9 billion word (English)
      •   Web: several billions of words
14/07/2024                              Dr. Loai Alnemer   15