Natural Language Processing
By
    Dr. Parminder Kaur
             What is NLP?
• Natural Language Processing (NLP)
  – Computers use (analyze, understand,
    generate) natural language
  – A somewhat applied field
• Computational Linguistics (CL)
  – Computational aspects of the human
    language faculty
  – More theoretical
           Goals of NLP
• Scientific Goal
  – Identify the computational machinery
    needed for an agent to exhibit various
    forms of linguistic behavior
• Engineering Goal
  – Design, implement, and test systems
    that process natural languages for
    practical applications
                Applications
• speech processing: get flight information or book
  a hotel over the phone
• information extraction: discover names of people
  and events they participate in, from a document
• machine translation: translate a document from
  one human language into another
• question answering: find answers to natural
  language questions in a text collection or
  database
• summarization: generate a short biography of
  Noam Chomsky from one or more news articles
            General Themes
•   Ambiguity of Language
•   Language as a formal system
•   Computation with human language
•   Rule-based vs. Statistical Methods
•   The need for efficiency
              Topic Ideas
1.Text to Speech – artificial voices
2.Speech Recognition - understanding
3.Textual Analysis – readability
4.Plagiarism Detection – candidate selection
5.Intelligent Agents – machine interaction
Text to Speech – artificial voice
• Text Input
• Break text into phonemes
  – Match phonemes to voice elements
  – Concatenate voice elements
  – Manipulate pitch and spacing
• Output results
• Research question: How can a human voice be
  used to produce an artificial voice?
• Model Talker - opportunities for active, hands-on
  research
         Speech Recognition
• Spoken Input
• Identify words and phonemes in speech
  – Generate text for recognized word parts
  – Concatenate text elements
  – Perform spelling, grammar and context checking
• Output results
• Research question: How can speech recognition
  assist a deaf student taking notes in class?
• VUST – Villanova University Speech Transcriber
  Textual Analysis - Readability
• Text Input
• Analyze text & estimate “readability”
  – Grade level of writing
  – Consistency of writing
  – Appropriateness for certain educ. level
• Output results
• Research question: How can computer
  analyze text and measure readability?
• Opportunities for hands-on research
         Plagiarism Detection
• Text Input
• Analyze text & locate “candidates”
  – Find one or more passages that might be plagiarized
  – Algorithm tries to do what a teacher does
  – Search on Internet for candidate matches
• Output results
• Research question: What algorithms work like
  humans when finding plagiarism?
• Experimental CS research
           Intelligent Agents
• Example: ELIZA
• AIML: Artificial Intelligence Modeling Lang.
• Human types something
• Computer parses, “understands”, and generates
  response
• Response is viewed by human
• Research question: How can computers
  “understand” and “generate” human writing?
• Also good area for experimentation
     Digital Image Processing
• The images in previous slides are digital
  (now), but they are NOT the result of DIP
• Digital Image Processing is
  – Processing digital images by a digital
    computer
• DIP requires a digital computer and other
  supporting technologies (e.g., data
  storage, display and transmission)
Photography
Motion Pictures
Law Enhancement and Biometrics
   Remote Sensing
Hurricane Andrew     America at night
taken by NOAA GEOS   (Nov. 27, 2000)
         Thermal Images
         Operate in infrared frequency
Human body disperses      Different colors indicate
  heat (red pixels)        varying temperatures
 Medical Diagnostics
    Operate in X-ray frequency
chest                        head
         PET and Astronomy
         Operate in gamma-ray frequency
                             Cygnus Loop in the
                           constellation of Cygnus
Positron Emission Tomography
Cartoon Pictures (Non-photorealistic)
Synthetic Images in Gaming
  Age of Empire III by Ensemble Studios
Virtual Reality (Photorealistic)
   Speech Recognition in AI
• AI-based speech recognition has made it
  possible for computers to understand and
  recognize human speech, enabling
  frictionless interaction between humans
  and machines.
          How does Speech
        Recognition in AI Work?
•   Recording: The voice recorder that is built into the gadget is used to carry out the first
    stage. The user's voice is kept as an audio signal after being recorded.
•   Sampling: As you are aware, computers and other electronic gadgets use data in their
    discrete form. By basic physics, it is known that a sound wave is continuous. Therefore,
    for the system to understand and process it, it is converted to discrete values. This
    conversion from continuous to discrete is done at a particular frequency.
•   Transforming to Frequency Domain: The audio signal's time domain is changed to its
    frequency domain in this stage. This stage is very important because the frequency
    domain may be used to examine a lot of audio information. Time domain refers to the
    analysis of mathematical functions, physical signals, or time series of economic or
    environmental data, concerning time. Similarly, the frequency domain refers to the
    analysis of mathematical functions or signals concerning frequency, rather than time.
•   Information Extraction from Audio: Each voice recognition system's foundation is at
    this stage. At this phase, the audio is transformed into a vector format that may be used.
    For this conversion, many extraction methods, including PLP, MFCC, etc., are applied.
•   Recognition of Extracted Information: The idea of pattern matching is applied in this
    step. Recognition is performed by taking the extracted data and comparing it to some pre-
    defined data. Pattern matching is used to accomplish this comparing and matching. One
    of the most popular pieces of software for this is Google Speech API.
         techniques used in AI for
          speech recognition are:
•   Hidden Markov Models (HMMs): HMMs are statistical models that
    are widely used in speech recognition AI. HMMs work by modelling
    the probability distribution of speech sounds, and then using these
    models to match input speech to the most likely sequence of sounds.
•   Deep Neural Networks (DNNs): DNNs are a type of machine
    learning model that is used extensively in speech recognition AI.
    DNNs work by using a hierarchy of layers to model complex
    relationships between the input speech and the corresponding text
    output.
•   Convolutional Neural Networks (CNNs): CNNs are a type of
    machine learning model that is commonly used in image recognition,
    but have also been applied to speech recognition AI. CNNs work by
    applying filters to input speech signals to identify relevant features.
    Some recent advancements in
    speech recognition AI include:
•   Transformer-based models: Transformer-based models, such as BERT
    and GPT, have been highly successful in natural language processing
    tasks, and are now being applied to speech recognition AI.
•   End-to-end models: End-to-end models are designed to directly map
    speech signals to text, without the need for intermediate steps. These
    models have shown promise in improving the accuracy and efficiency of
    speech recognition AI.
•   Multimodal models: Multimodal models combine speech recognition AI
    with other modalities, such as vision or touch, to enable more natural and
    intuitive interactions between humans and machines.
•   Data augmentation: Data augmentation techniques, such as adding
    background noise or changing the speaking rate, can be used to generate
    more training data for speech recognition AI models, improving their
    accuracy and robustness.