MODULE 5 NLP
1 Define Machine Translation. List and explain its types and applications.
Machine Translation (MT) is a subfield of Natural Language Processing (NLP) that deals with the
automatic translation of text or speech from one natural language to another using computer
systems, without human intervention.
Types of Machine Translation:
    1. Rule-Based Machine Translation (RBMT):
            o   Works using grammar rules and dictionaries.
            o   Example: English–Hindi dictionary-based translation.
            o   Advantage: Grammatically accurate for simple sentences.
            o   Limitation: Needs a large number of rules, difficult to maintain.
    2. Statistical Machine Translation (SMT):
            o   Works using probability and statistics from bilingual text data.
            o   Example: Google Translate (earlier versions).
            o   Advantage: Works well with large data.
            o   Limitation: Struggles with rare words and context.
    3. Example-Based Machine Translation (EBMT):
            o   Works using previously translated examples
            o   Translates by matching new sentences with similar past examples.
            o   Advantage: Good for repetitive content.
            o   Limitation: Limited if example database is small.
    4. Neural Machine Translation (NMT):
            o   Works using deep learning and neural networks.
            o   Example: Modern Google Translate, Microsoft Translator.
            o   Advantage: Produces fluent, natural translations.
            o   Limitation: Requires large computational resources.
Applications of Machine Translation (Alphabetical Order – 5 points)
    1. Business Communication – Helps companies communicate with international clients by
       translating emails, contracts, and documents.
    2. Education – Assists students and researchers in understanding study material, books, and
       papers written in foreign languages.
    3. E-commerce – Translates product descriptions and customer reviews for global online
       shopping platforms.
    4. Healthcare – Helps doctors and patients from different language backgrounds to
       communicate effectively.
    5. Tourism & Travel – Real-time translation apps guide travelers to understand local signs,
       menus, and conversations.
2. What are language divergences? Give examples in translation.
Language divergences occur when two languages express the same idea in different grammatical,
lexical, or structural ways. In Machine Translation, divergences create difficulties because a direct
word-by-word translation often leads to incorrect or unnatural sentences.
Types of Language Divergences with Examples:
1. Word Order Typology
       Languages differ in the order of Subject (S), Verb (V), and Object (O) in a sentence.
       Examples:
            o   English, German, French, Mandarin → SVO (e.g., I eat apples).
            o   Hindi, Japanese → SOV (Main seb khata hoon).
            o   Irish, Arabic → VSO (Eats the boy an apple).
       VO languages → usually have prepositions (English: to a friend).
       OV languages → usually have postpositions (Hindi: table par).
       Translation issue: Systems must reorder sentence structure.
2. Lexical Divergences
       Occur when words have different meanings, usages, or gaps across languages.
       Examples:
            o   English word bass → can mean a fish (lubina in Spanish) or a *musical instrument
                (bajo in Spanish).
            o   English word wall → in German, Wand (inside wall) vs. Mauer (outside wall).
            o   English word brother → in Chinese there are two separate words: gege (elder
                brother) and didi (younger brother).
       Translation issue: Context decides the correct word.
3. Morphological Typology
       Languages differ in how many morphemes (smallest meaning units) they pack into words.
    1. Isolating Languages (Vietnamese): one morpheme per word.
    2. Polysynthetic Languages (Siberian Yupik): many morphemes, one word = full sentence.
    3. Agglutinative Languages (Turkish): clear separable morphemes, each with one meaning.
    4. Fusional Languages (Russian): morphemes blend; one morpheme may show case + number
       + gender together.
       Translation issue: Morphologically rich languages need subword handling in MT.
4. Referential Density
       Measures how much a language uses explicit pronouns.
       Hot languages (high referential density): English → frequent pronouns (e.g., He is eating).
       Cold languages (low referential density): Chinese, Japanese → omit pronouns (Eating
        instead of He is eating).
       Translation issue: From cold → hot, MT must insert missing pronouns.
Explain the Encoder-Decoder architecture used in Neural Machine Translation.
Ans :
The Encoder–Decoder architecture is the most widely used framework in Neural Machine Translation
(NMT). It is based on deep learning, where one network (encoder) reads the input sentence in the
source language and another network (decoder) generates the translation in the target language.
Working of Encoder–Decoder:
    1. Encoder:
            o   Takes the input sentence in the source language (e.g., English).
            o   Converts each word into a vector (word embedding).
            o   Uses RNN, LSTM, or GRU to process the sequence of words.
            o   Produces a context vector (fixed-length representation) summarizing the whole
                sentence.
    2. Decoder:
            o   Takes the context vector from the encoder.
            o   Generates the target sentence (e.g., Hindi) word by word.
            o   At each step, it predicts the next word using probabilities.
            o   Example: Input: “I am happy” → Output: “Main khush hoon.”
   3. Training:
           o    The system is trained on large parallel corpora (source–target pairs).
           o    The goal is to minimize the difference between predicted output and correct
                translation.
Encoder Block
   1. Self-Attention → looks at all words in the input.
   2. Add & Norm → stabilize.
   3. Feed-Forward → small NN for each word.
   4. Add & Norm → stabilize again.
Decoder Block
   1. Masked Self-Attention → looks at past words only.
   2. Add & Norm.
   3. Cross-Attention → looks at encoder output (input sentence).
   4. Add & Norm.
   5. Feed-Forward.
   6. Add & Norm.
Limitations of Basic Encoder–Decoder:
      The context vector is fixed-length, so long sentences lose information.
      Translation may be inaccurate for complex sentences.
How is machine translation evaluated? Discuss BLEU and METEOR scores.
MT Evaluation
Machine Translation (MT) systems are evaluated mainly along two dimensions:
   1. Adequacy – how well the translation captures the exact meaning of the source sentence
      (faithfulness or fidelity).
   2. Fluency – how natural, clear, and grammatically correct the translation is in the target
      language.
       Human judgment is the most reliable way to evaluate MT.
       Raters score translations on scales (e.g., 1–5 or 1–100) based on fluency and
       adequacy.
       Human evaluation is slow and costly, so automatic metrics are often used. Common
       automatic metrics include BLEU, METEOR, ROUGE, TER, etc.
       1. BLEU (Bilingual Evaluation Understudy)
       Definition:
       BLEU is one of the first and most popular automatic metrics. It measures how many n-grams
       (word sequences) in the machine output match with reference (human) translations.
       How it works:
           1. n-gram Precision: It calculates the overlap of 1-gram, 2-gram, 3-gram, and 4-gram
               between system output and reference.
           2. Brevity Penalty: If the machine output is too short compared to the reference, BLEU
               reduces the score.
   3. Final score is between 0 and 1 (or 0–100).
Example:
   o Reference: “the cat is on the mat”
   o MT Output: “the cat sat on the mat”
   o Common n-grams: “the cat”, “on the mat”.
   o BLEU score will be high because of overlaps, even though “is” vs “sat” is different.
2. METEOR (Metric for Evaluation of Translation with Explicit ORdering)
Definition:
METEOR was developed to address BLEU’s weaknesses. It tries to match words semantically
and not just by surface form.
How it works:
    1. Checks exact word matches.
    2. Includes stem matches (run vs running).
    3. Includes synonym matches (big vs large).
    4. Adds word order penalties (if words are jumbled, score decreases).
    5. Final score ranges from 0 to 1.
Example:
    o Reference: “the boy is playing football”
    o MT Output: “the kid plays soccer”
    o BLEU score will be low (few exact matches).
    o METEOR score will be higher because it matches “boy–kid” (synonym), “football–
        soccer”, “play–plays”.