0% found this document useful (0 votes)
14 views35 pages

Lecture 13

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views35 pages

Lecture 13

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

Lexical Semantics

COMP-550
Oct 17, 2017
Outline
Semantics
Lexical semantics
Lexical semantic relations
WordNet
Word Sense Disambiguation
• Lesk algorithm
• Yarowsky’s algorithm

2
Semantics
What is ”Semantics”?
The study of meaning in language
“When I use a word”, Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean – neither more nor less.”
Lewis Carroll, Alice in Wonderland

What does meaning mean?


• Relationship of linguistic expression to the real world
• Relationship of linguistic expressions to each other

3
This Lecture
We’ll start by focusing on the meaning of words—
lexical semantics.
Later on:
• meaning of phrases and sentences
• how to construct that from meanings of words

4
From Language to the World
What does telephone mean?
• Picks out all of the objects in the world that are
telephones (its referents)
Its extensional definition
not telephones

telephones

5
Relationship of Linguistic Expressions

How would you define telephone? e.g, to a three-year-


old, or to a friendly Martian.

6
Dictionary Definition
http://dictionary.reference.com/browse/telephone

Its intensional definition


• The necessary and sufficient conditions to be a telephone
This presupposes you know what “apparatus”, “sound”,
“speech”, etc. mean.

7
Lexical Semantics Jargon
Lexeme: Pairing of a particular form (orthographic or
phonological) with its meaning.
For example, the lexeme BANK (noun) consists of bank and
banks, but not banker. BANKER is a lexeme of its own!
Lexicon: Finite list of lexemes
Lemma: The grammatical form that is used to represent
a lexeme.
The lemma for sing, sang, sung is sing. The specific form (e.g.
sang) is called wordform.
Lemmatization: The process of mapping a wordform to
a lemma.

8
Sense and Reference (Frege, 1892)
Frege was one of the first to distinguish between the
sense of a term, and its reference.

Same referent, different senses:

Venus

the morning star

the evening star

9
Word Senses
The meaning of a lemma can vary enormously given
the context:
• A bank can hold investments in a custodial account in the
client’s name.
• As agriculture burgeons on the east bank, the river shrink
even more.
A word sense (or simply sense) is a discrete
representation of one aspect of the meaning of a word.
Next: Relations between different senses (and generally
words)
Later: How to disambiguate between varying senses?

10
Lexical Semantic Relations
How specifically do terms relate to each other? Here
are some ways:
Hypernymy/hyponymy
Synonymy
Antonymy
Homonymy
Polysemy
Metonymy
Synecdoche
Holonymy/meronymy

11
Hypernymy/Hyponymy
ISA relationship

Hyponym Hypernym
monkey mammal
Montreal city
red wine beverage

12
Synonymy and Antonymy
Synonymy
(Roughly) same meaning
offspring descendent spawn
happy joyful merry

Antonymy
(Roughly) opposite meaning
synonym antonym
happy sad
descendant ancestor

13
Homonymy
Same form, different (and unrelated) meaning
Homophone – same sound
• e.g., son vs. sun
Homograph – same written form
• e.g., lead (noun) vs. lead (verb)

14
Polysemy
Multiple related meanings
S: (n) newspaper, paper (a daily or weekly publication on
folded sheets; contains news and articles and
advertisements) "he read his newspaper at breakfast"
S: (n) newspaper, paper, newspaper publisher (a business
firm that publishes newspapers) "Murdoch owns many
newspapers"
S: (n) newspaper, paper (the physical object that is the
product of a newspaper publisher) "when it began to rain he
covered his head with a newspaper"
S: (n) newspaper, newsprint (cheap paper made from wood
pulp and used for printing newspapers) "they used bales of
newspaper every day"
15
Homonymy vs Polysemy
Homonymy: unrelated Polysemy: related meaning
S: (n) position, place (the particular portion of space occupied by
something) "he put the lamp back in its place"
S: (n) military position, position (a point occupied by troops for
tactical reasons)
S: (n) position, view, perspective (a way of regarding situations or
topics etc.)"consider what follows from the positivist view"
S: (n) position, posture, attitude (the arrangement of the body and
its limbs) "he assumed an attitude of surrender"
S: (n) status, position (the relative position or standing of things or
especially persons in a society) "he had the status of a minor"; "the
novel attained the status of a classic"; "atheists do not enjoy a
favorable position in American life"
S: (n) position, post, berth, office, spot, billet, place, situation (a
job in an organization) "he occupied a post in the treasury"

16
Metonymy
Substitution of one entity for another related one
We ordered many delicious dishes at the restaurant.
I worked for the local paper for five years.
Quebec City is cutting our budget again.
The loonie is at a 11-year low.

Synecdoche – a specific kind of metonymy involving


whole-part relations
All hands on deck!
Don’t be a <censored body part>

17
Holonymy/meronymy
Some kind of whole/part relationship
Subtypes Holonym Meronym
groups and members class student
whole and part car windshield
whole and substance chair wood

18
Quiz
Classify the following examples in terms of what lexical
semantic relation they exhibit
cold freezing
they’re their
hair head
enemy friend
cut (hair) cut (bread)
George Clooney actor

19
WordNet (Miller et et., 1990)
WordNet is a lexical resource organized by synsets
• Nodes: synsets
• Edges: lexical semantic relation between two synsets
Separate hierarchy for different parts of speech
• Nouns, verbs, adjectives, adverbs

20
A Synset Entry
S: (n) hand, manus, mitt, paw (the (prehensile) extremity of the superior
limb) "he had the hands of a surgeon"; "he extended his mitt"
direct hyponym / full hyponym
S: (n) fist, clenched fist (a hand with the fingers clenched in the palm (as for hitting))
S: (n) hooks, meat hooks, maulers (large strong hand (as of a fighter))"wait till I get my
hooks on him"
S: (n) right, right hand (the hand that is on the right side of the body) "he writes with his
right hand but pitches with his left"; "hit him with quick rights to the body"
S: (n) left, left hand (the hand that is on the left side of the body) "jab with your left"
part meronym
direct hypernym / inherited hypernym / sister term
part holonym
S: (n) arm (a human limb; technically the part of the superior limb between the shoulder
and the elbow but commonly used to refer to the whole superior limb)
S: (n) homo, man, human being, human (any living or extinct member of the family
Hominidae characterized by superior intelligence, articulate speech, and erect carriage)
derivationally related form

http://wordnetweb.princeton.edu/perl/webwn?o2=&o0=1&o8=1&o1=1&o7=
&o5=&o9=&o6=&o3=&o4=&s=hand&i=8&h=1100000000000000000000000#
c

21
WordNet Has an NLTK Interface
>>> from nltk.corpus import wordnet

Some useful functions:


>>> wordnet.synsets(<query_term>)
>>> wordnet.synset(<synset_name>)

Remember you can use dir and help to get a list of


functions in Python.

22
Word Sense Disambiguation
Figuring out which word sense is expressed in context
His hands were tired from hours of typing.
à hand.n.01

Due to her superior education, her hand was flowing and


graceful.
à hand.n.03

General idea: use words in the context to disambiguate.


Which words above would help with this?

23
Possible Computational Approaches
A heuristic algorithm
• Lesk’s algorithm
Supervised machine learning
• Possible, but requires a lot of work to annotate word
sense information that we want to avoid
Unsupervised, or minimally supervised machine
learning
• Yarowsky’s algorithm

24
Lesk’s Algorithm (1986)
More like a family of algorithms which, in essence,
choose the sense whose dictionary definition shares
the most words with the target word’s neighborhood.

Steps to disambiguate word 𝑤:


1. Construct a bag of words representation of the context, 𝐵
2. For each candidate sense 𝑠$ of word 𝑤:
• Calculate a signature of the sense by taking all of the words
in the dictionary definition of 𝑠$
• Compute Overlap(𝐵, signature(𝑠$ ))
3. Select the sense with the highest overlap score

25
Financial Bank or Riverbank?

Construct from definitions of


all senses of context words

26
Model Variations
Which dictionary to use? NLTK?
Use only dictionary definitions? Or include example
sentences?
Ignore uninformative stopwords (e.g., the, a, of)?
Lemmatize when considering matches (tomatoes
matches tomato)?

27
Exercise
Run the Lesk algorithm using NLTK/WordNet. Ignore
stop words, include examples, count lemma overlap.
Consider only the top two senses of bank.
1. I’ll deposit the cheque at the bank.
2. The bank overflowed and water flooded the town.

28
Yarowsky’s Algorithm (1995)
A method based on bootstrapping
Goal: Learn a classifier for a target word
Steps:
1. Gather a data set with target word to be disambiguated
2. Automatically label a small seed set of examples
3. Repeat the following for a while:
• Train a supervised learning algorithm from the seed set
• Apply the supervised model to the entire data set
• Keep the highly confident classification outputs to be the
new seed set
4. Use the last model as the final model

29
Yarowsky’s Example
Step 1: Disambiguating plant

30
Step 2: Initial Seed Set
Sense A:
• plant as in a lifeform

Other data

Sense B:
• plant as in a factory

31
Step 3: Train a Classifier
He went with a decision-list classifier (we didn’t cover
this one in class)

Note how new collocations are found for each sense

32
Step 3: Change Seed Set
Use only the cases where classifier is highly confident

33
Results
96% on binary word sense distinctions
Same result as with supervised methods, but with
minimal amounts of annotation effort!

34
Notes on Yarowski’s Algorithm
The key to any bootstrapping approach lies in its ability
to create a larger training set from a small set of seeds:
• Need an accurate initial set of seeds
• Need a good confidence metric for picking good new
examples to add to the training set

35

You might also like