0% found this document useful (0 votes)

48 views26 pages

It Co 1 en

The document discusses Information Theory and Coding, focusing on the concepts of self-information, conditional information, mutual information, and entropy of random variables. It defines key terms and mathematical formulations, including the properties of logarithmic functions and the implications of entropy in relation to uncertainty and information content. Examples illustrate the application of these concepts in binary sources and random variables.

Uploaded by

ihkhan.cs

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

48 views26 pages

It Co 1 en

Uploaded by

ihkhan.cs

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 26

Information Theory and Coding

Quantitative measure of information

Cédric RICHARD
Université Côte d’Azur
Self-information
Information content

Let A be an event with non-zero probability P (A).

The greater the uncertainty of A, the larger the information h(A) provided by the
realization of A. This can be expressed as follows:

1
h(A) = f .
P (A)

Function f (·) must satisfy the following properties:

f (·) is an increasing function over IR+
information provided by 1 sure event is zero: limp→1 f (p) = 0
information provided by 2 independent events: f (p1 · p2 ) = f (p1 ) + f (p2 )
This leads us to use the logarithmic function for f (·)

1
Self-information
Information content

Lemme 1. Function f (p) = − logb p is the only one that is both positive, continue
over (0, 1], and that satisfies f (p1 · p2) = f (p1) + f (p2).
Proof. The proof consists of the following steps:
1. f (pn ) = n f (p)
1/n
2. f p = n1 f (p) after replacing p with p1/n

3. f pm/n = m n f (p) by combining the two previous equalities

4. f (pq ) = q f (p) where q is any positive rational number

5. f (pr ) = limn→+∞ f (pqn ) = limn→+∞ qn f (p) = r f (p) because rationals are
dense in the reals
Let p and q in (0, 1[. One can write: p = q logq p , which yields:

f (p) = f q logq p = f (q) logq p.

We ﬁnally arrive at: f (p) = − logb p

2
Self-information
Information content

Definition 1. Let (Ω, A, P ) be a probability space, and A an event of A with

non-zero probability P (A). The information content of A is defined as:

h(A) = − log P (A).

Unit. The unit of h(A) depends on the base chosen for the logarithm.
log2 : Shannon, bit (binary unit)
loge : logon, nat (natural unit)
log10 : Hartley, decit (decimal unit)

Vocabulary. h(·) represents the uncertainty of A, or its information content.

3
Self-information
Information content

Information content or uncertainty: h(A) = − logb P (A)

h(A)

P (A)
0
0 1/b 0.5 1

4
Self-information
Information content

Example 1. Consider a binary source S ∈ {0, 1} with P (0) = P (1) = 0.5.

Information content conveyed by each binary symbol is equal to: h 12 = log 2,
namely, 1 bit or Shannon.

Example 2. Consider a source S that randomly selects symbols si among 16

equally likely symbols {s0 , . . . , s15 }. Information content conveyed by each symbol
is log 16 Shannon, that is, 4 Shannon.

Remark. The bit in Computer Science (binary digit ) and the bit in Information
Theory (binary unit ) do not refer to the same concept.

5
Self-information
Conditional information content

Self-information applies to 2 events A and B. Note that P (A, B) = P (A) P (B|A).

We get:
h(A, B) = − log P (A, B) = − log P (A) − log P (B|A)
Note that − log P (B|A) is the information content of B that is not provided by A.

Definition 2. Conditional information content of B given A is defined as:

h(B|A) = − log P (B|A),

that is: h(B|A) = h(A, B) − h(A).

Exercise. Analyze and interpret the following cases: A ⊂ B, A = B, A ∩ B = ∅.

6
Self-information
Mutual information content

The deﬁnition of conditional information leads directly to another deﬁnition, that

of mutual information, which measures information shared by two events.

Definition 3. We call mutual information of A and B the following quantity:

i(A, B) = h(A) − h(A|B) = h(B) − h(B|A).

Exercise. Analyze and interpret the following cases: A ⊂ B, A = B, A ∩ B = ∅.

7
Entropy of a random variable
Deﬁnition

Consider a memoryless stochastic source S with alphabet {s1 , . . . , sn }. Let pi be

the probability P (S = si ).
The entropy of S is the average amount of information produced by S:
n

H(S) = E{h(S)} = − pi log pi .
i=1

Definition 4. Let X be a random variable that takes its values in {x1 , . . . , xn }.

Entropy of X is defined as follows:
n

H(X) = − P (X = xi ) log P (X = xi ).
i=1

8
Entropy of a random variable
Example of a binary random variable

The entropy of a binary random variable is given by:

H(X) = −p log p − (1 − p) log(1 − p) H2 (p).

H2 (p) is called the binary entropy function.

1
entropy H2 (Sh/symb)

0.5

0
0 0.5 1
probability p

9
Entropy of a random variable
Notation and preliminary properties

Lemme 2 (Gibbs’ inequality). Consider 2 discrete probability distributions with

mass functions (p1 , . . . , pn ) and (q1 , . . . , qn ). We have:
n
qi
pi log ≤0
i=1
pi

Equality is achieved when pi = qi for all i

Proof. The proof is carried out in the case of the neperian logarithm. Observe
that ln x ≤ x − 1, with equality for x = 1. Let x = pqii . We have:
n
n
qi qi
pi ln ≤ pi −1 = 1 − 1 = 0.
i=1
pi i=1
pi

10
Entropy of a random variable
Notation and preliminary properties

Graphical checking of inequality ln x ≤ x − 1

y
y =x−1
2

1
y = ln x
0
0.5 1 1.5 2 2.5
x
−1

−2

−3

−4

11
Entropy of a random variable
Properties

Property 1. The entropy satisfies the following inequality:

Hn (p1 , . . . , pn ) ≤ log n,
1
Equality is achieved by the uniform distribution, that is, pi = n for all i.

1
Proof. Based on Gibbs’ inequality, we set qi = n.

Uncertainty about the outcome of an experiment is maximum when all possible

outcomes are equiprobable.

12
Entropy of a random variable
Properties

Property 2. The entropy increases as the number of possible outcomes increases.

Proof. Let X be a discrete random variable with values in {x1 , . . . , xn } and

probabilities (p1 , . . . , pn ), respectively. Consider that state xk is split into two
substates xk1 et xk2 , with non-zero probabilities pk1 et pk2 such that pk = pk1 + pk2 .

Entropy of the resulting random variable X is given by:

H(X ) = H(X) + pk log pk − pk1 log pk1 − pk2 log pk2

= H(X) + pk1 (log pk − log pk1 ) + pk2 (log pk − log pk2 ).

The logarithmic function being strictly increasing, we have: log pk > log pki . This
implies: H(X ) > H(X).

Interpretation. Second law of thermodynamics

13
Entropy of a random variable
Properties

Property 3. The entropy Hn is a concave function of p1 , . . . , pn .

Proof. Consider 2 discrete probability distributions (p1 , . . . , pn ) and (q1 , . . . , qn ).

We need to prove that, for every λ in [0, 1], we have:

Hn (λp1 + (1 − λ)q1 , . . . , λpn + (1 − λ)qn ) ≥ λHn (p1 , . . . , pn ) + (1 − λ)Hn (q1 , . . . , qn ).

By setting f (x) = −x log x, we can write:

n

Hn (λp1 + (1 − λ)q1 , . . . , λpn + (1 − λ)qn ) = f (λpi + (1 − λ)qi ).
i=1

The result is a direct consequence of the concavity of f (·) and Jensen’s inequality.

14
Entropy of a random variable
Properties

Graphical checking of the concavity of f (x) = −x log x

0
f (x)

−5
0 1 2
x

15
Entropy of a random variable
Properties

Concavity of Hn can be generalized to any number m of distributions.

Property 4. Given {(q1j , . . . , qnj )}m

j=1 a finite set of discrete probability
distributions, the following inequality is satisfied:
m
m
m

Hn ( λj q1j , . . . , λj qmj ) ≥ λj Hn (q1j , . . . , qmj ),
j=1 j=1 j=1
m
where {λj }m
j=1 is any set of constants in [0, 1] such that j=1 λj = 1.

Proof. As in the previous case, the demonstration of this inequality is based on

the concavity of f (x) = −x log x and Jensen’s inequality.

16
Pair of random variables
Joint entropy

Definition 5. Let X and Y be two random variables with values in {x1 , . . . , xn }

and {y1 , . . . , ym }, respectively. The joint entropy of X and Y is defined as:
n
m
H(X, Y ) − P (X = xi , Y = yj ) log P (X = xi , Y = yj ).
i=1 j=1

The joint entropy is symmetric: H(X, Y ) = H(Y, X)

Example. Case of two independent random variables

17
Pair of random variables
Conditional entropy

Definition 6. Let X and Y be two random variables with values in {x1 , . . . , xn }

and {y1 , . . . , ym }, respectively. The conditional entropy of X given Y = yj is:
n

H(X|Y = yj ) − P (X = xi |Y = yj ) log P (X = xi |Y = yj ).
i=1

H(X|Y = yj ) is the amount of information needed to describe the outcome of X

given that we know that Y = yj .

Definition 7. The conditional entropy of X given Y is defined as:

m

H(X|Y ) P (Y = yj ) H(X|Y = yj ),
j=1

Example. Case of two independent random variables

18
Pair of random variables
Relations between entropies

H(X, Y ) = H(X) + H(Y |X) = H(Y ) + H(X|Y ).

These equalities can be obtained by ﬁrst writing:

log P (X = x, Y = y) = log P (X = x|Y = y) + log P (Y = y),

and then taking the expectation of each member.

Property 5 (chain rule). The joint entropy of n random variables can be evaluated
using the following chain rule:
n

H(X1 , . . . , Xn ) = H(Xi |X1 . . . Xi−1 ).
i=1

19
Pair of random variables
Relations between entropies

Each term of H(X, Y ) = H(X) + H(Y |X) = H(Y ) + H(X|Y ) is positive. We can
conclude that:

H(X) ≤ H(X, Y )
H(Y ) ≤ H(X, Y )

20
Pair of random variables
Relations between entropies

From the generalized concavity of the entropy, setting qij = P (X = xi |Y = yj ) and

λj = P (Y = yj ), we get the following inequality:

H(X|Y ) ≤ H(X)

Conditioning a random variable reduces its entropy. Without proof, this can be
generalized as follows:

Property 6 (entropy decrease with conditioning). The entropy of a random

variable decreases with successive conditionings, namely,

H(X1 |X2 , . . . , Xn ) ≤ . . . ≤ H(X1 |X2 , X3 ) ≤ H(X1 |X2 ) ≤ H(X1 ),

where X1 , . . . , Xn denote n discrete random variables.

21
Pair of random variables
Relations between entropies

Consider X and Y two random variables, respectively with values in {x1 , . . . , xn }

and {y1 , . . . , ym }. We have:

0 ≤ H(X|Y ) ≤ H(X) ≤ H(X, Y ) ≤ H(X) + H(Y ) ≤ 2H(X, Y ).

22
Pair of random variables
Mutual information

Definition 8. The mutual information of two random variables X and Y is

defined as follows:
I(X, Y ) H(X) − H(X|Y )
or, equivalently,
n
m
P (X = xi , Y = yj )
I(X, Y ) P (X = xi , Y = yj ) log .
i=1 j=1
P (X = xi ) P (Y = yj )

The mutual information quantiﬁes the amount of information obtained about one
random variable through observing the other random variable.

Exercise. Case of two independent random variables

23
Pair of random variables
Mutual information

In order to give a diﬀerent interpretation of mutual information, the following

deﬁnition is recalled beforehand.

Definition 9. We call the Kullback-Leibler distance between two distributions P1

and P2 , here supposed to be discrete, the following quantity:
P1 (X = x)
d(P1 , P2 ) = P1 (X = x) log .
P2 (X = x)
x∈X(Ω)

The mutual information corresponds to the Kullback-Leibler distance between the

marginal distributions and the joint distribution of X and Y .

24
Pair of random variables
Venn diagram

A Venn diagram can be used to illustrate relationships among measures of

information: entropy, joint entropy, conditional entropy and mutual information.

H(X) H(Y |X) H(X|Y ) H(Y )

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

H(X, Y ) H(X|Y ) I(X, Y ) H(Y )

Lecture 3 - Entropy
No ratings yet
Lecture 3 - Entropy
35 pages
MIT16 36s09 Lec03
No ratings yet
MIT16 36s09 Lec03
10 pages
Lecture 3: Entropy, Relative Entropy, and Mutual Information
No ratings yet
Lecture 3: Entropy, Relative Entropy, and Mutual Information
5 pages
CoverThomas Ch2 PDF
No ratings yet
CoverThomas Ch2 PDF
38 pages
Entropy, Relative Entropy and Mutual Information
No ratings yet
Entropy, Relative Entropy and Mutual Information
38 pages
1.1 Shannon's Information Measures: Lecture 1 - January 26
No ratings yet
1.1 Shannon's Information Measures: Lecture 1 - January 26
5 pages
Intro to Information Theory
No ratings yet
Intro to Information Theory
14 pages
Elements of Information Theory 2006 Thomas M. Cover and Joy A. Thomas
No ratings yet
Elements of Information Theory 2006 Thomas M. Cover and Joy A. Thomas
16 pages
Entropy and Mutual Information
No ratings yet
Entropy and Mutual Information
63 pages
Lecturer: Mark Braverman Scribe: Mark Braverman: COS597D: Information Theory in Computer Science
No ratings yet
Lecturer: Mark Braverman Scribe: Mark Braverman: COS597D: Information Theory in Computer Science
5 pages
ECE4007 Information Theory and Coding: DR - Sangeetha R.G
No ratings yet
ECE4007 Information Theory and Coding: DR - Sangeetha R.G
44 pages
LECTURE 1: Introduction
No ratings yet
LECTURE 1: Introduction
16 pages
The Binary Entropy Function: ECE 7680 Lecture 2 - Definitions and Basic Facts
No ratings yet
The Binary Entropy Function: ECE 7680 Lecture 2 - Definitions and Basic Facts
8 pages
Entropy & Info Theory Lecture
No ratings yet
Entropy & Info Theory Lecture
36 pages
Mutual Information
No ratings yet
Mutual Information
4 pages
Entropy and Mutual Information
No ratings yet
Entropy and Mutual Information
4 pages
Mutinf PDF
No ratings yet
Mutinf PDF
4 pages
Intro to Information Theory
No ratings yet
Intro to Information Theory
17 pages
Lecture 2: Entropy and Mutual Information: 2.1 Example
No ratings yet
Lecture 2: Entropy and Mutual Information: 2.1 Example
8 pages
Lecture 1: Introduction, Entropy and ML Estimation
No ratings yet
Lecture 1: Introduction, Entropy and ML Estimation
5 pages
ITC Module - I
No ratings yet
ITC Module - I
98 pages
1 Introduction To Information Theory
No ratings yet
1 Introduction To Information Theory
9 pages
Tema 1 Awp
No ratings yet
Tema 1 Awp
32 pages
2 Information Measurement and Entropy
No ratings yet
2 Information Measurement and Entropy
23 pages
Lecture 5
No ratings yet
Lecture 5
42 pages
Introduction To Information Theory
No ratings yet
Introduction To Information Theory
20 pages
ITC Module2 1
No ratings yet
ITC Module2 1
34 pages
Probability & Information: Prof. J Bapat
No ratings yet
Probability & Information: Prof. J Bapat
20 pages
Intro to Information Theory
No ratings yet
Intro to Information Theory
21 pages
Lecture 3: Entropy, Relative Entropy, and Mutual Information
No ratings yet
Lecture 3: Entropy, Relative Entropy, and Mutual Information
5 pages
2 Entropy and Mutual Information: I (A) F (P (A) )
No ratings yet
2 Entropy and Mutual Information: I (A) F (P (A) )
27 pages
Lecture 1: Entropy and Mutual Information: 2.1 Example
No ratings yet
Lecture 1: Entropy and Mutual Information: 2.1 Example
8 pages
Chapter 2
No ratings yet
Chapter 2
68 pages
Lec35 - 210108062 - ZAINAB ALI
No ratings yet
Lec35 - 210108062 - ZAINAB ALI
9 pages
Information Theory: Info Rmatio N Types
No ratings yet
Information Theory: Info Rmatio N Types
52 pages
Mutual Information
No ratings yet
Mutual Information
48 pages
Information Theory Basics
No ratings yet
Information Theory Basics
26 pages
BEC503-DC-M3-Information Theory
No ratings yet
BEC503-DC-M3-Information Theory
100 pages
lời giải
No ratings yet
lời giải
52 pages
Information Theory Exercises
No ratings yet
Information Theory Exercises
4 pages
1.1 Entropy and Relative Entropy
No ratings yet
1.1 Entropy and Relative Entropy
22 pages
Session 2
No ratings yet
Session 2
60 pages
Info Theory Exercise Solutions
No ratings yet
Info Theory Exercise Solutions
16 pages
Relative Entropy
No ratings yet
Relative Entropy
6 pages
Information Theory for Students
No ratings yet
Information Theory for Students
16 pages
Ch5 Entropy and Information
No ratings yet
Ch5 Entropy and Information
77 pages
Lecture 2
No ratings yet
Lecture 2
22 pages
Information Theory Entropy Relative Entropy
No ratings yet
Information Theory Entropy Relative Entropy
60 pages
Information & Coding Theory Basics
No ratings yet
Information & Coding Theory Basics
34 pages
Information Theory Basics
No ratings yet
Information Theory Basics
211 pages
Entr 5
No ratings yet
Entr 5
2 pages
Exercise Problems: Information Theory and Coding
No ratings yet
Exercise Problems: Information Theory and Coding
6 pages
Lecture 8: Channel Capacity, Continuous Random Variables: 1.1 Examples
No ratings yet
Lecture 8: Channel Capacity, Continuous Random Variables: 1.1 Examples
6 pages
Lec7 InformationTheory
No ratings yet
Lec7 InformationTheory
41 pages
Information Theory
No ratings yet
Information Theory
114 pages
Information Theory Course Overview
No ratings yet
Information Theory Course Overview
114 pages
Business Statistics ETF1100 Notes
No ratings yet
Business Statistics ETF1100 Notes
56 pages
Chapter 2 DIgital Communication
No ratings yet
Chapter 2 DIgital Communication
25 pages
(Ebook) A First Course in Probability and Markov Chains by Giuseppe Modica, Laura Poggiolini ISBN 9781119944874, 9781141171200, 1119944872, 1141171201, B00B9REVAS Full
No ratings yet
(Ebook) A First Course in Probability and Markov Chains by Giuseppe Modica, Laura Poggiolini ISBN 9781119944874, 9781141171200, 1119944872, 1141171201, B00B9REVAS Full
102 pages
Probability & Distribution Guide
No ratings yet
Probability & Distribution Guide
3 pages
$$$MGB3rdSearchable PDF
100% (1)
$$$MGB3rdSearchable PDF
577 pages
ECS315 2014 Postmidterm U1 PDF
No ratings yet
ECS315 2014 Postmidterm U1 PDF
89 pages
Sympy: Symbolic Computing in Python: Supplementary Material
No ratings yet
Sympy: Symbolic Computing in Python: Supplementary Material
16 pages
Homework Help by StudyHub.vip
100% (1)
Homework Help by StudyHub.vip
6 pages
Data Fusion With The Linear Kalman Filter Slides
No ratings yet
Data Fusion With The Linear Kalman Filter Slides
325 pages
Anna University Question Paper - MA2261 Probability and Random Processes
No ratings yet
Anna University Question Paper - MA2261 Probability and Random Processes
3 pages
Discrete Probability Distributions: Mcgraw-Hill/Irwin
No ratings yet
Discrete Probability Distributions: Mcgraw-Hill/Irwin
15 pages
(Maa 4.9) Discrete Distributions in General
No ratings yet
(Maa 4.9) Discrete Distributions in General
20 pages
Question 5 (30 Points, Matlab Simulations - Simula...
No ratings yet
Question 5 (30 Points, Matlab Simulations - Simula...
5 pages
Practice No. 1 - Exercises On Discrete and Continuous Random Variables
No ratings yet
Practice No. 1 - Exercises On Discrete and Continuous Random Variables
3 pages
1.probability Random Variables and Stochastic Processes Athanasios Papoulis S. Unnikrishna Pillai 1 300 121 150
No ratings yet
1.probability Random Variables and Stochastic Processes Athanasios Papoulis S. Unnikrishna Pillai 1 300 121 150
30 pages
Statistics Probability q3 Mod2 Mean and Variance of Discrete Random
No ratings yet
Statistics Probability q3 Mod2 Mean and Variance of Discrete Random
25 pages
Activity
No ratings yet
Activity
9 pages
Statistics For Business and Economics, 10th Global Edition Paul Newbold Full Chapters Included
No ratings yet
Statistics For Business and Economics, 10th Global Edition Paul Newbold Full Chapters Included
150 pages
DLP 1 (Abm11)
No ratings yet
DLP 1 (Abm11)
2 pages
Probability Basics for Beginners
No ratings yet
Probability Basics for Beginners
2 pages
MQ12MathsMethodsVCEU3&42E PDF
50% (4)
MQ12MathsMethodsVCEU3&42E PDF
680 pages
19 Probability
No ratings yet
19 Probability
83 pages
Introduction To Probability Models 12th Edition Sheldon Mross
No ratings yet
Introduction To Probability Models 12th Edition Sheldon Mross
41 pages
‎⁨جزئيه الميد⁩
No ratings yet
‎⁨جزئيه الميد⁩
53 pages
Chapter 7
No ratings yet
Chapter 7
55 pages
4.1 Mean of A Random Variable: Ziad Zahreddine
No ratings yet
4.1 Mean of A Random Variable: Ziad Zahreddine
13 pages
Discrete and Continuous Random Variable
67% (3)
Discrete and Continuous Random Variable
30 pages
01 Handout 1 PDF
No ratings yet
01 Handout 1 PDF
2 pages
Bab 8 Probablity Distribution
No ratings yet
Bab 8 Probablity Distribution
10 pages
Tutorial 3
No ratings yet
Tutorial 3
2 pages

It Co 1 en

Uploaded by

It Co 1 en

Uploaded by

Information Theory and Coding

Quantitative measure of information

Let A be an event with non-zero probability P (A).

Function f (·) must satisfy the following properties:

4. f (pq ) = q f (p) where q is any positive rational number

We ﬁnally arrive at: f (p) = − logb p

Definition 1. Let (Ω, A, P ) be a probability space, and A an event of A with

h(A) = − log P (A).

Vocabulary. h(·) represents the uncertainty of A, or its information content.

Information content or uncertainty: h(A) = − logb P (A)

Example 1. Consider a binary source S ∈ {0, 1} with P (0) = P (1) = 0.5.

Example 2. Consider a source S that randomly selects symbols si among 16

Self-information applies to 2 events A and B. Note that P (A, B) = P (A) P (B|A).

Definition 2. Conditional information content of B given A is defined as:

h(B|A) = − log P (B|A),

that is: h(B|A) = h(A, B) − h(A).

Exercise. Analyze and interpret the following cases: A ⊂ B, A = B, A ∩ B = ∅.

The deﬁnition of conditional information leads directly to another deﬁnition, that

Definition 3. We call mutual information of A and B the following quantity:

i(A, B) = h(A) − h(A|B) = h(B) − h(B|A).

Exercise. Analyze and interpret the following cases: A ⊂ B, A = B, A ∩ B = ∅.

Consider a memoryless stochastic source S with alphabet {s1 , . . . , sn }. Let pi be

Definition 4. Let X be a random variable that takes its values in {x1 , . . . , xn }.

The entropy of a binary random variable is given by:

H(X) = −p log p − (1 − p) log(1 − p) H2 (p).

H2 (p) is called the binary entropy function.

Lemme 2 (Gibbs’ inequality). Consider 2 discrete probability distributions with

Equality is achieved when pi = qi for all i

Graphical checking of inequality ln x ≤ x − 1

Property 1. The entropy satisfies the following inequality:

Uncertainty about the outcome of an experiment is maximum when all possible

Property 2. The entropy increases as the number of possible outcomes increases.

Proof. Let X be a discrete random variable with values in {x1 , . . . , xn } and

Entropy of the resulting random variable X is given by:

H(X ) = H(X) + pk log pk − pk1 log pk1 − pk2 log pk2

Interpretation. Second law of thermodynamics

Property 3. The entropy Hn is a concave function of p1 , . . . , pn .

Proof. Consider 2 discrete probability distributions (p1 , . . . , pn ) and (q1 , . . . , qn ).

Hn (λp1 + (1 − λ)q1 , . . . , λpn + (1 − λ)qn ) ≥ λHn (p1 , . . . , pn ) + (1 − λ)Hn (q1 , . . . , qn ).

By setting f (x) = −x log x, we can write:

Graphical checking of the concavity of f (x) = −x log x

Concavity of Hn can be generalized to any number m of distributions.

Property 4. Given {(q1j , . . . , qnj )}m

Proof. As in the previous case, the demonstration of this inequality is based on

Definition 5. Let X and Y be two random variables with values in {x1 , . . . , xn }

The joint entropy is symmetric: H(X, Y ) = H(Y, X)

Example. Case of two independent random variables

Definition 6. Let X and Y be two random variables with values in {x1 , . . . , xn }

H(X|Y = yj ) is the amount of information needed to describe the outcome of X

Definition 7. The conditional entropy of X given Y is defined as:

Example. Case of two independent random variables

H(X, Y ) = H(X) + H(Y |X) = H(Y ) + H(X|Y ).

These equalities can be obtained by ﬁrst writing:

log P (X = x, Y = y) = log P (X = x|Y = y) + log P (Y = y),

and then taking the expectation of each member.

From the generalized concavity of the entropy, setting qij = P (X = xi |Y = yj ) and

Property 6 (entropy decrease with conditioning). The entropy of a random

H(X1 |X2 , . . . , Xn ) ≤ . . . ≤ H(X1 |X2 , X3 ) ≤ H(X1 |X2 ) ≤ H(X1 ),

where X1 , . . . , Xn denote n discrete random variables.

Consider X and Y two random variables, respectively with values in {x1 , . . . , xn }

0 ≤ H(X|Y ) ≤ H(X) ≤ H(X, Y ) ≤ H(X) + H(Y ) ≤ 2H(X, Y ).

Definition 8. The mutual information of two random variables X and Y is

Exercise. Case of two independent random variables

In order to give a diﬀerent interpretation of mutual information, the following

Definition 9. We call the Kullback-Leibler distance between two distributions P1

The mutual information corresponds to the Kullback-Leibler distance between the

A Venn diagram can be used to illustrate relationships among measures of

H(X) H(Y |X) H(X|Y ) H(Y )

H(X, Y ) H(X|Y ) I(X, Y ) H(Y )

You might also like