0% found this document useful (0 votes)
48 views73 pages

DC - Information Theory

Uploaded by

6087 Swathiga
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
48 views73 pages

DC - Information Theory

Uploaded by

6087 Swathiga
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 73
UNIT - I INFORMATION THEORY [syutasus | | Discrete Memoryless source, Information, Entropy, Mutual Information = Discrete | Memoryless channels — pinary Symmetric Channel, Channel Capacity - Hartley « | Shannon law - Source coding theorem - Shannon - Fano & Huffman codes. Information theory was known as the mathematical theory of communication, nmunication system deals only with mathematical modeling and analysis of a com rather than with physical sources and physical channels. Information theory provides, “The minimum number of bits per symbo! required to fully represent the source; The maximum rate at which reliable communication can take place over the channel. 1.1. UNCERTAINTY “Uncertainty refers to justification situations involving imperfect or unknown information”. Uncertainty applies to predictions of future events to physical measurements that are already made, or to the unknown. The lack of certainty, a state of limited knowledge where it is impossible to exactly describe the existing state, a future outcome, or more than one possible outcome. Example: The following example related to uncertainty (or) surprise. {= | Sun Rises in East i | Here uncertainty is zero, because there is no surprise in the Statemen, nt, 9 probability of occurrence of sun rising in the east is always I. Me Sun does not Rise in East | Here Uncertainty is high because there is maximum surprise, MAXIM | information as it is not possible. ‘| | : R | Consider the source output is modeled as a discrete random variable, Sw | . | takes on symbols from a fixed finite alphabet. 7 $ = {5951541} (la) | with probabilities, | PS=s) = py = 0,1... #1 (13) | Set of probabilities must satisfy the condition, | Tp sd (13) « Consider the event S = s,, describing the emission of symbol 5, by the souree with probability p, , as P(S=5,) = Py (14) if the probability p, = 1 and p, = 0 for all j # k, then thére is no ‘surprise’ and therefore no “information” when symbol s, is emitted, If, on the other hand, the source symbols occur with different probabilities, and the probability p, is low, then there is more ‘surprise’ and therefore ‘information’ | wwhen symbol s, is emitted by the source than when symbol S;, i # k with higher | probability is emitted. UNCERTAINTY - SURPRISE - INFORMATION Uncertainty. Surprise and Information are all related. Before the event S = 5, occurs, there is the amount of uncertainty. “+ When the event S = s,, occurs there is the amount of surprise. * Affer the occurrence of the event § = 5,, there is gain in the amount of infromation, 1 2 Information Theor, 1.2. INFORMATION of of the probability The amount of information is related to the inverse occurrence. Definition The amount of information gained after observing the event S = Sa» which oceurs with probability »,, S 1 5) 16) = toe a () 25, | By using logarithm to base 2, the resulting unit of information is ealled the bit, 1.6) in 16) = lon (ze) 40,1, 1.2.1. PROPERTIES OF INFORMATION Property 1 If we are obsolutely certain of the outcome of an event, even before it occurs, there is no information gained. 1) = 0 for p,=1 --C.-7) Also, if recently knowns the message being twansmitted, the amount of information carried is zero. Property 2 ‘The occurrence of an event S = s, either provides some or no information but never brings about a loss of information. 1(s,)20 for OSp,=1 +18) Property 3 ‘The less probable an event is, the more information we gain when it occurs. i.e. If there is more uncertainty about the message, information carried is also more. 1(s,)> 1s) for pe Le more ola | +(L.14)| where ¢ is the base of the natural algorithm. By using the inequaltiy equation (1.13), we get ! 4s) laa! (# ) log, |} < aT XP 82 (# log, e oP Px 1 Ket S Tose zap) 1 (= K=1 } < ax- D Px | =0 Tog, e | 2% 2,7# is 4s) ie. log,| ~~] < 0 2 (LAS) where the equality holds if g, =p, for all k. . 1 Suppose Ge = i #=01..K-1 (1.16) which corresponds to an alphabet $ with equiprobables symbols. The entropy of a discrete memoryless source with such a characterisation equals. Kz) 1 I —|= K (1.17) ~% 4 on. (2] log, ol Substitute equation (1.16) in (1.15) we get Kel i Lye toes( 5) S log, K (1.18) K=0 Ps Equivalently, the entropy of a discrete memoryless source with an arbitrary probability distribution for the symbols of its alphabet S is bounded as, H(S) < log, K (119) a ee) holds only if the “The inequality holds only HFt | \ he alphabet § are equiprobable. | ample 1.7 | Entropy of Memoryless Source: Consider @ bin for whic 7 7 for which symbol 0 occurs with probability p, and symbol 1 with pro P,=I-pe ary source bability Solution: relations ; emitted by We assume that the source is memoryless so that successive symbols emitted 92 the source are satistically independent. The entropy of such a source equals, H(S) ~ Po logs Po~ Pr logs Pr = Sr easicti-Po log, (1 = Po) bits 20) “When py = 0, the entropy H(S) = 0. This follows from the fact that x log x>0asx 0. a8 When p= I the entropy H(S) = “The entropy H(S) attains its maximum value, H,,4x = 1 bit, when py = Po 1, 3 ie. symbols 1 and 0 are equally probable. Extension of a Discrete Memoryless Source Consider blocks rather than individual symbols with each block consisting of ‘a’ suce ve source symbols. Such block as being produced by an extended source with a source alphabet S” that has k" distinct blocks, where & is the number of distinct symbols in the source alphabet § of the original source. In the case of a discrete memoryless source, the source symbols are statistically independent. of a source symbol in S" is equal to the product of the probabilities of the source symbols in S constituting the particular source symbol ins”, ee | Hence, the probability a Digital Communica) ai entropy of the extended source, is equal to 7 times } 1(S), entropy of the original source, H(S") = 1 H(S) a2 (12 Consider a discrete memoryless source with source alpha abe ih probabilities. ©Solution: 1 P(So) = Po“ 1 pS) = P=y 1 and P(S2) = P2=2 1 1 H(s) = po Joes i logs, whee = i log, (4) + F log, +5 1 W665 (2) = 5 bits Example 1.9 | Prove that if there are ‘M’ number of equally likely messages, then entropy of the source is log, M. @Solution: ai =M i.e, the probability for all the messages are, py = Zl- Pm Entropy is given by, : : H(S) = ¥ py log,>- k=O Pp 1 1 1 : a +P logs +--+ Pm log 5 1 4 log, M+ log, M+ +h lon M ee ee Information Theory Sean By adding all the term we get, 1.4. INFORMATION RATE P , i jon per Information Rate ‘R? is represented in average number of bits of information pe! second. R = r HS) where r= is rate at which messages are generated. H(S) is the entropy ; | Messages.) _ Information bis) Re R= (rin Second) «(H16) in” Second An analog signal is bandlimited 10 B Hz and sampled at Nyquist rate. The samples are quantized into4 levels. Each level represents one message. Thus there are 4 messages. The probabilities of occurrence of these 4 levels arep, =p, 1 and py =p3 Find out information rate of the source. (Nov/Dec 2011) ©Solution: Given: 1. 3 23,1 Pi=gsPo=g>P3—g Pang Information Rate R = r H(S) r = rate of message We known that the signal is sampled at Nyquist rate: Nyquist rate for B Hz bandlimited signal is, Nyquist rate = 2 B samples /sec Every sample generates one message signal, messages per second ~ Digital Comma iccuj Entropy H(S) oe 1 i 1 H(S) = py logs 57 + P2 logs 5 * Ps log. 5 * Py logy yi 1 3 Be = 3 log 8 +3 log, 5 H(S) = 1.8 bits ‘message R = rH(S) = 2Bx18 R_=_ 3.68 bits /sec 1.5. SOURCE CODING THEOREM (Shannon’s First Theorem) Bod logs 5 +g log, 8 Information Rate R The process of generating data by a discrete source is called source encoding For the source encoder to be efficient we require knowledge of statistics of the source. “If some source symbols are known to be more probable than others, then we may exploit this feature in the generation of a source code by assigning short-code words to frequent source symbols and long code words to rare source symbols.” It referred as a variable length code. Morse Code The Morse code is an example of a variable length code. In morse code, the letters of the alphabet and numerals are encoded into streams of marks and spaces, denoted as dots *.’ and dashes ‘-‘ respectively. To develop an efficient source code, the encoder must satisfies two functional requirements, *% The code-words produced by the encoder are in binary form. % The source code is uniquely decodable, so that the original source sequence can be reconstructed perfectly from the encoded binary ae ee laformation Theory Consider a encocting proc Dy Binary Sk. Source he sequence encoder Source encoding . Adee ' vd by the source A discrete memoryless. souree whose output $, is converted by the encoder into a stream of 0s and 1’s denoted by by. hat the k" symbol Sx The source has an alphabet with & different symbols, and th occurs with probability py. =0, 1... k= 1 sed to symbol S, by the encoder have Lot the bianry code-word a ength xs measured in bits. The average code-word length L of the source encoder as, skal T= Daly iat the parameter L represents the average number of bits per source symbol used in the source encoding process. Lot Ligy denote the minimum possible value of L then coding efficiently of the source endcoder as, with D2 Lj, and $1. The source encoder is said to be efficient when 9 approaches unity. Statement Given a discrete memoryless source of entropy H(S), the average code word length L for any source encoding is bounded as L > H(S) The entropy H(S) represents a fundamental limit on the average number of bits per source symbol, — al Commune cessary to represent a discrete memoryless source in that L can be mage, small as, but no smaller than, the entropy H(S). Thus, with L,,,,, = H(S), we may rewrite the efficiency of a source encode; Ty terms of the entropy H(S) as = H©) new L Code Redundancy | Code redundancy is the measure of redundancy of bits in the encoded message sequence. Redundancy y = 1 -code efficiency 1-n | Code Variance Variance is the measure of variability in codeword length . Variance of the code is given as, Ne1 = La %-D k=O 6 Here? is variance of the code P, is probability of k* symbol ny is the number of bits assigned to k* symbol. L - isaverage codeword length N -_ is the number of symbols, DP ereereeneeere 1.6. PREFIX CODING Consider a discrete memoryless source with and source statistics {Pos Piss Pea} ‘The out; put of the code has to be uniquely decodable, } 1 source alphabet{so, $1, .-. Sk-1! Information Theory 1.6.1. PREFIX CONDITION Let the k codeword assigned to source symbol s, o be denoted by {Myr Mi Mun are 0’S and 1°s a> «+» My,,}, Where the individual elements Mxj -- and n is the code word length. The initial part of the code word is represented by the elements Mx +» M,,, for some i 0.125 00 110 O11 83 0.125 i 111 o1i1 ©Solution: : From the given table, Codel Code I is not a prefix code since the bit ‘0’, the codeword for sp is a prefix of 00, the codeword for s2- Note | For prefix code; a code in which no codeword is the prefix of any other code- word. Digital Com —~ Bi Midiieg Code I is a prefix code bacaus it satisfy the prefix code condition, Code Ill Like code I, code III is not a prefix code, the bit 1, the codeword for 5 i ~ Sisis prefix of 11, the code - word for s;. Decision Tree It is graphical drawing of a code words in the particular source code, j It is equivalent to source decoder. Source Decoder | The source decoder simply starts at the beginning of the sequence and decodes) one code-word at a time. Initial State Fig. 1.3. Decision tree for code Il * The tree has an initial state and four ten Source symbols so, s,, Sy and 53. state, minal states correspondit The decoder always starts in the init! The firs as 5 he first received bit moves the decoder to the terminal state sy ifit js or else to second di lecision point if it is 1 Information Theory he second bit moy the decoder one step further down th to terminal st; sion point if it is 1 and le 8). iPit is 0 or else to a third decision point if it is So on. Once er is ie its * Once each terminal state emits its symbol, the decoder is reset 10 initial state, For example: the encoded sequence 1011111000... is readily decoded as the Source sequence s, $5 83 s9 80... ie WU Ooo St S83 S2_ So Sq 1.6.2. KRAFT MC MILLAN IN EQUALITY If a prefix code has been constructed for a discrete memoryless source with source alphabet (so, s), ... s, _,} and source statistics {pq Py. »-- Py 4} and the code-word for symbol sx has length /, , k= 0, 1... 1 then the code-word lengths of the code satisfy a certain inequality known as the kraft-McMillan inequality. Ket ries ko where factor » = 2 refers to the radix (number of symbols) in the binary alphabet. “If the code-word lengths of a code for a discrete memo! kraft-McMillan inequality, constructed”, ryless source satisfy the then a prefix code with these code-word lengths can be Proof Consider the quantity Ko1 n irae = lotr e K=0 trolk- yn where n is a positive integer. By expanding right side of the above equa peli ei len tet fea = tion, =p | Digital “ommnnicag — ta Let the smallest of the code word lengths /, be unity and the largest be / Th | the integer i may take on a value extending from n ton /. ™ Let N, denote the number of terms of the form r~4, we may rewrite the equat y ion Kei 1 nt Dre} = SN kao fe N, is also the number of code symbol of length . Hence, if the code is uniquely decodable, N, cannot be greater than 7. [ie n > at . Y i ice., (= , ) < Yrirtsnt-ntl ko ion According we have rk } < ni foralln Taking the nt roots of both sides of inequality, we get, Kel york sap for all n =0 Limit —> 00, then Hm Gy ym get the inequality given. ‘The average code-word length of a prefix code is bounded as follows. H(S) < L tS) 2 Py bok py I 102 lon |g Tol logs 0.11 , 1 0.23 logy 9295 1 10.22 los 533 1 > 10.15 loge 9.75 10.09 lo mol | 2. Shannon-Fano Symbol Probability Stage 1 Stagell Stage stage Code word Length ul IV A lo] 00 2 F ol 2 B 10 2 E 100 3 c Ol 1110 4 D 0.09 Mi 4 ‘Average Codeword Length be ano) C= Dem kel = 0.23 (2) + 0.2(2) + 0.11 (4) + 0.09 (4) + 0.15 (3) + 0.22(2) 55 bits “Ee tal Communia 7.8. HUFFMAN CODING | |The Huffinen Code is a souree eode whose average word length approaches ental limit set by the entropy of a discrete memoryless source. fund: The algorithm is used to synthesize the code is to replace the prescribed set gf eto source statistics of s discrete memoryless source with a simpler one. | This reduction process is continued in a step-by-step manner until we are lef ties of only two. fo ' with a final set of source stati which (0. 1) is an optimal code, | Algorithm | 1. The source symbols are listed in order of decreasing probability ed a “0° and 2, The two source symbols of lowest probability combined into a new These two source symbols are regarded as bel equal to the sum of the two original source symbol with probabil probabilities. 4. The list of source symbols and therefore source statistics, is thereby of the new symbol is placed in reduced in size by one. The probabili the list in accordance with its value. 5. The procedure is repeated until we are left with a final list of source statistics (Symbols) of only two for which a 0 and 1 are assigned. 6. The code for each source symbol is found by working backward and gned to that symbol as well as its tracing the sequence of 0"s and 1°s as successors. Example 1.16] The five source symbols of the alphabet of a discrete memoryless source and their probabilities are given by sg = 0.4, S$) ~ 7 0.1. Find average codeword length and entropy usi"s 5) =02,5;= 0.1 s, Huffman algorithm, Solution: Source symbols are listed in order of decreasing probabilities, Information Theory Huffman Coding Table Stagel Stage Stage HT Stage TV So 0.4 —» 0.4 — oe “'t a = aes S2 8; 014 sy 01 1 Encoding Table - Codeword obtained by Number of digit im Symbol | Probability | tracing digits from stage codeword IV to stage I So py=0.4 1 1 5 =0.2 01 2 8) p,=0.2 000 3 3 | p= 0 0010 4 Sy p,=0.1 ool 4 ‘Average Codeword Length Average codeword 7 = 3(Probability of symbol) x (No. of digit in codeword) length per symbol, ib 4(1) + 0.2(2) + 0.2 (3) + 0.1 (4) + 0.1 (4) c binary digits /symbol L Entropy 1 1 1 H(S) = Po lee; +p, log 5” + P2 logs + Ps log. 5” + Py loss 3; 4 1.82 = . Digital Commun _ Lo I del ops gg 1 02 fox, 93 102 low Gy 40-4 log, a 0.1 log, , 0.52877 + 0.46439 + 0.46439 4 0.33219 + 0.33219 : [2193 bits /symbol | (Ww) ADMS have five symbols sq, 8 probability distribution as 0-4, 0.2, 0.1, 0.2 and 0.1 respectively, 82183) Sy characterised Evaluate the distinct variable length Huffman codes for the source to illustray of Huffnan technique, Calculate code efficieney and the variane non uniquene: of the ensemble as defined by, Not ee Lp ko where p, and n, are probability and length of codeword respectively for symbal 3, and Lis average length. Conclude on result. | (Nov/Dec 2009, Apr/May 2011) Solution: Here the Huffman codes should be evaluate for two distinct variable length, ie. non uniqueness of Huffman techniques. 1. Placing the combined probability as high as possible. 2. Placing the combined probability as low as possible. Entropy Entropy of the source is same for two distinct. HO =F py loa k=0 “Pk a = t 1 0.4 log, 9g +0.2 log;g 5 +0.1 logs ay +02 low a5 +0.1 log 9,1 Tae - 21> bwee] LH(S) = 2.122 bits/symbol Information Theory 1. Placi Dol as High as Post acing Combined symbol as High as Possible Huffman Coding Table Source symbols a i ‘ymbols are listed in order of decreasing probabilities, Symbol Probability StageII Stage IIT Stage 1V Stage | So 04 04 S102 02 ee 04 S202 024 02 Sens 027 sso : Symbol | Probability Codon nied by sami of in So 04 00 2 51 02 10 3 8 01 o10 3 3 02 il 2 84 01 ol 3 Average Codeword Length z Dy Ny = 0-4(2) + 0.2(2) + 0.18) + 0.22) + 0.1 GB) k=0 Code Variance = Srnou-DP k=0 ag 1% = 96.45% 2, Placing Combined Symbol as Low as Possible Stage I iit 21 Comms P +0113 - 2.2P + 0.2 [2— 22P-+ 0113 = 2.2p Symbol Probability Stage “Stage IIT Stage Ty So 0.4—+ 0.4—— 04 064 r-1 Ss; 0.2— + 02 0.4 os ; SF 0.2 ——» 023) 0.2—- oO 82 0.1 027 84 or 1 Fig. 1.6. “ | Codeword obtained by Number of bits in Symbol | Probability tracing codeword So 0.4 1 1 5; 02 01 2 $3 0.2 0010 4 82 0.1 a 000 3 (Esse O 0011 4 Tim = O41) +020) +0104) +028)+ (1) +0.20) +0114) +028) +01 @) LL= 22 ynta Code Vatiance 4 ~ Yalu ip ko 2ap+0fd-22p+02 (3-227 +6 Is-2.2P eng Ib) ADT filing witb of caren as shown Belov. i ie etiad ese | ete |e ae Probaiiy | 0.25 | 00625 | 025 ars} 0.125 | a125| 025 Generate th Hin code ih iu code arian, Deter the coe ries add fe Cnn on cde fen. (Ap/May 2004) Solution: Source symbols are listed inthe order of decreasing probabilities. Dl aa!” 1.36 | , ica) puso on | Huffman Coding Table symbol Probability Stagell — Stagelll Ste IV Stave Vg Be Stage vl s: 0.25 0.25 0.5 058 56 0.25 0.25 0.25 0.25 af ut 7 s, 0128 0.125. [80.25 025% | 025 < 0 ~ ' 8, 0128 0.125. 0.1259 [* 0.25 5 ss 0.125 0.1254 |* 0.1257 i | Codeword obtained by Number of bits in | Symbol Probability — ria | a 025 | 10 2 se | 0.25 | i 2 | an ose 001 3 | s | ons | 010 3 | 3s | 0125 oll 3 | Ls, | 0.0605 | 0000 4 | 5s | 0.065. | 0001 4 | Average Codeword Length = | a Pym + Pz Mz * Ps M3 * Py My + Ps M5 * Po No NEES SISSIES ; 137 Information Theory | ve 95(3) +0.25(2) | HLS O.062S eH) +0.2502) + 0.00254 1) 10.125(3) + 0.1290) | : | { LL. 2.625 bits/Symbol | Entropy | | ‘ 1 i NS) = ¥ py tog, i keo { 1 ~ +0,0625 log + 0.25 logs 9.35 Oe 1 os * 0.0615 log 90625 1 ; . 1 § loos ise + 0.25 log. Teds + 9-125 log, Fy5g +0125 low 195 | | Code Efficiency | H(S) } Tesh | nel | Code Variance 6 _ = Y py lin LP ia = 0.125(3 ~ 2.625) + 0.0625(4 ~ 2.625); (4 = 2.625)? + 0.125 (3 ~ 2.625) + 0. A discrete memoryless source hay an alphabet of five syabals with their probabilities for its output as given here, X = [X, XX, X, Xa] PIX} = (0.45, 0.15, 0.15. 0.1, 0.15] i | | { t j } i 1 { | | 1 | Compute two different Huffinan codes for this source. Fop these Dwo codes find nr wrage codeword length @ - averd Gi) Variance of the symbols. @Solution: Placing combined symbol as high Both will produce a same result. Huffman Coding Table : Pitted Comme codeword length over the ensemble of Sure, [Apr/May 2005, 2995 as possible and as low as possible, Symbol Probability Stage I Stage II] Stage IV Stage I X, 0.45 ——> 0.45 —= 0.45 0554 1 X, 0.15 0. os 0.30 feed X; 0.15 0.15% 0.25 Xs 0.154 0.155 X, Symbol | Probability Codeword obtained by Number of bits in tracing codeword Xx 0.45 1 1 x - 2 0.15 000 3 | % 0.15 001 ; X | on oul 3 a. 2. Ls | 0.15 oll 3 3 Average Codeword Length ; E pany * 0.15(3) + 0.15(3) + 0.1(3) + 0.15 (3) Code Variance 3 e= Y piy-Lp kel = 0.451 = 2.1)? + 0.153 = 2.192 + 0.153 ~ 2:1)? + 0.1 G 2-1)? + 0-15 = 2.1" o = 0.99 = 1 Draw a Huffman coded tree for the following input sequence “Add BB CCC DA”, [Apr/May 2014] ©Solution: Given sequence is AAA BB CCC DA. Theer are total 10 symbols. Here, A occurs 4 times out of 10. p(A) = 4 a 04 Similarly, B occurs 2 times out of 10. p(B) = 5 =02 C occurs 3 times out of 10, p(C) = 4 =03 D occurs | times out of 10, 1 PD) = jg =0.1 4 7 T 7 Symbol A }e | Dd] ae onl al Probability | 04 | 02 | 03 | 01 | | Huffman Coding Table Source Symbol are listed in drder of decreasing probabilitie Distal © Comms Symbol Probability Stage Stage Hl] Stage | A Cc - 06% C 03N, ee 3 ne B 020 D ud Using this “ tables we can draw Huffman Code tree, | | ee The code for D = 001, C 1, B = 000, A Drawbacks of Huffman Coding “> It requires knowledge of the probabilistic model of the sourc knowing the source statistics in advance is not possible at all times: n code) e. But) + With modeling text, the storage requirement prevents the Huffmai from capturing the higher order relationship between words. % This coding algorithm doesnot uses fixed length codes, therefore it is no suitable for synchronous transmission. “When it is applied to the ordinary English text, achieves compaction af | about 43% only Information Theory 1.9. DISCRETE MEMORYLESS CHANNELS ough which (DMC) is @ 1.7. Both X A communicati ica ; i ae ‘tion channel may be defined as the path or medium thr S flow to the receiver end. A Discrete Memoryless Channel statistical model with an i sit del with an input X and an output Y es shown in figure and Y are random variables c— — | [xd . | TX PID yo Y | . . [ss ce 1.7. Discrete memoryless channel During each unit of time, (the signaling interval) the channel accepts an input symbol from X, and in response an output symbol from Y. The channel is said to be “discrete” when both alphabets X and Y have finite sizes. It is said to be “memoryless” when the current output symbol depends only on the current input symbol not any of the previous ones. The channel is described in terms of an input alphabet. X= Ko xp eee Md (1) an output alphabet, Y= Vode xo} +Q) and a set of transition probabilities pox lx) = p(Y=y,|X=x) foralljandk ...(3) We have O 8 jet riy=yp) = 5 vy 2. HX| Y=yx) Srestowtee| 7a,i50 Thus quantity is itself a random variable that takes HX | Y=) 1) with probabilities p).--PO'x-) respectively. ‘The mean value of H(X| Y =)’x) over the output alphabet y’ is K K -1 DHX Y=) POW =0 HY) " Ket Jel i yy ves) POD Hoe| SET ] x50 j=0 5 1 - (x;..¥4) logy [ tum lyn) | an PO 2] BG, Le 2 The quantity HCXIY) js called a conditional entropy. It represents the amount of uncertainty remaining about the channel input after the channel output has been observed. Now, considering both the uncertainty conditions (before and after applyi inputs), we come to know that the difference i.e., H(X) - H(X | Y) ees the uncertainty about the channel input that is resolved by ob: ae ae output, This called as the Mutual Information of the channel ears el. T Ommanic Wi whole thing jy Denoting the mutual Information as I(X5 Y) we ean a mn in write the follows equation, as WX; Y¥) = W(X) ~ XY) Hence, this is the equational representation of mutual information. Properties of Mutual Information “The mutual information I(X;Y) has the following important properties, Property1: ‘The mutual information of a channel is symmetric i.e. W(X; Y) = 1(¥s X) ( vere the mutual information I(X; Y) is a measure of the uncertainty aba th ved by observing the channel output and the ruta channel input that is resol ion I(Y;X) is a measure of the uncertainty about the channel output that is informati | resolved by sending the channel input. Proof: Jel 1 H(X) = <) log, | ee wel ] K k _ a = & pe) tors Pt) ] x P01 x) Jel Kel 1 = YL poxlx) pep toes] sep joo k=O ] mae c= SS wooeal ay | = We know that HX ¥) Sus 1 ; (x, y,) log] oy Ne cel see | 1G Y) = HO) -H(X|Y) Substituting 7 ubstituting H(X) and HICX | Y) and then combining terms we get, A det Ket , : -F'5 69) oa al A) Information Theo, From Baye; Sere ae a rom Baye’s rule for condit:onal probabilities, we have PEL _ pols) (8) P(x) PO) ae ' | ; r of Hence, substituting equation (8) into equation (7) and interebanging the ove summation, we may write IQGY) = = = P(X jY4) logy eae | = 1(%X) @) Hence the proof. Property 2 The mutual information is always non negative. WX; Y) > 0 +10) Proof To prove this property, we first note that Pp ye) -(11) POY”) = Do) Hence, substituting equation (11) in equation (7), we may express the mutual information of the channel as IK! PCC ¥4) | SY) = PEI 1X; Y) yy 2% PC» 5) logs [ pee) pO) (12) Next, a direct application of the fundamental inequaltiy yields the desired result, W(X; ¥) 20 with equality if and only if, PU» Ye) = PC) PO) for all and k (3) This property states that we cannot lose information, on the average, by observing the output of a channel. Moreover, the average information is zero if, and he input and output symbols of the channel are Statistically independent, as Property 3 ‘The mutual information of a channel may be expressed in terms of , entropy of the channel output as WX; Y) = H(Y)- HY] X) where H(Y | X) is a conditional entropy. oll Proof: This property follows directly from defining equation (3) for mutual informa, . ofachannel and property 1. Property 4 The mutual information of a channel is related to the joint entropy of th! channel input and channel output by : | 1(X:Y) = H(X)+H(Y)-H(%Y) Al} where the joint entropy H(X, Y) is defined by 0,1) = SS pe, 0 06 (a5) Al Q %, EP ots ple) Proof To prove this property, we first rewrite the definition for joint entropy HY) as 1 HOG Y) = [eee | 1 K=1 EX per ys) loge! Tee, yy) J=0 k=O Jel + iso K-1 1 XP) logy [ PG) POW) ] The first double summation term on the right side of equation (17) is recogni! as the negative of the mutual information of the channel I (X ; Y) » previous given in equation (12). As for the second summation term, we manipulate it follows, at Ald) Information The aS : 7 _ E pe,)) log,| —_ _] el oe | / 1 VS po 29 y aolea|ee! K-1 ie Stal ay | Eason hel wo] Tio feo & 10 1 rl - 7 o aie = ¥ ple) log, all ) log y ] 2 pl) oe a + &, Pow L pow (18) = HX) +H) : : “tthe result. Accordingly, using equations 12 and 18 in equation (17) we 26 HE ms eal)! H(X, Y) = -1(X; ¥) + W(X) + HY) By rearranging this equation, we will get 1(X;Y) = H(X) + H(Y)-H% Y) diagram, Hence, the proof. Mutual Information, of a channel can be conclude usi Fig, 1.8. Mustrating the relations amoug various channel parameters ‘The entropy ofchannel input X is represented by the circle on the left. The el output Y is represented by the circle of the right. The mutual is represented by the over between these two circles. entropy of chann' information by the channel We have so far discussed mutual information, The maximum averag IE muy ation in an instant of a signaling interval, when transmitted by a Y a diser, inform, bilities of the rate of maximum reliable transmy memoryless,channel, the probal of data, can be understood as the channel capacity. It is denoted by C any i measured in bits per channel use. Consider a discrete memoryless channel with input alphabet X, output alphet Y and transition probabilities p(s |%))- ‘The mutual information of the channel is defined as Jer Ke POs) WX;¥) = Z pa y0 | DO) yao ke0 PV Pty Ii) = PO lpPGp Jel pO) = PU lx) Pe) j=0 | information of a chante e three equations we see that, the mutual which the channel is uset ly on the channel, but also on the way in )} is independent of the channel. We c= From thest depends not onl ‘The input probal then maximize the average ™ to {p(x}. Hence, we define the channel capacity of a diserete memoryless channel «s erage mutual information I(X ; Y) in any single use of the chant: ability distributions {pCi bility distribution {p) autual information I(X ; Y) of the channel with respe maximum av where the maximization is over all possible input prol on X. ‘The channel capacity is commonly denoted by C, thus C= max I(X;Y) [se eat eee 5 Information Theory rr 1,12, CHANNEL CODING THEOREM OR SHANNON’ SECOND THEOREM The noist ent i A a na se present in a channel creates unwanted errors between the input and output sequet e ot jlity sl put sequences of a digital communication system. The eror probability should be very low, nearly ‘eve such 2 me z low, nearly < 10-6 for a reliable communication. In order (0 achieve such @ high levi . i gh level of performance, we may have to resort to the sue of channel coding: The channel coding in a communication system, introduces redundancy with @ control, so as to improve the reliability of the system. The source coding reduces redundancy to improve the efficiency of the system. Channel coding consists of two parts of action. 1. Mapping incoming data sequence into a channel input sequences. it sequence into an output data 2. Inverse Mapping the channel output sequence. The final target is that the overall effect of the channel noise should be minimized. ‘The mapping is done by the transmission, with the help of an encoder, whereas the inverse mapping is done by the decoder in the receiver. Discrete Channel pee Channel |. |_| i memoryless encoder memoryless decoder Destination | seus channel i Transmitter Receiver | Noise i in which the message sequence is divided into sequential h k-bit block is mapped into an »-bit block where * Consider block codes, blocks of & bits long, and eacl n2k. The number of redundant bits added by encoder to each transmitted bloc! n—k bits. The ratio kin is called the code rate. Fig, 19. Block diagram of digital communication system Digital Communicge i k Code rate, r= “>, always <1 _ No. of message bits in a block ~ No. of bits in code word The accurate reconstruction of the original source sequence at 4 ity of symbol error should be les, destination requires average probabil Consider a discrete memoryless source with an alphabet $ and H(S) bits per source symbol and source emits symbols once ¢ entropy very T, seconds. H(S) , Average information rate of the source ae bits/sec The decoder delivers decoded symbols to the destination from the sours, alphabet S and the same source rate of symbol every Ts. { The discrete memoryless channel has a channel capacity equal to C bis| per use of the channel and used once every T, seconds, | . Cc | Channel capacity per units = =| bits /see 2 | This represents the maximum rate of information transfer over the channel. The Channel coding theorem for a discrete memoryless channel is stated in two parts a follows, - No po: ility of transmi » with small probability of error, - i ation ion and reconstruction of inform ¢ of the Henee, the maximum rate oo. e critical rau of the transmission is equal to the eri channel capacity, a ake place over fe i ich ¢: ake for reliable error-free messages, which can take Pp) 7 is called as “Channel Coding Theor lowing points, serete memoryle Inn important to note the foll iG It doesnot show us how to construct a good code. It do a »mbol error after ot have a precise result for the probability of symbol decoding the channel output. Binary Symmetric Channel ; ications channel model --A binary symmetric channel is a common communications channt chest . ‘ e] Q itte! shes: used in coding theory and information theory. In this model, a transmitter wi send a bit (a zero or a one) and the receiver receives a bit. It is assumed that the bit is usually transmitted correctly, but that it will ee ipped” with a small probability. This channel is used frequently in information theory because it is one of the simplest channel to analyse. 41.10. Binary symmetric channel A binary symmetric channel with cross over Probability p denoted by is a channel with binary input and binary output and Probability of error P; that is, if . 3 » if x Digital Commune a. is the transmitted random variable and Y is the received random Variable, th > then channel is characterized by the conditional probabilities, p.(¥=0[X=0) = p.(Y=0| X=1) = p p.(Y=1|X=0) = p p(Y=1|X=1) = 1-p | “It is assumed that 0

1/2 then the receiver can swap i output (interpret when it sees 0 and vice versa) and obtain an equival le channel with crossover probability 1 —p < 1/2. | This channel is used by theorists because it is one of the simplest nois, channels to analyse. Many problems is communicative theory can te! | reduced to a binary symmetric channel. Conversely, being able to transmit effectively over the Binary symmetric channel can give rise to solutions for more complicated channels. Channels Capacity for Binary Symmetric Channel By symmetry, the capacity for the binary symmetric channel is achieved with : a 1 channel input probability-p(x,) = p(x,) = 3» Where x and x, are each 0 or 1. Hence © = 105 ¥) 96) = ple,) = of From the binary symmetric channel diagram we have POI) = pO |x) =p “8 and B04) = pO [x)= 1 —p 8) Therefore, substituting capacity equation with J = P(*,) accordance with X(Dp) = 'g these channel transition probabilities into chant! K = 2, and then setting the input probability via)” ~Po log py~(1~p,) log, (1 - py) ald) ‘mation Theory We find that the eap; acity of the binary symmetric channel (5) C= 1+plog, p+(1-p) log, (Ip) Based on equation (4) we may define the entropy function re) 1 LL) 7 Xp) = vlogs () + (py oes (725) Hence, we may rewrite equation (5) as (7) C= 1-X(p) a en The channel capacity © varies with the probability of error F following figure. as shown in Channel Capacity, © i 2 05 Transition Probability, P ig. LL Variation of channel capacity ofa binary symmetric channel with transition ae probability p 1. When the channel is noise-free permitting to set p = 0, the channel capacity C attains its maximum value of one bit per channel use, which is exactly the information in each channel input. At this value of Ps entropy function X(p) attains its minimum value of zero, the 2. When the channel is noisy, producing a conditional probability of ertor p =}, the channel capacity C attains its minimum value of zero, whereas the entropy function X(p) attains its maximum value of unity, of the Channel Coding Theorem to Binary Symmetric Channels nels ‘Application iscrete memoryless source that emits equally likely bi ally likely binary 5 ‘yMib Consider a di vith the source entropy equall to one bit ae per source g Yb seconds, W ate of the source is ( once every Ts the information r I/¢) bits per second. | she couee sequence is app 10 9 Dinar channel ener wih coder | - she encoder produees a symbol once every conds, Hence, the chang an sety per unit time is (C/T.) units per second, where C is determined by th the capa prescribed channel transition probability p in accordance with following entropy’ function. xia) = plosa(2)# =p) toea(725) c = 1-X(p) “a Accordingly, the channel coding theorem implies that if (9) arbitrarily low by the use of a suitable ‘The probability of error can be made Is the code rate of the encoder. encoding scheme, But the ratio T, /Ts equa T, ra (10 ic Ts T, T, 7 oandr Hence, from the channel coding theorem, we may state that for any = vind fh < 0.9192, there exists a code of large enough length 7 and © oe w the, given hat when the code is appropriate decoding algorithm, such th channel, the average probability of decoding error Repetition codes Goad capacity a value =10° ‘Average probabily of error 0.01 01 10 | ks Fig, 1.12, Mlustrating significance of the channel coding theorem Example 2: Consider a simple coding scheme that involves the use of a repetition code, in which each bit of the message is repeated several times. Let each bit (0 or 1) be repeated » times, = 3, we transmit 0 and 1 as 000 and 111, respectively. where 7 = 2m +1 is an odd integer. For example, for * Comma IFin a block of received bits the number of Os exceeds the nun 1 the decoder decides in favor of a 0. Otherwise it decides in favor ofa, ° Tony Hence, an error occurs when m + 1 or more bits out of m= 2 m + Lj tence, . * 1 bit received incorrectly. Because of the assumed symmetric nature of the chany , Ba aa - Mel average probability of error, P, is independent of a prior; probabilities of y at any We find P, as “fn a= 3 ("Jeane FF i=m+l t where P is the transition probability of the channel. Table 1.1. Average Probability of Repetition Code 1 [A ability of Code Rate r=", verage Probability of error, p, 1 102 3x 10-4 10-6 4x 107 108 5x 10-10 1 3 L 5 a 7 1 9 1 I SARS RUS ee 1.13. CHANNEL CAPACITY THEOREM x Cais & zero-mean stationary process X(/) that is band-limited to B hertz. Lt x» K= 1, 2, .. m denote the continuous random variables obtained by unifort sampling of the process X(/) at the rate of 2B samples per second. The: i in me asl ate transmitted in T seconds over a noisy channel, also ba (0 B Hertz, Hence, the number of samples, n is given by RE ea aaa —- (1.57 Information Theory ee We reler 0X, aso Te ehiannel output Je Xi, a8 a sample of the transmitted signal. ‘The ea aA Perturbed by additive white Gaussian noise of zero mean and power °PS density Ny/ 2. The noise is band-limited to B hertz. Let the continuous R. . samples of andom variables Yj, k= 1, Que # denote samPl hown by U (2) Yy = X +N, k=1,2000 ‘The noise sample Ny. is Gaussian with zero mean and variance given bY oe = NB we assume that the samples Y,, k = 1,2, . 3) nare statistically independent . ibed is called a A channel for which the noise and the received signal are as deser X——_—>| Ye Fig. 1.13. Model of discrete-time, memory-less Gaussian Channel The transmitter is power limited and assign a cost to each channel input to make meaningful statements about the channel. So cost can be defined as ED] =P k=1,2..,n (4) where p is the average transmitted power. We define the channel capacity as C= max (1K, 5 ¥) EK?) =p} 83) LEED where I(X;, 5 Yq) is the average mutual information between a sample of the transmitted signal, X,, and the corresponding sample of the received signal Y,,. The . F ‘a maximization in above equation can be performed with respect to f Xx), the probability density function of X,. Ke), The average mutual information (X, + Y4) can be expressed j one op | quivatent forms, Among thatene is Ms UX EY = AOD FOI) sft Slave Ny and Ny are independent random variables and their sum equate y als such as an REN, kein We find that the conditional differential entropy of Yq, given Xy, is equal to conditional entropy of Ny. YEN) = AND Substituting equation (7) into equation (6) we get 1X2 Yi) = AY ANY (8) | Since A(Nq) mizing I (X, ; Yj) independent of the distribution of X,, may i} st requires maximizing 4(Y,) and the differential entropy of sample Yj of the received) signal, Pr For A(Y,) to be ma imum, Y, has to be a Gaussian random variable (i.e) the| | t1 samples of the received signal represent a noise like process. 7 Since N, is Gaussian by assumption, the sample X, of the transmitted signal must be Gaussian too. The maximization specified in equation (5) is attained by choosing the samples of the transmitted signal from a noise like process of average power p. So we can reformulate equation (5) as CaO EE X, Gaussian, E[X? ] =p a For the evaluation of the —_ channel capacity C we proceed in three stages, he variance of sample iver si ° ee ple Yi, of the receiver signal equals p + o?. Hence rential entropy of Y, as AY) = 1 2 bg, [2x e (+62) 0) os od ~~ T159 Frerential 2, fence, the dif 2. The v, - 1als 6? atianee of the no} : nee of the noise sample Ny equ entropy of Ny as ‘ A) KN: N 2 (2m 0) recognizing 3. By substitu . YY substituting equations (10) and (11) in equation (9) and we get the the definiti A ¢ definition of channel capacity given in equation (9), desired result as (12) c=lt P : 2 logs +5) bits/channel use withthe channel used times forthe transmission of samples ofthe process /T) times. X(t) in T seconds, we find that the channel capacity per unit time is (1 The number n equals 2 BT and the channel capacity per unit time as > c= Bie (1435 ) bits/s (13) Statement Shannon’s channel capacity theorem is also referred as the Shannon- information that can be Hartley Law. Hartley showed that the amount of transmitted over a given channel is proportional to the product of the channel bandwidth and the time of operation. ve white B hertz, disturbed by addi Capacity of a channel of bandwidth iy /2 and limited in bandwidth to B, Gaussian noise of power spectral density I is given by P c= Btog,(1 +h ) bits/s rage transmitted power. where P is the ave! theorem is one of the most remarkable results of ‘The channel capacity on theory ina single formula. informati Channel capacity theorem highlights most vividly the interplay between ey system parameters. three k Digital Communje —~ @ Chann (average transmitted power and Power spectral density at the channel output, el bandwidth (iii) nois The theorem implies that for given average transmitted power P and chap, bandwidth B. we can transmit information at the rate C bits per second Ny arbitrarily small probability of error by employing sufficiently complex ened, Systems, z It is not possible to transmit at a rate higher than C bits per second by ay, encoding system without a definite Probability of error. Hence, the channe! capacity theorem defines the fundamental limit on the rate power-limited, band-limited Gaussian channel, 7 must have statistical propeniy error-free transmission for a Spproach this limit, the transmitted signal approximating those of white Gaussian noise. Ideal System We define an ideal system as one that transmits data at a bit rate R 6 equal to th: channel capacity C. We may then express the average transmitted power as P=E,C alt where E, is the transmitted energy per bit, Accordingly, the ideal system i defined by the equation 5 > ba (1+$88 } B (ENB J Y per-bit to noise power spectral density ratio E, / Nj. we may define the energy terms of the bandwidth efficiency, C/B for the ideal system as E, 2c/B _ | 7 No CB wll This equation is displayed as the curve labeled “eapacity boundary’ in belo figure. A plot of bandwidth effcieney Ry / B vérsus E, / Ny in figure called t bandwidth efficiency diagram, Based on this diagram, we can have the followit! observations. Pm see density ratio Be! 9| jo-no' 1 niting value lim ) ae soe This value is called the Shannon limit. Expressed 19 decibels. it equals — 1.6 dB. The corresponding limiting value of the channel eapaci®) fe cs) WN VG 104 f capacity boundary for which Rp = © KX Region for which Rp C). The latter region v is shown shaded in - Disctal Comm No. Rye Seam highlights potential trade-ofl among ProPability of symbol érror P.. We may view movement of the oper YN for a Fixed ; Point along a horizontal line as trading P, versus On the other hand, we may view movement of the operating 8 Vertical line ag trading P, versus R,/B, for a fixed E,/Ny. point The Shannon limit may also be defined in terms of the I2y/Ny required by ideal system for error-free transmission to be possible. We may write for the system, [0 (EYN) =, U Gano Jog, { 2 1 = 3 bom (5% log, ¢ and = che *o 202 The desired form for fy (x) is therefore described by _ —pn) Ix @) = Ee en (=S*) (B which is recognized as the probability density of a Gaussian random variabj of mean pt and variance 62, lex Already we have the entropy of a discrete random variable as 100 = f Fy 6) tog, [w]e ral =(e=pp I 109 = fy o7(=85"*) og ot Jax A(X) = ¥ tog, (2m e 62) We may summarize the results of this, example, as follows. 1. For a given variance o?, the Gaussian Random variable has the larges! differential entropy attainable by any random variable. (i.e) if X i84) Gaussian Random variable and Y is any other random variable with th same mean and variance, then for all Y. A(X) = ACY) where the equality holds if , and énly if, y = xX 1.67) The entro , stermined by he entre PY Of at Gaussian random variable X is uniquely deter the variance Mranee of N (Le, iL is independent of the mean of X) TWO MARKS QUESTIONS WITH ANSWERS What is information Information can be defined as the inverse of probability of occurrence . aE i) = log Ps Give to properties of information, * Information must be Non-negative. ty is more * If probability then information is more and if probabili then information is | 3. What is entropy? Entropy can be defined as the average amount of information per source symbol, KeI 1 HS) = Dry toes( 55) a0 Pe H(S) is called the entropy of a discrete memoryless source with source alphabet. 4. Give two properties of entropy. “If H(S) = 0, if and only if the probability p, = 1 for some k, and the remaining probabilities in the set are all zero. This lower bound on entropy corresponds to no uncertainty. . a le IP H(S) = log, &. ifand only ifpy = Z for all & (i.e all the symbols in the alphabet S are equiprobables. This upper bound on entropy corresponds to maximum uncertainty. ° Pi 5. What is discrete source? If a source emits symbols $ = {sq s), 5 ao Se} from a fix i a xed finit alphabet then the source is said to be discrete source. . fon Rate, . per ot bits OF intomy, | Information site“ is presented in average num : Ros) os are generated. is nate at which mes NUS) ~ is the entropy of the source. hat isa presi code? In Pretix code no codeword is the prefix of any other codeword, jy Variable th code, The binary digits are assigned to the mess: their Probabilities of eccurrence. Define Mutual Information, Mutual intormation 1(X.Y) ofa channel is defined by KX Y) = A(X) —HEXIY) bits/symbol HX) ~ enutopy of the source HQXIY) - Conditional entropy of Y 9. State any. Sour properties of mutual information, * The mutual information ofa channel j KX:¥) = (Y:x) * The mutual information is always non negg ative. KX;¥) 20 symmetric * The mutual information of a channel may be expressed in terms of entropy of the channel output as IXSY) = HOY) -Hy] * The mutual information of a chanr nel is re] fated to the joint entropy of th channel input and channel output by IXY) = HEQeHy— HX. Y) ee heony il {109} | Cf the entropy UNV) of a communication system to, Define © the yi SMficy where 104 and Y is he receiver, ents uncertainty of X on onal entropy. I repres is Kno 8 Known. tn other words HUX]Y) is an average meastre of aller V is received. umeeraingy jay HAI) te KNIYY represents the inform i ‘ition lost in the noisy cl + AN event has a six po la sible outcomes with probabilities 1/2, 1/4, 1/8, WAG US2. Vi * Mind the entropy of the system, I Poe 5.pyet iy ot 1 7 Ae P2 ys Pa 76+ Pa + Ps 39 WS Ep, tows dp) 2 +(F) tow ons (i) toes 06) ($4) toes 624 (53) toe 6) 1.5625 12. When is the average information delivered by a source of alphabet size i maximum? re equally likely Average information is maximum when the two n ise, py = py = V2. Then the maximum average information is q, i log, @+4 log,(2) = 1 bit /message 13. Name the source coding techniques. 1. A Prefix coding 2, Shannon-Fano coding Huffman coding we 14 Write the expression for code redunudaney in terms af entropy. Code redundancy is the measure of redundancy of bits in the encoded ‘ode - message sequence. Redundancy = 1 code efficiency | f = ive 2 If's0, why? 15. ds he information y Fa continuous system non negati “ 7 Lal Yes, the information of stem is non neg i tha IXY) > 0 sone orgs property. 16, State Shannon's continuous 3) frst theorem, Shannon's First The HS), the utee OF entry, ven a discrete memory . ‘orem: " ce encoding is bounded a '1GC code word length L for any source encodin [> IK) numb e average number The entropy H(S) represens a fuidamental limit onthe average nur a bits per source symbol i, JT. What is data compaction? | Data compaction ; ; ation so that he 'S used to remove redundant information s th | decoder Teconstructs { he original data with no less of information | 18. Whavis decision ree? Where itis used? The decision tree is a tree that has an in Corresponding to source symbols 55, s), Spo Sy_, Once each terminal Site mits its symbol, the decoder is reset to its inital state, | Decision tee is used for decoding operation 19. What is ‘instantaneous code? ial state and terminal states of prefix codes, Ifa codeword is not a prefi instantaneous code; eg. 0 10 110 1110 20. What is uniquely decipherable code? ix of any other codeword, then it is said to be Ifa codeword is not a Combination of, ANY other code itis sa, ‘Word th 0 be uniquely decipherable code, Smits said 101 1001 , / — . i sing 21. How ant encoding operation is taken place in, Lempel-Ziv. coding " » binary sequence? ties , . fa stream Encoding operation can be taken place by Parsing the source ae nto d previously. ements that are the shoftest subsequences not encounters’ 22. What is discrete channel? The channel is said to be diserete when both the alphabets have finite sizes. 23. What is memoryless channel? The channel is said to be memoryless when the current output symbol of the previous choices. depends only on the current input symbol and not any 24. Define a discrete memoryless channel, For the discrete memoryless channels, input and output both are discrete random variables. The current output depends only upon current input for such. channel. What is the important property while using the conditional probability Ol¥y? The sum of all the elements along the column side should be equal to 1. to on What is the important property while using the conditional probability (1X)? The sum of all the elements along the row side should be equal to 1. 26. 2% Define channel capacity? We define the channel capacity of a discrete memoryless channel as the maximum average mutual information IX ; Y) in any single use of the Pigital Comuriac ‘ig if cl hannel, i 28. State the channel coding theorem fora source with ana) s. Let discre' ‘ete memory Letad produce symbols at every capacity C and used once every le 1 nitled source output can be trans cheme for which s pe ability of error. a coding ith the small proba There exists constructed W over the channel and can be 1 ng) | & ~ T, aling at the cr! ical rate. Then, the s id to be 29, Explain Shannon-s -Fano-Coding ? tained by the following simple procedure An efficient code can be ol known an Shannon Fano algorithm. Algorithm: Step 1 : List the sour Partition the sets into two sets that are as close to equiprobabk symbols in order to decreasing probability. Step 2 + as possible and assign 0 to the upper set and assign 1 [0 the lower set. Step 3 : Continue this process each time partitioning the sets with * nearly probabilities as possible unti . a mntil further partitioning is possible. partitioning 1S 30. Define Huffman coding. The Huffman code is a si a source code wh d ose average Wi engl ‘orld lene! approaches the fundamental limit set b: set by the ent ofa di q source. tropy of a discrete memory! [ taformation Theory prescribed set he The Co s used to synthesize the code is to replace t ne. ss source with @ simpler one REVIEW QUESTIONS il al information 1. Define mutual information, Find the relation between the MUNA” and the joint entropy of the channel input and channel outPut: ca important properties of mutual! information. using Huffman Encode the source symbols with following set of probabilities y coding . m = { 0.4, 0.2, 0.12, 0.08, 0.08, 0.04} 3. Explain in detail Huffman coding algorithm and compare this with the other types of coding. 4, Explain the properties of entropy and with suitable example. Explain the entropy of binary memory less source. 5, Discuss source coding theorem, give the advantages and disadvs channel coding in detail and discuss the data conipaction. antages of Derive the channel capacity theorem. Explain the information capacity theorem. Explain the Shannon-Fano coding. Explain the discrete memoryless channels. 1 ule Ow ye aN } Gessivne wilervetle Which QoQ

You might also like