Arithmetic evaluation
Average codeword length in bits in Huffman and
Arithmetic:
Huffman
Arithmetic
Huffman cost (%)
5.
English
4.1854
4.1603
0.6
Finnish
4.0448
4.0134
0.8
French
4.0003
4.0376
0.9
German
4.1475
4.1129
0.8
Hebrew
4.2851
4.2490
0.8
Italian
4.0000
3.9725
0.7
Portuguese
4.0100
3.9864
0.6
Spanish
4.0469
4.0230
0.6
Russian
4.4704
4.4425
0.6
English-2
7.4446
7.4158
0.4
Hebrew-2
8.0370
8.0085
0.4
 Klein S. T. and Wiseman Y.
Small Alphabet
What will happen if the items set for
the compression is small?
Example of 77.49MB of bitmaps:
n
Huffman
(MB)
Arithmetic
(MB)
increase (%)
each bit
77.49
9.64
703
k-blocks
256
13.70
7.70
78
5.
 Klein S. T. and Wiseman Y.
Time considerations
Encoding time in seconds:
200
180
Arithmetic
160
140
Static
Huffman
120
100
Adaptive
Huffman
80
60
40
20
0
Gadsby
5.
Tora
gnuplot.exe
chess.bmp
 Klein S. T. and Wiseman Y.
Time considerations
Decoding time in seconds:
800
Arithmetic
700
600
Static
Huffman
500
400
Adaptive
Huffman
300
200
100
0
Gadsby
5.
Tora
gnuplot.exe
chess.bmp
 Klein S. T. and Wiseman Y.
Quasi Arithmetic Coding
A combination of Huffman coding and
Arithmetic coding.
Algorithm:
Compress data by Huffman (Or other prefix
codes).
Compress the bits by Arithmetic coding.
5.
 Klein S. T. and Wiseman Y.
An example
The items are:
a-10%
i-20%
r-30%
y-40%
A Huffman tree for these item:
y-0
r - 10
a - 110
i  111
The string which should be encoded is: "ii".
5.
 Klein S. T. and Wiseman Y.
An example (Cont.)
ones - 0.3*1+0.1*2+0.2*3=1.1
zeros - 0.4*1+0.3*1+0.1*1=0.8
Ratio is 8:11
Arithmetic code for bits:
- [0, 819 )
- [ 819 ,1)
5.
 Klein S. T. and Wiseman Y.
An example (Cont.)
The bit string for "ii" is: 111111.
[0,1)  [ 819 ,1)  [ 240 361 ,1) 
 [5528 6859 ,1)  [115680 130321 ,1) 
 [ 2315048 2476099 ,1)
 [ 45274320 47045881 ,1)
5.
 Klein S. T. and Wiseman Y.
An example (Cont.)
The shortest binary fraction in this interval is
0.11111 which is just 5 bits.
Huffman represents "i" by 3 bits which is too
much since 3=-log20.125 while the probability
of "i" is 0.2.
0.2*0.2=0.04 so the content of data in this
string is -log20.04 which is circa 4.643856 so
5 bits are the optimal compression of this
string.
5.
 Klein S. T. and Wiseman Y.
Advantages of Quasi Arithmetic
Just two intervals (for 1 and 0) so can
be implemented by tables instead of
floating point divisions.
These table can be implemented even
in hardware.
 Faster than Arithmetic coding but has
the same efficiency.
5.
 Klein S. T. and Wiseman Y.
Theorem
The second step of the Quasi Arithmetic
Coding will give the worst improvement when
the 1s and the 0s have exactly the same
probability - 0.5
In other words: The entropy of a binary
string will be maximal, if the 1s and the 0s
have exactly the same probability - 0.5
When the ratio between the 1s and the 0s
grows, the entropy will be diminished.
5.
 Klein S. T. and Wiseman Y.
Theorem - Proof
The derivative of the entropy is:
1
H' = [   Pi log 2 Pi ]' = [ P log 2 P  (1  P ) log 2 (1  P)]' =
i =0
(1 P) ln (1 P)
P
ln
P
= [ ln 2 
]'
ln 2
 P +  ln P
P ln 2
ln 2
(1 P)( 1)
 ln 2 (1 P)
= ln12  log 2 P + ln12 + log 2 (1  P) = log 2 (1PP )
5.
 Klein S. T. and Wiseman Y.
 ln (1 P)
 ln 2
Theorem - Proof (Cont.)
log 2 (1PP ) = 0
1 P
P
=1
1 P = P
P = 12
So there is an extremum point when
P=1/2.
5.
 Klein S. T. and Wiseman Y.
Theorem - Proof (Cont.)
The second derivative is:
ln(1 P)
[log 2 (1  P )  log 2 P ]' = [ ln 2
P ]' = 
1
1
 ln
ln 2
(1 P ) ln 2 P ln 2
When P=1/2:
 ln22  ln22 < 0
So 1/2 is a maximum point.
5.
 Klein S. T. and Wiseman Y.
Theorem - conclusion
When P=1 or P=0:
lim x 0 H = lim x 0  x  log 2 x  1  log 2 1 =
lim x 0  log 2 ( x ) =  log 2 1 = 0
x
When P=1/2:
H =  12 log 2 12  12  log 2 12 = 1
5.
 Klein S. T. and Wiseman Y.
Conclusion in a graph
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
P
5.
 Klein S. T. and Wiseman Y.
0.2
0.1
H as a function of P
Wrong probabilities
Excess of Huffman over Arithmetic in
percents when assuming the rows to
compress the columns
English
Gadsby
German
Finnish
French
English
0.6
1.4
-2.1
-5.1
0.4
Gadsby
-2.4
1.0
-2.9
-0.8
-3.3
German
-1.8
-3.6
0.9
-5.2
-0.1
Finnish
2.4
1.2
2.7
0.8
2.9
French
0.8
2.2
-4.6
-11.3
0.9
5.
 Klein S. T. and Wiseman Y.
Error in Arithmetic coding
One bit in the compressed file was changed
from "1" to "0":
In the beginning God created the heaven and the ea eithen e
tnos hheaheedh rd ehtesthhfo nh dtohrnh we
telw .e as io
oht
o dwf esttfde evsieoeGehee mn n to ito en,nnte
denntoe asipa c., meeha odftdetredd gnt nGp og rmoa oei
ndf r pnfwh eias peedd kaeoweeihd.gomhi G ro m sarc t
eeioorAhiytgawGtLmG r
ddis.i tswot suea df.i. at
i vsha nieI,yawedd D cn eAotc ndeenogidihhese
n etdoleated oiAhettS tDeot Aee retd gctdehneds hheda tdai
sdwL toheAc ohia nrv,iaicna:dde iettweah dea Gmg k.d,ai
kafe rew:ehtd .truit h.ihnd enif e.dwgt hrihrm atn
hSetbhhdd hdtongehie
5.
 Klein S. T. and Wiseman Y.