CoolPression: A Hybrid Significance Compression Technique for Reducing Energy in Caches
Mrinmoy Ghosh Dr Hsien Hsin Lee Weidong Shi
School of Electrical and Computer Engineering
Georgia Institute of Technology
Mrinmoy Ghosh 1 September 15, 2004
Hot Caches
Mem. Controller 19% Bus Interface Unit 12% Data Cache 14% Instructio n Cache 7% Data Path 32% Integer Units 16%
Alpha 21264
D Cache 19%
I Cache 25%
Other 4% Clocks 4% SysCtl 3% BIU PATag 8% CP15 RAM 2% 1%
D MMU 5% I MMU 4%
ARM 920T
Intel Pentium 4 (Willamette)
ARM 9 25%
Mrinmoy Ghosh
Motivation
1000000000 100000000 10000000 1000000 100000 10000 1000 100 10 1 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 Max Min Avg
Mrinmoy Ghosh
Salient Features
Reuse most significant byte Counting granularity in bits rather than bytes Compression scheme is a hybrid of two schemes, where the better scheme is chosen dynamically Each scheme gives power savings of around 30-45% if applied independently, and the hybrid scheme saves around 3-5% more than either of the schemes.
Mrinmoy Ghosh
4
CoolPression Cache
CE Bit
ZIBs
CE Bit 66bits bits 36 bits
count bits count bits
SRAM Cell Array
32- count Sense Amps 37 bits
SRAM Cell Array Sense Amps
37 Data from Cache
CoolCount Circuit
Data Out 32
33
Step 2a: Read only 32 count bits and append with leading zeroes or ones 33 33 CE Bit 36 bits
Circuit CoolCount Circuit CoolCount Data Out 32
Bitline Enable Lines Step 1: Read In First 7 bits and the ZIBs
SRAM Cell Array Sense Amps
37 Data from Cache
CoolCount Circuit
Data Out Mrinmoy Ghosh 32
33
Step 2b: Read Bytes that are not zeroes
Counting Leading Zeroes And Ones
7 6 5 4 3 2 1 0
Priority Encoder
2
Mrinmoy Ghosh
0
6
No of Leading Zeroes or Ones -1
Counting Leading Zeroes And Ones 0
7 6 5 4 3 2 1
Priority Encoder
No of Leading Zeroes or Ones -1
Mrinmoy Ghosh
7
Bitline Precharge Enabling Circuit
Mrinmoy Ghosh
Read Data From Cache
Read in Count Enable (CE) Bit and First 6 bits of data
CE ==1
Yes
Enable Least Significant ~count bit lines
No Read Data for bytes where ZIB is not enabled and make the other bytes zero Mrinmoy Ghosh Read Data From Least Significant ~count bit lines and append with count leading zeroes or ones
Write Data To Cache
Count Number of Leading Zeroes or Ones Check for Bytes which are zero
Count >8
Yes
Set CE bit to one and Enable Most Significant 6 bits lines and Least Significant ~count bit lines
No Set CE bit to 0 and Write Data to Cache setting ZIBs where necessary
Write Encoded Data to Cache
Mrinmoy Ghosh
10
Effect on Performance
Normal Cache
2.5 2 1.5 IP C 1 0.5 0 Crafty Gcc Gzip Mcf Parser Twolf Vortex VPR Avg
Coolcount Cache
Mrinmoy Ghosh
11
Results
16K Data Cache
1 .2 1 0.8 0.6
Dcache Base Dcache DZC
Dcache CoolCount Dcache CoolPression
0.4 0.2 0 Bzip 2 Craf t y GCC GZIP M CF Parser V ort ex Vpr A vg
16K Instruction Cache
Icache Base
1 .05 1 0.95 0.9 0.85 0.8 0.75 Bzip2 Craft y GCC GZIP M CF Parser Vortex Vpr Avg
Icache CoolCount
Icache DZC
Icache CoolPression
Mrinmoy Ghosh
12
Conclusions
System Transparent Hybrid Zero Compression Scheme Bit level and Byte level compressibility used to save power Energy Savings of over 35% over baseline cache Potential Use at other places where data transfer takes place like L2 Cache to Memory
Mrinmoy Ghosh
13
Thank You
Mrinmoy Ghosh
14