Arithmetic Logic Unit (ALU)
D IIgor Ivkovic
Dr. I k i
iivkovic@uwaterloo.ca
@
[with material from “Computer Organization and Design” by Patterson and Hennessy, and “Digital Design and
Computer Architecture” by Harris and Harris, both published by Morgan Kaufmann]
Objectives
Binary Representations of Integers
Basic Arithmetic Circuitry
Composing Arithmetic Logic Unit (ALU)
Multipliers
p and Dividers
2
Unsigned Binary Integers
Given an n-bit number:
x = xn−12n−1 + xn−2 2n−2 + + x121 + x 0 20
No sign bit present
Range: 0 to +2n – 1
Example:
0000 0000 0000 0000 0000 0000 0000 10112
= 0 + … + 1×23 + 0×22 +1×21 +1×20
= 0 + … + 8 + 0 + 2 + 1 = 1110
Using 32 bits:
Represent 0 to +4,294,967,295
H
How d
do we representt negative
ti numbers?
b ?
3
One Idea: Signed/Magnitude Binary Integers
Given an n-bit number:
x = xn−12n−1 + xn−2 2n−2 + + x121 + x 0 20
Left-most bit represents the sign (1 for negative, 0 for non-
negative)
Range: –2n – 1 + 1 to +2n – 1 – 1 (cannot represent –2n – 1 and 2n – 1)
Example:
1000 0000 0000 0000 0000 0000 0000 10112
= –(0 + … + 1×23 + 0×22 +1×21 +1×20)
= –(0 + … + 8 + 0 + 2 + 1) = –1110
Using 32 bits with signed/magnitude representation:
Represent –2,147,483,647 to +2,147,483,647
0 is represented twice as 1000…00002 and 0000…00002
Not trivial to perform arithmetic operations such as addition and
subtraction
4
Better Idea: Two’s Complement Integers /1
Given an n-bit number:
x = xn−12n−1 + xn−2 2n−2 + + x121 + x 0 20
Left-most bit represents the sign (1 for negative, 0 for non-
negative)
R
Range: –2
2n – 1 to
t +2n – 1 – 1 (cannot
( t representt 2n – 1)
Example:
1111 1111 1111 1111 1111 1111 1111 11002
Multiply = –1×2
1×231 + 1×2
1 230 + … + 1×2
1 22 +0×2
+0 21 +0×2
+0 20
by – 1 = –2,147,483,648 + 2,147,483,644 = –410
Using 32 bits
Representt –2,147,483,648
R 2 147 483 648 tto +2,147,483,647
2 147 483 647
0 is represented only once as 0000…00002
Easier to perform arithmetic operations such as addition and
subtraction (see the discussion that follows)
5
Better Idea: Two’s Complement Integers /2
Bit 31 or the left-most bit:
The most significant bit (MSB) or the sign bit
MSB 1111 1111 1111 1111 1111 1111 1111 11002
MSB = 1 for negative numbers
MSB = 0 for non-negative numbers
Non-negative numbers have the same unsigned and
p
two’s complement representation
p
Compute the negative value when MSB = 1:
Start with –2n – 1 and add +2k for each k-th bit from n – 2 to 0
that is 1; ignore the bits that are not 1
Example:
Start with 11002 = –23 + 22 = –8 + 4 = –410
6
Better Idea: Two’s Complement Integers /3
Some specific numbers:
Most positive: 0111 1111 … 1111
2
2: 0000 0000 … 0010
1: 0000 0000 … 0001
0: 0000 0000 … 0000
–1: 1111 1111 … 1111 Notice the pattern for the
–2: 1111 1111 … 1110 negative numbers: counting
the binary numbers in reverse
–3:
3: 1111 1111 … 1101
–4: 1111 1111 … 1100
–5: 1111 1111 … 1011
–6:
6: 1111 1111 … 1010
–7: 1111 1111 … 1001
–8: 1111 1111 … 1000
Most negative: 1000 0000 … 0000
7
Signed Negation
How to negate a signed integer?
Easy: Compute the complement and add 1
Compute the complement means convert 1 → 0 and 0 → 1
x + x = 1111...1112 = −1
x + 1 = −x
Example: Negate +2
+2 = 0000 0000 … 00102
–2 = 1111 1111 … 11012 + 1
= 1111 1111 … 11102
Interesting: What if you have to negate 0?
8
Sign Extension
Representing a number using more bits
The first goal is to preserve the numeric value
How to extend the bit representation?
Fill the extra bit slots with the MSB (the sign bit)
For the unsigned values, fill the extra slots with 0
Example: 4-bit to 8-bit
+5:
5 0101 => 0000 0101
–5: 1011 => 1111 1011
Example: 8
8-bit
bit to 16-bit
16 bit
+2: 0000 0010 => 0000 0000 0000 0010
–2: 1111 1110 => 1111 1111 1111 1110
9
Logical Operations
Instructions for bitwise manipulation:
Operation C Java MIPS
Shift left << << sll
Shift right >> >>> srl
Bitwise AND & & and, andi
Bi i OR
Bitwise | | or, ori
i
Bitwise NOT ~ ~ nor
Useful for extracting and inserting groups of bits in a word
MIPS instructions discussed later in the course
10
Shift Operations
Shift left logical (sll)
Shift left and fill with 0 bits
sll by i bits multiplies by 2i
Example: 11001 << 2 = 00100
Shift right
i ht logical
l i l (srl)
( l)
Shift right and fill with 0 bits
srl by i bits divides by 2i (unsigned only)
Example: 11001 >> 2 = 00110
We will use shifters in creation of multipliers
p
More on shifters in the context of MIPS later in the course
11
AND Operation
AND Operation
Useful to mask bits in a word
Select some bits, clear others to 0
MIPS: and $t0, $t1, $t2
$t2 0000 0000 0000 0000 0000 1101 1100 0000
$t1 0000 0000 0000 0000 0011 1100 0000 0000
$t0 0000 0000 0000 0000 0000 1100 0000 0000
12
OR Operation
OR Operation
Useful to include bits in a word
Set some bits to 1, leave others unchanged
MIPS: or $t0, $t1, $t2
$t2 0000 0000 0000 0000 0000 1101 1100 0000
$t1 0000 0000 0000 0000 0011 1100 0000 0000
$t0 0000 0000 0000 0000 0011 1101 1100 0000
13
NOT Operation
NOT Operation
Useful to invert bits in a word
Change 0 to 1, and 1 to 0
MIPS has a NOR 3-operand instruction
a NOR b == NOT ( a OR b ) Register
R i t $0 always
l
reads zero
MIPS : nor $t0, $t1, $0
$t1 0000 0000 0000 0000 0011 1100 0000 0000
$t0 1111 1111 1111 1111 1100 0011 1111 1111
14
Arithmetic Circuitry
Digital building blocks:
So far, we have covered gates, multiplexers, decoders, and
registers
i t
We will use these to build arithmetic circuits, counters, memory
arrays, and logic arrays
Digital building blocks demonstrate hierarchy,
modularity, and regularity:
Hierarchy of simpler components
Well-defined interfaces and functions
Regular structure easily extends to different sizes
We will use all of these building blocks to create a
microprocessor
p
15
Adders /1
Adders:
Add two input bits, A
and
dBB, andd output
t t the
th A B A B
result, S
Cout Cout Cin
If there is carry-over, it + +
is outputted as Cout S S
Carry-over is added to A B Cout S Cin A B Cout S
the next adder in the 0 0 0 0 0 0 0 0 0
chain, or it indicates 0 1 0 1 0 0 1 0 1
1 0 0 1 0 1 0 0 1
overflow if it is the last 1 1 1 0 0 1 1 1 0
adder in the chain 1 0 0 0 1
S = 1 0 1 1 0
Cout = 1 1 0 1 0
1 1 1 1 1
Carry-over when A
and B are both 1
S =
Cout =
16
Adders /2
Adders:
Half adder has no
carry-over input
i t ffrom A B A B
the previous adder
Cout Cout Cin
+ +
Full adder represents
a chained adder and S S
input from the A B Cout S Cout S
Cin A B
previous adder is 0 0 0 0 0 0 0 0 0
given as Cin 0 1 0 1 0 0 1 0 1
1 0 0 1 0 1 0 0 1
How to create an 1 1 1 0 0
1
1
0
1
0
1
0
0
1
adder from gates? S =A B 1 0 1 1 0
Cout = AB 1 1 0 1 0
Use the logical 1 1 1 1 1
formulas shown below
each truth table S = A B Cin
Cout = AB + ACin + BCin
17
Adders /3
Multibit Adders (CPAs)
When composing multibit adders, different strategies for
propagating
ti carry-over bits
bit apply
l
Types of carry propagate adders (CPAs):
Ripple carry
Ripple-carry (slow)
Carry-lookahead (fast)
Prefix (faster)
Carry-lookahead and prefix adders can perform faster for large
adders but require more hardware
Symbol:
18
Ripple-Carry Adder
Ripple-Carry Adder
Simply chain 1-bit adders together
Carry ripples through entire chain
Advantage: simpler operational semantics
Disadvantage: slower performance
Ripple-Carry Adder Delay
i l = N x tFA , where tFA is the delay of a full adder
tripple
19
Carry-Lookahead Adder /1
Carry-Lookahead Adder
Instead of waiting for each adder to compute the carry-out bit,
di id the
divide th addition
dditi iinto
t kk-bit
bit bl
blocks
k and
d precompute
t carry-outs
t
Compute carry-out (Cout) for k-bit blocks using generate, Gi,
and propagate, Pi, signals
Operational Semantics:
Column i adder produces a carry-out by either generating a
carry-outt or propagating
ti a carry-in
i to
t the
th carry-outt
Generate (Gi) and propagate (Pi) signals for each column:
Column i will generate a carry-out if Ai AND Bi are both 1
Gi = Ai Bi
Column i will propagate a carry-in to the carry-out if Ai OR Bi is 1
Pi = Ai + Bi
The carry-out
carry out of column i (Ci) is:
Ci = Ai Bi + (Ai + Bi)Ci-1 = Gi + Pi Ci-1
20
Carry-Lookahead Adder /2
G i = Ai Bi
Pi = Ai + Bi
Ci = Ai Bi + (Ai + Bi)Ci-1
= Gi + Pi Ci-1
Example (in class):
A = 0001
B = 1111
Cout = 1
21
Carry-Lookahead Adder /3
Operational Semantics Continued:
Step 1. Compute Gi and Pi for all columns
Step 2. Compute G and P for k-bit blocks
Step 3. Cin propagates through each k-bit P/G block
E
Example:
l 4-bit
4 bit blocks
bl k (G3:0 and
d P3:0)
G3:0 = G3 + P3 (G2 + P2 (G1 + P1G0 )
P3:0 = P3P2P1P0
Generally, as shown on the previous slide:
Gi:j = Gi + Pi ((Gi-1 + Pi-1 ((Gi-2 + Pi-2Gj )
Pi:j = PiPi-1Pi-2Pj
Ci = Gi:j + Pi:j Ci-1
22
Carry-Lookahead Adder /4
Carry-Lookahead Adder Delay
For N-bit CLA with k-bit blocks:
tCLA = tpg + tpg_block + (N/k – 1) x tAND_OR + k x tFA
where tpg is the delayy needed to ggenerate all Pi, Gi
tpg_block is the delay needed to generate all Pi:j, Gi:j
tAND_OR is the delay from Cin to Cout of final AND/OR gate in k-bit
CLA block
An N-bit CLA is generally much faster than a ripple-carry
adder for N > 16
23
Prefix Adder /1
Prefix Adder
Continues on the idea of carry-lookahead adder
Computes carry-in
carry in (Ci-1)
(Ci 1) for each column then computes the sum
Si = (Ai ⊕ Bi) ⊕ Ci
Computes G and P for 1-, 2-, 4-, 8-, etc bit blocks until all Gi
((carry-in)
y ) are known ((computed
p in log
g2N stages)
g )
Operational Semantics Overview:
Carry-in either generated in a column or propagated from a
previous column
Column -1 holds Cin, so G-1 = Cin and P-1 = 0
Carry-in to column i equals carry-out of column i-1: Ci-1 = Gi-1:-1
Gi-1:-1 is
i th
the generate
t signal
i l spanning
i columns
l ii-1
1 tto -1
1
Sum equation: Si = (Ai ⊕ Bi) ⊕ Gi-1:-1
Goal: Quickly compute G0:-1, G1:-1, G2:-1, G3:-1, G4:-1, G5:-1, …
Th
These are called
ll d prefixes
fi
24
Prefix tree
composed of
these nodes
25
Prefix Adder /3
Operational Semantics Continued:
Generate and propagate signals for a block spanning bits i:j:
Gi:j = Gi:k + Pi:k Gk-1:j
Pi:j = Pi:kPk-1:j
G
Generate
t Stage:
St bl
block
k i:j
i j will
ill generate
t a carry if
Upper part (i:k) generates a carry, or
Upper part propagates a carry generated in lower part (k-1:j)
(k 1:j)
Propagate Stage: block i:j will propagate a carry if
Both the upper
pp and lower p
parts p
propagate
p g the carry
y
See Section 5.2.1 of the Harris textbook for details
26
Prefix Adder /4
Prefix Adder Delay
For N-bit prefix adder (PA):
tPA = tpg + log2N x (tpg_prefix) + tXOR
tpg is the delayy to pproduce Pi Gi ((OR and AND g gate))
tpg_prefix is the delay of black prefix cell (AND-OR gate)
The delay
y grows
g logarithmically
g y instead of linearly
y
An N-bit PA is generally faster than a CLA for N ≥ 32
27
Adder Delay Comparisons
Let us compare the delay of 32-bit ripple-carry,
carry-lookahead, and prefix adders
CLA has 4-bit blocks
2-input gate delay = 100 ps; full adder delay = 300 ps
tripple
i l = N x tFA = 32(300 ps) = 9.6 ns
tCLA = tpg + tpg_block + (N/k – 1) x tAND_OR + k x tFA
= [100 + 600 + (7)200 + 4(300)] ps = 3.3 ns
tPA = tpg + log2N x (tpg_prefix) + tXOR
= [100 + log232(200) + 100] ps = 1.2 ns
28
Subtractor
Subtraction:
Subtract A from B by first changing the sign of B using an
i
inverter
t (i(i.e., complement
l t B’
B’s bit
bits and
d th
then add
dd 1)
Then add A and the inverted B
Cin = 1
29
Equality Comparator
Comparator:
Determines if two N-bit binary numbers, A and B, are equal, or
if one number
b is i greater
t than
th or less
l than
th the
th other
th numberb
Equality Comparator:
Produces a single output that indicates if A is equal to B
30
Magnitude Comparator
Magnitude Comparator:
Produces one or more outputs
i di ti th
indicating the relative
l ti values
l off A
and B
First compute A – B and then look
at the MSB
S off the result
If the MSB equals 1 (i.e., the result
is negative), A is less than B;
otherwise, A is greater than or
equal to B
31
Introducing Arithmetic Logic Unit (ALU) /1
Arithmetic/Logical Unit (ALU):
Combines mathematical and logical operations
A typical ALU performs AND, OR, addition, subtraction, and
magnitude comparison operations
The ALU is at the core of most computer
p systems
y
Certain ALUs produce extra outputs, called flags, that indicate
information about the ALU output
For example,
example overflow flag indicates that the result of the adder
overflowed; zero flag indicates that the result is zero
The symbol
y for ALU:
where F is the selector bit that selects the
corresponding operation
32
Introducing Arithmetic Logic Unit (ALU) /2
F2:0 Function
000 A&B
001 A|B
010 A+B
011 not used
100 A & ~B
101 A | ~B
110 A-B
111 SLT
33
Introducing Arithmetic Logic Unit (ALU) /3
F2:0 Function See the traces of
operations
ti on
000 A&B the board
001 A|B
0
010 A+B
F2 bit
011 not used
100 A & ~B
101 A | ~B
Extend
110 A-B
Zero
F1:0 bits
111 SLT
0
3
34
Introducing Arithmetic Logic Unit (ALU) /4
Configure 32-bit ALU for A B
N N
SLT operation:
A = 25 and B = 32
A < B, so Y should be 32-bit N
1
representation of 1
0
F2
(0x00000001) N
F2 bit
F2:0 = 111
F2 = 1 ((adder acts as
subtracter), so Cout +
25 - 32 = -7 [N-1] S
-7 has 1 in the most significant
Exttend
bit (S31 = 1)
Z
Zero
F1:0 = 11 multiplexer selects Y N N N N F1:0 bits
= S31 (zero extender) =
0
3
2
0x00000001 2 F1:0
N
Y
35
Shifters and Rotators /1
Shifters and Rotators:
Move bits and multiply or divide by powers of 2.
Shifter: Circuit that shifts a binary number left or right by a
specified number of bit positions
Rotator: Rotates a binary y number in a circle pattern,
p , so that
empty spots are filled with bits shifted from the other end
Logical Shifter: Shifts a binary number to the left (sll) or right
(srl), and fills the empty bits with 0s
Arithmetic Shifter: Performs the same as the logical shifter on
left (sla and sll are the same); on right, it fill the bits with a
copy of the MSB (sra is different than srl)
36
Shifters and Rotators /2
Shifters and Rotators Examples:
Logical shifter:
11001 >> 2 = 00110
11001 << 2 = 00100
Arithmetic shifter:
11001 >>> 2 = 11110
11001 <<< 2 = 00100
Rotator:
11001 ROR 2 = 01110
11001 ROL 2 = 00111
37
Shifters and Rotators /3
A3 A2 A1 A0 shamt1:0
shamt1:0: 2
Represents the shift magnitude
00
S1:0
01
The input, A, is shifted by 10
Y3
shamt1:0 bits, and routed to Y 11
shamt1:0 as the
If shamt1:0 = 00,, Y = A 00 selector bits
S1:0
01
10
Y2
11
00
S1:0
01
10
Y1
11
00
S1:0
01
10
Y0
Ground 11
symbol
38
Shifters and Rotators /4
Shifters and Rotators Applied:
A << N = A × 2N
00001 << 2 = 00100 (1 × 22 = 4)
11101 << 2 = 10100 (-3 × 22 = -12)
A >>> N = A / 2N
01000 >>> 2 = 00010 (8 / 22 = 2)
10000 >>> 2 = 11100 (-16 / 22 = -4)
Trace these through the matching shifters on the next slide
39
Multipliers /1
Multipliers:
N-by-N multipliers multiply two N-bit numbers, and produce
2N bit results
2N-bit lt
The partial products in binary multiplication are either the
multiplicand or all 0s
Multiplication of 1-bit binary numbers is equivalent to the AND
operation, so AND gates are used to form the partial products
A B
230 multiplicand 0101 Multiplying a single digit 4 4
x 42 multiplier x 0111 of the multiplier with
460
+ 920
partial
ti l
products
0101
0101
multiplicand in stages
x
9660 0101
+ 0000 Shifted partial products
8
are then summed to
result
lt 0100011 form the result P
230 x 42 = 9660 5 x 7 = 35 41
A3 A2 A1 A0
x B3 B2 B1 B0
A3B0 A2B0 A1B0 A0B0
A3B1 A2B1 A1B1 A0B1
A3B2 A2B2 A1B2 A0B2
+ A3B3 A2B3 A1B3 A0B3 Trace execution of
0101 x 0101
P7 P6 P5 P4 P3 P2 P1 P0
A3 A2 A1 A0
B0
B1
0
Adders add components of
the partial products 0
B2
0 Cout
B3
Shifting performed via 0
the output bits
P7 P6 P5 P4 P3 P2 P1 P0
42
Divider Overview
Divider Example: 0 B3 0 B2 0 B1 A3 B0
The divider computes A/B
and
d produces
d a quotient,
ti t 1
Q, and a remainder, R Q3
N indicates if R – B is A2
negative, and it is
1
obtained from the Cout bit
of the left-most block Q2
g A1
R B
1
R B
Cout Cin Cout Cin
i + i Q1
D
D A0
N R'
1
N 1
Q0
R'
R3 R2 R1 R0 43
Food for Thought
Download and Read Assignment #2 Specifications
Read:
ead
Chapter 3 of the course textbook
Review the material discussed in the lecture notes in more detail
(Optional) Chapter 5 of the Harris and Harris textbook
44