0% found this document useful (0 votes)
25 views68 pages

RCC 15

The document discusses online placement and routing in reconfigurable computing systems, focusing on the KAMER algorithm for managing empty rectangles during task placement. It also covers the challenges of routing in Network on Chip (NoC) architectures, emphasizing the importance of efficient routers and various routing algorithms. The content highlights the need for effective communication between processing elements and the management of data flow within integrated circuits.

Uploaded by

adithyanmusic
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views68 pages

RCC 15

The document discusses online placement and routing in reconfigurable computing systems, focusing on the KAMER algorithm for managing empty rectangles during task placement. It also covers the challenges of routing in Network on Chip (NoC) architectures, emphasizing the importance of efficient routers and various routing algorithms. The content highlights the need for effective communication between processing elements and the management of data flow within integrated circuits.

Uploaded by

adithyanmusic
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 68

Reconfigurable Computing

AELZG 554 / ES ZG554 / MELZG554


Session 15
Pawan Sharma
BITSPilani ps@pilani.bits-pilani.ac.in
Pilani Campus 15/05/2025
Last Lecture

• Scheduling
• ASAP
• ALAP
• List scheduling
• Force Directed Scheduling

2
BITS Pilani, Pilani Campus
Today’s Lecture

• Online Placement
• Routing

3
BITS Pilani, Pilani BITS
Campus
Online Placement

• Given a reconfigurable processing unit H at time t


with a given configuration Ct.
• The task of online placement is to find an optimal
position for a new incoming task v such that v
does not overlap with any running component in
the configuration Ct.

4
BITS Pilani, Pilani Campus
Partial Reconfigurable System
System Model: Operating system incl.
• Scheduler: Assigns resources to incoming tasks.
• Module Database: Contains the physical implementation
of the tasks.
• Placer: Assigns space on the reconfigurable device to run
tasks sent by the scheduler.

Interaction between scheduler and placer is open.


Possible scenario:
• The scheduler requests the placement of a new task.
• The placer tries to find a place on the device for the new
task.
• Upon success, it acknowledges the scheduler and
downloads the corresponding module to the device. Fine!
• Upon failure, it acknowledges the scheduler. The scheduler
can now decide to reject the request or to try it later again.
• In all cases, we have to decide on-line, if a new component
can be placed on a device where other components are
already running

BITS Pilani, Pilani Campus


• The online placement methods are 2D extensions of the FF, BF, and BL algorithms.
• The generic algorithm consists of two parts: an empty space partitioning manager for
insertion and deletion and a search engine and bin-packing rule.
• The partitioning part divides the empty region on the chip into not necessarily disjoint
rectangles called “empty rectangles.”
• The second part is responsible for selecting an empty rectangle to accommodate a
module whose insertion is requested.
• All empty rectangles that can accommodate the module are candidates for the location
of the module on the chip. The bin-packing rule is used to favor one over the others.
• Finally, the module will be placed at the bottom-left corner of the selected empty
rectangle. For example, the criteria could be to choose the empty rectangle with
minimum area (Best Fit) or to pick the one with the lowest bottom side.

BITS Pilani, Pilani Campus


• On the basis of empty rectangle concept, a strategy to solve the
online temporal placement problem was based on a method which
has the name keeping all maximum empty rectangles (KAMER),
permanently keeps track of all MERs.
• Whenever a request for placing a component v arrives, the list of
MERs is searched for a rectangle that can accommodate v.
• Because it is possible to have many MERs in which v can fit, strategies
such as the first-fit or best-fit are used to select one rectangle from
the set of possible free rectangles.
• Once a rectangle is chosen, the candidate points that can be chosen
as reference point for placing the new component are those that do
not allow an overlap with the external part of the rectangle.
• Heuristically, we use the bottom left point as the reference one to
place the component.

7
BITS Pilani, Pilani Campus
Kamer Algorithm

• The decision in KAMER algorithm is taken by choosing the


one with the leftmost left edge based on bottom-left
Heuristic.
• Now an RFUOP r, can be either accepted or rejected based
on the availability of RFU in the real estate, If an RFUOP is
rejected, the same function should be performed by the
host CPU; hence, a running time penalty is incurred.
• In this model, we assume there is no communication
between RFUOPs,
• The data to be processed by an RFUOP are loaded on the
RFU before the RFUOP starts execution.
• After it is done, the result is read into CPU registers .

BITS Pilani, Pilani Campus


KAMER Model of an RCS
On-line temporal placement

o RFUOPs = {r1, …, rn| ri = (wi, hi, si, ei)} ; si < ei


• ei - si : time span the operation is resident in the system
o Placement of RFUOPs:
• Modeled as 3D template placement
o Empty Rectangle (ER):
• A rectangle that does not overlap a placed module on the chip
o Maximum Empty Rectangle (MER):
• An ER not included in any other rectangle than the device bounding box

BITS Pilani, Pilani Campus


Example
 Non-ER:
o (E,F) -- because it overlaps module v2.
 ER:
o (E,D): MER
o (A,D): MER
o (E,C): not MER
 KAMER method:
oKeeping All Maximum Empty Rectangles:
• Permanently keeps track of all MERs
• Whenever a request for placing a component v arrives, the list
of MERs is searched for a rectangle that can accommodate v.
oPossible to have many MERs in which v can fit
Strategies: first-fit or best-fit
o Bottom left [Bazargan]
o The great advantage of the KAMER approach is that all the free
space on the device is captured, thus providing a good quality. This
happens at the cost of efficiency in the computation time

BITS Pilani, Pilani Campus


KAMER
Once a rectangle is chosen:
• Candidate points are those that do not allow an overlap with the external part of the rectangle.
 Problem:
• o Number of empty rectangles does not grow linear with the number of components included. When
implementing a placement method of this type, we are obviously not looking for the fastest, but rather the
best quality placement algorithm. ((D,I),
(C,H),
((D,I),
(N,P),
(C,H),
(A,U),
(A,P),
(N,O),
(A,O),
(A,S),
(F,O),
(F,S),
(E,N),
(E,R),
(Q,G),
(E,K),
(Q,H),
(L,H),
(E,K),
(L,J),
(M,I),
(M,S),
(L,J))
The configuration contains 11 modules The number of modules grows to 14 (V,I),
before the insertion of a new task after the insertion of a new task v4 (T,I)),

BITS Pilani, Pilani Campus


KAMER
o Run-time of free rectangle placement: O(n2)

In order to avoid the quadratic space requirement and increased running time, we can keep only linear number of
empty rectangles in terms of the number of modules. The strategy consist of keeping only the non-overlapping
empty rectangles, thus reducing the number of empty rectangles to be managed. This allows a linear run-time at
the cost of the quality. The problem with this approach is that several non-overlapping rectangles representation
of a configuration may exists leading to multiple possible choices in the representation .

BITS Pilani, Pilani Campus


Keeping Non-Overlapping ERs
 Problem 2:
o Non-overlapping empty rectangles are not necessarily maximal
• A module may exists that could fit onto the device, but cannot be
placed

BITS Pilani, Pilani Campus


Motivation for keeping maximal empty rectangles

• whenever a new component v1 is placed in a nonoverlapping rectangle, two possibilities exist to split
that rectangle. A horizontal split that is done using segment Sa and a vertical split using the segment
Sb. Choosing to select one of the two split directions may have a negative impact on the placement of
the next components. Assuming for example that the algorithm keeps selecting the horizontal split,
after the placement of module v1, the free rectangles left are (A,D) and (C,E). Assume now that a new
module, whose width is the same as (C,E) but with a height slightly bigger than that of (C,E) is chosen.
The algorithm will not be able to place the component, although enough free space is available on the
chip to accommodate the new module.

BITS Pilani, Pilani Campus


Finding MERs

Modelling FPGA Area Matrix

15
BITS Pilani, Pilani Campus
Keeping Non-Overlapping ERs
o KAMER always finds a rectangle to place a new component, if one exists.
o But position to place the component within the rectangle must be selected from a set of points
• whose number is the area of the rectangle in worst case
 In [Bazargan]:
o Bottom-left position
• Note: No communication is assumed between the rectangle to be placed and those
already placed.
 If communication exists:
o If a position in the middle of the selected rectangle is better adapted for optimizing the goal
seeked, then the component should be placed there, no mater if the number of empty rectangle
increases.
o An optimal algorithm should consider any single point where the placement is possible and
develop a strategy to choose the best position.
o Instead of managing the free space, the strategy consists of managing the occupied space on the
device and use the set of running components for computing the best position to place the
incoming component.

BITS Pilani, Pilani Campus


FPGA Routing Basics (NoC)

17
BITS Pilani, Pilani Campus
Routing Challenges

How to handle
communication between
increasing number of
components on chip?

BITS Pilani, Pilani Campus


Routing Challenges

BITS Pilani, Pilani Campus


Buffers or Routers

Wiring delay grows quadratically


Buffer Insertion can make it linear,
but at the heavy cost of buffers
Multi cycle hops

Replace the buffers with Routers?

BITS Pilani, Pilani Campus


Network on Chip

• NoC is a network based communication subsystem on an integrated


circuit most typically between modules in a system on chip
• NoCs contains switches and links which interconnect different PEs or
processing elements in general and manage the data movement between
them
• NoCs improve the scalability of systems on chip and the power efficiency
of complex SoCs compared to other communication subsystem designs

BITS Pilani, Pilani Campus


Routing Techniques

• inspired by data-communication networks


• to communicate across the chip in a way similar to that of
messages transmitted over the.
• can effectively overcome the long-wire disadvantages from bus
architectures
• the architecture is decoupled into different layers, such as
transaction and physical layers.

BITS Pilani, Pilani Campus


Campus
Network on Chip Terminologies
Processing Element (PE):
• Hardware modules that do the actual processing. Can be homogeneous or heterogeneous.
• Consist of a set of network clients (DSP, memory, peripheral controller, memory controllers,
custom logic) that communicate on a packet base (instead of using direct connection).

Switches: switches or routers are used for routing data coming from one PE to another PE based
on the destination information in the data

Links: Used to interconnect switches and PEs. Follow some particular protocol when they
transmit data between these switches.

BITS Pilani, Pilani Campus


Campus
Topologies

• Topology is the arrangement of the elements (links nodes, etc) of the


network topology is chosen based on the cost and performance
(flow control, routing and bisection bandwidth

BITS Pilani, Pilani Campus


Bisection Bandwidth

• One of the most important parameters in deciding the throughput of a NOC


• if the network is bisected into 2 partitions, the bisection bandwidth of a network
topology is the bandwidth available between the two partitions

BITS Pilani, Pilani Campus


Communication in NoC

• NoCs follow packet-based communication.


• The data packets will have a header portion which will have the
information regarding the destination PE
• The header may have additional information like the source PE,
size of the packet etc.
• The portion of a packet transmitted on one clock cycle is called
a flit (flow control digit)

flit packet

BITS Pilani, Pilani Campus


Circuit Switching vs Packet Switching

• In circuit switching based communication the communication link


between the transmitter and the receiver is first established before the
actual data transmission happens
• all our previous processor based systems are following the circuit
switching
• In packet switching data is transmitted in the form of packets -- here
the exact route taken by the packet from the source to the destination is
decided at runtime and in most of the cases it cannot be determined in
advance
• NOCs and most of other computer networks including the internet
follow packet switching

BITS Pilani, Pilani Campus


Router

• Routers consist of a set of components like buffers to temporarily store


packets, a controller that determines how to forward the packet, usually
according to the destination address.

• The performance and the efficiency of the NoC depend on the


underlying communication infrastructure, which in turn depends on the
performance (latency and throughput) of the on-chip routers.

• Design of efficient, high performance routers represents a critical issue for


the success of the NoC approach.

35
BITS Pilani, Pilani Campus
Campus
• Packets delivered by routers are partitioned in a flit-by-flit basis.
• Each flit of a packet arrives at a router and stored in a memory buffer
until it can traverse to the next hop of the route.
• The first flit in the buffer memory will be processed by the control
logic to determine whether it is allowed to be forwarded and which
output direction it should proceed to.
• The decision made by the control unit is based on the computation
result of routing, arbitration, and the downstream buffer space.
• After the control setup is done, the flit passes through the crossbar
switch to its desired output direction.
• Since all the decisions are settled by the control logic at the input side,
flits will never be stalled at the output ports.

BITS Pilani, Pilani Campus


Campus
Router

BITS Pilani, Pilani Campus


Campus
Contention

When 2 packets trying to use the same link at the same time is called as
contention

• Buffer One (widely used)


• Drop One
• Misroute (deflect)

BITS Pilani, Pilani Campus


Flow Control

BITS Pilani, Pilani Campus


Routing Algorithms

• To decide what path a message will take through


the network to reach its destination.
• To distribute traffic evenly among the paths
supplied by the network topology,
• Have a low energy overhead but the specific
route chosen affects hop count directly, and thus
substantially affects energy consumption.

33
BITS Pilani, Pilani Campus
Routing Algorithms

• Dimension Ordered Routing


• Oblivious Routing
• Adaptive Routing

34
BITS Pilani, Pilani Campus
Dimension Ordered Routing

• Always generate the same single routing path for given pair of source
and destination addresses, typically a shortest one.
• When source routing is used, the source node implements pure
routing function returning a unique path without considering any
information about the traffic.
• For distributed routing, routers make unique decisions in every
intermediate node.
• Deterministic distributed routing is used in most commercially
available parallel machines.
• It is simple, fast, and performs well under uniform traffic assumption.
• Used in mesh-based machines,
• It is known to be deadlock-free in meshes and is easy to implement in
HW.
• Nonuniform traffic requires some degree of adaptivity.
35
BITS Pilani, Pilani Campus
Oblivious Routing

• Messages traverse different paths from A to B,


without regard to network congestion.
• Do not take into consideration any other information
except the addresses, equally to deterministic
routing.
• The routing decision are oblivious to the status of
network traffic.
• Oblivious non-deterministic algorithms can distribute
uniformly the communication load in situations
where adaptive solutions are too expensive or slow.
36
BITS Pilani, Pilani Campus
Adaptive Algorithm
• A more sophisticated routing algorithm can be adaptive, in which the path
a message takes from A to B depends on network traffic situation.
• Use information about network traffic and/or channel status to avoid
congested or faulty regions of the network.
• Source-node adaptive routing is useful only when the traffic status does
not change too fast
• On the other hand, adaptive routing can easily be combined with
distributed routing.
• Local or global information can be leveraged to make adaptive routing
decisions

37
BITS Pilani, Pilani Campus
38
BITS Pilani, Pilani Campus
Minimal and Non-minimal Routing

• Minimal routing algorithms select only paths that require the


smallest number of hops between the source and the
destination.
– greedy algorithms: (also called minimal, direct, shortest-path, or
profitable.) Every routing decision brings the packet closer to the
destination. Deterministic and oblivious routing algorithms are
usually greedy.

• Non-minimal routing algorithms allow paths to be selected that


may increase the number of hops between the source and
destination.
– nongreedy algorithms: (also called nonminimal, indirect,
nonprofitable, or misrouting.) They can supply channels which send
packets away from its destination.

39
BITS Pilani, Pilani Campus
Deadlock Avoidance

• Most applications require the network to guarantee deadlock freedom.


• A deadlock occurs when a cycle exists among the paths of multiple messages..
• Deadlock freedom can be ensured in the routing algorithm by preventing cycles among the
routes generated by the algorithm, or in the flow control protocol by preventing router buffers
from being acquired and held in a cyclic manner.
• A network that uses a deadlock-prone routing algorithm requires a flow control protocol that
ensures deadlock freedom.

40
BITS Pilani, Pilani Campus
Flow Control

• Flow control governs the allocation of network


buffers and links.
• A good flow control protocol lowers the latency
experienced by messages at low loads
• In determining the rate at which packets access
buffers (or skip buffer access altogether) and
traverse links, flow control is instrumental in
determining network energy and power
consumption.
41
BITS Pilani, Pilani Campus
Messages, Packets, Flits and Phits
• When a message is injected into the network, it is
first segmented into packets, which are then divided
into fixed-length flits,.
• The packet will consist of a head flit that contains the
destination address, body flits and a tail flit that
indicates the end of a head, body and packet.
• Flits can be further broken down into phits, which
are physical units and correspond to the tail flits
physical channel width
42
BITS Pilani, Pilani Campus
43
BITS Pilani, Pilani Campus
Flow Control Techniques

– Circuit Switching
– Store-and-Forward
– Virtual Cut-Through
– Wormhole Routing

BITS Pilani, Pilani Campus


Campus
CircuitSwitching

• It is message based flow control technique.


• A communication path is created from the source to the destination before
transmitting any data.
• A routing probe containing the source and destination address traverses
network and reserving links to transmit the data
• Once the routing probe reaches the destination address, an
acknowledgment is sent back to the source address,
• The data are transferred at the full bandwidth of the hardware.
• The circuit remains operational until the end of data to be transmitted.
• The lock on the links may be released once all the data have reached the
destination by sending back another acknowledgment through the same
route to the source.

BITS Pilani, Pilani Campus


Campus
46
BITS Pilani, Pilani Campus
Store-and-Forward(SAF)

It is packed based flow control.


At each Node:

1. The packet is copied entirely into the network router before moving to the next node.
2. the routing information is examined to determine which output channel to direct the packet.
3. the packet is sent to the neighbor.
4. Flow control unit is the entire packet.

Latency:
Nr * tr
– Nr: number of routers through which the packet must travel
– tr: time to transfer the packet between the routers
A B
– Leads to high per-packet latency
– Requires buffering for entire packet in each node.

BITS Pilani, Pilani Campus


Campus
48
BITS Pilani, Pilani Campus
Virtual Cut-Through(VCT)
– As the routing information is carried in the header of a packet, the packet should not be
stored in the current node’s memory if an output buffer is available.
– The packet simply cuts through the router of the node to an available output channel. Start
forwarding as soon as the header is received and resources like buffers or channel etc are
allocated.
– This alleviates the need for an excessive amount of memory along the path of a message,
but enough memory has to be allocated if an output channel is not available

BITS Pilani, Pilani Campus


Campus
Worm hole routing

• The flit can depart the current node as soon as there is sufficient buffering for this flit.
• While wormhole flowcontrol uses buffers effectively, it makes inefficient use of link bandwidth.
• Throughput suffers because other packets queued behind the blocked packet are unable to use the idle physical
links.
• Wormhole flow control reduces packet latency by allowing a flit to leave the router as soon as a downstream
buffer is available (in the absence of contention, the latency is the same as virtual cut through).
• Additionally, wormhole flow control requires much smaller buffers than packet-based techniques.
• Due to the tight area and power constraints of on-chip networks, wormhole flow control is the predominant
technique adopted thus far.

BITS Pilani, Pilani Campus


Campus
Review Questions

51
BITS Pilani, Pilani Campus
Q1—scheduling algorithm
• For the edge weighted sequence
graph shown in fig, weights on
edges indicate execution delay of
nodes in terms of clock cycles.
Schedule operations as per
– ASAP
– ALAP (latency constraint of 6 cycles)
– Find mobility
– List schedule under resource
constraint of 2=a*, 2=a+/-, 1=a< with
ALAP schedule times as priority for
nodes.
– List schedule under latency
constraint of 6

BITS Pilani,
Pilani, Pilani
Pilani Campus
Campus
ASAP AND ALAP

BITS Pilani,
Pilani, Pilani
Pilani Campus
Campus
mobili
ty

BITS Pilani,
Pilani, Pilani
Pilani Campus
Campus
List schedule-resource constraint

t = 0,
– U0,* = {a,b,c,d}, U0,+/- = {e}, U0,< = ∅
– T0,* = ∅, T0,+/- = ∅, T0,< = ∅
– For +/-, easy to select Y = {e}
– For *, we have a choice. ALAP times for a,b,c,d
t=1
are 0,0,1,3, respectively So most urgent are Y = {a,b}
– U1,* = {c,d}, U1,+/- = ∅, U1,< = {i}
– For <, there is nothing to schedule Y = ∅
– T1,* = {a,b}, T1,+/- = ∅, T1,< = ∅ – S(a) = 0, S(b) = 0, S(e) = 0
– For +/-, Y = ∅
– For *, Y = ∅ (all resources busy) t=3 t=4
– For <, Y = {i} -- U3, * = {d}, U3,+/- = ∅, U3,<= ∅ -- U4, * = {d,g}, U4,+/- = (j}, U4,<= ∅
– S(i) = 1 -- T3, * ={c,f}, T3,+/-= ∅, T3,<= ∅ -- T4, * ={∅}, T4,+/-= ∅, T3,<= ∅
t=2 For +/-, Y = ∅ For +/-, Y = {j}
– U2,* = {c,d,f}, U2,+/- = ∅, U2,< = ∅ For *, Y= ∅ (all resources busy) For *, Y= {d,g}
– T2,* = ∅, T2,+/- = ∅, T2,< = ∅ For <. Y= ∅ For <. Y= ∅
– For +/-, Y = ∅ S(0) = 3 S(d) = 4. S(g) =4, S(j)=4
– For *, ALAP times for c,d,f are 1,3,2
Similarly,
respectively.Y = {c,f}
T=5, S(0)
– For <, Y = ∅
T=6, S(k)
– S(c) = 2, S(f) = 2
T=7, sink node
BITS Pilani,
Pilani, Pilani
Pilani Campus
Campus
Latency = 7
we take once cycle longer than ASAP (but can
use half the number of multipliers)

BITS Pilani,
Pilani, Pilani
Pilani Campus
Campus
List schedule-latency constraint of 6 cycles
B
I
T
S
P
i
l
a Slack values from ALAP schedule
n
i
Assume one resource of each type to begin
, with.
P
i t=0
t=1
l – U0,* = {a,b,c,d}, U0,+/- = {e}, U0,< = ∅
a – U1,* = {c,d}, U1,+/- = ∅, U1,< = {i}
n – T0,* = ∅, T0,+/- = ∅, T0,< = ∅
– T1,* = {a,b}, T1,+/- = ∅, T1,< = ∅
i – sa = 0, sb = 0, sc = 1, sd = 3, se = 4
C – sc = 0, sd = 2, si = 3
a – For *, k1 = {a,b};
m – For *, k1 = {c};
– a* = 2;
p – a* = 3;
u – For +/-, k2 = {e};
s – For +/-, k2 = ∅; for <, k3 = {i}
For <, k3= ∅
– S(c) = 1, S(i) = 1
– S(a) = 0, S(b) = 0, S(e) = 0
t=2
Contd…………
– U2,* = {f,d}, U2,+/- = ∅, U2,< = ∅
..
– T2,* = {c}, T2,+/- = ∅, T2,< = ∅
– sf = 0, sd = 1
– For *, k1 = {f} schedule has the same
– all resource constraints unchanged latency as ASAP, but
requires 3 rather than 4
– For +/-, k2 = ∅; for <, k3 = ∅ multipliers and 2 ALUs and
– S(f) = 2, one comparator
Q2

• Build a BDD for f = abc + ab’c + a’bc’ + a’b’c’


– Use the variable ordering a ≤ b ≤ c
• Find ROBDD for the function
– f(a,b,c,d) = (a+c+d).(ab+bc) .(bc+ac)
With variable orders as O1=[a,b,c,d], O2=[b,d,c,a]

BITS Pilani,
Pilani, Pilani
Pilani Campus
Campus
sol
Compute cofactors of f with respect to a
(first variable in ordering)
f = abc + ab’c + a’bc’ + a’b’c’
fa’ = bc’ + b’c’
fa = bc + b’c

Compute cofactors of fa and fa’ with


respect to b (second variable in
ordering)
fa = bc + b’c
(fa)b = fab = c
(fa)b’ = fab’ = c
fa’ = bc’ + b’c’
(fa’)b = fa’b = c’
(fa’)b’ = fa’b’ = c’

BITS Pilani,
Pilani, Pilani
Pilani Campus
Campus
f(a,b,c,d) = (a+c+d).(ab+bc) .(bc+ac)

For O1, the ROBDD can be gendered as follows

For O2, the ROBDD can be gendered as follows

BITS Pilani,
Pilani, Pilani
Pilani Campus
Campus
Variable ordering matters
f(x1,x2,x3,x4,x5,x6) = (x1+x2).(x3+x4).(x5+x6)

Order: [x1,x2, x3, x4, x5, x6] Order: [x1,x3,x5,x2,x4,x6]

BITS Pilani,
Pilani, Pilani
Pilani Campus
Campus
Q3

• Your task is to implement a 32K x 32 memory.


Assume you are given 32Kbit configurable
memory blocks such as a block RAM in a FPGA.
Each block can be configured as a 32K x 1bit
memory or a 1K x 32bit memory. Which option
would you choose as a building block for your 32K
x 32bit memory and why?

BITS Pilani,
Pilani, Pilani
Pilani Campus
Campus
Sol 3

Using the 32Kx1


memories, the larger
memory can be
constructed by simply
using the one bit from
each block as one of the
32 bits of data.
On writes, the address
goes to all blocks, write
enable is asserted for all,
and one of the 32 bits
goes to each block.

BITS Pilani,
Pilani, Pilani
Pilani Campus
Campus
• Using the 1Kx32 memories, each
data word is stored in one of the 32
blocks. 32:1 decoder 32x1 MUX
• For reads, part of the read address
lines (0 to 9) chooses the one word
out of 1k words inside the memory
and the rest of the read address
lines (10 to 14) chooses between
the data from the blocks.
• For writes the data-In goes to a
block whose write-enable is
asserted, determined by a decoder.
• So, the 1Kx32 implementation
requires an extra 32:1 decoder and
a 32-bit wide 32:1 mux; this is a lot
of extra hardware and performance
is also worse because of the extra
delay through the decoder on
writes and mux on reads.

BITS Pilani,
Pilani, Pilani
Pilani Campus
Campus
Q5

• Implement the two combinational functions,


• F = X’Y + XYZ’ and
• G = W’X’Y + W’XYZ’.

BITS Pilani,
Pilani, Pilani
Pilani Campus
Campus
Sol
5

BITS Pilani,
Pilani, Pilani
Pilani Campus
Campus
Chortle -crf

Apply Chortle-crf LUT-mapping algorithm for the expression below with K=4. Use two-level
decomposition followed by multi-level decomposition using bin packing algorithm to map the
design on the available LUTs.

Y = abc + db + ace + dfgh

BITS Pilani, Pilani Campus


Sol

BITS Pilani, Pilani Campus

You might also like