0% found this document useful (0 votes)

25 views68 pages

RCC 15

The document discusses online placement and routing in reconfigurable computing systems, focusing on the KAMER algorithm for managing empty rectangles during task placement. It also covers the challenges of routing in Network on Chip (NoC) architectures, emphasizing the importance of efficient routers and various routing algorithms. The content highlights the need for effective communication between processing elements and the management of data flow within integrated circuits.

Uploaded by

adithyanmusic

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views68 pages

RCC 15

Uploaded by

adithyanmusic

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 68

Reconfigurable Computing

AELZG 554 / ES ZG554 / MELZG554

Session 15
Pawan Sharma
BITSPilani ps@pilani.bits-pilani.ac.in
Pilani Campus 15/05/2025
Last Lecture

• Scheduling
• ASAP
• ALAP
• List scheduling
• Force Directed Scheduling

2
BITS Pilani, Pilani Campus
Today’s Lecture

• Online Placement
• Routing

3
BITS Pilani, Pilani BITS
Campus
Online Placement

• Given a reconfigurable processing unit H at time t

with a given configuration Ct.
• The task of online placement is to find an optimal
position for a new incoming task v such that v
does not overlap with any running component in
the configuration Ct.

4
BITS Pilani, Pilani Campus
Partial Reconfigurable System
System Model: Operating system incl.
• Scheduler: Assigns resources to incoming tasks.
• Module Database: Contains the physical implementation
of the tasks.
• Placer: Assigns space on the reconfigurable device to run
tasks sent by the scheduler.

Interaction between scheduler and placer is open.

Possible scenario:
• The scheduler requests the placement of a new task.
• The placer tries to find a place on the device for the new
task.
• Upon success, it acknowledges the scheduler and
downloads the corresponding module to the device. Fine!
• Upon failure, it acknowledges the scheduler. The scheduler
can now decide to reject the request or to try it later again.
• In all cases, we have to decide on-line, if a new component
can be placed on a device where other components are
already running

BITS Pilani, Pilani Campus

• The online placement methods are 2D extensions of the FF, BF, and BL algorithms.
• The generic algorithm consists of two parts: an empty space partitioning manager for
insertion and deletion and a search engine and bin-packing rule.
• The partitioning part divides the empty region on the chip into not necessarily disjoint
rectangles called “empty rectangles.”
• The second part is responsible for selecting an empty rectangle to accommodate a
module whose insertion is requested.
• All empty rectangles that can accommodate the module are candidates for the location
of the module on the chip. The bin-packing rule is used to favor one over the others.
• Finally, the module will be placed at the bottom-left corner of the selected empty
rectangle. For example, the criteria could be to choose the empty rectangle with
minimum area (Best Fit) or to pick the one with the lowest bottom side.

BITS Pilani, Pilani Campus

• On the basis of empty rectangle concept, a strategy to solve the
online temporal placement problem was based on a method which
has the name keeping all maximum empty rectangles (KAMER),
permanently keeps track of all MERs.
• Whenever a request for placing a component v arrives, the list of
MERs is searched for a rectangle that can accommodate v.
• Because it is possible to have many MERs in which v can fit, strategies
such as the first-fit or best-fit are used to select one rectangle from
the set of possible free rectangles.
• Once a rectangle is chosen, the candidate points that can be chosen
as reference point for placing the new component are those that do
not allow an overlap with the external part of the rectangle.
• Heuristically, we use the bottom left point as the reference one to
place the component.

7
BITS Pilani, Pilani Campus
Kamer Algorithm

• The decision in KAMER algorithm is taken by choosing the

one with the leftmost left edge based on bottom-left
Heuristic.
• Now an RFUOP r, can be either accepted or rejected based
on the availability of RFU in the real estate, If an RFUOP is
rejected, the same function should be performed by the
host CPU; hence, a running time penalty is incurred.
• In this model, we assume there is no communication
between RFUOPs,
• The data to be processed by an RFUOP are loaded on the
RFU before the RFUOP starts execution.
• After it is done, the result is read into CPU registers .

BITS Pilani, Pilani Campus

KAMER Model of an RCS
On-line temporal placement

o RFUOPs = {r1, …, rn| ri = (wi, hi, si, ei)} ; si < ei

• ei - si : time span the operation is resident in the system
o Placement of RFUOPs:
• Modeled as 3D template placement
o Empty Rectangle (ER):
• A rectangle that does not overlap a placed module on the chip
o Maximum Empty Rectangle (MER):
• An ER not included in any other rectangle than the device bounding box

BITS Pilani, Pilani Campus

Example
 Non-ER:
o (E,F) -- because it overlaps module v2.
 ER:
o (E,D): MER
o (A,D): MER
o (E,C): not MER
 KAMER method:
oKeeping All Maximum Empty Rectangles:
• Permanently keeps track of all MERs
• Whenever a request for placing a component v arrives, the list
of MERs is searched for a rectangle that can accommodate v.
oPossible to have many MERs in which v can fit
Strategies: first-fit or best-fit
o Bottom left [Bazargan]
o The great advantage of the KAMER approach is that all the free
space on the device is captured, thus providing a good quality. This
happens at the cost of efficiency in the computation time

BITS Pilani, Pilani Campus

KAMER
Once a rectangle is chosen:
• Candidate points are those that do not allow an overlap with the external part of the rectangle.
 Problem:
• o Number of empty rectangles does not grow linear with the number of components included. When
implementing a placement method of this type, we are obviously not looking for the fastest, but rather the
best quality placement algorithm. ((D,I),
(C,H),
((D,I),
(N,P),
(C,H),
(A,U),
(A,P),
(N,O),
(A,O),
(A,S),
(F,O),
(F,S),
(E,N),
(E,R),
(Q,G),
(E,K),
(Q,H),
(L,H),
(E,K),
(L,J),
(M,I),
(M,S),
(L,J))
The configuration contains 11 modules The number of modules grows to 14 (V,I),
before the insertion of a new task after the insertion of a new task v4 (T,I)),

BITS Pilani, Pilani Campus

KAMER
o Run-time of free rectangle placement: O(n2)

In order to avoid the quadratic space requirement and increased running time, we can keep only linear number of
empty rectangles in terms of the number of modules. The strategy consist of keeping only the non-overlapping
empty rectangles, thus reducing the number of empty rectangles to be managed. This allows a linear run-time at
the cost of the quality. The problem with this approach is that several non-overlapping rectangles representation
of a configuration may exists leading to multiple possible choices in the representation .

BITS Pilani, Pilani Campus

Keeping Non-Overlapping ERs
 Problem 2:
o Non-overlapping empty rectangles are not necessarily maximal
• A module may exists that could fit onto the device, but cannot be
placed

BITS Pilani, Pilani Campus

Motivation for keeping maximal empty rectangles

• whenever a new component v1 is placed in a nonoverlapping rectangle, two possibilities exist to split
that rectangle. A horizontal split that is done using segment Sa and a vertical split using the segment
Sb. Choosing to select one of the two split directions may have a negative impact on the placement of
the next components. Assuming for example that the algorithm keeps selecting the horizontal split,
after the placement of module v1, the free rectangles left are (A,D) and (C,E). Assume now that a new
module, whose width is the same as (C,E) but with a height slightly bigger than that of (C,E) is chosen.
The algorithm will not be able to place the component, although enough free space is available on the
chip to accommodate the new module.

BITS Pilani, Pilani Campus

Finding MERs

Modelling FPGA Area Matrix

15
BITS Pilani, Pilani Campus
Keeping Non-Overlapping ERs
o KAMER always finds a rectangle to place a new component, if one exists.
o But position to place the component within the rectangle must be selected from a set of points
• whose number is the area of the rectangle in worst case
 In [Bazargan]:
o Bottom-left position
• Note: No communication is assumed between the rectangle to be placed and those
already placed.
 If communication exists:
o If a position in the middle of the selected rectangle is better adapted for optimizing the goal
seeked, then the component should be placed there, no mater if the number of empty rectangle
increases.
o An optimal algorithm should consider any single point where the placement is possible and
develop a strategy to choose the best position.
o Instead of managing the free space, the strategy consists of managing the occupied space on the
device and use the set of running components for computing the best position to place the
incoming component.

BITS Pilani, Pilani Campus

FPGA Routing Basics (NoC)

17
BITS Pilani, Pilani Campus
Routing Challenges

How to handle
communication between
increasing number of
components on chip?

BITS Pilani, Pilani Campus

Routing Challenges

BITS Pilani, Pilani Campus

Buffers or Routers

Wiring delay grows quadratically

Buffer Insertion can make it linear,
but at the heavy cost of buffers
Multi cycle hops

Replace the buffers with Routers?

BITS Pilani, Pilani Campus

Network on Chip

• NoC is a network based communication subsystem on an integrated

circuit most typically between modules in a system on chip
• NoCs contains switches and links which interconnect different PEs or
processing elements in general and manage the data movement between
them
• NoCs improve the scalability of systems on chip and the power efficiency
of complex SoCs compared to other communication subsystem designs

BITS Pilani, Pilani Campus

Routing Techniques

• inspired by data-communication networks

• to communicate across the chip in a way similar to that of
messages transmitted over the.
• can effectively overcome the long-wire disadvantages from bus
architectures
• the architecture is decoupled into different layers, such as
transaction and physical layers.

BITS Pilani, Pilani Campus

Campus
Network on Chip Terminologies
Processing Element (PE):
• Hardware modules that do the actual processing. Can be homogeneous or heterogeneous.
• Consist of a set of network clients (DSP, memory, peripheral controller, memory controllers,
custom logic) that communicate on a packet base (instead of using direct connection).

Switches: switches or routers are used for routing data coming from one PE to another PE based
on the destination information in the data

Links: Used to interconnect switches and PEs. Follow some particular protocol when they
transmit data between these switches.

BITS Pilani, Pilani Campus

Campus
Topologies

• Topology is the arrangement of the elements (links nodes, etc) of the

network topology is chosen based on the cost and performance
(flow control, routing and bisection bandwidth

BITS Pilani, Pilani Campus

Bisection Bandwidth

• One of the most important parameters in deciding the throughput of a NOC

• if the network is bisected into 2 partitions, the bisection bandwidth of a network
topology is the bandwidth available between the two partitions

BITS Pilani, Pilani Campus

Communication in NoC

• NoCs follow packet-based communication.

• The data packets will have a header portion which will have the
information regarding the destination PE
• The header may have additional information like the source PE,
size of the packet etc.
• The portion of a packet transmitted on one clock cycle is called
a flit (flow control digit)

flit packet

BITS Pilani, Pilani Campus

Circuit Switching vs Packet Switching

• In circuit switching based communication the communication link

between the transmitter and the receiver is first established before the
actual data transmission happens
• all our previous processor based systems are following the circuit
switching
• In packet switching data is transmitted in the form of packets -- here
the exact route taken by the packet from the source to the destination is
decided at runtime and in most of the cases it cannot be determined in
advance
• NOCs and most of other computer networks including the internet
follow packet switching

BITS Pilani, Pilani Campus

Router

• Routers consist of a set of components like buffers to temporarily store

packets, a controller that determines how to forward the packet, usually
according to the destination address.

• The performance and the efficiency of the NoC depend on the

underlying communication infrastructure, which in turn depends on the
performance (latency and throughput) of the on-chip routers.

• Design of efficient, high performance routers represents a critical issue for

the success of the NoC approach.

35
BITS Pilani, Pilani Campus
Campus
• Packets delivered by routers are partitioned in a flit-by-flit basis.
• Each flit of a packet arrives at a router and stored in a memory buffer
until it can traverse to the next hop of the route.
• The first flit in the buffer memory will be processed by the control
logic to determine whether it is allowed to be forwarded and which
output direction it should proceed to.
• The decision made by the control unit is based on the computation
result of routing, arbitration, and the downstream buffer space.
• After the control setup is done, the flit passes through the crossbar
switch to its desired output direction.
• Since all the decisions are settled by the control logic at the input side,
flits will never be stalled at the output ports.

BITS Pilani, Pilani Campus

Campus
Router

BITS Pilani, Pilani Campus

Campus
Contention

When 2 packets trying to use the same link at the same time is called as
contention

• Buffer One (widely used)

• Drop One
• Misroute (deflect)

BITS Pilani, Pilani Campus

Flow Control

BITS Pilani, Pilani Campus

Routing Algorithms

• To decide what path a message will take through

the network to reach its destination.
• To distribute traffic evenly among the paths
supplied by the network topology,
• Have a low energy overhead but the specific
route chosen affects hop count directly, and thus
substantially affects energy consumption.

33
BITS Pilani, Pilani Campus
Routing Algorithms

• Dimension Ordered Routing

• Oblivious Routing
• Adaptive Routing

34
BITS Pilani, Pilani Campus
Dimension Ordered Routing

• Always generate the same single routing path for given pair of source
and destination addresses, typically a shortest one.
• When source routing is used, the source node implements pure
routing function returning a unique path without considering any
information about the traffic.
• For distributed routing, routers make unique decisions in every
intermediate node.
• Deterministic distributed routing is used in most commercially
available parallel machines.
• It is simple, fast, and performs well under uniform traffic assumption.
• Used in mesh-based machines,
• It is known to be deadlock-free in meshes and is easy to implement in
HW.
• Nonuniform traffic requires some degree of adaptivity.
35
BITS Pilani, Pilani Campus
Oblivious Routing

• Messages traverse different paths from A to B,

without regard to network congestion.
• Do not take into consideration any other information
except the addresses, equally to deterministic
routing.
• The routing decision are oblivious to the status of
network traffic.
• Oblivious non-deterministic algorithms can distribute
uniformly the communication load in situations
where adaptive solutions are too expensive or slow.
36
BITS Pilani, Pilani Campus
Adaptive Algorithm
• A more sophisticated routing algorithm can be adaptive, in which the path
a message takes from A to B depends on network traffic situation.
• Use information about network traffic and/or channel status to avoid
congested or faulty regions of the network.
• Source-node adaptive routing is useful only when the traffic status does
not change too fast
• On the other hand, adaptive routing can easily be combined with
distributed routing.
• Local or global information can be leveraged to make adaptive routing
decisions

37
BITS Pilani, Pilani Campus
38
BITS Pilani, Pilani Campus
Minimal and Non-minimal Routing

• Minimal routing algorithms select only paths that require the

smallest number of hops between the source and the
destination.
– greedy algorithms: (also called minimal, direct, shortest-path, or
profitable.) Every routing decision brings the packet closer to the
destination. Deterministic and oblivious routing algorithms are
usually greedy.

• Non-minimal routing algorithms allow paths to be selected that

may increase the number of hops between the source and
destination.
– nongreedy algorithms: (also called nonminimal, indirect,
nonprofitable, or misrouting.) They can supply channels which send
packets away from its destination.

39
BITS Pilani, Pilani Campus
Deadlock Avoidance

• Most applications require the network to guarantee deadlock freedom.

• A deadlock occurs when a cycle exists among the paths of multiple messages..
• Deadlock freedom can be ensured in the routing algorithm by preventing cycles among the
routes generated by the algorithm, or in the flow control protocol by preventing router buffers
from being acquired and held in a cyclic manner.
• A network that uses a deadlock-prone routing algorithm requires a flow control protocol that
ensures deadlock freedom.

40
BITS Pilani, Pilani Campus
Flow Control

• Flow control governs the allocation of network

buffers and links.
• A good flow control protocol lowers the latency
experienced by messages at low loads
• In determining the rate at which packets access
buffers (or skip buffer access altogether) and
traverse links, flow control is instrumental in
determining network energy and power
consumption.
41
BITS Pilani, Pilani Campus
Messages, Packets, Flits and Phits
• When a message is injected into the network, it is
first segmented into packets, which are then divided
into fixed-length flits,.
• The packet will consist of a head flit that contains the
destination address, body flits and a tail flit that
indicates the end of a head, body and packet.
• Flits can be further broken down into phits, which
are physical units and correspond to the tail flits
physical channel width
42
BITS Pilani, Pilani Campus
43
BITS Pilani, Pilani Campus
Flow Control Techniques

– Circuit Switching
– Store-and-Forward
– Virtual Cut-Through
– Wormhole Routing

BITS Pilani, Pilani Campus

Campus
CircuitSwitching

• It is message based flow control technique.

• A communication path is created from the source to the destination before
transmitting any data.
• A routing probe containing the source and destination address traverses
network and reserving links to transmit the data
• Once the routing probe reaches the destination address, an
acknowledgment is sent back to the source address,
• The data are transferred at the full bandwidth of the hardware.
• The circuit remains operational until the end of data to be transmitted.
• The lock on the links may be released once all the data have reached the
destination by sending back another acknowledgment through the same
route to the source.

BITS Pilani, Pilani Campus

Campus
46
BITS Pilani, Pilani Campus
Store-and-Forward(SAF)

It is packed based flow control.

At each Node:

1. The packet is copied entirely into the network router before moving to the next node.
2. the routing information is examined to determine which output channel to direct the packet.
3. the packet is sent to the neighbor.
4. Flow control unit is the entire packet.

Latency:
Nr * tr
– Nr: number of routers through which the packet must travel
– tr: time to transfer the packet between the routers
A B
– Leads to high per-packet latency
– Requires buffering for entire packet in each node.

BITS Pilani, Pilani Campus

Campus
48
BITS Pilani, Pilani Campus
Virtual Cut-Through(VCT)
– As the routing information is carried in the header of a packet, the packet should not be
stored in the current node’s memory if an output buffer is available.
– The packet simply cuts through the router of the node to an available output channel. Start
forwarding as soon as the header is received and resources like buffers or channel etc are
allocated.
– This alleviates the need for an excessive amount of memory along the path of a message,
but enough memory has to be allocated if an output channel is not available

BITS Pilani, Pilani Campus

Campus
Worm hole routing

• The flit can depart the current node as soon as there is sufficient buffering for this flit.
• While wormhole flowcontrol uses buffers effectively, it makes inefficient use of link bandwidth.
• Throughput suffers because other packets queued behind the blocked packet are unable to use the idle physical
links.
• Wormhole flow control reduces packet latency by allowing a flit to leave the router as soon as a downstream
buffer is available (in the absence of contention, the latency is the same as virtual cut through).
• Additionally, wormhole flow control requires much smaller buffers than packet-based techniques.
• Due to the tight area and power constraints of on-chip networks, wormhole flow control is the predominant
technique adopted thus far.

BITS Pilani, Pilani Campus

Campus
Review Questions

51
BITS Pilani, Pilani Campus
Q1—scheduling algorithm
• For the edge weighted sequence
graph shown in fig, weights on
edges indicate execution delay of
nodes in terms of clock cycles.
Schedule operations as per
– ASAP
– ALAP (latency constraint of 6 cycles)
– Find mobility
– List schedule under resource
constraint of 2=a*, 2=a+/-, 1=a< with
ALAP schedule times as priority for
nodes.
– List schedule under latency
constraint of 6

BITS Pilani,
Pilani, Pilani
Pilani Campus
Campus
ASAP AND ALAP

BITS Pilani,
Pilani, Pilani
Pilani Campus
Campus
mobili
ty

BITS Pilani,
Pilani, Pilani
Pilani Campus
Campus
List schedule-resource constraint

t = 0,
– U0,* = {a,b,c,d}, U0,+/- = {e}, U0,< = ∅
– T0,* = ∅, T0,+/- = ∅, T0,< = ∅
– For +/-, easy to select Y = {e}
– For *, we have a choice. ALAP times for a,b,c,d
t=1
are 0,0,1,3, respectively So most urgent are Y = {a,b}
– U1,* = {c,d}, U1,+/- = ∅, U1,< = {i}
– For <, there is nothing to schedule Y = ∅
– T1,* = {a,b}, T1,+/- = ∅, T1,< = ∅ – S(a) = 0, S(b) = 0, S(e) = 0
– For +/-, Y = ∅
– For *, Y = ∅ (all resources busy) t=3 t=4
– For <, Y = {i} -- U3, * = {d}, U3,+/- = ∅, U3,<= ∅ -- U4, * = {d,g}, U4,+/- = (j}, U4,<= ∅
– S(i) = 1 -- T3, * ={c,f}, T3,+/-= ∅, T3,<= ∅ -- T4, * ={∅}, T4,+/-= ∅, T3,<= ∅
t=2 For +/-, Y = ∅ For +/-, Y = {j}
– U2,* = {c,d,f}, U2,+/- = ∅, U2,< = ∅ For *, Y= ∅ (all resources busy) For *, Y= {d,g}
– T2,* = ∅, T2,+/- = ∅, T2,< = ∅ For <. Y= ∅ For <. Y= ∅
– For +/-, Y = ∅ S(0) = 3 S(d) = 4. S(g) =4, S(j)=4
– For *, ALAP times for c,d,f are 1,3,2
Similarly,
respectively.Y = {c,f}
T=5, S(0)
– For <, Y = ∅
T=6, S(k)
– S(c) = 2, S(f) = 2
T=7, sink node
BITS Pilani,
Pilani, Pilani
Pilani Campus
Campus
Latency = 7
we take once cycle longer than ASAP (but can
use half the number of multipliers)

BITS Pilani,
Pilani, Pilani
Pilani Campus
Campus
List schedule-latency constraint of 6 cycles
B
I
T
S
P
i
l
a Slack values from ALAP schedule
n
i
Assume one resource of each type to begin
, with.
P
i t=0
t=1
l – U0,* = {a,b,c,d}, U0,+/- = {e}, U0,< = ∅
a – U1,* = {c,d}, U1,+/- = ∅, U1,< = {i}
n – T0,* = ∅, T0,+/- = ∅, T0,< = ∅
– T1,* = {a,b}, T1,+/- = ∅, T1,< = ∅
i – sa = 0, sb = 0, sc = 1, sd = 3, se = 4
C – sc = 0, sd = 2, si = 3
a – For *, k1 = {a,b};
m – For *, k1 = {c};
– a* = 2;
p – a* = 3;
u – For +/-, k2 = {e};
s – For +/-, k2 = ∅; for <, k3 = {i}
For <, k3= ∅
– S(c) = 1, S(i) = 1
– S(a) = 0, S(b) = 0, S(e) = 0
t=2
Contd…………
– U2,* = {f,d}, U2,+/- = ∅, U2,< = ∅
..
– T2,* = {c}, T2,+/- = ∅, T2,< = ∅
– sf = 0, sd = 1
– For *, k1 = {f} schedule has the same
– all resource constraints unchanged latency as ASAP, but
requires 3 rather than 4
– For +/-, k2 = ∅; for <, k3 = ∅ multipliers and 2 ALUs and
– S(f) = 2, one comparator
Q2

• Build a BDD for f = abc + ab’c + a’bc’ + a’b’c’

– Use the variable ordering a ≤ b ≤ c
• Find ROBDD for the function
– f(a,b,c,d) = (a+c+d).(ab+bc) .(bc+ac)
With variable orders as O1=[a,b,c,d], O2=[b,d,c,a]

BITS Pilani,
Pilani, Pilani
Pilani Campus
Campus
sol
Compute cofactors of f with respect to a
(first variable in ordering)
f = abc + ab’c + a’bc’ + a’b’c’
fa’ = bc’ + b’c’
fa = bc + b’c

Compute cofactors of fa and fa’ with

respect to b (second variable in
ordering)
fa = bc + b’c
(fa)b = fab = c
(fa)b’ = fab’ = c
fa’ = bc’ + b’c’
(fa’)b = fa’b = c’
(fa’)b’ = fa’b’ = c’

BITS Pilani,
Pilani, Pilani
Pilani Campus
Campus
f(a,b,c,d) = (a+c+d).(ab+bc) .(bc+ac)

For O1, the ROBDD can be gendered as follows

For O2, the ROBDD can be gendered as follows

BITS Pilani,
Pilani, Pilani
Pilani Campus
Campus
Variable ordering matters
f(x1,x2,x3,x4,x5,x6) = (x1+x2).(x3+x4).(x5+x6)

Order: [x1,x2, x3, x4, x5, x6] Order: [x1,x3,x5,x2,x4,x6]

BITS Pilani,
Pilani, Pilani
Pilani Campus
Campus
Q3

• Your task is to implement a 32K x 32 memory.

Assume you are given 32Kbit configurable
memory blocks such as a block RAM in a FPGA.
Each block can be configured as a 32K x 1bit
memory or a 1K x 32bit memory. Which option
would you choose as a building block for your 32K
x 32bit memory and why?

BITS Pilani,
Pilani, Pilani
Pilani Campus
Campus
Sol 3

Using the 32Kx1

memories, the larger
memory can be
constructed by simply
using the one bit from
each block as one of the
32 bits of data.
On writes, the address
goes to all blocks, write
enable is asserted for all,
and one of the 32 bits
goes to each block.

BITS Pilani,
Pilani, Pilani
Pilani Campus
Campus
• Using the 1Kx32 memories, each
data word is stored in one of the 32
blocks. 32:1 decoder 32x1 MUX
• For reads, part of the read address
lines (0 to 9) chooses the one word
out of 1k words inside the memory
and the rest of the read address
lines (10 to 14) chooses between
the data from the blocks.
• For writes the data-In goes to a
block whose write-enable is
asserted, determined by a decoder.
• So, the 1Kx32 implementation
requires an extra 32:1 decoder and
a 32-bit wide 32:1 mux; this is a lot
of extra hardware and performance
is also worse because of the extra
delay through the decoder on
writes and mux on reads.

BITS Pilani,
Pilani, Pilani
Pilani Campus
Campus
Q5

• Implement the two combinational functions,

• F = X’Y + XYZ’ and
• G = W’X’Y + W’XYZ’.

BITS Pilani,
Pilani, Pilani
Pilani Campus
Campus
Sol
5

BITS Pilani,
Pilani, Pilani
Pilani Campus
Campus
Chortle -crf

Apply Chortle-crf LUT-mapping algorithm for the expression below with K=4. Use two-level
decomposition followed by multi-level decomposition using bin packing algorithm to map the
design on the available LUTs.

Y = abc + db + ace + dfgh

BITS Pilani, Pilani Campus

Sol

BITS Pilani, Pilani Campus

FPGA Temporal Placement Guide
No ratings yet
FPGA Temporal Placement Guide
35 pages
Module 5 - Placement
No ratings yet
Module 5 - Placement
36 pages
Birla Institute of Technology and Science - Pilani Pilani - K.K. Birla Goa - Hyderabad
No ratings yet
Birla Institute of Technology and Science - Pilani Pilani - K.K. Birla Goa - Hyderabad
4 pages
Placement and Partitioning
No ratings yet
Placement and Partitioning
12 pages
Lec 06
No ratings yet
Lec 06
78 pages
Reconfigurable Computing Exam Key
No ratings yet
Reconfigurable Computing Exam Key
7 pages
PD Slides03 Partition
No ratings yet
PD Slides03 Partition
88 pages
VLSI Min-Cut Placement Techniques
No ratings yet
VLSI Min-Cut Placement Techniques
49 pages
Partitioning Techniques Explained
No ratings yet
Partitioning Techniques Explained
143 pages
Design Automation: ENGR 3430 - Digital VLSI Mark L. Chang Spring '07
No ratings yet
Design Automation: ENGR 3430 - Digital VLSI Mark L. Chang Spring '07
52 pages
Vlsi 1
No ratings yet
Vlsi 1
60 pages
Nesting
No ratings yet
Nesting
20 pages
Sorting On A Mesh-Connected Parallel Computer
No ratings yet
Sorting On A Mesh-Connected Parallel Computer
30 pages
VLSI Design Automation Guide
No ratings yet
VLSI Design Automation Guide
40 pages
Lec - 29 Final
No ratings yet
Lec - 29 Final
31 pages
Ia Scheme t2
No ratings yet
Ia Scheme t2
11 pages
Unit 5
No ratings yet
Unit 5
50 pages
4 Placement
No ratings yet
4 Placement
39 pages
Placement and Partitioning
No ratings yet
Placement and Partitioning
24 pages
VLSI Physical Design: From Graph Partitioning To Timing Closure
No ratings yet
VLSI Physical Design: From Graph Partitioning To Timing Closure
88 pages
An Efficient Temporal Partitioning Algorithm To Minimize Communication Cost For Reconfigurable Computing Systems
No ratings yet
An Efficient Temporal Partitioning Algorithm To Minimize Communication Cost For Reconfigurable Computing Systems
11 pages
Reconfigurable Computing
No ratings yet
Reconfigurable Computing
9 pages
Congestion Driven Placement
No ratings yet
Congestion Driven Placement
25 pages
VLSI Module Placement Based On Rectangle-Packing by The Sequence-Pair
No ratings yet
VLSI Module Placement Based On Rectangle-Packing by The Sequence-Pair
7 pages
Chapter 7
No ratings yet
Chapter 7
37 pages
Chap 3
No ratings yet
Chap 3
88 pages
COL215 Software Assignment 1
No ratings yet
COL215 Software Assignment 1
6 pages
Column-Wise FPGA Design Impact Study
No ratings yet
Column-Wise FPGA Design Impact Study
4 pages
Physical Design Arnab
No ratings yet
Physical Design Arnab
49 pages
Ece425 L28
No ratings yet
Ece425 L28
20 pages
PnR-I (Floorplanning)
No ratings yet
PnR-I (Floorplanning)
44 pages
Topological Placement
No ratings yet
Topological Placement
4 pages
VLSI Physical Design - Model Questions
No ratings yet
VLSI Physical Design - Model Questions
5 pages
5 Dphysicaldesign
No ratings yet
5 Dphysicaldesign
13 pages
Vlsi Physical Design
100% (3)
Vlsi Physical Design
26 pages
VLSI Placement Techniques
No ratings yet
VLSI Placement Techniques
80 pages
System Partioning
No ratings yet
System Partioning
104 pages
Course:-Cad For Vlsi Partitioning Algorithm (K-L Algorithm and F-M Algorithm)
No ratings yet
Course:-Cad For Vlsi Partitioning Algorithm (K-L Algorithm and F-M Algorithm)
9 pages
CS 7 - Variations of Astar
No ratings yet
CS 7 - Variations of Astar
24 pages
Algorithm Types: Random Algorithms, and Backtracking
No ratings yet
Algorithm Types: Random Algorithms, and Backtracking
62 pages
Unification of Partitioning Placement and Floorplanning
No ratings yet
Unification of Partitioning Placement and Floorplanning
8 pages
L14 Placement and Routing
No ratings yet
L14 Placement and Routing
40 pages
Eda Ieee Eds26sep2013
No ratings yet
Eda Ieee Eds26sep2013
90 pages
Daa PPT 28th Jan
No ratings yet
Daa PPT 28th Jan
35 pages
FPGA Design & Reconfigurable Computing
No ratings yet
FPGA Design & Reconfigurable Computing
2 pages
Lecture3 Partition
No ratings yet
Lecture3 Partition
23 pages
Avda Co2
No ratings yet
Avda Co2
84 pages
Unified Method for VLSI Floorplan Constraints
No ratings yet
Unified Method for VLSI Floorplan Constraints
11 pages
Lec7 - Partial Reconfiguration
No ratings yet
Lec7 - Partial Reconfiguration
37 pages
Lecture9 Placement1
100% (1)
Lecture9 Placement1
68 pages
ECE260B - CSE241A Winter 2017 Floorplanning and Partitioning
No ratings yet
ECE260B - CSE241A Winter 2017 Floorplanning and Partitioning
53 pages
Syllabus VLMR3
No ratings yet
Syllabus VLMR3
2 pages
Lec 5
No ratings yet
Lec 5
14 pages
Outline: - What Is Physical Design - Design Methods - Design Styles - Analysis and Verification
No ratings yet
Outline: - What Is Physical Design - Design Methods - Design Styles - Analysis and Verification
43 pages
Metode Rank Order
No ratings yet
Metode Rank Order
3 pages
Lauren Valentine Resume
No ratings yet
Lauren Valentine Resume
2 pages
Tcs Aspire
No ratings yet
Tcs Aspire
35 pages
Ethical Research Guidelines
67% (3)
Ethical Research Guidelines
2 pages
LLM Scholarship Application Letter
No ratings yet
LLM Scholarship Application Letter
1 page
Artikel CSR MSDM
No ratings yet
Artikel CSR MSDM
18 pages
Sesion 8 CL (1) Ingles
No ratings yet
Sesion 8 CL (1) Ingles
5 pages
Math Problem Set with Solutions
No ratings yet
Math Problem Set with Solutions
16 pages
Noname 365
No ratings yet
Noname 365
4 pages
Pps Unit 2 Notes
No ratings yet
Pps Unit 2 Notes
10 pages
The Inner Alchemist
0% (1)
The Inner Alchemist
11 pages
Quantitative Aptitude Plain
No ratings yet
Quantitative Aptitude Plain
532 pages
Table of Specifications (Tos) Midterm Examination: Earth and Life Science in GRADE 11
No ratings yet
Table of Specifications (Tos) Midterm Examination: Earth and Life Science in GRADE 11
4 pages
E-Commerce Product Rating Based On Customer Review Mining-IJAERDV05I0128071 PDF
No ratings yet
E-Commerce Product Rating Based On Customer Review Mining-IJAERDV05I0128071 PDF
5 pages
ICSE Class 10 Computer Applications Sample Paper
No ratings yet
ICSE Class 10 Computer Applications Sample Paper
2 pages
Review Science Unit 7 Unit 8 and Unit 9
No ratings yet
Review Science Unit 7 Unit 8 and Unit 9
16 pages
Submitted By: Crystal Angela A. Naval Submitted To: Prof. Jerson Jalani
No ratings yet
Submitted By: Crystal Angela A. Naval Submitted To: Prof. Jerson Jalani
15 pages
Dr. Jekyll and Mr. Hyde
0% (1)
Dr. Jekyll and Mr. Hyde
39 pages
Annex 1 - Task 2 Compose and Conquer - Submission Template
No ratings yet
Annex 1 - Task 2 Compose and Conquer - Submission Template
16 pages
Memory's Daughter
No ratings yet
Memory's Daughter
9 pages
Case in Point - 1 Introduction
No ratings yet
Case in Point - 1 Introduction
3 pages
Kenstowicz
No ratings yet
Kenstowicz
28 pages
Document 1658228538
No ratings yet
Document 1658228538
24 pages
In Northern Hemisphere A Star Is On The Prime Vertical 4 HR 10m After Passing The Observer
No ratings yet
In Northern Hemisphere A Star Is On The Prime Vertical 4 HR 10m After Passing The Observer
1 page
MS Thesis Report
No ratings yet
MS Thesis Report
11 pages
LE - Math 1 - Q3 - Week 4 - v.2
No ratings yet
LE - Math 1 - Q3 - Week 4 - v.2
12 pages
ESL With Alice Assessment Package
No ratings yet
ESL With Alice Assessment Package
10 pages
Who Kidnapped Excellence
No ratings yet
Who Kidnapped Excellence
8 pages
Squares and Square Roots Chapter Class Viii
100% (1)
Squares and Square Roots Chapter Class Viii
24 pages
Correlation Notes
No ratings yet
Correlation Notes
3 pages

RCC 15

Uploaded by

RCC 15

Uploaded by

Reconfigurable Computing

AELZG 554 / ES ZG554 / MELZG554

• Given a reconfigurable processing unit H at time t

Interaction between scheduler and placer is open.

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

• The decision in KAMER algorithm is taken by choosing the

BITS Pilani, Pilani Campus

o RFUOPs = {r1, …, rn| ri = (wi, hi, si, ei)} ; si < ei

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

Modelling FPGA Area Matrix

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

Wiring delay grows quadratically

Replace the buffers with Routers?

BITS Pilani, Pilani Campus

• NoC is a network based communication subsystem on an integrated

BITS Pilani, Pilani Campus

• inspired by data-communication networks

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

• Topology is the arrangement of the elements (links nodes, etc) of the

BITS Pilani, Pilani Campus

• One of the most important parameters in deciding the throughput of a NOC

BITS Pilani, Pilani Campus

• NoCs follow packet-based communication.

BITS Pilani, Pilani Campus

• In circuit switching based communication the communication link

BITS Pilani, Pilani Campus

• Routers consist of a set of components like buffers to temporarily store

• The performance and the efficiency of the NoC depend on the

• Design of efficient, high performance routers represents a critical issue for

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

• Buffer One (widely used)

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

• To decide what path a message will take through

• Dimension Ordered Routing

• Messages traverse different paths from A to B,

• Minimal routing algorithms select only paths that require the

• Non-minimal routing algorithms allow paths to be selected that

• Most applications require the network to guarantee deadlock freedom.

• Flow control governs the allocation of network

BITS Pilani, Pilani Campus

• It is message based flow control technique.

BITS Pilani, Pilani Campus

It is packed based flow control.

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

• Build a BDD for f = abc + ab’c + a’bc’ + a’b’c’

Compute cofactors of fa and fa’ with

For O1, the ROBDD can be gendered as follows

For O2, the ROBDD can be gendered as follows

Order: [x1,x2, x3, x4, x5, x6] Order: [x1,x3,x5,x2,x4,x6]

• Your task is to implement a 32K x 32 memory.

Using the 32Kx1

• Implement the two combinational functions,

Y = abc + db + ace + dfgh

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

You might also like