0% found this document useful (0 votes)
7 views11 pages

Lecture HWA

Hardware accelerators are specialized processors designed to perform specific computations more efficiently than general-purpose processors, with examples including GPUs, FPGAs, and ASICs. They offload tasks from CPUs to optimize performance and reduce power consumption, particularly excelling in parallel processing. The choice between FPGAs and ASICs depends on application needs, with FPGAs offering flexibility and faster time-to-market, while ASICs provide higher performance and lower power consumption.

Uploaded by

Athmajan Vu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views11 pages

Lecture HWA

Hardware accelerators are specialized processors designed to perform specific computations more efficiently than general-purpose processors, with examples including GPUs, FPGAs, and ASICs. They offload tasks from CPUs to optimize performance and reduce power consumption, particularly excelling in parallel processing. The choice between FPGAs and ASICs depends on application needs, with FPGAs offering flexibility and faster time-to-market, while ASICs provide higher performance and lower power consumption.

Uploaded by

Athmajan Vu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Hardware Accelerators

Zaheer Khan
• A hardware accelerator is a specialized processor that is designed to accelerate a specific type of computation.
• They are optimized for a particular task or set of tasks, which allows them to perform those tasks faster and
more efficiently than general-purpose processors.
• Some examples of hardware accelerators include
• graphics processing units (GPUs),
• FPGAs
• ASIC
How do Hardware Accelerators Work?
• Hardware accelerators work by offloading specific types of computations from the general-purpose processor
to a specialized processor.
• This allows the specialized processor to perform the computation more efficiently and with less power
consumption than a general-purpose processor.
• Example: GPUs are optimized for the parallel processing of large amounts of data, which makes them well-
suited for graphics rendering and machine learning.

Flexibility vs Efficiency Trade-off


• The purpose-built architecture of GPUs allows
• the offloading of certain calculations from the
CPU.
• These types of calculations are called “single
instruction multiple data” (SIMD)
• so GPUs are great at simplistic operations on
large inputs
• whereas CPUs excel at complex operations on
small input streams.

• Parallelism
• Task Level
• Instruction Level
• Data Level

• Differences with Vector Processors


❑ CPUs and GPUs have fundamentally different design philosophies
➢ CPUs: low-latency, low-throughput high clock freq., large caches, sophisticated
control, powerful ALUs
➢ GPUs: high-latency, high-throughput moderate clock freq., small caches, simple
control, (many) energy efficient ALUs
➢ Require massive number of threads to tolerate latencies
On an FPGA one starts out ASIC development starts further down into the
With a large array of logic blocks, weeds.
clock buffers, This means that these components must either
PLLs, on-chip RAMs, be purchased, come as part of a library,
I/O buffers, or they must be individually developed for use
(de)serializers, power distribution networks and more, within any ASIC design.
Design Process Simple design process. Long and complex design process.

It is expensive as it involves the cost of


Expenses There are no non-recurring expenses.
circuit design and mask design.

Termed as Faster “time-to-market” product. Longer “time-to-market” product.

Speed Slower than ASIC. Fast.

Reusability and Flexibility Reusable and flexible. Not reusable and not flexible.

Wastage Un-avoidable No wastage of hardware.

Best suited When the required numbers are less. When the required numbers are large.

• When choosing between an ASIC or FPGA, it is best to ask what the end use application
will be.
• If your application requires constant bug fixes, feature and design changes, and software
flexibility, then FPGAs may be the right solution.
• If your end application requires high performance, smaller device footprint, and
significantly lower power consumption, then ASICs are your best bet.
• What to accelerate?
• Decide the operational specifications of the hardware accelerator
• Profile software applications
• Determine the critical path/bottleneck, and frequently used kernels or functions


Accelerator Design How to accelerate?
• Architecture of the accelerator
• Memory hierarchy and I/O interfaces
• CPU-accelerator interfaces
• Programming interfaces

• Acceleration goals/requirements/constraints?
• Maximum latency
• Minimum throughput
• Maximum power consumption
• Cost, time to market, etc.
• Accelerator Design
• A few examples of choices in hardware accelerator design
• Types of parallelism exploited
• Fine-grained vs coarse-grained
• Data parallel vs task parallel

• Optimized for high throughput vs low latency E.g., optimizing number of tasks completed per unit of time,
OR, execution time of a single task

• Memory organization
• External interfaces
• On-chip memory usage, data buffering schemes
Parallelism
Why are accelerators faster? Exploit the
parallelism in kernels/applications

Consider vector addition:

•No data dependences between loop iterations


•Explicit data parallelism in this example
•We could instantiate K parallel adders Speedup = N/K
Can we really achieve N/K speedup?
Interface Choices
How do data move in and out of the accelerator?
What are the bandwidths needed for the interfaces?

You might also like