0% found this document useful (0 votes)
10 views19 pages

Seminar Presentation

The document presents a machine learning framework for early estimation of power, performance, and area (PPA) at the RTL stage using HDL like Verilog, addressing inefficiencies in traditional VLSI design that require full synthesis. It introduces a Simple Operator Graph (SOG) to facilitate this estimation, employing various machine learning models for time, power, and area predictions. Results indicate high accuracy in PPA predictions, making it effective for early-stage RTL evaluation and suggesting future improvements for complex designs.

Uploaded by

Rahat Ahmed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views19 pages

Seminar Presentation

The document presents a machine learning framework for early estimation of power, performance, and area (PPA) at the RTL stage using HDL like Verilog, addressing inefficiencies in traditional VLSI design that require full synthesis. It introduces a Simple Operator Graph (SOG) to facilitate this estimation, employing various machine learning models for time, power, and area predictions. Results indicate high accuracy in PPA predictions, making it effective for early-stage RTL evaluation and suggesting future improvements for complex designs.

Uploaded by

Rahat Ahmed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Machine Learning Framework for Early

Power,
Performance, and Area Estimation of
AUTHORS: RTL
Vijay Kumar Sutrakar
Aeronautical Development
Establishment Defence Research and
Development Organisation
Bangalore, India
1 vks.ade@gov.in

Presented by,
Anindita Chattopadhyay
Md. Rahat Ahmed Khan
Dept. of Electronics and Communication
Student ID: 210929
BMS College of Engineering
Electronics & Communication
Bangalore, India
Engineering Discipline
anindita.lvs21@bmsce.ac.in
2 PRESENTATION OUTLINE

 Abstract
 Methodology
 Simple Operator Graph
 Time Modelling
 Power Modelling
 Area Modelling
 Dataset
 Result & Discussion
 Conclusion
 References
3 ABSTRACT
Focus
 Predict Power, Performance, Area (PPA) at RTL stage using HDLs like Verilog—before full
synthesis.

Problem in Traditional VLSI Design


 Evaluating PPA (Performance, Power, Area) requires full synthesis and layout of RTL design.
 This takes hours to days, using expensive EDA tools (like Synopsys, Cadence).
 Late-stage issues are hard or costly to fix.
 Not ideal for rapid design iteration or early feedback.

Proposed Solution
 Pre-synthesis ML framework using RTL + library files.
 Key: Simple Operator Graph (SOG) – bit-level graph mimicking post-synthesis design.
4 ABSTRACT

Results (147 RTL Designs)


 98% – Worst Negative Slack (WNS)
 98% – Total Negative Slack (TNS)
 90% – Power
 Outperforms prior models.
5 METHODOLOGY
HDL → RTL Representation (H → R):
 They convert the HDL into a Simple Operator Graph (SOG).
 SOG breaks down RTL into small building blocks like AND, OR, NOT etc., at the bit-
level.

Apply ML models:
Three machine learning models are used:
 Random Forest and XGBooost for time (performance),
 Graph Convolutional Networks (GCN) for power,
 Tree based model for area.
All models learn from the SOG version of the design.
 ·This gives early feedback, before synthesis.
6 SIMPLE OPERATOR GRAPH (SOG)
What is SOG?
 SOG (Simple Operator Graph) is a bit-level representation of RTL code where all operations
are broken down into five fundamental logic operations:
AND, OR, XOR, NOT, and 2-to-1 MUX Forms a graph of these operations → Simple Operator
Graph (SOG) [12-15]

 How is SOG Created?


RTL code → Yosys → Bit-level graph (SOG)[16]
 No full synthesis needed → Faster & simpler

 Why SOG Works Well


Closer to post-synthesis → better PPA estimation
Uniform input → works on many types of designs
No optimization steps needed → saves time
7 TIME MODELLING
What is Time Modelling?
 Time Modeling in VLSI refers to the process of estimating how long signals take to
propagate through a digital circuit

Two Key Metrics:


❖ WNS (Worst Negative Slack): Worst delay beyond clock deadline
If a circuit has a path with:
• Time Constraint = 1 ns
• Actual Delay = 1.5 ns
Then,
Slack = 1 − 1.5 = −0.5 ns ← This is a negative slack
If this is the worst slack in the design, it’s calledMWNS = −0.5 ns
❖ TNS (Total Negative Slack): Total accumulated delay violations
8 TIME MODELLING
Time Modelling Flow (SOG + ML)
1. Input: RTL → SOG (bit-level logic graph).
2. Analytical Delay: Assign delay to each logic node.
3. Critical Path Extraction: Trace paths with highest delay using source/sink
matching.
4. Delay Propagation: Compute cumulative delay per path.
5. Feature Extraction: Count operations and total delay per path.
6. Random Forest: Predict path-level delay.
7. XGBoost: Predict global WNS and TNS.
8. Output: Accurate timing insights at the RTL stage (WNS AD TNS)
9 POWER MODELLING

What is RTL Power Modelling?


 At the RTL (Register Transfer Level), power modeling means estimating how
much power the digital design will consume before going through full synthesis
or physical implementation.

Work Flow:
 Bit-level RTL representation (SOG)
 Switching activity of each node
 Graph Convolutional Networks (GCN) for learning power consumption directly
from the structure
10 POWER MODELLING

Switching Activity = Dynamic Power Source:


 Power is consumed when signals toggle (0 ↔ 1).
 More toggles = more power used.

Bit-Level Power Annotation in SOG


 Each node (AND, OR, XOR, etc.) in the SOG is:
• Annotated with its switching frequency.
 Enables precise bit-level power tracking.
11 POWER MODELLING

Power Prediction via GCN (Graph Convolutional Network):


 SOG is fed into a Graph Convolutional Network.
 GCN learn how gate types and their connections affect power.
 Accurately models power at RTL. [12,14,15]
12 AREA MODELLING
What is Area?
 The physical silicon space a circuit occupies after fabrication.
Types of Area:
 Sequential Logic Area (e.g., flip-flops)
 Combinational Logic Area (e.g., AND, OR, MUX, etc.)

Sequential Area Estimation (Simple + Direct):


 Count number of D flip-flops in the SOG.
 Get area per flip-flop from standard cell library (liberty file).
Formula:
 Sequential Area = Number of Flip-Flops × Area per Flip-Flop
 No machine learning required — just count and multiply.
13 AREA MODELLING

Combinational Area Estimation (ML-based):

 For each logic gate in the SOG (AND, OR, MUX...):


• Count occurrences
• Multiply with their individual area (from liberty file)

 Extract features like:


• Gate counts, types, SOG structure

 Use a tree-based ML model (e.g., Random Forest) to predict total combinational area.
 Total Area: Sequential Logic Area + Combinational Logic Area
147 Optimized Circuits:

14 RESULT AND DISCUSSION


Different benchmark circuits ISCAS’89 [18], ITC’99 [19],
OpenCores [20], VexRiscv [21], RISC-V [22], NVDLA [23],
Chipyard [24] are used to predict their PPA.

Metric R (Correlation) MAPE (Error)


WNS 0.98 12%
TNS 0.98 24%
Power 0.92 <48%
Area 0.99 12%
15 RESULT AND DISCUSSION
16 CONCLUSION

 The framework performs moderately well on unoptimized designs.


 On optimized designs, it achieves high accuracy for WNS, TNS, power, and
area.
 It's effective for early-stage RTL evaluation.
 Future work: improve further for more complex designs and scalability.
17 REFERENCES
 [12] W. Fang et al., "MasterRTL: A Pre-Synthesis PPA Estimation Framework for Any RTL
Design," 2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD),
San Francisco, CA, USA, 2023, pp. 1-9
 [13] NanGate 45nm Open Cell Library, https://si2.org/open-cell-library/.
 [14] N. Wu, H. Yang, Y. Xie, P. Li, and C. Hao, “High-level synthesis performance prediction
using gnns: Benchmarking, modeling, and advancing,” in Proceedings of the 59th
ACM/IEEE Design Automation Conference (DAC), 2022, pp. 49–54.
 [15] E. Ustun, C. Deng, D. Pal, Z. Li, and Z. Zhang, “Accurate operation delay prediction for
fpga hls using graph neural networks,” in Proceedings of the 39th International
Conference on Computer-Aided Design (ICCAD), 2020, pp. 1–9.
[18] F. Brglez, D. Bryan, and K. Kozminski, “Combinational profiles of sequential benchmark
circuits,” in IEEE International Symposium on Circuits and Systems (ISCAS). IEEE, 1989, pp.
1929–1934.
 [19] F. Corno, M. S. Reorda, and G. Squillero, “Rt-level itc’99 benchmarks and first atpg
results,” Design & Test of computers (ITC), 2000.

18 REFERENCES

 [20] E. Ustun, C. Deng, D. Pal, Z. Li, and Z. Zhang, “Accurate operation delay
prediction for fpga hls using graph neural networks,” in Proceedings of the 39th
International Conference on Computer-Aided Design (ICCAD), 2020, pp. 1–9.
 [21] VexRiscv, “VexRiscv: A FPGA friendly 32 bit RISCV CPU implementation,”
2022. [Online]. Available: https: //github.com/SpinalHDL/VexRiscv.
 [22] “A 32-bit risc-v processor for mriscv project,” 2017. [Online]. Available:
https://github.com/onchipuis/mriscvcore.
 [23] Nvidia, “Nvidia deep learning accelerator,” 2018. [Online]. Available:
http://nvdla.org/primer.html
 [24] A. Amid, D. Biancolin, A. Gonzalez, D. Grubb, S. Karandikar, H. Liew, A.
Magyar, H. Mao, A. Ou, N. Pemberton et al., “Chipyard: Integrated design,
simulation, and implementation framework for custom socs,” IEEE Micro, vol. 40,
no. 4, pp. 10–21, 2020.
19

THANK YOU

You might also like