Machine Learning Framework for Early
Power,
Performance, and Area Estimation of
AUTHORS: RTL
Vijay Kumar Sutrakar
Aeronautical Development
Establishment Defence Research and
Development Organisation
Bangalore, India
1 vks.ade@gov.in
Presented by,
Anindita Chattopadhyay
Md. Rahat Ahmed Khan
Dept. of Electronics and Communication
Student ID: 210929
BMS College of Engineering
Electronics & Communication
Bangalore, India
Engineering Discipline
anindita.lvs21@bmsce.ac.in
2 PRESENTATION OUTLINE
Abstract
Methodology
Simple Operator Graph
Time Modelling
Power Modelling
Area Modelling
Dataset
Result & Discussion
Conclusion
References
3 ABSTRACT
Focus
Predict Power, Performance, Area (PPA) at RTL stage using HDLs like Verilog—before full
synthesis.
Problem in Traditional VLSI Design
Evaluating PPA (Performance, Power, Area) requires full synthesis and layout of RTL design.
This takes hours to days, using expensive EDA tools (like Synopsys, Cadence).
Late-stage issues are hard or costly to fix.
Not ideal for rapid design iteration or early feedback.
Proposed Solution
Pre-synthesis ML framework using RTL + library files.
Key: Simple Operator Graph (SOG) – bit-level graph mimicking post-synthesis design.
4 ABSTRACT
Results (147 RTL Designs)
98% – Worst Negative Slack (WNS)
98% – Total Negative Slack (TNS)
90% – Power
Outperforms prior models.
5 METHODOLOGY
HDL → RTL Representation (H → R):
They convert the HDL into a Simple Operator Graph (SOG).
SOG breaks down RTL into small building blocks like AND, OR, NOT etc., at the bit-
level.
Apply ML models:
Three machine learning models are used:
Random Forest and XGBooost for time (performance),
Graph Convolutional Networks (GCN) for power,
Tree based model for area.
All models learn from the SOG version of the design.
·This gives early feedback, before synthesis.
6 SIMPLE OPERATOR GRAPH (SOG)
What is SOG?
SOG (Simple Operator Graph) is a bit-level representation of RTL code where all operations
are broken down into five fundamental logic operations:
AND, OR, XOR, NOT, and 2-to-1 MUX Forms a graph of these operations → Simple Operator
Graph (SOG) [12-15]
How is SOG Created?
RTL code → Yosys → Bit-level graph (SOG)[16]
No full synthesis needed → Faster & simpler
Why SOG Works Well
Closer to post-synthesis → better PPA estimation
Uniform input → works on many types of designs
No optimization steps needed → saves time
7 TIME MODELLING
What is Time Modelling?
Time Modeling in VLSI refers to the process of estimating how long signals take to
propagate through a digital circuit
Two Key Metrics:
❖ WNS (Worst Negative Slack): Worst delay beyond clock deadline
If a circuit has a path with:
• Time Constraint = 1 ns
• Actual Delay = 1.5 ns
Then,
Slack = 1 − 1.5 = −0.5 ns ← This is a negative slack
If this is the worst slack in the design, it’s calledMWNS = −0.5 ns
❖ TNS (Total Negative Slack): Total accumulated delay violations
8 TIME MODELLING
Time Modelling Flow (SOG + ML)
1. Input: RTL → SOG (bit-level logic graph).
2. Analytical Delay: Assign delay to each logic node.
3. Critical Path Extraction: Trace paths with highest delay using source/sink
matching.
4. Delay Propagation: Compute cumulative delay per path.
5. Feature Extraction: Count operations and total delay per path.
6. Random Forest: Predict path-level delay.
7. XGBoost: Predict global WNS and TNS.
8. Output: Accurate timing insights at the RTL stage (WNS AD TNS)
9 POWER MODELLING
What is RTL Power Modelling?
At the RTL (Register Transfer Level), power modeling means estimating how
much power the digital design will consume before going through full synthesis
or physical implementation.
Work Flow:
Bit-level RTL representation (SOG)
Switching activity of each node
Graph Convolutional Networks (GCN) for learning power consumption directly
from the structure
10 POWER MODELLING
Switching Activity = Dynamic Power Source:
Power is consumed when signals toggle (0 ↔ 1).
More toggles = more power used.
Bit-Level Power Annotation in SOG
Each node (AND, OR, XOR, etc.) in the SOG is:
• Annotated with its switching frequency.
Enables precise bit-level power tracking.
11 POWER MODELLING
Power Prediction via GCN (Graph Convolutional Network):
SOG is fed into a Graph Convolutional Network.
GCN learn how gate types and their connections affect power.
Accurately models power at RTL. [12,14,15]
12 AREA MODELLING
What is Area?
The physical silicon space a circuit occupies after fabrication.
Types of Area:
Sequential Logic Area (e.g., flip-flops)
Combinational Logic Area (e.g., AND, OR, MUX, etc.)
Sequential Area Estimation (Simple + Direct):
Count number of D flip-flops in the SOG.
Get area per flip-flop from standard cell library (liberty file).
Formula:
Sequential Area = Number of Flip-Flops × Area per Flip-Flop
No machine learning required — just count and multiply.
13 AREA MODELLING
Combinational Area Estimation (ML-based):
For each logic gate in the SOG (AND, OR, MUX...):
• Count occurrences
• Multiply with their individual area (from liberty file)
Extract features like:
• Gate counts, types, SOG structure
Use a tree-based ML model (e.g., Random Forest) to predict total combinational area.
Total Area: Sequential Logic Area + Combinational Logic Area
147 Optimized Circuits:
14 RESULT AND DISCUSSION
Different benchmark circuits ISCAS’89 [18], ITC’99 [19],
OpenCores [20], VexRiscv [21], RISC-V [22], NVDLA [23],
Chipyard [24] are used to predict their PPA.
Metric R (Correlation) MAPE (Error)
WNS 0.98 12%
TNS 0.98 24%
Power 0.92 <48%
Area 0.99 12%
15 RESULT AND DISCUSSION
16 CONCLUSION
The framework performs moderately well on unoptimized designs.
On optimized designs, it achieves high accuracy for WNS, TNS, power, and
area.
It's effective for early-stage RTL evaluation.
Future work: improve further for more complex designs and scalability.
17 REFERENCES
[12] W. Fang et al., "MasterRTL: A Pre-Synthesis PPA Estimation Framework for Any RTL
Design," 2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD),
San Francisco, CA, USA, 2023, pp. 1-9
[13] NanGate 45nm Open Cell Library, https://si2.org/open-cell-library/.
[14] N. Wu, H. Yang, Y. Xie, P. Li, and C. Hao, “High-level synthesis performance prediction
using gnns: Benchmarking, modeling, and advancing,” in Proceedings of the 59th
ACM/IEEE Design Automation Conference (DAC), 2022, pp. 49–54.
[15] E. Ustun, C. Deng, D. Pal, Z. Li, and Z. Zhang, “Accurate operation delay prediction for
fpga hls using graph neural networks,” in Proceedings of the 39th International
Conference on Computer-Aided Design (ICCAD), 2020, pp. 1–9.
[18] F. Brglez, D. Bryan, and K. Kozminski, “Combinational profiles of sequential benchmark
circuits,” in IEEE International Symposium on Circuits and Systems (ISCAS). IEEE, 1989, pp.
1929–1934.
[19] F. Corno, M. S. Reorda, and G. Squillero, “Rt-level itc’99 benchmarks and first atpg
results,” Design & Test of computers (ITC), 2000.
18 REFERENCES
[20] E. Ustun, C. Deng, D. Pal, Z. Li, and Z. Zhang, “Accurate operation delay
prediction for fpga hls using graph neural networks,” in Proceedings of the 39th
International Conference on Computer-Aided Design (ICCAD), 2020, pp. 1–9.
[21] VexRiscv, “VexRiscv: A FPGA friendly 32 bit RISCV CPU implementation,”
2022. [Online]. Available: https: //github.com/SpinalHDL/VexRiscv.
[22] “A 32-bit risc-v processor for mriscv project,” 2017. [Online]. Available:
https://github.com/onchipuis/mriscvcore.
[23] Nvidia, “Nvidia deep learning accelerator,” 2018. [Online]. Available:
http://nvdla.org/primer.html
[24] A. Amid, D. Biancolin, A. Gonzalez, D. Grubb, S. Karandikar, H. Liew, A.
Magyar, H. Mao, A. Ou, N. Pemberton et al., “Chipyard: Integrated design,
simulation, and implementation framework for custom socs,” IEEE Micro, vol. 40,
no. 4, pp. 10–21, 2020.
19
THANK YOU