Machine Learning Applications in
Physical Design: Recent Results
         and Directions
           Andrew B. Kahng
       CSE and ECE Departments
            UC San Diego
         http://vlsicad.ucsd.edu
                                   A. B. Kahng, 180327 ISPD--2018
Agenda
• Crises…
            A. B. Kahng, 180327 ISPD--2018   2
IC Industry Crises: Cost, Quality of Design
• Can’t afford to design chips (tools, people, time, risk)
• Return on investment for new technology is poor
  • $$M to move to new node (28nm  14nm 10nm  7nm  …)
  • Benefit from new node: ~20% power, speed, area (less, today)
• Design Capability Gap
  • Available density
    grows at 2x/node
  • Realizable density
    grows at 1.6x/node
  • UCSD / 2013 ITRS
                                               A. B. Kahng, 180327 ISPD--2018   3
IC Design Crises: Unpredictability, Schedule
• Many steps in long “design flow”  can we predict outcome?
• Many chicken-egg loops  convergence point? how to initialize?
• Nearly all problems are NP-hard
  • Min-cut hypergraph bisection, Quadratic assignment,
    Multicommodity flow, Max-weight independent set, Multi-
    vehicle TSP, k-colorability, …
• Huge “n” metaheuristics piled on metaheuristics
• Suboptimality is expensive
  • 10% of {power, speed, area} is half of benefit from new node
• Iteration is expensive
  • Moore’s Law: 1 week = 1 percent
• Conservatism (“margin”) is expensive
  • But: “oops” (didn’t fit, didn’t route, too slow) is unacceptable
                                                    A. B. Kahng, 180327 ISPD--2018   4
Unpredictability of Design
• Intractable optimizations  heuristics piled on heuristics
• “Noise” or “Chaos” when EDA tools “try hard”
• Unpredictability  added margin and schedule
  14nm PULPino: area = 6% from freq = 10MHz !
          Challenges: Schedule, Quality, Cost
                                                  A. B. Kahng, 180327 ISPD--2018   5
“The Last Semiconductor Scaling Levers”
• Quality
  • Improved design tools and methods
  • Reduced margins
• Schedule
  • 1 week = 1%
• Cost
  • IC design is expensive (engineers, tools, spins, …)
                                                A. B. Kahng, 180327 ISPD--2018   6
Agenda
• Crises…
• … and a Vision
                   A. B. Kahng, 180327 ISPD--2018   7
Unpredictability of Design
• Intractable optimizations  heuristics piled on heuristics
• “Noise” or “Chaos” when EDA tools “try hard”
• Unpredictability  added margin and schedule
  14nm PULPino: area = 6% from freq = 10MHz !
                                                  A. B. Kahng, 180327 ISPD--2018   8
Today’s SOC Design
  # Partitions     Design Flexibility
                    Predictability          Margins
                    # Iterations
                                             Achieved
                                          Design Quality
                   Turnaround Time
                                               A. B. Kahng, 180327 ISPD--2018   9
Vision for Future SOC Design
                           Design Flexibility
     # Partitions 
                       !
                             Predictability              Margins
                                                   !
                              # Iterations
                                                          Achieved
                                     Single-pass
                                                   !   Design Quality
                           Turnaround Time
Mindsets
• Tools should not return unexpected results  Quality
• Achieve predictability from the user’s POV
                                                  Schedule
• Use cloud/parallel to recover solution quality
• Focus on reducing design time, design effort
                                                  Cost
   Machine Learning will be a key piece of this …
                                                         A. B. Kahng, 180327 ISPD--2018   10
Agenda
• Crises…
• … and a Vision
• Machine Learning in PD
                           A. B. Kahng, 180327 ISPD--2018   11
Machine Learning in Physical Design
Problem types solved with Machine Learning
•   Classification
•   Regression
•   Dimensionality reduction
•   Structured prediction
•   Anomaly detection
Past ML applications in EDA literature
•   Yield modeling (anomaly detection, classification)
•   Lithography hotspot detection (classification)
•   Identification of datapath-regularity (classification)
•   Noise and process-variation modeling (regression)
•   Performance modeling for analog circuits (regression)
•   Design- and implementation-space exploration (regression)
      ML in PD: modeling, prediction, correlation, …
                                                          A. B. Kahng, 180327 ISPD--2018   12
Near-Term Opportunities
• Modeling and Prediction
  • Predict tool outcome = F(design, constraints, tool config)
   • How to run tool “optimally” for given design and design goals?
   • Avoid “failed runs”  reduce iterations in design flow
   • Dream: one-pass design flow
• Analysis Correlation
  • Model analysis errors (crude vs. golden analyses)
   • Reduced guardbands and pessimism  better design quality
• Optimization (ML models = objective functions!)
  • ML models = objective functions for higher-level optimization
  • Better use of resources (tools, schedule, engineers) + better tools
  • Project-level prediction, adaptive scheduling
• Later: “Taxonomy and Roadmap”
                                                         A. B. Kahng, 180327 ISPD--2018   13
Agenda
• Crises…
• … and a Vision
• Machine Learning in PD
• Modeling and Prediction
                            A. B. Kahng, 180327 ISPD--2018   14
Example 1: Interface Between Global-Detailed Route
 • 7nm P&R: global route (GR) congestion map does not
   correlate well with post-route (actual) DRC violations (DRVs)
 • Many false-positive overflows in GR congestion map
 • False positives do not correspond to actual DRVs
            GR Overflows                  Actual DRVs
GR-based prediction can mislead routability optimizations!!!
                                                 A. B. Kahng, 180327 ISPD--2018   15
Too Many Expensive Iterations
                                    Conventional closure
                                     • Iteratively fix design before
          TECHNOLOGY
                                       signoff
                                     • Go back to placement or
                                       synthesis or FP if QOR is
 DESIGN RULES     CONSTRAINTS
                                       hopeless
                                     • Costly iterations and TAT (7-
                                       day P&R runs…)
  RTL DESIGN       SYNTHESIS
                                           ISPD17: ML-based
                   PLACEMENT               DRV predictor
  X                                  X
                  G/D ROUTING
                                     X         Iteration with space
                                               padding,
         ANALYZE QOR (AREA, WIRELENGTH,        NDR modifications,
              TIMING, #DRCs, YIELD)            density screens ...
                                                    A. B. Kahng, 180327 ISPD--2018   16
Insight From Layout Studies
• Initial prediction from GR overflows and cell/pin density map
• Red DRV-hotspot likely a False Negative due to low cell-pin density
• Larger windows , buried nets (, NDRs, FFs, etc.) added to model inputs
                                                         Standard cells
                                                         Actual DRV
                                                         False-negative
                                                         Layout windows
                                                         Non-buried net
   Sparse pins/cells             Dense pins/cells
                                                        A. B. Kahng, 180327 ISPD--2018   17
Improved Learning-Based Predictor
• Captures all true-positive clusters
• Maintains low false-positive rate
  Learning-based Prediction      Actual DRVs
      (a)              (b)               (c)
                                        A. B. Kahng, 180327 ISPD--2018   18
ISPD17: Model-Guided Routability Opt
• New: True-Positive rate = 74%, False-Positive rate = 0.2%
• Previous: True-Positive rate = 24%, False-Positive rate = 0.5%
                                                A. B. Kahng, 180327 ISPD--2018   19
Example 2: Local CTS Optimization Moves
• Iterative local moves to minimize skew variation
  across corners
 1. Displacement {N, S, E, W, NE, NW, SE, SW} by 10μm x
    one-step sizing
 2. Displacement by 10μm x one-step sizing on child buffer
 3. Reassign to a new driver (i) at the same level, (ii) within
    bounding box of 50μm x 50μm
                10μm                       10μm
                   ...                       ...         ...                             ...
                         ...                       ...                                          ...
    ...           ...          ...          ...                ...                     ...
          (1)                        (2)                                (3)
 Each move is expensive (legalization, ECO routing, RC extraction, STA)
 Each buffer has many candidate moves
 DAC-15: learning-based model
                                                               A. B. Kahng, 180327 ISPD--2018   20
DAC15: CTS Outcome Prediction
• Predict driver-to-fanout latency change due to local moves
       Local move                               100%
                            %Buffers identified to
                             have the best move
                                                     80%
   Analytical models                                 60%
 Routing: FLUTE, STST
 Cell delay: Liberty LUTs                            40%                             Flute+ED
 Wire delay: Elmore, D2M                                                             Flute+D2M
                                                     20%                             STST+ED
                                                                                     STST+D2M
                                                     0%                              Model
     Delta delays                                          0   2   4       6     8        10        12
                                                                       #Attempts
                                    Each attempt is a local move
 Learning-based model               114 buffers
                                    45 candidate moves for each buffer
                                    Learning-based model identifies best moves
      Delta delays                   for more buffers with less #attempts
                                                                             A. B. Kahng, 180327 ISPD--2018   21
    Example 3: Prediction of Doomed Runs?
•   Some P&R runs end up with too many post-route DRVs
•   Approach: track and project metrics as time series
•   Markov decision process (MDP): terminate “doomed runs” early
•   Shown: 4 example progressions of #DRVs (commercial router)
    • Stopping red, yellow runs early would save resources and schedule !
                                                           A. B. Kahng, 180327 ISPD--2018   22
 Markov Decision Process = “Strategy Card”
• State space from Fibonacci binning
• Actions – GO or STOP
• Rewards at each state – e.g., small negative reward for non-stop state, large
  positive reward for stop with low #DRVs, etc.
• Automatically trained MDP “strategy card”: Yellow = GO, Purple = STOP
                                                           A. B. Kahng, 180327 ISPD--2018   23
Strategy Card “Completion”
                             A. B. Kahng, 180327 ISPD--2018   24
    Promising Initial Studies
•   TYPE 1 Prediction Error: MDP STOPs a run that will eventually succeed
•   TYPE 2 Prediction Error: MDP predicts GO at each iteration, but run fails
•   Training data: 1200 logfiles from PROBE experiments
•   Testing data: 3442 logfiles from ARM Cortex M0 floorplan experiments
•   Substantial #iterations saved for doomed runs (398 / 3442 cases)
    Latest P&R tools have increased #iterations  larger benefit in future ?
Errors                  Training (Total = 1200)                 Testing (Total = 3442)
N = 200         Total      #TYPE 1 Errors   #TYPE 2     Total      #TYPE 1 Errors            #TYPE 2
                Training   (wrong STOP      Errors      Training   (wrong STOP               Errors
                Error      prediction)      (no STOP)   Error      prediction)               (no STOP)
1 STOP          29.66%     251              99          35.2%      1317                      3
2 consecutive   10.5%      27               99          8.3%       307                       3
STOPs
3 consecutive   8.5%       3                99          4.2%       154                       3
STOPs
                                                                          A. B. Kahng, 180327 ISPD--2018   25
Agenda
• Crises…
• … and a Vision
• Machine Learning in PD
• Modeling and Prediction
• Analysis Correlation
                            A. B. Kahng, 180327 ISPD--2018   26
ML Shifts the Accuracy-Cost Tradeoff Curve!
                                 A. B. Kahng, 180327 ISPD--2018   27
Example 4: ML-based Timer Correlation
                                                                                                                                                      If
                                                                              Outliers                              INCREMENTAL                    error >
                                                                            (data points)                                                         threshol
 DATE-2014                                                                                                                         New
                                                                                                                                                     d
                                                                                                                                  Designs
(+ SLIP-2015)                                                                                               MODELS
                                                                                                         (Path slack, setup
                                                         Train              Validate                      time, stage, cell,
                                                                                                                                    Test
                                                                                                            wire delays)
                                                        Artificial               Real
                                                        Circuits                Designs                      ONE-TIME
                     0,1                        BEFORE                                                                  AFTER
T2 Path Slack (ns)
                                                                                    T2 Path Slack (ns)
                 -0,1
                 -0,2
                 -0,3                                                   ML                                                       31 ps
                                                                      Modeling
                 -0,4                                   123 ps                                                                 ~4 	reduction
                 -0,5
                 -0,6
                           -0,6   -0,5   -0,4    -0,3      -0,2      -0,1   0     0,1
                                          T1 Path Slack (ns)                                                              T1 Path Slack (ns)
                                                                                                                               A. B. Kahng, 180327 ISPD--2018   28
            “SI for Free” with Machine Learning
                                   Timing Reports in SI         Timing Reports in
                                          Mode                    Non-SI Mode                           • Machine learning of
                                      Create Training, Validation and Testing Sets                        incremental transition
                                                                                                          time, delay due to SI
                             ANN (2 Hidden Layers,           SVM (RBF Kernel, 5-Fold
                             5-Fold Cross-Validation)           Cross-Validation)                       • Accurate SI-aware
                                                                                                          path delays, slacks
                                  HSM (Weighted Predictions from ANN and SVM)
                                                Save Model and Exit
Non-SI Path Slack (ns) ($)
                                             BEFORE                                                             AFTER
                                                                            Predicted Path Delay (ps)
                                                              81ps                                                         Worst absolute
                                                                                                                           error = 8.2ps
                                                                  ML                                            8.2ps      Average absolute
                                                                Modeling                                                   error = 1.7ps
                                                SI Path Slack (ns) ($$$)                                 Actual PathA. Delay    180327 ISPD--2018 29
                                                                                                                       B. Kahng,(ps)
 Example 5: Predicting PBA from GBA?
• PBA (Path-Based Analysis) is less pessimistic than GBA
  (Graph-Based Analysis)
• But, more expensive runtime !
• Question: Can we predict PBA timing from GBA timing?
  •  Better optimization in P&R&Opt, less expensive STA
                                                                                PBA ‐ GBA Slack Gain 
                GBA Mode
                                                                50
                                   PBA Slack – GBA Slack (ps)
                                                                40
                                                                30
                                                                20
                                                                10
                                                                0
                                                                     0   5000      10000   15000    20000     25000    30000
                                                                                       Endpoint Index
                 PBA Mode
                                                                                           A. B. Kahng, 180327 ISPD--2018   30
Costs of GBA vs. PBA Pessimism
GBA Actual    PBA Actual      Impact
Slack         Slack
POSITIVE      POSITIVE        Power recovery can’t exploit usable
                              slack
NEGATIVE      POSITIVE        Schedule, Area, Power wasted fixing
                              false timing violations
NEGATIVE      NEGATIVE        Schedule, Area, Power waste from
                              over-fixing
 PBA Actual   PBA Predicted      Impact
 Slack        Slack (Model)
 HIGH         LOW                Power recovery can’t exploit all of
                                 usable slack
 LOW          HIGH               Masking of real violations
                                                   A. B. Kahng, 180327 ISPD--2018   31
 Promising Initial Studies
• Early model with MARS (multiple adaptive regression
  splines): 90% of predicted PBA slacks within 5ps
• Also: random forest classifier for 2-stage “bi-grams”
• Testcase: netcard, 28nm FDSOI
 # EndPoints (Testing)
                                    9000
                                    8000
                                    7000
                                    6000
                                    5000
                                    4000
                                    3000
                                    2000
                                    1000
                                       0
 ‐40                          ‐20          0    20          40
                         Error (ps) = Actual ‐ Predicted         Bi-gram =2-stage unit in timing path
                                    PBA Slack
                                                                                  A. B. Kahng, 180327 ISPD--2018   32
 Example 6: Reduce Corners in STA, Opt !
• Want benefits of STA at N corners, using just M << N corners
  • “Missing Corner Prediction” (“matrix completion”) saves runtime, licenses
  • Avoids optimistic timing that is caught at detailed signoff, causing iteration
                                                              A. B. Kahng, 180327 ISPD--2018   33
Agenda
• Crises…
• … and a Vision
• Machine Learning in PD
• Modeling and Prediction
• Analysis Correlation
• Optimization
                            A. B. Kahng, 180327 ISPD--2018   34
Example 7: Design Cost Optimization
• Predictive models == Optimization objectives
• Enables schedule, resource optimizations up to enterprise level
                                                                                         A2         A5     A3
                                                                                         (3)        (1)    (3)                   Datacenter capacity
     Usage (Across Three Projects)
                                                                                         A2         A4     A5
                                                                                         (1)        (3)    (1)                   Current servers
                                                                              A4         A4         A4     A5
                                                                              (1)        (2)        (1)    (2)
                                                                   A4         A3         A3         A3     A4         A5
                                                                   (1)        (1)        (1)        (2)    (2)        (2)
                                                        A3         A2         A3         A2         A3     A4         A4
                                                        (1)        (1)        (2)        (2)        (3)    (3)        (3)
                                                        A1         A1         A2         A1         A4     A3         A4         A4
                                                        (2)        (2)        (2)        (3)        (2)    (2)        (2)        (3)
                                      A1     A1         A2         A2         A1         A4         A2     A2         A3         A5      A5
                                      (1)    (1)        (1)        (2)        (3)        (1)        (3)    (3)        (3)        (3)     (3)
                                     20     22     24         26         28         30         32         34     36         38         40    42
                                                                                                                                        Work Weeks
• TODAES 2017: Schedule Cost Minimization, Resource Cost
  Minimization ILPs
  • “How do I pack 12 tapeouts into my design center during Q4?
                                                                                                                                        A. B. Kahng, 180327 ISPD--2018   35
Agenda
• Crises…
• … and a Vision
• Machine Learning in PD
• Modeling and Prediction
• Analysis Correlation
• Optimization
• A Roadmap
                            A. B. Kahng, 180327 ISPD--2018   36
Four Stages of ML Insertion in IC Design
1. Mechanization and Automation
2. Orchestration of Search and
   Optimization
3. Pruning via Predictors and
   Models
4. Reinforcement Learning and
   Intelligence
                                   Huge space of tool, command,
                                  option trajectories through design
                                                  flow
                                             A. B. Kahng, 180327 ISPD--2018   37
 1. Mechanization and Automation
• Create “robot IC design engineers”
  • Observe and learn from humans
  • Search for command sequences in design tools
• Multi-Armed Bandit Problem: Given slot machine
  with N arms, maximize reward obtained using T pulls
  • Well-studied in context of Reinforcement Learning
• IC Design: “arm” = target frequency; “pull” = run flow
                  Tool Outcomes (Area,
                  Power, WNS/TNS)
                         Arms to
                         Sample         Parallel
               SAMPLER
                         Samples per    Tool Runs
 Constraints
                         Arm
                   Max
                   Frequency
                               DAC-18 session: “The Road to No-Human-in-the-
                               Loop IC Design” (UCSD, Qualcomm, Synopsys)      A. B. Kahng, 180327 ISPD--2018   38
 1. Mechanization and Automation
• Create “robot IC design engineers”
  • Observe and learn from humans
  • Search for command sequences in design tools
• Multi-Armed Bandit Problem: Given slot machine
  with N arms, maximize reward obtained using T pulls
  • Well-studied in context of Reinforcement Learning
• IC Design: “arm” = target frequency; “pull” = run flow
                                                A. B. Kahng, 180327 ISPD--2018   39
 2. Orchestration of Search and Optimization
• How to optimally orchestrate N robot
  engineers?
  • Concurrent search of N flow trajectories
  • Explore, identify good flow options efficiently
  • Constraint: compute and license resources
• Goal: best QOR within resource, risk limits
• Example strategy: “Go with the winners”
  • Launch multiple optimization threads
  • Periodically identify promising thread
  • Clone promising thread and terminate others
                                                      A. B. Kahng, 180327 ISPD--2018   40
Another Example: “Adaptive Multi-Start”
• Optimization cost landscapes often have “big valley”
  structures
 • Best local minima are central to all other local minima
• Adaptive Multi-Start (AMS)
 • Identify promising configurations
   in current iteration
 • Adaptively choose better
   start points for next
   optimization iteration
                                                  A. B. Kahng, 180327 ISPD--2018   41
3. Pruning via Predictors and Models
• Prediction of tool- and design-specific outcomes over
  longer and longer subflows
  • Wiggling of longer and longer ropes
                                          A. B. Kahng, 180327 ISPD--2018   42
Example 8: Prediction of SRAM Timing Failure
 • Multiphysics effects (IR drop, thermal, etc.) affect
   timing closure
 • Floorplanning with SRAMs is complicated
   • P&R blockages
   • Unpredictable post-P&R timing
 • Goal: Early prediction of post-P&R slack (“doomed
   floorplans”) to save schedule
 • But estimating post-P&R timing at floorplan stage is
   challenging:
   • Wire delay estimate has no spatial embedding information
   • Gate delay estimate has no buffering information
                                               A. B. Kahng, 180327 ISPD--2018   43
Multiphysics Analysis is Difficult to Predict
• IR drop, thermal, reliability, crosstalk, etc.
• ASP-DAC 2016 (UCSD, Samsung): Can we predict
  “risk map” for embedded memories at floorplan stage?
SRAM Slack (ps)
                    Implementation Index
                                           A. B. Kahng, 180327 ISPD--2018   44
Floorplan Pathfinding with Machine Learning
• Filter bad floorplans (e.g., embedded memory
  placements, power plans) comprehending
  downstream PD flow
• Model f estimates combined effects of netlist,
  constraints, placement, CTS, routing, optimization,
  STA
                  Gate Netlist         Constraints
                        Floorplan, Powerplan
                             Placement
       Modeling
                                                     Extraction,
        Scope
                                                       Timing
                    Clock network synthesis
                             Routing
                                                       Costly
                        Extraction, Timing,          Iteration
                            Verification
   Slack (w/, w/o IR)          Signoff
                                                                   A. B. Kahng, 180327 ISPD--2018   45
Modeling Techniques and Flow
 Parameters from netlist      Parameters from floorplan           Slack reports from P&R,
    sequential graph             context, constraints                 multiphysics STA
                                                                                Ground Truth
LASSO with L1          SVM with RBF       ANN with 1 input, 2         Boosting with SVM
regularization            kernel         hidden, 1 output layer        as weak learner
                               Combine using weights
                                Save model and exit
                                                                    A. B. Kahng, 180327 ISPD--2018   46
Floorplan Pathfinding Model
• False negatives = 3%
  • Pessimistic predictions  floorplan change that is
    actually not required
• False positives = 4%
  • Model incorrectly deems a floorplan to be good
                                        Actual
                                    Pass       Fail
                         Pass
             Predicted
                                                        False
                                     584          42
                                                        positives
                         Fail
                                     31           384
                                False negatives
                                                             A. B. Kahng, 180327 ISPD--2018   47
3. Pruning via Predictors and Models
• Prediction of tool- and design-specific outcomes over
  longer and longer subflows
  • Wiggling of longer and longer ropes
• Prune, terminate  avoid wasted design resources
  • Better outcome within given resource budget
• Implicit: improved predictability and modelability
  of heuristics and tools
                                              A. B. Kahng, 180327 ISPD--2018   48
4. Reinforcement Learning and “Intelligence”
Many challenges on the road ahead…
• Latency and unpredictability of IC design tools/flows
  • Can’t “play the IC design game” 100M times in 3 days
• “Small data” challenge with a big-data problem
  • Data points are expensive
  • Huge implementation space
  • Tool versions, design versions, technology all changing
    (pictures of cats and trees don’t change)
• Model parameters come from domain experts today
• Open: bridging real (top-secret!) and artificial (fake!)
  • My group: many years of “eye chart” papers
                                                 A. B. Kahng, 180327 ISPD--2018   49
Todo List: “Last Mile” Robots
• Automation of manual DRC violation fixing
  • P&R tools cannot handle latest rule decks, unavoidable
    lack of routing resource in high-utilization block, etc.
• Automation of manual timing closure
  • After routing and optimization, several thousand violations
    of maxtrans, setup, hold constraints exist
  • Engineer fixes 200-300 DRVs by hand, per day
• Placement of memory instances in a P&R block
• Package layout automation
  • How to assess post-routed quality (e.g., bump inductances)
    of SOC floorplan and die-package pin map?
  • Required for: pin map, power delivery optimization
  • Requires: automation/estimation of manual package routing
                                                 A. B. Kahng, 180327 ISPD--2018   50
Todo List: Improving Analysis Correlation
• Prediction of the worst PBA path
• Prediction of the worst PBA slack per endpoint,
  from GBA analysis
• Prediction of timing at “missing corners”
  • Predict other impacts (e.g., transition times, ..) of an ECO as
    well
• Closing of multi-physics analysis loops
  • Early priorities: vectorless dynamic IR drop, power-
    temperature loops
• Continued improvement of timing correlation and
  estimation !
  • Faster and better always helpful !
                                                  A. B. Kahng, 180327 ISPD--2018   51
Todo List: Predictive Models of Tools, Designs
• Predict convergence point for P&R, non-uniform PDN
• Estimate PPA response of block to floorplan context
• Estimate useful skew impact on post-route WNS,TNS
• “Auto-magic” determination of netlist constraints for
  given performance and power targets
  • Key opportunity: exactly ONE netlist is passed into place-
    and-route – how to generate this best netlist?
• Predict best “target sequence” of constraints through
  layout optimization phases
• Predict “most-optimizable” cells during design closure
• Predict divergence (detouring , timing/slew violations)
  between trial/global route and final detailed route
• Predict “doomed runs” at all steps of design flow
                                                 A. B. Kahng, 180327 ISPD--2018   52
Todo List: And More…
• Infrastructure for machine learning in IC design
  • Standards for model encapsulation, model application, and
    IP preservation when models are shared
• Standard ML platform for EDA modeling
  • Enablement of design metrics collection, tool/flow model
    generation, design-adaptive tool/flow configuration,
    prediction of tool/flow outcomes
  • This recalls “METRICS” http://vlsicad.ucsd.edu/GSRC/metrics
• Modelable algorithms and tools
  • Smoother, less chaotic outcomes than present methods
• Datasets to support ML
  • Artificial circuits and “eyecharts”
  • Shared training data – e.g., timer correlation, post-route
    DRV prediction, optimal sizing
                                                  A. B. Kahng, 180327 ISPD--2018   53
Agenda
• Crises…
• … and a Vision
• Machine Learning in PD
• Modeling and Prediction
• Analysis Correlation
• Optimization
• A Roadmap
• Conclusion
                            A. B. Kahng, 180327 ISPD--2018   54
Conclusion
• Many high-value opportunities for ML in physical
  design
  • Analysis correlation  less margin, improved design QOR,
    faster convergence
  • Predictive modeling of tools/flows and designs  fewer
    loops, less wasted effort, less pessimism, better design
    optimization, better resource management
• Roadmap
  •   Robots
  •   Orchestration of robots
  •   Pruning via predictors and models
  •   Intelligence       + many specific “todos”
• Other facets: enablement, standards, openness,…
• I hope that many of you will join this quest !!!
                                                   A. B. Kahng, 180327 ISPD--2018   55
                THANK YOU !
Support from NSF, Qualcomm, Samsung, NXP, Mentor
Graphics and the C-DEN center is gratefully acknowledged.
                                              A. B. Kahng, 180327 ISPD--2018   56
                                                                [ISQED01]
(This is “METRICS” !)
• METRICS (1999; ISQED01): “Measure to Improve”
  • Goal #1: Predict outcome
  • Goal #2: Find sweet spot (field of use) of tool, flow
  • Goal #3: Dial in design-specific tool, flow knobs
            http://vlsicad.ucsd.edu/GSRC/metrics
                                                   A. B. Kahng, 180327 ISPD--2018   57
Patterning and Margins for Wires (“BEOL”)
• Self-aligned multiple patterning + Cutmask
• Make a “sea of wires”
• Make “cuts”
• Cut shapes and locations determine dummy wires and
  end-of-line extensions of wire segments
• Final layout  Target layout
   Timing and power not the same as originally designed !
   Need more margin !
                                      cut                         extension
  Target  layout    1D wires        Cut masks     Final layoutdummy fill
                                                  A. B. Kahng, 180327 ISPD--2018   58
 Patterning and Margins for Gates (“FEOL”)
 • Neighbor diffusion effect (NDE)
            • Diffusion step = neighboring diffusion area height
              change
            • Transistor drive strength and leakage prop. to
              horizontal fin spacing
 • 2nd Diffusion Break (DB)
            • Vt shift as a function of spacing to the 2nd diffusion
              break
 • Gate Cut (GC)
            • Idsat shifts as a function of gate-cut distance to DUT
 • Worst corner has to consider NDE + 2nd DB + GC
             More margin added besides PVT (!)
                 1st DB   2nd DB
                                                     Diffusion
Diffusion
 height
                                                     Diffusion break
                DUT                                  Fin
                                                     PC
                                                     Gate cut          Gate Cut (GC) Effect
                                                                           A. B. Kahng, 180327 ISPD--2018   59
                                                                                                    [ASPDAC16]
Closing Multiphysics Analysis Loops
                                                   Sim Results
     Tech files, signoff                          (Dyn.) Activity             Functional 
                                                                                                   Sim vectors
                                                                                 Sim
      criteria, corners                           Factor (Static)                                  Benchmark
                                                                                                       RTL
                                        IR Drop                      Power 
                                          Map                                           Thermal 
                AVS                                                  Trace              Analysis
                                      Timing /            Power                                         Temp 
                                      Glitches           Analysis                                       Map
                Slack
                                                                                  Task 
                            Timing/                                               Mapping/ 
    P&R +                    Noise                                                Migration/ 
                                                          Reliability             (DVFS)
 Optimization
                                                           Report
                                                                                     MTTF & 
                                                                                      Aging
                                                                                       A. B. Kahng, 180327 ISPD--2018   60
                                                                                         [ASPDAC16]
Closing Multiphysics Analysis Loops
                                                   Sim Results
     Tech files, signoff                              Workload-Thermal
                                                  (Dyn.) Activity  Functional     loop
                                                                                     Sim vectors
                                                                      Sim
      criteria, corners                           Factor (Static)                        Benchmark
                             STA-IR loop                                                    RTL
                                        IR Drop                 Power 
                                          Map                                Thermal 
                AVS                                             Trace        Analysis
                                      Timing /          Power 
                                                              STA-Thermal
                                                       Analysis
                                                                                             Temp 
                Slack                 Glitches                                               Map
                                                               loop
                                                                         Task 
                            Timing/                                      Mapping/ 
    P&R +                    Noise                                       Migration/ 
                                                         Reliability     (DVFS)
 Optimization
                                                          Report
                               STA-Reliability loop                        MTTF & 
                                                                            Aging
                                                                            A. B. Kahng, 180327 ISPD--2018   61
BACKUP
         A. B. Kahng, 180327 ISPD--2018   62
Many Operating Conditions (“Corners”)
• Chip must work at many (500+) operating conditions (corners)
• Each corner = another run of the timing tool
• GOAL: Run as few timing corners as possible; predict the rest
                                 Predict the hidden slack values!
                                                A. B. Kahng, 180327 ISPD--2018   63
And a Dream … [predicting dynamic voltage drop]
                                      Inexpensive
                                      Static analysis
                                      + Current map
                +
                                        Expensive
                                        Dynamic analyses
                                      A. B. Kahng, 180327 ISPD--2018   64
    Some References
Highlighted in the talk from ABKGroup
•   [RISKMAP] W.-T. J. Chan, K. Y. Chung, A. B. Kahng, N. D. MacDonald and S. Nath, "Learning-Based Prediction of
    Embedded Memory Timing Failures During Initial Floorplan Design", (.pdf), Proc. ASPDAC, 2016.
•   [GT1GT2] ] S. S. Han, A. B. Kahng, S. Nath and A. Vydyanathan, "A Deep Learning Methodology to Proliferate
    Golden Signoff Timing", (.pdf), Proc. DATE, 2014.
•   [GT1GT2] A. B. Kahng, M. Luo and S. Nath, "SI for Free: Machine Learning of Interconnect Coupling Delay and
    Transition Effects", (.pdf), Proc. SLIP, 2015.
•   [#ML/ROPT] W.-T. J. Chan, Y. Du, A. B. Kahng, S. Nath and K. Samadi, "BEOL Stack-Aware Routability Prediction
    from Placement Using Data Mining Techniques", (.pdf), Proc. ICCD, 2016.
•   [#ML/ROPT] W.-T. J. Chan, P.-H. Ho, A. B. Kahng and P. Saxena, "Routability Optimization for Industrial Designs at
    Sub-14nm Process Nodes Using Machine Learning", (.pdf), Proc. ISPD, 2017.
•   [CTS] K. Han, A. B. Kahng, J. Lee, J. Li and S. Nath, "A Global-Local Optimization Framework for Simultaneous
    Multi-Mode Multi-Corner Skew Variation Reduction",(.pdf), Proc. DAC, 2015.
Some other machine learning / data mining papers from ABKGroup
•   [3DPE] W.-T. J. Chan, Y. Du, A. B. Kahng, S. Nath and K. Samadi, "3D-IC Benefit Estimation and Implementation
    Guidance from 2D-IC Implementation", (.pdf), Proc. DAC, 2015.
•   [HS] A. B. Kahng, C.-H. Park and X. Xu, "Fast Dual-Graph Based Hotspot Detection” (.pdf), Proc. BACUS, 2006.
•   [INT] A. B. Kahng, S. Kang, H. Lee, S. Nath and J. Wadhwani, "Learning-Based Approximation of Interconnect Delay
    and Slew in Signoff Timing Tools", (.pdf), Proc. SLIP, 2013.
•   [METRICS] S. Fenstermaker, D. George, A. B. Kahng, S. Mantik and B. Thielges, "METRICS: A System Architecture
    for Design Process Optimization", (.pdf), Proc. DAC, 2000.
•   [METRICS] A. B. Kahng and S. Mantik, "A System for Automatic Recording and Prediction of Design Quality
    Metrics", (.pdf), Proc. ISQED, 2001.
•   [HSM] A. B. Kahng, B. Lin and S. Nath, "Enhanced Metamodeling Techniques for High-Dimensional IC Design
    Estimation Problems", (.pdf), Proc. Design, Automation and Test in Europe, 2013, pp. 1861-1866.
•   [HHSM] A. B. Kahng, B. Lin and S. Nath, "High-Dimensional Metamodeling for Prediction of Clock Tree Synthesis
    Outcomes", (.pdf), Proc. ACM/IEEE International Workshop on System-Level Interconnect Prediction, 2013.
•   [METRICS] GSRC/METRICS: http://vlsicad.ucsd.edu/GSRC/metrics/
       See also: Center for Design-Enabled Nanofabrication, http://cden.ucsd.edu                                           65
                                                                                          A. B. Kahng, 180327 ISPD--2018
Cycles of Margin Implications [ISQED08]
                                          Delays
                                   Optimization Challenge
50% decrease of margin?                 Driver Sizes
Or 100% increase?
                                          Area (A)
                                        Wirelengths
 Parambest            Paramworst                              Yr  e  Ad
                                          Defects                     (d: defect density)
  -100%      0%           100%                                               r2   2r 
                                           Cost              N dies               
                                                                             A    2 A  
                                                             (r: wafer radius)
                                                       A. B. Kahng, 180327 ISPD--2018   66
Benefits from Margin Reduction at 45nm
                 Technology
             (90nm, 65nm, 45nm)
                                                • 40% margin reduction
  Cell library margin          RC margin          • Area: 13% reduction
        reduction              reduction          • Dynamic power: 13% reduction
                                                  • Leakage power: 19% reduction
    RTL Design
                                                  • Wirelength: 12% reduction
                             Synthesis
(AES, JPEG, SOC1)                                 • Tool runtime (S,P&R): 28% reduction
                                                  • #Timing viols.:100% reduction
                             Placement
                                                     saves iterations and schedule
Experiments                                       • #Good dies per wafer (w/o process
with industry chip Clock tree synthesis             enhancement): 4% increase
implementation
flow                     Routing
                                                • More margin = more cost
                         Analyze outcomes       • Less margin = less cost
                         (Area, wirelength,
                        runtime, #violations,
                                                • Cost reduction  must cure
                               yield)             unpredictability of design tools
                                                                    A. B. Kahng, 180327 ISPD--2018   67
Agenda
• Scaling, Moore’s Law and Crises
• Scaling Prospects
• What’s Left for the Future?
                                    A. B. Kahng, 180327 ISPD--2018   68
“More Than Moore”: 2.5D/3D Integration
         Conventional Path                                                            Futures
2.5D          Interconnect                                 3D             Monolithic Integration
              Micro
Interposer-   Bump
                                                           Sequential
               TSV
   based
                                                            Build-up
               C4
              Bump
2.5D              SoC            “Virtual” SoC
                                                                                                               Source: LETI
  MOCHI
 (Marvell)                                                   3D                 Transfer Printing
                                                           Stamp                               Grab objects off
3D                                                                        1
                                                                                               of donor subs
              Tier3
TSV-based Tier2                                      TSV                                        2
                      Three Dimensional System
              Tier1   Integration, Springer, 2011.
                                                             Donor subs
                                                                                             4 Prints objects
3D                                                           3                                 onto receiver
 Bonding-
                     D2W          /              D2D
  based
                                                             Receiver
                                                              subs            Nature Materials 5, 33 - 38 (2006)
                                                                                            A. B. Kahng, 180327 ISPD--2018   69
New (“Rebooting Computing”) Paradigms
• Approximate Computing
  • E.g., cut carry chain in adder to trade off throughput, accuracy
• Stochastic Computing
 • Represent numbers by pseudo-random bitstreams
 • Tolerant to delay-induced error compared to parallel number
   representation
                                               Z = X1×X2
                                                      3/8 = 4/8  6/8
• Neuromorphic Computing …
                                                                A. B. Kahng, 180327 ISPD--2018   70
BUT: Even If We Had Infinite Dimensions...
 Idea: Infinite dimension gives us a bound on 3DIC benefits
 Infinite dimension: netlist optimization with zero wire
  parasitics
 Gap between infinite dimension and 2D  maximum power
  benefit from 3DIC = 36% for CORTEX M0, 20% for AES
                             CORTEX M0                                             AES
                30                                                  60
                         Pseudo1D      2D
                25       3D (2 tier)   3D (3 tier)                  50
                                                       Power (mW)
   Power (mW)
                         3D (4 tier)   infD
                                       infiD
                20                                                  40
                                                                    20%
                15
                36%                                                 30
                10                                                  20
                 5                                                  10
                  0,75            0,95          1,15                  0,55          0,75                 0,95
                              clock period (ns)                              clock period (ns)
                                                                                    A. B. Kahng, 180327 ISPD--2018   71
BUT: Even If Frequency Didn’t Matter At All…
 Up to ~65% area difference (usually ~30%) between
  minimum clock period constraint (2.08GHz) and
  relaxed clock period constraint (28FDSOI, AES)
                                       Area vs. Target Frequency - AES Cipher in 28FDSOI
                           18000
                           16000                                                               Timing
                                                                                               Fail
                           14000
                                                                                                 65%
   Post Route Area (um2)
                           12000
                           10000
                            8000
                            6000
                            4000
                            2000
                              0
                                   0       0,5             1                 1,5           2               2,5
                                                          Target Frequency (GHz)
                                                                                                 A. B. Kahng, 180327 ISPD--2018   72
BUT: Even If Wires Were Perfect (No R, C) ...
              3                            Path Delays (JPEG Encoder)                 Path Delay (with wires)
                                                                                      Path Delay (without wires)
                                                     Min. cycle time = 2.8
             2.5
                                                          Min. cycle time = 2.25
              2
Delay (ns)
             1.5
             0.5
              0
                   1   501   1001   1501      2001      2501       3001   3501     4001        4501         5001
                                                      Path Index
                                                                                 A. B. Kahng, 180327 ISPD--2018   73
Agenda
• Scaling, Moore’s Law and Crises
• Scaling Prospects
• What’s Left for the Future?
• The Last Semiconductor Scaling Levers
                                    A. B. Kahng, 180327 ISPD--2018   74
Takeaways
• Quality, Schedule, Cost are “the last levers for
  semiconductor scaling”
  • Accessibility of hardware / semiconductor design
  • Continue semiconductor value trajectory (for a while longer)
• Foundation #1: machine learning in, around EDA
  • Pervasive ML  Drive down iterations, margins
  • Cloud-targeted, large-scale optimizations  drive down TAT
• Foundation #2: open-source EDA
  • Will a “Linux of EDA” be possible this time around?
• Foundation #3: partitioning and cloud EDA
  • Also part of schedule reduction
• Design Capability Gap is a crisis for the industry
  • Need all hands on deck!
                                                 A. B. Kahng, 180327 ISPD--2018   75
  Quality, Schedule, and Cost:
Design Technology and the Last
 Semiconductor Scaling Levers
           Andrew B. Kahng
       CSE and ECE Departments
            UC San Diego
        http://vlsicad.ucsd.edu
                                  A. B. Kahng, 180327 ISPD--2018
Agenda
• Scaling, Moore’s Law, and Crises
                                     A. B. Kahng, 180327 ISPD--2018   77
What is “Scaling”?
• ITRS = International Technology Roadmap for
  Semiconductors (http://www.itrs2.net/)
• Key metric of (density) progress: half-pitch (F)
• Contacted Poly pitch scales by 0.7
• Mx pitch scales by 0.7
 0.7 x 0.7 = 0.49  density doubles
 at each “technology node”
                                             A. B. Kahng, 180327 ISPD--2018   78
“Moore’s Law” = Scaling of Cost and Value
• Moore, 1965: “The complexity for minimum component costs
  has increased at a rate of roughly a factor of two per year”
     Min cost per transistor
• Moore’s Law is a law of cost reduction                  (1% = 1 week)
• Proxy for cost reduction: “scaling of value”
• Proxies for value: “bits”, “hertz”, “density” (= utility, integration)
                                                      A. B. Kahng, 180327 ISPD--2018   79
Today: Bigger Stacks of Margin (“Corners”)
 Design margin = stacks of
 layers of conservatism
  Reliability
                            Voltage                             Temperature
          Process
                                       Nominal Vdd                      Signoff
              PDF
                            Static IR drop
Signoff                     Power grid
                            IR gradient
                            Dynamic IR
                            HCI/NBTI
              performance
                                             Signoff
                                                       source: Wu 08
                                                                   A. B. Kahng, 180327 ISPD--2018   80
Corner Explosion Worsens
Corners =    Process X RCX X Temperature X Voltage X                             ...
              FF, FFG,   C-worst,    -40°C, 0°C,     0.7V, 0.8V,
              FS, SF,    Cc-worst,   80°C, 125°C,    0.9V, 1.0V,
              TT,        C-best,     …               1.1V, …
              SSG, SS,   Cc-best,
              …          RC-worst,
                         RC-best,
                         …
• Each corner is a new “objective function” and a new set of
  constraints!
• Lose design turnaround time (TAT) == schedule
  • Non-convergence, “ping-ponging” in timing closure
                                                    A. B. Kahng, 180327 ISPD--2018   81
Consequences
•   Diminishing ROI from next node
•   Typical: Moore’s Law-ish scaling
•   Worst-case: Scales, but worse return on investment
•   Signoff with excessive margin: gain is wiped out
                                           A. B. Kahng, 180327 ISPD--2018   82
Agenda
• Scaling, Moore’s Law and Crises
• Scaling Prospects…
 • Difficult and costly, with limits ahead !
                                               A. B. Kahng, 180327 ISPD--2018   83
Scaling Will Continue (!)
• Lateral scaling in semiconductor manufacturing and
  device architecture is still predicted to occur
  • Extremely challenging after 5nm/3nm node (i.e., N5/N3)
  • Monolithic 3D will drive scaling afterwards
• Beyond this roadmap, new scaling levers are needed
                         Source: IRDS
                                              A. B. Kahng, 180327 ISPD--2018   84
Lateral (Area) Scaling: MOL and Tracks (1)
• Old technology node layer stack
  • OD / Poly – V0 – M1 – V1 – M2
               VDD
 M1
                                VDD     M2
                                        V1                               BEOL
                                        M1
              A    Z
                               A    Z   V0
                                         Mint
                                         Vint          Poly              MOL
                                        M0G            M0A
               VSS              VSS
                                        Fin
      Inverter (old)   Inverter (old)                  Poly
• Advanced node layer stack
  • OD – M0A – VINT – MINT – V0 – M1 – V1 – M2
  • Poly – M0G – VINT – MINT – V0 – M1 – V1 – M2
                                                A. B. Kahng, 180327 ISPD--2018   85
Lateral (Area) Scaling: MOL and Tracks (2)
• N10/N7/N5 technology nodes
      Cells              12T           9T       7.5T      6T                5T/4T/3T
          Pins                 M1               M1                  MINT/M1
          M1             Bidirectional                     Unidirectional
       MOL                     N/A                     Yes: MINT/M0 below M1
     VDD/VSS                   M1               M2     M1/MINT       Buried/backside P/G
# M2 routing tracks      ~9            ~6        5         6                    5/4/3
                               VDD                                                         M0G
           VDD
                                                                                           M0A
    VDD            VDD         VDD               VDD
                                                                              VDD          MINT
               Z
    VSS   A        VSS                 A    Z                   A                           M1
                                                                     Z
                                                                                            M2
           VSS                   VSS            VSS                           VSS          Buried
                                 VSS
  Inverter (7.5T)               Inverter (6T)              Inverter (5T)
                                                                    A. B. Kahng, 180327 ISPD--2018   86
Area Scaling Teardown (CPP x MP)
• 0.5x target area scaling to                                      Gate-Contact Congestion
  continue Moore’s Law
• Combines Contacted Poly Pitch
  (CPP) scaling and Metal Pitch
  (MP) scaling
•  Need new design technology
  and device technologies
0.5x area scaling = CPP scaling x metal pitch scaling
   [source] M. Badaroglu, “More Moore scaling: opportunities and inflection points”
                                                                                 A. B. Kahng, 180327 ISPD--2018   87
Scaling is Doable, but ...
... it’s getting tough 
                             A. B. Kahng, 180327 ISPD--2018   88
Machine Learning Gives Us Scaling !
• High-value opportunities in and around EDA
• Modeling and Prediction
  • Predict tool outcome = F(design, constraints, tool config)
   • How to run tool “optimally” for given design and design goals?
   • Avoid “failed runs”  reduce iterations in design flow
   • Dream: one-pass design flow
  • Model analysis errors (crude vs. golden analyses)
   • Reduced guardbands and pessimism  better design quality
• Optimization (ML models = objective functions!)
  • Better use of resources (tools, schedule, engineers) + better tools
  • Project-level prediction, adaptive scheduling
• Today: the major focus for IC industry
  • U.S. DARPA IDEA program: automation, schedule
  • 24-hour TAT, “no-human-in-the-loop”
                                                         A. B. Kahng, 180327 ISPD--2018   89
 What About … “No Human In The Loop”?
• Multi Armed Bandit Problem: Given a slot machine
  with N arms, maximize total reward obtained using T
  pulls (iterations)
  • Well-studied in context of Reinforcement Learning
• IC Design: “arm” = target frequency; “pull” = run of flow
  • UCSD scripts available upon request
                  Tool Outcomes (Area,
                  Power, WNS/TNS)
                         Arms to
                         Sample        Parallel
               SAMPLER
                                       Tool Runs
                         Samples per
 Constraints
                         Arm
                   Max
                   Frequency
                                                   A. B. Kahng, 180327 ISPD--2018   90
Same Quality in Less Time = Scaling
                  IC Quality (%)
                  (25, 100)
         100%                                                       Current (100, 100)
          90%                               #1
                       #3
                                 #2
                                                                       Design time (%)
                            25                                100
#1. tool/flow models; design-adaptive, learning-based, one-pass flows
#2. analysis correlation, prediction; reduced margins/corners; correct by construction
#3. cloud-based design to recover global optimization; SP&R improvements
  Machine Learning (Data + Intelligence) is essential for this
                                            A. B. Kahng DARPA IDEA workshop 170413
                                                                             A. B. Kahng, 180327 ISPD--2018   91
                                                                [ISQED01]
(This is “METRICS” !)
• METRICS (1999; ISQED01): “Measure to Improve”
  • Goal #1: Predict outcome
  • Goal #2: Find sweet spot (field of use) of tool, flow
  • Goal #3: Dial in design-specific tool, flow knobs
            http://vlsicad.ucsd.edu/GSRC/metrics
                                                   A. B. Kahng, 180327 ISPD--2018   92
A Future Ecosystem
                     A. B. Kahng, 180327 ISPD--2018   93
Agenda
• Scaling, Moore’s Law and Crises
• Scaling Prospects
• What’s Left for the Future?
• The Last Semiconductor Scaling Levers
• Going Forward: Foundation #1 = ML in/around EDA
• Going Forward: Foundation #2
                                    A. B. Kahng, 180327 ISPD--2018   94
Attacking the Design Capability Gap
• Not enough R&D attention on EDA challenges
  • ~10,000 worldwide EDA, internal CAD, academic research
    headcount
• Long latency of technology transfer
  • Latest CAD research technologies unavailable to chip
    designers
  • 5-7 years from ASP-DAC proceedings to production IC
    design flow
•  Opportunity for another form of “scaling”
                                            A. B. Kahng, 180327 ISPD--2018   95
Is It Time for “Linux of EDA”?
• Free open-source software (FOSS) has sparked
  rapid innovation in many fields
  • Common standards, platforms avoid wasted energy
  • Recent U.S. DARPA “IDEA” program solicitation: IC
    design that is “no human in the loop” and “24-hour TAT”
• Older efforts
  •   MARCO GSRC Bookshelf
  •   Berkeley tools (SPICE, MIS/SIS/ABC, …)
  •   UCLA/UCSD/UM tools (Capo, MLPart, …)
  •   OpenAccess and OAGears
• Many recent efforts worldwide
  • OpenTimer, Yosys, RSyn, Ophidian, Open Design Flow,
    CloudV.io, …
  • Will “critical mass” be possible this time around?
                                               A. B. Kahng, 180327 ISPD--2018   96
Agenda
• Scaling, Moore’s Law and Crises
• Scaling Prospects
• What’s Left for the Future?
• The Last Semiconductor Scaling Levers
• Going Forward: Foundation #1 = ML in/around EDA
• Going Forward: Foundation #2 = “Linux of EDA”
• Going Forward: Foundation #3 = partitioning, cloud
• Takeaways
                                       A. B. Kahng, 180327 ISPD--2018   97
Multiphysics Analysis is Difficult to Predict
• IR drop, thermal, reliability, crosstalk, etc.
• Example: Can we predict “risk map” for embedded
  memories at floorplan stage ?
    SRAM Slack (ps)
                                   29ps
                         25ps
                      SRAM #1     SRAM #5
                                          A. B. Kahng, 180327 ISPD--2018   98
Key Challenge: Global-Detailed Route Correlation
 • 7nm P&R: global route (GR) congestion map does not
   correlate well with post-route (actual) DRC violations
 • Many false-positive overflows in GR congestion map
 • False-positive  do not correspond to actual DRC violations
            GR Overflows                 Actual DRC
GR-based prediction can mislead routability optimizations!!!
                                               A. B. Kahng, 180327 ISPD--2018   99
If We Know DRC Hotspots before Routing…
                                             • Conventional way to
                 Technology                    close designs
                                                  • Iteratively fix design before
                                                    signoff
  Design Rules             Constraints
                                                  • Go back to placement if
                                                    QOR is hopeless
  RTL Design               Synthesis
                                                  • Turnaround time is VERY
                                                    challenging (7-day P&R
                           Placement
                                                    runs…)
                                             • Can we do better with
                          G/D Routing          accurateIteration
                                                          prediction?
                                                                 with space
                                                           padding,
                 Analyze QOR (Area, wirelength,            NDR modifications,
                      timing, #DRCs, yield)                density screens ...
                                                                A. B. Kahng, 180327 ISPD--2018   100
Layout Study
• Initially predict with GR overflows and cell/pin density map
• Red DRC-hotspot likely be rejected due to low cell-pin density
• Larger windows and buried nets metrics to guide prediction
                                                          Standard cells
                                                          Route-DRC
                                                          False-negative
                                                          Extraction
                                                          windows
                                                          Non-buried net
   Sparse pins/cells             Dense
                                 pins/cells              A. B. Kahng, 180327 ISPD--2018   101
   DRV Prediction with Machine Learning
    • Predictor is used to guide routability optimization
    • SVM with weighting to compensate biased training data
                                            Cell density,                 Parameters
                                            pin density
                                                                                           Remaining
                        Random              GR resources
                                                                                              80%
                           20%              Pin proximity    Learning
                                                                                            gcells for
                        gcells for                            Model
                                                 Cell                                        testing
                         training            connectivity
                                            Net spreading
                                                …
                                                …                                  Prediction of
                                     Route-DRVs                                    Route-DRVs
                                     for training
                                            Initial linear model        Non-linear SVM model
                                                            With
                                              W/o DRC                           W/o DRC    With DRC
True positive rate = tp / t                                 DRC
False positive rate = tn / n
                                 W/o DRC        98260       350    W/o DRC        98571       117
                                 With DRC        481        111    With DRC        170        344
                               True positive rate: 24%                        True positive rate: 74%
                               False positive rate: 0.5%                      False positive rate: 0.2%
                                                                                               A. B. Kahng, 180327 ISPD--2018   102
Improved Learning-Based Predictor
• Captures all true-positive clusters
• Maintains low false-positive rate
     Learning-based Prediction   Actual DRC
      (a)              (b)               (c)
                                        A. B. Kahng, 180327 ISPD--2018   103
Machine Learning Gives Us Scaling !
• High-value opportunities in and around EDA
• Modeling and Prediction
  • Predict tool outcome = F(design, constraints, tool config)
   • How to run tool “optimally” for given design and design goals?
   • Avoid “failed runs”  reduce iterations in design flow
   • Dream: one-pass design flow
  • Model analysis errors (crude vs. golden analyses)
   • Reduced guardbands and pessimism  better design quality
• Optimization (ML models = objective functions!)
  • Better use of resources (tools, schedule, engineers) + better tools
  • Project-level prediction, adaptive scheduling (=separate talk)
• Today: the major focus for IC industry
  • U.S. DARPA IDEA program: automation, schedule
                                                         A. B. Kahng, 180327 ISPD--2018   104
Agenda
• Crises…
• … and a Vision
• Machine Learning
                     A. B. Kahng, 180327 ISPD--2018   105
PREDICTION
             A. B. Kahng, 180327 ISPD--2018   106
Agenda
• Scaling, Moore’s Law and Crises
• Scaling Prospects
• What’s Left for the Future?
• The Last Semiconductor Scaling Levers
• Going Forward: Foundation #1
                                    A. B. Kahng, 180327 ISPD--2018   107
Savings due to MDP
      Errors                                Testing (Total = 3442 logs)
     N = 200          Number of runs that    Number of runs stopped       Average number of
                      need to be stopped      correctly out of these       iterations saved
      1 STOP                 398                      394                        18.9644
2 consecutive STOPs          398                      391                        17.9309
3 consecutive STOPs          398                      380                        16.9736
Test data = M0 runs
For one run, #iterations saved = 20 – (iteration number where MDP says
STOP)
Average #iterations saved = Sum(#iterations saved)/398
In almost every one of these 398 cases, the run starts with a huge number of
violations, and the MDP stops it almost immediately. Hence, large avg.
#iterations saved
                                                                       A. B. Kahng, 180327 ISPD--2018   108
 Doomed Runs – Updated Error Criteria
• Prediction is wrong if:
   • DR ends with less than N violations and we predict STOP at 3 consecutive
     iterations (less stringent)
     (where N is the number of violations which a human designer finds it hard to
     resolve - usually N ~100-200)
   • DR ends with more than N violations and we predict GO at each iteration (already
     relaxed, but predictor does not have information about N)
• Training data: 1200 logfiles from PROBE experiments
• Testing data: 3745 logfiles from ARM Cortex M0 floorplan experiments
Errors                     Training (Total = 1200)                  Testing (Total = 3442)
N = 200         Total       #Errors wrongly   #Errors with   Total      #Errors wrongly         #Errors
                Training    predicted to      no STOP        Training   predicted to            with no
                Error       STOP (TYPE 1)     (TYPE 2)       Error      STOP                    STOP
1 STOP          29.66%      251               99             35.2%      1317                    3
2 consecutive   10.5%       27                99             8.3%       307                     3
STOPs
3 consecutive   8.5%        3                 99             4.2%       154                     3
STOPs
                                                                          A. B. Kahng, 180327 ISPD--2018   109
Machine Learning Gives Us Scaling !
• High-value opportunities in and around EDA
• Modeling and Prediction
  • Predict tool outcome = F(design, constraints, tool config)
   • How to run tool “optimally” for given design and design goals?
   • Avoid “failed runs”  reduce iterations in design flow
   • Dream: one-pass design flow
  • Model analysis errors (crude vs. golden analyses)
   • Reduced guardbands and pessimism  better design quality
• Optimization (ML models = objective functions!)
  • Better use of resources (tools, schedule, engineers) + better tools
  • Project-level prediction, adaptive scheduling (=separate talk)
• Today: the major focus for IC industry
  • U.S. DARPA IDEA program: automation, schedule
                                                         A. B. Kahng, 180327 ISPD--2018   110
Example Early Result
• Early model with MARS (multiple adaptive regression
  splines): 90% of predicted PBA slacks within 5ps
• Testcase: netcard, 28nm FDSOI
                                                 9000
                                                 8000
                                                 7000
         # EndPoints (Testing)
                                                 6000
                                                 5000
                                                 4000
                                                 3000
                                                 2000
                                                 1000
                                                   0
                          ‐30     ‐20      ‐10          0    10       20             30
                                 Error (ps) = Actual ‐ Predicted PBA Slack
                                                                             A. B. Kahng, 180327 ISPD--2018   111
Machine Learning Gives Us Scaling !
• High-value opportunities in and around EDA
• Modeling and Prediction
  • Predict tool outcome = F(design, constraints, tool config)
   • How to run tool “optimally” for given design and design goals?
   • Avoid “failed runs”  reduce iterations in design flow
   • Dream: one-pass design flow
  • Model analysis errors (crude vs. golden analyses)
   • Reduced guardbands and pessimism  better design quality
• Optimization (ML models = objective functions!)
  • Better use of resources (tools, schedule, engineers) + better tools
  • Project-level prediction, adaptive scheduling
• Today: the major focus for IC industry
  • U.S. DARPA IDEA program: automation, schedule
                                                         A. B. Kahng, 180327 ISPD--2018   112
Takeaways
• Quality, Schedule, Cost are “the last levers for
  semiconductor scaling”
  • Accessibility of hardware / semiconductor design
  • Continue semiconductor value trajectory (for a while longer)
• Foundation #1: machine learning in, around EDA
  • Pervasive ML  Drive down iterations, margins
  • Cloud-targeted, large-scale optimizations  drive down TAT
• Foundation #2: open-source EDA
  • Will a “Linux of EDA” be possible this time around?
• Foundation #3: partitioning and cloud EDA
  • Also part of schedule reduction
• Design Capability Gap is a crisis for the industry
  • Need all hands on deck!
                                                 A. B. Kahng, 180327 ISPD--2018   113
Conclusions and Futures (2)
• ML+EDA: challenges of technology
  • “Small data” problem alongside “big data” problem
  • Huge implementation space, difficult parameter identification
  • Complicated by tool versions, design versions, technology
    changes (pictures of cats and trees don’t change every
    year)
  • Possibly helpful: EDA folks know what’s in their tools!
• ML in EDA: industry challenges
  • EDA {doesn’t like to, doesn’t know how to} model itself
  • Dependence on customers and customer data to understand
    what is needed
  • Open: Will customers or EDA vendors (or foundries)
    drive ML into design enablements and production flows?
• METRICS … revisited? (measure, record, model, predict, improve)
                                                 A. B. Kahng, 180327 ISPD--2018   114