ADCX V18A SG - Vol.1
ADCX V18A SG - Vol.1
N ETWORKS
                                               Education Services
          .
          •
'J
                                               Engineering
                                               Simplicity
                                               Student Guide
                                               Volume 1 of 2
  un1Pe[  NETWORKS
                          Education Services
Juniper Networks reserves t he right to change, modify, t ransfer, or otherwise revise t his publication without notice.
YEAR 2000 NOTICE
Juniper Networks hardware and software products do not suffer from Year 2000 problems and hence are Year 2000 compliant. The Junos operating system has no known
t ime-related limitations through t he year 2038. However, the NTP application is known t o have some difficulty in t he year 2036.
SOFTWARE LICENSE
The terms and cond it ions for using Juniper Networks software are described in t he software license provided with the software, or t o the extent applicable, in an agreement
executed between you and Juniper Net works, or Juniper Net works agent. By using Juniper Networks software, you indicat e that you understand and agree t o be bound by its
license t erms and condit ions. Generally speaking, the software license rest ricts t he manner in which you are permitted t o use t he Juniper Net works software, may contain
prohibitions against certain uses, and may state condit ions under which t he license is automat ically terminated. You should consult t he software license for further det ails.
                                                                                                                                                Contents
iv • Contents                                                                                              www.juniper.net
Course Overview
        Th is f ive-day co urse is designed to provide in-depth instruction on IP fabric and Ethernet VPN Controlled Virtua l
        Extensible LAN (EVPN-VXLAN) data cent er design and configuration. Add it iona lly, the co urse will cove r other data center
        concepts, including basic and advanced data center design options, Data Center Interconnect (DCI), EVPN m ulticast
        enhancements, and an introduction to dat a center automation co ncept s. The co urse ends with a multi-sit e dat a cente r
        design lab. Th is content is based on Junos OS Re lease 17.4R1 and 18.2R1-S3.
Course Level
        Data Center Fabric with EVPN and VXLAN (ADCX) is an advanced level course.
 Intended Audience
        The primary audiences for this cou rse are the following:
 Prerequisites
        The fo llowing are the prereq uisites for this course:
               •     Adva nced routing knowledge- the Advanced Junos Enterprise Routing (AJER) course or equ iva lent
                     knowledge;
               •     Intermediate switching knowledge- the Junos Enterprise Switching Using Enhanced Layer 2 Software (JEX)
                     course or equivalent knowledge; and
Objectives
        Aft er successfully completing t his course, you should be able to:
 Day1
        Chapter 1 :   Course Introduction
Chapter 3 : IP Fabric
Lab 1 : IP Fabric
 Day2
        Chapter 5 :   EVPN Controlled VXLAN
Lab 2: EVPN-VXLAN
 Day3
        Chapter 7:    Basic Data Center Architectures
 Day4
                      Lab 4 :     Data Center Interconnect
 Day5
        Chapter 1 2: Comprehensive Lab
         Franklin Gothic            Normal text.                                 Most of what you read in the Lab Guide and
                                                                                 Student Guide.
         CLI Input                  Text that you must enter.                    l ab@Sa n Jose> show r o ute
         GUI Input                                                               Select Fi l e > Save, and type config. ini
                                                                                 in the Fi l ename f ield.
         CLI Undefined              Text where the variable's value is the user's      Type set policy policy-name.
                                    discretion or text where the variable's va lue
                                                                                       ping 10.0.x.y
                                    as shown in the lab guide might differ from
         GUI Undefined              the va lue the user must input according to        Select File > Save, and type filename
                                    the lab topology.                                  in the Fi l ename fie ld .
 Technical Publications
         You can print technica l man uals and release notes directly from the Internet in a variety of fo rmats:
                                 Engineering Simplicity
Data Center Fabric with EVPN and VXLAN
Objectives
We Will Discuss:
        •            Objectives and course content information;
Introductions
Introductions
The slide asks several questions for you to answer during c lass introductions.
Prerequisites
Prerequisites
The slide lists the prerequ isites for t his course.
       Course Contents (1 of 2)
       ■    Contents:
              • Chapter 1: Course Introduction
              • Chapter 2: Data Center Fundamentals Overview
              • Chapter 3: IP Fabrics
              • Chapter 4: VXLAN Fundamentals
              • Chapter 5: EVPN Controlled VXLAN
              •    Chapter 6:                      Configuring EVPN Controlled VXLAN
              •    Chapter 7:                      Basic Data Center Architectures
              •    Chapter 8:                      Data Center Interconnect
              •    Chapter 9:                      Advanced Data Center Architectures
              • Chapter 10: EVPN Multicast
              • Chapter 11: Introduction to Multicloud Data Center
     C> 2019 Juniper Networks, Inc All Rights Reserved                                                         Jun1Per
                                                                                                                    N(lWOPKS
                                                                                                                               5
Course Contents (2 of 2)
      ■    Contents (contd.):
             • Chapter 12: Comprehensive Lab
             •    Appendix A:                           Virtual Chassis Fabric
             •    Appendix B:                           Virtual Chassis Fabric Management
             •    Appendix C:                           Junos Fusion Data Center
             •    Appendix D:                           Multi-Chassis LAG
             • Appendix E: Troubleshooting MC-LAG
             • Appendix F: Zero Touch Provisioning
             • Appendix G: In-Service Software Upgrad
             • Appendix H: Troubleshooting Basics
             • Appendix I: Data Center Devices
Course Administration
       ■    The basics:
               • Sign-in sheet
               • Schedule
                           • Class times
                           • Breaks
                           • Lunch
               • Break and restroom facilities
               • Fire and safety procedures
               • Communications
                           • Telephones and wireless devices
                           • Internet access
      Education Materials
      • Available materials for classroom-based
        and instructor-led online classes:
             • Lecture material
             • Lab guide
             • Lab equipment
      • Self-paced online courses also available
             • http://www.juniper.net/courses
Additional Resources
Additional Resources
The slide provides links to additional resources available to assist you in the installation, configuration , an d operation of
Juniper Networks products.
Satisfaction Feedback
                                                                                         ~
                                                                                         ~
                                                                                             Class
                                                               r
                                                                   "'                    ~
                                                                                         ~
                                                                                             Feedback
                                                                                     ►
                                                                                         ~
                                                                                         ~
                                                                        .. ~             ~
                                                                                         ~
                                                                               ..,
                                                                    ■
                                                                                         ~
-- - ~
                                                                                         ~
                                                                                                 I   II   I
i1
Satisfaction Feedback
Juniper Networks uses an electronic survey system to col lect and analyze yo ur comments and feedback. Depending on the
class you are taking, please complete the survey at the end of the class, or be sure to look for an e-mail about two weeks
from class completion that directs you to complete an online survey form . (Be sure to provide us with your current e-ma il
address.)
Submitting your feedback entit les you to a certificate of c lass completion. We thank you in advance for taking the time to
help us improve our educationa l offerings.
Course
Service Provider Routing & Switching Enterprise Routing & Switching Data Center Junos Security
JNCIE-DC
JNCIE- SP Bootcamp JNCIE- ENT Bootcamp JNCIE· DC Self-Study Bundle JNCIE- SEC Self-Study Bundle
                             t
                           JNCIP-SP
                                                                                        t
                                                                                    JNCIP-ENT
                                                                                                                                                     t
                                                                                                                                                  JNCIP-DC
                                                                                                                                                                                                            t
                                                                                                                                                                                                         JNCIP-SEC
              Advanced Junos Service Provider                      Advanced Junos Enterprise Switching (AJEX)                     Advanced Data Center Switching (ADCX)                      Advanced Junos Security (AJSEC)
                     Routing (AJSPR)
                                                                    Advanced Junos Enterprise Routing (AJER)
                 Ju nos Layer 3 VPNs (JL3V)
                                                                                        t
              Junos Intermediate Routing (JIR)
Networking Fundamentals
     Notes: Information current a s of M arch 2019. Course a nd exa m information (le ngth, av ailability, c ontent, etc .) is subj ect to c ha nge ; refe r t o www.juniper.net/training for the mo st current info rmation.
     C> 2019 J uniper Networks , Inc All Rights Reserved.
Courses
Juniper Networks courses are available in the following formats:
• Learning bytes: Short, topic-specific, video-based lessons covering Juniper products and technologies.
Find the latest Education Services offerings covering a wide range of platforms at www.juniper.net/training
                                                                                                Education Services
                                                                                                CERTIFICATION PROGRAM
    -~
                                                                                                                                  JNCIP-
    ·-"'0                                                                                                                       ENT-Cloud*,
     "'
    .&
     E!                                                                                                                          SP-Cloud*
    0.
    ..."'
               ---:l---:t•••••••••••••••                               •0   JNCIS·ENT
                                                                                                                                   :t
    ·-                                                                                                                            JNCIS-
    ·-.,"'u
     0.
                                                                                                                                   Cloud
    Vl
                                                                                                                                   :t
                                                                                                                                  JNCIA-
                                                                                                                                                       13
     u
     0
     "'                                                                                                                            Cloud
    <
                    Service Provider              Enterprise                 Data                         Junos                    Cloud                Automation          Data Center,
                      Routing and                Routing and                Center                       Security                                          and            Service Provider,
                       Switching                  Switching                                                                                              DevOps           Security Design
         Information as of March 2019. •in pla nning. Refer to www.juniper.neVcertification for the most current information.
Each JNCP track has one to fou r certification levels- Associate-level, Specia list-level, Professiona l-level, a nd Expert-level. The
Associate-leve l, Specia list-level, and Professio na l-level exams are computer-based exams composed of multiple cho ice
questions administered at Pea rson VUE testing centers worldwide.
Expert-level exams are composed of hands-on lab exerc ises administered at select Juniper Networks testing cente rs . Please
visit t he JNCP website at http://www.jun iper.net;certif ication f or detailed exam information, exam pric ing, and exam
registration.
      Certification Preparation
      • Training and study resources:                              • Community:
             • Juniper Networks Certification                          • J-Net:
               Program website:                                            http ://forums .juniper. net/t5/
                 www.juniper.net/certification                           Training-Certification-and/
             • Education Services training                               bd-p/Training_and_Certification
               classes:                                                • Twitter: @JuniperCertify
                  www.juniper.net/training
             • Juniper Networks documentation
               and white papers:
                 www.juniper.net/documentation
                                                                                                                                                                                                     a
                           • Purchase on-demand training
                             courses
                                                                                                             ·-~-,
                                                                                                                 ____  ___
                                                                                                                        ...... _
                                                                                                                              __
                                                                                                            • PIM-SM11~0!'!1J • .i,...A,,.,....,....1n•-......
                                                                                                                 .,._ .....
                                                                                                                 ·-------      .-
                                                                                                            • ,,,,,-1,.11:f'...-.. multif)lsftii.M• . , _
                                                                                                                                          ,
                                                                                                                                                                                                                    .....
                           • View assets offline (app only;                                                       --
                                                                                                             -·---·-·---
                                                                                                                  --~-
                                                                                                                 ·•------·-
                                                                                                                                                                         Recently Added
@ @
                                                                      1--1
                                                                                                                                                                                                          IOI$
UV.w, ,ev,_.. • • •·
Junos Genius
The Ju nos Genius mobile learning platform (wwwjunosgenius.net) helps you learn Juniper technologies and prepare for
Juniper certif icat ion exams on your schedule. An app for iOS and Android devices, along with laptops and desktops, Junos
Genius provides certification preparation resources, practice exams, and e-learning courses developed by experts in Juniper
technology. Courses cover automation, routing, switching, security, and more.
Find Us Online
J-n~ http://www.juniper.net/jnet
http://www.juniper.net/facebook
I http://www.juniper.net/youtube
http://www.juniper.net/twitter
Find Us Online
The slide lists some on line resources to learn and share information about Juniper Networks.
Questions
Any Questions?
If you have any questions or concerns about the class you are attending, we suggest that you vo ice them now so that your
instructor can best address your needs during class.
                              Engineering Simplicity
Data Center Fabric with EVPN and VXLAN
Objectives
We Will Discuss:
        •            The benefits and challenges of the traditional multitier architecture;
• The networking requirements that are requiring a change to the design of a data center;
Hierarchical Design
      ■    Legacy data center networks are often hierarchical and can consist of
           access, aggregation, and core layers
             • Benefits of a hierarchical network design include:
                          • Modularity - facilitates change
                          • Function-to-layer mapping - isolates faults
Distribution Layer
Access Layer = = = =
Multiple Tiers
Legacy data centers are often hierarchical and consist of multiple layers. The diagram in the example illustrates the typical
layers, which include access, distribution (sometimes referred to as aggregation), and core. Each of these layers performs
unique responsibilities.
Hierarchical networks are designed in a modular fashion. Th is inherent modularity facilitates change and makes this design
option quite scalable. When working with a hierarchica l network, the individual elements can be replicated as the network
grows. The cost and complexity of network changes are generally confined to a specific portion (or layer) of the network
rather than to the entire network.
Because f unctions are mapped to ind ividua l layers, fau lts relating to a specific function can be isolated to that function 's
co rresponding layer. The ability to isolate fau lts to a specific layer can greatly simplify troubleshooting efforts.
Functions of Tiers
Functions of Layers
The individua l layers usually represent specific f unctions found within a network. It is often mistaken ly thought that the
access, distribution (or aggregation), and core layers must exist in clear and distinct physical devices, but this is not a
requirement, nor does it make sense in some cases.
The example highlights the access, aggregation, and core layers and provides a brief description of t he functions commonly
implemented in those layers. If CoS is used in a network, it should be incorporated consistently in all three layers.
        •            Since using a hierarch ical implementation does not requ ire the use of proprietary featu res or protocols, a
                     multitier topology can be constructed using eq uipment from multiple vendors.
        •            A mu lt itier implementation allows flexible placement of a variety of switching platforms. The simpl icity of the
                     protocols used does not req uire specif ic Ju nos versions or platform positioning.
Admin
         •            The legacy multitier switching architecture cannot provide today's appl ications and users with predictable
                      latency and uniform bandwidth. This problem is made worse when virtual ization is introduced, where the
                      performance of virtua l machines (VMs) depends on the physical location of t he servers hosting those VMs.
         •            The management of an ever growing data center is becoming more and more taxing administratively speaking.
                      While the north to south boundaries have been fixed fo r years, the east to west boundaries have not stopped
                      growing. This growth, of t he compute, storage, and infrastructure, requires a new management approach.
         •            The power consumed by networking gear represents a significant proportion of the overall power consumed in
                      the data center. This cha llenge is particularly important today, when escalating energy costs are putting
                      additional pressure on budgets.
         •            The increasing performance and densities of modern CPUs has led to an increase in network traffic. The
                      network is often not equipped to deal with the large bandwidth demands and increased number of media
                      access control (MAC) addresses and IP addresses on each network port.
         •            Separate networks for Ethernet data and storage traffic must be maintained, adding to the training and
                      management budget. Siloed Layer 2 domains increase the overall costs of t he data center environment. In
                      addition, outages re lated to the legacy behavior of the Spanning Tree Protocol (STP), which is used to support
                      these legacy environments, often results in lost revenue and unhappy customers.
Given these cha llenges, along with others, data center operators are seeking solutions.
                                                                                              -
                                                                                              +--
L2 access
                                                                                          -
                                                                                                         -      Unused links
Resource Utilization
In the m ultitier topo logy displayed in the example, you can see that almost ha lf t he links are not uti lized . In th is example you
would also need to be running some type of spanning tree protocol (STP) to avoid loops which, wou ld introd uce a delay with
your network convergence as we ll as introduce significant STP control traffic taking up valuab le bandwidth.
Th is topology is re latively simple but allows us to visualize the lack of resource uti lization. Imagine a data center with a
hundred racks of servers with a hundred top of rack access switches. The access switches all aggregate up to the
core/distribution switches, including redundant connections. In this much larger and comp licated network, you wou ld have
thousands of physical cable connections that are not being utilized . Now imagine these connections are fiber. In addition to
the unused cables you would also have two transceivers per connection that are not being used. Because of the inefficient
use of physical components there is a significant amount of usable bandwidth that is sitting idle, and a significant
investment in device components that sit idle until a fa ilure occurs.
,,-..J laaS
                                                                                                                     Saas      BMaaS
                                                                                                                       PaaS
        •            Application Flows: More east-west traffic communication is happening in data centers. With today's
                     appl ications, many requests can generate a lot of traffic between devices in a single data center. Basica lly a
                     single user request triggers a barrage of additiona l requests to other devices. "Go here and get this, then go
                     here and get that" behavior of many applications is being done on such a large scale today that it is driving data
                     centers to become f latter and to provide higher performance with consistency.
        •            Network Virtualization: Th is means overlay networks; for example, NSX and Contrail. Virtualization is being
                     implemented in today's data centers and will continue to gain popu larity in the future . Some customers might
                     not be currently using virtualization in the ir data center, but it could definitely play a role in the design for those
                     customers that are forward looking and eventua lly want to incorporate some level of virtualization.
        •            Everything as a service: To be cost effective, a data center that offers hosting services must be easy to scale out
                     and scale back as demands change. The data center should be very agile it shou ld be easy to deploy new
                     services quickly.
                                                        -
                                                        ----         ------
=-==
A drawback to using MC-LAG in a data center environment is the sometimes proprietary implementation between vendors,
which can introduce incompatibilities in a multi-vendor environment. Addit ionally, in some data center environments,
MC-LAG configuration can become comp lex and difficult to troubleshoot.
======
      Junos Fusion
       • Junos Fusion
          • Based on the IEEE 802.1 BR Standard
          • Single point of management
          • Spine and leaf architecture
             • Also referred to as aggregation and satellite devices
======
Junos Fusion
Ju nos Fusion is a Juniper Networks Ethernet fabric architecture designed to provide a bridge from legacy networks to
software-defined cloud networks. With Junos Fusion, service providers and enterprises can reduce network comp lexity and
operational costs by collapsing underlying network elements into a single, logica l point of management. The Ju nos Fusion
architecture consists of two major components: aggregation devices and satellite devices. With this structure, it can also be
classified as a spine and leaf architecture. These components work together as a single switching system, f lattening the
network to a single t ier without compromis ing resiliency. Data center operators can build individual Ju nos Fusion pods
comprised of a pair of aggregation devices and a set of satel lite devices. Each pod is a col lection of aggregation and satellite
devices that are managed as a single device. Pods can be small-for example, a pair of aggregation devices and a handfu l of
satellites- or large with up to 64 satellite devices based on the needs of the data center operator.
       IP Fabric
        • IP          Fabric
           •           Flexible deployment scenarios
           •          Open choice of technologies and protocols
           •           Multi-vendor interoperability
           •           Highly scalable
           •           Strictly Layer 3
IP Fabric
An IP fabric is one of the most flexible and scalable data center solutions ava ilable. Beca use an IP fabric operates strictly
using Layer 3, there are no proprietary features or protocols being used so this solution works very we ll with data centers
that must accommodate m ult iple vendors. One of the most compl icated tasks in building an IP fabric is assigning all of the
details like IP addresses, BGP AS numbers, routing policy, loopback address assignments, and many other implementation
details.
                 • Mode 1 Applications
                           • Require Layer 2 connection (e.g. Legacy Data Base)
                 • Mode 2 (native IP) applications
                           • Do not require Layer 2 connectivity
                 • Overlay networking (tunneling of Layer 2 frames)
                           • Supports hybrid of Mode 1 and Mode 2 application types
                           • VXLAN (Supported by many vendors including VMware)
                           • MPLS
        1.           IP-only Data: Many data centers simply need IP connectivity between racks of equipment. There is less and less
                     need for the stretch ing of Ethernet networks over the fabric. For example, one popular compute and storage
                     methodology is Apache's Hadoop. Hadoop allows for a large set of data (i.e. like a single Tera-bit fi le) to be
                     stored in chunks across many servers in a data center. Hadoop also allows for the st ored chunks of data to be
                     processed in parallel by the same servers they are stored upon. The connectivit y between the possibly
                     hundreds of servers needs only to be IP-based.
        2.           Overlay Networking: Overlay networking allows fo r Layer 2 connectivity between racks however, instead of layer
                     2 frames being t ransferred natively over the fabric, they are t unneled using a different outer encapsu lation.
                     Virtua l extensible Local Area Network (VXLAN), Multiprot ocol Label Switching (M PLS), and Generic Routing
                     Encapsu lation (GRE) are some of the common tunneling protocols used to t ransport Layer 2 frames of the
                     fabric of a data center. One of the benefits of ove rlay networking is that when there is a change to Layer 2
                     connectivity between VMs/ servers (the overlay network), the underlying fabric (underlay net work) can remain
                     relative ly untouched and unaware of t he changes occurring in the ove rlay network.
       Overlay Networking (1 of 2)
        • Layer 2 Transport network
           • Seen as a point of weakness in scale, high availability, flooding, load
             balancing, and prevention of loops
           • Adding a tenant requires touching the transport network
                                                                   Assumes an Ethernet Fabric (loop prevention with STP would add even
                       VM 1 lvM2I lvM31                            more complexity). If you add a VM, the transport network must be
                                                                   configured with appropriate IEEE 802.1q tagging.
vSwitch ----
                                                                                            ----            ----
                                                                Layer 2 Underlay Network
BMS
      Overlay Networking (2 of 2)
       • Overlay networking can allow for an IP underlay
          • Compute and storage have already been virtualized in the DC; the
            next step is to virtualize the network
                           • VXLAN (one of a few overlay networking protocols) allows the decoupling of
                             the network from the physical hardware (provides scale and agility)
                                                               A VTEP encapsulates Ethernet frames into IP packets, so adding a
                                                               VMNLAN requires no changes to the IP transport network. Loops are
                                                               prevented by the routing protocols.
                      VM 1        lvM2I        VM3
                                                        --
                                                                       IP Fabric        ----            ----                       BMS
Overlay Networking
Overlay networking can help solve many of the requ irements and problems discussed in the previous slides. This slide shows
the addition of an overlay network that includes the use of VXLAN. The overlay network consists of the virtual switches and
the VXLAN tunnel endpoints (VTEPs). A VTEP wi ll encapsulate the Ethernet frames t hat it receives from the virtual switch into
IP and forward the resulting IP packet to t he remote VTEP. The underlay network simply needs to forward IP packets between
VTEPs. The receiving VTEP will de-encapsulate the VXLAN IP packets and then forward the resu lting Ethernet Frame to the
appropriate VM. Adding and removing VMs from the data center has no effect on the underlay network. The underlay
network simply needs to provide IP connectivity between the VTEPs.
When implementing the underlay network in this scenario, you have a few choices. You can use an Ethernet fabric like Virtual
Chassis (VC), Virtual Chassis Fabric (VCF), or Ju nos Fusion. All of these are valid solutions. Because al l of the traffic crossing
the underlay network is IP, the option for an IP fabric becomes available. The choice of underlay network comes down to
scale and future growth. An IP fabric is considered to be t he most scalable underlay solution.
       Spine-Leaf Architecture (1 of 2)
        • High capacity physical underlay
                  • High density 1OG , 40G , 1OOG core switches as spine devices
                  • High density 1OG, 25G, 40G , 50G, 1OOG leaf switches for server access
                                                         . ... -   -
                                                                          •••
As server capabi lities increase, the need for higher speed uplinks and access links grows as well. Edge tech nologies such as
25Gbps and 50Gbps server access links place a h igher demand on uplink bandwidth to the switch ing co re, which can
dramatically increase oversubscription rat ios in the switching fabric.
      Spine-Leaf Architecture (2 of 2)
       • Spine - QFX10000 series
          • 1OG , 40G, 1OOG
          • Can be used as routing core and DC interconnect
       • Leaf - QFX series
          • QFX10002-10G/40G access
          • QFX51 xx - 1OG/40G/1 OOG access
          • QFX52xx - 1OG, 25G, 40G, 50G, 1OOG access
.. 4 - -
                                                                            ------     •••
                                                          ------                               ------
        •            The QFX1 00xx Series provides high densit y 10G, 40G, and 100G switching capacity for the core switching layer.
                     It provides routing and switch ing functions, data center interconnect, as well as deep buffers to help manage
                     bursty traffic patterns between leaf nodes.
• The QFX51xx Series provides high density 10G, 4 0G, and 100G switching capacity for access nodes.
• The QFX52xx Series provides high density 10G, 25G, 4 0G, 50 G, and 1 00G access and uplink interfaces.
Although all of these models are part of the QFX Series fami ly of switches, featu res and f unctionality can vary between
different product lines. Please refer to technical documentation for specific features and conf iguration options for your
specific QFX Series platform .
Summary
We Discussed:
         •            The benefits a nd chal lenges of t he trad itio nal multit ier architecture;
Review Questions
Review Questions
        1.
2.
3.
       2.
Some of the applications that are driving a change in data centers include greater east-west traffic, and more reliance on
predictable lat ency between devices within the data center. Increased need for on-demand scaling is also creating new
challenges.
       3.
Layer 2 networks can be stretched over IP networks, or IP fabrics, using Layer 2 tunneling technology, also called overlay
networks
Chapter 3: IP Fabric
                               Engineering Simplicity
Data Center Fabric with EVPN and VXLAN
Objectives
We Will Discuss:
        •            Routing in an IP fabric;
• Scaling of an IP fabric;
Agenda: IP Fabric
       ➔ IP  Fabric Overview
       ■ IP Fabric Routing
       ■ IP Fabric Scaling
       ■ Configure an IP Fabric
IP Fabric Overview
The slide lists the topics we will discuss. We wi ll discuss the highlighted topic first.
IP Fabric Infrastructure
      ■    IP fabric
             • All IP infrastructure
                          • No Layer 2 switching or xSTP protocols
                          • Uses standards-based Layer 3 routing protocols, allowing for vendor interoperability
                            (can be a mix of devices)
             • Multiple equal cost paths should exist between any two
               servers/VMS in the DC                                                                      Data Center
                          • Paths are computed dynamically by the routing protocol                         --       --
                                                                                                                   +--
IP Fabric
An IP fabric is one of the most f lexible and scalable dat a center solutions ava ilable. Beca use an IP fabric operates strictly
using Layer 3, there are no proprietary feat ures or prot ocols being used, so this solution works very well with dat a centers
that must accommodate m ult iple vendors. Some of t he most complicated tasks in bu ilding an IP fabric are assigning all of
the detai ls like IP addresses, BGP AS numbers, routing policy, loopback address assignments, and many other
implementation det ails. Throughout t his chapter we refe r to the devices as nodes (spine nodes and leaf nodes). Keep in
mind that all devices in an IP fabric are basica lly just Layer 3 ro uters that rely on routing information t o make f orwarding
decisions.
                             Spine
                                                                        ----     ----             -
                                                                                                 --
                             Nodes
                                                         ~
                                                                ,,..z
                                                                          - --
                                                         -
                             Leaf
                             Nodes               ----    ----                                ■   ■ ■
You should notice that t he goal of the design is to provide connectivity from one ingress crossbar switch to an egress
crossbar switch. There is no need for connectivity between crossbar switches that belong to the same stage.
Spine Nodes
In a spine-leaf architecture, the goal is to share traffic loads over multiple paths through the fabric. A 3-stage fabric design
ensures t hat the access-facing port of any leaf node is exactly two hops from any other access-facing port on another leaf
node. It is called a 3-stage fabric because the forward ing path from any connected host is leaf-spine-leaf, or three st ages,
regardless of where the destination host is connected to the fabric.
Many applications funct ion best when packets are received in the order in which they are sent. In an IP fabric design,
per-flow based traffic sharing should be implemented so that packets in a unique flow follow the same fabric pat h. This helps
prevent out-of-order packets arriving at the destination host due t o potential congestion or latency between different paths
through the fabric.
                                                         Spine         Spine
                                                           1             2
                                                                 Option 1                      Option 2
                                         • Three-stage topology                •   Five-stage topology
                                         • Small to medium deployment          •   Medium to large deployment
                                         • Generally one BGP design            •   Lots of BGP design options
Option 1 is a basic three-stage architecture. Each leaf connects to every spine. The number of spines is determined by the
number of leaf nodes, and the throughput capacity required for leaf-to-leaf connectivity. In the diagram, a spine-leaf topo logy
with two spine devices and four leaf nodes is shown . The throughput capacity from one leaf to any other leaf is limited to
twice the capacity of a single uplink, since there is a single uplink to each spine node. In the event of a spine device fa ilure,
the forwarding capac ity from leaf-to-leaf is cut in half, wh ich could lead to traffic congestion. Th is can be increased by adding
additional uplinks to the spine nodes using technologies such as LAG. However, this type of design places a large amount of
traffic on few paths through the fabric. To increase scale, and to reduce the impact of a single spine device fa ilure in the
fabric domain, additional spine devices can be added. If, for instance, two more spine nodes were added to option 1, the
traffic from leaf 1 to leaf 4 would have four equal cost paths for traffic sharing. In the event of a spine device fa ilure, or
maintenance that requires a spine device to be removed from the fabric, the forwarding capacity of the fabric would only be
reduced by one fourth instead of one half.
For scalability and modularity, a Layer 3 fabric can be broken up into groups of spine-leaf nodes. Each group of spine-leaf
nodes is configured in a 3-stage fabric design. Each group is then interconnected through another fabric layer. Sometimes
the groups of 3-stage devices are cal led Pods, and the top tier of the fabric is ca lled a Super Spine. Traffic within a pod does
not leave the pod. Only inter-pod traffic must transit the f ive-stage fabric, or Super Spine. It is ca lled a f ive-stage fabric
because the forwarding path from one host to another follows a leaf-spine-superspine-spine-leaf path, or five stages. From
the perspective of the super spine devices in the d iagram, each spine level node is viewed as a leaf node of the super spine.
      ■    Recommended spine nodes                                                                Note: The numbers shown can vary based on
                                                                                                  model, cards installed, and services enabled.
                                                                                .   ---
                                                                                     --                 .
                                                                                                      -"'" :a:i:;:;;_;::=i ;ac:::i -'=-:,,   ==
                                                             QFX51xx                                      QFX52xx
                                                                                                                                     EX4300
                                                                                                        . --·
                                                                                                          --- --
                                                                                                          ====·==                         .
                                                                                                                           .. . ... . ..... ...   ~
                                                                                                                                                      -
                                                                                                                                                      .
                                                                                                    -                   -·--
Note: The numbers shown can vary based on model, cards installed, and services enabled
Agenda: IP Fabric
➔ IP Fabric Routing
IP Fabric Routing
The slide highlights the topic we d iscuss next.
When a host in a Layer 2 data center is required to communicate with another host within the data center, and within the
same Layer 2 broadcast domain, the source host sends a data frame to the MAC address associated with the destination
host. If the MAC address of the destination host is unknown, the source host uses an ARP request to query the remote MAC
address associated with the destination host's IP address. The response to the ARP is stored in the source host's ARP table,
where it can be used for future transmissions.
Switches that relay the frames between hosts popu late a MAC address table, or switching table, based on the MAC
addresses on frames that enter switch ports. In th is manner a switch can assign an outbound port to each MAC address in
the Layer 2 domain and can avoid broadcasting Layer 2 frames on ports that do not connect to the destination hosts.
Because of the redundant links present in a switching domain fabric, the potential for broadcast loops requ ires the
implementation of loop prevention protocols. One of the most common loop prevention protocols in a switch ing domain is
Spann ing Tree Protocol {STP), wh ich manages the forwarding state of Layer 2 ports within the network and blocks forward ing
on ports that could potentially create a forward ing loop. In the event of an active forwarding link fa ilure, blocked ports can be
allowed to pass traffic. Because of this blocking nature of STP, the implementation of STP within a switching doma in has the
potential to greatly reduce the forward ing capabilities of a fabric by blocking ports.
When traffic is required to transit between Layer 3 domains in a Layer 2 fabric, a default gateway is often used as the next
hop to which hosts forward traffic. The defau lt gateway acts as a gateway between different Layer 2 domains, and functions
as a Layer 3 router.
                                                                                                                                        1111111   Host1
    C> 2019 Juniper Networks, Inc All Rights Reserved
Ideally, leaf nodes in a Layer 3 fabric should have multiple forwarding paths to remote destinations. Loop prevention in a
Laye r 3 fabric is provided by dynamic routing protocols, such as OSPF, IS-IS, and BGP. Many vendors implement routing
solutions that enable equal cost multi-path load sharing (ECMP). The proper configuration of ECM P within the Layer 3 fabric
permits the use of all links for forwarding, wh ich provides a substantial advantage over STP loop prevention in a Layer 2
fabric.
Because the Layer 3 fabric is a routed domain, t he broadcast domains are limited in scope to each individua l link. With a
Layer 3 fabric, hosts attached to the edge, or leaf nodes, do not have direct Layer 2 connectivity, but instead communicate
through Layer 3.
                                                                                                                  ...
                                                                                    I            _                                  _          _   I
                                                                                    I        -~
                                                                                             -                                     -~
                                                                                                                                   -       -
                                                                                                                                           -
                                                                                                                                          B ~'
                                                                                                                                                   I
                                                                                                                                        -- --
                                                          C's Routing Table          , ,
                                                          10.1.1/24 > nexthop   X ,___       -----------------
                                                                    > nexthop   Y
                                                          10.2.2/24 > nexthop   X
                                                                    > nexthop   Y
     C> 2019 Juniper Networks, Inc All Rights Reserved
Remember that your IP Fabric will be forward ing IP data only. Each node is basically an IP router. To forward IP packets
between routers, they need to exchange IP routes. So, you have to make a choice between routing protocols. You want to
ensure that your choice of routing protocol is sca lable and future proof. As you can see by the chart, BGP is the natural
choice for a routing protocol, although the capabi lities of OSPF and IS-IS may be sufficient for many environments and
deployment types, depending on the end role of the IP fabric. For example, an IP fabric that wi ll connect directly to all host
devices, and maintain routing information to those hosts, wou ld benefit from the scale capabilities of BGP. An IP fabric that
will be deployed as an underlay technology for an underlay/ove rlay deployment may only be required to maintain routing
information related to the loca l links and loopback addresses of the fabric, and will not maintain routing information
pertaining to end hosts or tenants. In the latter scenario, an IGP may be sufficient to maintain the necessary routing
information.
                                                                   \
                                                                         C                                  A                       B   .,, 1
                                                                       --------------------                                     --
                                                                                                            10.1.1/24           1
                                                                                                                --......    /
    Q 2019 Juniper Networks, Inc All Rights Reserved
                                                                         1-=--=--=--=--=- -=...1
                                                                         I
                                                                         ,    -
                                                                                  -
                                                                                      -
                                                                                          -
                                                                                                 -
                                                                                                     -
                                                                                                                   -
                                                                                                                       -
                                                                                                                                -
                                                                                                                                    -
                                                                                                                                        -
                                                                                                                                            _I
                                                                                                                                              I
                                                                         ~------------------------;
                                                                        Requirement                  OSPF                     ISIS          EBGP
                                                                      Advertise Prefixes                 Yes                  Yes           Yes
                                                                              Scale                  Limited                 Limited    Extensive
                                                                        Policy Control               Limited                 Limited    Extensive
                                                                        Traffic Tagging              Limited                 Limited    Extensive
                                                                     Multivendor Stability               Yes                  Yes           Yes
EBGP Fabric (1 of 4)
        EBGP Fabric (2 of 4)
       ■    EBGP peering
              • Physical interface peering (no need for an IGP)
              • Every leaf node has one peering session to every spine node
                          • No leaf-to-leaf and no spine-to-spine peering (they will receive all routes based on
                            normal EBGP advertising rules)
              • Al l routers configured for multipath multiple - as with forwarding table
                load-balancing policy
                          • Enables ECMP at the routing table and forwarding table levels
                                                                                            ••   ••
                                                                                •••   •• ••                                          EBGP Session
                                                                         •• •
                                                                                                                                     •·············►
• • •
By default, if a route to the same destination is received from two different BGP peers, which belong to different AS's, only
one of the routes will be selected as active in the routing table. In order to take advantage of the redundant paths within a
BGP based IP fabric, the mul t i path mul t i ple-as parameter can be configured. This parameter enables the device to
install all paths to a remote destination in the routing table. In order to export a ll pot ential paths in the routing table to the
forward ing table, a load balancing policy must be configured and applied to the forwarding table as we ll.
EBGP Fabric (3 of 4)
                                                                ••
                                                                  ••
                                                                    ••   ••••
                                                                                •• ••
                                                                10.1/16                 ....
                                                        • • •   NH=A
To allow both advertised next hops t o be installed in the AS65001 and AS65002 devices, the multipath mult iple-as
parameters are conf igured on the BGP peering sessions. This allows all potential next hops for net work 1 0.1/16 to be
inst alled in the routi ng table.
EBGP Fabric (4 of 4)
                                                                         \_10.1/16 )
                                                                           ---..__J
                                                               10.1 /16
                                                        AS Path 65000 65001 I ••••• •
                                                                             •• •••••
                                                                        ••••
                                                                             •
                                                                         •• •
                                                                      •••
                                                                     •
                                                                     ~
                                             Local AS in
                                           advertised route!
                                            Loop Detected!
10.1/16
One of the built-in functions of the BGP protocol is loop prevention. The BGP protocol prevents loops by tracking the AS
numbers through which a route has been advertised, and drops a route that is received if that route has the local AS number
in the AS Path property of the route.
In the example, all leaf nodes are configured with AS 65001. The leaf nodes connected to network 10.1/16 advertise the
10.1/ 16 network to the spine nodes. When the route is advertised to AS65000, the leaf nodes prepend, or add their local AS
number to the front of the AS-Path field of the advertised route. When the spine nodes in AS65000 receive the route, the AS
Path parameter is analyzed, and no loop is detected, as AS65000 is not in the AS Path list. Both spines advertise the
10.1/ 16 network to connected BGP peers, and prepend the locally configured spine AS to the AS Path. When the route
arrives on another leaf configured with AS65001, the AS Path is examined, and the rece iving router determines that the
route has been advertised in a loop, since it 's locally configured AS number is already present in the AS Path parameter.
In order to change t his behavior, BGP can be configured with the as-ove r r i de parameter or with t he loops parameter.
The as-over ride parameter overwrites values in the AS Path field prior to advertising the route to a BGP peer. Normally
the AS of the device that is overwriting the AS Path parameter is used in place of the specified value. In the example shown,
the AS65000 router would replace all instances of AS65001 with AS65000 prior to advertising the route to its peers, and
therefore the leaf device in AS65001 would not see its local ly configured AS number in the AS Path of the received route.
Alternatively, the receiving device, in this case the leaf devices, can be configured to allow the presence of the locally
configured AS number a specified number of times. In the example, the parameter l oops 2 may be used to allow the route
to contain the locally configured AS number up to 2 times before a route is considered a loop.
       ■     Best practices:
              • All spine nodes should be the same type of router
              • Every leaf node should have an uplink to every spine node
              • Use either all 1OOGbE, 40GbE, or all 1OGbE uplinks
                           • IGPs can take interface bandwidth into consideration during the SPF calculation,
                             which may cause lower speed links to go unused
                           • EBGP load balancing does not take interface bandwidth into consideration
Best Practices
When enabling an IP fabric you should f ollow some best practices . Remember, t wo of t he mai n goals of an IP fabric design is
t o provide a non-blocking architecture t hat also provides predict able load-balancing behavior.
         •            All spi ne nodes should be the same t ype of router. They shoul d be the same model and t hey should also have
                      the same linecards installed . This helps the fabric t o have a predictable load balancing behavior.
         •            All leaf nodes should be the same type of router. Leaf nodes do not have to be the same rout er as the spine
                      nodes. Each leaf node shou ld be the same model and they should also have the same linecards inst alled. This
                      helps the fabric t o have a predictable load balancing behavior.
         •            Every leaf node should have an uplink to every spine node. This helps t he fabric to have a predictable load
                      ba lancing behavior.
Al l uplinks from leaf node to spine node should be of the same speed. This helps the fabric to have predict able load
balancing behavior and also helps with t he non-blocking nature of the fabric. For example, let us assume t hat a leaf has one
40 Gigabit Ethernet (40GbE) uplink and one 10GbE uplink to the spine. When using an IGP such as OSPF (f or loopback
interface advertisement ), whe n calculating the shortest path to the destination, the bandwidth of the links will be take n into
co nsideration . OSPF will most likely always choose the 40Gb E interface during its shortest path first (SPF) calcu lation and
use the interface for f orwarding toward remot e next hops, which essentially blocks the 10GbE interface f rom ever being
used. In the EBGP scenario, t he bandwidth will not be t aken into consideration, so traffic will be equally load-shared over the
t wo different speed interfaces. Imagine trying to equally load share 60 Gbps of data over the two links. How will the 10GbE
interface handle 30 Gbps of traffic? The answer is ... it will not.
Agenda: IP Fabric
➔ IP Fabric Scaling
IP Fabric Scaling
The slide highlights the topic we d iscuss next.
       IP Fabric Scaling
       ■   Scaling up the number of ports in a fabric network is accomplished by
           adjusting the width of the spine and the oversubscription ratio
              • What oversubscription ratio are you willing to accept?
                          • 1 to 1 ( 1: 1) - Approximately line rate forwarding over the fabric
                          • 3 to 1 (3: 1)- Spine (as a whole) can only handle 1/3 of the bandwidth on the server
                            facing interfaces (on the leaf nodes)
                          • Number of spines is determined by the number of uplinks on leaf devices
                              Leaf Nodes
                              qfx5120
                                                        ----          ----          ----         ----   •••        ----
     Q 2019 Juniper Networks, Inc All Rights Reserved
Scaling
To increase the overal l throughput of an IP fabric, you simply need to increase the number of spine devices (a nd the
appropriate upl inks from the leaf nodes to those spine nodes). If you add one more spine node to the fabric, you will also
have to add one more uplink from each leaf node. Assuming that each uplink is 40GbE, each leaf node can now forward an
extra 4 0 Gbps over the fabric.
Adding and removing both server-facing ports (down links from the Leaf nodes) and spine nodes will affect the
oversubscription (OS) ratio of a fabric . When designing the IP fabric, you must understand OS requirements of yo ur data
center. For example, does your data center need line rate forwarding over the fabric? Line rate forward ing would equate to
1-to-1 (1 :1) OS. That means the aggregate server-facing bandwidth is equal to the aggregate uplink bandwidth. Or, maybe
your data center would work perfectly fine with a 3 :1 OS of the fabric. That is, the aggregate server-facing bandwidth is three
times that of the aggregate uplink bandwidth . Most data centers will probably not require to design around a 1 :1 OS. Instead,
you should make a decision on an OS ratio that makes the most sense based on the data center's normal bandwidth usage.
The next few slides discuss how to ca lculate OS ratios of various IP fabric designs.
                                             Spine Nodes
                                                                  51   ---
                                                                        -               52   ---
                                                                                             ,-----, 3 1
                                                                                               -           53 ----         54   ---
                                                                                                                                 -
                               Leaf Nodes
                                                       --
                                                       .::::::+
                                                                                 ----                       ----                     ----   •••        ----
                                                         L1                       L2                         L3                     L4                  L32
                                                                                                              Can be any number of leaf nodes up to 32._ l
                                                                                                     Regardless of number of leaf nodes, OS remains 3:1
    Q 2019 Juniper Networks, Inc All Rights Reserved
3:1 Topology
The slide shows a basic 3:1 OS IP fabric . All spine nodes, four in tota l, are routers (Layer 3 switches) that each have (32)
40GbE interfaces. All leaf nodes, 32 in total, are routers (Layer 3 switches) that have (6) 40GbE uplink interfaces and (48)
10GbE server-facing interfaces. Each of the (48) 10GbE ports for all 32 spine nodes will be fully uti lized (i.e., attached to
downst ream servers). So, the total server-facing bandwidth is 48 x 32 x 10 Gbps, which equals 15360 Gbps. Each of the 32
leaf nodes has (4) 40GbE spine-facing interfaces. So, the total uplink bandwidth is 4 x 32 x 40 Gbps, which equals
5120 Gbps. The OS ratio for this fabric is 15360:5120 or 3:1.
An interesting thing to note is that if you remove any number of leaf nodes, the OS ratio does not change. For example, what
would happen to the OS ratio if only 31 nodes exist? The server-facing bandwidth would be 48 x 31 x 10 Gbps, wh ich equals
14880 Gbps. The total uplink bandwidth is 4 x 31 x 40 Gbps, which equals 4960 Gbps. The OS ratio for th is f abric is
14880:4960 or 3 :1. This fact actually makes your design calculat ions very simple. Once you decide on an OS ratio and
determine the number of spine nodes that will allow that rat io, you can simply add and remove leaf nodes f rom the topology
without affecting the original OS ratio of the fabric .
                                LeafNodes
                                                         --
                                                         ~
                                                                             ----                     ----                       ----      ...            ----
                                                         L1                   L2                      L3                      L4                   L32
                                                                                                        Can be any number of leaf nodes up to 32._ l
                                                                                               Regardless of number of leaf nodes, OS remains 2:1
     C> 2019 Juniper Networks, Inc All Rights Reserved
2:1 Topology
The slide shows a basic 2:1 OS IP fabric in which two spine nodes were added to the topology from the last slide. All spine
nodes, six in total, are routers (Layer 3 switches) that each have (32) 40GbE interfaces. All leaf nodes, 32 in total, are routers
(Layer 3 switches) that have (6) 40GbE uplink interfaces and (48) 10GbE server-facing interfaces. Each of the (48) 10GbE
ports for all 32 spine nodes will be fully utilized (i.e., attached to downstream servers). That means that the total
server-facing bandwidth is still 4 8 x 32 x 10 Gbps, whic h equals 15360 Gbps. Each of the 32 leaf nodes has (6) 40GbE
spine-facing interfaces. That means, t hat t he tota l uplink bandwidth is 6 x 32 x 4 0 Gbps, wh ich equals 7680 Gbps. The OS
ratio for this fabric is 15360:7680 or 2 :1.
                               Leaf Nodes
                                                        --
                                                        .:::::::+
                                                                                   ----                   ----                   ----      ■ ■ ■         ----
                                                          L1                        L2                    L3                      L4                      L32
                                                                                                            Can be any number of leaf nodes up to 32 .   _J
                                                                                                       Regardless of number of leaves, OS remains 1:1
    C> 2019 Juniper Networks, Inc All Rights Reserved
1:1 Topology
The slide shows a basic 1:1 OS IP fabric . All spine nodes, six in total, are qfx5100-24q routers that each have (32) 40GbE
interfaces. All leaf nodes, 32 in total, are qfx5100-48s routers that have (6) 40GbE uplink interfaces and (48) 10GbE
server-facing interfaces. There are many ways that an 1:1 OS ratio can be attained . In this case, although the leaf nodes
each have (48) 10GbE server-facing interfaces, we are only going to allow 24 servers to be attached at any given moment.
That means the tota l server-facing bandwidth is still 24 x 32 x 10 Gbps, which equals 7680 Gbps. Each of the 32 leaf nodes
has (6) 40GbE spine-facing interfaces. That means the total uplink bandwidth is 6 x 32 x 40 Gbps, which equals 7680 Gbps.
The OS ratio for this fabric is 7680:7680 or 1 :1.
Agenda: IP Fabric
➔ Configure an IP Fabric
Configure an IP Fabric
The slide highlights the topic we d iscuss next.
      Loopback Addresses:
          spine1: 192. 168. 100.1
          spine2: 192.168. 100.2
          leaf1: 192.168.100.3
          leaf2: 192.168.100.4
          leaf3: 192.168.100.5
                                                                .6
                                                        leaf1        .2                  leaf2
                                                                     xe-0/0/0
                                                                           10.1.1.0/24
                                                                      .1                          .1    10.1.2.0/24
                                                                 El
                                                                Host A
                                                                                                   El
                                                                                                  Host B
       ■    Spine configuration
                • Each spine production interface is included in OSPF
                • Each spine loopback interface is included in OSPF
       • Leaf2 configuration
              • Leaf 2 connects to each spine device
           {master : O} [edit)
           lab@leaf2# show protocols ospf
           area 0 . 0 . 0 . 0 {
               interface xe- 0/0/1 . 0 ;
               i nterface xe-0/0/2 . 0 ; :              Physical interface address to spine nodes
               interface loO . O;
            }}
      ■    Leaf1 results
              • Two hops to destination host in routing table
                           • By default, only one hop is selected and installed in forwarding table
          {master : 0 ) [edit ]
          lab@l eafllt run s how r oute 1 0 . 1 . 2 . 1                                                            OSPF Area 0
          inet . O: 21 dest i nations , 21 routes (21 ac tive , 0 holddown , 0 h i dden )
          + = Active Route , - = Last Active , * = Both
                                                                                                                    leaf1        .2
                                                                                                                                 xe-0/0/0
                                                                                                                                       10.1.1.0/24
                                                                                                                                  .1
Selected next-hop
Leaf1 Results
The slide shows t hat the OSPF protocol has advertised directly connect ed host networks with in the OSPF domain. Not e that
on leaf1, OSPF has advertised two next hops to reach remote net work 10.1 .2 .0/24. A single active next hop has been
selected. At this point in the configurat ion, the next hop of 172.16.1 .17, via interface xe-0/0/2.0, is the only next hop that will
be used to forward traffic to 10.1.2.0/24 . The alternate next hop, through interface xe-0/0/1.0, is a backup next hop.
           ...
                                                                                                                                       .6
                                                                                                                               leaf1        .2
                                                                                                                                            xe-0/0/0
                                                                                                                                                  10.1.1.0/24
                                                                                                                                             .1
                                                           {master : O} [edit]
                                                           lab@leafl# show routing-options
                                                           router- id 192 . 168 . 100 . 11 ;
                                                           autonomous-system 65100 ;             load-balance policy must be applied
                                                           forwarding-table {
                                                               export load-balance;              to forwarding table to override default
                                                           }                                     behavior
The example shows a routi ng policy used to implement load ba lancing among all avai lable equal cost next hops in the
routing table. The policy load-ba l ance has a single term, named te r m 1, wh ich has an action of l oad-bal ance
per-packet. Note that t here is no fr om statement in the term . The absence of a from statement indicates t hat the policy
will match any route, without any conditions. This policy will match any route in the routing table, and if t he route has multiple
next hops, all next hops will be accepted by the policy. If more granular load-balancing is required, a from statement can
specify additiona l match parameters to select to which routes the policy will be applied.
The policy is applied to t he forward ing table to override the default behavior of exporting a single next hop from t he routing
table. To apply a policy to the forward ing table, apply the policy as an export policy at the [ r o u t i ng-options
forwarding-table J hierarchy.
Note that although t he policy action is l oad-b a l ance per-packet, the action actually performs per-flow based load
balancing.
                                                                                                           10.1 .1.0/24
                                                                                                  .1                                    .1     10.1.2.0/24
                                                                                              El
                                                                                             Host A
                                                                                                                                        D
                                                                                                                                        Host B
Example Topology
The slide shows the example topology that will be used in the subsequent slides. Notice that each router is the single
member of a unique autonomous system . Each router will peer using EBGP with its directly attached neighbors using the
physica l interface addresses. Host A is single homed to the router in AS 65003. Host B is single homed to the router in AS
65005.
       ■    Spine configuration
                • Each spine peers with each leaf using physical interface addressing
                • No spine-to-spine peering
                                                                                                                                              EBGP Fabric
            {master : O} [edit prot ocol s bgp]                                                              AS 65001              AS 65002
            lab@spi nel# show
            group leafs (
                type external ;
                l ocal- as 65001;
                n e i ghbor 172 .1 6 . 1 . 6 (
                       peer- as 65003 ; - - -
                    }
                    ne i ghb or 172 . 1 6 . 1 . 1 0 {
                                                                                    AS 65003 -~6 _01010                 AS 65004        AS 65005 .2
                          peer-as 6500 4; - - - 1                                                                                                   xe-0/0/0
                    }
                    neighbor 172 . 16 . 1 . 14 (                                                       10.1 .1.0/24
                        peer- as 65005 ; - - - t                                                  .1                                           .1     10.1.2.0/24
                                                                                                                                                D
                    }
            )
                                                                                              Host A                                           Host B
      ■    Leaf configuration
               • Each leaf peers with each spine using physical interface addressing
               • No leaf-to-leaf eering                                                                                             EBGP Fabric
           {master : O} [edit protocols bgp)                                                              AS 65001   AS 65002
           lab@l eafli show
           group spine {
               type external ;
               local-as 65003 ;
               neighbor 172 . 16 . 1 . 5 {
                     peer- as 65001 ; - - - - .
                   }                                                                       .6                                                 .26
                                                                                                                               --
                                                                                                                                -----·        --
                  neighbor 172 . 16 . 1 . 17 {                                                                            AS 65005 .2
                      peer-as 65002 ; - - - - t                                   AS 65003 -~e-o,o,o                                      xe-0/010
                   }
           }                                                                                         10.1.1.0/24
                                                                                                .1                                   .1    10.1.2.0/24
                                                                                            El
                                                                                           Host A
                                                                                                                                      El
                                                                                                                                     Host B
                                                                                              '
                                                                          Four numbers separated by slashes
                                                                     indicate a neighbor relationship is established
       (master : 0}
       lab@leafl> show bgp summary
       Threading mode : BGP I/0
       Groups : 1 Peers : 2 Down peers : 0
       Table             Tot Paths Act Paths Suppressed     History Damp State    Pending
       inet . O
                                 0           0        0           0          0            0
       Peer                        AS      I nPkt   Out Pkt    OutQ   Flaps Last Up/Own Sta eliActive/Received/Accepted/Damped ...
       172 . 16 . 1 . 5         65001           5         5       0       0                   •
                                                                                   1 : 33 0/0,0/0             0/0/0/0
       172 . 16 . 1 . 17        65002           5         4       0       0        1 : 29 0/0/0/0             0/0/0/0
Verifying Neighbors
Once you configure BGP neighbors, you can check the status of t he relationships using either the show bgp summary or
sho w bgp neighbo r command.
Route Redistribution (1 of 3)
Routing Policy
Once BGP neighbors are established in the IP fabric, each router must be configured to advertise routes to its neighbors and
into the fabric. For example, as you attach a server to a top-of-rack (TOR) switch/router (which is usually a leaf node of the
fabric) you must configure the TOR to advertise the server's IP subnet to the rest of the network. The first step in advertising
a route is to write a policy that wi ll match on a route and then accept that route . The slide shows t he pol icy that must be
configured on the router in AS 65003.
An important thing to note with this configuration is that it applies to a Layer 3 fabric that advertises all access-connected
server IP addresses throughout t he fabric. This implementation is used when servers attached to the Layer 3 fabric will
communicate using Layer 3, or IP based, methods. With this implementation, all fabric nodes will have a route to every
server in the data center present in the routing table.
If an underlay/overlay design is configured, the IP addresses of the hosts connected to the access layer, or leaf nodes, are
not maintained in the fabric node routing tables. With an underlay/overlay design, only the leaf nodes maintain the host, or
tenant, routing information. The underlay fabric is used to interconnect the fabric nodes and to propagate routing
information specific to the underlay fabric. In order to provide the reachabi lity information necessary for an overlay to be
configured, only the loopback addresses of t he underlay devices need to be advertised to other fabric nodes, and the route
distribution policies should be adjusted to perform that task.
Route Redistribution (2 of 3)
Applying Policy
After configuring a policy, the policy must be appl ied to the router EBGP peers. The s lide shows the direct policy be ing
applied as an export policy to AS 65003's EBGP neighbors.
Route Redistribution (3 of 3)
                                                                    {master : O}
                                                                    lab@leafl> show route advertising-protocol bgp 172 . 16 . 1 . 17
Multiple Paths (1 of 3)
          Notice that only one next-hop will   10 . 1 . 2 . 0/24   *[BGP/170] 00 : 01 : 54 , localpref 100
          be used for routing even though                             AS path : 65002 65005 I , validation-state : unverified
            two next hops are possible       ~-~-------          . > to 172 . 16 . 1 . 17 via xe-0/0/2 . 0
                                                                    [BGP/170] 00 : 01 : 54 , localpref 100
                                                                      AS path : 65001 65005 I , validation- state : unverified
                                                                    > to 172 . 16 . 1 . 5 via xe-0/0/1 . 0
                                                                         ,r
                                                                     '
                                                                   Second possible and
                                                                     unused next hop
Default Behavior
Assuming the router in AS 65005 is advertising Host B's subnet, the slide shows the default routing behavior on a remote
leaf node. Notice that the leaf node has rece ived two advertisements for the same subnet. However, because of the default
behavior of BGP, the leaf node chooses a single route to select as the active route in the routing table (you can tell which is
the active route because of the asterisk). Based on what is shown in the slide, the leaf node will send al l traffic destined for
10.1.2/24 over the xe-0/0/2.0 link. The leaf node will not load share over the two possible next hops by default. This same
behavior will take place on the spine nodes, as we ll.
      Multiple Paths (2 of 3)
      • Enable multipath multiple-as on ALL nodes so multiple paths
        can be used for routing
                                                        {master : 0) [edit]
                                                        lab@spinel# show protocols bgp
                                                        group leafs {
                                                            type external ;
                                                            local-as 65001 ;
                                                            multipath {
                                                                  multiple-as ;
                                                            I
                                                            neighbor 172 . 16 . 1 . 6 {
                                                                peer-as 65003;
                                                            }
                                                            neighbor 172 . 16 . 1 . 10 {
                                                                peer- as 65004 ;
                                                            }
                                                            neighbor 172 . 16 . 1 . 14 {
                                                                peer-as 65005 ;
                                                            }
                                                        }
Multiple Paths (3 of 3)
■ Verify that multiple received routes can be used for routing traffic
                                                         {master : O}
                                                         lab@leafl> show route 10 . 1 . 2 . 0/24
Verify Multipath
View the routing table to see the results of the multipath statement. As you can see t he active BGP route now has two next
hops that can be use fo r forwarding. Do you th ink the router is using both next hops for forwarding?
Load Balancing (1 of 3)
■ View forwarding table to view the next hops used for forwarding
                                          FT
                                                ----------------
                                                  10.1.20/24    nexthop xe-0/0/2.0
Load Balancing (2 of 3)
                                                                {master : O) [edit)
                                                                lab@leafl# show policy-options
                                                                {master : O}[edit]
                                                                lab@leafli show routing-options
                                                                router-id 192 . 168 . 100 . 11 ;
                                                                autonomous - system 65100 ;
                                                                forwarding-table {
                                                                    export load- balance ;
                                                                }
Load Balancing (3 of 3)
      ■    View forwarding table to view the next hops used for forwarding
                                                             {master: 0}
                                                             lab@leafl> show route forwarding - table destinati on 10 . 1 . 2 . 0/24
                                                             Routing table : default . inet
                                                             Internet :
                                                             Enabled protocols : Bri dging,
                                                             Destination         Type RtRef Nex t hop            Type I ndex         NhRef Neti f
                                                             10 . 1 . 2 . 0/24   user       0                    ulst       131070       2
                                                                                              172 . 16 . 1 . 5   ucst           1750     4 xe-0/0/1 . 0
                                                                                              172 . 16 . 1 . 17  ucst           1754     4 xe- 0/0/2 . 0
                                                                                                                                             .
                                                             _____...,______
                                                                                        Export load-balancing policy applied to RT
                                                                                Both next hops are pushed down to FT
                                                             10.1.2/24    nexthop xe-0/0/1.0
                                                        FT                nexthop xe-0/0/2.0
Full Configuration (1 of 3)
       routing-options {                                protocols {
           router-id 192 . 168 . 100 . ll ;                 bgp {                                  policy-options {
           autonomous-system 65100 ;                            group spine {                          policy-statement direct {
           forwarding-table {                                       type external ;                        term 1 {
               e xport load- balance ;                              export direct ;                            from {
               }                                                    local-as 65003 ;                                protocol direct ;
       }                                                            multipath {                                     route-filter 10 . 1 . 1 . 0/24 exact ;
                                                                        multiple-as ;                             )
                                                                    )                                             then accept ;
                                                                    neighbor 172 . 16 . 1 . 5 {             }
                                                                        peer-as 65001;                 )
                                                                    )                                  policy-statement load- balance {
                                                                    neighbor 172 . 16 . 1 . 17 {           term 1 (
                                                                        peer- as 65002 ;                       then {
                                                                    }                                               load-balance per-packet ;
                                                                )                                                 )
                                                            )                                               }
                                                        )                                              )
                                                                                                   }
Leaf1 Configuration
The routing-options, protocols BGP, and pol icy configuration of leaf1, in AS65003, are displayed.
Full Configuration (2 of 3)
Summary
We Discussed:
         •            Routing in an IP fabric;
• Scaling of an IP fabric;
Review Questions
      1. What are some of the Juniper Networks products that can be used
         in the spine position of an IP fabric?
      2. What is the general routing strategy of an IP fabric?
      3. In an EBGP-based IP fabric, what must be enabled on a BGP router
         so that it can install more than one next hop in the routing table
         when the same route is received from two or more neighbors?
Review Questions
        1.
2.
3.
Lab: IP Fabric
Lab: IP Fabric
The slide provides the objectives for th is lab.
          2.
The general routing strategy of an IP fabric is to provide multiple paths to all destinations within a fabric, and to provide load
sharing, pred ictability, and scale by leveraging the characteristics of routing protocols.
          3.
In an EBGP based IP fabric, the default BGP route selection proces.s must be modified to allow multiple paths to the remote
destination. The command to do so is to configure the mu lti -path mult iple-as parameter in BGP. In addition, the default
behavior of the forwarding table must be modified through policy to permit multiple next hops to be copied to the forwarding
tab le.
Chapter 4: VXLAN
                              Engineering Simplicity
Data Center Fabric with EVPN and VXLAN
Objectives
We Will Discuss:
        •            Functions and operations of VXLAN
        •           Control and data plane of VXLAN in a contro ller-less overlay
        •           VXLAN Gateway f unctions
Agenda: VXLAN
■ VXLAN Gateways
      Traditional Applications
      ■    Many traditional applications in a data center require Layer 2 connectivity
           between devices
                                                                           L2 Switch                   L2 Switch
                                                                                        VLAN 100
                                                                Host A                                              Host B
                                                            10.1.1.1/24          Switched Network                  10.1.1.2/24
      ■    What happens when you have traditional applications in the data center that is
           built around IP fabric?
                                                                          L3 Device                    L3 Device
                                                            ~            -
                                                                1--- --i =             172.16.0/24
                                                                                                         -
                                                                                                         =r--- --111      ~
Layer 2 Applications
Data centers host different types of applications. There are a few ways for applications to function . A Mode One appl ication
requ ires direct layer 2 connectivity between the d ifferent devices that host the app lications. A Mode Two application doesn't
requ ire layer 2 connectivity, but can run on a Laye r 3 network by using IP reachab ility instead of Layer 2 reachabi lity between
application nodes.
As data centers have grown, and as appl ication mobility and the need to move applications from device to device w ithin the
data center has become increasingly common, t he limits of Layer 2 only data centers have become more apparent. Build ing
a Layer 2 data center, which is based on VLANs and t he manual mapping of broadcast domains to accommodate the
application needs of legacy applications, can be cumbersome from a management perspective, and is often t imes not agile
enough, or scalab le enough, to meet the demands of modern business needs.
Layer 3 Fabrics
To address the scalability and agility needs of modern businesses and applications, new methods of designing and building
data centers have been established. Instead of using a Layer 2 switched network between host devices, the concept of an IP
based fabric has been introduced. Using an IP fabric within the data center allows t he use of routing protoco ls to
interconnect edge devices, and allows the use of the sca lab ility and load sharing capabilities that are inherently bu ilt into t he
routing protocols.
Using a Layer 3 network between edge devices introduces a problem with legacy applications. Many legacy app lications, or
Mode One applications, still require d irect Layer 2 connectivity between hosts or appl ication instances. Plac ing a Layer 3
fabric in between Layer 2 broadcast domains d isrupts that layer 2 connectivity. A new techno logy has been developed to
address t he need to stretch a Layer 2 domain across a Layer 3 domain.
                                                                                     =-------------=~~---..... . . . J=
                                                                      10.1.1.0/24        •                                                        10.1.1.0/24
                                                                       .1           ~L                                                                     .2 ..=
                                                                                    -
                                                           ,:;::::i
                                                                                    -        .1                 172.160/24
Layer 2 VPNs
The foundation of th is Layer 2 stretch concept is based on the functiona lity of a standard Layer 2 VPN. With a Layer 2 VPN, a
data frame is encapsulated with in an IP packet at the boundary of a Layer 2 broadcast domain and an IP routed domain. The
encapsulated packet is routed across the IP domain, and is then decapsulated once it reaches the Layer 2 broadcast
doma in at the remote side of the Layer 3 routed domain.
From the perspective of the hosts at each end of the network, they are still connected with a direct Layer 2 connection. The
encapsulation and decapsulation used to cross the IP fabric is transparent. From the perspective of Host B in the example
shown, Host A is directly connected to the same broadcast domain as Host B, and they can communicate directly across that
Layer 2 domain.
                                                                                ---- -                                                                            ----
                                                                                      •                                                                       •
                                                                                     •                                                                        •
                                                              =
                                                              -    .1                                                                                                               =
                                                                                                                                                                                 .2 ..
                                                                                          .1                     172.16 0/24                           .2 -
                                                          HostA
                                                                                GW   \                           IP Fabric                               )     ) GW
                                                                                                                                                                                 Host B
Data Plane
There are generally two components of a VPN : t he data plane (as described on this diagram) and the contro l plane (as
described on the next diagram).
The data plane of a VPN describes the method in wh ich a gateway encapsulates and decapsulates the original data. Also, in
regards to an Ethernet Layer 2 VPN, it m ight be necessary for the gateway to learn the MAC addresses of both local and
remote servers, much like a normal Ethernet switch learns MAC addresses. In almost all forms of Ethernet VPNs, the
gateways learn the MAC addresses of locally attached servers in the data plane (i.e. from rece ived Ethernet frames). Remote
MAC addresses can be learned either in the data plane (after decapsulating data received from remote gateways) or through
the control plane.
                                                               10.1.1.0/24                                                10.1.1.0/24
                                                          -=    .1
                                                                             --
                                                                              --+
                                                                              --+
                                                                                    ~
Control Plane
The control plane describes the process of learning performed by VPN gateways. This includes lea rn ing the remote IP
add ress of other VPN gat eways, establ ish ing the tunnels that interconnect the gateways, and sometimes learning the MAC
add resses of remote hosts. The contro l plane information can be statica lly conf igured o r dynamically d iscove red through
some type of dynamic VPN signa ling protocol.
Static configuration works fi ne but it does not sca le well. Fo r example, imagine t hat you have 20 TOR routers participating in
a statica lly configu red Layer 2 VPN. If yo u add a nother TOR router to the VPN, you wou ld have to manual ly configure each of
the 20 routers to recogn ize the newly added gateway on the VPN. In addition, imagine that the workloads, or applications
ru nning in the network, are constantly being moved from phys ica l host to physica l host. The abi lity to make net work
adj ustments to t he consta ntly moving work loads is d ifficu lt , if not impossib le, to ach ieve.
Normally a VPN has some fo rm of dynam ic s ignaling protocol for t he control plane. The signaling protocol can allow for
dynam ic adds a nd deletions of gateways from the VPN. Some signal ing protocols a lso al low a gateway to advertise its loca lly
learned MAC addresses to remote gateways. Usually a gateway has to rece ive an Et hernet frame f rom a remote host before
it can learn the host's MAC add ress. Learn ing remote MAC add resses in t he co ntro l p la ne allows t he MAC tables of all
gateways to be more in sync. Th is has a positive s ide effect of causing the forwardi ng behavio r of the VPN t o be more
efficient (less flood ing of data over the fabric) .
      Layer 2 VPNs
      ■    Layer 2 VPN options
Two options that have been developed for a data center environment, and which work together to provide flexibi lity and
scalabi lity, are EVPN and VXLAN . These are two separate technologies that can work together to manage both the fo rwarding
and control planes in a data center.
                                                           ..r::
                    Host1
                                       VM1
                                       VM2
                                                           -cg
                                                            <>
                                                            ~
                                                                                                Router
                                                                                                VTEP
                                                                                                                       BMS
                                                                                                                       = !APP1 !
                                                                                                            VNI 5100
                                                                              IP Fabric
                                                           ..r::
                                                                              172.16.0/24
                                       VM3                         '.'.J ~
                    Host2                                  .-<>
                                                             ::                                 '.'.J ~                BMS
                                                            ~      ~r::;                                    VN1s200
                                                                                                ~r::;
                                       VM4                 cg
                                                                                                Router                 = ! APP2 !
                                                                                                VTEP
                                                           ..r::
                    Host3
                                       VMS
                                       VM6
                                                           -cg
                                                            <>
                                                            ~
The Virtual Network Identifier (VNI), identifies a broadcast domain by using a numeric value . Unlike a VLAN tag, which can be
in the range from Oto 4095 (12 bit value), the VNI va lue can range from Oto 16,777,215 (24 bit value). A packet with a VNI
tag in the header can be associat ed with any one of up over 16 million broadcast domains. This alleviates a significant
sca ling limit associated with VLANs. Juniper networks recommends using a VNI that starts with a value of 5000 or higher to
avoid confusing a VNI value with a VLAN tag value.
In a data center, a VLAN is mapped to a VN I value. However, when the packet enters the Layer 3 doma in and is encapsulated
in a VXLAN packet, the VLAN ID is discarded and is not transmitted with the original packet. Once the packet arrives at the
remote gateway, the VLAN ID associated with the VNI at the remote gateway may be placed on the packet before it reenters
the Layer 2 domain.
The VXLAN cont rol plane has gone t hrough some changes and evolution. Originally, VXLAN gateways learned about remote
destinations through multicast. Although this cont rol plane function worked, it was very resource int ensive, was slow to
converge and to update when changes were made, and required the running of a multicast protocol in the underlay network.
It also required manual configuration on every gateway to associate locally connected broadcast domains with multicast
group addresses.
In order to improve performance of the VXLAN control plane, EVPN was developed. EVPN signaling is an extension to t he BGP
routi ng protocol. EVPN signaling utilizes BGP routing updat es to advertise local broadcast domains and locally learned MAC
addresses to remote tunnel endpoints. The BGP routing protocol was designed for high scalabi lity, fast convergence, and
flexibility.
       VXLAN Benefits
       ■    VXLAN is being embraced by major vendors and supporters of virtualization
              • Standardized protocol
              • Support in many virtual switch software implementations
                          • The focus of this course is on VTEP support in the physical network environment, not on VTEP support in the
                            virtual switch environment
                                     VM1                                                                         BMS
                   Host1
                                     VM2
                                                        >                                                         =
                                                                        IP Fabric
                                     VM1                               172.16.0/24
                   Host2                                    '.'..J ~                                             BMS
                                     VM2                    ~G
                                                                                              Router
                                                                                              VTEP
                                                                                                                  =
                                     VM1
                   Host3
                                     VM2
VXLAN Benefits
The slide lists the benefits of VXLAN .
                                                           .c
                   Host1
                                         VM 1
                                     ..................    -cg
                                                           (.)
                                                           ,::
                                                                                       R3          BMS
                         ••
                           •• •
                                ••• •( VM2 :
                                     ; ............... :
                                                                                      VTEP
                                                                                      ~~
                                                                                                    =I    APP1          I
                         ••                                             IP Fabric                      • • •••• • •• ••• •
                           •
                           ••
                                ....                       .c           172.16.0/24
                                                                                      ~r:;             L~P.?..~..~·-..•
                   Host2
                                         VM2
                                         VM4
                                                           -cg
                                                           (.)
                                                           ,::                        ~~
                                                                                      ~r:;
                                                                                                   BMS
                                                                                                                               ••
                                                                                                                               •
                                                                                                                                ••
                                                                                                                                 •
                                                                                                                                ••
                                                                  R1                   R2           =I    APP2          l·l
                                                                                                                              •
                                                                 VTEP                 VTEP
                                                           .c
                   Host3
                                         VMS
                                         VM6
                                                           -cg
                                                           (.)
,::
Virtua lization
With the introduction of virtua lization within a data center, the manner in which applications and physical hardware interact
has changes drastically. With th is change came new networking requirements. Virtua l machines, or software based
computers that run an operating system and applications, can be run on an underlying physica l host. A single physica l host
machine can be host to numerous virtual machines. The ability to move a virtual machine to any physical host within the
data center a llows a more efficient use of resources. If the resources of a physical host become taxed, due to increased
memory or CPU requirements of the virtual machine or applications, a portion or all of the processes of that virtua l mach ine
can be migrated to a different physical host that has sufficient resources to support the virtua l mach ine requirements.
Physical resources within the data center can be monitored and analyzed in order to track historical resource usage, as we ll
as pred ict future resource requirements within the environment. Because the workloads are software based, automated
systems can be used to migrate workloads with little to no user intervention.
The ability to move workloads to the physica l resources that can most efficiently handle those workloads introduces a new
set of issues with regards to the network that interconnects the workloads. As a workload m igrates to a new physical or
logical server, other applications or hosts within the data center environment must be able to continue to communicate with
the migrated resource, regardless of where the resource exists. This can, at times, involve a change of physica l server,
change of rack, and at t imes, even a change of data center locations. In addit ion, the original MAC address of the virtual or
physica l mach ine on wh ich the application is running changes, and the mechanisms used to interconnect the devices at a
Layer 2 level must be able to adapt to an ever-changing environment.
Agenda: VXLAN
VXLAN Fundamentals
The slide highlights the topic we d iscuss next.
      VXLAN Fundamentals
      ■    VXLAN is a Layer 2 VPN
              • Defined in RFC 7348
                          • Encapsulates Ethernet Frames within IP packets
              • Data plane component
                          • Encapsulation: Includes adding an outer Ethernet header, outer IP header, outer UDP header, and VXLAN
                            header to the original Ethernet Frame (original VLAN tag is usually removed)
                          • Decapsulation: Includes removing all of the above outer headers and forwarding the original Ethernet frame to
                            its destination (adding the appropriate VLAN tag if necessary)
              • Signaling component (learning of remote VXLAN gateways)
                          • RFC7348 discusses static configuration and multicast using PIM
                          • Other methods include using EVPN signaling or OVSDB
VXLAN Fundamentals
VXLAN is defined in RFC 7348 and describes a method of tunneling Et hernet frames over an IP network. RFC 7348
describes the data plane and a signaling plane for VXLAN. Although RFC 7348 discusses PIM and multicast in the signal ing
plane, other signaling methods f or VXLAN exist, including Multiprotocol BGP (MP-BGP) Ethernet VPN (EVPN) as wel l as Open
vSwitch Database (OVSDB).
                                                                                                                        ii!
                                                                                                                              /
                                                                                                                                               F
                                                                          OUTER               VXLAN
                                   OUTER MAC             OUTER IP                                                  Original L2 Frame           C
                                                                           UDP               HEADER
                                                                                                                                               s
                                                         32
                                                              MY VTEP
                                                               OSTIP:
                                                                         16
                                                                                 OxOOOO
                                                                                    ~
                                                                                             8    RESERVED
                                                                                                             "'     mapped statically through configuration
                                                                                                                    to a host/server facing VLAN, allowing
                                                                                                                    for ~ 16 million broadcast domains in a
                                                              DESTVTEP
                                                                                        \                           data center
                                                                              VXLAN Port = 4789
     C> 2019 Juniper Networks, Inc All Rights Reserved                                                                                       Jun1PerNElWOPKS
                                                                                                                                                               15
      VTEP (1 of 3)
      • A VTEP is the endpoint of a VXLAN tunnel
             • It takes Layer 2 frames from VMs and encapsulates them using VXLAN
               encapsulation
                          • Based on preconfigured mapping of VLAN to VN I
                                             Host Machine 1
                                                                                            Static mapping on
                                                                                            VTEP of VM-fac ing
                                                   VM1               VM2                   VLAN to outgoing VNI
                                                                                                                                  BMS
                                                              10.1 .1.3
                                                                                                                                  ..
                                                                                                                                  c::::J
Virtual Switch 1
                                                                                                IP Fabric
                                                                                               172.16.0/24
                                                                                R1                                     R2
                                                                              (VTEP)                                 (VTEP)
                                                                           172.16.1.1/24                          172.16.2.1/24
VTEP, Part 1
The d iagram shows how a VTEP handles a VXLAN packet on a source VTEP that must be encapsulated and sent to a remote
VTEP. Here is the step by step proces.s taken by the network a nd R1:
1. VM2 sends an Ethernet frame to the MAC address of the remote BMS.
       3.            R1 (VTEP) receives the incoming Ethernet frame and associates the VLAN on which the frame arrived with a
                     VNI.
One th ing you should notice about the VLAN tagging between the VMs and the virtual switches is that, since the VLAN tags
are stripped before send ing over the IP Fabric, the VLAN tags do not have to match between remote VMs. This allows for
more flexibility in VLAN ass ignments from server to server and rack to rack.
       VTEP (2 of 3)
       • A VTEP is the endpoint of a VXLAN tunnel (contd.)
               • Forwards VXLAN packets to a remote VTEP over the Layer 3 network
                           • Based on MAC-to-remote-VTEP mapping
                                                     Host Machine 1
                                                                                                                                                            BMS
                                                     VM1               VM2
                                                                 10.1.1.3 I
                                                                                                                                                             =
                                                                                                                                                             -·
                                                          Virtual Switch ·
                                                                                                              IP Fabric
                                                                   ~                                         172.16.0/24
                                                                                                                                                              10. 1.1.2
VTEP, Part 2
         1.           R1 (VTEP) analyzes the local VXLAN bridging table to determine which remote VTEP has advertised the MAC
                      address associated with the destination MAC address in the received frame.
2. R1 identifies which VTEP interface (tunnel) next hop should be used to forward the frame.
         3.           R1 encapsu lates the original Ethernet frame in a VXLAN/I P packet with a destination IP address of the remote
                      VTEP address, and forwards the IP packet to the next physical device in the IP fabric.
     VTEP (3 of 3)
      ■   A VTEP is the endpoint of a VXLAN tunnel (contd.)
            • Takes Layer 3 packets received from the remote VTEP and strips the outer MAC, outer IP
              header, and VXLAN header
            • Forwards resulting Layer 2 frames to the destination based on VNl-to-interface mapping
                              Host Machine 1
                                                                                                                                         BMS
                              VM1            VM2
                                                                                                                                         c::::J
                                      10.1 .1.3
                                Virtual Switch 1
                                                                                  IP Fabric                                                 10.1.1.2
                                                                                  172.16.0/24
                                                         ~gr-~=====~~====~·~g,..--
                                                        R1                                                           R2
                                                                                                                                     I                 I
                                                                                                                                         IP-DA 10.1 .1.2
                                                                                                                                    Original L2 Frame
                                                      (VTEP)                                                       (VTEP)     (plus VLAN tag if necessary)
                                                   172.16.1.1/24                                             172.16.2.1 /24
                                                           II..._ _ _ _ _ I
                                                               IP-DA 172.16.2.1         ETH   j 1P-DA 10.1 .1 .2
                                                                                      _,11.__ _ _ _ _.-1
                                                                                                                   I
                                                                          I                        I
                                                                VXLAN Encapsulation      Original Ethernet Frame
                                                                                            (minus VLAN tag)
    C>2019 Ju                                                                                                                                                18
VTEP, Part 3
       1.       The remote VTEP (R2) receives an IP packet destined to the IP address associated with the local VTEP tunne l
                endpoint.
       2.       The IP packet is decapsulated, which exposes the VXLAN header, wh ich contains the VNI value associated with
                the Ethernet segment.
       3.       The remote VTEP removes the VXLAN encapsulation and forwards the original Ethernet frame out the interface
                associated w ith the destination MAC address in the local bridging table . If the loca l interface has a VLAN tag
                associated w ith it, the VLAN tag associated with the VNI is included in the transm itted frame.
• Locally attached server/VM/ Host MACs are learned from locally received packets.
                                  Through a control plane signaling protocol, such as EVPN, that advertises locally learned MACs to remote
                                  VTEPs (Juniper recommended solution).
BUM Traffic
The multicast learning model is one method of handling BUM traffic in a VXLAN enabled network. In this model, you shou ld
note that the underlay network must support a multicast routing protocol, preferably some sort of Protocol Independent
Multicast Sparse Mode (PIM-SM). Also, the VTEPs must support Internet Group Membership Protocol (IGM P) so that the
VTEP can register the multicast group that is associated with a configured VNI.
For every VNI used in the data center, there must also be a multicast group assigned. Remember that there are 2"24 {~16M)
possible VNls, so your customer will be able to have up to 2"24 group addresses. Luckily, 239/8 is a reserved set of
organ izationally scoped multicast group addresses (2"24 group addresses in total) that can be used freely within your
customer's data center.
                                                                                                                                    Virtual Switch 1
                           Virtual Switch 1
                                                                                                                                        (VTEP B)
                                                              A
                                                           '.:J ~                                           '.:J ~\
                                                          s~G C                                             ~GI
                                                            R1                                              DR
                                                         VTEPA                                            VTEPB
                                                        (Receiver)                                        (Source)
Multicast Trees
With multicast MAC learning, a VTEP sends broadcast frames (suc h as ARP packets) rece ived from a locally configured VNI to
a multicast group that is manua lly mapped to the VNI. Remote VTEPs that will receive t raffic from a remote VNI send an
IGM P join within the core network, wh ich indicates the desire t o receive the traffic.
When a VNl-to-Multicast group address mapping is conf igured, t he VTEP (VTEP A in the exam ple) registers interest in
receiving traffic destined to the group (239.1 .1.1 in the exam ple) by sending an IGM P j oi n message within the fabric network.
A multicast tree is built t o t he Rendezvous Poi nt with in th e multicast domain. With th is syst em, all VTEPs t hat are co nfigured
t o participate in a VNI are registered on the RP as interested rece ivers for that m ulticast group, and VTEP A will receive
broadcast f rames that are sent to the RP through multicast t raffic d istribution.
                                                       ·-- --     ~r,; C
                                                                  B
                                                                                                          '.:J~ ◄-
                                                                                                          ~r,;
                                                                                                                      -   - --
                                                                   R1                                     DR          Register encapsulated traffic is sent to
                                                                                                                      the RP as unicast
                                                                 VTEPA                                   VTEPB
    Q 2019 Juniper Networks, Inc All Rights Reserved
Multicast Forwarding
When VTEP B receives a broadcast packet from a local VM or host, VTEP B encapsulates the Ethernet frame into the
appropriate VXLAN/U DP/I P headers. However, it sets the destination IP address of the outer IP header to the VNl's group
address (239.1.1.1 on the slide). Upon receiving the mu lticast packet, VTEP B's DR (the PIM router closest to VTEP 8)
encapsulates the multicast packet into unicast PIM register messages that are destined to the IP address of the RP. Upon
receiving the register messages, the RP de-encapsu lates the register messages and forwards the resulting multicast packets
down the(* ,G) tree. Upon receiving the multicast VXLAN packet, VTEP A does the following:
2. Forwards the broadcast packet toward the VMs using the virtual switch;
4. Learns the remote MAC address of the sending VM and maps it to VTEP B's IP address.
For all of this to work, you must ensure that the appropriate devices support PIM-SM, IGMP, and the Pl M DR and RP
functions.
Although it is not shown in the diagram, once R1 receives the first native multicast packet from the RP (source address is
VTEP B's address), R1 wi ll build a shortest path tree (SPT) to the DR closest to VTEP B, which will establish (S, G) state on all
routers along that path . Once the SPT is formed, multicast traffic will f low along the shortest path between VTEP A and
VTEP 8. However, direct communication between VM2 and VM3 is now possible by transiting the VXLAN tunnel between
VTEP A and VTEP B, since after the initial MAC learning process, the MAC addresses of each VM is registered in the
MAC-to-VTEP tables on R1 and the DR.
Agenda: VXLAN
VXLAN Gateways
The slide highlights the topic we d iscuss next.
        •            Encapsulates Layer 2 frames in VXLAN packets and performs a layer 3 forwarding operation on the Layer 3 IP
                     fabric;
        •            Decapsulates a VXLAN encapsulated packet and perf orms a Layer 2 forwarding operation (bridging) t o a host or
                     virtual host device.
• Functions as a bridge between VNls, and allows traffic from one VNI to be forwarded on a different VNI;
        •            Modifies the source MAC address of the underlying Ethernet frame, and places the local IRB interface virtual
                     MAC address (gat eway MAC) in its place prior to forwarding the Ethernet frame in the new VNI.
Host Machine 1
                                                    VM 1              VM2
                                                                                          ( IP-DA 172.16.0.2     IETH   I   IP-DA 10.1.1.2   I
                                                  .1       10.1.1.0/24      .3
                                                                                                     I
                                                                                           VXLAN Encapsulation
                                                                                                                              I
                                                                                                                 Original Ethernet Frame
                                                                                                                    (minus VLAN tag)
                                                                                                                                             '            Bare Metal
                                                                                                                                                            Server
                                                                                                                                                             =
                                                         Virtual Switch 1                                                     VXLAN Tunnel
                                                                            ---    ---,
                                                                                                                                                             J
                                                                                                           IP Fabric
                                                                                  Router _ _ _ _ _1_72
                                                                                                     _._16_.0_/2
                                                                                                               _4_ _ _ _ Rout> - - . E _ T_H_ !I_P-_
                                                                                                                                                   D_A _10_.1_.1_.2~I
                                                                                                                                                  ..,.1
When the IP packet arrives at the destination gateway, which is the remote VTEP, the VTEP identifies the destination address
as a local address and removes the outer Layer 3 header to process the packet. Once the outer header is removed, the inner
VXLAN header is identified and processed according to the information within t hat header.
Once the VXLAN header information is processed, the local ly connected broadcast domain is identified . The outbound
interface is identified based on the VN I tag in the VXLAN header and local switching table, and t he origina l packet is
forwarded toward the destination device based on the switching table information . During this proces.s, the IP add resses of
the original packet are not examined.
• Contrail vRouter;
• QFX 5100 Series (including the QFX 5100, 5110, and 5120);
                        VM1                    VM2
                                                                     l    IP-DA 172.16.0.2    I
                                                                                              I
                                                                                                  ETH J IP-DA 1 .1.1 .1   l
                                                                                                                          I
                                                                                                                                                    Default Switch                Gateway
                                                                                                                                                                                   function
                                                                                   t                        t                                                                    VXLAN Tunnel
                                                                       VXLAN Encapsulation        Original Ethernet Frame
                      .1       10.1.1.0/24           .3                                                                                                                         to remote VTEP
                                                                                                     (minus VLAN tag)
                                                                                                                              Router B performs lookup
                           Virtual Switch 1                                                                                                                           1.1.1.1
                                                                                                                                                                        p
                                                                                         IP Fabric
                                                                                        172.16.0/24
                                         VNl5001
                                                                                                                    ~G
                                                             Rou'te;.,,r' - - - - - - - - - - - - - - . - , > - - : .~      '--      I   ETH   j 1P-DA 1.1. 1.1    l__,J
                                                             VTEP                                                 Router B/VTEP       ._,- - - -t - - - •
                                                          172.16.0.1/24                                         VXLAN L3 Gateway       New Ethernet Frame but
                                                                                                                   172.16.0.2/24          original IP packet
A Layer 2 gateway device forwards a decapsu lated frame to a locally attached device based on the VNI in the VXLAN packet
without examining the underlying Ethernet frame. A Layer 3 gateway device must have the capabi lity to forward the original
packet to a device that may reside within a different broadcast domain than the original frame, which requires a different
process.
When a Layer 2 frame arrives at a VTEP from a VM or a host, the entire frame is encapsu lated in a VXLAN packet. The
original IP and MAC addresses (source and destination) are preserved within the encapsulated packet. The encapsulated
packet is forwarded across the IP fabric, where the outer MAC address is modified with each hop (which is the standard IP
forwarding process). During this process, the inner frame is unmodified.
The key difference between a Layer 2 Gateway and a Layer 3 Gateway is how the packet is processed when it arrives at the
remote VTEP that is also acting as a Layer 3 Gateway. Un like with a Layer 2 Gateway, a Layer 3 Gateway must remove the
original source MAC address and replace it with the virtual MAC address associated with the L3 Gateway within the
desti nation VNI, and then bridge the frame to a different VNI than the source VNI. To perform this task, the IRB of the L3
Gateway must be set as t he "default gateway" for all hosts within the broadcast domain, and the IRB interface must be
configured within the VNI as a host address that is reachable to all other devices within that broadcast domain, just as a
default gateway in a Layer 3 network must reside within the same subnet as the hosts that wish to use the gateway. When a
host, VM2 in t he example, is required to transmit data to an IP address that does not belong to the locally configured subnet,
VM2 sends the Ethernet frame to the configured default gateway MAC address, which is the virtual MAC address of the IRB
interface on the L3 Gateway.
When the Ethernet frame arrives at the L3 Gateway, the destination MAC address is an IRB virtual MAC address. The inner
frame is processed, the destination IP address is exam ined, and the VNI associated with the destination IP address is
determined. The L3 Gateway then replaces the source MAC address in the original Ethernet frame header with the MAC
address of the virtual GW address (IRB virtua l MAC). The frame is re-encapsulated in a VXLAN packet on the new VN I and is
forwarded across the VXLAN tunne l to the remote VTEP that is connected to the destination MAC address of the inner frame.
The remote VTEP decapsulates the VXLAN packet, and forwards the Ethernet frame to the end host, which sees the frame
come from the MAC address of the gateway, and the IP address of the original source.
                                                               Host
                                                        (Physical or Virtual)                             Host
                                                                    =           '-;:I::::::-'              =
                                                                                                                 IP: 192.168.1.100
                                                                                  VTEP
                            SRC MAC: AA
                            DST MAC: GWMAC
                            SRC IP:10.0.0.1
                            DST IP 192.168.1.100
A Layer 3 Gateway is also a Layer 2 Gateway, or VTEP. When a device in the network forwa rds a packet to an IP address that
is part of a d ifferent subnet, the device forms an Ethe rnet frame w ith the destination MAC add ress of the locally configured
default gateway, or the next hop in a routing table . The source VTE P receives the frame and perfo rms a MAC add ress lookup
in the forwa rding table. The Layer 3 Gateway must be configu red with an IRB interface within the source VNI so that the
Virtual MAC of the IRB is reachable by both the origina l host and the source VTEP.
The source VTE P encaps ulates the original f rame w ith the IP/UDP/VXLAN header information and forwards the f rame
through t he VTEP tunnel that terminates on the Layer 3 GW. Once the IP packet arrives at the gateway VTEP endpoint, the
frame is processed as normal, with the IRB interface as the local "host" to wh ich the original frame is destined. The Laye r 3
Gateway processes the Layer 2 frame normally, and identifies the destination IP address in the inner packet. The L3 Gateway
then performs another bridge table loo kup to determ ine the physical next hop (VXLAN tunnel) that is associated with t he
destination IP address. Once the next hop is determined, the Et hernet frame, sourced from the IRB virtual MAC address, is
forwarded over the next hop to t he destination IP add ress. If the next hop toward the destination IP address is a VXLAN
tunnel, the appropriate VXLAN IP/U DP/VXLAN header information is placed on the packet prior to transmission .
Host
                                                         R1
                                                                                                 =
                                                        VTEP          ~C\.---"r-
                                                                      ~G
                                                        ~G             R2                  IP: 192.168.1.100
                                                        VTEP         VTEP
                                                                      L3
                                                                    Gateway
• QFX 5110/5120;
• MX Series; and
• vMX Series.
                                                                        =         ===
                                • • •                            •• •                                     •• •             • • •                         • • •              •••
        11111111111111   1111            111111111111111111111          1111111   11111111111111111111I          1111111           111111                        1111111
Centrally Routed Bridging places the VN i transition points, or L3 Gateways, on spine devices or within the IP fabric. With CRB
deployments, Layer 3 GW functions are centra lized, and the spine devices depicted in the f irst of the examples shown are
requ ired to support VXLAN L2 Gateway capab ilities, and are VTEPs within the VXLAN domain.
The second option, Fabric Layer 3 Gateways, is also a CBR deployment type. With the Fabric Layer 3 Gateways, the spine
devices of each data center pod or fabric branch are not required to have any VXLAN capabilities, as they only forward IP
packets to and from leaf devices t o the L3 Gateways within the fabric.
The th ird option shown is an Edge Routed Bridge, or ERB, deployment method. With an ERB deployment, bridging between
broadcast domains, or VNls, occurs on the leaf nodes at the edge of the network.
Each of these designs has benefits and drawbacks. Some benefits of CRB designs are the conso lidation of L3 Gateway
functions with in sub-domains. This allows the deployment of non-L3 GW capable devices at the leaf nodes, and provides
modularity and scalability.
One drawback of a CBR design is a process ca lled hair-pinning. Hair-pinning refers to the forwarding path of all t raffic that
must be forwarded between broadcast domains. Multiple broadcast domains can be connected to the same Leaf device,
and even present on the same host. Traffic that must be forwarded between those broadcast domains must be forwarded to
the L3 Gateway device, wherever it is configured, and then return to the destination host.
The ERB design al leviates the hair-pinning of traffic in the network by transitioning between broadcast domains on the leaf
nodes themse lves, which can t hen forward the transitioned frames to the remote VTEPs directly. The drawback to
configuring ERB deployments is that the leaf nodes must support L3 Gateway functions, wh ich can, at times, require a more
advanced or expensive device at the leaf level. It also req uires the configuration of gateway functions and addresses on each
leaf with in the deployment, which increases configuration and management overhead.
Summary
We Discussed:
        •            VXLAN functions and operations;
        •            The control and data plane of VXLAN in a contro ller-less overlay; and
        •            VXLAN Gateways.
Review Questions
Review Questions
         1.
2.
3.
4.
5.
                              Engineering Simplicity
Data Center Fabric with EVPN and VXLAN
Objectives
We Will Discuss:
        •            EVPN functiona lity; and
        •            EVPN contro l in a VXLAN deployment.
       Agenda:EVPN
       ➔ VXLAN                            Management
       ■    VXLAN with EVPN Control
       ■    EVPN Routing and Bridging
VXLAN Management
The slide lists the topics we will discuss. We wi ll discuss t he highlighted topic first.
      VXLAN Review
      ■    Many traditional applications in a data center require Layer 2 connectivity
           between devices
                                                                      L2 Switch __---....:::--~-,,,,-----.,_.....----.. L2 Switch
                                                         Host A                                                                                        Host B
                                                                                                Switched Network
      ■    VXLAN allows Layer 2 tunneling between devices on the same broadcast
           segment across a Layer 3 transit network or fabric
                                                                  L3 Switch or Router                                          L3 Switch or Router
                                                                                      ~-~
                                                                                                                                                            ~
                                                             10.1.1.0/24                                                                   10        02
                                                           ~ .1            . :::::-d---------\Vl>XCLLAN---.. . . . . . . . .._1 ..:::::-     · 1· 1 · ' :
                                                                           ..:::::-                   Tunnel                       ..:::::- t--..:.:.J
                                                         Host A                             _
                                                                                       ....__    _..,____172.16.0/24                                   Host B
                                                                                                     IP Fabric
    C> 2019 Juniper Networks , Inc All Rights Reserved
VXLAN Review
VXLAN is a Layer 2 VPN technology that extends a Layer 2 broadcast domain across a Layer 3 IP domain. VXLAN is commonly
used in data center deployments that use a Layer 3 fabric design.
Layer 2 Applications
The needs of the applications that run on t he servers in a data center usually drive t he designs of those data centers. There
are ma ny server-to-server applications that have strict layer 2 connectivity requirements between servers. A switched
infrastructure that is built around xSTP or a Layer 2 fabric (like Juniper Network's Virtual Chassis Fabric or Junos Fusion) is
perfectly suited for this type of connectivity. This type of infrastructure allows for broadcast domains to be stretched across
t he data center using some form of VLAN tagging. A Layer 2 Fab ric has several lim itations, however, such as scalability and
manageability.
IP Fabric
Many of today's next generation data centers are being built around IP Fabrics which, as their name implies, provide IP
connectivity between t he racks and devices of a data center. How can a next generation data center based on IP-only
connectivity support the Layer 2 requirements of t he traditiona l server-to-server applications? We will discuss the possible
solutions to the Layer 2 connectivity problem.
VNI 1== Group 239.1.1.1 Muticast Enabled Network VNI 1== Group 239.1.1.1
                                                          =
                                                          ..   .1          -
                                                                          +--     .
                                                                                             <>·9 .'\ .'\ .'\
                                                                                       • '[.;,
                                                                                                                              Join •23
                                                                                                                                    ' 9 11      -
                                                                                                                                               +--          .
                                                                                                                                                                2   ~
                                                                           -
                                                                          +--
                                                                                join     •                      172.16.0/24           · . .1    -
                                                                                                                                               +--
                                                                      VTEP-A                                                                   VTEP-B
                                                          VMA                                                                                                   VM B
                                                                                                        Multicast Enabled
                                                                                                            IP Fabric
In a mu lticast signaled VXLAN deployment, a VTEP must be configured to join a multicast group that is assigned to a virtual
network. This is a manua l, static process. When a new host (physical or virtua l) is connected to a VTEP, the multicast group
that represents the virtual network in which the new host will participate must be configured on the VTEP.
                                                                                                      '\   ••           J,
                VTEP-A still listens to                 =                 -              <>9 '\ '\. •••                  Ojn    *23             -            2   ..=
                group 239.1.1.1 after                   ··    ·1         +=-.          2
                                                                                jo\n • · " · · ••••••• 172.16 0/24           ,, ' 9 -1. 1. 1   +=-.          ·
                VM migration                            ;'l        VTEP:                     ....••                                            ~EP-B
                                                                                        --                                                                   VM B
                                                                                VTEP-C
                                                                                       -
                                                        VM A Migration
                                                                                         =
                                                                                                       VTEP-C must be configured to listen
                                                                                                       to group 239.1.1.1 after VM migration
                                                                                     VMA
                                                                                                                           ---+
                                                                                                                          +--               =
                                                                                                                                         .2 ..
                                                                                                                          +--
                                                                                                                           VTEP-B
                                                                                                172.16.0/24
                                                                              -
                                                                              -
                                                                         VTEP-C
                                                                                       1------....
                                                                                                     --------                            VM B
                                                                                  =
                                                                                          VTEP-C advertises VNI 1 participant
VMA
      Agenda:EVPN
      ■    VXLAN Management
      ➔ VXLAN                            with EVPN Control
      ■    EVPN Routing and Bridging
       EVPN Overview
       ■    VXLAN is a Layer 2 VPN
              • Defined in RFC 7348/RFC8365
                           • Encapsulates Ethernet Frames into IP packets
       ■    EVPN is a control plane
              • Based on BGP
                           • Highly scalable
                           • Ability to apply policy
              • All active forwarding
                           • Multipath forwarding over the underlay
                           • Multipath forwarding to active/active dual-homed server
              • Control plane MAC learning
                           • Reduced unknown unicast flooding
                           • Reduced ARP flooding
              • Distributed Layer 3 gateway capabilities
     C> 2019 Juniper Networks, Inc All Rights Reserved
EVPN leverages the scale and flexib ility of MP-BGP to manage and control a VXLAN environment. An EVPN controlled VXLAN
can utilize multiple equal-cost paths for load sharing and redundancy, wh ich provides multi-path forwarding over the
underlay network. Additiona lly, it provides the capab ility of multihome servers with active/active forwarding.
A key difference between EVPN-based VXLAN and traditional multicast-based VXLAN signaling is the manner in which MAC
address information is propagated between VTEPs. With EVPN, locally learned MAC addresses are advertised to remote
VTEPs by using BGP updates, rather than the multicast f looding mechanisms that are used in multicast-signaled VXLANs .
Th is process reduces unknown unicast flood ing and ARP flooding. The use of EVPN as a control plane a lso provides the
ability to implement distributed Layer 3 gateways, wh ich can share traffic and provide redundancy to the network.
                                                                 .--Jt~----------~:\
                                                                                                                                    •
                                                                                  -- -=-
                                                          ~              -                                                      .         -        c::::J
                                                           ,~ --1 -=- -- -
                                                                     Leaf1'------..,..                                 I
                                                                                                                                          -   i----1
                                                                                                                                          Leaf2
                                                        Host A                          '
                                                                 VXLAN L2 Gateway------......__            ___..
                                                                                                                                                  Host B
                                                                                                                           VXLAN L2 Gateway
                                                                                                  IP Fabric
MP-BGP
EVPN is based on Multiprotocol BGP (MP-BGP). It uses the Address Fami ly Identifier (AFI) of 25, which is the Layer 2 VPN
address family. It uses the Subsequent Address Fami ly Identifier of 70, wh ich is the EVPN address fam ily.
BGP is a proven protoco l in both service provider and enterprise networks. It has the abi lity to scale to millions of route
advertisements. BGP a lso has the added benefit of being po licy oriented. Using pol icy, you have complete control over route
advertisements allowing you to control which devices learn which routes.
       Active/Active Forwarding
       ■    All active forwarding allows for:
              • Multipath in the fabric
                                                                                                                                                                                                    Host B
                                                                                                                      IP Fabric
                                                                                                                                                                                                      =
                                                        Host A                                                                                                                  Leaf3               Host B       - - VXLAN Tunnel
                                                                                                                                                                                                                 .. ······· ·► BUM Traffic
                                                                                                                      IP Fabric
     Q 2019 Juniper Networks, Inc All Rights Reserved
Active/Active Forwarding
When using Pl M in the cont rol plane for VXLAN, it is really not possible to have a server attach to two different top of rack
switches with the ability to forward data over both links (i.e., both links active). When using EVPN signaling in the control
plane, active/ active forwarding is total ly possible. EVPN allows for VXLAN gateways (Leaf1 at the top of the slide) to use
multiple paths and multiple remote VXLAN gateways to forward dat a to multihomed hosts. Also, EVPN has mechanisms (like
split horizon, etc.) to ensure that broadcast, unknown unicast, and multicast traffic (BUM) does not loop back toward a
multihomed host.
Unknown Unicast
HostA Host B
                                                                                                                                        2                        SA - MAC Host B
                                                                                           =                                                                     DA - MAC Host C
        1.           Leaf2 rece ives an Et hernet frame with a source MAC address of HostB and a destination MAC address of
                     Hoste.
        2.           Based on a MAC table lookup, Leaf2 forwards t he Ethernet frame to its destination over the VXLAN tunnel.
                     Leaf2 also popu lates its MAC table with HostB's MAC address and associates with the outgoing interface.
       3.            Since Leaf2 just learned a new MAC address, it advertises t he MAC address to the remote VXLAN gateway,
                     Leaf1. Leaf1 installs the newly learned MAC address in its MAC table and associates it with an outgoing
                     interface, the VXLAN tunnel to Leaf2.
Now, when Leaf1 needs to send an Ethernet frame to HostB, it can send it d irectly to Leaf2 because it is a known MAC
address. Without the sequence above, Leaf1 would have no MAC entry in its table for HostB (making the frame destined to
HostB an unknown un icast Et hernet frame), so it would have to send a copy of the frame to al l remote VXLAN gateways.
Proxy ARP
                                                                                                                                          --                            •••
                                                                                                                                                                               ••
                                                                                                                                                                                    ·•.• Leaf2
                                                                 10.1 .1 .3
                                                                                         ----                                          Spine 1
                                                            I                    ARP Reply   I
                                                            I      10.1.1 .2 is Host B MAC   I                                                                                                     - - VXLAN Tunnel
                                                                                                 HostC                                                                                             ..........► BGP Update
Proxy ARP
The EVPN RFC mentions that an EVPN Provider Edge (PE) router, Leaf1 in the example, can perform Proxy ARP. It is possible
that if Leaf2 knows the IP-to-MAC binding for HostB (because it was snoop ing some form of IP traffic from HostB), it can send
the MAC advertisement for HostB that also conta ins HostB's IP address. Then, when HostA sends an ARP request for HostB's
IP address (a broadcast Ethernet frame), Leaf1 can send an ARP reply back to HostA without ever having to send the
broadcast frame over the fabric.
Altho ugh an a utomat ic Virtua l MAC address for the Layer 3 Gateway is generated, Juniper Networks recommends manually
configuring a virtual MAC address for the IR S interfaces.
       Agenda:EVPN
       • VXLAN Management
       • VXLAN with EVPN Control
       ➔ EVPN                         Routing and Bridging
      EVPN Terminology
      • VXLAN is a Layer 2 VPN
             • Term inology often overlaps with other VPN terminology
                          • PE router (Provider Edge) - VTEP Gateway or Leaf Node (VXLAN Gateway)
                          • CE device (Custmer Edge) - Host (or virtual host within a physical host)
                          • Site - set of hosts or virtual hosts connected to a VTEP Gateway
                          • NVE - Network Virtualization Edge
                          • NVO - Network Virtualization Overlay
             • IBGP is used regardless of underlay routing protocol
-- --
                                                        --                  --                  --
                                                         =                  =                    =
                                                        Host A             Host B               Host C
    C> 2019 Juniper Networks, Inc All Rights Reserved
EVPN Terminilogy
A VXLAN environment is a Layer 2 VPN. Terminology in an EVPN VXLAN environment resembles that of a trad itional VPN
environment, including the concept of:
        •            Provider Edge (PE) devices : A VPN edge device that performs encapsulation and decapsulation (VXLAN
                     Gateway, or VTEP). These are often t imes referred to as Leaf nodes;
        •            Customer Edge (CE) devices: A device that is associated with a customer s ite . In a data center, a "site" is often
                     referred to as a Host (physica l or virtual ) that is connected to a Leaf node, and sites;
        •            Sites: A set of hosts or v irtual hosts connected to a VTEP gateway. It is common to have multiple hosts or virtual
                     hosts that participate in the same v irtual network connected to the same VTEP or Leaf node.
Regardless of the underlay routing protocol, MP-IBGP is used for the EVPN control plane signa ling.
       EVPN Instances
       ■    EVPN Instance (EVI)
              • EVI is a virtual switch
              • EVI is a MAC VRF
       ■    Bridge domain
                                                                                                              IP-IBGP Session (EVPN )
              • BD is equivalent to a VLAN                                                       ••••• •••••••••• ••••• ••••••••••• ••••••••• ••• ••••• •• ••• ••••• •••
                                                                                                                  Spine                          Spine
                                                                                                                     -
                                                                                                                   +---                              -
                                                                                                                                                   +---
EVPN Instances
An EVPN Instance (EVI) is a virtual switch within a VTEP. It is essentially a MAC VRF that is unique to a broadcast domain.
Because m ultiple broadcast domains may be connected to a single VTEP, the VTEP maintains a separate MAC VR F for each
instance in order to maintain traffic isolation between broadcast domains.
Bridge Domain
A bridge doma in is equ ivalent to a VLAN . It is a Layer 2 broadcast domain within a VXLAN environment. There can be many
bridge domains for a given EVI.
      Ethernet Segment
      ■    An Ethernet segment is a set of Ethernet links that connects a
           customer site to one or more leaf devices
             • Single-homed sites use reserved Ethernet Segment ID (ESI) of O (Ox00:00:00:00:00:00:00:00:00:00)
             • Multi homed sites must be assigned a unique, 10 octet ESI
                          •     Interfaces A and B belong to the same Ethernet segment and should be assigned a network-wide, unique, non-reserved ESI
                          • Site 2's ESI , interface C, would be assigned an ESI of 0
                          •     Used to prevent traffic replication and loops on multihomed sites
                                                                                               • •• •••••••••••••••••••
                                                                             ..         •• •                              • ••••••
                                                                                                                                     ••   ..·•..
                                                    Route
                                                 Advertisement •. •··  ..-··
                                                                                    -
                                                                                  •· ••
                                                                                  +--                                       +--
                                                                                                                               -                   ··
                                                                                                                                                    ••••
                                                                                                                                                                           Reserved ES ls:
                                                                                                                                                                           Single homed site: Ox00:00:00:00:00:00:00:00:00:00
                                                                                                                                                        ····••••••••~
                                                                  •
                                                                                  +--     ii::::                          :::ii -                                          MAX-ES I: OxFF:FF :FF:FF:FF :FF :FF:FF: FF:FF
                                                Leaf1   •
                                                            •••
                                                                                        Leaf2
                                                                                                          -   •
                                                                                                                  .... .. ············· · ······ ► - .
                                                                                                                       Leaf2 Leaf3 •
                                                                                                                                                                  -
                              ~~-:;-:-:;-:-:;-:-:;-:--:;-:-:;-:-f-~n~ ~ >.c:::- - - - - ,~~ n                 ~       Type 1          n -:,___ _ _ _ _.....,
                                                                                                                                   -,..,....,;:
                                                                                                                                   ~
                                                                                                 -- ~
                         !r;::
                            ESl=Ox0:1:1:1:1:1 :1:1 :1:11 ~        -.                                -                 ~-~                    _  ESl=OxO    _            I.________.I
                                                                                                                                              Site 2 P c
                                                        Site 1
                                                                             I
                                                                           host1                                                                               host2
    C> 2019 Juniper Networks, Inc All Rights Reserved
Ethernet Segments
When a host is multi homed to multiple Leaf nodes, the Link Aggregation Group (LAG) represents a s ingle logical connection
to the VXLAN domain. To enable the EVPN control plane to properly manage the multip le links in the LAG, the LAG is assigned
a unique Ethernet Segment ID (ESI). By default, a single-homed link is assigned the ESI value of 0
(Ox00:00:00:00:00:00:00:00:00:00). A single-homed site does not require loop prevention or load sharing across multiple
Ii nks.
With a multi homed site, a unique 10 octet ESI must be assigned to the link bundle. This va lue is assigned by the
adm inistrator. In t he example, the LAG that connects CE1 to Leaf1 and Leaf2 is assigned a va lue of Ox0.1.1.1 .1 .1.1.1.1.1.
This enables the EVPN control plane to identify that t he device connected to Leaf 1 and Leaf 2, over link A and link B, are the
same site. Because the EVPN control plane advertises the ESI associated with the LAG group to all VTEP, remote VTEPs can
install multiple next hops associated with MAC addresses assigned to CE1 and two VXLAN tunnels are available for
forward ing. In addition, Leaf1 and Leaf2 are able to manage the LAG bund le without implementing MC-LAG, and traffic loops
and traffic replication between the two LAG interfaces can be prevented .
              • Type 1:
                           •    RFC7432 recommended
                           •    ADM field is 4 bytes: Should contain an IP address assigned by IANA
                           •   AN field is 2 bytes: A number assigned by service provider
                           •    Example: 1.1.1.1 :33
     C> 2019 Juniper Networks, Inc All Rights Reserved
When routes are advertised within t he data center (service provider) environment, they may be stored in local RIB-IN databases
on each PE device (Leaf node). To avoid any possible overlap of the site addresses between d ifferent broadcast domains, a
route distinguisher that is unique to each Ethernet segment is prepended to each route before it is advertised to remote PEs. In
the examp le, two cl ients, or broadcast domains, are using the same IP prefix of 1.1.1.0/24. The route d istinguisher a llows the
service provider to distinguish between the 1 .1 .1 .0/ 24 network of each customer by making each network address unique while
it transits the provider domain.
         •            Type O: This format uses a 2-byte administration f ield t hat codes the provider's autonomous system number,
                      fol lowed by a 4-byte assigned number f ield. The assigned number field is administered by t he provide r and shou ld
                      be unique across the autonomous system.
         •            Type 1 : This format uses a 4-byte administration f ield t hat is normally coded with the router ID (RID) of t he
                      advertising PE router, fol lowed by a 2-byte assigned number f ield t hat caries a unique va lue for each VRF table
                      supported by the PE router.
The examples on the sl ide show both the Type O and Type 1 route distinguisher formats. The first example shows the 2-byte
adm inistration field with the 4-byte assigned number f ield (Type 0 ).
RFC 7 432 recommends using the Type 1 route distinguisher for EVPN signal ing.
      ■    VRF-lmport Policy
             • Policy that examines received MP-BGP routes and identifies route target communities associated with
               each route
             • Used to import routes with a matching route target community into an associated VRF routing table
      ■    VRF-Export Policy
             • Policy that applies a configured route target community value to advertised MP-BGP routes
             • Used to announce connectivity to local VRF instances
      ■    Route Isolation between MAC-VRF tables
             • Careful policy administration allows separation of MAC-VRF entries on a node
             • Example route target community: target:64520:1
When a PE router receives route advertisements from remote PE routers, it determines whether the associated route target
matches one of its local VRF tables. Matching route targets causes the PE router to install the route into the VRF table whose
configuration matches t he route target.
Because the application of policy determines a VPN 's connectivity, you must take extra care when writing and applying VPN
policy to ensure that the tenant's connectivity requirements are faithfully met.
                     1.     Locally learned MACs are copied into EVl-specific VRF table as EVPN Type 2 routes.
                     2.     Locally generated EVPN routes are advertised to remote nodes with target community attached .
                               = -+- - - - - - --10
                                          10.10.10.1 1/24   ge-0/0/0 .............                                                                                                                           10.10.10.22/24   Site 2    P
The vr f -ta rge t statement always sets the target commun ity (using hidden VRF import and export policies) of Type 1
ro utes. By default, the v rf - ta rget statement a lso sets the ta rget comm u nity of Type 2 and Type 3 ro utes as well.
                    1.     A received route is compared to the vrf-import policy. If the route's target community matches the import policy, the route
                           is copied into EVPN RIB-IN and also the EVl's VRF table.
                    2.     Newly learned MAC (from the update) is copied into EVl's MAC table and a recursive lookup is used to determine
                           outgoing VXLAN tunnel.
                                                                                                     MAC/IP Advertisement
                                    MAC Table for Green EVPN                                                    ·············•... I       01:01 :01:01 :01:01 > nh indirecttarget:1:1    I
                               01 :01 :01 :01:01 :01 > nh ge-0/0/0
                                                                                         S ine1
                                                                                           --
                                                                                          +--
                                                                                                            S ine2
                                                                                                               --
                                                                                                             +--
                                                                                                                              •· •• •••
                                                                                                                                      ••
                                                                                                                                        ••                 MAC co ied to Green EVPN MAC Table
                                                                                                                                                                                                 0
                                                                                                                                          •••
                                                                                          +--                +--
                                                                                                                                             •
                                                                                                                                              ••     01 :01 :01 :01:01 :01 > nh VXLAN tunnel to Leaf 1
                                                                                                                                                •
                                                                                                                                                ••
BUM Traffic
BUM Traffic
When EVPN signaling is used with VXLAN encapsulation, Juniper Networks devices only support ingress replication of BUM
traffic. That is, when BUM traffic arrives on a VTEP, the VTEP will unicast copies of the BUM packets to each of the individual
VTEPs that belong to the same EVPN. This is the default behavior and setting on a Ju nos OS device.
EVPN Routes
EVPN routes are c lass ified accord ing to route types. Each route type performs a specific task in an EVPN environment. We
w ill discuss the following route types in more detai l:
Route Validation
A BGP route advertisement consists of a dest ination prefix. The prefix that is advertised incl udes a set or properties
associated with that prefix. These properties are used for a variety of tasks. One of t he properties is called t he protocol next
hop. The protocol next-hop property lists the remote gateway to use to reach t he advertised prefix. The remote gateway, or
protocol next hop is not always set to t he device that advertises the route, since an advertising router may simply be relaying
the route from another peer.
If a route prefix is received by a BGP router, t he BGP router m ust validate whether or not the remote gateway to the prefix is
reachable in the loca l routing table. After all, a device can 't forwa rd an IP packet toward a remote gateway if the device does
not have a route toward that gateway.
The process of va lidating the protocol next hop of a rece ived route is cal led a recursive lookup. A recursive lookup describes
the process of looking within a local routing tab le to find a physical next hop that points toward a received remote next hop,
or gateway address. In the case of an IP prefix, the local ro uter examines the entires in the local inet.O routing table to
determine the next physica l hop toward the advertised protocol next hop. The phys ica l next hop to t he remote gateway is
then associated with the advertised prefix and installed in the local routing-table as the physical next hop for the advertised
prefix.
In the case of an M PLS VPN route, an MPLS destination must be reachab le through an MPLS tunne l. MPLS tunne ls are
installed in the inet.3 routing table, and therefore all M PLS VPN advertised routes must have a route to t he protocol next hop
present in the inet.3 routing table.
In the case of an EVPN advertised destination, t he EVPN destination must be reachable thro ugh a VXLAN t unnel before it
can be placed in t he local bgp.evpn .O routing table. The VXLAN tunnel routes are installed in t he :vxlan .inet.O routing table.
Therefore, when an EVPN route is received by a router through MP-BGP, the route to t he protocol next hop of t hat prefix,
which is the loopback address of the remote VT EP device, m ust exist in the :vxlan .inet.O route tab le. When a local subnet is
configured as a VNI, the local interface is added to t he :vxlan .inet.0 table. When a remote VTEP is discovered, a route to the
remote VTEP loopback is placed in the :vxlan .inet.O table as an automatically generated static route. The VTEP logical tunnel
interfaces are created when a remote VTEP advertises connectivity to one of the VN ls that is configured locally. If a remote
BG P peer advertises an EVPN ro ute for a VNI that does not correspond to a locally configured VN I, then t he local VTEP will
not have a VTEP tunnel to the remote gateway, and therefore will not have a route to the protocol next hop. This ca uses the
local VTEP to drop the advertised EVPN prefix, because the prefix is for a VNI that is not associated with any locally
configured VNls. This saves resources by not storing routes to remote VN ls that are not configured locally in the bgp.evpn.O
routing tab le.
                                                                                                                           -
                                                                                                                         +--
                                                                                                                         +--
                                                                                                                                       ••
                                                                                                                                            • ••
                                                                                                                                                   •••
                                                                                                                                                         ••
                                                                                                                                                           ••
                                                                                                                                                            ••••
                                                                                                                                                                            Single homed s ite: Ox00:00:00:00:00:00:00:00:00:00
                                                                                                                                                                            MAX-ES I: OxFF :FF :FF:FF:FF :FF :FF:FF: FF:FF
                                                Leaf1        -
                                                           • n ,...
                                                                                          Leaf2
                                                                                                        -•    • ••
                                                                                                                     . .....................• ,,,.
                                                                                                                         Leaf2         Leaf3
                                                                                                                                                                 --  ►
                         _ _ _ _----=:~..._~ :-----,<: n                                         -
                                                                                                    ~     ~                                                 ~
                                                                                                                                                                     >-
                                                                                                                        Type 1
                         !ESl=Ox0:1:1: 1:1 :1:1: 1:1:11 r"            -                        -    ~
                                                                                                                                                                     I        ESl=OxO
                                                                                                                                                                                              I
                                                                                                                                     Site 2                      P
                                                           Site 1
                                                                                   I
                                                                                host1                                                                           host2
                                                                                                                                                                                             •·•······· · ··• BGP Route Advertisement
Once you have configured a non-reserved ESI value on a site-facing interface, the PE will advertise an Ethernet Autod iscovery
route to al l remote VTEPs. The route carries the ESI va lue as well as the ESI Label Extended Community. The community
contains the Single-Active Flag. This flag lets the remote VTEPs know whether or not they can load share traffic over the
multiple links attached to the site. If the Single-Active flag is set to 1, that means only one link associated with the Ethernet
segment can be used for forward ing. If the Single-Active flag is set to 0, that means that all links associated with the Ethernet
segment can be used for forwarding data (we ca ll this active/active forwarding). Juniper Networks devices only support
active/active forwarding (we always set the f lag to 0).
                                                                                                                  host1 's MAC might have only been advertised from Leaf1, but
                                                                      VXLAN Tunnels                               Leaf3 assumes the MAC is reachable by any leaf attached to the
                                                                                  •••                             Ethernet segment
                                                                     - -
                                                                       -
                                                                       ..                   ---
                                                                    -- ..
                                                                       ..         ••
                                                                                  ••
                                                                     ~ ~ ?;--....~~~-
                                                                   -:;-             K
                                                                                                                             Leaf3 MAC-VRF Table
                                                                                                                             MAC host1 > vtep.32678
                                                                                                  ....                                 > vtep.32679
                                                 Leaf1
                                                         -                Leaf2
                                                                                        -                 Leaf3
Leaf1 Leaf3
Network Convergence
Another benefit of t he Ethernet Aut odiscovery route is t hat it helps to enable faster co nvergence times when a link fa ils.
Normally, when a site-faci ng link fails, a VTEP will withdraw each of its individual MAC Advertiseme nt . Think about t he case
where there are t housa nds of MACs associated with that link. The VTEP would have to send thousands of withdrawals. When
the Ethernet Autod iscovery route is being advertised (because the esi statement is configured on the interface), a VTEP
(like Leaf1 on the s lide) can send a single withdrawa l of its Ethernet Autodiscovery route and Leaf3 can immediately update
the MAC table for all of t he t housa nds of MACs it had learned from Leaf 1. Th is allows convergence times to greatly improve.
                                                                                      ••
                                                                                        ••••
                                                                                            ••
                                                                                              •••
                                                                                                  •
                                                                                                           +--
                                                                                                           +--
                                                                                                                --                                         +--
                                                                                                                                                           +--
                                                                                                                                                                --                  ••
                                                                                                                                                                                       •
                                                                                                                                                                                       ••
                                                                                                                                                                                         ••
                                                                                                                                                                                           ••
                                                                                                                                                                                             ••
                                                                                                                                                                                              ••
                                                                                    ••
                                                                                ..·····~                                                                                 ~                         ·.·....
                                =
                                ·-          10.10.10.11/24
                                                                              ••
                                                                                •
                                                                                                                                                                                                         '                           10.10 .10.22/24    Site 2         P
                                                                                                                                                                                                                           MAC 05:05:05:05:05:01
                                                                                                                                                                                                                                                                  11
         Site 1               host1                                       Leaf1                                                                                                                       Leaf2                                                       host2
                                                                        2.2.2.2/32                                                                                                                 4 .4.4.4/32
     C> 2019 J uniper Networks , Inc All Rights Reserved                                                                                                                                                                                               Jun1Per
                                                                                                                                                                                                                                                          NFf\\'OPKS
                                                                                                                                                                                                                                                                           29
      EVPN Type 3:
      Inclusive Multicast Ethernet Tag Route
                                                                                                   Route 'Type                                                                    Inclusive Multicast Ethernet Tag Route ('Type 3)
                                                                                        Route Distinguisher (RD)                                                                                                  RD of EVI on Leaf1
                                             NLRI
                                                                                                 Ethernet TAG ID                                                                                                    VXLANVNID
                                                                                         Originator IP Address                                                                                                         2.2.2.2
                                                                                                                                                                                    Flags                                  0 (No Leaf info req.)
                                                                                                                                                                                 Tunnel Type                            Ingress Replication, PIM
                                            Provider Multicast Service Interface (PMSI) Tunnel
                                                                                                                                                                                 MPLS Label                        N/A
                                                                                                                                                                                  Tunnel ID      Multicast Group or Sender IP (2.2.2.2)
                                                          Extended Community                                                                                                                Route-target for EVI on Leaf1
                                            Other attributes (Origin, AS -Path, Local-Pref, etc.)                                                                                                                         ...
                                                                                     ••••
                                                                                         ••••
                                                                                             •
                                                                                             •
                                                                                                         +--  --                                      +--  --                     ••
                                                                                                                                                                                     •
                                                                                                                                                                                     ••
                                                                                                                                                                                       ••
                                                                                                                                                                                         •   ••
                                                                                   ••
                                                                                                         +--                                          +--
                                                                                  •                                                                                                            •
                                                                                                                                                                                               ••
                                                                                 •
                                                                             ..·····~                                                                               ~                               ·..••..
                              =
                              ..
                                         10.10.10.11/24
                                                                           ••
                                                                             •
                                                                                                                                                                                                          '                      10.10.10.22/24    Site 2       P
                                                                                                                                                                                                                        MAC 05:05:05:05:05:01
                                                                                                                                                                                                                                                            11
        Site 1              host1                                       Leaf1                                                                                                                         Leaf2                                                 host2
                                                                      2.2.2.2/32                                                                                                                    4.4.4.4/32
    Q 2019 Juniper Networks, Inc All Rights Reserved                                                                                                                                                                                              Jun1Per
                                                                                                                                                                                                                                                     NOWOllKS
                                                                                                                                                                                                                                                                    30
                                                         --------~~-~~ .
                                                                          ......., .                                                 •••••••• • BUM Traffic
                                                                                       ·•.
                                                                                             ••••• ••• ••                            -   -   -   VXLAN Tunnel
                                                                                                 □
- Leaf3
                                          =
                                                                                                                    Spine2
                                                                                                                                                               =
                            host1                        Leaf1                                                               Leaf2                            hosts
     C> 2019 Juniper Networks, Inc All Rights Reserved                                                                                                          Jun1Per
                                                                                                                                                                      N(lW()PKS
                                                                                                                                                                                  31
                                                                Source
                                                                         =               = host4
                                                                 host3
                                                                                                             • • • • • • • • • BUM Traffic
- - - VXLANTunnel
- Leaf3
                                         =
                                                                                            Spine2
                                                                                                                                              =
                           host1                        Leaf1                                        Leaf2                                   hosts
    i9 2019 Juniper Networks, Inc All Rights Reserved
• • • • • • • • • BUM Traffic
- - VXLAN Tunnel
Leaf2 Leaf2
Source
                     =                                                                               =
                            .... ...   ►                                                                    ◄·   ..... .
                                           Leaf1         Leaf3                                                             Leaf1         Leaf3
                                                                 host2                              host1
                                        (Source Leaf)                                                                                 (Source Leaf)
In the diagram on t he left, Leaf1 makes copies of the BUM packets and unicasts them to each remote PE that belongs to the
same EVPN. This w ill ca use host2 to receive multiple copies of the same packets.
In the diagram on the right, Leaf3 receives BUM traffic f rom the attached host. It makes copies and unicasts them to the
remote VTEPs, whic h includes Leaf2. Because of the default split horizon rules, Leaf2 forwards BUM traffic back to the
so urce, which creates a loop.
      Designated Forwarder
      ■    A designated forwarder is a device that is elected to forward traffic toward a
           multihomed site
             • Once a designated forwarder is elected, BUM traffic will not be sent toward the non-
               designated forwarder
             • Ethernet segment route (Type 4) is used to elect the designated forwarder
• • • • • • • • • BUM Traffic
                                                                   - - VXLAN Tunnel
                                                        Leaf2
                                                         (DF)                                                                        Leaf2
Source Source
                    =                                                                               =                                                =
                           .........                     -                                                 ◄·   ..... .
                                          Leaf1         Leaf3                                                             Leaf1      Leaf3
                                                                host2                              host1
                                       (Source Leaf)                                                                              (Source Leaf)
Designated Forwarder
To fix the problems described on the previous example, a ll of the VTEPs attached to the same Ethernet Segment will elect a
designated forwarder for the Ethernet segment (two or more VTEPs advertising t he same ESI ). A designated forwarder is a
device that is elected to forward traffic to an Ethernet segment. A designated forwarder will be elected for each broadcast
doma in . Remember that an EVI can contain one or more broadcast domains or VLANs. The Ethernet Segment Route (Type 4 )
is used to help wit h the election of the designated forwarder.
                                                                                             Leaf2
                                                                                           4.4.4.4/32                         - -+       Type 4 Route
- - - VXLAN Tunnel
Source
In the example, Leaf2 and Leaf3 wi ll advertise a type 4 ro ute to every VTEP that belongs to an EVPN. However, the ro ute is
not tagged with a target community. Instead, it is tagged with an ES-import target community. The ES-import target
community is automatical ly generated by the advertising VTEP and is based on the ESI va lue. Since Leaf1 does not have an
import policy that matches on the ES-import target, it will drop the type 4 routes. However, since Leaf2 and Leaf3 are
configured with the same ESI, the routes are accepted by a hidden policy that matches on the ES-import target community
that is on ly known by the VTEPs attached to t he same Ethernet Segment. Leaf2 and Leaf3 use the Originator IP address in
the Type 4 route to bu ild a table that associates an Originator IP address (i.e. the elected designated forwarder) with a VLAN
in a round-robin fash ion . After the election, If a non-designated forwarder for a VLAN receives BUM traffic f rom a remote
VTEP, it will drop t hose packets.
           inet.O
         _1 _1 _
       10 0 0 254124
                                 a.:              Enables device to perform
                                 0
                                       +-----+--t VXLAN Layer 3 Gateway                                        10.254.255.254
                                                          function
                                                                                                                   6D
           Default Switch                                                                                          \JD
           (VXLAN L2 Gateway)
                                                                         irb .Ovirtual IP = 10.10.10.254/24                       irb.O virtual IP = 10.10.10.254/24
                                       VXLAN Tunnels                   irb virtua l MAC = 00:00:5e:00:01 :01                    irb virtual MAC = 00:00:5e:00:01 :01
                                       to remote Leafs
Spine1 Spine2
                                                               =                                                                                                            =
                                                                   10.10.10.11/24                                                                         10.10.10.22/24
                                                       host1                                Leaf1                                          Leaf2                           host2
If both Spine1 and Spine2 are configured in th is manner, and use the same virtua l gateway address, both devices will share
the same virtual IP address and the same virtual MAC address of 00:00:5e:00:01:01. The spine nodes wi ll each advertise
that MAC address to the other VTEPs. The remote VTEPs will be able to load share traffic over the multiple paths to the same
virtua l MAC address. In the event that one of the gateway devices fails, the remain ing gateway continues to service traffic
that is forwarded to the virtua l IP and MAC addresses.
Summary
We Discussed:
         •            Described EVPN functionality;
Review Questions
Review Questions
        1.
2.
3.
       2.
An Ethernet Segment route is tagged with an ES-Import Route Target Commun ity.
       3.
A designated forwarder is used in a mu ltihomed site environment.
                               Engineering Simplicity
Data Center Fabric with EVPN and VXLAN
Objectives
We Will Discuss:
        •            Configuring EVPN controlled VXLAN.
ens4
                                                                                                                         host2
                            EBGP Session                                         host1
Example Topology
This slide s hows the examp le topology that will be used for t he EVPN-VXLAN example.
In the example topology fo r th is configuration section, traffic from host1 and host2 will be tunne led across an IP fabric
network. The IP fabric network consists of five Layer 3 capable switches, which act as routers.
Each switch in the IP fabric is assigned a unique, private autonomous system ID (AS ID). EBGP sessions will be established
between each sw itch device. An alte rnative to EBGP pee ring between each switch would be to configure an IGP, such as
OSPF or IS-IS, between the switch devices. This BGP peering configuration provides the connectivity of t he underlay network,
which provides reachability information among all underlay devices.
The leaf nodes are VTEP devices, or Layer 2 Gateways, and are QFX series devices. The sp ine nodes a re configured as VXLAN
Layer 3 gateways.
The goal is to ensure that host1 and host2 can communicate with each other.
       Loopback A ddresses:
           spine1: 192.168.100.1
                                                                             RR   -----
                                                                                          ---------------------   - --....... RR
           spine2: 192.1 68. 100.2
                                                         Overlay AS: 65100
           leaf1: 192.168.100.11
           leaf2: 192.168.100.12
           leaf3: 192.168.100.13
ens4
                                                                                                                                        host2
        --------            IBGP Session                                          host1
Logical View
To help you understand the behavior of the example, the diagram shows a logical view of the overlay network. Using the help
of VXLAN, it wi ll appear that host1, host2, and the IRBs of the routers in AS 65001 and AS 65002 will be in the same
broadcast domain and IP subnet. Also, the IRBs of the routers in AS 65001 and AS 65002 wi ll share the same virtual
gateway address, and virtual gateway MAC address to represent a distributed gateway.
             • VXLAN tunnels established automatically to devices that advertise EVPN route destinations
      Loopback Addresses:
                                                                                      RR                            RR
          spine1 : 192.168.100.1
          spine2: 192.168.100.2
                                                        Overlay AS: 65100           spine1                         spine2
          leaf1: 192.168.100.11                                                     65001                          65002
          leaf2: 192.168.100.12
          leaf3: 192.168.100.13
ens4
                                                                                                                                host2
                              VXLAN Tunnels                                                  host1
VXLAN Tunnels
You m ust ensu re that all VTEP addresses are reachable by all of the routers in the IP fabric. Generally, the loopback interface
will be used on Juniper Networks' routers as the VTEP interface. Therefore, you must make sure that the loopback addresses
of the routers are reachable. The loopback interface for each route r in t he IP fabric was configured in the 192.168.100.0/24
range.
The diagram shows t he t unnel overlay between the devices. These tunnels will be automatically generated by the EVPN
control plane. Each leaf device wi ll be a VTEP tunnel endpoint when it advertises reachability to a local EVPN-VXLAN VN I. In
the diagram, yo u can see that leaf2 is not a VTEP in this example. Th is is because there are no locally connected hosts, and
therefore leaf2 does not advertise connectivity to any VNls. However, leaf2 wi ll have a similar configu ration as leaf1 and
leaf3. If a host is connected to, or migrated to, an interface connected to leaf2, the newly activated VNI will be advertised to
remote BGP peers, and VTEP tunne ls to all other VTEPs will be automatically generated.
       BGP Configuration (1 of 3)
       ■       Common configuration
           All route rs
           {master : 0} [ edit I
           l ab@spineli show routing- options
           router- id 192 . 168 . 100 . 1 ;
                                                                    Autonomous system should be set to overlay topology AS
           autonomous- system 65000;
                                                                    (useful for automatic route target generation discussed later)
           forwardi ng- table {
               export l oad-bal ance ; --
                                                                    Applies load balance policy to forwarding table
               chained- composite - next- hop {
                     ingres::1 {
                             evpn;                                  Allows groups of routes to share a common next-hop, rather
                       }
                                                                    than an individual next-hop for each route
                 }
           )
           {master : O} [edit)
           lab@leaf3 # show policy-options
           policy- statement export-directs {
               term loopback {
                     from {
                          protoco l direct ;    .                   Advertise the local loopback address so that the overlay will
                          inter f ace lo0 . 0 ;                     have reachability to loopback addresses
                       }
                       then accept ;
                 )
           }
           policy-statement load-balance {
               term load-bal ance {
                   then {                                           Load balanc ing policy to install multiple next hops in
                       load-balance per-packet;          --
                                                                    forwarding table - applied to forwa rding table
                       accept ;
                       }
                 )
           )
The chained-composite-next-hop statement a llows the device to create an indirect next hop in the routing table, and
associate many routes with the same next hop to a single indirect next hop locally. This can provide significant benefits when
a remote leaf fa ils or withdraws reachability to a large number of remote ly connected destinations, or when a fabric link or
node fails and the physical next hop to remote destinations must be changed for a large number of destinations. On the local
device, only the indirect next hop must be changed or re-mapped to the new physical next hop to adjust the forwarding table
for all prefixes that are mapped to that next hop, and each individual next hop doesn't need to be updated one at a time.
      BGP Configuration (2 of 3)
                                                                      spine1 Configuration                         spine2 Configuration
                                                                      {master : OJ[edit protocols bgp)             {master:O J [edit protocols bgp)
      ■    Route Reflector/Spine Nodes                                lab@.!!pineli show                           l ab@.!!pine2J .!!how
                                                                      group fabric {                               group fabric {
                                                                          type external;                                type external;
                                                                          export export- directs ;                      export export- directs ;
                             Underlay AS for this device                  local-as 65001 ;                              local-as 65002 ;
                                                                          multipath {                                   multipath {
                                                                                multiple- as ;                               mul tiple- as ;
                                                                              I                                         I
                                                                              neighbor 172 . 16 . 1 . 6    {           neighbor 172 . 16 . 1 . 18   {
                                                                                  peer-as 65003 ;                          peer-as 65003 ;
                                                                              I                                         I
                             The neighbors exchange 1Pv4 BGP routes           neighbor 172 . 16 . 1 . 10       {       neighbor 172 . 16 . 1 . 22   {
                             (underlay reachability)                              peer-as 65004 ;                          peer-as 65004 ;
                                                                              I                                         I
                                                                              neighbor 172 .16 . 1 . 14        {       neighbor 172 . 16 . 1 . 26   {
                                                                                  peer-as 65005 ;                          peer-as 65005 ;
                                                                              I                                         I
                                                                      I                                            I
                                                                      group overlay {                              group overlay {
                                                                          type internal;                               type internal;
          Sourec address of BGP peering session and protocol next-
                                                                          l ocal-address 192 . 168 . 100 .1;           local-address 192 . 168 . 100 .2 ;
          hop of routes that originate on this device
                                                                          family evpn {                                family evpn {
                                                                               signaling;                                  signaling;
                                                                              I                                         I
                                                                              cluster 1 . 1 . 1 . 1 ;                  cluster 2 . 2 . 2 .2 ;
          Overrides AS configured in [edit routing- options               .   local -as 65000;                         local-as 65000;
          autonomous-system] if desired                                   -
                                                                              multipath;                               multipath;
                                                                              neighbor 192 . 168 . 100 . 2;            neighbor 192 . 168 . 100 . 1;
          The neighbors exchange MP-BGP EVPN routes (overlay                  neighbor 192 .168 . 100 . 11 ;           neighbor 192 . 168 . 100 . 11 ;
          reachability)                                                       neighbor 192 .168 . 100 . 12 ;           neighbor 192.168 . 100 . 12 ;
                                                                              neighbor 192 . 168 . 100 . 13 ;          neighbor 192 . 168 . 100 . 13 ;
                                                                      I                                            I
There are two peer groups configured . One peer group is for the fabric underlay. The other peer group is for the overlay
network, wh ich is used to advertise the EVPN routes.
In the f a b r i c group, the local-as is configured as the un ique AS number that is used for the EBGP peering sessions.
Neighbors are configured that use the connected interface IP address, and the peer-as of each neighbor is configured . The
export po licy e x po rt -di rec t s is app lied to advertise the loopback address of the local node to all fabric peers. This wil l
allow reachability for the overlay network peering sessions. The mu lt i-pa t h mul tip l e-as parameter is configured to
allow the BGP route selection process to include a ll equal cost forwarding paths in the forwarding table.
The overlay network is configured as an IBGP peering group, or type internal. An internal peering session req u ires that the
peers belong to the same autonomous system. This can be performed in one of two ways: the local-as can be defined within
the group or neighbor statements, or the autonomous system ID defined under [e dit r out ing-op t i o n s
auton omous-sys t em ] can be used. If the local-as parameter isn't defined in the peering group, the global AS number is
inherited by the BGP protocol.
Because all lBGP peers belong to the same autonomous system, the mu ltip a t h statement is adequate to allow ECMP
among the IBGP peers, and the multiple-as parameter is not needed .
The cl u ster parameter identifies a group of peers to which routes will be d istributed, or reflected . This configuration
statement is what identifies the device as a route reflector and changes the default BGP route advertising parameters for
internally learned routes. An in-depth explanation of route reflectors is not covered in th is course. Just keep in mind that the
route reflector will "reflect" , or advertise, a route received from any member of the cluster to all other members of the
cluster.
The fam i ly evpn s i gna l i n g is configured on the overlay. Th is indicates that the BGP session will on ly be used to
advertise EVPN type routes . No standard unicast routes will be advertised between underlay peers.
       BGP Configuration (3 of 3)
       ■       Leaf Nodes
           Leaf1 Configuration
           {master : O}[edit protocols bgp]
           lab@leafl# show
           group overlay {                                Overlay AS inherited from [edit routing- options autonomous-system 65000] parameter
                 type internal ;
                                                          when group configured as type internal
                local - address 192 . 168 . 1 00 . 11 ;
                family evpn {
                    signa l ing ;
                 }
                neighbor 192 . 168 . 100 .l;              Leaf nodes only need to peer with route reflectors (overlay
                neighbor 192 . 168 . 100 . 2 ;            reachability)
           }
           group fabric {
               type external;
                export export-directs ;
                local - as 65003;             -           Underlay AS
                mul tipath {
                       multiple-as ;
                 }
                neighbor 172 . 16 . 1 . 5         {
                    peer-as 65001 ;
                 }
                                                          The neighbors exchange 1Pv4 BGP routes
                                                          (underlay reachability)
                neighbor 172 . 16 . 1 . 17            {
                    peer-as 65002 ;
                 }
           }
                                                          Note: Leaf2 and leaf3 will have similar confi g urations
              {master : O}
              lab@leafl> show bgp summary
              Threading mode : BGP I/0
              Groups : 2 Peers : 4 Down peers : 0
              Tabl e              Tot Paths Act Paths Suppressed     History Damp State    Pending
              bgp . evpn . 0
                                           6          3        0           0          0            0
              inet . O
                                           6          6        0           0          0            0
              Peer                           AS      InPkt   Out Pkt    OutQ   Flaps Last Up/Dwn Statel#Active/Received/Accepted/Damped ...
              172 . 16 . 1 . 5            65001         72        69       0       0       30 : 02 Establ
                inet . O: 3/3/3/0
              172 . 16 . 1 . 17           65002         86        86       0       0       36 : 37 Establ
                inet . O: 3/3/3/0
              192 . 168 . 100 . 1         65000         47        43       0       0       16 : 11 Establ
                _ default_ evpn_ . evpn . O: 0/0/0/0
                bgp . evpn . 0 : 3/3/3/0
                default - switch . evpn . O: 3/3/3/0
              192 . 168 . 100 . 2         65000         45        43       0       0       16 : 07 Esta bl
                _ default_evpn_ . evpn . O: 0/0/0/0
                bgp . evpn . O: 0/3/3/0
                default-switch . evpn . 0 : 0/3/3/0
                                                                -                                       \     \.
                                                  EVPN routes have been received from RRs   MP·BGP sessions are Established with RRs
          {master : 0 } [edi t]
          lab@leafl i show interfaces xe - 0/0/0
          unit 0 {
              family ethernet-switching {
                    vlan {
                                                             Server {edge device) VLAN tag
                          members vlO ;
                                                             information
                        )
                 }
          }
           {master : 0} (edit )
           lab@ leafl JI sh o w protocols evpn
           vn i - options {
                    .
                  vni 50 10 {                                               List each VNI you wish to have participate in EVPN signaling
                       vrf - t arget target : 65000 : 5010 ;   ~
                   }
           }                                                           Default encapsulation is MPLS
           encapsula tion vxlan ;
           e x tended - vni - list 5010 ;
The [edi t swi tch-opt i ons] hierarchy identifies the source address of the VTEP tunnel, the route-d istinguisher that
must be added to advertised route pref ixes, the target community that w ill be added to routes leaving the local switching
table, and the target community t hat will identify which routes received from remote BGP peers are to be imported into the
local VXLAN switching table.
The vr f -expo r t statement adds the associated vrf-target community value to the BGP updates that will be sent. The
vrf-import pol icy eva luates received BGP routes, and if the vrf-target community in the vrf-import policy matches a
community value in the route, the information from the route is imported into the local EVPN route table, and subsequently
the VXLAN switch ing table.
The vr f -target statement under the [edit switch-opti o n s] automatically creates hidden vrf-import and
vrf-export policies for the specified community va lue. If a v r f- t a r get statement is included for a VNI in the
vni-opt i o n s hierarchy, the vni-opt i ons vr f - t a r get statement overrides the switch-opt i ons vr f - t arget
value for Type 2 and Type 3 EVPN routes . The switch-opt i ons vr f -tar ge t statement is still applied automatica lly to
Type 1 EVPN routes.
CE3 CE4
                                                                                                                  11111
                                        VLAN 10 > VNI 5010
                                        VLAN 20 > VNI 5020
CE2
                                                                                                  ----
                                                                                                                               MAC Advertisement Routes
                                                                                                                                     VNI 5010, VNI 5020
Leaf3
                                                        111111
                                                                 VLAN 10
                                                                           _
                                                                           Leaf1
                                                                                   _____-b::::
                                                                                            MA: :C
                                                                                                 :::
                                                                                                  vA:::
                                                                                                     N~:::
                                                                                                        ~e:0~:::~::
                                                                                                                 ·e:'
                                                                                                                   :::Ne:1::n5:t0R
                                                                                                                                 :2~::
                                                                                                                                     u::
                                                                                                                                      te::::
                                                                                                                                        s :i,_..-,.,.   I~      1
                                                                                                                                                                I         o~~
                                                                                                                                                                  _v_LA_N_1
                                                                                                                                                                  -
Note: EVPN route targets and policies c an be configu red fo r multiple tenants, and c an be applied on a per-tenant basis
Automatic VRF-Target
Manually configuring all VNI to vrf-target commun ities in a large network can become cumbersome. The [edi t
sw i tch-op t i o n s v r f-target a u to J parameter enables t he device to automatically generate a vrf-target community
to each VNI t hat becomes active on the device. When this function is used, an automatic vrf-target community is derived for
each active VNI on the VTEP. The commun ity value is created using the global AS (overlay AS) and the VLAN ID associated
with the VNI. In this manner, the auto-generated vrf-target community is synchronized across all VTEPs that belong to the
same BGP overlay network and VLANs/ VNls across the domain .
EVPN RIB-IN
The EVPN RI B-IN table is the BGP routing table that stores a ll received BG P routes that are of the fami ly EVPN type. If an
EVPN route is received that does not have a vrf-target community that is accepted by a loca l vrf-import policy, the route is
discarded prior to being placed in the bgp. evpn . o tab le .
                 default - switch . evpn . 0 : 6 dest inat i o n s , 9 r ou tes (6 active , 0 hol ddown , 0 h idden)
                 +=Act i ve Route , - = Last Active , *=Both
BGP Troubleshooting
Th is course does not cover extensive BGP configuration and troubleshooting. However, many of the common BGP issues can
be resolved by analyzing the output from a few troubleshooting commands, which are shown in the slide.
       VTEP Interface
           ■    Verify status of VTEP interfaces
       user@leafl> show interfaces vtep
       Physical interface : vtep , Enabled , Physical link is Up
         Interface i ndex : 641 , SNMP ifindex : 518
         Type : Software-Pseudo, Link-level type : VxLAN-Tunnel-Endpoint, MTU : Unlimited , Speed : Unlimited
         Device f lags     .. Pr esent Running
         Link type        .. Full-Duplex
         Li nk flags      .. None
         Last flapped     .. Never
            Input packets .. 0
           Output packets : 0
                                                                                                       One VTEP interface will automatically get
         Logical interface vtep . 32768 (Index 578) ( SNMP ifindex 557)                                instantiated for locally attached network
           ".Lags : vp ~•""'" • raps vx" uvv .,ncapsu.La 1.-1.on : "'""'',
           Ethe r net segment value : 00 : 00 : 00 : 00 : 00 : 00 : 00 : 00 : 00 : 00 , Mode : s i ngle-homed , Mu lti-homed statu s : Forwarding
           VXLAN Endpoint Type : Source , VXLAN Endpoint Address : 192 . 168 . 10 0 . 11 , L2 Routing Instance : default - switch , L3 Routing Instance :
       default
                Input packets .. 0
                Output packets : 0
                                                                                                       One VTEP interface will automatically get
         Logical interface vtep . 32771 (Index 559) (SNMP ifindex 560)           -                     instantiated for each remote VTEP that is
           ".Lags : vp ,.., , raps ~ncapsu.1ac1.on : 'L                                                discovered through received EVPN routes.
           VXLAN Endpoint Type : Remote , VXLAN Endpoint Address : 192 . 168 .1 00 . 13 , L2 Routing   The interface is used to tunnel data to and   nee :
       default                                                                                         from a remote VTEP
                Input packets .. 30
                Output packets : 104
                Protocol eth-switch , MTU : Un limited
                  Flags : Trunk-Mode
       '   ..
VTEP Interfaces
The status of VTEP interfaces can be verif ied through the CLI. The show interfaces vtep command shows the local
VTEP interfaces. One VTEP logical interface will be c reated for each locally attached LAN segment in the VXLAN. One VTEP
logical interface wil l be created for each remote VTEP, which is discovered th rough t he EVPN control plane. Local VTEP
endpoint addresses are always t he local loopback interface address, and remote VTEP endpoints should always be the
remote loopback add ress.
       MAC Table
                                                                                                                                   ,, I
       ■          1ew MAC add resses th ath ave been Iearne db th e VTEP
                user@leafl> show ethernet-switching table           -                                             Shows locally and remotely learned MACs
                MAC flags (S - static MAC , D - dynamic MAC, L - l ocal ly l earned , P - Persistent static
                           SE - statistics enabled , NM - non configured MAC , R - remote PE MAC , 0 - ovsdb MAC)
                MAC flags ( S - static MAC , D - dynamic MAC , L - l oca l ly l earned, C - Control MAC
                                                                                                         •    I   Shows only remotely learned MACs
                                                                                                                                                     I
                            SE - Stat i s tics e nabled, NM -No n configured MAC , R - Remote PE MAC, P -Pinned MAC)
                user @spinel> show 12- learnin g vxl an- tunn el- end- point source
                use r @spinel> show 12- learning vxlan-tu nnel-end-po int remo te                                 Equivalent MX commands
     Q 2019 Juniper Networks, Inc All Rights Reserved                                                                                                Jun1Per
                                                                                                                                                         NOWOPKS
                                                                                                                                                                   25
MAC Table
The show ethernet-swi tching table comma nd displays the MAC addresses that have been lea rned by t he VTEP. To
view the MAC addresses that have been learned from a remote VTEP, the sho w ethernet-swi tching
vxlan-tunnel-end-po int remo te mac -table command can be used .
                                                                        _ ___...... ___
              family inet {                                                                                      vxlan {
                    address 10 . 1 . 2 . 100/24 (                                                                    vni 5020 ;
                        primary;                                           _,_                ~            ..___ )
                        vi rtua l-gateway-addr ess 10 . 1 . 2 . 254 ;
                                                                         Configure the VLANs that participate
                        )
                                                                         in the routing domain, and assign the
                 l                                                       IRB interface as the Layer-3 interface
          }
                                                                         for the broadcast domain.
       (mas t e r : 0}
       lab@spinel> show ethernet- switching vxlan-tunnel-end- point source
       Logical System Name        Id SVTEP-IP             IFL     L3- Idx   SVTEP Mode
       <defau lt>                 o   192 . 168 . 100 . 1 loO . O     o                 .,__-I Spine isthesourceVTEPofVNID
            L2 -RTT                  Bridge Domain                    VNI D   MC- Group-IP      5010and 5020
           default-switch            vlO+lO                           5010    0.0.0.0        ...__ _ _ _ _ _ _ _ _ ____.
           default-s witch           v20+20                           5020    0.0.0.0
       {master : 0}
       lab@spinel> sho w ethernet- switching vxlan- tunnel - end- point remote                                            Spine has VXLAN tunnels to remote
                                                                                                                          leaf devices (VTEP endpoints)
       Logical
       <default>  System Name          oId 192
                                            SVTEP-IP
                                                 . 168 . 100 . 1 ~I~F:L~~L:
                                                                  100 . 0 3~- o
                                                                              I: d:x_;_S~V~TE~P~M=
                                                                                        T s ine2 od: e : _ _ ~ : _ : ~ ; ; ~ , -
        RVTE P-IP            IFL-Idx      NH-Id       RVTEP Mode                         o
        1 92 . 168 . 100 . 2 558          1740              RNVE
            VNID             MC-Group-IP
            5020             0.0.0. 0
            5010             o.o.o. o
        RVTEP-I P            IFL-Idx      NH-Id
        192 . 168 . 100 . 11 557          1726
            VNI D            MC- Gr oup-I P
            5010             o.o.o. o
        RVTEP-IP             I FL-Id x    NH- Id      RVTEP Mode
        192 . 168 . 100 . 13 556          1723              RNVE
                                                                              Note: Notice that no VXLAN tunnel has been created
            VNID             MC-Group-IP
            5020             0.0.0.0                                          to leaf2. Th is is because leaf2 has not advertised any
    - - - - - - - - - - - - - - - - - - - - - - - - - 1 locally connected VNI.
Note that the default MAC address shown is the default virtua l MAC address that is associated with the virtua l gateway
address. This is the same virtual MAC address that is used for a virtual gateway when using VR RP as well. Because the
virtual MAC address is a fixed value, the virtual MAC address for all configured subnets will be the same if the default value
is used. Juniper Networks recommends manually assigning a virtua l MAC address for the virtual gateway instead of using the
default address.
                                                                            L3 Gateway
                                                                               inet.O
                                                                             I   Gateway   I
                                                                                 VXLAN
                                                                                  (irb)
If the VTEP device is not capable of performing both the VTEP and Layer 3 gateway functions, a router can be configured as
a host, which is connected to a VTEP. This requ ires one more forwarding step in the process, since the VTEP decapsulates
the original frame and forwards it to the default gateway, which performs its normal Layer 3 routi ng functions, and then
forwards the frame back to the VTEP in a new VLAN, which corresponds to the next routing next hop.
                                                                                                                                                  .2 p
                                                                                                                   Remote VTEPC 1------li;;;;;;;;;;;;;
host2
I ESI: 00:01 :01 :01 :01 :01 :01 :01 :01 :01 I .1 .3
                                                                                                                            host1
    C> 2019 Juniper Networks, Inc All Rights Reserved
Whenever the same physical device is connected to the network with multiple links, the VXLAN network must be assigned an
Et hernet Segment ID, o r ESI. By default, a single homed device receives the reserved ESI va lue of
00:00:00:00:00:00:00:00:00:00. When a s ite is multihomed, a non-defau lt ESI value must be used. This proces.s is used to
reduce BUM traffic f lood ing and loops.
In t he example, BUM traffic from host1 that leaves interface A and arrives at VTEPA. the VTEPA forwards the BUM traffic to all
other VTEPs that participate in the save VN I. Unless VTEPA and VTEPB are aware that they are connected to the same
physica l host, VTEPB would fo rward BUM traffic received from VTEPA to host1 . To avo id this behavior, both VTEPA and VTEPB
a re configured with t he same ESI for the LAN segments t hat connect to host1 . In this manner, VTEPB can identify that the
BUM traffic received f rom VTEPA originated on the same ESI, and VTEPB wi ll not forward the t raffic back to host1 .
      {master : O}[edit]
      user@leaf2# show interfaces xe-0/0/0
      esi {
          00 : 01 : 01: 0l : Ol : Ol : Ol : Ol : Ol : Ol ;
          all-active ;
      }                                                                                   leaf1                 VXLAN Tunnel             leaf2
      unit O {
          fami l y ethernet-swi tching {
               interface- mode trunk;
               vlan {                                         IESI: 00:01 :01 :01 :01 :01 :01 :01 :01 :01 I       .1           .3
                    members vlO ;
                      }
              }
      }                                                                                                                host1
    C> 2019 Juniper Networks, Inc All Rights Reserved
                                                                                                             ESI Value
                                                                                                                                          host1
    iQ 2019 Juniper Networks, Inc All Rights Reserved
host2
LAG
Link Aggregation groups are normal ly configured on a single device. The Link Aggregation Control Protocol (LACP), ma nages
the links in t he bund le. As you can see in the example, the LAG termi nat es o n two different devices with in the network; one
link t erm inates on leaf1, and the other on leaf2.
An EVPN Type 1 route, or Ethernet segment route, is advertised bet ween leaf1 and leaf2 to ind icat e that they are connected
t o the same Ethernet segment. In addit ion, an auto-d iscovery rout e, o r EVPN Type 4 rout e is generated by each connected
VTEP a nd advertised to remote VTEPs. The a uto-discovery rout e is used in the e lection of a designated forwa rder. The
designated forwa rder is the device con nected to the ESI that is responsible for forwarding BUM traffic for the segment. A
non-designated forward may f orward unicast traffic to the Ethernet segment, but blocks BUM traffic to avoid packet
duplication. The auto-d iscovery route also ind icates what forwardi ng mode is configured on the device, wh ich in the case of
Ju niper Networks devices, wi ll a lways be active-active.
From the perspective of host1, the LAG must te rminate on the same remote device, or must appear to term inate on the
same remote device. To perm it this funct ionality, t he LACP system ID on leaf1 and leaf2 is configured to the same va lue.
LACP contro l packets sourced from leaf1 and leaf2 a rrive on host1, and appear to originat e f rom the same remote device.
In order to enab le leaf1 and leaf2 to manage t he shared link properly, once again MP-BGP is used. Leaf1 and leaf2 co nf igure
LACP w ith a single physical link toward host1. The same ESI is assigned to the aggregated Ethernet int erface on both
devices, and both devices are configured with the same ESI. As long as t he devices are connect ed to the VXLAN network and
the underlay Fabric, they maintain the LACP status to host1 as active. In the event that a leaf device loses BGP con nectivity
t o the fa bric, and therefo re is disconnected f rom the fabric, the isolated VTEP sets the LACP status to standby, thereby
blocking traffic between the host and VTEP.
In order to limit traffic loss and delay in the fabric network, when a remot e VTEP, wh ich advertises connectivity t o an ESI is
removed from the network due to fai lure or some other cause, t he remot e VTEP performs a mass withdrawal of routes
associated with the remote VTEP. Forwarding next hops toward the withdrawn device are re-mapped to the remai ning VTEPs
that reta in connectivity to t he remot e ESI.
EVPN-LAG Configuration
The example shows the configuration of an EVPN-LAG on two leaf devices, each of which has a single physical link to t he
Ethernet segment. Note that both the ESI and LACP system ID are synchron ized on the two devices, as are the interface
parameters. The EVPN-LAG functions as a single logical link toward the connected host.
         {master : 0}
         user@leafl> show interfaces aeO detail
         Physical interface : aeO , Enabled, Physi cal link is Up
           Interface index : 670 , SNMP ifindex : 562 , Generation : 161
           Link- level type : Ethernet , MTU : 1514 , Speed : lOGbps , BPDU Error : None , Ethernet- Switching Erro r : None , MAC- REWRITE
         Error : None,
           Loopback : Disabled , Source filtering : Disabled , Flow control : Disabled, Minimum links needed : 1 , Minimum bandwidth
         needed : lbps
Verify LACP
The sho w lac p statistics i n t erfac e com mand is used to verify the LACP status of t he links in an EVPN-LAG. Note
that on the device leaf 1, the EVPN mu ltihomed status is set to Blocki ng BUM T r a f fic t o ES I . Th is indicates that this
device was not elected to be the designated forwa rder for th is Ethernet segment .
        {master : 0}
        user@l eaf2> show interfaces aeO detail
        Physical i nte r face : aeO , Enabl ed, Ph ysi cal link is Up
          Inte rface index : 670 , SNMP i findex : 563 , Generation : 161
          Link- level type : Ethernet , MTU : 1514 , Speed : lOGbps , BPDU Error : None , Ethernet- Switching Erro r : None , MAC- REWRITE
        Error : None,
          Loopback : Di sabled , Sour ce fil teri ng : Di sabled , Fl ow control : Disabled, Minimum l inks needed : 1 , Minimum bandwidth
        needed : lbps
Summary
We Discussed:
         •            Configuring EVPN controlled VXLAN.
Review Questions
Review Questions
        1.
2.
3.
Lab: VXLAN
• Configure EVPN-VXLAN.
Lab: VXLAN
The slide provides the objective for this lab.
      2.
An administrator-defined Ethernet Segment ID (ESI) is requi red when configuring a multihomed site.
      3.
The vrf-target community is used by MP-BGP to tag EVPN routes before being advertised to remote VTEPs. It is also used to
identify which EVPN routes rece ived from remote VTEPs shou ld be accepted and imported into the local routing tables.
                               Engineering Simplicity
Data Center Fabric with EVPN and VXLAN
Objectives
We Will Discuss:
        •           A basic data center deployment scenario.
       ➔ Requirements                                    Overview
       ■    Base Design
       ■    Design Options and Modifications
Requirements Overview
The slide lists the topics we will discuss. We wi ll discuss the highlighted topic first.
Organization Requirements
        •            VLANs - How many VLANs will be requ ired with in the domain? How will traffic flow with in the same VLAN? How
                     will traffic flow between hosts in different VLANs?
        •            Reachabil ity - Do appl ications require Layer 2 commun icat ion? Do applications req uire Layer 3
                     communication? What external networks (Internet, corporate WAN, etc.) will applications be required to
                     commun icate with?
        •            Security - What traffic will be required to pass through a security domain? How will t hat security domain be
                     implement ed? Is an edge f irewa ll sufficient and sca lable? Wi ll a security domain that contains several security
                     devices be required?
• Scalability - How will t he initial design be impact ed when the dat a center scales?
Proposed Solution
       ■     Solution outline
              • Spine-Leaf topology
              •    VXLAN with EVPN control plane for Layer 2 domains
              •    Layer 2 gateways at the leaf nodes
              •    Layer 3 gateways at the spine nodes
              •    Security domain for external destinations
Proposed Solution
In our example design, the following parameters wi ll be used:
• Spine-leaf topology;
• Layer 3 gateways implemented at the spine nodes (Centrally Routed design); and
      ■ Requirements Overview
      ➔ Base Design
      ■ Design Options and Modifications
Base Design
The slide highlights the topic we d iscuss next.
       • Physical Tapology
              • Spine-Leaf Topology
               • Dual-homed Servers                                                      Internet
                                                                     Spine       ----                --
                                                                                                     ---+
---+
                                                             Leaf   --
                                                                     ---+
                                                                     ---+         ----              --
                                                                                                     ---+
                                                                                                     ---+       ----
                                                                             =
                                                                             =                              =
                                                                                                            =
                                                                             =
                                                                             =                              =
                                                                                                            =
                                                         Servers             =
                                                                             =                              =
                                                                                                            =
                                                                             =
                                                                             =                              =
                                                                                                            =
     C> 2019 Juniper Networks, Inc All Rights Reserved
Physical Topology
The topology for this example is a simple spine-leaf topology. The number of spine and leaf devices can be increased as
needed without impacting the design.
Servers wi ll be dual-homed to leaf devices for redundancy, which requires two leaf devices per rack of servers.
A single Internet gateway is implemented. Alternative ly, a dual-gateway may be deployed. If a dual Internet gateway is
deployed, a chassis cluster is recommended, with a LAG to the spine devices. EVPN-LAG can be implemented on the spine
devices, with the security chassis cluster assigned to the same ESI.
      • IGP Underlay
             • OSPF
             • Advertise loopbacks                                                                   Internet
                                                                     --                  --                     --                   --
                                                                          ~                  ---+                  ---+                  ---+
                                                             Leaf     -       1--------t -          1--------t -          1--------t -
                                                                                    =
                                                                                    =                                           =
                                                                                                                                =
                                                                                    =
                                                                                    =                                           =
                                                                                                                                =
                                                        Servers                     =
                                                                                    =                                           =
                                                                                                                                =
                                                                                    =
                                                                                    =                                           =
                                                                                                                                =
    C> 2019 Juniper Networks, Inc All Rights Reserved
IGP Underlay
Mu lt iple options are available for t he underlay network. A simple IGP underlay is usual ly sufficient for a basic IP f abric data
ce nter design. The goal of the IGP is to advertise loopback addresses to all other fabric devices, which enables the ability to
use loopback addresses for t he overlay BGP peering sessions.
       • BGP Overlay
              • IBGP sessions to loopback addresses
              • Full mesh or route reflectors                                          Internet
                                                                   -     -     -
                                                                                ---+
                                                                                     -
                                                                                                   ---+         ---+
                                                                           =
                                                                           =                              =
                                                                                                          =
                                                                           =
                                                                           =                              =
                                                                                                          =
                                                        Servers            =
                                                                           =                              =
                                                                                                          =
                                                                           =
                                                                           =                              =
                                                                                                          =
     Q 2019 Juniper Networks, Inc All Rights Reserved
EVPN Overlay
The overlay for the data center is based on MP-BGP. A full mesh of IBGP peers or route reflectors may be used to distribute
EVPN route information. The IBGP sessions will support the family evpn signaling option. All fabric devices will be
configured with the same autonomous system ID.
                            gateway address
             • VXLAN tunnels connect leaf devices
                                                                   Leaf    -
                                                                          L2Gateway
                                                                                            -
                                                                                          L2 Gateway
                                                                                                                -
                                                                                                          L2 Gateway
                                                                                                                                  -
                                                                                                                            L2 Gateway
             • VXLAN tunnels connect leaf-to-spine
               devices for L3 gateway access                                          =
                                                                                      =                                     =
                                                                                                                            =
             • Creates full mesh of VXLAN tunnels                                     =
                                                                                      =                                     =
                                                                                                                            =
               through BGP signaling                                                  =
                                                                                      =                                     =
                                                                                                                            =
                                                             Servers
                                                                                      =
                                                                                      =                                     =
                                                                                                                            =
    C> 2019 Juniper Networks, Inc All Rights Reserved
The Laye r 3 gateways will be configured as distributed gateways on t he spine nodes. The spine nodes w ill be configured as
Layer 2 gateways (VT EPS) and as distributed Layer 3 gateways (IRB interfaces within the Layer 2 domains).
The resulting topology creates a full mesh of VXLAN tunnels w ithin the data center.
                                                                                                                                         -
                                                                                                                                         -1-------1
                                                                                                                                        +--
                                                                                                                                                          -
                                                                                                                                                          -
                                                                                                                                                         +--
                                                                                                                                                         +--
                                                                                                                                                               t - - - - ----..
                                                                                                                                                                VXLAN Dom in
                                                           Note: Not all possible traffic flows are shown     L 2 GQteway        L 2 Gateway    L 2 Gateway    L2 Gateway
                                                                                                            '\-__;,,•-:.,J_--,4..- ~ _ : _ J . - - L - ~ ~- ---'---.,,:.~:....y
                                                                                                                                                                        •
                                                                                                                     •
                                                                                                                          c:::::::J
                                                                                                                          c:::::::J
              - - - VXLAN Tunnel                                                                                          =
                                                                                                                          =
                                                                                                                          c:::::::J
              - - - - - Native Layer 2 Traffic                                                                            =
                                                                                                                          c:::::::J
                                                                                                                          c:::::::J
                                                                                                                                       -
             -----· VXLAN Tunnel
                                                                                                                        •
                                                                                                                            •
                                                                                                                                             External Security Gateway
               • External gateway device performs NAT and
                                                                                                                •
                                                                                                                                     L3 Gateway
                 other services prior to passing traffic                                                   L3G~ eway
                                                                                                                                     L2 Gateway        --
                                                                                                =
                                                                                                =                                                  =
                                                                                                                                                   =
              -----· VXLAN Tunnel                                                               =
                                                                                                =                                                  =
                                                                                                                                                   =
              - - - - - Native Layer 2 Traffic
                                                                                                =
                                                                                                =                                                  =
                                                                                                                                                   =
                                                                                                =
                                                                                                =                                                  =
                                                                                                                                                   =
              •············· Traffic Flow
External Destinations
Traffic destined to external destinations will be forwarded to the spine devices. The IRB interface on the spine device will be
configured as the default gateway for all VNl 's within the VXLAN. Once the traffic arrives on the spine device, it is routed to
the destination VNI, wh ich is the VLAN associated with the link that connects to the external security device. The traffic is
then forwarded to the external security device, which performs NAT and other services prior to passing traffic to the external
destination.
External traffic arrives on the External Security Gateway, is processed, and then is forwarded to the VXLAN Layer 3 gateway.
The VXLAN Layer 3 gateway forwards the traffic to the VXLAN VNI that corresponds to the destination host.
Scalability Considerations
The Layer 2 gateway functions are performed on each leaf device . The VXLAN environment requ ires IBGP peering to all other
leaf devices, or to a centralized set of route reflectors. The centra lized route reflectors topology is more scalable because as
new leaf nodes are added, only peer sessions from the new leaf to t he route reflectors are required. The a lternative of a full
mesh topology requires that every leaf node in the network creates an IBGP session to the newly added leaf.
Layer 3 gateways are on the spine devices, and therefore the Laye r 3 routing functions do not change as more leaf nodes are
added. The only exception to this is when new spine devices are added in the future, if necessary.
One downside to th is topology is that it requires all inter-VLAN t raffic with in the data center to pass t hrough the sp ine nodes.
As new leaf nodes are added, the IBGP signaling automatically advertises the new leaf and new VTEP to the devices within
the VXLAN, and new VXLAN tunnels are automatically created .
       Hair-Pinning
       • Hosts in different VLANs connected to
         same VTEP
              • Traffic must be forwarded to L3 gateway
                before returning to original VTEP                                                     Internet
                                                                                                                    ---,
                                                                         .,,      •••                       -
                                                                                                                                 =
                                                                                                                                 =
              -----· VXLAN Tunnel                                                                                                =
                                                                                                                                 =
              - - - - - Native Layer 2 Traffic
                                                                                                                                 =
                                                                                                                                 =
                                                                                                                                 =
                                                                                                                                 =
              .............. Traffic Flow
Hair-Pinning
With this design, all inter-VLAN traffic must pass through the Layer 3 gateway at the spine nodes. This causes what is known
as hair-pinning in the network. A downside to hair-pinning is that it utilizes uplink and downlink bandwidth and adds
unnecessary hops to traffic that is forwarded between devices that are connected to the same VTEP.
Failure of a Gateway
             • L3 GW functions on remaining
               gateway double                                                       L3 Gateway   L3 Gateway
L2 Gateway L2 Gateway
VXLAN Domain
                                                                                                                  -
                                                                                                              =
                                                                                                              =
             -----· VXLAN Tunnel                                                                              =
                                                                                                              =
             - - - - - Native Layer 2 Traffic
                                                                                                              =
                                                                                                              =
                                                                                                              =
                                                                                                              =
             .............. Traffic Flow
Gateway Failure
Another downside to th is topology is that the failure of a gateway device may have a more severe impact on traffic within the
data center. Not only does a ll Layer 2 traffic exiting the domain have to pas.s t hrough a single spine device, but also all Layer
3 traffic processing for inter-VLAN routing must be processed by the remaining spine device and its fabric links.
       ■ Requirements Overview
       ■ Base Design
                                                                                                   -------  ~ · -------
                          • Distri butes L3 gateway functions
             • Externally destined traffic is forwarded
                                                                                           --- ---
                                                                            r------,,..........,
                                                                             L3Ga l way
                                                                                                         .,
                                                                                                   L3 Ga way
                                                                                                            ~   '
                                                                                                                   L3
                                                                                                                        --- --
                                                                                                                        teway
                                                                                                                                  .....
                                                                                                                                ....-.----,
                                                                                                                                 L3     teway
                                                                     Leaf    L2Gateway             L2 Gateway      L2 Gateway    L2 Gateway
               through VXLAN tunnel to spine device
                          • Spine device decapsulates and
                            forwards original IP packet toward the
                                                                                  =
                                                                                  =                    =
                                                                                                       =                  =
                                                                                                                          =              =
                                                                                                                                         =
                            external network
                                                                                  =
                                                                                  =                    =
                                                                                                       =                  =
                                                                                                                          =              =
                                                                                                                                         =
                                                                                  =
                                                                                  =                    =
                                                                                                       =                  =
                                                                                                                          =              =
                                                                                                                                         =
                          • "Internet" or external network routing                =
                                                                                  =                    =
                                                                                                       =                  =
                                                                                                                          =              =
                                                                                                                                         =
                            device is considered a "site" or host
                                                                        Servers
    C> 2019 Juniper Networks, Inc All Rights Reserved
lnter-VLAN t raffic between devices that a re connected to the same leaf node never leaves the leaf node. lnter-VLAN traffic
between devices in different VLANs that are connected to remote leaf nodes is routed on the source leaf, and forwa rded
across the VXLAN Layer 2 tunnel to the remote leaf.
With t his design, traffic moves between Layer 3 domains at t he leaf device. For traffic destined to external destinations, the
leaf device bridges the original traffic to t he VN I that connects to the security device connected to the sp ine node, or to the
VNI that connects to the sec urity domain.
       ■    EVPN VXLAN
              • L2 Gateways on leaf devices                                                             Internet
No VXLAN to Spine
Another option is to el im inat e the VXLAN tunne ls to t he spine nodes. All Layer 2 forwarding still transits t he spine devices
over the IP fabric, but t he spine nodes are not required to support any VXLAN capab ilities. The spine nodes in this
environment forward Layer 3 t raffic between the leaf nodes.
The spine nodes still participate in the underlay routi ng protocols. They ca n be configured to relay the overlay routing
information as we ll, and serve as BGP peers or route reflect ors, without running the EVPN or VXLAN com ponents.
               routers
                                                                      L3Gatewa
                                                                      L2Gatewa
                                                                                       L3 Gatewa
                                                                                       L2 Gatewa
                                                                                                               L3 Gatewa
                                                                                                               L2 Gatewa        =
                                                                                                                                =             =
                                                                                                                                              =
                                                                                       L3 Gatewa
                                                                                       L2 Gatewa
                                                                                       L3 Gatewa
                                                                                                                                =
                                                                                                                                =             =
                                                                                                                                              =
                                                                                       L2 Gatewa
                                                                                       L3 Gatewa
                                                                                                                                =
                                                                                                                                =             =
                                                                                                                                              =
                                                                                       L2 Gatewa
                                                                                       L3 Gatewa
                                                                                       L2 Gatewa           =                    =
                                                                                                                                =             =
                                                                                                                                              =
                                                                                 Servers
    C> 2019 Juniper Networks, Inc All Rights Reserved
BGP Underlay
An EBGP underlay can a lso be configured in place of the IGP underlay. With the EBGP underlay, all fabric devices with in the
network peer to all physically connected fabric devices. Each device exports its loopback address into the IP fabric domain to
provide reachability for the overlay BGP peering sessions. In larger environments, using a BGP underlay can provide
additional scalability and policy management functions that may not be available with an IGP. With this environment, each
device in the fabric can be configured as an individual autonomous system, or devices can be grouped into autonomous
system groups, such as configuring the spine devices with one AS number, and the leaf devices with another AS number. If
this method is chosen, additional parameters may be needed to change the BGP protocol 's default route selection and loop
detection process.
                                                                                       ♦
                                                                                                     -.. ......,.............'~-'Y'·....·····. . ................ ......······
                                                                                        L·2 GateJey ·•,,,
                                                                                                     ....
                                                                                                       . . . . : .· • •
                                                                                                               ~ -
                                                                                                                                  L2 Gatew.,ay '•,,,
                                                                                                                                   ..._ •• • • l •
                                                                                                                                                     •    •
                                                                                                                                                         ...,
                                                                                                                                                                        ......
                                                                                                                                                                            • .....
                                                                                                                                                                        •.•••
                                                                                                                                                                               ... ......
                                                                                                                                                                                               • ••••
                                                                                    VXLAN                                 '          ·•·l r" '. ' ' •••• .. ••-. -. ... ......
                                                                                                                                                                          ••• ..   ••.
                                                                                                                              '       I ·• •••  .... ' .··- ---.~ :•             •
                                                                        -
                                                                        L2 Gateway        ~
                                                                                                             ~
                                                                                               I---'-~------.-.-
                                                                                                              ·- -·
                                                                                                                              r--
                                                                                                                          -'--'=.\
                                                                                                                 L2 Gateway
                                                                                                                                  ~..:...,
                                                                                                                                                                •••     '
                                                                                                                                                                      ·-••
                                                                                                                                                                              '
                                                                                                                                                                             ••••
                                                                                                                                                                                    '
                                                                                                                                                                                             .-.-
                                                                                                                                                                                               ♦ -
                                                                                                                                                                                                   \
                                                                                                                                                                                           L~ Gateway :
                                                                                                                                                                                                        #••. :•
                                                                      .........,_-.-.,.
                                                                     '\   -_ ,____:...1-_-1_
                                                                                                    I                                                                                   •
                                                                                                                                                                                        •.          .••• ..••
                                                                                                                                                                                        ••                    •
                                                                            I                                                                                                                             •
                                                                     VLAN 100
                                                                                            =
                                                                                            =                                           =
                                                                                                                                        =
             -----· VXLAN encapsulated traffic                           =
                                                                         =                  =
                                                                                            =                                           =
                                                                                                                                        =
             - - - - - Native Layer 2 traffic
                                                                         =
                                                                         =                  =
                                                                                            =                                           =
                                                                                                                                        =
                                                                         =
                                                                         =                  =
                                                                                            =                                           =
                                                                                                                                        =
             .............. Traffic flow
               • Security domain performs routing between                                            _______.. _,__-:;.,,,,,c~•...... External Security Gateway
                 VLANs/domains                                                                                                L:t-Qateway
                                                                                                                                       •
               • Dedicated security domain can be physical                  VXLAN                    L2 Gateway
                                                                                                      .......
                                                                                                           :0:   -             .. ·..   ..., . .
                                                                                                                               L2 Gatewi!
                                                                                                                                   ,
                                                                                                    L2 Gateway
                                                                                                                     ---,
                                                                                                                                -
                                                                                                                          L2 Gateway
                                                                                                                                       ..., .._
                                                                                                                                                   '·~~~
                                                                                                                                                        •.• '
                                                                                                                                                                    y
                                                                                                                                                                I
                                                                           VLAN100
                                                                                                        =
                                                                                                        =                         =
                                                                                                                                  =
              -----· VXLAN encapsulated traffic                                 =
                                                                                =                       =
                                                                                                        =                         =
                                                                                                                                  =
              - - - - - Internal native Layer 2 traffic
                                                                                =
                                                                                =                       =
                                                                                                        =                         =
                                                                                                                                  =
                                                                                =
                                                                                =                       =
                                                                                                        =                         =
                                                                                                                                  =
              •·· ··········· Public facing traffic flow
Summary
We Discussed:
        •            A basic data center deployment scenario.
Review Questions
Review Questions
         1.
2.
3.
4.
       2.
Laye r 3 gateways are normally placed in the leaf devices, or in hosts connected to leaf devices, in an edge routed
EVPN-VXLAN environment.
       3.
The primary ro le of an IGP in an EVPN-VXLAN environment is to advertise paths to fabric devices, and advertise loopback
interfaces in the underlay network for use in an overlay network configuration. It also provides ECM P across all forward ing
paths in the underlay.
       4.
The primary ro le of EBGP in an EVPN-VXLAN underlay environment is the same as that of an IGP: to advertise loopback
interfaces for use in the overlay network. It is also to load balance transit traffic across all available forwarding paths
between devices.
                              Engineering Simplicity
Data Center Fabric with EVPN and VXLAN
Objectives
We Will Discuss:
        •            The term Data Center Interconnect;
       ➔ DCI                  Overview
       ■    DCI Options for a VXLAN Overlay
       ■    EVPN Type-5 Routes
       ■    DCI Example
DCI Overview
This slide lists the topics we will cover. We will discuss t he highlighted topic f irst.
                                                                                    DCI Transport
                                                                                      Network
Whenever two or more data centers are deployed with in an organization that must communicate with each other, a method
of connecting them must exist. A connection between two or more data centers is called a data center interconnect (DCI).
A DCI can function at Layer 2 or Layer 3. A Layer 2 DCI bridges Layer 2 traffic across the transport network. A Layer 3 DCI
connects data centers with Layer 3, or IP routing. Many different transport options are available to interconnect sites.
                                                                                   DCI Transport
                                                                                     Network
Dark Fiber
      ■    Dark Fiber
             • Enterprise is responsible for DCI routing and end to end optical function
             • Dedicated optical fiber. Can be lit with multiple WL, Highest quality and
               bandwidth solution
             • Most expensive option
ROAOM ROAOM
Dark Fiber
Dark fiber is a f iber-optic network link that connects two or more sites. With a dark fiber interconnect, the enterprise is
responsible for DCI rout ing and end-to-end optica l function . The fiber can be a dedicated fiber, where it can be lit with
multiple wavelengths. A dark fiber connection is the highest quality and highest bandwidth solution. It is also t he most
expensive option.
Wavelength Service
       ■    Wavelength Service
              • Enterprise is responsible for DCI routing and end to end optical function
              • Fiber shared with multiple enterprises
              • Less expensive option OR dark;fiber not a,,ailable
                                                          .
                                                                                                                 I
                                                                                                                      W L Leased to other customer
                                                                                     1•    •    Provider     _• 1
                                                                                                Owned
                                                                                     I
                                                                                     I                           I
ROADM
                                                                                          [>               [>
                                                                                     I    f '              t '
Wavelength Service
Wavelength services are similar to dark f iber, except the f iber is not dedicated to a single enterprise. A service provider owns
the f iber and uses wave mult iplexing to separate data streams between locations. Each wavelength can be dedicated to an
individual customer, so an enterprise leases a wavelength or set of wavelengths on a shared fiber. A wavelength service is a
less expensive option to dark fiber, or a good option when dark fibers are not available.
                                                                                                                1
                                                                                                                     W L Leased to other customer
                                                                                    I
                                                                                         --      Provider
                                                                                                 Owned
                                                                                    .I                          .I
                                                                                    I                           I
                                                                              ~'
                                                                  ----
                                                                                          ROADM         ROADM
                                                                              r:;
                                                                                          [>
                                                                 ----       ~~ ·           t '
                                                                            ~r:;
                                                                         DCI Routers                            DCI Routers
                                                         L3 fabric                                                                    L3 fabric
                                                        data center                                                                  data center
        L3 DCI Only
                                           DCI Option 1 - EVPN Type-5NXLAN (DC fabric    DCI Option 4 - IPVPN/MPLS (DC fabric and DCI
                                           and DCI collapsed on QFX1 OK)                 collapsed on QFX10K)
                                           DCI Option 2 - EVPN Type-2+5NXLAN - OTT       DCI Option 6 - VLAN handoff from DC fabric to MX.
        L2 + L3 DCI                        with fully meshed VTEP's (DC fabric and DCI   VPLS + IPVPN/MPLS or EVPN/MPLS on MX
                                           collapsed on QFX1 OK)
For Layer 3 DCI on ly, the data center fabric can be EVPN/ VXLAN and t he DCI Option 1 example shown in the slide runs EVPN
with Type-5 routes across the DCI. With this option, no Layer 2 connectivity is established across the DCI link. Only Laye r 3
routi ng is performed across the DCI link.
If Layer 2 and Layer 3 DCI are requ ired, a hybrid of EVPN with Type-2 routes and EVPN with Type-5 routes can be used. EVPN
Type-2 routes are used for a Layer 2 stretch across the DCI. EVPN Type-5 routes are used when Layer 3 DCI is required.
When using an MPLS backbone, and a data center is running EVPN/ VXLAN, an IP VPN/ MPLS DCI can be configured.
If Layer 2 and Layer 3 connectivity is desired across an MPLS backbone, a VLAN handoff from the data center fabric to an
edge device, which is ru nning VPLS, or IP VPN/ MPLS, or EVPN/ MPLS can be used . A VLAN or VXLAN can be stitched to an
MPLS label switched path at each end of the DCI.
                                                                                 EVPN-MPLS
                                                        L3VPN-MPLS                                         EVPN-VXLAN
                                                             •                                                     •
                                                             •                                                      •
                                                             •
                                                             ••                                                    •••
                                                              •                                                    ,,
                                                                                                                   •
                                                                                                                   •
                                                                                                                                      EVPN -VXLAN
                                                                                                                                            ••
                                            I
                                                          Option 1                   Option 2                   Option 3                   Option 4
• EVPN Stitching;
L3VPN-MPLS Option 1
                                                   L3VPN-MPLS
                                                        •
                                                        •
                                                        ••
                                                        .••
L3VPN-MPLS
The L3VPN-MPLS option is easy implement. The VXLAN tunnel runs over the top of a Layer 3 VPN. The VTEP endpoints
terminate on the devices within the data center, and the data center sites appear to be directly connected across the VXLAN
tunnel.
EVPN-MPLS Option 2
                                         EVPN-MPLS
                                                 •
                                                 •••
                                                  ••
                                                 ...
                                                        ........   ►
                                                                          VXLAN Tunnel to Edge device - VXLAN tunnel stitched to MPLS LSP
                            QFX                                    QFX
                           DC 1                                    DC2
EVPN-MPLS
EVPN stitching requires some planning, because it requires coord ination between the enterprise and the service provider.
The enterprise VXLAN VNls must be mapped to MPLS label switched paths in the provider network. This mapping must
match on both ends of the DCI.
EVPN-VXLAN Option 3
                                       EVPN-VXLAN
                                            ••
                                                •
                                                •
                                                •
                                                •
                                                •
                                                •
                           QFX                                 QFX
                           DC 1                                DC2
EVPN-VXLAN Option 3
EVPN/ VXLAN over an existing WAN connection is another over-the-top DCI option. This can be run over the Internet. With this
option, the WAN is an IP routed network and VXLAN tunne ls are configured from the DCI edge devices. One th ing to note
about this scenario is that the VXLAN tunnels within a site do not cross the edge device. The local VXLAN tunnels terminate
on the edge device, and a separate VXLAN tunnel is used across the interconnect. Traffic is routed or bridged on the edge
device between the two VXLANs.
EVPN-VXLAN Option 4
EVPN-VXLAN
                                           •
                                           •
                                           •
                                           •
                                           •
                                           ••
                                          t
                                                          Leased lines or dark fiber (direct connect)
QFX QFX
, ____.DC 1 DC2
EVPN-VXLAN Option 4
With a Direct Connect DCI, the DCI connection is treated as a Layer 2 link. The DCI devices at each site appear to be directly
connected to each other, and VXLAN tunnels run directly between them.
                                                          ~      Inter-POD or
                  •      VXLAN Routing                        lnter-DCI IP Fabric
                  ~ Border Role
                                                                     EVPN
                                                                    Peering
                                                                    between
               L3 VXLAN                                            border GW                                               ------
                                                                                                                           ------
               GW                       ------          ~~I     "'--~ on~ly~ - - - -                                       ------
                                                                                                                           ------
                                                                                                                                       L2/L3 VXLAN
                                                                                                                                       GW
YellowVLAN
Tenant 1 Tenant 1
Because no Layer 2 bridging or Layer 2 stretch is needed, it's not necessary to advertise MAC addresses across the DCI.
Instead, route prefixes can be advertised between the sites. A route prefix can represent a group of hosts within a site, or a ll
host addresses within a site. To use this option, EVPN Type-5 routes must be supported.
EVPN Type-5 routes differ from EVPN Type-2 routes in that EVPN Type-5 routes do not need a VXLAN tunnel to the protocol
next hop received in the router advertisement. The EVPN peering between the border gateways is to advertised prefixes that
exist with in the data center domain, and not MAC addresses.
                                                       -      Inter-POD or
                  •     VXLAN Routing                      lnter-DCI IP Fabric
                  ~ Border Role
                                                                 EVPN Peering
                                                                 between border
                                                                 GW only, RR on
               L3 VXLAN
               GW
                                                                     Spine                                     ------
                                                                                                               ______
                                                                                                               ------
                                                                                                               -----..-
                                                                                                               ------
               L2VXLAN
               GW   ====
                                                       -- ------
                                                           Full mesh VTEP
                                                                                                                              L2/L3 VXLAN
                                                                                                                              GW
                                                                Peering
Tenant 1 Tenant 1
                                                                                                                                        L2/L3 VXLAN
               L2 VXLAN                                                                                                                 GW
               GW
                                                                                                                           vRouter
                                                                                YellowVLAN     Red VLAN
                                                                                                                              VM2
                          Tenant 1                                          Tenant 1
MPLS Backbone
With an MPLS backbone in the provider network, the border gateway peers to the service provider to exchange route
information. Routes from one data center are advertised to the service provider, transported across the service provider
network, and re-advertised back to the remote data center.
With this scenario, the IP prefixes from one data center are advertised to the remote data center and traffic is routed
between them.
                                                                                 .._ ___
                  •      VXLAN Routing                  lnter-DCI MPLS Fabric
                                                                                           ...
                                                                                                 fP\lpN
                                                                                                          Routes
                  ~ Border Role
                                                                  IPVPN
                                                                 Peering
                                                                 between
               L3 VXLAN                                         border GW
               GW                                                   only
                                                                                                                                              L2/L3 VXLAN
               L2VXLAN                                                                                                                        GW
               GW   ====
                                                                                                                               vRouter
                                                                                YellowVLAN
                                                                                                                                  VM2
                          Tenant 1                                          Tenant 2
Public Cloud
A public cloud can be used to interconnect data centers. The public cloud option functions in a similar manner to an MPLS
backbone DCI. IP VPN routes are advertised across the public cloud from the edge routers in each data center.
                                                                                                                                        -                             ., __
                                                                                                                                                                          MAC host2 nh leaf2's loO
                                                                                                                                                                                        target:1 :1
                                                                                                                          p;r. ril,1                        5ii
                                           MAC/IP Adv ,, ., .,
                                                          I
                                                              /
                                                                                     ..
                                                                    MAC host2's nh tunnel S1>L2
                                                                                                       MAC/IP Adv
                                                                                                                .,.
                                                                                                                      '                ••   •   :
-- ... ' \
t L!::=======:::...J ---~ \
                                                                                             r                                  ....
                        MAC host2's nh tunnel L 1>L2                                                                                                                   MAC host2's nh L2>xe-O/O/O
                                                                     ~   S1                                                                                S2             xe-0/0/0        -g
                                           Ii='
                                                          1.= 1                                      DCI Transport                                                       ,--
                                           I                  L1
                                                                     -
                                                                          --+
                                                                         +--
                                                                          --+
                                                                                                       Network                                             +--
                                                                                                                                                            --+
                                                                                                                                                           +--    -      L2             I
                                                                                                                                                                                       host
                                         host1                           +--                                                                                --+                                    2
                                                                    DCI Routers              '-                                 ~
                                                                                                                                                    DCI
                                   10.1 .1 .0/24                                1                                                                     1 Routers                    10.1.1.0/24
                                                          I               I     I                                                                           I                  I
                                                          ~---------' -----------------------------------'
                                                  MP-IBGP Session (EVPN)      MP-IBGP Session (EVPN )      ~---------'                                  MP-IBGP Session (EVPN)
    Q2019 Juniper Networks, Inc All Rights Reserved                                                                                                                                   Juniper          NFl'\\'OPKS
                                                                                                                                                                                                                     22
Layer 2 Stretch
When a Layer 2 stretch is required between data centers, EVPN Type-2 routes must be advertised between the data centers.
Since each EVPN Type-2 route represents a single MAC address within a data center, the number of routes advertised across
the DCI link could be in the thousands of routes . In addition, the BGP protocol next hop of Type-2 routes must be va lidated
with the VXLAN tunnel next hop. This means that VXLAN tunnels must exist from end to end across the data centers, and the
DCI transport network.
                                                                                                                                                                                                '\\
                                       target: 1: 1 /
                                                11111
                                                                                                                IP
                                                                                                            Network
                                                                                                                                                               -
                                                                                                                                                               S2                 xe-0/0/0
111111
                                               host1                                                                                                                                            host2
                                                                  DCI Routers                       . __ _ _ _ _ _ _ _..,                                           DCI Routers
                                        10.1.1 0/24              L3 GW EVPN 1 ,                                                                            1       L3 GW EVPN          10.1 .2.0/24
                                                            I                     I                                                                            I                   1
                                                            ~---------' -----------------------------------·
                                                        MP-IBGP Session (EVPN)  MP-IBGP Session (EVPN )      ~---------'                                   MP-IBGP Session (EVPN)
Layer 3 DCI
If t he IP address range in each data center is unique to th at data center, it's not necessary to advertise MAC addresses
between data centers. In this scenario, IP prefixes can be advertised by using EVPN Type-5 routes. The EVPN Type-5 routes
do not contain the MAC addresses of the hosts at each data center.
With a Layer 3 DCI, the destinations in the remote data center are on different subnets than the hosts in the original data
center. This means that IP routing must take place. In the example, host1 is on network 10.1.1.0/24 . To communicate with
hosts in the 10.1.2.0/ 24 network, host1 must first send its data packets to its default gateway. In t he example, the default
gateway is the IRB interface on the DCI edge device, and in th is example the DCI edge device is the spine device.
The spine device does not operate as a traditional VXLAN Laye r 3 gateway. With the trad itiona l VXLAN Layer 3 gateway, t he
VNls that would be bridged must all be configured on the gateway device. When using Type-5 routes, standard IP routing is
used to switch Layer 3 domains. In the example, the data center on the right conta ins subnet 10.1.2 .0/ 24. The MAC address
of host2 is advertised to t he edge device spine2. There is a VXLAN tunnel between spine2 and leaf2. Spine2 installs host2's
MAC address in its switching table with a next hop pointing to the tunnel that term inates on leaf2. With Type-5 routing
enabled, t he prefix associated with the IRB interface within t hat broadcast domain is installed in t he route tab le. The spine2
device advertises t he prefix 10.1.2/ 24 to spine1 in an EVPN Type-5 route.
Spine1 receives the EVPN Type-5 route, which contains the IP prefix 10.1.2.0/ 24, and even t hough it is an EVPN route, it can
validate t he protocol next hop of that route using the INET.O routing table. The MAC address of t he IRB interface on spine1 is
advertised to leaf1, which is also used as the default gateway.
When host1 sends traffic to host2, host1 forwards t he data to the MAC address of the default gateway on spine1's IRB
interface. Spine1 decapsulates the VXLAN packet, performs a route look up on the inner IP header, and looks up the next
hop to host2's IP address. The next hop to t he 10.1.2.0/24 network is the VXLAN tunnel between spine1 and spine2. The
data packet is re-encapsu lated in a new VXLAN header and forwa rded to spine2. Spine2 decapsulates the VXLAN packet
that is destined for its loopback address, analyzes t he inner IP header, and identifies host2's MAC/IP information in t he local
Laye r 2 switching table. The local Layer 2 switch ing table indicates t hat the MAC address of host2 is reachable through the
tunnel from spine2 to leaf2. Spine2 encapsulates the packet and fo rwards it t hrough the spine2 to leaf2 VXLAN tunnel and
the packet arrives on leaf2. Leaf2 decapsulates the VXLAN header and forwards the original IP packet to host2.
                                                              I        POD1      I
                                                 .,.,,,.----                  ......
                                               ,
                                        + - _ ,_ - -
                                                                        ---
                                                                                                       ~----
                                                                                                           ---- ... ' -- ... -    c::;i
                                                                                                                        --
                                                                           -
                                                                                       DCI Transport
                                                                                                            ---        -          11111
111111,
                                                        ..........
                                                                          -
                                                                     ..... ___ _---                           -                  host2
                                                                        POD2
                     ♦   VTEP Tunnel
- - - - Traffic Flow
VMTO (1 of 3)
The MAC addresses of hosts are learned and advertised using BGP route advertisements. When a BGP peer receives an
advertisement for a MAC address, it stores the MAC address in the EVPN switching table. In a data center that uses virtual
machines, virtual machines can frequently move from one VLAN to another in a m igration. When a host is m igrated to a new
VLAN, the host may not be aware that it has been moved. It's common for the host to not flush its ARP table, and to retain
the ARP table that existed prior to the migration. Because the host has stored t he MAC address information for the default
gateway it was using prior to the migration, subsequent traffic that must be forwarded to another VLAN can be sent to the old
gateway MAC address. Virtual Machine Traffic Optimization is used to assist in this scenario.
host2
                      ♦   VTEP Tunnel
                                                            POD2
- - - - Traffic Flow
                                                              I     POD2    I
                     ♦ VXLAN Gateway
                                                        Enable VMTO in the routing instance:
    --------- VXLAN Tunnel
    -       -    -    -     Traffic Flow                [edit routing- instance instance- name protocols evpn remote - ip- host- routes]
        Enable VMTO
        ■       Enable VMTO in the routing instance
                • [edit routing-instance instance-name protocols evpn remote-ip-host-routes]
                                                        I   POD1   I
                                     Iii
                                                                                       -   -     -
                                                                                                ;=....::~   .... ....   _ --
          VM                                                           DCI Transport
        Migration                                                                                                        -==- 1 11
                                                                                                                               host2
POD2
                     ♦ VXLAN Gateway
     --------- VXLAN Tunnel
    -       -    -    -   Traffic Flow
Enabling VMTO
VMTO is enabled with in a customer routing instance at the [ edit rout i ng - instance instance - name p ro toco ls
evpn r emote-ip-hos t-routes ] hierarchy. Routing pol icy can be implemented to modify or regulate which routes are
imported into the forwarding table on the edge device.
➔ DCI Example
DCI Example
The slide highlights the topic we d iscuss next.
      • Underlay Topology
             • Underlay is two IP Fabrics based on EBGP routing
                          • EBPG routing to provider to advertise loopbacks between data centers
             • DCI is a MPLS Layer 3 VPN
             • Goal: Ensure that all loopbacks are reachable between sites
                                                    DC1                                                                                          DC2
                                                                                                       ,            r
AS65003
                                 ii>---!~~ = 1-+15}5_
                                                          AS65001
                                                                                             MPLS ProviderWAN
                                                                                                 (AS65100)                   I
                                                                                                                                 ~    -
                                                                                                                                     AS65002 AS65005
                                                                                                                                 5~~ ==                ~   >----411
                                                                                                                                                               -:--
Underlay
Data centers DC1 and DC2 are connected across an MPLS provider WAN . The MPLS provider WAN appears to the spine1
and spine2 devices as an IP routing handoff. Spine1 and Spine2 peer with the provider using EBGP.
Each data center is configured as a spine-leaf EVPN-VXLAN network. Host1 and host2 are in the same subnet, so a Layer 2
stretch is req uired across the DCI connect ion .
The underlay in each data center site is configured using EBGP. The goal of the underlay is to ensure that all loopback
addresses are reachable between sites.
      • Overlay Topology
             • All leaf switches act as VXLAN Layer 2 gateways
              • EVPN Signaling is based on MP-IBGP routing
                                                                                                                                AS65002
                                                                  AS65001
                                                             EBGP Unicast                                                   EBGP Unicast
                                                                                                                                      1
                                                    DC1
                                                              '      1      '                                               I              I       DC2
                                                                                                    ,
                                               l              A                                                                             A                   I
                                               IBGP EVPN
                                                        '
                                                   AS65000
                                                                                                   '
                                                                                               IBGP EVPN
                                                                                               AS65100
                                                                                                                                                    '
                                                                                                                                               IBGP EVPN
                                                                                                                                                 AS65000
Overlay
The overlay topology consists of MP-IBGP routi ng. The leaf device at each data center peers with the spine device in that
data center. The spine devices peer with each other across the WAN connection . Because both sites belong to the same
autonomous system, the spine devices are configured as route reflectors so that routes received on spine1 f rom spine2 are
readvertised to leaf1, and routes received on spine2 from spine1 are readvertised to leaf3.
The signaling is EVPN signaling, which means the IBGP peering sessions will be sending Type-2 MAC routes . When leaf1
receives t he MAC advertisements that originate from leaf3, leaf1 will install a VXLAN t unnel to leaf3, and vice versa . With
this example, leaf1 and leaf three will be directly connected with a VXLAN t unnel and will be able to forward Layer 2 traffic
from end-to-end.
         {master : O}[edit]
         lab@spinelll show routing-instances
         customerl (
             instance-type vrf ;
             interface irb . 10 ;                               irb interface in VLAN 10
             interface lo0 . 10 ; --                            loopback interface for customer VRF
             route-distinguisher 192 . 168 . 100 . 1 : 5001 ;
             vrf- target target : 65000 : 1 ;  -
                                               -        -       vrf-target community associated with customer 1
             routing-options {
                   auto-export {
                       fami ly inet {     --                    Ensure interface routes are in VRF table for forwarding
                            unicast ;                           next-hops
                         }
                                 }
                                                        -
                 }
         }
Spine1 Configuration
The configuration on the spine will be performed in a virtua l routing and forwarding instance, or VRF. With this configuration
type, each customer with in the data center can be configured with an independent VRF, and routes from different customers
can be maintained separately.
An IRB interface and the loopback interfaces are placed in the virtua l routing and forwarding instance. The route
distinguisher is unique to this device and th is customer. A VRF-target community is also defined that is unique to th is
customer. The rou t ing-opt i o n s a ut o-expo r t parameter ensures that the physical interfaces connected to the
spine1 device are included in the routing instance. If th is configuration parameter is not present, only the IRB and loopback
interfaces defined in the routing instance are present, and there wou ld be no physica l interfaces to forward traffic.
A simi lar configuration exists on router spine2, with a matching VRF target community.
                                                                                                                         .           .       . .         .      .
                                                                                                         Note: Sp1ne2 will have a s1m1lar conf1gurat1on
The BGP peering group EVPN is for the underlay network. The EVPN peering group peers t o the loopback address of the leaf1
device and to the loop back address of t he sp ine2 device . The default BGP rout e advertisi ng rules do not forward routes
learned from inte rnal BGP peers to other internal BGP peers. The cluster statement in the group EVPN allows spi ne1 to
re-advertise routes received f rom an internal BGP peer to members of the cluster, which are the neighbors configured within
that group. If this st at e were not present , routes received from spine2 would not be forwarded to leaf 1, and routes received
f rom leaf 1 would not be forwarded to spine2.
The peering group called provider is an externa l peeri ng group, which appears to the service provider. Th is peering grou p also
advertises the local loopbac k address to the external peer, and will relay the loopback address received from leaf1 to the
service provider.
The protocols EVPN sect ion sets encapsu lation to VXLAN, and is configured t o support all VNl 's.
The switch options hierarchy defines the VTEP sou rce interface to be the loopback interface, defines the route distingu isher
for th is device, and defi nes the global VRF target community that wi ll be used for al l type I EVPN routes. For all Type-2 and
Type-3 EVPN routes, the device is configured to automatically generate VRF target values. The VRF target val ues for auto
generated VRF target communit ies are based on the base VRF target value, and the VNI associated with the VLAN ID. In this
way the auto generated VRF target commu nities are synchron ized across multiple devices as long as the VNI/VLAN
mappings and the base VRF target are the same.
      Leaf1 Configuration
        {master : O}[edit]                                                                         ~ vpn {
        lab@l eafli show protocols                                                      EVPN   ~        encapsulation vxlan ;
        bgp {                                                                                           extended-vni-list all ;
            group overlay (                       -
                  type internal ;
                  local-address 192 . 168.100 . 11;                                                 {master: 0 ) [edit]
                  family evpn {                                                                     lab@leafli show vlans
                                                    • Overlay BGP peering                           default {
                      signaling;
                         }                                                                              vlan-id l ;
                                                                                                    }
                        neighbor 192 . 168 . 100 . l ;
                 }
                group fabric {
                                                                 -                                  vlO {
                                                                                                        vlan- id 10 ;
                    type external ;
                    export export-directs ;
                                                         -                                              vxlan {
                                                                                                            vn1. 5010 ;
                    local-as 65003 ;                                                                     }
                    multi path {                                                                    }
                        multiple- as ;
                                                             •   Underlay BGP peering               {master : 0 ) [edi t]
                         }
                                                                                                    lab@leafli show switch-options
                        neighbor 172 . 16 . 1 . 5 {
                                                                                                    vtep- source- interface loO . O;
                            peer-as 65001 ;
                                                                                                    route-distinguisher 192 . 168 . 100 . 11 : l ;
                         }
                 }
                                                                                                    vrf-target {
         }                                               -                                              target : 65000 : 1;
                                                                                                        auto ;
                                                                                                    }
Leaf1 Configuration
The leaf1 BGP configuration has two groups. The fabric group connects leaf1 to spine1 and advertises the leaf1 loopback
address to spine1. The overlay group peers with spine1 only. Routes from other devices within the data centers are related to
leaf1 to the spine1 peering session.
The EVPN configuration set the encapsulation to VXLAN, and includes all VNls.
A single VLAN is configured on leaf1. VLAN v10 has VLAN ID 10, and is assigned to VNI 5010. Although not shown in t he
output, the single interface that connects to host1 is an e t he r ne t -swi tchi n g interface assigned to VLAN v10.
The switch-opt i ons configuration hierarchy defines the VTEP source the interface as interface l oO . O, defines the
route distinguisher for leaf1, and defines the route target.
                                                                                                                       ___...
                                                > to 172 . 16 . 1 . 6 via xe- 0/0/1 . 0
                                                                                                                                   (Note the AS Path of the route)
          192 . 168 . 100 . 13/32              *[BGP/170) 03 : 19 : 00 , localpref 100
                                                   AS path : 65100 65002 65005 I , val idation- state : unverified
                                                > to 172 . 16 . 1 . 30 vi a xe-0/0/0 . 0
                Input packets · 0
                Output packets : 0
EVPN routes w ill have VTEP tunnels to validate protocol next-hop to remote devices
[snip]
[snip]
         2 : 192 . 168 . 100 . 13 : 1 : : 5010 : : 52: 54: 00: 2c: 4b: a2/304 .MAC/ IP host2 MAC address present in bgp.evpn.0 table
                                        *[BGP/170] 00 : 14 : 39, localpref 100 , from 192 . 168 . 100 . 1
                                            AS path : I, vali dation-state : unverified
                                          > to 172 . 16 . 1 . 5 via xe-0/0/1 . 0
[snip)
[snip)
[snip)
Verify Reachability
From the host2 device, the ping command can be used to verify that host1 is reachable across the data center
interconnect and across both data centers.
[snip]
         2 : 192 . 168 . 100 . 13 : 1 : : 5010 : : 52 : 54 : 00 : 2c : 4b : a2/304 MAC/ IP host2 MAC address present in default EVPN switch table
                                        *[BGP/170] 00 : 22 : 28, localpref 100 , from 192 . 168 . 100 . 1
                                            AS path : I , vali dation-state : unverified
                                          > to 172 . 16 . 1 . 5 via xe-0/0/1 . 0
         [snip)
         2 : 192 . 168 . 100 . 13 : 1 : : 5010 : : 52 : 54 : 00 : 2c: 4b : a2 : : 10 . 1 . 1 . 2/304 MAC/IP host2 MAC/IP address present in default EVPN switch table
                                        *[BGP/170) 00 : 01: 51 , localpref 100 , from 192 . loti . 1uu.1
                                            AS path : I , validation- state : unverified
                                          > to 172 . 16 . 1 . 5 via xe-0/0/1 . 0
         [snip]
      • Underlay Topology
             • Underlay is two IP Fabrics based on EBGP routing
                          • EBPG routing to provider to advertise loopbacks between data centers
                          • MP-IBGP routing leafs
             • DCI is a MPLS Layer 3 VPN
                                                       DC1                                                                                DC2
                                                                                                    ,
                                                              ---
                                            AS65003          AS65001                                                         AS65002 AS65005
                                 -
                                                ----
                                 c;::::,
                                                                       '.'_!~             MPLS ProviderWAN
                                 II                                    ~G-                    (AS65100)
                                                                                                                 ,'
                                              leaf1
                               host1                         spine1      '--    ......_                      I
                                                                                                                               spine2       leaf3
The underlay topology in th is example uses the same underlay as in the previous example.
      • Overlay Topology
             • Only Type-5 route advertised between data centers (no MAC Type-2 routes)
              • EVPN Signaling is based on MP-IBGP           routing
                                            Type-5 route (10.1 .1.0/24)
                                                                                            Type-5 route (10.1 .2.0/24)
                                                                                                                                          AS65002
                                                                   AS65001
                                                               EBGP Unicast                                                              EBGP Unicast
                                                                      1                                                                         1
                                                    DC1        I                                                                                     \       DC2
                                                                                                           ,              r
                                 11~1
                                                -
                                               ==-lli-----4
                                             leaf1
                                                              ----        ::.!~
                                                                          ~G-
                                                                                            MPLS ProviderWAN
                                                                                                (AS65100)
                                                                                                                                    'I
                                                                                                                                            ~
                                                                                                                                          1::.1 ~
                                                                                                                                           ~G       ----           ----
                                                                                                                                                                  leaf3
                                                                                                                                                                               -=
                                                                                                                                                                               I
                               host1                        spine1
                                                              RR
                                                                            '--   .......                             I
                                                                                                                                                    spine2
                                                                                                                                                     RR
                                                                                                                                                                              host2
                                               l               A                                                                                      A                   I
                                               IBGP EVPN
                                                        '
                                                   AS65000
                                                                                                          '
                                                                                                    IBGP EVPN
                                                                                                     AS65100
                                                                                                                                      ----- -·                '
                                                                                                                                                          IBGP EVPN
                                                                                                                                                           AS65000
                                                                     -----------------------------------                  -----------
                                               VXLAN Tunnel                                        VXLAN Tunnel                                          VXLAN Tunnel
    C> 2019 Juniper Networks, Inc All Rights Reserved
A VXLAN tunnel is created between leaf1 and spine1. Since host2 is in a different broadcast domain t han host1, host1 will
be required to send traffic destined to host2 to t he default gateway, which is spine1. The VTEP t unnel between leaf1 and
spine1 wi ll be used to forward that traffic. The default gateway address is configured on t he IRB interface on spine1 .
Once t raffic reaches the IRB interface on spine1, a route lookup ta kes place and det ermines that the destination network
10.1 .2 .0/24 is reachable through the new VTEP t unnel that crosses the DCI. Router spine2 is listed as the next hop for t he
Type-5 route prefix that advertises reachabi lity t o t he destination.
The traffic is forwarded across the VXLAN t unnel and arrives at spine2, at which point the VXLAN header used to cross the
DCI link is removed, and a new VXLAN header is placed on t he packet in order to forward it to leaf3.
Since the devices in DC1 are not interested in the VNI associated with the host network, the Type-2 MAC routes associated
with t he DC2 VNls will not be accept ed by t he DC1 routers, and will not be present in the DC1 devices. Instead, a route prefix
will be present on spine1 to route traffic toward DC2.
         {master : O}[edit]
         lab@spinel# show routing-instances
         customerl (
             instance-type vrf ;
             interface irb . 10 ;                       +-------------+-- irb interface in VLAN 10
             interface loO . 1 O; :                                                       loopback interface for customer VRF
             route- distinguisher 192 . 168 . 100 . 1 : 5001 ;
             vrf- target target : 65000 : 5100 ; • : - - - - - - - - -                    vrf-target community associated with customer 1
             routing-options {        -
                   auto-export {
                       fami ly inet {                     ~-----------+- Ensure interface routes are in VRF table for forwarding next-hops
                            unicast ;
                         }
                                 }
                                                         -
                 }
                protocols (
                    evpn (                                               -
                        ip-prefix-routes {                                                  • Configure Type-5 IP Prefix Route advertisement
                            advertise direct-nexthop;                      ~t----+--- • Connection uses an independent VNI for spine-to-spine connection - acts as an
                            encapsulation vxlan;                                              Ethernet segment between spine devices for routing and forwarding
                            vn1. 10010 ;
                                 }
                                                                         -                  • Advertise /32 routes with advertise direct-next-hop parameter
                         }
                 }
         }
Spine1 Configuration
The only changes that wi ll affect the behavior of the DCI will be made on the spine devices. The leaf devices have no
changes.
The example shows the configuration of the customer1 VRF on spine1. Note the addition of the [protocol s evpn ]
hierarchy with in the routing instance. The configuration shown enables the Type-5 IP prefix router advertisement functions.
The VNI listed in the EVPN section of the configuration is the VNI associated with the VXLAN t unnel that will cross t he DCI.
[snip]
                                   Note: L3 gateway devices have IRB interfaces configured in the VLANs that will be bridged,
                                   and bridge traffic between IRB interfaces. With Type-5 routes, remote VLANs, and IRBs
                                   associated with remote VLANs, do not have to be configured on the border router.
         cust omer l . inet . O: 5 desti nati ons , 5 routes (5 active , 0 hol ddown , 0 hidden)
         +=Acti ve Route , - = Last Active , * = Both
                                                                              DC1 prefix on leaf1 (local IRB interface is configured as part of
         10 . 1 . 1 . 0/24                    * [Direct/OJ 00 : 24 : 59 .     the 10.1.1.0/24 network and is the default gateway for VLAN 10)
                                               > via irb . 10
         10 . 1 . 1 . 100/32                  * [Local/OJ 00 : 24 : 59
                                                   Local via irb . 10
         10 . 1 . 1 . 254/32                  *[Local/O J 00 : 24 : 59
                                                   Local via irb . 10
         10 . 1 . 2 . 0/24                    * [EVPN/170] 00 : 24 : 15                         DC2 prefix on leaf3 learned through EVPN route
                                               > to 172 . 16 . 1 . 30 via xe-0/0/0 . 0
Remember, traditional EVPN route next hops must be va lidated by a tunnel to a remote VTEP. Type-5 routes can be validated
using standard 1Pv4 routes .
[snip)
                                           Indirect
                                                              :==============--{
         5 : 192 . 168.100 . 1:5001: : 0 : :10 . 1.1.0 : : 24/248
                                    * [EVPN/170] 00 : 27 : 31                                      .--------..._---------------,
                                                                                                     Type-5 routes present for both subnets (one local, one remote)
         5 : 192 . 168.100 . 2 : 5001 :: 0 : : 10 . 1.2.0 : : 24/248                               ~-------r-------------~
                                    * [BGP/170 ) 00 : 26 : 47 , localpref 100, from 192 . 168 . 100.2
                                         AS path : I, vali dation-state : unverifi ed
                                      > to 172 . 16 . 1 . 30 via x e - 0/0/0 . 0
Summary
We Discussed:
        •            The term Data Center Interconnect;
Review Questions
Review Questions
         1.
2.
3.
4.
      2.
When the transport network is a public IP network, VXLAN tunnels can be configured across the DCI for Layer 2 stretch, or
Type-5 routes can be used to advertise prefixes across the DCI connection.
      3.
Type-2 EVPN routes must be val idated using the : vx l an . i ne t.   o table,   which contains the links associated with VXLAN
tunnels.
      4.
Type-5 EVPN routes can be validated using standard 1Pv4 routes, whether those routes are present in the default inet.O route
table, or in a customer VRF route table.
                              Engineering Simplicity
Data Center Fabric with EVPN and VXLAN
Objectives
We Will Discuss:
        •           An advanced data center deployment scenario.
       ➔ Requirements                                    Overview
       ■    Base Design
Requirements Overview
This slide lists t he topics that we will cover. We wi ll discuss t he highlighted topic first.
Organization Requirements
• How many data center sites will be deployed? How will t he data centers be connected?
        •            VLANs - How many VLANs wi ll be required within the domain? How will t raffic f low with in the same VLAN? How
                     will traffic flow between hosts in different VLANs?
        •            Security - What traffic will be required to pass t hrough a secu rity domain? How will that security domain be
                     implemented? Is an edge f irewa ll sufficient and sca lable? Will a security domain which contains several
                     security devices be requ ired?
• Scalability - How wi ll the initial design be impacted when the data center scales?
      Proposed Solution
      ■    Solution outline
             • Two data center sites: DC1 and DC2
             • Spine-Leaf topology
             • Multi-homed servers (EVPN-LAG)
             • VXLAN with EVPN control plane for Layer 2 domains within VRFs on super
               spine nodes for traffic flow control
             • Layer 2 gateways at the leaf nodes
             • Layer 3 gateways at the spine nodes
             • Dedicated service block within DC1
                          • Traffic to external destinations and to remote DC must pass through service block
             • Service device within DC2
             • EVPN controlled VXLAN architecture with Type 5 routes
             • Route reflectors for BGP overlay route distribution
    C> 2019 Juniper Networks, Inc All Rights Reserved
Solution Outline
The design for this example consists of two data center sites: DC1 and DC2. The physical topology will be a spine-leaf
topo logy, and all servers wi ll be multihomed using EVPN-LAG. A VXLAN with EVPN contro l plane will be used for the Layer 2
doma ins. Layer 2 gateways will be configured on the leaf nodes, and Layer 3 gateways wil l be configured on the spine nodes.
To maintain traffic separation and to improve control traffic flows, the Layer 3 gateways and super-spine nodes will
implement VRFs. VRFs allow traffic to pass through the device in one VRF, and then be forwarded back to the device to be
forward along a different path later t ime, such as after traffic has been processed through a service block, or when traffic
passing through the same super-spine device must be d irected over a DCI link or an externa lly facing link. The forward ing
path that the traffic takes depends on the VRF in which the traffic arrives.
A dedicated service block within data center DC1 will service all traffic that is destined to external destinations, and all
inter-VLAN traffic. A single service device will be deployed within the data center DC2, through which a ll Layer 3 routed traffic
that arrives at or leaves DC2 will have to pass.
Data Center Interconnect (DCI) traffic will be routed using Type 5 EVPN routes. Within the data center, route reflectors will be
used for BGP overlay route distribution. Layer 2 DCI traffic, or Layer 2 stretch traffic, does not pass through the service
devices, and is bridged to remote hosts with in the same broadcast domain.
      ■Requirements Overview
      ➔ Base Design
Base Design
The slide highlights the topic we will d iscuss next.
              • Dual-homed servers
                                                                                           --- ---
                                                                                             -         -     1----i
                                                                                                                       --- ---
                                                                                                                       -     -   1--------..
                                                                                                                                 Core Fabric
              • Dual Internet Gateway
                                                                                                 ---                  ---
              • Service Block                                                                      -                     -
                                                                                                               WAN
                                                          DC1
                                                                    DC1 Super Spine Layer                  -
                                                                                                       --=-.          ----
                                                         DC1 Spine Layer             ---               ---                          --           --
                                                         DC 1 Leaf Layer
                                                            EVPN LAG        .->,a~    :.__
                                                           to each server
                                                                                                 •••
                                                                                                                                  Service Device Cluster
                                                                                                                                  DC1 Service Block
                                                                                     Sample POD
Physical Layout
The physical topology in DC1 is based on a five t ier fabric architecture. The spine and leaf nodes are grouped together in
pods. Each pod connects to a super-spine layer. The servers within each pod are dual homed to leaf devices.
The super-spine consists of dual connected Internet gateway devices. The DCI is used to connect the super-spine devices in
DC1 with t he spine devices in DC2. A service block in DC1 services all traffic destined to external destinations, all inter-VLAN
traffic, and all routed traffic to data center DC2.
                                                                                                               -         - - -
                               within a pod                                                                          ~~7                                                Core Fabric
                                                                                                                        --- ---
             • IBGP between spine devices and RR
               (overlay route distribution between
                                                                                                                          - - " WAN
                                                                                                                                                          A
                                                                                                             ¢_                                                         ~
               pods)                                                                                                                                               .r
                                                                                                                            --- ---
                                                                                                                              1. /                        , 1
                                                                                                                                     -          --                          (
                                                        EBGP
                                                                 DC 1 Leaf Layer
                                                                                    ~    ~
                                                                                      ESls
                                                                                                 ..                            ~ESls
                                                                                                                                               ,-;;   '
                                                                                                                                                                                  •
                                                                                                                                                                                          •            ..•• ''
                                                                                   <-                       J               <-                                                  ' •
                                                        IBGP                                                                                                                              •
                                                                                                                BGP               ~                                             '
                                                        BGP
                                                                                   '\..,
                                                                                           •
                                                                                           II
                                                                                                 =
                                                                                                      Sample POD
                                                                                                                                  --
                                                                                                                                  -=
                                                                                                                                    •
                                                                                                                                                 ,/
                                                                                                                                                                            Service Device Cluster
                                                                                                                                                                                DC1 Service Block
                                                                                                                                                                                                                  ,,/'/
    C> 2019 Juniper Networks, Inc All Rights Reserved
Underlay Network
The underlay network in both data centers wi ll be configured with EBGP sessions on all point-to-point links within the data
center. IBGP peering sess ions between spine devices and the super-spine devices, wh ich act as route reflectors, provide
route red istribution between pods. The spine devices in each pod are configured as route reflectors for route redistribution
w ithin each pod.
BGP runs between the service block routers and the service block devices to d istribute routing information between the
VXLAN networks and the service device clusters.
                Gateway devices)
              • Full mesh between all leafs within a DC
              • Full mesh between all leafs across all DCs
              • VXLAN tunnels dynamically built through                                              DC1
                                                                                              DC1 Super Spine Layer
                EVPN signaling
Logical Overlay
VXLAN tunnels will be built between all VTEPs across the entire domain. A full mesh of Layer 2 VXLAN tunnels exists between
all leaf nodes and all spine nodes within a pod. A fu ll mesh of VXLAN tunnels exists between all leaf nodes within a DC. A fu ll
mesh of VXLAN tunnels exists between all leaf nodes across both data centers. Al l of these VXLAN tunne ls wi ll be
dynamically created through EVPN signaling. These VXLAN tunnels create a Layer 2 stretch from every leaf device to every
other leaf device.
Note that the diagram has been simplified by not showing all VXLAN t unnels. It shows a representation of the types of
tunnels that will be created.
                                                                                                                                  Core Fabric
             • lnter-VRF traffic between Data Centers through
                                                                                                           -- --
               the Service Block
      • EVPN Type-5 Routes
                                                                                                             - -
                                                                                                    '------l-1-~-        ~-------
Distributed Gateway
The Layer 3 VXLAN gateway exists on the service block routers. All inter-VRF traffic within the same pod must pass through
the service block. All inter-VRF traffic between pods must pass through the service block a ll inter-VRF traffic between data
centers must also pass through the service block.
EVPN Type-5 routes are used for Layer 3 destinations outside of the data centers. This includes Layer 3 destinations with in
each data center, and not just external destinations. The next hop for al l Type-5 routes is the interface on the service block.
Traffic in DC2 does not use a service block in DC2. Instead, DC2 traffic that requires servicing must traverse the DCI link and
pas.s through the service block in DC1.
                                                                             _ ___:::::::::;=i· ~·===;·;:S::::::=-__________
                                                                                             -- ---
                                                                                 DC1
                                                                                            - -                                    DC1 Super Spine Layer
Layer 2 Stretch
In the same manner, intra-VRF traffic between data centers did not pass to a service block as it is considered trusted.
                                                                                                     . ,.,.• .
                                                                                                          ♦
                                                                                                      • wa•• •
                                                                                                               •
                             the firewall
                                                         I DC1 Leaf Layer I
                                                                                        • • •                               •••             DC1 Service Block
---+
                                                                                                           WAN
                                                                                                             ••
                                                                                DC1                                                   DC1 Super Spine Layer
Internet Traffic
Traffic destined to the Internet or to outside destinations is forwarded to the Layer 3 gateway in the service block. The
Layer 3 gateway forwards the traffic to the service block, which processes the traffic and forwards it back to the Layer 3
gateways in a different VRF or VLAN. From the service block router, the traffic is forwarded to an Internet specific VRF on the
super-spine, which has a connection to external destinations, such as the Internet. Return traffic fol lows the reverse path
and comes in through the Internet VR F on the super-spine, and is forwarded to the service block to be processed before it
reenters the data center domain.
Summary
We Discussed:
         •            An advanced data cente r deployment scenario.
Review Questions
Review Questions
        1.
2.
3.
      2.    Deploying a service block instead of a dedicated service appliance allows the sca lability of the service block. A
            service block can be scaled by adding new devices and services beyond the gateway device, which is
            transpa rent to the rest of the network.
      3.    A five-stage topology allows the data center to expand horizontally without impacting each individua l pod. New
            pods can be added as needed without affecting the other pods, and with m inima l impact on the super-spine
            uplinks and down links.
                              Engineering Simplicity
Data Center Fabric with EVPN and VXLAN
Objectives
We Will Discuss:
        •           The multicast extensions to EVPN; and
       Traditional Multicast
       ■    Traditional Multicast Process
              • IGMP at edge
                           • Used to register multicast sources
                           • Used by receivers to request multicast feeds from specific or general multicast feeds (S,G
                             or *,G)
                           • Queries used by Multicast Designated Router (DR) for each broadcast segment (edge)
                              - Used to signal when a receiver is no longer interested in a multicast feed (Group
                                Leave)
                              - Used to verify interested receivers on a broadcast segment (Group Query)
               • PIM used to select active forwarding path through core (routed) network
                           • Shared tree through Rendezvous Point (RP)
                           • Shortest Path Tree between receiver and source
                           • Join and Prune messages to initiate or terminate multicast feeds
Traditional Multicast
Trad itional multicast networks involve several components. The resource devices that send multicast traffic into the network,
receiver devices that are interested in receiving the multicast traffic, and then there are the network devices in between the
source and receiver.
At the edge of the multicast domain, the Internet Group Management Prot ocol (IGMP) is used to register multicast sources
and to register multicast group requests by hosts on the network.
One device connected to a LAN segment is elected as a designated router for the broadcast segment. Its ro le is to signal to
the multicast network when a receiver is no longer interested in a multicast feed, and to verify or register interested receivers
on the broadcast domain for wh ich it is responsible.
The prot ocol independent multicast protocol, or PIM, is used to select the active forwarding path through 1/ 4 or routed
network. It can do this in one of several methods.
One of the most common methods is to forward all source traffic to a central point called a shared tree. With a shared tree,
a centralized device is selected to receive all source traffic, and to wh ich all join messages for multicast feeds, where a
specific source address for the feed is unknown, are initially sent. Th is device is called a Rendezvous Point (RP). The RP is
the central point for all multicast sources and receivers, or any receiver can join a multicast tree.
A Shortest Path Tree between a receiver and a source refers to a direct routed path from a source to a rece iver. In order to
create a shortest path tree, a receiver must know the specific source IP address, and request traffic from that specific
source in an IGMP query. When a DR receives an IGMP query for a specific source, group (S,G) combination, a multicast
forwarding tree is established along the shortest path between the source DR and the receiver DR.
To manage multicast flows throughout the network, join and prune messages are used to initiate or terminate multicast
feeds.
      EVPN Multicast
      ■    EVPN with VXLAN tunnels traffic
             • IGMP could be tunneled across IP overlay
                          • Inefficient (places DR at remote locations from source or receiver)
                          • Increases BUM traffic in core
             • Inefficient multicast stream replication
             • Multicast functions are separate from EVPN Control Plane
      ■    EVPN route types for Multicast
              • Type-6: Selective Multicast Route to signal IGMP Joins to remote VTEPS
             • Type-?: IGMP join sync for multi-homed sites (Designated Forwarders)
             • Type-8: IGMP leave sync for multi-homed sites (Designated Forwarders)
EVPN Multicast
Multicast t raffic is broadcast within a broadcast domain. Normally, broadcast domain is conta ined with in a single switch ing
doma in, and mu lticast traffic is only forwarded to remote rece ivers across a routed network. With this design, the multicast
broadcast packets are conta ined with in the broadcast domain that terminates at the designated rout er.
With an EVPN-VXLAN, a broadcast domain is not limited to a single location. The EVPN-VXLAN domain can extend across
multiple leaf devices, multiple spine devices, and mult iple data centers. The members of the VNls in the remote locations
are part of a single broadcast domain. Because of th is, when a multicast source sends traffic, it is forwarded throughout the
entire broadcast doma in.
There are a couple of ways to manage the multicast traffic in an EVPN-VXLAN. The f irst would be to tunnel IGMP across the IP
overlay. This is inefficient because it places a designated router at remote locations that are not directly connected to the
source or receiver. This creates inefficient multicast stream rep lication, and multicast functions are separate from the EVPN
control plane.
To help address some of these inefficiencies, three new EVPN route types for multicast were developed. The Type-6 route, or
selective multicast route tag route is used to signal IGMP joins to remote VTEPs.
The Type-7 route, or IGMP j oin sync route, is used in multihomed sites where a source or rece iver is connected to a broadcast
doma in has multiple routers as exit points, and the join queries must be synchronized across all potential edge devices.
The Type-8 route, or IGMP leave sync route, is used in multihomed sites where source or receivers connected to a broadcast
doma in that has mu ltiple routers as exit points, and t he leave queries must be synchronized across all potential edge
devices.
                                 DC1                           ---
                                                               +--
                                                                      --
                                                                     +--
                                                                     +--
                                                                                                                         VRF/Bridge Domain
VRF/Bridge Domain
                                                          --                --      --                     --      --
                                                         +--               +--
                                                                           +--
                                                                                   +--
                                                                                   +--            DC2      -
                                                                                                           +--
                                                                                                                   -
                                                                                                                   +--
                                                                            =
                       SRC                                                   R
                       (S,G)                                               (S,G)
                                              POD1                                   POD2               POD1                  111
With traditional multicast, traffic is sent to all leaf devices that service the VNI/VLAN because they all reside in the same
broadcast domain. Traffic from the source and pod one arrives at t he receiver in pod two, but t he t raffic is also forwarded to
all leaf devices that service the VNI throughout the entire VXLAN .
                                                                                                                ~
                                 DC1                      ---                                                         VRF/Bridge Domain
                                                                                                                      VRF/Bridge Domain
                                                                                                                        \
                                                                                                            \    -
                                                                                                                --
                                                                                                                +--
                                                                                                                            \
                                                                                                                                \
                                                                                                                                    \
                                                                                                                                        \
                                                                Interested
                                                                  (S,G)
                                                                  =
                       SRC                                         R
                       (S,G)                                     (S,G)
                                              POD1                           POD2               POD1                                        111
Type-6 Routes
The EVPN Type-6 route, or selective multicast Et hernet tag ro ute, allows a VTEP device to advertise whether locally
connected receivers are interested in receiving m ult icast traffic. The route can indicate a specific source for t he multicast
group, or j ust a multicast group address. This allows remote VTEPs to register which remote VTEPs are interested in
multicast feeds within the same broadcast domain.
Not all leaf devices will support the Type-6 route. If a leaf device does not support Type-6 routes, it cannot signal to remote
VTEPs that it is not interested in receiving certa in mu lt icast feeds. Therefore, it receives all multicast feeds for the locally
connected broadcast domains.
                                                                                                 l
                                                                       I                                                \
                                                       IGMP                                                                       IGMP
                                                         1                                                                          1
                                       I
                              =            10.1 0.10.1 1/24
                                                                                                                                                   '
                                                                                                                                      10.1 0.10.22/24   Site 2   ~
This process requires the capability to perform IGMP snooping on the leaf devices. The IGMP leaf devices listen to IGM P
queries that enter the customer facing interfaces, and replace them with EVPN Type-6 routes.
                                                                                                         ----                                                    ----
                                                                                                 •••• Soine1                                                Soine2 •••• •
                                                                                              ••                                                                                           ••
                                                         IGMP                               ••
                                                                                            •                                                                                                 •   ••                                            IGMP
                                                                                         •••                                                                                                        •
                                                                                                                                                                                                    ••
                                                          1                         ••
                                                                                      • •                                                                                                               ••
                                                                                                                                                                                                         ••                                      1
                                        I                                \         ••                                                                                                                        ••                  I                            \
                                                                               ..·····~
                                                                               •
                                                                                                                                                                               ~ ·. ....                          ·
                               =            10.10.10.1 1/24
                                                                             ••
                                                                                                                                                                                                                       "                          10.10.10.22/24    Site 2 =
                                                                               - ----------------------------·
                                                                             ~
+-
                                                                                                                                                                                                                                                                              11
                                                                         I                                                                                                                                                   I
                                                                     --
                                                                    +--
                                                                    +--
                                                                                        --
                                                                                       +--
+--
                                                        Leaf1
                                                                -     Leaf2
                                                                              -              Leaf3
                                                                                                     -
                                                                                  DR
                                                                                             S ite 2 ~ -----~
                                                                                                         SRC
Multihomed Sites
When a device is mu lti homed to a VXLAN, another problem presents itself when dea ling with mu lticast. Within a broadcast
doma in, only one device at the edge of the domain can be elected as a designated router (DR). The designated router's role
is to manage the mu lticast process within the connected Ethernet segment. With an active/active environment, the devices
connected to the m ultiple leaf devices can forward t heir IGM P join messages to t he non-designated ro uter. If the
non-designated router receives an IGMP join message, the non-designated router does not process the join. The res ult is
that the designated router never receives the join message, and cannot initiate a multicast f low from the source.
                                                                               +----             --
                                                                                                +--
                                                                               +--              +--
                                                                                                                          I- -       Multicast flow   J
                                                          Leaf1
                                                                   -                Leaf2
                                                                                            -          Leaf3
                                                                                                               -
                                                          Leave
                                                         Message            ~ P                       Site 2   P   ,......__~
                                                                   '-----~ I                                        SRC
                                                                   Site 1     cE1
                                                          Route
                                                       Advertisement        ---   ---
                                                          Leaf1                         Leaf3
                                                                                                  -
                                                                                                ---u..ui   DR
Site 2 P c
EVPN Type-7
To remedy the join mes.s age issue, the EVPN Type-7 route was created. The EVPN Type-7 route acts as an IGMP
synchronization mechanism between leaf devices that are connected to the same Ethernet segment. With t he EVPN Type-7
route, if an IGMP join message arrives on t he leaf device, that leaf device automatical ly advertises a Type-7 route contain ing
the join message information to all other leaf nodes that are connected to the same Ethernet segment. In the example, the
IGMP join is sent to a leaf device that is not the designated router. Leaf1 creates an EVPN Type-7 ro ute conta ining t he
information in the join message and sends it to Leaf2. Leaf2, acting as the designated router, becomes aware that a device
in sight one is interested in multicast traffic, and begins the multicast tree build ing process.
                                                                        Type-7 Route
                                                                        Advertisement
                                                                                                       --              ---
                                                                             Leaf1                         Leaf2        Leaf2    Leaf3
                                                                                                                                           -
                                            ---------
                                            !ESl=Ox0:1 :1:1 :1:1 :1 :1 :1 :11 ..... ··
                                                                                       ...•                            Type 1
                                                                               :'    ---====~
                                                                                            A~i;:;:=i
                                                                                                 ~ B~              -
                                                             l 1GMP Join r                                                      Site 2    =c
                                                                                       Site 1        CE1                                  CE2
     Q 2019 Juniper Networks, Inc All Rights Reserved
                                                        Type-8 Route
                                                        Advertisement        ----   ----
                                                           Leaf1                           Leaf3
                                                                                                     -
                                                                                                   -u..ui   DR
Site 2 P c
                                                                     Type-8 Route
                                                                     Advertisement                  -             ---
                                                                         Leaf1                        Leaf2             Leaf3
                                                                                                                                   -
                                            ---------
                                            ! ESl=Ox0:1:1:1:1:1:1:1:1:1j ••••• ••
                                                                                  ...•
                                                                           :·    ---====~
                                                                                        A~i;;;:::i
                                                                                             ~ B~             -
                                                           l1GMP Leaver
                                                                                                                        Site 2    =c
                                                                                   Site 1       CE1                              CE2
     Q 2019 Juniper Networks, Inc All Rights Reserved
                                                                          ----                   ----
                                                                                 -----
                                                                                                ,,,,.-----
                                                           Leaf1            Leaf2
                                                                                     -     ¥~                     ......
                                                                                                             ............,:,
                                                                                                               Leaf3
                                                                                                                                   -   Traffic replicated on Leaf 3
                                                                                                                               I
                                                                                                                               I
       Multicast Hair-Pinning
                • Centralized DR can cause hair-pinning across VXLAN network
Multicast Hair-Pinning
Within a multicast environment in a VXLAN, the designated router does not have to be configured on t he device that directly
connects to the source or receiver. The designated router can be configured on an IRB interface on a remote device. When a
multicast feed is instantiated, the feed must always extend from the receiver or source to the designated router for that LAN
segment.
In the example, t he designated router for subnet 2 is configured on leaf3. Any device in subnet 2 must receive multicast
traffic from its designated router, which is on the other side of t he EVPN. Th is process where t raffic is sent across the
network to an IRB, then switches VLANs, and returns to a receiver on the remote side of the network is called ha ir-pinning. As
you can see in the d iagram, this traffic pattern is inefficient.
Receiver1
EVPN Distributed DR
With distributed DRs, multicast traffic is forwarded to the EVPN IRB interfaces that are connected to interested receivers,
regardless of whether or not the leaf device is a DR for the subnet. This allows a local device, such as Leaf1 in the example,
to act as a designated router even when it is not a designated router. It forwards traffic destined to Receiver1, which is in a
different subnet, without having to forward the traffic to remote Leaf3.
To prevent traffic duplication, multicast traffic is only sent out of the IRB interface to local access interfaces. The local
distributed DR cannot forward routed multicast traffic. In other words, the IRB for subnet 2, which is connected to Receiver 1,
cannot forward the mult icast traffic to remote VTEPs. Only the original IRB that receives traffic from the source can forward
to remote VTEPs.
EVPN Multicast
The slide highlights the topic we will d iscuss next.
      • EVPN Routing
             • Synchronize joins from dual-homed Site 1 (CE1)
              • Advertise source from Site 2 (CE2)
---- ----
                                                                          Leaf1
                                                                                       -          Leaf2
                                                                                                          -          Leaf3
                                                                                                                             -
                                                    !ESl=Ox0:1:1 :1:1 :1 :1 :1 :1:11
Site 2 ~ c
Example shows the configuration of the IRB interface, the VLAN configuration that places the IRB interface in t he broadcast
doma in, the IGMP configuration that enables the IGMP protocol on the IRB interface.
      • IGMP Snooping
      • Configured under [edit protocols igmp-snooping]
                                                        {master : O}[edit]
                                                        lab@leafl# show protocols igmp- snooping
                                                        vlan vlO {
                                                            12-querier {
                                                                  source-address 10 . 1 . 1 . 113 ;
                                                            }
                                                            proxy ;
                                                        }
IGMP Snooping
In order to register the IGMP messages sent from clients, the leaf device must be able to see t he IGMP messages t hat enter
the host faci ng interfaces. IGMP stooping is used to perform th is task. The example shows the configuration of the IGMP
stooping process or VLAN v10 . It also configures a source address that should be used for listening to IGMP queries.
Optional - Distributed DR
       ■    Distributed DR
              • Only required if configuring distributed DR within the VXLAN
               • Configure RP address and enable distributed-cir
                           • In this example, the RP is a remote device (not local)
                                                         (master : O}[edit]
                                                         lab@l eaflf show protocols pim
                                                         rp {
                                                             static {
                                                                   address 192 . 168 . 100 . 1 ;
                                                              }
                                                         }
                                                         interface irb . 10 {
                                                             distributed- dr;
                                                         }
Distributed DR Configuration
The configuration example shown is only needed if a distributed DR is going to be configured within a VXLAN . The address of
the Rendezvous Point is configured to identify wh ich device is acting as the official Rendezvous Point. The distributed DR
parameters configured under the local IRB interface to identify which local interfaces will be used to perform the distributed
DR tasks.
                                                        (master : 0}
                                                        lab@leafl> show igmp interface
                                                        Interface : irb . 10
                                                            Querier : 10 . 1 . 1 . 112
                                                            State :             Up Ti meout :   209
                                                        Versi on : 2 Groups :          0
                                                            Immediate leave : Off
                                                            Promiscuous mode : Off
                                                            Passive : Off
                                                        Configured Parameters :
                                                        IGMP Query Interval : 125 . 0
                                                        IGMP Query Response Interval : 10 . 0
                                                        IGMP Last Member Query Inte rval: 1 . 0
                                                        IGMP Robustness Count : 2
                                                        Derived Parameters :
                                                        IGMP Membe rship Timeout : 260 . 0
                                                        IGMP Other Queri er Present Timeout :
                                                        255 . 0
       {master : O}                                        {master : O}
       lab@leafl> show igmp snooping evpn proxy vlan vlO   lab@leafl> show igmp snooping statistics
       Instance : default- switch                          Vlan : vlO
         Bridge-Domain : vlO , VN Identifier : 5010        IGMP Message type      Received       Sent    Rx errors
                                                           Membership Query              0          10           0
                                                           Vl Membership Report          0           0           0
                                                           DVMRP                         0           0           0
                                                           PIM Vl                        0           0           0
                                                           Cisco Trace                   0           0           0
                                                           V2 Membership Report          0           0           0
                                                           Group Leave                   0           0           0
                                                           Mtrace Response               0           0           0
                                                           Mtrace Request                0           0           0
                                                           Domain Wide Report            0           0           0
                                                           V3 Membership Report          2           0           0
                                                           Other Unknown types                                   0
Summary
We Discussed:
        •            The multicast extensions to EVPN; and
Review Questions
Review Questions
         1.
2.
3.
4.
      2.
In EVPN Type-7 route is used to synchron ize IGMP j oin messages bet ween devices con nected to a shared Ethernet segment,
in a multi homed environment.
      3.
EVPN Type-8 route is used to synchronize IGMP leave messages between devices connected to a shared Et hernet segment,
in a multi homed environment.
      4.
An EVPN-VXLAN distributed designated rout er allows a locally connected device to forward multicast traffic to a d ifferent
subnet on the same device, without forwarding the multicast traffic to a remote designat ed route r IRB.
                              Engineering Simplicity
Data Center Fabric with EVPN and VXLAN
Objectives
We Will Discuss:
        •           The benefits of CEM; and
CEM Overview
The slide lists the topics we will discuss. We wi ll discuss the highlighted topic first.
                                                ....   ---- --
                                                                  ........--- ---
                                                              -- --
                                                                                    .... ------
                                                                                                                                        D                                                        Hybrid Cloud                                    • Intent-driven networking
                         ..-·
                              ------   ---- --
                                               --
                                                                          D                                                                   Multidomain
                                                                                                                                                                                                 • Microsegmentation
                                                                                                                                                                                                 • Leashed policy
                                                                                                                                                                                                                                                 • Unified cloud policy
          D ..•·     ..... --
                                                                                    Simplified Data
                                                                                    Center
                                                                                                                                              Data Center
                                                                                                                                              • Workflowautomation
                                                                                                                                                                                                 • Root cause insight
                                                                                                                                                                                                                                                 • Orchestrated clouds
                                                                                                                                              • SON - overlay                                    • -35% IT staff time with
                                                                                    • Fabric                                                                                                                                                       decreasing IT efforts
               Legacy Data Center                                                                                                             • Automated remediation                              self-service capabilities                       by 30%
               • DC 3-tier                                                          • Telemetry                                               • AWS, Azure, GCP ...                              • -38%, savings in IT                           • +66% in faster
               • Perimeter Security                                                 • Threat detection                                                                                             infrastructure platform                         application life cycles
                                                                                                                                                                                                   costs per application
                                                                                                                                              • 70% time to market                               • -65% effort with
                                                                                     • -40% systems/admin
                                                                                                                                              • +100x resource                                     centralized & simplified
                                                                                                                                                scaling speed                                      security policies
                                                                                                                                              • Increase 400+:1 in
                                                                                                                                                VM/resource
Five Steps
The slide shows the f ive steps from a legacy data center to an automat ed multicloud .
                                                  ....   ---- --
                                                                    ........--- ---
                                                                -- --
                                                                                      .... ------
                                                                                                                                            D                                                         Hybrid Cloud                                                • Intent-driven networking
                           ..-·
                                ------   ---- --
                                                 --
                                                                            D                                                                     Multidomain
                                                                                                                                                                                                      • Microsegmentation
                                                                                                                                                                                                      • Leashed policy
                                                                                                                                                                                                                                                                  • Unified cloud policy
           D        ... ...... --
                                                                                      Simplified Data
                                                                                      Center
                                                                                                                                                  Data Center
                                                                                                                                                  • Workflowautomation
                                                                                                                                                                                                      • Root cause insight
                                                                                                                                                                                                                                                                  • Multi-cloud
                                                                                                                                                                                                                                                                    deployments
                                                                                                                                                  • SON - overlay                                     • More complex
                 Legacy Data Center                                                   • Fabric                                                    • Automated remediation                               microsegmentation and                                     • Multi-tenant with
                                                                                      • Telemetry                                                                                                       policy profiles                                             multi-cloud
                 • DC 3-tier                                                                                                                      • AWS, Azure, GCP ...
                                                                                                                                                                                                      • More data to track                                          deployments
                 • Perimeter Security                                                 • Threat detection
                                                                                                                                                  • Increased need for                                • More parameters to                                        • Need for unified
                                                                                       • Simplified designs                                                                                                                                                         policies across cloud
                                                                                         reduce need for                                            speed to market                                     synchronize across
                                                                                                                                                    requires automation                                 multiple environments for                                   and non-cloud
                • Basic data center                                                      admin resources                                                                                                                                                            infrastructures
                                                                                                                                                    and flexibility                                     consistency
                  architecture                                                         • Improved telemetry to
                                                                                                                                                  • Increased need for                                                                                            • Real-time or near real-
                • Manually configured                                                    see how the network                                                                                                                                                        time adjustments to
                                                                                         is performing                                              remed iation requires
                  and provisioned, or                                                                                                               automation                                                                                                      network and
                  use of scripting and                                                 • Improved threat                                                                                                                                                            application
                  templates                                                              detection                                                                                                                                                                  performance
     C> 2019 J uniper Networks , Inc All Rights Reserved                                                                                                                                                                                                                         Jun1Per   NElWOPKS
                                                                                                                                                                                                                                                                                                      5
      • Customer asks are simple ~                                                                               "I need a two-tier application execution environment with these
             • Ful lfillment is complex                                                                          characteristics"
                                                                                                                 "Can I have my DB cluster up and running by next week and
                                                                                                                 connected to the Web front end ?"
                                                  (       = PuP~-1~.c    ~~~~d
                                                  \   ~   - --  WORKl.0-60$
                                                                                                                  •    Thousands of lines to set up a         •   Best practices and tools are
                                     ------".":. .                                  ~ --'                              device                                     different across teams and
                                                          "--                                                     •    Hundreds of lines to create a
                                                                                                                       simple service                         •
                                                                                                                                                                  across DCs and clouds
                                                                                                                                                                  Skill sets are different
                                      WAN/ Interconnect                                                           •    Different capabilities for different   •   Tools and best practices are
                                                                                                                       vendors and OS versions                    distinct across physical and
                                                                               (I)           (i)                  •    Many DC architecture models                virtualized workloads
                                                                                I
                                                                                                         -             DC interconnects across DC
                                                                                                                  •
                                                                    l!!I      ~~                                       operations teams
                                                                                I
                                                                                     '             '''
                                                                                                   '
                                         Private
                                                                                                             Private
           -- - -                        cloud
                                            DC
                                                                  -- -- =                .,::-
                                                                                                 I\:
                                                                                                   '         cloud
                                                                                                               DC                                                 REVENUE-LOSS
            - - -   WORKU)A0$
                                                                  - -   -     W~ICLOA.DS
                                                                                                                         HUMAN ERRORS=
                                                                                                                                                                  LONG LEAD TIME
    C> 2019 Juniper Networks, Inc All Rights Reserved
Challenges in the DC
With each step in data center evolution, the complexity of implementing a data center that fu lfills business requirements
increases. In order to fu lf ill service level agreements (SLAs), not only do telemetry and network monitoring capabilities have
to be more robust and accurate, t he ability to act quickly on t he information gathered from those systems m ust be improved.
When implementing more complex data center designs, and as those designs incorporate remote data centers, and even
cloud based data centers, it becomes even more difficult to create and apply traffic management policies, security policies,
and performance polices across the different environments with consistency.
                                                               PaaSnaaS/                                                                      Vmware
                                                               Containers                                                                   Integration
                                                          -
                                                          ; _; Openstack                        @®
                                                                                                                                            a
                                                          e      Kubernetes
                                                                                         !. 'i ii ..©.. ©.·:
                                                                                         .0   •   0   •   •   0   0   .. I   O O O o
                                                          Q      Openshift             :@©©·~··:
                                                                                       .       ..
                                                                                                                                       .•
                                                                                                                                              0                  (Beta)
                                                           •
                                                           ~
                                                                  Mesos                ;;~m         ..
                                                                                       . .. . . .. ..
                                                                                        . ... ... .. .                             '
                                                                                                                                       ..
                                                                                                                                       •
                                                                                                                                             VMware
Contrail Command
You should also consider t he fact that 1 000s of lines of configuration statements can be prone to human error adding even
more hours of troubleshooting and re-configuration. However, from the Contrail Command user interface, you simply c lick a
few buttons and "Voila! ", you have networked your workloads. Besides the networking aspects of CEM, you can optiona lly
use the security features of Contrail Networking (sometimes called Contrail Security) as we ll as App Formix to bring security
and analytics to your deployment.
Servers OwrvirtY duster Nodts $1 i.pr....ion 9 DMto Oust.< Adviinmd Oy,t""" >
              Clww
                                      I      C
                                             3
                                                     ,   t• llodc>
                                                         ..
                                                                                      Control llodc>                         Analyt    cs Nodes                  Config Node>                                Dotabas• Node>
          •   Fabr-ta                                                                                                        1                                   1                                           1
                                                                     el e O                                el        eO                            e l   eO                          •• • o                                     • I      eO
Contrail Command
Contrail Command is the CEM Web-based user interface. As the solution evolves you wi ll be able to do more and more from
Contrail Command.
                                                                                    Underlay              DCI
                                                                                   automation          automation
                                     One-click networking services
                                      with visibility across the cloud
                                                                                                                Reimaging&
                                                                    VM
                                                                            Discovery/           Automate        upgrades
                                                                              import
                                                          Docker Azure                           IP Fabrics
                                        BMS OSVM         container (Beta)
                                       00                 00                              Telemetry
                                                                                         automation
                                                                                                          Topology
                                                                                                          discovery
                                                                                                                           qfx10k                                              qfx10k
                                                               Spine Nodes - - •
                                                                                               ••••
                                                                                                      ..~
                                                                                                                          AS 65000                                            AS 65001
                                                                                   • •♦
                                                                                          ••
              Prone to human error                                                ••
                                                                                                                                                          ••• •• ••
                                                                                                                                                          •• ••••••
                                                                              ♦
                                                                             ••                                                                        ••           •• •
                                                                         ••                                                                      •• ••
                                                                        ••                                                           •••
                                                                                                                                           •••                             ••••
                                                                   ,,
                                                                    ••
                                                                      •
                                                                                                          ••• ••
                                                                                                                   •• •
                                                                                                                          ••• ••••                                            ••• •
                                                                                                                                                                                   •• •
                                                         - - - - - - - , , ~ ~ ~.
                                                         ,-
                                                           qfx5100-1
                                                                                       •...      • ••   ♦
                                                                                                                                                                                          ••   •••
                                                                                                                                                                                                     ....
                                                                                                                                                                                                            ·················· ► qfx5100-2
                            Leaf Nodes - - - -             AS 65101                                                                                                                                                            AS 65102
       Overlay Autoconfiguration
       ■    EVPNNXLAN Overlay (Brownfield or Greenfield) autoconfiguration
              • Full-mesh IBGP peering
              • Brownfield underlay can be any routing protocol                                                                                                                                                          •········• MP-IBGP Session (EVPN)
                                                                                                       ••   •••• •••
                                                                                                                       ·····••••••••                    ----,
                                                                                                                                                         qfx10k
                                                                                                                                                                                                      •••
                                                                                                                                                                                                            .... ----.   qfx10k
                                                                                        ...•••                      Spine Nodes--•                                                               •······ ►
                                                                                   ••
                                                                                     ••
                                                                                                                                             ···"'.
                                                                                                                                                             AS 64512                                                   AS 64512                               "··••• •••••
                                                                                  •                                                     •• ••• ••••
                                                                                ••                                                   ••
                                                                                                                                  ••• • ••
                                                                                                                                                                                                                                                                              • •••
               Prone to human error                                          •••
                                                                           •••
                                                                                •
                                                                                                                    •• •• ••
                                                                                                                             •• •• •♦ •
                                                                                                                                  •••••                                                          •• ••
                                                                                                                                                                                                      •••••
                                                                                                                                                                                                        •
                                                                                                                                                                                                       •••• ••
                                                                                                                                                                                                               ••
                                                                                                                                                                                                                                                                                       ·-..••
                                                                                                                                                                                                                                                                                      ••
                                                                                                                                                                                                                                                                                             ••
                                                                          •                                  •• •
                                                                                                                  •
                                                                                                                                ••                                                      ••• ••                      • • ••
                                                                                                                                                                                                                                                                                                •••
                                                                       •• •                              •• •                 ••                                                •••••                                        • •• ••                                                               ••
                                                                                                           ♦                 •                                             ••                                                          ••
                                                                                                                ..••
                                                                                                                         ♦                            • •
                                                Leaf Nodes :
                                                               .·. .•••
                                                                     •
                                                                        .
                                                                                     qfx5100-1 ..
                                                                                      AS 64512
                                                                                                               r-
                                                                                                        ...... · · • · · ... · ·          ....
                                                                                                                                           ••                ••••••••••••••••••••••••
                                                                                                                                                                                                                          ............................ r.►
                                                                                                                                                                                                                                                                  • ••
                                                                                                                                                                                                             . · · · · · · · · · · · · · · · · · · · · · •.. ..... .......
                                                                                                                                                                                                                                                                         •••
                                                                                                                                                                                                                                                                      · · · · ·...
                                                                                                                                                                                                                                                                                .. .
                                                                                                                                                                                                                                                                                           ~....;:i,,_----,
                                                                                                                                                                                                                                                                                                  qfx5100 2
                                                                                                                                                                                                                                                                                                  AS 64512
                                                                                                                                                                                                                                                                                                               -
                                                                •
                                                               .• •.:••
                                                                              ..    •                  y
                                                                                                                                       ....... .                                                  ••••••••• •••••
                                                                            .
                                                                                           • ♦                                                                             ••••
                                                                                                                                 •••••
                                                               .•,. : .•.
                                           ..- - - - ~·~---r-:--"""T"'---~~                       ...... .     ••• ········
                                                                                          ••
                                                                                     .......•••
                                                     .--........_ ...,,,  ........ .
                                                               control                                 host                         csn                                                                                                                                                                 host
Overlay Autoconfiguration
Once there is an established underlay network, CEM can automatically discover t he IP fabric nodes, take user input to assign
a physical and routing/bridging ro le to each node (physica l spine node, physica l Leaf node, L2 VXLAN Gateway, L3 VXLAN
Gateway), and then automatically configure a baseline EVPN/VXLAN overlay between the IP fabric nodes as wel l as the
Contrail Control node. For a large IP fabric, th is could be thousands of lines of configuration enabled on the IP fabric nodes
(and Contrail) automatically by CEM.
      BMS to VM Bridging
      ■    By autoconfiguring the leaf nodes as Layer 2 VXLAN gateways,
           VM to BMS communication is achieved
                 1
                   ,   ,-----...'
                      Virtual \~-.. ,
          1
           ,,   .J
                     Network A        ... ,
                                               '
          ' '------------ ____,,,,
           ,         192.168.1.0/24                     I\
          --L-
          I   I -.J--
                I   I
          1VM -1 I 1VM-2 1
          L--..l L--..l
                                 8B MS1        MS                                                                         Key
                                                                              qfx10k                qfx10k
                                                             Spine Node--... AS
                                                                                   64512           AS 64512                 ~
                                                                                                                            ~      VR
                                                                                                                                    OUter
          Configuration changes to
          Contrail, Orchestrator, and
                     Fabric
                                                                       qfx5100-1                              qfx5100-2   - - - L e a f Nodes
                                                                       AS 64512                               AS 64512              (L2 VXLAN GWs)
                                                                                     ge-0/0/4             ge-01 14
BMS to VM Bridging
After an IP fabric has been onboarded (previous two slides), BMSs can be added to the infrastructure as well as Contrail
virtual networks. In this example, CEM wi ll automatically configure the Leaf nodes as Layer 2 VXLAN gateways. Normally, this
would be a tedious, manual task but is complete ly automated by CEM.
      SRIOV Support
              e          O ==
                        O_.CHSt-lrt
                                      C     C•
                                 e I
                                                                                                                       Benefits
        I                                                                                                              . Unified solution to manage heterogeneous
                                                                               DC Leaf
        I~'TIPP                                                                                  Network BLUE            compute environments
                                                                               switches
        I
        I                                                                                                              . Visibility and control of accelerated performance
        I                                                 vlan               vlan         vlan   Network RED
        I                                                                    100          210
                                                                                                                         forwarding on all workloads
                                                 [        110
        I
                                      BMS with sr-iov •wi!i!iiRoiiuiiil
                                                                  teii  r s•r-iiiioiiiv
                Problem Statement                                                           (     One-click action to automate ANY workload with the same operations     )
                                       Different compute
                                          performance                                       (          Single view across VM, containers, SRIOV workloads and physical   )
                                 Disjoint operations for                                    (                   Reduce lead time and cross team dependencies             )
                                   virtual and SRIOV
                                   accelerated VNFs
    iQ 2019 Juniper Networks, Inc All Rights Reserved
SRIOV Support
The slide shows that a vRouter that can be used with SR-IOV interfaces is now available to be used with CEM.
Summary
We Discussed:
         •              The benefits of CEM; and
Review Questions
      1. What are some of the complexities that have been introduced into
         data centers in a multicloud environment?
      2. What is the user interface for CEM?
      3. How can CEM automate the configuration of an IP fabric?
Review Questions
        1.
2.
3.
       2.
The user interface for CEM is a Web-based user interface that is used t o inte ract with multip le underlying systems and
components .
       3.
CEM can configure spine and leaf devices based on t heir role in the fabric and ca n autogenerate addressi ng and AS
numbering.
AD • •        • • •          • •       • •         • •         • • • •       • • •         • • •     • • •         • • •         • • •       • •           • • • • • • • •       •       • •       • • • • • • • •   • • •       • •          . aggregation device
                                                                                                                                                                                                                                         • • • •
RP • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • rendezvous point
SD • • •       • • •       • • •     • • •       • • •         • • • •       • • •         • • •     • • • • • • • •             • • •       •           • • • • • • • • • • •           • •       • • • •   • • •   • • •   • • •   •   • •   • • •   satellite device
STP •          • • • • • • • • • • •               • •         • • • •       • •           • • •     • • •         • • • •       • •         • • •         • • • • • • •       • • • •     • •       • • • • • • • •   • • • • • •   • • Spa nning Tree Protocol
vc     • •     • • •       • • •       • •       • • •         • • • •       • • • • • • •           • • • • • •                 • •         • • • • • • • • • • • • • •                 • • •       • • •   • • •   • • • • • •     • • • •   • •      Virtua l Chassis
VCF •        • • • • • • • • •         • •       • • •         • • • •       • • •         • • •     • • •         • • • •       • • •       • • •         • • • • • • •         • • •     •         • • • • • • • • • • •     • •   • • • Virtual Chassis Fabric
VM • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • . virtua l machine
']
                                         Engineering
                                         Simplicity
EDU-JUN-ADCX, Revision V18A