VI3 Networking Scenarios and
Troubleshooting
Krishna Raj Raja
VMware
Why This Talk ?
A vast majority of networking problems are configuration issues with the
physical switch
Physical switches are managed by network administrators. Virtual
switches are under the control of ESX administrators
Enabling/disabling various networking features can have subtle or
drastic implications on your network connectivity
Knowledge on how virtual switch works helps to troubleshoot problems
2
Outline
ESX Networking details
Scenarios
Virtual Switch boundaries
VLAN
Layer 2 Security
Load Balancing
Failover
Diagnostics
This talk assumes familiarity with ESX networking features
3
ESX Networking: Logical Layout
Virtual Machine Multiple layers, multiple ways
to interconnect
Virtual NIC Interesting possibilities !
Virtual Switch
Physical NIC
ESX
Physical Switch
Physical Machine
4
ESX 3.x Networking Details
1:VI traffic to COS
2:VM traffic
3:VI traffic for Remote
Service Console VM Console
4: IP storage
user user
5: VMotion traffic
kernel kernel
TCP/IP stack TCP/IP stack
netcos driver vlance / vmxnet / e1000
vmnixmod Net Device emulation VMX SCSI emulation
VMKernel uses its own
TCP/IP stack
1 2 3 4
Shared Shared
VMKernel MKS VMotion IP Storage
Clients inherit the
COS BSD TCP/IP stack
Shadow properties of portgroup
COS vmxnet Vlance
client client client TCP/IP client
Portgroup policies define
Portgroup what features are used
Ethernet VLAN Load
L2 Security Failover New functionalities could be
Switching Tagging Balancing
... plugged
Virtual Switch
All network traffic has to go
vmklinux compatibility layer through some virtual switch
Linux device drivers
VMKernel uses modified
Linux drivers
All the Physical NICs are
managed by VMkernel
5
Virtual Switch
Operates at Layer 2, no layer
3 functionalities.
Can have zero or more
uplinks (Physical NICs)
Cannot share (uplinks)
physical NICs with other
virtual switches
To use a virtual switch there
should be at least one
portgroup defined
6
Portgroups
Portgroups do not segment
broadcast domain
VLANs segment broadcast
domains
Clients inherit the properties of
the portgroups (in ESX 2.x
properties are specified to the
virtual NIC)
Portgroup policies Overrides
virtual switch policies.
Can use subset of NICs available
to the virtual switch
Can share NICs with other
portgroups on the same virtual
switch
Implication: Same set of Physical
NICs can be used with different
policy settings. For ex. VLAN,
NIC teaming etc.
7
Virtual Switch: External View
Virtual Switch behaves like a dumb switch
Does not speak
STP - Don’t have to, No Loops possible
Does not speak DTP, VTP, ISL etc
Does not speak LACP
Physical Switch ports have to be aggregated in Manual mode
Optional CDP support planned for the future version
Physical Switch
Virtual Switch
Multiple Client MAC
addresses appear
on this port
8
Virtual Switch: Internal View
MAC address learning
Unlike physical switches Virtual Switch does not learn MAC addresses
from the traffic flow
Virtual NICs notify MAC address when they register
Every other unicast MAC address belong to uplink port
Link negotiation
Virtual NIC does not negotiates speed/duplex with the virtual switch
Virtual NICs do not reflect the speed/duplex state of the Uplink
(physical NIC)
Guest reports link down status when the virtual ethernet device is
disconnected in the UI
9
Virtual Switch Boundaries
Virtual switches are isolated. i.e. Trunking is not possible between virtual
switches. Only uplinks connect virtual switches.
Communication from VM A to VM B can happen only through external
network
A B
Virtual
Physical
10
Virtual Switch Boundaries
Virtual Machines can interconnect Virtual Switch
Virtual NICs need to be placed in different subnet to use both virtual switches
Layer 2 Loops possible if the VM acts like a bridge
192.168.1.x/24 192.168.2.x/24
Virtual Switch
Physical Switch
11
Virtual Switch Boundaries
VMKernel TCP/IP Stack routing table determines packet flow
Put IP Storage and VMotion on separate subnets for isolation
Traffic will go through the same virtual switch if they are in the same subnet
VMKernel TCP/IP Stack
IP storage VMotion
Virtual Switch
Physical Switch
12
VLAN: Why Trunk ?
Uplink in a virtual switch is a
trunk link to the physical
A B switch
Configure the physical switch
port as a trunk port to allow
Untagged frames Virtual Switch traffic with tagged frames
Tagged
VLAN 5 VLAN 1 frames
P1 P2
VLAN 5 VLAN 1 Physical Switch
Trunk Port
Untagged frames
13
Native VLAN
Physical Switch does not tag
Native VLAN: VLAN 1 frames on the Native VLAN
Virtual Switch does not have
the notion of Native VLAN
C B
Communication A – B fails:
Virtual Switch Virtual switch forwards only
tagged frames to B
Communication B – A may or
VLAN 5
VLAN 5 VLAN 1 tagged
may not fail: Physical switch
may or may not accept
P1 P2 VLAN 1
untagged tagged frames on native
VLAN
VLAN 5 VLAN 1 Physical Switch Workaround: Put VM B on
an portgroup with no VLAN
tagging or enforce tagging on
switch port P2
A
14
Virtual Switch VLAN Behavior Example
Loopback cable
interconnects VLAN 5 and
A B VLAN 10 into the same
broadcast domain
192.168.1.x/24 192.168.1.y/24 VM A and VM B can talk to
Virtual Switch each other
In ESX 2.x the response
packets from VM B will not
VLAN 5 VLAN 10
Tagged reach VM A. Path
frames optimization prevents this
communication
ESX 3.x avoids this problem
VLAN 5 VLAN 10 Physical Switch
Untagged frames Loopback
15
VGT: Security Implications
VLAN id 4095 enables VGT
mode in ESX 3.x
VGT In VGT mode guest can
send/receive any VLAN
A B tagged frame (0-4094).
VLAN 5, 7, 10 Virtual Switch Virtual switch does not filters
VLAN
Filtering could be done in the
VLAN 5 VLAN 4095 physical switch port
P1 P2 P3 However VM B could still talk
to VM A
VLAN 7 VLAN 10 VLAN 5 Physical Switch
Filter: Allow VLAN: 7, 10
16
Layer 2 Security
ESX Layer 2 security options give a level of control beyond what is usually
possible in physical environments
Promiscuous Mode: Deny
Virtual NIC will appear to go into promiscuous mode, but it won’t receive any
additional frames
Forged transmits: Deny
drop any frames which the guest sends with a source MAC different from the
one currently registered
MAC address changes: Deny
if the guest attempts to change the MAC address to something other than
what’s configured for the virtual HW, stop giving it frames
17
Layer 2 Security
Why “Deny MAC Address Changes” ?
Guest can change its MAC address to send spoofed frames
Guest can change its MAC address to listen to other traffic when
promiscuous mode is denied.
To restrict the VM to use only its MAC address enforce “Deny MAC
Address Changes” and “Deny Disallow Forged transmits”
Deny all three options for complete layer 2 security
18
Layer 2 Security: Interactions
Microsoft Network Load Balancing
Deny Forged transmits will break Microsoft Network Load Balancing
operating in Unicast mode
In Unicast mode Cluster nodes use fake MAC address for outgoing
traffic to prevent switches from learning true MAC address. This
technique allows the incoming traffic for the cluster IP to be sent to all
the ports of the physical switch.
19
Layer 2 Security: Interactions
Windows IP address conflicts
Deny Forged transmits will cause machines on the network to point to
the offending machine instead of defending machine in the case of IP
address conflict
Windows Sends gratuitous ARP (ARP request for its own IP) to detect
duplicate IP address. If a host responds back, then duplicate IP
In the event a host responds back (duplicate IP found), windows
sends forged ARP request containing the MAC address of the
defending machine. This updates the ARP table of the machines in
the network with the IP address of the defending machine.
20
Switch Notification
Client MAC address is notified to
the physical switch using RARP
frame
When ?
Whenever Client register itself
with virtual switch
VM power on, Vmotion,
Changing MAC, Teaming status
change etc
Why ?
Allows the physical switch to
learn MAC immediately
Why RARP ?:
L2 broadcast reaches every
switch
Doesn’t disrupts ARP cache
L3 information not needed to
send RARP
21
Switch Notification: VMotion
VMotion moves the VM from
one switch port to another
Virtual Switches on source
and destination should have
identical L2 security policy
(VC Checks this)
Source and destination port
should be in the same
broadcast domain (implies
same VLAN).
RARP Virtual NIC is unplugged on
the source and plugged back
at the destination host –
triggers switch notification
22
Load Balancing: Source MAC/Originating Port ID
Outbound NIC is chosen
based on source MAC or
originating port id
Client traffic is consistently
A B C sent to the same physical
Virtual Switch
NIC until there is a failover
Replies are received on the
same NIC as the physical
switch learns the MAC/
switch port association
Better scaling if: no of vNICs
> no of pNICs
VM cannot use more than
one Physical NIC unless it
Physical Switch has two or more virtual NICs
23
Load Balancing: IP Hash (out-IP)
Outbound NIC is chosen
based on “Source-destination
L3 address pair”
Scalability is dependent on
A B C the no of TCP/IP sessions to
Virtual Switch
unique destinations. No
benefit for bulk transfer
between hosts
Physical switch will see the
client MAC on multiple ports
Can disrupt MAC address
learning on the physical
switch
Inbound traffic is
Physical Switch unpredictable.
24
NIC Teaming: Packet Reflections
Broadcast / Multicast packets
return to the VM through other
NICs in the team
Most Guest OS’es ignore
A duplicate packets
Virtual Switch
Avoid NIC Teaming if the VM
relies on frequent broadcast /
multicast packets (for ex.
Microsoft Network Load
Balancing)
ESX 3.x filters packet reflections
Physical Switch Frames received on
wrong link is
• Discarded in source
L2 Broadcast mac/originating port id mode
• Allowed in out-ip mode
25
Link Aggregation
Allows load balancing of
incoming traffic.
Packet reflections are prevented -
Aggregated ports do not re-send
broadcast / multicast traffic
Virtual Switch Works well with out-ip since
aggregated ports share a single
entry in the MAC lookup table
Throughput aggregation benefits
are less relevant with the advent
of gigabit and 10G Links
Traffic flow is unpredictable
Source mac/Source port id mode
load is incompatible with Link
Physical Switch aggregation in ESX 3.x
26
NIC Teaming: Multi Switch Configuration
Physical NICs can be
connected to different
switches as long as they
remain in the same broadcast
domain
Physical switches should be
trunked or ISL’ed
Virtual Switch
Expect problems if the port
on each physical switch is
configured with different
VLAN/trunking options
IP-hash (out-ip) mode is not
recommended
Physical Switches
Client MAC address can
appear on all the physical
switches
Client MAC address can
Broadcast domain appear on trunk ports
27
NIC Teaming: Multi Switch With Link Aggregation
Same scenario as before, but
uses link aggregation on
each switch
Currently ports from different
physical switches could not
be aggregated into a single
Virtual Switch link
Physical Switches
Broadcast domain
28
NIC Teaming: Failover Scenarios
Failover detection
Ethernet Link failure
Switch failure (beaconing)
Fail-back
Rolling failover : No - Fail
back is on
Failover order
Order of Standby Adapters
Unused Adapters – NICs
excluded from teaming
Changing the Order of Active
Adapters switches the traffic
flow through the NICs
29
NIC Teaming: Failover Implications
Fail-back is on by default. If link is flaky physical switch will notice client
MAC address on multiple ports frequently
Virtual switch uses the link as soon as it is up. Physical switch port may
not accept traffic immediately when the link comes online
To minimize delays disable
STP (use portfast mode instead) – 30 secs
Etherchannel negotiation, like PAgP (use manual mode) – 15 secs
Trunking negotiation – 4 secs
Link autonegotiation (Speed/duplex settings) – 2 secs
30
Diagnostics: Link state
31
Diagnostics: Portgroup settings
32
Diagnostics: VMKernel TCP/IP Stats
cat /proc/vmware/net/tcpip/ifconfig
33
Diagnostics: vmkping
ping command uses
service console
TCP/IP Stack
vmkping uses
VMKernel TCP/IP
stack
34
Diagnostics: Collecting Network Traces
Run tcpdump/ethereal/netmon inside the guest or in the service console
Traffic visibility depends on the portgroup policy settings
Portgroup with VLAN id 0 (No VLAN)
• Sees all the traffic on the virtual switch without VLAN tags
Portgroup with VLAN id ‘X’ (1-4094)
• Sees all the traffic on the virtual switch with VLAN id ‘X’
Portgroup with VLAN id 4095
• Sees all traffic on the virtual switch
• Traffic is captured with VLAN tags
Promiscuous mode
• Accept: All visible traffic
• Reject: Only traffic matching the client MAC address
35
Questions ?
Presentation Download
Please remember to complete your
session evaluation form
and return it to the room monitors
as you exit the session
The presentation for this session can be downloaded at
http://www.vmware.com/vmtn/vmworld/sessions/
Enter the following to download (case-sensitive):
Username: cbv_rep
Password: cbvfor9v9r
Some or all of the features in this document may be representative of
feature areas under development. Feature commitments must not be
included in contracts, purchase orders, or sales agreements of any kind.
Technical feasibility and market demand will affect final delivery.