Developing SIP
Jonathan Cumming
Director, VoIP Product Management
Data Connection (DCL)
Meet us at booth 211
Agenda
Designing the right product
Target markets and device characteristics
Getting to market efficiently
Development choices
Diagnosing problems
Designing for Scale
SIP load balancing mechanisms
Designing for High-Availability
Remote failures
Local failures
Q&A
3
Data Connection Ltd. (DCL)
Networking
Protocols
Division
Protocol software
for OEMs
Internet
Applications
Division
Communication
application
software for SPs
Enterprise
Connectivity
Division
SNA software for
OEMs and
Enterprises
Class 4/5 softswitch
solutions for
IOCs/CLECs
300+ deployments
3m subscriber capacity
Market Segments
Which applications?
Voice, Video, Instant Messaging, Presence, Gaming
Which SIP variant?
What type of customer and device?
Carrier, Enterprise, Consumer
5
Device Characteristics
What feature set?
Endpoint vs. Proxy vs. B2BUA
TCP, TLS, SCTP, SIGCOMP
What scale?
Initial footprint vs. scalability
How reliable?
Occasional reboot/crash acceptable
5x9s => Fault-tolerant architecture
How secure?
What types of attack are likely?
What is the risk and impact of DoS attack?
6
Development Choices
Platform choices
O/S
Hardware
POSIX
ACTA, Compact PCI, Network Processors, Multi-core
HA Middleware
Increased off-the-shelf integration
Ensure that all components meet your requirements
Make vs. Buy : Open source vs. Commercially licensed
Advanced features
Scalability and High Availability
Extensibility to support new features
Comprehensive diagnostics
Guaranteed support to minimize total cost of ownership
Timely enhancements to support new functionality
Problem diagnosis and fixes for interoperability issues and bugs
Help with application design
Diagnosing Problems
Wide range of issues
Interoperability e.g. NAT
Crashes i.e. service outage
Performance QoS, DoS
Intra-component
Requires comprehensive
diagnostics
Effective runtime filtering
Trace
Traceofofexecution
execution
FSM
FSMhistory
history
Inter-component
Inter-componenttracing
tracing
Event
Developer,Information,
Information,Warnings
Warningsand
andProblems
Problems
EventLogs
LogsDeveloper,
Device-level
External
ExternalLine
LineTrace
Tracee.g.
e.g.WireShark
WireShark/ /Ethereal
Ethereal
Development Environment
Field Use
Inter-Component Tracing
Components
Chronology
Separate Source &
Destination Time
Stamps
Details
Time Stamps
Designing for Scale
CPU Utilization
Faster processor
35
Distribution of software
components
30
% utilization
May be limited by software
and hardware bottlenecks
25
B2BUA
20
Stateless Proxy
15
Trx Stateful Proxy
10
Call Stateful Proxy
5
0
0
40
80
120
160
200
240
Calls per second
Requires modular software architecture
Suitable for multi-card and SMP systems
Distribution to multiple devices
Two distinct scenarios
Out of dialog requests
Dialogs
10
One of SIPs keys strengths: applies equally to proxies and
endpoints
Load-balancing is a specific form of distribution
Distribution Principles
Configuration
Out of band mechanism
Entered by user
DHCP
Redirection
Initial response indicates nominated server
DNS
SIP 3xx response
3
1
Proxy
Initial request forwarded
Creates path for future requests
Direct
Via proxy
11
5
2
1
4
Registrations
Registration is a heavy load
Soft-state: regular re-registrations for all devices
Also used to maintain NAT/Firewall pinholes
Distributing initial REGISTER request
Static configuration to use different registrars
DNS
Multicast
Distributing subsequent out-of-dialog requests
First-hop security
Initial registration establishes secure tunnel to nominated
server
Subsequent messages use this tunnel, overriding other routing
12
Dialogs
Distributing dialog-establishing requests
Static configuration to use different proxies or servers
Service-Route header
Returned on REGISTER response
Causes all requests from a given endpoint to be routed via
nominated proxies.
Redirection and Proxy
DNS and 3xx responses
Distributing in-dialog requests
Contact and Record-Route headers
Returned on response to dialog-creating request
Directs all requests within the dialog to be routed via nominated
proxies to the nominated server
DNS use is limited to stateless devices
13
Example: External load balancer
Advantages
Single external IP address
Can also provide security services at network border (SBC)
Can hide internal topology
Potential Pitfalls
Bottleneck
Single point of failure
IP load balancer
Simple => cheap, fast
Limited value, as breaks non-trivial flows, e.g. call transfer
SIP load balancer
14
Pure SIP proxy provides limited security
SBC function can break more complex flows
Example: Multi-card chassis
Distributor(s)
Path of initial message
IP Router
or NAT
Path of subsequent messages
Advantages
Single external IP address (optional)
Supports additional cards without changes to external configuration
Removes bottleneck
Distributor intelligently routes initial requests
Distributor not on path of subsequent requests
IP router or NAT distributes subsequent messages
Distributor does not need to be full SIP proxy
If SIP software has modular, distributable architecture
15
Designing for High Availability
Remote failures
Detecting service availability
Handling remote failures
Local failures
Scope of effect
Designing appropriate availability
16
Detecting Service Availability
SIP Service Request
May be implemented in several layers
SIP Transport Response
(100 Trying)
Application
TCP keep alive
SIP
PING / ICMP
IP Reachability
TCP/IP
Network
Router
17
Proxy
Endpoint
Handling Remote Failures
Cannot determine failure scope from error responses
Intelligence is distributed => cannot relate errors to topology
No mechanism to reduce load
SIP specification causes cascaded failure on overload
Work-in-progress: draft-ietf-sipping-overload-reqs-00
18
Handling Local Failures
Service
outage
Alternate
server
19
Existing
calls
New
calls
8
8
8
9
Handling Local Failures
Existing
calls
New
calls
8
8
9
8
9
9
Service
outage
Alternate
server
Hot
standby
State
replication
Hot standby
State replication enables failover without loss of stable calls
New calls may use alternative server during failover
In-service upgrade/downgrade for continuous operation
during maintenance
20
Implementing Hot Standby
System Manager
Creates backup process if required
Initiates replication procedures
Handles failovers
Management Component
Management Plane
Data/Protocol Plane
Primary Lin
e Card
Backup Lin
e Card
ers
tom
s
u
C HW
r
age
Man
DCL m
te
Sys ager
n
a
M
Active connections
Inactive connections
21
DCL m
te
Sys ager
n
a
M
DCLnent
po
m
Co A
Keep alive et al
State and/or configuration
replication
ers
tom
s
u
C HW
r
age
Man
DCLnent
po
Co m B
DCLnent
po
m
Co A
DCLnent
po
Com B
Real-World Example
Geographic
redundancy
Call signalling
Local redundancy
Geographic
distribution
Media
processing
Local distribution
Signaling processing centralized
Economies of scale by reducing number of centers
Media processing close to customer
22
Shorter media path reduces latency
Distribution reduces impact of single failure
Conclusions
SIP provides a very powerful architecture
Many different uses and variants
Different solutions for different applications
Different markets and devices characteristics
Good support and diagnostics are key to success
High availability and scalability is a challenge
Inherently complex area
Cost vs. benefit trade-off
Come to talk to us about your requirements
Our software is designed for scalability and High Availability
Deployed and field-hardened around the world
23
Questions?
jonathan.cumming@dataconnection.com
Meet us at booth 211