A Comprehensive Survey of Voice Over IP Security Research: Angelos D. Keromytis, Senior Member, IEEE
A Comprehensive Survey of Voice Over IP Security Research: Angelos D. Keromytis, Senior Member, IEEE
Abstract—We present a comprehensive survey of Voice over companies that are challenging the traditional status quo in
IP security academic research, using a set of 245 publications telephony and personal telecommunications. As a result, a
forming a closed cross-citation set. We classify these papers number of PSTN providers have already completed or are
according to an extended version of the VoIP Security Alliance
(VoIPSA) Threat Taxonomy. Our goal is to provide a roadmap in the process of transitioning from circuit-switched networks
for researchers seeking to understand existing capabilities and to VoIP-friendly packet-switched backbones. Finally, as the
to identify gaps in addressing the numerous threats and vulner- commercial and consumer sectors go, so do governments and
abilities present in VoIP systems. We discuss the implications of militaries due to cost reduction concerns and the general
our findings with respect to vulnerabilities reported in a variety dependence on Commercial Off The Shelf (COTS) equipment
of VoIP products.
We identify two specific problem areas (denial of service, for the majority of their computing needs.
and service abuse) as requiring significant more attention from Because of the need to seamlessly interoperate with the
the research community. We also find that the overwhelming existing telephony infrastructure, the new features, and the
majority of the surveyed work takes a black box view of speed of development and deployment, VoIP protocols and
VoIP systems that avoids examining their internal structure products have been repeatedly found to contain numerous
and implementation. Such an approach may miss the mark
in terms of addressing the main sources of vulnerabilities, vulnerabilities [1], [2], [3] that have been exploited [4], [5],
i.e., implementation bugs and misconfigurations. Finally, we [6]. As a result, a fair amount of research has been directed
argue for further work on understanding cross-protocol and towards addressing some of these issues. However, the effort is
cross-mechanism vulnerabilities (emergent properties), which are unbalanced, with little effort spent on some highly deserving
the byproduct of a highly complex system-of-systems and an problem areas.
indication of the issues in future large-scale systems.
This comprehensive survey covers 245 VoIP security re-
Index Terms—VoIP, SIP, security search papers and books, complementing our previous work
that analyzed known vulnerabilities [1], [2], [3]. Our primary
I. I NTRODUCTION goal is to create a roadmap of existing work in securing
VoIP, towards reducing the start-up effort required by other
VoIP refers to a class of products that enable advanced researchers to initiate research in this space. A secondary goal
communication services over data networks. While voice is is to identify gaps in existing research, and to help inform the
a key aspect in such products, video and other capabilities security community of challenges and opportunities for further
(e.g., collaborative editing and whiteboard sharing, file sharing, work. Finally, in the context of the VAMPIRE project1 we
calendaring) are supported. The key advantages of VoIP are seek to provide guidance as to what further work in needed
flexibility and low cost. The former derives from the (gener- to better understand and analyze the activities of attackers.
ally) open architectures and software-based implementation, We classify these papers according to the class of threat
while the latter is due to new business models, equipment and they seek to address, using an extended version of the VoIP
network-link consolidation, and ubiquitous consumer-grade Security Alliance (VoIPSA) [7] threat taxonomy. We discuss
broadband connectivity. our findings, and contrast them with our previous survey on
Due to these benefits, VoIP has seen rapid uptake in both the VoIP vulnerabilities.
enterprise and consumer markets. An increasing number of en- Paper Organization: Section II provides a brief overview
terprises are replacing their internal phone switches with VoIP- of SIP, perhaps the most popular VoIP technology currently in
based implementations, both to introduce new features and to use. Section III summarizes the threat taxonomy as defined
eliminate redundant equipment. Consumers have embraced a by the VoIP Security Alliance. Our survey of the research
slew of technologies with different features and costs, includ- literature is given in Section IV. We then discuss our findings
ing P2P calling (Skype), Internet-to-PSTN network bridging, in Section V.
and wireless VoIP. These new technologies and business
models are being promoted by a new generation of startup
II. SIP OVERVIEW
Author’s address: Angelos D. Keromytis, Department of Computer Science,
Mail Code 0401, Columbia University in the City of New York, New York, We focus our attention on Session Initiation Protocol (SIP)
NY 10027, USA. [8], a popular and widely deployed technology. Most research
The bulk of this work was conducted while the author was on sabbatical leave
with Symantec Research Labs, France. This work was supported in part by the
has focused on SIP, primarily because of its wide use and the
French National Research Agency (ANR) under Contract ANR-08-VERS-017
and by the US National Science Foundation under Grant CNS-09-14845. 1 http://vampire.gforge.inria.fr/
2 IEEE COMMUNICATIONS SURVEYS & TUTORIALS
5
ACK Bob, route Bob@10.0.0.1
Proxy Proxy ACK Bob, route Bob@10.0.0.1
3 7
BYE Bob@10.0.0.1
Fig. 1. Session Initiation Protocol (SIP) entity interactions. User Alice BYE Bob@10.0.0.1
registers with her domain’s Registrar (1), which stores the information in the
Location Server (2). When placing a call, Alice contacts her local Proxy Server OK
(3), which may consult the Location Server (4). A call may be forwarded to OK
another Proxy Server (5), which will consult its domain Location Server (6)
before forwarding the call to the final recipient. After the SIP negotiation
terminates, RTP is used directly between Alice and Bob to transfer media
content. For simplicity, this diagram does not show the possible interaction
between Alice and a Redirection Server (which would, in turn, interact with Fig. 2. Message exchanges during a SIP-based two-party call setup.
the Location Server).
Registrar/Proxy
Alice Domain D1 Bob
cryptographic hash. Passwords are not transmitted in plaintext
INVITE sip:Bob@D2 form over the network. It is worth noting that authentication
may be requested at almost any point during a call setup.
407 Authentication Required
Proxy−Authenticate: We shall later see an example where this can be abused by a
Digest algorithm=MD5, malicious party to conduct toll fraud in some environments.
realm="D1", nonce="12cc9a63"
For more complex authentication scenarios, SIP can use
ACK sip:Bob@D2 S/MIME encapsulation [18] to carry complex payloads, in-
cluding public keys and certificates. When TCP is used as the
INVITE sip:Bob@D2
Proxy authorization: transport protocol for SIP, TLS can be used to protect the SIP
Digest username="Alice", messages. TLS is required for communication among proxies,
realm="D1", uri="sip:Bob@D2",
response="12acb23970af", registrars and redirect servers, but only recommended between
nonce="12cc9a63", algorithm=MD5 endpoints and proxies or registrars. Alternatively, IPsec [19]
INVITE sip:Bob@D2
may be used to protect all communications, regardless of
RINGING RINGING
OK the transport protocol. However, because few implementations
OK
integrate SIP, RTP and IPsec, it is left to system administrators
ACK sip:Bob@D2 ACK sip:Bob@D2 to setup and manage such configurations.
Media Transfer (RTP)
III. VO IP T HREAT C LASSIFICATION
Fig. 3. SIP Digest Authentication
To classify the surveyed work, we use the taxonomy pro-
vided by the Voice over IP Security Alliance (VoIPSA)3 .
VoIPSA is a vendor-neutral, not for profit organization com-
and SIP, media gateways may disrupt the end-to-end nature posed of VoIP and security vendors, organizations and indi-
of the media transfer. These entities translate content (e.g., viduals with an interest in securing VoIP protocols, products
audio) between the formats that are supported by the different and installations. The VoIPSA security threat taxonomy [7]
networks. aims to define the security threats against VoIP deployments,
Because signaling and media transfer operate independently, services, and end users. The key elements of this taxonomy
the endpoints are responsible for indicating to the proxies that are:
the call has been terminated, using a BYE message which is 1) Social threats are aimed directly against humans. For
relayed through the proxies along the same path as the call example, misconfigurations, bugs or bad protocol inter-
setup messages. actions in VoIP systems may enable or facilitate attacks
There are many other protocol interactions supported by that misrepresent the identity of malicious parties to
SIP, that cover many common (and uncommon) scenarios users. Such attacks may then act as stepping stones
including call forwarding (manual or automatic), conference to further attacks such as phishing, theft of service, or
calling, voicemail, etc. Typically, this is done by semantically unwanted contact (spam).
overloading SIP messages such that they can play various 2) Eavesdropping, interception, and modification
roles in different parts of the call. We shall see in Section III threats cover situations where an adversary can
examples of how this flexibility and protocol modularity can unlawfully and without authorization from the parties
be used to attack the system. concerned listen in on the signaling (call setup) or the
SIP traffic is typically transmitted over port 5060 (UDP content of a VoIP session, and possibly modify aspects
or TCP), although the port can vary based on configuration of that session while avoiding detection. Examples of
parameters. The ports used for the media traffic, however, are such attacks include call re-routing and interception of
dynamic and negotiated via SDP during call setup. This poses unencrypted RTP sessions.
some problems when Network Address Translation (NAT) or 3) Denial of service threats have the potential to deny
firewalls are traversed. Typically, these have to be stateful users access to VoIP services. This may be particularly
and understand the SIP exchanges so that they can open the problematic in the case of emergencies, or when a
appropriate RTP ports for the media transfer. In the case of DoS attack affects all of a user’s or organization’s
NAT traversal, endpoints may use protocols like STUN to communication capabilities (i.e., when all VoIP and data
enable communication. Alternatively, the Universal Plug-and- communications are multiplexed over the same network
Play (uPnP) protocol 2 may be used in some environments, which can be targeted through a DoS attack). Such
such as residential broadband networks consisting of a single attacks may be VoIP-specific (exploiting flaws in the
subnet behind a NAT gateway. call setup or the implementation of services), or VoIP-
For authenticating endpoints, the registrar and the proxy agnostic (e.g., generic traffic flooding attacks). They
typically use HTTP Digest Authentication, as shown in Fig- may also involve attacks with physical components (e.g.,
ure 3. This is a simple challenge-response protocol that uses physically disconnecting or severing a cable) or through
a shared secret key along with a username, domain name, a computing or other infrastructures (e.g., disabling the
nonce, and specific fields from the SIP message to compute a DNS server, or shutting down power).
2 http://www.upnp.org/ 3 http://www.voipsa.org/
4 IEEE COMMUNICATIONS SURVEYS & TUTORIALS
4) Service abuse threats covers the improper use of VoIP Reputation, behavior, and identity (25)
setting. Examples of such threats include toll fraud and Social Threats (43)
Anti-SPIT architectures (7)
to the physical layer of the network (following the ISO Denial of Service (31)
DoS (1)
problems that may nonetheless cause VoIP services to P2P SIP (1)
become unusable or inaccessible. Examples of such Field Studies and System/Protocol Analysis (12)
threats include loss of power due to inclement weather, Performance Analysis (14)
resource exhaustion due to over-subscription, and per- Additional Categories (134 items) Authentication Protocols (15)
Middleboxes (11)
Figure 4 graphically depicts our overall classification In addition, a few more papers were suggested by anony-
scheme, annotated with the number of items in each category. mous reviewers, as part of the review process for this paper.
ANGELOS D. KEROMYTIS: A COMPREHENSIVE SURVEY OF VOIP SECURITY RESEARCH 5
The same algorithm was used, and expanded the 3 initial Authority (VA) and a Mediator. The Mediator acts as a call
suggestions to 12 total additional papers. In the process, we bridge, allowing the call to connect only once the VA has
discovered a case of plagiarism for papers that were published approved it (possibly based on policy and on such information
6 years apart; we notified the authors and journal editors as caller/callee identities, location and time of the call, etc.).
involved. The user receives the “call me back” request from the VA, and
In order to avoid a lopsided distribution of papers (and decides whether to proceed with picking up the call based on
an infinite expansion), we did not include in the collection local policy and other information (e.g., CallerID).
papers that were deemed of only peripheral relevance to VoIP Dantu and Kolan [24], [25] describe the Voice Spam Detec-
or VoIP security. The result of this process (modulo any papers tor (VSD), a multi-stage SPIT filter based on trust, reputation,
inadvertently missed) was 245 publications. and feedback among the various filter stages. The primary filter
In the following two sections, we discuss the related work stages are call pattern and volume analysis, black and white
using the extended VoIPSA taxonomy, as described in Sec- lists of callers, per-caller behavior profile based on Bayesian
tion III. For each classification area, we give the paper count classification and prior history, and reputation information
as a crude indication of the level of activity. from the callee’s contacts and social network. They provide a
formal model for trust and reputation in a voice network, based
on intuitive human behavior. They evaluate their system in a
B. VoIPSA-based Classification (111 items)
laboratory experiment using a small number of real users and
We now discuss the work that fits naturally within the first injected SPIT calls.
four categories of the VoIPSA taxonomy, which constitutes Rebahi and Sisalem [26] develop the concept of the “SIP
45% of the surveyed papers. All other work is further classified social network” as a means for managing reputation toward
and discussed in Section IV-C. countering SPIT. However, no experimental evaluation or val-
1) Social threats (43 items): The majority of work in this idation of any of these schemes is performed. Rebahi et al. [27]
area focuses on SPam over Internet Telephony (SPIT) detec- extend the previous by proposing two schemes for protecting
tion and prevention, although there are other items included against SPIT and SPIM (spam over instant messaging). The
in this category as well (e.g., secure principal binding). We first uses reputation, with users indicating how much “trust”
have broken down the work based on the general technical they have in the persons in their contact lists. These lists (and
approach taken, and discuss the work in rough chronological the trust values) are posted in a directory, where others can
order within each thrust; we use the same approach in the access them upon receiving a call from a previously unknown
remainder of the text. As we can see, the majority of work (to them) entity. This scheme requires that every user’s contact
has focused on reputation and behavior-based approaches. information be published, and that attackers cannot mask or
a) Reputation, behavior, and identity (25 items): Sri- change their identities. The second scheme is built around the
vastava and Schulzrinne [20] describe DAPES, a system for notion of “payment at risk”, wherein a caller may be required
blocking SPIT calls and instant messages based on several to deposit a small amount to a SIP server prior to placing a
factors, including the origin domain of the initiator (caller), call, depending on the callee’s or the SIP proxy’s policy. If
the confidence level in the authentication performed (if any), the user indicates that the call was SPIT, the payment is then
whether the call is coming through a known open proxy, and forfeit.
a reputation system for otherwise unknown callers. They give Hansen et al. [28] present SPIT-AL, an anomaly detection
an overview of other reputation-based systems and compare system seeking to identify SPIT calls. Their system takes
them with DAPES. into consideration information about the caller (such as Cal-
Dantu and Kolan [21] show that it is possible to use as a lerID, IP address, whitelists/blacklists, etc.) and the call (e.g.,
detection mechanism for high-volume SPIT the velocity and time), and allows for different responses (grey-listing, audio
acceleration (first- and second-order derivative of the number) CAPTCHA, etc.). A key element of their architecture is that
of incoming calls from a user, host or domain. Once either of users manage their own rules and responses, in order to comply
these values exceeds a threshold, related calls can be dropped. with the various German telecommunication laws.
The same method can also mitigate against certain VoIP-based Baumann et al. [29] overview SPIT threats and various
denial of service attacks. defense mechanisms. They then propose to prevent Sybil at-
MacIntosh and Vinokurov [22] propose a statistical de- tacks in SIP by binding user identities to biometric information
tection algorithm for SPIT that can be implemented at the (specifically voice fingerprint) that is stored in global servers.
receiver’s server. For each external entity that communicates Users wishing to place calls must first prove their identity,
with local users, their system keeps track of the number of thereafter receiving credentials that can be used to place calls.
call setups and terminations in both directions (incoming and Madhosingh [30] integrates white and black lists with
outgoing). Simultaneous deviation of two or more of these CAPTCHAs for those callers that are not a priori known (and
counters from their assumed long-term averages supposedly included in a whitelist or a blacklist). If the test is passed,
indicates spam activity, with confidence increasing as the the call is allowed through. However, such callers are not
deviation widens. The approach assumes that attackers cannot allowed to leave voicemail messages to the callee’s system;
rapidly change their identity. instead, such messages are stored on the caller’s local SIP
Croft and Olivier [23] propose extending the call setup server, and the callee is sent an indication about the availability
process by adding a “call me back” scheme using a Verifying of a voicemail and instructions on how to retrieve it.
6 IEEE COMMUNICATIONS SURVEYS & TUTORIALS
Bertrand et al. [31] propose an anomaly detection technique caller ID information, in conjunction with a blacklist/whitelist
for identifying and blocking SPIT that creates caller profiles filtering scheme. As one specific mechanism, they describe the
based on their IP address. The criteria used by in their analysis use of weakly secret information that a user makes available
includes number of received error messages, the use of a to potential callers, who must then use that information in
directory service, whether multiple calls are placed by the future calls. Another similar technique involves callers provid-
same caller, the duration of calls and the variance of call ing contact/identifying information to potential future callees.
duration across multiple calls, and the number of simultaneous Both of these schemes exploit cross-media interactions, lever-
incoming calls (from multiple different users) to the same aging the fact that most calls are associated with some other
user. In response to an identified SPIT call, their system rate interaction between the caller and callee entities. For example,
limits call delivery, temporary blacklists the most aggressive an e-commerce web site may accept such a weak secret from
callers, or redirects the call to a voicemail or other automated a customer, or provide one (depending on the scheme); this
system that notifies the caller of the problem. They propose secret would then be used when calling the customer in the
implementing this functionality in the network, where it will future.
work together with routers. Because of this choice, their Guang-Yu et al. [35] describe a multi-layer SPIT detection
system must do real-time layer 7 reconstruction and analysis and prevention architecture that takes into consideration the
of traffic, which in turn requires hardware support to keep up behavioral characteristics of specific types of SPIT campaigns,
with the volume. Such a system should be able to handle up starting from the reconnaissance phase.
to 106 simultaneous sessions, for 105 subscribers, with 104 in- Patankar et al. [36] compare two SPIT detectors derived
coming calls per second. They present a prototype Java-based from the email spam domain. One of these techniques is based
implementation running on Linux, using the netfilter on user reputation through a referral social network model,
and iptables components to divert and block traffic. Their while the other assigns a trust value to incoming SIP messages
performance evaluation shows that this prototype can handle based on their direct prior interactions with the caller. Their
80 incoming calls per second, adding approximately 5ms to simulations indicate that the referral-based model is more
the average 5.8 seconds call establishment time. effective, correctly identifying SPIT in over 98% of cases. In
Yan et al. [32] argue for the use of active fingerprinting an environment with little-to-moderate amounts of SPI, this
in SPIT prevention systems. Protocol implementations inter- likely be sufficient by itself. If the level of SPIT approaches
pret the standards in slightly different ways, especially with the current (circa 2010) levels of email spam, then additional
respect to indicating errors. Thus, it is possible to identify filtering/blocking mechanisms would have to be employed.
the implementation of a peer SIP device by observing its Wu et al. [37] apply semi-supervised clustering to call pa-
responses to a set of specially crafted messages. These may rameters (with optional user feedback) in order to distinguish
be either standards-compliant or non-compliant. By creating a SPIT from non-SPIT calls. The evaluation, which was done
number of different tests, it is possible to actively fingerprint using manually created call traces, shows that the approach is
a remote SIP device that is trying to initiate a call. Their scalable (in the number of calls) and offers reasonable detec-
conjecture is that malicious SIP user agents will not be able tion performance. Hyung-Jong et al. [38] describe a behavior-
to mimic legitimate stacks because of the diversity in possible based system that seeks to identify likely SPIT callers.
responses, and because often such tools implement only a Sorge and Seedorf [39] apply reputation techniques to the
subset of SIP. In their analysis, they were able to create unique SPIT problem, by evaluating the quality of information (tags)
fingerprints for 20 different SIP devices. The system evaluation attached to outgoing calls by the callers’ SIP-based service
was limited to a performance (throughput) oriented experiment provider (SSP). Their scheme allows receiving SIP providers
using PlanetLab. to evaluate the likelihood of a call being SPIT using caller-SSP
Balasubramaniyan et al. [33] propose to use call duration information, providing incentives to honest SSPs to correctly
and social network graphs to establish a measure of reputation tag their outbound calls. They demonstrate, through analytical
for callers. Their intuition is that users whose call graph has a means, that the precision of their SPIT detection improves
relatively small fan-out and whose call durations are relatively by almost 50% even in a limited trust case, with greater
long are less likely to be spammers. Conversely, users who improvements as longer trust chains of SSPs are taken into
place a lot of very short calls are likely to be engaging in consideration.
SPIT. Furthermore, spammers will receive few (if any) calls. Phithakkitnukoon and Dantu propose the use of user feed-
Their system works both when the parties in a call have back in closed email systems (such as Gmail) to identify
a social network link between them, and when such a link spammers [40]. The challenge in their scheme, which envi-
does not exist by assigning global reputation scores. Users sions a binary “spammer/non-spammer” classification is to
that are mistakenly categorized as spammers are redirected choose an appropriate threshold for determining when this
to a Turing test, allowing them to complete the call if they transition occurs so as not to misclassify benign users who
answer correctly. In a simulation-based evaluation, the authors were accidentally or maliciously tagged as spammers.
determine that their system can achieve a false negative rate b) Content-based detection (1 item): Pörschmann and
of 10% and a false positive rate of 3%, even in the presence Knospe [41] propose a SPIT detection mechanism based on
of large numbers of spammers. applying spectral analysis to the audio data of VoIP calls
Ono and Schulzrinne [34] propose the use of weak social to create acoustic fingerprints. SPIT calls are identified by
ties as a means to label calls with unknown or incomplete detecting a large number of fingerprints across a large number
ANGELOS D. KEROMYTIS: A COMPREHENSIVE SURVEY OF VOIP SECURITY RESEARCH 7
of a user such that calls to that user are redirected to a SIP and H.323. They analyze six different implementations,
malicious party (impersonation) or are dropped (denial of discovering confidentiality (eavesdropping a call), integrity
service). In their approach, users create temporary public (injecting voice into an ongoing call) and availability (perform-
keys that are bound to their location and identity through the ing DoS) compromises. This work assumes that no security
SIP registration process, possibly leveraging the existing SIP mechanism (such as SRTP) is used.
authentication mechanism used (or using some out-of-bound Wright et al. [69] apply machine learning techniques to
mechanism for securing the binding). Users then digitally sign determine the language spoken in a VoIP conversation, when
their registration information, which the local registrar verifies a variable bit rate (VBR) voice codec is used based on the
before sending to the location server. To allow entities in other length of the encrypted voice frame. As a countermeasure, they
domains to verify the location information, the user public key propose the use of block ciphers for encrypting the voice. In
can be conveyed through a secure channel at the domain level, follow-on work [70] they use profile Hidden Markov Models
e.g., by leveraging registrar public key certificates, or a pair- to identify specific phrases in the encrypted voice stream with
wise shared secret key between two domains. This approach a 50% average accuracy, rising to 90% for certain phrases.
assumes benign and reliable registrar servers. To mitigate this Wang et al. [71] evaluate the resilience of three commercial
weakness in the assumptions and to improve overall service VoIP services (AT&T, Vonage and Gizmo) against man-in-
reliability, the authors also propose the use of Byzantine the-middle adversaries. They show that it is possible for an
Fault Tolerance techniques, adapting their protocols (public attacker to divert and redirect calls in the first two services
key binding & querying, and user registration) to a quorum by modifying the RTP endpoint information included in the
environment. They conduct an experimental evaluation of their SDP exchange (which is not protected by the SIP Digest
non-replicated scheme, showing that it can achieve the same Authentication), and to manipulate a user’s call forwarding
performance as unsecured SIP and is 3–50 times faster than settings in the latter two systems. These vulnerabilities permit
TLS-protected SIP. for large-scale voice pharming, where unsuspecting users are
2) Eavesdropping, Interception, and Modification (30 directed to fake interactive voice response systems or human
items): Considerable work has been dedicated to protecting representatives. The authors argue for the need for TLS or
and attacking VoIP signaling and data traffic. We divide the IPsec protection of the signaling.
work in two sub-categories, attacks and defenses. Verscheure et al. [72] exploit the nature of human conver-
a) Attacks (12 items): Wang et al. [63], [64] describe sation (i.e., alternating periods of talking and silence for each
a de-anonymization attack against VoIP streams that use participant) to reveal communication pairs over a period of
low-latency anonymity proxies. Their intuition is to insert a time. The technique does not work as well against systems that
watermark in the encrypted stream, tracking its propagation do not use silence suppression, as these effectively introduce
across the network. The watermark used is a perturbation of a form of constant (voice) traffic padding in both directions.
the inter-packet delay for selected packets in the stream. With Petraschek et al. [73] examine the usability and security of
appropriate use of redundancy, they demonstrate a tracking ZRTP, a key agreement protocol based on the Diffie Hellman
attack against 2-minute Skype calls across the Internet using key exchange, designed for use in VoIP environments that
3 ms delays. Depending on the watermark parameters chosen, lack pre-established secret keys among users or a public key
they can achieve 99% true positive and 0% false positive rate infrastructure (PKI). ZRTP is intended to be used with SRTP,
or 100% true positive and 0.1% false positive rate. Srivatsa which performs the actual content encryption and transfer.
et al. [65] demonstrate flow-analysis attacks that expose the Because of the lack of a solid basis for authentication, which
privacy of peer-to-peer VoIP participants. makes active man-in-the-middle attacks easy to launch, ZRTP
Shah et al. [66] examine the use of injected jitter into VoIP uses Short Authentication Strings (SAS) to allow two users to
as a covert channel to exfiltrate keyboard activity of interest verbally confirm that they have established the same secret
(e.g., passwords). This attack would be effective even when key. The verbal communication serves as a weak form of
the VoIP stream is encrypted. authentication at the human level. The authors identify a relay
Takahashi and Lee [67] examine the problem of covert attack in ZRTP, wherein a man-in-the-middle adversary can
channels in VoIP protocols, identifying and quantifying several influence the SAS read by two legitimate users with whom he
ways in which data can be surreptitiously leaked out of a has established independent calls and ZRTP exchanges. The
user’s system or an enterprise network. As an example, they attacker can use one of the legitimate users as an oracle to
demonstrate the steganographic insertion of a second voice pronounce the desired SAS string through a number of means,
channel in a SIP-based VoIP conversation. This has the poten- including social engineering. The authors point out that SAS
tial of leaking an otherwise secure (encrypted) conversation does not offer any security in some communication scenarios
through a secondary channel, or can be used to hide the true with high security requirements, e.g., a user calling (or being
communication content from an eavesdropper. They determine called by) their bank. The authors implement their attack and
such parameters as channel capacity and perceptual quality of demonstrate it in a lab environment.
the encoded signal through experimental evaluation. They con- Zhang et al. [74] show that, by exploiting DNS and VoIP
clude with a discussion of several possible countermeasures implementation vulnerabilities, it is possible for attackers to
and detection methods. perform man-in-the-middle attacks even when they are not on
Weiser et al. [68] provide an overview of the security the direct communication path of the parties involved. They
considerations in RTP, the media transfer protocol used in both demonstrate their attack against Vonage, requiring that the
ANGELOS D. KEROMYTIS: A COMPREHENSIVE SURVEY OF VOIP SECURITY RESEARCH 9
attacker only knows the phone number and the IP address protection), and propose a lightweight scheme that mitigates
of the target phone. Such attacks can be used to eavesdrop some of the performance concerns and security weaknesses of
and hijack the victims’ VoIP calls. The authors recommend DTLS-SRTP.
that users and operators use signaling and media protection, Hlavacs et al. [84] propose the integration of computational
conduct fuzzing and testing of VoIP implementations, and puzzles in ZRTP as a way to mitigate the man-in-the-middle
develop a lightweight VoIP intrusion detection system to be attack described earlier [73]. Effectively, their scheme places
deployed on the VoIP phone. an upper bound to the amount of time a ZRTP exchange may
b) Defenses (18 items): Guo et al. [75] propose a new take, placing the attacker under (hopefully) severe time con-
scheme for protecting voice content that provides strong straint and making them unable to carry out the independent
confidentiality guarantees while allowing for graceful voice but parallel calls that are necessary. The authors propose a new
degradation in the presence of packet loss. They evaluate their puzzle scheme based on computing selected eigenvectors of
scheme via simulation and micro-benchmarks. However, Li et real symmetric matrices. An additional protection mechanism
al. [76] show that the scheme is insecure. suggested is to randomly delay (by short amounts of time)
Bellovin et al. [77] argue against the enactment of leg- the receiving of calls, again trying to make more difficult the
islation (in the US) mandating the integration of lawful- attacker’s task of orchestrating and playing against each other
intercept capabilities into VoIP implementations. Their key two independent calls.
concerns is that, based on a history of system compromises Palmieri and Fiore [85] describe an adaptation of SIP to
and implementation weaknesses, mandating such capabilities provide end-to-end security using existing and well-known
would enable or ease attacks against personal communications primitives (e.g., digital signatures and efficient encryption
by adversaries that would otherwise be unable to conduct such mechanisms). The authors developed a prototype implementa-
attacks. They suggest that lawful interception needs be met tion and conducted a performance analysis of their scheme.
either at the application provider or the network link level. One drawback of this scheme relative to ZRTP is that it
Seedorf [78] proposes the use of cryptographically gener- requires a PKI. When compared to at least some proposed
ated SIP URIs to protect the integrity of content in P2P SIP. deployments of DTLS-SRTP, this scheme provides end-to-
Specifically, he uses self-certifying SIP URIs that encode a end non-repudiation and end-to-end authentication while being
public key (or, more compactly, the hash of a public key). The resistant to man-in-the-middle attacks.
owner of the corresponding private key can then post signed Zhang and Berthold [86] discuss several passive traffic
location binding information on the peer-to-peer network (e.g., analysis attacks on VoIP systems. These attacks exploit both
Chord) that is used by call initiators to determine call routing. signaling and media flow information. They also discuss
Fessi et al. [79] propose extensions to P2P SIP that pro- techniques that can be used to mitigate some of these attacks,
vide location and interaction privacy for participants. They and conclude with a list of open problems. Many of the attacks
develop a signaling protocol for P2P SIP that uses two differ- and the countermeasures are shared with those in general-
ent Kademlia-based overlay networks for storing information purpose anonymity systems. Zhang and Fischer-Hub̈ner [87]
and forwarding traffic, respectively. Their scheme requires and Melchor et al. [88] also discuss techniques for protecting
a centralized authentication server, which provides verifiable the privacy of VoIP calls. The former studies an approach
identities at the application/SIP layer. They consider attacks based on using an anonymization overlay network (such as
against their scheme, shared with more general anonymity Tor) with traffic padding (where the overlay knows what
systems (such as Tor). They use analytical models to estimate traffic to drop because it is marked by the sender). The latter
communication reliability, cryptographic overhead, and end- discussed and evaluated (using an analytical model) the use of
to-end signaling latency. MIXes to provide strong resistance against traffic analysis for
Talevski et al. [80] describe the addition of security (in the VoIP flows. Their scheme uses dummy traffic, broadcasting,
form of encryption and integrity protection) to a lightweight and private information retrieval as building blocks. Srivatsa
VoIP protocol suitable for mobile devices. Kuntze et al. [81] et al. [89] examined the problem of on-demand construction
propose a mechanism for providing non-repudiation of voice of QoS-sensitive routes in anonymizing networks.
content by using digital signatures, taking into consideration Elbayoumy and Shepherd [90] propose the use of TEA
packet losses by reporting to the sender which packets were (Tiny Encryption Algorithm) as a lightweight confidentiality
actually received. mechanism. Subsequently, they propose an adaptive scheme
Wang et al. [82] extend the SIP call setup to include a Diffie where the selection of encryption algorithm to be used in
Hellman based key exchange that results in multiple shared protecting traffic is made with consideration of the CPU
keys that the parties switch among during the call in a deter- capabilities of both communicating parties [91], [92].
ministic (but unknown to an adversary) fashion. Their stated 3) Denial of Service (31 items): Reynolds and Ghosal [93]
goal is to impede cryptanalytic attacks that depend on the same describe a multi-layer protection scheme against flood-based
shared secret key being used throughout a call. They conduct application- and transport-layer denial of service (DoS) attacks
a performance evaluation using a prototype implementation of in VoIP. They use a combination of sensors located across the
their scheme on software phones, concluding that the overhead enterprise network, continuously estimating the deviation from
is negligible. The likely adoption of DTLS-SRTP would proba- the long-term average of the number of call setup requests
bly supersede this effort. Gurbani and Kolesnikov [83] discuss and successfully completed handshakes. Similar techniques
DTLS-SRTP and SDES (another proposed protocol for media have been used in detecting TCP SYN flood attacks, with
10 IEEE COMMUNICATIONS SURVEYS & TUTORIALS
good results. The authors evaluate their scheme via simulation, continue processing the same or another call. This attack works
considering several different types of DoS attacks and recovery against servers that perform synchronous DNS resolution and
models. only maintain a limited number of execution threads. They
Larson et al. [94], [95] experimentally analyzed the impact experimentally show that as few as 1,000 messages per second
of distributed denial of service (DDoS) attacks on VoIP call can cause a well provisioned synchronous-resolution server
quality. They also established the effectiveness of low-rate to exhibit very high call drops, while simple, single-threaded
denial of service attacks that target specific vulnerabilities and servers can be starved with even 1 message per second. As
implementation artifacts to cause equipment crashes and re- a countermeasure, they propose the use of non-blocking DNS
boots. They discuss some of the possible defenses against such caches, which they prototype and evaluate.
attacks and describe Sprint’s approach, which uses regional Fiedler et al. [102] present VoIP Defender, an open ar-
“cleaning centers” which divert suspected attack traffic to a chitecture for monitoring SIP traffic, with a primary focus
centralized location with numerous screening and mitigation on high-volume denial of service attacks. Their architecture
mechanisms available. They recommend that critical VoIP traf- allows for a variety of detection methods to be integrated, and
fic stay on private networks, the use of general DDoS mech- several different attack prevention and mitigation mechanisms
anisms as a front-line defense, VoIP-aware DDoS detection to be used. Key design goals include transparency, scalability,
and mitigation mechanisms, traffic policing and rate-limiting extensibility, speed and autonomous operation. Their evalua-
mechanisms, the use of TCP for VoIP signaling (which makes tion of the prototype implementation consists exclusively of
IP spoofing, and hence anonymous/unfilterable DoS attacks, performance measurements.
very difficult), extended protocol compliance checking by Conner and Nahrstedt [103] describe a semantic-level at-
VoIP network elements, and the use of authentication where tack that causes resource exhaustion on stateful SIP proxies
possible. by calling parties that (legitimately or in collusion) do not
Bremler-Barr et al. [96] describe de-registration attacks in respond. This attack does not require network flooding or other
SIP, wherein an adversary can force a user to be disassociated high traffic volume attacks, making it difficult to detect with
with the proxy server and registrar, or to even divert that simple, network-based heuristics used against other types of
user’s calls to any party (including to the attacker). This attack denial of service attacks. They propose a simple algorithm,
works even when authentication is used, if the adversary can called Random Early Termination (RET) for releasing reserved
eavesdrop on traffic between the client and the SIP proxy. They resources based on the current state of the proxy (overloaded
demonstrate the attack against several SIP implementations, or not) and the duration of each call’s ringing. They implement
and propose a protection mechanism that is similar to one- and evaluate their proposed scheme on a SIP proxy running in
time passwords. a local testbed, showing that it reduces the number of benign
Chen [97] describes a denial of service detection mechanism call failures when under attack, without incurring measurable
that models the SIP transaction state machine and identifies overheads when no attack is underway.
attacks by measuring the number of transaction and application Luo et al. [104] experimentally evaluate the susceptibility
errors, the number of transactions per node, and the traffic of SIP to CPU-based denial of service attacks. They use an
volume per transaction. If certain thresholds are exceeded, an open-source SIP server in four attack scenarios: basic request
alert is generated. Chen does not describe how appropriate flooding, spoofed-nonce flooding (wherein the target server is
thresholds can be established, other than to indicate that forced to validate the authenticator in a received message),
historical records can be used. adaptive-nonce flooding (where the nonce is refreshed period-
Sengar et al. [98], [99] describe vFDS, an anomaly detection ically by obtaining a new one from the server), and adaptive-
system that seeks to identify flooding denial of service attacks nonce flooding with IP spoofing. Their measurements show
in VoIP. The approach taken is to measure abnormal variations that these attacks can have a large impact on the quality
in the relationships between related packet streams using the of service provided by the servers. They propose several
Hellinger distance, a measure of the deviation between two countermeasures to mitigate against such attacks, indicating
probability measures. Using synthetic attacks, they show that that authentication by itself cannot solve the problem and that,
vFDS can detect flooding attacks that use SYN, SIP, or RTP in some circumstances, it can exacerbate its severity. These
packets within approximately 1 second of the commencement mitigation mechanisms include lightweight authentication and
of an attack, with small impact on call setup latency and whitelisting, proper choice of authentication parameters, and
voice quality. A similar approach, using Hellinger distance on binding of nonces to client IP addresses.
traffic sketches, is proposed by Tang et al. [100], overcoming Fuchs et al. [105] apply anomaly detection techniques to
the limitations of the previous schemes against multi-attribute protect against VoIP-originated denial of service attacks at
attacks. Furthermore, their scheme does not require the con- the phone call level at public safety service centers (e.g.,
stant calculation of an accurate threshold (defining “normal” 911 or 112 operators). Specifically, they use call traces from
conditions). normal operations to determine the level of calls coming
Zhang et al. [101] describe a denial of service attack from the PSTN, GSM and VoIP networks during normal
wherein adversaries flood SIP servers with calls involving operation and at disaster time. They then use these profiles to
URIs with DNS names that do not exist. Servers attempting discriminate against VoIP-based DoS attacks by limiting the
to resolve them will then have to wait until the request times accepted number of calls that can originate from that domain,
out (either locally or at their DNS server), before they can building on previous work that identified the network of origin
ANGELOS D. KEROMYTIS: A COMPREHENSIVE SURVEY OF VOIP SECURITY RESEARCH 11
as a potential discriminator [106]. Using call traces from a fire on bio-inspired anomaly detection. They compare their scheme
department response center, they evaluate the call response rate against a cryptography-based mechanism using synthetic traf-
against the DoS attack intensity. Their analysis shows that it fic. Similar work is described by Rebahi et al. [117]. Ak-
is possible to identify such attacks early and to avoid false bar and Farooq [118] conduct a comparative evaluation of
positives if VoIP-originated calls under normal scenarios are several evolutionary and non-evolutionary machine learning
less than 27% of total call volume. algorithms using synthetic SIP traffic datasets with different
Hyun-Soo et al. [107] propose a detection mechanism for levels of attack intensities and durations. They conclude that
de-registration and other call disruption attacks in SIP that different algorithms and settings are best suited for different
is based on message retransmission: when a server receives scenarios. The same authors subsequently apply anomaly
an unauthenticated (but possibly legitimate) message M that detection techniques to identify RTP fuzzing attacks that seek
could disturb a call or otherwise deny service to a user, it to cause server crashes through malformed packet headers and
asks the user’s agent to retransmit the last SIP message sent payloads [119]. They investigate several different classifiers,
by that agent, as an implicit authenticator. If the retransmission analyzing their accuracy and performance using synthetic RTP
matches M (i.e., this was a legitimate request), the server traces. Nassar et al. [120] use support vector machine (SVM)
proceeds with its processing. If the retransmission does not classifiers on 38 distinct features in SIP traffic to identify SPIT
match M , or if multiple retransmissions are received within and DoS traffic. Their experiments using SIP traffic traces
a short time window (as may be the case when an attacker show good performance and high detection accuracy.
can eavesdrop on the network link between the SIP proxy Rafique et al. [121] analyze the robustness and reliability
and the user, identifying the request for retransmission), M is of SIP servers under DoS attacks. They launch a number of
discarded. However, the scheme requires a new SIP message synthesized attacks against four well-known SIP proxy servers
to signal that a retransmission is needed. Geneiatakis and Lam- (OpenSER, PartySIP, OpenSBC, and MjServer). Their results
brinoudakis [108], [109] consider some of the same attacks, demonstrate the ease with which SIP servers can be overloaded
and propose mitigation through an additional SIP header that with call requests, causing such performance metrics as Call
must be included in all messages and can cryptographically Completion Rate, Call Establishment Latency, Call Rejection
validate the authenticity and integrity of control messages. Ration and Number of Retransmitted Requests to deteriorate
Ormazabal et al. [110] describe the design and implemen- rapidly as attack volume increases, sometimes with as few
tation of a SIP-aware, rule-based application-layer firewall as 1,000 packets/second. As an extreme case of such attacks
that can handle denial of service (and other) attacks in the large volumes of INVITE messages can even cause certain
signaling and media protocols. They use hardware acceleration implementations to crash. While valuable in documenting the
for the rule matching component, allowing them to achieving susceptibility to such attacks, this work proposes no defense
filtering rates on the order of hundreds of transactions per strategies or directions.
second. The SIP-specific rules, combined with state validation Akbar et al. [122] conduct an analysis of three anomaly de-
of the endpoints, allow the firewall to open precisely the tection algorithms for detecting flood attacks in IMS: adaptive
ports needed for only the local and remote addresses involved threshold, cumulative sum, and Hellinger distance. They use
in a specific session, by decomposing and analyzing the synthetic traffic data to determine the detection accuracy of
content and meaning of SIP signaling message headers. They these algorithms in the context of a SIP server being flooded
experimentally evaluate and validate the behavior of their with SIP messages.
prototype with a distributed testbed involving synthetic benign Battistello [123] introduces a DoS-resistant protocol for
and attack traffic generation. authenticated call establishment with key exchange across
Ehlert et al. [111], [112] propose a two-layer DoS pre- different domains.
vention architecture for SIP. The first layer is comprised 4) Service Abuse (7 items): Truong et al. [124] describe a
of a bastion host that protects against well-known network- rules-based intrusion detection system for H.323 that uses an
layer attacks (such as TCP SYN flooding) and SIP-flooding FSM model to detect unexpected messages, aimed at identi-
attacks. The second layer is located at the SIP proxy, and is fying illegitimate RAS (Registration, Admission and Status)
composed of modules that perform signature-based detection messages being forwarded to a H.323 gatekeeper.
of malformed SIP messages and a non-blocking DNS cache Kotulski and Mazurczyk [125], [126], [127] propose the
to protect against attacks involving SIP URIs with irresolvable use of steganographic and digital watermarking to embed
DNS names [101]. They conduct a series of evaluations in an additional information into SIP traffic to provide stronger
experimental testbed, where they validate the effectiveness of origin authentication and content integrity guarantees in a
their architecture to block or mitigate a number of DoS attacks. bandwidth-sensitive manner. Their scheme encodes the nec-
Ehlert et al. [113] separate propose and experimentally eval- essary information into unused fields in the IP, UDP and RTP
uate (via a testbed) a specification-based intrusion-detection protocol headers, and also into the transmitted voice.
system for denial of service attacks. Geneiatakis et al. [114], Zhang et al. [128] present a number of exploitable vulnera-
[115] use counting Bloom filters to detect messages that are bilities in SIP that can manipulate billing records in a number
part of a denial of service attack in SIP by determining the of ways, showing their applicability against real commercial
normal number of pending sessions for a given system and VoIP providers. Their focus is primarily on attacks that create
configuration based on profiling. billing inconsistencies, e.g., customers being charged for ser-
Awais et al. [116] describe an anti-DoS architecture based vice they did not receive, or over-charged for service received.
12 IEEE COMMUNICATIONS SURVEYS & TUTORIALS
Some of these attacks require a man-in-the-middle capability, lab experiment, and briefly discuss the applicability of general
while others only require some prior interaction with the target SQL injection defense mechanisms in a SIP environment.
(e.g., receiving a call from the victim SIP phone device). Tucker [140] gives an overview of SIP and H.323, and
Abdelnur et al. [129] use AVISPA to identify a protocol- briefly mentions some security concerns (with an emphasis on
level vulnerability in the way SIP handles authentication denial of service). Posegga and Seedorf [141] offer a similar
[130]. AVISPA is a model checker for validating security threat analysis. Edelson [142] discusses denial of service,
protocols and applications using a high-level protocol spec- SPIT, eavesdropping and security of emergency calls, before
ification and security-goals language that gets compiler into talking about the particular requirements of VoIP in wireless.
an intermediate format that can be consumed by a number She concludes with a brief discussion of intrusion detection
of lower-level checkers. The attack is possible with the SIP for VoIP. Albers et al. [143] gives a high-level overview of
Digest Authentication, whereby an adversary can reuse another the types of vulnerabilities that SIP-based systems may be
party’s credentials to obtain unauthorized access to SIP or exposed to, and discusses the capabilities and limitations of a
PSTN services (such as calling a premium or international number of commercially available (as of 2005) SIP intrusion
phone line). This attack is possible because authentication prevention and testing systems. In a related publication, Mc-
may be requested in response to an INVITE message at any Gann and Sicker [144] argue that several of the VoIP security
time during a call, and the responder may issue an INVITE tools available in 2005 did not cover the extent of known
message during a call either automatically (because of timer vulnerabilities, do not provide the coverage claimed by the
expirations) or through a user action (e.g., placing the caller developers, and were not user-friendly. A short overview of
on hold in order to do a call transfer). While the solution some SIP security mechanisms is given by Geneiatakis et al.
is simple, it requires changes possibly to all end-device SIP [145].
implementations. Cao and Malik [146], [147] examine the vulnerabilities that
Geneiatakis et al. [131] address the problem of billing at- arise from introducing VoIP technologies into the communi-
tacks against telephony service providers and their users. They cations systems in critical infrastructure applications. They
propose an authentication-based scheme that leverages the examine the usual threats and vulnerabilities, and discuss
existing Authentication, Authorization and Accounting (AAA) mitigation techniques. They conclude by providing some rec-
infrastructure operated by the service provider to provide the ommendations and best practices to operators of such systems.
latter with explicit and non-repudiable call confirmation by the Allain [148] discusses the security challenges in VoIP envi-
call initiator. However, the scheme has not been implemented ronments, focusing on a couple of specific issues to highlight
or evaluated, experimentally or formally. the tradeoffs. Adelsbach et al. [149] provide a comprehensive
description of SIP and H.323, a list of threats across all
networking layers, and various protection mechanisms. A
C. Additional Categories (134 items)
similar analysis was published by the US National Institute
We now classify the remainder of the surveyed work of Standards and Technology (NIST) [150]. An updated sum-
(55% of the total) using the following categories: Overviews mary, with practical recommendations to users and opera-
(19.7%), Field Studies and Analysis (4.9%), Performance tors is provided by Walsh and Kuhn [151]. Anwar et al.
Analysis (5.7%), Authentication Protocols (6.1%), Architec- [152] identify some areas where the NIST report remains
ture (7.8%) Middleboxes (4.5%) Intrusion Detection (4.5%), incomplete: counter-intuitive results with respect to the relative
and Miscellaneous (0.8%). performance of encryption and hash algorithms, the non-use of
1) Overviews and Surveys (50 items): There is a consider- the standardized Mean Opinion Score to evaluate call quality,
able body of work focusing on surveying and summarizing and the lack of anticipation of RTP-based denial of service.
risks and threats in SIP, and describing existing work on They then propose the use of design patterns to address the
defense mechanisms. problems of secure traversal of firewalls and NAT boxes,
a) General overviews (42 items): Ackermann et al. [132] detecting and mitigating DoS attacks in VoIP, and securing
describe threats in VoIP, focusing on specific attacks and VoIP against eavesdropping.
vulnerabilities as case studies. Hunter [133], Batchvarov [134], Geneiatakis et al. [153] also survey a number of SIP security
Bradbury [135], and Chau [136] provide summaries of specific vulnerabilities. Geneiatakis et al. [154] categorize potential
security concerns in VoIP. attacks on VoIP services, and provide recommendations and
Sicker and Lookabaugh [137] discuss threats in VoIP and guidelines for protecting the infrastructure. They use ontolo-
the need for security to be integrated at design and deployment gies to represent these recommendations, and first-order logic
time. Vuong and Bai [138] provide a brief survey of the types to translate them to a unified security policy for VoIP.
of intrusion detection systems that can be used to monitor for Me and Verdone [155] describe the security threats and
specific types of attacks in VoIP. high-level vulnerabilities in SIP when used in 802.11 or
Geneiatakis et al. [139] describe how SQL injection attacks other similar wireless environments. Singhai and Sahoo [156]
can be launched through SIP, by including partial SQL state- describe the risks of VoIP technologies (focusing on SIP and
ments in certain fields of SIP protocol messages that are likely H.323) and compare them with the public switched telephony
to be used in subsequent database operations (e.g., parts of the network (PSTN). Rippon [157] provides a laundry list of
SIP URI in the To: field may be used to look up the location threats and mitigation techniques for VoIP systems. Brief
of the user receiving the call). They demonstrate the attack in a descriptions of some VoIP-related threats are given by Hung
ANGELOS D. KEROMYTIS: A COMPREHENSIVE SURVEY OF VOIP SECURITY RESEARCH 13
and Martin [158], [159] and Zandi et al. [160]. risks of SPIT in SIP, the latter also taking into consideration
Xin [161] provides a somewhat more detailed overview of feedback from SIP operators. They then classify a number
VoIP-related security concerns. Persky gives a very detailed of previously proposed anti-SPIT mechanisms along a pre-
description of several VoIP vulnerabilities [162]. Quinten et vent/detect/handle axis. Dritsas et al. [175] survey a number of
al. [163] survey the various techniques for preventing and anti-SPIT mechanisms and techniques against a set of criteria
reducing SPIT, offering some suggestions as to possible com- that they argue is needed to identify a call as SPIT.
binations that increase overall blocking effectiveness. Hansen d’Heureuse et al. [176] give an overview of the various anti-
and Woodward [164] overview threats in VoIP environments SPIT efforts in standardization bodies and propose an architec-
and recommend that VoIP and data networks be logically ture for dealing with unwanted communications composed of
or physically separated. James and Woodward [165] propose 5 stages: non-intrusive pre-call message analysis, interaction
a security framework for end users of VoIP technologies, with the caller, pre-connection callee feedback, call content
combining a number of commonly available mechanisms and analysis and real-time callee feedback, and post-call callee
recommendations. feedback.
Butcher et al. [166] overview security issues and mech- c) Denial of Service (1 item): Sisalem et al. [177] give
anisms for VoIP systems, focusing on security-oriented op- an overview of SIP-based DoS attacks, looking at a couple
erational practices by VoIP providers and operators. Such of specific scenarios. They provide some recommendations to
practices include the separation of VoIP and data traffic by implementors of VoIP systems that mitigate some of these
using VLANs and similar techniques, the use of integrity and attacks.
authentication for configuration bootstrapping of VoIP devices, d) P2P SIP (1 item): Seedorf [178] overviews the secu-
authentication of signaling via TLS or IPsec, and the use rity challenges in peer-to-peer (P2P) SIP. Threats specific to
of media encryption. They briefly describe how two specific P2P-SIP include subversion of the identity-mapping scheme
commercial systems implement such practices, and propose (which is specific to the overlay network used as a substrate),
some directions for future research. attacks on the overlay network routing scheme, bootstrapping
A comprehensive discussion of threats and security so- communications in the presence of malicious first-contact
lutions is given by Thermos and Takanen [167]. Kurmus nodes, identity enforcement (Sybil attacks), traffic analysis and
and Garet [168] summarize a number of threats and specific privacy violation by intermediate nodes, and free riding by
vulnerabilities using actual attack tools. nodes that refuse to route calls or otherwise participate in the
Sisalem et al. [169] provide an in-depth description of SIP protocol other than to obtain service for themselves.
and IMS, discussing the security mechanisms available in each 2) Field Studies and System/Protocol Analysis (12 items):
part of the architecture. The focus particularly on the DoS and Wieser et al. [179] extend the PROTOS testsuite [180] with
SPIT threats, also describing some available countermeasures. a SIP-specific analysis fuzzing module. They then test their
Gurbani and Kolesnikov [170] discuss in depth and com- system against a number of commercial SIP implementations,
pare SDES, DTLS-SRTP, and ZRTP in terms features sup- finding critical vulnerabilities in all of them [181].
ported (e.g., conferencing, PSTN calling) and security fea- Berson [182] conducted an evaluation of the Skype system
tures/weaknesses (e.g., susceptibility to man-in-the-middle at- under contract by Skype itself, allowing him access to the
tacks and key leakage). They conclude that all three are suit- source code. The evaluation focused primarily on the crypto-
able, but they each offer a feature or suppress a vulnerability graphic protocols and algorithms used, and did not discover
that the others do not. any significant issues. Baset and Schulzrinne [183] performed
Keromytis [1], [2], [3] surveys over 200 vulnerabilities in a black-box analysis of Skype, identifying some characteris-
SIP implementations that were disclosed in the CVE database tics of the underlying protocol. Biondi and Desclaux [184]
from 1999 to 2009. He classifies these vulnerabilities along dissected the Skype binary in detail, exposing the extensive
several dimensions, including the VoIPSA threat taxonomy, the anti-reverse engineer and anti-debugging mechanisms built
traditional Confidentiality/Integrity/Availability concerns, and in the program. Their analysis identified a small number of
a Protocol/Implementation/Configuration axis. He finds that vulnerabilities (including a buffer overflow).
the various types of denial of service attacks constitute the Thermos and Hadsall [185] survey a number of Small Office
majority of disclosed vulnerabilities, over 90% of which were Home Office (SOHO) VoIP gateways and related equipment,
due to implementation problems and 7% due to configuration. as provided by 3 different commercial VoIP providers with
b) SPIT (6 items): The SPIDER project (SPam over different corporate profiles and customer bases. Their anal-
Internet telephony Detection sERvice) released a public report ysis looks at four key factors: manageability, node security,
[171] providing an overview of SPIT threats and the relevant signaling security, and media security. They find numerous
European legal framework (both on an EU and national basis). problems, including insecure access to the web-based manage-
The second public report [172] focuses on SPIT detection and ment interface, default passwords and inappropriate services,
prevention, summarizing some of the work done in this space lack of encryption to protect signaling and media, and low-
and defining criteria for evaluating the efficiency of anti-SPIT level implementation issues (e.g., presumed buffer overflow
mechanisms. The report classifies prior work according to vulnerabilities and fuzzing-induced crashes). A similar survey
fulfillment of these criteria, expanding on the relative strengths by Scholz [186] looks at protocol and device problems and
and weaknesses of each approach. vulnerabilities at a medium-size German ISP with high rate
Dritsas et al. [173] and Marias et al. [174] survey the of VoIP adoption. He focuses on intentional and uninten-
14 IEEE COMMUNICATIONS SURVEYS & TUTORIALS
tional denial of service attacks, problems in customer-premises of Datagram TLS (DTLS) [197], it is possible that encryption
equipment (e.g., SIP phones), and protocol-independent issues. and integrity for SIP can be had for all configurations (UDP
A number of problems are found, including DoS through call or TCP) at no additional cost.
forks, misconfigured devices, and lawful-interception evasion, Barbieri et al. [198] find that when using VoIP over IPsec,
among others. performance can drop by up to 63%; however, it is ques-
INRIA has been conducting a multi-thrust effort to apply tionable whether these results still hold, given the use of
testing and fuzzing toward identifying vulnerabilities in SIP hardware accelerators and the more efficient AES algorithm in
protocols [187], implementations [188] and deployed systems IPsec. Simulation-based work by Ranganathan and Kilmartin
[189], [190]. It is worth noting that this work has resulted [199] shows that the use of IPsec with pre-established Security
in a number of vulnerability disclosures in the Common Associations (SAs) increases SIP call setup time by 1.4% and
Vulnerabilities and Exposures (CVE) database and elsewhere. media (voice) transfer by 1.6%. However, when taking into
Gupta and Shmatikov [191] formally analyze the security of consideration the delay in establishing SAs for the first time
the VoIP protocol stack, including SIP, SDP, ZRTP, MIKEY, using a dynamic key-agreement protocol such as IKE [200] or
SDES, and SRTP. Their analysis uncovers a number of flaws, IKEv2 [201], the call setup delay can increase dramatically.
most of which derive from subtle inconsistencies in the They identify encryption engine queuing delays as a potential
assumptions made in designing the different protocols. These concern, as call volumes increase.
attacks include a replay attack in SDES that completely break A conclusion similar to Salsano et al. [196] is reached by
content protection, a man-in-the-middle attack in ZRTP, and Bilien [202] and Bilien et al. [203], [204], who study the
a (perhaps theoretical) weakness in the key derivation process overhead in SIP call setup latency when using end-to-end
used in MIKEY. They also show several minor weaknesses and hop-by-hop security mechanisms. They consider protocols
and vulnerabilities in all protocols, primarily enabling denial such as MIKEY, S/MIME, SRTP, TLS, and IPsec, concluding
of service attacks. Floroiu and Sisalem [192] also conduct a that the overall penalty of using full-strength cryptography is
comparative analysis of the security aspects of DTLS, ZRTP, low.
MIKEY and SDES. They describe a number of possible Xiao and Zarrella [205] conduct an experimental evaluation
attacks against these protocols, and propose mitigation ap- of the impact of security mechanisms on VoIP in wireless
proaches in some cases. environments with a specific voice codec. They specifically
3) Performance Analysis (14 items): Reason and Messer- look at how the use of IPsec and WEP affect the Mean Opinion
chmitt [193], in one of the earliest works on the subject of the Score, packet loss, and delay of VoIP calls in 802.11 networks.
performance impact of security mechanisms on VoIP, looked They find that WEP has a bigger impact on packet loss than
specifically at the error-expansion properties of encryption and IPsec, but the latter can cause larger packet delays and fewer
their effect on voice quality. They analytically derive the post- but more extreme voice artifacts (disturbances) in the call.
decryption Bit Error Rate (BER) relative to the pre-encryption Also in the context of VoIP for wireless networks, Lakay
BER for block and stream ciphers, and analyze the effect and Agbinya [206] summarize similar experiments that show
of error-expansion mitigation techniques, such as the use of SIP security mechanism processing is responsible for 80% of
forward error correction, on quality of service. They discuss the call setup delay when using stateless proxies, and 45% for
an error-robust encryption scheme that is analogous to self- stateful proxies.
synchronizing ciphers. Eun-Chul et al. [207], evaluate via simulation the costs
Elbayoumi and Shepherd [194] conduct a performance com- of different security protocols (TLS, DTLS and IPsec) with
parison of block and stream cipher encryption in the context respect to call setup delay using different transport proto-
of securing VoIP calls. They analyze the impact of each on cols (TCP, UDP and SCTP). They conclude that the most
end-to-end delay and subjective quality of perceived voice. A efficient combinations, DTLS/UDP and IPsec/UDP, approxi-
broader view at several performance-impacting parameters is mately double the call setup delay. However, since the analysis
given by the same authors in a concurrent paper at the same is purely simulation-based, their results are sensitive to the
journal [195]. configured relative costs for processing the various protocols.
Salsano et al. [196] give an overview of the various SIP Shen et al. [208] also study the performance impact of
security mechanisms (as of 2002), focusing particularly on using TLS as a transport protocol for SIP. In their experiments
the authentication component. They conduct an evaluation of using a testbed, they use profiling at various system levels
the processing costs of SIP calls that involve authentication, (application, library, and kernel), and decompose the costs at
under different transport, authentication and encryption sce- a fine level of granularity. They determine that use of TLS can
narios. They show that a call using TLS and authentication reduce performance by a factor of up to 20 (when compared
is 2.56 times more expensive than the simplest possible SIP with the unsecured SIP-over-UDP). The main overhead factor
configuration (UDP, no security). However, a fully protected is the cost of RSA signatures during session negotiation,
call takes only 54% longer to complete than a configuration while symmetric key operations impose a relatively small cost.
that is more representative than the basic one but still offers no They recommend that operators amortize the setup cost over
security; the same fully-protected call has the same processing long-lived connections. Finally, they provide a cost model
cost if it is transported over TCP with no encryption (TLS). for provisioning SIP-over-TLS servers, predicting an average
Of the overhead, approximately 70% is attributed to message performance overhead of 15% under a suggested system
parsing and 30% to cryptographic processing. With the advent configuration.
ANGELOS D. KEROMYTIS: A COMPREHENSIVE SURVEY OF VOIP SECURITY RESEARCH 15
Rebahi et al. [209] analyze the performance of RSA as used Casola et al. [227], [228] suggest the use of a policy-based
in SIP for authentication and identity management (via public- approach to design secure VoIP infrastructures. The policies
key certificates and digital signatures), and describe the use of express security goals in measurable terms; suggested infras-
Elliptic Curve DSA (ECDSA) within this context to improve tructure designs can then be evaluated against these policies to
performance. Using ECDSA, their prototype can handle from determine whether the goals are met to an acceptable degree.
2 to 8 times as many call setup requests per second, with the Wu et al. [229] design an intrusion detection system, called
gap widening as key sizes increase. SCIDIVE, that is specific to VoIP environments. Specifically,
4) Authentication Protocols (15 items): Buschel [210] ar- SCIDIVE aims to detect different classes of intrusions, can
gues for integrated authentication between User Agents and operate with different viewpoints (on clients, proxies, or
all elements of a SIP infrastructure. Over the years, a number servers), and takes into consideration both signaling (i.e.,
of authentication schemes aiming to replace Digest Authenti- SIP) and media-transfer protocols (e.g., RTP). SCIDIVE’s
cation have been proposed, using such basic blocks as Diffie ability to correlate cross-protocol behavior, theoretically al-
Hellman [211], Elliptic Curve Diffie Hellman (ECDH) [212], lows for detection of more complex attacks. However, the
Elliptic Curve Discrete Logarithm Problem (ECDLP) [213], system is rules-based, which limits its effectiveness against
nonces [214], PKI [215], [216] hash functions [217], and new/unknown attacks. The primary evaluation (conducted on
others [218], not all of them secure [219]. a small testbed) consists of four simple cross-protocol attacks,
Cao and Jennings [220] propose a new mechanism for which would have evaded other contemporary, non-specialized
authenticating the responding user’s identity in SIP without intrusion detection systems. In follow-on work, Apte et al.
exposing said identity to untrusted intermediate elements. [230], [231] develop SPACEDIVE, a VoIP-specific intrusion
Their scheme requires additional headers in SIP messages, and detection system that allows for correlation of events among
has not been implemented or evaluated. distributed rules-based detectors. They demonstrate the ability
Insu and Keecheon [221] propose a secret key based mech- of SPACEDIVE to detect certain classes of attacks using a
anism to reduce the performance requirements of using public simple SIP environment with two domains, and compare it
key certificates to protect signaling (e.g., with TLS) in an with SCIDIVE.
enterprise VoIP environment. Martin and Hung [232] discuss a high-level policy for VoIP
Schmidt et al. [222] suggest that administration overheads applications, intended to guide the implementation, configura-
for implementing strong authentication in SIP could be low- tion, and use of VoIP systems.
ered by grouping users with the same function or role (e.g., SNOCER4 , a project funded by the European Union, is
agents in a calling center). They propose a proxy-based “investigating approaches for overcoming temporal network,
mechanism for implementing a form of “certificate sharing” hardware and software failures and ensuring the high availabil-
among a group of users, without exposing the corresponding ity of the offered VoIP services based on low cost distributed
private key to any of them. They demonstrate feasibility of the concepts.” The first public project report [233] provides an
scheme by implementing it in the NIST SIP proxy, with no overview of VoIP infrastructure components and the threats
further evaluation. that must be addressed (staying primarily at the protocol
Wang and Zhang [223] discuss an authentication and key and network level, and avoiding implementation issues with
agreement mechanism for SIP that uses certificate-less public- the exception of SQL injection), along with possible defense
key cryptography. Certificate-less public-key cryptography mechanisms. There is also discussion on scalable service
[224] is a variant of identity-based cryptography (where the provisioning (replication, redundancy, backups etc.), toward
public key of an entity is its public identity); here, the providing reliability and fault tolerance. The second public
public key for an entity is generated collaboratively between project report [234] describes an architecture for protect-
that entity and a trusted third party in such a way that the ing against malformed messages and related attacks using
public key can be verified by any other entity that knows specification-based intrusion detection, protocol message veri-
the public parameters under which the trusted third party fication, and redundancy. They use ontologies to describe SIP
operates. Compared to previous proposals that used identity- vulnerabilities, to allow for easy updating of the monitoring
based cryptography [225], their scheme does not require that components (IDS) [235].
the trusted third party Niccolini et al. [236] design an intrusion detection/intrusion
5) Architectures (19 items): Singh and Vuong [226] use prevention system architecture for use with SIP. Their system
a mobile agent framework to collect and correlate events uses both knowledge-based and behavior-based detection, ar-
from various network components, toward detecting a number ranged as a series in that order. They develop a prototype
of attacks. The stated advantages of their approach are that implementation using the open-source Snort IDS. They eval-
it does not require a new protocol for exchanging event uate the effectiveness of their system in an attack scenario
information and that mitigation and recovery capabilities can by measuring the mean end-to-end delay of legitimate SIP
be implemented by extending the framework and the agents, traffic in the presence of increasing volumes of malformed
with no changes to the VoIP protocols. They also propose SIP INVITE messages.
using user behavior profiles to detect anomalous behavior. Marshall et al. [237] describe the AT&T VoIP security
They describe the operation of their system in a number of architecture. They divide VoIP equipment into three classes:
attack scenarios, including protocol-based denial of service,
call hijacking, packet flooding, and abnormal call patterns. 4 http://www.snocer.org/
16 IEEE COMMUNICATIONS SURVEYS & TUTORIALS
trusted, trusted-but-vulnerable, and untrusted. The latter con- and maintaining whitelists/blacklists. Implicit feedback is also
sists of the customer premises equipment, which is outside provided through statistical analysis of interactions (e.g., call
the control of the carrier. The trusted domain includes all the frequency and duration). The architecture also provisions for
servers necessary to provide VoIP service. Between the two a recovery mechanism that incorporates explicit feedback and
sit various border and security elements, that are responsible quarantining.
for protecting the trusted devices while permitting legitimate 6) Middleboxes (11 items): Reynolds and Ghosal [246]
communications to proceed. They describe the interactions describe a VoIP-aware middlebox architecture that integrates
among the various components, and the security mechanisms the enterprise firewall, media gateway, and intrusion detection
used in protecting these interactions. facilities to allow the secure operation of dynamic VoIP
Sher and Magedanz [238] describe a security architec- applications. The problem of firewall and NAT traversal by
ture for IMS service delivery platforms, focusing on time- VoIP protocols has been the subject of some research [247],
independent attacks (e.g., software vulnerabilities). The key [248], [249], [250], generally involving some kind of signaling
element of their proposed approach is an intrusion detection (whether in-band or out-of-band) between the end-device and
and prevention system that inspects all incoming and outgoing the middlebox.
SIP messages to the IMS application servers, applying rules Bessis et al. [251] discuss the necessary features of a SIP-
that detect and mitigate specific attacks. A brief performance specific firewall, juxtaposing them with specific threats to SIP
evaluation shows that a prototype can operate with acceptable messages at each network layer (data link, network, transport
delay parameters. and session). They propose a simple, hardware-accelerated
Ding and Su [239] propose the combination of specification- SIP-proxy as a front-end SIP firewall and argue that this
based intrusion detection with anomaly detection techniques approach would block most of the attacks.
and attack-specific methods using hierarchical colored Petri Gurbani et al. [252] propose an mechanism whereby proxies
nets. create an overlay network between user agents. This network
Nassar et al. [240] advocate the use of SIP-specific honey- is used for rendezvous/coordination purposes only. Once user
pots to catch attacks targeting the Internet telephony systems, agents establish a session, the proxies become transparent
protocols and applications. They design and implement such a traffic forwarders, with the user agents communicating over
honeypot system, and explore the use of a statistical engine for an end-to-end secure session. This approach allows users to
identifying attacks and other misbehavior, based on training on communicate without exposing (as much) private information
legitimate traces of SIP traffic. The engine is based on their to proxies, at the cost of requiring a PKI and a new message
prior work that uses Bayesian-based inference. The resulting extension.
SIP honeypot effort is largely exploratory, with performance Sengar et al. [253], [254], [255] examine the problem of
and effectiveness evaluations left for future work. In follow-on cross-infrastructure vulnerabilities created by bridging VoIP
work, Nassar et al. [241] describe an intrusion detection and and PSTN networks. They outline a high-level architecture
prevention architecture for VoIP that integrates SIP honeypots that integrates firewall-like functionality with trust manage-
and an application-layer event correlation engine. ment, signaling encryption and authentication, and intrusion
Barry and Chan [242] describe a host-based intrusion de- detection.
tection architecture for SIP that combines specification-based Ehlert et al. [256] describe a rule-reduction algorithm for
and signature-based detection, and allows for the correlation of improving the performance of firewalls operating in busy
information across modules to identify cross-protocol attacks. VoIP environments, in balance with security requirements.
They conduct a simulation-based evaluation using OMNeT++ Their algorithm works by merging similar single-mapped rules
to determine detection accuracy and performance impact. into a more general rule, then dropping less important rules,
Rieck et al. [243] apply machine learning techniques to and finally calculating the accuracy of the new ruleset. If
detecting anomalous SIP messages, incorporating a “self- needed, their algorithm re-iterates until an acceptable solution
learning” component by allowing for periodic re-training of the is achieved.
anomaly detector using traffic that has been flagged as normal. 7) Intrusion Detection (11 items): Mandjes et al. [257],
The features used for clustering are based on n-grams and on [258] describe the use of statistical techniques to identify
tokenization of the SIP protocol. To prevent training attacks, anomalies in VoIP networks. Their work is primarily directed
wherein an adversary “trains” the anomaly detector to accept at non-adversarial anomalies, although certain attacks (such as
malicious inputs as legitimate, they employ randomization denial of service) would also be detected by their scheme.
(choosing random samples for the training set), sanitization Geneiatakis et al. [259], [260] discuss malformed-message
[244], and verification (by comparing the output of the new attacks against SIP servers and equipment, primarily depend-
and old training models). Their experimental prototype was ing on the PROTOS testsuite for SIP implementations [180].
shown to handle 70 Mbps of SIP traffic, while providing a To detect such attacks, they propose building an intrusion
99% detection rate with no false positives. detection system that leverages the SIP syntax grammar [8] to
Dantu et al. [245] describe a comprehensive VoIP security decompose incoming messages, and a grammar for specifying
architecture, composed of components distributed across the rules that check whether specific constraints are being violated
media gateway controller, the proxy server(s), the IP PBX, and (or specific conditions met) [261], [262]. In subsequent work,
end-user equipment. These components explicitly exchange Geneiatakis and Keromytis [263] apply entropy theory and
information toward better training of filters, and creating “itself information” to the problem of identifying anomalies
ANGELOS D. KEROMYTIS: A COMPREHENSIVE SURVEY OF VOIP SECURITY RESEARCH 17
in a stream of SIP messages. issue with VoIP users at this point, our past and current
However, Hantehzadeh et al. [264] point out that the most experiences with email spam and telemarketing seems to
approaches to anomaly detection in SIP use datasets with provide sufficient motivation for research in this area. Most of
large differences between anomalous and normal messages, the work is focused on identifying SPIT calls and callers based
which make them easy to detect. An analysis using a dataset on behavioral traits, although a number of other approaches
with minimal such differences (while maintaining the dis- are under exploration (e.g., CAPTCHAs and real-time content
tinction between malicious and normal messages) indicates analysis). One of the problems is the lack of a good corpus
that existing classification schemes do not perform as well. of data for experimentation and validation of the proposed
They propose feature reduction techniques to enhance these techniques.
classification schemes even on “trickier” datasets. Similar We were also not surprised to see a sizable portion of
results, focusing specifically on the performance of classifiers research (over 15%) directed at design, analysis (both security-
using Euclidean distance, are discussed by Mehta et al. [265]. and performance-oriented), and attacking of cryptographic
Sengar et al. [255] model the protocol state machine of protocols as used in VoIP. The cryptographic research com-
individual SIP nodes (derived from the SIP specification) and munity appears to be reasonably comfortable in proposing
inter-node interactions, in order to have a complete picture of tweaks and minor improvements to the basic authentication
the overall system state towards detecting anomalous behavior mechanisms, and the systems community appears content with
and attacks. This is particularly important in VoIP, since nodes analyzing the performance of different protocol configurations
can interact in many ways, and with several other nodes (e.g., TLS vs. IPsec).
during a call and throughout their operation. They conduct Most distressing, however, is the fact that comparatively
a performance evaluation to determine the overhead added to little research (less than 13%) is going toward addressing the
call setup and media transfer by their system, and its overall problem of denial of service. Given the numerical dominance
scalability. While their system can identify known attacks (for of SIP-specific DoS vulnerabilities (as described earlier) and
which attack patterns can be specified) with high accuracy the ease of launching such attacks, it is clear that significantly
and low false positives, detecting previously unknown attacks more work is needed here. What work is being done seems to
depends on the fidelity of the protocol state machines. This primarily focus on the server and infrastructure side, despite
problem is left for future work. our finding that half of DoS-related vulnerabilities are present
Seo et al. [266] develop a stateful intrusion detection system on endpoints. Furthermore, much of the existing work focuses
for SIP, modeling SIP state transitions to match the expected on network-observable attacks (e.g., “obviously” malformed
state of the monitored SIP entities. Their system allows the SIP messages), whereas the majority of VoIP DoS vulnerabil-
specification of rules that match attacks and misbehavior based ities are the result of implementation failures. More generally,
not only on the content of the communications but also on the additional work is needed in strengthening implementations,
state of the SIP call and of the proxies. rather than introducing middleboxes and network intrusion
8) Miscellaneous (2 items): Cao et al. [267] describe how detection systems, whose effectiveness has been shown to be
to transparently add information in SIP and H.323 messages limited in other domains; taking a black box approach in
such that calls can be tracked across the network. A simi- securing VoIP systems is, in our opinion, not going to be
lar approach, leveraging watermarking of VoIP content, was sufficient.
previously described by Steinebach et al. [268]. Also disconcerting is the lack of research (2.8%) in address-
ing service abuse threats, considering the visibility of large
fraud incidents [4], [5], [6].
V. D ISCUSSION
In general, we found little work that took a “big picture”
In our previous work [1], [2], [3], we surveyed 215 vul- view of the VoIP security problem. What cross-cutting ar-
nerabilities in SIP implementations that had been disclosed chitectures have been proposed focus primarily on intrusion
in the CVE database from 1999 to 2009. We classified these detection. Work is needed to address cross-implementation and
vulnerabilities along several dimensions, including the VoIPSA cross-protocol problems, above and beyond the few efforts
threat taxonomy, the traditional {Confidentiality, Integrity, along those lines in the intrusion detection space.
Availability} concerns, and a {Protocol, Implementation, Con- Finally, we note that none of the surveyed works addressed
figuration} axis. We found that the various types of denial of the problem of configuration management. While such prob-
service attacks constitute the majority of disclosed vulnerabili- lems represent only 7% of known vulnerabilities, configuration
ties, over 90% of which were due to implementation problems issues are easy to overlook and are likely under-represented
and 7% due to configuration. in our previous analysis due to the nature of vulnerability
Considering the research work we have surveyed, we can reporting.
see that out of a total of 245 publications, almost 20% concern
themselves with an overview of the problem space and of
VI. C ONCLUSIONS
solutions — a figure we believe is reasonable, considering
the enormity of the problem space and the speed of change We have presented a survey of 245 publications on the topic
in the protocols, standards, and implementations. We also see of VoIP security, classifying them according to the VoIPSA
a considerable amount of effort (roughly 20%) going toward threat taxonomy. We juxtaposed this survey against our previ-
addressing SPIT. While SPIT does not appear to be a major ous analysis on VoIP security vulnerabilities. We identified two
18 IEEE COMMUNICATIONS SURVEYS & TUTORIALS
specific areas (denial of service and service abuse) as being [24] R. Dantu and P. Kolan, “Detecting Spam in VoIP Networks,” in
under-represented in research efforts directed at them (relative Proceedings of the USENIX Workshop on Steps to Reducing Unwanted
Traffic on the Internet (SRUTI), pp. 31–37, July 2005.
to their importance in the vulnerability survey). Furthermore, [25] P. Kolan and R. Dantu, “Socio-technical Defense Against Voice
we identify implementation bugs and misconfigurations as two Spamming,” ACM Transactions on Autonomous and Adaptive Systems
general problem areas that merit considerably more work than (TAAS), vol. 2, March 2007.
[26] Y. Rebahi and D. Sisalem, “SIP Service Providers and the Spam
they currently attract. We hope that our work will ease the Problem,” in Proceedings of the 2nd VoIP Security Workshop, June
task of conducting research in VoIP security and help guide 2005.
the often disjoint research efforts. [27] Y. Rebahi, D. Sisalem, and T. Magedanz, “SIP Spam Detection,” in
Proceedings of the International Conference on Digital Telecommuni-
cations(ICDT), pp. 29–31, August 2006.
[28] M. Hansen, M. Hansen, J. Möller, T. Rohwer, C. Tolkmit, and
R EFERENCES H. Waack, “Developing a Legally Compliant Reachability Management
System as a Countermeasure against SPIT,” in Proceedings of the 3rd
[1] A. D. Keromytis, “Voice over IP: Risks, Threats and Vulnerabilities,” in Workshop on Securing Voice over IP, June 2006.
Proceedings of the Cyber Infrastructure Protection (CIP) Conference, [29] R. Baumann, S. Cavin, and S. Schmid, “Voice Over IP - Security and
June 2009. SPIT,” KryptDet Report FU Br 41, Swiss Army, August/September
[2] A. D. Keromytis, “A Look at VoIP Vulnerabilities,” USENIX ;login: 2006.
Magazine, vol. 35, pp. 41–50, February 2010. [30] A. Madhosingh, “The Design of a Differentiated SIP to Control
[3] A. D. Keromytis, “Voice over IP Security: Research and Practice,” VoIP Spam,” Masters Thesis Report SPIT, CAPTCHA, Florida State
IEEE Security & Privacy Magazine, vol. 8, pp. 76–78, March/April University, Computer Science Department, 2006.
2010. [31] M. Bertrand, Q. Loudier, Y. Gourhant, F. Bougant, and M. Osty, “SPIT
[4] B. Krebs, “Security Fix: Default Passwords Led to $55 Million in Mitigation by a Network-Level Anti-Spit Entity,” in Proceedings of the
Bogus Phone Charges,” June 2009. 3rd Workshop on Securing Voice over IP, June 2006.
[5] The Register, “Two charged with VoIP fraud.” http://www.theregister. [32] H. Yan, H. Zhang, K. Sripanidkulchai, Z. Shae, and D. Saha, “In-
co.uk/2006/06/08/voip fraudsters nabbed/, June 2006. corporating Active Fingerprinting into SPIT Prevention Systems,” in
[6] The Register, “Fugitive VOIP hacker cuffed in Mexico.” http://www. Proceedings of the 3rd Workshop on Securing Voice over IP, June
theregister.co.uk/2009/02/11/fugitive voip hacker arrested/, February 2006.
2009. [33] V. Balasubramaniyan, M. Ahamad, and H. Park, “CallRank: Combating
[7] VoIP Security Alliance, “VoIP Security and Privacy Threat Taxonomy, SPIT Using Call Duration, Social Networks and Global Reputation,” in
version 1.0.” http://www.voipsa.org/Activities/taxonomy.php, October Proceedings of the 4th Conference on Email and Anti-Spam (CEAS),
2005. August 2007.
[8] J. Rosenberg, H. Schulzrinne, G. Camarillo, A. Johnston, J. Peterson, [34] K. Ono and H. Schulzrinne, “Have I Met You Before? Using Cross-
R. Sparks, M. Handley, and E. Schooler, “SIP: Session Initiation Media Relations to Reduce SPIT,” in Proceedings of the 3rd Inter-
Protocol.” RFC 3261 (Proposed Standard), June 2002. Updated by national Conference on Principles, Systems and Applications of IP
RFCs 3265, 3853, 4320, 4916, 5393. Telecommunications (IPTComm), pp. 1–7, June 2009.
[9] J. Franks, P. Hallam-Baker, J. Hostetler, S. Lawrence, P. Leach, [35] H. Guang-Yu, W. Y.-Y. Wen, and H. Zhao, “SPIT Detection and
A. Luotonen, and L. Stewart, “HTTP Authentication: Basic and Digest Prevention Method Based on Signal Analysis,” in Proceedings of the
Access Authentication.” RFC 2617 (Draft Standard), June 1999. 3rd International Conference on Convergence and Hybrid Information
[10] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson, “RTP: A Technology (ICCIT), vol. 2, pp. 631–638, November 2008.
Transport Protocol for Real-Time Applications.” RFC 3550 (Standard), [36] P. Patankar, G. Nam, G. Kesidis, and C. R. Das, “Exploring Anti-Spam
July 2003. Updated by RFC 5506. Models in Large Scale VoIP Systems,” in Proceedings of the 28th
International Conference on Distributed Computing Systems (ICDCS),
[11] I. Johansson and M. Westerlund, “Support for Reduced-Size Real-Time
pp. 85–92, June 2008.
Transport Control Protocol (RTCP): Opportunities and Consequences.”
[37] Y.-S. Wu, S. Bagchi, N. Singh, and R. Wita, “Spam Detection in Voice-
RFC 5506 (Proposed Standard), Apr. 2009.
Over-IP Calls through Semi-Supervised Clustering,” in Proceedings of
[12] J. Postel, “Transmission Control Protocol.” RFC 793 (Standard), Sept.
the 39th Annual IEEE/IFIP International Conference on Dependable
1981. Updated by RFCs 1122, 3168.
Systems and Networks (DSN), pp. 307–316, June 2009.
[13] J. Postel, “User Datagram Protocol.” RFC 768 (Standard), Aug. 1980.
[38] K. Hyung-Jong, K. M. Joo, K. Yoonjeong, and J. H. Cheol, “DEVS-
[14] L. Ong and J. Yoakum, “An Introduction to the Stream Control based Modeling of VoIP Spam Callers’ Behavior for SPIT Level Cal-
Transmission Protocol (SCTP).” RFC 3286 (Informational), May 2002. culation,” Simulation Modeling Practice and Theory, vol. 17, pp. 569–
[15] E. Rescorla and N. Modadugu, “Datagram Transport Layer Security.” 584, April 2009.
RFC 4347 (Proposed Standard), Apr. 2006. [39] C. Sorge and J. Seedorf, “A Provider-Level Reputation System for
[16] M. Handley, E. Rescorla, and IAB, “Internet Denial-of-Service Con- Assessing the Quality of SPIT Mitigation Algorithms,” in Proceedings
siderations.” RFC 4732 (Informational), Dec. 2006. of the IEEE Internation Conference on Communications (ICC), pp. 1–
[17] M. Handley, V. Jacobson, and C. Perkins, “SDP: Session Description 6, June 2009.
Protocol.” RFC 4566 (Proposed Standard), July 2006. [40] S. Phithakkitnukoon and R. Dantu, “Defense Against SPIT Using Com-
[18] B. Ramsdell, “Secure/Multipurpose Internet Mail Extensions munity Signals,” in Proceedings of the IEEE International Conference
(S/MIME) Version 3.1 Message Specification.” RFC 3851 (Proposed on Intelligence and Security Informatics (ISI), June 2009.
Standard), July 2004. [41] C. Pörschmann and H. Knospe, “Analysis of Spectral Parameters of
[19] S. Kent and K. Seo, “Security Architecture for the Internet Protocol.” Audio Signals for the Identification of Spam Over IP Telephony,” in
RFC 4301 (Proposed Standard), Dec. 2005. Proceedings of the 5th Conference on Email and Anti-Spam (CEAS),
[20] K. Srivastava and H. Schulzrinne, “Preventing Spam For SIP-based August 2008.
Instant Messages and Sessions,” Technical Report CUCS-042-04, [42] H. Tschofenig, R. Falk, J. Peterson, J. Hodges, D. Sicker, and J. Polk,
Columbia University, Department of Computer Science, October 2004. “Using SAML to Protect the Session Initiation Protocol (SIP),” IEEE
[21] R. Dantu and P. Kolan, “Preventing Voice Spamming,” in Proceedings Network, vol. 20, pp. 14–17, September/October 2006.
of the IEEE Global Telecommunications Conference (GLOBECOM), [43] N. d’Heureuse, J. Seedorf, and S. Niccolini, “A Policy Framework for
Workshop on VoIP Security Challenges and Solutions, December 2004. Personalized and Role-Based SPIT Prevention,” in Proceedings of the
[22] R. MacIntosh and D. Vinokurov, “Detection and Mitigation of Spam 3rd International Conference on Principles, Systems and Applications
in IP Telephony Networks Using Signaling Protocol Analysis,” in of IP Telecommunications (IPTCOMM), July 2009.
Proceedings of the IEEE/Sarnoff Symposium on Advances in Wired [44] Y. Soupionis, S. Dritsas, and D. Gritzalis, “An Adaptive Policy-Based
and Wireless Communication, pp. 49–52, April 2005. Approach to SPIT Management,” in Proceedings of the 13th European
[23] N. Croft and M. Olivier, “A Model for Spam Prevention in Voice over Symposium on Research in Computer Security (ESORICS), pp. 446–
IP Networks using Anonymous Verifying Authorities,” in Proceedings 460, October 2008.
of the 5th Annual Information Security South Africa Conference [45] N. Banerjee, S. Saklikar, and S. Saha, “Anti-vamming Trust En-
(ISSA), July 2005. forcement in Peer-to-peer VoIP Networks,” in Proceedings of the
ANGELOS D. KEROMYTIS: A COMPREHENSIVE SURVEY OF VOIP SECURITY RESEARCH 19
International Conference on Communications and Mobile Computing [69] C. V. Wright, L. Ballard, F. N. Monrose, and G. M. Masson, “Language
(IWCMC), pp. 201–206, July 2006. Identification of Encrypted VoIP Traffic: Alejandra y Roberto or Alice
[46] J. Quittek, S. Niccolini, S. Tartarelli, M. Stiemerling, M. Brunner, and and Bob?,” in Proceedings of 16th USENIX Security Symposium,
T. Ewald, “Detecting SPIT Calls by Checking Human Communication pp. 1–12, August 2007.
Patterns,” in Proceedings of the IEEE International Conference on [70] C. V. Wright, L. Ballard, S. Coulls, F. N. Monrose, and G. M. Masson,
Communications (ICC), pp. 1979–1984, June 2007. “Spot Me If You Can: Recovering Spoken Phrases in Encrypted VoIP
[47] T. Wang, “A VoIP anti-Spam System based on Reverse Turing Test,” Conversations,” in Proceedings of IEEE Symposium on Security and
Masters Thesis ETD-05072007-173147, North Carolina State Univer- Privacy, pp. 35–49, May 2008.
sity, May 2007. [71] X. Wang, R. Zhang, X. Yang, X. Jiang, and D. Wijesekera, “Voice
[48] J. L. Janne and M. Komu, “Cure for Spam Over Internet Telephony,” Pharming Attack and the Trust of VoIP,” in Proceedings of the 4th
in Proceedings of the 4th IEEE Consumer Communications and International Conference on Security and Privacy in Communication
Networking Conference (CCNC), pp. 896–900, January 2007. Networks (SecureComm), pp. 1–11, September 2008.
[49] S. Niccolini, “SPIT Prevention: State of the Art and Research Chal- [72] O. Verscheure, M. Vlachos, A. Anagnostopoulos, P. Frossard, E. Bouil-
lenges,” in Proceedings of the 3rd Workshop on Securing Voice over let, and P. S. Yu, “Finding ”who is talking to whom” in VoIP
IP, June 2006. Networks via Progressive Stream Clustering,” in Proceedings of the
[50] M. Haberler and O. Lendl, “Secure Selective Peering with Federations,” 6th International Conference on Data Mining (ICDM), p. 667=677,
in Proceedings of the 3rd Workshop on Securing Voice over IP, June December 2006.
2006. [73] M. Petraschek, T. Hoeher, O. Jung, H. Hlavacs, and W. N. Gansterer,
[51] S. Saklikar and S. Saha, “Identity Federation for VoIP-based Services,” “Security and Usability Aspects of Man-in-the-Middle Attacks on
in Proceedings of the ACM Workshop on Digital Identity Management, ZRTP,” Journal of Universal Computer Science, vol. 14, no. 5, pp. 673–
pp. 62–71, November 2007. 692, 2008.
[52] D. Shin and C. Shim, “Voice Spam Control with Gray Leveling,” in [74] R. Zhang, X. Wang, R. Farley, X. Yang, and X. Jiang, “On the
Proceedings of the 2nd VoIP Security Workshop, June 2005. Feasibility of Launching the Man-In-The-Middle Attacks on VoIP
[53] D. Shin, J. Ahn, and C. Shim, “Progressive Multi Gray-Leveling: A from Remote Attackers,” in Proceedings of the 4th International ACM
Voice Spam Protection Algorithm,” IEEE Network, vol. 20, pp. 18–24, Symposium on Information, Computer, and Communications Security
September/October 2006. (ASIACCS), pp. 61–69, March 2009.
[54] R. Schlegel, S. Niccolini, S. Tartarelli, and M. Brunner, “SPam over [75] J.-I. Guo, J.-C. Yen, and H.-F. Pai, “New Voice over Internet Protocol
Internet Telephony (SPIT) Prevention Framework,” in Proceedings Technique with Hierarchical Data Security Protection,” IEE Proceed-
of the IEEE Global Telecommunications Conference (GLOBECOM), ings — Vision, Image and Signal Processing, vol. 149, pp. 237–243,
pp. 1–6, November/December 2006. August 2002.
[55] J. Quittek, S. Niccolini, S. Tartarelli, and R. Schlegel, “Prevention of [76] C. Li, S. Li, D. Zhang, and G. Chen, “Cryptanalysis of a Data Security
Spam over IP Telephony (SPIT),” NEC Technical Journal, vol. 1, no. 2, Protection Scheme for VoIP,” IEE Proceedings—Vision, Image and
pp. 114–119, 2006. Signal Processing, vol. 153, pp. 1–10, February 2006.
[77] S. M. Bellovin, M. Blaze, and S. Landau, “The Real National-Security
[56] J. Quittek, S. Niccolini, S. Tartarelli, and R. Schlegel, “On Spam
Needs for VoIP,” Communications of the ACM (CACM), vol. 48, p. 120,
over Internet Telephony (SPIT) Prevention,” IEEE Communications
November 2005.
Magazine, vol. 46, pp. 80–86, August 2008.
[78] J. Seedorf, “Using Cryptographically Generated SIP-URIs to Protect
[57] Y. Rebahi, S. Ehlert, S. Dritsas, G. F. Marias, D. Gritzalis, B. Pannier,
the Integrity of Content in P2P-SIP,” in Proceedings of the 3rd
O. Capsada, T. Golubenco, J. F. Juell, and M. Hoffmann, “General
Workshop on Securing Voice over IP, June 2006.
Anti-Spam Security Framework for VoIP Infrastructures,” Tech. Rep.
[79] A. Fessi, N. Evans, H. Niedermayer, and R. Holz, “Pr2-P2PSIP: Privacy
Deliverable WP2/D2.3, SPIDER COOP-32720, July 2007.
Preserving P2P Signaling for VoIP and IM,” in Proceedings of the 4th
[58] B. Mathieu, S. Niccolini, and D. Sisalem, “SDRS: A Voice-over- Annual ACM Conference on Principles, Systems and Applications of
IP Spam Detection and Reaction System,” IEEE Security & Privacy IP Telecommunications (IPTCOMM), pp. 141–152, August 2010.
Magazine, vol. 6, pp. 52–59, November/December 2008. [80] A. Talevski, E. Chang, and T. Dillon, “Secure Mobile VoIP,” in Pro-
[59] D. Gritzalis and Y. Mallios, “A SIP-oriented SPIT Management Frame- ceedings of the International Conference on Convergence Information
work,” Computers & Security, vol. 27, pp. 136–153, October 2008. Technology, pp. 2108–2113, November 2007.
[60] P. Kolan, R. Dantu, and J. W. Cangussu, “Nuisance of a Voice Call,” [81] N. Kuntze, A. U. Schmidt, and C. Hett, “Non-Repudiation in Internet
ACM Transactions on Multimedia Computing, Communications and Telephony,” in Proceedings of the IFIP International Information
Applications (TOMCCAP), vol. 5, pp. 6:1–6:22, October 2008. Security Conference, pp. 361–372, May 2007.
[61] S. Dritsas, V. Dritsou, B. Tsoumas, P. Constantopoulos, and D. Gritza- [82] C.-H. Wang, M.-W. Li, and W. Liao, “A Distributed Key-Changing
lis, “OntoSPIT: SPIT management through ontologies,” Computer Mechanism for Secure Voice over IP (VoIP) Service,” in Proceedings of
Communications, vol. 32, pp. 203–212, January 2009. the IEEE International Conference on Multimedia and Expo, pp. 895–
[62] L. Kong, V. B. Balasubramaniyan, and M. Ahamad, “A Lightweight 898, July 2007.
Scheme for Securely and Reliably Locating SIP Users,” in Proceedings [83] V. K. Gurbani and V. Kolesnikov, “Work in Progress: A secure
of the 1st IEEE Workshop on VoIP Management and Security (VoIP and lightweight scheme for media keying in the Session Initiation
MaSe), pp. 9–17, April 2006. Protocol (SIP),” in Proceedings of the 4th Annual ACM Conference
[63] X.Wang, S. Chen, and S. Jajodia, “Tracking Anonymous Peer-to- on Principles, Systems and Applications of IP Telecommunications
Peer VoIP Calls on the Internet,” in Proceedings of the 12th ACM (IPTCOMM), pp. 35–44, August 2010.
Conference on Computer and Communications Security (CCS), pp. 81– [84] H. Hlavacs, W. N. Gansterer, H. Schabauer, J. Zottl, M. Petraschek,
91, November 2005. T. Hoeher, and O. Jung, “Enhancing ZRTP by using Computational
[64] S. Chen, X. Wang, and S. Jajodia, “On the Anonymity and Traceability Puzzles,” Journal of Universal Compter Science, vol. 14, no. 5,
of Peer-to-Peer VoIP Calls,” IEEE Network, vol. 20, pp. 32–37, pp. 693–716, 2008.
September/October 2006. [85] F. Palmieri and U. Fiore, “Providing True End-to-End Security in
[65] M. Srivatsa, A. Iyengar, and L. Liu, “Privacy in VoIP networks: A k- Converged Voice over IP Infrastructures,” Computers & Security,
Anonymity Approach,” in Proceedings of the 28th IEEE Conference on vol. 28, pp. 433–449, September 2009.
Computer Communication (INFOCOM), pp. 2856–2860, April 2009. [86] G. Zhang and S. Berthold, “Hidden VoIP Calling Records from
[66] G. Shah, A. Molina, and M. Blaze, “Keyboards and Covert Channels,” Networking Intermediaries,” in Proceedings of the 4th Annual ACM
in Proceedings of 15th USENIX Security Symposium, pp. 59–75, Conference on Principles, Systems and Applications of IP Telecommu-
July/August 2006. nications (IPTCOMM), pp. 15–24, August 2010.
[67] T. Takahashi and W. Lee, “An Assessment of VoIP Covert Channel [87] G. Zhang and S. Fischer-Hub̈ner, “Peer-to-Peer VoIP Communica-
Threats,” in Proceedings off the 3rd International Conference on tions Using Anonymisation Overlay Networks,” in Proceedings of the
Security and Privacy in Communications Networks (SecureComm), 11th Conference on Communications and Multimedia Security (CMS),
pp. 371–380, September 2007. May/June 2010.
[68] C. Wieser, J. Röning, and A. Takanen, “Security analysis and experi- [88] C. A. Melchor, Y. Deswarte, and J. Iguchi-Cartigny, “Closed-circuit
ments for Voice over IP RTP media streams,” in Proceedings of the 8th Unobservable Voice over IP,” in Proceedings of the 23rd Annual
International Symposium on Systems and Information Security (SSI), Computer Security Applications Conference (ACSAC), pp. 119–128,
November 2006. December 2007.
20 IEEE COMMUNICATIONS SURVEYS & TUTORIALS
[89] M. Srivatsa, L. Liu, and A. Iyengar, “Preserving Caller Anonymity in ment,” Telecommunication Systems, vol. 36, pp. 153–159, December
Voice-over-IP Networks,” in Proceedings of the IEEE Symposium on 2007.
Security and Privacy (S&P), pp. 50–63, May 2008. [109] D. Geneiatakis and C. Lambrinoudakis, “A Cost-Effective Mechanism
[90] A. D. Elbayoumy and S. Shepherd, “A High Grade Secure VoIP System for Protecting SIP Based Internet Telephony Services Against Signaling
Using the Tiny Encryption Algorithm,” in Proceedings of the 7th Attacks,” in Proceedings of the IMS and Mobile Multimedia Workshop,
Annual International Symposium on Advanced Radio Technologies, July 2008.
pp. 342–350, March 2005. [110] G. Ormazabal, S. Nagpal, E. Yardeni, and H. Schulzrinne, “Secure
[91] A. D. Elbayoumy and S. Shepherd, “QoS Control Using an Endpoint SIP: A Scalable Prevention Mechanism for DoS Attacks on SIP Based
CPU Capability Detector in a Secure VoIP System,” in Proceedings VoIP Systems,” in Proceedings of the 2nd International Conference
of the 10th IEEE Symposium on Computers and Communications, on Principles, Systems and Applications of IP Telecommunications
pp. 175–181, June 2005. (IPTComm), pp. 107–132, July 2008.
[92] A. D. Elbayoumy and S. Shepherd, “A High Grade Secure VoIP [111] S. Ehlert, G. Zhang, D. Geneiatakis, G. Kambourakis, T. Dagiuklas,
System Using an Endpoint CPU Capability Detector,” in Proceedings J. Markl, and D. Sisalem, “Two Layer Denial of Service Prevention
of the ITA05 International Conference on Internet Technologies and on SIP VoIP Infrastructures,” Computer Communications, vol. 31,
Applications, pp. 173–180, September 2005. pp. 2443–2456, June 2008.
[93] B. Reynolds and D. Ghosal, “Secure IP Telephony using Multi-layered [112] S. Ehlert, Y. Rebahi, and T. Magedanz, “Intrusion Detection System for
Protection,” in Proceedings of the ISOC Symposium on Network and Denial-of-Service Flooding Attacks in SIP Communication Networks,”
Distributed Systems Security (NDSS), February 2003. International Journal of Security and Networks, vol. 4, pp. 189–200,
[94] J. Larson, T. Dawson, M. Evans, and J. C. Straley, “Defending VoIP July 2009.
Networks from Distributed DoS (DDoS) Attacks,” in Proceedings [113] S. Ehlert, C. Wang, T. Magedanz, and D. Sisalem, “Specification-based
of the IEEE Global Telecommunications Conference (GLOBECOM), Denial-of-Service Detection for SIP Voice-over-IP Networks,” in Pro-
November/December 2004. ceedings of the 3rd International Conference on Internet Monitoring
[95] J. Larson, T. Dawson, M. Evans, and J. C. Straley, “Defending VoIP and Protection, pp. 59–66, July 2008.
Networks from DDoS Attacks,” in Proceedings of the 2nd Workshop [114] D. Geneiatakis, N. Vrakas, and C. Lambrinoudakis, “Utilizing Bloom
on Securing Voice over IP, June 2005. Filters for Detecting Flooding Attacks against SIP Based Services,”
[96] A. Bremler-Barr, R. Halachmi-Bekel, and K. Kangasharju, “Unregister Computers and Security, vol. 28, pp. 578–591, October 2009.
Attacks in SIP,” in Proceedings of the 2nd IEEE Workshop on Secure [115] D. Geneiatakis, N. Vrakas, and C. Lambrinoudakis, “Performance
Network Protocols, pp. 32–37, November 2006. Evaluation of a Flooding Detection Mechanism for VoIP Networks,”
[97] E. Y. Chen, “Detecting DoS Attacks on SIP Systems,” in Proceedings in Proceedings of the 16th International Workshop on Systems Signals
of the 1st IEEE Workshop on VoIP Management and Security (VoIP and Image Processing, pp. 1–5, June 2009.
MaSe), April 2006. [116] A. Awais, M. Farooq, and M. Y. Javed, “Attack Analysis & Bio-inspired
Security Framework for IP Multimedia Subsystem,” in Proceedings
[98] H. Sengar, H. Wang, D. Wijesekera, and S. Jajodia, “Fast Detection
of the GECCO Conference Companion on Genetic and Evolutionary
of Denial-of-Service Attacks on IP Telephony,” in Proceedings of the
Computation, pp. 2093–2098, July 2008.
14th IEEE International Workshop on Quality of Service (IWQoS),
[117] Y. Rebahi, M. Sher, and T. Magedanz, “Detecting Flooding Attacks
pp. 199–208, June 2006.
Against IP Multimedia Subsystem (IMS) Networks,” in Proceedings
[99] H. Sengar, H. Wang, D. Wijesekera, and S. Jajodia, “Detecting VoIP
of the IEEE/ACS International Conference on Computer Systems and
Floods Using the Hellinger Distance,” IEEE Transactions on Parallel
Applications, pp. 848–851, March/April 2008.
and Distributed Systems, vol. 19, pp. 794–805, June 2008.
[118] M. A. Akbar and M. Farooq, “Application of Evolutionary Algorithms
[100] J. Tang, Y. Cheng, and C. Zhou, “Sketch-based SIP Flooding Detection in Detection of SIP based Flooding Attacks,” in Proceedings of the
Using Hellinger Distance,” in Proceedings of the IEEE Global Telecom- Genetic and Evolutionary Computation Conference (GECCO), July
munications Conference (GLOBECOM), pp. 1–6, November/December 2009.
2009. [119] M. A. Akbar and M. Farooq, “RTP-Miner: A Real-time Security
[101] G. Zhang, S. Ehlert, T. Magedanz, and D. Sisalem, “Denial of Framework for RTP Fuzzing Attacks,” in Proceedings of the 20th
Service Attack and Prevention on SIP VoIP Infrastructures Using International Workshop on Network and Operating Systems Support
DNS Flooding,” in Proceedings of the 1st International Conference for Digital Audio and Video (NOSSDAV), June 2010.
on Principles, Systems and Applications of IP Telecommunications [120] M. Nassar, R. State, and O. Festor, “Monitoring SIP Traffic Using
(IPTCOMM), pp. 57–66, July 2007. Support Vector Machines,” in Proceedings of the Symposium on Recent
[102] J. Fiedler, T. Kupka, S. Ehlert, T. Magedanz, and D. Sisalem, “VoIP Advances in Intrusion Detection (RAID), pp. 311–330, September
Defender: Highly Scalable SIP-based Security Architecture,” in Pro- 2008.
ceedings of the 1st International Conference on Principles, Systems [121] M. Z. Rafique, M. A. Akbar, and M. Farooq, “Evaluating DoS Attacks
and Applications of IP Telecommunications (IPTComm), pp. 11–17, Against SIP-Based VoIP Systems,” in Proceedings of the IEEE Global
July 2007. Telecommunications Conference (GLOBECOM), November/December
[103] W. Conner and K. Nahrstedt, “Protecting SIP Proxy Servers from 2009.
Ringing-based Denial-of-Service Attacks,” in Proceedings of the 10th [122] M. A. Akbar, Z. Tariq, and M. Farooq, “A Comparative Study of
IEEE International Symposium on Multimedia (ISM), pp. 340–347, Anomaly Detection Algorithms for Detection of SIP Flooding in IMS,”
December 2008. in Proceedings of the International Conference on Internet Multimedia
[104] M. Luo, T. Peng, and C. Leckie, “CPU-based DoS Attacks Against Services Architecture and Applications (IMSAA), December 2008.
SIP Servers,” in Proceedings of the IEEE Network Operations and [123] P. Battistello, “Work in Progress: Inter-Domain and DoS-Reistant
Management Symposium (NOMS), pp. 41–48, April 2008. Call Establishment Protocol (IDDR-CEP),” in Proceedings of the 4th
[105] C. Fuchs, N. Aschenbruck, F. Leder, and P. Martini, “Detecting Annual ACM Conference on Principles, Systems and Applications of
VoIP-based DoS Attacks at the Public Safety Answering Point,” in IP Telecommunications (IPTCOMM), pp. 25–34, August 2010.
Proceedings of the ACM Aymposium on Information, Computer and [124] P. Truong, D. Nieh, and M. Moh, “Specification-based Intrusion
Communications Security (ASIACCS), pp. 148–155, March 2008. Detection for H.323-based Voice over IP,” in Proceedings of the
[106] N. Aschenbruck, M. Frank, P. Martini, J. Tolle, R. Legat, and H.-D. IEEE International Symposium on Signal Processing and Information
Richmann, “Present and Future Challenges Concerning DoS-attacks Technology, December 2005.
against PSAPs in VoIP Networks,” in Proceedings of the 4th IEEE [125] W. Mazurczyk and Z. Kotulski, “New Security and Control Protocol
International Workshop on Information Assurance (IWIA), pp. 103– for VoIP Based on Steganography and Digital Watermarking,” tech-
108, April 2006. nical report, Institute of Fundamental Technological Research, Polish
[107] C. Hyun-Soo, R. Jea-Tek, R. Byeong-hee, K. Jeong-Wook, and Academy of Sciences, June 2005.
J. Hyun-Cheol, “Detection of SIP De-Registration and Call-Disruption [126] Z. Kotulski and W. Mazurczyk, “Covert Channel for Improving VoIP
Attacks Using a Retransmission Mechanism and a Countermeasure Security,” in Proceedings of the 13th International Multi-Conference
Scheme,” in Proceedings of the IEEE International Conference on on Advanced Computer Systems (ACS), pp. 311–320, October 2006.
Signal Image Technology and Internet Based Systems (SITIS), pp. 650– [127] W. Mazurczyk and Z. Kotulski, “New VoIP Traffic Security Scheme
656, November/December 2008. with Digital Watermarking,” in Proceedings of International Con-
[108] D. Geneiatakis and C. Lambrinoudakis, “A Lightweight Protection ference on Computer Safety, Reliability, and Security (SafeComp),
Mechanism against Signaling Attacks in a SIP-based VoIP Environ- pp. 170–181, September 2006.
ANGELOS D. KEROMYTIS: A COMPREHENSIVE SURVEY OF VOIP SECURITY RESEARCH 21
[128] R. Zhang, X. Wang, X. Yang, and X. Jiang, “Billing Attacks on SIP- [152] Z. Anwar, W. Yurcik, R. E. Johnson, M. Hafiz, and R. H. Campbell,
based VoIP Systems,” in Proceedings of the 1st USENIX Workshop “Multiple Design Patterns for Voice over IP (VoIP) Security,” in
On Offensive Technologies (WOOT), pp. 1–8, August 2007. Proceedings of the IEEE Workshop on Information Assurance (WIA),
[129] H. Abdelnur, T. Avanesov, M. Rusinowitch, and R. State, “Abusing SIP held in conjunction with the 25th IEEE International Performance
Authentication,” in Proceedings of the 4th International Conference on Computing and Communications Conference, (IPCCC), April 2006.
Information Assurance and Security (ISIAS), pp. 237–242, September [153] D. Geneiatakis, T. Dagiuklas, G. Kambourakis, C. Lambrinoudakis,
2008. S. Gritzalis, K. S. Ehlert, and D. Sisalem, “Survey of Security Vulner-
[130] R. State, O. Festor, H. Abdelanur, V. Pascual, J. Kuthan, R. Coeffic, abilities in Session Initiation Protocol,” IEEE Communications Surveys
J. Janak, and J. Floroiu, “SIP digest authentication relay attack.” draft- & Tutorials, vol. 8, pp. 68–81, 3rd Quarter 2006.
state-sip-relay-attack-00, March 2009. [154] D. Geneiatakis, C. Lambrinoudakis, and G. Kambourakis, “An On-
[131] D. Geneiatakis, G. Kambourakis, and C. Lambrinoudakis, “A Mecha- tology Based Policy for Deploying Secure SIP-based VoIP Services,”
nism for Ensuring the Validity and Accuracy of the Billing Services in Computers and Security, vol. 27, pp. 285–297, October 2008.
IP Telephony,” in Proceedings of the 5th International Conference on [155] G. Me and D. Verdone, “An Overview of Some Techniques to Exploit
Trust, Privacy & Security in Digital Business (TrustBus), pp. 59–68, VoIP over WLAN,” in Proceedings of the International Conference on
September 2008. Digital Telecommunications (ICDT), pp. 67–73, August 2006.
[132] R. Ackermann, M. Schumacher, U. Roedig, and R. Steinmetz, “Vul- [156] R. Singhai and A. Sahoo, “VoIP Security,” technical report, Indian
nerabilities and Security Limitations of current IP Telephony Systems,” Institute of Technology, Mumbai — School of Information Technology,
in Proceedings of the Conference on Communications and Multimedia 2006.
Security (CMS), pp. 53–66, May 2001. [157] W. J. Rippon, “Threat Assessment of IP Based Voice Systems,” in
[133] P. Hunter, “VOIP the Latest Security Concern: DoS Attack the Greatest Proceedings of the 1st IEEE Workshop on VoIP Management and
Threat,” Network Security, vol. 2002, pp. 5–7, November 2002. Security (VoIP MaSe), pp. 19–28, April 2006.
[134] A. Batchvarov, “Security Issues and Solutions for Voice over IP [158] P. C. K. Hung and M. V. Martin, “Security Issues in VoIP Applica-
Compared to Circuit Switched Networks,” tech. rep., INFOTECH tions,” in Proceedings of the Canadian Conference on Electrical and
Seminar Advanced Communication Services (ACS), 2004. Computer Engineering (CCECE), pp. 2361–2364, May 2006.
[135] D. Bradbury, “The Security Challenges Inherent in VoIP ,” Computers [159] P. C. K. Hung and M. V. Martin, “Through the looking glass: Security
& Security, vol. 26, pp. 485–487, December 2007. issues in VoIP applications,” in Proceedings of the IADIS International
[136] J. Chau, “Security Issues Around the Deployment of VoIP and Multi- Conference on Applied Computing, February 2006.
media Protocols in Wireless and Firewalled Environments,” Computer [160] M. Zandi, M. V. Martin, and P. C. K. Hung, “Overview of Security
Fraud & Security, vol. 2006, pp. 14–16, August 2006. Issues of VOIP,” in Proceedings of the IASTED European Conference
[137] D. Sicker and T. Lookabaugh, “VoIP Security: Not an Afterthought,” on Internet and Multimedia Systems and Applications (IMSA), pp. 254–
ACM Queue Magazine, vol. 2, pp. 56–64, September 2004. 259, March 2007.
[138] S. Vuong and Y. Bai, “A Survey of VoIP Intrusions and Intrusion [161] J. Xin, “Security Issues and Countermeasure for VoIP ,” white paper,
Detection Systems,” in Proceedings of the 6th International Confer- SANS Institute, 2007.
ence on Advanced Communication Technology (ICACT), pp. 317–322, [162] D. Persky, “VoIP Security Vulnerabilities,” white paper, SANS Institute,
February 2004. 2007.
[139] D. Geneiatakis, G. Kambourakis, C. Lambrinoudakis, T. Dagiuklas, and [163] V. M. Quinten, R. van de Meent, and A. Pras, “Analysis of Techniques
S. Gritzalis, “SIP Message Tampering: THE SQL code INJECTION for Protection Against Spam over Internet Telephony,” in Proceedings
attack,” in Proceedings of 13th IEEE International Conference on of the 13th Open European Summer School and IFIP TC6.6 Workshop
Software, Telecommunications and Computer Networks (SoftCOM), (EUNICE), pp. 70–77, July 2007.
September 2005. [164] P. Hansen and A. Woodward, “Network Security—Is IP Telephony
[140] G. S. Tucker, “Voice Over Internet Protocol (VoIP) and Security,” white Helping The Cause?,” in Proceedings of the 5th Australian Information
paper, SANS Institute, 2005. Security Management Conference, pp. 73–79, December 2007.
[141] J. Posegga and J. Seedorf, “Voice Over IP: Unsafe at any Bandwidth?,” [165] P. James and A. Woodward, “Securing VoIP: A Framework to Mitigate
in Proceedings of the Eurescom Summit: Ubiquitous Services and or Manage Risks,” in Proceedings of the 5th Australian Information
Applications Exploiting the Potential, Apri 2005. Security Management Conference, pp. 103–116, December 2007.
[142] E. Edelson, “Voice over IP: Security Pitfalls,” Network Security, [166] D. Butcher, X. Li, and J. Guo, “Security Challenge and Defense in VoIP
vol. 2005, pp. 4–7, February 2005. Infrastructures,” IEEE Transactions on Systems, Man, and Cybernetics,
[143] J. Albers, B. Hahn, S. McGann, S. Park, and R. Zhu, “An Analysis Part C: Applications and Reviews, vol. 37, pp. 1152–1162, November
of Security Threats and Tools in SIP-Based VoIP Systems,” M.Sc. 2007.
Capstone Paper, Univesity of Colorado, Boulder, 2005. [167] P. Thermos and A. Takanen, Securing VoIP Networks. Pearson
[144] S. McGann and D. Sicker, “An Analysis of Security Threats and Tools Education, 2008.
in SIP-Based VoIP Systems,” in Proceedings of the 2nd VoIP Security [168] A. Kurmus and J.-F. Garet, “Studying and Experimenting with Threats
Workshop, June 2005. Against Voice over IP Systems,” Tech. Rep. Masters Thesis, EURE-
[145] D. Geneiatakis, G. Kambourakis, T. Dagiuklas, C. Lambrinoudakis, and COM, 2009.
S. Gritzalis, “SIP Security Mechanisms: A state-of-the-art review,” in [169] D. Sisalem, J. Floroiu, J. Kuthan, U. Abend, and H. Schulzrinne, SIP
Proceedings of the 5th International Network Conference (INC), July Security. Wiley, 2009.
2005. [170] V. K. Gurbani and V. Kolesnikov, “A Survey and Analysis of Media
[146] F. Cao and S. Malik, “Security Analysis and Solutions for Deploying Keying Techniques in the Session Initiation Protocol (SIP),” IEEE
IP Telephony in the Critical Infrastructure,” in Proceedings of the work- Communications Surveys and Tutorials (to appear), 2011.
shop of the 1st International Conference on Security and Privacy for [171] Y. Rebahi, S. Ehlert, M. Theoharidou, J. Mallios, S. Dritsas, G. F.
Emerging Areas in Communication Networks, pp. 171–180, September Marias, L. Mitrou, T. Dagiuklas, M. Avgoustianakis, D. Gritzalis,
2005. B. Pannier, O. Capsada, and J. Markl, “SPIT Threat Analysis,” de-
[147] F. Cao and S. Malik, “Vulnerability Analysis and Best Practices liverable wp2/d2.1, SPIDER COOP-32720, January 2007.
for Adopting IP Telephony in Critical Infrastructure Sectors,” IEEE [172] G. F. Marias, S. Dritsas, M. Theoharidou, J. Mallios, L. Mitrou,
Communications Magazine, vol. 44, pp. 138–145, April 2006. D. Gritzalis, T. Dagiuklas, Y. Rebahi, S. Ehlert, B. Pannier, O. Capsada,
[148] B. Allain, “VoIP Security Challenges and Approaches,” in Proceedings and J. F. Juell, “SPIT Detection and Handling Strategies for VoIP
of the 2nd Workshop on Securing Voice over IP, June 2005. Infrastructures,” Tech. Rep. Deliverable WP2/D2.2, SPIDER COOP-
[149] A. Adelsbach, A. Alkassar, K.-H. Garbe, M. Luzaic, M. Manulis, 32720, March 2007.
E. Scherer, J. Schwenk, and E. Siemens, “Voice over IP: Sichere [173] S. Dritsas, J. Mallios, M. Theoharidou, G. F. Marias, and D. Gritzalis,
Umstellung der Sprachkommunikation auf IP-Technologie.” Bunde- “Threat Analysis of the Session Initiation Protocol Regarding Spam,” in
sanzeiger Verlag, 2005. Proceedings of the 26th IEEE International Performance Computing
[150] D. R. Kuhn, T. J. Walsh, and S. Fries, “Security Considerations and Communications Conference (IPCCC), pp. 426–433, April 2007.
for Voice Over IP Systems.” US National Institute of Standards and [174] G. F. Marias, S. Dritsas, M. Theoharidou, J. Mallios, and D. Gritza-
Technology (NIST) Special Publication SP 800-58, January 2005. lis, “SIP Vulnerabilities and Anti-SPIT Mechanisms Assessment,”
[151] T. J. Walsh and D. R. Kuhn, “Challenges in Securing Voice over IP,” in Proceedings of the 16th International Conference on Computer
IEEE Security & Privacy Magazine, vol. 3, pp. 44–49, May/June 2005. Communications and Networks (ICCCN), pp. 597–604, August 2007.
22 IEEE COMMUNICATIONS SURVEYS & TUTORIALS
[175] S. Dritsas, Y. Soupionis, M. Theoharidou, Y. Mallios, and D. Gritza- [200] D. Harkins and D. Carrel, “The Internet Key Exchange (IKE).” RFC
lis, “SPIT Identification Criteria Implementation: Effectiveness and 2409 (Proposed Standard), Nov. 1998. Obsoleted by RFC 4306,
Lessons Learned,” in Proceedings of the 23rd IFIP TC11 International updated by RFC 4109.
Information Security Conference (SEC), pp. 381–395, September 2008. [201] C. Kaufman, “Internet Key Exchange (IKEv2) Protocol.” RFC 4306
[176] N. d’Heureuse, J. Seedorf, S. Niccolini, and T. Ewald, “Protecting (Proposed Standard), Dec. 2005. Updated by RFC 5282.
SIP-Based Networks and Services from Unwanted Communications,” [202] J. Bilien, “Key Agreement for Secure Voice over IP,” Master of Science
in Proceedings of the IEEE Global Telecommunications Conference Thesis IMIT/LCN 2003-14, Royal Institute of Technology, Sweden,
(GLOBECOM), pp. 1–5, November/December 2008. December 2003.
[177] D. Sisalem, J. Kuthan, and S. Ehlert, “Denial of Service Attacks [203] J. Bilien, E. Eliasson, and J.-O. Vatn, “Call Establishment Delay
Targeting a SIP VoIP Infrastructure: Attack Scenarios and Prevention for Secure VoIP,” in Proceedings of the Workshop on Modeling and
Mechanisms,” IEEE Network, vol. 20, pp. 26–31, September/October Optimization in Mobile, Ad Hoc and Wireless Networks (WiOpt), March
2006. 2004.
[178] J. Seedorf, “Security challenges for peer-to-peer SIP,” IEEE Network, [204] J. Bilien, E. Eliasson, J. Orrblad, and J.-O. Vatn, “Secure VoIP:
vol. 20, pp. 38–45, September/October 2006. Call Establishment and Media Protection,” in Proceedings of the 2nd
[179] C. Wieser, M. Laakso, and H. Schulzrinne, “Security Testing of SIP Workshop on Securing Voice over IP, June 2005.
Implementations,” Tech. Rep. CUCS-024-03, Columbia University, [205] H. Xiao and P. Zarrella, “Quality Effects of Wireless VoIP Using Secu-
Department of Computer Science, 2003. rity Solutions,” in Proceedings of the IEEE Military Communications
[180] R. Kaksonen, M. Laakso, and A. Takanen, “Software Security Assess- Conference (MILCOM), vol. 3, pp. 1352–1357, October/November
ment through Specification Mutations and Fault Injection.” 2004.
[181] “CVE-2003-1109.” http://cve.mitre.org/cgi-bin/cvename.cgi?name= [206] E. T. Lakay and J. I. Agbinya, “Security Issues in SIP Signaling in
CVE-2003-1109, 2003. Wireless Networks and Services,” in Proceedings of the International
[182] T. Berson, “Skype Security Evaluation,” October 2005. Conference on Mobile Business, pp. 639–642, July 2005.
[183] S. A. Baset and H. Schulzrinne, “An Analysis of the Skype Peer-to- [207] C. Eun-Chul, C. Hyoung-Kee, and C. Sung-Jae, “Evaluation of Security
Peer Telephony Protocol,” in Proceedings of IEEE INFOCOM, April Protocols for the Session Initiation Protocol,” in Proceedings of the
2006. 16th International Conference on Computer Communications and
[184] P. Biondi and F. Desclaux, “Silver Needle in the Skype,” in BlackHat Networks (ICCCN), pp. 611–616, August 2007.
Europe Conference, March 2006. www.blackhat.com/presentations/ [208] C. Shen, E. Nahum, H. Schulzrinne, and C. P. Wright, “The Impact
bh-europe-06/bh-eu-06-biondi/bh-eu-06-biondi-up.pdf, (reverse engi- of TLS on SIP Server Performance,” in Proceedings of the 4th
neer). Annual ACM Conference on Principles, Systems and Applications of
[185] P. Thermos and G. Hadsall, “Vulnerabilities in SOHO VoIP Gateways,” IP Telecommunications (IPTCOMM), pp. 63–74, August 2010.
in Proceedings of the 2nd VoIP Security Workshop, June 2005. [209] Y. Rebahi, J. J. Pallares, G. Kovacs, N. T. Minh, S. Ehlert, and
[186] H. Scholz, “Attacking VoIP Networks,” in Proceedings of the 3rd D. Sisalem, “Performance Analysis of Identity Management in the Ses-
Workshop on Securing Voice over IP, June 2006. sion Initiation Protocol (SIP),” in Proceedings of the IEEE/ACS Inter-
[187] H. Abdelnur, R. State, and O. Festor, “Fuzzing for Vulnerabilities in national Conference on Computer Systems and Applications (AICCSA),
the VoIP Space,” in Proceedings of the 17th Annual Conference of the pp. 711–717, March/April 2008.
European Institute for Computer Anti-Virus Research (EICAR), May
[210] A. Buschel, “Authentication in VoIP,” in Proceedings of the 2nd
2008.
Workshop on Securing Voice over IP, June 2005.
[188] H. Abdelnur, R. State, and O. Festor, “KiF: A stateful SIP Fuzzer,” in
[211] C. C. Yang, R. C. Wang, and W. T. Liu, “Secure Authentication Scheme
Proceedings of the 1st International Conference on Principles, Systems
for Session Initiation Protocol,” Computers and Security, vol. 24,
and Applications of IP Telecommunications, pp. 47–56, July 2007.
pp. 381–386, August 2005.
[189] H. Abdelnur, V. Cridlig, R. State, O. Festor, and J. Bourdellon, “VoIP
Security Assessment: Methods and Tools,” in Proceedings of the 1st [212] A. Durlanik and I. Sogukpinar, “SIP Authentication Scheme Us-
IEEE Workshop on VoIP Management and Security (VoIP MASe), ing ECDH,” World Academy Science, Engineering and Technology
pp. 29–34, April 2006. (WASET), vol. 8, pp. 350–353, September 2005.
[190] H. Abdelnur, R. State, I. Chrisment, and C. Popi, “Assessing the [213] E.-J. Yoon and K.-Y. Yoo, “A New Authentication Scheme for Session
security of VoIP Services,” in Proceedings of the 10th IFIP/IEEE Initiation Protocol,” in Proceedings of the International Conference
Symposium on Integrated Management (IM), pp. 373–382, May 2007. on Complex, Intelligent and Software Intensive Systems, pp. 549–554,
[191] P. Gupta and V. Shmatikov, “Security Analysis of Voice-over-IP Proto- March 2009.
cols,” in Proceedings of the 20th IEEE Computer Security Foundations [214] J. L. Tsai, “Efficient Nonce-based Authentication Scheme for Session
Symposium (CSFW), pp. 49–63, July 2007. Initiation Protocol,” International Journal of Network Security (IJNS),
[192] J. Floroiu and D. Sisalem, “A Comparative Analysis of the Security vol. 9, pp. 12–16, July 2009.
Aspects of the Multimedia Key Exchange Protocols,” in Proceedings [215] R. Srinivasan, V. Vaidehi, K. Harish, K. L. Narasimhan, S. L. Babu,
of the 3rd International Conference on Principles, Systems and Ap- and V. Srikanth, “Authentication of Signaling in VoIP Applications,” in
plications of IP Telecommunications (IPTComm), pp. 2:1–2:10, July Proceedings of the 11th Asia-Pacific Conference on Communications
2009. (APCC), pp. 530–533, October 2005.
[193] J. Reason and D. Messerschmitt, “The Impact of Confidentiality on [216] A. Mohammadi-nodooshan, Y. Darmani, R. Jalili, and M. Nourani, “A
Quality of Service in Heterogeneous Voice over IP Networks,” in Robust and Efficient SIP Authentication Scheme,” in Proceedings of
Proceedings of the IEEE Conference on Management of Multimedia the 13th International CSI Computer Conference (CSICC), pp. 551–
Networks and Services, pp. 175–192, November 2001. 558, March 2008.
[194] A. D. Elbayoumy and S. Shepherd, “Stream or Block Cipher for [217] H.-F. Huang and W.-C. Wei, “A New Efficient Authentication Scheme
Securing VoIP?,” International Journal of Network Security, vol. 5, for Session Initiation Protocol,” in Proceedings of the Joint Conference
pp. 128–133, September 2007. on Information Sciences (JCIS), 9th International Conference on
[195] A. D. Elbayoumy and S. Shepherd, “A Comprehensive Secure VoIP Computer Science and Informatics, October 2006.
Solution,” International Journal of Network Security, vol. 5, pp. 233– [218] C.-C. Chang, Y.-F. Lu, A.-C. Pang, and T.-W. Kuo, “Design and
240, September 2007. Implementation of SIP Security,” in Proceedings of the International
[196] S. Salsano, L. Veltri, and D. Papalilo, “SIP Security Issues: The SIP Conference On Information Networking (ICION), pp. 669–678, Febru-
Authentication Procedure and its Processing Load,” IEEE Network, ary 2005.
vol. 16, pp. 38–44, November/December 2002. [219] C. C. Lee, “On Security of An Efficient Nonce-based Authentication
[197] N. Modadugu and E. Rescorla, “The Design and Implementation of Scheme for SIP,” International Journal of Network Security, vol. 9,
Datagram TLS,” in Proceedings of the ISOC Symposium on Network pp. 201–203, November 2009.
and Distributed Systems Security (NDSS), February 2004. [220] F. Cao and C. Jennings, “Providing Response Identity and Authen-
[198] R. Barbieri, D. Bruschi, and E. Rosti, “Voice over IPsec: Analysis tication in IP Telephony,” in Proceedings of the 1st International
and Solutions,” in Proceedings of the 18th Annual Computer Security Conference on Availability, Reliability and Security (ARES), April
Applications Conference (ACSAC), pp. 261–270, December 2002. 2006.
[199] M. K. Ranganathan and L. Kilmartin, “Performance Analysis of [221] K. Insu and K. Keecheon, “Secure Session Management Mechanism
Secure Session Initiation Protocol Based VoIP Networks,” Computer in VoIP Service,” in Proceedings of the Workshop on Ubiquitous
Communications, vol. 26, pp. 552–565, April 2003. Processing for Wireless Networks (UPWN), held in conjunction with the
ANGELOS D. KEROMYTIS: A COMPREHENSIVE SURVEY OF VOIP SECURITY RESEARCH 23
5th International Symposium on Parallel and Distributed Processing International Conference on Principles, Systems and Applications of
and Applications (ISPA), pp. 96–104, August 2007. IP Telecommunications (IPTCOMM), pp. 1–9, July 2007.
[222] H. Schmidt, C.-T. Dang, and F. J. Hauck, “Proxy-based Security [242] B. I. A. Barry and H. A. Anthony, “On the Performance of a Hybrid
for the Session Initiation Protocol(SIP),” in Proceedings of the 2nd Intrusion Detection Architecture for Voice over IP Systems,” in Pro-
International Conference on Systems and Networks Communications ceedings of the 4th International Conference on Security and Privacy
(ICSNC), pp. 42–47, August 2007. in Communication Networks (SecureComm), pp. 1–10, September
[223] F. Wang and Y. Zhang, “A New Provably Secure Authentication and 2008.
Key Agreement for SIP Using Certificateless Public-Key Cryptogra- [243] K. Rieck, S. Wahl, P. Laskov, P. Domschitz, and K.-R. Müller, “A Self-
phy,” Computer Communications, vol. 31, pp. 2142–2149, June 2008. learning System for Detection of Anomalous SIP Messages,” in Pro-
[224] S. Al-Riyami and K. Paterson, “Certificateless Public Key Cryptogra- ceedings of the 2nd Internation Conference on Principles, Systems and
phy,” in Proceedings of AsiaCrypt, pp. 452–473, November/December Applications of IP Telecommunications. Services and Security for Next
2003. Generation Networks: Second International Conference, (IPTComm),
[225] J. Ring, K.-K. R. Choo, E. Foo, and M. Looi, “A New authentication pp. 90–106, July 2008.
Mechanism and Key Agreement Protocol for SIP Using Identity-based [244] G. F. Cretu, A. Stavrou, M. E. Locasto, S. J. Stolfo, and A. D.
Cryptography,” in Proceedings of AusCERT, R&D Stream, pp. 61–72, Keromytis, “Casting out Demons: Sanitizing Training Data for
May 2006. Anomaly Sensors,” in Proceedings of the IEEE Security and Privacy
[226] K. Singh and S. Vuong, “Blaze: A Mobile Agent Paradigm for VoIP Symposium, pp. 81–95, May 2008.
Intrusion Detection Systems,” in Proceedings of the 1st International [245] R. Dantu, S. Fahmy, H. Schulzrinne, and J. Cangussu, “Issues and
Conference on E-Business and Telecommunication Networks (ICETE), Challenges in Securing VoIP,” Computers & Security (to appear), 2010.
August 2004. [246] B. Reynolds and D. Ghosal, “STEM: Secure Telephony Enabled
[227] V. Casola, R. Chianese, A. Mazzeo, N. Mazzocca, and M. Rak, Middlebox,” IEEE Communications Magazine, vol. 40, pp. 52–58,
“A Policy-based Design Methodology and Performance Evaluation October 2002.
Framework for a Secure VoIP Infrastructure,” in Proceedings of the [247] U. Roedig, R. Ackermann, and R. Steinmetz, “Evaluating and Improv-
International Conference on E-business and TElecommunication Net- ing Firewalls for IP-Telephony Environments,” in Proceedings of the
works (ICETE), August 2004. 1st IP Telephony Workshop, April 2000.
[228] V. Casola, M. Rak, A. Mazzeo, and N. Mazzoccca, “Security Design [248] J. Kuthan, “Internet Telephony Traversal Across Decomposed Firewalls
and Evaluation in a VoIP Secure Infrastracture: A Policy Based Ap- and NATs,” in Proceedings of the 2nd IP Telephony Workshop, April
proach,” in Proceedings of the International Conference on Information 2001.
Technology: Coding and Computing, pp. 727–732, April 2005. [249] P. Sijben, W. van Willigenburg, M. de Boer, and S. van der Gaast,
[229] Y. Wu, S. Bagchi, S. Garg, and N. Singh, “SCIDIVE: A Stateful “Middleboxes: Controllable Media Firewalls,” Bell Labs Technical
and Cross Protocol Intrusion Detection Architecture for Voice-over- Journal, vol. 7, pp. 141–157, August 2002.
IP Environments,” in Proceedings of the Conference on Dependable [250] D. T. Stott, “SAFENeT: Server-based Architecture For Enterprise NAT
Systems and Networks (DSN, pp. 433–442, June/July 2004. and Firewall Traversal,” in Proceedings of the 2nd VoIP Security
Workshop, June 2005.
[230] V. Apte, Y.-S. Wu, S. Bagchi, S. Garg, and N. Singh, “SPACEDIVE: A
[251] T. Bessis, A. Rana, and V. K. Gurbani, “Session Initiation Protocol
Distributed Intrusion Detection System for Voice-over-IP Environments
(SIP) Firewall for Internet Multimedia Subsystem (IMS) Core,” Bell
(Fast Abstract),” in Proceedings of the International Conference on
Labs Technical Journal, Fall 2010.
Dependable Systems and Networks (DSN), pp. 25–28, June 2006.
[252] V. K. Gurbani, D. Willis, and F. Audet, “Cryptographically Transparent
[231] Y.-S. Wu, V. Apte, S. Bagchi, S. Garg, and N. Singh, “Intrusion
Session Initiation Protocol (SIP) Proxies,” in Proceedings of the IEEE
Detection in Voice over IP Environments,” International Journal of
International Conference on Communications (ICC), pp. 1185–1190,
Information Security, vol. 8, pp. 153–172, June 2009.
June 2007.
[232] M. V. Martin and P. C. K. Hung, “Towards a Security Policy for
[253] H. Sengar, D. Wijesekera, S. Jajodia, and R. Dantu, “Securing VoIP
VoIP Applications,” in Proceedings of the Canadian Conference on
from Signaling Network Vulnerabilities,” in Proceedings of the 2nd
Electrical and Computer Engineering (CCECE), pp. 65–68, May 2005.
Workshop on Securing Voice over IP, June 2005.
[233] D. Sisalem, S. Ehlert, D. Geneiatakis, G. Kambourakis, T. Dagiuklas, [254] H. Sengar, R. Dantu, and D. Wijesekera, “Securing VoIP and PSTN
J. Markl, M. Rokos, O. Botron, J. Rodriguez, and J. Liu, “Towards a from Integrated Signaling Network Vulnerabilities,” in Proceedings of
Secure and Reliable VoIP Infrastructure,” Tech. Rep. Deliverable D2.1, the 1st IEEE Workshop on VoIP Management and Security, pp. 1–7,
SNOCER COOP-005892, May 2005. April 2005.
[234] T. Dagiuklas, D. Geneiatakis, G. Kambourakis, D. Sisalem, S. Ehlert, [255] H. Sengar, D. Wijesekera, H. Wang, and S. Jajodia, “VoIP Intrusion De-
J. Fiedler, J. Markl, M. Rokis, O. Botron, J. Rodriguez, and J. Liu, tection Through Interacting Protocol State Machines,” in Proceedings
“General Reliability and Security Framework for VoIP Infrastructures,” of the International Conference on Dependable Systems and Networks
Tech. Rep. Deliverable D2.2, SNOCER COOP-005892, September (DSN), pp. 393–402, June 2006.
2005. [256] S. Ehlert, G. Zhang, and T. Magedanz, “Increasing SIP Firewall
[235] D. Geneiatakis and C. Lambrinoudakis, “An Ontology Description for Performance by Ruleset Size Limitation,” in Proceedings of the 19th
SIP Security Flaws,” Computer Communications, vol. 30, pp. 1367– IEEE International Symposium on Personal, Indoor and Mobile Radio
1374, April 2007. Communications (PIMRC), pp. 1–6, September 2008.
[236] S. Niccolini, R. G. Garroppo, S. Giordano, G. Risi, and S. Ventura, “SIP [257] M. Mandjes, I. Saniee, and A. L. Stolyar, “Load characterization and
Intrusion Detection and Prevention: Recommendations and Prototype Anomaly Detection for Voice over IP Traffic (Extended Abstract),” in
Implementation,” in Proceedings of the 1st IEEE Workshop on VoIP Proceedings of the ACM SIGMETRICS Conference, June 2001.
Management and Security (VoIP MaSe), pp. 47–52, April 2006. [258] M. Mandjes, I. Saniee, and A. A. L. Stolyar, “Load Characterization
[237] W. Marshall, A. F. Faryar, K. Kealy, G. de los Reyes, I. Rosencrantz, and Anomaly Detection for Voice over IP Traffic,” IEEE Transactions
R. Rosencrantz, and C. Spielman, “Carrier VoIP Security Architecture,” on Neural Networks, vol. 16, pp. 1019–1026, September 2005.
in Proceedings of the 12th International Telecommunications Network [259] D. Geneiatakis, G. Kambourakis, T. Dagiuklas, C. Lambrinoudakis,
Strategy and Planning Symposium, pp. 1–6, November 2006. and S. Gritzalis, “A Framework for Detecting Malformed Messages in
[238] M. Sher and T. Magedanz, “Protecting IP Multimedia Subsystem SIP Networks,” in Proceedings of 14th IEEE Workshop on Local and
(IMS) Service Delivery Platform from Time Independent Attacks,” Metropolitan Area Networks (LANMAN), September 2005.
in Proceedings of the 3rd International Symposium on Information [260] D. Geneiatakis, G. Kambourakis, T. Dagiuklas, C. Lambrinoudakis,
Assurance and Security (IAS), pp. 171–176, August 2007. and S. Gritzalis, “A Framework for Detecting Malformed Messages
[239] Y. Ding and G. Su, “Intrusion Detection System for Signal-based SIP in SIP Networks,” Computer Networks: The International Journal of
Attacks Through Timed HCPN,” in Proceedings of the 2nd Inter- Computer and Telecommunications Networking, vol. 51, pp. 2580–
national Conference on Availability, Reliability and Security (ARES), 2593, July 2007.
pp. 190–197, April 2007. [261] D. Geneiatakis, T. Dagiuklas, C. Lambrinoudakis, G. Kambourakis, and
[240] M. Nassar, R. State, and O. Festor, “VoIP Honeypot Architecture,” S. Gritzalis, “Novel Protecting Mechanism for SIP-based Infrastructure
in Proceedings of the 10th IFIP/IEEE International Symposium on against Malformed Message Attacks: Performance Evaluation Study,”
Integrated Network Management, pp. 109–118, May 2007. in Proceedings of the 5th International Conference on Communication
[241] M. Nassar, S. Niccolini, R. State, and T. Ewald, “Holistic VoIP Systems, Networks and Digital Signal Processing (CSNDSP), pp. 261–
Intrusion Detection and Prevention System,” in Proceedings of the 1st 266, July 2006.
24 IEEE COMMUNICATIONS SURVEYS & TUTORIALS