0% found this document useful (0 votes)

161 views34 pages

The Serverless Computing Survey: A Technical Primer For Design Architecture

This document provides an overview of serverless computing. It begins with definitions of serverless computing and the Function-as-a-Service (FaaS) model. It then discusses key features of serverless computing like auto-scaling, flexible scheduling, being event-driven, transparent development, and pay-as-you-go billing. The document aims to survey research in serverless computing and provide context to its technical concepts and architecture.

Uploaded by

Usman Nadeem

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

161 views34 pages

The Serverless Computing Survey: A Technical Primer For Design Architecture

Uploaded by

Usman Nadeem

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 34

The Serverless Computing Survey: A Technical Primer for

Design Architecture

ZIJUN LI, LINSONG GUO, JIAGAN CHENG, and QUAN CHEN, Shanghai Jiao Tong University
BINGSHENG HE, National University of Singapore
MINYI GUO, Shanghai Jiao Tong University

The development of cloud infrastructures inspires the emergence of cloud-native computing. As the most
promising architecture for deploying microservices, serverless computing has recently attracted more and
more attention in both industry and academia. Due to its inherent scalability and flexibility, serverless com-
puting becomes attractive and more pervasive for ever-growing Internet services. Despite the momentum in
the cloud-native community, the existing challenges and compromises still wait for more advanced research
and solutions to further explore the potential of the serverless computing model. As a contribution to this
knowledge, this article surveys and elaborates the research domains in the serverless context by decoupling
the architecture into four stack layers: Virtualization, Encapsule, System Orchestration, and System Coordi-
nation. Inspired by the security model, we highlight the key implications and limitations of these works in
each layer, and make suggestions for potential challenges to the field of future serverless computing.

CCS Concepts: • Computer systems organization → Cloud computing; n-tier architectures; • Networks
→ Cloud computing; • Theory of computation → Parallel computing models;

Additional Key Words and Phrases: Serverless computing, architecture design, FaaS, Lambda paradigm

ACM Reference format:

Zijun Li, Linsong Guo, Jiagan Cheng, Quan Chen, BingSheng He, and Minyi Guo. 2022. The Serverless Com-
puting Survey: A Technical Primer for Design Architecture. ACM Comput. Surv. 54, 10s, Article 220 (Septem-
ber 2022), 34 pages.
https://doi.org/10.1145/3508360 220

1 INTRODUCTION
1.1 Definition of Serverless Computing
Traditional Infrastructure-as-a-Service (IaaS) deployment mode demands a long-term running
server for sustainable service delivery. However, this exclusive allocation needs to retain resources

This work was partially sponsored by the National Natural Science Foundation of China (62022057, 61832006) and a Shang-
hai international science and technology collaboration project (no. 21510713600).
Authors’ addresses: Z. Li, L. Guo, J. Cheng, Q. Chen (corresponding author), and M. Guo (corresponding author), De-
partment of Computer Science and Engineering, Shanghai Jiao Tong University, China; emails: {lzjzx1122, gls1196,
chengjiagan}@sjtu.edu.cn, {chen-quan, guo-my}@cs.sjtu.edu.cn; B. He, Department of Computer Science, National Uni-
versity of Singapore, Singapore; email: hebs@comp.nus.edu.sg.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee
provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and
the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored.
Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires
prior specific permission and/or a fee. Request permissions from permissions@acm.org.
© 2022 Association for Computing Machinery.
0360-0300/2022/09-ART220 $15.00
https://doi.org/10.1145/3508360

ACM Computing Surveys, Vol. 54, No. 10s, Article 220. Publication date: September 2022.
220:2 Z. Li et al.

Fig. 1. Example of an asynchronous invocation in serverless computing.

regardless of whether the user application is running or not. Consequently, it results in low re-
source utilization in current data centers by only about 10% on average, especially for an online
service with a diurnal pattern. The contradiction attracts the development of a platform-managed
on-demand service model to attain higher resource utilization and lower cloud computing costs.
To this end, serverless computing was put forward, and most large cloud vendors such as Amazon,
Google, Microsoft, IBM, and Alibaba have already offered such elastic computing services.
In the following, we will first review the definition given in Berkeley View [65], and then we
will give a broader definition. We believe that a narrow perception of the Function-as-a-Service
(FaaS)-based serverless model may weaken its advancement. So far, there is no formal definition
of serverless computing. The common acknowledged definitions from Berkeley View [65] are pre-
sented as follows:
• Serverless Computinд = FaaS (Function-as-a-Service) + BaaS (Backend-as-a-Service). One
fallacy is that Serverless is interchangeable with FaaS, which is revealed in a recent inter-
view [78]. To be precise, they both are essential to serverless computing. The FaaS model
enables the function isolation and invocation, whereas Backend-as-a-Service (BaaS) pro-
vides overall backend support for online services.
• In the FaaS model (aka the Lambda paradigm), an application is sliced into functions or
function-level microservices [26, 45, 57, 65, 117, 141]. The function identifier, the language
runtime, the memory limit of one instance, and the function code blob URI (Uniform Re-
source Identifier) together define the existence of a function [94].
• The BaaS covers a wide range of services that any application relies on can be categorized
into it—for example, the cloud storage (Amazon S3 and DynamoDB), the message bus system
for passing (Google Cloud pub/sub), the message notification service (Amazon SNS), and
DevOps tools (Microsoft Azure DevOps).
To depict the serverless computing model, we take the asynchronous invocation in Figure 1 as an
example. The serverless system receives triggered API queries from the users, validates them, and
invokes the functions by creating new sandboxes (aka the cold startup [15, 28, 65]) or reusing run-
ning warm ones (aka the warm startup). The isolation ensures that each function invocation runs
in an individual container or a Virtual Machine (VM) assigned from an access-control controller.
Due to the event-driven and single-event processing nature, the serverless system can be triggered
to provide on-demand isolated instances and scale them horizontally according to the actual ap-
plication workload. Afterward, each execution worker accesses a backend database to save execu-
tion results [23]. By further configuring triggers and bridging interactions, users can customize
the execution for complex applications (e.g., building internal event calls in a {Fn A , Fn B , FnC }
pipeline).
In the broader scenario, we believe that the serverless computing model should be identified
with the following features:

ACM Computing Surveys, Vol. 54, No. 10s, Article 220. Publication date: September 2022.
The Serverless Computing Survey 220:3

• Auto-scaling: Auto-scalability should not be only narrowed to the FaaS model (e.g., container
black boxes as scheduling units in OpenWhisk [134]). The indispensable factor in identify-
ing a serverless system is performing horizontal and vertical scaling when accommodat-
ing workload dynamics. Allowing an application to scale the number of instances to zero
also introduces a worrisome challenge—cold startup. When a function experiences the cold
startup, instances need to start from scratch, initialize the software environment, and load
application-specific code. These steps can significantly drag down the service response, lead-
ing to QoS (Quality-of-Service) violations.
• Flexible scheduling: Since the application is no longer bound to a specific server, the server-
less controller dynamically schedules applications according to the resource usage in the
cluster while ensuring load balancing and performance assurances. Moreover, the server-
less platform also takes the multi-region collaboration into account [154]. For a more robust
and available serverless system, flexible scheduling allows the workload queries to be dis-
tributed across a broader range of regions [119]. It avoids serious performance degradation
or damage to the service continuity in case of unavailable or crash nodes.
• Event-driven: The serverless application is triggered by events such as the arrival of RESTful
HTTP queries, the update of a message queue, or new data to a storage service. By binding
events to functions with triggers and rules, the controller and functions can use metadata
encapsulated in context attributes. It makes relationships between events and the system
detectable, enabling different collaboration responses to different events. The Cloud-Native
Computing Foundation (CNCF) serverless group also published CloudEvents specifications
for commonly describing event metadata to provide interoperability.
• Transparent development: On the one hand, managing underlying host resources will no
longer be a bother for application maintainers, as they are agnostic about the execution
environment. Simultaneously, cloud vendors should ensure isolated sandboxes, reliable ex-
ecution environment, available physical nodes, software runtimes, and computing power
while making them transparent to maintainers. On the other hand, serverless computing
should also integrate DevOps tools to help deploy and iterate more efficiently.
• Pay-as-you-go: The serverless billing model shifts the cost of computing power from a capital
expense to an operating expense. This model eliminates the requirement from users to buy
exclusive servers based on the peak load. By sharing network, disk, CPU, memory, and other
resources, the pay-as-you-go model only indicates the resources that applications actually
used [1, 2, 26], no matter whether the instances are running or idle.
We regard an elastic computing model with the preceding five features incorporated as the
key to the definition of serverless computing. Along with the serverless emergence, application
maintainers would find it more attractive that resource pricing is billed based on the actual pro-
cessing events of an application rather than the pre-assigned resources [2]. Today, the server-
less computing is commonly applied in backend scenarios for batch jobs, including data analytics
(e.g., distributed computing model in PyWren [64]), ML (Machine Learning) tasks (e.g., deep learn-
ing) [78, 111], and event-driven web applications.

1.2 Survey Method by the Layered Serverless Architecture

Several surveys in serverless computing have discussed the characteristics of serverless general-
ization [15, 52, 65, 112, 116, 144]. However, they only propose literature reviews from a high-level
perspective while ignoring to provide enough architecture implications. As a result, researchers
and serverless vendors may find it struggling to grasp and comprehend each issue in the real
serverless architecture. In the lack of systematic knowledge, challenges and proposed solutions

ACM Computing Surveys, Vol. 54, No. 10s, Article 220. Publication date: September 2022.
220:4 Z. Li et al.

Fig. 2. General layered implementation of the serverless architecture, and security models (bottom-up logic)
in the Virtualization, Encapsule, and System layers.

will lack high portability and compatibility for various serverless systems. To this end, this survey
is inspired to propose a layered design and summarize the research domains from different views.
It can help researchers and practitioners to further understand the nature of serverless computing.
As shown in Figure 2, we analyze its design architecture with a bottom-up logic and decouple the
serverless computing architecture into four stack layers: Virtualization, Encapsule, System Orches-
tration, and System Coordination. We also abstract the security model in each layer (the System
Orchestration layer and System Coordination layer are merged).
Virtualization layer. The Virtualization layer enables function isolation within a performance
and functionality secured sandbox. The sandbox serves as the runtime for application service
code, runtime environment, dependencies, and system libraries. To prevent access to resources
in the multi-application or multi-tenant scenarios, cloud vendors usually adopt containers/VMs
to achieve isolation. Currently, the popular sandbox technologies are Docker [41], gVisor [49],
Kata [67], Firecracker [3], and Unikernel [86]. The security model answers how to provide reliable
runtime environments for different tenants and guarantee security on the cloud platform. Section 2
introduces these solutions to isolate functions and analyze their pros and cons.
Encapsule layer. Various middlewares in the Encapsule layer enable customized function trig-
gers and executions, as well as collecting data metrics for communicating and monitoring. We call
all these additional middlewares the sidecar. It separates other features from the service business
logic and enables loose coupling between the functions and the underlying platform. Meanwhile,
to speed up instance startup and initialization, the prewarm pool is commonly used in the En-
capsule layer [44, 97, 104, 105, 118, 146]. Serverless systems may use prediction by analyzing the
load pattern to prewarm each by a one-to-one approach, or build a template for all functions to
dynamically install requirements (REQs) according to the runtime characteristics by a one-for-all
approach. The security model resolves privacy concerns by introducing a user-level or system-level
analyzer when loading users’ private requirements. We introduce those concepts in Section 3.
System Orchestration layer. The System Orchestration layer allows users to configure triggers
and bind rules, ensuring the high availability and stability of the user application by dynamically

ACM Computing Surveys, Vol. 54, No. 10s, Article 220. Publication date: September 2022.
The Serverless Computing Survey 220:5

adjusting as load changes. Through the cloud orchestrator, the combination of online and offline
scheduling can avoid resource contention, recycle idle resources, and ease the performance degra-
dation for co-located functions. The preceding implementations are also typically integrated into
container orchestration services (e.g., Google Kubernetes and Docker Swarm). However, in the
serverless system, the resource monitor, controller, and load balancer are consolidated to resolve
scheduling challenges [4, 32, 50, 57, 66, 70, 88, 139]. They enable the serverless system to achieve
scheduling optimizations in three different levels: resource-level, instance-level, and application-
level, respectively. The security model deals with robust performance when serverless applications
have more fragmented boundaries. Section 4 analyzes the methodology from three angles.
System Coordination layer. The System Coordination layer consists of a series of BaaS compo-
nents that use unified APIs and SDKs to integrate backend services into functions. Distinctly, it
differs from the traditional middlewares that use local physical services outside the cloud. These
BaaS services provide the storage, queue service [94, 99], trigger binding [75, 77], API gateway,
data cache [6, 7], DevOps tools [24, 25, 63, 122], and other customized components for better meet-
ing the System Orchestration layer’s flexibility requirements. Section 5 discusses these essential
BaaS components in a serverless system.
Each stack layer plays an essential role in the serverless architecture. Therefore, based on the
preceding hierarchy, we conclude the contributions of this survey as follows:
(1) Introduce the serverless definition and summarize the features.
(2) Elaborate the architecture design based on a four-layer hierarchy, and review the significant
and representative works in each layer.
(3) Analyze the security model of each layer based on the four-layered architecture.
(4) Explore the challenges, limitations, and opportunities in serverless computing.
The rest of the survey is organized as follows. Sections 2 through 5 introduce the four stack
layers and elaborate current research domains in serverless computing. Section 6 analyzes several
factors that degrade performance and compares the current production serverless systems. Finally,
we summarize and outline the challenges and opportunities in Sections 7 and 8.

2 VIRTUALIZATION LAYER
A user function invoked in the serverless runtime will be loaded and executed within a virtualized
sandbox. A function can either reuse a warm sandbox or create a new one, but usually not co-run
with different user functions. In this premise, most of the concerns in virtualization are isolation,
flexibility, and low startup latency. The isolation ensures that each application process runs in
the demarcated resource space, and the running process can avoid interference by others. The
flexibility requires the ability to test and debug, and the additional support for extending the system.
Low startup latency requires a fast response for the sandbox creation and initialization. The current
sandboxing mechanism in the Virtualization layer is broken into four representative categories:
traditional VM, container, secure container, and Unikernel. Table 1 compares these mainstream
approaches in several respects.
In the table, “Startup Latency” represents the response latency of cold startup. “Isolation Level”
indicates the capacity of functions running without interference by others. “OSkernel” shows
whether the kernel in GuestOS is shared. “Hotplug” allows the function instance to start with
minimal resources (CPU, memory, virtio blocks) and add additional resources at runtime. “OCI
Supported” means whether it provides the Open Container Initiative (OCI), an open gover-
nance structure for expressing container formats and runtimes. Moreover, “✓” in all survey tables
means that this technique or strategy is used, and vice versa.

ACM Computing Surveys, Vol. 54, No. 10s, Article 220. Publication date: September 2022.
220:6 Z. Li et al.

Table 1. Techniques in the Virtualization Layer

Virtualization Startup Latency (ms) Isolation Level OSkernel Hotplug Hypervisor OCI Supported Backed by
Traditional VM >1,000 Strong Unsharing ✓ ✓ /
Docker [41] 50–500 Weak Host-sharing ✓ ✓ Docker
SOCK [101] 10–50 Weak Host-sharing ✓ ✓ /
Hyper-V [58] >1,000 Strong Unsharing ✓ ✓ ✓ Microsoft
gVisor [49] 100–500 Strong Unsharing ✓ ✓ Google
Kata [67] 100–500 Strong Unsharing ✓ ✓ ✓ OpenStack
FireCracker [3] 100–500 Strong Unsharing ✓ ✓ Amazon
Unikernel [86] 10–50 Strong Built-in ✓ Docker

The traditional VM-based isolation adopts a Virtual Machine Manager (VMM) (e.g., hyper-
visor) that provides virtualization capabilities to guests. It can also mediate access to all shared
resources by provided interfaces (or using Qemu/KVM). With snapshots, VM shows high flexibil-
ity in quick failsafe when patch performing on applications within each VM instance. Though VM
provides a strong isolation mechanism and flexibility, it lacks the benefits of lower startup latency
for user applications (usually > 1,000 ms). This tradeoff is fundamental in serverless computing,
where a function is negligible while the relative overhead of VMM and guest kernel is high.
Container customization: Provide high flexibility and performance. Another common
function isolation mechanism in serverless computing is using containers. The container engine
leverages the Linux kernel to isolate resources and create containers as different processes in the
host [19, 92]. Each container shares the host kernel with the read-only attribute, typically includ-
ing binaries and libraries. The high flexibility is also attached to the container with the UnionFS
(Union File System), which enables the combination of the layered container image by read-only
and read-write layers. Essentially, a container achieves the isolation through namespace to en-
able processes sharing the same system kernel and Linux cgroups to set resource limits. Without
hardware isolation, container-based sandboxing shows lower startup latency than coarse-grained
consolidation strategies [11, 147] in hypervisor-based VMs.
The representative container engine is Docker [41]. Docker packages software into a standard-
ized RunC container adapted to the environment requirements, including libraries, system tools,
code, and runtime. The Docker container has been widely employed in various serverless systems
for its lightweight nature. Some works further optimize the container runtime for better adaption
to the application requirements in the serverless system. SOCK [101] proposes an integration solu-
tion for serverless RunC containers, where redundant features in Docker containers are discarded
in this lean container. By only constructing a root file system, creating communication channels,
and imposing isolation boundaries, the SOCK container makes serverless systems run more effi-
ciently in startup latency and throughput. The startup latency of the SOCK container is reduced
to 10 to 50 ms compared with Docker containers that usually take 50 to 500 ms. Unlike condensing
redundance in lean containers, as additional tools (e.g., debuggers, editors, coreutils, shell) enrich
the container and increase the image size, CNTR [130] splits the container image into “fat” and
“slim” parts. A user can independently deploy the “slim” image and expand it with additional tools
by dynamically attaching the “fat” image to the former. The evaluation of CNTR shows that the
proposed mechanism can significantly improve the overall performance and effectively reduce the
image size when extensively applied in the data center.
Secure container: Compromise security with high flexibility and performance. By re-
viewing our security model of the Virtualization layer in Figure 2, security concerns arise for
the relatively low isolation level of containers. Side-channel attacks such as Meltdown [84],
Zombieload [114], and Spectre [72] prompt mitigation approaches toward vulnerabilities. On the

ACM Computing Surveys, Vol. 54, No. 10s, Article 220. Publication date: September 2022.
The Serverless Computing Survey 220:7

one hand, container isolation should involve preventing privilege escalation, information, and com-
munication disclosure side channels [3]. On the other hand, The untrusted code from user func-
tions should not allow full access to the host kernel. Any process-based solution must include a
relaxation of the security model for its insufficiency for mutually untrusted functions. It requires
containers to craft function containers and restrict permissions arbitrarily in the case of shared
kernel architecture. The state-of-the-art solution to this issue is leveraging secure containers. For
example, Microsoft proposes their Hyper-V Container for Windows [58]. Hyper-V offers enhanced
security and broader compatibility. Each instance runs inside a highly optimized microVM and does
not share its kernel with others on the same host. However, it is still a heavy-weight virtualization
that can introduce more than 1000 ms of startup latency. In Google gVisor [49], the kernel in it
acts as a nonprivileged process to restrict syscalls that called in userspace. However, the overhead
introduced during interception and processing syscalls in a sandbox is high. As a result, it is not
well suited for applications with heavy syscalls. To isolate different tenants with affordable over-
head, FireCracker [3] creates microVMs by customizing VMM for cloud-native applications. Each
Firecracker sandbox runs in userspace and is restricted by Seccomp, cgroup, and Namespace poli-
cies. Hardware and hypervisor-based virtualization help FireCracker limit access to the privileged
domain and host kernel for guests. With a container engine built-in microVMs, Kata [67] adopts
an agent to communicate with the kata-proxy located on the host through the hypervisor, thus
achieving a secure environment in a lightweight manner. Both FireCracker and Kata containers
can significantly reduce startup latency and memory consumption, and they all need only 100 to
500 ms to start a sandbox. Secure containers can provide complete and strong isolation for the host
kernel and other tenants, at the cost of the limited flexibility in condensed microVMs. However,
the startup latency of an instance is still long due to the additional application initialization, such
as JVM or Python interpreter setup.
Specialized Unikernel: Enhance flexibility with high security and performance. Another
emerging virtualization technique is Unikernel [86], which leverages libraryOS, including a series
of essential dependent libraries to construct a specialized, single-address-space machine image.
Because the Unikernel runs as a built-in GuestOS, the compile-time invariance rules out runtime
management, which significantly reduces the applicability and flexibility of Unikernel. However,
unnecessary programs or tools such as ls, cd, and tar are not contained, so the image size of a
Unikernel is smaller (e.g., 2 MB by mirage-skeleton [95] that compiled from Xen), the startup la-
tency is much less (e.g., start within 10 ms), and the security is more substantial than containers.
Based on it, LightVM [90] replaces the time-consuming XenStore and implements the split tool
stack, separating functionality that runs periodically from that which must be carried out, thus
improving efficiency and reducing VM startup latency. From the perspective of software ecosys-
tem, to solve the challenge that traditional applications are struggling to be transplanted to the
Unikernel model [86, 113], Olivier et al. [102] proposes HermitTux, a Unikernel model compati-
ble with Linux binary. HermitTux makes the Unikernel model compatible with Linux Application
Binary Interface while retaining the benefits of Unikernel. However, Unikernel is not adaptable
for developers once built, making it inherently inflexible for applications, let alone the terrible
DevOps environment. Furthermore, in heterogeneous clusters, the heterogeneity of the underly-
ing hardware forces Unikernel to update as drivers change, making it the antithesis of serverless
philosophy.
Tradeoffs among security, performance, and flexibility. Last, we make the indicatrix dia-
gram of these four technologies in Figure 3 to show the tradeoffs among security, performance,
and flexibility. To conclude, hypervisor-based VM shows better isolation and flexibility, whereas
the container can make the instance start faster and flexible to customize the runtime environ-
ment. The secure container offers high security and relatively low startup latency with flexibility

ACM Computing Surveys, Vol. 54, No. 10s, Article 220. Publication date: September 2022.
220:8 Z. Li et al.

Fig. 3. The flexibility, startup latency, and isolation level of four virtualization mechanisms.

compromise. Unikernel demonstrates great potential in terms of performance and security, but it
loses flexibility. When offering adaptable images in the production environment by either virtu-
alization mechanism, it is also critical to avoid that built ones are signed and originated from an
unsafe pedigree, with the solutions [69, 128] by keeping a continuous vulnerability assessment
and remediation program.

3 ENCAPSULE LAYER
A cold startup in serverless computing may occur when the function fails to capture a warm run-
ning container or experiences a bursty load. In the former, a function is invoked for the first time or
scheduled with a longer invocation interval than the instance lifetime. The typical characteristic is
that instances (or pods) must start from scratch. In the latter case of a bursty load, instances need to
perform horizontal scaling during a surge in user workloads. Function instances will auto-scale as
load changes to ensure adequate resource allocation. Besides taking less than 1 second to prepare a
sandbox in the Virtualization layer, the initialization of software environment, such as load Python
libraries, and application-specific user code can dwarf the former [42, 65, 83, 101, 117]. Although
we can provide a more lightweight sandboxing mechanism to reduce the cold startup latency in
the Virtualization layer, the state-of-the-art sandboxing mechanism may not demonstrate perfect
compatibility for containers or VMs when migrated to the existing serverless architecture. In re-
sponse to the tradeoff between performance and compatibility, an efficient solution is to prewarm
instances in the Encapsule layer. This approach is known as the prewarm startup, which has been
widely researched. Representative work about instance prewarm is listed in Table 2.
Before giving a detailed analysis and comparison, we first describe the taxonomy in each column.
“Template” reflects whether the cold startup instance comes from a template. “Static Image” shows
whether the VM/container image for prewarm disables dynamically updating in each cold startup.
“Pool” indicates whether a prewarm pool is used for function cold startups. “Exclusive” and “Fixed
Size” represent whether the prewarmed instance is exclusive and the prewarm pool is size-fixed.
“Predict/Heuristic” indicates whether the prediction algorithm or heuristic-based method are used
to prewarm instances. “REQs” reflects whether the runtime requirements are dynamically loading
and updating in the prewarm instance. “C/R” reflects whether it supports checkpoint and restore
to accelerate the startup. “Sidecar Based” represents whether the relevant technologies can be
implemented or integrated into the sidecar. “Imp” indicates where it is implemented.
There are two common prewarm startup approaches: one-to-one prewarm startup and one-for-all
prewarm startup. In the one-to-one prewarm startup, each function instance is prewarmed from
a size-fixed pool or by dynamic prediction based on the historical workload traces, whereas in
the one-for-all prewarm startup, instances of all functions are prewarmed from cached sandboxes,
which are pre-generated according to a common configuration file. When a cold startup occurs,
the function only needs to specialize these pre-initialized sandboxes by importing function-specific

ACM Computing Surveys, Vol. 54, No. 10s, Article 220. Publication date: September 2022.
The Serverless Computing Survey 220:9

Table 2. Works in the Encapsule Layer

Static Fixed Predict/ Sidecar
Representative Work Template Pool Exclusive REQs C/R Imp
Image Size Heuristic Based
Pause container [55, 94] ✓ ✓ ✓ /
Azure functions [105] ✓ ✓ ✓ ✓ ✓ AWS
Fission [44] ✓ ✓ ✓ ✓ Kubernetes
Adaptive Warm-up [146] ✓ ✓ ✓ ✓ Kubernetes
Serverless in the Wild [118] ✓ ✓ ✓ ✓ OpenWhisk
Replayable Execution [140] ✓ ✓ ✓ FaaS FW
Catalyzer [42] ✓ ✓ ✓ ✓ gVisor-based
Mohan et al. [97] ✓ ✓ ✓ OpenWhisk
Apache OpenWhisk [104] ✓ ✓ ✓ ✓ /
SOCK [101] ✓ ✓ ✓ ✓ OpenLambda

code blob URI and settings. C/R (Checkpoint/Restore) is also used with prewarmed instances in
a serverless system for higher scalability and lower instance initialization latency. C/R is a tech-
nique that can freeze a running instance, make a checkpoint into a list of files, and then restore
the running state of the instance at the frozen point. A common pattern in serverless implemen-
tations is to pause the instance when idle to save resources and then recover it for reusing when
invoked [55, 94].
One-to-one prewarm by size-fixed pool: Makes sense but resource-unfriendly. One-to-one
strategy prewarms exclusive instances in a size-fixed prewarm pool for each function and load
codes whenever invocations arrive. The security model in the Encapsule layer is usually referred to
as privacy concerns. In the one-to-one prewarm pattern, the user-level analyzer for each function
makes user privacy inviolable, and only this user-related analyzer has access to the private pack-
ages. The function portrait cannot leak from the one-to-one prewarm pool, and hardware-based
isolation further ensures that malicious code cannot access the user-level analyzer through priv-
ilege escalation. It is a safe strategy without introducing other security concerns. By building an
exclusive and over-subscribed prewarm pool for each function, serverless providers can maximize
the availability and stability of the user applications. For example, Azure Functions [105] warms
up instances of each function by setting up a fixed-size prewarm pool. Once the always-ready in-
stance is occupied, prewarmed instances will be active and continue to buffer until reaching the
limit. The open-sourced Fission [44] also prewarms like Azure Function. It introduces a compo-
nent called poolmgr, which manages a pool of generic instances with a fixed pool size and injects
function code into the idle instances to reduce the cold start latency.
One-to-one prewarm by predictive warm-up: Ways to make it resource-friendly. The one-
to-one strategy prewarms instances for each function, which means that it is crucial to determine
the warm-up time. Otherwise, a slow warm-up cycle can reduce cold startup efficiency, whereas
a quick cycle will produce massive idle instances in the background and make the serverless sys-
tem resource-unfriendly. Such a deficiency inspires researchers to propose more flexible prewarm
strategies like using prediction-based and heuristic-based methods. Xu et al. [146] design an AWU
(Adaptive Warm-up) strategy by leveraging the LSTM (Long Short-Term Memory) networks to
discover the dependence relationships based on the historical traces. It predicts the invoking time
of each function to prewarm instances and initializes the prewarmed containers according to the
ACPS (Adaptive Container Pool Scaling) strategy once AWU fails. Shahrad et al. [118] propose a
practical resource management policy for the one-to-one prewarm startup. By characterizing the
FaaS workloads, they dynamically change the instance lifetime of the recycling and provisioning
instances according to the time series prediction. CRIU (Checkpoint/Restore In Userspace) [39]

ACM Computing Surveys, Vol. 54, No. 10s, Article 220. Publication date: September 2022.
220:10 Z. Li et al.

is a software tool on Linux to implement C/R functions. Replayable Execution [140] makes im-
provements based on CRIU, using mmap to map checkpoint files to memory and leveraging the
Copy-on-Write in OS to share cold data among multiple containers. By exploiting the intensive-
deflated execution characteristics, it reduces the container’s cold startup time and memory usage.
One-for-all prewarm with caching-aware: Try to make the prewarm generalized and
resource-friendly with privacy guaranteed. One-for-all prewarm startup shares a similar mech-
anism with the Template method, which is hatched and has already pre-imported most of the
bins/libs after being informed by the socket. When a new invocation arrives and requires a new
instance, it only needs to initialize or specialize from the templates. For example, the famous open-
sourced Apache OpenWhisk [103] resolves it by allowing that users can assign private packages
in a zip or virtualenv to specialize the prewarmed container dynamically [104]. Catalyzer [42]
optimizes the restore process in C/R by accelerating the recovery on the retrenched critical path.
Meanwhile, it proposes a sandbox fork to leverage a template sandbox that already has pre-loaded
the specific function for state reusing. To make the cold startup less initialization together with
more flat startup latency, Mohan et al. [97] propose a self-evolving pause container pool by pre-
allocating virtual network interfaces with lazy binding.
As performance improves, so arises vulnerability. The security model of the one-for-all prewarm
is weakened by introducing the system-level analyzer where different function portraits may ag-
gregate, and the pre-imported and pre-allocated requirements will implicitly embody user privacy.
Therefore, when designing a one-for-all based prewarm strategy, the security model should answer
how to make private packages/libraries (REQs) inaccessible and avoid potential privacy disclosure
in case of malicious codes reusing a prewarm container. SOCK [101] explicitly seeks to address
this problem by introducing a tree cache for packages and using the benefit-to-cost model to dy-
namically update packages in the prewarm containers. Although SOCK still uses a system-level
analyzer to collect the internal characteristics of workloads and prewarm zygotes, each handler
container may be only forked from a zygote that has not imported any additional packages other
than the ones the handler specifies/needs. Given that a zygote with a superset of packages needed
by the function may exist, SOCK does not use it for security reasons. The cache tree-based security
model of the one-for-all prewarm in SOCK provides a minimal set of user privacy.
One-to-one and one-for-all prewarm: The challenging points. For one-to-one prewarm
startup and one-for-all prewarm startup, both can be beneficial for optimizing the cold startup
in the Encapsule layer of serverless architecture. Their respective flaws are also apparent. The
one-to-one prewarm startup focuses on significantly less initialization latency by exchanging the
memory resource. According to the research [118], it meets the challenge that a warm-up time in
point is usually hard to measure or predict while ensuring the reasonable allocation of memory re-
sources. On the one hand, prediction-based and heuristic-based methods are particularly effective
when historical data is sufficient to build an accurate model but degrade when the trace is scarce.
On the other hand, the prediction and iteration operations can introduce high CPU overhead when
massive applications and function chains co-exist.
The template mechanism in the one-for-all prewarm startup is adopted to ease the high cost
of functions cold startup from scratch. In addition, maintaining a global prewarm pool introduces
less additional memory resource consumption than the one-to-one prewarm startup. However, it
still suffers from several challenges, including the huge template image size [8, 51], confliction
of various pre-imported libraries, and potential privacy disclosure. It may also potentially reveal
the vicinity in which applications with a similar portrait are widely deployed. It is nontrivial to
“suit the remedy to the case” for cold startups in different scenarios. For example, it is much more
efficient to generate a template by one-for-all prewarm startup when the function is invoked for

ACM Computing Surveys, Vol. 54, No. 10s, Article 220. Publication date: September 2022.
The Serverless Computing Survey 220:11

Fig. 4. System logic and scheduling levels in the Orchestration layer.

the first time or with poor predictions during the trace analysis. The one-to-one prewarm startup
performs better for functions with general rules or diurnal patterns, and vice versa.

4 ORCHESTRATION LAYER
The main challenge in the System Orchestration layer is the friendly and elastic support for dif-
ferent services. Even though the current serverless orchestrators are implemented differently, the
challenges they face are much the same. As hundreds of functions co-exist on a serverless node,
it challenges scheduling massive functions with inextricable dependencies. In addition, managing
granular permissions for hundreds or thousands of functions is hard to do. Therefore, the secu-
rity model is more referred to as performance security than functional security. It resolves the
challenges in making “just the right amount” of resource provision robust to performance while
answering the colocation interference and load balancing for applications. Similar to the tradi-
tional solutions [26, 35, 59, 76, 126], the serverless model should concern the ability to predict the
on-demand computing resources and an efficient scheduling strategy for services. As shown in
Figure 4, researchers usually propose to introduce the load balancer and resource monitor compo-
nents into the controller to resolve provision and scheduling challenges. The load balancer aims
to coordinate resource usage to avoid overloading any single resource. Meanwhile, the resource
monitor keeps watching the resource utilization of each node and passes the updated information
to the load balancer. With the resource monitor and load balancer, a serverless controller can per-
form better scheduling strategies in three levels: resource-level, instance-level, and application-level.
We summarize the hierarchy in Table 3.
Specifically, “Focused Hierarchy” indicates that the resource adjusting is designed in addition to
the essential resource auto-provision, which can be classified into “R” (resource-level), “I” (instance-
level), or “A” (application-level). “Resource Adjusting” shows whether the scheduling provides an
adjustment for resource provision. “SLO” reflects whether SLO constraints are considered. “Intf”
represents whether the resource contention or interference is discussed. “Usage Feedback” re-
flects whether the resource feedback in a physical node is considered. “Dynamic Strategy” indi-
cates whether it is a dynamic or runtime scheduling strategy. “Trace Driven” indicates whether

ACM Computing Surveys, Vol. 54, No. 10s, Article 220. Publication date: September 2022.
220:12 Z. Li et al.

Table 3. Works by Focused Hierarchy in the System Orchestration Layer

Representative Focused Resource Usage Dynamic Trace Predict/
SLO Intf Implement Insight
Work Hierarchy Adjusting Feedback Strategy Driven Heuristic
Pigeon [82] R ✓ ✓ ✓ Kubernetes Static pool
FlowCon [155] R&(I) ✓ ✓ ✓ ✓ Prototype DL tasks
SIREN [139] R&(I) ✓ ✓ ✓ ✓ AWS Lambda ML tradeoff
CherryPick [5] R ✓ ✓ ✓ ✓ ✓ ✓ Prototype Bayesian Opt
Lin and Khazaei [81] R&(A) ✓ ✓ ✓ ✓ ✓ ✓ AWS Lambda Profiling
MPC [57] R ✓ ✓ ✓ ✓ ✓ ✓ ✓ OpenWhisk /
Chang et al. [32] I&(R) ✓ ✓ ✓ ✓ ✓ Kubernetes /
Kaffes et al. [66] I&(R) ✓ ✓ Prototype /
FnSched [127] I&(R) ✓ ✓ ✓ ✓ ✓ OpenWhisk /
Guan et al. [50] I&(A) ✓ ✓ ✓ Prototype Library
McDaniel et al. [93] I ✓ ✓ ✓ ✓ Docker Swarm Two-tiered
Kim et al. [70] I&(R) ✓ ✓ ✓ ✓ ✓ ✓ ✓ Prototype CPU cap
Smart spread [88] I ✓ ✓ ✓ ✓ ✓ AWS Lambda Profiling
Xanadu [40] A&(R) ✓ ✓ ✓ ✓ ✓ OpenWhisk Profile&predict
Step Functions [43] A ✓ AWS Lambda Health check
WUKONG [29, 30] A ✓ ✓ AWS Lambda Graph to seq
Viil and Srirama [136] A&(R) ✓ ✓ ✓ Pegasus Partition
SAND [4] A ✓ ✓ Prototype Colocation
GlobalFlow [154] A&(I) ✓ AWS Lambda Cross-regions
SONIC [129] A&(R) ✓ ✓ ✓ ✓ ✓ AWS Lambda Hybrid exchange

making choices depends on traces or collected data metrics. “Predict/Heuristic” reflects whether a
prediction/heuristic-based method is used.

4.1 Dynamic Adjustment of Resource Provision (Resource-Level)

Resources including CPU and memory serve as the basic scheduling objects in serverless comput-
ing. For isolation and stability, resources are configured by an orchestrator, and access to them
is restricted. The serverless controller will allocate resources for a new instance and isolate the
execution environment when a cold startup occurs. Therefore, the key to building an efficient
serverless controller is auto-scaling just the right amount of resources to satisfy the resilient work-
loads. However, the controller component itself cannot adjust appropriately because the resource
is highly dynamic in the cluster and the potential inaccurate resource specifications by default.
Make resource provision of the container “just the right amount.” The common solu-
tion for avoiding resource over-provisioning is building feedbacks regarding historical traces.
For example, effort in the work of Chen et al. [33, 34] optimizes the original resource settings
by varying the trace-driven patterns for VMs. In serverless computing systems with more fine-
grained functions, a real-time resource monitor can be employed to help the controller make dy-
namic resource adjustments, as shown in Figure 4(a). For example, Pigeon [82] builds a server-
less framework, introducing a function-level resource scheduler and an over-subscribed static
pool. The scheduler assigns containers with different resource configurations to the queries
based on the node capacity and function requirement. However, their container pool is based
on a static configuration, leading to resource segmentation and low utilization. FlowCon [155]
facilitates dynamic resource allocation for container-based DL training tasks in the near future and
resets resource configuration elastically. Although they design a dynamic auto-provision strategy
based on the monitor feedback, the SLO constraints and resource interference are not considered
further. This results in their lack of flexibility and practicality in a real production environment.
DRL (Deep Reinforcement Learning), evolving from Deep Q-learning, is a widely used combi-
nation algorithm by learning control strategies from higher-dimensional perceptual inputs [96],
which can be used to make resource provision decisions. For example, Wang et al. [139] propose a

ACM Computing Surveys, Vol. 54, No. 10s, Article 220. Publication date: September 2022.
The Serverless Computing Survey 220:13

serverless scheduler based on DRL for ML training jobs. It can dynamically adjust the number of
function instances needed and their memory size to balance high model quality and the training
cost.
The keys to making resource provision robust to performance. With a view to the perfor-
mance robustness requirement of the security model in the System Orchestration layer, recent
works take the SLA into account to ensure stability and reliability when functions are invoked
in a shared-resource cloud. CherryPick [5] leverages the Bayesian optimization, which estimates
a confidence interval of an application’s running time and cost, to help search the optimal re-
source configurations. Unlike static searching solutions, it builds a performance model to distin-
guish the unnecessary iteration trials, thus accelerating the convergence. However, CherryPick’s
performance model targets big data applications specifically, not generalized to other applications.
Similarly, Lin and Khazaei [81] build an analytical model to help general serverless applications
deployment. It can predict the application’s end-to-end response time and the average cost un-
der a given configuration. They also propose a PRCP (Probability Refined Critical Path Greedy)
algorithm based on the transition probability, recursively searching the critical path of execution
order. With PRCP, they can achieve the best performance with a specific configuration under bud-
get constraints or less cost under QoS constraints. Besides SLA, shared-resource contention should
also be noticed in the multi-tenant environment. HoseinyFarahabady et al. [57] discuss this topic.
Their proposed MPC optimizes the serverless controller for resource predictively allocation by in-
troducing a set of cost functions. It reduces the QoS violation, stabilizes the CPU utilization, and
avoids serious resource contention. However, these resource and workload estimations based on
ML or AI (Artificial Intelligence) usually achieve a tradeoff between an optimal global solution and
robust performance to inaccurate workload information [26, 31, 60, 133]. Whether they can avoid
fragile robustness and improve resource utilization in the production environment is unknown
and remains a critical avenue to explore.

4.2 Load Balancing for Instance Scheduling (Instance-Level)

In addition to dynamic adjustment at the resource-level, another essential part of a serverless
system is instance-level scheduling. From the perspective of cloud vendors, they hope to achieve
either higher throughput, higher resource utilization, or less energy consumption. At the same
time, users prefer cheaper deployment costs and less end-to-end invocation latency. To this end,
instances from multi-functions or multi-tenants should be carefully scheduled across the cluster
to achieve the preceding targets.
The mainstream solution is leveraging the load balancer, which is shown in Figure 4(b). It is
designed as the queries router to help schedule functions and achieve the load balancing between
nodes in the cluster. The strategies can be classified into two categories: hash-based and multi-
objective-based methods. In the hash-based method, the controller uses a hash function to decide a
home node (or executor) of a given function for default routing. Then it will set a stepsize to recur-
sively filter out an alternative if the home is unavailable or resource-constrained. They are usually
done by a health check in each physical node. Until the cloud provider has a full understanding of
the characteristics of workloads running in its serverless system and the cluster, we recommend
using the hash-based method to implement a load balancer. In the multi-objective-based balancing
method, the load balancer aims at multiple optimizations, such as throughput, response time, and
resource utilization. Therefore, it should balance different factors to satisfy both the cloud vendors
and users.
Leverage resource monitor and load balancer to make scheduling decisions. The resource
monitor can provide a global view of resource status in a cluster, which helps the load balancer
make better scheduling decisions. Chang et al. [32] design a comprehensive monitoring mechanism

ACM Computing Surveys, Vol. 54, No. 10s, Article 220. Publication date: September 2022.
220:14 Z. Li et al.

for the Kubernetes-based system. It can provide a variety of runtime information to the scheduler,
including system resource utilization and the QoS performance of an application. The flaw of the
study is that it does not provide a complicated resource scheduling algorithm. Kaffes et al. [66] pro-
pose a centralized and core-granular scheduler. Centralization provides a global view of the clus-
ter to the scheduler so that it can eliminate heavy-weight function migrations. Core-granularity
binds cores with functions and therefore avoids core-sharing among functions and promises per-
formance stability. However, they only consider the scheduling of CPU resource but ignore other
important resources like memory. FnSched [127] regulates CPU-shares to accommodate the in-
coming application invocations by checking the available resource. A key advantage of employing
a greedy algorithm is that fewer invoker instances are scheduled by concentrating invocations in
response to varying workloads. Although FnSched makes a tradeoff between scalable efficiency
and acceptable response latencies, it is limited by the assumption that function execution times are
not variable. Guan et al. [50] propose an AODC-based (Application Oriented Docker Container) re-
source allocation algorithm by considering both the available resources and the required libraries.
They model the container placement and task assignment as an optimization problem, then take a
Linear Programming Solver to find the feasible solution. The Pallet container performs the AODC
algorithm, serving as both a load balancer and resource monitor. The downside is that plenty of
containers will occupy the memory space as the number of functions increases.
Take the performance interference and QoS constraints into consideration. While im-
proving utilization, load balancing strategies also bring the interference challenge that sharing
resources between instances may result in performance degradation and QoS violation. The per-
formance robustness of the security model drives the scheduling to make tradeoffs between higher
resource utilization and fewer user QoS violation due to the interference. Different functions’ sen-
sitivities to different resources may vary, which means that we should avoid physical colocation of
functions that are sensitive to the same resource (e.g., CPU-sensitive containers may cause serious
CPU contention when co-located). The load balancer should notice and moderate the interference
when scheduling containers. McDaniel et al. [93] manage the I/O of containers at both the cluster
and node levels to effectively reduce resource contention and eliminate performance degradation.
Based on a resource monitor in Docker Swarm, it refines the container I/O management by pro-
viding a client-side API, thus enforcing proportional shares among containers for I/O contention.
Kim et al. [70] present a fine-grained CPU cap control solution by automatically and distributedly
adjusting the allocation of CPU capacity. Based on performance metrics, applications are grouped
and allowed to make adjustment decisions, and application processes of the group consume only
up to the quota of CPU time. Hence, it minimizes the response time skewness and improves the
robustness of the controller to performance degradation. Smart spread [88] proposes an ML-based
function placement strategy by considering several resource utilization statistics. It can predic-
tively find the best-performing instance and incurs the least degradation in performance of the
instance.

4.3 Data-Driven Workflows for Application Deployment (Application-Level)

At the application-level, load balancing strategies can be categorized into two kinds: the spread
strategy, which distributes functions of an application across all the physical nodes, and the bin-
pack strategy, which tries to schedule functions of an application to the same node first [57]. In-
tuitively, the spread strategy seems to better balance the workloads on the nodes while avoiding
serious resource contention. However, it weakens the data locality, which means that the spread
strategy will introduce more transmission overhead than bin-pack if functions are data-dependent.
It indicates the significance of customized scheduling from the application’s perspective.

ACM Computing Surveys, Vol. 54, No. 10s, Article 220. Publication date: September 2022.
The Serverless Computing Survey 220:15

Fig. 5. Two invocation patterns for functions and two execution models of workflows.

Invocation patterns and workflow execution models. As shown in Figure 5(a), if a function
is invoked from user queries via the RESTful API or other triggers, it is called external invocation.
The instance-level load balancing can perform well in external invocation scenarios. However, the
emerging cloud applications may consist of several functions, and there are data dependencies be-
tween multiple functions. For example, the implementation of a real-world social network consists
of around 170 functions [2]. In this case, functions in such an application will get active by various
triggers from the user query or another function. If a function is initialized or assigned by other
functions, it follows the internal invocation pattern. Currently, researchers raise their vision to the
data-driven scheduling for internal invocations from the perspective of application-level topology.
Workflow is the most common implementation of internal invocations, where functions are ex-
ecuted in a specified order to satisfy complex business logic. The execution models of these data-
driven workflows can be extracted into two approaches: sequence-based workflow and DAG (Di-
rected Acyclic Graph)-based workflow. As shown in Figure 5(b), functions are invoked in a pipeline
through a registered dataflow in the sequence-based workflow. The sequence-based workflow is
the basic and the most common pattern in the serverless workflow, and most cloud vendors pro-
vide such execution mode for application definition. Obviously, there is more than one sequenced
workflow in one complex application, and the same functions can be executed in various sequences.
If we regard each function as a node and dataflow between nodes as a vector edge, such an appli-
cation with multiple interlaced sequenced workflows can be defined by the DAG (hence the name
“DAG-based workflow”). Today, few cloud vendors provide services for the application definition
in the DAG form, aka serverless workflows [1, 21, 27].
The scheduling overhead introduced in serverless sequences. With massive functions com-
municating with each other, scheduling dataflow introduces more complexities. However, the
existing serverless systems in the production environment commonly treat these workflows as
simple recursion of internal invocations. It raises the challenge of reducing the overhead in the
System Orchestration layer by scheduling function sequences [16]. Current policy to manage the
function sequences is quite simple—functions are triggered following the first-come-first-served
algorithm [129]. However, as the length of the function sequence increases, cascading cold start
overheads should be addressed to avoid seriously end-to-end latencies degradation of sequenced
workflows [20, 40]. To this end, Xanadu [40] combines the prewarm strategy with a most-likely-
path (MLP) estimation in the workflow execution. It prewarms instances by a speculative-based
strategy and makes just-in-time resource provisioning. However, the prediction miss would in-
troduce additional memory waste, especially in the scenario of multi-branch or DAGs. Moreover,
serverless workflow engines prefer the Master-Worker architecture where ready functions are
ACM Computing Surveys, Vol. 54, No. 10s, Article 220. Publication date: September 2022.
220:16 Z. Li et al.

identified by the state and invoked directly by the master without a queue [9, 17, 30, 47, 89], in-
cluding AWS Step Functions [43] and Fission Workflows [46]. As shown in Figure 5(a), the defi-
ciency is that the additional overhead is introduced in the function workflow through unnecessary
middlewares (e.g., unnecessary storage in an internal invocation).
Enhance the data locality for efficient serverless DAG executions. To help function work-
flow avoid undesired middlewares, researchers usually co-locate the functions into subgraphs to
enhance the data locality, as shown in Figure 4(c). For example, Viil and Srirama [136] use multi-
level k-way graph partitioning to provision and configure scientific workflows automatically into
multi-cloud environments. However, their partition algorithm may not match well with serverless
applications, where each node in the graph can auto-scale multiple replicas in such as foreach
steps. In this case, the connections and edge weights become unpredictable. In serverless context,
WUKONG [29, 30] implements a decentralized DAG engine based on AWS Lambda, which com-
bines static and dynamic scheduling. It divides the workflow of an application into subgraphs, be-
fore and during execution, thus improving parallelism and data locality simultaneously. However,
WUKONG’s colocation of multi-functions within a Lambda executor may introduce additional
security vulnerabilities due to its weakened isolation. SAND [4] presents a new idea to group
these workflow functions into the same instance so that libraries can be shared across forked
to reduce initialization cost, and additional transmission can be eliminated in the workflow due
to the data locality. SAND performs a better isolation mechanism than WUKONG by using pro-
cess forking for function invocations however ignores the colocation interference resulting from
the resource contention. When exchanging intermediate data of DAGs, SONIC [87] proposes to
use the VM-storage-based transmission strategy when functions are co-located on the same node.
The optimal transferring depends on application-specific parameters, such as the input size and
node parallelism. SONIC dynamically performs the data passing selection with a communication-
aware function placement, predicting such runtime metrics of functions in the workflow. Glob-
alFlow [154] considers a geographically distributed scenario where functions reside in one region
and data in another. It groups the co-located functions into subgraphs and connects them with
lightweight functions, so it improves data locality and reduces transmission latency. The combina-
tion of local and cross-region strategies in a holistic manner makes sense.
Summary of the challenges in the scheduling of serverless workflows. Workflow schedul-
ing is an NP-hard problem, and researchers have been designing various strategies for it [1, 91].
Such optimization in the workflow aims to minimize the makespan, reduce the execution cost,
and improve resource utilization while satisfying single or multiple constraints. Considering the
preceding challenges, serverless computing focuses on leveraging enhanced data locality. The chal-
lenge is that the end-to-end latency of a workflow query could increase significantly due to fre-
quent interactions with the storage from different nodes. Resource volatility becomes another fo-
cus in the serverless system, which can be unpredictable as the number of functions increases in the
production environment. It introduces more difficulty to find an efficient workflow placement and
scheduling strategy in a concise decision time (e.g., 10 ms for load balancing). To evaluate the effi-
ciency and performance for future workflow-based research, DAG-based or DG-based serverless
benchmarks also urgently need to be published. They are better adapted based on real applications
rather than simple microbenchmarks [110] or function self-loops [81, 148]. Keeping a guaranteed
QoS performance is also significant for applications in serverless computing.

4.4 Other Security Concerns in the Orchestration Layer

Besides the performance robustness in the Orchestration layer, another serious security con-
cern is how to resolve unavailability. The attackers may establish destructive behaviors from the

ACM Computing Surveys, Vol. 54, No. 10s, Article 220. Publication date: September 2022.
The Serverless Computing Survey 220:17

resource-level, instance-level, or application-level, resulting in an unmatched in-memory footprint,

concurrency exhaustion, or workflow exceptions.
In-memory footprint by unrestricted read-in. Contrary to intuitive thinking, serverless ar-
chitecture can complicate the programming because the decoupled microservices have higher re-
quirements on the normalized input. The unrestricted memory read-in from the input data may
result in the timeout or breakdown for its over-sized memory footprint (e.g., 300 MB read-in within
a 256-MB-limit container). Function developers may overlook this vulnerability in a public cloud
where an attack can easily disguise as a legitimate invocation. On the premise that the user code
is fragile against such kind of attack, serverless systems need to bind filtering rules in the event
trigger to help avoid this security concern.
Concurrency exhaustion by DDoS (Distributed Denial-of-Service) attack. In FaaS-based
application architecture, the number of APIs increases nonlinearly with the number of services,
resulting in complex middle-tier and backend service interactions in one single invocation. If the
cascading effect of time-consuming API calls lurks in a serverless application, the external traffic
of bursty load leads to the mismatch between service capability and resource allocation. In this
case, the DDoS attack on serverless is a more significant threat. Any unavailable function node
can seriously take down the entire application with a wilder attack surface. It can cause seriously
degraded QoS by invocation exhaustion attacks in a single function node or generate a large bill
for the application account. The DoW (Denial-of-Wallet) attack can be mitigated by setting quotas
on the total number of instances and the startup concurrency.
Workflow authentication to avoid malicious dataflow. When triggering invocations, func-
tion input parameters and preferences are also passed within the dataflow. Functions start trust-
ing their input as they believe it came from another trusted first party function. Therefore, an
internal invocation is more fragile than an external one if the function maintainer fails to pro-
vide essential verification of queries source. In addition, attackers can inject malicious input
into queries and generate invocation exceptions, or make the workflow execution order out of
control. In the serverless architecture, all function nodes in the application need to authorize
their access permission to identify whether the input tampers with the current invocation. Pro-
viding robust authentication schemes in access control and protection to all functions, events,
and triggers is also a complex undertaking. It introduces the tradeoff between easier manage-
ment by giving the sum of the permissions to grouped functions or safer privilege by explicitly
stating the data access for each function to minimize the damage that malicious dataflow can
cause.

5 ESSENTIAL BAAS COMPONENTS IN THE COORDINATION LAYER

We have examined the most critical implementations in Sections 2, 3, and 4, and there are also
some other components in the System Coordination layer introduced to support or enhance the
serverless system. We also outline the relevant techniques and research in Figure 6. In terms of
implementation, a serverless system needs to integrate the six significant components or services:
storage, queue, API gateway, trigger, data cache, and DevOps tools. Most of the literature focuses
on the data cache, queue service, and function storage from an academic perspective.

5.1 Storage Service

One of the key requirements of a serverless workload is efficiently sharing ephemeral data between
functions or saving results for asynchronous invocation. Therefore, a natural way to communicate
between them is to exchange the data through a remote store.

ACM Computing Surveys, Vol. 54, No. 10s, Article 220. Publication date: September 2022.
220:18 Z. Li et al.

Fig. 6. Techniques and works about BaaS components in the System Coordination layer.

Different phases of storage during the function execution. During a serverless invocation,
there are three phases where the database service is required: Authentication, In-Function, and
Log. Authentication is usually performed ahead of controller scheduling to avoid security issues,
and it should get fast response for access. Using an MMDB (Main Memory DataBase) to imple-
ment the Authentication phase is recommended in a serverless system, such as Redis, a high-
performance key-value database. During the function execution, the calls of storage APIs make
up the In-Function phase. Users can choose to use either a DRDB (Disk-Resident DataBase, e.g.,
MySQL) or an MMDB by different BaaS interfaces for ephemeral storage. The Log phase builds
the bridge for users to return invocations results, especially for the functions invoked in an asyn-
chronous manner. A detailed record in JSON format, including runtime, execution time, queue
time, and states, will be ephemerally or permanently stored and returned (e.g., CouchDB in Open-
Whisk). It is recommended that serverless storage follow the invocation patterns to only pay for
queries consumed by the storage operation and the storage space consumed when logging. How-
ever, the throughput of existing storage is a major bottleneck due to the frequent and vast functions
interactions [64, 65]. Although current serverless systems support NAS (Network Attached Stor-
age) to help reduce storage API calls, these shared access protocols are still network-based data
communication essentially.
IO bottlenecks in storage: Modeling in serverless context. Traditional solutions use predictive
methods [38, 100, 137] and active storage [109, 131, 132, 143, 145, 152] to automatically scale re-
sources and optimize the data locality on demand. Researchers also explore using a hybrid method
to ease the I/O bottleneck for serverless storage. For example, Pocket [71] is strict with separat-
ing responsibilities across the control, metadata, and data planes. Using heuristics and combining
several storage technologies, it dynamically rightsizes resources and achieves a balance between
cost and performance efficiency. To alleviate the extremely inefficient execution for data analyt-
ics workloads in the serverless platform, Locus [106] models a mixture of cheap but slow storage
with expensive but fast storage. It makes a cost-performance tradeoff to choose the most appro-
priate configuration variable and shuffle implementation. Middleware Zion [108] enables a data-
driven serverless computing model for stateless functions. It optimizes the elasticity and resource
contention by injecting computations into data pipelines and running on dataflows in a scalable
manner.
Due to the data-shipping architecture of serverless applications, current works usually focus on
designing a more elastic serverless storage, enhancing the data locality to ease the I/O contention of
function communication on the DB-side. However, due to the potential heterogeneity of different
functions, the preceding uncertainty still makes these technologies in practice challenging.

ACM Computing Surveys, Vol. 54, No. 10s, Article 220. Publication date: September 2022.
The Serverless Computing Survey 220:19

5.2 Specialized Queue

In various implementations of the serverless system, the queue is acquiescently integrated into
the System Orchestration layer, which passes messages between different system components. For
instance, Apache Kafka serves as a distributed message streaming platform that allows applications
to write and subscribe to messages across different hosts.
Interact with the controller by node queue and function queue. The function queue can
send messages between the controller and functions. In contrast, the node queue serves for load
balancing to schedule the functions to different nodes (e.g., a queue in the cluster manager). A
representative of adopting function queue design is OpenWhisk [103]. When the OpenWhisk con-
troller receives an invocation query, it decides which invoker should execute the instance and then
sends it to the selected invoker via Kafka. It also leverages the topic partitioned to increase paral-
lelism so that messages with the same consumer can write to the same partition. SAND [4] also
follows this message queue design by introducing a two-level hierarchical message bus: a local
message bus deployed on each host and a global message bus distributed across different hosts. A
local message bus is partitioned into different message queues such that every function on the host
subscribes to messages from this function queue. In this way, if a function and its successors are
running on the same host, it can directly write its output into the local message queue subscribed
by its successor. Otherwise, the output writes to the global message bus. Other work dives into the
shortcomings of the queue-based mechanism, which may lead to reduced performance and avail-
ability in the serverless context. McGrath and Brenner [94] propose to introduce the “cold queue”
and the “warm queue” to assume different responsibility for function queries. DORADO [99] also
uses shared memory to mediate communication and persist data. By such means, queries can be
routed to any container that is replicated.
It is more convenient for developers to adopt scalable queues in serverless computing, as tenants
offloading scaling to cloud vendors. For example, Amazon Simple Queue Service provides scalabil-
ity by processing each buffered invocation independently, scaling transparently to handle bursty
loads without provisioning instructions. The idea of the scalable queue also meets the requirements
for serverless computing, such as pay-per-use, dependability, convenience, and flexibility.

5.3 API Gateway and Various Triggers

In serverless computing, instances start up on-demand, and invocations do not bind with a static
address. When deploying the containers or VMs to the cluster, the system will dynamically assign
addresses to services. In this case, containers on the same node in the default network can commu-
nicate by IP addresses, whereas containers across nodes need to allocate ports for forwarding. The
dynamic port allocation raises the challenge as it intensifies at scale. The API gateway component
can provide a unique entry point to ensure accurate services’ addresses. When the queries join the
gateway, the service registry is inquired and forwards queries to the available instances according
to the IP route. It should also consider the availability and reliability of the serverless system, such
as the lazy reaction for services incompatible due to the hardware heterogeneity.
Meanwhile, serverless systems design various triggers to invoke functions responding to queries.
A trigger defines how a function is invoked, and the binding rule represents a mapping between
them. The trigger and binding rule together make up a probe for the detectable event and help avoid
hardcoding access to other services [77]. Besides invoking an event-driven function, triggers can
also provide a declarative way to connect data to the code (e.g., storage services).
The four most popular triggers are the following: HTTP, queue, timer, and Event. The HTTP
trigger is widely used [118] to handle external invocations, by which a function can be easily in-
voked once an HTTP query arrives. Although the HTTP trigger simplifies the external invocation
for functions, it shows less efficiency in the case of internal invocations. An alternative way to
ACM Computing Surveys, Vol. 54, No. 10s, Article 220. Publication date: September 2022.
220:20 Z. Li et al.

handle internal events is using the queue trigger, by which functions get triggered whenever an
invocation enqueues. For instance, Kubeless [74] provides a Kafka-based queue trigger bond with
a Kafka topic so that users can invoke the function by writing messages into the topic. Specific
purposes also require more extensive triggers. For example, a timer trigger in Kubernetes can in-
voke a function periodically. It creates a CronJob [75] object, which is written in a Cron expression
representing the set of invocation time, to schedule a job accordingly. An event trigger invokes a
function in response to an event, which is the atomic piece of information that describes something
that happened in the system. A convincing example of such implementation is Triggerflow [85],
which maps a workflow by setting an event trigger in each edge.

5.4 Data Cache

To ensure stability in case the workload bursts and reaches a hard limit in concurrency, the com-
mon practice among cloud-based applications is utilizing multiple levels of caches [12]. Data
caching can cut unnecessary roundtrips for less response time when queries experience a full-end
slow invocation path. One common idea is caching at the API gateway (e.g., a caching solution for
the GET method [6]), or caching the pages and only resulting in a storage I/O query (e.g., Ama-
zon DAX [7] and Aurora [135] for database caching). Another common solution is to cache in the
system [87, 107, 138, 149]. This freezes maintainers from declaring inside the function by enabling
static assets or large object caching. However, the cached data is only available in the ephemeral
container, making sharing across all short-term instances of a function challenging. Furthermore,
this approach may not be as effective as it seems—the first invocation in every container will result
in cache misses.
Image cache: On-demand loading and page sharing. A simple and popular method is to
provide the image cache for accelerating. In a container-based serverless system, an image is com-
posed of multiple layers and shared by numerous containers as needed. When invoking functions
on a host for the first time, container images need to be downloaded and cached locally. For exam-
ple, Slacker [51] builds a Docker storage driver and uses block-level COW to implement snapshots
in a VMstore. VMstore’s read-only snapshots represent Docker images, and the pull and push oper-
ations only involve sharing snapshot IDs rather than large network transfers. It makes it possible
for Docker workers to fetch data lazily from shared storage as needed. DADI [79] also implements
an image service merging a sequence of block-based layers, and it caches recently used data blocks
adopting the overlay with a tree-structured design. SEUSS [28] factors out a common execution
state shared in a snapshot stack, which expresses a lineage between snapshots. It uses CoW to cap-
ture into a snapshot only the pages that were modified for fast deployment. Furthermore, SEUSS
proposes to use anticipatory optimization to reduce the number of written pages captured in each
snapshot.
State cache: Make the serverless applications stateful. State cache can be further combined
with an active database to enable function execution stateful. Bledi [151] extended from Olive [115]
adopts the refined SSF (Stateful Serverless Function) instance with a built-in database. By saving a
set of intent tables recording the SSF’s state and function information, it provides a fault-tolerant
workflow execution manner. SNF [120] decouples the functions into computing units and state
units, and relaxes the constraint of communication between cooperating units. When processing
a subsequent flowlet in the same flow, the function’s internal state is cached in the local memory.
Then SNF can proactively replicate ephemeral state among compute units. Cloudburst [125] packs
the local cache with a function executor in each instance and periodically publishes a snapshot of
cached keys to the key-value store. By such means, Cloudburst can enhance the data locality via
physical colocation of mutable caches and enable state management with the remote auto-scaling
key-value database.
ACM Computing Surveys, Vol. 54, No. 10s, Article 220. Publication date: September 2022.
The Serverless Computing Survey 220:21

Checkpoint cache: Enable functions with fault tolerance. The demand for fault tolerance
also inspires researchers to make relevant techniques applicable in serverless context, such as
C/R-based [80, 153] and log-based [85, 142]. One example of such implementation in serverless
computing is AFT [124], which builds an interposition between a storage engine and a common
serverless platform by providing atomic fault tolerance shim. It leverages the data cache and the
shared storage to guarantee the isolation of atomic read, avoid storage lookups for frequent access,
and prevent significant consistency anomalies.
In addition to the implementations we discussed earlier, other caching mechanisms following
the pay-as-you-go mode can be explored and integrated into any layer of our proposed serverless
architecture. In summary, data caching is still an essential component for higher flexibility and
better performance.

5.5 DevOps Tools

DevOps is a compound of development and operations, which improves collaboration and produc-
tivity. Since the responsibility of managing underlying resources and runtime environment are
transferred to the cloud vendors in the serverless concept, developers only need to focus on the
code logic. Operations teams are actually liberated from this process, where they are required to
check, compile, pack images, and test deployment after developers submit the code. There is a
necessity for providing DevOps tools in the serverless system. We agree with the pipeline imple-
mentation [61] to group the DevOps into CI (Continuous Integration), CD (Continuous Delivery),
and CM (Continuous Monitoring) categories.
CI refers to the continuous merging of developed code with others during software develop-
ment while ensuring automatical validation and building. Not only should it make the operations
within the function (e.g., integration [63] and inspection [56, 122]), but it should also focus on
availability between functions in an application (e.g., monitoring and debugging tool IOpipe for
workflows [24]).
CD requires that the serverless system can automatically update the application instances
with the old version while keeping the services still available. Today, the common solutions are
Rolling-update, RED-Black (aka Blue-Green), and Canary deployment, which are adopted by Ku-
bernetes [25, 48]. Essentially, these deployment strategies share the same mechanism to keep part
of instances with the old version serving and then gradually replace them with new ones. The
difference between them is the complexity of rollback to the last available instances.
CM enables the DevOps team to receive timely feedback on the problems and errors during
the above steps. It also provides a visualized interface for monitoring applications’ runtime behav-
ior and resource utilization. Actually, the feedback tool is already implemented in most CI/CD
tools and runs through the whole DevOps lifecycle of the application. By enriching the CM
component, visualizing activities, and resource monitoring [69], users can better understand their
cloud services.
The DevOps concept in serverless typically appears more frequently in production scenarios,
and most platforms provide various such tools to ensure good compatibility and flexibility. How-
ever, the introduction of DevOps may bring about new vulnerabilities threatening the security
of instances, and serverless research should also provide more substantial support for detecting
vulnerable containers [22, 69, 73, 128].

6 PERFORMANCE AND COMPARISON

This section first summarizes the performance with different VMs/containers, language runtime,
and resource limits in serverless computing. Then we analyze the current production serverless
systems to show the preferences.
ACM Computing Surveys, Vol. 54, No. 10s, Article 220. Publication date: September 2022.
220:22 Z. Li et al.

Fig. 7. Cold startup latency under different language runtimes, container runtimes, and memory limits.

6.1 Performance Analysis

The runtime within the instance built from different virtualization technologies can exhibit dif-
ferent cold startup performances. In addition, the language runtime is another factor that can
seriously affect the cold startup latency. For example, evaluations in Catalyzer [42] show the cold
startup latencies with different VM/container and language runtimes. As shown in Figure 7(a), Hy-
perContainer introduces the highest cold startup latency with various language runtimes. Process-
based Docker runtime certainly performs significantly better than others. Generally speaking, the
interpreted languages (e.g., Python) incur a higher initial cost and make startup times up to 10×
slower [4, 101] than the compiled languages (e.g., C) with cold startups.
However, according to the performance tests from Jackson and Clynch [62], which measure the
startup and execution latency of different language runtimes, the performance of compiled and
interpreted language runtime also depends on the platform. For example, the cold startup latency
of .NET C# on AWS Lambda is higher than that of Node.js, whereas it is the opposite on Azure
Functions. This is because Azure Functions would provide better support for C# based on its core
technology .NET for Microsoft, and be implemented by running on windows containers rather
than the open-source .NET CLR (Common Language Runtime) based on Linux containers.
The memory limit is another significant factor that slows down the cold startups for the
container-based serverless system. The performance evaluation about memory allocation [117]
is shown in Figure 7(b). The cold startup latency of each microbenchmark function increases as
stepping to smaller memory limits. We can also see a significant decrease in container startup
latency when stepping from 128 to 256 MB. However, a larger memory limit results in less obvi-
ous optimizations without the reasonable regime of marginal increases. It also explains why most
serverless systems set 256 MB as the default memory limit of the function container.

ACM Computing Surveys, Vol. 54, No. 10s, Article 220. Publication date: September 2022.
The Serverless Computing Survey 220:23

Table 4. Comparing Metrics of Four Serverless Vendors [77, 94] (“CCI” Means the Concurrent Invocations)
Item Amazon Lambda Google Functions Microsoft Azure Functions IBM OpenWhisk
GFLOPS per function 19.63 4.35 2.15 3.19
TFLOPS in 3,000 66.30 13.04 7.94 12.30
Throughput of 1–5 CCI 20–55 TPS 1–25 TPS 60–150 TPS 1 TPS
Throughput of 2,000 CCI 400 TPS 40 TPS 120 TPS 210 TPS
CCI Tail latency Best Superior Worst Inferior
CI/CD performance Best Fail frequently Long latency Balanced
Read/Write (1–100 CCI) 153/83 MB/s–93/39.5 MB/s 56/9.5 MB/s–54/3.5 MB/s 424/44 MB/s - NA 68/8 MB/s–34/0.5 MB/s
File I/O (1–100 CCI) 2–3.5 s 10–30 3.5–NA 15–60
Object I/O (1–100 CCI) 1.3–2.4 s 5–8 12–NA 1–30
Trigger Throughput 55-25-860 (HTTP-Object-DB) 20-25-NA 145-250-NA 50-NA-40
Language Runtime overhead Balanced 0.05 s avg (–0.06) 0.22 s (+0.1) (–0.02) 0.22 s (+0.03) (–0.02) 0.17 s (+0.02)
Dependencies overhead (−0.5) 1.1 s (+0.2) avg (−0.5) 1.9 s (+0.4) (−1.3) 3.4 s (NA) NA
Maximum Memory 3,008 MB 2,048 MB 1,536 MB 512 MB
Execution Timeout 5 minutes 9 minutes 10 minutes 5 minutes
Price per Memory $0.0000166/GB-s $0.0000165/GB-s $0.0000016/GB-s $0.000017/GB-s
Price per Execution $0.2 per 1M $0.4 per 1M $0.2 per 1M NA
Free Tier First 1M Exec First 2M Exec First 1M Exec Free Exec/40,000 GB-s

Besides the cold startup analysis of different language runtimes and memory limits, SAND [4]
also measures several sandbox isolation mechanisms for function executions, and we show their
results in Figure 7(c). Native executions (exec and fork) are the fastest methods, whereas Unikernel
(Xen MirageOS) performs similar to using a Docker container. Regardless of the recycled user code
in memory in the paused container, using the Docker client interface to start a warm function
(Docker exec C) is much faster than a cold startup (Docker run C).
As a supplement to the preceding factors that affect serverless cold startup performance,
Shahrad et al. [117] explore other factors that may affect the function cold startup and execution
time, such as MKPI (mispredictions per kilo-instruction), LLC (Last-Level Cache) size, and memory
bandwidth. First, they find that a longer execution time usually appears with a noticeably lower
branch MKPI within a function. It is easy for us to understand that functions with short execution
time spend most of the time on language runtime startup, and thus the branch predictor outputs
more miss when staying trained. Second, the LLC size is not a significant factor affecting cold
startup latency and execution time. Higher LLC size cannot improve serverless function execution
performance because of the insensitivity. Only when the LLC size is very small (e.g., less than 2M)
will it become a bottleneck for the function execution and cold startup. Therefore, cloud vendors
usually set a default LLC size and pre-profile in the serverless system to avoid serious performance
degradation. BabelFish [121] also finds that lazy page table management can result in heavy TLB
stress in a containerized environment. Therefore, to avoid redundant kernel jobs produced in page
table management, they try to share translations across containers in the TLB and page tables.

6.2 Production Comparison

With more attempts to enable the rapid development of cloud-native applications, Wang et al. [141]
evaluate the performance of three commercial serverless platforms by invoking measurement func-
tions with stepwise memory limits to collect various system-level metrics. Lee et al. [77] also give a
detailed comparison between Amazon Lambda, Google Functions, Microsoft Azure Functions, and
IBM OpenWhisk. They demonstrate the differences in terms of throughput, network bandwidth,
I/O capacity, and computing performance. Based on these experiments, we summarize the metrics
in Table 4. From this table, we can glimpse their respective strengths and weaknesses. For example,

ACM Computing Surveys, Vol. 54, No. 10s, Article 220. Publication date: September 2022.
220:24 Z. Li et al.

AWS Lambda shows higher capacity and throughput of concurrent function invocations, although
performing poorly in trigger throughput. From another aspect, Microsoft Azure Functions enables
fast read and write speed when queries are invoked in sequence and shows relatively higher func-
tion cold startup latency. Undoubtedly, all cloud vendors are aware of the challenges in serverless
architecture and are actively optimizing the function invocation bottlenecks.

7 OTHER KEY LIMITATIONS AND CHALLENGES

The limitations of the current works in each layer and challenges are already discussed in the corre-
sponding sections. We refer readers to more detailed and focused discussions on the Virtualization
layer in other surveys [13, 18]. This section will highlight other key limitations and challenges in
the Encapsule, Orchestration, and Coordination layer, respectively, as an orthogonal supplement.

7.1 Stateless within the Encapsule Layer

An essential feature of serverless is that the service is loaded and executed on-demand rather
than deployed in a long-term running instance. Short lifespan functions within the application
are no longer associated with a particular instance or server, and each query processed cannot
be guaranteed to be invoked by the same function instance. In other words, the application’s
state cannot and will not keep access on the resumed instance [68]. The stateless nature limits its
scope to stateless applications, such as web applications, IoT (Internet of Things), and media pro-
cessing. Undoubtedly, the extension from stateful serverless architecture (see references to other
works [120, 125, 151] in Section 5.4) persists state by external storage making it inferior to regu-
lar sticky sessions as IaaS or PaaS does. Worse still, storing sensitive data outside the server has
significant security implications. Considering that the data is at risk when transferred, it is still
challenging to use short-lived caches when encrypting data stored in session stores.

7.2 Memory Fragmentation within the Orchestration Layer

In the serverless architecture where multiple tenants co-exist, concurrent invocations are either
processed in multiple containers and experience undesired cold startups in each one or executed
concurrently in one single container (e.g., OpenFaaS and OpenLambda). In the former, a container
is allowed to execute only one invocation at a time for performance isolation. In this case, the
memory footprint of massive sidecars prevents serverless containers from achieving high-density
deployment and improved resource utilization [3]. The key to this challenge is slimming and con-
densing the container runtime by deduplication within the VMM and guest kernel, such as shar-
ing the page cache across different instances on the host. In the latter, memory fragmentation
becomes a top priority. Figure 8 depicts two common scenarios where memory fragmentation
may arise. Allocation fragmentation is usually due to the improper provision of a microVM. As
a result, function executors cannot fully utilize the memory allocated. Scheduling fragmentation
is inevitable and usually caused by an instance-level load balancing strategy when auto-scaling
with workload changes. Since the serverless emergence, challenges remain in further instructing
an efficient methodology for high-density container deployment.

7.3 API and Benchmark Lock-in within the Coordination Layer

When people talk about serverless vendor lock-in, they are concerned about the portability of func-
tions. However, the real point of this problem depends on the API from other services rather than
the function itself. Although some efforts such as Apex [10] and Sparta [123] allow users to deploy
functions to serverless platforms in languages that are not supported natively, the BaaS services
from different platforms and their API definitions are still different. The challenge with API lock-in

ACM Computing Surveys, Vol. 54, No. 10s, Article 220. Publication date: September 2022.
The Serverless Computing Survey 220:25

Fig. 8. Two scenarios where the memory fragmentation arises.

is derived from the tight coupling between the user functions and other BaaS components, which
can add difficulty to the code migration between different FaaS platforms.
The over-simplified benchmark is another problem with API lock-in. Easy-to-build microbench-
marks are over-emphasized and used in 75% of the current works [110]. We call for the estab-
lishment and open source of cross-platform real-world application benchmarks besides scientific
workflows [64, 89, 119]. However, when decomposing a large service into different functions and
then building fine-grained node interconnections, the mismatch between the pre-defined control
plane and actual data plane makes the grading of the function challenging to determine.

8 OPPORTUNITIES IN SERVERLESS COMPUTING

Last, we discuss some future opportunities that serverless computing faces and give some prelim-
inary, constructive explorations to solutions.

8.1 Application-Level Optimization

Application-level optimization requires coordinating between different functions within the ap-
plication instead of focusing on each general function. Complex interconnections like data de-
pendence and caller-callee relation may conceal between functions. Future works could achieve
application-level optimization in two ways: workflow support and workflow scheduling. Work-
flow support means general support for the inter-connection among functions. We believe that
the following supports are necessary:
• Better storage: In some cases, functions need to exchange large ephemeral files with others.
If we register intermediate storage, transferring between storage and functions will take up
most of the I/O resource and significantly slow down the response. The FaaS architecture
amplifies this inefficiency in the serverless workflow scenario. Therefore, better storage de-
mands higher priority for metadata exchange between functions within an application.
• Higher parallelism capacity: In an example of video processing, multiple recoding instances
can be invoked simultaneously in a MapReduce way to speed up the transcoding. Distinctly,
there is great potential in parallelism to optimize end-to-end latency. However, higher par-
allelism is hard to implement due to the considerations on resource utilization management
in physical nodes. If a serverless system could provide superior parallelism with sustainable
resource overhead, it can further empower users. For example, a serverless system allows
multiple queries to be invoked concurrently within an instance with a guaranteed QoS, or
optimizes the guest kernel, container runtime, and host-side cgroups to achieve lighter vir-
tualization in high concurrency and density scenarios.
Workflow scheduling drives a scheduling strategy that takes functions’ interconnection into ac-
count. We believe that the following considerations are missing in current works:

ACM Computing Surveys, Vol. 54, No. 10s, Article 220. Publication date: September 2022.
220:26 Z. Li et al.

Fig. 9. The dataflow architecture for serverless workflow.

• Caller-callee relation: Caller-callee relation is common in a complex application. Usually, the

callee will be invoked after the caller finishes, as Figure 9(a) shows. It reveals an opportu-
nity to explore: the system can prewarm function instances and execute them in advance
with partial data by the dataflow architecture. As shown in Figure 9(b), Function B, C can
start execution earlier while Function A is not complete, thanks to the data dependency
rather than function state dependency. In the case of providing an optimized interface with
dataflow canonical patterns and applying directly to functions, cloud vendors could enable
an application to achieve higher parallelism and lower response latency via a data pipeline.
• Data locality: We have mentioned that metadata exchange may continually happen between
functions in an application. If two functions with data dependency are scheduled on the same
physical node, the data transmission can be significantly reduced by middleware. However,
the current serverless system is a data-shipping architecture, which sends data to the code
node to parse instead of sending code to the data node. Thus, on the one hand, a serverless
system cannot guarantee that the data stored and the workers scheduled are just in the same
physical node. On the other hand, frequent code transferring should also be avoided due to
security and privacy concerns. Improving data locality can effectively reform the application
design from a data-shipping architecture into a code-shipping one [54].

8.2 Robust Performance of Cold Startup Alleviation

Current works usually use predictive methods to reduce cold startups, whereas they all require
functions’ historical traces or system-level metrics. By predicting them in the near future, the sys-
tem will enlarge the container pool or prewarm template containers. Nonetheless, it is impractical
for each function to collect enough data and build an accurate prediction model. Like Shahrad
et al. [118] show in the Azure trace that about 40% and 30% of registered functions and applica-
tions, respectively, are invoked less than 10 times daily. This fact also makes it more challenging to
collect system-level information periodically for such a kind of service. For example, an LRU-based
template can maximize the cache hits for hotspot functions startup, whereas cold startups of non-
hot functions cannot benefit from the cache updating at the system level. The current compromise
to this discrepancy is to use a reserved container pool for functions despite a massive waste of
resources.
It is crucial to explore the warm-up strategy with solid robustness to performance, especially
for sporadically triggered or latency-sensitive functions. It requires the serverless controller and
load balancer to be more general enough to alleviate cold startups or reduce the performance
degradation. They may make decisions based on the information inside the functions, such as
the service category, the environment libraries used, and the context diagram, for cold startup
prediction and alleviation. For example, a serverless system can build shared images and template
containers for functions within the same category, or pack the functions with similar environment
configurations and implement more fine-grained inside isolation mechanisms.

ACM Computing Surveys, Vol. 54, No. 10s, Article 220. Publication date: September 2022.
The Serverless Computing Survey 220:27

8.3 Accelerators in Serverless

Accelerators like GPUs and FPGAs are widely used in many applications such as databases [36, 53]
and graph processing [37, 156]. They can significantly speed up the processing of specific tasks,
like image processing and ML applications. To satisfy the demand for accelerators, cloud vendors
furnish accelerators in IaaS (e.g., AWS EC2 P4 and F1 instance) and SaaS (e.g., AWS SageMaker)
manner. However, the inflexibility of such accelerators impedes the instantiation in serverless
computing. This circumstance leads to two obstacles: (1) it makes the usage of accelerators less
convenient and flexible in the cloud, and (2) it limits the range of applications that serverless can
support. We think a multiplexing accelerator in serverless is the key to solving these obstacles.
For example, some works [98, 150] integrate GPUs into serverless systems, and BlastFunction [14]
makes FPGAs available in serverless. However, the current works are still insufficient. We believe
that future research can focus on the following points:
• Accelerator-aware scheduling: Accelerators can also be considered a resource in serverless
systems, except they have more irreplaceable features than others. Latency-aware schedul-
ing and on-demanding scaling are more expensive on accelerators, stimulating the server-
less controller to treat accelerators distinctively. In such a situation, the scheduling strategy
should be more conservative when scheduling multiple tasks on one accelerator.
• Accelerator virtualization: Virtualization is an essential technology applied in a serverless
system. It is used to fulfill runtime environment management, resource isolation, and high
security. However, serverless accelerator schemes are not explored insofar as CPU virtual-
izations. It makes accelerators embarrassing to be integrated into serverless systems. Accel-
erator virtualization should be further explored to better support accelerators in serverless.
• Automatic batching: Accelerators usually have strong I/O bandwidth restrictions. Batching
queries is a common operation to conquer these restrictions and make full use of accelerators’
computation ability. However, the batching operation will introduce redundancy into end-
to-end latency. Therefore, a serverless batching strategy that balances utilization and latency
should be investigated in future research.

9 CONCLUSION
The rapid development of the cloud-native concept inspires developers to reorganize cloud ap-
plications into microservices. Elastic serverless computing becomes the best practice for these
microservices. This survey explicates and reviews the fundamental aspects of serverless comput-
ing and provides a comprehensive depiction of four-layered design architecture: Virtualization,
Encapsule, System Orchestration, and System Coordination layers. We elaborate on the respon-
sibility and significance of each layer, enumerate relevant works, and give practical implications
when adopting these state-of-the-art techniques. Serverless computing will undoubtedly continue
to gain prominence, and the potential remains sealed in forthcoming years.

REFERENCES
[1] Mainak Adhikari, Tarachand Amgoth, and Satish Narayana Srirama. 2019. A survey on scheduling strategies for
workflows in cloud environment and emerging trends. ACM Comput. Surv. 52, 4 (2019), Article 68, 36 pages. https:
//doi.org/10.1145/3325097
[2] Gojko Adzic and Robert Chatley. 2017. Serverless computing: Economic and architectural impact. In Proceedings of
the 2017 11th Joint Meeting on Foundations of Software Engineering (ESEC/FSE’17). ACM, New York, NY, 884–889.
https://doi.org/10.1145/3106237.3117767
[3] Alexandru Agache, Marc Brooker, Alexandra Iordache, and Anthony Liguori. 2020. Firecracker: Lightweight virtu-
alization for serverless applications. In Proceedings of the 17th USENIX Symposium on Networked Systems Design and
Implementation (NSDI’20). 419–434.

ACM Computing Surveys, Vol. 54, No. 10s, Article 220. Publication date: September 2022.
220:28 Z. Li et al.

[4] Istemi Ekin Akkus, Ruichuan Chen, Ivica Rimac, Manuel Stein, Klaus Satzke, Andre Beck, Paarijaat Aditya, and
Volker Hilt. 2018. SAND: Towards high-performance serverless computing. In Proceedings of the 2018 USENIX Annual
Technical Conference (ATC’18). 923–935.
[5] Omid Alipourfard, Hongqiang Harry Liu, and Jianshu Chen. 2017. CherryPick: Adaptively unearthing the best cloud
configurations for big data analytics. In Proceedings of the 14th USENIX Symposium on Networked Systems Design and
Implementation (NSDI’17). 469–482.
[6] Amazon. 2021. Enabling API caching to enhance responsiveness. AWS. Retrieved February 8, 2022 from https://docs.
aws.amazon.com/apigateway/latest/developerguide/api-gateway-caching.html.
[7] Amazon. 2021. Amazon DynamoDB Accelerator (DAX): A fully managed, highly available, in-memory cache service.
AWS. Retrieved February 8, 2022 from https://aws.amazon.com/dynamodb/dax/.
[8] Ali Anwar, Mohamed Mohamed, Vasily Tarasov, Michael Littley, and Lukas Rupprecht. 2018. Improving Docker
registry design based on production workload analysis. In Proceedings of the 16th USENIX Conference on File and
Storage Technologies (FAST’18). 265–278.
[9] Lixiang Ao, Liz Izhikevich, Geoffrey M. Voelker, and George Porter. 2018. Sprocket: A serverless video processing
framework. In Proceedings of the ACM Symposium on Cloud Computing (SoCC’18). ACM, New York, NY, 263–274.
[10] Apex. 2021. Home Page. Retrieved February 8, 2022 from https://apex.sh/.
[11] Vincent Armant, Milan De Cauwer, Kenneth N. Brown, and Barry O’Sullivan. 2018. Semi-online task assignment
policies for workload consolidation in cloud computing systems. Future Gener. Comput. Syst. 82 (2018), 89–103. https:
//doi.org/10.1016/j.future.2017.12.035
[12] Dulcardo Arteaga, Jorge Cabrera, Jing Xu, Swaminathan Sundararaman, and Ming Zhao. 2016. CloudCache: On-
demand flash cache management for cloud computing. In Proceedings of the 14th USENIX Conference on File and
Storage Technologies (FAST’16). 355–369.
[13] Naylor G. Bachiega, Paulo S. L. Souza, Sarita Mazzini Bruschi, and Simone do Rocio Senger de Souza. 2018. Container-
based performance evaluation: A survey and challenges. In Proceedings of the 2018 IEEE International Conference on
Cloud Engineering (IC2E’18). IEEE, Los Alamitos, CA, 398–403.
[14] M. Bacis, R. Brondolin, and M. D. Santambrogio. 2020. BlastFunction: An FPGA-as-a-service system for accelerated
serverless computing. In Proceedings of the 2020 Design, Automation, and Test in Europe Conference and Exhibition
(DATE’20). 852–857. https://doi.org/10.23919/DATE48585.2020.9116333
[15] Ioana Baldini, Paul Castro, Kerry Chang, Perry Cheng, Stephen Fink, Vatche Ishakian, Nick Mitchell, et al. 2017.
Serverless computing: Current trends and open problems. In Research Advances in Cloud Computing. Springer, 1–20.
[16] Ioana Baldini, Perry Cheng, Stephen J. Fink, and Nick Mitchell. 2017. The serverless trilemma: Function composition
for serverless computing. In Proceedings of the 2017 ACM SIGPLAN International Symposium on New Ideas, New
Paradigms, and Reflections on Programming and Software, Onward! ACM, New York, NY, 89–103.
[17] Bartosz Balis. 2016. HyperFlow: A model of computation, programming approach and enactment engine for complex
distributed workflows. Future Gener. Comput. Syst. 55 (2016), 147–162. https://doi.org/10.1016/j.future.2015.08.015
[18] Christian Bargmann and Marina Tropmann-Frick. 2019. A survey on secure container isolation approaches for multi-
tenant container workloads and serverless computing. In Proceedings of the 8th Workshop on Software Quality Anal-
ysis, Monitoring, Improvement, and Applications (SQAMIA’19). http://ceur-ws.org/Vol-2508/paper-bar.pdf
[19] S. Barlev, Z. Basil, S. Kohanim, R. Peleg, S. Regev, and Alexandra Shulman-Peleg. 2016. Secure yet usable: Protecting
servers and Linux containers. IBM J. Res. Dev. 60, 4 (2016), 12. https://doi.org/10.1147/JRD.2016.2574138
[20] David Bermbach, Ahmet-Serdar Karakaya, and Simon Buchholz. 2020. Using application knowledge to reduce cold
starts in FaaS services. In Proceedings of the 35th ACM/SIGAPP Symposium on Applied Computing (SAC’20). ACM,
New York, NY, 134–143. https://doi.org/10.1145/3341105.3373909
[21] Kahina Bessai, Samir Youcef, Ammar Oulamara, Claude Godart, and Selmin Nurcan. 2012. Bi-criteria workflow tasks
allocation and scheduling in cloud computing environments. In Proceedings of the 2012 IEEE 5th International Con-
ference on Cloud Computing. IEEE, Los Alamitos, CA, 638–645. https://doi.org/10.1109/CLOUD.2012.83
[22] Nilton Bila, Paolo Dettori, Ali Kanso, Yuji Watanabe, and Alaa Youssef. 2017. Leveraging the serverless architecture
for securing Linux containers. In Proceedings of the 37th IEEE International Conference on Distributed Computing
Systems Workshops (ICDCS Workshops’17). IEEE, Los Alamitos, CA, 401–404.
[23] Sol Boucher, Anuj Kalia, David G. Andersen, and Michael Kaminsky. 2018. Putting the “Micro” back in microservice.
In Proceedings of the 2018 USENIX Annual Technical Conference (USENIX ATC’18). 645–650. https://www.usenix.org/
conference/atc18/presentation/boucher.
[24] Mark Boyd. 2021. Serverless: IOpipe Launches a Monitoring Tool for AWS Lambda. Retrieved February 8, 2022 from
https://thenewstack.io/iopipe-launches-lambda-monitoring-tool-aws-summit/.
[25] Frank Budinsky. 2021. Canary Deployments Ising Istio. Retrieved February 8, 2022 from https://istio.io/latest/blog/
2017/0.1-canary/.

ACM Computing Surveys, Vol. 54, No. 10s, Article 220. Publication date: September 2022.
The Serverless Computing Survey 220:29

[26] Rajkumar Buyya, Satish Narayana Srirama, Giuliano Casale, and Rodrigo N. Calheiros. 2019. A manifesto for future
generation cloud computing: Research directions for the next decade. ACM Comput. Surv. 51, 5 (2019), Article 105,
38 pages. https://doi.org/10.1145/3241737
[27] Rajkumar Buyya, Chee Shin Yeo, Srikumar Venugopal, James Broberg, and Ivona Brandic. 2009. Cloud computing
and emerging IT platforms: Vision, hype, and reality for delivering computing as the 5th utility. Future Gener. Comput.
Syst. 25, 6 (2009), 599–616. https://doi.org/10.1016/j.future.2008.12.001
[28] James Cadden, Thomas Unger, Yara Awad, and Han Dong. 2020. SEUSS: Skip redundant paths to make serverless
fast. In Proceedings of the 15th EuroSys Conference (EuroSys’20). ACM, New York, NY, Article 32, 15 pages.
[29] Benjamin Carver, Jingyuan Zhang, Ao Wang, Ali Anwar, Panruo Wu, and Yue Cheng. 2020. Wukong: A scalable and
locality-enhanced framework for serverless parallel computing. In Proceedings of the 11th ACM Symposium on Cloud
Computing (SoCC’20). ACM, New York, NY, 1–15. https://doi.org/10.1145/3419111.3421286
[30] Benjamin Carver, Jingyuan Zhang, Ao Wang, and Yue Cheng. 2019. In search of a fast and efficient serverless DAG
engine. CoRR abs/1910.05896 (2019). http://arxiv.org/abs/1910.05896.
[31] Israel Casas, Javid Taheri, Rajiv Ranjan, and Albert Y. Zomaya. 2017. PSO-DS: A scheduling engine for scientific
workflow managers. J. Supercomput. 73, 9 (2017), 3924–3947. https://doi.org/10.1007/s11227-017-1992-z
[32] Chia-Chen Chang, Shun-Ren Yang, En-Hau Yeh, Phone Lin, and Jeu-Yih Jeng. 2017. A Kubernetes-based monitoring
platform for dynamic cloud resource provisioning. In Proceedings of the 2017 IEEE Global Communications Conference
(GLOBECOM’17). IEEE, Los Alamitos, CA, 1–6.
[33] Liuhua Chen and Haiying Shen. 2017. Considering resource demand misalignments to reduce resource over-
provisioning in cloud datacenters. In Proceedings of the 2017 IEEE Conference on Computer Communications
(INFOCOM’17). IEEE, Los Alamitos, CA, 1–9.
[34] Liuhua Chen, Haiying Shen, and Stephen Platt. 2016. Cache contention aware virtual machine placement and mi-
gration in cloud datacenters. In Proceedings of the 24th IEEE International Conference on Network Protocols (ICNP’16).
IEEE, Los Alamitos, CA, 1–10.
[35] Shuang Chen, Christina Delimitrou, and José F. Martínez. 2019. PARTIES: QoS-aware resource partitioning for multi-
ple interactive services. In Proceedings of the 24th International Conference on Architectural Support for Programming
Languages and Operating Systems (ASPLOS’19). ACM, New York, NY, 107–120.
[36] Xinyu Chen, Yao Chen, Ronak Bajaj, Jiong He, Bingsheng He, Weng-Fai Wong, and Deming Chen. 2020. Is FPGA
useful for hash joins? In Proceedings of the 10th Conference on Innovative Data Systems Research (CIDR’20).
[37] Xinyu Chen, Hongshi Tan, Yao Chen, Bingsheng He, Weng-Fai Wong, and Deming Chen. 2021. ThunderGP: HLS-
based graph processing framework on FPGAs. In Proceedings of the 2021 ACM/SIGDA International Symposium on
Field-Programmable Gate Arrays (FPGA’21). ACM, New York, NY, 69–80. https://doi.org/10.1145/3431920.3439290
[38] Eli Cortez, Anand Bonde, and Alexandre Muzio. 2017. Resource central: Understanding and predicting workloads for
improved resource management in large cloud platforms. In Proceedings of the 26th Symposium on Operating Systems
Principles. ACM, New York, NY, 153–167.
[39] GitHub. 2021. CRIU: A Utility to Checkpoint/Restore Linux Tasks in Userspace. Retrieved February 8, 2022 from
https://github.com/checkpoint-restore/criu.
[40] Nilanjan Daw, Umesh Bellur, and Purushottam Kulkarni. 2020. Xanadu: Mitigating cascading cold starts in serverless
function chain deployments. In Proceedings of the 21st International Middleware Conference (Middleware’20). ACM,
New York, NY, 356–370.
[41] Docker. 2021. Home Page. Retrieved February 8, 2022 from https://www.docker.com/.
[42] Dong Du, Tianyi Yu, Yubin Xia, Binyu Zang, Guanglu Yan, Chenggang Qin, Qixuan Wu, and Haibo Chen. 2020.
Catalyzer: Sub-millisecond startup for serverless computing with initialization-less booting. In Architectural Support
for Programming Languages and Operating Systems (ASPLOS’20). ACM, New York, NY, 467–481. https://doi.org/10.
1145/3373376.3378512
[43] AWS. 2021. Elastic Load Balancing: Application Load Balancers. Retrieved February 8, 2022 from https://docs.aws.
amazon.com/elasticloadbalancing/latest/application/elb-ag.pdf.
[44] Fission. 2021. Execute Mode in Fission. Retrieved February 8, 2022 from https://fission.io/docs/usage/function/
executor/.
[45] Erwin Van Eyk, Lucian Toader, and Sacheendra Talluri. 2018. Serverless is more: From PaaS to present cloud com-
puting. IEEE Internet Comput. 22, 5 (2018), 8–17.
[46] GitHub. 2021. Fission Workflows: Fast, Reliable and Lightweight Function Composition for Serverless Functions.
Retrieved February 8, 2022 from https://github.com/fission/fission-workflows.
[47] Sadjad Fouladi, Riad S. Wahby, Brennan Shacklett, Karthikeyan Balasubramaniam, William Zeng, Rahul Bhalerao,
Anirudh Sivaraman, George Porter, and Keith Winstein. 2017. Encoding, fast and slow: Low-latency video processing
using thousands of tiny threads. In Proceedings of the 14th USENIX Symposium on Networked Systems Design and
Implementation (NSDI’17). 363–376.

ACM Computing Surveys, Vol. 54, No. 10s, Article 220. Publication date: September 2022.
220:30 Z. Li et al.

[48] Etienne Tremel. 2021. Deployment Strategies on Kubernetes. Retrieved February 8, 2022 from https://www.cncf.io/
wp-content/uploads/2020/08/CNCF-Presentation-Template-K8s-Deployment.pdf.
[49] GitHub. 2021. Google Container Runtime Sandbox. Retrieved February 8, 2022 from https://github.com/google/
gvisor.
[50] Xinjie Guan, Xili Wan, Baek-Young Choi, Sejun Song, and Jiafeng Zhu. 2017. Application oriented dynamic resource
allocation for data centers using Docker containers. IEEE Commun. Lett. 21, 3 (2017), 504–507.
[51] Tyler Harter, Brandon Salmon, Rose Liu, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2016. Slacker:
Fast distribution with lazy Docker containers. In Proceedings of the 14th USENIX Conference on File and Storage
Technologies (FAST’16). 181–195. https://www.usenix.org/conference/fast16/technical-sessions/presentation/harter.
[52] Hassan B. Hassan, Saman A. Barakat, and Qusay I. Sarhan. 2021. Survey on serverless computing. J. Cloud Comput.
10, 1 (2021), 39. https://doi.org/10.1186/s13677-021-00253-7
[53] Bingsheng He, Ke Yang, Rui Fang, Mian Lu, Naga Govindaraju, Qiong Luo, and Pedro Sander. 2008. Relational joins
on graphics processors. In Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data
(SIGMOD’08). ACM, New York, NY, 511–524. https://doi.org/10.1145/1376616.1376670
[54] Joseph M. Hellerstein, Jose M. Faleiro, and Joseph Gonzalez. 2019. Serverless computing: One step forward, two steps
back. In Proceedings of the 9th Biennial Conference on Innovative Data Systems Research (CIDR’19).
[55] Scott Hendrickson, Stephen Sturdevant, Edward Oakes, Tyler Harter, Venkateshwaran Venkataramani, Andrea C.
Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2016. Serverless computation with OpenLambda. Login Usenix Mag.
41, 4 (2016), 14–19. https://www.usenix.org/publications/login/winter2016/hendrickson.
[56] Honeycomb. 2021. Home Page. Retrieved February 8, 2022 from https://www.honeycomb.io/.
[57] M. Reza HoseinyFarahabady, Albert Y. Zomaya, and Zahir Tari. 2018. A model predictive controller for managing
QoS enforcements and microarchitecture-level interferences in a lambda platform. IEEE Trans. Parallel Distrib. Syst.
29, 7 (2018), 1442–1455.
[58] Microsoft. 2021. Isolation Modes. Retrieved February 8, 2022 from https://docs.microsoft.com/en-us/virtualization/
windowscontainers/manage-containers/hyperv-container.
[59] Shigeru Imai, Thomas Chestna, and Carlos A. Varela. 2013. Accurate resource prediction for hybrid IaaS clouds using
workload-tailored elastic compute units. In Proceedings of the IEEE/ACM 6th International Conference on Utility and
Cloud Computing (UCC’13). IEEE, Los Alamitos, CA, 171–178. https://doi.org/10.1109/UCC.2013.40
[60] Shigeru Imai, Stacy Patterson, and Carlos A. Varela. 2018. Uncertainty-aware elastic virtual machine scheduling for
stream processing systems. In Proceedings of the 18th IEEE/ACM International Symposium on Cluster, Cloud, and Grid
Computing (CCGRID’18). IEEE, Los Alamitos, CA, 62–71. https://doi.org/10.1109/CCGRID.2018.00021
[61] Vitalii Ivanov and Kari Smolander. 2018. Implementation of a DevOps pipeline for serverless applications. In Product-
Focused Software Process Improvement. Lecture Notes in Computer Science, Vol. 11271. Springer, 48–64.
[62] David Jackson and Gary Clynch. 2018. An investigation of the impact of language runtime on the performance
and cost of serverless functions. In Proceedings of the 2018 IEEE/ACM International Conference on Utility and Cloud
Computing Companion (UCC Companion’18). IEEE, Los Alamitos, CA, 154–160.
[63] Jenkins. 2021. DevOps CI Tool. Retrieved February 8, 2022 from https://www.jenkins.io/.
[64] Eric Jonas, Qifan Pu, Shivaram Venkataraman, Ion Stoica, and Benjamin Recht. 2017. Occupy the cloud: Distributed
computing for the 99%. In Proceedings of the 2017 Symposium on Cloud Computing (SoCC’17). ACM, New York, NY,
445–451. https://doi.org/10.1145/3127479.3128601
[65] Eric Jonas, Johann Schleier-Smith, Vikram Sreekanti, Chia-Che Tsai, Anurag Khandelwal, Qifan Pu, Vaishaal Shankar,
et al. 2019. Cloud programming simplified: A Berkeley view on serverless computing. CoRR abs/1902.03383 (2019).
http://arxiv.org/abs/1902.03383.
[66] Kostis Kaffes, Neeraja J. Yadwadkar, and Christos Kozyrakis. 2019. Centralized core-granular scheduling for server-
less functions. In Proceedings of the ACM Symposium on Cloud Computing (SoCC’19). ACM, New York, NY, 158–164.
[67] Kata Containers. 2021. Home Page. Retrieved February 8, 2022 from https://katacontainers.io/.
[68] Alireza Keshavarzian, Saeed Sharifian, and Sanaz Seyedin. 2019. Modified deep residual network architecture de-
ployed on serverless framework of IoT platform based on human activity recognition application. Future Gener.
Comput. Syst. 101 (2019), 14–28.
[69] Asif Khan. 2017. Key characteristics of a container orchestration platform to enable a modern application. IEEE Cloud
Comput. 4, 5 (2017), 42–48. https://doi.org/10.1109/MCC.2017.4250933
[70] Young Ki Kim, M. Reza HoseinyFarahabady, Young Choon Lee, and Albert Y. Zomaya. 2020. Automated fine-grained
CPU cap control in serverless computing platform. IEEE Trans. Parallel Distrib. Syst. 31, 10 (2020), 2289–2301. https:
//doi.org/10.1109/TPDS.2020.2989771
[71] Ana Klimovic, Yawen Wang, Patrick Stuedi, and Animesh Trivedi. 2018. Pocket: Elastic ephemeral storage for
serverless analytics. In Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation
(OSDI’18). 427–444.

ACM Computing Surveys, Vol. 54, No. 10s, Article 220. Publication date: September 2022.
The Serverless Computing Survey 220:31

[72] Paul Kocher, Jann Horn, Anders Fogh, Daniel Genkin, Daniel Gruss, Werner Haas, Mike Hamburg, et al. 2020. Spectre
attacks: Exploiting speculative execution. Commun. ACM 63, 7 (2020), 93–101.
[73] Ricardo Koller and Alan Dawson. 2021. Vulnerability Advisor—Secure your Dev + Ops Across Containers. Retrieved
February 8, 2022 from https://www.ibm.com/blogs/cloud-archive/2016/11/vulnerability-advisor-secure-your-dev-
ops-across-containers/.
[74] GitHub. 2021. Kubeless. Retrieved February 8, 2022 from https://kubeless.io/.
[75] Kubernetes. 2021. CronJob. Retrieved February 8, 2022 from https://kubernetes.io/docs/concepts/workloads/
controllers/cron-jobs/.
[76] Anthony Kwan, Jonathon Wong, Hans-Arno Jacobsen, and Vinod Muthusamy. 2019. HyScale: Hybrid and network
scaling of dockerized microservices in cloud data centres. In Proceedings of the 39th IEEE International Conference on
Distributed Computing Systems (ICDCS’19). IEEE, Los Alamitos, CA, 80–90. https://doi.org/10.1109/ICDCS.2019.00017
[77] Hyungro Lee, Kumar Satyam, and Geoffrey C. Fox. 2018. Evaluation of production serverless computing environ-
ments. In Proceedings of the 11th IEEE International Conference on Cloud Computing (CLOUD’18). IEEE, Los Alamitos,
CA, 442–450.
[78] Philipp Leitner, Erik Wittern, Josef Spillner, and Waldemar Hummer. 2019. A mixed-method empirical study of
function-as-a-service software development in industrial practice. J. Syst. Softw. 149 (2019), 340–359. https://doi.org/
10.1016/j.jss.2018.12.013
[79] Huiba Li, Yifan Yuan, Rui Du, Kai Ma, Lanzheng Liu, and Windsor Hsu. 2020. DADI: Block-level image service for
agile and elastic application deployment. In Proceedings of the 2020 USENIX Annual Technical Conference (USENIX
ATC’20). 727–740.
[80] Wubin Li and Ali Kanso. 2015. Comparing containers versus virtual machines for achieving high availability. In
Proceedings of the 2015 IEEE International Conference on Cloud Engineering (IC2E’15). IEEE, Los Alamitos, CA, 353–
358.
[81] Changyuan Lin and Hamzeh Khazaei. 2021. Modeling and optimization of performance and cost of serverless appli-
cations. IEEE Trans. Parallel Distrib. Syst. 32, 3 (2021), 615–632.
[82] W. Ling, L. Ma, C. Tian, and Z. Hu. 2019. Pigeon: A dynamic and efficient serverless and FaaS framework for private
cloud. In Proceedings of the 2019 International Conference on Computational Science and Computational Intelligence
(CSCI’19). 1416–1421. https://doi.org/10.1109/CSCI49370.2019.00265
[83] David Lion, Adrian Chu, Hailong Sun, Xin Zhuang, Nikola Grcevski, and Ding Yuan. 2017. Don’t get caught in
the cold, warm up your JVM. Login Usenix Mag. 42, 1 (2017), 46–51. https://www.usenix.org/publications/login/
spring2017/lion.
[84] Moritz Lipp, Michael Schwarz, Daniel Gruss, Thomas Prescher, Werner Haas, Jann Horn, Stefan Mangard, et al. 2020.
Meltdown: Reading kernel memory from user space. Commun. ACM 63, 6 (2020), 46–56.
[85] Pedro García López, Aitor Arjona, Josep Sampé, Aleksander Slominski, and Lionel Villard. 2020. Triggerflow: Trigger-
based orchestration of serverless workflows. In Proceedings of the 14th ACM International Conference on Distributed
and Event-Based Systems (DEBS’20). ACM, New York, NY, 3–14. https://doi.org/10.1145/3401025.3401731
[86] Anil Madhavapeddy, Richard Mortier, Charalampos Rotsos, David J. Scott, Balraj Singh, Thomas Gazagnaire, Steven
Smith, Steven Hand, and Jon Crowcroft. 2013. Unikernels: Library operating systems for the cloud. In Proceedings of
Architectural Support for Programming Languages and Operating Systems (ASPLOS’13). ACM, New York, NY, 461–472.
https://doi.org/10.1145/2451116.2451167
[87] Ashraf Mahgoub, Karthick Shankar, Subrata Mitra, Ana Klimovic, Somali Chaterji, and Saurabh Bagchi. 2021. SONIC:
Application-aware data passing for chained serverless applications. In Proceedings of the 2021 USENIX Annual Tech-
nical Conference (USENIX ATC’21). 285–301.
[88] Nima Mahmoudi, Changyuan Lin, Hamzeh Khazaei, and Marin Litoiu. 2019. Optimizing serverless computing: In-
troducing an adaptive function placement algorithm. In Proceedings of the 29th Annual International Conference on
Computer Science and Software Engineering (CASCON’19). ACM, New York, NY, 203–213.
[89] Maciej Malawski, Adam Gajek, Adam Zima, Bartosz Balis, and Kamil Figiela. 2020. Serverless execution of scientific
workflows: Experiments with HyperFlow, AWS Lambda and Google Cloud Functions. Future Gener. Comput. Syst.
110 (2020), 502–514.
[90] Filipe Manco, Costin Lupu, Florian Schmidt, Jose Mendes, Simon Kuenzer, Sumit Sati, Kenichi Yasukata, Costin Raiciu,
and Felipe Huici. 2017. My VM is lighter (and safer) than your container. In Proceedings of the 26th Symposium on
Operating Systems Principles. ACM, New York, NY, 218–233. https://doi.org/10.1145/3132747.3132763
[91] Mohammad Masdari, Sima ValiKardan, Zahra Shahi, and Sonay Imani Azar. 2016. Towards workflow scheduling in
cloud computing: A comprehensive analysis. J. Netw. Comput. Appl. 66 (2016), 64–82.
[92] Massimiliano Mattetti, Alexandra Shulman-Peleg, Yair Allouche, Antonio Corradi, Shlomi Dolev, and Luca Foschini.
2015. Securing the infrastructure and the workloads of linux containers. In Proceedings of the 2015 IEEE Conference

ACM Computing Surveys, Vol. 54, No. 10s, Article 220. Publication date: September 2022.
220:32 Z. Li et al.

on Communications and Network Security (CNS’15). IEEE, Los Alamitos, CA, 559–567. https://doi.org/10.1109/CNS.
2015.7346869
[93] Sean McDaniel, Stephen Herbein, and Michela Taufer. 2015. A two-tiered approach to I/O quality of service in Docker
containers. In Proceedings of the 2015 IEEE International Conference on Cluster Computing (CLUSTER’15). IEEE, Los
Alamitos, CA, 490–491.
[94] M. Garrett McGrath and Paul R. Brenner. 2017. Serverless computing: Design, implementation, and performance.
In Proceedings of the 37th IEEE International Conference on Distributed Computing Systems Workshops (ICDCS Work-
shops’17). IEEE, Los Alamitos, CA, 405–410. https://doi.org/10.1109/ICDCSW.2017.36
[95] GitHub. 2021. Mirage-Skeleton with Simple MirageOS Applications. Retrieved February 8, 2022 from https://github.
com/mirage/mirage-skeleton.
[96] Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin A.
Riedmiller. 2013. Playing Atari with deep reinforcement learning. CoRR abs/1312.5602 (2013).
[97] Anup Mohan, Harshad Sane, Kshitij Doshi, and Saikrishna Edupuganti. 2019. Agile cold starts for scalable serverless.
In Proceedings of the 11th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud’19). https://www.usenix.
org/conference/hotcloud19/presentation/mohan.
[98] Diana M. Naranjo, Sebastián Risco, Carlos de Alfonso, Alfonso Pérez, Ignacio Blanquer, and Germán Moltó. 2020.
Accelerated serverless computing based on GPU virtualization. J. Parallel Distrib. Comput. 139 (2020), 32–42. https:
//doi.org/10.1016/j.jpdc.2020.01.004
[99] Hylson Vescovi Netto, Lau Cheuk Lung, Miguel Correia, Aldelir Fernando Luiz, and Luciana Moreira Sá de Souza.
2017. State machine replication in containers managed by Kubernetes. J. Syst. Archit. 73 (2017), 53–59.
[100] Hiep Nguyen, Zhiming Shen, Xiaohui Gu, Sethuraman Subbiah, and John Wilkes. 2013. AGILE: Elastic distributed
resource scaling for infrastructure-as-a-service. In Proceedings of the 10th International Conference on Autonomic
Computing (ICAC’13). 69–82.
[101] Edward Oakes, Leon Yang, Dennis Zhou, Kevin Houck, Tyler Harter, Andrea C. Arpaci-Dusseau, and Remzi H.
Arpaci-Dusseau. 2018. SOCK: Rapid task provisioning with serverless-optimized containers. In Proceedings of
the 2018 USENIX Annual Technical Conference (USENIX ATC’18). 57–70. https://www.usenix.org/conference/atc18/
presentation/oakes.
[102] Pierre Olivier, Daniel Chiba, Stefan Lankes, Changwoo Min, and Binoy Ravindran. 2019. A binary-compatible uniker-
nel. In Proceedings of the 15th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments
(VEE’19). ACM, New York, NY, 59–73. https://doi.org/10.1145/3313808.3313817
[103] GitHub. 2021. OpenWhisk: Serverless Functions Platform for Building Cloud Applications. Retrieved February 8,
2022 from https://github.com/apache/openwhisk.
[104] GitHub. 2021. Prewarm in Apache OpenWhisk. Retrieved February 8, 2022 from https://github.com/apache/
openwhisk/blob/master/docs/actions-python.md.
[105] Microsoft. 2021. Azure Functions Premium Plan. Retrieved February 8, 2022 from https://docs.microsoft.com/en-
us/azure/azure-functions/functions-premium-plan.
[106] Qifan Pu, Shivaram Venkataraman, and Ion Stoica. 2019. Shuffling, fast and slow: Scalable analytics on server-
less infrastructure. In Proceedings of the 16th USENIX Symposium on Networked Systems Design and Implementation
(NSDI’19). 193–206.
[107] K. V. Rashmi, Mosharaf Chowdhury, Jack Kosaian, Ion Stoica, and Kannan Ramchandran. 2016. EC-cache: Load-
balanced, low-latency cluster caching with online erasure coding. In Proceedings of the 12th USENIX Symposium on
Operating Systems Design and Implementation (OSDI’16). 401–417.
[108] Josep Sampé, Marc Sánchez Artigas, Pedro García López, and Gerard París. 2017. Data-driven serverless functions
for object storage. In Proceedings of the 18th ACM/IFIP/USENIX Middleware Conference. ACM, New York, NY, 121–133.
[109] Josep Sampé, Pedro García López, and Marc Sánchez Artigas. 2016. Vertigo: Programmable micro-controllers
for software-defined object storage. In Proceedings of the 9th IEEE International Conference on Cloud Computing
(CLOUD’16). IEEE, Los Alamitos, CA, 180–187.
[110] Joel Scheuner and Philipp Leitner. 2020. Function-as-a-service performance evaluation: A multivocal literature re-
view. J. Syst. Softw. 170 (2020), 110708.
[111] Joel Scheuner and Philipp Leitner. 2020. The state of research on function-as-a-service performance evaluation: A
multivocal literature review. CoRR abs/2004.03276 (2020). https://arxiv.org/abs/2004.03276.
[112] Johann Schleier-Smith, Vikram Sreekanti, Anurag Khandelwal, Joao Carreira, Neeraja Jayant Yadwadkar, Raluca Ada
Popa, Joseph E. Gonzalez, Ion Stoica, and David A. Patterson. 2021. What serverless computing is and should become:
The next phase of cloud computing. Commun. ACM 64, 5 (2021), 76–84.
[113] Florian Schmidt. 2017. Uniprof: A unikernel stack profiler. In Posters and Demos Proceedings of the Conference of the
ACM Special Interest Group on Data Communication (SIGCOMM’17). ACM, New York, NY, 31–33. https://doi.org/10.
1145/3123878.3131976

ACM Computing Surveys, Vol. 54, No. 10s, Article 220. Publication date: September 2022.
The Serverless Computing Survey 220:33

[114] Michael Schwarz, Moritz Lipp, Daniel Moghimi, Jo Van Bulck, Julian Stecklina, Thomas Prescher, and Daniel Gruss.
2019. ZombieLoad: Cross-privilege-boundary data sampling. In Proceedings of the 2019 ACM SIGSAC Conference on
Computer and Communications Security (CCS’19). ACM, New York, NY, 753–768. https://doi.org/10.1145/3319535.
3354252
[115] Srinath T. V. Setty, Chunzhi Su, and Jacob R. Lorch. 2016. Realizing the fault-tolerance promise of cloud storage using
locks with intent. In Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation
(OSDI’16). 501–516.
[116] Hossein Shafiei, Ahmad Khonsari, and Payam Mousavi. 2021. Serverless computing: A survey of opportunities, chal-
lenges and applications. arXiv:1911.01296 [cs.NI].
[117] Mohammad Shahrad, Jonathan Balkind, and David Wentzlaff. 2019. Architectural implications of function-as-a-
service computing. In Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture
(MICRO’19). ACM, New York, NY, 1063–1075. https://doi.org/10.1145/3352460.3358296
[118] Mohammad Shahrad, Rodrigo Fonseca, Iñigo Goiri, and Gohar Chaudhry. 2020. Serverless in the wild: Characterizing
and optimizing the serverless workload at a large cloud provider. In Proceedings of the 2020 USENIX Annual Technical
Conference (USENIX ATC’20). 205–218. https://www.usenix.org/conference/atc20/presentation/shahrad.
[119] Vaishaal Shankar, Karl Krauth, and Qifan Pu. 2018. Numpywren: Serverless linear algebra. CoRR abs/1810.09679
(2018).
[120] Arjun Singhvi, Junaid Khalid, Aditya Akella, and Sujata Banerjee. 2020. SNF: Serverless network functions. In Pro-
ceedings of the ACM Symposium on Cloud Computing (SoCC’20). ACM, New York, NY, 296–310.
[121] Dimitrios Skarlatos, Umur Darbaz, Bhargava Gopireddy, and Nam Sung Kim. 2020. BabelFish: Fusing address transla-
tions for containers. In Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture
(ISCA’20). IEEE, Los Alamitos, CA, 501–514.
[122] Sonarqube. 2021. Code Quality and Code Security. Retrieved February 8, 2022 from https://www.sonarqube.org/.
[123] Sparta. 2021. A Go Framework for AWS Lambda Microservices. Retrieved February 8, 2022 from http://gosparta.io/.
[124] Vikram Sreekanti, Chenggang Wu, and Saurav Chhatrapati. 2020. A fault-tolerance shim for serverless computing.
In Proceedings of the 15th EuroSys Conference (EuroSys’20). ACM, New York, NY, Article 15, 15 pages.
[125] Vikram Sreekanti, Chenggang Wu, Xiayue Charles Lin, and Johann Schleier-Smith. 2020. Cloudburst: Stateful
functions-as-a-service. Proc. VLDB Endow. 13, 11 (2020), 2438–2452.
[126] Satish Narayana Srirama and Alireza Ostovar. 2018. Optimal cloud resource provisioning for auto-scaling enterprise
applications. Int. J. Cloud Comput. 7, 2 (2018), 129–162. https://doi.org/10.1504/IJCC.2018.10014880
[127] Amoghavarsha Suresh and Anshul Gandhi. 2019. FnSched: An efficient scheduler for serverless functions. In Pro-
ceedings of the 5th International Workshop on Serverless Computing (WOSC@Middleware’19). ACM, New York, NY,
19–24.
[128] Byungchul Tak, Canturk Isci, Sastry Duri, Nilton Bila, Shripad Nadgowda, and James Doran. 2017. Understanding
security implications of using containers in the cloud. In Proceedings of the 2017 USENIX Annual Technical Conference
(USENIX ATC’17). 313–319.
[129] Ali Tariq, Austin Pahl, Sharat Nimmagadda, Eric Rozner, and Siddharth Lanka. 2020. Sequoia: Enabling quality-of-
service in serverless computing. In Proceedings of the ACM Symposium on Cloud Computing (SoCC’20). ACM, New
York, NY, 311–327.
[130] Jörg Thalheim, Pramod Bhatotia, Pedro Fonseca, and Baris Kasikci. 2018. Cntr: Lightweight OS containers. In Proceed-
ings of the 2018 USENIX Annual Technical Conference (USENIX ATC’18). 199–212. https://www.usenix.org/conference/
atc18/presentation/thalheim.
[131] Raúl Gracia Tinedo, Pedro García López, Marc Sánchez Artigas, and Josep Sampé. 2016. IOStack: Software-defined
object storage. IEEE Internet Comput. 20, 3 (2016), 10–18.
[132] Raúl Gracia Tinedo, Josep Sampé, and Edgar Zamora-Gómez. 2017. Crystal: Software-defined storage for multi-
tenant object stores. In Proceedings of the 15th USENIX Conference on File and Storage Technologies (FAST’17). 243–256.
[133] László Toka, Gergely Dobreff, Balázs Fodor, and Balázs Sonkoly. 2020. Adaptive AI-based auto-scaling for Kubernetes.
In Proceedings of the 20th IEEE/ACM International Symposium on Cluster, Cloud, and Internet Computing (CCGRID’20).
IEEE, Los Alamitos, CA, 599–608.
[134] GitHub. 2021. Creating and Invoking Docker Actions. Retrieved February 8, 2022 from https://github.com/apache/
openwhisk/blob/master/docs/actions-docker.md.
[135] Alexandre Verbitski, Anurag Gupta, and Debanjan Saha. 2017. Amazon Aurora: Design considerations for high
throughput cloud-native relational databases. In Proceedings of the 2017 ACM International Conference on Manage-
ment of Data (SIGMOD’17). ACM, New York, NY, 1041–1052.
[136] Jaagup Viil and Satish Narayana Srirama. 2018. Framework for automated partitioning and execution of scientific
workflows in the cloud. J. Supercomput. 74, 6 (2018), 2656–2683.

ACM Computing Surveys, Vol. 54, No. 10s, Article 220. Publication date: September 2022.
220:34 Z. Li et al.

[137] Muhammad Wajahat, Anshul Gandhi, Alexei A. Karve, and Andrzej Kochut. 2016. Using machine learning for black-
box autoscaling. In Proceedings of the 7th International Green and Sustainable Computing Conference (IGSC’16). IEEE,
Los Alamitos, CA, 1–8.
[138] Ao Wang, Jingyuan Zhang, Xiaolong Ma, Ali Anwar, Lukas Rupprecht, Dimitrios Skourtis, Vasily Tarasov, Feng Yan,
and Yue Cheng. 2020. InfiniCache: Exploiting ephemeral serverless functions to build a cost-effective memory cache.
In Proceedings of the 18th USENIX Conference on File and Storage Technologies (FAST’20). 267–281.
[139] Hao Wang, Di Niu, and Baochun Li. 2019. Distributed machine learning with a serverless architecture. In Proceedings
of the 2019 IEEE Conference on Computer Communications (INFOCOM’19). IEEE, Los Alamitos, CA, 1288–1296.
[140] Kai-Ting Amy Wang, Rayson Ho, and Peng Wu. 2019. Replayable execution optimized for page sharing for a managed
runtime environment. In Proceedings of the 14th EuroSys Conference (EuroSys’19). ACM, New York, NY. https://doi.
org/10.1145/3302424.3303978
[141] Liang Wang, Mengyuan Li, Yinqian Zhang, Thomas Ristenpart, and Michael M. Swift. 2018. Peeking behind the
curtains of serverless platforms. In Proceedings of the 2018 USENIX Annual Technical Conference (USENIX ATC’18).
133–146.
[142] Stephanie Wang, John Liagouris, and Robert Nishihara. 2019. Lineage stash: Fault tolerance off the critical path. In
Proceedings of the 27th ACM Symposium on Operating Systems Principles (SOSP’19). ACM, New York, NY, 338–352.
[143] Jake Wires and Andrew Warfield. 2017. Mirador: An active control plane for datacenter storage. In Proceedings of the
15th USENIX Conference on File and Storage Technologies (FAST’17). 213–228.
[144] Mingyu Wu, Zeyu Mi, and Yubin Xia. 2020. A survey on serverless computing and its implications for JointCloud
computing. In Proceedings of the 2020 IEEE International Conference on Joint Cloud Computing. 94–101. https://doi.
org/10.1109/JCC49151.2020.00023
[145] Yulai Xie, Dan Feng, Yan Li, and Darrell D. E. Long. 2016. Oasis: An active storage framework for object storage
platform. Future Gener. Comput. Syst. 56 (2016), 746–758.
[146] Zhengjun Xu, Haitao Zhang, Xin Geng, Qiong Wu, and Huadong Ma. 2019. Adaptive function launching acceleration
in serverless computing platforms. In Proceedings of the 25th IEEE International Conference on Parallel and Distributed
Systems (ICPADS’19). IEEE, Los Alamitos, CA, 9–16. https://doi.org/10.1109/ICPADS47876.2019.00011
[147] Kejiang Ye, Zhaohui Wu, Chen Wang, Bing Bing Zhou, Weisheng Si, Xiaohong Jiang, and Albert Y. Zomaya. 2015.
Profiling-based workload consolidation and migration in virtualized data centers. IEEE Trans. Parallel Distrib. Syst.
26, 3 (2015), 878–890. https://doi.org/10.1109/TPDS.2014.2313335
[148] Tianyi Yu, Qingyuan Liu, and Dong Du. 2020. Characterizing serverless platforms with serverlessbench. In Proceed-
ings of the ACM Symposium on Cloud (SoCC’20). ACM, New York, NY, 30–44.
[149] Yinghao Yu, Renfei Huang, Wei Wang, Jun Zhang, and Khaled Ben Letaief. 2018. SP-Cache: Load-balanced,
redundancy-free cluster caching with selective partition. In Proceedings of the International Conference for High Per-
formance Computing, Networking, Storage, and Analysis. IEEE, Los Alamitos, CA, Article 1, 13 pages.
[150] Chengliang Zhang, Minchen Yu, Wei Wang, and Feng Yan. 2019. MArk: Exploiting cloud services for cost-effective,
SLO-aware machine learning inference serving. In Proceedings of the 2019 USENIX Annual Technical Conference
(USENIX ATC’19). 1049–1062. https://www.usenix.org/conference/atc19/presentation/zhang-chengliang.
[151] Haoran Zhang, Adney Cardoza, and Peter Baile Chen. 2020. Fault-tolerant and transactional stateful serverless work-
flows. In Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI’20).
1187–1204.
[152] Tian Zhang, Dong Xie, Feifei Li, and Ryan Stutsman. 2019. Narrowing the gap between serverless and its state with
storage functions. In Proceedings of the ACM Symposium on Cloud Computing (SoCC’19). ACM, New York, NY, 1–12.
[153] Wen Zhang, Vivian Fang, Aurojit Panda, and Scott Shenker. 2020. Kappa: A programming framework for serverless
computing. In Proceedings of the ACM Symposium on Cloud Computing (SoCC’20). ACM, New York, NY, 328–343.
[154] Ge Zheng and Yang Peng. 2019. GlobalFlow: A cross-region orchestration service for serverless computing services.
In Proceedings of the 12th IEEE International Conference on Cloud Computing (CLOUD’19). IEEE, Los Alamitos, CA,
508–510.
[155] Wenjia Zheng, Michael Tynes, Henry Gorelick, Ying Mao, Long Cheng, and Yantian Hou. 2019. FlowCon: Elastic
flow configuration for containerized deep learning applications. In Proceedings of the 48th International Conference
on Parallel Processing (ICPP’19). ACM, New York, NY, Article 87, 10 pages.
[156] Jianlong Zhong and Bingsheng He. 2014. Medusa: Simplified graph processing on GPUs. IEEE Trans. Parallel Distrib.
Syst. 25, 6 (June 2014), 1543–1552. https://doi.org/10.1109/TPDS.2013.111

Received April 2021; revised December 2021; accepted December 2021

ACM Computing Surveys, Vol. 54, No. 10s, Article 220. Publication date: September 2022.

Serverless Computing: Current Trends and Open Problems
No ratings yet
Serverless Computing: Current Trends and Open Problems
20 pages
Seminar Presentation II (Anushka)
No ratings yet
Seminar Presentation II (Anushka)
14 pages
Serverless Computing - Architectural Considerations & Principles
No ratings yet
Serverless Computing - Architectural Considerations & Principles
12 pages
Overview of Serverless Architecture Research
No ratings yet
Overview of Serverless Architecture Research
7 pages
Serverless Computing - State-of-the-Art, Challenges
No ratings yet
Serverless Computing - State-of-the-Art, Challenges
18 pages
Term Paper Cloud
No ratings yet
Term Paper Cloud
21 pages
32-Serverless Computing
No ratings yet
32-Serverless Computing
2 pages
Serverless Computing: Current Trends and Open Problems
No ratings yet
Serverless Computing: Current Trends and Open Problems
20 pages
Part 3 Report
No ratings yet
Part 3 Report
21 pages
IoT Serverless Computing at The Edge: A Systematic Mapping Review
No ratings yet
IoT Serverless Computing at The Edge: A Systematic Mapping Review
21 pages
2019 Shafiei
No ratings yet
2019 Shafiei
13 pages
Serverless Computing 2
No ratings yet
Serverless Computing 2
60 pages
Serverless Computing Research Paper
No ratings yet
Serverless Computing Research Paper
3 pages
Part 3
No ratings yet
Part 3
21 pages
An Analysis of The Influence of Serverless Computing On Cloud Architecture
No ratings yet
An Analysis of The Influence of Serverless Computing On Cloud Architecture
8 pages
Predictive Resource Management by Reducing Cold Start in Serverless Cloud
No ratings yet
Predictive Resource Management by Reducing Cold Start in Serverless Cloud
31 pages
Survey On Serverless Computing
No ratings yet
Survey On Serverless Computing
30 pages
Serverless Computing - What It Is, and What It Is Not.
No ratings yet
Serverless Computing - What It Is, and What It Is Not.
13 pages
SC Chapter2023
No ratings yet
SC Chapter2023
13 pages
Cloud Computing
No ratings yet
Cloud Computing
10 pages
Serverless Computing
No ratings yet
Serverless Computing
3 pages
Serverless Computing Revolution
No ratings yet
Serverless Computing Revolution
6 pages
Serverless Final1 PDF
No ratings yet
Serverless Final1 PDF
6 pages
Analysis of Serverless Computing Techniques in Clo
No ratings yet
Analysis of Serverless Computing Techniques in Clo
14 pages
MONI Report Final One
No ratings yet
MONI Report Final One
32 pages
Serverless Computing Project Guide
No ratings yet
Serverless Computing Project Guide
8 pages
Serverless Computing
No ratings yet
Serverless Computing
25 pages
Prachi Gupta
No ratings yet
Prachi Gupta
9 pages
PDF Serverless Architecture
No ratings yet
PDF Serverless Architecture
16 pages
Serverless Computing Advantages Limitations and Us
No ratings yet
Serverless Computing Advantages Limitations and Us
7 pages
SDA Assignment 01
No ratings yet
SDA Assignment 01
6 pages
Evaluation of Production Serverless Computing Environments1
No ratings yet
Evaluation of Production Serverless Computing Environments1
9 pages
Platform As A Service
No ratings yet
Platform As A Service
15 pages
Serverless Computing:: A Security Perspective
No ratings yet
Serverless Computing:: A Security Perspective
13 pages
On-Premises Serverless Computing For Event-Driven Data Processing Applications
No ratings yet
On-Premises Serverless Computing For Event-Driven Data Processing Applications
8 pages
Part 2
No ratings yet
Part 2
3 pages
Serverless Computing
No ratings yet
Serverless Computing
8 pages
CNCF Serverless Whitepaper v1.0
No ratings yet
CNCF Serverless Whitepaper v1.0
39 pages
Serverless Computing for DevOps
No ratings yet
Serverless Computing for DevOps
10 pages
ECS781P 12 Serverless
No ratings yet
ECS781P 12 Serverless
26 pages
Revolutionizingthe Cloud AComprehensive Reviewof Serverless Computing
No ratings yet
Revolutionizingthe Cloud AComprehensive Reviewof Serverless Computing
6 pages
Article 4
No ratings yet
Article 4
5 pages
Sece
No ratings yet
Sece
10 pages
Cloud Computing for Tech Experts
No ratings yet
Cloud Computing for Tech Experts
23 pages
Serverless Computing Trends 2019
No ratings yet
Serverless Computing Trends 2019
7 pages
Serverless Computing: Optimizing Resource Utilization and Cost Efficiency
No ratings yet
Serverless Computing: Optimizing Resource Utilization and Cost Efficiency
4 pages
Serverless & Faas
No ratings yet
Serverless & Faas
5 pages
Dzone Refcard298 Introductiontoserverless
No ratings yet
Dzone Refcard298 Introductiontoserverless
7 pages
Architecture Decision On Using Microservices or Serverless Functions With Containers
No ratings yet
Architecture Decision On Using Microservices or Serverless Functions With Containers
7 pages
The Rise of Serverless Computing An
No ratings yet
The Rise of Serverless Computing An
1 page
Serverless Computing Presentation
No ratings yet
Serverless Computing Presentation
12 pages
Toward Security Quantification of Serverless Computing: Research Open Access
No ratings yet
Toward Security Quantification of Serverless Computing: Research Open Access
27 pages
Serverless, FAAS and Event-Driven Architecture
100% (1)
Serverless, FAAS and Event-Driven Architecture
63 pages
Serverless Technology Research
No ratings yet
Serverless Technology Research
27 pages
L6 Serverless
No ratings yet
L6 Serverless
16 pages
Researchpaper - Serverless Computing
No ratings yet
Researchpaper - Serverless Computing
16 pages
Guide
No ratings yet
Guide
39 pages
Serverless Computing in Cloud
100% (1)
Serverless Computing in Cloud
2 pages
Guide Anssi Secure Admin Is Pa 022 en v2
No ratings yet
Guide Anssi Secure Admin Is Pa 022 en v2
68 pages
Building iOS Framework With Dependencies. - by Max Kalik - ITNEXT
No ratings yet
Building iOS Framework With Dependencies. - by Max Kalik - ITNEXT
22 pages
Fs 821gwv D.PDF - Wwluriqmh6udcpjfbory3v3tmri5sqzj
No ratings yet
Fs 821gwv D.PDF - Wwluriqmh6udcpjfbory3v3tmri5sqzj
2 pages
Co
No ratings yet
Co
61 pages
Lookup and Lookup Caches
No ratings yet
Lookup and Lookup Caches
17 pages
Shs TVL Ict Css q3 m2 Edited
No ratings yet
Shs TVL Ict Css q3 m2 Edited
16 pages
Adobe Analytics SLA Overview
No ratings yet
Adobe Analytics SLA Overview
3 pages
NCP-MCI v6.5
No ratings yet
NCP-MCI v6.5
65 pages
Datasheet Mobile Security For Enterprises
No ratings yet
Datasheet Mobile Security For Enterprises
4 pages
Sahil Sharma C.S Token
No ratings yet
Sahil Sharma C.S Token
16 pages
Proyek Akhir
No ratings yet
Proyek Akhir
13 pages
Lesson 7 Theory
No ratings yet
Lesson 7 Theory
30 pages
Software Requirements Specification For Online Courier Tracking System
0% (1)
Software Requirements Specification For Online Courier Tracking System
6 pages
JNCIS-SP Voucher Assessment
100% (6)
JNCIS-SP Voucher Assessment
16 pages
SIPROTEC 7SA87 Profile
No ratings yet
SIPROTEC 7SA87 Profile
2 pages
Sd01189cen 0314
No ratings yet
Sd01189cen 0314
97 pages
Osi Model
No ratings yet
Osi Model
24 pages
TM70 Touchscreen Keypad Menu Guide
No ratings yet
TM70 Touchscreen Keypad Menu Guide
3 pages
Internet of Things Version 1.1
No ratings yet
Internet of Things Version 1.1
23 pages
5x7 Matrix Display
No ratings yet
5x7 Matrix Display
1 page
Hackthebox Backdoor: Prepared by Imamrahman15
No ratings yet
Hackthebox Backdoor: Prepared by Imamrahman15
17 pages
Telit LE910Cx 4G LTE Module Overview
No ratings yet
Telit LE910Cx 4G LTE Module Overview
2 pages
Wifi Version User Manual - : At4P HD Color Screen Digital Display Smart Rail Watt-Hour
0% (2)
Wifi Version User Manual - : At4P HD Color Screen Digital Display Smart Rail Watt-Hour
1 page
Module 5 Control Hijacking Attacks
No ratings yet
Module 5 Control Hijacking Attacks
61 pages
Add Cell Eran Reconfiguration Guide (v100r017c10 - 04) (PDF) - en
No ratings yet
Add Cell Eran Reconfiguration Guide (v100r017c10 - 04) (PDF) - en
977 pages
About FortiSRA
No ratings yet
About FortiSRA
19 pages
Cloud
No ratings yet
Cloud
91 pages
Part-4-Timer and Interrupt
100% (2)
Part-4-Timer and Interrupt
23 pages
Gujarat Technological University
No ratings yet
Gujarat Technological University
1 page
2023 The Systems Life Cycle
No ratings yet
2023 The Systems Life Cycle
13 pages

The Serverless Computing Survey: A Technical Primer For Design Architecture

Uploaded by

The Serverless Computing Survey: A Technical Primer For Design Architecture

Uploaded by

The Serverless Computing Survey: A Technical Primer for

ACM Reference format:

Fig. 1. Example of an asynchronous invocation in serverless computing.

1.2 Survey Method by the Layered Serverless Architecture

Table 1. Techniques in the Virtualization Layer

Table 2. Works in the Encapsule Layer

Fig. 4. System logic and scheduling levels in the Orchestration layer.

Table 3. Works by Focused Hierarchy in the System Orchestration Layer

4.1 Dynamic Adjustment of Resource Provision (Resource-Level)

4.2 Load Balancing for Instance Scheduling (Instance-Level)

4.3 Data-Driven Workflows for Application Deployment (Application-Level)

4.4 Other Security Concerns in the Orchestration Layer

resource-level, instance-level, or application-level, resulting in an unmatched in-memory footprint,

5 ESSENTIAL BAAS COMPONENTS IN THE COORDINATION LAYER

5.1 Storage Service

5.2 Specialized Queue

5.3 API Gateway and Various Triggers

5.4 Data Cache

5.5 DevOps Tools

6 PERFORMANCE AND COMPARISON

6.1 Performance Analysis

6.2 Production Comparison

7 OTHER KEY LIMITATIONS AND CHALLENGES

7.1 Stateless within the Encapsule Layer

7.2 Memory Fragmentation within the Orchestration Layer

7.3 API and Benchmark Lock-in within the Coordination Layer

Fig. 8. Two scenarios where the memory fragmentation arises.

8 OPPORTUNITIES IN SERVERLESS COMPUTING

8.1 Application-Level Optimization

Fig. 9. The dataflow architecture for serverless workflow.

• Caller-callee relation: Caller-callee relation is common in a complex application. Usually, the

8.2 Robust Performance of Cold Startup Alleviation

8.3 Accelerators in Serverless

Received April 2021; revised December 2021; accepted December 2021

You might also like