0% found this document useful (0 votes)
67 views7 pages

Analyzing The Impact of GDPR On Storage Systems

The document discusses the impact of GDPR compliance on storage systems. It finds that over 30% of GDPR's 99 articles are related to storage. The authors modified Redis, a key-value store, to be GDPR compliant. This introduced new features but significantly reduced Redis' performance, lowering its throughput by 20 times. Strict real-time compliance poses challenges for storage systems designed for performance. The document identifies open research problems around efficient deletion, logging, and metadata indexing to enable strict GDPR compliance without compromising efficiency.

Uploaded by

Brk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
67 views7 pages

Analyzing The Impact of GDPR On Storage Systems

The document discusses the impact of GDPR compliance on storage systems. It finds that over 30% of GDPR's 99 articles are related to storage. The authors modified Redis, a key-value store, to be GDPR compliant. This introduced new features but significantly reduced Redis' performance, lowering its throughput by 20 times. Strict real-time compliance poses challenges for storage systems designed for performance. The document identifies open research problems around efficient deletion, logging, and metadata indexing to enable strict GDPR compliance without compromising efficiency.

Uploaded by

Brk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Analyzing the Impact of GDPR on Storage Systems

Aashaka Shah*1 , Vinay Banakar*3 , Supreeth Shastri1 , Melissa Wasserman2 , and Vijay Chidambaram1

1 Computer Science, University of Texas at Austin


2 School of Law, University of Texas at Austin
3 Hewlett Packard Enterprise

The recently introduced General Data Protection Regula- comply with GDPR. While essential, achieving compliance
tion (GDPR) is forcing several companies to make significant is not trivial: Gartner estimates [13] that less than 50% of
changes to their systems to achieve compliance. Motivated the companies affected by GDPR would likely be compliant
by the finding that more than 30% of GDPR articles are re- by the end of 2018. This challenge is exacerbated for a vast
lated to storage, we investigate the impact of GDPR com- majority of companies that rely on third-parties for infras-
pliance on storage systems. We illustrate the challenges of tructure services, and hence, do not have control over the
retrofitting existing systems into compliance by modifying internals of such services. For example, a company building
Redis to be GDPR-compliant. We show that despite needing a service on top of Google cloud storage system would not
to introduce a small set of new features, a strict real-time be compliant if that cloud subsystem is violating the GDPR
compliance (e.g., logging every user request synchronously) norms. In fact, GDPR prevents companies from using any
lowers Redis’ throughput by 20×. Our work reveals how third-party services that violate its standards.
GDPR allows compliance to be a spectrum, and what its im- Though GDPR governs the behavior of most of the infras-
plications are for system designers. We discuss the technical tructure and operational components of an organization, its
challenges that need to be solved before strict compliance impact on the storage systems is potent: 31 of the 99 articles
can be efficiently achieved. that make up GDPR directly pertain to storage systems. Mo-
tivated by this finding, we set out to investigate the impact of
1 Introduction GDPR on storage systems. In particular, we ask the following
questions: (i) What features should a storage system have
“In law, nothing is certain but the expense.” to be GDPR-compliant? (ii) How does compliance affect the
performance of different types of storage systems? (iii) What
Samuel Butler are the technical challenges in achieving strict compliance
in an efficient manner?
Privacy and protection of personal data (or more aptly, the By examining the GDPR articles, we identify a core set of
lack thereof) has become a topic of concern for the modern (six) features that must be implemented in the storage layer to
society. The gravity of personal data breaches is evident achieve compliance. We hypothesize that despite needing to
not only in their frequency (∼1300 in 2017 alone [19]) but support a small set of new features, storage systems would
also their scale (the Equifax breach [14] compromised the experience a significant performance impact. This stems
financial information of ∼145 million consumers), and scope from a key observation: GDPR’s goal of data protection by
(the Cambridge Analytica scandal [21] harvested personal design and by default sits at odd with the traditional system
data to influence the U.K. Brexit referendum and the 2016 U.S. design goals (especially for storage systems) of optimizing for
Presidential elections). In response to this alarming trend, performance, cost, and reliability. For example, the regulation
the European Union (EU) adopted a comprehensive privacy on identifying and notifying data breaches requires that a
regulation called the General Data Protection Regulation controller shall keep a record of all the interactions with
(GDPR) [18]. personal data. From a storage system perspective, this turns
GDPR defines the privacy of personal data as a fundamen- every read operation into a read followed by a write.
tal right of all European people, and accordingly regulates
To evaluate our hypothesis, we design and implement the
the entire lifecycle of personal data. Thus, any company
changes required to make Redis, a widely used key-value
dealing with EU people’s personal data is legally bound to
store, GDPR-compliant. This not only illustrates the chal-
* Aashaka Shah and Vinay Banakar contributed equally. lenges of retrofitting existing systems into GDPR compliance
but also quantifies the resulting performance overhead. Our be designed to store these metadata but also be organized
benchmarking using YCSB demonstrates that the GDPR- to allow a timely access. Related to this is the Article 21:
compliant version experiences a 20× slowdown compared Right to object, which allows a person to object at any
to the unmodified version. time to using their personal data for the purposes of market-
We share several insights from our investigation. First, ing, scientific research, historical archiving, or profiling. This
though GDPR is clear in its high-level goals, it is intention- requires storage systems to know both whitelisted and black-
ally vague in its technical specifications. This allows GDPR listed purposes associated with personal data at all times,
compliance to be a continuum and not a fixed target. We and control access to it dynamically.
define real-time compliance and eventual compliance to de- However, prominently, Article 17: Right to be forgot-
scribe a system’s approach to completing GDPR tasks. Our ten grants people the right to require the data controller to
experiments show the performance impact of this choice. For erase their personal data without undue delay1 . This right
example, by storing the monitoring logs in a batch (say, once is broadly construed whether or not the personal data was
every second) as opposed to synchronously, Redis’ through- obtained directly from the customer, or if the customer had
put improves by 6× while exposing it to the risk of losing one previously given consent. From a storage perspective, the
second worth of logs. Such tradeoffs present design choices article demands that the requested data be erased in a timely
for researchers and practitioners building GDPR-compliant manner including all its replicas and backups. Finally, Article
systems. Second, some GDPR requirements sit at odds with 20: Right to data portability states that people have the
the design principles and performance guarantees of storage right to obtain all their personal information in a commonly
systems. This could lead to storage systems offering differing used format as well as the right to have these transmitted
levels of native support for GDPR compliance (with missing to another company directly. Thus, storage systems should
features expected to be handled by other infrastructure or have the capability to access and transmit all data belonging
policy components). Finally, we identify three key research to a particular user in a timely fashion.
challenges (namely, efficient deletion, efficient logging, and
efficient metadata indexing) that must be solved to make 2.2 Responsibilities of the Data Controller
strict compliance efficient.
Among the articles that outline the responsibilities of data
controllers, 10 concern storage systems.
2 Background on GDPR Three articles elucidate the high-level principles of data
GDPR [18] is laid out in 99 articles that describe its legal re- security and privacy that must be followed by all controllers.
quirements, and 173 recitals that provide additional context Article 24: Responsibility of the controller establishes that
and clarifications to these articles. GDPR is an expansive set the ultimate responsibility for the security of all personal
of regulation that covers the entire lifecycle of personal data. data lies with the controller that has collected it; Article 32:
As such, achieving compliance requires interfacing with in- Security of processing requires the controller to implement
frastructure components (including compute, network, and risk-appropriate and state-of-the-art security measures in-
storage systems) as well as operational components (pro- cluding encryption and pseudonymization; and lastly, Article
cesses, policies, and personnel). However, since our investi- 25: Data protection by design and by default, specifies
gation primarily concerns with GDPR’s impact on storage that all systems must be designed, configured, and adminis-
systems, we focus on articles that describe the behavior of tered with data protection as a primary goal.
storage systems. These fall into two broad categories: the There are several articles that set guidelines for the col-
rights of the data subjects (i.e., the people whose personal lection, processing, and transmission of personal data. The
data has been collected) and the responsibilities of the data purpose limitation of Article 5: Processing of personal data
controllers (i.e., the companies that collect personal data). mandates that personal data should only be collected for
specific purposes and not be used for any other purposes.
From a storage point, this translates to maintaining associ-
2.1 Rights of the Data Subject ated (purpose-)metadata that could be accessed and updated
There are 12 articles that codify the rights and freedoms of by systems that process personal data. Interestingly, Article
people. Among these, four directly concern storage systems. 13 also ascertains that data subjects have the right to know
The first one, Article 15: Right of access by the data the specific purposes for which their personal data would
subject allows any person whose personal data has been be used as well as the duration for which it will be stored.
collected by a company to obtain detailed information about The latter requirement means that storage systems have to
its usage including (i) the purposes of processing, (ii) the support time-to-live mechanisms in order to automatically
recipients to whom it has been disclosed, (iii) the period for erase the expired personal data.
which it will be stored, and (iv) its use in any automated 1 Article 17 covers only the personal data, not the insights derived from
decision-making. Thus, the storage system should not only it; nor can it be used to violate the rights of other people or law enforcement.
No. GDPR article Key requirement Storage feature

5.1 Purpose limitation Data must be collected and used for specific purposes Metadata indexing
5.1 Storage limitation Data should not be stored beyond its purpose Timely deletion
5.2 Accountability Controller must be able to demonstrate compliance All
13 Conditions for data collection Get user’s consent on how their data would be managed All
15 Right of access by users Provide users a timely access to all their data Metadata indexing
17 Right to be forgotten Find and delete groups of data Timely deletion
20 Right to data portability Transfer data to other controllers upon request Metadata indexing
21 Right to object Data should not be used for any objected reasons Metadata indexing
25 Protection by design and by default Safeguard and restrict access to data Access control, Encryption
30 Records of processing activity Store audit logs of all operations Monitoring
32 Security of data Implement appropriate data security measures Access control, Encryption
33, 34 Notify data breaches Share insights and audit trails from concerned systems Monitoring
46 Transfers subject to safeguards Control where the data resides Manage data location

Table 1: Key GDPR articles that significantly impact the design, interfacing, or performance of storage systems. The table maps
the requirements of these articles into storage system features.

Finally, while Article 30: Records of processing activi- write) or control path (say, changes to metadata or access
ties requires the controller to maintain logs of all activities control) needs to be logged.
concerning personal data, Article 33: Notification of per- Indexing via Metadata. Storage systems should have inter-
sonal data breach mandates them to notify the authorities faces to allow quick and efficient access to groups of data. For
and users within 72 hours of any personal data breaches. In example, accessing all personal data that could be processed
conjunction with Accountability clause of Article 5 which under a specific purpose, or exporting all data belonging to
puts the onus of proving compliance on the controller, these a user. Additionally, it should have the ability to quickly re-
articles impose stringent requirements on storage systems: trieve and delete large amounts of data that match a criterion.
to monitor and maintain detailed logs of all control- and
data-paths interactions. For instance, every read operation Access Control. As GDPR aims to limit access to personal
now has to be followed by a (logging-)write operation. data to only permitted entities, for established purposes, and
Table–1 summarizes these articles and translates their key for predefined duration of time, the storage system must
requirements into specific storage features. support fine-grained and dynamic access control.
Encryption. GDPR mandates that personal data be en-
crypted both at rest and in transit. While pseudonymization
3 Designing for Compliance may help reduce the scope and size of data needing encryp-
tion, it is still required and likely results in degradation of
Based on our analysis of GDPR, we identify six key features storage system performance.
that a storage system must support to be GDPR-compliant.
Then, we characterize how systems show variance in their Managing Data Location. Finally, GDPR restricts the geo-
support for these features. graphical locations where personal data may be stored. This
implies that storage systems should provide an ability to find
and control the physical location of data at all times.
3.1 Features of GDPR-Compliant Storage
Timely Deletion. Under GDPR, no personal data can be re- 3.2 Degree of Compliance
tained for an indefinite period of time. Therefore, the storage Though GDPR is clear in its high-level goals, it is intentionally
system should support mechanisms to associate time-to-live vague in its technical specifications. For example, GDPR
(TTL) counters for personal data, and then automatically mandates that no personal data can be stored indefinitely
erase them from all internal subsystems in a timely man- and must be deleted after its expiry time. However, it does not
ner. GDPR allows TTL to be either a static time or a policy specify how soon after its expiry should the data be erased?
criterion that can be objectively evaluated. Seconds, hours, or even days? GDPR is silent on this, only
Monitoring and Logging. In order to demonstrate com- mentioning that the data should be deleted without an undue
pliance, the storage system needs an audit trail of both its delay. What this means for system designers is that GDPR
internal actions and external interactions. Thus, in a strict compliance need not be a fixed target, instead a spectrum.
sense, all operations whether in the data path (say, read or We capture this variance along two dimensions: response
time and capability. Unmodified AOF w/ sync LUKS + TLS

Real-time vs. Eventual Compliance. Real-time compli- 25000

Throughput (op/sec)
ance is when a system completes the GDPR task (e.g., deleting 20000
expired data or responding to user queries) synchronously in 15000
real-time. Otherwise, we categorize it as eventually compli-
10000
ant. Given the steep penalties (up to 4% of global revenue or
e20M, whichever is higher) for violating compliance, com- 5000

panies would do well to be in the strict end of the spectrum. 0


Load-A A B C D Load-E E F
However, as we demonstrate in §4, achieving real-time com- YCSB workloads
pliance results in significantly high overhead unless the chal-
lenges outlined in §5.1 are solved. This problem is further
Figure 1: Performance overhead of GDPR-compliant Redis.
exacerbated for organizations that operate at scale. For ex-
YCSB benchmarking shows that monitoring and encryption
ample, Google cloud platform informs [2] their users that
will each reduce Redis’ throughput to ∼30% of the original.
for a deleted data to be completely removed from all their
internal systems, it could take up to 6 months. 2M operations, and run them on a Dell Precision Tower 7810
Full vs. Partial Compliance. Distinct from the response with quad-core Intel Xeon 2.8GHz processor, 16 GB RAM,
time, systems exhibit varying levels of feature granulari- and 1.2TB Intel 750 SSD.
ties and capabilities. Such discrepancies arise because many
GDPR requirements sit at odds with the design principles
and performance guarantees of certain systems. For exam-
4.1 Monitoring and Logging
ple, file systems do not implement indexing into files as a Redis offers several mechanisms to generate complete audit
core operation since that feature is commonly supported logs: a debugging command called MONITOR, configuring
via application software like grep. Similarly, many rela- the server with slowlog option, and piggybacking on append-
tional databases only partially and indirectly support TTL only-file (AOF). Our microbenchmarking revealed that since
as that operation could be realized using user-defined trig- Redis anyway performs its journaling via AOF, the first two
gers, albeit inefficiently. Thus, we define full compliance to options result in more overhead than AOF. Also, MONITOR
be natively supporting all the GDPR features, and partial streams the logs over a network, thus requiring additional
compliance as enabling feature support in conjunction with encryption. So, we selected the AOF approach. However,
external infrastructure or policy components. AOF records only those operations that modify the dataset.
We use the term strict compliance to reflect that a system Thus, we had to update the AOF code to include all of Redis’
has achieved both full- and real-time compliance. interactions. Our benchmarking shows that when we set
AOF to fsync every operation to the disk synchronously,
Redis’ throughput drops to ∼5% of its original. But as Figure 1
4 GDPR-Compliant Redis shows, when we relaxed the fsync frequency to once every
second, the performance improved by 6× i.e., throughput
Redis [3] is a prominent example of key-value stores, a class
dropped only to ∼30% the original.
of storage where unstructured data (i.e., value) is stored in
an associative array and indexed by unique keys. Our choice Key takeaway: Even fully supported features like logging
of Redis as the reference system is motivated by two reasons: can cause significant performance overheads. Interestingly,
(i) it is a modern storage system with an active open-source the overheads vary significantly based on how strictly the
development, and (ii) key-value stores, in general, are not compliance is enforced.
only widely deployed in Internet-scale systems [10, 16, 22]
but are also an active area of research [6, 7, 9, 12, 15, 17, 23].
4.2 Encryption
From amongst the features outlined in §3.1, Redis fully
supports monitoring, metadata indexing, and managing data In lieu of natively extending Redis’ limited security model,
locations; partially supports timely deletion; offers no native we incorporate third-party modules for encryption. For data
support for access control and encryption. Below, we discuss at rest, we use the Linux Unified Key Setup (LUKS) [1], and
our changes—some involving implementation while others for data in transit, we set up transport layer security (TLS)
simply concerning policy and configurations—towards mak- using Stunnel [4]. Figure 1 shows that Redis performs at
ing Redis, GDPR compliant. This effort resulted in ∼120 lines a third of its original throughput when encryption is en-
of code and configuration changes within Redis. abled. We observed that most of overhead was due to TLS:
Then, we evaluate the performance impact of our modi- this was because the TLS proxies in our setup had reduced
fications to Redis (v4.0.11) using the Yahoo Cloud Serving the average available network bandwidth from 44 Gbps to
Benchmark (YCSB) [8]. We configure YCSB workloads to use 4.9 Gbps, thereby affecting both latency and throughput of
YCSB. While there are alternatives to the LUKS-TLS approach Redis (w/ fast active expiry)

like key-level encryption, our investigation using the open- 12000 10728

source Themis [5] cryptographic library showed similar per-

Time to erase (sec)


10000
formance overheads. 8000
6000 4830
Key takeaway: Retrofitting new features, especially those
4000
that do not align with the core design philosophies, will 2228
2000 1090
result in excessive performance overheads. 41 94 256 511

0
1k 2k 4k 8k 16k 32k 64k 128k
Total keys in data store
4.3 Timely Deletion
While GDPR does not mandate a timeline for erasing the Figure 2: The graph shows the delay in erasing the expired
personal data after a request has been issued, it does specify keys (20% of total keys in each case) beyond their TTL. In con-
that such data be removed from everywhere without undue trast, our GDPR-compliant Redis erases all the expired keys
delays. Redis offers three groups of primitives to erase data: within sub-second latency.
(i) DEL & UNLINK to remove one or more specified keys
immediately, (ii) EXPIRE & EXPIREAT to delete a given GDPR goals besides exposing itself to side-channel attacks.
key after a specified timeout period, and (iii) FLUSHDB & A naive approach to guaranteeing an immediate removal of
FLUSHALL to delete all the keys present in a given database deleted personal data is to trigger AOF compaction every
or all existing databases respectively. The current mecha- time a key gets deleted. However, since GDPR only man-
nisms and policies of Redis present two hindrances. dates a reasonable time for clean up, it may be prudent to
The first issue concerns the lag between the time of request configure a periodic (say, hourly) AOF compaction, which in
and time of actual removal. While most of the above com- turn would guarantee that no deleted key persists beyond
mands erase the data proactively, taking a time proportional an hour boundary.
to the size of data being removed, EXPIRE* commands Key takeaway: Even when the system supports a GDPR fea-
take a passive approach. The only way to guarantee the re- ture, system designers should carefully analyze its internal
moval of an expired key is for a client to proactively access it. data structures, algorithms, and configuration parameters to
In absence of this, Redis runs a lazy probabilistic algorithm: gauge the degree of compliance.
once every 100ms, it samples 20 random keys from the set of
keys with expire flag set; if any of these twenty have expired,
they are actively deleted; if less than 5 keys got deleted, then 5 Concluding Remarks
wait till the next iteration, else repeat the loop immediately.
We analyze the impact of GDPR on storage systems. We find
Thus, as percentage of keys with associated expire increases,
that achieving strict compliance efficiently is hard; a naive
the probability of their timely deletion decreases.
attempt at strict compliance results in significant slowdown.
To quantify this delay in erasure, we populate Redis with
We modify Redis to be GDPR-compliant and measure the
keys, all of which have an associated expiry time. The time-
performance overhead of each modification. Below, we iden-
to-live values are set up such that 20% of the keys will expire
tify three key research challenges that must be addressed to
in short-term (5 minutes) and 80% in the long-term (5 days).
achieve strict GDPR compliance efficiently.
Figure 2 then shows the time Redis took to completely erase
the short-term keys once 5 minutes have elapsed. As ex-
pected, the time to erasure increases with the database size. 5.1 Research Challenges
For example, when there are 128k keys, clean up of expired
keys (∼25k of them) took nearly 3 hours. To support a stricter Efficient Logging. For strict compliance, every storage op-
compliance, we modify Redis to iterate through the entire eration including reads must be synchronously written to
list of keys with associated EXPIRE. Then, we re-run the persistent storage; persisting to solid state drives or hard
same experiment to verify that all the expired keys are erased drives results in significant performance degradation. New
within sub-second latency for sizes of up to 1 million keys. non-volatile memory technologies, such as Intel 3D Xpoint,
The second concern relates to the persistence of deleted can help reduce such overheads. Efficient auditing may also
data in subsystems beyond the main storage engine. For ex- be achieved through the use of eidetic systems. For exam-
ample, in Redis AOF persistence model, any deleted data per- ple, Arnold [11] is able to remember past state with only 8%
sists in AOF until its compaction either via a policy-triggered overhead; adapting Arnold for GDPR remains a challenge.
or user-induced BGREWRITEAOF operation. Though Re- Efficient Deletion. With all personal data possessing an
dis prevents any legitimate access to data that is already expiry timestamp, we need data structures to efficiently find
deleted, its decision to let these persist in various subsys- and delete (possibly large amounts of) data in a timely man-
tems, purely for performance reasons, is antithetical to the ner. Like timeseries databases, data can be indexed by their
expiration time, then grouped and sorted by that index to [7] Oana Balmau, Diego Didona, Rachid Guerraoui, Willy
speed up this process. However, GDPR is vague in its inter- Zwaenepoel, Huapeng Yuan, Aashray Arora, Karan
pretation of deletions: it neither advocates a specific timeline Gupta, and Pavan Konka. TRIAD: Creating synergies
for completing the deletions nor mandates any specific tech- between memory, disk and log in log structured key-
niques. Thus, it remains to be seen if efforts like Google value stores. In USENIX ATC, 2017.
cloud’s guarantee [2] to not retain customer data after 180
days of delete requests be considered compliant behavior. [8] Brian Cooper, Adam Silberstein, Erwin Tam, Raghu
Ramakrishnan, and Russell Sears. Benchmarking cloud
Efficient Metadata Indexing. Several articles of GDPR re- serving systems with YCSB. In ACM SoCC, 2010.
quire efficient access to groups of data based on certain at-
tributes. For example, accessing all the keys that allow pro- [9] Biplob Debnath, Sudipta Sengupta, and Jin Li. SkimpyS-
cessing for a particular purpose while ignoring those that tash: RAM space skimpy key-value store on flash-based
object to that purpose; or collating all the files of a particu- storage. In ACM SIGMOD, 2011.
lar user to be ported to a new controller. While traditional
databases natively offer this ability via secondary indices, [10] Giuseppe DeCandia, Deniz Hastorun, Madan Jampani,
not all storage systems have efficient or configurable support Gunavardhan Kakulapati, Avinash Lakshman, Alex
for this capability. Pilchin, Swaminathan Sivasubramanian, Peter Vosshall,
and Werner Vogels. Dynamo: Amazon’s Highly Avail-
able Key-Value Store. In USENIX OSDI, 2007.
5.2 Limitations and Importance
[11] David Devecsery, Michael Chow, Xianzheng Dou, Jason
Given its preliminary nature, our work has several limitations.
Flinn, and Peter M Chen. Eidetic Systems. In USENIX
First, we investigate one particular storage system, Redis, us-
OSDI, 2014.
ing one benchmark suite, YCSB. Expanding the scope to a
broader range of storage systems like relational databases [12] Robert Escriva, Bernard Wong, and Emin Gün Sirer.
and file systems would increase the confidence of our find- HyperDex: A distributed, searchable key-value store.
ings. Next, it is likely that the performance of our GDPR- In ACM SIGCOMM, 2012.
compliant Redis could be further improved with a deeper
knowledge of Redis internals. Finally, while we focus exclu- [13] Amy Ann Forni and Rob van der Meulen. Organizations
sively on storage systems, researchers have shown [20] how are unprepared for the 2018 European Data Protection
GDPR compliance requires organization wide changes to the Regulation. In Gartner, May 2017.
systems that process personal data.
[14] Todd Haselton. Credit reporting firm equifax says
With the growing relevance of privacy regulations around
data breach could potentially affect 143 million US con-
the world, we expect this paper to trigger interesting con-
sumers. In CNBC, Sep 7 2017.
versations. This is one of the first efforts to systematically
analyze the impact of GDPR on storage systems. We would [15] Hyeontaek Lim, Bin Fan, David Andersen, and
be keen to engage the storage community in identifying and Michael Kaminsky. SILT: A memory-efficient,
addressing the research challenges in this space. high-performance key-value store. In ACM SOSP, 2011.

[16] Rajesh Nishtala, Hans Fugal, Steven Grimm, Marc


References Kwiatkowski, Herman Lee, Harry Li, Ryan McElroy,
[1] Cryptsetup and LUKS - open-source disk encryp- Mike Paleczny, Daniel Peek, Paul Saab, David Stafford,
tion. https://gitlab.com/cryptsetup/cryptsetup, Accessed Tony Tung, and Venkateshwaran Venkataramani. Scal-
May 2019. ing Memcache at Facebook. In USENIX NSDI, 2013.

[2] Data Deletion on Google Cloud Platform. https://cloud. [17] Pandian Raju, Rohan Kadekodi, Vijay Chidambaram,
google.com/security/deletion/, Accessed May 2019. and Ittai Abraham. PebblesDB: Building Key-Value
Stores using Fragmented Log-Structured Merge Trees.
[3] Redis Data Store. https://redis.io, Accessed May 2019. In ACM SOSP, 2017.
[4] Stunnel. https://www.stunnel.org, Accessed May 2019. [18] General Data Protection Regulation. Regulation (EU)
[5] Themis. https://github.com/cossacklabs/themis, Accessed 2016/679 of the European Parliament and of the Council
May 2019. of 27 April 2016 on the protection of natural persons
with regard to the processing of personal data and on
[6] Anirudh Badam, KyoungSoo Park, Vivek Pai, and Larry the free movement of such data, and repealing Directive
Peterson. HashCache: Cache Storage for the Next Bil- 95/46. Official Journal of the European Union, 59(1-88),
lion. In USENIX NSDI, 2009. 2016.
[19] Victor Reklaitis. How the number of data breaches is [22] Roshan Sumbaly, Jay Kreps, Lei Gao, Alex Feinberg,
soaring. In MarketWatch, May 25 2018. Chinmay Soman, and Sam Shah. Serving Large-scale
Batch Computed Data with Project Voldemort. In
[20] Supreeth Shastri, Melissa Wasserman, and Vijay Chi- USENIX FAST, 2012.
dambaram. The Seven Sins of Personal-Data Processing
Systems under GDPR. In USENIX HotCloud, 2019.

[21] Olivia Solon. Facebook says Cambridge Analytica may [23] Xingbo Wu, Yuehai Xu, Zili Shao, and Song Jiang. LSM-
have gained 37M more users’ data. In The Guardian, trie: an LSM-tree-based Ultra-Large Key-Value Store
Apr 4 2018. for Small Data. In USENIX ATC, 2015.

You might also like