0% found this document useful (0 votes)

16 views64 pages

2022 Thesis

Uploaded by

Erum Ashraf

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views64 pages

2022 Thesis

Uploaded by

Erum Ashraf

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 64

DEPLOYMENT OF A LAB ENVIRONMENT TO IDENTIFY AND

PROTECT SENSITIVE DATA IN THE CLOUD

A Degree Thesis
Submitted to the Faculty of the
Escola Tècnica d'Enginyeria de Telecomunicació de
Barcelona
Universitat Politècnica de Catalunya
by
Blanca Ruiz Díaz

In partial fulfilment
of the requirements for the degree in
TELECOMMUNICATIONS TECHNOLOGIES AND SERVICES
ENGINEERING

Advisor: Israel Martín Escalona

Barcelona, June 2022

Abstract
This thesis is focused on the study of Cloud Data Loss Prevention, a cloud native DLP
tool offered by Google Cloud Platform to help in data security. The aim of the project
is to assess its behavior under different scenarios configured in a lab environment,
where I can study how the service inspects the information in search of sensitive data.

The capabilities of the tool and its key features must be analyzed to know its scope,
allowing me to design use cases to exemplify them and propose improvements to the
tool.

The behavior in the first use case with documents was positive, although the generated
custom infotype is not capable of locating all the information because it has a generic
pattern. In the second use case, the errors in the job inspection for images was
significant, reflecting the lack of maturity of the tool for these files. The new de-
identification method to pixelate the information works well, but it will be necessary
to improve it in a future work.

1
Resum
Aquesta tesi està enfocada en l’estudi de Cloud Data Loss Prevention, una eina DLP
nativa de cloud que ofereix Google Cloud Platform per ajudar en la seguretat de les
dades. L’objectiu del projecte és avaluar-ne el comportament en diferents escenaris
configurats en un entorn de laboratori, on puc estudiar com el servei inspecciona la
informació per localitzar dades de caràcter sensible.

Cal analitzar les capacitats de l’eina i les seves característiques principals per conèixer
l’abast, de manera que em permeti disenyar casos d’ús per exemplificar-les i proposar
millores en el servei.

El comportament del primer cas d’ús amb documents ha sigut positiu, malgrat que el
tipus d’informació personalitzat que s’ha generat no és capaç de localitzar tota la
informació al estar definit com a patró genèric. En el segon cas d’ús, els errors en la
inspecció per a imatges han sigut significatius, reflexant una falta de maduresa per part
de l’eina amb aquest tipus d’arxius. El nou mètode de desidentificació per pixelar la
informació funciona bé, tot i que caldrà millorar-lo en un treball futur.

2
Resumen
Esta tesis se centra en el estudio de Cloud Data Loss Prevention, una herramienta DLP
nativa en la nube que ofrece Google Cloud Platform para ayudar en la seguridad de los
datos. El objetivo del proyecto es evaluar su comportamiento en diferentes escenarios
configurados en un entorno de laboratorio, donde puedo estudiar la manera en la que
el servicio inspecciona la información para localizar datos sensibles.

Se deben analizar las capacidades de la herramienta y sus principales características

para conocer su alcance, permitiéndome diseñar casos de uso y ejemplificarlos,
además de proponer mejoras en la herramienta.

El comportamiento del primer caso de uso con documentos fue positivo, aunque el
infotipo personalizado generado no es capaz de localizar toda la información al tener
un patrón genérico. En el segundo caso de uso, los errores en la inspección para
imágenes fueron significativos, lo que reflejaría la falta de madurez de la herramienta
para este tipo de archivos. El nuevo método de desidentificación para pixelar la
información funciona bien, pero será necesario mejorarlo en un trabajo futuro.

3
Acknowledgements
During these months of research and project development, I have had the opportunity
to work with people who have helped me to understand the architecture of the cloud
and its security, because I started this thesis without any knowledge about cloud.
Thanks to the support of my advisor, José Luis, and my coworker Álvaro, I have been
able to carry out this project. I have learnt a lot from them during this time.

Also, I want to thank the support that my friends and my family have given me. They
have suffered and enjoyed at the same time all this journey with me. I dedicate this
thesis to all of them.

Finally, I want to thank to my project supervisor, Israel Martín, for all the
recommendations and the supervision given to successfully deliver this document, as
well as his advice during the thesis. Thank you so much.

4
Revision history and approval record

Revision Date Purpose

0 18/05/2022 Document creation

1 10/06/2022 Document modification

2 17/06/2022 Document modification

3 20/06/2022 Document revision

DOCUMENT DISTRIBUTION LIST

Name E-mail

Blanca Ruiz Díaz blanca.ruiz@estudiantat.upc.edu

Israel Martín Escalona israel.martin@upc.edu

Written by: Reviewed and approved by:

Date 18/05/2022 Date 21/06/2022

Name Blanca Ruiz Díaz Name Israel Martín Escalona

Position Project Author Position Project Supervisor

5
Table of contents

Abstract ........................................................................................................................................................ 1
Resum ........................................................................................................................................................... 2
Resumen ...................................................................................................................................................... 3
Acknowledgements ................................................................................................................................. 4
Revision history and approval record ............................................................................................. 5
Table of contents ...................................................................................................................................... 6
List of Figures ............................................................................................................................................ 8
List of Tables: ............................................................................................................................................. 9
1. Introduction .................................................................................................................................... 10
1.1. Project requirements and specifications ..................................................................... 11
1.2. Work Plan and milestones ................................................................................................ 12
1.3. Gantt Diagram………………………………….…………………………………………………….14
2. Cloud Computing……………………………………………………………………………………………15
2.1 Introduction to Cloud Computing………………………………………………………………15
2.2. Main service models in the cloud……………….………………….…………………………16
2.3. Main deployment models in the cloud………………………………………………………17
2.4. Data Security……………………………………………………………………………………………19
3. State of the art of the technology used or applied in this thesis: ................................ 20
3.1. Introduction to Data Loss Prevention .......................................................................... 20
3.2. DLP products .......................................................................................................................... 22
3.3. Approach to Cloud Native DLP tools……….…………………………………………………24
3.4. Capabilities that cloud native DLP should/could provide…………..………………25
3.5. Main Cloud Native DLP tools……………………………………………………………………..31
3.6. Comparison between the main Cloud Native DLP tools……………...……………….32
4. Methodology / project development:.................................................................................... 34
4.1. Key features of Cloud DLP…………………………………………….…………………….…..34
4.2. Design of use cases……………………..………………………………………………………...…39
4.3. Use case 1: Identification of sensitive data in documents…………………….……41
4.4. Use case 2: Detection and de-identification of sensitive data in images.….…44

6
5. Results ............................................................................................................................................. ..46
5.1. Results of use case 1………………………………………………………………….……………..46
5.2. Results of use case 2………………………………………………………………………………...51
6. Budget ................................................................................................................................................ 54
7. Conclusions and future development: .................................................................................. 56
Bibliography: ........................................................................................................................................... 57
Appendices ............................................................................................................................................... 59
Glossary ..................................................................................................................................................... 62

7
List of Figures

Figure 1: Work Breakdown Structure……………………………………………………………………12

Figure 2: Gantt Diagram……………………………………………………………………………………….14
Figure 3: Main service models in the cloud……………………………………………………………17
Figure 4: Cloud computing scheme………………………………………………………………….…….18
Figure 5: Cloud Data Loss Prevention elements……………………………………………………..20
Figure 6: Mapping DLP controls……………………………………………………………………………23
Figure 7: Cloud-native DLP…………………………………………………………………………………...24
Figure 8: Regular expression pattern………...…………………………………………………………..26
Figure 9: Cloud DLP demo………………………………………………………………………………….....36
Figure 10: Template configuration………………………………………………………………………..37
Figure 11: Job inspection configuration…………………………………………………………………38
Figure 12: Methodology for use cases....………………………………………………………………...39
Figure 13: Use case 1 scheme………………………………………………………………………………..41
Figure 14: testing_bucket_tfg bucket..……………………………………………………………………42
Figure 15: Custom infotype configuration…..…………………………………………………………42
Figure 16: Masking pipeline………………………………………………………………………………….43
Figure 17: Bounding box code………………………………………………………………………………45
Figure 18: Pixelate function………………………………………………………………………………….45
Figure 19: Findings use case 1………………………………………………………………………………46
Figure 20: Percentage of findings per infotype……………………………………………………….46
Figure 21: Person name false positives………………………………………………………………….47
Figure 22: Street address false positive…………………………………………………………………48
Figure 23: Distribution of false positives.....……………………………………………………………48
Figure 24: Analyzing a bill…………………………………………………………………………………….50
Figure 25: First name false positive……………………………………………………………………….52
Figure 26: Generic ID false positive……………………………………………………………………….52
Figure 27: result2.jpg pixelated…………………………………………………………………………….53

8
List of Tables:

Table 1: Work Package 1………………………………………………………………………………………12

Table 2: Work Package 2………………………………………………………………………………………13

Table 3: Work Package 3………………………………………………………………………………………13

Table 4: Milestones………………………………………………………………………………………………14

Table 5: Comparison of main cloud native DLP tools………………………………………………33

Table 6: False positives use case 1.....……………………………………………………………………..47

Table 7: Unidentified data use case 1…………………………………………………………………….49

Table 8: False positives use case 2…………………………………………………………………………51

Table 9: Unidentified data use case 2…………………………………………………………………….52

Table 10: General workers salary………………………………………………………………………….54

Table 11: Equipment……………………………………………………………………….…………………...54

Table 12: Amortization………………………………………………………………………………………...55

Table 13: Software resources………………………………………………………………………………..55

Table 14: Total cost of the project………………………………………………………………………….55

9
1. Introduction
This thesis was carried out in the Cloud Security Department at Accenture S.L, which
is a global consulting and professional services firm offering advisory, strategy
consulting, technology, and operation services.

This is a project aimed out to understand how information is protected in the cloud
and how the Data Loss Prevention (DLP) service works to prevent sensitive data
leakage from organizations. It will focus on studying Google Cloud Data Loss
Prevention, a cloud native DLP service of Google Cloud Platform (GCP), as well as the
capabilities it supports. The project will exemplify these capabilities by deploying a lab
environment with different use cases designed in order to observe the behavior of the
tool and its scope.

The project main goals are:

1. Provide a comparison report on the main cloud native tools that address the
identification of sensitive data in the cloud.

2. List the capabilities of Google Cloud Data Loss Prevention to know its scope.

3. Design use cases to exemplify the capabilities of the tool and propose new
improvements.

4. Deploy a lab environment to conduct some use cases to assess the performance
of the service.

10
1.1. Project requirements and specifications

Project requirements:

• Use the Google Cloud DLP tool in the project.

• Create 2-3 use cases that are suitable to study the capacity of the service and
its weak points.

• Provide new types of sensitive data that are not covered by the service.

• Implement possible masking solutions once the sensitive data is identified.

Project specifications:

• The proposed use cases should be clear and reflect possible scenarios where
the company’s sensitive information must be protected.

• Make a demo with a logic data set to represent the function of the service, as
well as clearly exemplify its capabilities.

• Propose new implementations in each use case to be useful in a real scenario.

11
1.2. Work Plan and Milestones
This section includes the Work Breakdown Structure of the project and the updated
Work Packages, tasks, and milestones. I have needed to modify the dedication weeks
in the work packages because the research part has taken me more time than I initially
expected. During the thesis everything has progressed correctly, with no incidents.

Figure 1: Work Breakdown Structure

Project: Research WP ref: (WP1)

Major constituent: Research Sheet 1 of 3

Short description: Research information about cloud and its Planned start date:01/03/2022
function, how to protect the data in it and native tools.
Planned end date: 24/03/2022

Start event:01/03/2022
End event: 29/04/22

Internal task T1: General information of the cloud Deliverables: - Dates: -

Internal task T2: Specific information of the data protection

Internal task T3: Comparison of the main native tools with DLP
Table 1: Work Package 1

12
Project: Designing use cases WP ref: (WP2)

Major constituent: Designing use cases Sheet 2 of 3

Short description: Design 2-3 use cases in order to exemplify the Planned start date: 27/03/2022
capabilities of the Cloud DLP service.
Planned end date: 29/04/2022

Start event: 25/04/22

End event: 18/05/22

Internal task T1: Analyzing the weak points of Cloud DLP Deliverables: - Dates: -

Internal task T2: Doing the draft of the possible use cases and
masking solutions

Table 2: Work Package 2

Project: Deployment of the lab environment WP ref: (WP3)

Major constituent: Deploying the lab environment Sheet 3 of 3

Short description: Create the demo with the new improvements Planned start date: 02/05/2022
Planned end date: 20/06/2022

Start event: 18/05/22

End event: 20/06/2022

Internal task T1: Creating the lab environment and configuring Deliverables: - Dates: -
the use cases

Internal task T2: Introducing the new improvements and testing

Internal task T3: Implementing masking solutions

Internal task T4: Performance assessment

Table 3: Work Package 3

13
Milestones

WP# Task# Short title Milestone / deliverable Date (week)

WP1 Task1 General information Research Week 1
WP1 Task2 Specific information Research Week 3
WP1 Task3 Comparison of DLP tools Research Week 8
WP2 Task1 Analyze weak points DLP Design use cases Week 9
WP2 Task2 Draft Design use cases Week 10
WP3 Task1 Configure lab environment Deploying the demo Week 13

WP3 Task2 Introducing improvements Deploying the demo Week 13

WP3 Task3 Masking solutions Deploying the demo Week 14
WP3 Task4 Performance assessment Deploying the demo Week 15

Table 4: Milestones

1.3. Gantt Diagram

I have included the Gantt Diagram followed in the project development, with all the
work packages, tasks, and the duration times.

Figure 2: Gantt Diagram

14
2. Cloud Computing
2.1. Introduction to Cloud Computing
The term Cloud is known as a space where we can store resources and consume
services without the need to keep them on our electronic devices. This allows us to
have our information in the cloud without consuming local resources, freeing up
storage space and improving the performance of our machines.
The cloud is made up by a global network of remote servers located in linked data
centers to process, share or store data and consume them quickly anywhere from any
device with Internet access. This technology is known as Cloud Computing. [1]
Cloud Computing is responsible for providing computing resources on request
through the Internet. This concept has allowed the creation of new service models,
where users and companies can access applications or repositories, as well as build
their own infrastructures without the need to have them in a specific place. [2]

This technology has given a lot of benefits in the Information and Communications
Technology (ICT) environment: [3]

▪ Cost reduction: The development of the Internet and the new communication
technologies have allowed companies to avoid physical Infrastructure
Technology (IT) infrastructures, as well as the need to acquire software
licenses, turning them into subscriptions with cloud computing service
providers such as Amazon Web Services (AWS), Microsoft Azure or GCP.

▪ Data control and security: Cloud will offer many advanced security features
that guarantee the data security stored, such as access management with roles
or monitoring the user activity.

▪ Scalability and flexibility: This feature will allow to scale up or down their IT
departments efficiently according to demand, without having to invest in
powerful infrastructures.

▪ Unlimited capacity: Thanks to the quick scalability, the cloud has unlimited
storage capacity to store any type of data in different storages, avoiding the
need to use local resources to host our information.

▪ Pay for what you use: This is a payment model in cloud where users only pay
for what is consumed. This allows to avoid unnecessary payments for wasted
resources, which represent significant losses for the companies.

15
2.2. Main service models in the cloud
Cloud Computing is mainly offered in the following three service models:

▪ Software as a Service (SaaS)

This service allows users to use applications that are hosted in the cloud
through the Internet and consume their services. The consumer will not have
to manage or maintain the software, as it will be done by an external provider,
which usually applies a subscription model: the user must register in the
application to use it. In other cases, it can be configured for free or under
payment model, depending on the services that you want to consume.

There are several types of SaaS, such as email systems like Gmail, team
collaboration such as Microsoft Teams or social networks like Twitter, as well
as streaming platforms such as Twitch or Netflix. [4]

▪ Platform as a Service (PaaS)

This service is made up of a set of tools that allow developers to create SaaS
applications and manage data, as well as use different design services or APIs
to accelerate the creation process and make them more complete.

They can be hosted and tested in a controlled environment, without the need
to manage infrastructure maintenance, facilitating the development of the
application and reducing costs. Some examples of PaaS are Google App Engine
and Heroku. [5]

▪ Infrastructure as a Service (IaaS)

This service provides an environment with computing, network, and storage
resources on demand. IaaS enables users to scale and reduce services when it
is necessary. The cloud provider manages the infrastructure, so users can focus
only on installing, managing, and configuring the applications, as well as save
their data in storages.

Amazon Web Services, Microsoft Azure and Google Cloud Platform are
examples of Infrastructure as a Service. All of them allow users to work with
virtual machines in the cloud and customize storage resources, CPU, and
operating systems. [6]

16
Figure 3: Main service models in the cloud

2.3. Main deployment models in the cloud

Cloud Computing is mainly categorized by the following deployment models:

▪ Shared cloud
It will offer resources from one provider over infrastructures shared among
multiple clients. Being a shared environment, it will be less expensive and will
facilitate scalability, although it could cause performance problems. The shared
cloud will be useful where there are peaks in demand because it can create
elastic platforms that will be modified depending on the needs of each moment,
paying only for the resources used. [7]

▪ Private cloud
This deployment model is focused on those organizations that want an exclusive
cloud environment, where the resources will not be shared. It will be made up
of its own infrastructure and machines under the requested demand. Although
its scalability and management are more limited, this model offers more security

17
and control than the shared cloud, avoiding performance issues from third
parties. Deploying a private cloud is usually quite expensive, so it is used for data
and applications where performance and security need to be ensured. [7]

▪ Hybrid cloud
It will combine private and public cloud resources. This model is useful for
those organizations that have their own infrastructure but want to take
advantage of the services provided by an external provider. Both clouds will
interact to provide agility by sharing workloads depending on the need and the
cost that it may assume. Hybrid cloud environments are effective not only in
improving computing and data security, but also in cost savings and reducing
dependencies with on-premises infrastructures. [7]

▪ Multi cloud
Also known as community cloud. It appears when an organization decides to
combine multiple clouds, either public or private. This strategy provides more
flexibility because it is possible to determine which cloud services are used,
reducing reliance on a single cloud provider. If it includes on-premises or
private cloud infrastructure, it is considered a hybrid multi cloud model. [8]

Figure 4: Cloud Computing scheme

18
2.4. Data Security
When the information is stored or shared in the cloud, the need to protect the data
against possible attacks or vulnerabilities that affect their privacy becomes evident.
Data is considered sensitive when contains personal identification information (PII),
which will be very protected by General Data Protection Regulation (GDPR). The
compliance of these protection laws imposed by this regulation will be mandatory if
the organizations do not want to receive economic sanctions, as well as important
consequences such as the loss of the reputation or bankruptcy of the company.
Protecting sensitive data in the cloud will require different approaches, having
different security needs depending on its state. Data that is attached to an email will
not be protected in the same way as if it is stored on a repository.
The three possible states will be the following: [11]

▪ Data at Rest: This state will be for the information that is not being used or
processed, it is simply stored in the cloud through devices and systems such as
repositories, databases, servers, or computers.

▪ Data in Motion: It is information that is travelling between an origin and a

destination. This state is founded when data is sent through an email and
instant messages, web or in a communication channel such as Teams, Slack, as
well as other collaboration tools.

▪ Data in Use: It is the data that is being consumed. Normally, is accessed by an

application where a user requests the data to view it or modify it, as well as
copy it in an external device.

Sensitive information will be protected from being stolen or leaked if the company
applies the following prevention actions:

▪ Good practices of employees to educate them, learning how to handle sensitive

information.

▪ Information encrypted when is stored at rest, and in motion when is through

unreliable networks.

▪ Implement measures and security policies to protect sensitive data.

▪ Implementation of Data Loss Prevention systems for data protection.

19
3. State of the art of the technology used in this thesis
3.1. Introduction to Data Loss Prevention
DLP is a solution focused on preventing data leaks within an organization, protecting
the information in its three possible states. It is defined as a set of tools which apply
content inspection techniques and contextual analysis to analyze user actions related
to data usage. This will prevent unauthorized information disclosure outside the
corporate network. [9]
DLP generally consists of three main elements: [10]

1. Identification: The organization’s information is examined to detect sensitive

data based on predefined security policies. This method performs the analysis
through content inspection and data classification to identify data that should
not leave the corporate network.

2. Detection: The data activity is tracked to assess if the actions performed by the
users is accepted or not by the organization and its security policies, protecting
it from anomalies.

3. Prevention: Actions are applied to the data according to the results obtained
in the inspection. DLP policies will determine the prevention method in each
case. These include blocking actions such as editing or sharing files, acting
against suspicious users and remove their access rights, or sending
notifications and alerts to users to warn them of the violations they were going
to commit. This ensures that unauthorized information is leaked or extracted
outside the organization.

Figure 5: Data Loss Prevention elements

20
DLP POLICIES
In order to avoid data loss in the above situations, the state of the data must be known,
as well as how to protect it. DLP techniques monitor data activity and evaluate if the
actions attempted in a particular scenario are accepted by predefined DLP policies
within a specific context.
These policies will be accompanied by a set of rules and conditions that will determine
if the expected data behavior is occurring. It also contains actions to be applied in case
of any anomaly, as well as notifications or alerts.
This prevention system makes it possible to obtain information for specific periods of
time to assess the impact. You can discover processes where policies are being violated,
review them, and locate the unprotected information within the organization to apply
actions in time. [12]

HOW DLP WORKS

DLP solutions will analyze information based on two main approaches: Context
analysis and content awareness. It is important to differentiate their approach to
understand how DLP works. In this way, the tool’s techniques can be listed more easily,
which will help in the project development. [13]

▪ Content awareness: This method involves the content inspection of sensitive

data. It will analyze the information to detect policy violations in the corporate
network.

▪ Context analysis: This method of analysis will focus on anything that does not
include the content of the information. It will examine the properties of the
metadata, such as its format, location, creation and modification date, or size.

21
3.2. DLP products
DLP products will be categorized in the market as dedicated or integrated:

▪ Dedicated DLP products (E-DLP): Also known as Enterprise DLP. These

are standalone products that are specifically designed to prevent the
leakage of sensitive information from data at rest and in transit. They will
have contextual and content analysis capabilities, as well as DLP security
policies. They will also generate reports used for auditing purposes. They
are more expensive as they contain more complex capabilities. These
products will be useful for enterprises that have an extensive network,
because the level of protection required is higher as they contain a large
amount of sensitive information. Examples include Forcepoint DLP, McAfee
DLP and Digital Guardian DLP, among others. [14]

▪ Integrated DLP products (I-DLP): Integrated products are defined as a

reduced version of dedicated DLPs as they have limited capabilities,
becoming a complement to security tools. They will focus on enforcing
compliance policies and will be less expensive than dedicated DLPs, since
their implementation is faster and easier as they do not require additional
installations. It is a product focused on small and medium enterprises. Some
examples of I-DLP solutions are cloud security, email encryption or device
control. [14]

Implementing a DLP architecture can be complex because different DLP vendors will
have different configurations and capabilities in their tools. Before deciding to have
multiple DLP providers in an organization’s environment, it is necessary to
understand how to manage the possible issues such as management overhead and lack
of consistency in the obtained results. This will affect its accuracy and its incident
response. This problem happens because there is no way to have a DLP integration
between them, they will work independently and must be configured separately as
they have different approaches and needs. These errors increasing the unnecessary
storage, poor quality in the analysis obtained, and consequences of making poor
decisions. To avoid this, a single E-DLP provider could be used, but if the organization
decides to use different I-DLP providers, it will have various levels of data risk at
different points in its security architecture.

22
Each DLP product may be focused on one or several types of DLP solutions. To know
which ones will be needed for data security, the scope of each DLP solution must be
known to understand the control they will provide and apply them depending on the
origin and destination of the data to protect. [14]

The Figure 7 shows the types of DLP control tools available:

Figure 6: Mapping DLP controls

Endpoint DLP is the most powerful option for any DLP architecture. It provides data
discovery and control for local storage, as well as control in removable media, data-
via-browser, and data-via-email. The control capabilities include blocking or masking,
and some vendors include classification. However, most endpoint DLP solutions come
from E-DLP vendors and are fully integrated.
Cloud Access Security Brokers (CASB) is a multifunctional tool which provides
security around cloud environments. It is an API-based solution which will be able to
discover and manage data security within the cloud environment as a data at rest
approach.
Secure Web Gateway (SWG) will redirect browser traffic to an inspection system
before forwarding to the intended destination, and Secure Email Gateway (SEG) will
do the same but with email traffic. Firewall DLP can be useful to address data leakage
issues from unmanaged devices on the network, restricting data movement.

23
3.3. Approach to Cloud Native DLP solutions
The objective of this section will be to define Cloud-Native DLP and list the capabilities
that this type of solution should provide to control sensitive data in an organization’s
corporate network. Also, I will study the native DLP solutions offered by different
cloud providers, such as GCP, Microsoft Azure and AWS to make a comparison of the
capabilities that they support. [14]

CLOUD-NATIVE DLP
Cloud-native DLP is an environment directly used by users which is designed to
process unstructured data. This DLP control only protect the data where the provider
is offering its cloud space. Cloud providers are increasing the capabilities of Cloud-
Native DLP, expanding them to the endpoint. [14]

Figure 7: Cloud-native DLP

One of the disadvantages of cloud native DLP solutions are the lack of consistency.
Unlike CASB solutions, each cloud provider will offer different features in their
products. This will make difficult to define the capabilities that this type of solution
should have.[14]

24
3.4. Capabilities that Cloud Native DLP should/could provide
In this section, the capabilities that Cloud native DLP should provide are listed.

1. Data classification
The information in an organization’s network must be identified to categorize
it according to its level of confidentiality. With data classification, the DLP tool
can take the appropriate actions based on the DLP policies configured for a
specific category. It will allow, through contextual analysis and content
inspection, to evaluate the information and assign it a category to recognize the
value of the data. In this way, the organization will have control over the
information and will protect it from security risks. In order to carry out this
categorization, different inspection and analysis methods may be used: [15]

▪ Pre-built data identifiers

Based on content inspection, it is the most used method in data
classification. These identifiers are predefined in DLP tools, and they are
ready to use in analysis to detect sensitive information. Some examples are
credit card numbers, locations, person names or social security numbers.
[13]

▪ Custom data identifiers

Aside from the data identifiers already predefined by the cloud providers,
there is a customizable option that allows users to define their own. When
the inspection in DLP is configured with a custom data identifier, it finds
that type of information if it matches with the specified pattern. [13]

▪ Database fingerprinting
It will be responsible of identifying sensitive data from databases.
When fingerprinting analysis is used, it helps to have a more accurate
classification and will reduce the number of false positives. [13]

▪ Keywords
This pattern allows the detection of sensitive information from key words
or phrases. It is necessary to specify them as a rule in order to find them if
the pattern matches. Also, it is possible to tag the document that contains
these keywords to be protected. Some examples of keyword rules could be
“confidential” or “internal use”. [16]

25
▪ Regular expressions
It allows DLP to detect matches of complex sensitive information. This type
of pattern is made up of a string with simple and special characters, where
each of them will have a meaning. The language type must be understood to
correctly define the pattern and find the desired match. One type of
information detected from a regular expression could be an email address:
[13]

Figure 8: Regular expression pattern

▪ Data labeling
It uses labeling to classify data based on its type or its sensitivity level in
order to simplify the analysis process. It allows us to know the value of its
information and take actions to avoid risks. By using classification, the
organization gains more security over the data, as it will be categorized to
increase control and protect it if an anomaly is detected. [18]

In general, the sensitivity level of the data can be labeled using the following
labels:

▪ HIGH SENSIBILITY
This labeling includes two types of data:

o Critical data: Highly sensitive and restricted information that

may lead to a regulatory breach, where sanctions would be
applied, and a great negative impact will occur in the
organization if the data is destroyed or modified. In this
category, personally identifiable information or
authentication data would be found, which would facilitate the
loss of anonymity of an individual and cause the violation of
their privacy.

26
o Confidential data: Information that can be a risk to the
company’s operation if the data is lost or theft, but it is not
considered critical. Some examples would be customer
information or worker wages.

▪ MEDIUM SENSIBILITY
Information that is not confidential but is not publicly available. It
resides in the corporate network for internal use only. This data
does not pose a danger to the integrity of the organization if it was
stolen. Marketing strategies, as well as emails or documents without
confidential or critical information, could be categorized as internal
data.

▪ LOW SENSITIVITY
Unrestricted information will be categorized in this labeling, such as
data for public use which is accessible to everyone. This is the case
of content found on a public web page, such as descriptions about an
organization or a product, as well as the addresses of a company,
among others.

Some cloud native DLP tools allow to label the risk level of the data to high,
moderate, or low. In addition, they can include labeling on users to determine
if they have access to that information, as well as what actions they can make
on it: read only, permissions to modify, print or forward, among others.

Data labeling will help to speed up the process of identifying sensitive

information on the corporate network and will increase its security. [18]

File Type
This context analysis method will be able to identify a document by its type.
Some cloud providers allow to add rules during the inspection in order to
specify which types of data should be inspected and which will be ignored. [19]

27
2. Data Discovery
It is crucial for companies to have control over their data to comply with today’s
strict security regulations, as well as fast interventions to protect themselves
from possible leaks and handle them before they occur. This prevention
technique is responsible for performing deep inspection of large amounts of
data to simplify the analysis and detection of sensitive information hosted on
the corporate network.
It provides a broader and more accurate view by exploring multiple data
sources, identifying relationships between them that were previously
unknown, causing an optimization impact. In this way, it will be possible to
discover and locate important data that had not been identified before, linking
data sources that were dispersed, and identify hidden information patterns
that must be integrated and evaluated. All this will improve the decisions that
the organization makes about the data and will facilitate its integrity and
confidentiality thanks to the prompt response they offer.

Data discovery will be able to analyze different format files: Text files, Microsoft
Documents (PowerPoint, Word, Excel), images or compressed files. [20]

3. Data Egress
In the activity of a network, we find the data egress to external locations. This
process must be managed to prevent information leakage if the data reaches an
unauthorized destination by the organization. Some of the data output
channels in the cloud are via email and external repositories. Thanks to the
prevention techniques of Data Discovery and Data Classification, protection
measures can be applied in the cloud environment to ensure that sensitive data
does is not exposed. [21]

4. Optical Character Recognition (OCR)

This advanced capability is used in the identification of sensitive data contained
in images, whether in a photography or in a scanned document. Its function will
be to extract the text from the images in order to classify it according to the
established rules and the security policies imposed by the organization once
the inspection is triggered. This allows to increase the data protection that
resides in the cloud and increase DLP capabilities.

The supported formats to inspect images are PDF, JPEG, PNG, JPG and TIFF
format, which can be stored in repositories or attached to emails. Depending
on the capabilities of the cloud provider, images formats such as GIF or BMP
can also be analyzed. [22]

28
5. Detailed reporting
When the inspection job is finished, a report will appear as a resume with all
relevant information. If sensitive data is detected during the scan, a finding will
be created. This report will include the number of data identifiers found, as well
as the total bytes analyzed and the total number of findings.

Detailed reports will be possible if the sensitive data is sent to another native
cloud services, such as databases, monitoring tools or security control services,
where will be able to show the real sensitive data founded in plain text and
other features like the type of data identifier, its source, and its location. [23]

6. Monitoring
Monitoring the network activity allows to track the data flow and control its
status. In case of detecting any anomaly, it will be able to notify it. This
capability does not usually integrate in the DLP tool, but it is possible to send
the results of the inspection to a native monitoring tool. [24]

7. User activity
This capability monitors and tracks end user behavior. This will help to detect
and stop insider threats, whether unintentional or intentional purposes.
Thanks to this, organizations can protect sensitive data while ensuring
compliance with data privacy and security regulations. [24]

8. Integration
Cloud native DLP tools will be able to integrate their service with other native
tools from their providers such as repositories, databases, or email services.
This will facilitate their inspection and the communication between other
native tools to implement new services related with process optimization. [25]

9. Leakage handling
Through inspection techniques, DLP systems can recognize which data is
sensitive and track it to check if the security policies established by the
organization are being complied with. If an incident occurs with that
information, DLP will try to handle it to prevent data leakage or theft.

DLP systems will be able to anticipate leaks of sensitive data with an adequate
prevention approach to prevent them from occurring through specific
techniques: [26]

29
▪ Data encryption: Encryption can be used to protect sensitive
information by making it incomprehensible unless the secret
encryption key is known.

▪ Access control: This technique allows you to restrict access to the

information, configuring who can access to that data, as well as the
actions that can be performed on it, such as editing or read-only.

▪ De-identification: Some DLP services can de-identify sensitive data.

This process will remove the identified information and will apply a
de-identification technique such as masking, redacting or
tokenization to transform it.

▪ Whitelisting: This prevention method will help to whitelist domains

or users, as well as data identifiers in exclusion rules to avoid
security risks.

They may also have leak management capabilities once an incident has been
detected. Different actions will be taken to remedy the loss of this data: [26]

▪ Blocking actions will be responsible for interrupting the data flow,

preventing the infraction from being committed. It is an invasive
method which will not be possible to implement in all DLP systems.

▪ Notification actions do not interrupt the data flow, they will send a
warning to the users about the policy violation that they have
committed. In this way, employees are educated by giving them the
opportunity to reverse the action.

▪ Alert actions will be responsible for sending a notification to the

personal in charge of supervising the data flow and the security of
the organization’s information.

▪ Audit actions will be the least invasive technique because only lets
a log of the incident occurred.

30
3.5. Main Cloud Native DLP tools
The cloud native DLP tools of the main cloud providers are presented below:

Google Cloud Data Loss Prevention

Also known as Cloud DLP. It is the cloud native DLP tool offered
by Google Cloud Platform to help in data security. This service
allows customers to inspect their data for potentially sensitive
information stored in the Google infrastructure.

The service will be able to de-identify data and optionally re-identify them applying
redact and masking solutions to hide information thanks to OCR technique and data
identifiers. It has many capabilities which are accessible through APIs, requiring the
user to build a DLP solution given the low-level coding blocks available. [25]

Azure Information Protection (AIP)

It is the cloud native DLP tool offered by Microsoft Azure, which
can be used in local storage as well as in the cloud. Its main
characteristic is data classification labeling to protect documents,
especially on e-mails. This type of data identifier will help to label
information depending on its level of sensitivity.

It is useful because it allows persistent protection: the label travels with the data,
regardless of where it is stored, sent, or shared. Azure can prevent data leak thanks to
its data tracking. This will allow us to monitor protected documents, seeing who is
accessing them and when. If security issues are found, AIP will have prevention
capabilities to limiting actions such as file expiration dates or revoke access. [27]

Amazon Macie
Amazon Web Services offers Amazon Macie as its data
discovery tool for data loss prevention. It will be able to
detect personal data within native buckets thanks to the
pattern matching and protect it in AWS environment.

It automates evaluations and monitors data for its security and keeps data privacy.
Macie will create detailed reports with findings to review and remediate potential

31
issues detected. Also integrates other native tools to submit incidents but does not
provide any blocking or prevention capability directly. [28]

3.6. Comparison between the main Cloud Native DLP tools

The following table summarizes the capabilities supported by the presented cloud
native DLP tools, making a comparison of the services they can offer: [25][27][28][29]

CAPABILITIES OF Google Azure Amazon

CLOUD-NATIVE DLP Cloud Information Macie
DLP Protection

Pre-built data Yes Yes Yes

identifiers

Custom data Yes Yes Yes

identifiers

DATA Regular expressions Yes Yes Yes

CLASSIFICATION
Fingerprinting No No Yes

Tagging Custom Yes Custom

Keywords Yes Yes Yes

File Type Yes No Yes

Text Yes Yes Yes

Documents Yes Yes Yes

Compression No No Yes
archives
DATA
DISCOVERY Images Yes Yes No

Video No No No

Apache Avro object Yes No Yes

containers

32
Access Control Custom Yes Yes

Data quarantine Custom No No

Encryption Yes Yes Yes

Redacting Yes No No

LEAKAGE Masking Yes No No

HANDLING
Tokenization Yes No No

Expiration time Custom Yes No

Blocking Custom Yes No

Notifying Custom Yes Custom

Alerting Custom Yes Custom

Logging Custom Yes Custom

Limiting user Custom Yes Custom

actions

Whitelisting Yes Yes Yes

DATA EGRESS Yes No Yes

OCR Yes Yes No

DETAILED REPORTING Yes Yes Yes

MONITORING Custom Yes Custom

INTEGRATION Yes Yes Yes

USER ACTIVITY Custom Yes Custom

Table 5: Comparison of main cloud native DLP tools

33
4. Methodology / project development:
Once the DLP services provided by the main cloud providers were defined, as well as
the basic and advanced capabilities they should support on their tools, it is verified
that Cloud DLP is the most complete option, directly providing leakage handling
methods like masking and OCR capabilities to inspect and redact images.
This section will be fully exploiting all the features of the native Google Cloud tool to
make a general technical description that allows understanding its operation and
configuration. It will also find weak points in the supporting capabilities that the
service may have. The goal is to study if Cloud DLP actually performs as it claims in its
documentation, or if its capabilities are more limited.

As introduced in the previous section, Cloud DLP is a cloud-based solution for

protection and detection of sensitive data. Its main function is the data discovery and
classification through inspection methods. It will allow users to inspect their
repositories and optionally de-identify them.
Several capabilities will be accessible through API calls. DLP service is provided as an
independent REST API that will provide methods for the detection, inspection, and de-
identification of sensitive data, whether in text documents, images or in native
storages.

4.1. Key features of Cloud DLP

CONTENT INSPECTION IN NATIVE STORAGE SERVICES

Cloud DLP may be used to inspect native repositories tools to locate sensitive
information in the following storage services: [25]

▪ Cloud Storage: This service allows you to store and retrieve

any amount of data at any time in buckets. You should enter
its name in Cloud DLP inspection to scan it or includes an URL
to scan only a specific directory or file. [30]

▪ Big Query: It is a Google’s data warehouse designed to

improve data analyst productivity when performing SQL
queries. For content inspection, enter the ID of the project,
the dataset, and the table you want to analyze. [31]

34
▪ Google Datastore: It is a NoSQL database using in web and
mobile applications, which will use to do SQL data queries.
It will be necessary to introduce the ID of the project and the
type of data you want to analyze in the inspection. [32]

DATA CLASSIFICATION
In order to protect the sensitive data stored on the corporate work, Cloud DLP will
perform content inspections to classify it and find out its location. This will allow us to
have the information identified, as well as its type and how it is being used. The user
can enable automatic data classification in its GCP repositories. It will be able to scan
structured and unstructured information in text files, images, Microsoft documents,
PDFs, and binary data. [33]

USING INFOTYPES
Cloud DLP will be able to inspect our information if we provide a list of the types of
sensitive data which we want to locate. It will use infotypes to define a type of data to
be located in the corporate network. The service can recognize them thanks to
infotype detectors, which are configured to detect the desired information if they
match with their matching criteria.
By default, DLP has 150 different infotype detectors built and ready to use, such as
social security numbers, emails, phone numbers, or person names. If you want to
identify an email address in a table, there is the EMAIL_ADDRESS infotype detector
defined to recognize this kind of sensitive data.
If none of the default infotype detectors is valid in our use case, Cloud DLP allows you
to add custom infotype detectors. You can implement new types of sensitive data and
configure the detection behavior, such as regular expression pattern or a list of
keywords or phrases that must match with the information analyzed. [34]

MATCH LIKELIHOOD
Scan results are categorized based on how likely they are to represent a match. Cloud
DLP uses match likelihood to indicate how likely it is that a piece of data matches a
given infotype. There are 5 possible values for likelihood: VERY_LIKELY, LIKELY,
POSSIBLE, UNLIKELY, VERY_UNLIKELY. They are arranged in order from highest to
lowest probability of a match. [35]

35
When you want to start an inspection, it is recommendable to set in the request the
minimum level of likelihood to scan data that you want to retrieve. To prove this, I
used the Google Data Loss Prevention Demo to check this characteristic and see how
likelihood works when I introduce the same information but in different context. The
infotype PHONE_NUMBER appears as a finding, but in different bucketized
representation: [36]

Figure 9: Cloud DLP demo

The likelihood affects in the number of matching findings that are returned in the
response. If your inspection is configured with likelihood LIKELY, the response will
not include the second finding because it will contain findings with likelihood LIKELY
and VERY_LIKELY. If you set the minimum likelihood as VERY_UNLIKELY, all findings
will appear in the response, although in some cases it is not a reliable option as it
introduces false positives.

36
DATA DE-IDENTIFICATION
Once Cloud DLP performs the information scan, and the results of the inspection are
obtained, DLP will be able to take preventive actions de-identifying sensitive data in
text content. This process will remove identifying information from data applying a
de-identification transformation to redact, mask, tokenize or transform text and
images and guarantee data privacy in storages and tables. Some of these types are one-
way and they can’t be reversed once the action is done, such as masking or redaction.
Others may be re-identified, like encryption transformation. [37]

TEMPLATES CONFIGURATION
Templates can be used to create and maintain configurations information for reuse in
Cloud DLP. This characteristic is useful to speed up the inspection process
configuration and define the de-identification transformations to apply in sensitive
data.
The service supports two types of templates: [38]
▪ Inspection templates: It will contain configuration information related to data
analysis and data classification. It is mandatory to include the infotypes to scan
for, and it is recommended to specify the likelihood in confidence threshold
section.

▪ De-identification templates: This template will include the de-identification

configuration to apply in information transformation purposes. As in
inspection templates, infotypes should be defined to be processed, as well as
the likelihood according to the needs.

Once the template is created, it appears in configuration section ready for its use.

Figure 10: Template configuration

37
JOB INSPECTIONS
To trigger content inspection in storages and databases, a job inspection must be
configured in Cloud DLP.

Figure 11: Job inspection configuration

To create a job inspection, it will be necessary to fill up the following fields: [39]

▪ Input Data configuration: The name of the job inspection, as well as its ID
which must be unique, will be specified first. The resource location where will
be hosted should be defined, and the path or table ID where is the input data to
inspect. In addition, Cloud DLP allows sampling, so we can optionally scan only
part of the information depending on the needs of the organization and the
volume of data stored.

▪ Detection configuration: This section will allow us to choose the infotype

detectors to be identified in the inspection job. It is possible to define them by
including the ID of a previously configured inspection template or by selecting
them directly in the job and defining the custom infotypes if it is necessary. In
addition, exclusion and keyword rules can be added to enhance infotypes and
improve inspection results. Also, it is important to include confidence threshold,
setting the minimum range accepted.

▪ Add actions: Cloud DLP offers different actions that will be performed once the
inspection job is finished. You can decide where you want to save the findings
founded during the analysis. The most common option is through Big Query
notifications, where the sensitive data and its characteristics will send to a
specific table. Also, you can enable the email option, which notifies you when
the inspection is finished, or publish the results in other native tools such as
Cloud Monitoring.

38
▪ Scheduling: We can schedule the inspection job at a certain time interval, or
turn into a job trigger, which runs periodically. Another option will be run
immediately once, without applying scheduling.

All this configuration will summarize in a JSON representation, which will send to the
Cloud DLP API through content.inspect method. We could also do the configuration
programmatically instead of in the tool, which allows you to interact with the Cloud
API deeply and access to other types of capabilities and methods that are not allowed
to be configured from the tool, such as deidentification techniques. [39]

4.2. Design of use cases

To carry out the correct design of the use cases, the following methodology is defined:

1 2 3 4 5

Figure 12: Methodology for use cases

1. Analyze the channels in which sensitive information is stored, whether locally,

in repositories, or relational or non-relational databases.

2. Analyze the sensitive information stored, as well as the configuration of the

environment, to understand the level of risk which is exposed.

3. Design the scenario of the use case and propose custom configurations or new
improvements and how to implement them.

4. Generate the inspection work, which will contain the rules, conditions and
actions adjusted to each use case to identify and control sensitive information.
Save the results to see the detailed report in another native tool accepted by
Cloud DLP.

5. Apply a de-identification job in order to protect and hide the findings, as well
as take actions to handle the leakage.

39
The infrastructure used to represent the uses cases will be Google Cloud Platform,
where you can find all its native tools available in the cloud, offering different IaaS,
PaaS, SaaS services. In this environment, users can test and develop their applications,
as well as store their data in storage solutions and configure the security and network
management. [40]
Having described the main features of the Google Cloud DLP solution and listing the
capabilities that they can support; I will be able to understand the needs that must be
covered by organizations and users in the service. The following sections will show
the design of two use cases to exemplify the capabilities of the tool and how responds
in different scenarios.

40
4.3. Use case 1: Identification of sensitive data in documents
Scenario
This use case studies how Cloud DLP responds when it inspects a repository that
contains unstructured information. For this, a bucket in Cloud Storage is created,
storing PDF documents with sensitive content. Data analysis is done by launching an
inspection job configured to classify and identify the infotypes defined in it, as well as
include where the findings must to be sent. A new custom infotype will be created to
observe the behavior of this capability. This job will generate a report that will indicate
the status of the inspection and the number of findings obtained from each infotype.
The detailed report will be available in a Big Query table, which will show all the
findings with their inspection results and their characteristics, such as their location
and their likelihood. A de-identification template will be created to mask the sensitive
data found to the API through the content.deidentify method.
For this, I will use the Google Data Fusion tool, which allows me to create a pipeline to
transform data and save them in another native storage tool. In this case, I will store
the masked results in a redact_bucket_results bucket in Cloud Storage, in a CSV format.

The following scheme shows the implementation needed to do the testing:

Figure 13: Use case 1 scheme

Implementation
A cloud storage bucket called testing_bucket_tfg is configured in the laboratory, which
will store 15 different bills in PDF format with sensitive information:

41
Figure 14: testing_bucket_tfg bucket

In Cloud DLP, the inspection job must be configured. The job trigger-inspect-bills must
be capable to analyze the input data stored in Figure 14 with no sampling. The infotype
detectors PERSON_NAME, SPAIN_NIF_NUMBER, PHONE_NUMBER, STREET_ADDRESS,
and EMAIL_ADDRESS will be integrated in the service, but there is no infotype in the
tool capable of detecting amounts. So, it is necessary to create a new custom infotype
to identify this data in the bills.
This custom infotype will be known as GENERIC_AMOUNT, defined with a regular
expression. To define it, I have considered that negative values could appear in the bill,
as well as decimals and high values. Any information with the ‘€’ symbol will be an
amount. That is why I have defined the minimum probability of the pattern as LIKELY.

Figure 15: Custom infotype configuration

42
Once the custom infotype is declared, and the infotype detectors are included in the
inspection, I decided to set the probability threshold to LIKELY to find the maximum
results with the lowest possible rate of false positives. In Big Query, I have created a
table called results_1 where all the results found in the inspection will be saved. I have
had to specify it in the inspection to get details about the location and likelihood of
each particular result. Otherwise, I will only have general statistics about the number
of infotypes found, without the concrete values.

This table will be the input data in a pipeline created in Cloud Data Fusion, called
Masking_ Pipeline_1. Cloud DLP will take this data and will transform the infotypes
masking them with ‘#’ symbol as will be defined in the de-identification job and in the
transformation template. Then, the process will save the masked results in the
redact_bucket_results storage, where they can be consulted in CSV format. [41]

Figure 16: Masking pipeline

43
4.4. Use case 2: Detection and de-identification of sensitive data in
images
Scenario
It is about observing the behavior that Cloud DLP has when it inspects and de-
identifies an image that contains sensitive information.
For this use case, I have decided to analyze a personal DNI and trigger a job inspection
to identify the critical data contained. Once located, I will protect the user by de-
identifying the information found, as well as any other information that could put their
privacy at risk.

Implementation
The personal DNI will be stored in a local path in JPG format called DNI.jpg. To inspect
it, I configure an inspection job in Node.js to request the analysis called Image.js. In the
script, I will include all the necessary parameters to run the API call and import the
required Google Cloud and Node.js libraries. As the input data is an image, Cloud DLP
will be able to inspect it if it is sent in base64 encoding. It will be necessary to convert
before call the content.inspect method.
Also, I will define the infotypes to detect: FIRST_NAME, LAST_NAME, DATE_OF_BIRTH,
SPAIN_DNI_NUMBER, GENDER, DATE, STREET_ADDRESS and GENERIC_ID. The
minimum likelihood will be VERY_UNLIKELY to get the maximum number of results.
Once I get the inspection results in the console, I extend the code generated for the
inspection to performs a de-identification job to transform the information found.
Cloud DLP has configured redact.image method, which can redact the results in a black
box to mask them. [42]

This transformation process is not reversible. For this reason, the masked image will
save in an output path declared in the code. In this use case, the result will be stored
in the same path as the input data, in result.jpg.

Redacting the image is useful to hide the sensitive information, but the result is not
very delicate. A new masking solution could be proposed to make the result cleaner
and more attractive than the existing one. It is intended to implement a transformation
process by pixelating the sensitive information founded to achieve a less aggressive
effect on the image. For this, NodeJS has the Jimp library to performs the image
processing, which allow us to manipulate an image thanks to its included methods.

44
The pixelate() function is an unbuilt function which applies pixelation effect over an
image or region. Its syntax will take the size of the pixel defined as a constant and the
bounding box values that contains the coordinates where sensitive data resides. In this
way, it will be possible to apply the pixelation in the region passed by parameter to
achieve this new way of data transformation.
If Cloud DLP inspects an image, it will save in declared variables the coordinates when
sensitive data was located:

Figure 17: Bounding box code

Once the bounding boxes are saved, the pixilation function will be applied in the region
with pixels of size 7, which I have considered adequate to de-identify the information
and achieve the desired clean transformation:

Figure 18: Pixelate function

The complete code for this implementation, as well as the inspection job configuration
is attached in appendices section.

45
5. Results
In this section, I will include the results obtained in the use cases implemented before.

5.1. Results use case 1

Cloud DLP has found 623 findings during the content inspection shown in Figure 14,
with the following number of findings obtained from each infotype:

Figure 19: Findings use case 1

With the information provided by the report, the percentage of each infotype can be
calculated to analyze the impact they had on the inspection. According to Cloud DLP,
the GENERIC_AMOUNT infotype configured in the tool is the most found in the bucket,
representing slightly more than half of the findings of the configured job:

Figure 20: Percentage of findings per infotype

46
To draw conclusions about the behavior of the service when inspected the bucket, it is
necessary to analyze the results saved in Big Query table, which contains the detailed
analysis report.

Analyzing false positives (i.e. Precision)

As the table contains relevant information on each of the located infotypes, such as
how many infotypes have been found in each bill and the probability of coincidence
that they have had, I can know how many false positives have been detected in the
inspection.

In the following table, I include all the false positives found in the detailed report,
separating by their infotype detectors, as well as the total findings of each type:

Infotypes Total findings Number of false positives

EMAIL_ADDRESS 19 0

GENERIC_AMOUNT 345 0

PERSON_NAME 105 51

PHONE_NUMBER 36 0

STREET_ADDRESS 98 9

SPAIN_NIF_NUMBER 20 0

TOTAL 623 60

Table 6: False positives use case 1

I have found 60 false positives in the inspection result. This represents de 9,63% of
the total findings, which is a high number since it is an inspection that includes results
with LIKELY and VERY_LIKELY coincidence. Only the infotypes PERSON_NAME and
STREET_ADDRESS have false positives in their detections. Observing these errors, the
service introduces rare characters as a match for the pattern defined in
PERSON_NAME infotype detector:

Figure 21: Person name false positives

47
This problem is caused by analyzing sensitive data in unstructured data. Data
identifiers can generate false positives and not be entirely accurate during detection if
they are used on this type of data. They are more effective in structured data as they
are easier to categorize.

In the case of STREET_ADDRESS, there are errors caused by the inaccurate declared
pattern as shown in the following figure:

Figure 22: Street address false positive

The following graph represents the number of false positives found in both infotypes
with probability measures:

Figure 23: Distribution of false positives

Analyzing unidentified data

In addition to studying the false positives detected as findings during the inspection,
performing an analysis of the data that has not been detected by the configured job is
necessary to understand the behavior of the service.

Knowing the sensitive information contained in the bills, as well as the detailed report
shown in results_1 table, I have been able to locate the information that has not been
detected by Cloud DLP. I summarized this analysis in the following table, which I
specified the percentage of the unidentified data about the total of each infotype:

48
Infotype detector Number of unidentified Percentage with
data reference to total

EMAIL_ADDRESS 2 9,52%

GENERIC_AMOUNT 90 20,69%

PERSON_NAME 0 0%

PHONE_NUMBER 13 26,53%

STREET_ADDRESS 7 7,29%

SPAIN_NIF_NUMBER 0 0%

TOTAL 112 16,59%

Table 7: Unidentified data use case 1

I detected 112 unidentified data in the job inspection. Having set the probability of
matching to LIKELY, it is logical that unidentified data appear if they have not met with
this threshold rule.

In the case of PHONE_NUMBER, the 13 results are public and custom service numbers.
They do not represent a threat by not revealing private information about the client
or the company. It is the same case presented by the data found from EMAIL_ADDRESS
and STREET_ADDRESS, so they will not be considered as an error during the
conclusions of the service behavior. They are not sensitive data, so their matching with
the detector will be lower than a personal data identified.

Analyzing the custom infotype

The created custom infotype GENERIC_AMOUNT has worked correctly since it has
been able to identify a large part of the amounts defined in the bills. However, it was
unable to discover 90 of them, leaving them unclassified.

This is due to the regular expression pattern defined. As entered in the configuration,
Cloud DLP can identify an amount if it contains the ‘€’ symbol. If not, it will not
consider that this data corresponds to the infotype created, so it will not be added as
a GENERIC_AMOUNT finding.

Here is an example with a fictitious bill that I have created for the analysis. In it, you
can see how there are two ways to represent an amount: one of them uses the €
symbol to define it, and the other appears with no symbol, only as a value.

49
That is why the inspection job has not been able to locate them. It is very difficult for
DLP tools to interpret this type of data, which requires very complex patterns.
Depending on how the information is presented to us, data identifiers may be more
accurate during inspect.

Figure 24: Analyzing a bill

The custom infotype GENERIC_AMOUNT was created without considering the type of
document to inspect in this use case. It was intended to find a way to define an amount
in a generic way to be used by default in the service as a new built-in identifier. So, the
most logical way to find it is next to its currency identifier.

If the infotype had been implemented for this particular use case, it would not have
been defined the regular expression in this way. If it is known that the bucket only
contains bills, the custom infotype would be defined without the symbol restriction,
making sure that it captures all the amounts. Although the false positives would be
higher when entering all the values in the results, those that correspond to the
amounts would be covered and identified.

In relation of the data transformation, a CSV file appears in the redact_bucket_results

bucket with the requested masked information once the pipeline finishes the
execution, so the de-identification job in structured data and the generated template
worked successfully.

50
In conclusion, the behavior of the service in this use case is positive. Although it
introduces false positives in the results, the data that has not been identified is high,
but it is not considered critical. In the case of the custom infotype, unidentified data is
significant. Although it is confidential information that should be protected, it does not
pose a danger to the company if it is leaked. It should be targeted with a change in the
regular expression defined and with the help of keywords or exclusion rules that allow
Cloud DLP to identify these amounts, giving it a custom approach.

5.2. Results use case 2

After sending the content inspection request, Cloud DLP found 18 results. These will
be the maximums that could be found because the minimum matching range in the
request was VERY_UNLIKELY.

The results are printed on the console showing the value of the infotype, its type, and
the matching likelihood. In this way, the input data and the findings of the inspection
can be analyzed to conclude if the behavior of the service has been adequate.

Analyzing false positives

Checking the results obtained, it is possible to find out if any false positive has been
detected in any of the defined infotypes. It is summarized in the following table:

Infotype detector Total findings Number of false positives

FIRST_NAME 1 1

LAST_NAME 1 0

LOCATION 3 0

SPAIN_DNI_NUMBER 1 0

GENDER 0 0

DATE_OF_BIRTH 1 0

DATE 2 0

STREET_ADDRESS 1 1

GENERIC_ID 8 2

TOTAL 18 4

Table 8: False positives use case 2

51
There are 4 false positives detected during the analysis, representing the 22,22% of
the total findings. This value is very significant, but it is logical by the type of analysis
accepted with a VERY_UNLIKELY likelihood.

The FIRST_NAME infotype detected my surname as my name in a VERY_LIKELY

probability. Therefore, it is an error that Cloud DLP would show regardless of the
defined likelihood in the inspection. The STREET_ADDRESS detected is a group of
sensitive information made up of person names and identifiers with a LIKELY
likelihood, which is not related with an address. About the GENERIC_ID false positives,
they correspond to errors when confusing address or dates, although the service
define this match as VERY_UNLIKELY, so it could be discarded if the probability
threshold increased.

Figure 25: First name false positive Figure 26: Generic ID false positive

Analyzing unidentified data

By comparing the findings located in the inspection, as well as sensitive information

in the image, I can find out what data has not been identified by the tool:

Infotype detector Number of Percentage with

unidentified data reference to total

FIRST_NAME 1 100%

LAST_NAME 1 50%

LOCATION 0 0%

SPAIN_DNI_NUMBER 0 0%

GENDER 1 100%

DATE_OF_BIRTH 0 0%

DATE 0 0%

GENERIC_ID 0 0%

TOTAL 3 17,64%

Table 9: Unidentified data use case 2

52
There are 3 sensitive data that the service can not detect. The FIRST_NAME defined in
the image is not located by the inspection, which confused it with the surname as
commented in the previous analysis. In the case of GENDER, the pattern definition
does not allow to find out that the ‘F’ letter in the DNI means woman.

As it is, the inspection has too many errors to be usable in images. Although Cloud DLP
has the OCR additional capability with a great potential, it is still quite immature to use
in real cases. The results are not reliable and consistent enough to ensure its correct
behavior in images. The image inspection needs to improve its detection capabilities,
otherwise its correct behavior can not be guaranteed. If a company decides to use the
Cloud DLP API to detect sensitive data in images, they should be aware of this
limitation by part of the tool. Using this without revision would cause serious
problems for the organization, where the loss of this data would involve critical
consequences such as huge economic sanctions for revealing user sensitive
information.

Analyzing the pixelate de-identification method

The result obtained once the pixelate function has finished is showed in Figure 27.
With the generated code, I have achieved that the last finding found in the inspection
is the one that appears masked.

I have not been able to get the pixelation applied to the rest of findings since the
implementation of that part of the code requires more research that it seemed, so I
have not had enough time to include it in the thesis.

Despite this, the result shows that the pixelation masking proposal works correctly
and could be implemented in the service once the method is sufficient mature.

Figure 27: result2.jpg pixelated

53
6. Budget
This section will analyze all the costs of the project during 16 weeks of work.
As it is basically software, I have not needed any kind of components to do it.

The following table shows an estimation of the general salary of the team. I have
included the hours worked per week, as well as the cost per hour, obtaining the gross
salary. Apart from this, I have considered the 30% for Social Security (SS) paid by the
company to obtain the total salary of all the workers:

Role Weeks Cost/h Hours/week Gross SS Charges TOTAL

Salary
Senior engineer 16 40€/h 7h/week 4.480€ 1.344€ 5.824€
Junior 16 15€/h 20h/weeek 4.800€ 1.440€ 6.240€
developer
Analyst 16 25€/h 25h/week 10.000€ 3.000€ 13.000€

Table 10: General workers salary

To do the project, we have needed four powerful computers to be autonomous in our

work and be able to work correctly:

Material Price (€) Units TOTAL

Equipment 1.200 € 4 4.800 €

Table 11: Equipment

This will be an investment for the project, so it is necessary to consider the

amortization of this material. I only have to calculate the amortization of all the
computers considering the 15% of the amortization coefficient and 5 years of useful
life.

54
Once I have made the calculation, I obtained the total depreciation of the material
used:

Amortization Price Useful Coefficient Residual Units TOTAL

(€) life (%) value (€)
(years)
Equipment 1.200 € 5 15 180 € 4 4.080 €

Table 12: Amortization

Regarding to the software resources, the total cost of the lab environment is the
following, which includes all the tools used in the use cases and the functionalities
enabled during the implementation:

Software resources Estimated monthly cost Months TOTAL

Laboratory cost 1.300 € 4 5.200 €

Table 13: Software resources

To sum up, the final budget for the project is the following:

Costs Total
Worker salaries 25064 €
Equipment 4.800 €
Amortization 4.080 €
Software 5.200 €
TOTAL COST 39.144 €

Table 14: Total cost of the project

55
7. Conclusions and future development:
In conclusion, the thesis has reflected the importance of detecting sensitive
information in the cloud, as well as de-identifying it to prevent possible attacks or
unintentional leaks.
The comparison of the main cloud native tools shows that Cloud DLP is the most
complete tool and the one that includes a great number of capabilities in its service,
but with limitations. Through the designed use cases, I assessed its behavior in
different situations with input unstructured data, where the behavior in the inspection
of documents is adequate. The custom infotype generates more errors in the analysis
as it is generic, so this type of capability will be aimed to improve the results for
specific use cases.

On the other hand, the behavior when the tool inspects images failed. Data protection
is critical for any business, and in this use case it has not been able to detect all the
sensitive information contained in the DNI. Cloud DLP cannot ensure a high efficiency
in this type of file due to a lack of mature. About the proposed de-identification method,
I was able to save the coordinates where the infotype was located to apply the function
in that region. The data transformation works correctly, but only in the last finding
found. It will be necessary to extend the code, making it useful and efficient to be
introduced as a new data transformation method.

As a future work, the tool could be oriented as a hierarchical classifier, where

depending on the type of document, specific rules are applied to obtain greater
efficiency in the work done. In addition, it could be studied how to eliminate the false
positives that have appeared in the inspections, improving the results and the
effectiveness of the tool.

56
Bibliography:

[1] https://www.powerdata.es/cloud
[2] https://www.salesforce.com/mx/cloud-computing/
[3] https://intelequia.com/blog/post/2055/ventajas-del-cloud-computing
[4] https://www.redhat.com/es/topics/cloud-computing/what-is-saas
[5] https://www.redhat.com/es/topics/cloud-computing/what-is-paas
[6] https://www.redhat.com/es/topics/cloud-computing/what-is-iaas
[7] https://nexica.com/es/blog/modelos-de-despliegue-cloud-cloud-privado-cloud-p%C3%BAblico-y-cloud-
h%C3%ADbrido
[8] https://www.redeszone.net/tutoriales/redes-cable/que-es-multicloud-ventajas/
[9] https://www.cybrary.it/blog/introduction-to-data-loss-prevention/
[10] https://dspace.library.uvic.ca/bitstream/handle/1828/11339/Alhindi_Hanan_PhD_2019.pdf?sequence=1&i
sAllowed=y. Page 22: Data Loss Prevention
[11] https://www.sealpath.com/es/blog/tres_estados_info/
[12] https://www.itdigitalsecurity.es/reportajes/2019/01/dlp-o-como-prevenir-la-fuga-de-datos
[13] https://www.mcafee.com/blogs/enterprise/cloud-security/do-you-dlp-understanding-the-difference-
between-content-awareness-and-contextual-
analysis/#:~:text=Data%20loss%20prevention%20(DLP)%2C,file%20servers%20or%20in%20cloud
[14] Gartner: Data Loss Prevention: Comparing Architecture Options. Published 3rd December 2020 – ID
G00731429
[15] https://digitalguardian.com/blog/what-data-classification-data-classification-definition
[16] https://docs.trendmicro.com/all/ent/imsec/v1.6/en-us/imsec_1.6_olh/dac_keywords.html
[17] Gartner: Guide to Cloud Data Security Concepts. Published 21st September 2021 – ID G00756156
[18] https://www.microsoft.com/en-us/insidetrack/using-azure-information-protection-to-classify-and-label-
corporate-data
[19] https://techdocs.broadcom.com/us/en/symantec-security-software/information-security/data-loss-
prevention/15-8/about-data-loss-prevention-policies-v27576413-d327e9/supported-formats-for-file-type-
identification-v41600705-d327e133471.html
[20] https://digitalguardian.com/dskb/data-discovery
[21] https://digitalguardian.com/dskb/data-egress
[22] https://knowledge.broadcom.com/external/article/160504/detect-sensitive-data-in-an-image-file-w.html
[23] https://cloud.google.com/dlp/docs/analyzing-and-reporting
[24] https://securityintelligence.com/data-activity-monitoring-and-data-loss-prevention-a-balanced-approach-
to-securing-your-critical-assets/
[25] https://cloud.google.com/dlp?hl=es
[26] https://is.muni.cz/th/asqds/thesis.pdf. Page 10: Leakage handling
[27] https://docs.microsoft.com/es-es/azure/information-protection/what-is-information-protection
[28] https://docs.aws.amazon.com/macie/latest/user/macie-user-guide.pdf#what-is-macie
[29] https://cloudacademy.com/course/introduction-to-google-cloud-data-loss-prevention/introduction-to-
google-cloud-data-loss-prevention/
https://github.com/MicrosoftDocs/Azure-RMSDocs/blob/master/Azure-RMSDocs/rms-client/client-
admin-guide-file-types.md
https://cloud.google.com/dlp/docs/infotypes-reference?hl=es-419

57
https://docs.aws.amazon.com/macie/latest/user/discovery-supported-formats.html
https://cloud.google.com/dlp/docs/inspecting-storage?hl=es_419
https://stealthbits.com/blog/using-the-azure-information-protection-aip-scanner-to-discover-sensitive-
data/
https://docs.aws.amazon.com/macie/latest/user/discovery-supported-formats.html
https://cloud.google.com/dlp/docs/sensitivity-risk-calculation
https://docs.microsoft.com/es-es/azure/information-protection/aip-classification-and-protection
https://cloud.google.com/dlp/docs/concepts-image-redaction
https://techcommunity.microsoft.com/t5/security-compliance-and-identity/azure-information-protection-
documentation-update-for-november/ba-p/287364
https://aws.amazon.com/blogs/security/how-to-create-custom-alerts-with-amazon-macie/
[30] https://cloud.google.com/storage?hl=es
[31] https://cloud.google.com/bigquery?hl=es
[32] https://cloud.google.com/datastore?hl=es
[33] https://cloud.google.com/dlp/docs/classification-redaction#storage_classification
[34] https://cloud.google.com/dlp/docs/concepts-infotypes?hl=es_419
[35] https://cloud.google.com/dlp/docs/likelihood
[36] https://cloud.google.com/dlp/demo/#!/#!%2F
[37] https://cloud.google.com/dlp/docs/deidentify-sensitive-data
[38] https://cloud.google.com/dlp/docs/concepts-templates
[39] https://cloud.google.com/dlp/docs/creating-job-triggers?hl=es_419
[40] https://cloud.google.com/gcp/?hl=es
[41] https://cloud.google.com/data-fusion/docs/create-data-pipeline
[42] https://cloud.google.com/dlp/docs/redacting-sensitive-data-images

58
Appendices

Image.js
// Imports the Google Cloud libraries
const DLP = require('@google-cloud/dlp');
//const vision = require('@google-cloud/vision');
var jimp = require('jimp');
const gm = require('gm');

// Imports required Node.js libraries

const mime = require('mime');
const fs = require('fs');

// Instantiates a client
const dlp = new DLP.DlpServiceClient();
//const client = new vision.ImageAnnotatorClient();

// The project ID to run the API call under

const projectId = 'tfg-dlp-test';

// The path to a local file to inspect. Can be a JPG or PNG image file.
const filepath = 'blanca2.jpg';

// The minimum likelihood required before redacting a match

const minLikelihood = 'VERY_UNLIKELY';

// The maximum number of findings to report per request (0 = server

maximum)
const maxFindings = 0;

// Whether to include the matching string

const includeQuote = true;

//infoTypes to identify and redact

const infoTypes = [{ name: 'FIRST_NAME' }, { name: 'LAST_NAME' },{ name:
'LOCATION' }, { name: 'GENDER' }, { name: 'DATE' }, { name: 'DATE_OF_BIRTH'
}, { name: 'STREET_ADDRESS' }, { name: 'SPAIN_DNI_NUMBER' }, { name:
'GENERIC_ID' }];

// The local path to save the resulting image to.

const outputPath = 'result2.png';

59
async function inspectAndPixelateImage() {
// Construct file data to inspect
const imageRedactionConfigs = infoTypes.map(infoType => {
return {infoType: infoType};
});
const fileTypeConstant =
['image/jpeg', 'image/bmp', 'image/png', 'image/svg'].indexOf(
mime.getType(filepath)
) + 1;
const fileBytes =
Buffer.from(fs.readFileSync(filepath)).toString('base64');
const item = {
byteItem: {
type: fileTypeConstant,
data: fileBytes,
},
};

// Construct Inspect request

const inspectRequest = {
parent: `projects/${projectId}/locations/global`,
inspectConfig: {
infoTypes: infoTypes,
//customInfoTypes: customInfoTypes,
minLikelihood: minLikelihood,
includeQuote: includeQuote,
limits: {
maxFindingsPerRequest: maxFindings,
},
},
item: item,
};

//Construct Redact request

const redactRequest = {
parent: `projects/${projectId}/locations/global`,
byteItem: {
type: fileTypeConstant,
data: fileBytes,
},
inspectConfig: {
minLikelihood: minLikelihood,
infoTypes: infoTypes,
},
imageRedactionConfigs: imageRedactionConfigs,
};

60
// Run request
const [responseInspect] = await dlp.inspectContent(inspectRequest);
const findings = responseInspect.result.findings;

//const [responseRedact] = await dlp.redactImage(redactRequest);

//const image = responseRedact.redactedImage;

var results = 0;

if (findings.length > 0) {
console.log('Findings:');
findings.forEach(finding => {
if (includeQuote) {
results = results + 1;
console.log(`\tQuote: ${finding.quote}`);
console.log(`\tInfo type: ${finding.infoType.name}`);
console.log(`\tLikelihood: ${finding.likelihood}`);
console.log(`\n`);

finding.location.contentLocations.forEach(location => {
location.imageLocation.boundingBoxes.forEach(box => {
console.log(`\t\tTop: ${box.top}`);
console.log(`\t\tLeft: ${box.left}`);
console.log(`\t\tHeight: ${box.height}`);
console.log(`\t\tWidth: ${box.width}`);
console.log(`\n`);

const size = 7;
var top = box.top;
var left = box.left;
var height = box.height;
const width = box.width;

jimp.read('blanca2.jpg').then(coord => {
return coord
.pixelate(size, left, top, width, height)
.write(outputPath);
})
})
});
}})
console.log(`Total findings: ${results}`);
// fs.writeFileSync(outputPath, image);
console.log(`Saved image redaction results to path: ${outputPath}`);
}}

inspectAndPixelateImage();

61
Glossary

DLP: Data Loss Prevention

GCP: Google Cloud Platform

ICT: Information and Communications Technology

IT: Information Technology

IaaS: Infrastructure as a Service

PaaS: Platform as a Service

SaaS: Software as a Service

PII: Personally Identifiable Information

GDPR: General Data Protection Regulation

I-DLP: Integrated DLP

E-DLP: Enterprise DLP

CASB: Cloud Access Security Broker

SEG: Secure Email Gateway

SWG: Secure Web Gateway

62
AIP: Azure Information Protection

AWS: Amazon Web Services

OCR: Optical Character Recognition

Project - Report
No ratings yet
Project - Report
75 pages
Chapter One
No ratings yet
Chapter One
7 pages
Cloud Report Final
No ratings yet
Cloud Report Final
99 pages
Synopsis of ML Project
100% (1)
Synopsis of ML Project
6 pages
A Dynamic Cloud With Data Privacy Preservation
No ratings yet
A Dynamic Cloud With Data Privacy Preservation
151 pages
Helpfulnoteschecklist PDF
No ratings yet
Helpfulnoteschecklist PDF
80 pages
Datacommunicationsecurityprivacy PDF
No ratings yet
Datacommunicationsecurityprivacy PDF
80 pages
Thatsliehopethishelps PDF
No ratings yet
Thatsliehopethishelps PDF
80 pages
Context Sensitive Privacy Algorithm Method For Users in Cloud
No ratings yet
Context Sensitive Privacy Algorithm Method For Users in Cloud
5 pages
I Paper 3
No ratings yet
I Paper 3
5 pages
(IJCST-V3I6P17) :mr. Chandan B, Mr. Vinay Kumar K
No ratings yet
(IJCST-V3I6P17) :mr. Chandan B, Mr. Vinay Kumar K
4 pages
A Secure Data Privacy
No ratings yet
A Secure Data Privacy
15 pages
E3sconf Iconnect2023 04040
No ratings yet
E3sconf Iconnect2023 04040
9 pages
Fulltext01
No ratings yet
Fulltext01
100 pages
Data Security in The Cloud: Project Summary Gerardo Pineda Betancourth
No ratings yet
Data Security in The Cloud: Project Summary Gerardo Pineda Betancourth
4 pages
Mini Web Cloud
No ratings yet
Mini Web Cloud
66 pages
Secure IoT Data Sharing in Blockchain
0% (1)
Secure IoT Data Sharing in Blockchain
49 pages
Assignment 3
No ratings yet
Assignment 3
10 pages
NIST Draft SP 800 144 - Cloud Computing PDF
No ratings yet
NIST Draft SP 800 144 - Cloud Computing PDF
60 pages
Om Report 2025
No ratings yet
Om Report 2025
24 pages
NIST Cloud Computing Architecture
No ratings yet
NIST Cloud Computing Architecture
204 pages
Cloud Security Architecture Guide
No ratings yet
Cloud Security Architecture Guide
204 pages
Detecting Secrets
No ratings yet
Detecting Secrets
27 pages
Security and Privacy in Cloud Computing
No ratings yet
Security and Privacy in Cloud Computing
110 pages
Cloud Computing: A Nobel Cryptosystem For Group Data Sharing in Cloud Storage
No ratings yet
Cloud Computing: A Nobel Cryptosystem For Group Data Sharing in Cloud Storage
10 pages
First Review PPT Template
No ratings yet
First Review PPT Template
16 pages
Sharna Cs Done
No ratings yet
Sharna Cs Done
15 pages
Ijet 26263
No ratings yet
Ijet 26263
9 pages
Cloud Data Security Solutions
No ratings yet
Cloud Data Security Solutions
3 pages
Final Thesis Document 2014-1 PDF
No ratings yet
Final Thesis Document 2014-1 PDF
226 pages
Research On Preserving User Confidentiality in Cloud Computing - Design of A Confidentiality Framework
No ratings yet
Research On Preserving User Confidentiality in Cloud Computing - Design of A Confidentiality Framework
18 pages
15 The End 2o
No ratings yet
15 The End 2o
71 pages
Research Essay Privacy Issue Human
No ratings yet
Research Essay Privacy Issue Human
11 pages
Privacy Preservation and Secured Data Storage in Cloud Computing
No ratings yet
Privacy Preservation and Secured Data Storage in Cloud Computing
10 pages
Cloud Security for CS Students
100% (1)
Cloud Security for CS Students
9 pages
C6-REST-666 Report Project
No ratings yet
C6-REST-666 Report Project
57 pages
Seminar On Security in Cloud Computing
No ratings yet
Seminar On Security in Cloud Computing
32 pages
Report of Internship123
No ratings yet
Report of Internship123
19 pages
Crypto 23 PDF
No ratings yet
Crypto 23 PDF
12 pages
Cloud Security Issues
No ratings yet
Cloud Security Issues
19 pages
Azeefa
100% (3)
Azeefa
103 pages
TS Report
No ratings yet
TS Report
38 pages
Data Security in Cloud Storage
No ratings yet
Data Security in Cloud Storage
4 pages
Data Storage Security in Cloud Computing
No ratings yet
Data Storage Security in Cloud Computing
18 pages
TECHNICAL Seminar Gurukiran
No ratings yet
TECHNICAL Seminar Gurukiran
32 pages
Googlescolar
No ratings yet
Googlescolar
4 pages
Machine Learning Boosts Cloud Security
No ratings yet
Machine Learning Boosts Cloud Security
5 pages
A New Perspective On Cloud Computing
No ratings yet
A New Perspective On Cloud Computing
10 pages
Cloud Clocking - AV - Report Final
No ratings yet
Cloud Clocking - AV - Report Final
70 pages
Secure Cloud Using RGB Value and Homomorphic Encryption For Shared Data in Cloud
No ratings yet
Secure Cloud Using RGB Value and Homomorphic Encryption For Shared Data in Cloud
4 pages
Major 5N4
No ratings yet
Major 5N4
73 pages
Data Leakage Detection Using Cloud Computing
No ratings yet
Data Leakage Detection Using Cloud Computing
6 pages
Technical Seminar Darshan
No ratings yet
Technical Seminar Darshan
32 pages
A Study On Secure Data Storage in Public Clouds: T. Dhanur Bavidhira C. Selvi
No ratings yet
A Study On Secure Data Storage in Public Clouds: T. Dhanur Bavidhira C. Selvi
5 pages
Research Paper Vignesh
No ratings yet
Research Paper Vignesh
8 pages
TV 1&2
100% (1)
TV 1&2
48 pages
Instant Access To Global Marketing Management 6th Edition by Masaaki Kotabe Ebook Full Chapters
No ratings yet
Instant Access To Global Marketing Management 6th Edition by Masaaki Kotabe Ebook Full Chapters
24 pages
System & Network Security Course
No ratings yet
System & Network Security Course
3 pages
Lesson Eleven
No ratings yet
Lesson Eleven
3 pages
The Adventures in Harmony Course
100% (11)
The Adventures in Harmony Course
183 pages
Thomas L. Brodie-Genesis As Dialogue - A Literary, Historical, and Theological Commentary-Oxford University Press, USA (2001) PDF
100% (5)
Thomas L. Brodie-Genesis As Dialogue - A Literary, Historical, and Theological Commentary-Oxford University Press, USA (2001) PDF
605 pages
Modern Political Theories
0% (1)
Modern Political Theories
3 pages
Definition of Marriage
No ratings yet
Definition of Marriage
1 page
Form 4 - Tes Batch 13.1 (94 Grantees) 1st Sem
No ratings yet
Form 4 - Tes Batch 13.1 (94 Grantees) 1st Sem
2 pages
Sugar Sun Series Order
No ratings yet
Sugar Sun Series Order
1 page
PJ-700 Series Brochure - LR
No ratings yet
PJ-700 Series Brochure - LR
5 pages
Admission of A Partner
No ratings yet
Admission of A Partner
8 pages
Type Form
No ratings yet
Type Form
2 pages
A Commentary On Mark's Gospel
100% (2)
A Commentary On Mark's Gospel
41 pages
AAT MATS LRP Assessment Answers 2022
No ratings yet
AAT MATS LRP Assessment Answers 2022
16 pages
Bonafide For Copy Evaluation
No ratings yet
Bonafide For Copy Evaluation
6 pages
Tourism's Economic Impact in Trinidad
No ratings yet
Tourism's Economic Impact in Trinidad
9 pages
6.1. Review 456
No ratings yet
6.1. Review 456
2 pages
RA 1425 Memo 247
100% (1)
RA 1425 Memo 247
3 pages
Assess The Ability of Technology To Ensure Human Happiness in The Present Century
100% (1)
Assess The Ability of Technology To Ensure Human Happiness in The Present Century
3 pages
Generalized Poisson Equation FDM
No ratings yet
Generalized Poisson Equation FDM
19 pages
Rotor Resistance Control of Induction Motor
No ratings yet
Rotor Resistance Control of Induction Motor
4 pages
Family Nursing Assessment Guide
No ratings yet
Family Nursing Assessment Guide
23 pages
MentFX Symbols Sorted and Categorized v4 RESTORED
No ratings yet
MentFX Symbols Sorted and Categorized v4 RESTORED
4 pages
The Quranists Quranism 11256 en
No ratings yet
The Quranists Quranism 11256 en
4 pages
Communication and Flight Information
No ratings yet
Communication and Flight Information
94 pages
Eu Data Act Addendum
No ratings yet
Eu Data Act Addendum
1 page
Hermeneutics Elound Questions and Answers
No ratings yet
Hermeneutics Elound Questions and Answers
8 pages
Canal Escape
No ratings yet
Canal Escape
6 pages
A Thing of Beauty Ii
60% (5)
A Thing of Beauty Ii
16 pages