Access Control Models in Nosql Databases: An Overview: Ashwaq A. Alotaibi, Reem M. Alotaibi Nermin Hamza
Access Control Models in Nosql Databases: An Overview: Ashwaq A. Alotaibi, Reem M. Alotaibi Nermin Hamza
)
                                               Doi: 10.4197/Comp. 8-1.1
   Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi
                                           Arabia
                                         aalotaibi0553@stu.kau.edu.sa
             Abstract. Recently non-relational databases known as NoSQL have become most popular for
             handling a huge amount of data. Many organizations move from relational databases towards
             NoSQL databases due to the growing popularity of cloud computing and big data. NoSQL
             database is designed to handle unstructured data like documents, e-mails, and social media
             efficiently. It uses distributed and cooperating devices to store and retrieve data. As a large
             number of people storing sensitive data in NoSQL databases, security issues become critical
             concerns. NoSQL has many advantages like scalability and availability, but it suffers from some
             security issues like weak authorization mechanisms. This paper reviews the different models of
             NoSQL databases and the security issues concerning these databases. In addition, we present the
             existing access control models in different NoSQL databases.
             Keywords: Access control; Big data; Databases; NoSQL database; NoSQL models; Security
                       issues.
                                                           1
2                                            Ashwaq A. Alotaibi et al.
                                                                                         [13]
       Increasing the use of big data and cloud                field level or row level          . The integration of
computing led organizations to move from                       Fine-Grained Access
relational databases towards NoSQL databases
[10]                                                                 Control (FGAC) features into data
    . These NoSQL databases are speedily                       management systems that process sensitive
developing and spreading in many companies,                    data could benefit it greatly [8].
and compared to relational databases, NoSQL
is considered a suitable choice. Applications of                      The FGAC is a fundamental requirement
web services, e-commerce, mobile computing,                    in many applications for an effcient protection
and social media require NoSQL databases to                    of critical data. Most of NoSQL databases
store and process huge amounts of data [6].                    adopt basic access control mechanisms
                                                               operating at coarse-grained level. For example,
       Many related organization and websites                  some document-based database enforced
have ranked the NoSQL databases by its                         access control at the level of the database. It is
popularity [7]. The top five open source NoSQL                  still not adequate to provide customized data
databases       are   MongoDB,       Cassandra,
                                                               protection levels, which could increase the
CouchDB, Hypertable, and Redis. These                          usability and expansion of these systems. Only
systems have critical issues in security. Even                 a few NoSQL databases offers a native support
though the advantages of NoSQL databases are                   for FGAC, such as Accumulo database (key-
making them very popular, these systems have                   value database), which enforces access control
critical issues in security [8]. They suffered                 at the cell level. However, the vast majority of
from many security issues like lack of proper                  the existing NoSQL databases do not enforce
authentication, encryption and fine-grained                     FGAC.
authorization [9].
                                                                     In this work, we review different access
      One of the most important issue is                       control models in NoSQL databases. The rest
relevant to the poor data protection                           of the paper is organized as follows: Section 2
mechanisms they currently provide [10]. Access                 presents the different NoSQL database models,
control is the basic unit in data protection of                section 3 discusses the security issues in
any database management system [8].                            NoSQL databases. Section 4 presents the
Granularity level in access control refers to the              existing access control models in NoSQL
size of data which authorized to users [11]. Most              databases. Finally, Section 5 concludes the
of NoSQL databases adopt basic access control                  paper.
mechanisms operating at coarse-grained level
[8]                                                                      2. NoSQL Database Models
   . For example, some document-based
database grants access control to whole                              NoSQL databases categorized into
database or none at all. It is still not adequate to           different models. It generally can be classified
provide customized data protection levels,                     into the following four groups depending on
which could increase the usability and                         the data storage model [14]. Figure 1 shows
expansion of these systems.                                    NoSQL categories with examples for each
      Since big data platforms often process                   model.
user data with personal characteristic, it is                        The key-value model is the simplest one
significant that data access control being at the               among all NoSQL models. It stores data values
finest granularity levels [12]. Fine-grained                    into a system that can be recalled later using a
authorization enables object-level security like               key (hash) [4]. The key is used as an index
                                                               similar to the hash table. It is a simple, effcient
                                Access Control Models in NoSQL Databases: An Overview                           3
                                                                 [15]
and powerful model which storing data in a                          . HBase and Cassandra are examples of the
schema-less form.                                                column-based database.
                                                                 C. Document-based Database
               NoSQL Database
                  Models                                                The document-based database model
                                                                 consists of two main elements which are key
                                                                 and document [5]. This database stores data as
                                                                 documents that contain one or more fields.
  Key-value    Column-      Document-       Graph-               Each document has a unique key that is used to
  database
                based         based         based                insert, delete and update data in the document.
               database      database      database
                                                                 The document-based database supports
                                                                 structured data, semi-structured data (XML
                                                                 files) or unstructured data (text). It also offers
   Redis,
                                                                 high performance and horizontal scalability [4].
                HBase,      MongoDB,
   Oracle                                    Neo4j
   NoSQL
               Cassandra    CouchDB                                     The document inside a database is
                                                                 comparable to the record in the relational
                                                                 database and has a dynamic structure that
       Fig. 1 NoSQL database models with examples.               allows modifying, adding or deleting fields.
A. Key-value Database                                            The indexing feature on specified fields allows
                                                                 fast data retrieval. The document-based
      The key-value database has a very simple
                                                                 database is more flexible than the relational
application programming interface (API). It is
                                                                 database since it is schema-less. It is used for
helpful for quickly getting data from the
                                                                 blog software and content management
database, process a large amount of data and it
                                                                 systems [15]. MongoDB and CouchDB are the
uses less storage capacity to store data [15]. It is
                                                                 most popular document-based databases.
used in forums and websites for online
shopping [5]. Redis, Oracle, and Accumulo are                    D. Graph-based Database
some examples of this model.                                            The graph-based model stores data in a
B. Column-based Database                                         graph form that composed of edges and nodes.
                                                                 The node represents an object while the edge is a
      The column-based database model stores                     relationship between objects [15]. The database
data in a similar way of the key-value database,                 has schema-less and effciently store data.
but the key is an integration of row, column,
                                                                        The graph-based database is used in
and/or time-stamp, which refers to one or many
                                                                 many applications like recommendation
columns (Column Family) [4]. The column
                                                                 software, social networking applications, and
family is equivalent to a table in the relational
                                                                 content management. It is scalable but has
databases.
                                                                 complexity [11]. The graph-based database uses
       This model operates well with both                        shortest path algorithms in order to improve the
complex datasets and a large amount of data in                   efficiency of data queries. Neo4j is an example
distributed systems [2]. It has a faster query than              of the graph-based database.
relational databases. It is easy to add new                      3. NoSQL Database Security Issues
columns by creating a new file while there is a
need to rebuild table in case of relational                             There are many studies discuss and
databases. The model is suitable for an analytic                 analyze security issues in NoSQL databases. In
application, data mining and web applications                    this section, we focus on some of them.
4                                          Ashwaq A. Alotaibi et al.
       Zaki [16] discussed the security threats in           applications. They focused on MongoDB and
NoSQL databases. The main threats are about                  CouchDB       which    are   document-based
integrity, authentication, injection attacks,                databases. These databases fulfill minimum
consistency, and insider attacks. He showed                  criteria in the security features. In access
that these databases do not offer any feature of             control, the MongoDB supports RBAC in
security in the database itself. These databases             unsharded mode while no support for RBAC in
provide a very weak security layer. To                       sharded mode. Also, it supports Mandatory
overcome the security problems of NoSQL                      Access Control (MAC) [21], and Task-Based
databases, Zaki suggested that developers must               Access Control (TBAC) [22]. The CouchDB
enforce the security mechanism in the                        only support RBAC.
middleware without effect on the scalability
                                                                    The main security problem of some
features.                                                    NoSQL databases is the lack of access control.
      Okman et al. [10] analyzed the                         Dadapeer et al. [4] compared and analyzed
authentication and authorization features of                 security features which include authentication,
MongoDB and Cassandra. Also, they proposed                   authorization, data encryption, data access
possible strategies to enhance them. In                      encryption, and auditing in some popular
Cassandra, authorization mechanism is                        NoSQL databases like Cassandra, MongoDB,
enforced at column family level. In MongoDB,                 CouchDB, Redis, and HBase. They found that
both read-only and read-write permissions set                authorization techniques vary from one NoSQL
to users in unshared mode. However, there is                 database to another. NoSQL databases have
no support for authorization in shared mode.                 ineffcient authorization mechanisms. Most of
Both databases provide simple authorization                  them implement authorization at a higher level
mechanisms. So, improving authorization                      instead of performing authorization at a lower
techniques in these databases is needed.                     level. More precisely, authorization mechanism
                                                             is implemented at a database level instead of at
       NoSQL did not support proper
                                                             the collection level. NoSQL databases suffer
authentication and role management when it
                                                             from many security issues such as the lack of
starts [17]. But now it is possible to manage
                                                             fine-grained      access      control,    proper
proper authentication and authorization on the                                              [9]
                                                             authentication, and encryption . Next, we will
most popular NoSQL databases [9]. Gayatri and
                                                             present some research studies that proposed
Rustom [18] performed a comparison between
                                                             access control models for different NoSQL
Role-Based Access Control (RBAC) [19] of
                                                             databases.
most popular cloud and NoSQL databases.
They selected MongoDB and Cassandra as                                 4. Access Control Models in NoSQL
NoSQL databases. The MongoDB supports                                              Databases
RBAC which performs access to document                             Most NoSQL databases implement
collections level based on the privileges                    access control mechanisms at the coarse-
granted to roles. In Cassandra, the                          grained level [8]. However, fine-grained access
authorization is done at column family level.                control (FGAC) is a fundamental requirement
      Milic et al. [20] analyzed the security                in many applications. There are research
features like authentication, access control,                studies that focus on the integration of FGAC
auditing, data encryption, data access                       into NoSQL databases.
encryption, replication, and integrity in some                    In column-based databases, Kulkarni [23]
NoSQL databases which used in web                            proposed an access control model that operates
                            Access Control Models in NoSQL Databases: An Overview                          5
at the fine-grained level. The model                          proposed model supports both content-based
implemented the access control policies at                   and context-based access control policies.
various levels like a column family, column or                     The authors developed the RBAC model
row. It is designed to work with Cassandra                   of MongoDB with fine-grained context and
database and then expanded to operate with                   content-based policies. An enforcement
HBase. However, the proposed model has                       monitor called ConfinedMem (context aware
dedicated implementation. So, it cannot be                   fine-grained MongoDB enforcement monitor)
adapted easily to other databases.                           was designed to implement the model. The
       For document-based databases, many                    monitor designed to integrate into any
research studies focused on MongoDB which                    MongoDB deployment. It is defined as a
is one of the most popular NoSQL databases.                  MongoDB Wire protocol interpreter and acts
MongoDB supports the RBAC model which                        as a proxy that analyzes and possibly alters
implements access control at the collection                  messages that exchanged between MongoDB
level. In order to enforce the access control at             clients and server. However, the enforcement
the document level, the authors [24] proposed                mechanism cannot be implemented in the same
the incorporation of a purpose-based model                   efficient way for all query types due to
working at the document level into MongoDB.                  technological restrictions of MongoDB.
They developed the MongoDB RBAC to                                 To generalize the proposed solutions [24]
support purpose-based policy specification.                   [25]
                                                                , Colombo and Ferrari [8] discussed issues of
They refined the granularity level of the                     integration FGAC into NoSQL databases. They
MongoDB RBAC model to works at the                           used MongoDB to recognize some methods to
document level by integrating some purpose
                                                             define and integrate FGAC into NoSQL
related concepts.                                            databases. The improvement of NoSQL
       In the proposed model, the authors                    databases to support FGAC needs identifying
developed an enforcement monitor called Mem                  suitable engineering solutions for encoding of
(MongoDB enforcement monitor). It was                        policies, defining an enforcement monitor and
designed to work with any MongoDB                            integration it into a target NoSQL database.
deployment. The Mem works as a proxy                               All previous studies focused on
between MongoDB clients and the server. It                   enhancing RBAC in NoSQL databases.
observes and may change the flow of messages                  Actually, there is no commercial NoSQL
that are exchanged by the clients. However, the              database integrates Attribute Based Access
approach is specified to MongoDB and
                                                             Control (ABAC) [26] which supports attributes
generalizing it to operate with multiple NoSQL
                                                             to define access control rules. The authors [27]
databases is required. Also, refining the access              proposed an ABAC model for Big Data
control to operate at the filed level is needed.              applications and traditional data management.
       Many recent applications use the context-             They claim that the proposed model can be
related information to support highly specified               used in a relational database, NoSQL database,
services. Enhancing NoSQL databases with                     and Hadoop. The model based on a query
fine-grained context-aware access control to                  modification method which combines the
work at the filed level is required. The authors              ABAC mechanism into the source code of user
[25]
     proposed an access control model work at                transactions.
the field level for the MongoDB database. The                      On the other hand, the enforcement
                                                             mechanism specified for SQL queries. The SQL
6                                           Ashwaq A. Alotaibi et al.
cannot address data variability of non-relational             security implementation. On the other hand,
models. There is a need for general ABAC                      using AES improves security but increases the
framework to operate with NoSQL databases.                    data length. So, extra bandwidth for Redis
To satisfy this requirement, the authors [28]                 environment is required. Also, the security
proposed a general approach to enforce fine-                   extensions in the proposed system will add extra
grained ABAC into NoSQL databases.                            computational overhead, so it should reduce it.
       The proposed approach implemented                            To improve security in graph-based
different ABAC policies at field level for                     databases, the work [30] proposed a security
documents with a different format without any                 model that performs access control for NoSQL
previous knowledge of the document structures.                graph-oriented database. The proposed model
It based on defining SQL++ query rewriting                     uses metadata and provides Data Definition
approach and targets any document database                    Language (DDL) and Data Manipulation
that supports SQL++.                                          Language (DML) operations. The goal of the
                                                              model is to allow different applications to
       In addition, the ABAC model elastic
                                                              implement their own access control when using
enough to support the implementation of
                                                              the graph-oriented database.
content-based, context-based and purpose-based
policies [28]. Also, it is flexible to support any                   The model provides a structure with
access control rule that consists of subject                  authorization principles to implement access
attributes, object attributes, and environment                control. The model was implemented for the
attributes at document or field level. However,                Neo4j database and success in preventing
using a tool for specifying policies and binding              unauthorized access. However, the model
is needed to simplify this process. Also,                     should be implemented in the core of Neo4j and
monitors should be implemented to facilitate the              assessed the performance to evaluate its
integration of the proposed approach with                     feasibility. Also, extension the access control to
different databases that support SQL++.                       a finer granularity level is required.
      In key-value databases, Redis which is                        In order to examine the feasibility of a
one of the most popular NoSQL databases does                  granular security on the graph database, the
not provide enough security [16]. Zaki and                    authors [31] used Neo4j which is the most
Indiramma [29] proposed a Redis Client which                  popular graph-based database. They used graph
supports many security services like                          concepts to find a technique that permits access
authentication, authorization, and encryption for             to data while maintaining the security. The
different types of data.                                      method used mathematical formulas to
                                                              determine two-hop connections that exit from
      The main idea is that a separate Key is
                                                              and return to a security layer of the network.
created and stored in the database. The value of
                                                              These connections can be disclosed to a user
the key is a data that encrypted using symmetric
                                                              without breach the security that specified by
key by AES algorithm. This data included all
                                                              security layer.
other key values being concatenated and
encrypted. When a query has created, the                            Table 1 summarizes the previously
system first elicits data entities and then uses the           mentioned access control models for different
symmetric key to decrypt the data.                            NoSQL databases. As shown in the table, it is
                                                              obvious that some popular NoSQL databases do
       The authors also developed a user
                                                              not have fine-grained access control.
interface for the system to determine the
effciency of the system with the proposed
                                      Access Control Models in NoSQL Databases: An Overview                                        7
     551–562, 2004.                                                      [22] Deng, J.B. and Hong, F., “Task-based access control
[12] Colombo, P. and Ferrari, E., “Access control in the era                  model,” Journal of Software, 14(1): 76–82, Sep. 2003.
     of big data: State of the art and research directions,” In:         [23] Kulkarni, D., “A fine-grained access control model for
     Proceedings of the 23nd ACM on Symposium on Access                       key-value systems,” In: Proceedings of the third ACM
     Control Models and Technologies, pp: 185–192, 2018.                      conference on Data and application security and
[13] Gupta, N. and Agrawal, R., “Nosql security,” In:                         privacy, pp: 161–164, 2013.
     Advances in Computers, Elsevier, 109: 101–132, 2018.                [24] Colombo, P. and Ferrari, E., “Enhancing mongodb
[14] Fidels Cybersecurity, “Current Data Security Issues of                   with purpose-based access control,” IEEE Transactions
     NoSQL Databases”, Jan., 2014.                                            on Dependable and Secure Computing, 14(6): 591–604,
                                                                              May. 2017.
[15] Nayak, A., Poriya, A. and Poojary, D., “Type of nosql
     databases and its comparison with relational databases,”            [25] Colombo, P. and Ferrari, E., “Towards virtual private
     International Journal of Applied Information Systems, 5                  nosql datastores,” In: IEEE 32nd International
     (4): 16–19, 2013 .                                                       Conference on Data Engineering (ICDE), pp: 193–204,
                                                                              2016.
[16] Zaki, K., ” NoSQL DATABASES: New Millennium
     Database for Big Data, Big Users, Cloud Computing and               [26] Hu, V.C., Kuhn, D.R., Ferraiolo, D.F. and Voas, J.,
     Its Security Challenges”, International Journal of                       “Attribute based access control,” Computer, 48(2): 85–
     Research in Engineering and Technology (IJRET), 3(3):                    88, 2015.
     May 2014.                                                           [27] Longstaff, J. and Noble, J., “Attribute based access
[17] Factor, M., Hadas, D., Harnama, A., Har’El, N.,                          control for big data applications by query modification,”
     Kolodner, E.K., Kurmus, A., Shulman-Peleg, A. and                        In: IEEE Second International Conference on Big Data
     Sorniotti, A., “Secure logical isolation for multi-tenancy               Computing Service and Applications (Big Data Service),
     in cloud storage,” In: IEEE 29th Symposium on Mass                       pp: 58–65, 2016.
     Storage Systems and Technologies (MSST), pp: 1–5, Oct.              [28] Colombo, P. and Ferrari, E., “Towards a unifying
     2013.                                                                    attribute-based access control approach for nosql
[18] Kapadia, G.S., “Comparative study of role-based                          datastores,” In: IEEE 33rd International Conference on
     access control in cloud databases and nosql databases,”                  Data Engineering (ICDE), pp: 709–720, Jun. 2017.
     International Journal of Advanced Research in                       [29] Zaki, K. and Indiramma, M., “A novel redis security
     Computer Science, 8(5), 2017.                                            extension for nosql database using authentication and
[19] Ferraiolo, D.F., Sandhu, R., S., Kuhn, D.R. and                          encryption,” In: IEEE International Conference on
     Chandramouli, R., “Proposed Gavrila, NIST standard                       Electrical, Computer and Communication Technologies
     for role-based access control,” ACM Transactions on                      (ICECCT), pp: 1–6, 2015.
     Information and System Security (TISSEC), 4(3): 224–                [30] Morgado, C., Baioco, G.B., Basso, T. and Moraes, R.
     274, November 2001.                                                      “A security model for access control in graph-oriented
[20] Milic, P., Kuk, K., Trajkovi, S., Ranelovi, D. and                       databases,” In: IEEE International Conference on
     Popovi, B., “Security analysis of open source databases                  Software Quality, Reliability and Security (QRS), pp:
     in web application development,” pp: 310–315, 2016.                      135–142, 2018.
[21] Hu, V.C., Kuhn, D.R., Xie, T. and Hwang, J., “Model                 [31] Crawford, B., “Granular security in a graph database,”
     checking for verification of mandatory access control                     Tech. rep., Naval Postgraduate School Monterey United
     models and properties,” International Journal of                         States, Jul. 2017.
     Software Engineering and Knowledge Engineering, 21
     (01): 103–127, 2011.
                    Access Control Models in NoSQL Databases: An Overview                    9
كمية الحاسبات وتقنية المعمومات ،جامعة الممك عبدالعزيز ،جدة ،المممكة العربية السعودية
                                   arushdi@kau.edu.sa
   المستخمص .أصبحت قواعد البيانات غير االرتباطية الحديثة المعروفة باسم  NoSQLأكثر
   شيوعا في التعامل مع كمية البيانات اليائمة .انتقمت العديد من المؤسسات من قواعد البيانات
                                                                                     ً
   االرتباطيو نحو قواعد بيانات  NoSQLبسبب تزايد شعبية الحوسبة السحابية والبيانات الضخمة.
   تم تصميم قاعدة بيانات  NoSQLلمتعامل مع البيانات غير الييكمية ،مثل المستندات والبريد
   اإللكتروني ووسائل اإلعالم االجتماعية عمى نحو فعال .وتستخدم أجيزة موزعة ،وتعمل بشكل
   كبير من األشخاص يقومون بتخزين البيانات
                                     عددا ًا
                                          ونظر ألن ً
                                                ًا   تعاوني لتخزين واسترجاع البيانات.
   الحساسة في قواعد بيانات  ،NoSQLفإن المشاكل األمنية أصبحت قضايا ميمة .لدى NoSQL
   العديد من المزايا مثل قابمية التوسع والتوافر ،ولكنيا تعاني من بعض المشاكل األمنية مثل
   NoSQL   ضعف آليات صالحيات الدخول .ستعرض ىذه الورقة النماذج المختمفة لقواعد بيانات
   ومشاكل األمان المتعمقة بقواعد البيانات ىذه .باإلضافة إلى ذلك ،نقدم نماذج التحكم في
                                  الوصول لمبيانات الموجودة في قواعد بيانات  NoSQLالمختمفة.
   انكهماث انمفتاحيت :انتحكم في انوصول ,انبياناث انكبيرة ،قواعد بياناث ،قاعدة بياناث،NoSQL
                                          نماذج  ،NoSQLانمشاكم األمنيت.
10   Ashwaq A. Alotaibi et al.