0% found this document useful (0 votes)

31 views10 pages

02 - Vines, Samoila

The paper provides an overview of the Data Vault methodology, emphasizing its advantages over traditional data modeling approaches like Kimball and Inmon. It highlights Data Vault's flexibility, scalability, and ability to integrate various data sources, making it suitable for big data environments. The study includes a comparative analysis of data warehousing methodologies and presents a use case demonstrating the application of Data Vault architecture.

Uploaded by

albumihai1186

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

31 views10 pages

02 - Vines, Samoila

Uploaded by

albumihai1186

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Informatica Economică vol. 27, no.

2/2023 15

An Overview of Data Vault Methodology and Its Benefits

Andreea VINEȘ, Radu-Eleonor SAMOILĂ

Bucharest University of Economic Studies
andreea.vines@csie.ase.ro, radusamoila2@gmail.com

Business Intelligence plays a vital role in helping organizations to extract meaningful insights
from their data and make informed decisions. However, choosing the right data modeling
approach can be challenging, as different methodologies have unique advantages and
limitations. The focus of the current paper is to provide an insight into the Data Vault
architecture, which is an emerging approach to data modeling, compared to the established
methods of Kimball and Inmon. The paper conducted a comparative analysis of these
methodologies, with a particular emphasis on the benefits of Data Vault. This study highlights
the advantages of the Data Vault model in managing data from multiple sources and its
compatibility with agile implementation practices. Overall, this paper sheds light on the
relevance of Data Vault in contemporary data management practices.
Keywords: Data Vault, Data Warehouse, Architecture, Data modelling, ETL
DOI: 10.24818/issn14531305/27.2.2023.02

1 Introduction
In the era of digitalization, most functions
of business and society are depending on the
structured and unstructured data. Thus,
enterprises will be capable of making better
decisions for their organizations. [4]
data that comes in many forms. To gain a large To address the challenges posed by big data,
competitive advantage and some insights new data warehousing techniques have
from the data, there is required to apply data emerged that are better suited to handling
analytics on top of it. [1] Hence that, the large and variated data sources. One such
volume, velocity and variety of data are technique is the Data Vault methodology,
increasing rapidly over the last decades. An which was developed by Dan Linstedt in the
advanced method to overcome all the early 2000s. Its main aim is to addresses all
shortcomings of traditional data warehouse the challenges of building and maintaining a
solutions that might appear became required. data warehouse in the context of big data,
For example, considering the volume that providing a flexible, scalable, and auditable
keeps increasing, traditional data modelling approach to data modelling and integration.
approaches can become difficult to maintain Data Vault is designed to accommodate a
to store all the information. Another example wide variety of data sources, including both
would be the data sources that are constantly structured and semi-structured data, making it
changing; hence the classical data warehouse well-suited to the challenges of big data. In
model is quite robust, and it can’t handle all this context, it is important to explore the
business requirements. Moreover, there are principles and practices of Data Vault, and to
multiple issues that affect the classical understand how it can be used to create
systems, such as missing source keys, data effective data warehousing solutions in the era
issues, data quality, performance gaps and of big data.
lack of database tuning or partitioning. [2] [3] The purpose of this paper is to conduct a
As traditional data warehouses were designed comprehensive analysis of the evolution of
to handle structured data, not semi-structured data warehouse models, with a specific focus
or unstructured data that comes from social on the Data Vault methodology proposed by
media, mobile devices or other sources, Dan Linstedt, and its relevance in the context
vendors started building a new generation data of Big Data. The paper includes an overview
warehouse with capabilities of performing of this area and also a use case that
analytics and forecasting, to support
Informatica Economică vol. 27, no. 2/2023 16

demonstrates how semi-structured data can be Bill Inmon and Ralph Kimball. The concept
integrated using Data Vault architecture. of Data Vault appeared in 1990 when Dan
To achive this objective, the current paper is Linstedt published some papers regarding this
structured as follows: The first section method, being able to propose a final version
provides an introduction of the paper. The of the architecture in 2020. In the following 10
second section describes the three available years, the author adjusted the method, and
methodologies of Data warehouses developed adapted it to the current necessities and design
by Kimball, Inmon and Linstedt. The third patterns, providing an adaptive method called
section specifically focuses on the Data Vault, Data Vault 2.0.
presenting its architecture and core Kimball’s architecture [5] that was proposed
components. The fourth section presents a use in 1996 consist in a two-layer model. Raw
case that illustrates how data can be modelled data is copied into a staging area and all the
using Data Vault 2.0. Finally, the last section transformation happen before it is ingested
of the paper presents the conclusions of the into the second layer, data warehouse. Using
study. this approach, end users will connect to the
same data warehouse model and all the
2 Data Warehouse models – from Kimball dimensions will be common. The main
to Linstedt advantage of it is that it can be easy to
Data Vault represents a conceptual and logical implement, but it can also be quite challenging
data model used to build data warehouses for if there is required to build a second model
enterprise-scale analytics. It comes as an using the same landing data, as everything
alternative of the architectures proposed by needs to be loaded again. [5]

Fig. 1. Kimball approach architecture

Source - [7]

Inmon’s architecture [6] has three layers to advantage of separating the data warehouse
overcome the challenges presented earlier, from the data mart is beneficial in the context
where the new layer consists of data marts of models with multiple use cases. It also
which are some mini-data warehouses from offers scalability in the context of integrating
where users can extract the information that is new requirements or defining new data marts.
required. This also requires more data [6]
processing to create all the data marts. The
Informatica Economică vol. 27, no. 2/2023 17

Fig. 2. Inmon approach architecture

Source - [7]

Data Vault 2.0 was designed in 2010 by Architecture). to be processed in parallel and
Linstedt as a necessity to come with a new removes all the dependencies on the data
proposed model scalable in the context of big processing side. The hash key is calculated
data, to be able to handle unstructured data based on the combination of the columns that
and to be integrated with more systems. One constitute the business key of the table, using
of the main differences is the change in using different methods such as SHA-1 (Secure
hash keys, which are primary keys instead of Hash Algorithm-1) or MD5 (Message-Digest
sequential numbers. This change allows all Algorithm). Several of the differences
the hubs and satellites (which will be between Data Vault 1.0 and Data Vault 2.0 are
presented in detail in Section 3 – Data Vault synthesized in the table 1. [7]

Table 1. Data Vault 1.0 & 2.0 Comparison

Data Vault 1.0 Data Vault 2.0
Sequence numbers for Surrogate keys. Use Hash keys for surrogate keys.

DV 1.0 also used MPP but only to a DV 2.0 takes advantage of MPP style platforms
certain extent. and is designed with MPP in mind.

Limited support to real time load. It’s real-time ready, cloud ready, NoSQL ready
and big data friendly.
Not very flexible to automation and DV 2.0 has a very strong focus on both
virtualization. automation and virtualization.
DV 1.0 had a major focus on modelling DV 2.0 is a complete system of Business
and many of the modelling concepts are Intelligence. It talks about everything, from
similar. concept to delivery.

The model proposed by Dan Linstedt • It enables new business capabilities such
represents a third approach of data modelling as data-driven decision-making, data
and it comes with a series of benefits, such as: science and it can be considered the key to
[8] new business models.
• It contributes to the organization ability It represents an agile, structured solution with
and improves the speed at which the flexibility for refactoring.
business can learn and exploit more According to the paper of Cernjeka et al. that
opportunities. developed a metamodel, Data Vault can be
• It delivers a modernized data service. used to integrate semi-structured data as
Informatica Economică vol. 27, no. 2/2023 18

well.[9] The authors translated some NoSQL to Kimball's approach where the
document stores that use JSON files in a data transformations to create the dimensions and
Vault model using the key-value concept of fact tables can be complex. Another aspect
the files. The following translation rules were worth mentioning is the historization, where
applied: all three approaches can handle data
• TR0: MongoDB collection is translated versioning. Regarding the lifecycle of the
into a set of hubs, links and satellites. model, it was already mentioned that Data
• TR1: each document id is translated into a Vault provides an easy to change model,
business key and a hash key is calculated where the existing tables are not affected by
and added to the hub. any changes. To conclude, for all modelling
• TR2: the non-id fields of the document are approaches presented, developers and data
translated as attributes of the satellite architects are required to understand the
entity where the value of the satellite requirements and choose the options that best
attribute is the value part of the document. fit.
• TR3: a document reference to another
document is translated into a Data Vault 3 Data Vault architecture
link entity connecting the current hub and The main objective of Data Vault is to ensure
a different hub (referenced document). quick adaption to changes in case a new
• TR4: an embedded document is translated business object is defined or some new
into a DV entity connected to the current sources are ingested. The architecture is
hub. In this case TR1-TR3 are applied defined on three layers: [9]
accordingly. • Staging area – which ensures ingesting
However, this metamodel doesn’t provide data from different sources and keep them
details regarding nested attributes as part of in their raw format; this layer doesn’t keep
JSON files, or how the JSON files with the history; hence it gets truncated every
different structure or attributes can be time a new batch is being processed. It
integrated in the Data vault model, which will allows only applying different hard
be the subject of future work. business rules which doesn’t impact the
On the other side, authors of paper [10] meaning of the data (for example,
provided a methodology for mapping a Data applying different data time
Vault schema to a standard XML schema, that transformations, define hash keys,
offers flexibility and also helps if the data defining Unicode format etc.)
from the source systems come in various • Enterprise data warehouse area – which
formats, and they all need to be stored in a represents the second layer of data which
centralized data warehouse. The authors also is modelled using the data vault 2.0
mentioned that using XML schema might architecture, also keeping the history of
have some limitations, such as identifying the the data. This component is designed
hub key or the data type variations. using hub, satellites, and link tables. This
As Data Vault is a quite new approach, most layer allows applying soft business rules,
of the studies found in the literature focus on calculating different KPIs. This area can
comparing the methodologies of Inmon and be also dividing into a Raw Vault where
Kimball. However, there were also some raw data is transposed into a data vault, or
papers that consider Data Vault in the directly into a business vault, where all
comparison [7][8]. The conclusion of these the business rules are applied, and data is
works is that all three methods have modelled using Data vault.
advantages and disadvantages, depending on • Information delivery area - which
the aim of the implementation. In the two allows end users to access the data using
mentioned researches, it was highlighted that data marts and use it to get different
Data Vault implementation requires simple insights. Data is aggregated and prepared
and standard ETL rules to load data, compared in order to be used for different use-cases,
Informatica Economică vol. 27, no. 2/2023 19

such as decision-making, prediction, artificial intelligence or machine learning.

Fig. 3. Data Vault architecture

Source - [7]

Comparing with the dimensional approach, can have multiple satellites with correspond to
where the main components of a data different data sources that are integrated into
warehouse are facts and tables, in Data Vault the system and they have the same business
there are defined three core components: hub, key. Another alternative is to define a new
link and satellite. [9] satellite if some new attributes are being
A hub represents the core business concept ingested in the system, instead of updating the
(customer, product, sales, vendor); it is current tables.
created as table that contains the business key The popularity of Data Vault was defined
and some metadata about when the key was based on the following characteristics [13]:
first loaded and the source of it. With the Data i. Flexibility which allows new sources to
Vault 2.0 model, the primary key of the hub be easily integrated in the correct model,
table was replaced with a hash key of the with no or only some small updates to
business column (in the Data Vault 1.0 the existing tables. Once a new entity is
version, the primary key was created using a being defined, there is required only to
sequence number). create a hub table and update one of the
A link represents relationships between link tables to connect with the new hub
multiple business objects. It contains a that was created.
primary key defined also as a hash key and ii. Loading efficiency is another
foreign keys for the hubs that are being characteristic, as all the entities can be
interconnected, like a bridge table. Links loaded in parallel without requiring a lot
make the model more flexible as for every of data dependencies. The only
new functionality added, it is just required to dependency required is that all the
create a new hub table and connect it to a link entities need to be processed in the
table. A satellite stores information about the following order: hubs, links and
hubs and the relationship between them, satellites, but all components can be
contains all attributes and information about loaded in parallel (for example, all hub
when the data was loaded and its source, keeps tables can be loaded in parallel,
also all the history to track the changes. The followed by linked tables and the
structure of it consists in a primary key created satellites).
using the parent hash key and the foreign key iii. Third characteristic is auditability that
of the load data and different other columns keeps track of all the changes made to a
for source, load date and attributes. Each hub source system as all the changes are
Informatica Economică vol. 27, no. 2/2023 20

tracked using the satellites, meaning The initial data model contains two objects –
that each change of the current data is Products that initially gets data from two
being stored as new records with the sources (S1 and S2) and another object for
timestamp that indicates its expiration Categories that only ingests data from one
date, similar with slowly changing source (S1). All data comes in a structured
dimensions type 2. format from the client’s systems, and it needs
to be ingested in the final data warehouse in
4 A simplified implementation of Data order to be used in the presentation layer,
Vault where users can consume the data through the
This section presents an alternative of reports and dashboards. The table structure of
implementing a Data Vault model and its the input data that has the column names,
benefits of integrating new sources to the description, and data types is presented in the
current model. Tables 2 & 3:
Table 2. Products table structure (S1 & S2 sources)
Column name Description Data type
id Unique identifier String
Name Name of the product String
Description Description of the product String
Price Current price Float
Category_id Category of the product String

Table 3. Categories table structure (S1 source)

Column name Description Data type
Category_id Unique identifier String
Name Name of the category String
Description Description of the category string

Considering the two data source presented The Product hub is connected with two
above, the proposed Data Vault model is satellites, representing one satellite for each
represented in the figure below. It contains source. The Categories hub has only one
two hubs (one for Categories and one for satellite with its attributes to display the
Products) and the two hubs are being category name and description.
connected using a link table. One of the main principles of the Data Vault
The primary key of the hub is created as a hash is that data won’t be deleted. [14] Considering
key of the business key of the Products source that, the information regarding when the data
(which is the id) and the record source. In this was ingested are kept in every source using
case, if the data that comes from different data ‘Load Date’ columns. As the satellites keep
sources will have the same id, they will be the history of the data, the column ‘Load End
distinguish in the hub as the record source is Date’ will be used to track when the record
difference and hence, that hash key generated become inactive – for example, a new record
will be different. for that key was added in the table (it was
The link table created will contain the hash modified in the source); hence that, the old
key of each hub as foreign key and a new hash record will be marked as inactive by
key (created as a combination of the two populated ‘Load End Date’ column.
mentioned) as a primary key.
Informatica Economică vol. 27, no. 2/2023 21

Fig. 3. Initial Data Vault model

If a new source of products is ingested, the the hash key of the hub is creating using the
approach is to create a new satellite for it. business key (which is the id) and the record
Considering that the data source will have a source. In this case, the hash key will be
new attribute, size, it won’t affect the other unique, and it won’t be affected by having the
sources that don’t have this attribute and the same id value in multiple sources. The new
current ETL process of ingesting data and data model is displayed below.
uploading it into the satellites won’t be
affected. To distinguish between data source,

Table 4. Products table structure (S3 source)

Column name Description Data type
id Unique identifier String
Name Name of the product String
Description Description of the product string
Price Current price Float
Size Current size of the product String
Category_id Category of the product String

From a data integration standpoint, when becomes apparent that Data Vault simplifies
comparing all three methodologies, it the process of adding new sources by keeping
Informatica Economică vol. 27, no. 2/2023 22

the satellites separate. In contrast, with the The new data model can be observed in the
traditional Inmon and Kimball approaches, if Fig. 5. below, where the only change that
a new data source is added to the system, the happen compared to the previous version, was
ETL process must be adjusted to incorporate to create a new satellite table
it. (Products_satellite_S3).

Fig. 5. Final Data Vault model

5 Conclusions solutions.
In this paper it was presented a comprehensive It should be also highlighted that all three
overview of the integration of Data Vault approaches have their respective advantages
architecture within data warehouse solutions. and disadvantages, depending on the defined
The study starts by comparing this approach purpose. The traditional approaches are an
with the methodologies introduced by excellent fit if the system is stable, and the
Kimball and Inman. The traditional requirements are well-known. In contrast,
approaches, Kimball and Inman, are prone to Data Vault architecture is a powerful
challenges generated by changes in the current approach if the data warehouse that needs to
data sources, and they require considerable be built has to be flexible and scalable due to
data engineering work to adjust the current its architecture that allows easy integration of
ETL flow. new sources.
Data Vault architecture, on the other hand, Therefore, the selection of an appropriate
provides flexibility and scalability to approach should be based on the specific
overcome these issues. Although this needs and objectives of the data warehouse
modelling architecture is relatively new, its solution that needs to be implemented. While
agile methodology is increasingly being used Kimball and Inmon are conventional
by many more companies to develop data methodologies that have been in use for a long
Informatica Economică vol. 27, no. 2/2023 23

time and have proved their efficiency in in-depth analysis on this area.
certain contexts, Data Vault provides a As a future work, the paper can be extended to
modern and flexible solution that is better include the implementation of semi-structured
suited for rapidly evolving environments. data, considering nested JSON or XML files
Moreover, Data Vault provides support for and how they can be modelled using Data
semi-structured data which seems to be more Vault. Another aspect that could be
and more used to develop modern data considered when developing solutions for
warehouse solutions. Data Vault architecture semi-structured data is how to handle cases
enables the construction of incremental where the attributes differ from one file to
models at a low cost, making also easy to another, and how they can be modelled
modify the business rules. One of its main effectively within the Data Vault architecture.
benefits is that it helps in the context of Big As semi-structured data becomes more
data, when it comes to data variety, being prevalent in the modern data landscape,
possible to integrate unstructured data or to exploring the integration of this data type
perform near-real-time data loading.[15] within the Data Vault architecture is being
Another advantage for Big data processing critical to ensure that data warehousing
refers to the volume of data that can be solutions remain flexible and scalable. By
ingested in the system, allowing petabyte- addressing these issues, the Data Vault
scale management.[9] architecture can continue to provide a
Some papers have explored the overview comprehensive and robust solution for data
development of metamodels for semi- warehousing in the future.
structured data, but there is a need for a more

References [6] W. H. Inmon and B. D. Eggleston,

[1] J. C. Margulies, "Data as competitive "Building the data warehouse," John
advantage," Winterberry Group, Oct. Wiley & Sons, 4th ed., Indianapolis, IN,
2015, Tech. Rep. 2015-02, pp. 1-28 USA, 2005
[2] D. Linstedt, K. Graziano, and H. [7] L. Yessad and A. Labiod, "Comparative
Hultgreen, "The business of data vault Study of Data Warehouses Modeling
modeling, 2nd edition," in Proceedings Approaches: Inmon, Kimball and Data
of the 2009 International Conference on Vault," in 2016 International
Information and Knowledge Conference on System Reliability and
Engineering (IKE), Las Vegas, NV, Science (ICSRS), Algiers, Algeria,
USA, July 2009, pp. 232-238. 2016, pp. 77-82
[3] D. Linstedt and K. Graziano, "Super [8] D. Schneider, A. Martino and M.
Charge your Data Warehouse," Eschermann, "Comparison of Data
Createspace Independent Pub, Scotts Modeling Methods for a Core Data
Valley, CA, USA, 1st ed., 2011 Warehouse," in Trivadis, pp. 20-21,
[4] Deloitte CIO Journal Editor. "The 2014
Future of Data Warehouses in the Age [9] D. Linstedt, "Building a Scalable Data
of Big Data" [Online]. Available: Warehouse with Data Vault 2.0,"
http://deloitte.wsj.com/cio/2013/07/17/t Morgan Kaufmann, 1st ed., Boston,
he-future-of-data-warehouses-in-the- MA, USA, 2016
age-of-big-data/. [Accessed: Apr. 14, [10] "What is Data Vault? - Data Vault,"
2023] Data Vault, 2019. [Online]. Available:
[5] R. Kimball and M. Ross, "The data https://www.data-vault.co.uk/what-is-
warehouse toolkit: The definitive guide data-vault/. [Accessed: Apr. 14, 2023]
to dimensional modeling," John Wiley [11] K. Cernjeka, D. Jaksic, and V.
& Sons, 3rd ed., Indianapolis, IN, USA, Jovanovic, "NoSQL Document Store
2013 Translation to Data Vault Based EDW,"
Informatica Economică vol. 27, no. 2/2023 24

in Proceedings of the 41st International Learned," in Proceedings of the 38th

Convention on Information and Conference on Conceptual Modeling
Communication Technology, (ER 2019), 2019, pp. 90-99
Electronics and Microelectronics [14] D. Linstedt, "Super Charge Your Data
(MIPRO), Opatija, Croatia, May 2018, Warehouse: Invaluable Data Modeling
pp. 1105-1110 Rules to Implement Your Data Vault,"
[12] C. Knowles and V. Jovanovic, 1st ed. CreateSpace Independent
"Extensible Markup Language (XML) Publishing Platform, 2011
Schemas for Data Vault Models," [15] I. Nogueira, M. Romdhane and J.
Journal of Computer Information Darmont, "Modeling Data Lake
Systems, vol. 53, no. 4, pp. 12-21, 2013 Metadata with a Data Vault" in
[13] C. Giebler, C. Gröger, E. Hoos, H. Proceedings of the 22nd International
Schwarz, and B. Mitschang, "Modeling Database Engineering & Applications
Data Lakes with Data Vault: Practical Symposium (IDEAS '18), New York,
Experiences, Assessment, and Lessons USA, 2017, pp. 253-261

Andreea VINEȘ has graduated the Faculty of Economic Statistics,

Cybernetics and Informatics in 2019. Following the bachelor’s degree, she also
pursued a Master program at the Bucharest University of Economic Studies in
Information Systems for the Management of Processes and Economic
Resources. She is currently working as a Data Engineer, and she comes with a
wealth of experience in designing, developing and maintain data solutions, her
expertise including building and managing data pipelines, data warehousing and ETL
processes, focusing more on the cloud technologies. Her main fields of interest are cloud
technologies, big data, data warehousing and BI.

Radu SAMOILA has graduated the Faculty of Accounting, Audit and

Information Technology in 2009. He also holds a master’s degree in
Economy, Audit, and Information Technology since 2011 at the Bucharest
University of Economy. He had different managerial roles for multinational
companies operating in the energy industry, business, and financial
consultancy or FMCG business sectors. His main roles were covering internal
and external audit, business advisor and corporate governance activities. He is a PhD Student
at Bucharest University of Economy, since 2019. Main fields of interests are business process
optimization and automation, as well as the continuous improvements concept.

Data Vault Modelling
100% (1)
Data Vault Modelling
40 pages
Data Vault Series
No ratings yet
Data Vault Series
10 pages
Data Warehouse and Business Intelligence
No ratings yet
Data Warehouse and Business Intelligence
4 pages
Data Vault
No ratings yet
Data Vault
26 pages
A Direct Approach To Physical Data Vault Design
No ratings yet
A Direct Approach To Physical Data Vault Design
32 pages
KB Catalogue
No ratings yet
KB Catalogue
23 pages
Agile Data Vault 2.0 Overview
No ratings yet
Agile Data Vault 2.0 Overview
42 pages
PLDM
No ratings yet
PLDM
16 pages
DS QB Unit1
No ratings yet
DS QB Unit1
22 pages
Enhancing Data Warehouses Security
No ratings yet
Enhancing Data Warehouses Security
7 pages
IEEE 07815845-Lamia Yessad
No ratings yet
IEEE 07815845-Lamia Yessad
6 pages
Data Vault Issues PDF
No ratings yet
Data Vault Issues PDF
4 pages
Dan Linstedt, Supercharge Your Data Warehouse
100% (1)
Dan Linstedt, Supercharge Your Data Warehouse
152 pages
Nimish PPT Datawarehouse
No ratings yet
Nimish PPT Datawarehouse
9 pages
Data Vault & HQDM Insights
No ratings yet
Data Vault & HQDM Insights
8 pages
Notes Data Warehouse
No ratings yet
Notes Data Warehouse
49 pages
Lake Data Warehouse Architecture for Big Data
No ratings yet
Lake Data Warehouse Architecture for Big Data
8 pages
Mine of Information - Introduction To Data Vault For Data Warehousing
No ratings yet
Mine of Information - Introduction To Data Vault For Data Warehousing
8 pages
Framework For The Integrated and Validated Model of Data Warehouse
No ratings yet
Framework For The Integrated and Validated Model of Data Warehouse
5 pages
1 What Is Data Mining
No ratings yet
1 What Is Data Mining
9 pages
Data Warehousing & Data Mining
No ratings yet
Data Warehousing & Data Mining
16 pages
Data Mining for Business Analysts
No ratings yet
Data Mining for Business Analysts
77 pages
Overview of Data Warehousing and OLAP
No ratings yet
Overview of Data Warehousing and OLAP
12 pages
المستند
No ratings yet
المستند
23 pages
Conceptual Data Vault Model
100% (1)
Conceptual Data Vault Model
7 pages
Data Warehouse Overview & Benefits
No ratings yet
Data Warehouse Overview & Benefits
8 pages
Trends in Data Warehousing
No ratings yet
Trends in Data Warehousing
3 pages
Data Warehousing
No ratings yet
Data Warehousing
11 pages
Data Warehousing and Data Mining Original Notes
No ratings yet
Data Warehousing and Data Mining Original Notes
47 pages
Data Warehouse Final Report
No ratings yet
Data Warehouse Final Report
19 pages
Data Ware Housing
No ratings yet
Data Ware Housing
10 pages
UNIT1
No ratings yet
UNIT1
108 pages
Data Warehouse Architecture Guide
No ratings yet
Data Warehouse Architecture Guide
7 pages
Data Warehousing & Data Mining-A View
No ratings yet
Data Warehousing & Data Mining-A View
11 pages
Data Warehousing-Notes (Module - I & II)
No ratings yet
Data Warehousing-Notes (Module - I & II)
32 pages
Report Data Warehousing
No ratings yet
Report Data Warehousing
16 pages
??? ????????? ???
No ratings yet
??? ????????? ???
21 pages
From:-Anmol Chopra-0231152707 Lakshay Gaur-0301152707
No ratings yet
From:-Anmol Chopra-0231152707 Lakshay Gaur-0301152707
11 pages
Data Warehousing and Data Mining: Downloaded From
No ratings yet
Data Warehousing and Data Mining: Downloaded From
94 pages
Data Warehousing/OLAP Report: Deepa Vaidhyanathan Graduate Student-Department of Computer and Information Systems
No ratings yet
Data Warehousing/OLAP Report: Deepa Vaidhyanathan Graduate Student-Department of Computer and Information Systems
16 pages
What Is Data Vault Modelling
No ratings yet
What Is Data Vault Modelling
4 pages
Ware Housing
No ratings yet
Ware Housing
9 pages
Data Mining 1
No ratings yet
Data Mining 1
41 pages
Data Warehouse As Desicion Support
No ratings yet
Data Warehouse As Desicion Support
3 pages
A Detailed View Inside Snowflake
No ratings yet
A Detailed View Inside Snowflake
14 pages
Data Warehousing Essentials
No ratings yet
Data Warehousing Essentials
25 pages
Data Vault Integration Guide
No ratings yet
Data Vault Integration Guide
11 pages
Emailing CHAPTER-1-Data-Warehouse-The-Building-Blocks
No ratings yet
Emailing CHAPTER-1-Data-Warehouse-The-Building-Blocks
9 pages
Pharma Batch: Data Warehousing
No ratings yet
Pharma Batch: Data Warehousing
32 pages
Data Science & Warehousing Basics
No ratings yet
Data Science & Warehousing Basics
13 pages
Data Warehouse Overview & Insights
No ratings yet
Data Warehouse Overview & Insights
18 pages
Chapter 7 - S S Reddy
No ratings yet
Chapter 7 - S S Reddy
2 pages
Data Warehouse Unit-I
No ratings yet
Data Warehouse Unit-I
33 pages
3 Marks 1.what Is Data Warehouse?: o o o o o
No ratings yet
3 Marks 1.what Is Data Warehouse?: o o o o o
13 pages
Chapter Four
No ratings yet
Chapter Four
43 pages
PBI Training
No ratings yet
PBI Training
36 pages
Data Engineer
No ratings yet
Data Engineer
2 pages
Azure Database Administrator.1
No ratings yet
Azure Database Administrator.1
1 page
QBAE JD Junior Data Engineer
No ratings yet
QBAE JD Junior Data Engineer
2 pages
Multisim Tutorial Using Bipolar Transistor Circuit." N.P., N.D. Web
No ratings yet
Multisim Tutorial Using Bipolar Transistor Circuit." N.P., N.D. Web
1 page
ISO 8859 1 Character Encoding
No ratings yet
ISO 8859 1 Character Encoding
5 pages
Hollow Sections Properties
No ratings yet
Hollow Sections Properties
20 pages
VHF 136-174 MHZ: 1 800 Antenna
No ratings yet
VHF 136-174 MHZ: 1 800 Antenna
18 pages
Scheda - Tecnica - DG PED-IT
No ratings yet
Scheda - Tecnica - DG PED-IT
4 pages
Command List For Gps Tracker: Function Command Format Reply Concel Example Note
100% (5)
Command List For Gps Tracker: Function Command Format Reply Concel Example Note
2 pages
Excel 2013: Olympic Data Analysis
No ratings yet
Excel 2013: Olympic Data Analysis
2 pages
Forcipol - Forcimat
No ratings yet
Forcipol - Forcimat
12 pages
Pic18 (L) f2x4xk22 - Instruction Set
No ratings yet
Pic18 (L) f2x4xk22 - Instruction Set
49 pages
Igor Help
No ratings yet
Igor Help
2 pages
HDCP Gigabit Multimedia Serial Link Serializer/Deserializer MAX9263/MAX9264
No ratings yet
HDCP Gigabit Multimedia Serial Link Serializer/Deserializer MAX9263/MAX9264
67 pages
Some Ideas To Creation and Development History The Organisms.-english-Gustav Theodor Fechner
100% (1)
Some Ideas To Creation and Development History The Organisms.-english-Gustav Theodor Fechner
60 pages
Important PYQs - Electrochemistry
No ratings yet
Important PYQs - Electrochemistry
14 pages
Waves Grade 7
100% (1)
Waves Grade 7
51 pages
Lec-20-21 Supersaturated Flow Through Nozzle
No ratings yet
Lec-20-21 Supersaturated Flow Through Nozzle
21 pages
Maths Tutorial On Indices Allschool - NG
No ratings yet
Maths Tutorial On Indices Allschool - NG
8 pages
Conditional Expectation & Variance
No ratings yet
Conditional Expectation & Variance
3 pages
Cacho, Kathleen Joy C. Broadcast Engineering Finals
No ratings yet
Cacho, Kathleen Joy C. Broadcast Engineering Finals
4 pages
Testing Elixir Andrea Leopardi Download
100% (10)
Testing Elixir Andrea Leopardi Download
117 pages
LESSON 1 - Operations of Signed Numbers
No ratings yet
LESSON 1 - Operations of Signed Numbers
4 pages
Iii - 4 - Statistical Methods and Its Applications - II
No ratings yet
Iii - 4 - Statistical Methods and Its Applications - II
114 pages
Wave Interference and Diffraction
No ratings yet
Wave Interference and Diffraction
5 pages
DBA Assignment by Arslan Ahmad (1044)
100% (1)
DBA Assignment by Arslan Ahmad (1044)
9 pages
Asme Ix
No ratings yet
Asme Ix
95 pages
Progecad LT 2006 Tutorial: 1. Download and Install The Program
No ratings yet
Progecad LT 2006 Tutorial: 1. Download and Install The Program
11 pages
Environmental Engineering Guide
No ratings yet
Environmental Engineering Guide
119 pages
Dme Question Paper PDF
No ratings yet
Dme Question Paper PDF
34 pages
30 IChO PP Sol
No ratings yet
30 IChO PP Sol
113 pages
Math51H Translated For The Mathematical Underdog
100% (1)
Math51H Translated For The Mathematical Underdog
738 pages
Manual 1203 GD1
No ratings yet
Manual 1203 GD1
96 pages

02 - Vines, Samoila

Uploaded by

02 - Vines, Samoila

Uploaded by

Informatica Economică vol. 27, no.

An Overview of Data Vault Methodology and Its Benefits

Andreea VINEȘ, Radu-Eleonor SAMOILĂ

Fig. 1. Kimball approach architecture

Fig. 2. Inmon approach architecture

Table 1. Data Vault 1.0 & 2.0 Comparison

such as decision-making, prediction, artificial intelligence or machine learning.

Fig. 3. Data Vault architecture

Table 3. Categories table structure (S1 source)

Fig. 3. Initial Data Vault model

Table 4. Products table structure (S3 source)

Fig. 5. Final Data Vault model

References [6] W. H. Inmon and B. D. Eggleston,

in Proceedings of the 41st International Learned," in Proceedings of the 38th

Andreea VINEȘ has graduated the Faculty of Economic Statistics,

Radu SAMOILA has graduated the Faculty of Accounting, Audit and

You might also like