Aggregate Data Models

The document discusses aggregate data models and how they differ from relational data models. Some key points: 1. Aggregate data models group related data elements into complex records called aggregates, allowing for nested data structures like lists and maps. 2. This differs from relational models which store data in normalized tables without nested structures. 3. Aggregate models match how NoSQL databases like key-value and document stores work better than relational models by allowing application developers to work with and manipulate data at the aggregate level.

Uploaded by

chitraalavani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

3K views55 pages

Aggregate Data Models

Uploaded by

chitraalavani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 55

Aggregate Data Models

Data Model
• A data model is a representation that we use
to perceive and manipulate our data.
• It allows us to:
– Represent the data elements under analysis, and
– How these are related to each others
• This representation depends on our
perception.
Data Model: Database View
• In the database field, it describes how we
interact with the data in the database.
• This is distinct from the storage model:
– It describes how the database stores and
manipulate the data internally.
• In an ideal worlds:
– We should be ignorant of the storage model, but
– In practice we need at least some insight to
achieve a decent performance
Data Models: Example
• A Data model is the model of the specific data
in an application
• A developer might point to an entity-
relationship diagram and refer it as the data
model containing
– customers,
– orders and
– products
Data Model: Definition
• In this course we will refer “data
model” as the model by which the
database organize data.
• It can be more formally defined as
meta-model
Last Decades Data Model
• The dominant data model of the last decades
what the relational data model.
1. It can be represented as a set of tables.
2. Each table has rows, with each row
representing some entity of interest.
3. We describe entities through columns
4. A column may refer to another row in the
same or different table (relationship).
NoSQL Data Model
• It moves away from the relational data model
• Each NoSQL database has a different model
– Key-value,
– Document,
– Column-family,
– Graph, and
– Sparse (Index based)
• Of these, the first three share a common
characteristic (Aggregate Orientation).
Relational Model
vs
Aggregate Model
Relational Model
• The relational model takes the information that
we want to store and divides it into tuples (rows).
• However, a tuple is a limited data structure.
• It captures a set of values.
• So, we can’t nest one tuple within another to get
nested records.
• Nor we can put a list of values or tuple within
another.
Relational Model
• This simplicity characterize the relational
model
• It allows us to think on data manipulation as
operation that have:
– As input tuples, and
– Return tuples
• Aggregate orientation takes a different
approach.
Aggregate Model
• It recognizes that, you want to operate on data unit
having a more complex structure than a set of
tuples.
• We can think on term of complex record that allows:
– List,
– Map,
– And other data structures to be nested inside it
• Key-Value, document, and column-family databases
uses this complex structure.
Aggregate Model
• Aggregate is a term coming from Domain-
Driven Design [Evans03]
– An aggregate is a collection of related objects that
we wish to treat as a unit. It is a unit for data
manipulation and management for consistency.
• We like to update aggregates with atomic
operation
• We like to communicate with our data storage
in terms of aggregates
Aggregate Models
• This definition matches really with how key-value,
document, and column-family databases works.
• With aggregates it is easier to work on a cluster,
since they are unit for replication and sharding.
• Aggregates are also easier for application
programmer to work since it solve the impedance
mismatch problem of relational databases.
Example of Relational Model
• Assume we are
building an e-
commerce website;
• We have to store
information about:
users, products,
orders, shipping
addresses, billing
addresses, and
payment data.
Example of Relational Model
• As we are good
relational soldier:
– Everything is
normalized
– No data is
repeated in
multiple tables.
– We have referential
integrity
Example of Relational Model
Example of Aggregate Model
• We have two aggregates: Customers and Orders
• We use the black diamond composition to show
how data fits into the aggregate structure

A possible aggregation
Example of Aggregate Model
• The customer contains a list of billing addresses;
• The order contains a list of: order items, a shipping address, and
payments
• The payment itself contains a billing address for that payment
Example of Aggregate Model
• A single address appears 3 times, but instead of using an id it is copied each time
• This fits a domain where we don’t want shipping, payment and billing address to
change
• What is the difference w.r.t a relational representation?
Example of Aggregate Model
• The link between customer and the order is a
relationship between aggregates
Example of Aggregate Model
• Link from an order item would cross into a separate
aggregate structure for product (not considered
here)
• This is kind of denormalization – similar to tradeoff
with relational database, but is more common with
aggregate because we want to minimize the
number of aggregates we access.
Example of Aggregate Model
• We aggregate to minimize the number of
aggregates we access during data interaction
• •The important think to notice is that,
– We have to think about accessing that data
– We make this part of our thinking when developing the
application data model
• We could draw our aggregate differently, but it
really depends on the “data accessing models”.
• No universal answer for how to draw aggregate boundaries
• It depends entirely on how you tend to manipulate data!
– Accesses on a single order at a time: first solution
– Accesses on customers with all orders: second solution
• Context-specific
– some applications will prefer one or the other
– even within a single system
• Focus on the unit of interaction with the data storage
• Pros:
– it helps greatly with running on a cluster: data will be manipulated
together, and thus should live on the same node!
• Cons:
– an aggregate structure may help with some data interactions but be
an obstacle for others.
Consider a Student information system consisting of 3 entities namely,
Student_info, Course_info, and Marksheet.
Following are the frequent queries in the workload:
1. List the details of students admitted to ‘F.Y.B.Sc’ course.
2. List the details of students staying in ‘Kothrud’ area and studying in
‘T.Y.B.Sc’
3. Find the maximum score value for ‘Databases’ subject
4. List the number of students failing in the subject ‘Computer networks’
(marks < 40)

Given the above workload, derive an aggregate boundary, for aggregating the
three entities. Justify your answer.
Consequences of Aggregate Models
No Distributable Storage
• Relational mapping can captures data elements
and their relationship well.
• It does not need any notion of aggregate entity,
because it uses foreign key relationship.
• But we cannot distinguish for a relationship that
represent aggregations from those that don’t.
• As result we cannot take advantage of that
knowledge to store and distribute our data.
Marking Aggregate Tools
• Many data modeling techniques provides way to
mark aggregate structures in relational models
• However, they do not provide semantic that
helps in distinguish relationships
• When working with aggregate-oriented
databases, we have a clear view of the semantic
of the data.
• We can focus on the unit of interaction with the
data storage.
Aggregate Ignorant
• Relational database are aggregate-ignorant,
since they don’t have concept of aggregate
• Also graph database are aggregate-ignorant.
• This is not always bad.
• In domains where it is difficult to draw
aggregate boundaries aggregate-ignorant
databases are useful.
Aggregate and Operations
• An order is a good aggregate when:
– A customer is making and reviewing an order, and
– When the retailer is processing orders
• However, when the retailer want to analyze its
product sales over the last months, then
aggregate are trouble.
• We need to analyze each aggregate to extract
sales history.
Aggregate and Operations
• Aggregate may help in some operation and not in
• others.
• In cases where there is not a clear view aggregate-
ignorant database are the best option.
• But, remember the point that drove us to
aggregate models (cluster distribution).
• Running databases on a cluster is need when
dealing with huge quantities of data.
Running on a Cluster
• It gives several advantages on computation
power and data distribution
• However, it requires to minimize the number of
nodes to query when gathering data
• By explicitly including aggregates, we give the
database an important view of which
information should be stored together
• But, still we have the problem on querying
historical data
Aggregates and Transactions
ACID transactions
• Relational database allow us to manipulate any
combination of rows from any table in a single
transaction.
• ACID transactions:
– Atomic,
– Consistent,
– Isolated, and
– Durable
have the main point in Atomicity.
Atomicity & RDBMS
• Many rows spanning many tables are updated
into an Atomic operation
• It may succeeded or failed entirely
• Concurrently operations are isolated and we
cannot see partial updates
• However relational database still fail.
Atomicity & NoSQL
• NoSQL don’t support Atomicity that spans
multiple aggregates.
• This means that if we need to update multiple
aggregates we have to manage that in the
application code.
• Thus the Atomicity is one of the consideration
for deciding how to divide up our data into
aggregates
Aggregates Models on NoSQL
Key-Value and Document
• Key-value and Document databases are strongly
aggregate-oriented.
• Both of these types of databases consists of lot of
aggregates with a key used to get the data.
• The two type of databases differ in that:
– In a key-value stores the aggregate is opaque (Blob)
– In a document database we can see a structure in the
aggregate.
Key-Value and Document
• The advantage of opacity is that we can store
whatever we like in the aggregate.
• The database may impose some size limit, but
we have freedom
• A document store imposes limits on what we
can place in it, defining a structure on the
data.
Key-Value and Document
• With a key-value we can only access by its key
• With document:
– We can submit queries based on fields,
– We can retrieve part of the aggregate, and
– The database can create index based on the fields
of the aggregate.
• But in practice they are used differently
Key-Value and Document
• In practice, the line between key-value and
document gets a bit blurry.
• An ID field is put in a document database to do a
key-value style lookup
• With key-value databases we expect aggregates
using a key
• With document databases, we mostly expect to
submit some form of query on the internal
structure of the documents.
Column-Family Stores
• One of the most influential NoSQL databases
was Google’s BigTable [Chang et al.]
• Its name derives from its structure composed
by sparse columns and no schema.
• We don’t have to think of this structure as a
table, but to a two-level map.
Column-Family Stores
• These BigTable-style data model are referred
to as column stores.
• Pre-NoSQL column stores like C-Store used
SQL and the relational model.
• What make NoSQL columns store different is
how physically they store data.
• Most databases has rows as unit of storage,
which helps in writing performances
Column-Family Stores
• However, there are many scenarios where:
– Write are rares, but
– You need to read a few columns of many rows at
once
• In this situations, it’s better to store groups of
columns for all rows as the basic storage unit.
• These kind of databases are called column
stores or column-family databases
Column-Family Stores
• Column-family databases have a two-level aggregate
structure.
• Similarly to key-value the first key is the row
identifier.
• The difference is that retrieving a key return a Map
of more detailed values.
• These second-level values are defined to as columns.
• Fixing a row we can access to all the column-families
or to a particular element.
Example of Column Model
Column-Family Stores
• They organize their columns into families.
• Each column is a part of a family, and column
family acts as unit of access.
• Then the data for a particular column family
are accessed together.
Column-Family Stores:
How to structure data
• In row-oriented:
– each row is an aggregate (For example the customer
with id 456),
– with column families representing useful chunks of
data (profile, order history) within that aggregate
• In column-oriented:
– each column family defines a record type (e.g.
customer profiles) with rows for each of the records.
– You can think of a row as the join of records in all
columnfamilies
Key Points
• An aggregate is a collection of data that we interact with as
a unit.
• Aggregates form the boundaries for ACID operations with
the database
• Key-value, document, and column-family databases can all
be seen as forms of aggregate-oriented database
• Aggregates make it easier for the database to manage data
storage over clusters
• Aggregate-oriented databases work best when most data
interaction is done with the same aggregate
• Aggregate-ignorant databases are better when interactions
use data organized in many different formations

NOSQL Module-3
100% (2)
NOSQL Module-3
67 pages
NoSQL Databases for MCA Students
0% (1)
NoSQL Databases for MCA Students
15 pages
Dbms Practical Slips
No ratings yet
Dbms Practical Slips
10 pages
NoSQL Databases for Tech Students
No ratings yet
NoSQL Databases for Tech Students
16 pages
Unit 5 Notes
100% (3)
Unit 5 Notes
66 pages
Unit-5 NoSQL Data Management-Big Data
100% (2)
Unit-5 NoSQL Data Management-Big Data
14 pages
Database Systems Evolution
100% (2)
Database Systems Evolution
33 pages
Big - Data Lab Manual
No ratings yet
Big - Data Lab Manual
65 pages
Distributed Databases: Presentation-I
No ratings yet
Distributed Databases: Presentation-I
30 pages
Web App Lab Manual R20 by Hemanth
80% (5)
Web App Lab Manual R20 by Hemanth
41 pages
Big Data Analytics Lab Manual
No ratings yet
Big Data Analytics Lab Manual
38 pages
DBMS Unit 3 Notes
No ratings yet
DBMS Unit 3 Notes
29 pages
CS2032 2 Marks & 16 Marks With Answers
100% (1)
CS2032 2 Marks & 16 Marks With Answers
30 pages
DBMS-Unit 3
No ratings yet
DBMS-Unit 3
30 pages
CS8091 Important Questions BDA
No ratings yet
CS8091 Important Questions BDA
1 page
DBMS - Question Bank (Unit 1 To 6,9)
No ratings yet
DBMS - Question Bank (Unit 1 To 6,9)
5 pages
CS3492 DBMS Notes
100% (1)
CS3492 DBMS Notes
165 pages
Unit 2 - RELATIONAL MODEL
No ratings yet
Unit 2 - RELATIONAL MODEL
28 pages
Unit 3-BDA
50% (2)
Unit 3-BDA
26 pages
Parallel Database Architecture Guide
No ratings yet
Parallel Database Architecture Guide
10 pages
Syllabus of Data Privacy VIT
No ratings yet
Syllabus of Data Privacy VIT
2 pages
DDM - Unit 5 - Material
100% (2)
DDM - Unit 5 - Material
45 pages
City, Grade, Salesman - Id) ORDERS (Ord - No, Purchase - Amt, Ord - Date, Customer - Id, Salesman - Id) Write SQL Queries To
50% (2)
City, Grade, Salesman - Id) ORDERS (Ord - No, Purchase - Amt, Ord - Date, Customer - Id, Salesman - Id) Write SQL Queries To
4 pages
Fybca Dbms Slip
50% (4)
Fybca Dbms Slip
37 pages
Introduction To Data Mining With Case Studies Author: G. K. Gupta Prentice Hall India, 2006
100% (3)
Introduction To Data Mining With Case Studies Author: G. K. Gupta Prentice Hall India, 2006
78 pages
Mini Project B.tech
100% (1)
Mini Project B.tech
15 pages
Persistent Programming Language
No ratings yet
Persistent Programming Language
2 pages
Unit 1 Introduction of Machine Learning Notes
No ratings yet
Unit 1 Introduction of Machine Learning Notes
57 pages
A Convergence of Key Trends: Kept Large Amounts of Information Information On Tape
No ratings yet
A Convergence of Key Trends: Kept Large Amounts of Information Information On Tape
14 pages
DBMS Practical Slips
No ratings yet
DBMS Practical Slips
2 pages
Static and Dynamic Hashing
No ratings yet
Static and Dynamic Hashing
12 pages
CS3361 - Data Science Laboratory
No ratings yet
CS3361 - Data Science Laboratory
31 pages
CS8582 Object-Oriented-Analysis-and-Design-Lab-Manual PDF
33% (6)
CS8582 Object-Oriented-Analysis-and-Design-Lab-Manual PDF
93 pages
CS3492 DBMS Univ - QP Answer AM 2024
No ratings yet
CS3492 DBMS Univ - QP Answer AM 2024
19 pages
AI Important Questions
No ratings yet
AI Important Questions
196 pages
CS8091-BIG DATA ANALYTICS UNIT V Notes
100% (4)
CS8091-BIG DATA ANALYTICS UNIT V Notes
31 pages
Unit - 1 EDA
No ratings yet
Unit - 1 EDA
123 pages
Part A: SQL Programming: DBMS Lab Manual-2019-20
No ratings yet
Part A: SQL Programming: DBMS Lab Manual-2019-20
33 pages
IRS Theory & Lab Syllabus
100% (1)
IRS Theory & Lab Syllabus
3 pages
Database Security Notes
100% (1)
Database Security Notes
67 pages
Data Base Management Systems - Lab 2ND SEM BCA - Y2K8 SCHEME
No ratings yet
Data Base Management Systems - Lab 2ND SEM BCA - Y2K8 SCHEME
8 pages
Module-4 Normalization: Database Design Theory DBMS (18CS53)
No ratings yet
Module-4 Normalization: Database Design Theory DBMS (18CS53)
24 pages
Cassandra PPT Final
No ratings yet
Cassandra PPT Final
23 pages
CS8492-Database Management Systems
No ratings yet
CS8492-Database Management Systems
15 pages
DBMS UNIT-3 Notes
100% (3)
DBMS UNIT-3 Notes
45 pages
CCS334 Big Data Analytics Important Question
No ratings yet
CCS334 Big Data Analytics Important Question
1 page
DBMS
No ratings yet
DBMS
18 pages
CLIQUE and PROCLUS
0% (1)
CLIQUE and PROCLUS
13 pages
Unit 1 - Data Mining and Warehousing - WWW - Rgpvnotes.in
No ratings yet
Unit 1 - Data Mining and Warehousing - WWW - Rgpvnotes.in
16 pages
Dbms-Unit-3 - Aktu
100% (2)
Dbms-Unit-3 - Aktu
7 pages
Dunham - Data Mining PDF
83% (6)
Dunham - Data Mining PDF
156 pages
Properties of Relational Decomposition
No ratings yet
Properties of Relational Decomposition
3 pages
Difference Between ROLAP, MOLAP and HOLAP
No ratings yet
Difference Between ROLAP, MOLAP and HOLAP
3 pages
Distribution Model
100% (1)
Distribution Model
24 pages
DBMS - Unit 3 - Notes (Relational Algebra)
No ratings yet
DBMS - Unit 3 - Notes (Relational Algebra)
45 pages
Unit - 1 Big Data Handwritten Notes
No ratings yet
Unit - 1 Big Data Handwritten Notes
16 pages
Aggregate Data Models Unit 2
No ratings yet
Aggregate Data Models Unit 2
16 pages
NoSQL Module 1 Part1
No ratings yet
NoSQL Module 1 Part1
13 pages
BGD Mod 2 QB Solns
No ratings yet
BGD Mod 2 QB Solns
11 pages
Aggregrate Data Models
No ratings yet
Aggregrate Data Models
9 pages
Implement - Column-Family Stores
No ratings yet
Implement - Column-Family Stores
37 pages
Implement - Graph Databases
No ratings yet
Implement - Graph Databases
40 pages
More Details On Data Models
No ratings yet
More Details On Data Models
23 pages
Consistency
No ratings yet
Consistency
42 pages
Zimbabwe School Examinations Council: Computing
No ratings yet
Zimbabwe School Examinations Council: Computing
33 pages
FortiOS ADVPN Version 2018-06-28
No ratings yet
FortiOS ADVPN Version 2018-06-28
98 pages
Let Us Python
0% (2)
Let Us Python
429 pages
7 DG2 19eskcs821
No ratings yet
7 DG2 19eskcs821
17 pages
Smokescreen Illusion Black: Benefits of Using Deception Technology
No ratings yet
Smokescreen Illusion Black: Benefits of Using Deception Technology
6 pages
8051 MCQ's
No ratings yet
8051 MCQ's
8 pages
SIEMonster V4 Starter Edition Operations Guide V1.0
100% (1)
SIEMonster V4 Starter Edition Operations Guide V1.0
210 pages
Major Project Documentation
No ratings yet
Major Project Documentation
36 pages
Chapter 17 - Web Security
No ratings yet
Chapter 17 - Web Security
18 pages
This Study Resource Was: Case 1 - Systemx Inc. Withdraws Rs. 1 Billion Softguide Acquisition Offer
No ratings yet
This Study Resource Was: Case 1 - Systemx Inc. Withdraws Rs. 1 Billion Softguide Acquisition Offer
4 pages
Rec Om Jan 08
No ratings yet
Rec Om Jan 08
66 pages
Akshay Tyagi: Education Skills
No ratings yet
Akshay Tyagi: Education Skills
1 page
Hytera Encryption Options
100% (1)
Hytera Encryption Options
1 page
Data Engineering Cookbook
90% (10)
Data Engineering Cookbook
88 pages
Answers To Ethical Hacker Interview Questions
No ratings yet
Answers To Ethical Hacker Interview Questions
5 pages
Resort Management System
No ratings yet
Resort Management System
19 pages
Advanced Technologies in Cloud Computing
No ratings yet
Advanced Technologies in Cloud Computing
25 pages
ICT Notes Internet JSS
No ratings yet
ICT Notes Internet JSS
3 pages
Public Key Infrastructure
No ratings yet
Public Key Infrastructure
10 pages
Ringless Voicemail, Voice Broadcast & Bulk SMS Services by Messagizer
No ratings yet
Ringless Voicemail, Voice Broadcast & Bulk SMS Services by Messagizer
12 pages
Mastery 3 (Etech)
No ratings yet
Mastery 3 (Etech)
4 pages
1788-ENBT Manual
No ratings yet
1788-ENBT Manual
136 pages
MIL 11 - 12 Q3 0203 The Evolution of Traditional To New Media PS
No ratings yet
MIL 11 - 12 Q3 0203 The Evolution of Traditional To New Media PS
13 pages
DVP - ES2/ EX2/ SS2/ S A2/ SX2/S E&TP Ope Rati On M Anual - P Rog Ramming
No ratings yet
DVP - ES2/ EX2/ SS2/ S A2/ SX2/S E&TP Ope Rati On M Anual - P Rog Ramming
5 pages
Installation Manual 805
No ratings yet
Installation Manual 805
22 pages
What Every CEO Need To Know Aout The Cloud
No ratings yet
What Every CEO Need To Know Aout The Cloud
18 pages
Information Security Lms PDF
No ratings yet
Information Security Lms PDF
6 pages
Project 2
No ratings yet
Project 2
8 pages
B.C.A. Third Year (Effective From Session 2020-21) : BCA-301: JAVA Programming
No ratings yet
B.C.A. Third Year (Effective From Session 2020-21) : BCA-301: JAVA Programming
6 pages

Aggregate Data Models

Uploaded by

Aggregate Data Models

Uploaded by

Aggregate Data Models

You might also like