0% found this document useful (0 votes)

759 views16 pages

Data Modeling Interviews

This document provides guidance on conducting data modeling interviews. It discusses establishing the purpose and scope of the modeling effort, involving relevant stakeholders, exploring the domain being modeled through questioning, collaboratively developing high-level data and process models during interviews, and refining the models after interviews. The goal is to abstractly define standard data structures and vocabularies to enable data sharing and integration across organizations.

Uploaded by

Michael Corsello

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

759 views16 pages

Data Modeling Interviews

Uploaded by

Michael Corsello

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

RF

Corsello Research Foundation

Data Modeling Interviews

Lines of Questioning

Basics
Data modeling is about defining standard structures for data
Many data sets may share a common structure
Each thing in the real world should have only one data structure Each data structure may appear in multiple data models

Data models come in 2 primary flavors

Domain model
Models all entities specific to a domain Aligns to task automation and workflows

Entity model
Models entities regardless of domain

Corsello Research Foundation

Software
Software works with or on data
Software is actually a form of data as well Software should be keyed to a data model

Software may be built for dynamic data models

Allows for mapping to a specific implementation of a data model Results in general purpose software

Generally, lower performance

Lower specialization, Higher generality

Software may be keyed to a specific data model

Allows for high-performance, specialized tooling Allows for integration with workflows specific to the domain Lower generality

Neither model is better, just different

Corsello Research Foundation

Data Stores
A collection of data based upon a single data model in a coherent repository is a data store A relational database is a form of repository for data stores
A single RDBMS instance may contain multiple data stores

Data stores may be abstracted by software in numerous ways to enable access

Web-based services (SOAP/REST/JSON/RSS) Database API (e.g. ODBC/JDBC)

Remote Service (non-web, like CORBA/DCOM/IIOP)

Native API (e.g. code library/dll/jar)
Corsello Research Foundation

Data Models
Data models serve several purposes

Standard data models enable standard data formats, which enable sharing
Standard data models enable standard software implementations, which enable application integration Data models provide standard vocabularies for communicating Data models provide references for standardizing workflows
A workflow will require and produce data from the model Better enables defining standard entry and exit criteria

Data format standards are not data models

A standard XML schema is a standard encoding of data that implies structure, but is not itself a data model A data model is more abstract, it does not constrain implementation, encoding or use

Data models are only part of the bigger picture of standardization of practice

Corsello Research Foundation

Parts and Pieces

The goal is consistency, repeatability, measurability and reuse (sharing)

This goal requires multiple facets:

Standard data models Standard methodologies
Technical models, algorithms and approaches

Standard business processes

Delineation of responsibility Processes and procedures Workflow models

In short, standards
Does not require agreement, only acceptance Standards do not need to fit everyones needs, only the cross-section of needs Standards should be composable to get more detail thats how to support everyone (a web of standards)

Corsello Research Foundation

People
All activities are performed for and/or by people An task is automated to remove a person from needing to perform the task, however the result of the task will flow to a person People will appreciate the results of standardization, if done well but:
There is a fear that automation is meant to put them out of work There is a dislike for being required to do things in a different manner than we are used to (xenophobia) People want results, standardization is not quick

Corsello Research Foundation

Coping with change

To enable standardization to work well, expect long time lines

Expect people to not support the time lines

Deliver results in the interim, without the promise of the standardization The grand vision of the resulting utopia from standardization should be avoided
There is no silver bullet, only hard work and good intent Dont hide the goals, but emphasize the short-term goals Dont let the short-term goals undermine the grand vision

The long-term goals are the most important to maintain relevance

The short-term goals are the most important to maintain support

Corsello Research Foundation

Interviewing (finally)
When holding a data modeling / business process session, remember it is a collaborative interview Get relevant people involved:
Average user in the domain Hotshot or Hero in the domain Trouble child or Technophobe in the domain Minimal managers in general meetings Meet with management in a separate meeting both before and after for differing views

Get a cross-section of what the domain is

Corsello Research Foundation

The Session(s)
Ask questions to spur discussion
The people are a cross-section of the domain to ensure active discussion

The facilitator / modeler do not actually create the model, the audience does
Maintain enough control and direction to stay on topic Some discussion need to go off-topic to get to a point

The modeler guides the model development based upon their knowledge of modeling practices, not the domain
The modeler should understand the domain well enough to know what is on or off track

The outcome of the meetings is a high-level abstract data model and process model
One is of little use without the other in a specific domain Entity data model sessions

Should result in a domain map indicating what domains this entity model is relevant to
Map should directly intersect the audience

Corsello Research Foundation

Questions
There are no fixed questions to ask
It is imperative to teach data model basics in most cases

The line of questioning should be exploratory Try to answer

What does your domain do (and not do)
Establishes boundary of the domain

Who does your domain contain (and not contain)

Establishes a list of organizations of responsibility and regulatory environment Establishes a relative size for the domain

Who do you serve and interact with

Establishes a list of consumers of what domain produces Establishes a list of suppliers the domain consumes from

How does your domain accomplish this

Establishes a list of processes / practices

Corsello Research Foundation

Modeling
Continue elaborating the previous questions

Extract from the answers

What do you use (tools, data, techniques) Where do you use X (for each data entity X) What is the same/different about each data entity

Establish a baseline of entities

Forms the core data model Extract fields/attributes

Extract metadata (descriptions)

Extract relations/multiplicities
Corsello Research Foundation

Build a Model
Still during the meeting

Depict graphically:
Data entities Entity relations Process uses / domain mappings

Probe users for issues with the model

Whats missing What is not always true with the model What is domain specific about the model

What cannot be lived without

What is too costly to require or is inherently optional

Corsello Research Foundation

Build the Real Model

After the meeting is over

Decompose the model into a logical data model representation (e.g. in UML)
Partition the model
Find natural break points in the entities Isolate each entity

Resolve dependencies into a parent and child

Extends the relational concept in that the parent data model owns the link to the child, the child is not required to know about the link

Address partition consistency issues

Define any mandatory constraints in the model

Expect implementations will not be 100% able to enforce contstraints

Expect implementations to be fully distributed, loosely coupled and inter-organizational

Corsello Research Foundation

Review and Splanations

Provide the real model to the community
Expect concerns and issues
No word generally means nobody understands, or nobody cares Expect most issues will be addressed not by changing the model, but by explaining the concepts of the model

Educate, explain and provide examples

Most users will want to directly relate a model to an implementation of the model It is extremely hard to convey the difference It is critical to maintain a complete separation of the model from its implementation If (when) example implementations are shown, they should test the boundary of what is compliant with the model

Corsello Research Foundation

Questions
RF
Corsello Research Foundation

Dimensional Modeling
100% (1)
Dimensional Modeling
12 pages
A Data Pipeline Should Address These Issues:: Topics To Study
No ratings yet
A Data Pipeline Should Address These Issues:: Topics To Study
10 pages
Data Warehouse - Concept and Fundamentals: Sridevi
No ratings yet
Data Warehouse - Concept and Fundamentals: Sridevi
25 pages
SCD Type-2 with Pandas in Spark
0% (1)
SCD Type-2 with Pandas in Spark
8 pages
A Performance Comparison of SQL and NoSQL Databases
No ratings yet
A Performance Comparison of SQL and NoSQL Databases
5 pages
Informatica BDM Training Agenda
100% (2)
Informatica BDM Training Agenda
4 pages
Document Database Data Modeling
No ratings yet
Document Database Data Modeling
27 pages
9-10 Spark Architecture
No ratings yet
9-10 Spark Architecture
25 pages
Mandapriyanka (7 0)
No ratings yet
Mandapriyanka (7 0)
3 pages
Leetcode SQL QnA 1693149052
No ratings yet
Leetcode SQL QnA 1693149052
60 pages
Big Data Engineer Interview Questions
No ratings yet
Big Data Engineer Interview Questions
1 page
Data Warehousing Schema Guide
No ratings yet
Data Warehousing Schema Guide
4 pages
Data Vault & HQDM Insights
No ratings yet
Data Vault & HQDM Insights
8 pages
Introduction To Data Warehousing
No ratings yet
Introduction To Data Warehousing
74 pages
Data Warehousing Essentials Guide
No ratings yet
Data Warehousing Essentials Guide
20 pages
Dokumen - Pub - Understanding Etl Data Pipelines For Modern Data Architectures Early Release 9781098159252
No ratings yet
Dokumen - Pub - Understanding Etl Data Pipelines For Modern Data Architectures Early Release 9781098159252
39 pages
Datawarehouse Concepts
No ratings yet
Datawarehouse Concepts
5 pages
Optimizing Data Loading
No ratings yet
Optimizing Data Loading
26 pages
3.data Modeling Tools
100% (1)
3.data Modeling Tools
28 pages
Data Warehouse Concepts
No ratings yet
Data Warehouse Concepts
68 pages
Modernize Data Platforms With SingleStore - IBM
No ratings yet
Modernize Data Platforms With SingleStore - IBM
27 pages
Dimensional Modeling Guide
No ratings yet
Dimensional Modeling Guide
37 pages
Data Warehousing Interview Prep
No ratings yet
Data Warehousing Interview Prep
9 pages
Windowing Functions
No ratings yet
Windowing Functions
54 pages
Obiee 11G - Informatica - Dac - Odi - Obia - Bip - Main - Copy.24 PDF
No ratings yet
Obiee 11G - Informatica - Dac - Odi - Obia - Bip - Main - Copy.24 PDF
352 pages
Tuning SQL Queries - Oracle
100% (1)
Tuning SQL Queries - Oracle
27 pages
6 Frequently Asked Hadoop Interview Questions and Answers: Q1.What Is Hadoop?
No ratings yet
6 Frequently Asked Hadoop Interview Questions and Answers: Q1.What Is Hadoop?
8 pages
Data Warehousing Insights
No ratings yet
Data Warehousing Insights
6 pages
What Is Data Vault Modelling
No ratings yet
What Is Data Vault Modelling
4 pages
Data Warehouse Concepts & Terminology: - Vamshi Myana
No ratings yet
Data Warehouse Concepts & Terminology: - Vamshi Myana
39 pages
ERStudio Training v3
No ratings yet
ERStudio Training v3
48 pages
SQL Server Modernization Guide
No ratings yet
SQL Server Modernization Guide
74 pages
Data Modeling 101:: Bringing Data Professionals and Application Developers Together
No ratings yet
Data Modeling 101:: Bringing Data Professionals and Application Developers Together
46 pages
ER/Studio® Software Architect: Evaluation Guide
No ratings yet
ER/Studio® Software Architect: Evaluation Guide
27 pages
Introduction To Data Warehouse: Unit I: Data Warehousing
No ratings yet
Introduction To Data Warehouse: Unit I: Data Warehousing
110 pages
Airflow 2 X
100% (2)
Airflow 2 X
39 pages
Data Fundamentals DP-900 Guide
No ratings yet
Data Fundamentals DP-900 Guide
37 pages
Lead Data Engineer with AWS Expertise
No ratings yet
Lead Data Engineer with AWS Expertise
2 pages
Data Warehousing Interview Questions
No ratings yet
Data Warehousing Interview Questions
6 pages
Sr. Data Engineer with Azure Expertise
No ratings yet
Sr. Data Engineer with Azure Expertise
6 pages
Ajay Kadiyala Resume 2023 PDF
No ratings yet
Ajay Kadiyala Resume 2023 PDF
6 pages
Modul 9 - Data Warehousing and Business Intelligence - DMBOK2
No ratings yet
Modul 9 - Data Warehousing and Business Intelligence - DMBOK2
59 pages
NoSQL Data Modeling Techniques - Highly Scalable Blog
0% (1)
NoSQL Data Modeling Techniques - Highly Scalable Blog
32 pages
Relational (OLTP) Data Modeling
No ratings yet
Relational (OLTP) Data Modeling
2 pages
Top 50 Data Warehousing Interview Questions & Answers
No ratings yet
Top 50 Data Warehousing Interview Questions & Answers
8 pages
Big Query Optimization Document
No ratings yet
Big Query Optimization Document
10 pages
Suraj Kandukuri: Areas of Expertise
No ratings yet
Suraj Kandukuri: Areas of Expertise
5 pages
Apache Druid: Sudhindra Tirupati Nagaraj
No ratings yet
Apache Druid: Sudhindra Tirupati Nagaraj
12 pages
DW Life Cycle
No ratings yet
DW Life Cycle
114 pages
Talend Data Integration Guide
No ratings yet
Talend Data Integration Guide
64 pages
Metadata Repository Manager Semantic Architect in Denver CO Resume Kim Bare
No ratings yet
Metadata Repository Manager Semantic Architect in Denver CO Resume Kim Bare
3 pages
Ssis Interview Imp1
No ratings yet
Ssis Interview Imp1
4 pages
Application Architecture: Corsello Research Foundation
No ratings yet
Application Architecture: Corsello Research Foundation
28 pages
Data Management Flows
No ratings yet
Data Management Flows
39 pages
Se 08
No ratings yet
Se 08
11 pages
Lec 9
No ratings yet
Lec 9
23 pages
09 Modelling
No ratings yet
09 Modelling
19 pages
UNIT - II Object Model
No ratings yet
UNIT - II Object Model
36 pages
Object-Oriented Methodologies
No ratings yet
Object-Oriented Methodologies
98 pages
Michael Corsello: Geospatial Computing Expert
No ratings yet
Michael Corsello: Geospatial Computing Expert
13 pages
Technology Glossary Volume 1 Acronyms
No ratings yet
Technology Glossary Volume 1 Acronyms
53 pages
Agent-Based Simulation Insights
No ratings yet
Agent-Based Simulation Insights
9 pages
Modeling System Concept
No ratings yet
Modeling System Concept
6 pages
Software Development Lifecycle CM Process
No ratings yet
Software Development Lifecycle CM Process
19 pages
R2537.shp Roadall - SHP: Similar 100.126% Near by Length
No ratings yet
R2537.shp Roadall - SHP: Similar 100.126% Near by Length
6 pages
Long Term Trends
No ratings yet
Long Term Trends
20 pages
Simgis General Function Extension
No ratings yet
Simgis General Function Extension
10 pages
Jane Provancha, Russell Lowers, Doug Scheidt, Mario Mota, Michael Corsello DYN-2, Kennedy Space Center, FL 32899
No ratings yet
Jane Provancha, Russell Lowers, Doug Scheidt, Mario Mota, Michael Corsello DYN-2, Kennedy Space Center, FL 32899
26 pages
Seagrass Modeling in Banana River 1997
No ratings yet
Seagrass Modeling in Banana River 1997
23 pages
Web Application Vulnerability Areas
No ratings yet
Web Application Vulnerability Areas
12 pages
PFD - Firestreams
100% (1)
PFD - Firestreams
14 pages
Data Model Standardization
No ratings yet
Data Model Standardization
16 pages
The Monty Hall Problem
No ratings yet
The Monty Hall Problem
10 pages
CSci210 Paper Corsello
No ratings yet
CSci210 Paper Corsello
10 pages
EIM Intro - IT Total Lifecycle
No ratings yet
EIM Intro - IT Total Lifecycle
35 pages
Corsello Term Paper 2008 FINAL
No ratings yet
Corsello Term Paper 2008 FINAL
12 pages
Spatial Database
No ratings yet
Spatial Database
2 pages
DBMS Question Bank
No ratings yet
DBMS Question Bank
5 pages
Data Processing Sss 2 Scheme of Work - Syllabus - NG
No ratings yet
Data Processing Sss 2 Scheme of Work - Syllabus - NG
20 pages
MongoDB 3 Succinctly
No ratings yet
MongoDB 3 Succinctly
111 pages
College Library Management System
No ratings yet
College Library Management System
186 pages
Advance Java Notes
No ratings yet
Advance Java Notes
70 pages
EshwarHebbally Resume
No ratings yet
EshwarHebbally Resume
6 pages
Introduction to DBMS: Key Features & Types
No ratings yet
Introduction to DBMS: Key Features & Types
13 pages
Relational Cloud: A Database-as-a-Service For The Cloud
No ratings yet
Relational Cloud: A Database-as-a-Service For The Cloud
6 pages
Normalization of Database Tables
No ratings yet
Normalization of Database Tables
21 pages
Module 3 Introduction To SQL
No ratings yet
Module 3 Introduction To SQL
21 pages
The State of Reactive Java Persistence
No ratings yet
The State of Reactive Java Persistence
108 pages
Fourth Sem All Sub Important Questions
No ratings yet
Fourth Sem All Sub Important Questions
6 pages
ER - Relational Solutions PDF
No ratings yet
ER - Relational Solutions PDF
7 pages
SQL Basics for Beginners
No ratings yet
SQL Basics for Beginners
2 pages
Nba Ug Cs B 3 6 Syllabus
No ratings yet
Nba Ug Cs B 3 6 Syllabus
48 pages
Oracle 715am
No ratings yet
Oracle 715am
63 pages
Python Full Stack Internship Report
No ratings yet
Python Full Stack Internship Report
44 pages
Logical Modeling SDLC
0% (1)
Logical Modeling SDLC
6 pages
CHP - 3 Database
No ratings yet
CHP - 3 Database
5 pages
Title: Powerful Basic Concepts of Database System Target Population: Second Year BSCS/ICT and First Year ACT Students III - Overview
No ratings yet
Title: Powerful Basic Concepts of Database System Target Population: Second Year BSCS/ICT and First Year ACT Students III - Overview
6 pages
Hotel Management System
No ratings yet
Hotel Management System
43 pages
Caltech Data Analytics Brochure 2022
No ratings yet
Caltech Data Analytics Brochure 2022
15 pages
Mapping of ER Diagram To Relational Model
No ratings yet
Mapping of ER Diagram To Relational Model
36 pages
Dbms Notes
No ratings yet
Dbms Notes
37 pages
SQL Server and ASP Net Questions & Answers
No ratings yet
SQL Server and ASP Net Questions & Answers
12 pages
CSE311 IAH Slide01 Intro
No ratings yet
CSE311 IAH Slide01 Intro
17 pages
Database Types
No ratings yet
Database Types
9 pages
Sem2 - IDC - BCA-124 P Working With Data Using MySQL - Revised
No ratings yet
Sem2 - IDC - BCA-124 P Working With Data Using MySQL - Revised
5 pages
Er Diagram Assignment (DBMS)
No ratings yet
Er Diagram Assignment (DBMS)
18 pages

Data Modeling Interviews

Uploaded by

Data Modeling Interviews

Uploaded by

RF

Corsello Research Foundation

Data Modeling Interviews

Data models come in 2 primary flavors

Corsello Research Foundation

Software may be built for dynamic data models

Generally, lower performance

Software may be keyed to a specific data model

Neither model is better, just different

Corsello Research Foundation

Data stores may be abstracted by software in numerous ways to enable access

Remote Service (non-web, like CORBA/DCOM/IIOP)

Data format standards are not data models

Corsello Research Foundation

Parts and Pieces

This goal requires multiple facets:

Standard business processes

Corsello Research Foundation

Corsello Research Foundation

Coping with change

Expect people to not support the time lines

The long-term goals are the most important to maintain relevance

Corsello Research Foundation

Get a cross-section of what the domain is

Corsello Research Foundation

Corsello Research Foundation

The line of questioning should be exploratory Try to answer

Who does your domain contain (and not contain)

Who do you serve and interact with

How does your domain accomplish this

Corsello Research Foundation

Extract from the answers

Establish a baseline of entities

Extract metadata (descriptions)

Probe users for issues with the model

What cannot be lived without

Corsello Research Foundation

Build the Real Model

Resolve dependencies into a parent and child

Address partition consistency issues

Expect implementations will not be 100% able to enforce contstraints

Corsello Research Foundation

Review and Splanations

Educate, explain and provide examples

Corsello Research Foundation

You might also like