Data Dictionaries in Management
Information Systems (MIS)
Introduction
A data dictionary is a centralized repository that contains definitions and descriptions of data
elements used within an information system. It provides metadata—data about data—including
names, types, formats, allowable values, and relationships among data elements.
Role of Data Dictionaries in MIS
In MIS, data dictionaries are critical for ensuring that data is consistently defined and accurately
used across the system. They support database design, data integrity, system documentation, and
user training.
Key Components
Data Element Name: Identifier of the data element.
Description: Explanation of the element’s purpose.
Data Type: Nature of data (e.g., text, numeric, date).
Length: Maximum size/character limit.
Allowable Values: Permissible inputs.
Validation Rules: Conditions the data must meet.
Source: Origin system or application.
Relationships: Connections like primary/foreign keys.
Importance of Data Dictionaries
Consistency: Standardizes definitions to prevent confusion.
Communication: Enhances cross-team understanding.
Accuracy: Supports data validation and quality checks.
Documentation: Assists in compliance and auditing.
Benefits of Using a Data Dictionary
Benefits Description
Data Consistency Ensures reliability and quality of data.
Data Analysis Provides context for accurate data interpretation.
Data Transparency Promotes visibility and shared understanding.
Self-Serve Data Enables users to access and use data independently.
Data security Better support for data security and access control
Types of Data Dictionaries
1. Active Data Dictionary
Definition:
An active data dictionary is integrated into the database management system (DBMS)
and automatically updated as changes occur in the database structure (like new tables,
columns, constraints, etc.).
Key Features:
Maintains real-time synchronization with the database.
Changes to tables, views, or constraints are immediately reflected.
Used by relational DBMSs like Oracle, SQL Server, PostgreSQL, etc.
Typically read-only for users; only the DBMS can update it.
Examples:
Oracle’s USER_TABLES, USER_TAB_COLUMNS
SQL Server’s INFORMATION_SCHEMA
MySQL’s INFORMATION_SCHEMA tables
Advantages:
Ensures accuracy and consistency.
Supports automated validation and integrity checks.
Critical for performance tuning and query optimization.
2. Passive Data Dictionary
Definition:
A passive data dictionary is manually maintained and not automatically updated by
the DBMS. It is often created by analysts or developers for documentation purposes.
Key Features:
Exists outside the database, usually in documents, spreadsheets, or modeling
tools.
Requires manual updates when changes are made to the database schema.
Useful in early design stages or for project documentation.
Examples:
Excel spreadsheets listing tables and fields.
ERD documentation tools like Lucidchart or Draw.io.
PDF/Word-based system documentation.
Advantages:
Customizable for stakeholders (e.g., business users, auditors).
Helpful for planning and communication between teams.
Centralized Data Dictionary
Definition:
A centralized data dictionary is stored and maintained in one single location or system,
usually managed by a central data team or IT department.
Key Characteristics:
All metadata is stored in a central repository.
Ensures consistency and standardization of data definitions.
Easier to maintain, audit, and secure.
Often integrated with enterprise data catalogs.
Advantages:
Single source of truth.
Simplifies governance and compliance.
Reduces data redundancy and confusion.
Challenges:
May become a bottleneck if access is restricted or updates are slow.
Less flexibility for individual departments.
Example:
An enterprise-wide data dictionary used by all departments at Mount Kenya University,
managed by the IT office.
Decentralized Data Dictionary
Definition:
A decentralized data dictionary is distributed across multiple departments or
systems, each maintaining its own set of data definitions.
Key Characteristics:
Managed independently by different teams or units.
Tailored to specific departmental needs or applications.
May result in inconsistent definitions across units.
Advantages:
Greater flexibility for departments to define and manage data.
Enables faster updates and adaptations to local needs.
Challenges:
Higher risk of data inconsistency or duplication.
Difficult to ensure organization-wide data governance.
Example:
Each faculty at Mount Kenya University maintains its own data dictionary for academic
records, without a central authority.
How to Create a Data Dictionary:
1. Identify & Define Elements: List and describe all key data fields.
2. Establish Relationships: Map dependencies and keys.
3. Document the Dictionary: Store details in a central format (e.g., spreadsheet).
4. Regular Updates: Keep it current as data evolves.
Mount Kenya University Data Dictionary
Table: Students
Field Name Data Length Description Allowable Validation Source Relationships
Type Values Rules
student_id Integer N/A Unique ID Auto- Primary Internal Linked to
for each generated Key Enrollments
student
first_name Varchar 50 Student's Alphabetic Not Null User –
first name input
last_name Varchar 50 Student's Alphabetic Not Null User –
last name input
gender Varchar 6 Gender of Male, Must be one User –
the student Female of listed input
values
date_of_birth Date N/A Student's Valid date Must be User –
date of birth over 16 input
years old
email Varchar 100 Student's Valid Must be User –
email email unique and input
address format in email
format
department_id Integer N/A Linked Valid dept Foreign Internal Linked to
department ID Key Departments
ID
Best Practices:
Assign ownership to ensure accountability.
Involve stakeholders for a comprehensive view.
Encourage communication for continual refinement.
Review and update regularly for relevance.
Data Dictionary vs. Data Catalog:
Data Catalog: A data catalog is a comprehensive inventory of all data assets across an
entire organization, regardless of where they are stored. It acts as a searchable directory
and provides rich metadata and context to help users discover, understand, and leverage
data
Data Dictionary: A data dictionary is a detailed, technical document or repository that
provides definitions and explanations of the data elements within a specific database or
system. It focuses on the technical metadata of individual data assets.
Key Differences:
Feature Data Dictionary Data Catalog
Scope Narrow (single Broad (all data assets across the
database/system) organization)
Focus Technical details, structure, Discoverability, understanding,
consistency governance, collaboration
Content Primarily technical metadata Technical, business, operational, and
social metadata
Audience Technical users (DBAs, Broader audience (analysts, business
developers, engineers) users, data scientists)
Primary Standardize data definitions Enable data discovery, understanding,
Goal within a system and trust across the enterprise
Examples in Kenyan Institutions
a) Universities:
Kenyatta University uses data dictionaries to standardize student record systems.
Mount Kenya university ensures data quality across multiple campuses through a centralized
data dictionary.
b) Government Agencies:
The Kenya Revenue Authority (KRA) employs data dictionaries for consistency in taxpayer
information systems.
Ministry of Health uses them for managing health records across counties.
Data dictionaries are foundational tools in MIS that help organizations maintain accurate,
consistent, and understandable data. Their use is crucial for efficient data management and
effective decision-making.
A data dictionary is essential for effective data governance. It improves data clarity, consistency,
and usability while supporting analysis, transparency, and independent data access. When
integrated with a data catalog, it contributes to a robust, organized, and efficient data management
ecosystem.