24-09-2024
What is Data?
•
Sources of Data Collection
Sources of Data for Data Analysis:
The actual data is then further divided mainly into two types known as:
1. Primary data
2. Secondary data
2
24-09-2024
Nature of Data
Nature of Data
3
24-09-2024
Nature of Data
Nature of Data
Ordinal Example Nominal Example
Good, better, best Hair colour (black, white, grey)
Poor, Rich Nationality (Name of Countries)
Star rating Courses (B.Tech, B.Pharma, BCA, BA)
State of mind (Happy, sad, angry) Random categories
Hot, cool, etc……………
4
24-09-2024
Nature of Data
Nature of Data
5
24-09-2024
Classification of Data
Classification of Data
6
24-09-2024
Structured Data
•
Unstructured Data
•
7
24-09-2024
Semi-structured Data
•
Characteristics of Data
•
•
•
•
•
•
8
24-09-2024
Introduction to Big Data Platform
Big Data is a collection of data that is huge in volume, yet
growing exponentially with time. It is a data with so large size
and complexity that none of traditional data management
tools can store it or process it efficiently. Big data is also a data
but with huge size.
Introduction to Big Data Platform
9
24-09-2024
Characteristics of Big Data
Big data can be described by the following
characteristics:
Volume
Velocity
Variety
Variability
Veracity
Vulnerability
Visualization
Need of Data Analytics
•
•
•
•
•
•
10
24-09-2024
Data Analytics Process
Steps involved in data analysis are:
Data Analysis Process consists of the following
phases that are iterative in nature −
Data Requirements Specification
Data Collection
Data Processing
Data Cleaning
Data Analysis
Communication
Evolution of analytics scalability:
In analytic scalability, we have to pull the data
together in a separate analytics environment and then
start performing analysis.
11
24-09-2024
Massively Parallel Processing (MPP) system is the most
mature, proven, and widely deployed mechanism for
storing and analyzing large amounts of data.
An MPP database breaks the data into independent pieces
managed by independent storage and central processing
unit (CPU) resources.
12
24-09-2024
Data Analytics Tool
•
•
•
•
•
•
•
•
Analytics vs Reporting
13
24-09-2024
Key roles for a successful analytics
project:
1. Business User:
The business user is the one who understands the main area of the
project and is also basically benefited from the results.
This user gives advice and consult the team working on the
project about the value of the results obtained and how the
operations on the outputs are done.
The business manager, line manager, or deep subject matter expert
in the project mains fulfills this role.
2. Project Sponsor:
The Project Sponsor is the one who is responsible to initiate the
project. Project Sponsor provides the actual requirements for the
project and presents the basic business issue.
He generally provides the funds and measures the degree of value
from the final output of the team working on the project.
This person introduce the prime concern and brooms the
desired output.
3. Project Manager:
This person ensures that key milestone and purpose of the project
is met on time and of the expected quality.
4. Business Intelligence Analyst:
Business Intelligence Analyst provides business domain perfection
based on a detailed and deep understanding of the data, key
performance indicators (KPIs), key matrix, and business intelligence
from a reporting point of view.
This person generally creates fascia and reports and knows about
the data feeds and sources.
5. Database Administrator (DBA):
DBA facilitates and arrange the database environment to support
the analytics need of the team working on a project.
His responsibilities may include providing permission to key
databases or tables and making sure that the appropriate security
stages are in their correct places related to the data repositories or
not.
6. Data Engineer: 14
Data engineer grasps deep technical skills to assist with tuning SQL
queries for data management and data extraction and provides support
for data intake into the analytic sandbox.
The data engineer works jointly with the data scientist to help build
data in correct ways for analysis.
24-09-2024
7. Data Scientist:
Data scientist facilitates with the subject matter expertise for
analytical techniques, data modelling, and applying correct
analytical techniques for a given business issues.
He ensures overall analytical objectives are met.
Data scientists outline and apply analytical methods and proceed
towards the data available for the concerned project.
15
24-09-2024
Data Analytics Life Cycle
•
Data Analytics Life Cycle
16
24-09-2024
Data Analytics Life Cycle
•
•
•
•
Data Analytics Life Cycle
17
24-09-2024
Data Analytics Life Cycle
Data Analytics Life Cycle
18
24-09-2024
Data Analytics Life Cycle
Data Analytics Life Cycle
19
24-09-2024
20
24-09-2024
Application Data Analytics
•
Application Data Analytics
•
21
24-09-2024
22