0% found this document useful (0 votes)

68 views116 pages

Unit 2 DS

The document discusses the data analytics lifecycle and key phases in approaching analytics problems. It describes 6 phases: business understanding, analytic understanding, data requirements, data collection, data understanding, and data preparation. For each phase, the document provides details on the goals and activities involved. It also discusses key roles that are important for a successful analytics project, including business users, project managers, data scientists, and others. Finally, it outlines the discovery phase of a project, focusing on learning the business domain, identifying resources and stakeholders, and framing the problem.

Uploaded by

ramya ravindran

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

68 views116 pages

Unit 2 DS

Uploaded by

ramya ravindran

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 116

APPROACHING ANALYTICS PROBLEMS

 The Data Analytics Lifecycle is designed

specifically for Big Data problems and data science
projects

 The lifecycle has six phases, and project work can

occur in several phases at once.

 For most phases in the lifecycle, the movement

can be either forward or backward 2
SRM Institute of Science and Technology
APPROACHING ANALYTICS PROBLEMS

Business Understanding:

 Before solving any problem in the Business domain it needs to

be understood properly.

 Business understanding forms a concrete base, which further

leads to easy resolution of queries.

SRM Institute of Science and Technology 3

APPROACHING ANALYTICS PROBLEMS

Analytic Understanding

The approaches can be of 4 types: Descriptive

approach (current status and information provided.
Diagnostic approach(a.k.a statistical analysis, what
is happening and why it is happening)
Predictive approach(it forecasts on the trends or
future events probability)
 Prescriptive approach( how the problem should be
solved actually).

SRM Institute of Science and Technology 4

APPROACHING ANALYTICS PROBLEMS

Analytic Understanding

The approaches can be of 4 types: Descriptive

SRM Institute of Science and Technology 5

APPROACHING ANALYTICS PROBLEMS

Data Requirements:

The above chosen analytical method indicates the necessary

data content, formats and sources to be gathered.

 During the process of data requirements, one should find the

answers for questions like „what‟, „where‟, „when‟, „why‟, „how‟ &
„who‟.

SRM Institute of Science and Technology 6

APPROACHING ANALYTICS PROBLEMS

Data Collection:

Data collected can be obtained in any random format. So,

according to the approach chosen and the output to be obtained,
the data collected should be validated.

SRM Institute of Science and Technology 7

APPROACHING ANALYTICS PROBLEMS

Data Understanding:

 Data understanding answers the question “Is the data

collected representative of the problem to be solved?”.

 Descriptive statistics calculates the measures applied over

data to access the content and quality of matter.

SRM Institute of Science and Technology 8

APPROACHING ANALYTICS PROBLEMS

Data Preparation:

 This whole process includes transformation, normalization etc.

Modelling:

 Modelling decides whether the data prepared for processing is

appropriate or requires more finishing and seasoning.
 This phase focuses on the building of predictive/descriptive
models.

SRM Institute of Science and Technology 9

APPROACHING ANALYTICS PROBLEMS

Evaluation:

Model evaluation is done during model development. It checks
for the quality of the model to be assessed and also if it meets
the business requirements
Deployment:

Deployment phase checks how much the model can withstand
in the external environment and perform superiorly as
compared to others.

SRM Institute of Science and Technology 10

APPROACHING ANALYTICS PROBLEMS

Feedback:

Feedback is the necessary purpose which helps in refining the

model and accessing its performance and impact.

SRM Institute of Science and Technology 11

KEY ROLES FOR A SUCCESSFUL
ANALYTICS PROJECT
 Business User
 Project Sponsor
 Project Manager
 Business Intelligence Analyst
 Database Administrator (DBA)
 Data Engineer
 Data Scientist

SRM Institute of Science and Technology 2

KEY ROLES FOR A SUCCESSFUL
ANALYTICS PROJECT
Business User:
 Someone who understands the domain area and usually
benefits from the results.
 This person can consult and advise the project team on
the context of the project, the value of the results, and how
the outputs will be operationalized.

SRM Institute of Science and Technology 3

KEY ROLES FOR A SUCCESSFUL
ANALYTICS PROJECT
Project Manager:
Ensures that key milestones and objectives are met on time and at
the expected quality.
Business Intelligence Analyst :
 Provides business domain expertise based on a deep
understanding of the data, key performance indicators (KPis),
key metrics, and business intelligence from a reporting
perspective.
 Business Intelligence Analysts generally create dashboards and
reports and have knowledge of the data feeds and sources.
SRM Institute of Science and Technology 4
KEY ROLES FOR A SUCCESSFUL
ANALYTICS PROJECT
Database Administrator (DBA):
 His responsibilities include providing access to key databases
or tables and ensuring the appropriate security levels are in
place related to the data repositories.

SRM Institute of Science and Technology 5

KEY ROLES FOR A SUCCESSFUL
ANALYTICS PROJECT

Data Engineer:
 Leverages deep technical skills to assist with tuning SQL
queries for data management and data extraction, and
provides support for data ingestion into the analytic
sandbox.
 DBA sets up and configures the databases to be used, the
data engineer executes the actual data extractions and
performs substantial data manipulation to facilitate the
analytics.

SRM Institute of Science and Technology 6

KEY ROLES FOR A SUCCESSFUL
ANALYTICS PROJECT

Data Scientist:
 Provides subject matter expertise for analytical techniques,
data modeling, and applying valid analytical techniques to
given business problems.
 Ensures overall analytics objectives are met.

SRM Institute of Science and Technology 7

PHASE 1: DISCOVERY
 The data science team must learn and investigate the
problem, develop context and understanding, and
learn about the data sources needed and available for
the project.
 Learning the Business Domain
 Resources
 Framing the Problem
 Identifying Key Stakeholders
 Interviewing the Analytics Sponsor
 Developing Initial Hypotheses
 Identifying Potential Data Sources

SRM Institute of Science and Technology 2

PHASE 1: DISCOVERY
Learning the Business Domain:

 Understanding the domain area of the problem is

essential.
 Data scientists have deep knowledge of the
methods, techniques, and ways for applying
heuristics to a variety of business and conceptual
problems

SRM Institute of Science and Technology 3

PHASE 1: DISCOVERY
Resources:

 As part of the discovery phase, the team needs to

assess the resources available to support the
project.
 In this context, resources include technology, tools,
systems, data, and people.

SRM Institute of Science and Technology 4

PHASE 1: DISCOVERY
Resources:
 Does the requisite level of expertise exist within the
organization today, or will it need to be cultivated?
 The team will need to determine whether it must collect
additional data, purchase it from outside sources, or
transform existing data.
 Ensure the project team has the right mix of domain
experts, customers, analytic talent, and project
management to be effective.

SRM Institute of Science and Technology 5

PHASE 1: DISCOVERY
Framing the Problem :

 Framing is the process of stating the analytics problem to

be solved.
 It is crucial to state the analytics problem, as well as why
and to whom it is important
 it is important to identify the main objectives of the project,
identify what needs to be achieved in business terms, and
identify what needs to be done to meet the needs.

SRM Institute of Science and Technology 6

PHASE 1: DISCOVERY

 It is best practice to share the statement of goals

and success criteria with the team and confirm
alignment with the project sponsor's expectations.
 Establishing criteria for both success and failure
helps the participants to avoid unproductive effort
and remain aligned with the project sponsors

SRM Institute of Science and Technology 7

PHASE 1: DISCOVERY
Identifying Key Stakeholders:

 Important step is to identify the key stakeholders and their

interests in the project.
 During these discussions, the team can identify the
success criteria, key risks, and stakeholders
 When interviewing stakeholders, learn about the domain
area and any relevant history from similar analytics projects.

SRM Institute of Science and Technology 8

PHASE 1: DISCOVERY
Identifying Key Stakeholders:

 Depending on the number of stakeholders and

participants, the team may consider outlining the type of
activity and participation expected from each stakeholder
and participant.
 This will set clear expectations with the participants and
avoid delays later

SRM Institute of Science and Technology 9

PHASE 1: DISCOVERY
Interviewing the Analytics Sponsor:

 The team should plan to collaborate with the stakeholders

to clarify and frame the analytics problem.
 Sponsors may have a predetermined solution that may not
necessarily realize the desired
 outcome.
 In these cases, the team must use its knowledge and
expertise to identify the true underlying
 problem and appropriate solution.
SRM Institute of Science and Technology 10
PHASE 1: DISCOVERY
 Data science team typically may have a more objective
understanding of the problem set than the stakeholders,
who may be suggesting
solutions.
Some tips for interviewing project sponsors:
 Prepare for the interview; draft questions, and review with
colleagues.
 Use open-ended questions; avoid asking leading
questions.
 Document what the team heard, and review it with the
sponsors. SRM Institute of Science and Technology 11
PHASE 1: DISCOVERY
Developing Initial Hypotheses:
 This step involves forming ideas that the team
 can test with data.
 In this way, the team can compare its answers with the
outcome of an experiment or test to generate additional
possible solutions to problems
 Another part of this process involves gathering and
assessing hypotheses from stakeholders and domain
 experts who may have their own perspective on what
the problem is, what the solution should be, and how
 to arrive at a solution.
SRM Institute of Science and Technology 12
PHASE 1: DISCOVERY

Identifying Potential Data Sources:

 The team should perform five main activities during this
step of the discovery phase:
 Identify data sources
 Capture aggregated data sources
 Review the raw data
 Evaluate the data structures and tools needed
 Sort of data infrastructure needed for this type of
problem

SRM Institute of Science and Technology 13

SRM Institute of Science and Technology 2

PHASE 1: DISCOVERY
Learning the Business Domain:

 Understanding the domain area of the problem is

essential.
 Data scientists have deep knowledge of the
methods, techniques, and ways for applying
heuristics to a variety of business and conceptual
problems

SRM Institute of Science and Technology 3

PHASE 1: DISCOVERY
Resources:

 As part of the discovery phase, the team needs to

assess the resources available to support the
project.
 In this context, resources include technology, tools,
systems, data, and people.

SRM Institute of Science and Technology 4

SRM Institute of Science and Technology 5

PHASE 1: DISCOVERY
Framing the Problem :

 Framing is the process of stating the analytics problem to

SRM Institute of Science and Technology 6

PHASE 1: DISCOVERY

 It is best practice to share the statement of goals

SRM Institute of Science and Technology 7

PHASE 1: DISCOVERY
Identifying Key Stakeholders:

 Important step is to identify the key stakeholders and their

SRM Institute of Science and Technology 8

PHASE 1: DISCOVERY
Identifying Key Stakeholders:

 Depending on the number of stakeholders and

SRM Institute of Science and Technology 9

PHASE 1: DISCOVERY
Interviewing the Analytics Sponsor:

 The team should plan to collaborate with the stakeholders

Identifying Potential Data Sources:

SRM Institute of Science and Technology 13

PHASE 2: DATA PREPARATION
 The second phase of the Data Analytics Lifecycle involves
data preparation, which includes the steps to explore,
preprocess, and condition data prior to modeling and
analysis.
 To get the data into the sandbox, the team needs to
perform ETLT, by a combination of extracting, transforming,
and loading data into the sandbox. Once the data is in the
sandbox, the team needs to learn about the data and
become familiar with it.
SRM Institute of Science and Technology 2
PHASE 2: DATA PREPARATION

 The team may perform data visualizations to help team

members understand the data, including its trends,
outliers, and relationship among data variables. The step
involves
 Preparing the Analytic Sandbox
 Performing ETLT
 Learning About the Data
 Data Conditioning
 Survey and Visualize
 Common Tools for the Data Preparation Phase
SRM Institute of Science and Technology 3
PHASE 2: DATA PREPARATION
Preparing the Analytic Sandbox

 When developing the analytic sandbox, it is a best practice

to collect all kinds of data there, as team members need
access to high volumes and varieties of data for a Big
Data analytics project.
 This can include Analytic Sandbox: everything from
summary-level aggregated data, structured data, raw data
feeds, and unstructured text data from call logs or web
logs, depending on the kind of analysis the team plans to
undertake SRM Institute of Science and Technology 4
PHASE 2: DATA PREPARATION

 Expect the sandbox to be large.lt may contain raw data,

aggregated data, and other data types that are less
commonly used in organizations.
 Sandbox size can vary greatly depending on the project. A
good rule is to plan for the sandbox to be at least 5-10 times
the size of the original data sets, partly because copies of the
data may be created that serve as specific tables or data
stores for specific kinds of analysis in the project.

SRM Institute of Science and Technology 5

PHASE 2: DATA PREPARATION
Performing ETLT
 In ETL, users perform extract, transform, load processes to
extract data from a datastore, perform data transformations,
and load the data back into the datastore.
 ln this case, the data is extracted in its raw form and loaded
into the data store, where analysts can choose to transform
the data into a new state or leave it in its original, raw
condition.

SRM Institute of Science and Technology 6

PHASE 2: DATA PREPARATION

Performing ETLT

 As part of the ETLT step, it is advisable to make an inventory

of the data and compare the data currently available with
datasets the team needs (Gap Analysis).

SRM Institute of Science and Technology 7

PHASE 2: DATA PREPARATION
Learning About the Data:
 A critical aspect of a data science project is to become
familiar with the data itself
 Clarifies the data that the data science team has access to at
the start of the project
 Highlights gaps by identifying datasets within an organization
that the team may find useful
 Identifies datasets outside the organization that may be
useful to obtain, through open APIs, data sharing, or
purchasing data to supplement already existing datasets
SRM Institute of Science and Technology 8
PHASE 2: DATA PREPARATION
Data Conditioning:

 Data conditioning refers to the process of cleaning data,

normalizing datasets, and performing transformations On the
data
 Data conditioning can involve many complex steps to join or
merge data sets or otherwise get datasets into a state that
enables analysis in further phases.

SRM Institute of Science and Technology 9

PHASE 2: DATA PREPARATION
Data Conditioning:
 What are the data sources? What are the target fields (for
example, columns of the tables)?
 How clean is the data?
 How consistent are the contents and files?
 Review the content of data columns or other inputs
 Look for any evidence of systematic error.

SRM Institute of Science and Technology 10

PHASE 2: DATA PREPARATION

Survey and Visualize:

 After the team has collected and obtained at least some of

the datasets needed for the subsequent analysis, a useful
step is to leverage data visualization tools to gain an
overview of the data.
 Seeing high-level patterns in the data enables one to
understand characteristics about the data very quickly
SRM Institute of Science and Technology 11
PHASE 2: DATA PREPARATION

Survey and Visualize:

 Review data to ensure that calculations remained consistent

within columns or across tables for a given data field.
 Does the data distribution stay consistent over all the data? If
not, what kinds of actions should be taken to address this
problem?
 Assess the granularity of the data, the range of values, and
the level of aggregation of the data.
SRM Institute of Science and Technology 12
PHASE 2: DATA PREPARATION

 For time-related variables, are the measurements daily,

weekly, monthly?
 Is the data standardized/normalized? Are the scales
consistent?
 For geospatial datasets, are state or country abbreviations
consistent across the data?

SRM Institute of Science and Technology 13

PHASE 2: DATA PREPARATION

Common Tools for the Data Preparation Phase:

 Hadoop :can perform massively parallel and custom analysis

for web traffic parsing, GPS location analytics and genomic
analysis
 Alpine Miner : provides a graphical user interface (GUI) for
creating analytic work flows, including data manipulations and
a series of analytic events such as data-mining techniques

14
PHASE 2: DATA PREPARATION
 Open Refine :(formerly called Google Refine) is "a free, open
source, powerful tool for working with messy data." It is a
popular GUI-based tool.
 Data Wrangler :is an interactive tool for data clean ing and
transformation. Wrangler was developed at Stanford University
and can be used to perform many transformations on a given
dataset forming data transformations

SRM Institute of Science and Technology 15

 The team may perform data visualizations to help team

 When developing the analytic sandbox, it is a best practice

 Expect the sandbox to be large.lt may contain raw data,

SRM Institute of Science and Technology 5

SRM Institute of Science and Technology 6

PHASE 2: DATA PREPARATION

Performing ETLT

 As part of the ETLT step, it is advisable to make an inventory

of the data and compare the data currently available with
datasets the team needs (Gap Analysis).

SRM Institute of Science and Technology 7

 Data conditioning refers to the process of cleaning data,

SRM Institute of Science and Technology 9

SRM Institute of Science and Technology 10

PHASE 2: DATA PREPARATION

Survey and Visualize:

 After the team has collected and obtained at least some of

Survey and Visualize:

 Review data to ensure that calculations remained consistent

 For time-related variables, are the measurements daily,

weekly, monthly?
 Is the data standardized/normalized? Are the scales
consistent?
 For geospatial datasets, are state or country abbreviations
consistent across the data?

SRM Institute of Science and Technology 13

PHASE 2: DATA PREPARATION

Common Tools for the Data Preparation Phase:

 Hadoop :can perform massively parallel and custom analysis

SRM Institute of Science and Technology 15

MODEL PLANNING
Data science team identifies candidate models to apply to the data for
clustering, classifying, or finding relationships in the data depending on the goal
of the project
Assess the structure of the datasets.
Ensure that the analytical techniques enable the team to meet the business
objectives and accept or reject the working hypotheses.
Determine if the situation warrants a single model or a series of techniques as
part of a larger analytic workflow.

SRM Institute of Science and Technology 2

MODEL PLANNING

 Data Exploration and Variable Selection

 Model Selection
 Common Tools for the Model Planning Phase

SRM Institute of Science and Technology 3

MODEL PLANNING
Data Exploration and Variable Selection:

 Data exploration takes place in the data preparation

phase, those activities focus mainly on data hygiene
and on assessing the quality of the data itself.
 the objective of the data exploration is to understand
the relationships among the variables to inform
selection of the variables and methods and to
understand the problem domain

SRM Institute of Science and Technology 4

MODEL PLANNING

Data Exploration and Variable Selection:

 The key to this approach is to aim for capturing the

most essential predictors and variables rather than
considering every possible variable that people think
may influence the outcome.

SRM Institute of Science and Technology 5

MODEL PLANNING

Model Selection
 The team's main goal is to choose an analytical
technique, or a short list of candidate techniques, based
on the end goal of the project.
 In the case of machine learning and data mining, these
rules and conditions are grouped into several general
sets of techniques, such as classification, association
rules, and clustering.

SRM Institute of Science and Technology 6

MODEL PLANNING

Model Selection

 Teams create the initial models using a statistical

software package such as R, SAS, or Matlab.
 Although these tools are designed for data mining and
machine learning algorithms, they may have limitations
when applying the models to very large datasets, as is
common with Big Data.

SRM Institute of Science and Technology 7

MODEL PLANNING

Model Selection

 Teams create the initial models using a statistical

SRM Institute of Science and Technology 8

SRM Institute of Science and Technology 2

MODEL PLANNING

 Data Exploration and Variable Selection

 Model Selection
 Common Tools for the Model Planning Phase

SRM Institute of Science and Technology 3

MODEL PLANNING
Data Exploration and Variable Selection:

 Data exploration takes place in the data preparation

SRM Institute of Science and Technology 4

MODEL PLANNING

Data Exploration and Variable Selection:

 The key to this approach is to aim for capturing the

most essential predictors and variables rather than
considering every possible variable that people think
may influence the outcome.

SRM Institute of Science and Technology 5

MODEL PLANNING

SRM Institute of Science and Technology 6

MODEL PLANNING

Model Selection

 Teams create the initial models using a statistical

SRM Institute of Science and Technology 7

MODEL PLANNING

Model Selection

 Teams create the initial models using a statistical

SRM Institute of Science and Technology 8

MODEL BUILDING PHASE

 The data science team needs to develop data sets for

training, testing, and production purposes.
 These data sets enable the data scientist to develop the
analytical model and train it while holding aside some of
the data for testing the model.
 During this phase, users run models from analytical
software packages, such as R or SAS, on file extracts
and small data sets for testing purposes. On a small
scale, assess the validity of the model and its results.
SRM Institute of Science and Technology 2
COMMON TOOLS FOR THE MODEL BUILDING
PHASE
Common Tools for the Model Building Phase
 SAS Enterprise Miner allows users to run predictive and
descriptive models based on large volumes of data from
across the enterprise.
 SPSS Modeler offers methods to explore and analyze
data through a GUI.
 Matlab provides a high-level language for performing a
variety of data analytics, algorithms, and data exploration

SRM Institute of Science and Technology 3

COMMON TOOLS FOR THE MODEL BUILDING
PHASE

 Alpine Miner provides a GUI front end for users to

develop analytic work flows and interact with Big Data
tools and platforms on the back end.
 STATISTICA and Mathematica are popular and well-
regarded data mining and analytics tools

SRM Institute of Science and Technology 4

COMMON TOOLS FOR THE MODEL BUILDING
PHASE
Open Source tools:

 R and PL/R :PL/R is a procedural language for

PostgreSQL with R.
 Octave: a free software programming language for
computational modeling, has some of the functionality of
Matlab.

SRM Institute of Science and Technology 5

COMMON TOOLS FOR THE MODEL BUILDING
PHASE
Open Source tools:

 WEKA is a free data mining software package with an

analytic workbench
 Python is a programming language that provides toolkits
for machine learning and analysis, such as scikit-learn,
numpy, scipy, pandas, and related data visualization
using matplotlib

SRM Institute of Science and Technology 6

MODEL BUILDING PHASE

 The data science team needs to develop data sets for

SRM Institute of Science and Technology 3

COMMON TOOLS FOR THE MODEL BUILDING
PHASE

 Alpine Miner provides a GUI front end for users to

develop analytic work flows and interact with Big Data
tools and platforms on the back end.
 STATISTICA and Mathematica are popular and well-
regarded data mining and analytics tools

SRM Institute of Science and Technology 4

COMMON TOOLS FOR THE MODEL BUILDING
PHASE
Open Source tools:

 R and PL/R :PL/R is a procedural language for

PostgreSQL with R.
 Octave: a free software programming language for
computational modeling, has some of the functionality of
Matlab.

SRM Institute of Science and Technology 5

COMMON TOOLS FOR THE MODEL BUILDING
PHASE
Open Source tools:

 WEKA is a free data mining software package with an

SRM Institute of Science and Technology 6

COMMUNICATE RESULTS

 After executing the model, the team needs to compare

the outcomes of the modeling to the criteria established
for success and failure.

 When conducting this assessment, determine if the

results are statistically significant and valid

SRM Institute of Science and Technology 2

COMMUNICATE RESULTS

 The best practice in this phase is to record all the findings

and then select the three most significant ones that can
be shared with the stakeholders.

 The team will have documented the key findings and

major insights derived from the analysis.

SRM Institute of Science and Technology 3

ANALYSIS OVER DIFFERENT MODELS

 Better performance
 Longer lifetime
 Easier retraining
 Speedy production

SRM Institute of Science and Technology 4

COMMUNICATE RESULTS

 After executing the model, the team needs to compare

the outcomes of the modeling to the criteria established
for success and failure.

 When conducting this assessment, determine if the

results are statistically significant and valid

SRM Institute of Science and Technology 2

COMMUNICATE RESULTS

 The best practice in this phase is to record all the findings

and then select the three most significant ones that can
be shared with the stakeholders.

 The team will have documented the key findings and

major insights derived from the analysis.

SRM Institute of Science and Technology 3

ANALYSIS OVER DIFFERENT MODELS

 Better performance
 Longer lifetime
 Easier retraining
 Speedy production

SRM Institute of Science and Technology 4

OPERATIONALIZE

 The team communicates the benefits of the project more

broadly and sets up a pilot project to deploy the work in a
controlled way before broadening the work to a full enterprise
or ecosystem of users.

 This allows the team to learn from the deployment and make
any needed adjustments before launching the model across
the enterprise.

 The presentation needs to include supporting information

about analytical methodology and data sources

SRM Institute of Science and Technology 2

OPERATIONALIZE

 Creating a mechanism for performing ongoing monitoring

of model accuracy and, if accuracy degrades, finding
ways to retrain the model.

SRM Institute of Science and Technology 3

OPERATIONALIZE
 Business Intelligence Analyst needs to know if the
reports and dashboards he manages will be impacted
and need to change.
 Data Engineer and Database Administrator (DBA)
typically need to share their code from the analytics
project and create a technical document on how to
implement it.
 Data Scientist needs to share the code and explain the
model to managers, and other stakeholders.

SRM Institute of Science and Technology 4

MOVING MODEL TO DEPLOYMENT ENVIRONMENT
 Developing Core Material for Multiple Audiences
 Project Goals
 Main Findings
 Approach
 Model Description
 Model Details
 Providing Technical Specifications and Code

SRM Institute of Science and Technology 5

OPERATIONALIZE

 The team communicates the benefits of the project more

broadly and sets up a pilot project to deploy the work in a
controlled way before broadening the work to a full enterprise
or ecosystem of users.

 This allows the team to learn from the deployment and make
any needed adjustments before launching the model across
the enterprise.

 The presentation needs to include supporting information

about analytical methodology and data sources

SRM Institute of Science and Technology 2

OPERATIONALIZE

 Creating a mechanism for performing ongoing monitoring

of model accuracy and, if accuracy degrades, finding
ways to retrain the model.

SRM Institute of Science and Technology 3

SRM Institute of Science and Technology 4

SRM Institute of Science and Technology 5

ANALYTICS PLAN

 Discovery , Business problem framed

 Initial Hypotheses
 Data and Scope
 Model planning-Analytic Techniques
 Result and Key finding
 Business impact

SRM Institute of Science and Technology 2

ANALYTICS PLAN

 Discovery , Business problem framed

 Initial Hypotheses
 Data and Scope
 Model planning-Analytic Techniques
 Result and Key finding
 Business impact

SRM Institute of Science and Technology 2

KEY DELIVERABLES OF ANALYTICS PROJECT

 Developing Core Material for Multiple Audiences

 Project Goals
 Main Findings
 Approach
 Model Description
 Model Details
 Providing Technical Specifications and Code

SRM Institute of Science and Technology 2

PRESENTING YOUR RESULTS TO THE PROJECT
SPONSOR
 project sponsor is the person who wants the data science
result—generally for the business need that it will fill.

1.Summarize the motivation behind the project, and its

goals.
2.State the project’s results.
3.Back up the results with details (Code), as needed.
4.Discuss recommendations, outstanding issues, and
possible future work.

SRM Institute of Science and Technology 3

PRESENTING YOUR RESULTS TO THE PROJECT
SPONSOR
Project sponsor presentation takeaways

 Keep it short.

 Keep it focused on the business issues, not the

technical ones.

 Your project sponsor might use your presentation to

help sell the project or its results to the rest of the
organization.
SRM Institute of Science and Technology 4
PROVIDING TECHNICAL SPECIFICATIONS AND
CODE
 The team should anticipate questions from IT related to
how computationally expensive it will be to run the model
in the production environment.

 Teams should approach writing technical documentation

for their code and specifications.

 Introduce your results early in the presentation, rather

than building up to them.

SRM Institute of Science and Technology 5

KEY DELIVERABLES OF ANALYTICS PROJECT

 Developing Core Material for Multiple Audiences

 Project Goals
 Main Findings
 Approach
 Model Description
 Model Details
 Providing Technical Specifications and Code

SRM Institute of Science and Technology 2

PRESENTING YOUR RESULTS TO THE PROJECT
SPONSOR
 project sponsor is the person who wants the data science
result—generally for the business need that it will fill.

1.Summarize the motivation behind the project, and its

goals.
2.State the project’s results.
3.Back up the results with details (Code), as needed.
4.Discuss recommendations, outstanding issues, and
possible future work.

SRM Institute of Science and Technology 3

PRESENTING YOUR RESULTS TO THE PROJECT
SPONSOR
Project sponsor presentation takeaways

 Keep it short.

 Keep it focused on the business issues, not the

technical ones.

 Your project sponsor might use your presentation to

 Teams should approach writing technical documentation

for their code and specifications.

 Introduce your results early in the presentation, rather

than building up to them.

SRM Institute of Science and Technology 5

Unit 2 Notes
No ratings yet
Unit 2 Notes
117 pages
Data Analytics Lifecycle & Key Roles
No ratings yet
Data Analytics Lifecycle & Key Roles
50 pages
Data Science Roles & Lifecycle Guide
No ratings yet
Data Science Roles & Lifecycle Guide
20 pages
Unit 1
No ratings yet
Unit 1
60 pages
Lec.3.Intro.D.S. Fall 2023
No ratings yet
Lec.3.Intro.D.S. Fall 2023
21 pages
Data Analytics
No ratings yet
Data Analytics
25 pages
BSR-Data Science
No ratings yet
BSR-Data Science
308 pages
Chapter 1
No ratings yet
Chapter 1
41 pages
CSCI946 w2-BDLifecycle
No ratings yet
CSCI946 w2-BDLifecycle
76 pages
Allinone
No ratings yet
Allinone
189 pages
Chapter 02 DataAnalyticsLifecycle
No ratings yet
Chapter 02 DataAnalyticsLifecycle
44 pages
OC - Module 2 - DA Lifecycle 021312
No ratings yet
OC - Module 2 - DA Lifecycle 021312
33 pages
Module I - 1
No ratings yet
Module I - 1
23 pages
Module 1
No ratings yet
Module 1
40 pages
Overview of Data Analytics Lifecycle: Unit 2
No ratings yet
Overview of Data Analytics Lifecycle: Unit 2
100 pages
Adobe Scan 27-Mar-2024
No ratings yet
Adobe Scan 27-Mar-2024
12 pages
The Essence of Analytics: Masamha T
No ratings yet
The Essence of Analytics: Masamha T
10 pages
ATW115 Slides Chp02
No ratings yet
ATW115 Slides Chp02
52 pages
Unit - I - 2
No ratings yet
Unit - I - 2
63 pages
Data Science: Lesson 4
No ratings yet
Data Science: Lesson 4
8 pages
2-Evolution of Analytic Scalability-07!01!2025
No ratings yet
2-Evolution of Analytic Scalability-07!01!2025
21 pages
Unit - 2 Learning Notes
No ratings yet
Unit - 2 Learning Notes
7 pages
Big - Data Unit-2
100% (2)
Big - Data Unit-2
64 pages
Unit I
No ratings yet
Unit I
41 pages
Week 2 - Data Analytics Life Cycle
No ratings yet
Week 2 - Data Analytics Life Cycle
41 pages
Life Cycle
No ratings yet
Life Cycle
35 pages
DS&BDA Unit 3
No ratings yet
DS&BDA Unit 3
51 pages
Unit 1 - DS
No ratings yet
Unit 1 - DS
113 pages
Module 1B
No ratings yet
Module 1B
65 pages
Big Data Module 2
No ratings yet
Big Data Module 2
31 pages
Data Analytics Lifecycle
No ratings yet
Data Analytics Lifecycle
50 pages
W3 - DA Life Cycle
No ratings yet
W3 - DA Life Cycle
49 pages
Unit 1 - DSA
No ratings yet
Unit 1 - DSA
12 pages
Unit 1 Topic 1 Intro
No ratings yet
Unit 1 Topic 1 Intro
30 pages
Data Analytics: Key Concepts & Terms
No ratings yet
Data Analytics: Key Concepts & Terms
22 pages
DA&V Module 1 (SAMI)
No ratings yet
DA&V Module 1 (SAMI)
4 pages
Data Analytics Lifecycle
No ratings yet
Data Analytics Lifecycle
16 pages
Big Data Analytics Unit1
No ratings yet
Big Data Analytics Unit1
10 pages
Unit2 DATA SCIENCE
No ratings yet
Unit2 DATA SCIENCE
8 pages
Tung Wah College GEN3005 / GED3005 Big Data and Data Sciences
No ratings yet
Tung Wah College GEN3005 / GED3005 Big Data and Data Sciences
8 pages
Unit - 2 PDA
No ratings yet
Unit - 2 PDA
20 pages
PM Unit 1
No ratings yet
PM Unit 1
41 pages
For Students Copy Intro To Data Science
No ratings yet
For Students Copy Intro To Data Science
40 pages
Business Analytics Unit I
No ratings yet
Business Analytics Unit I
45 pages
BigData DataAnalyticsTypes
No ratings yet
BigData DataAnalyticsTypes
9 pages
Unit 2 DS
No ratings yet
Unit 2 DS
30 pages
Unit 1 221226040256 44f48981
No ratings yet
Unit 1 221226040256 44f48981
32 pages
Unit 1
No ratings yet
Unit 1
50 pages
Business Undestanding and Data Collection
No ratings yet
Business Undestanding and Data Collection
27 pages
Unit 2 - Data Science
No ratings yet
Unit 2 - Data Science
37 pages
AA THeory and Methods
No ratings yet
AA THeory and Methods
40 pages
Key Roles and Life Cycle
No ratings yet
Key Roles and Life Cycle
4 pages
Data Analytics
100% (1)
Data Analytics
13 pages
Unit 1
No ratings yet
Unit 1
21 pages
2 - BBDS - Decisions Management & Problem Framing
No ratings yet
2 - BBDS - Decisions Management & Problem Framing
78 pages
Introduction To Big Data
No ratings yet
Introduction To Big Data
47 pages
10 Best Data Analytics Projects
No ratings yet
10 Best Data Analytics Projects
13 pages
Analytics and Data Science
No ratings yet
Analytics and Data Science
12 pages
ETL Basics and Testing Guide
No ratings yet
ETL Basics and Testing Guide
19 pages
SQL Table Creation Commands
No ratings yet
SQL Table Creation Commands
2 pages
Java Student Database Operations
No ratings yet
Java Student Database Operations
75 pages
Database Systems. The Complete Book 2nd Ed. Hector Garcia-Molina Direct Download
No ratings yet
Database Systems. The Complete Book 2nd Ed. Hector Garcia-Molina Direct Download
153 pages
1.what Is A Cursor For Loop ?
No ratings yet
1.what Is A Cursor For Loop ?
15 pages
Hospital
100% (1)
Hospital
5 pages
Apps Technical Course Content-Regular (Jagadeesh)
No ratings yet
Apps Technical Course Content-Regular (Jagadeesh)
3 pages
Ivy: P2P File System Design & Performance
No ratings yet
Ivy: P2P File System Design & Performance
29 pages
Introduction To Database System and Information Retrieval (Part 1)
No ratings yet
Introduction To Database System and Information Retrieval (Part 1)
67 pages
Lab 08 DATABASE SYSTEMS
No ratings yet
Lab 08 DATABASE SYSTEMS
4 pages
ITE 2152 Introduction To Mobile Application Development: Week 10
No ratings yet
ITE 2152 Introduction To Mobile Application Development: Week 10
11 pages
Jasperreportsserver Auth Cookbook
No ratings yet
Jasperreportsserver Auth Cookbook
124 pages
T5 Homework 5 Secondary Storage Answers
No ratings yet
T5 Homework 5 Secondary Storage Answers
1 page
Oracle Hyperion Essbase Training
No ratings yet
Oracle Hyperion Essbase Training
4 pages
Unions Intersection Minus Assignment 5
No ratings yet
Unions Intersection Minus Assignment 5
4 pages
3 Normalization
No ratings yet
3 Normalization
16 pages
Diff Conc Managers
No ratings yet
Diff Conc Managers
34 pages
3642-Article Text-13393-1-10-20210301
No ratings yet
3642-Article Text-13393-1-10-20210301
16 pages
DBMS Unit 1
No ratings yet
DBMS Unit 1
23 pages
BRD 7theditan-1
No ratings yet
BRD 7theditan-1
51 pages
DM Lab Cse
No ratings yet
DM Lab Cse
108 pages
Unit 2 MCQ
No ratings yet
Unit 2 MCQ
4 pages
Mcs-207 Solved Assignment 2025-26 @ignouhub
No ratings yet
Mcs-207 Solved Assignment 2025-26 @ignouhub
24 pages
DBMS Essentials for IT Professionals
No ratings yet
DBMS Essentials for IT Professionals
9 pages
System Design Principles & Techniques
No ratings yet
System Design Principles & Techniques
2 pages
ChaosCastleFinal GetPermission
No ratings yet
ChaosCastleFinal GetPermission
4 pages
Reusable Components
No ratings yet
Reusable Components
6 pages
Session On SAPI
No ratings yet
Session On SAPI
16 pages
The Three Layers of SAP BW
75% (4)
The Three Layers of SAP BW
10 pages
Linux
No ratings yet
Linux
95 pages