Applying The CRISP-DM Framework For Teaching Business Analytics
Applying The CRISP-DM Framework For Teaching Business Analytics
Alison Kelly
Suffolk University, Boston, MA, 02108, e-mail: akelly@suffolk.edu
ABSTRACT
   Experiential learning opportunities have been proven effective in teaching applied
   and complex subjects such as business analytics. Current business analytics
   pedagogy tends to focus heavily on the modeling phase with students often lacking a
   comprehensive understanding of the entire analytics process including dealing with
   real-life data that are not necessarily "clean" and/or small. Similarly, the emphasis
   on analytical rigor of ten comes at the expense of storytelling, which is among the
   most important aspects of business analytics. In this article, we demonstrate how the
   philosophy of the Cross In dustry Standard Process for Data Mining (CRISP-DM)
   framework can be infused into the teaching of business analytics through a term-
   long project that simulates the real world analytics process. The project focuses on
   problem formulation, data wrangling, modeling, performance evaluation, and
   storytelling, using real data and the program ming language R for illustration. We
   also discuss the pedagogical theories and tech niques involved in the application of
   the CRISP-DM framework. Finally, we document how the CRISP-DM framework
   has proved to be effective in helping students navigate through complex analytics
   issues by offering a structured approach to solving real-world problems.
Subject Areas: analytics project, business analytics, CRISP-DM, data
wran gling, experiential learning, storytelling, R.
INTRODUCTION
The importance of incorporating business analytics in pedagogy has been well
doc umented (Asamoah et al., 2017; Henke et al., 2016). This trend is further
evidenced by the proliferation of business analytics courses and programs
across universities and by the increasing industry demand for analytics
professionals. Although there
     †
      Corresponding Author. sjaggia@calpoly.edu
                                          612
                                                 Jaggia, Kelly, Lertwachara, and Chen 613
     Hipp, 2000).
      Students often fail to realize the interplays between the phases and how they
      col lectively contribute to the success of analytics projects. Moreover, most
      data sets introduced in these courses tend to be small and "clean" with little, if
      any, data wrangling required. As such, the data understanding phase and the
      data prepara tion phase receive minimal coverage. Finally, as a technical field,
      analytics ed ucation tends to overlook the importance of storytelling, the
      process of trans lating data points, analytical methodologies, and findings into
      interesting stories that are more palatable to the intended audience. The project
      described in this teaching brief aims to address these deficiencies witnessed in
      the current analytics education.
      THE PROJECT
      The project can be assigned to teams of students enrolled in a business
      analytics course in either the upper division undergraduate curriculum or at the
      graduate level. Unlike the pedagogical approach used by Anderson and
      Williams (2019)
616 Teaching Business Analytics
The Data
The data used in this project are the ERIM data set provided by the James M.
Kilts Center of the University of Chicago’s Booth School of Business
(https: //www.chicagobooth.edu/research/kilts/datasets/erim). The ERIM data
set con tains demographic information on 3,189 households in two midwestern
cities in the United States and their purchases in several product categories
(e.g., frozen dinners, yogurt, ketchup, margarine) from participating stores
over a 3-year period (1985-1988). For demonstration purposes, the application
used in this teaching brief focuses only on yogurt and frozen dinners. The
instructor may determine the scope of the project that best aligns with the
objectives of the course. One approach is to ask each team to select a product
category to analyze, and another approach
                                                Jaggia, Kelly, Lertwachara, and Chen 617
     is to design the project as a competition where student teams all focus on the
     same product category or categories.
            It is worth noting that while the project described in this teaching brief
     was designed around an existing data set, real-life business analytics projects
     would likely start with business managers identifying problems that require
     data-driven solutions instead of asking what questions they can answer with
     the existing data. The students must understand that the identified business
     questions should drive the entire analytics process including the acquisition of
     relevant data. The instructor should explain this important limitation of the
     project to the students in order to provide them with a realistic expectation of
     what they are likely to encounter in real projects.
     Business Understanding
     The first phase of the project deals with formulating business questions
     through an understanding of the business context. As most students are
     familiar with the retail industry, students can identify the potential business
     opportunities the data set presents to retailers and manufacturers in marketing
     and selling these products. For consistency across student teams, we suggest
     that the following two business questions be included in the assignment:
           1. Which households are likely to purchase yogurt and frozen dinner
              prod ucts?
            2. How much money is each household likely to spend on each product?
            Students also need to understand that in practice, documenting available
      re sources, estimating potential costs and benefits of the project, and
      identifying suc cess criteria are also part of this first phase. In addition,
      business analysts often work with a wide range of stakeholders, and therefore,
      identifying relevant stake holders and understanding their values and
      requirements are critical to the success of the project. During this phase of the
      project, the instructor may also impress upon students the general differences
      between supervised and unsupervised techniques. For example, students may
      be asked to consider whether supervised (predictive modeling) or unsupervised
      (pattern recognition) learning would be appropriate for achieving the analytical
      objectives and whether the current data set supports these techniques. These
      questions can be further explored during the data understanding phase of the
      CRISP-DM framework.
      Data Understanding
      Depending on the prerequisite knowledge of the students, the instructor can
      choose to require students to download the original data from the ERIM
      website, which require a fair amount of data integration and preprocessing.
      The original household data set contains 3,189 observations with 62 variables.
      Given that we are interested in variables that may influence a household’s
      decision to purchase yogurt or frozen dinners, we remove irrelevant variables
      from the data set such as the head of household’s first name and whether the
      household had a washer and dryer. Also, because nearly all male and female
      households are white in this data
618 Teaching Business Analytics
Yogurt 0.00 1.20 10.33 40.60 35.84 3,258.40 Frozen Dinners 0.00 0.00 0.00 55.77
10.45 4,073.01
set, we delete the race variables. After integrating the household data with
detailed purchase records, we produce the modified data set, now renamed
ERIMData, which contains 3,189 observations and 18 variables. We provide a
complete list and description of the variables in Table B1 in Appendix B; all of
the data used in this application are available upon request.
       Even with a more manageable data set, data preparation and wrangling
are still necessary prior to model development and analysis. In this application,
data wrangling and analytical modeling are performed using the R language;
however, these are common tasks that can also be completed with other
software packages or programming languages. For those who wish to learn R,
basic tutorials can be found at www.r-project.org/about.html and
www.rstudio.com/online-learning. During this phase, students also explore the
data and identify possible variables that may add value to subsequent analysis
phases. In Appendix C, we provide a portion of the R code used for data
wrangling and selected business analytic models. The complete R code is
available upon request.
       It is a common practice to produce summary statistics, look for
symmetry, and/or identify outliers for key variables. Table 2 provides a
summary for the two expenditure (target) variables. Even with only descriptive
statistics, students can draw insights from the data. For both target variables,
the median is notably less than the mean and the maximum value is
dramatically higher than the third quar tile. Thus, it is likely that both
distributions are positively skewed and have outliers. Students are encouraged
to use data visualization, such as boxplots, to reinforce this finding, and to
explore other visualization tools, such as histograms, stacked column charts,
scatterplots, bubble plots, and heat maps, to discover other interest ing patterns
and stories.
       The strong evidence of positive skewness and/or outliers suggests the
follow ing two approaches for conducting predictive analytics:
      1. Log-transform the yogurt and dinner expenditure variables for
         prediction models.
    2. Bin the yogurt and dinner expenditure variables for classification models.
      These transformations along with other potential data-related issues can
then be dealt with in the data preparation phase of the CRISP-DM framework.
Data Preparation
Although the ERIM data set is from the 1980s, it highlights many of the
relevant issues that business analysts still confront daily, starting with
transforming un wieldy raw data into useful, actionable information. During
the data preparation stage, we ask students to develop a comprehensive plan
for data wrangling,
                                                  Jaggia, Kelly, Lertwachara, and Chen 619
      Model Development
      In the model development subphase, the instructor should highlight the
      strengths and limitations of various modeling techniques. Students will
      identify and choose the appropriate analytical techniques after considering
      their advantages and limi tations. Initially, students may start with a model that
      offers a higher level of in terpretability. For example, both logistic regression
      for classification and multiple
620 Teaching Business Analytics
Own Home 0.8498 0.8619 0.7966 0.8702 0.8396 Pets 0.5165 0.5321 0.4475 0.5823
0.4835 Married 0.7294 0.7430 0.6695 0.7883 0.6999
College Educated Both                       0.1154 0.1301 0.0508 0.0988 0.1237 0.0618
College Educated One
                                            0.0635 0.0542 0.0442 0.0706
Age 48.2469 47.2564 52.6102 44.7662 49.9873 Female HH 0.2405 0.2336 0.2712
0.1910 0.2653
linear regression for prediction are highly interpretable and quantify the impact
of each predictor variable on the target variable.
      Table 4 shows the logistic regression results for classifying whether or
not a household purchases yogurt and frozen dinner products, respectively. As
part of the project assignments, the instructor should ask students to consider
the following questions based on the initial modeling results: (a) which
predictor variables are the most influential predictors?, (b) how much impact
does each predictor variable have on the probability of a household purchasing
yogurt/dinner?, and (c) which type of household is likely to purchase
yogurt/dinner? The instructor may also ask students to compare answers to
these questions to their initial assumptions gained from data exploration during
the earlier phases.
       Students are also asked to estimate multiple linear regression models on
the expenditure variables. As mentioned before, given the skewness of the
numeri cal target variables, these target variables must first be transformed into
natural logs. The instructor should provide the correct interpretation of the
estimated co efficients of log-linear models. Qualitatively, the results from the
multiple linear regression models are similar to those of the logistic models;
the results are not reported in the article for the sake of brevity.
      While logistic and linear regression models offer important insights on
how the predictor variables influence the target variables, the instructor can
remind stu dents that predictive modeling focuses more on the model’s ability
to classify or predict a future case correctly than trying to interpret or draw
inferences from the model. In other words, a well-performing explanatory
model may not necessarily be a good predictive model. Data-driven
techniques, such as naïve Bayes, ensemble
                                                    Jaggia, Kelly, Lertwachara, and Chen 621
      Table 4: Estimated logistic regression models for yogurt and dinner. Variable
      Yogurt Frozen dinners
Intercept 0.3047 (0.447)                     (0.349)
HH Income 0.0582 ** (0.007)                  0.1367
Cable –0.0052 (0.959)                        (0.365)
Single Family Home –0.0614 (0.714)           0.1359
Own Home 0.3565 ** (0.022)                   (0.326)
Pets 0.0740 (0.456)                          0.2186 ** (0.007)
Married 0.6901 ** (0.004)                    0.4194
College Educated Both 0.6808 ** (0.001)      (0.109)
College Educated One 0.3977 * (0.073)        –0.4930 ** (0.000)
Work Hours 0.0014 (0.717)                    –0.2248
Other HH Members 0.2455 ** (0.000)           (0.250)
Age –0.0095 ** (0.044)                       –0.0045
Female HH 0.9031** (0.000)                   (0.165)
–0.0870                                      0.1735 ** (0.000)
(0.814)                                      –0.0214 ** (0.000)
–0.0387 ** (0.025)                           0.1424
–0.0784                                      (0.585)
     Notes. Parameter estimates with the p-values in parentheses; * and ** represent
     significance at the 10% and 5% level, respectively.
      trees, and k-nearest neighbors, may result in better predictive models even
      though they suffer in interpretability.
      Model Assessment
      In the model assessment subphase, the instructor should stress the importance
      of evaluating model performance using the validation or test data set instead of
      the training set. Performance measures should evaluate how well an estimated
      model will perform in an unseen sample, rather than making the evaluation
      solely on the basis of the sample data used to build the model. The validation
      data set not only provides measures for evaluating model performance in an
      unbiased manner but also helps optimize the complexity of predictive models.
             Students are asked to evaluate model performance using the validation
      data set each time a model is developed and focus on measures of predictive
      perfor mance rather than on goodness-of-fit statistics as in a traditional
      analytical process. For classification models, performance measures include
      the accuracy rate, sensi tivity, and specificity. For prediction models, common
      performance measures are
622 Teaching Business Analytics
the mean error (ME), the root mean square error (RMSE), and the mean
absolute error (MAE). Furthermore, performance charts, such as the
cumulative lift chart, the decile-wise lift chart, and the receiver operating
characteristic (ROC) curve are also used to evaluate model performance.
       To illustrate the teaching points, we partitioned the data to re-estimate
and assess the logistic regression model for classifying whether a household
would purchase yogurt or frozen dinners, respectively; see Table 5 for the
performance measures. We present two sets of performance measures, using
the cutoff value of 0.5 (the default cutoff value for binary classification
models) and the cutoff value equal to 0.82 for yogurt and 0.33 for frozen
dinners (the actual proportion of households in the data that purchased yogurt
and frozen dinners, respectively). It is important to point out to students that
classification performance measures are highly sensitive to the cutoff values
used. A higher cutoff value classifies fewer number of cases into the target
class, whereas a lower cutoff value classifies more cases into the target class.
As a result, the choice of the cutoff value can influence the confusion matrix
and the resulting performance measures. In cases where there are asymmetric
misclassification costs or an uneven class distribution in the data, it is
recommended that the proportion of target class cases be used as the cutoff
value. For example, by setting the cutoff value to 0.33 for frozen dinners, the
model generates a sensitivity value of 0.5671 meaning that 56.71% of the
target class cases are correctly classified, versus a sensitivity value of 0.1106 if
the cutoff value is 0.5.
       It is sometimes more informative to have graphical representations to
assess model performance. Figure 2 displays these charts that are associated
with the lo gistic regression model for frozen dinners. Note that the charts are
created using the validation data set. Unlike the numeric performance
measures, the performance charts are not sensitive to the choice of cutoff
value. Students need to be able to articulate the performance of various models
based on the performance charts. For example, Figure 2 suggests that the
logistic regression model offers improvement in prediction accuracy over the
baseline model (random classifier). The lift curve lies above the diagonal line
suggesting that the model is able to identify a larger per centage of target class
cases (households that purchase frozen dinners) by looking at a smaller
percentage of the validation cases with the highest predicted proba bilities of
belonging to the target class. The decile-wise lift chart conveys similar
information but presents the information in 10 equal-sized intervals. Finally,
the ROC curve also suggests that the model performs better than the baseline
model in terms of sensitivity and specificity across all possible cutoff values.
The area
      Jaggia, Kelly, Lertwachara, and Chen 623 Figure 2: Performance charts for logistic
      Evaluation
      While the efficacy of the predictive models with regard to various performance
      measures is assessed during the modeling phase, the evaluation phase focuses
      on determining whether the models have properly achieved the business
      objectives specified in the earlier phases. This phase reminds students that data
      mining is not merely an academic exercise but a field designed to impact
      decision making, and it requires students to take off the hat of a technical
      expert and put on the business hat.
624 Teaching Business Analytics Figure 3: A regression tree model and its
performance measures.
One important process within the evaluation phase is to review the steps
executed to construct the model to ensure that no important business issues
were overlooked. Therefore, each team selects two students outside the team to
review and critique the team’s modeling process and validate the logic behind
the process.
       This phase also impresses on students the importance of domain
knowledge and business acumen in understanding the findings of analytics.
During this phase, students frequently realize that the strongest patterns and
relationships identified by the models are often obvious, less useful, or simply
reflect business rules, and in many cases, the most technically elegant or
sophisticated solutions yield little relevant insights to answer the business
questions. In addition to self-assessment, student teams are asked to
collaborate with domain experts to evaluate how well their predictive models
achieve the business objectives. In the case of the ERIM project, each team
conducts interviews with the instructor, who plays the role of a retail expert, to
discuss the team’s findings and explore the possible actionable decisions that
retailers and manufacturers can make based on the findings.
       For example, the classification models reveal interesting differences and
sim ilarities between households that are likely to purchase yogurt and those
that are likely to purchase frozen dinners. Households that are likely to
purchase yogurt tend to have higher income and education levels and consist
of a married couple or are led by a female head of household; whereas
households that are likely to
                                                 Jaggia, Kelly, Lertwachara, and Chen 625
     purchase frozen dinners tend to have lower income and education levels and
     have at least one pet. Relatively young head(s) of households with large
     families are expected to purchase both yogurt and frozen dinners. Students are
     encouraged to develop compelling data stories that help depict the profiles of
     these households for the audience and provide actionable recommendations
     that would lead to market ing and advertising, store placement, and product
      design strategies. Such discus sion often leads to teams backtracking to earlier
      phases to augment data preparation and modeling processes.
            Putting on the business hat encourages students to look at the models
      from a different perspective. For example, while prediction models produce
      predicted values of the target variable, a marketing executive may decide to
      place more em phasis on the ranking of the predicted values rather than the
      values themselves. Similarly, in order to achieve our business objective, we are
      more likely to be in terested in identifying households that would spend more
      on frozen dinners so that we can target these households for future marketing
      efforts rather than accurately predicting how much each household spends on
      frozen dinners. The performance measures, such as RMSE, often fail to
      provide us with this critical information. To understand the model’s ability to
      correctly rank spending, performance charts, such as the lift chart and the
      decile-wise lift chart, can be more helpful. A criti cal evaluation of the
      analytics findings from the business perspective in this phase helps the teams
      refocus on the objectives of the project and create compelling data stories and
      recommendations for business decision makers.
      Deployment
      The final phase of the project involves a written report and presentation of the
      key findings. This phase stresses the importance of storytelling to
      communicate ana lytical findings to their intended audience effectively.
      Storytelling, or data story telling, refers to crafting and delivering compelling
      data-driven stories to decision makers for the purpose of converting insights
      into actions; that is the final phase of the CRISP-DM framework.
            Contrary to popular belief, storytelling is not the same as data
      visualization although presenting data through visually engaging figures and
      diagrams is a part of storytelling. Students are asked to focus on three key
      elements of storytelling: data, visualization and narrative, and how they
      complement one another to create a compelling story about the findings.
      Simply presenting the analytical process and findings from a technical
      perspective would have limited use to decision makers. To engage the
      audience, students must learn to focus on the context around the data that helps
      demonstrate the business value of the analysis and use appropriate and
      engaging visualizations to help reveal the underlying patterns and relation
      ships. Students are advised to present business insights gained from data
      analysis from a nontechnical standpoint and craft the story around the data by
      focusing on answering the following three questions:
            (1) Why should the decision maker care about the findings?
            (2) How do these findings affect the business?
            (3) What actions do you recommend to the decision maker?
626 Teaching Business Analytics
      Storytelling gives the dry topic of data analysis an interesting spin and
makes the content of the report and presentation more palatable to the audience
who are often business decision makers with little training and/or interest in
analytical methodologies. We find that storytelling is often intimidating at first
to students in the business analytics course, especially to those with a technical
mindset. How ever, it is a career-building skill that can be improved with
practice and guidance from the instructor.
      Critical reflection is an essential component of the experiential learning
cycle (Kolb, 2015). It helps enhance students’ understanding of the
experiential activi ties in the context of the learning objectives of the course.
Upon the completion of the project, the students are asked to reflect on their
analytics work. A critical reflection framework, such as the self-reflective
model, by Rolfe et al. (2001) can reinforce students’ learning experience. Their
reflective model is based on three simple steps: "What?," "So what?," and
"Now what?." In the "What?" step, stu dents reflect upon important questions,
such as "What happened in the project?," "What was the role of each team
member?," and "What was the problem being solved?" During the "So what?"
step, students may consider questions, such as "What other issues and
opportunities arose from the project?," "What conclusions did you draw from
the project?," and "What did you learn about the project and other team
members?" Finally, during the "Now what?" step, students contemplate
questions such as "How will you apply what you learned from the project?,"
"If you need to complete a similar project again, what would you do
differently?" and "What other skills might be beneficial to learn before you
proceed to the next project?"
      CONCLUSIONS
      Experiential learning opportunities that mimic real-world projects have been
      proven effective in teaching applied subjects such as business analytics. The
      project described in this teaching brief provides students with a holistic
      experience of converting data into insights and actionable business strategies.
      The infusion of the CRISP-DM framework throughout the project creates a
      structured approach to a creative problem-solving process. This extensive
      project is valued by both the instructor and students. The structure of the
      project and instructional experience gained by the instructor from this project
      can be readily applied to other large data sets and business problem contexts
      including consulting projects. Students who have undertaken this project gain a
      better understanding of the CRISP-DM framework, which is a widely adopted
      industry standard for analytics projects, and an integrated view of the
      knowledge points that permeate throughout the business analytics curriculum.
      The project provides an effective and engaging experiential learning activity
      that helps improve career readiness for business students.
      REFERENCES
      Abbasi, A., Sarker, S., & Chiang, R. H. (2016). Big data research in
           information systems: Toward an inclusive research agenda. Journal of
           the Association for Information Systems, 17(3).
      Anderson, J. S. & Williams, S. K. (2019). Turning data into better decision
           mak ing: Asking questions, collecting and analyzing data in a personal
           analytics project. Decision Sciences Journal of Innovative Education,
           17(2), 126–145.
628 Teaching Business Analytics
Asamoah, D. A., Sharda, R., Hassan, Z. A., & Kalgotra, P. (2017). Preparing a
     data scientist: A pedagogic experience in designing a big data analytics
     course. Decision Sciences Journal of Innovative Education, 15(2), 161–
     190.
Burch, G. F., Giambatista, R., Batchelor, J. H., Burch, J. J., Hoover, J. D.,
     Heller, N. A. (2019). A meta-analysis of the relationship between
     experiential learning and learning outcomes. Decision Sciences Journal
     of Innovative Education, 17(3), 239–273.
Cardozo, R. N., Durfee, W. K., Ardichvili, A., Adams, C., Erdman, A. G.,
     Hoey, M., Iaizzo, P. A., Mallick, D. N., Bar-Cohen, A., Beachy, R., &
     Johnson, A. (2002). Perspective: Experiential education in new product
     design and business development. Journal of Product Innovation
     Management, 19(1), 4–17.
Dykes, B. (2016). Data storytelling: The essential data science skill everyone
     needs. Forbes. https://www.forbes.com/sites/brentdykes/2016/03/31/data
     storytelling-the-essential-data-science-skill-everyone-needs/
Heim, G. R., Tease, J., Rowan, J., & Comerford, K. (2005). Experiential
     learning in a management information systems course: Simulating IT
     consulting and CRM system procurement. Communications of the
     Association for Informa tion Systems, 15, 428–463.
Henke, N., Bughin, J., Chui, M., Manyika, J., Saleh, T., Wiseman, B., &
     Sethupathy, G. (2016). The age of analytics: Competing in a data-driven
     world. McKinsey Global Institute. https://www.mckinsey.com/business
     functions/mckinsey-analytics/our-insights/the-age-of-analytics-
       competing in-a-data-driven-world.
Institute for Operations Research and the Management Sciences. (2019). Best
       Definition       of      Analytics.      https://www.informs.org/About-
       INFORMS/News                Room/O.R.-and-Analytics-in-the-News/Best-
       definition-of-analytics
Kolb, D. (2015). Experiential learning: experience as the source of learning
       and development. New Jersey: Pearson Education, Inc.
Northwestern University. (2019). Course Descriptions and Schedule.
       https://sps.northwestern.edu/masters/data-science/program-courses.php?
       course_id=4790
Rolfe, G., Freshwater, D., & Jasper, M. (2001). Critical reflection in nursing
and the helping professions: A user’s guide. Basingstoke: Palgrave Macmillan.
Rudin, C. (2012). Teaching ‘Prediction: Machine Learning and Statistics’. Pro
ceedings of the 29th International Conference on Machine Learning, Edin
burgh, Scotland.
Silvester, K. J., Durgee, J. F., McDermott, C. M., & Veryzer, R. W. (2002).
       Per spective: Integrated market-immersion approach to teaching new
       product de velopment in technologically-oriented teams. Journal of
       Product Innovation Management, 19(1), 18–31.
    University of Chicago. (2019). Curricular Updates Spring 2018. https://
        grahamschool.uchicago.edu/news/curricular-updates-spring-2018
                                                 Jaggia, Kelly, Lertwachara, and Chen 629
     Watson, H. J. (2013). Business case for analytics. Biz Ed, 49–54. Wilder, C. R.
     & Ozgur, C. O. (2015). Business analytics curriculum for undergrad uate
     majors. INFORMS Transactions on Education, 15(2), 180–187. Wirth, R. &
     Hipp, J. (2000). CRISP-DM: Towards a standard process model for data
     mining. Proceedings of the Fourth International Conference on the Practical
     Application of Knowledge Discovery and Data Mining, Manch ester, United
     Kingdom, 29–39.
      Data preparation: Determine and perform the necessary data wrangling and
       prepa ration tasks based on the decision made during the business and data
       understand ing phases. Explain the rationale for these tasks and document the
       changes that you have made to the data set.
      Modeling: Consider the strengths and weaknesses of different modeling tech
       niques. Implement the appropriate techniques, explain the rationale for your
       selections, and present relevant analysis results and interpretation. For the su
       pervised techniques, determine whether to use classification or prediction
       mod els and explain your decision. Use appropriate data partitioning and
       performance measures to evaluate the competing models implemented in the
       modeling phase. Identify the best model(s).
630 Teaching Business Analytics
Evaluation: Refocus on the business objectives of the project. Review the steps
 ex ecuted to construct the model to ensure no key business issues were
 overlooked. Evaluate whether the models have properly achieved the
 business objectives out lined during the business understanding phase.
 Formulate actionable recommen dations based on the findings.
Deployment: Communicate the findings and relevant business insights with a
 written report and oral presentation that incorporate appropriate statisti cal
 information and visuals. The main focus should be placed on provid ing
 actionable business recommendations for a managerial and non-technical
 audience.
      Data Wrangling
      myData$Yogurt <- ifelse(myData$YogExp > 0, 1, 0)
          1
           The master code for a comprehensive list of analytical techniques, including the ones not detailed in
      the article, is available upon request.
632 Teaching Business Analytics