WEEK 2
Here are the notes summarizing the key points from Professor Chatterjee's lecture (Week 2,
Session 1):
Topic: What Consumers Want - Introduction and Quantitative Analysis
Key Concepts:
   Need, Want, Desire, Demand: Understanding the differences is crucial for marketers.
        Need: Basic necessities for survival.
        Want: Higher-level desires beyond basic needs.
        Desire: A specific want.
        Demand: Desire backed by willingness to pay.
   Sources of Information on Customer Wants:
        Directly asking customers: Surveys (qualitative and quantitative).
        Review websites: (Amazon, TripAdvisor) - analyze customer reviews.
        Complaint websites: (ConsumerComplaints.com) - understand negative feedback.
        Online forums: (Salesforce.com, etc.) - user discussions and recommendations.
        Tracking customer behavior: Choice models (binomial, multinomial).
        Consumer experiments: Conjoint analysis (creating scenarios and asking for
        preferences).
R Code and Analysis:
 1. Data Preparation (Review from previous week):
       Set working directory, load data ( read.csv() ).
       Handle missing values and outliers.
 2. Scaling Variables:
        scale() : Standardizes the dependent (overall rating) and independent (aspect ratings)
       variables. This makes them comparable in the regression.
 3. Regression Analysis:
        lm() : Linear regression to determine the relative importance of hotel aspects on overall
       rating.
       Example: fit <- lm(review_overall_rating ~ rating_value + rating_location +
       ..., data = Data)
       Interpreting Results:
            F-statistic and p-value: Assess overall model significance.
            Adjusted R-squared: Explains the variance in the overall rating.
            Coefficients: Show the impact of each aspect on the overall rating.
 4. Impact of Co-traveler Type:
       Include review_type in the regression model.
       R automatically creates dummy variables.
       Interpretation: The coefficients for different review_type levels (business, solo, family,
       friends) are compared to the dropped level (as a couple).
 5. Impact of Time Distance (between travel and review):
       Convert date of review and month of visit to date format using as.Date() .
            Handle different date formats (e.g., "dd-mm-yyyy" vs. "mm/dd/yyyy").
       Calculate the time difference.
       Include the time difference variable in the regression model.
       Interpretation: A significant coefficient suggests that the time between travel and review
       affects overall satisfaction.
Key Findings (from the example):
   Value for Money is the most important factor influencing overall hotel rating.
   Business and Family travelers tend to have lower satisfaction scores compared to
   couples.
   The time distance between travel and review can significantly impact the rating.
Future Topics (Mentioned):
   Choice modeling (binomial, multinomial).
   Conjoint analysis.
   Text mining of reviews.
Important Note: The professor emphasizes the importance of understanding basic statistics,
marketing management, and introductory business analytics.
Here are the short and most important notes from Professor Chatterjee's lecture (Week 2,
Session 2):
Topic: What Consumers Want - Conjoint Analysis Introduction
Key Concept: Conjoint Analysis (or "Features CONsidered JOINTly")
   Goal: Identify what attributes and attribute levels consumers value and how they make
   trade-offs.
   Principle: Break down offerings into combinations of multiple attributes and assess
   customer preferences for these combinations.
   Consumer Surplus: Customers aim to maximize the difference between the perceived
   value (utility) of a product and its price (Value - Price).
Challenges in Understanding Customer Value & Pricing:
   Increasing Value vs. Increasing Price: Higher value often means higher cost, potentially
   leading to higher prices and lower demand.
   Finding the "Right Price": Balancing customer willingness to pay with profitability.
Limitations of Classical Market Research:
   Direct Questions: Can yield vague and non-actionable answers.
   Importance Ratings/Rankings: Don't reveal how consumers make trade-offs between
   different attribute levels.
Conjoint Analysis Steps:
 1. Define the Product/Service: The focus of the analysis (e.g., laptop, smartphone).
 2. Select Attributes and Levels:
       Independent Attributes: Attributes should not be highly correlated.
       Varying Levels: Each attribute must have multiple distinct levels.
       Unambiguous Levels: Levels should be clear and specific (e.g., brand names, specific
       RAM sizes, concrete prices).
       Mutually Exclusive Levels: A single option cannot possess multiple levels of the same
       attribute simultaneously.
       Balanced Levels (Ideally): Similar number of levels across attributes.
 3. Create Combinations (Profiles): Generate meaningful combinations of the attribute levels.
    The number of combinations increases rapidly with more attributes and levels.
 4. Collect Preference Data:
       Ranking-based: Respondents rank the combinations from most to least preferred.
       Rating-based: Respondents rate each combination on a scale.
       Choice-based: Respondents choose their most preferred option from a set of
       combinations.
 5. Analyze Data: Use statistical techniques (to be discussed in the next video) to determine
   the part-worth utilities (value) of each attribute level.
 6. Interpret Results: Understand customer preferences, estimate willingness to pay, predict
   market share for new products, and optimize product positioning.
Example (Laptop):
   Attributes: Brand, Processor, RAM Size, Monitor Size, Price.
   Levels: Specific brand names (HP, Lenovo, Dell), processor speeds (e.g., 2 GHz, 3 GHz),
   RAM (e.g., 8GB, 16GB), screen sizes (e.g., 14 inch, 15.6 inch), prices (e.g., $500, $700).
Key Takeaway: Conjoint analysis is a powerful technique to understand how consumers value
different product features jointly and make trade-offs, providing more actionable insights than
traditional methods. The next session will focus on the analysis techniques.
Here are the short and most important notes from Professor Chatterjee's lecture (Week 2,
Session 3):
Topic: What Consumers Want - Conjoint Analysis Hands-on
Key Concepts:
   Orthogonal Design: A subset of all possible combinations (full factorial design) used to
   reduce the number of choices presented to respondents. In an orthogonal design, the
   attributes are uncorrelated.
   Rating-based Conjoint Analysis: Respondents rate each product profile (combination of
   attribute levels) on a scale.
   Ranking-based Conjoint Analysis: Respondents rank the product profiles in order of
   preference.
   Choice-based Conjoint Analysis: Respondents choose their most preferred product
   profile from a set of options.
   Reverse Coding: In ranking-based conjoint analysis, ranks are reversed (e.g., 1 becomes
   the highest rating, and the highest rank becomes the lowest rating).
   Part-worth Utilities (Inferred Preferences): The relative value a consumer places on each
   level of each attribute, derived from the conjoint analysis.
   Importance Scores: The relative importance of each attribute, calculated from the range of
   part-worth utilities for that attribute.
R Code and Analysis:
 1. Data Preparation:
       Load the data ( read.csv() ).
       Convert the categorical variables (fuel, capacity, price) to factors ( factor() ).
 2. Regression Analysis:
       Use linear regression ( lm() ) to model the relationship between the product profile
       attributes and the rating.
       The regression coefficients indicate the part-worth utilities.
 3. Interpreting Regression Results:
       Assess model fit (F-statistic, p-value, adjusted R-squared).
       The coefficients show how each attribute level affects the rating.
 4. Calculating Attribute Importance:
       For each attribute, calculate the range between the highest and lowest part-worth
       utilities.
       Normalize these ranges to sum to 1 (or 100%) to get the relative importance scores.
Applications of Conjoint Analysis:
   Pricing Decisions:
        Estimate the impact of attribute level changes on preference.
        Use the regression coefficients to determine the price that compensates for a change
        in an attribute level.
        Alternatively, directly ask for willingness to pay.
Example (Car):
   Attributes: Fuel Type (Diesel, Petrol, CNG), Capacity (8-seater, 6-seater, 4-seater), Price
   (12 lakhs, 8 lakhs, 4 lakhs).
   The analysis determines the relative importance of these attributes, the part-worth utilities
   for each level, and how price changes might compensate for changes in other attributes.
Key Takeaways:
   Conjoint analysis provides valuable insights into customer preferences for different product
   features.
   It allows for the quantification of the value customers place on specific attribute levels.
   The results can be used to inform product design, pricing, and marketing strategies.
   R is used to run the analysis.
Here are the short and most important notes from Professor Chatterjee's lecture (Week 2,
Session 4):
Topic: What Consumers Want - Conjoint Analysis Applications Continued
Key Applications of Conjoint Analysis:
 1. Brand Premium:
       The extra amount consumers are willing to pay for a branded product over a non-
       branded one, keeping all other attributes constant.
       Can be directly estimated if "Brand" is included as an attribute in the conjoint study,
       especially if willingness to pay is the response variable.
       The difference in part-worth utilities (or willingness to pay) between different brand
       levels represents the brand premium.
 2. Market Share Modeling:
       Predict the market share of different product configurations based on consumer
       preferences derived from conjoint analysis.
       Assumptions:
            Utility is related to the rating (e.g., Utility = Rating^alpha).
            Probability of choosing a product is the utility of that product divided by the sum of
            utilities of all available products (for all i).
       Steps:
            Calculate the predicted utility (based on the regression equation from conjoint
            analysis) for each competitive product.
            Use an alpha parameter to transform utility (rating).
            Calculate predicted market share based on the formula.
            Optimize the alpha parameter (e.g., using Solver) to minimize the difference
            between predicted and actual market shares (RMSE).
            Once a good alpha is found, the model can be used to predict the market share of
            new product introductions.
 3. New Product Introduction:
       Evaluate the potential market share of different new product configurations before
       launch.
       Calculate the predicted utility and market share for various potential product offerings
       using the calibrated market share model.
       This helps in deciding which new product configuration has the highest potential for
       success (highest predicted demand).
       Consider cost and profitability in addition to market share for the final decision.
Other Types of Conjoint Analysis (Briefly Mentioned):
   Choice-Based Conjoint (CBC):
        Respondents are presented with sets of product profiles (typically 3-4) and asked to
        choose the one they prefer most.
        The dependent variable is the choice made (categorical).
        Analyzed using conditional logit or similar choice models (to be discussed in the next
        video).
   Adaptive/Hybrid Conjoint:
        The product profiles presented to respondents are adapted based on their previous
        responses.
        Aims to efficiently identify the most important attributes and fine-tune preferences for
        specific levels.
        Particularly useful in online environments where the process can be automated.
Key Takeaways:
   Conjoint analysis has diverse applications beyond just understanding feature preferences.
   It can be used to quantify brand value and predict market outcomes.
   Market share modeling involves an additional step of calibration using actual market data.
   Different types of conjoint analysis cater to different research objectives and data collection
   methods.
Here are the short and most important notes from Professor Chatterjee's lecture (Week 2,
Session 5):
Topic: Choice Modeling - Binomial Choice (Yes/No)
Key Concepts:
   Choice Modeling: Quantitatively modeling how customers make choices. This session
   focuses on binomial choice (buy/don't buy, switch/don't switch).
   Mobile Number Porting: The example used in this session, where customers can switch
   service providers while keeping their phone number.
   Switching Barriers: Factors that hinder customers from switching service providers.
        Economic: Financial costs associated with switching (e.g., losing prepaid balance).
        Social and Psychological: Habit, loyalty, and relationships with other users on the
        same network.
        Procedural: Difficulty and complexity of the switching process.
        Option-related: Availability and attractiveness of alternative service providers.
   Generalized Linear Model (GLM): A flexible framework for modeling various types of
   response variables. Here, it is used for logistic regression.
   Logistic Regression: A statistical method used to model the probability of a binary
   outcome (e.g., switch/don't switch).
   Covariates: Control variables included in the model that are not the primary focus of the
   analysis.
   Model Objectives:
        Explanation: Understanding the relationships between predictor variables (e.g., price,
        service quality) and the outcome.
        Prediction: Accurately predicting the outcome (e.g., whether a customer will switch).
   Training and Testing Data: When the objective is prediction, the data is split into two sets:
        Training Data: Used to build the model.
        Testing Data: Used to evaluate the model's predictive performance.
   Stepwise Regression: A method to select the best set of predictor variables for a model
   (often used in prediction).
   Confusion Matrix: A table that summarizes the performance of a classification model (e.g.,
   predicted vs. actual switching behavior).
   Overall Accuracy: The percentage of cases that the model correctly predicts.
R Code and Analysis:
 1. Data Preparation:
       Load the data ( read.csv() ).
       Convert relevant columns to appropriate data types (e.g., factors).
 2. Model Building (Logistic Regression):
       Use glm() with family = binomial(link = "logit") to perform logistic regression.
       Specify the outcome variable (switching behavior) and predictor variables (service
       quality, price, switching barriers, etc.).
       Include covariates if necessary.
 3. Model Interpretation (for Explanation):
       Examine the coefficients of the predictor variables to understand their impact on the
       probability of switching.
       Consider the meaning of dummy variables for categorical predictors.
 4. Model Evaluation (for Prediction):
       Split the data into training and testing sets.
       Build the model using the training data.
       Use predict() to generate predictions on the testing data.
       Create a confusion matrix to assess the model's accuracy.
Key Takeaways:
   Choice modeling, specifically logistic regression, can be used to understand and predict
   customer switching behavior.
   Factors like price, service quality, and switching barriers influence switching decisions.
   The approach differs depending on whether the goal is explanation or prediction.
   The next video will discuss choice-based conjoint analysis.
Here are the short and most important notes from Professor Chatterjee's lecture (Week 2,
Session 6):
Topic: Choice-Based Conjoint Analysis
Key Concepts:
   Cognitive Load: Choice-based conjoint reduces the cognitive effort required from
   respondents compared to ranking or rating many options.
   Choice Sets: Respondents are presented with a limited set of product profiles (typically 3-
   4) and asked to choose their most preferred option.
   No Choice Option: Allowing respondents to indicate that they would not choose any of the
   presented options.
   Multiple Held-Constant Alternatives: Including certain fixed options across different
   choice sets.
   Mimics Real World: Choice-based conjoint mirrors the actual consumer decision-making
   process involving awareness sets, consideration sets, and choice sets.
   Investigates Interactions: Can capture how different attribute levels jointly influence
   choice.
   Alternative Specific Effects: Can account for unique preferences related to specific
   alternatives.
   Larger Sample Size: Requires more respondents compared to rating-based or ranking-
   based conjoint to obtain sufficient data points.
   Less Attribute Ideal: Too many attributes can lead to heuristic-based responses (shortcuts
   in decision-making).
   Complex Analysis: The analytical methods are more sophisticated than simple linear
   regression (e.g., conditional logit).
Conditional Logit:
   The primary analytical technique for choice-based conjoint.
   Models the probability of choosing a specific option from a choice set, conditional on the
   attributes of all options within that set.
   Formula: (P(\text{choose } j | \text{choice set } k) = \frac{e^{Uj}}{\sum{m \in k} e^{U_m}}),
   where (U) represents the utility of each option.
   Mimics logistic regression when the choice set has only two options (choose vs. not
   choose).
Case Study: Healthcare Choices of Urban Slum Dwellers:
   Attributes and Levels: Distance (≤5km, >5km), Reputation (Low, High), Delivery Method
   (Telemedicine, Face-to-face), Payment (Upfront, Non-upfront), Price (Low, High).
   Data Collection: 303 respondents, each presented with 8 choice sets of 4 options.
   Analysis: Conditional logit was used to determine the influence of each attribute level on
   healthcare choices.
   Key Findings (Direct Effects):
        Shorter distance, higher reputation, telemedicine (slight preference), upfront payment,
        and lower price increased the likelihood of choosing a healthcare option.
        Reputation and price were the most influential factors.
   Interaction Effects (with Gender and Health Insurance):
        Male respondents were less sensitive to distance compared to female respondents.
        Male respondents and those with health insurance were less sensitive to price.
        The effect of health insurance on price sensitivity needs further investigation due to the
        small sample size of insured individuals.
   Other Findings (from the actual paper):
        Individuals with bank accounts were less sensitive to price and distance and preferred
        reputed doctors.
        Availability of local ambulance facilities reduced price and distance sensitivity and
        increased preference for face-to-face interactions and EMI payments.
Key Takeaways:
   Choice-based conjoint provides a more realistic assessment of consumer preferences by
   presenting choices as they occur in the real world.
   Conditional logit is the primary statistical method for analyzing choice data.
   Interaction effects can reveal how preferences for certain attribute levels vary across
   different consumer segments.
   Conjoint analysis can inform the design of new products and services, even for specific and
   vulnerable populations.