The Business of Diwali
The Business of Diwali: A Sales Analysis
Vedanta Bhattacharya
PGDM-International Business
Roll No:183052
Data Analysis using Excel & Python (DAEP)
Submitted for evaluation to
Prof. Prashant Kumar
Assistant Professor (Visiting)
FORE School of Management, New Delhi
September 2024
The Business of Diwali 2
Summary of project
Abstract:
This report presents an analytical study of Diwali sales data using Python to uncover key
consumer trends and patterns. Visualizations, made possible by matplot.lib, are used to highlight
significant findings, offering a clear, data-driven view of the Diwali sales landscape. This
analysis aids in understanding consumer behavior during the festival, providing valuable insights
for future marketing strategies and inventory planning.
The Business of Diwali 3
1. Introduction
“Diwali is the festival of lights, and it reminds us that even in the darkest times, hope and
kindness will shine through.” This report aims to analyse the Diwali sales data using Python,
focusing on consumer demographics and purchasing behaviour during the festive season. The
analysis covers various aspects, including Gender Count, Age Distribution, State-wise sales,
Occupation-based trends, and product category preferences. Through comprehensive data
visualization, we seek to understand the key trends that influence Diwali shopping.
Purpose/Objective:
• To examine gender and age distribution among customers.
• To analyse the geographic spread of Diwali sales.
• To study occupation-wise purchasing patterns.
• To investigate the popularity of product categories during Diwali.
Scope:
Using Python’s powerful libraries such as Pandas, Matplotlib, and Seaborn, we will process and
visualize the Diwali sales data. The focus will be on understanding customer behaviour by
categorizing sales based on gender, age, state, occupation, and product types. This analysis will
provide actionable insights for businesses looking to capitalize on Diwali season sales.
The Business of Diwali 4
2. About Dataset:
User_ID: Unique identifier for each customer.
Cust_name: Customer’s name.
Product_ID: Unique identifier for each product.
Gender: Gender of the customer (Male or Female).
Age Group: Age group of the customer (e.g., 26-35).
Age: Actual age of the customer.
Marital Status: Indicates whether the customer is married (1) or not (0).
State: The state where the customer resides.
Zone: The region of India where the state belongs (e.g., Western, Southern).
Occupation: Customer’s occupation (e.g., Healthcare, Automobile).
Product_Category: Category of the purchased product (e.g., Auto).
Orders: Number of orders placed by the customer.
Amount: Total amount spent by the customer.
Status: Possibly indicates the status of the transaction, but it has missing values.
Unnamed1: An additional column, likely unnecessary, possibly a placeholder or artifact from
data entry.
The Business of Diwali 5
Impact of the Labels:
User_ID, Cust_name: Help in identifying individual customers for tracking and personalization.
Product_ID, Product_Category: Essential for analyzing sales performance by product type.
Gender, Age Group, Age, Marital_Status: Useful for demographic segmentation and targeting
of marketing efforts.
State, Zone: Help analyze geographic trends in sales, allowing for region-specific marketing
strategies.
Occupation: Provides insights into customer spending patterns based on profession.
Orders, Amount: Critical for understanding sales volume and revenue generation.
Status: Could indicate transaction completion or order fulfillment, though it has missing data.
Each of these labels enables a detailed understanding of customer behavior and sales
performance, crucial for optimizing marketing and sales strategies during the Diwali season.
The Business of Diwali 6
3. Results with screenshots of python code:
Figure 1: Plotting a bar chart for Gender and its count
Figure 2: Plotting a bar chart for Age Group and its count
The Business of Diwali 7
Figure 3: Distribution of Number of orders (Top 10 States)
Figure 4: Marital Status of consumers
The Business of Diwali 8
Figure 5: Distribution of consumers based on occupation
Figure 6: Distribution of Production Categories and its count
Figure 7: Product-ID v/s Orders
The Business of Diwali 9
4. Discussion on the key findings
The analysis of Diwali sales data provided significant insights into consumer behaviour during
the festive season. Below are the main findings:
1. Gender-Based Trends: A bar chart displaying the count of customers based on gender
revealed that purchasing behaviour slightly skewed towards males. This suggests that marketing
strategies during Diwali might be more effective if tailored towards male customers, while not
neglecting female shoppers.
2. Age Distribution: The age distribution analysis showed a strong preference for Diwali
shopping among younger demographics, particularly in the 18–35 age group. This indicates that
businesses could benefit from targeting this age range through digital marketing, social media,
and product offerings that appeal to younger consumers.
3. State-Wise Sales: The geographic distribution of sales highlighted the top 10 states
contributing the most to Diwali purchases. This allows businesses to focus their logistical and
marketing efforts on these key regions, optimizing inventory placement and promotions.
4. Occupation-Based Purchases: An analysis of the occupations of consumers revealed
that salaried individuals formed a significant portion of the customer base. This insight suggests
the importance of targeting professionals, particularly around salary dates or offering relevant
schemes such as EMI options or festive bonuses.
5. Product Categories: The product category distribution indicated high sales of certain
categories, with electronics and home decor being particularly popular. These categories could
be prioritized in promotions and inventory stocking for future Diwali sales.
The Business of Diwali 10
6. Purchase Behaviour by Marital Status: Data on the marital status of consumers showed
that married individuals tended to spend more, likely driven by family-oriented shopping.
Businesses may capitalize on this by offering family bundles, discounts on home essentials, or
promotions tied to household needs.
These findings offer valuable insights for businesses looking to optimize their marketing,
inventory, and sales strategies for Diwali, aligning them with consumer preferences to drive
higher engagement and profitability.
Use of Seaborn:-
Seaborn is an amazing visualization library for statistical graphics plotting in Python. It provides
beautiful default styles and colour palettes to make statistical plots more attractive. It is built on
top matplotlib library and is also closely integrated with the data structures from pandas.
Seaborn aims to make visualization the central part of exploring and understanding data. It
provides dataset-oriented APIs so that we can switch between different visual representations for
the same variables for a better understanding of the dataset.
The Business of Diwali 11
Different categories of plot in Seaborn
Plots are basically used for visualizing the relationship between variables. Those variables can be
either completely numerical or a category like a group, class, or division. Seaborn divides the
plot into the below categories –
• Relational plots: This plot is used to understand the relation between two variables.
• Categorical plots: This plot deals with categorical variables and how they can be
visualized.
• Distribution plots: This plot is used for examining univariate and bivariate distributions
• Regression plots: The regression plots in Seaborn are primarily intended to add a visual
guide that helps to emphasize patterns in a dataset during exploratory data analyses.
• Matrix plots: A matrix plot is an array of scatterplots.
• Multi-plot grids: It is a useful approach to draw multiple instances of the same plot on
different subsets of the dataset.
Use of Matplotlib:-
Matplotlib is a powerful plotting library in Python used for creating static, animated, and
interactive visualizations. Matplotlib’s primary purpose is to provide users with the tools and
functionality to represent data graphically, making it easier to analyze and understand. It was
originally developed by John D. Hunter in 2003 and is now maintained by a large community of
developers.
The Business of Diwali 12
Key Features of Matplotlib:
Versatility: Supports various types of plots like line, scatter, bar, histograms, and pie charts.
Customization: Offers extensive control over plot details like line styles, colors, markers, labels,
and annotations.
NumPy Integration: Works seamlessly with NumPy, allowing easy plotting of data arrays.
Extensibility: Can be extended with toolkits like Seaborn, Pandas, and Basemap for specialized
plotting.
Cross-Platform: Runs on multiple operating systems like Windows, macOS, and Linux.
Interactive Plots: Supports interactive features like widgets and event handling for dynamic data
exploration.
5. Conclusion:
In conclusion, this report has highlighted the effectiveness of Python's data analysis and
visualization capabilities in uncovering key trends in Diwali sales. By leveraging libraries such
as Pandas, Matplotlib, and Seaborn, we were able to analyze consumer behavior across various
demographic factors including gender, age, occupation, and geographic location. The insights
drawn from this analysis provide actionable recommendations for businesses to tailor their
marketing strategies, optimize inventory, and enhance consumer engagement during the festive
season. This Python-driven approach serves as a valuable tool for data-driven decision-making,
reinforcing its importance in modern business analytics.
The Business of Diwali 13
References
Dataset(csv): G-Drive
Code: G-Drive