Diwali Sales Analysis: A Minor Degree (Major) Project Report
Diwali Sales Analysis: A Minor Degree (Major) Project Report
Submitted by
SAHIL PATHANIA
(Enrolment No.:H230421)
(Course Code:CS306(b))
DEC, 2024
Department of Computer Science
Career Point University
Hamirpur, Himachal Pradesh
CERTIFICATE
It gives me immense pleasure to certify that the project entitled “Diwali Sales Analysis”
embodies an original work done by “Sahil Pathania,” having examination Roll Number
H230421, under my supervision and is worthy of consideration for the award of degree of
Master of Computer Application (Department of Computer Science), Career Point University,
Hamirpur, Himachal Pradesh.
I
Department of Computer Science
Career Point University
Hamirpur, Himachal Pradesh
DECLARATION
I hereby affirm that the work presented in this project “Diwali Sales Analysis” is exclusively
my own and there are no collaborators. It does not contain any work for which a degree/
diploma has been awarded by any other university/institution.
II
ACKNOWLEDGEMENT
Apart from the efforts of myself, the success of B. Tech project work depends largely
on the encouragement and guidelines of many others, Hard work and dedication are not the
only things, required for the completion of my project work, but equally important is proper
guidance and inspiration. I take this opportunity to express my deepest and sincere gratitude to
Mr. Anshul Thakur, HOD of CSE Department and my supervisor Ms. Manju Bala, Assistant
Professor, CSE Department, Career Point University, Hamirpur for their sustained enthusiasm,
creative suggestions, motivation, and exemplary guidance throughout the course of my project
work. I am deeply grateful to them for providing me necessary facilities and excellent
supervision and guidance to complete this work. My heartfelt thanks are due to all my family
members, friends, classmates, research scholars and juniors who really helped me immensely
during my work.
Sahil Pathania
III
LIST OF FIGURES
IV
INDEX
Data Science combines mathematics, computer programming, data mining and advanced
computational technology with the domain knowledge to extract useful information from the
data. A data scientist uses scientific methods, processes, algorithms and systems to achieve
these goals while benefiting from the latest advancements (e.g. Cloud Computing, GPU) in
computer technology. The data presented for a particular problem can either be structured
(e.g. database tables, spreadsheets, or XML files) or unstructured (e.g. Human Language
Texts, Images, Videos etc.), also the datasets can either be small & medium sized or they
can be very large running into several Gigabytes (Big Data). Hence the term data science
covers a huge range of possibilities, and its scope is growing with every passing day.
Historically speaking, statistical analysis and trend reporting are the oldest applications of
Data Science in business, but today its applications extend far beyond to include advanced
techniques like Hypothesis Testing, Predictive Analytics, Machine Learning, Natural
Language Processing and Computer Vision and several more. In the next section, we
discuss how diabetes is an important healthcare issue worldwide and why we chose this
topic for our report.
1
1.2 Background
Diwali, also known as the Festival of Lights, is one of the most significant and widely
celebrated festivals in India and among Indian communities worldwide. It typically falls
between October and November and symbolizes the victory of light over darkness and good
over evil. The festival is marked by various traditions, including lighting oil lamps, decorating
homes, exchanging gifts, and indulging in festive foods. For businesses, Diwali represents a
peak sales period, as consumers engage in extensive shopping for clothing, electronics, home
decor, and gifts.
Economic Significance
The economic impact of Diwali is substantial, with retail sales often experiencing a significant
surge during this period. According to various industry reports, the festive season can account
for a considerable percentage of annual sales for many retailers, particularly in sectors such as
fashion, electronics, and consumer goods. Businesses invest heavily in marketing campaigns,
promotional offers, and inventory management to capitalize on the increased consumer
spending associated with the festival.
Consumer behaviour
Consumer behaviour during Diwali is influenced by cultural traditions, social norms, and
economic factors. Shoppers often seek to purchase new items for their homes and families,
driven by the belief that buying new goods during the festival brings prosperity and good
fortune. Additionally, the trend of gifting during Diwali encourages consumers to spend more
on products that symbolize love and appreciation. Understanding these behavioral patterns is
crucial for businesses aiming to optimize their sales strategies during this period.
2
The Need for Data-Driven Insights
In light of these challenges, there is a growing need for businesses to adopt data-driven
approaches to sales analysis during Diwali. By leveraging advanced analytics, businesses can
gain valuable insights into customer behaviour, sales trends, and product performance. This
enables them to make informed decisions regarding inventory management, marketing
strategies, and customer engagement.
The implementation of a comprehensive sales analysis system can help businesses overcome
the challenges associated with the Diwali sales period. By utilizing modern technologies such
as cloud computing, data warehousing, and advanced analytics tools, organizations can
enhance their ability to analyse sales data, optimize inventory, and tailor marketing efforts to
meet customer needs effectively.
1.3 Objective
1. Analyse Sales Trends
Seasonal Patterns: Identify how sales volumes fluctuate during the pre-Diwali, Diwali,
and post-Diwali periods.
Revenue Drivers: Examine key contributors to overall revenue, such as high-demand
categories and product bundles.
Growth Analysis: Compare sales performance with previous years to assess growth or
decline during the festive season.
3
5. Optimize Future Sales Strategies
Forecasting: Use the insights gained to predict future sales trends for upcoming Diwali
seasons.
Personalization: Develop personalized offers and recommendations for different
customer segments based on their preferences.
Operational Efficiency: Suggest improvements in logistics, inventory management, and
customer support to enhance the overall shopping experience.
1.4.2 Scope
The scope of the Diwali Sales Analysis project includes an in-depth examination of sales
performance, customer behaviour, marketing effectiveness, and operational efficiency during
the Diwali festive season. It involves analysing transactional data to uncover trends in revenue,
profit margins, and product categories that drive sales. The project also encompasses customer
segmentation based on demographics, preferences, and purchasing behaviour to identify high-
value customer groups and their impact on sales.
Marketing and promotional strategies are assessed to evaluate their contribution to overall
sales, focusing on campaigns, discounts, and the effectiveness of various sales channels such
as e-commerce platforms and physical stores. Product performance is analysed to identify top-
selling and underperforming items, along with inventory management insights to improve
stock levels and minimize losses. Additionally, regional sales trends are explored to understand
geographical variations and inform region-specific strategies. The project aims to provide
actionable recommendations for enhancing marketing, inventory, and customer engagement
strategies, ensuring businesses can optimize their operations and maximize profitability during
future Diwali seasons.
4
CHAPTER 2
SYSTEM ANALYSIS
2.1 Existing System
The existing systems for analyzing Diwali sales data often rely on traditional methods that can
hinder efficiency and accuracy. Data collection is frequently done through manual entry into
spreadsheets or basic point-of-sale (POS) systems, which may not integrate well with advanced
analytics tools. This reliance on spreadsheets can lead to errors and inconsistencies, especially
as data volume increases. While some organizations utilize local databases, they often lack the
scalability and accessibility of cloud-based solutions. Analysis techniques are typically limited
to basic reporting and descriptive analytics, providing only surface-level insights without
deeper exploration of customer behavior or sales trends.
Customer insights are often underutilized, with demographic analysis and segmentation being
rudimentary, which prevents targeted marketing efforts. Additionally, inventory management
tends to be reactive rather than proactive, as many systems do not provide real-time data to
respond to changing customer demands. The siloed nature of data across departments
complicates the ability to gain a holistic view of sales performance, leading to time-consuming
analysis processes that delay decision-making. Overall, these limitations highlight the need for
a more integrated, automated, and analytics-driven approach to enhance the effectiveness of
Diwali sales analysis, enabling businesses to make informed, data-driven decisions that
improve sales performance and customer satisfaction during the festival and beyond.
5
3. Advanced Analytics and Visualization
Descriptive, Predictive, and Prescriptive Analytics: Utilize advanced analytics
techniques to not only describe past sales performance but also predict future trends
and prescribe actionable strategies based on data insights.
Interactive Dashboards: Develop user-friendly, interactive dashboards using tools like
Tableau, Power BI, or Python libraries (e.g., Dash, Stream lit) to visualize key
performance indicators (KPIs), sales trends, and customer demographics in real-time.
6
2.3 Hardware Requirements
Processor: Minimum 1GHz; Recommended 2GHz or more.
Hard Drive: Minimum 32GB; Recommended 64GB or more.
Memory (RAM): Minimum 1GB; Recommended 4GB or above.
Window 11 or lower.
7
CHAPTER 3
SYSTEM DESIGN
3.1 Module Division
The Diwali sales analysis system can be effectively divided into several key modules, each
serving a distinct purpose to streamline the overall process.
Data Collection Module is responsible for gathering sales data from various sources,
including Point of Sale (POS) systems for real-time transaction capture, online sales platforms
for e-commerce data, and customer feedback through surveys. This module utilizes APIs and
middleware for integration, ensuring comprehensive data acquisition.
Data Processing Module focuses on cleaning and preprocessing the collected data to prepare
it for analysis. This involves removing duplicates, correcting errors, and handling missing
values, often using tools like Python's Pandas library or ETL tools. Data transformation is also
a critical function here, normalizing formats and validating data integrity to ensure accuracy.
Data Analysis Module then takes the processed data to derive actionable insights. It conducts
sales trend analysis to identify patterns over time, evaluates product performance to determine
top sellers, and analyses customer behaviour to understand purchasing habits. This module
leverages statistical analysis tools and machine learning libraries to provide deep insights.
User Management Module ensures secure access and role management within the system. It
implements user authentication, defines role-based access controls, and tracks user activity for
security and auditing purposes. This module is crucial for maintaining data integrity and
security.
Integration Module facilitates connections with external systems and services, allowing for
API integrations with third-party tools, data export/import functionalities, and notification
services to keep users informed. This module enhances the system's interoperability and
ensures seamless communication with other platforms.
8
3.2 Proposed Model Diagram
9
CHAPTER 4
IMPLEMENTATION AND TESTING
10
Exploratory Data Analysis
Gender
# plotting a bar chart for Gender and it's count
ax = sns.countplot(x = 'Gender',data = df)
for bars in ax.containers:
ax.bar_label(bars)
Age
ax = sns.countplot(data = df, x = 'Age Group', hue = 'Gender')
for bars in ax.containers:
ax.bar_label(bars)
# Total Amount vs Age Group
sales_age = df.groupby(['Age Group'],
as_index=False)['Amount'].sum().sort_values(by='Amount', ascending=False)
sns.barplot(x = 'Age Group',y= 'Amount' ,data = sales_age)
State
# total number of orders from top 10 states
sales_state = df.groupby(['State'], as_index=False)['Orders'].sum().sort_values(by='Orders',
ascending=False).head(10)
sns.set(rc={'figure.figsize':(15,5)})
sns.barplot(data = sales_state, x = 'State',y= 'Orders')
# total amount/sales from top 10 states
sales_state = df.groupby(['State'],
as_index=False)['Amount'].sum().sort_values(by='Amount', ascending=False).head(10)
sns.set(rc={'figure.figsize':(15,5)})
sns.barplot(data = sales_state, x = 'State',y= 'Amount')
Marital Status
ax = sns.countplot(data = df, x = 'Marital_Status')
sns.set(rc={'figure.figsize':(7,5)})
for bars in ax.containers:
ax.bar_label(bars)
sales_state = df.groupby(['Marital_Status', 'Gender'],
as_index=False)['Amount'].sum().sort_values(by='Amount', ascending=False)
sns.set(rc={'figure.figsize':(6,5)})
sns.barplot(data = sales_state, x = 'Marital_Status',y= 'Amount', hue='Gender')
Occupation
sns.set(rc={'figure.figsize':(20,5)})
ax = sns.countplot(data = df, x = 'Occupation')
for bars in ax.containers:
ax.bar_label(bars)
sales_state = df.groupby(['Occupation'],
as_index=False)['Amount'].sum().sort_values(by='Amount', ascending=False)
sns.set(rc={'figure.figsize':(20,5)})
sns.barplot(data = sales_state, x = 'Occupation',y= 'Amount')
11
Product Category
sns.set(rc={'figure.figsize':(20,5)})
ax = sns.countplot(data = df, x = 'Product_Category')
for bars in ax.containers:
ax.bar_label(bars)
sales_state = df.groupby(['Product_Category'],
as_index=False)['Amount'].sum().sort_values(by='Amount', ascending=False).head(10)
sns.set(rc={'figure.figsize':(20,5)})
sns.barplot(data = sales_state, x = 'Product_Category',y= 'Amount')
sales_state = df.groupby(['Product_ID'],
as_index=False)['Orders'].sum().sort_values(by='Orders', ascending=False).head(10)
sns.set(rc={'figure.figsize':(20,5)})
sns.barplot(data = sales_state, x = 'Product_ID',y= 'Orders')
# top 10 most sold products (same thing as above)
fig1, ax1 = plt.subplots(figsize=(12,7))
df.groupby('Product_ID')['Orders'].sum().nlargest(10).sort_values(ascending=False).plot(kind
='bar')
Unit Testing: Developers write test cases for each function or method, using frameworks like
JUnit (for Java) or Pym test (for Python).
Integration Testing: Use integration testing frameworks to validate data flow between
modules, such as the Data Collection Module and Data Processing Module.
Functional Testing: Execute test cases based on functional specifications, using manual
testing or automated testing tools like Selenium.
User Acceptance Testing (UAT): Involve actual users in testing the system in a real-world
scenario, gathering feedback on usability and functionality.
Performance Testing: Use performance testing tools like JMeter or LoadRunner to simulate
high traffic and measure response times.
Data Quality Testing: Validate data at various stages, from collection to processing and
reporting.
12
CHAPTER 5
RESULTS AND DISCUSSION
5.2.1 Gender: Gender is a critical demographic variable that influences consumer behavior,
preferences, and purchasing decisions. In the context of Diwali sales analysis, understanding
gender dynamics can provide valuable insights into market trends, customer segmentation, and
effective marketing strategies.
Fig. 5.2.1 In the above graph we can see that most of the buyers are females and even the
purchasing power of females are greater than men.
13
5.2.2 Age: Age is a fundamental demographic variable that significantly influences consumer
behavior, preferences, and purchasing decisions. In the context of sales analysis, particularly
during culturally significant events like Diwali, understanding the age distribution of customers
can provide valuable insights into market trends, customer segmentation, and effective
marketing strategies.
Fig.5.2.2 In the above graph we can see that most of the buyers are of age group between 26-
35 yrs female.
5.2.3 State: States can be defined by various factors, including geographical boundaries,
cultural differences, economic conditions, and demographic characteristics. Each state may
exhibit unique consumer behaviors and preferences influenced by local culture, traditions, and
economic factors. Understanding these differences is essential for businesses aiming to
effectively target and engage their customers.
Fig.5.2.3 In the above graph we can see that most of the orders & total sales/amount are from
Uttar Pradesh, Maharashtra, and Karnataka respectively.
14
5.2.4 Marital Status: Marital status is a significant demographic variable that influences
consumer behavior, preferences, and purchasing decisions. Understanding the marital status of
customers can provide valuable insights for businesses, particularly during key sales periods
such as festivals, holidays, or special occasions. This theory explores the implications of
marital status on consumer behavior, market segmentation, and effective marketing strategies.
Fig.5.2.4 In the above graph we can see that most of the buyers are married (women) and they
have high purchasing power.
Fig.5.2.5 In the above graph we can see that most of the buyers are working in IT, Healthcare
and Aviation sector.
15
5.2.6 Product Category: Product categories are essential classifications that group similar
products based on shared characteristics, functions, or target markets. Understanding product
categories is crucial for businesses as they influence consumer behavior, purchasing decisions,
and marketing strategies. This theory explores the implications of product categories on
consumer behavior, market segmentation, and effective sales strategies.
Fig.5.2.6 From above graphs we can see that most of the sold products are from Food, Clothing
and Electronics category.
16
CHAPTER 6
CONCLUSION AND FUTURE WORK
6.1 Conclusion
The Diwali sales analysis provides critical insights into consumer behaviour, preferences, and
market trends during one of the most significant shopping seasons in many cultures,
particularly in India. This analysis highlights the importance of understanding various
demographic factors—such as gender, age, marital status, occupation, and geographic
location—in shaping purchasing decisions during the festival. Key findings from the analysis
indicate that consumer preferences vary significantly across different demographic segments.
For instance, certain product categories, such as traditional clothing, sweets, and home decor,
tend to attract more attention from specific age groups and marital statuses. Additionally, the
analysis reveals that marketing strategies tailored to resonate with these diverse consumer
segments can lead to increased engagement and higher sales.
The analysis also underscores the impact of cultural and regional factors on consumer
behaviour during Diwali. Understanding local customs, traditions, and economic conditions
allows businesses to optimize their product offerings and promotional strategies, ensuring they
meet the unique needs of their target audience. Furthermore, the Diwali sales analysis
emphasizes the importance of data-driven decision-making. By continuously monitoring sales
performance and consumer trends, businesses can adapt their strategies in real-time, responding
effectively to changing market dynamics and consumer preferences. In summary, the Diwali
sales analysis serves as a valuable tool for businesses looking to enhance their marketing
efforts, improve customer engagement, and drive sales during this festive season. By
leveraging insights gained from the analysis, companies can create targeted campaigns,
optimize their product assortments, and ultimately foster long-term customer loyalty,
positioning themselves for success in a competitive marketplace.
17
BIBLIOGRAPHY
Books:
[1] Fundamentals of Machine Learning for Predictive Data Analytics 24July 2015 by John D.
Kelleher (Author), Brian Mac Namee (Author), Aoife D'Arcy (Author).
[2] Python for Data Analysis: Data Wrangling with Pandas, NumPy, and I Python,
2nd Edition by William McKinney (Author).
Websites:
[1] Kaggle:(https://www.kaggle.com/)
[2] www.w3schools:(https://www.w3schools.com/python/)
[3] www.geekaforgeeks:(https://www.geeksforgeeks.org/)
18