Major Project 2025
Project Title Daily Transactions
Tools Visual Studio code / Jupyter notebook
Domain Finance Analyst
Project Difficulties level intermediate
Dataset : Dataset is available in the given link. You can download it at your
convenience.
Click here to download data set
About Dataset
The "Daily Transactions" dataset contains information on dummy
transactions made by an individual on a daily basis. The dataset includes
data on the products that were purchased, the amount spent on each
product, the date and time of each transaction, the payment mode of each
transaction, and the source of each record (Expense/Income).
This dataset can be used to analyze purchasing behavior and money
management, forecasting expenses, and optimizing savings and budgeting
strategies. The dataset is well-suited for data analysis and machine learning
applications,it can be used to train predictive models and make data-driven
decisions.
Column Descriptors
● Date: The date and time when the transaction was made
● Mode: The payment mode used for the transaction
● Category: Each record is divided into a set of categories of
transactions
● Subcategory: Categories are further broken down into Subcategories
of transactions
● Note: A brief description of the transaction made
● Amount: The transactional amount
● Income/Expense: The indicator of each transaction representing
either expense or income
● Currency: All transactions are recorded in official currency of India
Example: You can get the basic idea how you can create a project from
here
Sure! Let's outline a financial analyst project that involves working with a
dataset of daily transactions. We'll include steps to clean the data, perform
analysis, and generate a report with code examples in Python using popular
libraries like Pandas, NumPy, Matplotlib, and Seaborn.
1. Project
Overview
Objective:
● Analyze daily financial transactions to identify trends, patterns, and
insights.
● Generate a comprehensive report with visualizations.
2. Dataset Description
● Date: Date of the transaction.
● Transaction_ID: Unique identifier for each transaction.
● Account_ID: Unique identifier for the account.
● Category: Category of the transaction (e.g., Sales, Purchase,
Salary).
● Amount: Amount of money involved in the transaction.
● Type: Type of transaction (Credit or Debit).
3. Steps to Complete the Project
Step 1: Import Libraries and Load Data
import pandas as pd import
numpy as np
import matplotlib.pyplot as plt import
seaborn as sns
# Load the dataset df =
pd.read_csv('daily_transactions.csv')
# Display the first few rows of the
dataset df.head()
Step 2: Data Cleaning
● Handle missing values.
● Correct data types.
● Remove duplicates.
Step 3: Exploratory Data Analysis (EDA)
● Summary statistics.
● Distribution of transaction amounts.
● Transaction counts by category and type.
# Summary statistics
df.describe()
# Distribution of transaction amounts
plt.figure(figsize=(10, 6))
sns.histplot(df['Amount'], bins=50,
kde=True) plt.title('Distribution of
Transaction Amounts') plt.xlabel('Amount')
plt.ylabel('Frequency') plt.show()
# Transaction counts by category
plt.figure(figsize=(12, 6))
sns.countplot(data=df, x='Category', order=df['Category'].value_counts().index)
plt.title('Transaction Counts by Category')
# Verify data types df.dtypes
plt.xlabel('Category')
plt.ylabel('Count')
plt.xticks(rotation=45)
plt.show()
# Transaction counts by type
plt.figure(figsize=(10, 6)) sns.countplot(data=df, x='Type') plt.title('Transaction Counts
by Type') plt.xlabel('Type') plt.ylabel('Count') plt.show()
Step 4: Time Series Analysis
● Trend analysis.
● Monthly and daily trends.
# Resample data to monthly frequency
monthly_data = df.resample('M', on='Date').sum()
plt.figure(figsize=(14, 7))
plt.plot(monthly_data.index, monthly_data['Amount'],
marker='o') plt.title('Monthly Transaction Amounts')
plt.xlabel('Month') plt.ylabel('Total Amount')
plt.grid(True)
plt.show()
# Daily trends
daily_data = df.groupby(df['Date'].dt.date).sum()
plt.figure(figsize=(14, 7))
plt.plot(daily_data.index, daily_data['Amount'], marker='o')
plt.title('Daily Transaction Amounts') plt.xlabel('Date')
plt.ylabel('Total Amount') plt.grid(True) plt.show()
Step 5: Correlation Analysis
● Analyze the correlation between transaction categories and
amounts.
# Create a pivot table for correlation analysis
pivot_table = df.pivot_table(index='Date', columns='Category', values='Amount',
aggfunc='sum', fill_value=0)
# Calculate correlation matrix
correlation_matrix = pivot_table.corr()
# Plot correlation heatmap
plt.figure(figsize=(12, 8))
sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm',
linewidths=0.5) plt.title('Correlation Heatmap of Transaction Categories')
plt.show()
Step 6: Generate Report
● Summarize findings and visualizations.
4. Report
Summary
The financial transactions dataset was analyzed to identify key trends and
insights. The data cleaning process involved handling missing values,
correcting data types, and removing duplicates. Exploratory Data Analysis
(EDA) revealed the distribution of transaction amounts, transaction counts
by category and type, and significant patterns over time. Time series
analysis highlighted monthly and daily transaction trends.
Correlation analysis identified relationships between different transaction
categories.