Skip to content

Beginner Certification Project – Olist E-Commerce Analysis with RFM, Pareto, and customer segmentation

License

Notifications You must be signed in to change notification settings

hfz1988/Olist-Sales-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Olist Sales Data Analysis / Analisis Data Penjualan Olist

English

Project Overview

This project analyzes Olist's e-commerce sales data, covering customer behavior, order trends, product performance, payment preferences, and customer segmentation.
The analysis is based on the public Olist dataset and was originally developed as part of my Beginner Data Analyst Certification at GROWIA.

Objectives

  1. Identify key sales trends and seasonal patterns.
  2. Highlight top-performing product categories.
  3. Analyze customer satisfaction using review scores.
  4. Segment customers using RFM and cohort retention analysis.
  5. Provide actionable business recommendations.

Key Insights & Visualizations

1. Monthly Sales Trend

Monthly Sales Trend
Sales increase toward the end of the year, especially in November, likely influenced by seasonal events such as Black Friday or year-end promotions.

2. Review Score

Review Score
Most reviews have a score of 5. Low ratings (1–2) are relatively rare but should be addressed to maintain customer satisfaction.

3. Pareto Analysis of Product Sales

Pareto Chart
Pareto chart shows that the majority of revenue comes from a small number of product categories — focusing on these top categories can improve sales efficiency.

4. Top Product Categories

Top Product Categories
Top categories: bed_bath_table, health_beauty, sports_leisure — recommended to focus marketing on these.

5. Payment Method Distribution

Payment Method Distribution
Majority of customers use credit card. Payments via boleto and voucher are minimal.

6. Sales by Day of the Week

Sales by Day
Monday and Tuesday have higher transaction volumes — potential for targeted campaigns.

7. RFM Segmentation - Clustering

RFM Clustering
Customers segmented into 4 clusters:

  • Green: Loyal & high-value customers
  • Orange: New customers with growth potential
  • Red: Passive customers needing reactivation
  • Blue: Customers close to churn

8. Cohort Retention Analysis

Cohort Retention
Customer retention drops sharply after the first month — suggests implementing post-purchase engagement or loyalty programs.

9. Elbow Method

Elbow Method
Optimal number of clusters: 4 — beyond this point, inertia reduction is insignificant.

Dataset

Tech Stack

  • Python (Pandas, NumPy, Matplotlib, Seaborn)
  • Jupyter Notebook
  • GitHub for version control

Contact

📧 Email: hfzmustafa07@gmail.com
LinkedIn


Bahasa Indonesia

Ringkasan Proyek

Proyek ini menganalisis data penjualan e-commerce Olist, meliputi perilaku pelanggan, tren pesanan, performa produk, preferensi metode pembayaran, dan segmentasi pelanggan.
Analisis ini menggunakan dataset publik Olist dan awalnya dibuat untuk Sertifikasi Data Analyst Beginner di GROWIA.

Tujuan

  1. Mengidentifikasi tren penjualan dan pola musiman.
  2. Menemukan kategori produk dengan kinerja terbaik.
  3. Menganalisis kepuasan pelanggan melalui skor ulasan.
  4. Melakukan segmentasi pelanggan dengan RFM dan cohort retention analysis.
  5. Memberikan rekomendasi bisnis yang dapat diimplementasikan.

Temuan Utama & Visualisasi

1. Tren Penjualan Bulanan

Tren Penjualan Bulanan
Penjualan meningkat menjelang akhir tahun, terutama di November, kemungkinan dipengaruhi event musiman seperti Black Friday atau promo akhir tahun.

2. Skor Ulasan

Skor Ulasan
Mayoritas ulasan memiliki skor 5. Rating rendah (1–2) relatif jarang tetapi perlu ditindaklanjuti untuk menjaga kepuasan pelanggan.

3. Analisis Pareto Penjualan Produk

Pareto Chart
Pareto chart menunjukkan sebagian besar pendapatan berasal dari sedikit kategori produk — fokus pada kategori utama dapat meningkatkan efisiensi penjualan.

4. Kategori Produk Teratas

Kategori Produk Teratas
Kategori teratas: bed_bath_table, health_beauty, sports_leisure — disarankan fokus pemasaran ke kategori ini.

5. Distribusi Metode Pembayaran

Distribusi Metode Pembayaran
Mayoritas pelanggan menggunakan kartu kredit. Pembayaran via boleto dan voucher relatif kecil.

6. Penjualan Berdasarkan Hari

Penjualan Berdasarkan Hari
Senin dan Selasa memiliki volume transaksi lebih tinggi — berpotensi untuk kampanye promosi terarah.

7. Segmentasi RFM – Clustering

Segmentasi RFM
Pelanggan dibagi menjadi 4 cluster:

  • Hijau: Pelanggan loyal & bernilai tinggi
  • Oranye: Pelanggan baru dengan potensi tumbuh
  • Merah: Pelanggan pasif yang perlu diaktivasi kembali
  • Biru: Pelanggan hampir churn

8. Analisis Retensi Cohort

Cohort Retention
Retensi pelanggan menurun tajam setelah bulan pertama — disarankan adanya edukasi pasca-pembelian atau program loyalitas.

9. Elbow Method

Elbow Method
Jumlah cluster optimal: 4 — penurunan inertia setelah ini tidak signifikan.

Dataset

Teknologi

  • Python (Pandas, NumPy, Matplotlib, Seaborn)
  • Jupyter Notebook
  • GitHub untuk version control

Kontak

📧 Email: hfzmustafa07@gmail.com
LinkedIn

About

Beginner Certification Project – Olist E-Commerce Analysis with RFM, Pareto, and customer segmentation

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published