Lab report
Course code: CSE326
Course Title: Data Mining and Machine Learning Lab
Lab report: 01
Topic: Basic Implementation.
Submitted To:
Name: Sadman Sadik Khan
Designation: Lecturer
Department: CSE
Daffodil International University
Submitted By:
Name: Fardus Alam
ID: 222-15-6167
Section: 62-G
Department: CSE
Daffodil International University
Submission Date: 15-03-2025
Code:
1. from google.colab import drive
2. drive.mount('/content/drive')
3.
Explanation:
Here I mount (connect) my google drive with google colab.
Code:
1. import pandas as pd
2. import numpy as np
3. data = pd.read_csv('/content/drive/MyDrive/lab dataset data
mining/healthcare-dataset-stroke-data2.csv')
4.
Explanation:
Import necessary libraries pandas and numpy for further code execution. And read the csv file from
google drive in data variable.
Code:
1. data.head()
2.
Output:
Explanation:
Showing data rows from “data” frame from beginning of the data set. By default, first 5 rows are
showing.
Code:
1. data.tail()
2.
Output:
Explanation:
Showing last 5 rows by default from the “data” frame.
Code:
1. data.loc[4]
2.
Output:
Explanation:
loc() is a label based method. Here showing specific row = 4.
Code:
1. data.loc[10:15]
2.
Output:
Explanation:
Slicing rows from 10 to 15. loc() method include end label.
Code:
1. data.loc[0:4,'age']
2.
Output:
Explanation:
Slicing rows from 0 to 4(included) from specific column “age”
Code:
1. data.loc[20:24,['gender','bmi']]
2.
Output:
Explanation:
Slicing and showing rows from 20 to 24(included) from data frame with two specific columns “gender”,
“bmi”.
Code:
1. data.iloc[0:4,0:3]
2.
Output:
Explanation:
iloc() is a index based method and it exclude the end index in slicing. Here I slice rows from 0 to 4
(exclude) and columns from 0 to 3 (not include).
Code:
1. data.dtypes
2.
Output:
Explanation:
Showing the data types of all columns from data frame “data”.
Code:
1. data.info()
2.
Output:
Explanation:
info() method give overall information (Non-Null Count and Data type of each columns) of the data set.
Code:
1. data.isnull().sum()
2.
Output:
Explanation:
Showing the null (missing) values of each column. Here “age” has 64 and “bmi” has 201 null values.
Code:
1. print(f'bmi column maximum value = {max(data.bmi)}')
2. print(f'age column minimum value = {min(data.age)}')
3.
Output:
Explanation:
Showing “bmi” column maximum value and minimum value of “age” column.
Code:
1. pd.unique(data['work_type'])
2.
Output:
Explanation:
Showing the unique values of “work type” column.
Code:
1. data['gender'].value_counts()
2.
Output:
Explanation:
Showing counts of each unique values of “gender” column.