COMPUTER SCIENCE & ENGINEERING
(Experiment: 1.4)
Student Name: Sarthak Kumar Singh UID: 21BCS11763
Branch: CSE Section/Group: 643-A
Semester: 5th Date of Performance: 25/08/23
Subject Name: AI/ML Subject Code: 21CSH-316
1. Aim: The objective of this experiment is to demonstrate the implementation of Python libraries for
machine learning applications, specifically Pandas and Matplotlib. 2. Objective: The objective of this
experiment is to demonstrate the implementation of Python libraries for
machine learning applications, specifically Pandas and Matplotlib.
3. Input/Apparatus Used:
The input for this experiment consists of a dataset in a CSV file format. The required apparatus includes a
computer with Python installed and the following libraries: Pandas and Matplotlib.
4. Procedure/Algorithm:
Pandas: Pandas is a powerful library that provides easy-to-use data structures and data analysis tools for Python.
It's particularly useful for handling structured data and performing data preprocessing tasks. Key components
include:
a) DataFrame: A two-dimensional table-like data structure that is central to data manipulation in Pandas.
b) Series: A one-dimensional labeled array capable of holding various data types.
c) Data Cleaning: Pandas offers functions to deal with missing data, duplicate data, and data type conversions.
d) Data Transformation: You can filter, sort, group, aggregate, and pivot data easily.
e) Data Integration: Merging and joining datasets based on common columns.
Matplotlib: Matplotlib is a widely used library for creating static, interactive, and animated visualizations in
Python. It provides a high degree of customization and control over the appearance of plots. Key features include:
a) Line Plots: Creating simple and complex line plots to visualize trends and relationships.
b) Scatter Plots: Visualizing individual data points and correlations.
c) Bar Plots: Representing categorical data with bars, suitable for comparisons.
d) Histograms: Displaying the distribution of a continuous variable.
e) Box Plots: Showing the summary statistics and distribution of a dataset.
f) Heatmaps: Displaying matrix-like data with color-coded values.
g) Annotations and Labels: Adding text, titles, legends, and annotations to enhance plot understanding.
h) Subplots: Organizing multiple plots within a single figure.
5. Code:
(a) import numpy
as np
a = np.array([5, 8, 12])
print(a)
(b) import pandas as pd
data = {"country":
["Brazil",
"Russia", "India",
"China", "South Africa"],
"capital": ["Brasilia", "Moscow", "New Delhi", "Beijing", "Pretoria"],
"area": [8.516, 17.10, 3.286, 9.597, 1.221], "population":
[200.4, 143.5, 1252, 1357, 52.98] }
data_table = pd.DataFrame(data)
print(data_table)
(c )
import matplotlib.pyplot as plt plt.plot([1,
2, 3])
plt.title("Line Plot") plt.show()
6.Output
(a)
COMPUTER SCIENCE & ENGINEERING
Ou
(b)
(c)
7.Learning Outcomes:
• Learn about pandas.
• Learn about matplotlib.
• Learn about python.