0% found this document useful (0 votes)

4 views23 pages

ML 2

The document outlines an experiment focused on studying Python libraries for machine learning applications, specifically Pandas for data preprocessing and Matplotlib for data visualization. It discusses the importance and advantages of using the Pandas library, including efficient data handling and representation, as well as the functionalities of Matplotlib for creating various types of plots. Key concepts such as reading data from CSVs, handling duplicates, and working with missing values are also covered, along with practical examples of data manipulation and visualization techniques.

Uploaded by

Kadambari Prabhuwalawalkar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views23 pages

ML 2

Uploaded by

Kadambari Prabhuwalawalkar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

Experiment No: 2

Title: Study of Python Libraries for ML application such as Pandas and Matplotlib, Keras and
TensorFlow

Objective:

• To understand data preprocessing and analysis using Pandas library

• To understand data visualization in the form of 2D graphs and plots using Matplotlib library

Theory/Description:

List important ML libraries

o Python Libraries for Machine Learning

▪ Numpy
▪ Scipy
▪ Scikit-learn
▪ Theano
▪ TensorFlow
▪ Keras
▪ PyTorch
▪ Pandas
▪ Matplotlib

Importance of Pandas library

Pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data

structures and data analysis tools for the Python programming language.
Pandas makes importing, analyzing, and visualizing data much easier. It builds on packages like
NumPy and matplotlib to give you a single, convenient, place to do most of your data analysis
and visualization work.

Advantages of Pandas Library

There are many benefits of Python Pandas library, listing them all would probably take more time
than what it takes to learn the library. Therefore, these are the core advantages of using the Pandas
library:
1. Data representation
Pandas provide extremely streamlined forms of data representation. This helps to analyze
and understand data better. Simpler data representation facilitates better results for data
science projects.
2. Less writing and more work done
It is one of the best advantages of Pandas. What would have taken multiple lines in Python
without any support libraries, can simply be achieved through 1-2 lines with the use of
Pandas.
Thus, using Pandas helps to shorten the procedure of handling data. With the time saved, we
can focus more on data analysis algorithms.
3. An extensive set of features
Pandas are really powerful. They provide you with a huge set of important commands and
features which are used to easily analyze your data. We can use Pandas to perform various
tasks like filtering your data according to certain conditions, or segmenting and segregating the
data according to preference, etc.
4. Efficiently handles large data
Wes McKinney, the creator of Pandas, made the python library to mainly handle large
datasets efficiently. Pandas help to save a lot of time by importing large amounts of data very
fast. 5. Makes data flexible and customizable
Pandas provide a huge feature set to apply on the data you have so that you can customize, edit
and pivot it according to your own will and desire. This helps to bring the most out of your
data. 6. Made for Python
Python programming has become one of the most sought after programming languages in the
world, with its extensive amount of features and the sheer amount of productivity it provides.
Therefore, being able to code Pandas in Python, enables you to tap into the power of the various
other features and libraries which will use with Python. Some of these libraries are NumPy,
SciPy, MatPlotLib, etc.

Pandas Library

The primary two components of pandas are the Series and DataFrame.
A Series is essentially a column, and a DataFrame is a multi-dimensional table made up of a
collection of Series.
DataFrames and Series are quite similar in that many operations that you can do with
one you can do with the other, such as filling in null values and calculating the mean.

❖ Reading data from CSVs

With CSV files all you need is a single line to load in the data:
df = pd.read_csv('purchases.csv')
df
Let's load in the IMDB movies dataset to begin:
movies_df = pd.read_csv("IMDB-Movie-Data.csv", index_col="Title")
We're loading this dataset from a CSV and designating the movie titles to be our index.

❖ Viewing your data

The first thing to do when opening a new dataset is print out a few rows to keep as a visual
reference. We accomplish this with .head():
movies_df.head()

Another fast and useful attribute is .shape, which outputs just a tuple of (rows,
columns): movies_df.shape
Note that .shape has no parentheses and is a simple tuple of format (rows, columns). So we
have 1000 rows and 11 columns in our movies DataFrame.
You'll be going to .shape a lot when cleaning and transforming data. For example, you might
filter some rows based on some criteria and then want to know quickly how many rows were
removed.

❖ Handling duplicates

This dataset does not have duplicate rows, but it is always important to verify you
aren't aggregating duplicate rows.

To demonstrate, let's simply just double up our movies DataFrame by appending it

to itself:

temp_df = movies_df.append(movies_df)

temp_df.shape
Out:
(2000, 11)

Using append() will return a copy without affecting the original DataFrame. We
are capturing this copy in temp so we aren't working with the real data.

Notice call .shape quickly proves our DataFrame rows have doubled.

Now we can try dropping duplicates:

temp_df = temp_df.drop_duplicates()

temp_df.shape
Out:
(1000, 11)

Just like append(), the drop_duplicates() method will also return a copy of your
DataFrame, but this time with duplicates removed. Calling .shape confirms we're back
to the 1000 rows of our original dataset.
It's a little verbose to keep assigning DataFrames to the same variable like in this
example. For this reason, pandas has the inplace keyword argument on many of
its methods. Using inplace=True will modify the DataFrame object in place:
temp_df.drop_duplicates(inplace=True)

Now our temp_df will have the transformed data automatically.

Another important argument for drop_duplicates() is keep, which has three

possible options:

• first: (default) Drop duplicates except for the first occurrence.

• last: Drop duplicates except for the last occurrence.
• False: Drop all duplicates.

Since we didn't define the keep arugment in the previous example it was defaulted to
first. This means that if two rows are the same pandas will drop the second row and
keep the first row. Using last has the opposite effect: the first row is dropped.

False, on the other hand, will drop all duplicates. If two rows are the same then
both willbe dropped. Watch what happens to temp_df:

temp_df = movies_df.append(movies_df) # make a new copy

temp_df.drop_duplicates(inplace=True, keep=False)

temp_df.shape
Out:
(0, 11)

Since all rows were duplicates, keep=False dropped them all resulting in zero rows
being left over. If you're wondering why you would want to do this, one reason is that it
allows you to locate all duplicates in your dataset. When conditional selections are shown
below you'll see how to do that.

❖ Column cleanup
Many times, datasets will have verbose column names with symbols, upper and
lowercasewords, spaces, and typos. To make selecting data by column name easier we
can spend a little time cleaning up their names.
Here's how to print the column names of our dataset:
movies_df.columns
Out:
Index(['Rank', 'Genre', 'Description', 'Director', 'Actors', 'Year',
'Runtime (Minutes)', 'Rating', 'Votes', 'Revenue (Millions)',
'Metascore'],
dtype='object')
Not only does .columns come in handy if you want to rename columns by allowing for
simple copy and paste, it's also useful if you need to understand why you are receiving a
Key Error when selecting data by column.

We can use the .rename() method to rename certain or all columns via a dict. We
don't want parentheses, so let's rename those:

movies_df.rename(columns={
'Runtime (Minutes)': 'Runtime',
'Revenue (Millions)': 'Revenue_millions'
}, inplace=True)

movies_df.columns
Out:
Index(['Rank', 'Genre', 'Description', 'Director', 'Actors',
'Year', 'Runtime',
'Rating', 'Votes', 'Revenue_millions', 'Metascore'],
dtype='object')

Excellent. But what if we want to lowercase all names? Instead of using .rename()
we could also set a list of names to the columns like so:

movies_df.columns = ['rank', 'genre', 'description', 'director',

'actors', 'year', 'runtime',
'rating', 'votes', 'revenue_millions', 'metascore']

movies_df.columns
Out:
Index(['rank', 'genre', 'description', 'director', 'actors',
'year', 'runtime',
'rating', 'votes', 'revenue_millions', 'metascore'],
dtype='object')

But that's too much work. Instead of just renaming each column manually we can do a
list comprehension:

movies_df.columns = [col.lower() for col in movies_df]

movies_df.columns
Out:
Index(['rank', 'genre', 'description', 'director', 'actors',
'year', 'runtime',
'rating', 'votes', 'revenue_millions', 'metascore'],
dtype='object')

list (and dict) comprehensions come in handy a lot when working with pandas
and data in general.
It's a good idea to lowercase, remove special characters, and replace spaces
with underscores if you'll be working with a dataset for some time.

❖ How to work with missing values

When exploring data, you‘ll most likely encounter missing or null values, which are essentially
placeholders for non-existent values. Most commonly you'll see Python's None or NumPy's
np.nan, each of which are handled differently in some situations.
There are two options in dealing with nulls:
1. Get rid of rows or columns with nulls
2. Replace nulls with non-null values, a technique known as imputation

Let's calculate to total number of nulls in each column of our dataset. The first step is to
check which cells in our DataFrame are null:
movies_df.isnull()

Notice isnull() returns a DataFrame where each cell is either True or False depending on that
cell's null status.
To count the number of nulls in each column we use an aggregate function for
summing: movies_df.isnull().sum()

❖ DataFrame slicing, selecting, extracting

Up until now we've focused on some basic summaries of our data. We've learned about simple
column extraction using single brackets, and we imputed null values in a column using fillna().
Below are the other methods of slicing, selecting, and extracting you'll need to use constantly.
It's important to note that, although many methods are the same, DataFrames and Series have
different attributes, so you'll need be sure to know which type you are working with or else you
will receive attribute errors.
Let's look at working with columns first.
By column
You already saw how to extract a column using square brackets like this:
genre_col = movies_df['genre']
type(genre_col)

Importance of Matplotlib library

To make necessary statistical inferences, it becomes necessary to visualize your data and
Matplotlib is one such solution for the Python users. It is a very powerful plotting library
useful for those working with Python and NumPy. The most used module of Matplotib is
Pyplot which provides an interface like MATLAB but instead, it uses Python and it is
open source.
❖ General Concepts

A Matplotlib figure can be categorized into several parts as below:

1. Figure: It is a whole figure which may contain one or more than one axes (plots). You can
think of a Figure as a canvas which contains plots.
2. Axes: It is what we generally think of as a plot. A Figure can contain many Axes. It contains
two or three (in the case of 3D) Axis objects. Each Axes has a title, an x-label and a y-label. 3.
Axis: They are the number line like objects and take care of generating the graph limits. 4.
Artist: Everything which one can see on the figure is an artist like Text objects, Line2D objects,
collection objects. Most Artists are tied to Axes.
Matplotlib Library

Pyplot is a module of Matplotlib which provides simple functions to add plot

elements like lines, images, text, etc. to the current axes in the current figure.

❖ Make a simple plot

import matplotlib.pyplot as plt

import numpy as np

List of all the methods as they appeared.

• plot(x-axisvalues, y-axis values) — plots a simple line graph with x-axis values
against y-axis values
• show() — displays the graph

• title(―string‖) — set the title of the plot as specified by the string

• xlabel(―string‖) — set the label for x-axis as specified by the string •
ylabel(―string‖) — set the label for y-axis as specified by the string •
figure() — used to control a figure level attributes
• subplot(nrows, ncols, index) — Add a subplot to the current figure • suptitle(―string‖)
— It adds a common title to the figure specified by the string • subplots(nrows, ncols,
figsize) — a convenient way to create subplots, in a single call. It returns a tuple of a
figure and number of axes.
• set_title(―string‖) — an axes level method used to set the title of subplots in a figure •
bar(categorical variables, values, color) — used to create vertical bar graphs •
barh(categorical variables, values, color) — used to create horizontal bar graphs •
legend(loc) — used to make legend of the graph
• xticks(index, categorical variables) — Get or set the current tick locations and labels
of the x-axis
• pie(value, categorical variables) — used to create a pie chart

• hist(values, number of bins) — used to create a histogram

• xlim(start value, end value) — used to set the limit of values of the x-axis •
ylim(start value, end value) — used to set the limit of values of the y-axis
• scatter(x-axis values, y-axis values) — plots a scatter plot with x-axis values against
y-axis values
• axes() — adds an axes to the current figure
• set_xlabel(―string‖) — axes level method used to set the x-label of the plot specified
as a string
• set_ylabel(―string‖) — axes level method used to set the y-label of the plot specified
as a string
• scatter3D(x-axis values, y-axis values) — plots a three-dimensional scatter plot with
x-axis values against y-axis values
• plot3D(x-axis values, y-axis values) — plots a three-dimensional line graph with x
axis values against y-axis values

Here we import Matplotlib‘s Pyplot module and Numpy library as most of the data
that we will be working with will be in the form of arrays only.

We pass two arrays as our input arguments to Pyplot‘s plot() method and use show()
method to invoke the required plot. Here note that the first array appears on the x-axis and
second array appears on the y-axis of the plot. Now that our first plot is ready, let us add
the title, and name x-axis and y-axis using methods title(), xlabel() and ylabel()
respectively.
We can also specify the size of the figure using method figure() and passing the
values as a tuple of the length of rows and columns to the argument figsize

With every X and Y argument, you can also pass an optional third argument in the form
of a string which indicates the colour and line type of the plot. The default format is b
which means a solid blue line. In the figure below we use go which means green circles.
Likewise, we can make many such combinations to format our plot.
We can also plot multiple sets of data by passing in multiple sets of arguments of X and
Y axis in the plot() method as shown.
❖ Multiple plotsin one figure:

We can use subplot() method to add more than one plots in one figure. In the image
below, we used this method to separate two graphs which we plotted on the same axes in
the previous example. The subplot() method takes three arguments: they are nrows,
ncols and index. They indicate the number of rows, number of columns and the index
number of the sub-plot. For instance, in our example, we want to create two sub-plots in
one figure such that it comes in one row and in two columns and hence we pass
arguments (1,2,1) and (1,2,2) in the subplot() method. Note that we have
separately used title()method for both the subplots. We use suptitle() method to make
a centralized title for the figure.
If we want our sub-plots in two rows and single column, we can pass arguments
(2,1,1) and (2,1,2)
The above way of creating subplots becomes a bit tedious when we want many subplots
in our figure. A more convenient way is to use subpltots() method. Notice the
difference of ‘s’ in both the methods. This method takes two arguments nrows and
ncols as number of rows and number of columns respectively. This method creates two
objects:figure and axes which we store in variables fig and ax which can be used to
change the figure and axes level attributes respectively. Note that these variable names
are chosen arbitrarily.
Keras API

Keras is a high-level neural network API, written in Python. It is most powerful and easy to use for developing
and evaluating deep learning models. It runs seamlessly on CPU and GPU. Keras uses TensorFlow, Theano,
MxNet, and CNTK (Microsoft) as backends.

❖ Why Keras?

➢ Allows easy and fast prototyping

➢ Supports both convolutional networks, recurrent networks, and combination of both
➢ Provides clear and actionable feedback for user error
➢ Follows best practices for reducing cognitive load

❖ Installation of Keras
⮚ Install Keras in virtualenv:

A. pip3 install keras

⮚ Install Keras from the GitHub source:
A. Clone Keras using git:

i. git clone https://github.com/keras-team/keras.git

B. cd to the Keras folder and run the install command:

i. cd keras

ii. sudo python setup.py install

❖ Creating a Keras Model

➢ Architecture Definition: Number of layers, number of nodes in layers, and activation function to be
used
➢ Compile: Defines the loss function and details about how optimization works
➢ Predict: Predicts with the model prepared
➢ Fit: Finalizes the model through back propagation and optimization of weights with input data

TensorFlow Library

Tensor: A multidimensional array

Flow: A graph of operations

A popular open source library for deep learning and machine learning. Developed by Google Brain Team and
released in 2015. TensorFlow uses a dataflow graph to represent computation. Dataflow is a common
programming model for parallel computing. TensorFlow is used mainly for classification, perception,
understanding, discovering, prediction, and creation.

❖ Why TensorFlow?
➢ Flexibility (highly efficient C++ implementations of ML, flexibility to create sort of computations )
➢ Parallel Computation (Supports Distributed Computing)
➢ Multiple Environment Friendly (Linux, macOS, iOS, Android, Raspberry Pi, Windows)
➢ Large Community (Popular and growing)

❖ Installation of TensorFlow
➢ TensorFlow 2 packages require a pip version >19.0.

A. pip install --upgrade pip

Conclusion : Learned to implement Python Libraries for ML application such as Pandas and
Matplotlib, Keras and TensorFlow in this experiment.

W04L01 - FA23 - AIC270 - Programming For AI - Syed Ahmed
No ratings yet
W04L01 - FA23 - AIC270 - Programming For AI - Syed Ahmed
66 pages
Pandas
No ratings yet
Pandas
13 pages
DevOps Session 3 Pandas
No ratings yet
DevOps Session 3 Pandas
33 pages
Pandas: Import
100% (1)
Pandas: Import
13 pages
Pandas (Ziad)
No ratings yet
Pandas (Ziad)
38 pages
Pandas Tutorial
No ratings yet
Pandas Tutorial
33 pages
Python Pandas Tutorial For Beginners
100% (1)
Python Pandas Tutorial For Beginners
203 pages
Pandas Cheat Sheet for Data Science
No ratings yet
Pandas Cheat Sheet for Data Science
5 pages
Pandas PDF
No ratings yet
Pandas PDF
171 pages
Pandas DataFrame Basics Guide
No ratings yet
Pandas DataFrame Basics Guide
4 pages
Data Analysis With Python
No ratings yet
Data Analysis With Python
60 pages
CHP 8 Pandas
No ratings yet
CHP 8 Pandas
49 pages
Pandas
No ratings yet
Pandas
13 pages
Comprehensive Pandas Guide
No ratings yet
Comprehensive Pandas Guide
171 pages
Pandas Python Data Analysis Guide
No ratings yet
Pandas Python Data Analysis Guide
32 pages
Pandas Notes
No ratings yet
Pandas Notes
6 pages
Introduction To Pandas in Data Analytics
No ratings yet
Introduction To Pandas in Data Analytics
12 pages
Pandas Basics
No ratings yet
Pandas Basics
84 pages
Python & Pandas for Beginners
No ratings yet
Python & Pandas for Beginners
7 pages
Introduction to Pandas Library
No ratings yet
Introduction to Pandas Library
31 pages
DAP 3 Module
No ratings yet
DAP 3 Module
62 pages
04-Data Manipulation With Pandas
No ratings yet
04-Data Manipulation With Pandas
28 pages
Pandas Notes
No ratings yet
Pandas Notes
20 pages
Fundamental - Python
No ratings yet
Fundamental - Python
3 pages
Pandas
No ratings yet
Pandas
5 pages
Pandas Cheat Sheet
100% (1)
Pandas Cheat Sheet
2 pages
Phan1 Pandas Numpy Matplotlib
No ratings yet
Phan1 Pandas Numpy Matplotlib
158 pages
3Y3Z2Xzqn7 U Y%K : 2. How To Create A Data Frame Using A Dictionary of Pre-Existing Columns or Numpy 2D Arrays?
No ratings yet
3Y3Z2Xzqn7 U Y%K : 2. How To Create A Data Frame Using A Dictionary of Pre-Existing Columns or Numpy 2D Arrays?
8 pages
Exp3 Python
No ratings yet
Exp3 Python
15 pages
Pandas Cheat Sheet PDF
67% (3)
Pandas Cheat Sheet PDF
1 page
Python Cheat Sheet Code Academy
100% (1)
Python Cheat Sheet Code Academy
1 page
Pandas Tutorial
No ratings yet
Pandas Tutorial
9 pages
Pandas
No ratings yet
Pandas
29 pages
Pandas
No ratings yet
Pandas
21 pages
Comprehensive Python Pandas Guide
No ratings yet
Comprehensive Python Pandas Guide
2 pages
AI Student HandbookXII 2025-26!8!20
No ratings yet
AI Student HandbookXII 2025-26!8!20
13 pages
Python Pandas Beginner's Guide
No ratings yet
Python Pandas Beginner's Guide
45 pages
FDS Module 2 Notes
No ratings yet
FDS Module 2 Notes
24 pages
Asfasdas
No ratings yet
Asfasdas
36 pages
The Pandas Library
No ratings yet
The Pandas Library
39 pages
Pandas
No ratings yet
Pandas
4 pages
ML Lab1 Python Panda
No ratings yet
ML Lab1 Python Panda
9 pages
Content Pandas Cheat Sheet
No ratings yet
Content Pandas Cheat Sheet
9 pages
Pandas 1
No ratings yet
Pandas 1
50 pages
Introduction To Pandas
No ratings yet
Introduction To Pandas
27 pages
Pandas Notes
No ratings yet
Pandas Notes
10 pages
Pandas
No ratings yet
Pandas
26 pages
DataFrame Ac Win Final
No ratings yet
DataFrame Ac Win Final
30 pages
Mdad - Numpy ML
No ratings yet
Mdad - Numpy ML
85 pages
Rajni Ip File Final
No ratings yet
Rajni Ip File Final
42 pages
Pandas
No ratings yet
Pandas
41 pages
Unit III - Notes
No ratings yet
Unit III - Notes
12 pages
Pandas
No ratings yet
Pandas
25 pages
Pandas DataFrame Notes
No ratings yet
Pandas DataFrame Notes
13 pages
Unit 1 Chapter 2 Lecture2
No ratings yet
Unit 1 Chapter 2 Lecture2
28 pages
Van Der Post H. Learn Python For Finance and Accounting..Step by Step Guide 2023
No ratings yet
Van Der Post H. Learn Python For Finance and Accounting..Step by Step Guide 2023
365 pages
Ai Worksheet 2
No ratings yet
Ai Worksheet 2
7 pages
Railway Reservation System Project
No ratings yet
Railway Reservation System Project
11 pages
Rainfall Viva Project
No ratings yet
Rainfall Viva Project
58 pages
Anshu Kumar Jha CV
No ratings yet
Anshu Kumar Jha CV
1 page
Master Data Science With Python
No ratings yet
Master Data Science With Python
87 pages
Python Lab Manual
No ratings yet
Python Lab Manual
58 pages
P Murali Krishna 18
No ratings yet
P Murali Krishna 18
4 pages
Data Science AI Program Brochure
No ratings yet
Data Science AI Program Brochure
27 pages
PGP in DS & AI
No ratings yet
PGP in DS & AI
24 pages
Fds Model 2 QC
No ratings yet
Fds Model 2 QC
3 pages
4 Month Data Science Roadmap
No ratings yet
4 Month Data Science Roadmap
3 pages
InformaticsPractices SQP 2025 26
No ratings yet
InformaticsPractices SQP 2025 26
13 pages
12ip 22 23
No ratings yet
12ip 22 23
188 pages
AbdulSubhan Resume Tesla F
No ratings yet
AbdulSubhan Resume Tesla F
1 page
Data Science Fundamentals Lab
No ratings yet
Data Science Fundamentals Lab
24 pages
Pandas Vs SQL
No ratings yet
Pandas Vs SQL
50 pages
A Project Report On Bank Management System
No ratings yet
A Project Report On Bank Management System
20 pages
Super 40 - Pandas Series Worksheet 1 - Answers
No ratings yet
Super 40 - Pandas Series Worksheet 1 - Answers
10 pages
CSE Syllabus 2024 1
No ratings yet
CSE Syllabus 2024 1
112 pages
Question Bank Class XII IP 065 MCQ With Answer
No ratings yet
Question Bank Class XII IP 065 MCQ With Answer
35 pages
Sales Data Analysis
No ratings yet
Sales Data Analysis
37 pages
Ch-1 Python Pandas - I (Series) Programs
No ratings yet
Ch-1 Python Pandas - I (Series) Programs
54 pages
2021 Regulation Lab With Exp New
No ratings yet
2021 Regulation Lab With Exp New
14 pages
Automating e Abspdf-1
No ratings yet
Automating e Abspdf-1
50 pages
EDA Lab Manual for Students
No ratings yet
EDA Lab Manual for Students
41 pages
Question Bank Class XII IP 065 Long Question Answer
No ratings yet
Question Bank Class XII IP 065 Long Question Answer
35 pages
2-2 Intermediate Python - Chapter 2 Dictionaries Pandas
No ratings yet
2-2 Intermediate Python - Chapter 2 Dictionaries Pandas
54 pages
RAW Data
No ratings yet
RAW Data
22 pages

ML 2

Uploaded by

ML 2

Uploaded by

Experiment No: 2

• To understand data preprocessing and analysis using Pandas library

List important ML libraries

o Python Libraries for Machine Learning

Importance of Pandas library

Pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data

Advantages of Pandas Library

❖ Reading data from CSVs

❖ Viewing your data

To demonstrate, let's simply just double up our movies DataFrame by appending it

Now we can try dropping duplicates:

Now our temp_df will have the transformed data automatically.

Another important argument for drop_duplicates() is keep, which has three

• first: (default) Drop duplicates except for the first occurrence.

temp_df = movies_df.append(movies_df) # make a new copy

movies_df.columns = ['rank', 'genre', 'description', 'director',

movies_df.columns = [col.lower() for col in movies_df]

❖ How to work with missing values

❖ DataFrame slicing, selecting, extracting

Importance of Matplotlib library

A Matplotlib figure can be categorized into several parts as below:

Pyplot is a module of Matplotlib which provides simple functions to add plot

❖ Make a simple plot

import matplotlib.pyplot as plt

List of all the methods as they appeared.

• title(―string‖) — set the title of the plot as specified by the string

• hist(values, number of bins) — used to create a histogram

➢ Allows easy and fast prototyping

A. pip3 install keras

i. git clone https://github.com/keras-team/keras.git

B. cd to the Keras folder and run the install command:

ii. sudo python setup.py install

❖ Creating a Keras Model

Tensor: A multidimensional array

A. pip install --upgrade pip

You might also like