100% found this document useful (1 vote)
87 views84 pages

Python NumPy for Beginners

NumPy is a Python library that provides multidimensional array and matrix objects, along with tools to work with these arrays. The core data structure in NumPy is the ndarray, which represents a multidimensional, homogeneous array of fixed-size items. NumPy arrays have many capabilities for efficient mathematical and other operations on large data sets.

Uploaded by

Shree Shak
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
87 views84 pages

Python NumPy for Beginners

NumPy is a Python library that provides multidimensional array and matrix objects, along with tools to work with these arrays. The core data structure in NumPy is the ndarray, which represents a multidimensional, homogeneous array of fixed-size items. NumPy arrays have many capabilities for efficient mathematical and other operations on large data sets.

Uploaded by

Shree Shak
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 84

INTRODUCTION TO

NUMPY
By:
Aniruddh Kadam
Reg no- 12109237
Lovely professional University
What is NumPy?

The NumPy library is the core library for scientific computing in Python.

It provides a high performance multidimensional array object and tools for


working with these arrays.

The key to NumPy is the ndarray object, an n-dimensional array of homogeneous


data types, with many operations being performed in compiled code for
performance.
What is NumPy?
There are several important differences between NumPy arrays and the standard
Python sequences:
NumPy arrays have a fixed size. Modifying the size means creating a new array.

NumPy arrays must be of the same data type, but this can include Python objects.

More efficient mathematical operations than built-in sequence types.


import numpy as np

A=[[1,2,3],[4,5,6]] Output:
print(A) [[1, 2, 3], [4, 5, 6]]
type(A) list

A = np.array(A) Output:
print(A) [[1, 2, 3], [4, 5,
type(A) 6]]
list
print(np . ndim(A)) Ans : ?
print(np . ndim(A)) Ans : 2

print(np. shape(A)) Ans : ?


print(np . ndim(A)) Ans : 2

print(np. shape(A)) Ans : (2,3)

rows = np.shape(A)[0]
columns = np.shape(A)[1]
print("number of rows = ",rows)
print("number of columns = ", columns)

Output:
number of rows = 2
number of columns = 3
Ans:
import numpy as np [[ 0 1 2 3 4]
a = np.arange(15).reshape(3, [ 5 6 7 8 9]
5) print(a) [10 11 12 13
14]]
print(a.ndim)
2
print(a.shape)
print(a.dtype.name) (3, 5)
print(a.itemsize)
print(a.size) int64

b = np.array([6, 7, 8]) 8 15
print(b)
type(b) [6 7 8]

numpy.ndarray
Array Creation

a = np.array([2,3,4])
[2 3 4]
print(a)
dtype('int64')
a.dtype
b = np.array([1.2, 3.5, 5.1]) [1.2 3.5 5.1]
print(b) dtype('float64')

b.dtype array transforms sequences of sequences into two-


dimensional arrays, sequences of sequences of sequences into
three- dimensional arrays, and so on.

b = np.array([(1.5,2,3), (4,5,6)])
[[1.5 2. 3. ]
print(b)
[4. 5. 6. ]]
The type of the array can also be explicitly specified at creation
time:

c = np.array( [ [1,2], [3,4] ], [[1.+0.j 2.+0.j]


dtype=complex ) print(c) 3.+0.j 4.+0.j]]

c=np.array([1,2,3,4,5,6,7,8,9,10]) [ 1 2 3 4 5 6 7 8 9 10]
print(c)
D=np.reshape(c,(2,5)) [[ 1 2 3 4 5]
print(D) [ 6 7 8 9 10]]
The elements of an array are originally unknown, but its size is known.
Hence, NumPy offers several functions to create arrays with initial placeholder content. These
minimize
the necessity of growing arrays, an expensive operation.
The function zeros creates an array full of zeros, the function ones creates an array full of ones, and the
function empty creates an array whose initial content is random and depends on the state of the
memory. By default, the dtype of the created array is float64.
[[0. 0. 0. 0.]
[0. 0. 0. 0.]
print(np.zeros( (3,4) )) [0. 0. 0. 0.]]

[[[1 1 1 1]
[1 1 1 1]
[1 1 1 1]]
print(np.ones( (2,3,4), dtype=np.int16 ) )
[[1 1 1 1]
[1 1 1 1]
[1 1 1 1]]]

print(np.empty( (2,3) ) [[1.39069238e-309 1.39069238e-309 1.39069238e-309]


[1.39069238e-309 1.39069238e-309 1.39069238e-
)
309]]
To create sequences of numbers, NumPy provides a function analogous to range that returns arrays instead of
lists.
np.arange( 10, 30, 5 ) [10, 15, 20, 25]

np.arange( 0, 2, 0.3 ) [ 0. , 0.3, 0.6, 0.9, 1.2, 1.5,


1.8]

When arange is used with floating point arguments, it is generally not possible to predict the number of
elements obtained, due to the finite floating point precision. For this reason, it is usually better to use the
function linspace that receives as an argument the number of elements that we want .

from numpy import pi


print(np.linspace( 0, 2, 9 ))
x = np.linspace( 0, 2*pi, 100 )
print(x)
Ans: ?
f = np.sin(x)
print(f)
Basic Operations
Arithmetic operators on arrays apply
elementwise.
A new array is created and filled with the result.

a=
np.array( [20,30,40,50] )
b =
print(a)
np.arange( 4 )
print(b) Ans: ?
c = a-b
print(c)

print(b**2)

print(10*np.sin(a)

) print(a<35)
Unlike in many matrix languages,
the product operator * operates element wise in NumPy arrays.
The matrix product can be performed using the @ operator (in python >=3.5) or
the
dot function or method
A= np.array( [[1,1],
[0,1]] )
print(A)
B = np.array( [[2,0],
[3,4]] )
print(B) Ans: ?
print("The Element wise product")
print(A * B)

print("The Matrix
Product") print(A @ B)

print("The Matrix Product using dot


function")
a=np. array([1,2,3])
b=np.array([(1.5,2,3),(4,5,6)],dtype=float)
c=np. array([[(1.5,2,3),(4,5,6)],[(3,2,1),(4,5,6)]],
dtype=float)
print("The 1D",a)
print("The 2D",b)
print("The 3D",c)
d=np.arange(10,25,5)
print(d) Ans: ?
e=np. full((2,2),7)
print("The full
array") print(e)
f=np. eye(3)
print(“The 3 *3 identity
matrix") print(f)
print("the random array")
print(np.random.random((2,2)))
print("The subtraction of a&
b :”) print(np.subtract(a,b)) Also try comparison
operations
Similarly try a == b
np.add(b,a) a<2
np.divide(a,b) np.array_equal(a,b)
np.multiply(a,b)

np.exp(b)

np.sqrt(b)

np.sin(a)

np.cos(b)
Aggregate Functions

print(b.sum())
print(np.sum(b))

Similarly try
a.sum()
a.min()
b.max(axis = 0)
Ans: ?
b.cumsum(axis = 1)
a.mean()
b.median()
a.corrcoef()
np.std(b)
Copying Arrays
h=a.view()
print(h)

C=np.copy(b)
print(C)

h=a.copy()
print(h)

Sorting Arrays
b=np.array([5,7,2,4,1,9,6,0]
)
print(b)
print(np.sort(b)
)

a.sort()
Subsetting
1 2 3
a=np. array([1,2,3])

1.5 2 3
b=np.array([(1.5,2,3),(4,5,6)],dtype=float)
4 5 6

a[2] 1 2 3

1.5 2 3
b[1,2]
4 5 6
Slicing
1 2 3
a[ 0 : 2 ]

3 2 1
a[ : : -1 ]
1.5 2 3
b[ 0 : 2 ,1 ] 4 5 6

1.5 2 3
b[ : 1 4 5 6
]
1.5 2 3

b[ : 2 ] 4 5 6
Ans: ?
Indexing
1 2 3
a[a<2]

b[[1,0,1,0],[0,1,2,0]] [ 4. 2. 6.
1.5 ]

b[[1, 0, 1, 0]] [:,[0,1,2,0]]


[[ 4. 5. 6.
4. ]
[4. 5. 6. 4. ]
[1.5 2. 3.3. 1.5]]
2.
1.5]
Transposing
i = np.transpose(b)
print(i)
i.T

Changing Array Shape


b.ravel() Flatten the array

g.reshape(3,-2) Reshape, but don’t change data

[[-0.5 0. 0. ] array([[-0.5, 0. ], After reshape


[-3. -3. -3. ]] [ 0. , -3. ],
[-3. , -3. ]])
INTRODUCTION TO PANDAS
A LIBRARY THAT IS USED FOR DATA MANIPULATION AND ANALYSIS TOOL
USING POWERFUL DATA STRUCTURES
TYPES OF DATA STRUCTURE IN
PANDAS
Data Structure Dimensions Description
Series 1 1D labeled homogeneous
array, sizeimmutable.

Data Frames 2 General 2D labeled, size-


mutable tabular
structure with
potentially
heterogeneously typed
columns.

Panel 3 General 3D labeled,


size- mutable
array.
SERIES
• Series is a one-dimensional array like structure with homogeneous
data. For example, the following series is a collection of integers 10,
23, 56,
10 23 56 17 52 61 73 90 26 72
• …
DataFrame
• DataFrame is a two-dimensional array with heterogeneous data. For
example,
Name Age Gender Rating

Steve 32 Male 3.45

Lia 28 Female 4.6

Vin 45 Male 3.9

Katie 38 Female 2.78


Data Type of Columns
Column Type

Name String

Age Integer

Gender String

Rating Float
PANEL
• Panel is a three-dimensional data structure with heterogeneous data.
It is hard to represent the panel in graphical representation. But a
panel can be illustrated as a container of DataFrame.
DataFrame
• A Data frame is a two-dimensional data structure, i.e., data is aligned
in a tabular fashion in rows and columns.
• Features of DataFrame
• Potentially columns are of different types
• Size – Mutable
• Labeled axes (rows and columns)
• Can Perform Arithmetic operations on rows and columns
Structure
pandas.DataFrame
pandas.DataFrame(data, index , columns , dtype , copy )
• data
• data takes various forms like ndarray, series, map, lists, dict, constants and also
another DataFrame.
• index
• For the row labels, the Index to be used for the resulting frame is Optional Default
np.arrange(n) if no index is passed.
• columns
• For column labels, the optional default syntax is - np.arrange(n). This is only true if
no index is passed.
• dtype
• Data type of each column.
• copy
• This command (or whatever it is) is used for copying of data, if the default is False.
• Create DataFrame
• A pandas DataFrame can be created using various inputs like −
• Lists
• dict
• Series
• Numpy ndarrays
• Another DataFrame
Example
• Example
• import pandas as pd
• Data = [ ]
• Df = pd.DataFrame(data)
• Print df
Example 2
Import pandas as pd
Data = {‘Name’ : [‘ ’. ‘ ’],’Age’: [ ]} Df
= pd.DataFrame(data)
print df
Example
• import pandas as pd
• data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}]
• df = pd.DataFrame(data)
• print df


• import pandas as pd
• data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}]
• df = pd.DataFrame(data, index=['first', 'second'])
• print df
• import pandas as pd
• data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}]
• #With two column indices, values same as dictionary keys
• df1 = pd.DataFrame(data, index=['first', 'second'], columns=['a', 'b'])
• #With two column indices with one index with other name
• df2 = pd.DataFrame(data, index=['first', 'second'], columns=['a', 'b1'])
• print df1print df2
The following example shows how to create a DataFrame
with a list of dictionaries, row indices, and column indices.
• import pandas as pd
• data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}]
• #With two column indices, values same as dictionary keys
• df1 = pd.DataFrame(data, index=['first', 'second'], columns=['a', 'b'])
• #With two column indices with one index with other name
• df2 = pd.DataFrame(data, index=['first', 'second'], columns=['a', 'b1'])
• print df1
• print df2
Create a DataFrame from Dict of
Series
• import pandas as pd
• d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']), 'two' : pd.Series([1,
2, 3, 4], index=['a', 'b', 'c', 'd'])}
• df = pd.DataFrame(d)
• print df
Column Addition
• import pandas as pd
• d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'two' : pd.Series([1, 2,
'c']),
3, 4], index=['a', 'b', 'c', 'd'])}
•• #dfAdding a new column to an existing DataFrame object with column label
= pd.DataFrame(d)
by passing new series
• print ("Adding a new column by passing as Series:")
• df['three']=pd.Series([10,20,30],index=['a','b','c'])
• print dfprint ("Adding a new column using the existing columns
in DataFrame:")
• df['four']=df['one']+df['three']
• print df
Column Deletion

• # Using the previous DataFrame, we will delete


a column
• # using del function
• import pandas as pd
• d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd']),
'three' : pd.Series([10,20,30], index=['a','b','c'])}
• df = pd.DataFrame(d)
• print ("Our dataframe is:")
• print df
• # using del function
• print ("Deleting the first column using DEL function:")
• del df['one']
• print df
# using pop function
• print ("Deleting another column using POP function:")
• df.pop('two')
• print df
Slicing in python

• import pandas as pd
d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c',
'd'])}
• df = pd.DataFrame(d)
• print df[2:4]
Addition of rows
• Df2 = pd.DataFrame([[5,6], [7,8]], columns = [‘a’, ‘b’])
• Df = df.append(df2 )
• Print df

Deletion of rows
• Df2 = pd.DataFrame([[5,6], [7,8]], columns = [‘a’, ‘b’])
• Df = df.drop(0)
• Print df
Reindexing
• import pandas as pd
• import numpy as np
df1 =
pd.DataFrame(np.random.randn(10,3),columns=['col1','col2','col3'])
• df2 =
pd.DataFrame(np.random.randn(7,3),columns=['col1','col2','col3'])df1
= df1.reindex_like(df2)print df1
Concatenating objects
• import pandas as pd
• One = pd.DataFrame({ ‘Name’: [‘ ’] , ‘subject_id’: [‘ ’], ‘marks’:
[‘ ’]}, index = [] )
• two= pd.DataFrame({ ‘Name’: [‘’] , ‘subject_id’: [‘ ’], ‘marks’: [‘
’]}, index = [] )
• Print pd.concat([one, two])
INTRODUCTION TO MATPLOTLIB
Matplotlib is an amazing visualization library in Python for 2D plots of
arrays
• Matplotlib is an amazing visualization library in Python for 2D plots of
arrays
• Matplotlib is a multi-platform data visualization library built on
NumPy arrays and designed to work with the broader SciPy
stack. It was introduced by John Hunter in the year 2002.
• One of the greatest benefits of visualization is that it allows
us visual access to huge amounts of data in easily digestible
visuals.
• Matplotlib consists of several plots like line, bar,
scatter, histogram etc
Importing matplotlib
• from matplotlib import pyplot as plt or
• import matplotlib.pyplot as plt
Basic plots in Matplotlib
:
• Matplotlib comes with a wide variety of plots.
• Plots helps to understand trends, patterns, and to make correlations.
• They’re typically instruments for reasoning about
quantitative information.
Line
•plot :
• # importing matplotlib module
from matplotlib import pyplot as plt

• # x-axis values
• x = [5, 2, 9, 4, 7]

• # Y-axis values
• y = [10, 5, 8, 4, 2]

• # Function to plot
• plt.plot(x,y)

• # function to show the plot
• plt.show()
Bar
• # importing matplotlib module
•plot :
from matplotlib import pyplot as plt

• # x-axis values
• x = [5, 2, 9, 4, 7]

• # Y-axis values
• y = [10, 5, 8, 4, 2]

• # Function to plot the bar
• plt.bar(x,y)

• # function to show the plot
• plt.show()
Histogram
• # importing matplotlib module
:
• from matplotlib import pyplot as plt

• # Y-axis values
• y = [10, 5, 8, 4, 2]

• # Function to plot histogram
• plt.hist(y)

• # Function to show the plot
• plt.show()
Scatter
Plot :
• # importing matplotlib
• from matplotlib import
module
pyplot as plt

• # x-axis values
• x = [5, 2, 9, 4, 7]

• # Y-axis values
• y = [10, 5, 8, 4, 2]

• # Function to plot scatter
• plt.scatter(x, y)

• # function to show the plot
• plt.show()
Functional Approach:
Matplotlib allows us easily create multi-plots on the same figure
using the .subplot() method. This .subplot() method takes in three
parameters, namely:
•nrows: the number of rows
the Figure should have.
•ncols: the number of columns
the Figure should have.
•plot_number : which refers to a specific
plot in the Figure.
Using .subplot() we will create a two plots on the same canvas :
Object oriented
Interface:
This is the best way to create plots.
The idea here is to create Figure objects and call methods off it.
Let’s create a blank Figure using the .figure() method
Next
step
• Now we need to add a set of axes
• .add_axes()
• (left, bottom, width, and height)

Something interesting , figure in figure
We can create a matrix of subplot for example
3*3
Add , plot.tight_layout()

The only difference between plt.figure() and


plt.subplots() is that
plt.subplots() automatically does
what the .add_axes() method of .figure()
will do for you based off
the number of rows and columns you specify.
Figure size, aspect ratio, and DPI IN FIGURE
IN SUBPLOTS
THE BROTHER CAN SAVE PICTURE ALSO
VIA
FIG.SAVEFIG(‘NAME.PNG’)
Legends
• Legends allows us to distinguish between
plots. With Legends, you can use label
texts to identify or differentiate one plot
from another. For example, say we have a
figure having two plots like below:
Plot Types
• HISTOGRAM
• HELPS US UNDERSTAND THE DISTRIBUTION OF NUMERIC VALUE IN A
WAY THAT YOU CAN NOT DO WITH MEAN , MEDIAN , MODE
Time series (Line
Plot)
• is a chart that shows a trend over a period of
time.
• It allows you to test various hypotheses under
certain conditions, like what happens different
days of the week or between different times of
the day.
EXAMPL
E
Scatter plots
• offer a convenient way to visualize how two numeric values
are related in your data.
• It helps in understanding relationships between multiple
variables.
• Using .scatter() method, we can create a scatter plot:
• EXAMPLE
Bar
graphs
• are convenient for comparing numeric values of several
groups. Using .bar() method, we can create a bar graph:
WHAT YOU HAVE LEARN AND
WHY
• FOR DATA VISUALIZATION
• WE CAN INTEGRATE THE VISUALIZATION TECHNIQUE IN OUR ML MODEL
• WE CAN SELL VISUALIZATIONS TO COMPANY BY THEIR DATA
• HOWEVER , IF YOU WANT TO BECOME
• >>> DATA ANALYST
• >>>>>>>>MASTER THIS LIBRARY YOU WILL GET HIRED AT PACKAGE OF
• *** 5.5 LPA **
• PLOTLY , SEABORN AND MATPLOTLIB ARE REPLACEMENT FOR EACH
OTHER

You might also like