PRACTICAL-1
Create a Panda’s series from a dictionary of values and a ndarray.
import pandas as pd
import numpy as np
students={"Ankita":98,"Pooja":73,"Rahul":90}
s=pd.Series(students)
print("Series created from dictionary is:")
print(s)
arr=np.arange(7)
print("array is",arr)
s1=pd.Series(arr)
print("Series created from array is:")
print(s1)
OUTPUT:-
Series created from dictionary is:
Ankita 98
Pooja 73
Rahul 90
dtype: int64
array is [0 1 2 3 4 5 6]
Series created from array is:
0 0
1 1
2 2
3 3
4 4
5 5
6 6
dtype: int32
PRACTICAL-2
Given a series, print all the elements that are above the 75th percentile.
import pandas as pd
test=pd.Series([2,4,6,7,9,12,15,18,21,23])
Percentile=test.quantile(.75)
print("75th percentile is",Percentile)
print("Values in series greater than 75th percentile are :")
for x in range (len(test)):
if test[x]> Percentile:
print(test[x])
OUTPUT:-
75th percentile is 17.25
Values in series greater than 75th percentile are :
18
21
23
PRACTICAL-3
Create a Data Frame quarterly sales where each row contain the Item Category , Item Name
and Expenditure.
Group the rows by the Category , and print the total expenditure per category.
import pandas as pd
quarterly_sales={"ItemCategory":["cold drink","chips","chocolate","chips","cold drink"],
"ItemName":["coke","lays","dairy milk","kurkure","fanta"],
"Expenditure":[35,20,40,20,65]}
df_sales=pd.DataFrame(quarterly_sales)
print(df_sales)
print(df_sales.groupby("ItemCategory")["Expenditure"].sum())
OUTPUT:-
ItemCategory ItemName Expenditure
0 cold drink coke 35
1 chips lays 20
2 chocolate dairy milk 40
3 chips kurkure 20
4 cold drink fanta 65
ItemCategory
chips 40
chocolate 40
cold drink 100
Name: Expenditure, dtype: int64
PRACTICAL-4
Create a Data Frame for examination result and display row labels ,column labels,data types
of each column and the dimensions.
import pandas as pd
result_data={"English":[70,50,80,82],
"Physics":[80,60,90,62],
"Chemistry":[85,70,95,70],
"Maths":[92,82,95,95],
"IP":[96,92,98,96]}
result_df=pd.DataFrame(result_data,index=('Rohit','Atiksh','Rahul','Pooja'))
print(result_df)
print(result_df.index)
print(result_df.columns)
print(result_df.dtypes)
print(result_df.ndim)
OUTPUT:-
English Physics Chemistry Maths IP
Rohit 70 80 85 92 96
Atiksh 50 60 70 82 92
Rahul 80 90 95 95 98
Pooja 82 62 70 95 96
Index(['Rohit', 'Atiksh', 'Rahul', 'Pooja'], dtype='object')
Index(['English', 'Physics', 'Chemistry', 'Maths', 'IP'], dtype='object')
English int64
Physics int64
Chemistry int64
Maths int64
IP int64
dtype: object
2
PRACTICAL-5
Filter out rows based on different criteria such as duplicate rows.
import pandas as pd
students=[('Jack',34,'Sydeny'),
('riti',30,'Delhi'),
('Aadi',16,'NewYork'),
('Riti',30,'Delhi'),
('Riti',30,'Delhi'),
('Riti',30,'Mumbai'),
('Aadi',40,'london'),
('Sachin',30,'Delhi'),
('Riti', 30, 'Delhi') ]
df=pd.DataFrame(students,columns=['Name','age','city'])
print(df)
duplicaterowsdf=df[df.duplicated()]
print("Dulicate rows are :")
print(duplicaterowsdf)
OUTPUT:-
Name age city
0 Jack 34 Sydeny
1 riti 30 Delhi
2 Aadi 16 NewYork
3 Riti 30 Delhi
4 Riti 30 Delhi
5 Riti 30 Mumbai
6 Aadi 40 london
7 Sachin 30 Delhi
8 Riti 30 Delhi
Dulicate rows are :
Name age city
4 Riti 30 Delhi
8 Riti 30 Delhi
PRACTICAL-6
Replace all negative values in a data frame with a 0.
import pandas as pd
dict={
"English":[70,-50,80,82],
"Physics":[80,60,-90,62],
"Chemistry":[85,-70,95,70],
"Maths":[92,82,95,-95],
"IP":[-96,92,98,96]
}
df=pd.DataFrame(dict)
print(df)
df[df<0]=0
print(df)
OUTPUT:-
English Physics Chemistry Maths IP
0 70 80 85 92 -96
1 -50 60 -70 82 92
2 80 -90 95 95 98
3 82 62 70 -95 96
ss
English Physics Chemistry Maths IP
0 70 80 85 92 0
1 0 60 0 82 92
2 80 0 95 95 98
3 82 62 70 0 96
PRACTICAL-7
Create a Data Frame df and display first 3 rows and last 3 rows of the data frame df.
import pandas as pd
dict={
'A':[70,50,80,82,56,67,67],
'B':[80,60,90,62,67,56,45],
'C':[85,70,95,70,34,34,23],
'D':[92,82,95,95,56,45,34],
'E':[96,92,98,96,34,55,34],
'F': [80,60,90,62,34,23,45],
'G': [85,70,95,70,56,34,34]
}
df=pd.DataFrame(dict)
print(df)
print(df.head(3))
print(df.tail(3))
OUTPUT:-
A B C D E F G
0 70 80 85 92 96 80 85
1 50 60 70 82 92 60 70
2 80 90 95 95 98 90 95
3 82 62 70 95 96 62 70
4 56 67 34 56 34 34 56
5 67 56 34 45 55 23 34
6 67 45 23 34 34 45 34
A B C D E F G
0 70 80 85 92 96 80 85
1 50 60 70 82 92 60 70
2 80 90 95 95 98 90 95
A B C D E F G
4 56 67 34 56 34 34 56
5 67 56 34 45 55 23 34
6 67 45 23 34 34 45 34
PRACTICAL-8
Create a data frame namely aid that stores the aid given by different states such as
Andhra,Odisha,M.P and UP and aid contain items toys ,books,uniform and shoes also display
the aid for states Andhra and Odisha for books and uniform.
import pandas as pd
D={"Toys":[7916,8508,7226,7617],
"Books":[6189,8208,6149,6157],
"Uniform":[610,508,611,457],
"Shoes":[8810,6798,9611,6457]}
aid=pd.DataFrame(D,index=["Andhra", "Odisha", "M.P","U.P"])
print(aid)
print(aid.loc['Andhra':'Odisha','Books':'Uniform'])
OUTPUT:-
Toys Books Uniform Shoes
Andhra 7916 6189 610 8810
Odisha 8508 8208 508 6798
M.P 7226 6149 611 9611
U.P 7617 6157 457 6457
Books Uniform
Andhra 6189 610
Odisha 8208 508
PRACTICAL-9
Create a dataframe with Boolean Indexes and also filter out row using
boolean indexes.
import Pandas as pd
Days=['Monday','Tuesday','Wednesday','Thursday','Friday']
Classes=[6,7,9,0,5]
D={'Days':Days,'No.of classes':Classes}
Class_df=pd.DataFrame(D,index=[True,False,True,False,True])
print(Class_df)
print(Class_df.loc[True])
OUTPUT:-
Days No.of classes
True Monday 6
False Tuesday 7
True Wednesday 9
False Thursday 0
True Friday 5
Days No.of classes
True Monday 6
True Wednesday 9
True Friday 5
PRACTICAL-10
Create a dataframe df and apply Renaming and delete and drop
function on dataframe.
D={"Rollno.":[101,102,103,104],
"Name":['Komal','Apoorva','Rohit','Poonam'],
"Marks":[97,98,99,87]
}
df_top=pd.DataFrame(D,index=["SecA", "SecB", "SecC","SecD"])
print(df_top)
df_new=df_top.rename(index={"SecA":'A',"SecB":'B',"SecC":'C',"SecD":'D'})
print(df_new)
del df_new['Marks']
print (df_new)
df=df_new.drop(['A','B'],axis=0)
print(df)
OUTPUT:-
Rollno. Name Marks
SecA 101 Komal 97
SecB 102 Apoorva 98
SecC 103 Rohit 99
SecD 104 Poonam 87
Rollno. Name Marks
A 101 Komal 97
B 102 Apoorva 98
C 103 Rohit 99
D 104 Poonam 87
Rollno. Name
A 101 Komal
B 102 Apoorva
C 103 Rohit
D 104 Poonam
Rollno. Name
C 103 Rohit
D 104 Poonam
PRACTICAL-11
Write a program to calculate total points earned by both the teams in each
round
import pandas as pd
d1={
'p1':[700,975,970,900],
'p2':[490,460,570,590]
}
d2={
'p1':[1100,1275,1270,1400],
'p2':[1400,1260,1500,1190]
}
df1=pd.DataFrame(d1,index=[1,2,3,4])
df2=pd.DataFrame(d2,index=[1,2,3,4])
print("Team1 performance:")
print(df1)
print("team2 performance:")
print(df2)
print("Points earned by both teams")
print(df1+df2)
OUTPUT:-
Team1 performance:
p1 p2
1 700 490
2 975 460
3 970 570
4 900 590
team2 performance:
p1 p2
1 1100 1400
2 1275 1260
3 1270 1500
4 1400 1190
Points earned by both teams
p1 p2
1 1800 1890
2 2250 1720
3 2240 2070
4 2300 1780
PRACTICAL-12
Create a dataframe employees and add a new column Designation in
dataframe employees.
import pandas as pd
D={"Empcode":[101,102,103,104,105],
"Enpname":["Ananya","Rohit","Poonam","Dheeraj","Nitika"],
"Empage":[34,45,46,38,48]}
employees=pd.DataFrame(D)
print(employees)
employees['Designation']=['Physics','Chemistry','Maths','IP','English']
print(employees)
OUTPUT:-
Empcode Enpname Empage
0 101 Ananya 34
1 102 Rohit 45
2 103 Poonam 46
3 104 Dheeraj 38
4 105 Nitika 48
Empcode Enpname Empage Designation
0 101 Ananya 34 Physics
1 102 Rohit 45 Chemistry
2 103 Poonam 46 Maths
3 104 Dheeraj 38 IP
4 105 Nitika 48 ssss English
PRACTICAL-13
Write a program to understand the concept of loc and iloc
import pandas as pd
D={"Empcode":[101,102,103,104,105],
"Enpname":["Ananya","Rohit","Poonam","Dheeraj","Nitika"],
"Empage":[34,45,46,38,48],
"Designation":['Physics','Chemistry','Maths','IP','English']}
employees=pd.DataFrame(D,index=['A','B','C','D','E'])
print(employees)
print(employees.iloc[0:2,1:3])
print(employees.loc['A':'C','Empcode':'Empage'])
OUTPUT:-
Empcode Enpname Empage Designation
A 101 Ananya 34 Physics
B 102 Rohit 45 Chemistry
C 103 Poonam 46 Maths
D 104 Dheeraj 38 IP
E 105 Nitika 48 English
Enpname Empage
A Ananya 34
B Rohit 45
Empcode Enpname Empage
A 101 Ananya 34
B 102 Rohit 45
C 103 Poonam 46
PRACTICAL-14
Create a dataframe df and print one row and one column at a time
import pandas as pd
D={"Empcode":[101,102,103,104,105],
"Enpname":["Ananya","Rohit","Poonam","Dheeraj","Nitika"],
"Empage":[34,45,46,38,48],
"Designation":['Physics','Chemistry','Maths','IP','English']}
df=pd.DataFrame(D,index=['A','B','C','D','E'])
print(df)
print("one row at a time:-")
for i ,j in df.iterrows():
print(j)
print("----------------")
print("one column at a time:-")
for i, j in df.items():
print(j)
print("----------------")
OUTPUT:-
Empcode Enpname Empage Designation
A 101 Ananya 34 Physics
B 102 Rohit 45 Chemistry
C 103 Poonam 46 Maths
D 104 Dheeraj 38 IP
E 105 Nitika 48 English
one row at a time
Empcode 101
Enpname Ananya
Empage 34
Designation Physics
Name: A, dtype: object
----------------
Empcode 102
Enpname Rohit
Empage 45
Designation Chemistry
Name: B, dtype: object
----------------
Empcode 103
Enpname Poonam
Empage 46
Designation Maths
Name: C, dtype: object
----------------
Empcode 104
Enpname Dheeraj
Empage 38
Designation IP
Name: D, dtype: object
----------------
Empcode 105
Enpname Nitika
Empage 48
Designation English
Name: E, dtype: object
----------------
one column at a time
A 101
B 102
C 103
D 104
E 105
Name: Empcode, dtype: int64
----------------
A Ananya
B Rohit
C Poonam
D Dheeraj
E Nitika
Name: Enpname, dtype: object
----------------
A 34
B 45
C 46
D 38
E 48
Name: Empage, dtype: int64
----------------
A Physics
B Chemistry
C Maths
D IP
E English
Name: Designation, dtype: object
----------------
PRACTICAL-15
Importing and Exporting data between pandas and CSV file.
#Importing
import pandas as pd
df=pd.read_csv("C:\\Users\\User\\Desktop\\result.csv")
print(df)
out put(df):-
Rollno Name Physics Chemistry Maths Biology English Ip
0 101 Pooja 78 89 67 34 56 89
1 102 Ananya 78 56 78 56 89 90
2 103 Rohit 78 67 90 87 56 45
3 104 Atiksh 89 76 56 78 89 67
CSV File-: (result.csv)
Rollno,Name,Physics,Chemistry,Maths,Biology,English,Ip
101,Pooja,78,89,67,34,56,89
102,Ananya,78,56,78,56,89,90
103,Rohit,78,67,90,87,56,45
104,Atiksh,89,76,56,78,89,67
# Export:-
quarterly_sales={"ItemCategory":["cold drink","chips","chocolate","chips","cold drink"],
"ItemName":["coke","lays","dairy milk","kurkure","fanta"],
"Expenditure":[35,20,40,20,65]}
df_sales=pd.DataFrame(quarterly_sales)
df_sales.to_csv("sales.csv")
print('File writting is successful')
OUTPUT:-
File writting is successful
CSV FILE(sales.csv):-
,ItemCategory,ItemName,Expenditure
0,cold drink,coke,35
1,chips,lays,20
2,chocolate,dairy milk,40
3,chips,kurkure,20
4,cold drink,fanta,65
PRACTICAL-16
Write a program to plot a line chart to depict the changing weekly onion prices for four
weeks.Give appropriate axes labels.
import matplotlib.pyplot as plt
week=[1,2,3,4]
prices=[40,80,100,50]
plt.plot(week,prices)
plt.xlabel("week")
plt.ylabel("onion prices")
plt.show()
OUTPUT:-
PRACTICAL-17
Write a program to plot a bar chart from the medal won by Australia.
In the same chart ,plot medals won by India too.
import matplotlib.pyplot as plt
Info=['Gold','Silver','Bronze','Total']
Australia=[80,59,59,198]
India=[26,20,20,66]
plt.bar(Info,Australia)
plt.bar(Info,India)
plt.xlabel("Medal type")
plt.ylabel("Australia, India Medal count")
plt.show()
OUTPUT:-
PRACTICAL-18
Create a dataframe and plot appropriate chart from the given
dataframe.
import pandas as pd
import matplotlib.pyplot as plot
quarterly_sales={
"Country":["India","Australia","England","Canada","New Zealand","South
Africa","Wales","Scotland"],
"Gold":[80,45,26,15,15,13,10,9],
"Silver":[59,45,20,40,16,11,12,13],
"bronze":[59,46,20,27,15,13,14,22]
}
df_sales=pd.DataFrame(quarterly_sales)
df_sales.plot.bar("Country","Gold")
plot.show()
OUTPUT:-
PRACTICAL-19
Plot a graph with labels,title and legend
import pandas as pd
import matplotlib.pyplot as plt
year=[1998,1999,2000,2001,2002,2003,2004,2005]
Physics=[80,45,99,100,45,67,99,98]
Chemistry=[59,45,20,40,16,11,12,13]
IP=[59,46,20,27,15,13,14,100]
plt.plot(year,Physics,'k',label='physics')
plt.plot(year,Chemistry,'r',label='chemistry')
plt.plot(year,IP,'b',label='ip')
plt.legend(loc='upper left')
plt.title("A simple line graph")
plt.xlabel("years")
plt.ylabel("marks obtained")
plt.savefig('C:\\Users\\User\\Desktop\\fig.png')
plt.show()
OUTPUT:-
PRACTICAL-20
Create a histogram by given data.
Weight measurement for 16 small orders of French fries.
78,72,69,81,63,67,65,75,79,74,71,83,71,79,80,69
import matplotlib.pyplot as plt
Weight=[78,72,69,81,63,67,65,75,79,74,71,83,71,79,80,69]
plt.hist(Weight)
plt.show()
OUTPUT: