Informatics
Practices
S.Sarvesh Mariappan
1
2
DATA FRAME
A DataFrame is a data structure that organizes data into a
2-dimensional table of rows and columns, much like a
spreadsheet. DataFrames are one of the most common data
structures used in modern data analytics because they are a
flexible and intuitive way of storing and working with data.
3
DATA FRAME CREATION
SYNTAX
pandas.DataFrame(data, index, columns)
data: It is a dataset from which dataframe is to be created. It can be list,
dictionary, scalar value, series, ndarrays, etc.
index: It is optional, by default the index of the dataframe starts from 0
and ends at the last data value(n-1). It defines the row label explicitly.
columns: This parameter is used to provide column names in the
dataframe. If the column name is not defined by default, it will take a
value from 0 to (n-1)
4
CREATING EMPTY DATA FRAME
import pandas as pd OUTPUT
df = pd.DataFrame()
print(df)
5
DATA FRAME CREATION METHODS
I. List of Lists
II. List of Dictionary
III. List of Array
IV. Nested Array
V. List of Series
VI. Dict of Lists
VII. Dict of Dictionary
VIII. Dict of Array
IX. Dict of Series
6
CREATION OF DATA FRAME BY LIST OF LISTS
import pandas as pd
data = [['tom', 10], ['nick', 15], ['juli', 14]]
df = pd.DataFrame(data, columns=['Name', 'Age'])
print (df)
OUTPUT
7
CREATION OF DATA FRAME BY LIST OF DICTIONARY
import pandas as pd
data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10}]
df1 = pd.DataFrame(data, index=['first','second'])
OUTPUT
8
CREATION OF DATAFRAME BY LIST OF ARRAYS
import pandas as pd OUTPUT
import numpy as np
a=np.array([‘Jai’,’Msc’])
b=np.array( [‘Princi’,’MA’])
c=np.array([‘Gaurav’,’MCA’])
d=np.array [‘Anuj’,’Phd’])
e=[a,b,c,d]
df=pd.DataFrame(e,columns=[‘Name’,’Qualification’])
print(df)
9
CREATION OF DATA FRAME BY NESTED ARRAY
import pandas as pd OUTPUT
import numpy as np
array = np.array([['CEO', 20, 5], ['CTO', 22, 4.5],
['CFO', 21, 3], ['CMO', 24, 2]])
row= [1, 2, 3, 4]
column= ['Names', 'Age', 'Net worth in Millions']
df = pd.DataFrame(array,index=row,columns=column)
print(df)
10
CREATION OF DATAFRAME BY LIST OF SERIES
import pandas as pd
A= pd.Series([10, 20, 30, 40],index=['a', 'b', 'c', 'd'])
B= pd.Series([10, 20, 30, 40],index=['a', 'b', 'c', 'd'])
df = pd.DataFrame([A,B],columns=['one','two'])
print(df)
OUTPUT
11
CREATION OF DATA FRAME BY DICT OF LIST
import pandas as pd
data = {'Name': ['Tom', 'Jack', 'nick', 'juli'],'marks': [99, 98, 95, 90]}
df = pd.DataFrame(data, index=['rank1','rank2','rank3', 'rank4'])
print(df)
OUTPUT
12
CREATION OF DATA FRAME BY DICT OF DICTIONARY
import pandas as pd
dict = {'Key1': {0:1,1:2,2:3],'Key2':{0:"Hank",1: "Steve", 2:"Lisa"],"Key3" :
{0:1.2,1:3.1,2:3.1}
dp = pd.DataFrame(dict,index=idx)
OUTPUT
13
CREATION OF DATA FRAME BY DICT OF ARRAY
import pandas as pd OUTPUT
import numpy as np Category Marks
array1=np.array(['Array', 'Stack', 'Queue']) 0 Array 20
array2=np.array([20, 21, 19]) 1 Stack 21
data = {'Category':array1,'Marks':array2} 2 Queue 19
df = pd.DataFrame(data)
print(df)
14
CREATION OF DATA FRAME BY DICT OF SERIES
import pandas as pd OUTPUT
a=pd.Series([‘Ankit’,’Golu’,’Sanjay’]) Rollno Name
b=pd.Series([21,10,55]) 0 21 Ankit
c={‘Name’ : a, ’Rollno’ : b} 1 10 Golu
df=pd.DataFrame(c) 2 55 Sanjay
print(df)
15
16