Unit No Unit Name Marks
1 Data Handling using Pandas and Data Visualization 25
Data Handling using Pandas -I
- Pandas, Matplotlib.
- Series and data frames.Series: Creation of series from dictionary, scalar value;
mathematical operations; series attributes, head and tail functions; selection, indexing and slicing.
s, display,
iteration. Operations on rows and columns: add ( insert /append) , select, delete (drop column and row),
rename, Head and Tail functions, indexing using labels, Boolean indexing.
f plots using Matplotlib (line plot, bar
graph, histogram). Customizing plots:; adding label, title, and legend in plots.
TYPE A OBJECTIVE TYPE QUESTIONS [1 Mark] Multiple Choice Questions
1. To create an empty Series object, you can use:
(a) pd.Series(empty)
(c) pd.Series( )
(b) pd.Series(np.NaN)
(d) all of these
ANS:- (c) pd.Series( )
2. To specify datatype int16 for a Series object, you can write :
(a) pd.Series(data = array, dtype = int16)
(b) pd.Series(data = array, dtype = numpy.int16)
(c) pd.Series(data = array.dtype pandas.int16)
(d) all of the above
ANS:- (b) pd.Series(data = array, dtype = numpy.int16)
3. To get the number of dimensions of a Series object,
(a) index
(b) size attribute is displayed.
(c) itemsize
(d) ndim
ANS:- d) ndim
4. To get the size of the datatype of the items in Series object, you can display attribute.
(a) index
(b) size
(c) itemsize
(d) ndim
ANS:- (c) itemsize
5. To get the number of elements in a Series object, attribute may be used.
(a) index
(b) size
(c) itemsize
(d) ndim
ANS:- (b) size
6. To get the number of bytes of the Series data, attribute is displayed.
(a) hasnans
(b) nbytes
(c) ndim
(d) dtype
ANS:- b) nbytes
7 .To check if the Series object contains NaN values, attribute is displayd.
(a) hasnans
(b) nbytes
(c) ndim
(d) dtype
ANS:- a) hasnans
8. To display third element of a Series object S, you will write
(a) S[:3]
(b) S[2]
(c) S[3]
(d) S[:2]
ANS:- (b) S[2]
9. To display first three elements of a Series object S, you may write
(a) S[:3]
(b) S[3]
(c) S[3rd]
(d) all of these
ANS:- (a) S[:3]
10. To display last five rows of a Series object S, you may write
(a) head()
(b) head(5)
(c) tail( )
(d) tail(5)
ANS:- (c) tail( ) and (d) tail(5)
11. Pandas object cannot grow in size.
(a) Dataframe (b) Panel ( c) Series (d) None of these
ANS. ( c) Series
12. Given a Pandas series called Sequences, the command which will display the first 4 rows is
(a) print(Sequences.head(4))
(b) print(Sequences.Head (4))
(c) print(Sequences.heads(4))
(d)print(Sequences.Heads (4))
ANS:- (a) print(Sequences.head(4))
13. If a Dataframe is created using a 2D dictionary, then the indexes/row labels are formed from
(a) dictionary's values
(b) inner dictionary's keys
(c) outer dictionary's keys
(d) none of these
ANS:- (b) inner dictionary's keys
14. If a dataframe is created using a 2D dictionary, then the column labels are formed from
(a) dictionary's values
(b) inner dictionary's keys
(c) outer dictionary's keys
(d) none of these
ANS:- (c) outer dictionary's keys
15. The axis 0 identifies a dataframe's
(a) rows
(b) columns
(c) values
(d) datatype
ANS:- (a) rows
16. The axis 1 identifies a dataframe's
(a) row (b) columns (c) values (d) datatype
ANS:- (b) columns
17. To get the number of elements in a dataframe,
(a) size
(b) shapeartribute may be used.
(c) values
(d) ndim
ANS:- (a) size
18. To get a number representing number of axes in a dataframe, used. attribute may be
(a) size
(b) shape
(c) values
(d) ndim
ANS:- d) ndim
19. To extract row/column from a dataframe,
(a) row( ) (b) column( ) (c) loc( ) (d) all of these
ANS:- (a) row( )
21.. To display the 3rd, 4th and 5th columns from the 6th to 9th rows of a dataframe you can write
(a) DF.loc[6:9, 3:5]
(b) DF.loc[6:10, 3:6]
(c) DF.iloc[6:10, 3:6]
(d) DF.iloc[6:9, 3:5]
ANS:- c) DF.iloc[6:10, 3:6]
22. To change the 5th column's value at 3rd row as 35 in dataframe DF, you can write
(a) DF[4, 6] = 35
(b) DF.iat[4, 6] = 35
(c) DF[3, 5] = 35
(d) DF.iat[3, 5] = 35
ANS:- d) DF.iat[3, 5] = 35
23. Which among the following options can be used to create a DataFrame in Pandas ?
(a) A scalar value
(b) An ndarray
(c) A python dict
(d) All of these
ANS:- (d) All of these
24. Identify the correct statement:
(a) Data frames can change their size.
(b) Series act in a way similar to that of an array.
( (c) Both (a) and b)
(d) None of the above
ANS:- ( (c) Both (a) and b)
25. To delete a column from a DataFrame, you may use statement.
(a) remove
(b) del
(c) drop
(d) cancel statement.
ANS:- (b) del
26. To delete a row from a DataFrame, you may use
(a) remove
(b) del
(c) drop
(d) cancel
ANS:- (c) drop
27. To iterate over horizontal subsets of dataframe,
(a) iterate( )
(b) iterrows( ) function may be used.
(c) itercols( )
(d) iteritems( )
ANS:- (b) iterrows( ) function may be used.
28. To iterate over vertical subsets of a dataframe, function may be used.
(a) iterate( )
(b) iterrows( )
(c) itercols( )
(d) iteritems( )
ANS:- (d) iteritems( )
29. To add two dataframes' values,
(a) plus function may be used.
(b) rplus
(c) add
(d) radd
ANS:- c) add ,(d) radd
30. To subtract the values of two dataframes,
(a) sub
(b) difference
(c) minus
(d) rsub
ANS:- (a) sub, d) rsub
31. To divide the values of two dataframes, function may be used. function may be used.
(a) divide
(b) div
(c) rdiv
(d) division
ANS:- (b) div , c) rdiv
32. Which of the following two functions will produce the same result ?
(a) add
(b) radd
(c) sub
(d) rsub
ANS:- (a) add, b) radd
33. To get the 3 bottommost rows from a dataframe, you may use
(a) bottom
(b) bottom
(3) function.
(c) tail( )
ANS:- b) radd
34. Which of the following arguments lets you specify index labels of dataframe through Dataframe( ) ?
(a) index
(b) columns
(c) label
(d) all of these function.
ANS:- (a) index
35. To get top 5 rows of a dataframe, you may use
(a) head( )
(b) head(5)
(c) top( )
(d) top(5)
ANS:- (a) head( ) , b) head(5)
36. Which of the following can be used to specify data for creating a Dataframe ?
(a) Series
(b) DataFrame
(c) Structured ndarray
(d) All of these
ANS:- (d) All of these
37. All Pandas' data structures are mutable but not always mutable.
(a) size, value
(b) semantic, size
(c) value, size
(d) none of these
ANS:- (c) value, size
38. Which of the following statement will import Pandas library ?
(a) import pandas as pd
(b) import pandas as py
(c) import panda as py
(d) All of these
ANS:- (a) import pandas as pd ,(b) import pandas as py
39. What will be the output for the following code ?
import pandas as pd
S = pd. Series([1,2,3,4,5],index = ['a', 'b', 'c', 'd', 'e'])
print ( s[ 'a'] )
(a) 2
(b) 1
(c) 3
(d) 4
ANS:- (b) 1
40. What will be the output for the following code ?
import pandas as pd
import numpy as np
S = pd.Series (np.random.randn(2))
print (s.size)
(a) 0
(b) 1
(c) 2
(d) 3
ANS:- (c) 2
41. What will be the output for the following code ?
import pandas as pd i
mport numpy as np
S= pd.Series(np.random.randn(4))
print (s.ndim)
(a) 0
(b) 1
(c) 2
(d) 3
ANS:- b) 1
42. What is the purpose of using ndimattribute ?
(a) It returns the number of elements in the given data structure.
(b) It returns the Series object in the form of an ndarray.
(c) It returns a list of the indexes / labels.
(d) It returns the number of dimensions of the given data structure.
ANS:- (d) It returns the number of dimensions of the given data structure.
43. PyPlot is an interface of Python's.
(a) seaborn
(b) plotly library.
(c) ggplot
(d) matplotlih
ANS:- (d) matplotlih
44. For 2D plotting using a Python library, which library interface is often used ,
(a) seaborn
(b) plotly
(c) matplotlib
(d) matplotlib.pyplot
ANS:- d) matplotlib.pyplot
45. Which of the following is not a valid chart type ?
(a) Statistical
(b) Box
(c) Pie
(d) plot( )
ANS:- (a) Statistical , (b) Box
46. Which of the following is not a valid plotting function of pyplot ?
(a) pie( )
(b) plot( )
(c) bar( )
(d) line( )
ANS:- d) line( )
47. Which of the following plotting functions does not plot multiple data series ?
(a) plot( )
(b) barh( )
(c) bar( )
(d) pie( )
Ans:- (d) pie( )
48. The plot which tells the trend between two graphed variables is the
(a) scatter graph/chart.
(b) pie
(c) bar
(d) line
ANS:- (d) line
49. Which of the following functions is used to create a line chart ?
(a) line( )
(b) plot( )
(c) chart()
(d) plotline( )
ANS:- b) plot( )
50. Which of the following function will produce a bar chart ?
(a) plotbar( )
(b) plot( )
(c) bar( )
(d) barh( )
ANS:- (c) bar( )
51. Which of the following function will create a vertical bar chart ?
(a) plot( )
(b) bar( )
(c) plotbar()
(d) barh( )
ANS:- (b) bar( )
52. Which of the following function will create a horizontal bar chart ?
(a) plot( )
(b) bar( )
(c) plotbar( )
(d) barh( )
ANS:- (d) barh( )
53. The data points plotted on a graph are called
(a) points
(b) pointers
(c) marks graph is a type of chart which displays information as a series of data points
(d) markers
ANS:- (d) markers
54. A connected by straight line segments.
(a) line
(b) bar
(c) pie
(d) boxplot
ANS:- a) line
55. Which argument of bar() lets you set the thickness of bar ?
(a) thick
(b) thickness
(c) width
(d) barwidth
ANS:- (c) width
56. Which function lets you set the title of the plot ?
(a) title( )
(b) graphtitle().
(c) plottitle( )
(d) All of these
ANS:- (a) title( )
57. The command used to give a heading to a graph is
(a) plt.show()
(b) plt.plot()
(c) plt.xlabel( )
(d) plt.title( )
ANS:- d) plt.title( )
58. Which function would you use to set the limits for x-axis of the plot?
(a) limits( )
(b) xlimits( )
(c) xlim()
(d) lim( )
ANS:- c) xlim()
59. Which function is used to show legends?
(a) display( )
(b) show( )
(c) legend( )
(d) legends( )
ANS.- c) legend( )
60. Which argument must be set with plotting functions for legend( ) to display the legends ?
(a) data (b) label (c) name (d) sequence
ANS:- (b) label
61. Which function is used to create a histogram ?
(a) histogram( )
(b) histo( )
(c) hist()
(d) histtype
ANS:- (c) hist()
62. Which of the following is not a valid plotting function of pyplot ?
(a) plot( )
(b) bar( )
(c) line()
(d) pie( )
ANS:- (c) line()
63. Which of the following plotting functions does not plot multiple data series ?
(a) plot( )
(b) bar( )
(c) pie()
(d) barh( )
ANS:- (c) pie()
64. The plot which tells the trend between two graphed variables is the graph/chart.
(a) line
(b) scatter
(c) bar
(d) pie
ANS:- (a) line
65. A CSV file can take character as separator.
(a), (b) - (c) I (d) \t (e) only (a) (f) all of these
ANS:- (f) all of these
66. In order to work with CSV files from Pandas, you need to import pandas. , other than
(a) csv
(b) pandas.io
(c) no extra package required
(d) newcsv
ANS:- (d) newcsv
67. The correct statement to read from a CSV file in a dataframeis :
(a) <DF>.read_csv(<file>)
(b) <File>. read_csv( )(<DF>)
(c) <DF> = pandas.read(<file>)
(d) <DF> = pandas.read_csv(<files>)
ANS:- (d) <DF> = pandas.read_csv(<files>)
68. Which argument do you specify with read_csv( ) to specify a separator character ?
(a) character
(b) char
(c) separator
(d) sep
ANS:- (d) sep
69. To suppress first row as header, which of the following arguments is to be given in read_csv( ) ?
(a) noheader = True
(b) header = None
(c) skipheader = True
(d) header - Null
ANS:- (b) header = None
70. To read specific number of rows from a CSV file, which argument is to be given in read_csv( ) ?
(a) rows = <n>
(b) nrows = <n>
(c) n_rows - <n>
(d) number_rows = <n>
ANS:- (b) nrows = <n>
71, To skip first 5 rows of CSV file, which argument will you give in read_csv( ) ?
(a) skip_rows = 5
(b) skiprows = 5
(c) skip - 5
(d) noread - 5
ANS:- (a) skip_rows = 5
72. To skip 1st, 3rd and 5th rows of CSV file, which argument will you give in read_csv( ) ?
(a) skiprows = 11315
(b) skiprows - (1, 3, 5]
(c) skiprows = [1, 5, 1]
(d) Any of these
ANS:- (b) skiprows - (1, 3, 5]
73. While reading from a CSV file, to use a column's values as index labels, argument
given in read_CSV( ) is :
(a) index
(b) index_col
(c) index_values
(d) index_label
ANS:- (b) skiprows - (1, 3, 5]
74. While writing a dataframe onto a CSV file, which argument would you use in to sql() for NaN values'
representation as NULL?
(a) NaN = NULL
(b) na_rep = NULL
(c) na_value = NULL
(d) na = NULL
ANS:- b) na_rep = NULL
b) Pip install pandas
c) Python install python
d) Pip install pandas
ANS:- d) Pip install pandas
ii. Numpy array
iii. Dataframe
iv. Panel
ANS:- iii. Dataframe
i. DataFrame is size mutable
ii. DataFrame is value mutable
iii. DataFrame is immutable
iv. DataFrame is capable of holding multiple types of data
ANS:- iii. DataFrame is immutable
i. Comma separated value
ii. Comma separated variables
iii. Column separated values
iv. Column separated variables
ANS:- i. Comma separated value
i. iterrows()
ii. iteritems()
iii. mod()
iv. median()
ANS:- ii. iteritems()
i. iterrows()
ii. iteritems()
iii. mod()
iv. median()
ANS:- i. iterrows()
i std()
ii hist()
iii groupby()
iv rename()
ANS:- rename( )
i. pd.Series(empty)
ii. pd.Series(np.NaN)
iii. pd.series()
iv. All of these
ANS:- iii. pd.series()
TYPE B -( FILL IN THE BLANKS) [1 Mark] Multiple Choice Questions
i. remove()
ii. del()
iii. drop()
iv. cancel()
ANS:- iii. drop()
i. remove()
ii. del()
iii. drop()
iv. cancel()
ANS:- ii. del()
i. plus
ii. eplus
iii. add
iv. radd
ANS:- iv. radd
i. NaN
ii. Na
iii. skipna
iv. All of these
ANS:- iv. All of these
i. histogram()
ii. hist(numeric_only=True)
iii. hist()
iv. All of these
ANS:- ii. hist(numeric_only=True)
i. line
ii. scatter
iii. bar
iv. pie
ANS:- ii. scatter
i. line
ii. plot
iii. chart
iv. plotline
ANS:- ii. plot
i. plot
ii. bar
iii. plotbar
iv. barh
ANS:- ii. bar
i. line
iii. pie
iv. boxplot
ANS:- i. line
i. ,
ii. _
iii. !
iv. \t
v. All of these
ANS:- v. All of these
i. .csv
ii.pandas.io
iii. newcsv
iv. No extra module required
ANS:- i. .csv
a)import pandas
95. In given code dataframe rows and ______ columns.
import pandas as pd
D1 = pd.DataFrame(LoD)
a. 3, 3
b. 3, 4
c. 3, 5
ANS: c. 3, 5
96.D1[ : ] = 77 , will set __________
a. Only First Row
b. Only First Column
c. All
ANS:- c. All
97. The following statement will _________
df = df.drop(['Name', 'Class', 'Rollno'], axis = 1)
#df is a DataFrame object
c. delete any three columns
a. delete three columns having l
98. Which of the following are ways of indexing to access Data elements in a DataFrame?
a. Label based indexing
b. Boolean Indexing
c. All of the above
ANS:- c. All of the above
99. The following statement will display ________
print(df.loc[[True, False,True]])
100. NumPy stands for ____
a.Number Python
b. Numerical Python
c. Numbers in Python
d. None of the aboveS.
ANS. b numerical python
101. PANDAS stands for _____________
. Panel Data Analysis
b. Panel Data analyst
c. Panel Data
d. Panel Dashboard
ANS. c panel data
102. _________ is used when data is in Tabular Format
a. NumPy
b. Pandas
c. Matplotlib
d. All of the above
ANS. b. Pandas
103. When you print/display any series then the left most column is showing _________ value.
a. Index
b. Data
c. Value
d. None of the above
104. When we create a series from dictionary then the keys of dictionary become ________________
a. Index of the series
b. Value of the series
c. Caption of the series
d. None of the series
ANS;- a. Index of the series