Practice Set 1 Solution
Descriptive statistics and data manipulation using Panda’s
Note: The solutions provided are just one of the possible answers. Any other equivalent
solutions are also valid.
Question 1
(a) shapes.iloc[0:4,0:2]
(b) shapes[shapes["sides"] > 6]
(c) shapes[shapes["sides"] > 6]["area of shape"].mean() (Note: Any other expression which
evaluates to the same answer is also valid)
Question 2
This statement will add a new column named quadrilateral in the data frame. The expression on
the right side “shapes["sides"] == 4” will evaluate to a list of Boolean values. Therefore, the
column quadrilateral will have True in rows where the sides of the shape are equal to 4 and False
in all other rows.
Question 3
a) df [‘Year’]: returns series
b) df[‘Year’, ‘Ranking’]: returns none
c) df [‘Name”]: returns none
d) df [[‘Year’, ‘Ranking’]]: returns data frame
Question 4
a) Returns a data frame containing the rows 0 and 2
b) Returns a data frame containing the rows 0-2
c) Returns a data frame containing the rows 1-4
d) Returns a data frame containing all rows except the last one
Question 5
(a) pd.pivot_table(cameras, index='Release date', values='Price', aggfunc=np.mean)
(b) pd.pivot_table(cameras, index='Release date', values='Price', aggfunc=np.max)
(c) pd.pivot_table(cameras, index='Release date', values='Price', aggfunc=np.min)
Question 6
(a) cameras.sort_values(by= “Weight (inc. batteries)”, ascending=False, inplace=True)
(b) cameras['Release date'].value_counts()
(c) cameras['Release date'].unique()
Question 7
cereal.groupby(['manufacturer'], sort=True)[‘sugars’].max()
Question 8
(a) cereal [(cereal ["calories"] > 70 ) & ( cereal ["protein"] < 4 )]
["name"] (b) cereal [(cereal ["calories"] > 70 ) | ( cereal ["protein"]
< 4 )] ["name"]
Question 9
(a) cereal.loc [0:3, [ 'name' , 'manufacturer' ] ]
(b) cereal. set_index ('name' , inplace=True)
Question 10
a) employee = employee.dropna()
b) employee [employee ['salary']>100000]
c) employee [‘Senior Management’] = employee[‘Senior Management’].astype(int)