0% found this document useful (0 votes)
7 views8 pages

GE - Computer Scien 4ogygeb

This document is a question paper for a course on Data Analysis and Visualization Using Python, intended for Computer Science students in their second semester. It consists of two sections, A and B, with various questions covering topics such as data manipulation with pandas, numpy operations, and data visualization techniques. The paper includes instructions for candidates, marking schemes, and specific coding tasks related to Python programming.

Uploaded by

brpsriya2908
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views8 pages

GE - Computer Scien 4ogygeb

This document is a question paper for a course on Data Analysis and Visualization Using Python, intended for Computer Science students in their second semester. It consists of two sections, A and B, with various questions covering topics such as data manipulation with pandas, numpy operations, and data visualization techniques. The paper includes instructions for candidates, marking schemes, and specific coding tasks related to Python programming.

Uploaded by

brpsriya2908
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

2012 16 [This qudstion paper contains l6 printed pages.

(ii) a.rank(method :'first') You r Roll No...............

(iii) a.rank(ascending : False) Sr. No. of Question Paper: 2012 F

Unique Paper Code 2344001201

Name of the Paper Data Analysis and Visualization


Using Python \$.,,
Name of the Course Computer Science: Generic
Elective (G.E.)
Y
Semester II

Du:ation: 3 Hours Maximum Marks : 90

lnstructions for Candidates

1. Write your Roll No. on the top immediately on receipt


of this question paper.

2. This question paper has two sections A and B.

3. Question I in Section A is compulsory.


4. Attempt any 4 questions from Section B.

5. Parts of a question must be attempted together.

6. Section A carries 30 marks and each question in


Section B carries 15 marks.
7. Use of Calculator is not allowed.

(1500) P.T.O.
2012 2 2012 15

Section A (b) Give the output of the following code segment :

(4)
Assume numpy has been imported as np and pandas
has been imported as pd. arr = np.array([89, 54, 76, 32, 47, 21, 92, 39, 82])

arrl = arr[5:9]
(a) Explain unimodal, bimodal and multimodal
distribution with the help of examples. (5) arr2 = arr[5:9]. copyQ

arrl : 36
(b) Consider the DataFrames First and Second given
below: (5) zrr2 = 7

print(arr)
Ona Two On€ Two
0 0
2 ,B' print(arrl)
1
5 I D' 5
6 ,c' 2 print(arr2 )

First Second
(c) Consider the series a given below and give the
Consider the following python code segment:
output of the following commands : (3)

right = pd.merge(first, second, how='right', on='One')


a= pd. Series( [ 4, 1,7 , l, 8, 9, 0, 8, 2, 3, 9])
left = pd.merge(first, second, how='inner', on='Two')

(i) a.ranko
Show the content of the new DataFrarnes right
and left.

P. T. O.
2012 t4 2012 3

7. (a) Consider the DataFrame df given below : (8) (c) Write python commands to create a figure object
using matplotlib. The Figure object has one subplot
EInpIoyeeID Departsent sal.ry Ag€
10 01 Eng 1i sh 1000 ?3 that contains 3 line graphs. Define legend and chart
1,002 English 1,002 34 title of the graph. Define a different style and
1003 Enqlish 1004 39
1004 English 1005 43 colour for each line in the subplot. Import
1003 Maths 1004 34 appropriate libraries. (5)
1004 Maths 1005 43
1001 Maths 1006 53
7402 Maths 1-002 43 (d) List and describe the steps involved in process of
Data Analysis. (5)
Write the python code to perform the following
operations: (e) Give the output of the following code snippets:
(4)
(i) Create a hierarchical index on Department (i) y=np. arange(12).reshape(4,3)
and Employee ID. print(y)

y[(y> 5)] = -l
(ii) Give the summary level statistics for each
print(y)
column.

(ii) x = np.array (tt2,al, ts,lll)


(iii) Give the output for the following:
z=np.ones like(x)
l. df.stack0 print(z)

w=np.eye(2) * x
2. df.u nstack( )
print(w)

P. T. O.
2012 4 .. 2012 l3
(f) Consider the series Sl and 52 given below: (6) (ii) student [student ['Age'] >201

SI S2
(iii) student Istudent ['Age'] >201 ['Name']
A1 A5
B
(iv) avg marks = np.mean (student.Marks)
B z 6
stud ent I stu dent ['Marks'] > avg_m ark s l
C J D 7

D 4 E 8
(v) first = student [student ['Year'] ==11['Marks']

Give the output of the following python pandas np.mean(first)


commands:
(b) Consider the following list I l. (s)
(i) Sl [: 3] * 10

(ii) sl + s2 I I : il0, 10,20,40,50, 60, 70, 80, 90, 901

(iii) s2 [: : l]* 5 Discretise the I I into 4 bins using cut0 and qcut0.

Give the names ['first', 'second', 'third', 'fourth']


Section B
to the bins. What type of object is returned by the
pandas after binning? What output is generated
2. (a) Consider the DataFrame Frame given below : (7)
by attributes codes and categories of binning
NaDg A9€ tlelght Eeight
Ram 15 45.6 140 object?
Ravi. 23 34.9 160
Reena 32 aq R 145
Ri ta 20 60 .1 155
Rash.L 33 54.1 170
ROMI 21 34.6 144

P,T.O
2012 12 2012 5

(iv) Delete the column Longitude from data. Write python commands to perform the following
operalions:
(v) Save data as a csv with separator as ';'.

(i) Compute the correlation of Age with both


(b) Write a python code to create a figure and a
Weight and Height.
subplot using matplotlib functions. Plot a rectangle
of size 3.5 x 8.5 at point (2.0,7.0), a circle of
(ii) Sort Frame in descending order of Age
radius 2.5 at point (7.0,2.0) as patches in the
subplot, functions for plotting. Set the colour of (iii) To find the index for the row with minimum
rectangle as 'Green' and color of circle as 'Blue'. Age.
Set the x-scale and y-scale to l-10. Import
appropriate libraries. (5) (iv) Calculate cumulative sum for Weight for
all Students.

6. (a) Consider the following dataset student. (10)


(v) To set height of 'Rita' and .Romi, to

Year Na.ae RoII No Malkg Age NA.


1 Rani 23 1A 1B
2 Rita 24 75 20
3 Raj 25 BO 22
(vi) Replace the value 32 with 18 and 33 with
I Rahul 26 65
2 Rohit 21 BO 28
19 in Age column.

(vii) Define map function to convert values of


Cive the output of the ibllowing python commands :

Name column to upper case.


(i) student [['Roll No ',' Name']) P: al

P.T.O.
2012 6 2012 11

(b) Refer to the DataFrame Frame given in question 1vi) Multiply each element in mat with 25

2 (a), Write a python program to perform the


following operations in the given dataset with 5 (a) Give the python commands to create a dictionary
columns Name, Age, Weight, Height. (8) with 5 keys - 'A', 'B', 'C', 'D', 'E' and value as

follows. (I 0)
(i) Create a figure and include 2 subplots in
it.
Key value
A List ofnumbers from 1 to 10 skipping 2 atatime.
(ii) In the first subplot create a scatter B List of Strings fiom A to E.
C List of 5 numbem obtained using random normal
plot between two variables Age and
distribution function.
Height. D Listof 5 random int€gers from 20 to 30.
E Square root of 5 random numbers from 50 to '70.

(iii) ln the second subplot draw a horizontal

bar plot between Name and Weight. Give python commands to perform the following
operations:
(iv) Set the title for the figure as 'Data
(i) Create DataFrame data using the above
Analysis'.
dictionary.
(v) Give appropriate labels for x and y axis.
(ii) Convert Column A to index.
(vi) Save the figure to file with name
(iii) Rename the rest of the columns as Area,
'analysis.png'. Temperature, Latitude and Longitude.

P. T. O.
2012 10 2012 7

(iv) Replace all null values by the last known 3. (a) Consider the following numpy array matrix:
valid observation. (i0)

(b) What are outliers? How can you detect outliers tIs,10,20],

using boxplots? (5)


120,13,43),

(c) Consider the given numpy array mat: (6) 134,27,671,

mat = np.array(ttt-1,21, t3,4ll, tt-s,61, t7,8lll) \t2,46,7711

Write numpy commands to perform the following Give the output of the following numpy commands :

operations:
(i) matrix.T
(i) Create an array of zeros with the same
(ii) matrix[:l,l:]
shape as mat.

(iii) matrix[[ I ,3,0], t2,l,0ll


(ii) Print the shape of the mat
(iv) matrix[[-2,-4]l
(iii) Print the datatype of the elements in mat

(v) matrix[[True, False, False, True]l


(iv) Print the elements which are greater than
6 in mat. (vi) matrix[3] [:2]

(v) Convert all the elements in mat as float (vii) matrix[:: I ]


type.

P. T. O.
2012 8 2012 9

(viii) matrix.ndim (iv) Give the average price of all Low Fat
items.
(ix) np. swapaxes(matrix, l, 0)

(v) Check if 'Juice' ims one of the items


(x) matrix+ l0
sold.

(b) Consider the following DataFrame df. (5)

4. (a) Consider the DataFrame data given below. (4)


I t€es Sugar Tfrpe Price
Yogurt Low Eat 45
Chips Regu Ia-r 30 One fro Three Fou! Five
soda LoL, Eat 50 1 74 34 NaN NaN
Yogurt High Fat 10 34 2L NaN t2 NaN
Ca ke Regular 140 NaN 23 NaN 2 NaN
Chips I,ov,i Ea t 40
34 2), 32 33 NaN
Yogurt Regular 50

Write python commands to perform the following


Give commands to perform the following operations:
operations:

(i) List the name of unique items sold.


(i) Drop columns with any null values

(ii) Count the number of times each value in (ii) Replace the null values with the mean of
items is stored.
each column.

(iii) Delete the rows which have duplicate (iii) Drop the null values where there are at
values of Items.
least 2 null values in a row.

P,T,O

You might also like