0% found this document useful (0 votes)
90 views7 pages

Dav 2024 Pyq

Uploaded by

somya.234017
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
90 views7 pages

Dav 2024 Pyq

Uploaded by

somya.234017
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

|This question paper contains 12 printed pages.

Your Roll No.&b73 700


Sr. No. of Question Paper: 1673 G
Unique Paper Code 2343012002
Name of the Paper Data Analysis and Visualization
Name of the Course : B.Sc. (Hons.) Computer
Science
Semester : III

Duration:3 Hours Maximum Marks : 90

Instructions for Candidates


1. Write your Roll No. on the top immediately on receipt
of this question paper.
2. Section A is compulsory.
3. Attempt any four questions from Section B
4. Parts of a question must be
answered together.

Section A
the
Assume that the
imported : following libraries have already been
import numpy as np
import pandas as pd

P.T.O.
3
2
1.
(a) Given
rainfall =(5,2, 7,:
of a
month, days = [1,3,8,2] capturedWritefor 5 days 1673 3
Python to plot a line 5, 1,9).
and y axis with days and code in RollNo Name

red circle ofrespectively. Mark rainfall


size 20. Add a each point with a
as x
RollNo Name
Abhav
Vihaan
Roni
Kabeer
(Make use of title to the 3 Chitra 3 Ishani

(b) Consider the


appropriate libraries.) graph.
(S)
4 Devansh
sectionl
2
section2
Vihaan

following ofdataframe, company,


details of employees an having Write Python statements to do the following:
organization: (5)
(i) Create a dataframe named section1 for the
Name Age
table section I.
1 Sangeeta
Sarika
18
30
2 (ii) Display details of all students of section 2
3 Sangeeta 45
along with details of students of section 1
4
Babita
Sarika 32 with the same Name.

Using appropriate libraries,


statements to do the
write Py thon (ii) Display details of students with the same
output :
following and also show the Name and RollNo in both sections.

()) Display the total number of distinct (d) Find the output that will be produced on the
of the employees. names execution of the following code snippet: (5)

(ii) Compute the average age of the employees al = np. zeros ((2, 3))
with the same name. print (al)
a2 = [[3, 4, 5], (7, 8, 9]]
(c) Consider the following tables named sectionl and print (np.add(al, a2) )
al = np.append (a1, a2,
section2, each having details viz. RollNo and Name axis = 0
print (al) Shace
of students in each print ('Shape of array:'
class: (5) a
P.T.O.
1673 4

(e) Consider a NumPy array, 1673


empSalary, containing 5
salary of 10 employees. Write Python
to do the following : statements df = pd.DataFrame (np.arange (12).reshape (4, 3),
(5) index = [[" North', 1
(1, 2, 1, 2)),North', 'South', 'South',
(i) Find total number of employees earning columns = [[' Delhi', 'Delhi',
'Chandigarh' ,
('Green', 'Red' ,
salary > 5000. 'Green']))
Find the output that will be
produced on the
(ii) Create a new array, incentive, to store execution of the following code snippet :
incentives given to each employee where (i) print(df)
incentive is 10% of the salary.
(ii)) df.index.names = [key 1', 'key2')
() Find the output that will be produced on the print(df)
execution of the following code snippet : (S) dfl =df.swaplevel(*key l', 'key2' )
print(dfl )
cata = pd.DataFrame ([ [2, 4, 6), [np.NaN, 8, 10],
[np. NaN, 12, np.NaN], (np. NaN, np.NaN, np.NaN] ])
print (data) (ii) df 2=dfl .sort index (level=0)
print (data.dropna (thresh = 2))
print (data.fillna (method = "£fill!, limit = 2)) print(df2)

(b) Construct a Numn Py array, mark


Sheet, to store
Section B marks obtained by 2 students in 3
subjects, where
marks are between 60 and 100.
Write Python
Assume that the following libraries have already been statements to display the data type, shape and
imported dimension of mark Sheet.
(5)
import numpy as np
(c) Consider the
import pandas as pd
fo!lowing dataframe. item Rate: (4)
Item Rate
2.
(a) Consider the following dataframe, df: (6)
Apples 220
1
Oranges 90

P.T.O.
1673 6

1673 7
Write Python statements to do the following:

(i) Double the value of the column Rate of between Hours_studied and Marks_ obtained.
each item. (iv) Plot the heatmap of columns Hours studied
and Marks obtained of the
dataframe
(ii) Display the type of item with minimum rate. Student.

(b) Find the output on the execution of the


following
3. (a) Consider the following dataframe, df Student, code snippet : (7)
consisting of student details : (8) bl = np.arange (6)
b2 - np. array ([ [1, 2, 3), (4, 6, 811)
print ('i.\n, bl)
Name Hours_ studie Marks_obtained print ('ii.\n', b2)
Mohan 2.5 40 print('iii. \n! , 2/b2)
Sohan 4.0 52 print('iv.\n', b1 (1], b2 (1] )
2 Rajeev 6.0 64 print (v.\n', bl[:1), b2 [: :2])
3 Jeevan 8.0 70
4 Gita 10.0 90
5 Meenu 1.0 10
4 (a) Using
6 Gopal 5.0 60 diagrams give example of each of the
following data distributions : (6)
Write Py thon code to answer the following. (Make (i) unimodal
use of appropriate libraries.) :
(ii) bimodal
(i) Find names of students who got maximum (iii) multimodal
marks.
(b) Consider the
(ii) Find the average number of hours studied
by the students.
showing detailsfollowing dataframe,
of sales done by company.
(iii) Compute the correlation and covariance
two quarters :
salespersons" (9)

P. T.O.
1673
8

person sales 1673


A
8
1000 quarter country
1 np.arange(0, 24,
2)
6))
300 US cl cl.reshape ( (2,
2 c2 = \n')
3
C
D
400
S00
1 Japan
Brazil print (c1, c2, sep
(c2.reshape ( (3,4)) )
1 print
4 UK 3:) = 0
5
800 1 arg2(:3,
A 1000 US print (c2)
2
6 B S00 Brazil print (cl * 2)
2
C 700 2 Japan
8 D Brazil given below is saved in an
50 2 US (b) Assume that the data
excel file 'data.xlsx' (with 4 columns Employee id,
Write Python statements to do the
following Department, Salary and Age) : (10)
(i) Find the
maximum and minimum sales for
Brazil. Employee id Department SalarY Age
101 Computer Science 2000 23
(ii) Display total sales for cach 102 Computer Science 2002 34
country. 103 Computer Science 2040 39
104
(ii) Display the name of the salesperson with English 2045 43
105
English 2030 34
maximum average sales. 106
English 2006 53

(iv) Display statistical summary of the Write Python statements to do the


numerical attributes only. following (Make
use of appropriate libraries.) :
(v) Draw a boxplot of the sales.
(i) Read data from the
given excel fi le
5.
(a) Find the output that will be produccd on the *data.xlsx' into a dataframe. dfl. Set
execution of the following code snippet: (5) Lmployee id as the index of the dfl.

P.T.O.
1673 11 (5)
10
32],
1673 [21,
(ii) 'A':
Create a figure and add two DataFrane({"B': (27, 30) })
In the first subplots in it. (iiü) dfl = pd. [23,
41]})
subplot, create a scatter plot DataFrame({'A':
between Salary and Age. Give pd.
df2 = (dfl)
the x-axis as Salary and labels to print
the y-axis as Age. print (df2) dfl ['A')[1] + 10
d£2['A')[1) =
Also, give a title to this plot.
Discretize print (df2) ['B'].min () )
Salary into 3 equal bins. In the print(df2 >
dfl
second
subplot, draw a figure to visualize the count consisting of age of
12
array, ages,
of the number of (b)Consider an
employees in each of 23, 37, 31, 61, 45, 41,
these bins. people [20, 22, 25, 27, 21,
32].
(ii) Save the plotted figure to a
file named
Using appropriate libraries, write code to :
'Employees.png'.
6 (a) Find the output that
(i) Create four bins of the array ages, using
will be produced on the right side closed intervals (18-25], (25-35],
execution of the following code : snippet (35-60], (60-1001. Name the categories as
(i)) s1= pd.Series ((5, 0, -4,
8])
Youth', YoungAdult', 'MiddleAged' and
print (sl) (2)
print (s1.rank ()) Senior' respectively. Display the number
of values in each
(ii) datal =
category .
pd.DataFrame({ (3) (i)
'One': ['a', 'b'] * 2 + ('b'3, Create four equal-sized categories
Two': (21, 22, 21, 23, 24]})
of the
array ages.
print (datal) (5)
data2 = datal1.drop duplicates (['0ne',
keep= last') Two'J, 7.
print (data1) Siven the following CSV file
print (data2) of details of
employees
: 'employee.csv' copstS
32+0 P.T.O.
1673 12

Gender Role
Male Data Analyst ExperienceSalary
1 48000
Male Data Analyst 1 42000
Male Data Analyst 3 51000
Male Data Scientist 5 62000
Female Data Scientist 6 71000
Female Data Scientist 73000
Male Manager 10 82000

FemaleManager
Female
11 87000
91000
Manager 12

Write Python statement(s) to do the following. (Make


use of appropriate libraries.):

(a) Read data from the given CSV file 'employee.csv'


into a dataframe empData.

(b) Calculate and display the total salary for each


role.

(c) Displaythe total number of females along with


their average salary.

(d) Compare the highest and lowest salary for


gender using bar plot. each
(e) Delete records
with salary less
salary of all
employees. than the average
(13)

You might also like