0% found this document useful (0 votes)

29 views4 pages

Diploma in Information Technology: Centralized Question Bank

The document outlines the practical examination for a Diploma in Information Technology, specifically focusing on Data Science and Big Data. It includes a series of tasks involving data manipulation, statistical analysis, and visualization using Excel and Python, with a detailed allocation of marks for various components of the exam. Students are required to complete a set of operations on datasets, including loading, cleaning, analyzing, and visualizing data.

Uploaded by

imran Basith

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

29 views4 pages

Diploma in Information Technology: Centralized Question Bank

Uploaded by

imran Basith

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

DIPLOMA IN INFORMATION TECHNOLOGY

CENTRALIZED QUESTION BANK

4052653 - Data Science and Big Data Practical

DIRECTORATE OF TECHNICAL

EDUCATIONGOVERNMENT OF

TAMILNADU
DIPLOMA END SEMESTER / YEAR EXAMINATION – 2023
Course : Information Technology
Subject : Data Science and Big Data Practical QP Code : 4052653
Time : 3 Hours Date : Session: Max Marks:100

ANSWER ALL THE QUESTION

1. Load the data about the exam fee paid by the students of all branches of your college.
Perform the following operation sonit using Excel.

a. Arrange the data branch wise with in the branch and arrange register numbers.
Replace all names with CAPITAL.
b. Count the number of students in each branch and semester
c. Calculate the total fee paid by students of each branch.
d. Find the minimum and the maximum fee paid by the student.
e. Find the sum, average, max, min of fee paid in each branch

2. Load the data collected from all students during online answer paper submission with the
following details for each exam. Regno, name, course_ code, subject _ code, semester,
number_ of_ pages (nop), mode _of _ dispatch, email_ id, mobile_ number.
Perform the following operations using Excel.

a. Check the file for any missing data in the columns.

b. Count the number of students appeared for the exam .
c. Count the number of papers (subjects) submitted by each student (Using
register number).
d. Create a new column by concatenating register number and the subject code.
Using this column, perform the v lookup function to find the number of pages
(nop) written by the students in that subject, and the mode of dispatch.
e. Count the number of students appeared (submitted) for each subject.
f. Count the number of different (unique) subject _codes that have been submitted.

3. Read the data set from the Auto-MP G repository and perform the descriptive
Statistics on the data using Excel-Data Analysis. Verify the same using the statistical
functions of Excel.
4. Read the data set from the Auto-MP G repository and
a. Identify the relationship between the variables using correlation..
b. Identify the in dependent and the dependent variables.
c. Perform the linear regression on the related variables and find the.
d. regression equation.
e. Estimate the performance of the regression model.

5. Load any external csv data file and store it in a P and as Data Frame.

a. Check the shape and column types of the Data Frame(rows and columns).[Note:
Usedf.info()and df. shape()].
b. Subset the data column by names, by index, by range.
c. Subset data base don index label, row index, multiple rows.
d. Subset base don rows and columns.

6. DESCRIPTIVESTATISTICS using Python-Pandas

a. Write a Python script to find basic descriptive statistics on AUTO-MPG

dataset.
b. Find the values of the descriptive statistics.
c. Determine the measures of a central location, such as mean, markers such as
Quartiles or percentiles, and measures of variability or spread, such as the
standard deviation.

7. READING AND WRITING DIFFERENT TYPES OF DATASETS

a. Reading different types of data sets (.txt, .csv) from Web and disk and writing
in file in specific disk location.
b. Reading Excel data sheet using Pandas.
c. Export the values from the Data Frame to several other formats.

8. DATAVISUALIZATION
a. Load the Auto-MPG dataset from csv file into pandas.
b. Analyze the Behavior of the Number of Cylinder sand Horse power Using a
Box plot
c. Find the relationship between horse power and weight using the scatter plot
using the data from Auto _MPG:
d. Find the out liers using plot.
e. Plot the histogram, bar chart and pie chart on sample data

9. COVARIANCE and CORRELATION

a. Find the correlation and covariance between two variables.

b. Plot the correlation plot on the data set and visualize giving an over view of
relationships a mong data.
c. Fit a simple linear regression model using libraries such as NumpyorScikit-
learn. (Import Linear Regression from sk learn. linear _ model)

 Import the packages and classes you need.

 Provide data for independent and dependent variables.
 Create a regression model and fitit with existing data.
Check the results of model fitting to know whether the model is satisfactory.

10. OUTLIER Detection

When analyzing data collected as part of a science experiment it may be desirable to

remove the most extreme values before performing other calculations. Write a function
that takes a list of values and an non-negative integer, n, as its parameters.

The function should create a new copy of the list with the n largest elements and the n
smallest elements removed. Then it should return the new copy of the list as the
function‘s only result. The order of the elements in the returned list does not have to
match the order of the elements in the original list.

11. Text Processing

b. Open a text file and read all the lines of the file.
c. Token is e(separate the words) the text.
d. Count the total number of lines, total number of word sand unique words
e. Sort the words alphabetically.
f. Find the most frequent and least frequent words.
g. List the words having certain suffixes.
Note: You can open aTamiltextfileusing'UTF-16' encoding.

12. Text Processing-II

Load atextfilecontaining a list of words into a Data Frame. Apply the following
functions and verify the results.
Replace(), repeat(), count(pattern), starts with(pattern), ends
with(pattern),find(pattern),find all(pattern).

DETAILLEDALLOCATIONOFMARKS

Writing answer for any one program from the list 45Marks
Executing the program 35Marks
Result with printout of the Program 10Marks
Demonstration of Mini Project 5 Marks
VIVA–VOCE 5 Marks
TOTAL 100Marks

DSBDAlab Manual
No ratings yet
DSBDAlab Manual
116 pages
Data Science & Big Data Lab Manual
No ratings yet
Data Science & Big Data Lab Manual
117 pages
Dsbda Lab Manual
No ratings yet
Dsbda Lab Manual
167 pages
Dsbdal Lab Manual
No ratings yet
Dsbdal Lab Manual
107 pages
DSBDA LAB - MANUAL (Autosaved) - Sd1-Converted-1-2
100% (1)
DSBDA LAB - MANUAL (Autosaved) - Sd1-Converted-1-2
256 pages
Python Practice Questions
No ratings yet
Python Practice Questions
5 pages
DSBDA Manual
No ratings yet
DSBDA Manual
76 pages
DAV Guidelines
No ratings yet
DAV Guidelines
4 pages
TY - Lab-II CS-358 Web Tech & DS Slip (Rev 2021-22)
No ratings yet
TY - Lab-II CS-358 Web Tech & DS Slip (Rev 2021-22)
20 pages
Final Paper MF 450 BA
No ratings yet
Final Paper MF 450 BA
1 page
Practical Lab Manual Grade 10
No ratings yet
Practical Lab Manual Grade 10
6 pages
PracticalList - EDT - BCA - 2024 SET B1 - 4
No ratings yet
PracticalList - EDT - BCA - 2024 SET B1 - 4
8 pages
DAV Guidelines
No ratings yet
DAV Guidelines
4 pages
SL-III Lab Manual
No ratings yet
SL-III Lab Manual
74 pages
Syllabus AIML
No ratings yet
Syllabus AIML
14 pages
DADS301 MBA Sem 3programming in DS
No ratings yet
DADS301 MBA Sem 3programming in DS
10 pages
IDS Syllabus
No ratings yet
IDS Syllabus
5 pages
CSE1703 - Fundamental of Data Science
No ratings yet
CSE1703 - Fundamental of Data Science
6 pages
Practical Assignment4 1
No ratings yet
Practical Assignment4 1
6 pages
Index: SR. NO. Practical Name Date of Perform NO. Sign
No ratings yet
Index: SR. NO. Practical Name Date of Perform NO. Sign
28 pages
Data Science Lab Manual..
No ratings yet
Data Science Lab Manual..
54 pages
Data Science for Engineers Course
No ratings yet
Data Science for Engineers Course
8 pages
Term-I Practical Question Paper 2022-2023
No ratings yet
Term-I Practical Question Paper 2022-2023
8 pages
DSBDAL Lab Manual
No ratings yet
DSBDAL Lab Manual
26 pages
XII IP Practical List 2023-24
No ratings yet
XII IP Practical List 2023-24
4 pages
DSBDA Lab Plan
No ratings yet
DSBDA Lab Plan
5 pages
Ip CLSS Xii 2024-25 Hy
No ratings yet
Ip CLSS Xii 2024-25 Hy
14 pages
PR List Dsbda
No ratings yet
PR List Dsbda
2 pages
Mid Sem QP
No ratings yet
Mid Sem QP
3 pages
Int375 Etp Paper
No ratings yet
Int375 Etp Paper
11 pages
Data Analysis and Processing Tasks
No ratings yet
Data Analysis and Processing Tasks
3 pages
Data Science & Big Data Lab Guide
No ratings yet
Data Science & Big Data Lab Guide
167 pages
Python & SQL Exercises for Class 12
No ratings yet
Python & SQL Exercises for Class 12
6 pages
XII - IP - Practical - List 2023-24
No ratings yet
XII - IP - Practical - List 2023-24
4 pages
Aids-B Ii-Ii DSP Lab LP
No ratings yet
Aids-B Ii-Ii DSP Lab LP
2 pages
21hcs4108 Davpracticals
No ratings yet
21hcs4108 Davpracticals
29 pages
41 DS PL MF
No ratings yet
41 DS PL MF
20 pages
Fds 1
No ratings yet
Fds 1
5 pages
DS Question Bank Unit-1 Part-2
No ratings yet
DS Question Bank Unit-1 Part-2
3 pages
Lab Manual New
No ratings yet
Lab Manual New
12 pages
Bca212 Ids 2023
No ratings yet
Bca212 Ids 2023
3 pages
DS & BDA Lab Manual 2021-22
No ratings yet
DS & BDA Lab Manual 2021-22
100 pages
Manishadav
No ratings yet
Manishadav
27 pages
Data Science Practicals
No ratings yet
Data Science Practicals
40 pages
CS 3362 FDS
No ratings yet
CS 3362 FDS
53 pages
Shivansh Rawat IP Practical File XII
No ratings yet
Shivansh Rawat IP Practical File XII
43 pages
Data Science
No ratings yet
Data Science
5 pages
Data Science Syllabus
No ratings yet
Data Science Syllabus
4 pages
Practical List 2022-23
100% (1)
Practical List 2022-23
4 pages
Data Science Manual
No ratings yet
Data Science Manual
155 pages
Self Practical File Tina Gupta
No ratings yet
Self Practical File Tina Gupta
45 pages
DBDAL LAB - MANUAL - Final
No ratings yet
DBDAL LAB - MANUAL - Final
93 pages
MLT Lab Manual
No ratings yet
MLT Lab Manual
41 pages
AMLW Assignment 3
No ratings yet
AMLW Assignment 3
2 pages
Syllabus Sem 6
No ratings yet
Syllabus Sem 6
6 pages
IP Project 12A
No ratings yet
IP Project 12A
39 pages
Class 12 Informatics Practice Set
No ratings yet
Class 12 Informatics Practice Set
9 pages
4BUIS014W Business Computing-Portfolio
No ratings yet
4BUIS014W Business Computing-Portfolio
7 pages
Ai-Lab-Lesson Plan-A-2024-25-Final
No ratings yet
Ai-Lab-Lesson Plan-A-2024-25-Final
4 pages
Enrollment For - Coursera Certifications Via Infosys Springboard
No ratings yet
Enrollment For - Coursera Certifications Via Infosys Springboard
41 pages
NumPy for Data Science Enthusiasts
No ratings yet
NumPy for Data Science Enthusiasts
119 pages
Pandas Cheatsheet 1737475033
No ratings yet
Pandas Cheatsheet 1737475033
11 pages
RainFall - Prediction - Ipynb - Colaboratory
No ratings yet
RainFall - Prediction - Ipynb - Colaboratory
7 pages
AI Lab Manual New
No ratings yet
AI Lab Manual New
41 pages
Quizzes Panda
No ratings yet
Quizzes Panda
23 pages
Internship or Mini Project Report NutriNourish
No ratings yet
Internship or Mini Project Report NutriNourish
26 pages
Vikrant Wankhade Resume
No ratings yet
Vikrant Wankhade Resume
2 pages
Learn AI (Roadmap)
No ratings yet
Learn AI (Roadmap)
18 pages
Porter Case Study
No ratings yet
Porter Case Study
27 pages
Dsmlusingpython
No ratings yet
Dsmlusingpython
10 pages
Sarkar, DR Tirthajyoti - Roychowdhury, Shubhadeep - Data Wrangling With Python - Creating Actionable Data From Raw Sources-Packt Publishing (2019)
No ratings yet
Sarkar, DR Tirthajyoti - Roychowdhury, Shubhadeep - Data Wrangling With Python - Creating Actionable Data From Raw Sources-Packt Publishing (2019)
538 pages
Data Avengers PAP Analytics Course Brochure
No ratings yet
Data Avengers PAP Analytics Course Brochure
14 pages
Ip 12 Assignment - 6 (MCQ)
No ratings yet
Ip 12 Assignment - 6 (MCQ)
8 pages
Namrata Resume
No ratings yet
Namrata Resume
4 pages
12 IP Dataframe and Pyplot Notes
No ratings yet
12 IP Dataframe and Pyplot Notes
14 pages
Final Ip Project
No ratings yet
Final Ip Project
27 pages
EDA Unit2
No ratings yet
EDA Unit2
99 pages
Automating e Abspdf-1
No ratings yet
Automating e Abspdf-1
50 pages
Python File Handling & REST API Guide
No ratings yet
Python File Handling & REST API Guide
7 pages
Proposal-1 2
No ratings yet
Proposal-1 2
26 pages
Analyze Employee Exit Surveys
No ratings yet
Analyze Employee Exit Surveys
11 pages
Thonny
No ratings yet
Thonny
4 pages
Data Analyst Interview Questions
No ratings yet
Data Analyst Interview Questions
49 pages
2 Pandas
No ratings yet
2 Pandas
22 pages
UNIT4
No ratings yet
UNIT4
19 pages
Question Bank
No ratings yet
Question Bank
4 pages
DM Lab Cycle 1
No ratings yet
DM Lab Cycle 1
12 pages
Final Ip Practical File
No ratings yet
Final Ip Practical File
29 pages

Diploma in Information Technology: Centralized Question Bank

Uploaded by

Diploma in Information Technology: Centralized Question Bank

Uploaded by

DIPLOMA IN INFORMATION TECHNOLOGY

CENTRALIZED QUESTION BANK

4052653 - Data Science and Big Data Practical

ANSWER ALL THE QUESTION

a. Check the file for any missing data in the columns.

6. DESCRIPTIVESTATISTICS using Python-Pandas

a. Write a Python script to find basic descriptive statistics on AUTO-MPG

7. READING AND WRITING DIFFERENT TYPES OF DATASETS

9. COVARIANCE and CORRELATION

a. Find the correlation and covariance between two variables.

 Import the packages and classes you need.

10. OUTLIER Detection

When analyzing data collected as part of a science experiment it may be desirable to

11. Text Processing

12. Text Processing-II

You might also like