0% found this document useful (0 votes)
14 views3 pages

Question Bank

The document is a question bank for the course AD3491 Fundamentals of Data Science and Analytics at SRM TRP Engineering College, consisting of short answer, long answer, and essay questions. It covers various topics such as data analysis, statistics, exploratory data analysis, hypothesis testing, ANOVA, and regression. The questions are designed to assess students' understanding of key concepts and methodologies in data science and analytics.

Uploaded by

sakthivel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views3 pages

Question Bank

The document is a question bank for the course AD3491 Fundamentals of Data Science and Analytics at SRM TRP Engineering College, consisting of short answer, long answer, and essay questions. It covers various topics such as data analysis, statistics, exploratory data analysis, hypothesis testing, ANOVA, and regression. The questions are designed to assess students' understanding of key concepts and methodologies in data science and analytics.

Uploaded by

sakthivel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

SRM TRP Engineering College

Department of Computer Science and Engineering

Ques%on Bank for AD3491 FUNDAMENTALS OF DATA SCIENCE AND ANALYTICS

Part A – Short Answer Ques%ons

1. List down any five skills required for a data analyst.

2. Outline the significance of Exploratory Data Analysis (EDA).

3. Tabulate the differences between univariate, bivariate, and mulIvariate analysis. Give an
example.

4. Give an example of a dataset with a non-Gaussian distribuIon.

5. Explain the term 'Normal DistribuIon'.

6. Brief about the Type I and Type II errors in staIsIcs. IdenIfy the relaIonship between
standard error and margin of error.

7. With an assumpIon of null hypothesis as correct, what does it mean when p-values are high
and low?

8. Define the term one-factor ANOVA.

9. Outline a few approaches to detect outliers. Explain different ways to deal with it.

10. Give an approach to handle missing values in a dataset.

Part B – Long Answer Ques%ons

11. a) Brief about exploratory data analysis in dataset analysis and knowledge discovery process.

Or

11. b) i) Outline the purpose of data cleansing. How are missing and nullified data a[ributes
handled and modified during the preprocessing stage? (6)

ii) Explain the Data AnalyIc life cycle. Brief about Regression Analysis. (7)

12. a) i) Indicate whether each of the following distribuIon is posiIvely or negaIvely skewed.
(1) Incomes of taxpayers have a mean of $48,000 and a median of $43,000.

(2) GPAs for all students at some college have a mean of 3.01 and a median of 3.20 (6)
SRM TRP Engineering College

Department of Computer Science and Engineering

ii) Consider the following number of online examinaIon a[empts by 15 students: 2, 17, 5, 3,
28, 7, 5, 8, 5, 6, 2, 12, 10, 4, 3.

(1) Find the mode, median, and mean for these data.

(2) Draw the distribuIon for balanced, posiIvely skewed, or negaIvely skewed. (7)

Or

12. b) i) Assume that SAT math scores approximate a normal curve with a mean of 500 and a
standard deviaIon of 100. Sketch a normal curve and shade in the target areas described by each
of the following statements:

• More than 570

• Less than 515

• Between 520 and 540. (5)

ii) Assume that the burning Imes of electric light bulbs approximate a normal curve with a
mean of 1200 hours and a standard deviaIon of 120 hours. If a large number of new lights are
installed at the same Ime (possibly along a newly opened freeway), at what Ime will:

• 1 percent fail? (2)

• 50 percent fail? (2)

• 95 percent fail? (4)

13. a) Among 100 couples who had undergone marital counseling, 60 couples described their
relaIonships as improved, and among this la[er group, 45 couples had children. The remaining
couples described their relaIonships as unimproved, and among this group, 5 couples had
children.

(1) What is the probability of randomly selecIng a couple who described their relaIonship as
improved? (4)

(2) What is the probability of randomly selecIng a couple with children? (4)

(3) What is the condiIonal probability of randomly selecIng a couple with children, given that
their relaIonship was described as improved? (5)

Or

13. b) State any two reasons why the research hypothesis is not tested directly. Explain them in
brief.
SRM TRP Engineering College

Department of Computer Science and Engineering

14. a) Explain ANOVA in detail with an example.

Or

14. b) The F test describes the raIo of two sources of variability: that for subjects treated
differently and that for subjects treated similarly. Is there any sense in which the t test for two
independent groups can be viewed likewise?

15. a) Define autocorrelaIon and how is it calculated? What does the negaIve correlaIon
convey?

Or

15. b) What is the philosophy of logisIc regression? What kind of model is it? What does logisIc
regression predict? Tabulate the cardinal differences between linear and logisIc regression.

Part C – Essay Ques%ons (1 x 15 = 15 marks)

16. a) Explain populaIon and samples and their difference.

Or

16. b) Imagine a simple populaIon consisIng of only 5 observaIons: 2, 4, 6, 8, 10. List all
possible samples of size two. Construct a relaIve frequency table showing the sample
distribuIon of the mean.

You might also like