0% found this document useful (0 votes)
17 views2 pages

Question Bank

The document outlines a comprehensive curriculum for an Exploratory Data Analysis course, covering key topics in Python, NumPy, Pandas, data visualization with Matplotlib, and machine learning. Each module includes specific questions and programming tasks designed to assess understanding and application of data analysis techniques. The course emphasizes practical skills through coding exercises and theoretical concepts essential for data manipulation and analysis.

Uploaded by

parvithac31
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views2 pages

Question Bank

The document outlines a comprehensive curriculum for an Exploratory Data Analysis course, covering key topics in Python, NumPy, Pandas, data visualization with Matplotlib, and machine learning. Each module includes specific questions and programming tasks designed to assess understanding and application of data analysis techniques. The course emphasizes practical skills through coding exercises and theoretical concepts essential for data manipulation and analysis.

Uploaded by

parvithac31
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

IMPORTANT QUESTION

Exploratory Data Analysis(BDS613B)

Module 1: Introduction to Python and NumPy:

1. Explain the role of IPython and Jupyter in data analysis. (5 Marks)


2. Describe the enhanced interactive features of IPython. (5 Marks)
3. What are NumPy arrays? Explain their importance in scientific computing. (10 Marks)
4. Write a Python program to demonstrate the creation and manipulation of a NumPy array. (10
Marks)
5. Explain how NumPy handles structured data with examples. (10 Marks)
6. Compare NumPy arrays with Python lists in terms of performance and functionality. (5 Marks)
7. How are sorted arrays utilized in data analysis tasks? (5 Marks)
8. What is the significance of NumPy’s structured arrays in handling complex datasets? (10 Marks)
9. Illustrate the process of indexing and slicing in NumPy arrays with examples. (10 Marks)
10. Explain the key differences between 1D, 2D, and multi-dimensional arrays in NumPy. (10 Marks)

Module 2: Data Manipulation with Pandas – I:

1. Explain the concept of Pandas objects and their types. (5 Marks)


2. How can missing data be handled in Pandas? Provide examples. (10 Marks)
3. Write a Python program to demonstrate hierarchical indexing in Pandas. (10 Marks)
4. What are pivot tables in Pandas, and how are they useful in data analysis? (10 Marks)
5. Discuss the significance of DataFrame and Series in Pandas. (5 Marks)
6. Compare Pandas with NumPy for data manipulation tasks. (5 Marks)
7. Explain the use of .groupby() in Pandas for aggregating data. (10 Marks)
8. How do you load and save data using Pandas? Illustrate with examples. (10 Marks)
9. Discuss the advantages of using Pandas for handling time-series data. (10 Marks)
10. Write a Python program to demonstrate basic data manipulation using Pandas. (10 Marks)

Module 3: Data Manipulation with Pandas – II:

1. Explain vectorized string operations in Pandas with examples. (10 Marks)


2. How does Pandas handle time-series data? Provide examples. (10 Marks)
3. What are the benefits of using the eval and query methods in Pandas? (5 Marks)
4. Write a Python program to demonstrate the use of eval in Pandas for high-performance operations.
(10 Marks)
5. Discuss the challenges and solutions in working with large datasets in Pandas. (10 Marks)
6. Illustrate the use of .merge() and .concat() for data combination in Pandas. (10 Marks)
7. How can you perform indexing and selection in Pandas DataFrames? (5 Marks)
8. Explain the process of reshaping and pivoting data in Pandas. (10 Marks)
9. Compare vectorized operations with iterative approaches in Pandas. (5 Marks)
10. Write a Python script to demonstrate handling and analyzing time-series data. (10 Marks)

Module 4: Data Visualization with Matplotlib:


1. Explain the general tips for creating visualizations with Matplotlib. (5 Marks)
2. Write a Python program to create a simple line plot using Matplotlib. (10 Marks)
3. How can scatter plots be created and customized in Matplotlib? (10 Marks)
4. Discuss the role of Seaborn in enhancing data visualizations. (5 Marks)
5. Compare Matplotlib with Seaborn in terms of functionality and ease of use. (10 Marks)
6. Illustrate how to create a histogram using Matplotlib. (5 Marks)
7. Explain the use of color maps in visualizing data with Matplotlib. (10 Marks)
8. Write a Python script to create multiple subplots in a single figure. (10 Marks)
9. How can Seaborn be used for correlation heatmaps? Provide an example. (10 Marks)
10. Discuss the best practices for designing effective visualizations. (10 Marks)

Module 5: Introduction to Machine Learning:

1. Define machine learning and explain its significance in data analysis. (5 Marks)
2. Write a Python program to demonstrate the use of Scikit-Learn for simple linear regression. (10
Marks)
3. What are hyperparameters in machine learning models? Explain their importance. (10 Marks)
4. How is model validation performed in Scikit-Learn? (5 Marks)
5. Compare supervised and unsupervised learning with examples. (10 Marks)
6. Explain the process of splitting datasets into training and testing sets. (5 Marks)
7. Discuss the steps involved in building a machine learning pipeline in Scikit-Learn. (10 Marks)
8. How is cross-validation used for model evaluation? Provide an example. (10 Marks)
9. Write a Python script to demonstrate classification using decision trees in Scikit-Learn. (10 Marks)
10. Explain the challenges faced in machine learning projects and how they can be addressed. (10
Marks)

You might also like