0% found this document useful (0 votes)

144 views33 pages

Practical 1 and 2-1

The document contains code snippets and output related to data analysis tasks using NumPy and Pandas. The tasks involve computing statistics of arrays, manipulating multi-dimensional arrays, handling missing values in data frames, sorting and filtering data frames. Correlation and covariance are also calculated between columns of a data frame.

Uploaded by

SURAJ BISWAS

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

144 views33 pages

Practical 1 and 2-1

Uploaded by

SURAJ BISWAS

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 33

31/01/2024

Practical → 1st (a)

Q1 Write programs in Python using NumPy library to do the following:
A. Compute the mean, standard deviation, and variance of a two-dimensional
random integer array along the second axis.
CODE:
import numpy as np
x=np.arange(6)
print("\nOriginal array:")
print(x)
r1=np.mean(x)
r2=np.average(x)
print("\nMean:",r1)
r1=np.std(x)
r2=np.sqrt(np.mean((x-np.mean(x))**2))
print("\nstd:",r1)
r1=np.var(x)
r2=np.mean((x-np.mean(x))**2)
print("\nvariance;",r1)

OUTPUT:
01/02/2024

Practical → 1st (b)

Q1 Write programs in Python using NumPy library to do the following:
B. Create a 2-dimensional array of size m x n integer elements, also print the shape,
type and data type of the array and then reshape it into an n x m array, where n
and m are user inputs given at the run time
CODE:
import numpy as np
x = np.array([[2, 4, 6], [6, 8, 10]])
print("First Array : ")
print(x)
print("Type of Array")
print(type(x))
print("Shape of Array")
print(x.shape)
print(x.dtype)
reshaped2 = np.reshape(x, (3, 2))
print("Second Reshaped Array : ")
print(reshaped2)

OUTPUT:
21/02/2024

Practical → 1st (c)

Q1 Write programs in Python using NumPy library to do the following:
C. Test whether the elements of a given 1D array are zero, non-zero, and NaN.
Record the indices of these elements in three separate arrays.
CODE:
import numpy as np
arr = np.array([0, 1, 2, 0, np.nan, 5, 0])
zero_indices = np.where(arr == 0)[0]
non_zero_indices = np.where(arr != 0)[0]
nan_indices = np.where(np.isnan(arr))
print("Zero indices:", zero_indices)
print("Non-zero indices:", non_zero_indices)
print("NaN indices:", nan_indices)

OUTPUT:
21/02/2024

Practical → 1st (d)

Q1 Write programs in Python using NumPy library to do the following:
D. Create three random arrays of the same size: Array1, Array2, and
Array3. Subtract Array 2 from Array3 and store in Array4. Create
another array Array5 having two times the values in Array1. Find
Covariance and Correlation of Array1 with Array4 and Array5
respectively
CODE:
import numpy as np

# Create three random arrays of the same size

size = 100
Array1 = np.random.rand(size)
Array2 = np.random.rand(size)
Array3 = np.random.rand(size)

# Subtract Array2 from Array3 and store in Array4

Array4 = Array3 - Array2

# Create Array5 having two times the values in Array1

Array5 = 2 * Array1

# Find Covariance of Array1 with Array4

covariance_1_4 = np.cov(Array1, Array4)[0][1]

# Find Correlation of Array1 with Array5

correlation_1_5 = np.corrcoef(Array1, Array5)[0][1]

print("Covariance of Array1 with Array4:", covariance_1_4)

print("Correlation of Array1 with Array5:", correlation_1_5)

OUTPUT:
21/02/2024

Practical → 1st (e)

Q1 Write programs in Python using NumPy library to do the following:
E. Create two random arrays of the same size 10: Array1, and Array2. Find the sum
of the first half of both the arrays and the product of the second half of both
arrays.
CODE:
import numpy as np

# Create two random arrays of the same size 10

size = 10
Array1 = np.random.rand(size)
Array2 = np.random.rand(size)

# Find the sum of the first half of both arrays

sum_first_half_Array1 = np.sum(Array1[:size//2])
sum_first_half_Array2 = np.sum(Array2[:size//2])

# Find the product of the second half of both arrays

product_second_half_Array1 = np.prod(Array1[size//2:])
product_second_half_Array2 = np.prod(Array2[size//2:])

print("Sum of the first half of Array1:", sum_first_half_Array1)

print("Sum of the first half of Array2:", sum_first_half_Array2)
print("Product of the second half of Array1:", product_second_half_Array1)
print("Product of the second half of Array2:", product_second_half_Array2)

OUTPUT:
28/02/2024

Practical → 2nd (a)

Q2 Do the following using PANDAS Series:
A. Create a series with 5 elements. Display the series sorted on index and also sorted
on values separately.
CODE:
import pandas as pd

# Create a series with 5 elements

series = pd.Series([10, 5, 8, 2, 7], index=['e', 'a', 'd', 'c', 'b'])

# Display the series sorted on index

sorted_by_index = series.sort_index()

# Display the series sorted on values

sorted_by_values = series.sort_values()

print("Series sorted on index:")

print(sorted_by_index)

print("\nSeries sorted on values:")

print(sorted_by_values)

OUTPUT:
28/02/2024

Practical → 2nd (b)

Q2 Do the following using PANDAS Series:
B. Create a series with N elements with some duplicate values. Find the minimum
and maximum ranks assigned to the values using the ‘first’ and ‘max’ methods.
CODE:
import pandas as pd
series = pd.Series([2, 4, 6, 2, 8, 6, 3, 7, 4, 5])
min_ranks_first = series.rank(method='first')
max_ranks_max = series.rank(method='max')

print("Series:")
print(series)
print("\nMinimum ranks (using 'first' method):")
print(min_ranks_first)
print("\nMaximum ranks (using 'max' method):")
print(max_ranks_max)

OUTPUT:
28/02/2024

Practical → 2nd (c)

Q2 Do the following using PANDAS Series:
C. Display the index value of the minimum and maximum element of a
Series.
CODE:
import pandas as pd

# Create a sample series

series = pd.Series([10, 5, 8, 2, 7])

# Find index value of the minimum element

min_index = series.idxmin()

# Find index value of the maximum element

max_index = series.idxmax()

print("Index value of the minimum element:", min_index)

print("Index value of the maximum element:", max_index)

OUTPUT:
06/03/2024

Practical → 3rd (a)

Q3 Create a data frame having at least 3 columns and 50 rows to store numeric data
generated using a random function. Replace 10% of the values by null values whose
index positions are generated using random function. Do the following:
a. Identify and count missing values in a data frame.
CODE:
import pandas as pd
import numpy as np
data = np.random.randn(50, 3)
df = pd.DataFrame(data, columns=['Column1', 'Column2', 'Column3'])
null_indices = np.random.choice(df.index, size=int(0.1 * len(df)), replace=False)
for idx in null_indices:
col_idx = np.random.randint(0, 3)
df.iloc[idx, col_idx] = np.nan
missing_values_count = df.isnull().sum()
print("Missing values count:")
print(missing_values_count)

OUTPUT:
06/03/2024

Practical → 3rd (b)

b. Drop the column having more than 5 null values.
CODE:
import pandas as pd
import numpy as np
data = np.random.randn(50, 3)
df = pd.DataFrame(data, columns=['Column1', 'Column2', 'Column3'])
null_indices = np.random.choice(df.index, size=int(0.1 * len(df)), replace=False)
for idx in null_indices:
col_idx = np.random.randint(0, 3)
df.iloc[idx, col_idx] = np.nan
df=df.dropna(axis=1, thresh=45)
print(df)

OUTPUT:
06/03/2024

Practical → 3rd (c)

c. Identify the row label having maximum of the sum of all values in a row and drop that
row.
CODE:
import pandas as pd
import numpy as np
data = np.random.randn(50, 3)
df = pd.DataFrame(data, columns=['Column1', 'Column2', 'Column3'])
null_indices = np.random.choice(df.index, size=int(0.1 * len(df)), replace=False)
for idx in null_indices:
col_idx = np.random.randint(0, 3)
df.iloc[idx, col_idx] = np.nan
row_sums = df.sum(axis=1)
max_row_label = row_sums.idxmax()
df = df.drop(index=max_row_label)
print(df)

OUTPUT:
06/03/2024

Practical → 3rd (d)

d. Sort the data frame on the basis of the first column.
CODE:
import pandas as pd
import numpy as np
data = np.random.randn(50, 3)
df = pd.DataFrame(data, columns=['Column1', 'Column2', 'Column3'])
null_indices = np.random.choice(df.index, size=int(0.1 * len(df)), replace=False)
for idx in null_indices:
col_idx = np.random.randint(0, 3)
df.iloc[idx, col_idx] = np.nan
df_sorted = df.sort_values(by='Column1')
print(df_sorted)

OUTPUT:
06/03/2024

Practical → 3rd (e)

e. Remove all duplicates from the first column.
CODE:
import pandas as pd
import numpy as np
data = np.random.randn(50, 3)
df = pd.DataFrame(data, columns=['Column1', 'Column2', 'Column3'])
null_indices = np.random.choice(df.index, size=int(0.1 * len(df)), replace=False)
for idx in null_indices:
col_idx = np.random.randint(0, 3)
df.iloc[idx, col_idx] = np.nan
df_unique=df.drop_duplicates(subset=['Column1'])
print(df_unique)

OUTPUT:
06/03/2024

Practical → 3rd (f)

f. Find the correlation between first and second column and covariance between second
and third column.
CODE:
import pandas as pd
import numpy as np
data = np.random.randn(50, 3)
df = pd.DataFrame(data, columns=['Column1', 'Column2', 'Column3'])
null_indices = np.random.choice(df.index, size=int(0.1 * len(df)), replace=False)
for idx in null_indices:
col_idx = np.random.randint(0, 3)
df.iloc[idx, col_idx] = np.nan
# Calculate correlation between first and second column
correlation_first_second = df['Column1'].corr(df['Column2'])

# Calculate covariance between second and third column

covariance_second_third = df['Column2'].cov(df['Column3'])

print("Correlation between first and second column:", correlation_first_second)

print("Covariance between second and third column:", covariance_second_third)

OUTPUT:
06/03/2024

Practical → 3rd (g)

g. Discretize the second column and create 5 bins.
CODE:
import pandas as pd
import numpy as np
data = np.random.randn(50, 3)
df = pd.DataFrame(data, columns=['Column1', 'Column2', 'Column3'])
null_indices = np.random.choice(df.index, size=int(0.1 * len(df)), replace=False)
for idx in null_indices:
col_idx = np.random.randint(0, 3)
df.iloc[idx, col_idx] = np.nan
df['Column2_bins'] = pd.cut(df['Column2'], bins=5)
print(df)

OUTPUT:
13/03/2024

Practical → 6th
Q Consider the following data frame containing a family name, gender of the family
member and her/his monthly income in each record.

CODE:
import pandas as pd

# Creating the DataFrame

data = {
'Name': ['Shah', 'Vats', 'Vats', 'Kumar', 'Vats', 'Kumar', 'Shah', 'Shah', 'Kumar',
'Vats'],
'Gender': ['Male', 'Male', 'Female', 'Female', 'Female', 'Male', 'Male', 'Female',
'Female', 'Male'],
'MonthlyIncome': [114000.00, 65000.00, 43150.00, 69500.00, 155000.00,
103000.00, 55000.00, 112400.00, 81030.00, 71900.00]
}

df = pd.DataFrame(data)
print(df)
OUTPUT:
13/03/2024

Practical → 6th (a)

Q Write a program in Python using Pandas to perform the following:
a. Calculate and display familywise gross monthly income.
CODE:
import pandas as pd
data = {
'Name': ['Shah', 'Vats', 'Vats', 'Kumar', 'Vats', 'Kumar', 'Shah', 'Shah', 'Kumar',
'Vats'],
'Gender': ['Male', 'Male', 'Female', 'Female', 'Female', 'Male', 'Male', 'Female',
'Female', 'Male'],
'MonthlyIncome': [114000.00, 65000.00, 43150.00, 69500.00, 155000.00,
103000.00, 55000.00, 112400.00, 81030.00, 71900.00]
}
df = pd.DataFrame(data)
family_income = df.groupby('Name')['MonthlyIncome'].sum()
print("Familywise gross monthly income:")
print(family_income)
print()
OUTPUT:
13/03/2024

Practical → 6th (b)

Q Write a program in Python using Pandas to perform the following:
b. Calculate and display the member with the highest monthly income.
CODE:
import pandas as pd
data = {
'Name': ['Shah', 'Vats', 'Vats', 'Kumar', 'Vats', 'Kumar', 'Shah', 'Shah', 'Kumar',
'Vats'],
'Gender': ['Male', 'Male', 'Female', 'Female', 'Female', 'Male', 'Male', 'Female',
'Female', 'Male'],
'MonthlyIncome': [114000.00, 65000.00, 43150.00, 69500.00, 155000.00,
103000.00, 55000.00, 112400.00, 81030.00, 71900.00]
}
df = pd.DataFrame(data)
highest_income_member = df.loc[df['MonthlyIncome'].idxmax()]
print("Member with the highest monthly income:")
print(highest_income_member)
print()
OUTPUT:
13/03/2024

Practical → 6th (c)

Q Write a program in Python using Pandas to perform the following:
c. Calculate and display monthly income of all members with income greater than Rs.
60000.00.
CODE:
import pandas as pd
data = {
'Name': ['Shah', 'Vats', 'Vats', 'Kumar', 'Vats', 'Kumar', 'Shah', 'Shah', 'Kumar',
'Vats'],
'Gender': ['Male', 'Male', 'Female', 'Female', 'Female', 'Male', 'Male', 'Female',
'Female', 'Male'],
'MonthlyIncome': [114000.00, 65000.00, 43150.00, 69500.00, 155000.00,
103000.00, 55000.00, 112400.00, 81030.00, 71900.00]
}
df = pd.DataFrame(data)
high_income_members = df[df['MonthlyIncome'] > 60000.00]
print("Monthly income of members with income greater than Rs. 60000.00:")
print(high_income_members[['Name', 'MonthlyIncome']])
print()
OUTPUT:
13/03/2024

Practical → 6th (d)

Q Write a program in Python using Pandas to perform the following:
d. Calculate and display the average monthly income of the female members
CODE:
import pandas as pd
data = {
'Name': ['Shah', 'Vats', 'Vats', 'Kumar', 'Vats', 'Kumar', 'Shah', 'Shah', 'Kumar',
'Vats'],
'Gender': ['Male', 'Male', 'Female', 'Female', 'Female', 'Male', 'Male', 'Female',
'Female', 'Male'],
'MonthlyIncome': [114000.00, 65000.00, 43150.00, 69500.00, 155000.00,
103000.00, 55000.00, 112400.00, 81030.00, 71900.00]
}
df = pd.DataFrame(data)
female_avg_income = df[df['Gender'] == 'Female']['MonthlyIncome'].mean()
print("Average monthly income of female members:", female_avg_income)

OUTPUT:
21/03/2024

Practical → 7th (a)

Q7. Using Titanic dataset, to do the following:
A. Find the total number of passengers with age less than 30.

CODE:
import pandas as pd
titanic_df = pd.read_csv("C:/Users/DELL/Downloads/titanic.csv")
# a. Total number of passengers with age less than 30
passengers_under_30 = titanic_df[titanic_df['Age'] < 30]
total_passengers_under_30 = passengers_under_30.shape[0]
print("Total number of passengers with age less than 30:",
total_passengers_under_30)

OUTPUT:
21/03/2024

Practical → 7th (b)

Q7. Using Titanic dataset, to do the following:
B. Find total fare paid by passengers of first class.

CODE:
import pandas as pd
titanic_df = pd.read_csv("C:/Users/DELL/Downloads/titanic.csv")
# b. Total fare paid by passengers of first class
first_class_fare = titanic_df[titanic_df['Pclass'] == 1]['Fare'].sum()
print("Total fare paid by passengers of first class:", first_class_fare)

OUTPUT:
21/03/2024

Practical → 7th (c)

Q7. Using Titanic dataset, to do the following:
C. Compare number of survivors of each passenger class

CODE:
import pandas as pd
titanic_df = pd.read_csv("C:/Users/DELL/Downloads/titanic.csv")
# c. Number of survivors of each passenger class
survivors_by_class = titanic_df.groupby('Pclass')['Survived'].sum()
print("Number of survivors of each passenger class:")
print(survivors_by_class)

OUTPUT:
21/03/2024

Practical → 7th (d)

Q7. Using Titanic dataset, to do the following:
D. Compute descriptive statistics for any numeric attribute genderwise

CODE:
import pandas as pd
titanic_df = pd.read_csv("C:/Users/DELL/Downloads/titanic.csv")
# d. Descriptive statistics for age attribute genderwise
descriptive_stats_genderwise = titanic_df.groupby('Sex')['Age'].describe()
print("Descriptive statistics for age attribute genderwise:")
print(descriptive_stats_genderwise)

OUTPUT:
17/04/2024

Practical → 4th
Q4. Consider two Excel files having an attendance of two workshops. Each file has three
fields ‘Name’, ‘Date, duration (in minutes) where names are unique within a file. Note
that duration may take one of three values (30, 40, 50) only. Import the data into two
data frames.
CODE:
import pandas as pd

# Create dummy data for workshop1

workshop1_data = {
'Name': ['Alice', 'Bob', 'Charlie', 'David'],
'Date': ['2024-04-01', '2024-04-02', '2024-04-03', '2024-04-04'],
'Duration': [30, 40, 50, 30]
}

# Create dummy data for workshop2

workshop2_data = {
'Name': ['Alice', 'Eve', 'Charlie', 'Frank'],
'Date': ['2024-04-03', '2024-04-04', '2024-04-05', '2024-04-06'],
'Duration': [30, 40, 50, 30]
}

# Create data frames from the dummy data

df1 = pd.DataFrame(workshop1_data)
df2 = pd.DataFrame(workshop2_data)

# Display the first few rows of each data frame to verify the data
print("Data Frame 1:")
print(df1)

print("\nData Frame 2:")

print(df2)

OUTPUT:
Q. Import the data into two data frames and do the following:
a. Perform a merging of the two data frames to find the names of students
who had attended both workshops.
b. Find the names of all students who have attended a single workshop only.
c. Merge two data frames row-wise and find the total number of records in
the data frame.
d. Merge two data frames row-wise and use two columns viz. names and
dates as multi-row indexes. Generate descriptive statistics for this
hierarchical data frame.
CODE:
# a. Perform merging of the two data frames to find the names of students who had attended
both workshops.
attended_both = pd.merge(df1, df2, how='inner', on='Name')
print("\nNames of students who attended both workshops:")
print(attended_both['Name'].unique())

# b. Find names of all students who have attended a single workshop only.
attended_either = pd.merge(df1, df2, how='outer', on='Name', indicator=True)
attended_single = attended_either[attended_either['_merge'].isin(['left_only', 'right_only'])]
print("\nNames of students who attended a single workshop only:")
print(attended_single['Name'].unique())

# c. Merge two data frames row-wise and find the total number of records in the data frame.
merged_df = pd.concat([df1, df2], ignore_index=True)
print("\nTotal number of records in the merged data frame:", len(merged_df))

# d. Merge two data frames row-wise and use two columns viz. names and dates as multi-row
indexes.
# Generate descriptive statistics for this hierarchical data frame.
merged_df_multi_index = pd.concat([df1.set_index(['Name', 'Date']), df2.set_index(['Name',
'Date'])], axis=0)
print("\nDescriptive statistics for the hierarchical data frame:")
print(merged_df_multi_index.describe())
OUTPUT:
24/04/2024

Practical → 5th (a)

Q5. Using Iris data, plot the following with proper legend and axis labels: (Download
IRIS data from: https://archive.ics.uci.edu/ml/datasets/iris or import it from sklearn
datasets)
a. Plot bar chart to show the frequency of each class label in the data.
CODE:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
# Load Iris dataset
iris_df = sns.load_dataset('iris')
# a. Plot bar chart to show the frequency of each class label in the data.
plt.figure(figsize=(8, 6))
sns.countplot(x='species', data=iris_df)
plt.title('Frequency of each class label')
plt.xlabel('Species')
plt.ylabel('Frequency')
plt.show()

OUTPUT:
24/04/2024

Practical → 5th (b)

Q5. Using Iris data, plot the following with proper legend and axis labels: (Download
IRIS data from: https://archive.ics.uci.edu/ml/datasets/iris or import it from sklearn
datasets)
b. Draw a scatter plot for Petal width vs sepal width and fit a regression line .
CODE:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
# Load Iris dataset
iris_df = sns.load_dataset('iris')
# b. Draw a scatter plot for Petal width vs sepal width and fit a regression line
plt.figure(figsize=(8, 6))
sns.regplot(x='petal_width', y='sepal_width', data=iris_df)
plt.title('Petal width vs Sepal width')
plt.xlabel('Petal width (cm)')
plt.ylabel('Sepal width (cm)')
plt.show()

OUTPUT:
24/04/2024

Practical → 5th (c)

Q5. Using Iris data, plot the following with proper legend and axis labels: (Download
IRIS data from: https://archive.ics.uci.edu/ml/datasets/iris or import it from sklearn
datasets)
c. Plot density distribution for feature petal length.
CODE:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
# Load Iris dataset
iris_df = sns.load_dataset('iris')
# c. Plot density distribution for feature petal length.
plt.figure(figsize=(8, 6))
sns.kdeplot(iris_df['petal_length'], shade=True)
plt.title('Density distribution of Petal length')
plt.xlabel('Petal length (cm)')
plt.ylabel('Density')
plt.show()

OUTPUT:
24/04/2024

Practical → 5th (d)

Q5. Using Iris data, plot the following with proper legend and axis labels: (Download
IRIS data from: https://archive.ics.uci.edu/ml/datasets/iris or import it from sklearn
datasets)
d. Use a pair plot to show pairwise bivariate distribution in the Iris Dataset.
CODE:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
# Load Iris dataset
iris_df = sns.load_dataset('iris')
# d. Use a pair plot to show pairwise bivariate distribution in the Iris Dataset.
plt.figure(figsize=(10, 8))
sns.pairplot(iris_df, hue='species')
plt.show()

OUTPUT:
24/04/2024

Practical → 5th (e)

Q5. Using Iris data, plot the following with proper legend and axis labels: (Download
IRIS data from: https://archive.ics.uci.edu/ml/datasets/iris or import it from sklearn
datasets)
e. Draw heatmap for the four numeric attributes.
CODE:
mport pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
# Load Iris dataset
iris_df = sns.load_dataset('iris')
# Drop the 'species' column
numeric_df = iris_df.drop(columns='species')
# Draw heatmap for the four numeric attributes
plt.figure(figsize=(8, 6))
sns.heatmap(numeric_df.corr(), annot=True, cmap='coolwarm', linewidths=0.5)
plt.title('Heatmap for numeric attributes')
plt.show()

OUTPUT:
24/04/2024

Practical → 5th (f)

Q5. Using Iris data, plot the following with proper legend and axis labels: (Download
IRIS data from: https://archive.ics.uci.edu/ml/datasets/iris or import it from sklearn
datasets)
f. Compute mean, mode, median, standard deviation, confidence interval and
standard error for each feature
CODE:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
# Load Iris dataset
iris_df = sns.load_dataset('iris')
# Compute statistics for numeric features
numeric_stats = iris_df.describe().transpose()
# Compute mode for non-numeric features separately
non_numeric_modes =
iris_df.select_dtypes(include=['object']).mode().transpose().iloc[:, 0]
# Compute standard error
standard_error = iris_df.select_dtypes(include=['number']).sem().values
# Compute 95% confidence interval
n = iris_df.shape[0]
confidence_interval = 1.96 * iris_df.select_dtypes(include=['number']).std() / (n ** 0.5)
# Combine statistics
feature_stats = pd.concat([numeric_stats, pd.DataFrame(non_numeric_modes,
columns=['mode']),
pd.DataFrame(standard_error, columns=['standard error']),
pd.DataFrame((numeric_stats['mean'] - confidence_interval).values,
columns=['95% CI (low)']),
pd.DataFrame((numeric_stats['mean'] + confidence_interval).values,
columns=['95% CI (high)'])], axis=1)
print("\nFeature statistics:")
print(feature_stats)

OUTPUT:
24/04/2024

Practical → 5th (g)

Q5. Using Iris data, plot the following with proper legend and axis labels: (Download
IRIS data from: https://archive.ics.uci.edu/ml/datasets/iris or import it from sklearn
datasets)
g. Compute correlation coefficients between each pair of features and plot heatmap.
CODE:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
# Load Iris dataset
iris_df = sns.load_dataset('iris')
# Exclude non-numeric column 'species'
numeric_columns = iris_df.select_dtypes(include=[float, int]).columns
iris_numeric_df = iris_df[numeric_columns]
# Compute correlation coefficients between each pair of features and plot heatmap
correlation_matrix = iris_numeric_df.corr()
plt.figure(figsize=(8, 6))
sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm', linewidths=0.5)
plt.title('Correlation heatmap')
plt.show()
OUTPUT:

Code
No ratings yet
Code
3 pages
1 CSW Lab Manual Bcs 453
No ratings yet
1 CSW Lab Manual Bcs 453
55 pages
Assignment 04
No ratings yet
Assignment 04
10 pages
Class Xii Ip (065) Computer Networks Notes
No ratings yet
Class Xii Ip (065) Computer Networks Notes
75 pages
ccs346 Eda Unit 1 Notes
No ratings yet
ccs346 Eda Unit 1 Notes
20 pages
Assignment 1
No ratings yet
Assignment 1
3 pages
Data Science - Unit-4
No ratings yet
Data Science - Unit-4
30 pages
Maths Course Outline
No ratings yet
Maths Course Outline
16 pages
Mining Class Comparisions and Mining Descriptive Statistical Measures
No ratings yet
Mining Class Comparisions and Mining Descriptive Statistical Measures
24 pages
18bge14a U4
No ratings yet
18bge14a U4
16 pages
Business Quantitative Analysis
100% (3)
Business Quantitative Analysis
4 pages
UNIT-3: Correlation and Regression Analysis
No ratings yet
UNIT-3: Correlation and Regression Analysis
3 pages
BITS Pilani: Computer Programming
No ratings yet
BITS Pilani: Computer Programming
64 pages
Machine Learning Module 3 Logistic Regression
No ratings yet
Machine Learning Module 3 Logistic Regression
22 pages
Unit 7 PDF
No ratings yet
Unit 7 PDF
15 pages
Statistics Hounours 2nd Year Syllabus
No ratings yet
Statistics Hounours 2nd Year Syllabus
9 pages
Pathfinder Python - Concept To Creation
No ratings yet
Pathfinder Python - Concept To Creation
40 pages
Proposistional Logic
100% (1)
Proposistional Logic
62 pages
CUET UG Computer Science 20 Sets Book 2025
No ratings yet
CUET UG Computer Science 20 Sets Book 2025
219 pages
Chapter 5 - Eigen Vector
No ratings yet
Chapter 5 - Eigen Vector
12 pages
Python Pandas Basics Quiz
No ratings yet
Python Pandas Basics Quiz
23 pages
Data Handling Using Pandas-1
100% (1)
Data Handling Using Pandas-1
25 pages
Higher Engineering Mathematics - B. S. Grewal Companion Text
80% (5)
Higher Engineering Mathematics - B. S. Grewal Companion Text
197 pages
Worksheet - Data Visualization
No ratings yet
Worksheet - Data Visualization
3 pages
Measure of Central Tendency PDF
No ratings yet
Measure of Central Tendency PDF
47 pages
Intermediate STATS 10
100% (1)
Intermediate STATS 10
35 pages
Sample Survey Practice
No ratings yet
Sample Survey Practice
89 pages
Unit1 - AI - PPT AIT
No ratings yet
Unit1 - AI - PPT AIT
212 pages
M.Sc. Mathematics
No ratings yet
M.Sc. Mathematics
22 pages
IT Practical File: HTML & JavaScript Projects
100% (3)
IT Practical File: HTML & JavaScript Projects
27 pages
Statistical Methods and Their Applications-I: II B.SC Computer Science
100% (1)
Statistical Methods and Their Applications-I: II B.SC Computer Science
317 pages
DBMS GTU Study Material Presentations Unit-3 20082020055702AM
No ratings yet
DBMS GTU Study Material Presentations Unit-3 20082020055702AM
82 pages
Gamma and Betta Function Adv Calculus Schaum
No ratings yet
Gamma and Betta Function Adv Calculus Schaum
17 pages
Class Xi Practical Assignment Mysql
No ratings yet
Class Xi Practical Assignment Mysql
6 pages
BTHBSC... 301 Mathematics-III 2023-24
0% (1)
BTHBSC... 301 Mathematics-III 2023-24
3 pages
Operations Research for Math Majors
No ratings yet
Operations Research for Math Majors
207 pages
BCA 2 SEM Syllabus
No ratings yet
BCA 2 SEM Syllabus
12 pages
CLASS XII COMPUTER SCIENCE CH-2 Functions - PPT
No ratings yet
CLASS XII COMPUTER SCIENCE CH-2 Functions - PPT
46 pages
Ds Practical File 11th
No ratings yet
Ds Practical File 11th
24 pages
Gec Practicals
No ratings yet
Gec Practicals
31 pages
Vanshika Goyal Gec Practicals
No ratings yet
Vanshika Goyal Gec Practicals
31 pages
DAV Practical
No ratings yet
DAV Practical
12 pages
Data Analysis & Visualization
No ratings yet
Data Analysis & Visualization
26 pages
DAV Guidelines
No ratings yet
DAV Guidelines
4 pages
Ilovepdf Merged (2) Merged
No ratings yet
Ilovepdf Merged (2) Merged
65 pages
23HCS4142 PDF
No ratings yet
23HCS4142 PDF
24 pages
23bet10114 Naman Gupta Assignment-1
No ratings yet
23bet10114 Naman Gupta Assignment-1
17 pages
Practical File 12th
No ratings yet
Practical File 12th
19 pages
Ge Sem II Dav Upc 2344001201 Sl. No. Qp. 2012 July 2023
No ratings yet
Ge Sem II Dav Upc 2344001201 Sl. No. Qp. 2012 July 2023
16 pages
DSC Lab Programs
No ratings yet
DSC Lab Programs
24 pages
Even Students
No ratings yet
Even Students
36 pages
GE02 (DAVP) Assignment
No ratings yet
GE02 (DAVP) Assignment
3 pages
Numpy Dataframe
No ratings yet
Numpy Dataframe
12 pages
Fda Batch2program
No ratings yet
Fda Batch2program
18 pages
Data Science and Analtics Laboratory
No ratings yet
Data Science and Analtics Laboratory
21 pages
Code 2
No ratings yet
Code 2
3 pages
Python Programming U5
No ratings yet
Python Programming U5
46 pages
GE - Computer Scien EaQvs42
No ratings yet
GE - Computer Scien EaQvs42
6 pages
Ex. No.: 01 Working With Numpy Arrays
No ratings yet
Ex. No.: 01 Working With Numpy Arrays
30 pages
Suryadatta National School Class 12 CBSE Informatics Practices Practicals List
No ratings yet
Suryadatta National School Class 12 CBSE Informatics Practices Practicals List
19 pages
Soilmec SR-125 HIT Hydraulic Rotary Rig (April 2016)
100% (1)
Soilmec SR-125 HIT Hydraulic Rotary Rig (April 2016)
12 pages
Brutal Legend Material Standards
100% (1)
Brutal Legend Material Standards
22 pages
Ec210d PDF
100% (3)
Ec210d PDF
20 pages
Concrete Curing Methods Guide
No ratings yet
Concrete Curing Methods Guide
8 pages
SmartPlant PID Design Validation PDF
No ratings yet
SmartPlant PID Design Validation PDF
2 pages
Lab Report 2 Djj3213 - DKM - Pmu - Metallographic Analysis
No ratings yet
Lab Report 2 Djj3213 - DKM - Pmu - Metallographic Analysis
6 pages
Unit 3 Event Driven Programming
No ratings yet
Unit 3 Event Driven Programming
56 pages
Heat Sink Capacity Mesurment in Inservice Pipeline
No ratings yet
Heat Sink Capacity Mesurment in Inservice Pipeline
13 pages
Manual For Wheel Adjustment of Eight-Wheel Trolley
No ratings yet
Manual For Wheel Adjustment of Eight-Wheel Trolley
3 pages
RP RD Eng 1201 1
No ratings yet
RP RD Eng 1201 1
12 pages
Physics Investigatory Project
67% (3)
Physics Investigatory Project
17 pages
Oracle Commands Basics
No ratings yet
Oracle Commands Basics
5 pages
Diaphargm Wall Design
80% (5)
Diaphargm Wall Design
24 pages
Virbyg J-1
100% (1)
Virbyg J-1
6 pages
Nkealah 2016
No ratings yet
Nkealah 2016
15 pages
Producing Multimedia Content
No ratings yet
Producing Multimedia Content
9 pages
Manual de Motorola S9-HD
No ratings yet
Manual de Motorola S9-HD
15 pages
ph217 Vip
No ratings yet
ph217 Vip
13 pages
Civil SHM: Methods & Perspectives
100% (1)
Civil SHM: Methods & Perspectives
27 pages
IshworThapa MPA631Rural-UrbanDevelopment
No ratings yet
IshworThapa MPA631Rural-UrbanDevelopment
114 pages
Transmisionespt
100% (1)
Transmisionespt
66 pages
MidTerm Exam PMGT 530 Hu
No ratings yet
MidTerm Exam PMGT 530 Hu
8 pages
Lesson Plan Mapeh P.E 8
No ratings yet
Lesson Plan Mapeh P.E 8
4 pages
FIRESSENSE Direct Low Pressure System LPCB Approved
No ratings yet
FIRESSENSE Direct Low Pressure System LPCB Approved
6 pages
EVGA Thermal Pad Mod Installation Guide
No ratings yet
EVGA Thermal Pad Mod Installation Guide
5 pages
11th-New History-Book Back 1 41 PDF
No ratings yet
11th-New History-Book Back 1 41 PDF
33 pages
Quotation LBM - Meet K Drama 2025
No ratings yet
Quotation LBM - Meet K Drama 2025
4 pages
Valvula de Balanceo Automatico Tipo K
No ratings yet
Valvula de Balanceo Automatico Tipo K
6 pages
Northbayou March 2024 Updated PL
No ratings yet
Northbayou March 2024 Updated PL
3 pages
Reelcraft Catalog 2021
No ratings yet
Reelcraft Catalog 2021
20 pages

Practical 1 and 2-1

Uploaded by

Practical 1 and 2-1

Uploaded by

31/01/2024

Practical → 1st (a)

Practical → 1st (b)

Practical → 1st (c)

Practical → 1st (d)

# Create three random arrays of the same size

# Subtract Array2 from Array3 and store in Array4

# Create Array5 having two times the values in Array1

# Find Covariance of Array1 with Array4

# Find Correlation of Array1 with Array5

print("Covariance of Array1 with Array4:", covariance_1_4)

Practical → 1st (e)

# Create two random arrays of the same size 10

# Find the sum of the first half of both arrays

# Find the product of the second half of both arrays

print("Sum of the first half of Array1:", sum_first_half_Array1)

Practical → 2nd (a)

# Create a series with 5 elements

# Display the series sorted on index

# Display the series sorted on values

print("Series sorted on index:")

print("\nSeries sorted on values:")

Practical → 2nd (b)

Practical → 2nd (c)

# Create a sample series

# Find index value of the minimum element

# Find index value of the maximum element

print("Index value of the minimum element:", min_index)

Practical → 3rd (a)

Practical → 3rd (b)

Practical → 3rd (c)

Practical → 3rd (d)

Practical → 3rd (e)

Practical → 3rd (f)

# Calculate covariance between second and third column

print("Correlation between first and second column:", correlation_first_second)

Practical → 3rd (g)

# Creating the DataFrame

Practical → 6th (a)

Practical → 6th (b)

Practical → 6th (c)

Practical → 6th (d)

Practical → 7th (a)

Practical → 7th (b)

Practical → 7th (c)

Practical → 7th (d)

# Create dummy data for workshop1

# Create dummy data for workshop2

# Create data frames from the dummy data

print("\nData Frame 2:")

Practical → 5th (a)

Practical → 5th (b)

Practical → 5th (c)

Practical → 5th (d)

Practical → 5th (e)

Practical → 5th (f)

Practical → 5th (g)

You might also like