Maths lab
Week 1(7-01-2025):
1. A researcher collects data on the heights (in cm) of students in a class. The data is
categorized into ungrouped formats.
Ungrouped Data
The recorded heights of 9 students are:
12, 15, 14, 14, 16, 15, 12, 18, 14
1. Calculate the Mean, Median, and Mode of the dataset.
2. Determine the Variance, Standard Deviation, and Coefficient of Variation of the
dataset.
Code:
Ungrouped Data
ungrouped_data <- c(12, 15, 14, 14, 16, 15, 12, 18, 14)
# Mean mean_value <-
mean(ungrouped_data)
# Median median_value <-
median(ungrouped_data)
# Mode mode_value <-
names(which.max(table(ungrouped_data)))
# Variance
variance_value <- var(ungrouped_data)
# Standard Deviation sd_value <-
sd(ungrouped_data) # Coefficient of
Variation cv_value <- (sd_value /
mean_value) * 100
# Display Results print("Ungrouped Data
Results:\n") cat("Mean:", mean_value,
"\n") cat("Median:", median_value, "\n")
cat("Mode:", mode_value, "\n")
cat("Variance:", variance_value, "\n")
cat("Standard Deviation:", sd_value, "\n")
cat("Coefficient of Variation:", cv_value, "%\n")
Week2(21-01-2025):
Grouped Data
The heights of another set of students are grouped into the following classes along with
their respective frequencies: Height Range (cm) Frequency
10 - 19 15
20 - 29 18
30 - 39 14
40 - 49 12
50 - 59 20
Code:
Grouped data:
classes <- c("10-19", "20-29", "30-39", "40-49", "50-59")
frequencies <- c(15, 18, 14, 12, 20) midpoints <- c(14.5, 24.5,
34.5, 44.5, 54.5) mean_value <- sum(midpoints * frequencies) /
sum(frequencies) cumulative_freq <- cumsum(frequencies) n <-
sum(frequencies) median_class_index <- which(cumulative_freq
>= n/2)[1]
L <- as.numeric(sub("-.*", "", classes[median_class_index])) h <- 10 f <-
frequencies[median_class_index] cf_previous <- ifelse(median_class_index == 1, 0,
cumulative_freq[median_class_index - 1]) median_value <- L + ((n/2 - cf_previous) / f) * h
modal_class_index <- which.max(frequencies)
L_mode <- as.numeric(sub("-.*", "", classes[modal_class_index])) f1 <-
frequencies[modal_class_index] f0 <- ifelse(modal_class_index == 1, 0,
frequencies[modal_class_index - 1])
f2 <- ifelse(modal_class_index == length(frequencies), 0, frequencies[modal_class_index +
1])
mode_value <- L_mode + ((f1 - f0) / ((f1 - f0) + (f1 - f2))) * h
mean_freq <- midpoints * frequencies variance <- sum((midpoints -
mean_value)^2 * frequencies) / sum(frequencies) sd_value <- sqrt(variance) cv
<- (sd_value / mean_value) * 100 cat("Mean:", mean_value, "\n")
cat("Median:", median_value, "\n") cat("Mode:", mode_value, "\n")
cat("Standard Deviation:", sd_value, "\n")
cat("Coefficient of Variation:", cv, "%", "\n")
Week3(4-02-2025):
Two golfers, Golfer A and Golfer B, recorded their scores over 20 rounds of golf. Their scores
are as follows:
Golfer A's Scores:
74, 75, 78, 72, 83, 85, 70, 73, 74, 71, 65, 68, 70, 55, 80, 89, 90, 95, 78, 80 Golfer
B's Scores:
82, 84, 86, 88, 87, 85, 83, 81, 80, 82, 89, 90, 92, 91, 78, 79, 82, 87, 85, 88
1. Compute the Coefficient of Variation (CV) for both golfers' scores.
2. Compare the CVs and determine which golfer is more consistent in performance.
3. Interpret the results: A lower CV indicates greater consistency. Based on your
calculations, which golfer demonstrates less variation in scores?
Code:
# Sample dataset data1 <-
c(74,75,78,72,83,85,70,73,74,71,65,68,70,55,80,89,90,95,78,80) data2 <-
c(82,84,86,88,87,85,83,81,80,82,89,90,92,91,78,79,82,87,85,88)
# Function to calculate Coefficient of Variation
cv <- function(x) { return((sd(x) / mean(x)) *
100)
}
# Compute CVs
cv1 <- cv(data1) cv2 <- cv(data2) # Print CV values
cat("Coefficient of Variation for Data1:", cv1, "%\n")
cat("Coefficient of Variation for Data2:", cv2, "%\n")
# Compare CVs for consistency
if (cv1 < cv2) { cat("GolferA is more
consistent.\n")
} else if (cv1 > cv2) { cat("GolferB is
more consistent.\n") } else {
cat("Both datasets have the same level of consistency.\n")
}
Output:
Week4(11-02-2025):
A fair coin is tossed 10 times. Let X be the number of heads obtained.
1. Find the probability of getting exactly 5 heads.
2. Find the probability of getting at most 5 heads.
3. Compute the mean and variance of the binomial distribution.
Code: n
<- 10 p
<- 0.5
prob_5_heads <- dbinom(5, size = n, prob = p) cat("Probability:",
prob_5_heads, "\n") prob_at_most_5_heads <- pbinom(5, size =
n, prob = p) cat("Probability most 5 heads:",
prob_at_most_5_heads, "\n") mean_value <- n * p
variance_value <- n * p * (1 - p) cat("Mean of the binomial
distribution:", mean_value, "\n") cat("Variance of the binomial
distribution:", variance_value, "\n")
Output:
Week5(18-02-2025):
A researcher is studying the relationship between the heights of fathers and their sons. The
recorded heights (in inches) of 8 father-son pairs are given below:
Father's Heights (in inches): 65,
66, 67, 67, 68, 69, 70, 72 Son's
Heights (in inches):
67, 68, 65, 68, 72, 72, 69, 71
1. Compute the Karl Pearson correlation coefficient to determine the strength and
direction of the relationship between the heights of fathers and their sons.
2. Interpret the correlation coefficient:
o Is the relationship positive or negative?
o Is it weak, moderate, or strong?
Code:
father_height <- c(65, 66, 67, 67, 68, 69, 70, 72) son_height <- c(67, 68, 65, 68, 72,
72, 69, 71) correlation <- cor(father_height, son_height, method = "pearson")
cat("Karl Pearson correlation coefficient:", correlation, "\n") plot(father_height,
son_height, main="Scatter Plot: Father's Height vs Son's Height", xlab="Father's
Height (inches)", ylab="Son's Height (inches)", pch=16, col="blue")
abline(lm(son_height ~ father_height), col="red", lwd=2)
Output:
Week6(25-02-2025):
The English and Mathematics scores of 5 students are given below:
English Marks: 75, 40, 52, 65, 60
Maths Marks: 25, 42, 35, 29, 33
1. Compute Spearman’s Rank Correlation Coefficient manually using the formula.
2. Verify the result using a built-in function.
3. Interpret the correlation – does a higher English score relate to a higher or lower
Maths score?
Code:
english_marks <- c(75, 40, 52, 65, 60) maths_marks <- c(25, 42, 35, 29, 33)
spearman_corr <- cor(english_marks, maths_marks, method = "spearman")
print(spearman_corr)
# Data english_marks <- c(75, 40, 52,
65, 60) maths_marks <- c(25, 42, 35,
29, 33)
# Rank the data rank_english <-
rank(english_marks) rank_maths <-
rank(maths_marks)
d <- rank_english - rank_maths
# Compute d^2 d_squared
<- d^2
# Number of observations n
<- length(english_marks)
# Apply Spearman's formula rho_manual <- 1 - (6 *
sum(d_squared)) / (n * (n^2 - 1))
# Compute using built-in function rho_builtin <- cor(english_marks,
maths_marks, method = "spearman")
# Print results
print(paste("Spearman's Rank Correlation (manual):", rho_manual)) Output:
2. Find the linear regression for the following data
X 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0
Y 8.5 7.5 7.9 6.5 7.5 4.2 3.8 7.9 9.2 8.1 8.3
also find the error for each x
find the sum of the squares of error
Code:
# Given data
x <- c(1.0, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2.0)
y <- c(8.5, 7.5, 7.9, 6.5, 7.5, 4.2, 3.8, 7.9, 9.2, 8.1, 8.3)
# Perform linear regression
model <- lm(y ~ x)
# Get predicted values
y_pred <- predict(model)
# Compute residuals (errors)
errors <- y - y_pred
# Compute Sum of Squared Errors (SSE)
SSE <- sum(errors^2)
print(summary(model))
# Print each error (residual)
print("Errors (Residuals) for each case:")
error_table <- data.frame(x, y, y_pred, errors)
print(error_table)
# Print Sum of Squared Errors
print(paste("Sum of Squared Errors (SSE):", round(SSE, 2)))
# Plot data and regression line
plot(x, y, main = "Linear Regression", xlab = "X Values", ylab = "Y Values", pch = 16, col =
"blue")
abline(model, col = "red", lwd = 2)
Week7( 4-023-2025):
.1. Find the hypothesis using R
h0:Coin is fair p(H)=0.5
h1: coin is biased p(H)=0.6
n=100, for x=40 , alpha=0.05
test the hypothesis upto alpha =0.05 significance
Code:
# Given values
n <- 100 # sample size
x <- 40 # number of heads
p0 <- 0.5 # null hypothesis proportion (fair coin)
alpha <- 0.05 # significance level
# Sample proportion
phat <- x / n
# Standard error under H0
se <- sqrt(p0 * (1 - p0) / n)
# Z-test statistic
z <- (phat - p0) / se
# Two-tailed p-value
p_value <- 2 * pnorm(z)
# Critical Z-value for alpha = 0.05
z_critical <- qnorm(1 - alpha / 2)
# Results
cat("Sample proportion:", phat, "\n")
cat("Z statistic:", z, "\n")
cat("Critical Z value:", z_critical, "\n")
cat("P-value:", p_value, "\n")
# Decision
if (abs(z) > z_critical) {
cat("Reject the null hypothesis: The coin is biased.\n")
} els
e{
cat("Fail to reject the null hypothesis: The coin is fair.\n")
}
Output:
2. using r, find a quadratic regression model for the following data
X 3 4 5 6 7
Y 2.5 3.2 3.8 6.5 11.5
also find x on y
Code:
# Given data
X <- c(3, 4, 5, 6, 7)
Y <- c(2.5, 3.2, 3.8, 6.5, 11.5)
# Create a data frame
data <- data.frame(X, Y)
# Fit a quadratic regression model
model <- lm(Y ~ X + I(X^2), data = data)
# Display model summary
summary(model)
# Print the quadratic equation coefficients
coefficients <- coef(model)
cat("Quadratic Regression Model: Y =", round(coefficients[3], 4), "X² +", round(coefficients[2],
4), "X +", round(coefficients[1], 4), "\n")
# Plot the data and regression curve
plot(X, Y, pch = 16, col = "blue", main = "Quadratic Regression", xlab = "X", ylab = "Y")
curve(coefficients[1] + coefficients[2] * x + coefficients[3] * x^2, add = TRUE, col = "red", lwd
= 2)
Output:
Week8(11-03-2025):
1. fit a straight line using the method of least squares
X 1 2 3 4 5 6 7 8 9 10
Y 52.5 58.7 65 70.2 75.4 81.1 87.2 95.5 102.2 108.4
output:
2.It is claimed that a random sample of 49 tyres had a mean life of 15200 km. This sample
was drawn from a population whose mean is 15150km, SD 1200km.
Test the significance at 5% level.
Code: sample_mean <- 15200
pop_mean <- 15150
sd <- 1200
n <- 49
alpha <- 0.05
#h0 : pop_mean =15, 150
#h1 : pop_mean != 15, 150--> two-tailed
z_score <- (sample_mean - pop_mean) / (sd / sqrt(n))
critical_value <- qnorm(1 - alpha / 2)
cat("Z-score:", round(abs(z_score), 3), "\n")
cat("critical value : for Z alpha/2", round(critical_value, 4), "\n")
if (abs(z_score) > critical_value) {
cat("Reject the null hypothesis H0: There is significant evidence that the mean differs from
15150 km.\n")
} else {
cat("Fail to reject the null hypothesis H0: There is not enough evidence to conclude that the
mean differs from 15150 km.\n")
}
Output:
Week9(18-03-2025):
1. An experiment was performed to compare the abrasive wear of two different laminated
materials. Twelve pieces of material 1 were tested by exposing each piece to a machine
measuring wear. Ten pieces of material 2 were similarly tested. In each case, the depth of
wear was observed. The samples of material 1 gave an average (coded) wear of 85 units with
a sample standard deviation of 4, while the samples of material 2 gave an average of 81 with
a sample standard deviation of 5. Can we conclude at the 0.05 level of significance that the
abrasive wear of material 1 exceeds that of material 2 by more than 2 units? Assume the
populations to be approximately normal with equal variances.
Code:
n1<-12
n2<-10
x1<-85
x2<-81
sd1<-4
sd2<-5
alpha<-0.05
diff_0 <- 2 #null hypothesis
sp <- sqrt(((n1 - 1) * sd1^2 + (n2 - 1) * sd2^2) / (n1 + n2 - 2))
t_stat <- ((x1 - x2) - diff_0) / (sp * sqrt(1/n1 + 1/n2))
df <- n1 + n2 - 2 #df -- degree of freedom
t_critical <- qt(1 - alpha, df)
p_value <- 1 - pt(t_stat, df)
cat("Test Statistic (t):", t_stat, "\n")
cat("Critical Value (t_critical):", t_critical, "\n")
cat("p-value:", p_value, "\n")
if (t_stat > t_critical) {
cat("Reject the null hypothesis: Material 1's wear exceeds Material 2's by more than 2
units.\n")
} else {
cat("Fail to reject the null hypothesis: No significant evidence that Material 1's wear exceeds
Material 2's by more than 2 units.\n")
}
Output:
Week10(1-04-2025):
1.Two materials are tested for wear.
Material 1: n1=12n_1 = 12n1=12, mean = 85, SD = 4
Material 2: n2=10n_2 = 10n2=10, mean = 81, SD = 5
Test if Material 1’s wear exceeds Material 2’s by more than 2 units, using α=0.05\alpha =
0.05α=0.05.
Assume equal variances.
State the test statistic, critical value, p-value, and conclusion.
Code:
# Given data
s1 <- 0.035 # Standard deviation of Company 1
s2 <- 0.062 # Standard deviation of Company 2
n1 <- 12 # Sample size for Company 1
n2 <- 12 # Sample size for Company 2
alpha <- 0.05 # Level of significance
# Compute the F-test statistic
F_stat <- (s1^2) / (s2^2)
# Compute critical value for left-tailed test
F_critical <- qf(alpha, df1 = n1 - 1, df2 = n2 - 1)
# Display results
cat("F-test statistic:", F_stat, "\n")
cat("Critical value (left-tailed):", F_critical, "\n")
# Decision rule
if (F_stat < F_critical) {
cat("Reject the null hypothesis: There is evidence that Company 1 has less variability.\n")
} else {
cat("Fail to reject the null hypothesis: No significant evidence that Company 1 has less
variability.\n")
}
Output:
2. A machine produces metal rods. A sample of 15 rods has a standard deviation of 0.61 mm.
Test if the population standard deviation is greater than 0.50 mm at the 5% significance level.
Use the chi-square test. State the test statistic, critical value, and conclusion.
Code:
# Given data
s <- 0.61 # Sample standard deviation
sigma_0 <- 0.50 # Hypothesized standard deviation
n <- 15 # Sample size
alpha <- 0.05 # Level of significance
# Compute the chi-square test statistic
chi_sq_stat <- (n - 1) * (s^2) / (sigma_0^2)
# Compute critical value for right-tailed test
df <- n - 1 # Degrees of freedom
chi_critical <- qchisq(1 - alpha, df)
# Display results
cat("Chi-square test statistic:", chi_sq_stat, "\n")
cat("Critical value:", chi_critical, "\n")
# Decision rule
if (chi_sq_stat > chi_critical) {
cat("Reject the null hypothesis: The standard deviation is significantly greater than 0.50
mm.\n")
} else {
cat("Fail to reject the null hypothesis: No significant evidence that the standard deviation is
greater than 0.50 mm.\n")
}
Output: