1
Experiment-7 Probability Distributions
1. A dataset consists of the following values:
12, 19, 15, 22, 26, 21, 24, 22, 20, 23
a) Calculate the mean, median, and mode.
b) Write an R script to compute these values.
Code:
data <- c(12, 19, 15, 22, 26, 21, 24, 22, 20, 23)
mean_value <- mean(data)
median_value <- median(data)
get_mode <- function(x) {
uniq_x <- unique(x)
uniq_x[which.max(tabulate(match(x, uniq_x)))]
}
mode_value <- get_mode(data)
mean_value
median_value
mode_value
Output:
> data <- c(12, 19, 15, 22, 26, 21, 24, 22, 20, 23)
>
> mean_value <- mean(data)
>
> median_value <- median(data)
>
> get_mode <- function(x) {
+ uniq_x <- unique(x)
+ uniq_x[which.max(tabulate(match(x, uniq_x)))]
+ }
> mode_value <- get_mode(data)
>
> mean_value
[1] 20.4
> median_value
[1] 21.5
> mode_value
[1] 22
2. For the dataset (10, 12, 14, 16, 18, 20, 22, 24, 26, 28), calculate:
a) Variance
b) Standard Deviation
c) Write an R script to compute these values.
Code:
values <- c(10, 12, 14, 16, 18, 20, 22, 24, 26, 28)
variance <- var(values)
standard_deviation <- sd(values)
variance
standard_deviation
Saad Asif | Roll Number: 58 | SY-Mech | Batch A3 | Experiment 7
2
Output:
> values <- c(10, 12, 14, 16, 18, 20, 22, 24, 26, 28)
> variance <- var(values)
> standard_deviation <- sd(values)
> variance
[1] 36.66667
> standard_deviation
[1] 6.055301
Binomial Distribution
3. A factory produces screws, and 5% of them are defective. If a random sample of 15
screws is taken, what is the probability that exactly 2 screws are defective? Write an R
script to calculate this probability.
Code:
dbinom(2, size=15, prob=0.05)
Output:
> dbinom(2, size=15, prob=0.05)
[1] 0.1347523
4. A call center receives 5% of its calls as complaints. If the center receives 50 calls, what
is the probability that:
a) Exactly 3 calls are complaints?
b) At most 2 calls are complaints?
Code:
#a
dbinom(3, size=50, prob=0.05)
#b
pbinom(2, size=50, prob=0.05)
Output:
> #a
> dbinom(3, size=50, prob=0.05)
[1] 0.2198748
> #b
> pbinom(2, size=50, prob=0.05)
[1] 0.5405331
Saad Asif | Roll Number: 58 | SY-Mech | Batch A3 | Experiment 7
3
Normal Distribution
5. A standardized test has scores that follow a normal distribution with a mean of 80 and a
standard deviation of 10.
a) What is the probability that a randomly selected student scores above 90?
b) Write an R script to compute this probability.
Code:
1 - pnorm(90, mean = 80, sd = 10)
Output:
> 1 - pnorm(90, mean = 80, sd = 10)
[1] 0.1586553
6. IQ scores follow a normal distribution with a mean of 100 and a standard deviation of
15.
a) What is the probability that a randomly chosen person has an IQ above 120?
b) What is the IQ score at the 90th percentile?
Code:
# Probability of IQ above 120
1 - pnorm(120, mean = 100, sd = 15)
# IQ score at the 90th percentile
qnorm(0.90, mean = 100, sd = 15)
Output:
> # Probability of IQ above 120
> 1 - pnorm(120, mean = 100, sd = 15)
[1] 0.09121122
> # IQ score at the 90th percentile
> qnorm(0.90, mean = 100, sd = 15)
[1] 119.2233
Saad Asif | Roll Number: 58 | SY-Mech | Batch A3 | Experiment 7
4
Poisson Distribution
7. A hospital receives an average of 6 emergency cases per hour.
a) What is the probability that exactly 4 emergency cases occur in an hour?
b) What is the probability that more than 7 emergency cases occur in an hour?
Code:
# Average number of emergency cases per hour
lambda <- 6
# Probability of exactly 4 emergency cases
prob_exact_4 <- dpois(4, lambda)
prob_exact_4
# Probability of more than 7 emergency cases
prob_more_than_7 <- 1 - ppois(7, lambda)
prob_more_than_7
Output:
> # Average number of emergency cases per hour
> lambda <- 6
> # Probability of exactly 4 emergency cases
> prob_exact_4 <- dpois(4, lambda)
> prob_exact_4
[1] 0.1338526
> # Probability of more than 7 emergency cases
> prob_more_than_7 <- 1 - ppois(7, lambda)
> prob_more_than_7
[1] 0.2560202
Central Limit Theorem
8. A bakery sells pastries, and individual sales are right-skewed with a mean of $5.50 and a
standard deviation of $2.00. If we take a random sample of 40 customers, what is the
probability that the sample mean spending is greater than $6.00?
Code:
std_dev <- 2.00 # Population standard deviation
n <- 40 # Sample size
# Compute standard error
SE <- std_dev / sqrt(n)
# Compute probability
1 - pnorm(6, mean = 5.5, sd = SE)
Output:
> std_dev <- 2.00 # Population standard deviation
> n <- 40 # Sample size
> # Compute standard error
> SE <- std_dev / sqrt(n)
> # Compute probability
> 1 - pnorm(6, mean = 5.5, sd = SE)
[1] 0.05692315
Saad Asif | Roll Number: 58 | SY-Mech | Batch A3 | Experiment 7
5
t-Test
9. A fitness trainer claims that his program increases strength significantly. Before training,
10 clients had an average bench press of 100 kg. After 6 weeks of training, their average
bench press increased to 105 kg with a standard deviation of 5 kg. Is the strength
increase significant at 5% level?
Code:
n <- 10
mean_diff <- 5
sd_diff <- 5
set.seed(123)
# Simulate differences for 10 clients
differences <- rnorm(n, mean = mean_diff, sd = sd_diff)
# Perform a one-sample t-test on the differences (testing if mean > 0)
t_test_result <- t.test(differences, mu = 0, alternative = "greater")
print(t_test_result)
Output:
> n <- 10
> mean_diff <- 5
> sd_diff <- 5
> set.seed(123)
>
> # Simulate differences for 10 clients
> differences <- rnorm(n, mean = mean_diff, sd = sd_diff)
>
> # Perform a one-sample t-test on the differences (testing if mean > 0)
> t_test_result <- t.test(differences, mu = 0, alternative = "greater")
>
> print(t_test_result)
One Sample t-test
data: differences
t = 3.5629, df = 9, p-value = 0.003046
alternative hypothesis: true mean is greater than 0
95 percent confidence interval:
2.608675 Inf
sample estimates:
mean of x
5.373128
Saad Asif | Roll Number: 58 | SY-Mech | Batch A3 | Experiment 7
6
Chi-Square Test
10. A researcher investigates whether gender influences product preference. The product
preference by male and female on Product A is 30 and 25 respectively & the product
preference by male and female on product B is 20 and 25 respectively. Does gender
have a significant e ect on product preference at 5% significance level?
Code:
data <- matrix(c(30, 20, 25, 25), nrow = 2, byrow = TRUE)
rownames(data) <- c("Male", "Female")
colnames(data) <- c("Product_A", "Product_B")
print(data)
# Chi-Square Test for Independence
test_result <- chisq.test(data)
print(test_result)
Output:
> data <- matrix(c(30, 20, 25, 25), nrow = 2, byrow = TRUE)
> rownames(data) <- c("Male", "Female")
> colnames(data) <- c("Product_A", "Product_B")
>
> print(data)
Product_A Product_B
Male 30 20
Female 25 25
>
> # Chi-Square Test for Independence
> test_result <- chisq.test(data)
>
> print(test_result)
Pearson's Chi-squared test with Yates' continuity correction
data: data
X-squared = 0.64646, df = 1, p-value = 0.4214
Since the p-value (0.4214) is greater than the 0.05 significance level, we fail to reject
the null hypothesis. This means that, based on the data, there is not enough evidence
to conclude that gender has a significant e ect on product preference.
Saad Asif | Roll Number: 58 | SY-Mech | Batch A3 | Experiment 7
7
Hypothesis Testing
11. A company claims that their batteries last on average 50 hours. A sample of 12 batteries
has the following lifespans: 48, 52, 49, 50, 47, 53, 51, 46, 50, 49, 52, 48. Test at 5%
significance level whether the mean lifespan is di erent from 50 hours
Code:
battery_lifespans <- c(48, 52, 49, 50, 47, 53, 51, 46, 50, 49, 52, 48)
# Perform one-sample t-test against the claimed mean of 50 hours
test_result <- t.test(battery_lifespans, mu = 50, alternative = "two.sided
")
print(test_result)
Output:
> battery_lifespans <- c(48, 52, 49, 50, 47, 53, 51, 46, 50, 49, 52, 48)
>
> # Perform one-sample t-test against the claimed mean of 50 hours
> test_result <- t.test(battery_lifespans, mu = 50, alternative = "two.sid
ed")
> print(test_result)
One Sample t-test
data: battery_lifespans
t = -0.67088, df = 11, p-value = 0.5161
alternative hypothesis: true mean is not equal to 50
95 percent confidence interval:
48.21636 50.95031
sample estimates:
mean of x
49.58333
At the 5% significance level, there is insu icient evidence to conclude that the mean
battery lifespan is di erent from 50 hours.
Saad Asif | Roll Number: 58 | SY-Mech | Batch A3 | Experiment 7
8
ANOVA (Analysis of Variance)
12. Three di erent fertilizers are tested on crops. The yields from each fertilizer are:
Fertilizer A: 22, 25, 20, 23, 24 Fertilizer B: 30, 28, 27, 29, 31 Fertilizer C: 18, 20, 19, 17, 22
Perform an ANOVA test to determine if at least one fertilizerhas a significantly di erent
e ect on yield.
Code:
Fertilizer <- factor(rep(c("A", "B", "C"), each = 5))
Yield <- c(22, 25, 20, 23, 24, 30, 28, 27, 29, 31, 18, 20, 19, 17, 22)
# Combine into a data frame
data <- data.frame(Fertilizer, Yield)
print(data)
#one-way ANOVA
anova_result <- aov(Yield ~ Fertilizer, data = data)
summary(anova_result)
Output:
> Fertilizer <- factor(rep(c("A", "B", "C"), each = 5))
> Yield <- c(22, 25, 20, 23, 24, 30, 28, 27, 29, 31, 18, 20, 19, 17, 22)
>
> # Combine into a data frame
> data <- data.frame(Fertilizer, Yield)
> print(data)
Fertilizer Yield
1 A 22
2 A 25
3 A 20
4 A 23
5 A 24
6 B 30
7 B 28
8 B 27
9 B 29
10 B 31
11 C 18
12 C 20
13 C 19
14 C 17
15 C 22
>
> #one-way ANOVA
> anova_result <- aov(Yield ~ Fertilizer, data = data)
> summary(anova_result)
Df Sum Sq Mean Sq F value Pr(>F)
Fertilizer 2 245.7 122.9 37.23 7.15e-06 ***
Residuals 12 39.6 3.3
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
At the 5% significance level, there is su icient evidence to conclude that at least one
fertilizer produces a significantly di erent yield compared to the others.
Saad Asif | Roll Number: 58 | SY-Mech | Batch A3 | Experiment 7