0% found this document useful (0 votes)
34 views8 pages

Exp 7

The document presents a series of statistical experiments involving probability distributions, including calculations of mean, median, mode, variance, and standard deviation using R scripts. It also covers binomial, normal, and Poisson distributions, as well as hypothesis testing, t-tests, and ANOVA, providing code examples and outputs for each analysis. The results indicate various statistical conclusions, such as the significance of strength increases and fertilizer effects on crop yields.

Uploaded by

prasadrohan966
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views8 pages

Exp 7

The document presents a series of statistical experiments involving probability distributions, including calculations of mean, median, mode, variance, and standard deviation using R scripts. It also covers binomial, normal, and Poisson distributions, as well as hypothesis testing, t-tests, and ANOVA, providing code examples and outputs for each analysis. The results indicate various statistical conclusions, such as the significance of strength increases and fertilizer effects on crop yields.

Uploaded by

prasadrohan966
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

1

Experiment-7 Probability Distributions


1. A dataset consists of the following values:
12, 19, 15, 22, 26, 21, 24, 22, 20, 23
a) Calculate the mean, median, and mode.
b) Write an R script to compute these values.
Code:
data <- c(12, 19, 15, 22, 26, 21, 24, 22, 20, 23)

mean_value <- mean(data)

median_value <- median(data)

get_mode <- function(x) {


uniq_x <- unique(x)
uniq_x[which.max(tabulate(match(x, uniq_x)))]
}
mode_value <- get_mode(data)
mean_value
median_value
mode_value

Output:
> data <- c(12, 19, 15, 22, 26, 21, 24, 22, 20, 23)
>
> mean_value <- mean(data)
>
> median_value <- median(data)
>
> get_mode <- function(x) {
+ uniq_x <- unique(x)
+ uniq_x[which.max(tabulate(match(x, uniq_x)))]
+ }
> mode_value <- get_mode(data)
>
> mean_value
[1] 20.4
> median_value
[1] 21.5
> mode_value
[1] 22

2. For the dataset (10, 12, 14, 16, 18, 20, 22, 24, 26, 28), calculate:
a) Variance
b) Standard Deviation
c) Write an R script to compute these values.
Code:
values <- c(10, 12, 14, 16, 18, 20, 22, 24, 26, 28)
variance <- var(values)
standard_deviation <- sd(values)
variance
standard_deviation

Saad Asif | Roll Number: 58 | SY-Mech | Batch A3 | Experiment 7


2

Output:
> values <- c(10, 12, 14, 16, 18, 20, 22, 24, 26, 28)
> variance <- var(values)
> standard_deviation <- sd(values)
> variance
[1] 36.66667
> standard_deviation
[1] 6.055301

Binomial Distribution

3. A factory produces screws, and 5% of them are defective. If a random sample of 15


screws is taken, what is the probability that exactly 2 screws are defective? Write an R
script to calculate this probability.
Code:
dbinom(2, size=15, prob=0.05)

Output:
> dbinom(2, size=15, prob=0.05)
[1] 0.1347523

4. A call center receives 5% of its calls as complaints. If the center receives 50 calls, what
is the probability that:
a) Exactly 3 calls are complaints?
b) At most 2 calls are complaints?
Code:
#a
dbinom(3, size=50, prob=0.05)
#b
pbinom(2, size=50, prob=0.05)

Output:
> #a
> dbinom(3, size=50, prob=0.05)
[1] 0.2198748
> #b
> pbinom(2, size=50, prob=0.05)
[1] 0.5405331

Saad Asif | Roll Number: 58 | SY-Mech | Batch A3 | Experiment 7


3

Normal Distribution

5. A standardized test has scores that follow a normal distribution with a mean of 80 and a
standard deviation of 10.
a) What is the probability that a randomly selected student scores above 90?
b) Write an R script to compute this probability.
Code:
1 - pnorm(90, mean = 80, sd = 10)

Output:
> 1 - pnorm(90, mean = 80, sd = 10)
[1] 0.1586553

6. IQ scores follow a normal distribution with a mean of 100 and a standard deviation of
15.
a) What is the probability that a randomly chosen person has an IQ above 120?
b) What is the IQ score at the 90th percentile?
Code:
# Probability of IQ above 120
1 - pnorm(120, mean = 100, sd = 15)
# IQ score at the 90th percentile
qnorm(0.90, mean = 100, sd = 15)

Output:
> # Probability of IQ above 120
> 1 - pnorm(120, mean = 100, sd = 15)
[1] 0.09121122
> # IQ score at the 90th percentile
> qnorm(0.90, mean = 100, sd = 15)
[1] 119.2233

Saad Asif | Roll Number: 58 | SY-Mech | Batch A3 | Experiment 7


4

Poisson Distribution
7. A hospital receives an average of 6 emergency cases per hour.
a) What is the probability that exactly 4 emergency cases occur in an hour?
b) What is the probability that more than 7 emergency cases occur in an hour?
Code:
# Average number of emergency cases per hour
lambda <- 6
# Probability of exactly 4 emergency cases
prob_exact_4 <- dpois(4, lambda)
prob_exact_4
# Probability of more than 7 emergency cases
prob_more_than_7 <- 1 - ppois(7, lambda)
prob_more_than_7

Output:
> # Average number of emergency cases per hour
> lambda <- 6
> # Probability of exactly 4 emergency cases
> prob_exact_4 <- dpois(4, lambda)
> prob_exact_4
[1] 0.1338526
> # Probability of more than 7 emergency cases
> prob_more_than_7 <- 1 - ppois(7, lambda)
> prob_more_than_7
[1] 0.2560202

Central Limit Theorem


8. A bakery sells pastries, and individual sales are right-skewed with a mean of $5.50 and a
standard deviation of $2.00. If we take a random sample of 40 customers, what is the
probability that the sample mean spending is greater than $6.00?
Code:
std_dev <- 2.00 # Population standard deviation
n <- 40 # Sample size
# Compute standard error
SE <- std_dev / sqrt(n)
# Compute probability
1 - pnorm(6, mean = 5.5, sd = SE)

Output:
> std_dev <- 2.00 # Population standard deviation
> n <- 40 # Sample size
> # Compute standard error
> SE <- std_dev / sqrt(n)
> # Compute probability
> 1 - pnorm(6, mean = 5.5, sd = SE)
[1] 0.05692315

Saad Asif | Roll Number: 58 | SY-Mech | Batch A3 | Experiment 7


5

t-Test
9. A fitness trainer claims that his program increases strength significantly. Before training,
10 clients had an average bench press of 100 kg. After 6 weeks of training, their average
bench press increased to 105 kg with a standard deviation of 5 kg. Is the strength
increase significant at 5% level?
Code:
n <- 10
mean_diff <- 5
sd_diff <- 5
set.seed(123)

# Simulate differences for 10 clients


differences <- rnorm(n, mean = mean_diff, sd = sd_diff)

# Perform a one-sample t-test on the differences (testing if mean > 0)


t_test_result <- t.test(differences, mu = 0, alternative = "greater")

print(t_test_result)

Output:
> n <- 10
> mean_diff <- 5
> sd_diff <- 5
> set.seed(123)
>
> # Simulate differences for 10 clients
> differences <- rnorm(n, mean = mean_diff, sd = sd_diff)
>
> # Perform a one-sample t-test on the differences (testing if mean > 0)
> t_test_result <- t.test(differences, mu = 0, alternative = "greater")
>
> print(t_test_result)

One Sample t-test

data: differences
t = 3.5629, df = 9, p-value = 0.003046
alternative hypothesis: true mean is greater than 0
95 percent confidence interval:
2.608675 Inf
sample estimates:
mean of x
5.373128

Saad Asif | Roll Number: 58 | SY-Mech | Batch A3 | Experiment 7


6

Chi-Square Test
10. A researcher investigates whether gender influences product preference. The product
preference by male and female on Product A is 30 and 25 respectively & the product
preference by male and female on product B is 20 and 25 respectively. Does gender
have a significant e ect on product preference at 5% significance level?
Code:
data <- matrix(c(30, 20, 25, 25), nrow = 2, byrow = TRUE)
rownames(data) <- c("Male", "Female")
colnames(data) <- c("Product_A", "Product_B")

print(data)

# Chi-Square Test for Independence


test_result <- chisq.test(data)

print(test_result)

Output:
> data <- matrix(c(30, 20, 25, 25), nrow = 2, byrow = TRUE)
> rownames(data) <- c("Male", "Female")
> colnames(data) <- c("Product_A", "Product_B")
>
> print(data)
Product_A Product_B
Male 30 20
Female 25 25
>
> # Chi-Square Test for Independence
> test_result <- chisq.test(data)
>
> print(test_result)

Pearson's Chi-squared test with Yates' continuity correction

data: data
X-squared = 0.64646, df = 1, p-value = 0.4214

Since the p-value (0.4214) is greater than the 0.05 significance level, we fail to reject
the null hypothesis. This means that, based on the data, there is not enough evidence
to conclude that gender has a significant e ect on product preference.

Saad Asif | Roll Number: 58 | SY-Mech | Batch A3 | Experiment 7


7

Hypothesis Testing
11. A company claims that their batteries last on average 50 hours. A sample of 12 batteries
has the following lifespans: 48, 52, 49, 50, 47, 53, 51, 46, 50, 49, 52, 48. Test at 5%
significance level whether the mean lifespan is di erent from 50 hours
Code:
battery_lifespans <- c(48, 52, 49, 50, 47, 53, 51, 46, 50, 49, 52, 48)

# Perform one-sample t-test against the claimed mean of 50 hours


test_result <- t.test(battery_lifespans, mu = 50, alternative = "two.sided
")
print(test_result)

Output:
> battery_lifespans <- c(48, 52, 49, 50, 47, 53, 51, 46, 50, 49, 52, 48)
>
> # Perform one-sample t-test against the claimed mean of 50 hours
> test_result <- t.test(battery_lifespans, mu = 50, alternative = "two.sid
ed")
> print(test_result)

One Sample t-test

data: battery_lifespans
t = -0.67088, df = 11, p-value = 0.5161
alternative hypothesis: true mean is not equal to 50
95 percent confidence interval:
48.21636 50.95031
sample estimates:
mean of x
49.58333

At the 5% significance level, there is insu icient evidence to conclude that the mean
battery lifespan is di erent from 50 hours.

Saad Asif | Roll Number: 58 | SY-Mech | Batch A3 | Experiment 7


8

ANOVA (Analysis of Variance)


12. Three di erent fertilizers are tested on crops. The yields from each fertilizer are:
Fertilizer A: 22, 25, 20, 23, 24 Fertilizer B: 30, 28, 27, 29, 31 Fertilizer C: 18, 20, 19, 17, 22
Perform an ANOVA test to determine if at least one fertilizerhas a significantly di erent
e ect on yield.
Code:
Fertilizer <- factor(rep(c("A", "B", "C"), each = 5))
Yield <- c(22, 25, 20, 23, 24, 30, 28, 27, 29, 31, 18, 20, 19, 17, 22)

# Combine into a data frame


data <- data.frame(Fertilizer, Yield)
print(data)

#one-way ANOVA
anova_result <- aov(Yield ~ Fertilizer, data = data)
summary(anova_result)

Output:
> Fertilizer <- factor(rep(c("A", "B", "C"), each = 5))
> Yield <- c(22, 25, 20, 23, 24, 30, 28, 27, 29, 31, 18, 20, 19, 17, 22)
>
> # Combine into a data frame
> data <- data.frame(Fertilizer, Yield)
> print(data)
Fertilizer Yield
1 A 22
2 A 25
3 A 20
4 A 23
5 A 24
6 B 30
7 B 28
8 B 27
9 B 29
10 B 31
11 C 18
12 C 20
13 C 19
14 C 17
15 C 22
>
> #one-way ANOVA
> anova_result <- aov(Yield ~ Fertilizer, data = data)
> summary(anova_result)
Df Sum Sq Mean Sq F value Pr(>F)
Fertilizer 2 245.7 122.9 37.23 7.15e-06 ***
Residuals 12 39.6 3.3
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

At the 5% significance level, there is su icient evidence to conclude that at least one
fertilizer produces a significantly di erent yield compared to the others.

Saad Asif | Roll Number: 58 | SY-Mech | Batch A3 | Experiment 7

You might also like