0% found this document useful (0 votes)
16 views13 pages

Task 1

1. The document presents 10 exercises in descriptive statistics related to data distributions, calculation of measures of central tendency and dispersion, construction of graphs and frequency tables. 2. The exercises include calculations of mean, median, variance, standard deviation, quartiles, and percentiles for data sets on salaries, gasoline consumption, relative humidity, hours of exercise, and subjects passed. 3. They also ask to construct histograms.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views13 pages

Task 1

1. The document presents 10 exercises in descriptive statistics related to data distributions, calculation of measures of central tendency and dispersion, construction of graphs and frequency tables. 2. The exercises include calculations of mean, median, variance, standard deviation, quartiles, and percentiles for data sets on salaries, gasoline consumption, relative humidity, hours of exercise, and subjects passed. 3. They also ask to construct histograms.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

NATIONAL POLYTECHNIC SCHOOL

PROBABILITY AND STATISTICS

Chair of Probability and Statistics November 2023

PROPOSED EXERCISES DESCRIPTIVE STATISTICS

The head of personnel of a company wants to know if the distribution of salaries in the mentioned
The company is as equitable as the manager claims, but only knows that it is a distribution.
symmetric of arithmetic mean 750 dollars, where the maximum salary is 1500 dollars and that
arranging the salaries of the 100 workers into five classes of equal width only 10 of them
Twenty earn less than 300 dollars and twenty earn between 900 dollars and 1200 dollars. Is the manager right?
Justify your answer, for that make the frequency distribution table and explain.

Six vehicles are selected from those that are permitted to park and are recorded.
the following data:

Distance
Antiquity
on a trip
Vehicle Type Brand Collective? in a
vehicle
address
(years)
(miles)
1 Car Honda No 23.6 6
2 Car Toyota No 17.2 3
3 Truck Toyota No 10.1 4
4 Van Dodge Yes 31.7 2
Harley-
5 Moto No 25.5 1
Davidson
6 Car Chevrolet No 5.4 9

a) What are the experimental units or individuals?


b) What are the variables being measured? What types of variables are they?
c) Are these data univariate, bivariate, or multivariate?

3. Suppose that categories have been assigned to the jars according to the number of grams of X.
what they contain.

Category A: Those jars that are in the top 25% in terms of the
amount of X that contain
Category C: Those bottles that are within the lowest 35% in terms of
amount of X that contain

1
Probability and Statistics

Category B: The rest of the bottles

Obtain the values, in grams, that identify the extremes of the three categories, in the sample.
presented.

4. The following measurements for the drying time (in hours) of a certain brand are recorded.
enamel paint.

3.4 2.5 4.8 2.9 3.6

2.8 3.3 5.6 3.7 2.8

4.4 4.0 5.2 3.0 4.8

Assume that the measurements constitute a simple random sample.

a) What is the size of the previous sample?


b) Calculate the sample mean for these data.
c) Calculate the median of the sample.
d) Plot the data using a dot graph.

5. The following data has been obtained regarding the gallons of gasoline used daily
in a transportation company, for 90 days.

49 56 53 41 49 59 56 59 57 43 47 53 47
HIGHLIGHT
56 49 47 55 41 44 55 49 59 56 49 57 41
47484147435644534743474359
53 43 47 49 42 47 49 42 57 55 44 42 49
59 56 48 59 59 57 42 41 47 48 44 56 53
53 47 56 56 48 41 56 55 56 42 59 57

Determine:

a) The number of days in which less than 48 gallons were used, in each of them,
gasoline.38 days.
b) The number of days in which a minimum of 45 gallons and less than 56 gallons were used.
39 days.
c) The total number of gallons of gasoline used in the 32 days of highest consumption.
1808 gallons.
The total number of gallons of gasoline used in the 25 days of lowest consumption.
Answer(s): 1059 gallons
e) The range and the third quartile of the sample.Range= 18;Q3= 56.
f) The mean, the median, and the standard deviation.Response(s): Average= 49.6;Q2= 49
s= 5.93

2
Probability and Statistics

6. Historical monthly records of relative humidity have been taken from a station.
in percentage that occurred over a period of 5 years:

81 75 76 77 78 76 80 77 75 77 83 82
79 82 80 81 79 81 79 82 79 81 79 79
77 81 79 78 75 83 77 81 76 83 78 80
76 80 78 83 79 79 75 82 83 76 83 80
81 80 80 77 77 76 80 77 80 80 75 78

a) Calculate the number of months in which the relative humidity was greater than 76 and no more than
81 percent.
b) Calculate the percentage of relative humidity for the 37th month with the lowest relative humidity.
of the 55th month of lower relative humidity.
c) Calculate the average monthly relative humidity and the standard deviation over the five years.
considered.
d) Create a histogram of the monthly relative humidity observed over the 5 years. Use classes.
of width equal to 1.

Respuesta(s):a) 9 meses; b) 83 %; c) 79.0167, 2.4040

7. A civil engineer monitors water quality by measuring the amount of suspended solids.
two in a river water sample. During 11 weekends, he observed suspended solids.
(parts per million).

14 12 21 28 30 63 29 63 55 19 20

a) Draw a dot diagram.


b) Find the median and the mean. Locate both in the dot plot.
c) Determine the variance and the standard deviation.
d) Build a box plot.

8. According to the publication Chemical Engineering, an important property of a fiber is its absorption.
water action. A random sample of 20 pieces of cotton fiber is taken and measured.
impermeability (absorption measurement) of each one and the values are the following:

18.71 21.41 20.72 21.81 19.29 22.43 20.17


23.71 19.44 20.50 18.92 20.33 23.00 22.85
19.25 21.77 22.11 19.77 18.04 21.12

a) Calculate the mean and the median of the sample for the values of the previous sample.
b) Calculate the trimmed mean 10%.
c) Create a scatter plot with the absorption data.
d) Find the quintiles of the sample.

e) What percentage of the data is between the second and third quintile?

3
Probability and Statistics

9. The following data represents the number of hours of exercise per week that 50
people from a certain condominium.

12 12 13 12 11 10 5 5 7 8
9 7 8 1 12 7 7 9 8 12
13 11 20 22 3 7 8 22 22 12
13 15 12 17 12 2 2 15 17 12
13 15 19 20 12 15 15 17 18 7

a) Find the mean, median, mode, variance, and standard deviation.


b) What are the quartiles of the population?
c) What is the proportion of data that is above the average?
d) What proportion of the data is within 1 standard deviation of the mean?
e) Create the box diagram of this population.

a)x= 11.66, Med= ¯12, Mod= 12; b)Q1= 8,Q2= 12,Q= 15; c) 0.58 d) 68 %

10. From a faculty with 786 students, a representative sample of 80 has been taken, regarding
to the number of approved signatures up to the date when the sample was obtained, so
has organized the attached individual frequency table.

Number of subjects 18 19 20 21 22 23 24 25 26 27 28
Number of students 2 4 5 8 13 12 9 11 7 6 3

Calculate:

a) The total number of subjects passed by the 15 students in the sample that
fewer subjects have been approved.
b) The total number of subjects passed by the 14 students in the sample that most
subjects have been approved.
c) The number of students in the sample and in the faculty who have passed at least 20
subjects and less than 26 subjects.
d) The number of students, in the sample and in the faculty, who have passed more than 25
subjects.

11. The following scores represent the grade in a final exam for a course of
Probability and Statistics:

23 60 79 32 57 74 52 70 82
36 80 77 81 95 41 65 92 85
55 76 52 10 64 75 78 25 80
98 81 67 41 71 83 54 64 72
88 62 74 43 60 78 89 76 84
48 84 90 15 79 34 67 17 82
69 74 63 80 85 61

4
Probability and Statistics

a) Determine a frequency distribution for the students' scores.


b) Develop a histogram of relative frequencies.
c) Calculate the mean, the median, and the standard deviation of the sample.

Average≈ 65.48= 71.5; Standard deviation≈ 21.13

12. When measuring the height in cm that a group of schoolchildren can jump, before and after having
After certain sports training, the following values were obtained. Do you think that the
Was the training effective?

Height jumped in cm.


Student Ana Bea Carol Diana Elena Fanny Gia Hilda Inés Juana
Before the training 115 112 107 119 115 138 126 105 104 115
After the training 128 115 106 128 122 145 132 109 102 117

13. Knowing that the absolute frequency of students who have 3 siblings is 30 and that the frequency
The total number of students who have up to 3 siblings is 80. How many students have 2 siblings?
more or less?

14. The following table represents the possible scores, between 0 and 100, obtained by a group
of workers in an aptitude test. It is also known that4− f r 5= 0.12 and the width of
the interval is 16.

Frequency Frequency
Point Frequency Frequency
Absolute Relative
Score average Absolute Relative
Accumulated Accumulate
( xi ) ( fi ) ( f r I)
( FI) of( FrI )
- 4
- 0.10
- 0.36
- 58.5
- 10
- 0.12

a) Complete the frequency table


b) What percentage of workers are below average?
c) Draw the box diagram.

15. The following data show the interest rates for loans at different financial institutions.
closing of two cities A and B:

Interestratesoffinancialinstitutions
7.1% 7.3% 7.0% 6.9% 6.6% 6.9% 6,5 % 7,3 % 6,85 %
Ciudad B 7,1 % 7,3 % 6,3 % 6,7 % 6,8 % 6,85 % 7,5 %

a) Calculate the mean, median, and mode for the interest rates of each of the cities

5
Probability and Statistics

b) Based on the results of the previous point, determine if there is any type of bias in
both distributions
c) Which of the cities appears to have the most stable interest rates? (justify your answer)
ta)

16. The following graph represents the distribution of money that has been spent over the last month.
spent the 200 workers of a health insurance company. Determine:
22%
20%
18%
16%
14%
12%
10%
8%
6%
4%
2%
0%
0 20 40 60 80 100 120 140 160 180 200 220 240 260 280
money/diet (class marks)

a) The frequency table that shows the data represented in the graph.
b) The average amount spent, the most frequent amount, and the maximum amount spent by
a worker in the 50% of workers who spent the least on insurance.
c) The minimum of 20% of employees with the highest amount of money spent. What per-
What percentage of the total company does this group correspond to?

d) If in the following month, the insurance company decided to increase the cost of insurance
5% of all workers, and also add a bonus of 50 dollars for all of them
insurance, calculate the new average spending on insurance, the most frequent one and the maximum amount

that will spend 50% of the workers who will spend the least on insurance.
e) From the spending of another company on insurance that belongs to the same sector, it is known that the average
The spending per worker is $120 with a standard deviation of $2.2.
Which company has the most similar insurance expenses among all its employees?
indicate your answer

17. From the production of 8000 packages, a sample was obtained whose frequency distribution
by class intervals considering the weight of the packaging, is given by:

6
Probability and Statistics

i Intervals (weights in grams)i


1 4.5 - 11.5 17
2 -7 23
3 18.5 - 25.5 18
4 -7.0 26
5 32.5 - 39.5 19
6 39.5 - 46.5 14
7 46.5 - 53.5 23
8 53.5 - 60.5 27
9 60.5 - 67.5 21
10 67.5 - 74.5 19

a) The production cost of each unit is 1.20 dollars. The units that weigh up to 29
grams sell for 1.40 dollars. Units that weigh more than 29 and up to 50 grams are
They sell for 1.70 dollars. Units that weigh more than 50 grams sell for 1.90 dollars.
Calculate the profit that would be expected if the sample is representative of the population.
and all produced units are sold.
b) Calculate the maximum weight that can be statistically accepted for the units that con-
form the 32% lowest of the sample.
c) Calculate the minimum weight that can be statistically accepted for the units that
they make up 26% higher than the sample.

18. Regarding the types of defects and their frequency in a production process of containers
glass, the following information has been obtained.

Types of defects
Tension 72 9.30
Striped 236 30.49
Bubble 83 10.72
Fracture 176 22.74
Stain 117 15.12
Rupture 51 6.59
Resistance 18 2.33
Density 21 2.71
Total 774 100.00

a) What type of variable is studied in this case? Explain


b) Graphically represent the information using the most suitable diagrams for
this case. Interpret the graphs.

19. From the 9860 apples produced, a sample has been taken regarding their diameter in mm, with
the one that has obtained the distribution given in the table.

7
Probability and Statistics

iDiameter Apples
1 26.5-35.5 18
2 35.5-44.5 8
3 44.5-53.5 15
4 53.5-62.5 14
5 62.5-71.5 25
6 71.5-80.5 21
7 80.5-89.5 19

a) Calculate the number of apples, in the sample and in production, that in diameter
wait no more than 0.8 times the mean of the sample.
b) Calculate the ninth decile.
c) Calculate the number of apples, in the sample and in the production, that in diameter have
Wait for it to exceed 1.2 times the average of the sample.

d) Calculate the median of the sample.

Respuesta(s):a)34; 2760; b)83.8; c)36; d) 2919

20. To decide on the quality of a certain type of perfume, the amount of substance has been measured.
From a sample of 200 jars, the following distribution was obtained:

Quantity (grams)
1 6.5-15.5 15
2 15.5-24.5 19
3 24.5-33.5 20
4 33.5-42.5 26
5 42.5-51.5 30
6 51.5-60.5 29
7 60.5-69.5 31
7 69.5-78.5 30

a) In a production of 4800 jars, which of the following alternatives would be obtained


more unprocessed jars?
1) Every jar containing a quantity of substanceX less than the specified amount is reprocessed.
mean minus one standard deviation and any jar containing a greater amount
to the mean plus one standard deviation.
All bottles containing up to 25 grams of substance X are reprocessed, and all bottles
that has more than 65 grams of substance X.
b) Suppose that it has been decided to assign categories to the jars, according to the quantity of grams.
of which they contain.

Category A: Those bottles that are within the top 25% in terms of
the amount of X that they contain.

8
Probability and Statistics

Category C: Those jars that are within the lowest 35% in terms of
the amount of X they contain.
Category B: The rest of the jars.
Obtain the values, in grams, that identify the extremes of the three categories.

21. For the income tax settlement, in a small company, the income was calculated.
annual salaries (in dollars) of all employees. The frequency distribution table is
next:

Annual Incomei )
2400 - 3000 3
3000 - 4200 20
4200 - 5400 35
5400 - 7250 25
7250 - 9000 15
9000 - 12000 2

Determine the mean, median, mode, and standard deviation of the annual income of the em-
employees of the company.

b) Determine the quartiles of the annual income of the employees.

Respuesta(s):a) 5491, 5125.75, 1682.96 b)Q1= 4268.57,Q2= 5125.75,Q3= 6658

A person is driving a car on a highway at 70 km/h and notices that the number of au-
The number of cars that pass her is equal to the number of cars that she passes. The average is 70 km/h.
the median or the mode of the speeds of cars on the highway. Why?

23. If the average annual compensation paid to senior executives of three engineering firms is
From $175,000, can one of them receive $550,000?

24. The following are the number of minutes that a person must wait for a bus to go to
work in 15 working days:

10, 1, 13, 9, 5, 9, 2, 10, 3, 8, 6, 17, 2, 10, 15

a) Find the average.


b) Find the median.
c) Draw a box plot.
d) What proportion of days must the bus wait a time shorter than the average time?
on hold?

25. The material manufactured continuously, before being cut and rolled into large rolls, must
monitor its thickness (gauge). A sample of 10 measurements on paper, in millimeters,
he gave as a result

9
Probability and Statistics

32.2, 32.0, 30.4, 31.0, 31.2, 31.2, 30.3, 29.6, 30.5, 30.7

Find the mean and the quartiles for this sample.

26. The average annual salaries paid to top-level managers in three companies are $164,000.
$172,000 and $169,000. If the respective numbers of high-level executives in these companies...
The figures are 4, 15, and 11, find the average salary paid to those 30 executives.

27. A contract for the maintenance of high-power locomotives of a railway


national was granted to an important private company. After a year of experience
With the maintenance program, those in charge of the program believed they could do
significant improvements in the reliability of locomotives. To document the current status,
They gathered data about the cost of materials to rebuild traction engines and
they obtained the following results in thousands of dollars:

1.41 1.70 1.03 0.99 1.68 1.09 1.68 1.94


1.53 2.25 1.60 3.07 1.78 0.67 1.76 1.17
1.54 0.99 0.99 1.17 1.54 1.68 1.62 0.67
0.67 1.78 2.12 1.52 1.01 0.69 1.63 2.23

a) Build a frequency distribution for the cost of reconstruction materials.


b) Calculate the sample mean.
c) Calculate the standard deviation of the sample.
d) Determine the maximum cost of reconstruction material that could be observed at 30%
of the cheapest reconstructions.

28. The response and assistance times (in minutes) received from 400 emergency calls are recorded.
emergency to 911, given by

Time [min] Cases of


Límite inferior
5 10 15
11 20 25
21 30 35
31 40 40
41 60 65
61 90 80
91 120 140
Total 400

a) Determine the coefficient of variation of the response time and establish whether this time
it can be considered homogeneous or not.

b) Build the box diagram.

10
Probability and Statistics

c) Considering that after 50 minutes the response is critical, what is this percentage?
of attention to these cases?

29. A company in the city of Quito has 1300 employees, of which: 20% belong to the
administrative area, 42% to the technical area and the rest to the sales area. It is also known that
430 employees have a high school education, 650 have higher education and the
postgraduate rest. In addition, 226 employees in the technical area have a high school education, 328
employees in the sales area have higher education, 118 employees in the administration area
Tiva has a postgraduate education and 74 sales employees have a high school education.
Complete the following table with the employee number data according to the criteria.
indicated in the statement:

Work area
Administrative
Baccalaureate
Superior
Graduate program
Total

30. The cases of deaths (real) in Ecuador are shown by month and by year (2017, 2018,
2019

Year of recorded deaths


My 2017 2018 2019
January 6107 6760 6801
February 5706 5827 6017
March 6732 6133 6672
April 6092 5840 6252
May 6108 5961 6072
June 5785 5828 5960
July 5949 6057 6156
August 5766 6187 6366
September 5369 5906 6296
October 5779 6030 6093
November 5624 5872 6151
December 5824 6388 6519
Población total 16 444 969 16 746 074 17 052 692
Source: Administrative death records, INEC

Elaborate and/or determine

a) Frequency histograms of deaths by year and compare.


b) In which month of each of the recorded years is the highest proportion presented
deaths.

11
Probability and Statistics

c) In which month of each year of records is the lowest proportion presented?


deaths.
d) Defining the rate of change of deaths between consecutive years as

(deaths in the year t)−(deaths in the year t-1)


(deaths in the year t-1)

What is the variation in the death rate between the years 2017/2018 and 2018/2019?
e) The General Mortality Rate, which is defined as the ratio between the total number of deaths and
the total population multiplied by 1000. What can you say about the trend of this rate?
of general mortality in Ecuador?
Note: the result of this indicator is interpreted as the number of deaths that occurred
per 1000 inhabitants.

31. Suppose that the following data corresponding to the cost (in dollars) is to be analyzed.
electricity over a month and that was obtained from a sample of 50 houses in a residential area
Guayaquil's dental

96 171 202 178 147 102 153 129 127 82


157 185 90 116 172 111 148 213 130 165
141 149 206 175 123 128 144 168 109 167
95 163 150 154 130 143 187 166 139 149
108 119 183 151 114 135 191 137 129 158

a) Describir la población, la muestra, el individuo, la variable estadística y el tipo de variable.


b) Create a frequency table by grouping them using class intervals. Use amplifying
all classes equal to 17.
c) Create a histogram of absolute frequencies that includes the frequency polygon.

32. A manufacturer of a certain electronic component is interested in determining the lifespan (in
hours) of these devices, for which a sample of 12 observations has been taken:

123, 116, 120, 130, 122, 110, 175, 126, 125, 110, 119, ?.

One of the data points has been lost, but it is known that the average of the 12 data points is 124 hours.

a) Find the missing data.


b) Calculate the median, first and third quartile.
Find the range, variance, and standard deviation.
d) Draw the box diagram.

33. A random sample is taken with data on the cost in dollars for electricity consumption in
a residential area of Guayaquil

12
Probability and Statistics

96 171 202 178 147


157 185 90 116 172
141 149 206 175 123
95 163 150 154 130
108 119 183 151 114

a) Obtain the table of absolute and relative frequencies.


b) Create a histogram of absolute and relative frequencies and include the frequency polygon.
cuencias.
c) Calculate the measures of central tendency: arithmetic mean, median, mode, and the quartiles.
d) Calculate the measures of dispersion: standard deviation, variance, interquartile range and the
coefficient of variation.

Briefly explain in your own words the meaning of each of these measures in terms.
we of the origin of the data.

Task 1: Exercises: 1, 3, 4, 7, 11, 13, 16, 25.

13

You might also like