0% found this document useful (0 votes)
132 views175 pages

Pde 710

The document outlines the introduction of e-course books for Post Graduate Diploma in Education students at the National Teachers' Institution, Kaduna, highlighting the shift from printed materials to digital formats in response to changing learner preferences. It emphasizes the importance of integrating ICT in education and the potential of e-books to alleviate issues related to course material scarcity. The document also discusses the significance of statistics in educational research and decision-making, detailing various statistical methods and their applications in educational contexts.

Uploaded by

russellsimeon567
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
132 views175 pages

Pde 710

The document outlines the introduction of e-course books for Post Graduate Diploma in Education students at the National Teachers' Institution, Kaduna, highlighting the shift from printed materials to digital formats in response to changing learner preferences. It emphasizes the importance of integrating ICT in education and the potential of e-books to alleviate issues related to course material scarcity. The document also discusses the significance of statistics in educational research and decision-making, detailing various statistical methods and their applications in educational contexts.

Uploaded by

russellsimeon567
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 175
NATIONAL TEACHERS’ INSTITUTION, KADUNA (PGDE/DLS) sais | ‘| E-COURSE BOOK ON FOREWORD The introduction of e-books for the Post Graduate Diploma in Education (PGDE) Students of the Institute clearly indicates the commitment of the 9*Executive Management members in the integration of ICT in all operations of the Institute as contained in its 5-Year Strategic Plan. There is no doubt, the use of printed course books as learner support mechanism is no longer fashionable and affordable. The advent and availability of smart and android phones in the hands of learners has changed the learners’ interest from’ learning through books to learning through devices, whichis more convenient and\ accessible than the printed copies of books. I commend the foresight of the Director and Chief Executive - Professor Musa Garba Maitafsir for the courage to accept the reality that ICT has compelled Education Managers to employ ICT driven techniques in supporting learners due to the shift in learners’ interest in learning. The e-course books will unavoidably bring an end to theera of course books scarcity, complaint of over and under production of the books as well as the accessibility of selected courses bedeviling some centres of the Institute. I sincerely commend NTI for this giant stride with the hope that it would be extended to it other programmes. Finally, I employ the Management of the Institute to look at the possibility of providing access to the it e-books to the entire teachers of Nigeria by dedicating an Open Educational Resource (OER) site for use free-of-charge. (tral Malam Adamu Adamu Honourable Minister of Education January, 2023 NATIONAL TEACHERS’ INSTITUTE, KADUNA POSTGRADUATE DIPLOMA IN EDUCATION (PGDE/DLS) E-COURSE BOOK ON PDE 710: STATISTICAL METHODS IN EDUCATION TABLE OF CONTENTS Course Title Contents Pages MODULE 1: STATISTICAL METHODS IN EDUCATION I Unit 1 | The Meaning of Statistics 1-9 Unit 2 | Descriptive Statistics 10-24 Unit 3 | Measures of Central Tendency and Location: Mean, Mode, edian and Graphical..Location of Mode, Median,| 25-53 Quartiles, Deciles ‘and Percentiles Unit 4 | Measures of Variability OF Dispersion, Standlard Scores (Z-| 54-72 Scores and T-Scores) and the Normal Curve Unit 5 | Measures of Relationship or Correlation and Regression 73-87 Unit 6 | Probability and Its Laws 88-96 Unit 7 | Distribution Funetions of a Random Variable 97-103 MODULE 2 STATISTICAL METHODS IN EDUCATION IT Unit 1 | Testing Statistical Significance 104-114 Unit 2)» | Sampling 115-126 Unit 3~ | Parametric Statistics 127-148 Unit 4 | Non-Parametric Statistics: Computation Procedures 149 - 167 ii MODULE 1 | PDETIO: | STATISTICAL METHODS IN EDUCATION | INTRODUCTION Educational Research usually is an attempt to answer educational questions in a systematic, objective and precise manner. In order to do this, the research is designed and carried out. Measurements of various shades are carried out, From this, a jumbled mass of numbers are obtained. In order to make meaning out of these numbers, they are organized. From these array of data, calculations are made and relationships described. In answering the research question, some decisions are made. In carrying out all these, the research as needs Statisties apart from the appropriate research design. Statistics means different things to different people. This is because statistics has its tentacles virtually in every pie of human endeavour. Whenever the need for sound judgment and decision making arises in any life situation, reliance on statistics is considered wise. This is because figures don't lie though liars can figure! Quality Teachers, Great Nation ena UNIT ONE: THE MEANING OF STATISTICS INTRODUCTION The word “Statistics” conveys a variety of meanings to people. To some, it is a collection of tables, charts, data or numbers. To others, it is an advanced component of mathematics. However, to the researcher, it is a tool for collecting, presenting, and analyzing data which will be used in decision making. Statistics here is seen in its vigorous analytical and applicational perspective. In this Unit, we shall explore the meaning of statistics and some of the concepts associated with it so that we can clearly understand its use, significance and purpose in education. OBJECTIVES By the end of this Unit, you should be able to: (1) Define the term “Statisties” correctly. 2) Distinguish between statistics anid statistic. (3) Discuss the place of statisticsin education. (4) Explain the relationship between statistics and probability. (5) _ Explain clearly some basic statistical concepts and notations, WHAT IS STATISTICS? Everyday, we are bombarded with statements such as: The number. of accidents recorded, onollr Yoda between September and January this year is more than that of the same period last year, The Federal Government is to.reduce the civil service workforce by 33% in its reform agenda. The following statistics was provided as the allocation to states in this Quarter etc In all these cases, statistics is used to inform the public. ‘The use of statistics probably begin as early as the First Century A. D., when governments used a census of land and properties for tax purposes. This was gradually extended to such local events as births, deaths and marriages. The science of statistics, which uses a sample to predict or estimate some characteristics of a population, began its development during the nineteenth century. Statistics is defined as the science comprising rules and procedures for collecting, organizing, summarizing, describing, analyzing, presenting and interpreting numerical data which are used in making decisions, valid estimates, predictions and generalizations. ena Apart from using statistics to inform people, it plays a significant role in moder day business and educational decision making and forecasting. Statistical methods offer us the opportunity to evaluate an uncertain future using limited information to assess the likelihood of future events occurring. Because of this contemporary use of statistics, it has three distinct parts — Descriptive Statistics, Inferential Statistics and Experimental Statistics. In descriptive Statistics, the event or outcome of events are described without drawing conclusions. It is concemed only with the collection, organization, summarizing, analysis and presentation of an array of numerical qualitative or quantitative data. Descriptive statistics include the Mean, Mode, Median, Standard Deviation, Range, Percentile, Kurtosis, Correlation Coefficient, and Proportions etc. Experimental statistics relates the design of experiments to establishing causes and effects. Such designs as experimental, Quasi-experimental, (factorial, Block, ANOVA, and ANCOVA etc belong to this groupy. Inferential Statistics builds on-the descriptive statistics by going a step further to make interpretation. The focus of inferential statistics is-surmising the properties of a population from the known properties of a sample of the population. Based on probability theory, valid and reliable decisions, generalizations, predictions and conclusions can be made using this statistics Inferential statistics find-usefulness instochastic (random) process, queuing theory, game theory, quality control etc; Statistical procedures like chi-square, t-test, f-test etc belong to inferential statistics. ‘As a student of education, you need to study statistics because of its usefulness in making predictions and taking decisions on educational matters, Downier and Heath (1970) indicated the following basicreasons for studying statisti (Daily Use: Statistics is of immediate and practical utility. They help the educator to ‘get work down quickly and efficiently: ‘They help the educator in forecasting, testing, record keeping, test reporting and interpretation ete. (2) Problem Solving: When action researches are conducted to solve immediate problems, statistical methods are applied-to the data, -Issues bordefing on curriculum improvement, deciding on a better method of teaching or predicting students’ enrolment and the required school plant will involve the use of statistics. (3) Theoretical Research: Theories predict what we expect to observe in sp: circumstances. Most researches in the behavioural sciences are now very sophisticated and are therefore more quantitative. Theories therefore serve to organize the information, In order to test these theories in education and the social sciences, we resort to statistical methods. The advantages of statistical methods in research include: (i) They permit the most exact kind of description (ii) They force us to be definite and exact in our procedures and in our thinking. 2 ena 4) 6) (ii) They enable us to summarize our results in a meaningful and convenient form, It gives order to our data in order that we can see the forest as well as the individual trees. (iv) They enable us to draw general conclusions in accordance to the accepted rules, It further establishes how much faith can be put on the conclusion and how far we can extend our generalization, (v) They enable us to prediet “how much” of a given event will occur under specified conditions known and measured. (vi) They enable us to analyze some of the causal factors underlying complex and otherwise bewildering events. Causal factors are usually best uncovered and proved by means of experiments. In education and social sciences, this may not be possible in most cases, Statistical methods are therefore often a necessary substitute forand as a constantcompanion of experiment, Thus, knowledge of some basic statistical proceduires is essential for those proposing to carry out research in order.to~summmarized andvinterpret their data well and communicate their finding, ‘Comprehension and Use of Research The competent educator and researcher-must.be able to-réad, with understanding, reports of applied: and theoretical research Learning invany field comes largely through reading In any specialized field, reading is largely a matter of enlarging vocabulary. Reading research reports means encountering statistical symbols, concepts and ideas which must be understood. He should also be able to determine when a given statistical procedure had been. appropriately-used inorder to assess the conclusions teached, To do this, he must be at grips:with statistical ideas and methods, Employment Statistical logic, statistical thinking and statistical operations are necessary components, of the teaching profession,, To the extent that the teacher uses in his practice the common technigal instruments, such as tests, the educator will depend upon ‘statistical background in their administration and in the interpretation of the results. Teachers who are unfamiliar with these procedures may have difficulty in evaluating their students’ abilities and achievements, They will also find it difficult to review research in their areas of specialization and to acquire up-to-date information, Knowledge of statistics is also advantageous in other employment situations like Engineering, Accountancy, and Economies etc. Statistics has a wider application in several human endeavours. Training in statistics is also training in scientific method. Statistical inference is inductive inference — the making of general statements from the study of particular cases. Many instances of this are encountered in life and on teaching, ena ACTIVITY I 1. Give a clear definition of statistics. 2. Explain lucidly four reasons for studying statistics in education. THE PURPOSE OF STATISTICS As earlier mentioned, statistics is used in a variety of forms by different people. However, the primary purposes of statistics are to: (i) reduce large array of data to manageable and comprehensible form; (ii) aid in the study of populations and samples; (iii) aid in making reliable inferences about events based on observational data; and (iv) help in arri ing at validand reliable decisions and genteralizations. EDUCATIONAL STATISTICS Educational statistics is simply the application of the Seichce of{Statistics to solve problems connected with various facets of education. It helps us to organize, summarize, present and interpret results and data) from) educational measurements. Through it, the degrees of association between educational variables are measured and inferences or predictions made in order to accomplish certain educational tasks. According to Boyinbode (1984), such educational tasks may include the organization and presentation of data, the measurement and description of individual or group performance, the measurement of relationships, the design of experiments and-testing of the significance of its results, the drawing of inferences or the formation of models and educational forecasting. There are various role players-in education — educational. managers and administrators, Teachers, Guidance. and Career Counsellors, Fxaminers and Examining Bodies, Researchers, Parents and students. Each of these stakeholders in education will need information in order to perform their roles well. Reliable information will be arrived at through the use of statistics. Also, for them to manipulate the, information to a useful and productive end will involve the use of statistics. Thus, each uses statistics in specific ways to achieve specific educational tasks. It is-therefore not surprising that statistics is used in education in the following areas: > Determination of educational needs of the community ——- population, age distribution, state finance, priorities, manpower, growth rate, existing institutions, personnel ete. > Planning for physical resources (School Plant) i.c. when determining the number of classrooms, the formular below is often applied: CxPxD-W Px D R = on One Where C = number of streams in the school P = number of periods held per day = number of school days per week W = _ number of periods per week spent outside normal classroom teaching for recess, PHE, Gardening, Break, practicals in the laboratory ete. > Planning for Human Resources Accurate projections should be made based on population. From these, the number of classes, teachers, students and other non-teaching staff would be determined. The total - 1 hhumber of teachers required in a school used:torbe.l~: 1 between teachers and number of class streams. However, in recent times, the-number of pupils enrolled for each subject offered in the school, the number of periods per subject per week, the level of difficulty of each subject, the level of academic attainment of students in each subject and the content volume of each subject are input variables in the equation. The other important areas where Statistics is applied in education include: > Educational Budgeting > Inspection and school record/keeping > Test development, fest scoring and test reporting > — Continuous’assessment and record keeping and reporting: In all these areas, statistics is applied to solve educational problems by various stake holders. Statistics is therefore of immense importance in education, ACTIVITY Briefly discuss the rolé of statisties in eduCation. SOME BASIC STATISTICAL CONCEPTS AND NOTATIONS Variables and Constants: A variable is a characteristic or property that can take on different values. It refers to a property where by the members of a group or set differ from one anothi Individuals in a class may differ in sex age, Intelligence, height etc. These properties are variables. Constants on the other hand do not assume different values Variables could be those that vary in quality or those that vary in quantity. ena Quantitative Variables take values that very in terms of magnitude, They are easy to measure and compare with one another. These may be scores obtained in a test, weight, height, age, distance, number etc. Qualitative Variables are those that differ in kind, They are only categorized. The differences are usually in kind such as marital status, gender, nationality, social economic status, educational qualifications ete. Quantitative variables may be discrete or continuous. A discrete variable is one which can take only a finite set of values, implying that fractional values are usually not allowed. These variables are generated by a counting process usually in whole numbers i.e. the number of goals scored in a football match, the number of teachers in a school, number of gitls and boys in a class ete. A continuous variable is that which can take"onany value over a range of feasible values. Measured data can be whole numbers or fractions 1 2 weight, height, distance values etc. Variables could also be dependent or independent depending On their functions in a given context. A variable that is dependent in one context may be independent in another. The Independent Variable is one that is manipulated or treated. The effect of this manipulation is manifested’on the dependent variable. The value of the dependent variable thus depends on that of the independent variable. Also, the value of the dependent variable is, usually predicted from that of the independent variable. When comparing the effects of two teaching methods on ‘students’ learning. achievement, the teaching methods are the independent variables while learning achievement is the dependent variable. Note that in graphing, the-dependent variable is placed on the vértical, Y - axis while the independent variable is placed on the horizontal, X — axis. There are two types'of independent variables — Treatment or Abtive variables and Organismic or Attribute variables, ‘Treatment or-Active-Variable-is-defined-as’ one-that.can.be ditéctly, manipulated by the researchers and to which he or she assigns subjects. This group includes method of teaching, method of grouping and reinforcement procedures. Organismig of Attribute Variables are those variables that cannot be activély manipulated by the researchers. ‘These variables are sometimes called assigned variables’ and they are characteristics of individuals that cannot be manipulated at will. Such independent variables as age, sex, aptitude, social class, race, and intelligence level had already been determined but the researchers can decide to include or remove them as variables to be studied. Confounding Variables: confounding variables are those aspects of a study or sample that might influence the dependent variable or the outcome measure and whose effect may be confused with the effects of the independent variable, There are two types of these — Intervening and Extraneous variables. Intervening variables are those variables that cannot be measured directly or controlled but may have an important effect upon the outcome. They are usually modifying variables that interfere between the cause and the effect. Im TT Statistical Meth These may include anxiety, fatigue and motivation. These variables cannot be ignored in experiments and must be controlled as much as practicable through the use of designs. Extraneous variables: These are variables not manipulated by the researchers (uncontrolled variables) that may have a significant effect on the outcome of a study. These may include such variables as teacher competence or enthusiasm, the age, socio-economic status or academic ability of the students in the study. Though it is impossible to eliminate all extraneous variables in a classroom research, using robust experimental designs enables the researcher to neutralize their influence to a large extent. Some other methods include removing the variable, randomization, matching cases, ‘group matching or balancing cases and analysis of covariane Data: This is a collection of information, qualitative or quantitative Distribution: This is the amangettient of a Sét of numbers classified according to some property, Population: This refers tothe group of measurements that ate of interest i.e. the aggregate of units to be covered. This may be people; objects, materials, measurements or things, Populations could be finite or infinite, When the population is not too large and can be easily counted then it is finite i, e; number of students ina school, number of candidates that wrote an examination ete. However, when the members of @ pupation are so large say like the grains of sand or number of women in West Africa, we say itis infinite, Sample: This is a part or subset of a population. It is any subgroup or sub aggregate drawn by some appropriate-method from”a population, The sample-is usually the portion of the population appropriately selected for observation: Parameter: This is a descriptive measure or characteristic, true value of a population, When such characteristics'as-mean, standard deviation or variance of @ population is computed they are called parameters. Statistic: This refers to a descriptive measure or characteristic of a sample When we calcilaté the average age Of ‘candidates who ‘wrote JME, we ‘are talking of a parameter. However, if we compute the average age’ of candidates from a given school or state then the average age is a statistic. Note that this is also different from statistics as a discipline. Different Symbols are used to denote statistics and parameter: characteristics Parameter Statistic Mean H X oo M Standard Deviation 6 SD orS Variance oF SD? or S? ear EAS ee Dichotomy: A categorical variable with only two categories- i. e. Male/Female Categorical Variable: A nominal variable on which positions and scores are not recorded as number Scores: Any position on a numerical variable Skewness of a Distribution: This is a distribution having a longer tail at one end than at the other. It is an asymmetrical distribution. Kurtosis: This is the extent of peakedness in a distribution Normal Distribution: This is a symmetrical distribution having its mean, mode and Median equal. Also, the frequencies of the variable extend equally both to the left and to the right of the mode. Parametric Tests: These are tests whose efficacy. tests whether the variable being studied is at least approximately normally distributed, Non-Parametric Tests: These tests are developed ‘without reference to the distribution of variables. x: avariable ft: frequency of occutrence or observations n; the sample size in:the number of observations selected from a population (number of occurrence N= EF; total numbér of observations comprising a population of interest. Z: pronounced as sigma i. e, is a summation sign which instructs us to “take the sum of or add © 2,46) 210 Yoo square root Sign directs us to find the square root of anumber ie = 6 = this fects ls to-raise a quantity to the indicated power. ACTIVITY (Clearly distinguish between (i) statistics and “statistic” (i) continuous and diserete variable (ii) parametric and non-parametric tests. (iv) discrete and continuous variables (v) qualitative and quantitative variables (vi) dependent and independent variables aT Statistical Meth SUMMARY . In this unit, we have defined statistics as the science which comprises the rules and procedures for collecting, organizing, summarizing, describing, analyzing, presenting and interpreting numerical data which are used for decision making, predictions and generalizations. © The importance of statistics in general terms were discussed © Educational statistics is the application of statistics in the field of education, © The uses and purposes of statistics in education were enumerated. © Some basic statistical concepts with some statistical not notations were also explained. ASSIGNMENT Discuss one application of’statistical methods in teacher education, REFERENCES Avy, Donal et al (1979):~Jntroduction to Research in Education U~S. A. Holt, Rineheat and Winston, Inc, Best, J. W and Kahn, J. V, (1986): Research in Education. London: Practice Hall Inter. Boyinbode I, R. (1984): Fundamental Statistical Methods in Education and Res Ife, DC $8 Books. arch, The- Gay L. G. (1970)-- Education Research? Competéncies for Analysis and Application. Ohio. Charles B--Merill Guilford, J. P. and Fruchter, B. (1973): Fundamental Statistics in Psychology and Education, McCall, R. By (1980): ~ Fundamental Statistics for Psychology, U. A. A. Harcourt B. Jovanovich Inc. Sn i) UNIT TWO: DESCRIPTIVE STATISTICS oT OBJECTIVES We have seen that various educational data can be obtained in various ways. These data must be summarized and presented in a form that is easily understood. Statistics is used to do this. The type of statistics will depend largely on the nature of data involved. In this unit, we shall discuss the various scales of measurement, ways of organizing these data and presenting them and the calculations of some statisties. OBJECTIVES By the end of this Unit, you should be able to:- 1. describe the four scales of measurement; describe the organization and présentation 6f data using charts and graphs define terms associated with frequency distribution, construct a frequency table for any'sét of data, draw a histogram ro represent a given set of data; draw frequeney polygons from frequency distributions; draw a frequency curve for a large set of data; and ea aus identify the different types of frequency curves. SCALE OF MEASUREMENT Quantification has-been defined as a numeri¢al method of describing observations of materials or characteristics. When a defined portion of the material or characteristic is used as a standard for measuring-any sample, a-valid and precise method of data description is provided Measurement is a fundamental step in the conduct of a research. Measurement is defined as the process through which observations are translated into numbers. It is the assignment of numerals to objects or events according to certain rules. Starting with variables, some rules are then used.to determine how these variables will be expressed in numerical form. It may be through tests or actual measurements. The nature of the measurement process that produces the numbers determines the interpretation that can be made from them and the statistical procedures that can be meaningfully used with them. Scientists distinguish among four levels of measurement as categorized in the scales of measurement which are Nomtinal, Ordinal, Interval and Ratio. NOMINAL SCALE Nominal data are counted data. Each individual can only be a member of mutually exclusive category and not the other. All members of each category include notionally, gender, socio- economic status, occupation, role, religious affiliation etc. 10 ena Numbers are often used at the nominal level, but only in order to identify the categories. The numbers arbitrarily assigned to the categories serve mainly as labels or names. The numbers do not represent absolute or relative amounts of any characterization. For instance, the numbers given to football players do not represent their degree of skillfulness but just for recognition and positions. ‘The identifying numbers in a nominal scale can not be arithmetically manipulated through addition, subtraction, multiplication or division. However, those statistical procedures based ‘on mere counting such as reverting the number of observations in a category can be used. ‘Thus, with this type of scale, we can only find the mode, percentages, draw charts and may perform chiesquare test and some special types of correlation. ORDINAL SCALE Nominal scales show that things. aré’ different but Ordinal scale shows the direction of differences. It shows relative position\of one thing>to another but can not specify the magnitude of the interval between two measures. Ordinal scales, thus only permit the ranking of items or individuals from highest to Jowest, The eriterion for highest (o lowest ordering is expressed as relative position or rank in’a group: I", 2”, 3 <_._ nth. This is why ordinal scale is also called rank-order.“ Ordinal measures have no absolute values and real differences between adjacent ranks may not be equal. Neither difference between the number nor their ratio has meaning. When numbers 1, 2, 3 and so on are used, there is implication that rank 1 is as much higher than rank 2 as 2 is than 3, and so on. In ordinal measurement, the empirical procedure used for ordering objects must satisfy the criterion of transitivity postulate. ‘This postulate holds that the relationship must be such that “if object a is greater than object 6, and object & is greater than object ¢, then object a is greater than object c, This is written‘as “if (a > 6) and (6 > c), then (a > c). Other words such a8 stronger than, precedes and has more attribute than can be Substituted for greater than in other situations. ‘The arithmetical observation of addition, subiraction, multiplication and division cannot be usefill with ordinal scales. The statistics that can be used with nominal scale can also be used with the ordinal scale, INTERVAL SCALE This is an arbitrary scale based on equal units of measurements which indicates how much of a given characteristic is present. It provides equal intervals from an arbitrary origin. An interval scale not only orders objects or events according to the amount of the attribute they represent but also establishes equal intervals between the units of measure. Equal differences in the numbers represent equal differences in the amount of the attributes being measured. The difference in the amount of the characteristics possessed by person with scores of 60 and 65 is assumed to be equivalent to that between persons with scores of 70 and 75. The limitation here is the lack of a true zero. The zero point is arbitrary. Interval scale ir ena lacks ability to measure the complete absence of the trait and a measure of 30 does not mean that the person has twice as much of the trait as someone who scored 15. You should note that in most cases where we use interval scales, the intervals are equal in terms of the measuring instrument itself but not necessarily in terms of the ability we are measuring. ‘Common example of interval data include time and temperature as measured on Centigrade and Fahrenheit scales, scores obtained in achievement tests and other examples. We can also force ordinal scale into an interval scale as in the case of ratings like: a) 1. _ Strongly agree b) Excellent 2 Agree Good 3. Undecided Average 4, Disagree ‘Weak 5. Strongly disagree Poor SA A UND D sD If this is regarded as a Continuum, where it is possible to choose any point, then we can regard it as interval scalé. Because interval scale lack true zero, multiplication and division of the numbers are not appropriate. This is because ratios between the numbers on ar’interval scale are measureless. However, additions and subtractions are possible, Any statistical procedures based on adding may be used with their scale along with the procedures earlier mentioned to be appropriate for the lower level scaleés.-These-include-mean, standard deviations, ftests, pearson r, analysis of variance, ete. THE RATIO SCALE The fourth and final type of scale is the ratio scale. It provides a true zero point as well as The numerals of the ratio scale have the qualities of real numbers and can be added, subtracted, multiplied divided and expressed in ratio relationship e.g. 10g is one half of 20g. 30cm is three times 10em ete. Examples of ratio data are usually found in the physical sciences and seldom if ever obtained in education and behavioural sciences. In education, these are limited to educational performance and other physiological ‘measurements, All types of statistical procedures are appropriate with a ratio scale. 12 aT Statistical Meth |ACTIVITY [Describe each type of the measurement scales and give a situation when each can be applied. THE ORGANIZATION OF DATA It is always difficult to make sense out of a large data that have not been arranged. This may be data from your research work or students” scores on tests. You need a method to organize the data in order to interpret them. Organizing research data is a fundamental step in statistics. There are two ways of organizing such data:- (arranging the measures into frequency distributions and (ii) presenting them in graphie forms. When you have an ungrouped raw data that is few, it is wise to arrange them in descending order of magnitude to produce whatis known as an array.of data. This process of arranging the raw data to get an array offdata is called Ranking. For example, scores of the students in your class on Statistics are as follows: Musa ~ 70 Lawrence _ 45 David F 50, Ade 1 52 Audu -O ¥ Ofodile = | 48 Hanatu - 60 Benedict ~ 55 Bunu = 90) Osun “ 40 Ranking 90 88 70 60 5S 52 48 45 40 This is an array of Raw Data This array provides a more convenient arrangement. The highest score being 90 and the lowest 40. From this, the Range can be easily calculated. The Range is the difference between the highest score, H and the Lowest score, L which is90-40 = 50 MODUL FREQUENCY DISTRIBUTION Given below is another array of the raw scores of 60 students in another statistics test 55 52 50 53 53 47 60 52 50 52 47 50 70 49 50 49 46 47 55 50 40 44 49 53 42 58 38 52 52 46 49 53 50 49 47 46 50 60 49 47 44 55 46 52 58 44 53 55 52 37 37 35 32 47 58 37 In order to make meaning out of this array of data, you will arrange them from highest to lowest. A systematic arrangement of individual measures from lowest to highest or vice-versa is called a Frequency Distribution, Rank ordering the scores from highest t0 lowest This may also be put in a frequency distribution table as given below:- 70 60 40 38 60. 58 37 35 53 50 49. 47 46 44 42 SCORES 70 60 58 37 55 58 37 55 53 52 50 49 41 46 42 57 55 53 52 50 49 47 46 4g TABLES 1 1 I mt 14 35 33 52 50 49 41 53 52, 52 50 50 49. a7 FREQUENCIES 1 2 3 4 5 [Ena mm SCORES TABLES FREQUENCIES 53 HHI 6 52 HEIL 8 50 a 7 49 HHL 6 47 HI 6 46 si 4 44 mI 4 42 0 2 40 1 1 38, 1 1 FREQUENCY DISTRIBUTIONS ‘When summarizing large masses of raw data, it is often good to distribute the data into classes or categories and-to determine the number of individuals belonging to each class frequency. Definition: A tabulaf arrangement of data by classes together with the corresponding class frequency is called a frequency distribution or frequency table. ‘As a preliminary of a full scale traffic. survey, it was necessary_to have’some information about the number of occupants of cars entering a certain town on Saturday afternoons, and an ‘occupancy count was made on each of 40 cars, The result were: 1, 3, 2, e@seeedromalomliyee dyed > 143, 32% 3 2% 2 2 1 % S591 300, Bed 36 1, 4 ol Waal lay Cilia. ‘Are these variety discrete or continuous? ‘These are discrete varieties but figures like these dazzle you and you find yourself not able to make any meaning out of numerical data like these just by mere looking at them. A simple picture of the occupancy of the cars is obtained if the data is given in the form of a table, showing the number of cars with 1 occupant, the number with 2 occupants, and so on. To tabulate the data in this way, you will probably find it easiest to work your way systematically through the 40 counts assigning each to the appropriate category using a tally mark as shown and working with blocks of five to facilitate the final totalling. Infact, the ‘observer might as well have recorded this data in this way in the first place. 15 [Ena mm Number of Occupants | Tally Stokes Numbers of cars L HH HH HH Is 2 HOH IL 2 3. HH IL 8 4, mI 4 5 I 1 40 Table 2.1 This is a simple example of a frequency distribution. (frequency table). The variate (which will henceforth be denoted by X)is in this case "number of occupants". The number of cars with X occupants shows the frequency with-which that value of X occurred. F is usually used. for frequency. In order to get a better picture, faw data can_also be grouped into Group Frequency Distribution. In doing this, we have to decide on the Aumber of Grotips required as well as the size of each interval. There is no fixed number of Groups that is appropriate. However, it is advised that between 5 and 20 groups are enough depending on the range of the scores. GROUPED DATA Let us consider the tesult of life-testing of 80 tungsten filament electri¢ lamps. The life of each lamp is given to the nearest hour: 854 1284 1001 on 1168 1357 1090 1082 1494 1684 1355 1502 1281 1666 778 1550 628 1325 1073 1273 1608 1367 1152 1393 1399 1199 1155 822 1448, 1623 1058 1930 1365 1291 683 811 1137 1185 392 937 963 1279 1494 798, 1599 1281 590 960 1310 1848 1200 845 1454 919. 1571 16 oa i710 1734 1928 1416 1465 1026 1299 1242 1508 705 1084 1220 1650 1091 210 1399 1198 518 1199 2074 945 1215 905 1810 1265 We now present this data into a grouped frequency table. NOTE THE STEPS 1. The range (2074-210) is found and divided into 10 groups. 2. Each group has width of 200. 3. The tally method is used to determine the frequefiey.in each group or class. 4. Always check up that the-sum of the frequencies \is equal to the number of observations in the data, 80.in this case» Table 2.2 shows the grouped data. STOP! and compare this grouped/data/with the tngrouped form of the same data, What differences do you observe? We readily observe characteristics of the distribution clearer and faster with the grouped data, and further statistics are readily facilitated, as you will see later in this unit, Table 1.2 Grouped Frequeney Distribution 7 Life Tally Marks ‘Nimber of lamps xX F 201 - 400 I 1 401_- 600: IL z 601 - 800 a 5 801, - 1000 aH HEIL 12 1001-1200 HH HH HAIL 7 1201 - 1400 HH HH HH HH 20 1401 — 1600 HH OHH I 12 1601 — 1800 HH IL 7 1801 — 2000, Tl 3 2001 - 2200 1 1 Table 2.3 MODUL ne TERMS USED IN FREQUENCY DISTRIBUTIONS The table below is a frequency distribution of masses (to the nearest kg) of 100 male students at a certain College of Education in Nigeria. Masses of 100 Students at a Certain Coll of Education Mass Number of Students (Kg) x F 60 - 62 5 63 - 65 18, 66 - 68 2 69 ~71 27 2 74 8 100 Table 2.4 ‘You should notice that with groups for this table defined in the way shown, there is always a gap between the right hand endpoint of one group and the left hand endpoint of the next one (ic. between 62 and 63,65 and 66/etc)"This may appear to: make the data more of a discrete one than continuous one; However, a life)recorded 62-kg would iit reality have been between 61.5-and 62.5kg (see rounding off numbers in Unit 1)'and similarly 63°-Kg covers true values between 62.5 and 63.5 Kg. Thus, in reality the data is a continuous one. The true end points of the groups are as shown with continuous, coverage along the time scale. End Points givén ag | / End of points Given as values's measured true values 60 - 62 59.5 - 62.5 63-65 62.5 - 65.5 66-68 65.5 - 68.5 Table 2.5 It is important to choose groups whose end points do not coincide with actual observed data ‘The above explanation brings us to what is called: 18 [Ena mm Class Boundaries: These numbers above indicated by the points 59.5, 62.5 ete are called class boundaries or true class limits. The smaller number 59.5 is the lower class boundary and the larger number 62.5 is the upper class boundary. How to calculate class boundaries will be discussed later. Class Intervals and Class Limits: A symbol defining a group such as 60-62 in the above table is called class interval or class. The end numbers 60 and 62 are called class limits; and the larger number is called the upper class limit while the small one is the lower class limit. Am open class interval: is one which has no upper class limit or no lower class limit such as, the class "75 years and over". ‘The Size of a Class Interval: The size or width of a class interval also referred to as the class width, class size ot class strength is the difference between the lower and upper class limits. For example in the data of table 2.5 the-class.interval is. 62.5-59.5 = 65.5-62.5 =3'¢te or 63-60 = 66-63 = For (Since the classes ate of equal, size) 65-62 = 68-65 4 Calculation of Class Boundaries, Class boundaries aré obtained by adding the upper class limit of one class to the-lower class limit of the next higher class and dividing by 2. For example, the upper class boundary of the first class (60-62) of the data given in table 2.4 is 62+ 63 62.5 =< The lower class boundary of the second class (63-65). ‘The upper class boundary of the second class (63 - 65) © > £6 — 65.5 lower class boundary.of the third class (66 - 68) and so on. 594.60 ‘The lower class boundary of the first class (60-62) is = 59.5 ‘CLASS MARK: The elass mark also called the class midpoint or class centre is obtained by adding the lower and upper|class limits and dividing by two. Thus, the class mark of the class 60- 62 is 60+ 62 2 = 61 ARRAYS: An array is an arrangement of numerical data in ascending or descending order of ‘magnitude. The difference between the largest and smallest numbers is the range of the data. Example: Looking at table 2.5 of the length of life of 80 lamps. Find a. The lower limit of the 4th class. b, The upper limit of the Sth class, 19 MODUL ©, ne The class mark of the 3rd class. d. The class boundaries of the 8th class. €. The size of the 6th class. f Are all the classes of the same size? g The frequency of the 7th class. h. Which class has the highest frequency? Solution a, The 4th class is 801-1000 The lower limit is 801 b. The Sth class is 1001-1200 The upper limit is 1200 c. The 3rd class is 601-800, “The class mark is ©! ~ = 7005 d. The eighth class is 1601 - 1800 ‘The lower class boundary is 1600 #1601 E 1601 1600.5 ‘The upper class boundary:isyt S20 180! na g00,5, €. The 6th class is 1201 -1400 and its'size is-1400.5-1200.5 = 200 {To determine if all the elasses are equal 600-400-200=800- 600=1000-800 ete OR 401-201 =200-801-601 etc. All the classes are of equal size. 2. The 7th/class is 1401-1600 and its frequency is 12. h. The 6th class 1201-1400 had the highest frequency of 20. Table 1.4: Marks obtained in Mathematics by 80 Students Marks Frequency x F 50 = 54 I 55-59 2 20 RS Marks Frequency x F 60 = 64 ul 65 - 69 10 70-74 12 75 -79 21 80 - 84 6 85 - 89 9 90 - 94 95 - 99 With reference to this table, determine; a, The lower limit of'the 6th class b. The upper limit of the fourth class c. The class mark of the tenth class: d. The class boundaries of the fifth class. ec. The size of the 9threlass f —_Areall the classes of equal size? g. _ Whatis the frequency of the 6th class? GRAPHICAL REPRESENTATION OF DATA - THE HISTOGRAM Data may be presented in two dimensional graphs to make more comparison than is possible with textual matter alone. There are a number of graphs for doing this. These include line graphs, Bar graphs, Pictographs, Pie graphs, Histogram, Frequency Polygons, Ogive and the smooth curve Histograms, Frequency Polygon and the smooth curve are most commonly used in education. HISTOGRAM ‘The histogram is a graph which uses bars to depict the way two variables are related. Each bar has as their bases the class interval and its length the class frequency. 2 m Ta 7” Example Let us consider the frequency distribution of the length of life of the lamps earlier discussed: Definition: The chart of a frequency distribution is called a histogram, EXAMPLE tire HRs) Fig. 11 The diagram of the length of life in (hours) of 80 lamps is as shown in Fig.1.1 The base of each rectangle extends from lower class boundary to the upper on a scale representing the variable, in this case the length of life in hours. The true class boundaries must be used, so that horizontal seale representing length of life is covered continuously with no breaks in between the rectangle. Notice that the width of the rectangles are equal as shown. This is because the frequency distribution has equal class interval. The bars are of different heights because the frequencies of each class are different. 22 ener Looking at the histogram, you will notice that the height of the rectangles represents the frequencies (where classes are of equal size). Note also that the left hand edge on each rectangle represents the lower class boundary and the right hand edge represents the upper class boundary. For the class 1201-1400 AB represents 1200.5 and CD 1400.5 which are the lower and upper class boundaries respectively. In general, when you combine n classes, the frequency (the height) of the new class becomes of the sum of frequencies. n - « 1 Thus, if classes two and three are combined, the frequency becomes x (215) =3.5 FREQUENCY POLYGONS AND FREQUENCY CURVES Frequency Polygons: The graph’6f a frequency distribution is called a frequency polygon. The graph is obtained by plotting the class frequencies against the class marks. It can also be obtained by connecting midpoints.of the tops-of the rectangles inthe histogram (where the histogram is already drawn), ACTIVITY: [Draw the frequency polygon from the histogram of the length of life (in hrs) of 80 lamps. Solution: All we need do here is join the midpoints of the already drawn histograms. The extremes are adjusted accordingly. Frequency Curves Most data are samplé/‘6f @ large population. Where the population is very large many observations are possible, it therefore becomes theoretically possible (for continuous data) to choose class intervals very-small-and still-haye quite-a-number of observations falling within each class. Thus, the frequency polygon for a large population will have so many small broken line segments that they closely approximate curves which we call frequency curves. Frequency Curves /can ‘be jobtainted) by smoothing frequency polygons.’ For)this reason a frequency curve is sometimes called a smoothed: frequency -polygon. ‘The smoothing removes irregularities in the curve but still approximates the same area. JACTIVITIES 1. (a) Arrange the numbers 12,56,42,21,5,18,10,3,61,34,65,24 in an array and (b) Determine the range. 2, If the class marks in a frequency distribution of lengths of laurels are 129, 138, 147, 156, 165, 174 and 183mm, find the class interval size, boundaries and limits. MODUL ne REFERENCES Avy, Donal et al (1979): Introduction to Research in Education U. 8. A. Holt, Rineheat and Winston, Inc. Best, J. W and Kahn, J. V. (1986): Research in Education, London: Practice Hall Inter. Boyinbode I. R. (11984): Fundamental Statistical Methods in Education and Research, Ne Ife, DC S$ Books. Gay L. G. (1970) - Education Research: Competencies for Analysis and Application. Ohio. Charles E, Merill. Guilford, J, P. and Fruchter, B. (1973): Fundamental Statistics in Psychology and Education, McCall, R. B_ (1980): Fundamental Statistics for Psychology, U. A. A. Harcourt B. Jovanovich Inc, 24 aT Statistical Meth UNIT THREE: MEASURES OF CENTRAL TENDENCY AND LOCATION: MEAN, MODE, MEDIAN AND GRAPHICAL LOCATION OF MODE, MEDIAN, QUARTILES, DECILES AND PERCENTILES INTRODUCTION We have so far been dealing with the qualitative aspects of a distribution. However, some aspects of a distribution can be described in quantitative terms by calculating certain values from it. An average is a value which is typical or representative of a set of data, Since such typical values tend to lie centrally within a,setof data arranged in an array, averages are also called measures of central tendency. There are several types of averages. The most common being midranges, the arithmetie mean, the mode and the median, The unit concerns itself with these averages. These measures feveal the position or lerigth of scores in a distribution OBJECTIVES By the end of this unit, you should be able to: () define and calculate the mean, median and mode of a distribution; (ii) make observations about mean, mode and median of a distribution; (iii) find the median atid mode using a graph; and (iv) locate the quartiles, deciles and percentiles by means of a graph, THE ARITHMETIC MEAN When buying-electric-lamp bulbs, you.can.pay.a little extra to-get the "longer life" type When tested, the lives (in hours) of 5 "standard" bulbs and S"longer-life"bulbs were as follows: "Standard" 1281 1090 1555 1494 1823 "Longer life" 2048 2741 2212 3319 3041 Here, it would be useful to have a measure which, for each type of bulb, would give a general indication of the time lasted. This is sometimes termed a "measure of location", as its aim is to indicate where about the observations are located (in this case, on the time scale). These measures are also called measures of central tendency, since their values tend to lie centrally within a set of data arranged in an array. The measure most often used to meet this need is the arithmetic mean. [Ena mm There are other types of means such as the geometric mean and the harmonic mean but they are not widely used and it is the arithmetic mean which is referred to when the word "mean" is used sum of all observations Total numberof observations Arithmetic Mean = Its symbol is ¥ (pronounced x bar). If ¥ (pronounced sigma) means sum or addition of a series and there are a set of group of N numbers x1, x2, X3..... X7, then yx W ‘The symbol )) Xi is used to denote:the Sum of all XPSfrom i= 1 to i Since the mean is an arithmetié average, it is classifiéd as an interval statistic. Its use is appropriate for interval or ratio data but not nominal or ordinal data. Example: Using the data for the "standard" electric lamp bulbs gives their mean length of life as 1281 + 1555 +1491 +1823 RAB 48 6 hours 5 5 What is the mean length of life of the "longer life" bulbs? You should have done it like this 2048 + 2741 + 2212 $3319 +3041 5 13361 38722 hours More generally’ if x,,.X) =x, are-n-values-of a variable.x,-then their Arithmetic mean x is given by 10 ion oz 1st 140149 gg 167 17 LENGHT (min) —S— Fig. 3.1 32 m Ta mm GRAPHICAL REPRESENTATION Graphically, the median is the value of the X (abcissa) corresponding to that vertical line which (correspond) to the S0th percentile point on the cumulative frequency and divides the frequency into two equal halves. Example 2: Using the data of the 40 laurel leaves in example 1 above, obtain the median length of the 40 laurel leaves graphically. Solution: The first step is to draw a smooth cumulative frequency curve or percentage Ogive for the given data (see graph - Fig,3.2). 100} g g 3 g & > UM. FREQUENCY & CUM, RELATIVE FREQUENCY 8 1175” 1265 1355 1446 1538 162.5 171.5 180.5 > EenetHimm Fig. 3.2 33 MODUL ne We know that the median should be the 20th item (where N/2) or the 20.Sth item (where N#1/2) is used. Therefore, the value corresponding to a frequency of 20 or 20.5 is read along the X axis which is 146.75mm as shown on the graph of Fig.3.2. A percentage ogive could be drawn and the value corresponding to 50th percentile is read. It is again 146,75mm as shown on the same graph. THE MODE ‘As already defined, the mode of a set of values is that one which occurs with the greatest frequene} Geometric Representation ‘The mode of a set of values can be obtained from the histogram of the distribution. To illustrate this, we present the following example. Example: Find the modal age of-adiili males in @eertain company from the following distribution. Ages | Frequency 21-25 x 26-30 14 31-35 29 36-40 4B 4145 33 46-50 9 Solution: The fourth class 36 =40 is the modal class since it has the highest frequency. The mode therefore must live within the modal class. Tor find the mode only the histogram of three classes need be drawn, that is the histogram of the class before the modal class (31-35), the modal class (36-40) and the class after the modal class (41-45). See graph of Fig. ona An 7D ‘3 36 46 98.9 ESTIMATE OF MODE Fig. 3.3 ‘The line AC and BD are drawn. The mode is determined by the X (abeissa) value of their intersection. In this case, the mode is found to be 38.9 QUARTILES ACTIVITY 1. What are quartiles? Define quartiles, deciles and percentiles. 35 IM To Statistical If you cannot remember these definitions again, go back to the beginning of this unit and study them again, Just as the median splits the area under the curve into equal portions (see diagram below), so also can a frequency curve be splited A B MEDIAN Area A= Area B. (Fig.3.4) Extending this idea, we an Split a frequency curve iito.as many equal portions as we wish. ‘You will recall that the general name given to those values that split a curve into equal parts are called quantiles. You will also recall the following:- 1. The three values that split a distribution into four equal portions are known as quartiles. In order of magnitude, they are usually represented by Ql, Q2, Q3 and called the first, second and third quartiles, respectively. B c os oe h Pe ® ® Fig.3.5 The second Quartile is the median since it divides the area under the curve into two equal portions. 2. The nine values that split a distribution into ten equal portions are known as deciles and ate represented by Di,D2 —— Ds. The fifth decile Ds being the median. See Fig.5.4 (b) 3. The ninety-nine values that split a distribution into one hundred equal portions are known as percentiles and are represented by Pi, P2,...Po where again Pso is the median, 36 MODUL ne GRAPHICAL REPRESENTATION OF QUANTILES Example 1: The following data gives the weight of 1200 duck eggs. Weight (mid 9 5 | 57 | 69] 63 | 66 | 99 | 72 | a5 | 78] 81 | s4 | 87] 90 | 93 point in grams) Noofeges [7 | 13 | 68 | 144 | 197 | 204 | 208 | 160 | 101 | sa] 25] 13 Ja | 2 Find the median, quartiles, D8 and Psy; the 8th Decile and 37th percentile using graphical method. Solution: All we need do is to draw the percentage ogive of the distribution, From the percentage ogive, it becomes relatively easy to find the median'which isthe 50th percentile. The first quartile is the 25th percentile. The second quartile is the 50th percentile (the median) the third quartile is the 75th percentile. The eight decile Dyvis the 80th percentile and P37 the 37th percentile. (See solution on graph (Fig.2.5) Table is.as shown in Table 5.1 Wight (grams) f cf pet. 58.5 1 7 Os 61.5 13 20 17 64.5 68 88 73 615 144 32 193 70.5 197 429 35.8 BS 204 633 52.8 16.5 208) 841 70.1 79.5 160 loor 83.4 82.5 101 1102 918 85.5 54 1156 96.3 88.5 28 Lusi 98.4 915 13, L194 99.5 94.5 4 1198 99.8 91: 2 1200 100 Table 3.1 mm g 3 8 40} i i i 5 2 i 3 é 30} 20} 10 e eo 2 8) 2 9 2 |B gs 3522238 Bo sw —— > Wiis igms) LESSTHAN Fig. 3.6 Median = 73.35gms Ist quartile = 71.10gms 2nd quartile = 73.35gms. MODULE On istical Methods in Education (PDE 710) 3rd quartile = — 77gms 8th decile (D8) = 79gms P37,37th percentile = 70.7 gms. ACTIVITY 1 The annual salaries of five men were N5,500, N4,800, N7,000 N8,000 and N32,000. a. find the arithmetic mean of their salaries. b. find their median salary. c. would you say the mean is typical of the salaries? 4. which of the two (a) or (b) gives‘amore reliable average and why? 2. The grades of a studentin eight examinations Were 50,60,75,85,67,60,56 and 72. a. Find the mode of the grades and b. Find the median of the grades. c. Is the mode tinique? DERIVATION AND-USE OF FORMULAE FOR THE MEASURES OF CENTRAL TENDENCY FOR A FREQUENCY DISTRIBUTION In the early part of this unit, we learnt about measures of central tendency and how to derive them from a set of numbers. -Later, we also leant how (6 locate them graphically. In this Section, we will learn how. todetive them from a frequency distribution. What is the arithmetic mean or mean of asetof numbers? If you cannot state what mean is, go back to the opening section of this unit. You will recall that the mean is the sum of all the items in a group divided by the numbers of items in that group. We had leat also how to calculate the mean’ for a/set or group. Let us now see how to calculate the mean for a frequency distribution. MEAN FOR A FREQUENCY DISTRIBUTION For a discrete frequency distribution taking values (x) x2... %,) with corresponding frequencies (fi, fi, «.....-- fads the mean xis given by 39 aT Statistical Meth Proof: Now x1, occurs exactly fi times x2 occurs f3 times, so that the sum total of all items is fi x + fox + fake = firs and the total number of items is clearly fi + fe +....+fi yA But x is defined as the sum of all items divided by the number of items hence Note: That }° is a summation notation. The process of adding x7, x2, x3 ..... Xn can be written as.x) + x2,+ x3 +...+.x, and using the }°_ notation can be written as i.e, the sum of all observations x1, x2... up to and including +», Worked Example Example. A group’of 10.has‘a’mean of 36 and a second group of 16 has’a mean of 20. Find the mean of the combined group of 26, Solution: x f fe 36 10 360 20 16 320 26 680 LAL 80 Logis YS 26 Continuous Frequency Distribution For a continuous frequency distribution or grouped discrete distribution, the last method cannot be directly used since we do not have distinct x values but ranges of values of x. 40 MODUL mm ‘What is done in this case is to simply take the midpoint of the class to represent x value and proceed in the usual way as in the last example. Example 1: The weights in (Kg) of 65 female adults of a certain female adult school is shown in the frequency distribution below. Find their mean weight. Solution: Class Weight Midpoint Frequency (Kg) x f & 5.00 - 5.49 5.245 12 62.940 -5.50 - 5.99 5.745 32 183.840. 6.00 - 6.49 6.245 Th 68.695 6.50 - 6.99 6.745 8 53.960 7.00 - 7.49 7.245 2, 14.490 65 383.925, DY = 65 Lope = 383.928 383.925 65 5.91 ‘The mean weight is 5.91 Kg Example 2: 178 people were asked how many coins they had in their pockets and the following results were obtained. No of Coin No of people 6 8 8 8 Find the mean number of Coins. 41 [Ena mm Solution: Class |___ Midpoint (x) |__Frequencyf | 0-4 2 6 12 5-7 6 48 8-10 9 2 11-12 115 46 178 = _ 178 - x 6 6.85 =7 coins to thenéarest whole number of coins NOTE: 1, The fact that we have Unequal class intervals makes no difference to the calculation for the mean. 2. The calculated miean (6.85) is nob a typical member of the-distribution since the data comprises of whole numbers. However) when calculating statistical measures for discrete distributions, we often give the answer in continuous form unless otherwise specified in which ease an approximated mean value such as(7) can be used. THE CODING METHOD When dealing with/large awkward values of a Variable, the €alculation of the mean by the methods so far employed can become tedious, for-this reason the coding method is introduced, ‘The method involves subtracting (or adding a number from each of the original values and, if possible and convenient, dividing (or multiplying) these new values by another number to obtain a set of x values: which shouldbe more manageable,-We say that the x, values have been coded (or transformed) into x values. We then find the mean of the x values x and by using a suitable decoding formula obtain x Definition: If (a) the set (x, x; -, %4)is transformed to (x4, -, x,) or (b) the frequency distribution Mm --% fh--I is transformed to means of the coding formula x 42 MODUL ne and x is found, we obtain x by means of the decoding formula x =a + bx NOTE: a and b are chosen for convenience in order to make the x values as simple as possible. Example: Find the mean of the set (15,21,24,27,30,33,36,39,42) using a method of coding. Solution: Subtract 27 (a central value) from each item. This is shown in the table below 10 10 MEDIAN Recall This The median of a set of numbers x), x2, —- xn is defined as the middle value of the set when arranged in order of magnitude and the mean of the two middle values if the set has an even number of items. MODUL ne For a Frequency Distribution For a discrete frequency distribution taking the values (x1, x2...2%) with corresponding xe 2 1 th value when the values are ranked. Drt ai frequencies (fi, fi ....f) the median is the Here there is distinction as to whether there is even or odd number of items. The s sometimes replaced by DS — if is fairly large. + XS y large It is usually desirable to include a columi"of Cumulative frequencies when calculating the median for a discrete frequency distribution-as|shown in the following example. Example 1: Find the median of the following discrete distribution: x} o]1/@ a 4 5 6 fr} 6 | 4 ito | 20 |) 20°} 30°} 10 Solution: eo) oO (Cump) 0 6 6 I 4 10 2 10 20 3 20 40 4 2 60 5 30 90 6 10 100 100 [Ena mm The 50,Sth falls at x = 4, the fifth row using the cumulative frequency column. Hence the median is 4. Grouped Data When dealing with a continuous (or grouped diserete) distribution, we can only estimate a value for the median. Example: Consider the following distribution. x S| Cump 10-19.9 2 2 20.29.9 14 16 30-39.9 38,/] 54 40-49,9 23 7 5059.9 6] 83 60-69.9 I 84, 8441 N = 84. Therefore the median should be the ( ye = 42.5th item which falls in the class 30-39.9. This class is called the median class, We neéd to find where in the median class, the median isexpected to lic..From the frequency distribution we sée that there are 16 items up to 29.9 and'$4 items up to 39.9. We require the 42.5th item: We therefore need to find m such'that there-are 42.5 items up to m. Since there are 16 items to 29.5 and 42.5-items tom, there-must-be-42:5 -/16-=26.5-items-from 29.95 to m. Similarly there must be 54-42.5 items = 11.5 items from m to 39.95. Now there are total of 38 items in 26.5 5 the median class, therefore m must li a fraction =—of the way along-29,95 to 39.95 The actual distance into the class must be 26.5/38 x 10 (since 10 is the class width) The median therefore lie at a point 29.95 + 10 avg = 36.92 Note that all number in the above expres mn are well defined quantities, 29.95 is the lower 1 class boundary of the median class. 26.5 is 42.5 -16 that is ; - Cum f up to lower class boundary (/cb) of median class. 38 is the median class frequency and 10 is the median class width or interval This technique for estimating a median value is called the method of Interpolation, 45 ion (PDE The general formula for working is therefore given for a continuous (or grouped discrete) frequency distribution by. 1) ) 3 ) -ON, Ff (median) Cor v ) 7 ON): Ff (median) J C where mah 1) = lower class boundary of median class. N = Number of items in the’data (/);. = Sum of the frequencies of all classes lower than median class ‘f (median) = frequency of median class C= median class width Example: Find the median length of 40 laurel Teaves using ‘interpolation formula and interpolation method, Length mm f of 1184126 3 3 127-135 3 8 136-144 9 17 145-153 12 29 154-162 g 34, 163-171 4 38 172-180 2 40 Solution: We include the cumulative frequency column and find the following: Using Formula: = 2050r X =20 2 N = 46 ion hy = 1445 A= 17 f median= 12 - OY)1 A 2) le f (median) (205 - 14g5~ (205-17 }° = 147.12mm \ Using Interpolation: The median is Y= item =20,5th item ‘Now the sum of the first-three classes’ frequencies is 17(i 20.5 we require 3.5 more of the 12 cases in the fourth class, The median must therefore lie 3 3° of the way between'144.5 and 153.5 . 3 +54 9), To give the desired The median therefore is: 144.5 + 38 (153.5 - 144.5) = 147.12mm_ QUARTILES We just found how to calculate the median using a formula, Let us now look at other quantiles. Fot small sets of data, the yalue’ of calculating quantiles; suchas) deciles or percentiles ig not significant. However, this becomes useful for frequency distributions with large number of items.“Their location in an ordered set or a frequency distribution is calculated in a manner similar to that of the median. Since quartiles split a set of distribution into four equal portions, the first and third quartile Qu nti)" / and Q willbe 1 [ a C4) a and 3 (721) items respectively in a distribution. 4) inet" imilarly, Dz will be the 7 | a! item 10 47 MODUL ne rey Also, P2s is the 23 | = item. {100 ) In general, if a particular quantile splits a distribution into 5 equal parts the jth quantile of the (21 item of the size ordered distribution set will be the j s) Let us now look at the following grouped distribution x f Cumf 70-72, 5 5 T3618 18 2B 76-78 42 65) 79-81 27 92 82-84 8 100 “100— In this distribution, n =) f = 100 100+1 The Ist quartile Q is given by th and 3100-41) the 3" quartile Qs is-given by itertis It follows therefore that Qris the 25.2Sth item and Qs the 75.75th item: Since Q) occurs in the class 76 to 78 it is the Q; class, similarly 79-81 is the 3 quartile Qs class, The general formula similar to the median interpolation method for obtaining the Ist and 3rd quartiles are’as follows: N41 a - 70, Ic. —-oP. 12, 48 ion Where /; and /s are the lower class boundaries of the Ist and 3rd quartile classes. N= Total number of items in the distribution, (7 f), , and ()f), = cumulative frequencies lower than the respective quartile classes. ‘£01, and {Qs = frequencies of Ist and 3" quartiles; Cy and Cs ~ widths of Ist and 3rd quartile classes. Example: Find using an interpolation formula method the median, quartiles and Psy of the weight of 1200 ducks given below Weight F Cumf (gms) 56 - 58 7 7 59-61 13 20. 62 - 64 68 88 65 - 67 144 232 68-70 197 429 1-73 204 633 74-76 208 841 77-79 160 1001 80-82 101 1102 83 - 85, 54 S56 86 - 88 25 1181 89-91 13 1194 92 - 94, 4 1198 95-97 2 1200 Solution: ‘The cumulative frequency is calculated above (a) Median = a ‘1 600.5¢hitem Median class = 1-% bh = 70.5(E Do c=3 429, f median = 204 49 OOO Median = 70,5 + 3 { 9005 = 429 204 = 73.02 () Quis the Been item = 300.25 the item Qu class = 68 - 70 Thus, 1; = 67.5 (Yo f), = 232 fO1 = 197.0 = (30025 — 232°) Q= 675 +3 aa): 68.54 (197 502001) con 95m item 76.5 (Y f)s =841,f0; = 160, C-=3 6s +3 (200-75 — 841) f 160 77.62 Qs (©) Paris the 37 CO item 444,37 items ‘This lies imelass'71'=73 Tey = 70.5 (Pay = 429, fPs7-= 204 c=3 Pay = 70,5 + 3 (444. 37 ~ 425) = 70.72 209 THE MODE The mode of a set of values is defined as the one which occurs with the greatest frequency. For continuous or grouped discrete data, a method similar to interpolation is used. This is illustrated by the following example. Consider the following distribution. 50 roa Class t 21-25 2 26-30 4 31-35 29 36-40 4B 41-45 33 46-50 9 ‘The modal class is 36-40 since it is the class with the highest frequency. It is obvious that the modal value should lie in this class. Sia (44-45) following the modal class is larger than the class (31 - 35) Ronee jode should be larger than the modal class midpoint. The | class midpoint depending ee I class, ‘on whether the class following ein rger'or smaller. than the class previous to the modal class. The fi Fig. 3.1 31 ea ‘The formula for the mode is given by lr + cy where 1 lower class boundary of modal class A, = difference in frequencies between modal class and previous class. A. difference in frequencies between modal class and the following class. c= width of modal class. Nom! 4, The value, “ies always between 0 and 1 A, +A; Using the given illustration 1=35, A, = 43-29=14 A, = 43-33 = 10 ande / Mode = 35 + (45 = 37.9 \24 Example: The following are the distribution of marks of 62 students in a statistics test. x f 93-97 2 98 -10.2 5 10.3-10.7 12 10.8512 18 113-117 14 11.8-12.2 6 12.3612.7 4 12.8-13.2 1 Find the mode. Solution: Modal clas 10.8 - 11.2 52 MODUL om hh =10.75 A, = 18-12 = 6A, = 18- 14-4 c= 05 6) mode ~ 10.75 + (©) (0.5)= 11.05 10) \ ASSIGNMENTS 1. The weight in kilogrammes, recorded by 50 final year students are as follows: Weight (Kg) Number of Students 54-57 5 5861 7 62-65 10 66 - 69, 12 70 - 73 6 4-77 5 78-81 4 82-85 1 Find the median, Qy, Q5, and 60th percentile. REFERENCES. Avy, Donal et al (1979): Introduction to Research in Education U. 8. A. Holt, Rineheat and Winston, Ine. Best, J. W and Kahn, J. V.\ (1986): Research in Education) London: Practice Hall Inter. Boyinbode I.-R. (11984): Fundamental Statistical Methods in Education and Research, e- Gay L. G. (1970) - Education Research: Competencies for Analysis and Application. Ohio. Charles E. Merill Guilford, J. P. and Fruchter, B. (1973): Fundamental Statistics in Psychology and Education, McCall, R. B_ (1980): Fundamental Statistics for Psychology, U. A. A. Harcourt B. Jovanovich Ine. aT Statistical Methods in UNIT FOUR: MEASURES OF VARIABILITY OR DISPERSION, STANDARD SCORES (Z - SCORES AND T - SCORES) AND THE NORMAL CURVE INTRODUCTION There is the need to determine the above in any distribution particularly when considering students’ performance. You are aware that when a teacher or an examiner marks or grades students’ or candidates! answer scripts, he/she assigns some marks or scores out of a maximum obtainable score. The fixed maximum obtainable score may be.10»20.30, 50, or most often 100. Scores may also be values of a variable (age, height, life span or weightof materials). Scores as presented above are referred to as raw scores. Raw'in the sense that such scores are not yet standardized or normed. Performance scores, barring examination mialpractices or irregularities, depends upon easiness or difficulty indices of ifems/tasks and'the generosity, or severity tendency of the teacher or examiner. Other yariable scores may-depend upon defects in or errors of reading the calibrations-of measuring instruments, All these defects or errors make the interpretations of scores difficult More so, when a candidafe/student gets a score of 70% in an examination, what would you make out of it? Is the 70% high score im-terms of standards of the task undertaken or in relation to the scores of the other candidates/students. who also took the examination? Supposing 70% is the highest/greatest of the least score of allthe scores eamed by all the students, how far apart are the other scores? ‘Tovovercome the above errors, defects or wnduesinfluences on scores, norming and/or standardization of score are devised and-used, OBJECTIVES By the end of this unit, you should be able to: 1. define variability and give its measures 2, calculate standard deviation 3. convert raw scores to z - score and vice versa; 4, transform a given z - score to a T- score and vice versa; 5 convert a raw score overall performance of students in a given set of tests or course as expressed in percentage score and T - scores when nei given/obtained, sary statistics are 6. draw the normal curve 7. interpret the areas of the normal curve 34 aT Statistical Meth MEASURE OF VARIABILITY (SPREAD) OR DISPERSION Measures of spread dispersion or variability indicate the degree to which the various points in a distribution deviate from the average. Measures of central tendency only describe a distribution in terms of average value or the typical measure but not the total picture of the distribution. The mean and the median may be identical for some distributions without us knowing their spread. This is why measures of spread are necessary. For illustration, consider the following distributions of scores of students in two subjects: Distribution A Distribution B 98 16 90 8B 85 77 80 u 5 9 70 B 65 2 60 74 55 75 Tx = 675 615 N = 9 9 x = = aa & ="75 Md = 15 = 75 The scores in distribution B is homogonous with little difference between adjacent scores, The scores in distribution are heterogeneous spreading for apart and performance ranged from superior to very poor, However, the mean and median in both distribitions are the same. Therefore, there is the need for the indices that describe the spread or dispersion of scores in a distribution. Several of such measures are available. These include the Range, Quartile Deviation, mean deviation, variance and the standard deviation Range The range is the simplest of all indices of variability. It is the difference between the highest and lowest scores in a distribution. The range may be inclusive or exclusive. The exelusive range is usually quoted as the difference between the largest and the smallest scores in a distribution. However, the inclusive range is the difference between the upper 55

You might also like