UNIT – 1
ELEMENTS OF STATISTICS
                              What is Statistics?
The word “Statistics” has been derive from the Latin word “Status” or Italian
word “Statista” or German word “Statistika”. Each of these words means
Political State. Initially, Statistics was used to collect the information of the
people of the state about their income, health, illiteracy and wealth etc.
But now a day, Statistics has become an important subject having useful
application in various fields in day to day life.
                         Statistics in Plural Sense:--
In the plural sense, Statistics refers to information in terms of numbers or
numerical data such as Population Statistics, Employment Statistics etc.
However any numerical information is not statistics.
Example: Ram gets ₹100 per month as pocket allowance is not Statistics. It is
neither an aggregate nor an average. Whereas average pocket allowance of the
students of Class X is ₹100 per month and there are 80 students in class XI &
8 students in Class XII are Statistics.
The following table shows a set of data that which is Statistics and which is not
Statistics.
Data which are not Statistics      Data which are Statistics
A cow has 4 legs.                  Average height of the 26 plus male
                                   people in India is 6 feet compare to 5
                                   feet in Nepal.
Ram has 200 rupees in his pocket.  Birth rate in India is 18 per thousand
                                   compare to 8 per thousand in USA.
A young lady was run over by a Over the past 10 years, India has won
speeding truck at 100 km per hour. 60 test matches in cricket and lost 50.
From above information we can say that “All Statistics are data, but all data
are not Statistics”
Definition:-
According to Bowley - “Statistics are numerical statements of facts in any
department of enquiry placed in relation to each other.”
According to Yule and Kendall ----- “By Statistics we mean quantitative
data affected to marked extent by multiplicity of causes.”
                 Characteristics of Statistics in Plural Sense
Main characteristics of Statistics in terms of numerical data are as follows:
   (1) Aggregate of Facts – A single number does not constitute Statistics. We
       can not draw any conclusion from single number. We can draw any
       conclusion by the aggregate number of facts.
       For example, if it is stated that there are 1,000 students in our college
       then it has no significance. But if it is stated that there are 300 students
       in arts, 400 students in commerce and 300 in science in our college. It
       makes statistical sense as this data convey statistical information.
       Similarly if it is stated that population of India is 130 crore or the value
       of total exports from India is ₹11, 66,439 crore then these aggregate of
       facts will be termed as Statistics.
   (2) Numerically Expressed - Statistics are expressed in terms of numbers.
       Qualitative aspects like small or big, rich or poor etc. are not statistics.
       For instance if we say that Irfan Pathan is tall Sachin is short then this
       statement has no statistical sense. However if it is stated that height of
       Irfan Pathan is 6 ft and 2 inch and the height of Sachin is 5 ft and 4 inch
       then these numerical will be called Statistics.
   (3) Affected by Multiplicity of Causes – Statistics are not affected by any
       single factor but it is affected by many factors. For instance 30% rise in
       prices may have been due to several causes like reduction in supply,
       increase in demand, shortage of power, rise in wages, rise in taxes, etc.
   (4) Reasonable Accuracy - A reasonable degree of accuracy must be kept in
       view while collecting statistical data. This accuracy depends on the
       purpose of investigation, its nature, size and available resources.
   (5) Pre-determined Purpose - Statistics are collected with some pre-
       determined objective. Any information collected without any definite
       purpose will only be a numerical value and not Statistics. If data
       pertaining to the farmers of a village is collected, there must be some
       pre-determined objective. Whether the statistics are collected for the
       purpose of knowing their economic position or distribution of land
    among them or their total population. All these objectives must be pre –
    determined.
(6) Collected in a Systematic Manner – Statistics should be collected in a
    systematic manner. Before collecting the data, a plan must be prepared.
    No conclusion can be drawn from data collected in haphazard manner.
    For instance, data regarding the marks secured by the students of a
    college without any reference to the class, subject, examination, or
    maximum marks, etc will lead no conclusion.
                          Statistics in Singular Sense
    In a singular sense, statistics means science of statistics or statistical
   methods. It refers to techniques or methods relating to collection,
   classification, presentation, analysis and interpretation of
   quantitative data.
   Definition
   ------ Statistics may be defined as the collection, presentation, analysis
   and interpretation of numerical data.                      -------------------
   CROXTON AND COWDEN
    ------Statistics is the science which deals with the collection,
   classification and tabulation of numerical facts as a basis for the
   explanation,        description and    comparison       of     phenomena.
   ------------------- LOVITT
                          Subject Matter of Statistics
   Subject matter of statistics includes two components:
   1. Descriptive Statistics                        2. Inferential Statistics
   1. Descriptive Statistics: Descriptive Statistics refers to those methods
      which are used for the collection, presentation as well as analysis of
      data. These methods relate measurement of central tendencies,
      measurement of dispersion, measurement of correlation etc. For
      Example: Descriptive statistics is used when you estimate average
      height of the secondary students in your school. Descriptive statistics
      is also used when you find the marks in science and mathematics of
      the students in all classes are intimately related to each other.
   2. Inferential Statistics: Inferential Statistics refers to all such methods
      by which conclusion are drawn related to the universe or population
      on the basis of a given sample. For example: If your class teacher
      estimate average weight of the entire class on the basis of average
      weight of only a sample of sample of students of the class then we use
      the inferential statistics.
                  Important terminology in statistics
1. Population: By population we mean a well-defined set or group of all the
   objects for a particular study. The objects may be persons, plants, books,
   fishes in ponds, shops etc. the population will consist of certain elements
   like the plants of a certain kind in a specified field, the fishes in a pond,
   the unemployed person in India, books in library and so on. For
   instance, if we want to study the properties of students in a school then
   the population consists of all the students of school. For instance if we
   want to study about the books in a library then the population includes
   all the books of the library etc. if the number of elements are limited then
   the population is finite. On the other hand if the number of elements is
   not limited then the population is infinite. Mostly we deal with finite
   population.
2. Sample: It is a part of the population selected by some sampling
   procedure. The process of selection of sample is known as sampling. The
   number of objects in the sample is called the size of the sample. It is
   believed that a sample is best representative of the population.
   For instance, suppose a research worker is required to study the weight
   of fishes in a pond after a particular period of growth. For this purpose
   suppose that there are 3,000 fishes in the pond, he may either measure
   the weight of all the fishes in the pond or he may decide to select a small
   group of fishes and measure their weights. The first approach of
   measuring the weight of all fishes is called complete enumeration or
   census. Another approach in which only a small group of fishes is
   considered is called sample survey. In brief we can say that in complete
   enumeration, information is collected on all the units of the universe and
   in sample survey, only a part of the universe is considered.
3. Variable: A property of objects is known as variable which differ from
   object to object and is expressible numerically, in terms of numbers.
          For instance: the marks in Mathematics of students in a class can be
          expressed in the term of marks obtained by the students. So it is a
          variable property which is expressible quantitatively.
       4. Attribute: A property and characteristic of objects is known as attribute
          which are not expressible quantatively in number. We can express the
          data qualitatively. For example, smoking, color, honesty etc.
                        CHARACTERISTICS OF STATISTICS
       1. Statistics are aggregate of facts.
       2. Statistics are numerically expressed.
       3. Statistics are affected to a marked extent by multiplicity of causes.
       4. Statistics are either enumerated or estimated with reasonable standard
          of accuracy
       5. Statistics are collected in a systematic manner.
       6. Statistics are collected for a pre-determined purpose.
       7. Statistics should be placed in relation to each other.
    In the absence of the above characteristics numerical data can‟t be called
Statistics and hence “all statistics are numerical statements of facts but all
numerical statements of facts are not statistics.”
       According to above Definitions, Statistics is both a science and an art. It
       is related to the study and application of the principles and methods
       applicable in the collection, presentation, analysis, interpretation and
       forecasting of data. Or statistical facts influenced by several factors and
       related to any area of knowledge or research so that concrete and intelligent
       decisions may be taken in the phase of uncertainty
                               NATURE OF STATISTICS
    Statistics as a science: science refers to a systematized body of knowledge. It
studies cause and effect relationship and attempts to make generalizations in the
form of scientific principles or laws. “Science, in short, is like a light house that
gives light to the ships to find out their own way but doesn‟t indicate the direction
in which they should go.” Like other sciences, Statistical Methods are also used to
answer the questions like, how an investigation should be conducted. In what way
the valid and reliable conclusions can be drawn? Statistics is called the science of
scientific methods.
   In words of Croxton and Cowden, “Statistics is not a science, it is scientific
methods.”According to Tippet, “as science, the statistical method is a part of the
general scientific method and is based on the same fundamental ideas and
processes.”
     Statistics as an art: we know that science is a body of systematized
knowledge. How this knowledge is to be used for solving a problem is work of an
art. An art is an applied knowledge. It refers the skill of handling facts so as to
achieve a given objective. It is concerned with ways and means of presenting and
handling data, making inferences logically and drawing relevant conclusion. Art
aspects of statistics tell, „how to use statistical rules and principles to study the
problems and finding their solutions. „Collections of statistics (data) its use and
utility are itself an art.
    Statistics is both science and art: After studying science and art aspects of
statistics, it is used not only to gain knowledge but also to understand the facts
and draw important conclusions from it. If science is knowledge, then art is action.
Looking from this angle statistics may also be regarded as an art. It involves the
application of given methods to obtain facts, derive results and finally to use them
for devising action.
              STAGES IN A STATISTICAL INVESTIGATION 5 stages -
              Collection           Organisation            Presentation
                         Analysis             Interpretation
       1. Collection: This is the primary step in a statistical study and data
           should be collected with care by the investigator. If data are faulty, the
           conclusions drawn can never be reliable. The data may be available from
           existing published or unpublished sources or else may be collected by
           the investigator himself. The first hand collection of data is one of the
           most difficult and important tasks faced by a statistician.
       2. Organization: Data collected from published sources are generally in
           organized form. However, a large mass of figures that are collected from a
           survey frequently needs organization. In organizing, there are 3 steps as
              (A) Editing (B) Classify (C) Tabulation.
       (A) Editing: The collected data must be editing very carefully so that the
           omissions, inconsistencies irrelevant answers and wrong computation in
           the returns from a survey may be corrected or adjusted.
       (B) Classify: Classification is the process of arranging the data according to
           some common characteristics possessed by the items constituting the
           data.
       (C) Tabulation: To arrange the data in columns and rows.
    Hence collected data is organized properly so that the desire information may
be highlighted and undesirable information avoided.
       3. Presentation: Arranged data is not capable to influence a layman. Thus,
          it is necessary that data may be presented with the help of tables,
          diagrams and graphs. By these devices facts can be understood easily.
       4. Analysis: A major part of it is developed to the methods used in
          analyzing the presented data, mostly in a tabular form. For this analysis,
          a number of statistical tools are available, such as averages, correlation,
          regression etc.
       5. Interpretation: the interpretation of a data is a difficult task and
          necessitates a high degree of skills and experience in the statistical
          investigation because certain decisions made on the basis of conclusions
          drawn.
                                SCOPE OF STATISTICS
     In early stages, the scope of statistics was very limited. It was confined mainly
to the administration of government and was, therefore, called the „Science of
Kings‟. But in modern time, the scope of statistics has widened usually all those
facts come in the purview of statistics, which are expressed in quantitative terms
directly or indirectly. That is why Croxton & Cowden observed, “Today there is
hardly a phase of endeavor which does not find statistical devices at least
occasionally useful.” It is not unfair to say, science without statistics bears no
fruit and statistics without science have no root.” The applications of statistics
are so numerous that it is often remarked, “Statistics is what statisticians do.” Now
let us examine a few fields or areas in which statistics is applied.
       1. Statistics and the State: in recent years the functions of the state have
          increased tremendously. The concept of the state has changed from that
          of simply maintaining law and order to that of a welfare state. Statistical
          data and statistical methods are of great help in promoting human
     welfare. The government in most countries is the biggest collector and
     user of statistical data. These statistics help in framing suitable policies.
2.   Statistics in Business and Management: with growing size and
     increasing competition, the problems of business enterprises have
     become complex. Statistics is now considered as an indispensable tool in
     the analysis of activities in the field of business, commerce and industry.
     The object can be achieved by properly conducted market survey and
     research which greatly depends on statistical methods. The trends in
     sales and production can be determined by statistical methods like time-
     series analysis which are essential for future planning of the phenomena.
     Statistical concepts and methods are also used in controlling the quality
     of products to satisfaction of consumer and the producer. The bankers
     use the objective analysis furnished by statistics and then temper their
     decisions on the basis of qualitative information.
3.   Statistics and Economics: R.A.Fisher complained of “the painful
     misapprehension that statistics is a branch of economics.” Statistical
     Data and methods are of immense help in the proper understanding of
     the economic problems and in the information of economic policies. In
     the field of exchange, we study markets, law of prices based on supply
     and demand, cost of production, banking and credit instruments etc. The
     development of various economic theories own greatly to statistical
     methods, e.g., „Engel‟s law of family expenditure‟, „Malthusian theory of
     population‟. The impact of mathematics and statistics has led to the
     development of new disciplines like „Econometrics‟‟ and „Economic
     Statistics‟. In fact, the concept of planning so vital for growth of nations
     would not have been possible in the absence of data and proper
     statistical analysis.
4.   Statistics and Psychology and Education: Statistics has found wide
     application in psychology and education. Statistical methods are used to
     measure human ability such as; intelligence, aptitude, personality,
     interest etc. by tests. Theory of learning is also based on Statistical
     Principles. Applications of statistics in psychology and education have
     led to the development of new discipline called „Psychometric‟.
5.   Statistics and Natural science: Statistical techniques have proved to be
     extremely useful in the study of all natural sciences like biology,
     medicine, meteorology, botany etc. for example- in diagnosing the correct
     disease the doctor has to rely heavily on factual data like temperature of
     the body, pulse rate, B.P. etc. In botany- the study of plant life, one has
     to rely heavily on statistics in conducting experiments about the plants,
     effect of temperature, type of soil etc. In agriculture- statistical
      techniques like „analysis of variance‟ and „design of experiments‟ are
      useful for isolating the role of manure, rainfall, watering process, seed
      quality etc. In fact it is difficult to find any scientific activity where
      statistical data and methods are not used.
   6. Statistics and Physical Science: The physical sciences in which
      statistical methods were first developed and applied. It seems to be
      making increasing use of statistics, especially in astronomy, chemistry,
      engineering, geology, meteorology and certain branches of physics.
   7. Statistics and Research: statistics is indispensable in research work.
      Most of the advancement in knowledge has taken place because of
      experiments conducted with the help of statistical methods. Statistical
      methods also affect research in medicine and public health. In fact, there
      is hardly any research work today that one can find complete without
      statistical methods.
   8. Statistics and Computer: The development of statistics has been closely
      related to the evolution of electronic computing machinery. Statistics is a
      form of data processing a way of converting data into information useful
      for decision-making. The computers can process large amounts of data
      quickly and accurately. This is a great benefit to business and other
      organizations that must maintain records of their operations. Processing
      of row data is extensively required in the application of many statistical
      techniques.
                      CLASSIFICATION OF STATISTICS
Statistics can be divided into 3 parts;
              DISCRIPTIVE STATISTICS
             INFERENTIAL STATISTICS
                 APPLIED STATISTICS
   1. Descriptive Statistics: Descriptive statistics is related to numerical data
      or facts. Such data are collected either by counting or by some other
      process of measurement. It is also related to those methods, includes
      editing of data, classification, tabulation, diagrammatic or graphical
      presentation, measures of central tendency, measures of dispersion,
      correlation etc., help to make the description of numerical facts simple,
      systematic, synoptic understandable and meaningful.
       2. Inferential    Statistics:    Inferential statistics      help in     making
          generalizations about the population or universe on the basis of study of
          samples. It includes the process of drawing proper and rational
          conclusion about the universe. Among these methods, probability theory
          and different techniques of sampling test are important.
       3. Applied Statistics; It involves application of statistical methods and
          techniques to the problems and actual facts. For example-statistics
          related to national income, industrial and agricultural production,
          population, price etc. are called applied statistics. It can be divided into 2
          parts-(1) Descriptive Applied Statistics- it deals with the study of the dat
          which are known and which naturally relate. Its main object is to provide
          descriptive information either to the past or to the present for any area.
          For example- price index number and vital statistics comes under the
          category of descriptive applied statistics. (2) Scientific Applied Statistics-
          under this branch of statistical science, statistical methods are used to
          formulate and verify scientific laws. For example-an effort is made by an
          economist to establish the law of demand, quantitative theory of money,
          trade circle etc. These are established and verify by the help of scientific
          applied statistics.
                            IMPORTANCE OF STATISTICS
     In recent days, we hear talking about statistics from a common person to
highly qualified person. It only show that how statistics has been intimately
connected with wide range of activities in daily life. They realize that work in their
fields require some understanding of statistics. It indicates the importance of the
statistics. A.L.Bowley says, “Knowledge of statistics is like knowledge of foreign
language or of algebra. It may prove of use at any time under any circumstances”.
       1. Importance to the State or Government; In modern era, the role of
          state has increased and various governments of the world also take care
          of the welfare of its people. Therefore, these governments require much
          greater information in the form of numerical figures. Statistics are
          extensively used as a basis for government plans and policies. For
          example-5-years plans are framed by using reliable statistical data of
          different segments of life.
       2. Importance in Human Behavior; Statistical methods viz., average,
          correlation etc. are closely related with human activities and behavior.
          For example-when a layman wishes to purchase some article, he first
          enquiries about its price at different shops in the market. In other words,
          he collects data about the price of a particular article and aims at getting
   idea about the average of the prices and the range within which the price
   vary. Thus, it can be concluded that statistics play an important role in
   every aspect of human activities and behavior.
3. Importance in Economics; Statistics is gaining an ever increasing
   importance in the field of economics. That is why Tugwell said, “The
   science of economics is becoming statistical in its method.” Statistics and
   economics are so interrelated to each other that the new disciplines like
   econometrics and economic statistics have been developed. Inductive
   method of generalization used in economics, is also based on statistical
   principle. There are different segments of economics where statistics are
   used-
(A) Consumption- By the statistics of consumption we can find the way in
which people in different group spend their income. The law of demand and
elasticity of demand in the field of consumption are based on inductive or
inferential statistics.
(B) Production- By the statistics of production supply is adjusted according
to demand. We can find out the capital invested in different productive units
and its output. The decision about what to produce, how much to produce,
when to produce is based on facts analyzed statistically.
(C) Distribution- Statistics play a vital role in the field of distribution. We
calculate the national income of a country by statistical methods and
compare it with other countries. At every step we require the help of figures
without them. It is difficult to move and draw inferences.
4. Importance in Planning; for the proper utilization of natural and
   manual resources, statistics play a vital role. Planning is indispensable
   for achieving faster rate of growth through the best use of a nation‟s
   resources. Sometimes said that, “Planning without statistics is a ship
   without rudder and compass.” For example- In India, a number of
   organizations like national sample survey organization(N.S.S.O.), central
   statistical organization (C.S.O.) are established to provide all types of
   information.
5. Importance in Business: The use of statistical methods in the solution
   of business problems dates almost exclusively to the 20th century. Or
   now days no business, large or small, public or private, can prosper
   without the help of statistics. Statistics provides necessary techniques to
   a businessman for the formulation of various policies and planning with
   regard to his business. Such as-
         (A) Marketing- In the field of marketing, it is necessary first to find out
         what can be sold and them to evolve a suitable strategy so that goods
         reach the ultimate consumer. A skillful analysis of data on population,
         purchasing power, habits of people, competition, transportation cost etc.
         should precede any attempt to establish a new market.
         (B) Quality Control- To earn the better price in a competitive market, it
         is necessary to watch the quality of the product. Statistical techniques
         can also be used to control the quality of the product manufactured by a
         firm. Such as - Showing the control chart.
         (C) Banking and Insurance Companies- banks use statistical
         techniques to take decisions regarding the average amount of cash
         needed each day to meet the requirements of day to day transactions.
         Various policies of investment and sanction of loans are also based on
         the analysis provided by statistics.
         (D) Accounts writing and Auditing- Every business firm keeps accounts
         of its revenue and expenditure. Statistical methods are also employed in
         accounting. In particular, the auditing function makes frequent
         application of statistical sampling and estimation procedures and the
         cost account uses regression analysis.
         (E) Research and Development- Many business organizations have
         their own research and development department which are responsible
         for collection of such data. These departments also prepare charts
         groups and other statistical analysis for the purpose.
                            FUNCTIONS OF STATISTICS
    Statistics performs the functions of making the numerical aspects of facts
simple, precise, comparable and reliable. In fact, the various functions performed
by statistics are the basis of its utility. R.W. Burgess says, “The fundamental
gospel of statistics is to push back the domain of ignorance, prejudice, rule of
thumb, arbitrary and premature decisions, tradition & dogmatism and to increase
the domain in which decisions are made. Principles are formulated on the basis of
analyzed quantitative facts.”
      1. Numerical and definite expression of facts: The first function of the
         statistics is the collection and presentation of facts in numerical form.
         We know that the numerical presentation helps in having a better
         understanding of the nature of a problem. One of the most important
         functions of statistics is to present general statements in a precise and
         definite form. Statements and facts conveyed in exact quantitative terms
         are always more convincing than vague utterances.
2. Simplifies the data (condensation): Not only does statistics present
   facts in a definite form but it also helps in condensing mass of data into
   a few significant figures. According to A.E.Waugh, “the purpose of a
   statistical method is to simplify great bodies of numerical data.”In fact,
   human mind cannot follow the huge, complex and scattered numerical
   facts. So these facts are made simple and precise with the help of various
   statistical methods like averages, dispersion, graphic or diagrammatic,
   presentation, classification, tabulation etc. so that a common man also
   understand them easily.
3. Comparison of facts: Baddington states, “The essence of the statistics
   is not only counting but also comparison.” The function of comparison
   does help in showing the relative importance of data. For example- the
   pass % of examination result of a college may be appreciated better when
   it is compared with the result of other college or the results of previous
   years of the same college.
4. Establishment of relationship b/w two or more phenomena; to
   investigate the relationship b/w two or more facts is the main function of
   statistics. For example-demand and supply of a certain commodity,
   prices and wages, temperature and germination time of seeds are
   interrelated.
5. Enlarges individual experiences: In word of Bowley, “the proper
   function of statistics indeed is to enlarge individual experience.”
   Statistics is like a master key that is used to solve problems of mankind
   in every field. It would not be exaggeration to say that many fields of
   knowledge would have remained closed to the mankind forever but for
   the efficient and useful techniques and methodology of the science of
   statistics.
6. Helps in the formulation of policies: statistics helps in formulating
   policies in different fields, especially in economic, social and political
   fields. The government policies like industrial policy, export-import
   policies, taxation policy and monetary policy are determined on the basis
   of statistical data and their movements, plan targets are also fixed with
   the help of data.
7. Helps in forecasting: statistical methods provide helpful means in
   estimating the available facts and forecasting for future. Here Bowley’s
   statementis relevant that, “a statistical estimate may be good or bad,
   accurate or the reverse; but in almost all cases it is likely to be more
   accurate than a casual observer‟s impression.”
8. Testing of hypothesis: statistical methods are also employed to test the
   hypothesis in theory and discover newer theory. For example-the
         statement that average height of students of college is 66 inches is a
         hypothesis. Here students of college constitute the population. It is
         possible to test the validity of this statement by the use of statistical
         techniques.
                           LIMITATIONS OF STATISTICS
    Newsholme states, “Statistics must be regarded as an instrument of research of
great value but having several limitations which are not possible to overcome and
as such they need out careful attention.”
      1. Statistics does not study qualitative facts: Statistics means aggregate
         of numerical facts. It means that in statistics only those phenomena are
         studied which can be expressed in numerical terms directly or indirectly.
         Such as- (1) directly in numerical terms like age, weight and income of
         individual (2) no directly but indirectly like intelligent of students and
         achievements of students (3) neither directly nor directly like morality,
         affection etc. such type of facts don‟t come under the scope of statistics.
      2. Statistics doesn’t study individual: According to W.I.King, “Statistics
         from their very nature of subject cannot and will never be able to take
         into account individual causes. When these are important, other means
         must be used for their study.” These studied are done to compare the
         general behavior of the group at different points of time or the behavior of
         different groups at a particular point of time.
      3. Statistical results are true only on the average: The statistical laws
         are not completely true and accurate like the law of physics. For example
         – law of gravitational forces is perfectly true & universal but statistical
         conclusions are not perfectly true. Such as the average age of a person in
         India is 62 years. It does not mean that every person will attain this age.
         On the basis of statistical methods we can say only in terms of
         probability and not certainty.
      4. Statistics as lack of complete accuracy: According to Conner,
         “Statistical data must always be treated as approximations or estimates
         and not as precise measurements.” Statistical result are based on sample
         or census data, are bound to be true only approximately. For example –
         according to population census 2001, country‟s population is
         1,02,70,15,247 but can real population may not be more or less by
         hundred, two hundred and so on.
      5. Statistics is liable to be misused: Statistical deals with figures and it
         can be easily manipulated, distorted by the inexpert and unskilled
         persons it is very much likely to be misused in most of the cases. In
         other words, the data should be handled by experts. Thus it must be
         used by technically sound persons.
      6. Statistics is only one of the methods of studying a phenomenon;
         According to Croxton & Cowden, “It must not be assumed that the
         statistical method is the only method to be used in research; neither this
         method be considered the best attack for every problem.” The
         conclusions arrived at with the help of statistics must be supplemented
         with other evidences.
      7. Statistical results may be misleading; Without any reference,
         statistical results may provide doubtful conclusions. For example – on
         the basis of increasing no. of prisoners in the prison, it may be conclude
         that crime is increasing. But it may be possible that due to rude behavior
         of police administration the number of prisoners is increasing but crime
         is decreasing.
    Therefore, it is worth-mentioning that every science based on certain
assumption and limitations. This does not reduce the importance of the subject
but lays emphasis on the fact that precautions should be taken while dealing with
statistical analysis and interpretations.
                             DISTRUST OF STATISTICS
   For practical view point statistics is very useful and important science. We
know that utility of statistics lies not merely in data but in correct analysis and
proper interpretation of data. Several times due to ignorance and bias, people
misuse this delicate tool of knowledge and it creates distrust about data.
   Opinion about distrust of statistics;
    1stOpinion–Statisticians fully trust on the statistical conclusions because data
is collected, edited, analyzed and interpreted on the basis of statistical methods.
Thus there is no reason to doubt on it and said that “Figures don‟t lie” or “Figures
can prove nothing.”
    2nd Opinion –The statistics is looked upon with a suspicious eye and is quite
often condemned as “Figures are tissue of flesh hood. Discardi remarks that there
are three kinds of lies- lies, damned lies and statistics or “There are black lies,
white lies, multi-chromatic lies and statistics is rainbow of lies.”
    Many persons feel that data are false, confusing and incorrect and with their
help truth can be proved wrong and lies can be put as truth. Hence it is said that
“Statistics can prove anything” or “Statistics are like clay of which you can make
God or Devil, as you please.”
    In this context, the observation is worth quoting that “Statistician is the person
who is deeply involved in statistical data. He can freely play with them, misuse
them and can cheat common people. So he is just magician who shows the games
of tricks of hand through statistical data. His result can be surprising but not
trustworthy.
                                    TYPES OF DATA
   Data are the foundation stones and basic raw material in relation to any
   statistical investigation that can be counted, classified, measured or quantified.
   Types of Data are following;
ON THE BASIS OF CHARACTERISTICS OF FACTS
Data may be divided into two types;
        1. Quantitative Data or Numerical Data: These types of data can be
           measured directly such as age, income, production, marks etc. those
           facts are called variables and variables may be discrete or continuous.
Discrete variable– Those variables whose values are individually distinct and
discontinuous. There is a definite difference between two variables. According
to Boddington, “Discrete variables is one where the variables (Individual values)
differ from each other by definite amounts.” For example – number of students
of a class, number of children in a family, number of cattle‟s etc. It takes
integral values such as 0, 1, 2, 3, 4 …etc.
Continuous variable – A continuous variable is one which assumes all values
with in an interval. That is no definite breaks are visible in this type of series.
For example – age, weight, height……
Questions; State which of the following represents Discrete data or Continuous
data?
   I. No. of accidents on each day in a month
  II. Lengths of 1,000 bolts produced in a factory
 III. Speed of an automobile in kilometer per hour
IV. No. of books on a library shelf
      2. Qualitative Data or Categorical Data: They include data relating to
         such facts which can„t be measured directly but are counted or
         categorized to the basis of attributes such as literates, illiterates,
         unemployed, honest etc. are called attributes. For example-
         population can be classified on the basis of males and females or
         males may be classified on the basis of marital status, i.e. married or
         unmarried. Qualitative Data may further be classified into two
         categories
ON THE BASIS OF VARIABLES
On the basis of variables, also data may be of two types;
   (1) Univariate Data: When the frequencies are determined on the basis of
       one variable. For example – no. of workers on the basis of wages, no. of
       persons on the basis of age etc.
    (2) Bivariate Data: When the data are edited or presented on the basis of
        two variables simultaneously. For this two-way frequency table is
        constructed, one variable is placed horizontally and the second one
        vertically. For example – to present the number of students in one table
        on the basis of marks obtained in two subjects, to tabulate the no. of
        persons in one table on the basis of two variables i.e. height and weight.
    ON THE BASIS OF ARRANGEMENTS
    Data may be categorized into two types;
    (1) Raw Data: When the data is arranged and analyzed. It is called „Raw‟
        because it is unprocessed by statistical methods.
    (2) Arrange Data: When the data is processed and is arranged, summarized,
        classified and tabulated in proper way.
    Terms like „Data Point‟ and „Data Set‟ are also used in order to distinguish
    between the numbers relating to individual or single facts and the aggregate
    of facts. For example– the data of production of sugar for ten years will be
    termed as „Data Set‟ and the figures for production of one year will be as
    „Data Point‟.
                                 CLASSIFICATION
After collection and editing of data the first step towards further processing the
same is classification. Classification is a process in which the collected data are
arranged in separate classes, groups or subgroups according to their
characteristics. According to Secrist, “Classification is the process of arranging
data into sequences and groups according to their common characteristics or
separating them into different classes.”
It concludes that classification means the arrangements and systematization of
data into different classes and these classes are determined on the basis of nature,
objectives and scope of the enquiry.
                       OBJECTIVES OF CLASSIFICATION
Classification is a method or technique for extracting the essential information
supplied by the raw data.
(1)   To condense the data: the main objective of classification is to condense and
       simplify the statistical material, so that the same may be easily
       understandable.
(2)   To bring out points of similarities and dissimilarities of data: classification
       brings out clearly the points of similarity and dissimilarities of statistical
       facts because data of similar characteristics are placed in one class i.e.,
       males and females, literates and illiterates, married and unmarried etc.
(3)   To make facts comparable: by arranging the data according to the points of
       similarity and dissimilarities, it helps in comparison.
(4)   To bring out relationship: classification helps in finding cause and effect
       relationship in the data. For example- based on literacy and criminal
       tendency of a group peoples, it can be established whether literacy has any
       impact on criminal tendency or not.
(5)   To prepare ground for tabulation: tabulation is the basis of statistical
       analysis and classification is the basis for tabulation.
It concludes that classification occupies an important place in the process of
statistical investigation. The fact is that the process of tabulation, presentation and
analysis can‟t even be shorted without classification.
         CHARACTERISTICS OR RULES FOR A GOOD CLASSIFICATION
(1) Unambiguity: the various classes should be so defined that there is no
     roomfor doubt and confusion. For example–population is classified as
     literates or illiterates.
(2) Exhaustive and mutually exclusive: classification should be so exhaustive
     (clear all aspects) and one item may not be find place in more than one class.
     For example – students of a college are classified into three groups – urban,
     rural and hostlers. This classification is not mutually exclusive because
     among hostlers some may be urban and some other rural.
(3) Stability: the classification of data into various classes must be stable over be
     a period of time of investigation.
(4) Suitability: the classification should confirm to the objectives of enquiry. For
     example–to study the relationship between sex and university education,
     there is no need to classify on the basis of age and religion.
(5) Flexibility: a good classification should be flexible so that adjustments may
     be easily be made in classes according to changed situations. An ideal
     classification is one that can adjust itself to these changes and yet retains its
     stability.
                          METHODS OF CLASSIFICATION
There are 4 methods of classification;
                   Geographical Classification
                   Chronological Classification
                     Qualitative Classification
                    Quantitative Classification
                                    TABULATION
Tabulation is the next step of classification of the data and is designed to
summaries lots of information in a simple manner. In common language
tabulation is the process of arranging data in a systematic manner in the form
of rows and columns. According to Blair, “Tabulation in its broadest sense is
any orderly arrangement of data in columns and rows.”
                         OBJECTIVES OF TABULATION
  1.   To simplify complex data
  2.   To facilitate comparison
  3.   To economies Space
  4.   To facilitate presentation
  5.   Help in analysis of data
  6.   To help in reference
DIFFERENCE BETWEEN CLASSIFICATION & TABULATION
Basis of            Classification           Tabulation
Difference
Presentation        Classifies into different Classifies into row and columns.
                    classes.
Sequence            First step              Second step
Methods             Method of statistical Method of data Presentation
                    Analysis.
Use of data         Original data are used. Derivatives    like   percentages,
                                            coefficients Proportion, etc. may
                                            also be used.
                         FREQUENCY DISTRIBUTION
The tabular arrangement of data showing the frequency of each item is called a
frequency distribution. According to Croxton and Cowden, “Frequency
distribution is a statistical table in which different values of variable are shown
in the sequence of magnitude along with corresponding frequencies.”
                   TYPES OF FREQUENCY DISTRIBUTION
   (1) Discrete frequency distribution: It is a discontinuous frequency
       distribution, where observations are independent to each other. Each
       observation is different and separates from others. Example – the no. of
       two children in 20 families;
                     1,1,2,3,4,3,2,1,1,4,5,2,4,2,2,1,3,3,2,5
Construction of discrete frequency distribution
  1. A table with three columns is prepared.
  2. In 1st column of this table all possible values of the variable are placed.
  3. In the 2nd column, tally marks or tally bars are put, keeping in view the
     repetition of each value.
  4. After putting tally marks, they are counted and this counting is shown in
     the 3rd column, entitled frequency.
                      No. of     Tally Bars      Frequency
                      children
                      1          ||||            5
                      2          |||| |          6
                      3          ||||            4
                      4          |||             3
                      5          ||              2
   (2) Continuous frequency distribution: A continuous frequency
       distribution is such a distribution in which data are arranged in classes
       or groups which are not exactly measureable. Groups or class-intervals
       are always in a continuous form from the beginning of the frequency
       distribution, till the end, within a given range of the data. For example–
                      39,25,5,33,19,21,12,48,13,21,9,1,10,
                     9,8,12,17,40,12,46,37,17,27,30,6,2,23
Construction of continuous frequency distribution
   (1)   Find out the range = maximum marks – minimum marks
   (2)   Class size = decide the class size
   (3)   Width of class interval =
   (4)   Mid value =
   (5)   Class frequency = the no. of observations corresponding to a
         particular class is known as frequency of the class.
                    Marks     Tally Bars      No. of
                                              Students
                    0 – 10    ||||||          7
                    10 – 20   |||| |||        8
                    20 – 30   ||||            5
                    30 – 40   ||||            4
                    40 – 50   |||             3
There are two types of series according to class interval;
   (1) Inclusive form; A frequency distribution in which each upper limit of
       each class is also included. Such as; 0-9, 10-19, 20-29……………..
   (2) Exclusive form; In which the upper limit of the next class-interval. Such
       as; 0-10, 10-20, 20-30…………