0% found this document useful (0 votes)
47 views9 pages

Random Sampling, Statistics, and Estimators

The document discusses random sampling and statistics. It defines a random sample as one where each observation is independently drawn from the population distribution. The goal of statistics is to estimate unknown population parameters based on random samples. Common parameters that are estimated include the population mean and variance. Unbiased estimators are desirable as their expected value equals the true population parameter. The sample mean is an unbiased estimator of the population mean, while the sample variance with n-1 in the denominator is an unbiased estimator of the population variance.

Uploaded by

architbumb
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views9 pages

Random Sampling, Statistics, and Estimators

The document discusses random sampling and statistics. It defines a random sample as one where each observation is independently drawn from the population distribution. The goal of statistics is to estimate unknown population parameters based on random samples. Common parameters that are estimated include the population mean and variance. Unbiased estimators are desirable as their expected value equals the true population parameter. The sample mean is an unbiased estimator of the population mean, while the sample variance with n-1 in the denominator is an unbiased estimator of the population variance.

Uploaded by

architbumb
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Random Sampling, Statistics, and Estimators

statistics.tex Feb. 11, 2003 In general, we assume the rv of interest, x, has some known distribution, f X x in the population of interest, but we do not known the values of the parameters in that distribution. That is, we assume that in the population of interest, X has some known distribution This is a very strong assumption. Given that we know, by assumption, the distribution, the problem is to estimate the values of the parameters. We do this by taking a sample from the population of interest,and use information in the sample to estimate the values of the population parameters. Hereafter, I will use the term population to refer to the population of interest. Estimation always starts by defining the population of interest. If one wants to emphasis the parameters in f X x, one might write it f X x; , where is the vector of parameters.

Samples and random samples


We would like our sample to be a random sample from f X x. Definition: Define a sample of size n as x 1 , x 2 , . . . , x n where x j is the jth observation in the sample (MGB 223) Note that a sample is a vector of random variables with some joint distribution. Definition: The sample x 1 , x 2 , . . . , x n is a random sample from f X x if f X 1 ,X 2 ,...,X n x 1 , x 2 , . . . , x n f X x 1 f X x 2 . . . f X x 2 where f X 1 ,X 2 ,...,X n x 1 , x 2 , . . . , x n is the joint distribution of the sample (MGB 223 and G74).

In explanation, each variable in f X 1 ,X 2 ,...,X n x 1 , x 2 , . . . , x n is a random variable; that is, observation j can take different values, so observation j is a rv. Denote this random variable X j , and the specific value it takes x j . f X 1 ,X 2 ,...,X n x 1 , x 2 , . . . , x n is therefore a joint density function for the n random variables in the sample. Said in words, f X 1 ,X 2 ,...,X n x 1 , x 2 , . . . , x n is a random sample from f X x if each observation is an independent draw from f X x. Just to be clear, let me write out the above in a little more detail. The sample x 1 , x 2 , . . . , x n is a random sample from f X x if f X 1 ,X 2 ,...,X n x 1 , x 2 , . . . , x n f X 1 x 1 f X 2 x 2 . . . f Xn x 2 f X x 1 f X x 2 . . . f X x 2 fx 1 fx 2 . . . fx 2 because f X i x i f X x i that is, each observation in the sample has the same distribution. Often we say (G74) a sample is random if the observations in it are independent, identically distribution - I.I.A. That is, a sample is random if each observation in the sample is independently drawn from the same (identical) distribution. Said loosely, the sample is random if for each observation, each value of the rv in the population has an equal chance of appearing as the jth observation, and this is true for all j. Give me an example of a nonrandom sample. Start by identifying the population you are sampling from. Can one tell, by observation, whether a sample is a random sample?

Note that random does not mean representative.

However as n increases, the sample will likely become more representative of the population Because of sampling variation, samples differ. That is, any two random samples of size n from the same population are likely to not exhibit the same values of the n random variables X1, X2, . . . , Xn. Remember that the n random variables X 1 , X 2 , . . . , X n have some joint density function f X 1 ,X 2 ,...,X n x 1 , x 2 , . . . , x n We call this the distribution of the samples. Each sample is a draw from this distribution. Picture a sample with two observations; that is n 2.

The central problem in statistics


1. We desire to study a population which has density f X x; where the form of f X x; is known but is unknown (MGB 226). This statement describes most of the econometrics you will ever do. 2. We take a random sample from f X x; of size n, x 1 , x 2 , . . . , x n 3. We then assume some function t tX 1 , X 2 , . . . , X n is an estimate of some element of , call that element k The issue is whether tX 1 , X 2 , . . . , X n is a good estimator for k definition: A function of X 1 , X 2 , . . . , X n is called a statistic That is, a statistic is just a function of the observed data (a function of the observed values of the n random variables in the sample. tX 1 , X 2 , . . . , X n is what we mean by a statistic Note that statistics is just the plural of statistic, so statistics is the study of functions of observed values of random variables, or, said another way, statistics is the study of functions of the data For example, if one takes a random sample of size n from f X x; , the following are all statistics: X 1 (the first X drawn

the smallest (or largest) X drawn e 3X 1 e 6X 2 e 1X 4 e 3X 17 n 1 i1 X i n Each of these statistics might or might not be a good estimator of some element of Consider some estimator t tX 1 , X 2 , . . . , X n If we use tx 1 , x 2 , . . . , x n as an estimate of , we say that tX 1 , X 2 , . . . , X n is an estimator of and tx 1 , x 2 , . . . , x n is an estimate of .

Estimating the population mean


One population that we often want to estimate is the population mean. Let x represent the population mean such that f X x; x , 2 x We want an estimator for x . The sample mean, from a random sample drawn from f X x; x , 2 is an estimate of x . x That is, 1 n is an estimator of x and 1 n

X i tX 1 , X 2 , . . . , X n
i1

xi
i1

is an estimate of x from the specific sample x 1 , x 2 , . . . , x n . Let X 1 n

Xi
i1

As an alternative estimators of x consider minX 1 , X 2 , . . . , X n and X 3 . These are also both statistics and estimators for x .

In a general sense, every statistic from a sample is an estimator for each of the population parameters, maybe a bad estimator, but an estimator never the less. I have proposed three candidates for estimators for x X 1 n

Xi
i1

minX 1 , X 2 , . . . , X n and X3 Do they have any desirable properties? EX E 1 n


n

Xi
i1 n i1

1 E X i n 1 EX 1 EX 2 . . . EX n n 1 x x . . . x n 1n nx x That is, EX x , which seems like a nice property for X to have? Does X 3 have this property? Yes EX 3 x How about minX 1 , X 2 , . . . , X n ? No. Can you prove it? Consider another estimator for x sX 1 , X 2 . 5X 1 . 25X 2

EsX 1 , X 2 E. 5X 1 . 25X 2 . 5EX 1 . 25EX 2 . 5 x . 25 x . 75 x x sX 1 , X 2 systematically underestimated x . definition: tX 1 , X 2 , . . . , X n is an unbiased estimator of EtX 1 , X 2 , . . . , X n

Estimating 2 , the population variance x


Consider the following two statistics as estimators for 2 : x x s2 where X
1 n n i1 X i and

1 n

X i X 2
i1

s2 x

1 n 1

X i X 2
i1

If one had to choose between these two estimators, at first blink, I might go for the first one because it is the average of the squared deviations. Remember that , 2 is the expectation of x the squared deviations in the population, EX EX 2 . s 2 is called the method of moment x estimator of s 2 . x Note that x x lim s 2 lim s 2
n n

Consider the expectation of each. However, before we do this, consider the following algebra, which will turn out to be useful

x i x
i1

x i x x x 2 x i x x x 2 x i x 2 x x 2 2x i xx x x i x
i1 i1 n 2 i1 n i1 n

nx x 2x x x i x
2 i1

but since x i x x i nx nx nx 0
i1 i1 n n n n

x i x 2
i1

x i x 2 nx x 2
i1

Solve this for i1 x i x to obtain x i x


i1 n 2

x i x 2 nx x 2
i1

This will prove useful. We will use in our derivation of Es 2 x Es 2 x E 1 n 1 X i X 2


i1 n n

1 EX i X 2 n 1 i1

Substituting, the algebraic relationship we just derived

Es 2 x

1 EX i x 2 nX x 2 n 1 i1 1 n 1 1 n 1 EX i x 2 nEX x 2
i1 n

2 nvarX x
i1

1 n 2 nvarX x n 1 2 1 n 2 n nx x n 1 1 n 2 2 x x n 1 n 1 2 x 2 x n 1 What did we just show? Es 2 2 x x That is, s 2 is an unbiased estimate of 2 . x x x Therefore s 2 is a biased estimate of 2 . Note that the degree of bias in 2 decreases as n x x increases. x That is why we prefer s 2 , over s 2 , as an estimator for 2 x x What is the intuition?If one has a sample of n observation, once X is determined there are 2 only n 1 independent X i X . That is, if one knows X and X 1 , X 2 , . . . , X n1 , X n is completely determined.

You might also like