| Satyagopal Mandal |
| Department of Mathematics |
| Office: 624 Snow Hall Phone: 785-864-5180 |
The Distribution of the Sample Mean
As in the chapter on Binomial Distribution, our final theorem in this chapter would be that the sample mean
X = (X1+X2+…+Xn)/n
Has normal distribution.
Given a set of data the mean or the average x (or A) that we have computed in the previous chapters is, in fact, the observed value of a random variable X to be called the sample mean.
Similarly, the standard deviation s that we have computed before is the observed value of a random variable S to be called sample standard deviation.
Each time you collect a sample/data the computed sample mean x is the value of the random variable X for this sample.
Our point of view is explained in the following example.
Example: Suppose we want to study the height distribution of the US population. So, we collect a data of size n = 713 as follows:
Data on Height (in inches) of 713 individuals:
|
71 |
62 |
67 |
73 |
61 |
58 |
|
63 |
58 |
69 |
68 |
55 |
57 |
|
51 |
57 |
49 |
63 |
63 |
64 |
|
72 |
59 |
67 |
59 |
57 |
69 |
|
55 |
56 |
65 |
66 |
53 |
53 |
|
51 |
66 |
68 |
71 |
61 |
63 |
|
And so on |
|||||
Our point of view is that the height x1, x2, x3, …, xn (in our case 71, 62, 67, …) are, in fact, the observed values of random variables X1, X2, X3,…, Xn, respectively. Here X1 is the notation for height of the first member of the sample, which could be the height of any body from the whole US population and in this case of
our sample the value of X1 is 71. Similarly, X2 is the notation for height of the 2nd member of the sample, which could be the height of any body from the whole US population and in this case of our sample the value of X2 is 72.Each time we collect a sample
x1, x2, x3,…, xn
the values of X1, X2, X3, …, Xn will be different. But the sample members x1, x2, x3, …, xn happen to be the values of the same set of random variables
X1, X2, X3, … , Xn.
Definition: We define the sample mean X as the random variable
X = (X1+ X2+….+ Xn)/n.
So, each time we collect a sample of size n, we get a value of X, namely the average of the sample x1, x2, x3,…, xn.
Remark: The main point here is that when we collect a sample and compute the mean x (or average), the value of x that we get is probabilistic or "chancy". So, we can and we have to talk about the probability distribution of x or X. If we know the distribution of X, then we will be able to answer the questions related to probability of various values of x that we may get.
We could make similar comments and definitions about the standard deviation. But we may not need them.
If we denote X to be a the random variable the height of an American then we also say that
X1, X2, X3, … , Xn
is
a sample from the population X-population. We used the example of height distribution of the US population to explain our point of view. But given any random variable X (like weight, wages, binomial), we can talk about a sampleX1, X2, X3, … , Xn
from the X-population.
Properties: Suppose X is a random variable and let
X1, X2, X3, … , Xn
be a sample from the X-population. Then we have the following properties.
Theorem: The mean of the sample mean X is equal to the population mean
m. So,mean(X) = mean(X) =
m.The standard deviation of the sample mean X is given by
s
X = s /(n 1/2).
The Central Limit Theorem: Suppose
X1, X2, X3, … , Xn
is a sample from a population X with mean
m and Standard deviation s .P(a <
X < b) =P( (a- m)/sX < Z < (b- m)/sX).
Problems on Sampling
Ex.1: It is known that the tuition X paid per semester by students in a university has a distribution with mean m = $2,050 and standard deviation s = $310. If 64 students are interviewed, what is the approximate probability that the sample mean tuition X paid will be above $2,060?
Solution: Here we are asked to compute P(X > 2,060) ?
The mean of X = m = 2,050 and standard deviation of X =
s
X = s/(n1/2) = 310/(641/2) = 310/8 = 38.75.So,
P(X > 2,060) =
P((X -
m)/ sX > (2060- m)/ sX) =P(Z > (2060 - 2050)/38.75) =
P(Z > 0.26) =
1 - P(Z < 0.3) = 1 - 0.6179 = 0.3821.
Ex.2: The annual rainfall X in a region has a distribution with mean m = 22 cm and standard deviation s = 14 cm. What is the probability that over next 36 years the mean X annual rainfall will exceed 27 cm?
Solution: Here we are asked to compute P(X > 27) ? Here n = 36.
The mean of X = m = 22 and standard deviation of X =
s
X = s/(n1/2) = 14/(361/2) = 14/6 = 2.33.So,
P(X > 27) =
P((X -
m)/ sX > (27- m)/ sX) =P(Z > (27 - 22)/2.33) =
P(Z > 2.1) =
1 - P(Z < 2.1) = 1 - 0.9821 = 0.0179.
Ex. 3: The amount X of ice cream in an ice-cream cone has mean m = 3.2 ounces and standard deviation s = 0.4 ounces. If there are 50 children at a birthday party, what is the approximate probability that the mean consumption X will be more than 3.3 ounce?
Ex.4: A cigarette manufacturer claims that the mean nicotine content in a cigarette is m = 2 mg with the standard deviation s = 0.3 mg. If this claim is valid, what is the approximate probability that a sample of n = 900 cigarettes will yield have a sample mean X nicotine content more than 2.02 mg?