Satyagopal Mandal
Department of Mathematics
University of Kansas
Office: 624 Snow Hall  Phone: 785-864-5180
  • e-mail: mandal@math.ukans.edu
  • © Copy right Laws Apply. My Students have the permission to copy.

    Testing Hypothesis

    The Philosophy of Testing Hypothesis

    In this chapter on Testing Hypotheses, we will be testing a hypothesis H0, to be called the Null hypothesis, against another hypothesis HA, to be called the alternative hypothesis. Only one of these two hypotheses is true. Based on the collected sample and testing criterion that we will set up, we will be accepting only one of them and reject the other one.

     

    Example 1: We may like to test the hypothesis that the disparity between the wages (annual income) of working men and women does not exist any more. Let m1 be the mean annual income of the men and let m1 be the mean annual income of the working women. So, our Null hypothesis H0 and the alternative hypothesis HA would be as follows:

    H0: m1 - m2 > 0

    HA: m1 - m2 = 0

     

     

    Example 2: A TV commentator mentioned that only about 10 years back the average life expectancy of a human being was 75 and now it has increased substantially. We would like to test the claim of this commentator. We let m be the average life expectancy of a human being.

    We set up our Null and alternative hypotheses as follows:

    H0: m = 75

    HA: m > 75

     

    Some of the important definitions and comments:

     

    1. Definition: A statistical hypothesis is a statement, a claim or a proposition regarding a population. Most often, they are about the values of the population parameters. In the above two examples, H0, HA are statistical hypotheses.
    2. It is of our important consideration, which is a Null hypothesis and which is an alternative hypothesis, in a given context. Essentially, one is the negation of the other.
    3. The Null hypothesis H0 represents the status quo, it is some thing that you believed for a long time or it is some assumption or method that has been working, reliably enough, for you a long time. You want to hold on to the Null hypothesis, unless there is very strong evidence, in the collected data, that the alternative hypothesis is better.
    4. The alternative hypothesis HA represents a new claim or some thing out of the ordinary. It could be a researcher's new technology or some sales person's claim that his/her product is better. We are very skeptical about the alternative hypothesis and would accept it only if there is very strong evidence, in the collected data, in favor of it.
    5. Given a Null hypothesis H0 and an alternative hypothesis HA a test of hypothesis is a rule or a procedure to decide, based on the collected sample, whether to accept H0 or HA? Our test will be based on the value of a test statistic. The rule is also called the decision rule or a test of significance.
    6. Two Types of errors: In this process of testing, we may commit two types of errors.

      1. If we reject H0 when it is in fact true, then it is called a type one error.
      2. If we accept H0 when it is in fact false, then it is called a type two error.
      3. The probability of committing a type one error is called the level of significance and is, normally, denoted by a. Usually, a will be a 0.1, 0.05, 0.01 (or a small number).

     

     

    Developing a Test

     

    Let X be a random variable with mean m and standard deviation

    s. Some of the test hypotheses that we will be doing would look like as follows:

     

    H0: m = 75

    HA: m not equal 75

     

    Or

    H0: m = 75

    HA: m > 75

     

    or

    H0: m = 75

    HA: m < 75

     

    More generally, we would test hypotheses like

    H0: m = m0

    HA: m not equal m0

     

    or

     

    H0: m = m0

    HA: m > m0

     

    or

    H0: m = m0

    HA: m < m0

    Let us develop a test procedure for

     

    H0: m = m0

    HA: m not equal m0

     

     

    We will take a sample X1, X2,…, Xn of size n from the X-population and let X =( X1, +X2+…+Xn )/n be the sample mean.

     

    1. We will assume that sample size n is large enough (more than 30), so that we have by CLT, that X has an approximate normal distribution with mean m and standard deviation s/(n1/2).
    2. We can control both the type one and the type two by increasing the sample size n. If the sample size n is fixed, it is not possible to control both simultaneously. If you want to reduce the probability of type one error, the probability of type two error will go up and conversely. Since, we are more concerned about type one error, we will try to minimize the probability of type one error, which is also called the level of significance. So, we would like to develop a test at the level of significance a.
    3. X is a good estimator for m. Our alternative hypothesis is given by "HA: m not equal m0". So, we will reject H0 only if X and m0 are far apart. That is, we reject H0 if the absolute value |X-m0| is large.
    4. Also, if H0 is true, then m not equal m0 and Z= (X-m0)/(s/(n1/2)) has the standard normal distribution. This expression Z above will be called a test statistic and we will be accepting H0 if the computed value |z| of |Z| is small and reject H0 if the computed value |z| of |Z| is large.
    5. If H0 is true then

    P(Z= (X-m0)/(s/(n1/2)) not between -za/2 and za/2) = a.

     

     

    6) So, at the level of significance a our decision rule is as follows:

    Reject H0 if

    Z= (X-m0)/(s/(n1/2)) not between -za/2 and za/2

    and accept H0 otherwise.

     

    1. The above decision rule works only if we know the value of s.

     

     

    Some Decision Rules: We will assume that the value of s is known. Following tests will be called

     

    Z-Tests:

     

    The Two-tail test: Suppose we are testing

    H0: m = m0

    Against HA: m not equal m0

     

    At the level of significance a, our decision rule is as follows:

    Reject H0 if

    Z= (X-m0)/(s/(n1/2)) not between -za/2 and za/2

    and accept H0 otherwise.

     

    The Left-tail test: Suppose we are testing

    H0: m = m0

    Against HA: m < m0

     

    At the level of significance a, our decision rule is as follows:

    Reject H0 if

    Z= (X-m0)/(s/(n1/2)) < -za

    and accept H0 otherwise.

     

    The Right-tail test: Suppose we are testing

    H0: m = m0

    Against HA: m > m0

     

    At the level of significance a, our decision rule is as follows:

    Reject H0 if

    Z= (X-m0)/(s/(n1/2)) > za

    and accept H0 otherwise.

     

    Definition: Suppose we are a test statistic Z to test H0 against HA. Let the observed value of Z = z. The P-value is defined to as the probability, assuming H0 is true, that Z will take a value at least as extreme as z or worst. In the above decision rules that we have given, our test statistic that we are talking about is Z= (X-m0)/(s/(n1/2))

    Let Z=z be the observed value of Z.

     

    1) For the two-tail test the P-value is given by

    p = P( Z not within -z and z)).

     

    2) For the left-tail test the P-value is given by p = P(Z < -z).

    3) For the right-tail test the P-value is given by p = P(Z > z).

     

    Use Your Calculator: I designed this chapter assuming that you will have a TI-83. A TI-83 will give you distinct advantage. You must have seen Ztest, Ttest and all that on your stat-menu while working with confidence intervals.

     

    1. Select Test, Ztest and enter.
    2. Select data or stat as the case may be.
    3. Give the value of m0, s, x and n.
    4. Select the alternative hypothesis.
    5. Calculate
    6. Among other things, it will give you the p (the p-value).
    7. At the level of significance a, reject H0 if p < a and otherwise accept H0. (Small p is BAD for H0).

     

     

     

    Z-Test: Problems

    Ex.1: Assume that you have a normal population with mean m and standard deviation s = 15. Suppose you have collected a sample of size 25 and the sample mean was found to be x = 81.

    We want to test the null hypothesis

    H0: m = 75 against HA: m not equal to 75.

    At the 5 percent level of significance will you reject or accept the Null hypotheses.

     

    Solution: The Calculator gives me the P-value p = 0.045. The level of significance is a = 0.05. Since p = 0.045 < 0.05 = a, we reject H0 at the 5 percent level of significance.

     

    Ex.2: (Change the level of significance to one percent) Assume that you have a normal population with mean m and standard deviation s = 15. Suppose you have collected a sample of size 25 and the sample mean was found to be x = 81.

    We want to test the null hypothesis

    H0: m = 75 against HA: m not equal to 75.

    At the 1 percent level of significance will you reject or accept the Null hypotheses.

     

    Solution: The Calculator gives me the P-value p = 0.045. The level of significance is a = 0.01. Since p = 0.045 > 0.01 = a, we accept H0 at the 5 percent level of significance.

     

    Ex.3: (Change the alternative hypothesis) Assume that you have a normal population with mean m and standard deviation s = 15. Suppose you have collected a sample of size 25 and the sample mean was found to be x = 81.

    We want to test the null hypothesis

    H0: m = 75 against HA: m > 75.

    At the 5 percent level of significance will you reject or accept the Null hypotheses.

     

    Ex.3: The time taken by an athlete to run an event has a distribution with mean m and known standard deviation s = 4 second. The coach believes that his mean has improved from last year's mean 34 seconds. To test the athlete ran 35 times and sample mean was found to be x = 32 seconds. The null and the alternative hypotheses are formulated as

    H0: m = 34 against HA: m < 34.

    At the 5 percent level of significance would the coach accept or reject his "impression"?

     

    Solution: The P-value is p = 0.002 < a = 0.05. So, we reject H0. So, we accept the "impression" of the coach that his mean time has improved.

     

    Ex.4: It is assumed that the lifetime (in hours) of lamps produced in factory is normally distributed with mean m and standard deviation s =1148 . The mean lifetime for an average lamp in the market is 6000 hours. A sales person claims that his lamps are better. To estimate m following data was collected on the lifetime of lamps:

     

    5110

    4671

    6441

    3331

    5055

    5270

    5335

    4973

    1837

    5487

    7783

    4560

    6074

    4777

    4707

    5263

    4978

    5418

    5123

    5017

    To test the claim the null and the alternative hypotheses are formulated as

    H0: m = 6000 against HA: m > 6000.

     

    At one-percent level of significance would you accept the sales person's claim?

    Solution: Here p = 0.9999 is bigger than a = 0.01. So, we accept H0. So, we reject the sales person's claim.

     

    Ex.5: To estimate the mean weight (in pounds) of salmon in a river the following sample was collected:

    34.7

    33.8

    38.2

    20.3

    27.8

    45.3

    43.1

    37.3

    32.5

    32.3

    31.8

    41.5

    44.5

    29.2

    25.3

    29.6

    39.5

    29.1

    37.3

     

    Last year the mean weight was found to be m = 35 pounds and s = 6.5. You want to test if the mean weight has changed significantly this year? To test the alternative hypotheses are formulated as

    H0: m = 35 against HA: m not equal 35.

    At 5 percent level of significance would you reject or accept the null hypothesis?