Confidence Interval for the Difference of Two Population Means

This file is part of a program based on the Bio 4835 Biostatistics class taught at Kean University in Union, New Jersey.  The course uses the following text:
Daniel, W. W. 1999.  Biostatistics: a foundation for analysis in the health sciences.  New York: John Wiley and Sons.  
The file follows this text very closely and readers are encouraged to consult the text for further information.

 

B) Confidence interval for the difference of two population means

Introduction

From each of two populations an independent random sample is drawn.  Sample means, x1 and x2 , are calculated.  The difference is
x1-x2 which is an unbiased estimator of the difference between the two population means, mu1 -mu2 .  The variance of the estimator is
(sigma1 squared /n1 ) + (sigma2 squared /n2 ).

Conditions for use

Assuming the populations are normally distributed, there are three situations where we would determine the 100(1-alpha ) percent confidence interval for mu1 -mu2 .

        a) where the population variances are known (use z)

        b) where the population variances are unknown but equal (use t)

        c) where the population variances are unknown but unequal (use t').  There is some acceptance of t' for cases such as these.  
            The concept of t' is noted here so that readers are aware of its existence but it will not be treated further in this narrative.


Illustrative Examples

Situation a)  Population variances are known (z is used)

When the population variances are known, the 100(1-alpha ) percent confidence interval for mu1 -mu2 is given by
         confidence interval formula
Example 6.4.1

A research team is interested in the difference between serum uric acid levels in patients with and without Down's syndrome.  In a large hospital for the treatment of the mentally retarded, a sample of 12 individuals with Down's syndrome yielded a mean of  x-bar1 = 4.5 mg/100 ml.  In a general hospital a sample of 15 normal individuals of the same age and sex were found to have a mean value of x-bar2= 3.4 mg/100 ml.  If it is reasonable to assume that the two populations of values are normally distributed with variances equal to 1 and 1.5, find the 95 percent confidence interval for mu1-mu2 .

(1) Given

        n1 = 12, x-bar1 = 4.5, sigma 1 squared = 1
        n2 = 15, x-bar2 = 3.4, sigma2 squared = 1.5


(2) Calculations

  • The point estimate for mu1 -mu2 is x-bar1x-bar2

            x-bar2x-bar2 = 4.5 - 3.4 = 1.1

  • The standard error is

            standard error
 

  • The 95% confidence interval is

            confidence interval formula
                      1.1 1.96 (.4282)

                       (.26, 1.94)

Discussion:  As this is a z-interval, we know that the correct value of z to use is 1.96.  We interpret this interval that the difference between the two population means is 1.1 and we are 95% confident that the true mean lies between 0.26 and 1.94.

Situation b) Population variances are unknown but can be assumed to be equal (t is used)

If it can be assumed that the population variances are equal then each sample variance is actually a point estimate of the same quantity.  Therefore, we can combine the sample variances to form a pooled estimate.

Weighted averages

The pooled estimated of the common variance is made using weighted averages.  This means that each sample variance is weighted by its degrees of freedom.


Pooled estimate of the variance

The pooled estimate of the variance comes from the formula:
        
          pooled variance formula

Standard error of the estimate

The standard error of the estimate is

        standard error formula
           
Confidence interval

The 100(1-alpha ) confidence interval for mu1 - mu2is

        confidence interval formula

Example

(1) Given

        n1 = 13, x-bar1 = 21.0, s1-squared = 4.9
        n2 = 17, x-bar2 = 12.1, s2-squared = 5.6

(2) Calculations

  • The point estimate for mu1 -mu2 is x-bar1x-bar2

            x-barx-bar2 = 21.0 - 12.1 = 8.9

  • The pooled estimate of the variance is

            pooled variance calculation
  

  • The standard error is

             standard error calculation

  • The 95% confidence interval is

              confidence interval formula
                        8.9 2.0484 (1.9569)

                        8.9 4.0085

                        (4.9, 12.9)

Discussion:  The correct value of t to use for a 95% confidence interval with 28 degrees of freedom is 2.0484.  We interpret this interval that the difference between the two population means is estimated to be 8.9 and we are 95% confident that the true value lies between 4.9 and 12.9.