This file is part of a
program based on the Bio 4835 Biostatistics class taught at
Daniel, W. W. 1999. Biostatistics: a foundation for analysis in the
health sciences.
The file follows this text very closely and readers are encouraged to consult
the text for further information.
B)
Confidence interval for the difference of two population
means
Introduction
From each of two populations an independent random sample is drawn.
Sample means,
and
, are calculated.
The difference is
-
which
is an unbiased estimator of the difference between the two population
means,
-
. The variance of the estimator is
(
/
) + (
/
).
Conditions for use
Assuming the populations are normally distributed,
there are three situations where we would determine the 100(1-
)
percent confidence interval for
-
.
a) where the
population variances are known (use z)
b) where the population variances are
unknown but equal (use t)
c) where the population variances are
unknown but unequal (use t'). There is some acceptance of t'
for cases such as these.
The concept of t' is noted
here so that readers are aware of its existence but it will not be treated
further in this narrative.
Illustrative Examples
Situation a) Population variances are known (z is used)
When the population variances are known, the 100(1-
) percent
confidence interval for
-
is given by

Example 6.4.1
A research team is interested in the difference between serum uric acid levels
in patients with and without Down's syndrome. In a large hospital for the
treatment of the mentally retarded, a sample of 12 individuals with Down's
syndrome yielded a mean of
= 4.5 mg/100 ml. In a
general hospital a sample of 15 normal individuals of the same age and sex were
found to have a mean value of
= 3.4 mg/100 ml. If it is
reasonable to assume that the two populations of values are normally
distributed with variances equal to 1 and 1.5, find the 95 percent confidence
interval for
-
.
(1) Given
= 12,
= 4.5,
= 1
= 15,
= 3.4,
= 1.5
(2) Calculations
-
= 4.5 - 3.4 = 1.1


1.1 ± 1.96 (.4282)
(.26, 1.94)
Discussion: As this is a z-interval, we know that the correct
value of z to use is 1.96. We interpret this interval that the difference
between the two population means is 1.1 and we are 95% confident that the true
mean lies between 0.26 and 1.94.
Situation b) Population variances are unknown but can be assumed to be equal
(t is used)
If it can be assumed that the population variances are equal then each sample
variance is actually a point estimate of the same quantity. Therefore, we
can combine the sample variances to form a pooled estimate.
Weighted averages
The pooled estimated of the common variance is made
using weighted averages. This means that each sample variance is weighted
by its degrees of freedom.
Pooled estimate of the variance
The pooled estimate of the variance comes from the formula:
Standard error of the estimate
The standard error of the estimate is

Confidence interval
The 100(1-
) confidence interval for
-
is

Example
(1) Given
= 13,
= 21.0,
= 4.9
= 17,
= 12.1,
= 5.6
(2) Calculations
-
= 21.0 - 12.1 =
8.9


8.9 ± 2.0484 (1.9569)
8.9 ± 4.0085
(4.9, 12.9)
Discussion: The correct value of t to use for a 95% confidence interval
with 28 degrees of freedom is 2.0484. We interpret this interval that the
difference between the two population means is estimated to be 8.9 and we are
95% confident that the true value lies between 4.9 and 12.9.