This file is part of a program based on the Bio 4835 Biostatistics class taught
at

Daniel, W. W. 1999. Biostatistics: a foundation for analysis in the
health sciences.

The file follows this text very closely and readers are encouraged to consult
the text for further information.

**E) ****Confidence
interval for the variance of a normally distributed population
**

Measures of dispersion

S

E( ) = when E( ) = when

sampling is with sampling is without

replacement replacement.

To understand replacement we can consider the New Jersey Pick 3 daily lottery. The Pick 3 lottery is a game of chance in which they draw three balls from 0 to 9, forming the daily number. If it were done with one lottery machine, the first ball up would have to be put back into the machine so that it has a chance of being drawn again. To solve this, the lottery commission uses three machines, each with 10 balls numbered 0 through 9.

The New Jersey Pick 6 is a game of chance drawn without replacement. The machine has balls numbered 1 through 49 and six balls are selected sequentially. As they roll out they become part of the six-digit number but are not replaced. That way, any given number that is drawn cannot be drawn a second time.

When N is large, N and N-1 are approximately equal so and will be approximately equal. These results justify why can be used to compute the sample variance.

The value of is used as a point estimator of the population variance, . Confidence intervals of are based on the sampling distribution of (n-1) / . If samples of size n are drawn from a normally distributed population, this quantity has a distribution known as the

The 100(1- ) confidence interval for the distribution of (n-1) / is a two-tailed distribution between and . This interval is given by

From the sampling distribution of (n-1) / the sampling distribution of is derived. The formula is:

Confidence interval for

To get the 100(1- ) confidence interval for , the population standard deviation, the square root of each term is taken. The result is the formula below.

Example

In a study on cholesterol levels a sample of 12 men and women was chosen. The plasma cholesterol levels (mmol/L) of the subjects were as follows: 6.0, 6.4, 7.0, 5.8, 6.0, 5.8, 5.9, 6.7, 6.1, 6.5, 6.3, and 5.8. We assume that these 12 subjects constitute a simple random sample of a population of similar subjects. We wish to estimate the variance of the plasma cholesterol levels with a 95 percent confidence interval.

Solution

(1) Given

6.0 6.4 7.0 5.8 6.0 5.8

5.9 6.7 6.1 6.5 6.3 5.8

Estimate the variance with a 95% confidence interval.

(2) Calculations

- Value of

s = .3918680978

- Values of from table

= 21.920

= 3.816

- Calculation of the confidence interval

Discussion: The value of from the data can be used as a point
estimate of the population variance, . We say that the
population variance is estimated to be .391868. From the calculation we
are 95% confident that the true population variance is between .1966 and 1.130.