This file is part of a program based on the Bio 4835 Biostatistics class taught at
Daniel, W. W. 1999. Biostatistics: a foundation for analysis in the health sciences.
The file follows this text very closely and readers are encouraged to consult the text for further information.
E) Confidence interval for the variance of a normally distributed population
Measures of dispersion
E( ) = when E( ) = when
sampling is with sampling is without
To understand replacement we can consider the New Jersey Pick 3 daily lottery. The Pick 3 lottery is a game of chance in which they draw three balls from 0 to 9, forming the daily number. If it were done with one lottery machine, the first ball up would have to be put back into the machine so that it has a chance of being drawn again. To solve this, the lottery commission uses three machines, each with 10 balls numbered 0 through 9.
The New Jersey Pick 6 is a game of chance drawn without replacement. The machine has balls numbered 1 through 49 and six balls are selected sequentially. As they roll out they become part of the six-digit number but are not replaced. That way, any given number that is drawn cannot be drawn a second time.
Effects of large population size
When N is large, N and N-1 are approximately equal so and will be approximately equal. These results justify why can be used to compute the sample variance.
Interval estimate of a population variance
The value of is used as a point estimator of the population variance, . Confidence intervals of are based on the sampling distribution of (n-1) / . If samples of size n are drawn from a normally distributed population, this quantity has a distribution known as the chi-square distribution with n-1 degrees of freedom. The assumption that the sample is drawn from a normally distributed population is crucial.
Confidence interval on the distribution
The 100(1- ) confidence interval for the distribution of (n-1) / is a two-tailed distribution between and . This interval is given by
Confidence interval for and
From the sampling distribution of (n-1) / the sampling distribution of is derived. The formula is:
Confidence interval for
To get the 100(1- ) confidence interval for , the population standard deviation, the square root of each term is taken. The result is the formula below.
In a study on cholesterol levels a sample of 12 men and women was chosen. The plasma cholesterol levels (mmol/L) of the subjects were as follows: 6.0, 6.4, 7.0, 5.8, 6.0, 5.8, 5.9, 6.7, 6.1, 6.5, 6.3, and 5.8. We assume that these 12 subjects constitute a simple random sample of a population of similar subjects. We wish to estimate the variance of the plasma cholesterol levels with a 95 percent confidence interval.
6.0 6.4 7.0 5.8 6.0 5.8
5.9 6.7 6.1 6.5 6.3 5.8
Estimate the variance with a 95% confidence interval.
s = .3918680978
Discussion: The value of from the data can be used as a point estimate of the population variance, . We say that the population variance is estimated to be .391868. From the calculation we are 95% confident that the true population variance is between .1966 and 1.130.