Confidence Interval for the Variance of a Normally Distributed Population


This file is part of a program based on the Bio 4835 Biostatistics class taught at Kean University in Union, New Jersey.  The course uses the following text:
Daniel, W. W. 1999.  Biostatistics: a foundation for analysis in the health sciences.  New York: John Wiley and Sons.  
The file follows this text very closely and readers are encouraged to consult the text for further information.

E) Confidence interval for the variance of a normally distributed population

Measures of dispersion

                                                     S

dispersion-sigma                dispersion-S

E(s-squared ) = sigma-squared when               E(s-squared ) = S squared when
sampling is with                   sampling is without
replacement                        replacement.

To understand replacement we can consider the New Jersey Pick 3 daily lottery.  The Pick 3 lottery is a game of chance in which they draw three balls from 0 to 9, forming the daily number.   If it were done with one lottery machine, the first ball up would have to be put back into the machine so that it has a chance of being drawn again.  To solve this, the lottery commission uses three machines, each with 10 balls numbered 0 through 9.  

The New Jersey Pick 6 is a game of chance drawn without replacement.  The machine has balls numbered 1 through 49 and six balls are selected sequentially.  As they roll out they become part of the six-digit number but are not replaced.  That way, any given number that is drawn cannot be drawn a second time.

Effects of large population size

When N is large, N and N-1 are approximately equal so sigma squared and S squared will be approximately equal.  These results justify why s squared can be used to compute the sample variance.

Interval estimate of a population variance

The value of s squared is used as a point estimator of the population variance, sigma squared .  Confidence intervals of sigma squared are based on the sampling distribution of (n-1)s squared /sigma squared .  If samples of size n are drawn from a normally distributed population, this quantity has a distribution known as the chi-square distribution with n-1 degrees of freedom.  The assumption that the sample is drawn from a normally distributed population is crucial.

Confidence interval on the chi-squareddistribution

The 100(1-alpha ) confidence interval for the distribution of (n-1)s squared /sigma squared is a two-tailed chi-squared distribution between chi-squared lower limit and chi-squared upper limit .  This interval is given by

        confidence interval on chi-squared
    
Confidence interval for sigma squared and sigma

From the sampling distribution of  (n-1)s-squared /sigma squared the sampling distribution of sigma squared is derived.  The  formula is:

        confidence interval for sigma squared

    
Confidence interval for sigma

To get the 100(1-alpha ) confidence interval for sigma, the population standard deviation, the square root of each term is taken.  The result is the formula below.

        confidence interval for sigma

Example

In a study on cholesterol levels a sample of 12 men and women was chosen.  The plasma cholesterol levels (mmol/L) of the subjects were as follows: 6.0, 6.4, 7.0, 5.8, 6.0, 5.8, 5.9, 6.7, 6.1, 6.5, 6.3, and 5.8.  We assume that these 12 subjects constitute a simple random sample of a population of similar subjects.  We wish to estimate the variance of the plasma cholesterol levels with a 95 percent confidence interval.

Solution

(1) Given

        6.0  6.4  7.0  5.8  6.0  5.8
        5.9  6.7  6.1  6.5  6.3  5.8

Estimate the variance with a 95% confidence interval.

(2)  Calculations

  • Value of s-squared

            s = .3918680978

  • Values of chi-squaredfrom table

            chi-squared .975 = 21.920

            chi-squared .025 = 3.816
 

  • Calculation of the confidence interval

           
 
Discussion:  The value of s-squared from the data can be used as a point estimate of the population variance, sigma squared.  We say that the population variance is estimated to be .391868.  From the calculation we are 95% confident that the true population variance is between .1966 and 1.130.