This file is part of a program based on the Bio 4835 Biostatistics class taught
at
Daniel, W. W. 1999. Biostatistics: a foundation for analysis in the
health sciences.
The file follows this text very closely and readers are encouraged to consult
the text for further information.
Hypothesis testing and estimation are used to reach conclusions about a
population by examining a sample of that population. Hypothesis testing
is widely used in medicine, dentistry, health care, biology and other fields as
a means to draw conclusions about the nature of populations.
Hypothesis testing is to provide information in helping to make
decisions. The administrative decision usually depends a test between two
hypotheses. Decisions are based on the outcome.
Definitions
Hypothesis: A hypothesis is a statement about one or more
populations. There are research hypotheses and statistical hypotheses.
Research hypotheses: A research hypothesis is the supposition or
conjecture that motivates the research. It may be proposed after numerous
repeated observation. Research hypotheses lead directly to statistical
hypotheses.
Statistical hypotheses: Statistical hypotheses are stated in such a way
that they may be evaluated by appropriate statistical techniques. There
are two statistical hypotheses involved in hypothesis testing.
Rules for hypothesis
statements
1. Your expected conclusion, or what you hope to conclude as a result of
the experiment should be placed in the alternative hypothesis.
2. The null hypothesis should contain an expression of equality, either
=,
or
.
3. The null hypothesis is the hypothesis that will be tested.
4. The null and alternative hypotheses are complementary. This
means that the two alternatives together exhaust all possibilities of the
values that the hypothesized parameter can assume.
Note: Neither hypothesis testing nor statistical inference proves the
hypothesis. It only indicates whether the hypothesis is supported by the
data or not.
Example of test statistic
Testing the mean using z,
![]()
=
relevant statistic--sample mean
= hypothesized parameter--population mean
= standard error of
which is the relevant statistic
This all depends on the assumptions being correct.
Level of significance
The level of significance,
, is a probability and is, in
reality, the probability of rejecting a true null hypothesis. For
example, with 95% confidence intervals,
= .05 meaning that
there is a 5% chance that the parameter does not fall within the 95% confidence
region. This creates an error and leads to a false conclusion.
Significance and errors
When the computed value of the test statistic falls in the rejection region it
is said to be significant. We select a small value of
such as
.10, .05 or .01 to make the probability of rejecting a true null hypothesis
small.
Types of errors
When a true null hypothesis is rejected, it causes a Type I error whose
probability is
.
When a false null hypothesis is not rejected, it causes a Type II error
whose probability is designated by
.
A Type I error is considered to be more serious than a Type II error.
Risk management
Since rejecting a null hypothesis has a chance of committing a type I error, we
make
small by selecting an appropriate confidence
interval. Generally, we do not control
, even though it is
generally greater than
. However, when failing to
reject a null hypothesis, the risk of error is unknown.
Table of error conditions

Hypothesis testing and scientific reporting
In science, as in other disciplines, certain methods and procedures are
used for performing experiments and reporting results. A research report
in the biological sciences generally has five sections.
I. Introduction
The introduction contains a statement of the problem to be solved, a summary of
what is being done, a discussion of work done before and other basic background
for the paper.
II. Materials and methods
The biological, chemical and physical materials used in the experiments are
described. The procedures used are given or referenced so that the reader
may repeat the experiments if s/he so desires.
III. Results
A section dealing with the outcomes of the experiments. The results are
reported and sometimes explained in this section. Other explanations are
placed in the discussion section.
IV. Discussion
The results are explained in terms of their relationship to the solution of the
problem under study and their meaning.
V. Conclusions
Appropriate conclusions are drawn from the information obtained as a result of
performing the experiments.
This method can be modified for use in biostatistics. The materials and
procedures used in biostatistics can be made to fit into these five categories.
Alternatively, we will use an approach that is similar in structure but
contains seven sections.
Procedure for hypothesis testing
(1) Data
(2) Assumptions
(3) Hypotheses
(4) Test statistic
(a) Distribution of test statistic
(b) Decision rule
(5) Calculation of test statistic
(6) Statistical decision
(7) Conclusion
Explanation of procedure for hypothesis testing
(1) Data
The data must be clearly stated and understood. Sometimes certain values
must be calculated before the hypothesis test begins. The data determine
what test statistic will be used.
(2) Assumptions
Confidence intervals are determined, in part, based on what assumptions are
being used. Examples include the assumption that the population is normally
distributed, that samples are randomly drawn and independent, and whether the
variances are equal.
(3) Hypotheses
Hypotheses are explicitly stated
: the null hypothesis
: the alternative hypothesis
(4) Test statistic
The test statistic is a statistic that can be computed from the data of the
sample. Examples are z and t which may be computed in several ways depending
on the data and the hypotheses to be tested.
(a) Distribution of test statistic
The key to statistical inference is the sampling distribution. Assuming
that the population is normally distributed, and the corrections are met, z
follows the standard normal distribution and t follows Student's t
distribution.
(b) Decision rule
Values of the test statistic form a distribution with a nonrejection region in
the center and a rejection region. The values in the rejection region are
less likely to occur if the null hypothesis is true. The decision rule
says to reject the null hypothesis if the value of the test statistic is in the
rejection region and not to reject the null hypothesis if it falls in the
nonrejection region.
(5) Calculation of test statistic
The test statistic is calculated from the data in the sample and the result is
compared with the rejection and nonrejection regions that have previously been
specified.
(6) Statistical decision
The statistical decision consists of rejecting or not rejecting the null
hypothesis. It is rejected if the computed value of the test statistic
falls in the rejection region, and it is not rejected if the computed value of
the test statistic falls in the nonrejection region.
(7) Conclusion
If
is rejected, we conclude that
is true. If
is
not rejected, we conclude that
may be true.
One should be careful to say "
may be true" not to conclude that
"
is true" because there is always a possibility that a type
II error was made, meaning that a false null hypothesis was not rejected.
Purpose of hypothesis testing
Hypothesis testing is to provide information in helping to make
decisions. The administrative decision usually depends on the null
hypothesis. If the null hypothesis is rejected, usually the
administrative decision will follow the alternative hypothesis.
It is important to remember never to base a decision solely on the outcome of
only one test. Statistical testing can be used to provide additional
support for decisions based on other relevant information.
In this unit we will study hypothesis testing for six parameters. These will be the same six parameters studied using confidence intervals. It is important to remember that hypothesis testing and confidence intervals are closely related, like two sides of the same coin. The six parameters are as follows:
A)
Hypothesis Testing of a Single Population Mean
B)
Hypothesis Testing of the Difference Between Two
Population Means
C)
Hypothesis Testing of a Single Population Proportion
D)
Hypothesis Testing of the Difference Between Two Population
Proportions
E)
Hypothesis Testing of a Single Population Variance
F)
Hypothesis Testing of the Ratio of Two Population Variances