Precision and sample-size analysis for
confidence intervals
ciwidth
January 23, 2020
1
Overview
Introduction
Precision and sample-size analysis
Overview of ciwidth
Inference using confidence intervals
Examples
One population mean
One population variance
Two paired means
Two independent means
Sensitivity analysis
Custom tables and precision graphs
Adding your own methods to ciwidth
Summary
2
Introduction to precision and
sample-size analysis
3
Precision and sample-size analysis
Hypothesis tests for inference
Test that a population parameter of interest is equal, or smaller, or larger than a
specified value
Power analysis estimates the required sample size for a future study to ensure
that the test will have higher power (chance) to detect when the parameter
estimate is different from the specified value
Confidence intervals for inference
Estimate an interval for the population parameter
Precision analysis estimates the required sample size for a future study to ensure
that the estimated interval will have the desired precision so that it is not too wide
ike hypothesis tests, confidence intervals are data dependent and so their precision
will vary across samples
Use precision and and sample-size analysis to account for the variability, and plan a
study with the desired precision
Precision and sample-size analysis for confidence intervals is analogous to power and
sample-size analysis for hypothesis tests
4
Precision and sample-size analysis
Hypothesis tests for inference
Test that a population parameter of interest is equal, or smaller, or larger than a
specified value
Power analysis estimates the required sample size for a future study to ensure
that the test will have higher power (chance) to detect when the parameter
estimate is different from the specified value
Confidence intervals for inference
Estimate an interval for the population parameter
Precision analysis estimates the required sample size for a future study to ensure
that the estimated interval will have the desired precision so that it is not too
wide
Like hypothesis tests, confidence intervals are data dependent and so their precision
will vary across samples
Use precision and and sample-size analysis to account for the variability, and plan a
study with the desired precision
Precision and sample-size analysis for confidence intervals is analogous to power and
sample-size analysis for hypothesis tests
5
Precision and sample-size analysis
Hypothesis tests for inference
Test that a population parameter of interest is equal, or smaller, or larger than a
specified value
Power analysis estimates the required sample size for a future study to ensure
that the test will have higher power (chance) to detect when the parameter
estimate is different from the specified value
Confidence intervals for inference
Estimate an interval for the population parameter
Precision analysis estimates the required sample size for a future study to ensure
that the estimated interval will have the desired precision so that it is not too
wide
Like hypothesis tests, confidence intervals are data dependent and so their precision
will vary across samples
Use precision and and sample-size analysis to account for the variability, and plan a
study with the desired precision
Precision and sample-size analysis for confidence intervals is analogous to power
and sample-size analysis for hypothesis tests
6
Precision and sample-size analysis
How many subjects would be required to ensure that the
confidence interval for the mean is no wider than 2?
If we only have enough resources to include 40 subjects in
our study, what kind of precision would we expect our
confidence interval to have?
How would this precision change as we increase our sample
size to 50, 60, and 70?
The ciwidth command will help you answer these questions
and more
7
Overview of ciwidth
Perform precision and sample-size analysis for confidence intervals
Population mean
Population variance
Comparison of means from independent samples
Comparison of means from paired samples
Compute
Sample size
CI precision
Probability of CI precision
Perform sensitivity analysis
Present results graphically or in a table
Customize graphs and tables
Add your own method
Easily create tables and graphs as you would with official ciwidth commands
8
Computation of a
confidence interval
9
Confidence intervals
Computation
Confidence level
95%, 97%, other
Sample size
How many participants can you afford to have in your study?
Standard deviation
How did weight loss vary in the sample?
Precision
Measured by the confidence-interval width
Ensured by the probability of C.I. width
10
Confidence intervals
Computation
Confidence level
95%, 97%, other
Sample size
How many participants can you afford to have in your study?
Standard deviation
How did weight loss vary in the sample?
Precision
Measured by the CI width
Ensured by the probability of CI width
11
Confidence interval for a population mean
A 100*(1- α)% CI:
[ ҧ𝑥 - 𝑡
n-1,1-α2
(
𝑠
𝑛
) , ҧ𝑥 + 𝑡
n-1,1-α2
(
𝑠
𝑛
) ]
α = significance level
s = sample standard deviation
n = sample size
12
Confidence interval for a population mean
A 100*(1- α)% CI:
[ ҧ𝑥 - 𝑡
n-1,1-α2
(
𝑠
𝑛
) , ҧ𝑥 + 𝑡
n-1,1-α2
(
𝑠
𝑛
) ]
α = significance level
s = sample standard deviation
n = sample size
Τ
1
2
width
Τ
1
2
width
13
Confidence interval for a population
mean
[ ҧ𝑥 - 𝑡
n-1,1-α2
(
𝑠
𝑛
) , ҧ𝑥 + 𝑡
n-1,1-α2
(
𝑠
𝑛
) ]
Sample mean Width Sample mean ±
half-width
95% Confidence interval
5 10 5-5, 5+5
5 8 5-4,5+4
5 4 5-2, 5+2
0 10
3 7
1 9
14
Confidence interval for a population
mean
[ ҧ𝑥 - 𝑡
n-1,1-α2
(
𝑠
𝑛
) , ҧ𝑥 + 𝑡
n-1,1-α2
(
𝑠
𝑛
) ]
Choose the values of the confidence
level (100*(1-𝛼)), sample standard
deviation, and sample size
that will provide the desired level of
precision, given the resources you have
15
A first example
16
Precision analysis for a population mean
How long do plug-in air fresheners last?
How many air fresheners would we need to
test to obtain a two-sided 95% CI for the
mean scent duration with a width no larger
than 4 days?
17
18
19
20
21
Computing sample size for a population
mean
22
Computing CI width for a population mean
How big of an interval width would we
expect if we could only afford to
sample 30 air fresheners?
23
Computing CI width for a population mean
24
Computing probability of CI width for a
population mean
Whats the probability that we’ll obtain
a desired CI width of 4, if we only
sample 30 air fresheners?
25
Computing probability of CI width for a
population mean
26
Syntax overview for a one-mean CI
Estimate sample size
ciwidth onemean, sd(6) probwidth(0.96) width(4)
Estimate CI width
ciwidth onemean, sd(6) probwidth(0.96) n(30)
Estimate probability of CI width
ciwidth onemean, sd(6) width(4) n(30)
27
General syntax for ciwidth
Estimate sample size
ciwidth method …, probwidth() width() …
Estimate CI width
ciwidth method …, probwidth() n()…
Estimate probability of CI width
ciwidth method , width() n()…
method:
onemean
onevariance
twomeans
pairedmeans
usermethod
28
General syntax for ciwidth
Estimate sample size
ciwidth method …, probwidth(numlist) width(numlist) [options]
Estimate CI width
ciwidth method …, probwidth(numlist) n(numlist) [options]
Estimate probability of CI width
ciwidth method , width(numlist) n(numlist) [options]
method: options:
onemean table
onevariance graph
twomeans lower
pairedmeans upper
usermethod
29
Finite populations
30
Infinite vs. finite population
We’ve been performing our analyses for a confidence
interval for the mean scent duration of all plug-in air
fresheners (infinite population).
But the parameters you are estimating may be for a
fixed population. For example, there are only 500
models of the Bugatti Chiron. (finite population).
Apply a finite population correction with the fpc()option.
31
Finite population
Suppose you work for this car manufacturer and
you’re designing a study to estimate a confidence
interval for the maximum speed of this limited
edition model.
What is the largest estimated width for the
maximum speed you’ll obtain if you only sample
50 out of the 500 cars that were manufactured,
assuming a standard deviation of 16 mph?
32
Compute CI width using a finite population correction
33
Alternative specification for a finite population
correction
34
Finite population correction
35
Back to infinite populations of air fresheners,
and on to
one-sided confidence intervals
36
One-sided confidence intervals
You may want one-sided confidence intervals if
You are certain of the direction of an effect.
Examples include toxicity studies and analysis of occurences of adverse drug
reaction data(Chow, Shao, Wang, & Lokhnygina, 2017).
You are mainly interested in either the lower or upper limit.
An example would be product quality and reliability (Meeker, Hahn, & Escobar,
2017 ).
In general, use caution with one-sided confidence intervals.
There have been cases where the placebo effect is superior to the
actual drug effect (Meeker, Hahn, & Escobar, 2017).
37
Lower 95% confidence interval
38
Two-sided 90% confidence level
39
Sensitivity analysis
Precision and sample-size analysis is centered on a
prospective study, and we are obtaining estimates based
on values that are truly unknown
Estimates may come from historical data or pilot studies
Sensitivity analysis is used to investigate the effect of
varying study parameters on CI precision, probability of CI
width, sample size, and other components of a study.
In Stata, you can perform sensitivity analysis with tables and
graphs.
40
Computing probability of CI width for a
population mean
41
Probability of CI width for a range of
standard deviations
42
Table and graph for sensitivity analysis
43
<output omitted>
Sensitivity analysis graph
44
Probability of CI width for ranges of multiple
parameters
45
Tables for sensitivity analysis
46
Reordering table columns
47
Modifying column labels and widths
48
ciwidth
We computed the following for the CI for a population mean
Sample size
Confidence-interval width
Probability of confidence-interval width
We created a graph and tables for sensitivity analysis
We can perform precision and sample-size analysis for CIs for
any of the following
A population mean
A population variance
A difference between two independent means
A difference between paired means
49
Precision analysis for a
population variance
50
Computing probability of CI width for a
population variance
51
Computing probability of CI width for a
population variance
52
Computing sample size for a population
standard deviation
53
Sensitivity analysis for a population
standard deviation
54
Paired samples
55
Paired-means for bad (LDL) cholesterol levels
Cholesterol
Before
(mg/dL)
Cholesterol
After
(mg/dL)
Difference
167 140 -27
149 138 -11
192 194 2
200 180 -20
162 162 0
52 41 67.25
Mean
129.58 111.96 -17.62
S.D.
52.27 42.15 6.47
56
CI for a paired-means difference
57
CI for a paired-means difference
58
Fractional sample sizes
59
Sample-size determination using the correlation
60
CI for a difference of two means
Lets consider the cholesterol example, but
instead randomly sample individuals who take
cholesterol medication and individuals who don’t
Two independent samples
Group sizes may differ
61
CI for a difference of two means
62
Graphs for sensitivity analysis
63
Define the x axis
64
Label distinct values on the y axis
65
Specifying multiple values for multiple parameters
66
Create subgraphs for values of width
67
Unbalanced sample sizes
Control group
(No medication)
Experimental group
(Medication)
68
Unbalanced sample sizes
69
Unbalanced sample sizes
70
Unbalanced sample sizes
71
Unbalanced sample sizes
72
Compute one sample size given another
73
Compute one sample size given another
74
Known standard deviations
75
Do you have another method in mind?
computations in the same way you would with other ciwidthcommands,
and create tables and graphs
76
Adding your own method to ciwidth
Adding your own methods to ciwidth is easy. Suppose
you want to add a method called mymethod to ciwidth.
Simply
1. write an r-class program called ciwidth_cmd_mymethod that
computes sample size, probability of CI width, or CI width and
follows ciwidths convention for naming common options and
storing results; and
2. place the program where Stata can find it.
You are done. You can now use mymethod within
ciwidth like any other official ciwidth method.
77
Program for computing CI width
program ciwidth_cmd_mymean, rclass
version 16.0
78
Program for computing CI width
program ciwidth_cmd_mymean, rclass
version 16.0
syntax, n(integer) [ Level(cilevel) Stddev(real 1) * ]
79
Program for computing CI width
program ciwidth_cmd_mymean, rclass
version 16.0
syntax, n(integer) [ Level(cilevel) Stddev(real 1) * ]
tempname width
scalar `width' = 2*invnormal(1/2+`level'/200)*`stddev'/sqrt(`n')
80
Program for computing CI width
program ciwidth_cmd_mymean, rclass
version 16.0
syntax, n(integer) [ Level(cilevel) Stddev(real 1) * ]
tempname width
scalar `width' = 2*invnormal(1/2+`level'/200)*`stddev'/sqrt(`n')
return scalar level = `level'
return scalar N = `n'
return scalar width = `width'
return scalar stddev = `stddev'
end
81
Program for computing CI width
82
ciwidth mymean : compute CI width
83
ciwidth onemean : compute CI width
84
Naming conventions for stored results
return scalar level = `level'
return scalar N = `n'
return scalar width = `width'
85
ciwidth mymean : Specifying multiple values
86
We can also compute results for multiple sample sizes and confidence levels
without any additional effort on our part:
ciwidth mymean : Automatic graphs
87
Customizing your ciwidth command
Add method-specific options and set them up to allow
multiple values
Tables
Change column labels, formats, and widths to modify the look and
contents for the table created by default
Graphs
Change the default column labels
Use different symbols to label the results
88
Summary
Perform precision and sample-size analysis for CIs for
A population mean
A population variance
A difference between two independent means
A difference between paired means
Compute
Sample size, CI width, and probability of CI width
Perform sensitivity analysis graphically and with a table
Implement your own method, and easily create tables
and graphs as if it were an official ciwidth command
89
References
Dixon, W. J., and F. J. Massey, Jr. 1983. Introduction to
Statistical Analysis. 4th ed. New York: McGrawHill.
Chow, Shein-Chung, J. Shao, H. Wang, and Y.
Lokhnygina. 2017. Sample Size Calculations in Clinical
Research. 3rd ed. Boca Raton: Taylor & Francis.
Meeker, W. Q., G. J. Hahn, and L. A. Escobar. 2017.
Statistical Intervals: A Guide for Practitioners and
Researchers. 2nd ed. Hoboken, NJ: Wiley.
90
Thank you !!
91