Confidence intervals (CIs)

This is an incredibly useful method. The basic idea is that when you have a collection of data on one or more variables we usually summarise them with at statistic: the mean, the median, standard deviation, correlation between scores on two variables, the internal reliability of data from the items of a multi-item measure. These are are summary statistics. We can use them purely descriptively: simply a summary of the data points we have. Alternatively, and often, we want to generalise from the data and the traditional model is to treat the data you have as a sample from a population to which you are generalising. Since the early twentieth century the dominant approach to this generalisation has been “inferential statistics”. Almost as old, and arguably much more useful for many of the sorts of statistics we have, is estimation and for our purposes estimation is about confidence intervals.

Details #

A confidence interval around an observed statistic tells us the precision with which think we have estimated the population value from our dataset’s value. Most of the time you will see 95% confidence intervals, something like:

The mean weight in kilograms of the 87 participants was 67.2 with 95% confidence interval from 63 to 75.

Invented example!

Some statistics, a proportion, a correlation coefficient or Cronbach’s alpha don’t have units, they are “dimensionless”, others such as the mean weight in the example above, do have dimension, do have units, some are more complex than a single dimension, for example the Body Mass Index (BMI) has units of kg/m2. One nice thing about confidence intervals is that they are always in the same unit as the statistic which gives us a very easy sense of the precision. Depending on why we want to generalise, what we wish to do with the estimation, an interval of 12 kilos from 63 to 75 might be fine, or it might be woefully inadequate for the intended use but the nice thing is that we can appraise as easily as we can the mean or any individual’s weight: a kilo is a kilo, there’s no moving to rather abstract p values as in inferential statistics.

However, this “95%” looks a bit like the “< .05” in inferential statistics and, for our purposes, it is and it’s no coincidence that .05 is 5% which is the gap between 95% and 100%. What does this 95% mean? Expanding the fictitious report above it would become this.

The mean weight in kilograms of the 87 participants was 67.2 and we estimate that if we think that the population mean is somewhere between 63 and 75 kilos given the observed data and the assumption of random sampling and Gaussian distribution. Of course, we can never know for certain that the population value is between 63 and 75, however, if our model/method for computing the confidence interval is correct every time, then in the long run 95% of the time we estimate things the true population value will lie within the intervals we computed around our statistic, here this mean weight.

Still fictitious data!

I have never seen a confidence interval expanded like that other than in statistical lectures or textbooks but that’s the explanation. Note the crucial “if our model/method for computing the confidence interval is correct “. We can’t hope that 95% of the 95% CIs we seen in reports will embrace the true population value, in fact it’s pretty likely that the proportion is lower than that. I find that a useful cautionary thought when reading research reports.

Why is it likely to be lower? Well here are the big influences.

  • The realities of how data is acquired are such that it’s never by true random sampling in our world.
  • Sometimes the method behind the calculation relies on assumptions about the distribution of the data in the population and that assumption, usually of Gaussian distribution, may not apply (sometimes bootstrap or jack-knife methods may largely avoid this issue but they are only being adopted slowly).
  • Pretty much always at some level within the structure of the data there is the assumption that observations are independent of each other but they may not be (see multi-level models).
  • Above all the relentless pressures to publish and the pressures to publish only exciting looking things really do bias what gets published.

Despite these caveats CIs should be the norm in reporting data in our fields. There are a very few exceptions, situations in which NHST (Null Hypothesis Significance Testing, i.e. inferential statistics, p values) clearly do apply, but even there, though frowned on the theoretical statisticians, I would like to report both the NHST p value and the CI to get the sense of the precision of estimation.

Try also #

Bootstrapping
Estimation
Jack-knife methods
Multi-level models
Null hypothesis significance testing (NHST) paradigm

Chapters #

Estimation and CIs were described in Chapter 5 in the OMbook.

Online resources #

A number of my shiny apps will give you CIs around observed statistics:
* Around an observed proportion given that proportion and the n.
* Around a difference between observed proportions in two separate datasets given that proportion and the ns.
* around an observed mean given the n value, the mean and either the observed SD or SE. Assumes Gaussian distributions.
* around an observed SD or variance given that value and the n. Assumes Gaussian distributions.
* around an observed Pearson correlation given the observed R and n. Assumes Gaussian distributions. Assumes Gaussian distributions.
* around an observed Spearman correlation given that value and n. Assumes Gaussian distributions.
* an observed Cronbach alpha value given that value and n. Assumes Gaussian distributions.
Those are all to enable you to take published summary statistics which didn’t report CIs and to compute CIs to put round them. CIs for proportions don’t have distributional assumptions but the others assume Gaussian distributions as noted above. All give you a default CI of 95% but you can request different CI widths if you want.
In addition, now I have cracked how to write shiny apps into which you can paste or upload data, I am now adding apps which will give you CIs for statistics given the raw data. So far this is only there for:
* observed quantiles of a distribution given the data and the quantiles you want. This can give you CIs around medians, around upper and lower quartiles, or any percentiles, say the 5th, 10th, 90th or 95th, that you might want.

Dates #

First created 18.v.24, tweaks 20.v.24.

Powered by BetterDocs