Survival analysis

This is going to be a small introduction to a big and important topic. The basic idea fits the name and to some extent the statistical methods that fit under the name include ones developed to test for differences in longevity: how long we (and other things) survive. One classic application was to analyse whether smoking shortened life expectancy.

Details #

Why can’t you just look at mean age of smokers and non-smokers at death? Because those of us who are still alive don’t yet know how old we will be at death: in statistical terminology, survival analysis terminology, our age at death is “right censored” if we are still alive, I’m 67 today so my age at death will be older than that but at present no-one knows how much it will be older than that. Hence “right censored”: on a left to right time line the amount by which I may exceed 67 is unknown: censored.

If something shortens our life expectancy then a population of people born on the same day will have fewer people who will have been exposed to the risk factor, say smoking, who are still alive: fewer of them will be “right censored” than in the longer living group not exposed to the effect. That means that comparing the mean ages at death of smokers and non-smokers born on the same day will, if the sample is big enough, probably correctly tell us that the non-smokers are older than the smokers (if enough time has elapsed from their shared birthday for smoking to have had its evil way with their lives) yet it will underestimate the effect of smoking, perhaps quite seriously, because of the differential censoring.

Rather than having samples all born on the same day it’s more realistic to have a sample of people recruited into a study over time so the participants will differ in how long they have had to live, or to die, at any point at which death ages are collected.

The family of survival analytic methods handle this right censoring of the data correctly. There are added complexities about the exact methods to use but that’s beyond this glossary.

These methods can apply to any “once only” event, not just death. Frailty analysis is a related set of methods to handle censored data where the event can happen more than once, e.g. having a psychological crisis, having a baby, self-harming. Both methods are seriously underused in research into psychological issues which is so often short term or with only very limited follow-up period. Sadly, this partly reflects how little money is spent on psychological issues and therapies.

Try also #

Censored data & censoring
Frailty analysis/models

Chapters #

Not mentioned in the OMbook.

Online resources #

None yet.

Dates #

First created 7.v.24, tweaked 8.v.24.

Powered by BetterDocs