What it says! Often written as n. It matters because it determines how generalisable findings from the sample will be, how precise they will be and the likelihood that they will be sufficient to argue that it’s unlikely that what you see is purely random.
Details #
The main idea in the statistical methods used in our field is of generalising from findings in a sample to a population. Generally our data aren’t from random samples and it is often hard to define the population, it tends to be “all other similar clients now and in the future” so it’s more accurate for our data to talk about “dataset size” but even I am not often that pedantic. The simple principle is that the bigger the n the better!
Try also #
Statistical power
Estimation
Confidence intervals
Sample
Population
Online resources #
My App creating samples from Gaussian distribution showing histogram, ecdf and qqplot gives an opportunity to see how sample size affects samples even from the same population.
Chapters #
Mostly chapters 5 to 8.
Dates #
Tweaks 5.iii.24.