These are statistical methods to handle non-independence of observations. For our purposes their great strength is that they can handle, statistically appropriately, “nested” data, ours is typically repeated completions of measures by individual clients, nested within the therapists they worked with and perhaps nested within the services in which the therapists worked. There are other applications in therapy change/outcome data but this is much the dominant one so I’ll keep to this here.
Details #
For our purposes these are always regression models of how a dependent variable may show different values in relation to other variables. In our arena this is typically a model of how score on the measure changes in relation to other variables. The main ideas to hold in mind are of the “level” of the variable and whether it is being estimated as a “free” or a “fixed” variable.
The level of the in our model is either client, therapist or service. So we might have client gender and starting score as client level variables; therapist gender and level of experience; and service size and perhaps funding or local deprivation index.
A fixed variable is modelled as having the same influence on the dependent variable across all participants/clients so if gender and starting score are modelled as fixed variables their effect on (non-baseline) scores is treated as if it were the same for all participants. If modelled as a free variable its impact is treated as potentially different for each participant. This would make no sense for gender or starting score as each participant has only one gender and only one starting score. However, we are usually interested in whether scores change with time, by session or simply time from starting session. This could show a different relationship for different clients so I can be modelled as either a fixed (same effect for all participants) or free (varying across clients).
Effects, e.g. of client gender on score change slope, can be reported in terms of whether effects are statistically significant (see inferential testing) or by looking at the confidence intervals (see estimation and confidence intervals) of the effects. In addition the overall model fit to the data can be reported and used to choose between models.
Analyses often start with the most complicated model with many predictor variables and perhaps interactions (an interaction might be whether the client gender and therapist gender are not simply having additive effects but maybe showing that being of the same gender contributes more than just the effects of the two genders. Models are then pruned back removing effects found non-significant in the complex model. Similarly, appropriate effects may initially be entered as free effects but if that shows no better fit to the model with the effect only entered as fixed then the free effect is dropped and only the free effect retained.
The basic ideas and maths behind these models are wonderfully apt to the structure of typical therapy change data and used wisely, the methods are rightly becoming dominant in analysing large datasets of change data. However, as with all powerful methods, and perhaps particularly for powerful methods and large datasets of purely quantitative data, there are dangers.
These dangers include purely mathematical/methodological ones which can make the findings misleading. These are generally issues about distributional assumptions, generally those are that the continuous variables have Gaussian distributions though they may also involve assumptions of “homoscedasticity”: that within levels of another variable a continuous variable has constant variance. Other issues are involve “cell sizes”: whether there are enough values of the dependent variable in the “cells” with the same values of the predictor variables and within the lowest level of the model, here the client/participant.
The other dangers are more philosophical and about whether the ability to estimated effects per individual are really addressing the individualities of clients coming into therapies. One catch here is that freely estimated effects at the level of the individual client are “real” but only within the assumptions of model essentially that differences between individuals are sufficiently caught by the model and such that, while effects are being investigated for individual differences, the assumption is that participants don’t differ in other ways either not measured, or measured but not entered into the model or in the model but such that the analyses have no statistical power to catch individual differences as “statistically significant”. The other catch is that because these analyses really need very large numbers of participants to estimate the effects, that therapy “outcome” research becomes dominated by, and having political influence on therapies offered, only for clients and work fitting a “large n”, potentially an industrial, model of therapy.
Try also #
Cell size
Heteroscedasticity
Homoscedasticity
Independence of observations
Intercept
Slope
Statistical power
Chapters #
These methods are touched on in Chapter 8 about analyses of service data. The issues about “cell size” mentioned in the details above meant we didn’t discuss whether the methods could be applied in Chapter 7 about individual therapists analysing their data (as they would generally have been stretched well beyond applicability unless a lot of therapists working similarly pooled data).
Online resources #
I hope to work through some MLM analyses of real data on the Rblog … not there yet.
It’s tempting to think of offering shiny apps to do MLMs but I am sure that really it’s safer if people get into practice research networks and work with statisticians or at least statistically experienced researchers when contemplating using these methods.
Dates #
First created 19.viii.23, tweaked 16.iv.24.