(Statistical) Analysis plan

A (statistical) analysis plan, ‘SAP’ for short, describes how the quantitative or qualitative data that you will collect will be statistically handled. You may add it as a supplement to your protocol.

What is a SAP?

A SAP a is a more technical document than the research protocol and includes detailed procedures for executing statistical analyses. Although a SAP was originally intended for clinical trials, also other types of research design may profit from transparent analysis plans. For example, the analysis of qualitative data is likely to benefit from a written plan, describing the underlying (philosophical) approach and tackling details of e.g. triangulation, criteria for saturation and the selection of quotes.

Why write a SAP?

  1. Developing a SAP forces you to think about which data to collect in which format, which may then guide your decisions on e.g. measurement instruments and timing of (repeated) measurements. It may also alert you to the fact that you may be planning to collect more data than you will be using in your analyses. This may burden participants and cause them to (selectively) drop out, which may jeopardize the overall validity of your study. Here is a link to a framework that helps you select the minimally needed set of confounding factors for a valid data-analysis. A recent example taught us that the statistical repair of bias due to poor treatment adherence in a trial would have required repeatedly measuring time-dependent confounders. Unfortunately, this had been forgotten and adjustment for adherence became impossible. Note that collecting personal data unnecessarily is unlawful.
  2. SAP development may alert you that the necessary tools or statistical techniques are not available (in your favorite software). SAP development may also flag up that (more) statistical support needs to be organized.
  3. A good SAP, and sticking to it, may save you a lot of time that would otherwise be spent analyzing the data in haphazard and data-driven ways (it prevents “data-dredging” where data are selected based on desired outcomes).
  4. A good reason to put your SAP along with your research protocol in the public domain early on is that the desire (unfortunately far from eradicated) to produce statistically significant results leads many investigators to torture their data until they confess (P-hacking ). This approach distorts your work (then reviews, guidelines and patient care or services, and the global scientific record as a whole) and should be avoided at all cost. This issue is explained in more detail in the chapter on preregistration .
  5. Given the influence of statistical decisions on study conclusions, well-documented and transparent statistical conduct is essential.

How to write a SAP?

Under construction, see here in the meanwhile.

Published by  Urban Vitality 12 June 2023