1 Before you get started

Here I will discuss some of the research-related things that you need to ask yourself before you can actually get started. I will also refer to some R information.

2 General features of QDECR

R is a programming language with extensive functionality related to statistical analyses. QDECR is written to capitalize on some of this functionality, including:

  • Working with imputed missing data
  • Straightforward formula definition
  • Usage of interaction terms, splines, polynomial terms, etc.
  • AsIs (‘as is’) treatment of data within formulas, i.e. being able to manipulate the data while specifying the formula

All these features can be found in QDECR.

The core qdecr function follows a number of steps:

  1. Checking the input arguments
  2. Preparing the statistical model
  3. Loading the vertex-wise data
  4. Running the statistical model per vertex
  5. Applying any corrections over the surface
  6. Saving out all the statistics

3 QDECR features

3.1 Vertex-wise data

Within QDECR all vertex measures that Freesurfer calculates have default names. This is qdecr_ combined with the name of the vertex measure file. A comprehensive list:

  • qdecr_thickness
  • qdecr_area
  • qdecr_area.pial
  • qdecr_curv
  • qdecr_jacobian_white
  • qdecr_pial
  • qdecr_sulc
  • qdecr_volume
  • qdecr_white.H
  • qdecr_white.k

Note that qdecr_w-g.pct does not work yet.

3.2 Formulas

3.2.1 On formulas

Nearly all statistical model functions in R utilize formula objects. The formula object allows users to generate design matrices for subsequent analysis through straightforward syntax:

Y ~ a + b

Lets deconstruct this:

  • Y: The outcome (AKA dependent variable AKA label)
  • ~: Denotes the left-hand side versus the right-hand side of the formula
  • a + b: The additive effect of determinants a and b (AKA independent variables AKA features)

This format allows users to use simple pseudo-math to generate complicated design matrices. R handles design making for incomplete data, conversion of categorical variables to e.g. dummy variables, etc.

3.2.2 Using formulas

QDECR uses the formula object to allow users to easily create design matrices. It further extends this functionality by explicitly including the vertex measure as a variable in the formula:

  • Cortical thickness as outcome: qdecr_thickness ~ a + b

3.2.3 Special terms

R also allows users to apply more complicated formulas:

  • Interaction terms: Y ~ a:b
  • Main effects and interaction: Y ~ a * b [equivalent to Y ~ a + b + a:b]
  • Polynomials: Y ~ a + poly(b, 2, raw = TRUE) [equivalent to Y ~ a + b + I(b^2)]
  • Orthogonal polynomials: Y ~ a + poly(b, 2)
  • Splines (e.g. from the splines package): Y ~ bs(a, 3)

Furthermore, R features AsIs treatment of objects, meaning that variables can be manipulated within the formula object itself using I(). This allows users to do all kinds of things in the formula itself, including:

  • Standardizing a variable: Y ~ I(scale(a))
  • Combining variables: Y ~ I(a + 2*b)
  • Applying other functions: `Y ~ I(cut(a, 3))

By extension, QDECR has all these features.

3.3 Imputed datasets

Datasets may contain missing information. The missing information can be imputed under certain conditions. Commonly used R package for imputation are mice and mi. We designed R in such a way that imputed datasets can be used as the input dataset, without any specifications by the user. QDECR currently supports imputed objects from Amelia, mi, mice, and missForest.

3.4 Parallel performance

Users may want to reduce computation time by utilizing multiple processes. QDECR has the n_cores argument that allows users to specify the number of processes (cores/threads) to use. Note that the benefit of using multiple processes is most evident when increasing the number of imputed datasets.

[Next vignette: 4. Post-processing]