## Introduction

This vignette documents the use of the MANOVA.RM package for the analysis of semi-parametric repeated measures designs and multivariate data. The package consists of three parts - one for repeated measurements, one for multivariate data and one for the combination of both - which will be explained in detail below. All functions calculate the Wald-type statistic (WTS) as well as the ANOVA-type statistic (ATS) for repeated measures and a modification thereof (MATS) for multivariate data based on means. These test statistics can be used for arbitrary semi-parametric designs, even with unequal covariance matrices among groups and small sample sizes. For a short overview of the test statistics and corresponding references see the next section. Furthermore, different resampling approaches are provided in order to improve the small sample behavior of the test statistics. The WTS requires non-singular covariance matrices. If the covariance matrix is singular, a warning is returned.

For detailed explanations and examples, see also Friedrich, Konietschke and Pauly (2019).

## Theoretical background

We consider the following model $X_{ik} = \mu_i + \varepsilon_{ik}$ for observation vectors from group $$i$$ and individual $$k$$. Here, $$\mu_i$$ denotes the group mean which we wish to infer and $$\varepsilon_{ik}$$ are the common error terms with expectation 0 and existing (co-)variances. We do not need to assume normally distributed errors for these methods and the covariances may be unequal between groups. Null hypotheses are formulated via contrast matrices, i.e., as $$H \mu = 0$$, where the rows of $$H$$ sum to zero. Then the Wald-type statistic (WTS) is a quadratic form in the estimated mean vector involving the Moore-Penrose inverse of the (transformed) empirical covariance matrix. Under the null hypothesis the WTS is asymptotically $$\chi^2$$ distributed with rank(H) degrees of freedom. However, this approximation is only valid for large sample sizes, see e.g. Brunner (2001). We therefore recommend to use a resampling approach as proposed by Konietschke et al. (2015) or Friedrich et al. (2017).

Since the WTS requires non-singular covariance matrices, two other test statistics have been proposed: the ATS for repeated measures designs (Brunner, 2001) and the MATS for multivariate data (Friedrich and Pauly, 2018). The ATS is a scaled quadratic form in the mean vector, which is usually approximated by an F-distribution with estimated degrees of freedom. This approach is also implemented in the SAS PROC Mixed procedure. However, it is rather conservative for small sample sizes and does not provide an asymptotic level $$\alpha$$ test.

The MATS for multivariate data has the advantage of being invariant under scale transformations of the data, an important feature for multivariate data. Since its asymptotic distribution involves unknown parameters, we again propose to use bootstrap approaches instead (Friedrich and Pauly, 2018).

## Difference between RM and MANOVA design

• In a repeated measures design (RM) the same outcome is observed at different occasions, e.g., different time points or different parts of the body like left/right hemisphere of the brain. Hypotheses about the sub-plot or within-subject factors are possible.
• In a MANOVA design, d-dimensional outcome vectors are observed for every subject. These outcomes are potentially measured on different scales (e.g., kg, m, heart rate, …) and hypotheses are formulated on the whole outcome vectors. The only difference between the functions MANOVA and MANOVA.wide is the format of the data: For MANOVA data need to be in long format (one row per measurement), whereas MANOVA.wide is for data in wide format (one row per individual).

Note that a combination of both types of data, i.e., multivariate longitudinal data, can now also be analyzed with MANOVA.RM, see below.

## The RM function

The RM function calculates the WTS and ATS in a repeated measures design with an arbitrary number of crossed whole-plot (between-subject) and sub-plot (within-subject) factors. The resampling methods provided are a permutation procedure, a parametric bootstrap approach and a wild bootstrap using Rademacher weights. The permutation procedure provides no valid approach for the ATS and is thus not implemented.

### Data Example 1 (One between-subject and two within-subject factors)

For illustration purposes, we consider the data set o2cons, which is included in MANOVA.RM.

library(MANOVA.RM)
data(o2cons)

The data set contains measurements on the oxygen consumption of leukocytes in the presence and absence of inactivated staphylococci at three consecutive time points. More details on the study and the statistical model can be found in Friedrich et al. (2017). Due to the study design, both time and staphylococci are within-subject factors while the treatment (Verum vs. Placebo) is a between-subject factor.

head(o2cons)
##     O2 Staphylococci Time Group Subject
## 1 1.48             1    6     P       1
## 2 2.81             1   12     P       1
## 3 3.56             1   18     P       1
## 4 1.04             0    6     P       1
## 5 2.07             0   12     P       1
## 6 2.81             0   18     P       1

We will now analyze this data using the RM function. The RM function takes as arguments:

• formula: A formula consisting of the outcome variable on the left hand side of a ~ operator and the factor variables of interest on the right hand side. The within-subject factor(s) must be specified last in the formula, e.g. cbind(outcome1, outcome2) ~ between1 * between2 * within1 * within2.
• data: A data.frame containing the variables in formula.
• subject: The column name of the subjects variable in the data frame.
• no.subf: The number of within-subject factors, default is 1. Alternatively, the within-subjects factors can be directly specified using within.
• iter: The number of iterations for the resampling approach. Default value is 10,000.
• alpha: The significance level, default is 0.05.
• resampling: The resampling method, one of ‘Perm’, ‘paramBS’ or ‘WildBS’. Default is set to ‘Perm’.
• para: A logical indicating whether parallel computing should be used. Default is FALSE.
• CPU: The number of cores used for parallel computing. If not specified, cores are detected automatically.
• seed: A random seed for the resampling procedure. If omitted, no reproducible seed is set. Note that the random seeds for the parallel computing differ from those without parallelisation. Thus, to reproduce results obtained with MANOVA.RM version <0.5.1, argument para must be set to TRUE.
• CI.method: The method for calculating the quantiles used for the confidence intervals, either ‘t-quantile’ (the default) or ‘resampling’ (based on quantile of the resampled WTS).
• dec: The number of decimals the results should be rounded to. Default is 3.
model1 <- RM(O2 ~ Group * Staphylococci * Time, data = o2cons,
subject = "Subject", no.subf = 2, iter = 1000,
resampling = "Perm", seed = 1234)
summary(model1)
## Call:
## O2 ~ Group * Staphylococci * Time
## A repeated measures analysis with 2 within-subject factor(s) ( Staphylococci,Time ) and 1 between-subject factor(s).
##
## Descriptive:
##    Group Staphylococci Time  n Means Lower 95 % CI Upper 95 % CI
## 1      P             0    6 12 1.322         1.150         1.493
## 5      P             0   12 12 2.430         2.196         2.664
## 9      P             0   18 12 3.425         3.123         3.727
## 3      P             1    6 12 1.618         1.479         1.758
## 7      P             1   12 12 2.434         2.164         2.704
## 11     P             1   18 12 3.527         3.273         3.781
## 2      V             0    6 12 1.394         1.201         1.588
## 6      V             0   12 12 2.570         2.355         2.785
## 10     V             0   18 12 3.677         3.374         3.979
## 4      V             1    6 12 1.656         1.471         1.840
## 8      V             1   12 12 2.799         2.500         3.098
## 12     V             1   18 12 4.029         3.802         4.257
##
## Wald-Type Statistic (WTS):
##                          Test statistic df  p-value
## Group                    "11.167"       "1" "0.001"
## Staphylococci            "20.401"       "1" "<0.001"
## Group:Staphylococci      "2.554"        "1" "0.11"
## Time                     "4113.057"     "2" "<0.001"
## Group:Time               "24.105"       "2" "<0.001"
## Staphylococci:Time       "4.334"        "2" "0.115"
## Group:Staphylococci:Time "4.303"        "2" "0.116"
##
## ANOVA-Type Statistic (ATS):
##                          Test statistic df1     df2      p-value
## Group                    "11.167"       "1"     "57.959" "0.001"
## Staphylococci            "20.401"       "1"     "Inf"    "<0.001"
## Group:Staphylococci      "2.554"        "1"     "Inf"    "0.11"
## Time                     "960.208"      "1.524" "Inf"    "<0.001"
## Group:Time               "5.393"        "1.524" "Inf"    "0.009"
## Staphylococci:Time       "2.366"        "1.983" "Inf"    "0.094"
## Group:Staphylococci:Time "2.147"        "1.983" "Inf"    "0.117"
##
## p-values resampling:
##                          Perm (WTS)
## Group                    "<0.001"
## Staphylococci            "<0.001"
## Group:Staphylococci      "0.119"
## Time                     "<0.001"
## Group:Time               "<0.001"
## Staphylococci:Time       "0.152"
## Group:Staphylococci:Time "0.146"

The output consists of four parts: model1$Descriptive gives an overview of the descriptive statistics: The number of observations, mean and confidence intervals (based on either quantiles of the t-distribution or the resampling-based quantiles) are displayed for each factor level combination. model1$WTS contains the results for the Wald-type test: The test statistic, degree of freedom and p-values based on the asymptotic $$\chi^2$$ distribution are displayed. Note that the $$\chi^2$$ approximation is very liberal for small sample sizes. model1$ATS contains the corresponding results based on the ATS. This test statistic tends to rather conservative decisions in case of small sample sizes and is even asymptotically only an approximation, thus not providing an asymptotic level $$\alpha$$ test. Finally, model1$resampling contains the p-values based on the chosen resampling approach. For the ATS, the permutation procedure cannot be applied. Due to the above mentioned issues for small sample sizes, the resampling procedure is recommended.

### Data Example 2 (Two within-subject and two between-subject factors)

We consider the data set EEG from the MANOVA.RM package: At the Department of Neurology, University Clinic of Salzburg, 160 patients were diagnosed with either Alzheimer’s Disease (AD), mild cognitive impairments (MCI), or subjective cognitive complaints without clinically significant deficits (SCC), based on neuropsychological diagnostics (Bathke et al.(2018)). This data set contains z-scores for brain rate and Hjorth complexity, each measured at frontal, temporal and central electrode positions and averaged across hemispheres. In addition to standardization, complexity values were multiplied by -1 in order to make them more easily comparable to brain rate values: For brain rate we know that the values decrease with age and pathology, while Hjorth complexity values are known to increase with age and pathology. The three between-subject factors considered were sex (men vs. women), diagnosis (AD vs. MCI vs. SCC), and age ($$< 70$$ vs. $$>= 70$$ years). Additionally, the within-subject factors region (frontal, temporal, central) and feature (brain rate, complexity) structure the response vector.

data(EEG)
EEG_model <- RM(resp ~ sex * diagnosis * feature * region,
data = EEG, subject = "id", within = c("feature", "region"),
resampling = "WildBS",
iter = 1000,  alpha = 0.01, seed = 987)
summary(EEG_model)
## Call:
## resp ~ sex * diagnosis * feature * region
## A repeated measures analysis with 2 within-subject factor(s) ( feature,region ) and 2 between-subject factor(s).
##
## Descriptive:
##    sex diagnosis    feature   region  n  Means Lower 99 % CI Upper 99 % CI
## 1    M        AD  brainrate  central 12 -1.010        -4.881         2.861
## 13   M        AD  brainrate  frontal 12 -1.007        -4.991         2.977
## 25   M        AD  brainrate temporal 12 -0.987        -4.493         2.519
## 7    M        AD complexity  central 12 -1.488       -10.053         7.077
## 19   M        AD complexity  frontal 12 -1.086        -6.906         4.735
## 31   M        AD complexity temporal 12 -1.320        -7.203         4.562
## 3    M       MCI  brainrate  central 27 -0.447        -1.591         0.696
## 15   M       MCI  brainrate  frontal 27 -0.464        -1.646         0.719
## 27   M       MCI  brainrate temporal 27 -0.506        -1.584         0.572
## 9    M       MCI complexity  central 27 -0.257        -1.139         0.625
## 21   M       MCI complexity  frontal 27 -0.459        -1.997         1.079
## 33   M       MCI complexity temporal 27 -0.490        -1.796         0.816
## 5    M       SCC  brainrate  central 20  0.459        -0.414         1.332
## 17   M       SCC  brainrate  frontal 20  0.243        -0.670         1.156
## 29   M       SCC  brainrate temporal 20  0.409        -1.210         2.028
## 11   M       SCC complexity  central 20  0.349        -0.070         0.767
## 23   M       SCC complexity  frontal 20  0.095        -1.037         1.227
## 35   M       SCC complexity temporal 20  0.314        -0.598         1.226
## 2    W        AD  brainrate  central 24 -0.294        -1.978         1.391
## 14   W        AD  brainrate  frontal 24 -0.159        -1.813         1.495
## 26   W        AD  brainrate temporal 24 -0.285        -1.776         1.206
## 8    W        AD complexity  central 24 -0.128        -1.372         1.116
## 20   W        AD complexity  frontal 24  0.026        -1.212         1.264
## 32   W        AD complexity temporal 24 -0.194        -1.670         1.283
## 4    W       MCI  brainrate  central 30 -0.106        -1.076         0.863
## 16   W       MCI  brainrate  frontal 30 -0.074        -1.032         0.885
## 28   W       MCI  brainrate temporal 30 -0.069        -1.064         0.925
## 10   W       MCI complexity  central 30  0.094        -0.464         0.652
## 22   W       MCI complexity  frontal 30  0.131        -0.768         1.031
## 34   W       MCI complexity temporal 30  0.121        -0.652         0.895
## 6    W       SCC  brainrate  central 47  0.537        -0.049         1.124
## 18   W       SCC  brainrate  frontal 47  0.548        -0.062         1.159
## 30   W       SCC  brainrate temporal 47  0.559        -0.015         1.133
## 12   W       SCC complexity  central 47  0.384         0.110         0.659
## 24   W       SCC complexity  frontal 47  0.403        -0.038         0.845
## 36   W       SCC complexity temporal 47  0.506         0.132         0.880
##
## Wald-Type Statistic (WTS):
##                              Test statistic df  p-value
## sex                          "9.973"        "1" "0.002"
## diagnosis                    "42.383"       "2" "<0.001"
## sex:diagnosis                "3.777"        "2" "0.151"
## feature                      "0.086"        "1" "0.769"
## sex:feature                  "2.167"        "1" "0.141"
## diagnosis:feature            "5.317"        "2" "0.07"
## sex:diagnosis:feature        "1.735"        "2" "0.42"
## region                       "0.07"         "2" "0.966"
## sex:region                   "0.876"        "2" "0.645"
## diagnosis:region             "6.121"        "4" "0.19"
## sex:diagnosis:region         "1.532"        "4" "0.821"
## feature:region               "0.652"        "2" "0.722"
## sex:feature:region           "0.423"        "2" "0.81"
## diagnosis:feature:region     "7.142"        "4" "0.129"
## sex:diagnosis:feature:region "2.274"        "4" "0.686"
##
## ANOVA-Type Statistic (ATS):
##                              Test statistic df1     df2      p-value
## sex                          "9.973"        "1"     "36.497" "0.003"
## diagnosis                    "13.124"       "1.343" "36.497" "<0.001"
## sex:diagnosis                "1.904"        "1.343" "36.497" "0.174"
## feature                      "0.086"        "1"     "Inf"    "0.769"
## sex:feature                  "2.167"        "1"     "Inf"    "0.141"
## diagnosis:feature            "1.437"        "1.562" "Inf"    "0.238"
## sex:diagnosis:feature        "1.031"        "1.562" "Inf"    "0.342"
## region                       "0.018"        "1.611" "Inf"    "0.965"
## sex:region                   "0.371"        "1.611" "Inf"    "0.644"
## diagnosis:region             "1.091"        "2.046" "Inf"    "0.337"
## sex:diagnosis:region         "0.376"        "2.046" "Inf"    "0.691"
## feature:region               "0.126"        "1.421" "Inf"    "0.81"
## sex:feature:region           "0.077"        "1.421" "Inf"    "0.864"
## diagnosis:feature:region     "0.829"        "1.624" "Inf"    "0.415"
## sex:diagnosis:feature:region "0.611"        "1.624" "Inf"    "0.51"
##
## p-values resampling:
##                              WildBS (WTS) WildBS (ATS)
## sex                          "0.003"      "0.003"
## diagnosis                    "<0.001"     "<0.001"
## sex:diagnosis                "0.127"      "0.131"
## feature                      "0.769"      "0.769"
## sex:feature                  "0.141"      "0.141"
## diagnosis:feature            "0.072"      "0.251"
## sex:diagnosis:feature        "0.467"      "0.388"
## region                       "0.976"      "0.986"
## sex:region                   "0.693"      "0.704"
## diagnosis:region             "0.154"      "0.344"
## sex:diagnosis:region         "0.857"      "0.817"
## feature:region               "0.799"      "0.936"
## sex:feature:region           "0.888"      "0.942"
## diagnosis:feature:region     "0.111"      "0.507"
## sex:diagnosis:feature:region "0.741"      "0.657"

We find significant effects at level $$\alpha = 0.01$$ of the between-subject factors sex and diagnosis, while none of the within-subject factors or interactions become significant.

### Plotting

The RM() function is equipped with a plotting option, displaying the calculated means along with the $$(1-\alpha)$$ confidence intervals. The plot function takes an RM object as an argument. Furthermore, additional graphical parameters can be used to customize the plots. The optional argument legendpos specifies the position of the legend in higher-way layouts, while the argument gap (default 0.1) specifies the distance between the error bars.

plot(model1, leg = FALSE)

For illustration purposes, we reduce the EEG-model above to a two-way design:

EEG4$resampling[, 2], EEG5$resampling[, 2], EEG6$resampling[, 2]), method = "bonferroni") ## [1] 0.024 0.096 0.006 0.006 0.066 0.000 This reveals that the central variables (comparison 2 and 5) do not contribute to the significant difference between male and female patients. ### Nested Design To create a data example for a nested design, we use the curdies data set from the GFD package and extend it by introducing an artificial second outcome variable. In this data set, the levels of the nested factor (site) are named uniquely, i.e., levels 1-3 of factor site belong to “WINTER”, whereas levels 4-6 belong to “SUMMER”. Therefore, nested.levels.unique must be set to TRUE. The code for the analysis using both wide and long format is presented below. if (requireNamespace("GFD", quietly = TRUE)) { library(GFD) data(curdies) set.seed(123) curdies$dug2 <- curdies$dugesia + rnorm(36) # first possibility: MANOVA.wide fit1 <- MANOVA.wide(cbind(dugesia, dug2) ~ season + season:site, data = curdies, iter = 1000, nested.levels.unique = TRUE, seed = 123) # second possibility: MANOVA (long format) dug <- c(curdies$dugesia, curdies$dug2) season <- rep(curdies$season, 2)
site <- rep(curdies\$site, 2)
curd <- data.frame(dug, season, site, subject = rep(1:36, 2))

fit2 <- MANOVA(dug ~ season + season:site, data = curd, subject = "subject", nested.levels.unique = TRUE, seed = 123, iter = 1000)

# comparison of results
summary(fit1)
summary(fit2)
}
## Call:
## cbind(dugesia, dug2) ~ season + season:site
##
## Descriptive:
##         season site n dugesia   dug2
## 1 SUMMER          4 6   0.419 -0.050
## 2 SUMMER          5 6   0.229  0.028
## 3 SUMMER          6 6   0.194  0.763
## 4 WINTER          1 6   2.049  2.497
## 5 WINTER          2 6   4.182  4.123
## 6 WINTER          3 6   0.678  0.724
##
## Wald-Type Statistic (WTS):
##             Test statistic df  p-value
## season      "6.999"        "2" "0.03"
## season:site "16.621"       "8" "0.034"
##
## modified ANOVA-Type Statistic (MATS):
##             Test statistic
## season              12.296
## season:site         15.064
##
## p-values resampling:
##             paramBS (WTS) paramBS (MATS)
## season      "0.064"       "0.036"
## season:site "0.305"       "0.249"
## Call:
## dug ~ season + season:site
##
## Descriptive:
##         season site n Mean 1 Mean 2
## 1 SUMMER          4 6  0.419 -0.050
## 2 SUMMER          5 6  0.229  0.028
## 3 SUMMER          6 6  0.194  0.763
## 4 WINTER          1 6  2.049  2.497
## 5 WINTER          2 6  4.182  4.123
## 6 WINTER          3 6  0.678  0.724
##
## Wald-Type Statistic (WTS):
##             Test statistic df  p-value
## season      "6.999"        "2" "0.03"
## season:site "16.621"       "8" "0.034"
##
## modified ANOVA-Type Statistic (MATS):
##             Test statistic
## season              12.296
## season:site         15.064
##
## p-values resampling:
##             paramBS (WTS) paramBS (MATS)
## season      "0.064"       "0.036"
## season:site "0.305"       "0.249"

## The multRM() function

The multRM() function provides a combination of the approaches described above. It is suitable for repeated measures designs, in which multiple outcomes have been recorded at each time point. The multRM() function takes as arguments:

• formula: A model formula object. The left hand side contains the matrix of response variables (using cbind()) and the right hand side contains the factor variables of interest. The within-subject factor(s) must be specified last in the formula, e.g. cbind(outcome1, outcome2) ~ between1 * between2 * within1 * within2.
• data: A data.frame, list or environment containing the variables in formula. Data must be in long format and must not contain missing values.
• subject: The column name of the subjects in the data. NOTE: Subjects within different groups of between-subject factors must have individual labels.
• within: Specifies the within-subject factor(s) in the formula.
• iter: The number of iterations used for calculating the resampled statistic. The default option is 10,000.
• alpha: A number specifying the significance level; the default is 0.05.
• resampling: The resampling method to be used, one of “paramBS” (parametric bootstrap approach) and “WildBS” (wild bootstrap approach with Rademacher weights).
• para: Logical, indicating whether parallel computing should be used. Default is FALSE.
• CPU: The number of cores used for parallel computing. If omitted, cores are detected automatically.
• seed: A random seed for the resampling procedure. If omitted, no reproducible seed is set.
• dec: Number of decimals the results should be rounded to. Default is 3.

As an example, we again use the EEG dataset. This time, imagine we have two outcomes (brainrate and complexity) measured for each of the three regions (within-subject factor). We additionally consider the between-subject factor sex. The tidyr package can be used to transform our original data to this format.

if (requireNamespace("tidyr", quietly = TRUE)) {
library(tidyr)
eeg <- spread(EEG, feature, resp)
fit <- multRM(cbind(brainrate, complexity) ~ sex * region, data = eeg, subject = "id", within = "region", iter = 1000)
summary(fit)
}
## Call:
## cbind(brainrate, complexity) ~ sex * region
## A multivariate repeated measures analysis with  1 within-subject factor(s) ( region )and  1 between-subject factor(s).
##
## Descriptive:
##   sex   region   n brainrate  complexity
## 1   M  central  59    -0.254      -0.302
## 2   M  frontal  59    -0.335      -0.399
## 3   M temporal  59    -0.294      -0.386
## 4   W  central 101     0.149       0.176
## 5   W  frontal 101     0.195       0.233
## 6   W temporal 101     0.172       0.226
##
## Wald-Type Statistic (WTS):
##            Test statistic df  p-value
## sex        "12.45"        "2" "0.002"
## region     "0.192"        "4" "0.996"
## sex:region "2.79"         "4" "0.593"
##
## modified ANOVA-Type Statistic (MATS):
##            Test statistic
## sex                54.203
## region              0.048
## sex:region          0.703
##
## p-values resampling:
##            paramBS (WTS) paramBS (MATS)
## sex        "0.002"       "<0.001"
## region     "0.997"       "0.992"
## sex:region "0.609"       "0.431"

The output is similar to that of MANOVA() described above.

## References

• Bathke, A. et al. (2018). Testing Mean Differences among Groups: Multivariate and Repeated Measures Analysis with Minimal Assumptions. Multivariate Behavioral Research, 53(3), 348-359, DOI: 10.1080/00273171.2018.1446320.

• Brunner, E. (2001). Asymptotic and approximate analysis of repeated measures designs under heteroscedasticity. Mathematical Statistics with Applications in Biometry.

• Friedrich, S., Brunner, E. and Pauly, M. (2017). Permuting longitudinal data in spite of the dependencies. Journal of Multivariate Analysis, 153, 255-265.

• Friedrich, S., and Pauly, M. (2018). MATS: Inference for potentially singular and heteroscedastic MANOVA. Journal of Multivariate Analysis, 165, 166-179, DOI: 10.1016/j.jmva.2017.12.008.

• Friedrich, S., Konietschke, F., and Pauly, M. (2019). Resampling-Based Analysis of Multivariate Data and Repeated Measures Designs with the R Package MANOVA.RM. The R Journal, 11(2), 380-400.

• Konietschke, F., Bathke, A. C., Harrar, S. W. and Pauly, M. (2015). Parametric and nonparametric bootstrap methods for general MANOVA. Journal of Multivariate Analysis, 140, 291-301.