We list here some recurrent problems reported by users. Other issues, questions, concerns may have been reported in github: https://github.com/CecileProust-Lima/lcmm/issues/ .

**Please refer to github, both closed and opened issues, before sending any question. And please ask the questions via github only.**

It sometimes happens that a model does not converge correctly. This is due most of the time to the very stringent criterion based on the “derivatives”. This criterion uses both the first derivatives and the inverse of the matrix of the second derivatives (Hessian). It ensures that the program converges at a maximum. When it can’t be computed correctly (most of the time because the Hessian is not definite positive), the program reports “second derivatives = 1”.

**There are several reasons that may induce a non convergence, e.g.:**

When the time variable (or more generally a variable with random effects) is in a unit which induces too small associated parameters (for very small changes per day). In that case, changing the scale (for instance with months or years) may solve the problem.

In models with splines in the link function (lcmm, multlcmm, Jointlcmm, mpjlcmm) or with splines in the baseline risk function (Jointlcmm, mpjlcmm), a parameter associated with splines very close to zero may prevent for correct convergence as it is at the border of the parameter space. In that case, this parameter can be fixed to 0 and convergence should be reached immediately.

- When the data are not rich enough and/or the model is too complicated. In that case, this is a problem of numerical non identifiability. There are not many solutions (other than simplifying the model) but some directions may be :
- Be patient, it happens that after some iterations the derivatives might be invertible and the model converges (you can change maxiter or rerun the program from the estimates at the non convergence point). But this is usually no necessary to specify more than 100 or 200 iterations. The iterative algorithm is used to converge in a few dozen of iterations.
- You can try to assume a less stringent threshold (e.g., 0.01) but be careful, convergence might be of lower quality.

- Run the model from different initial values.

Selection of the number of latent classes is a complex question. In some cases, the number is known. When not, different tools can be used to guide the decision:

- Several statistical criteria such as BIC, SABIC, ICL or Entropy
- Statistical tests when available: score test for conditional independence in joint models
- Discrimination power as described by the classification table using the command postprob
- Size of the classes (we can consider that classes should be larger than 1% or 5% depending on the context)
- Clinical aspects and interpretation should also be taken into account

Finally, it can be useful to present and contrast models with different numbers of latent classes.

The complexity of the selection of the optimal number of latent classes is illustrated in vignette: https://cran.r-project.org/package=lcmm/vignettes/latent_class_model_with_hlme.html . Indeed, all the criteria may not be concordant in practice.

Good discrimination of classes is usually sought when fitting latent class mixed models. Discriminatory power can be assessed using the entropy criterion (provided in summarytable) but also using the classification table (with command postprob). The description of the classes may also help comprehend the latent class structure.

(see vignette https://cran.r-project.org/package=lcmm/vignettes/latent_class_model_with_hlme.html for further details)

Different techniques can be used in this package to evaluate the goodness of fit. As in mixed models, one can compare the subject-specific predictions with the observations or plot the subject-specific residuals.

The comparison with more flexible models can also be useful (more flexible link functions, more flexible baseline risk functions, more flexible functions of time, etc.)

Each vignette includes a section on the evaluation of the model.

This is detailed in vignette on pre-normalizing: https://cran.r-project.org/package=lcmm/vignettes/pre_normalizing.html

The order of the latent classes can be changed in any function (hlme, lcmm, Jointlcmm, mpjlcmm) using the permut function. Here is an example with the estimation of a two class linear mixed model:

```
mhlme <- hlme(IST ~ I(age-age_init),random=~ I(age-age_init),subject="ID",data=paquid)
set.seed(1234)
mhlme2 <- hlme(IST ~ I(age-age_init),random=~ I(age-age_init),subject="ID",data=paquid,ng=2,
mixture=~ I(age-age_init),classmb =~ CEP , B=random(mhlme))
```

```
summary(mhlme2)
Heterogenous linear mixed model
fitted by maximum likelihood method
hlme(fixed = IST ~ I(age - age_init), mixture = ~I(age - age_init),
random = ~I(age - age_init), subject = "ID", classmb = ~CEP,
ng = 2, data = paquid)
Statistical Model:
Dataset: paquid
Number of subjects: 494
Number of observations: 2052
Number of observations deleted: 198
Number of latent classes: 2
Number of parameters: 10
Iteration process:
Convergence criteria satisfied
Number of iterations: 27
Convergence criteria: parameters= 1.4e-05
: likelihood= 3.5e-06
: second derivatives= 6.8e-11
Goodness-of-fit statistics:
maximum log-likelihood: -6010.69
AIC: 12041.38
BIC: 12083.4
Maximum Likelihood Estimates:
Fixed effects in the class-membership model:
(the class of reference is the last class)
coef Se Wald p-value
intercept class1 2.82683 1.02233 2.765 0.00569
CEP class1 -3.37138 1.00112 -3.368 0.00076
Fixed effects in the longitudinal model:
coef Se Wald p-value
intercept class1 25.54344 0.49570 51.530 0.00000
intercept class2 32.93898 0.54910 59.988 0.00000
I(...) class1 -0.51254 0.05031 -10.188 0.00000
I(...) class2 -0.57198 0.04930 -11.601 0.00000
Variance-covariance matrix of the random-effects:
intercept I(age - age_init)
intercept 10.89142
I(age - age_init) 0.03929 0.09386
coef Se
Residual standard error: 3.36817 0.06757
```

The order of the latent classes can be changed by running:

```
summary(mhlme2)
Heterogenous linear mixed model
fitted by maximum likelihood method
hlme(fixed = IST ~ I(age - age_init), mixture = ~I(age - age_init),
random = ~I(age - age_init), subject = "ID", classmb = ~CEP,
ng = 2, data = paquid)
Statistical Model:
Dataset: paquid
Number of subjects: 494
Number of observations: 2052
Number of observations deleted: 198
Number of latent classes: 2
Number of parameters: 10
Iteration process:
Convergence criteria satisfied
Number of iterations: 27
Convergence criteria: parameters= 1.4e-05
: likelihood= 3.5e-06
: second derivatives= 6.8e-11
Goodness-of-fit statistics:
maximum log-likelihood: -6010.69
AIC: 12041.38
BIC: 12083.4
Maximum Likelihood Estimates:
Fixed effects in the class-membership model:
(the class of reference is the last class)
coef Se Wald p-value
intercept class1 2.82683 1.02233 2.765 0.00569
CEP class1 -3.37138 1.00112 -3.368 0.00076
Fixed effects in the longitudinal model:
coef Se Wald p-value
intercept class1 25.54344 0.49570 51.530 0.00000
intercept class2 32.93898 0.54910 59.988 0.00000
I(...) class1 -0.51254 0.05031 -10.188 0.00000
I(...) class2 -0.57198 0.04930 -11.601 0.00000
Variance-covariance matrix of the random-effects:
intercept I(age - age_init)
intercept 10.89142
I(age - age_init) 0.03929 0.09386
coef Se
Residual standard error: 3.36817 0.06757
```

The models in objects mhlme2 and mhlme2perm are the same except for the permutation of the classes as shown with the cross-table:

An object stemmed from a estimation function of lcmm package (hlme, lcmm, Jointlcmm, mpjlcmm) provides predictions on the data on which the model was estimated.

Different functions allows the same type of computations but on external data:

predictClass function computes the posterior classification and the posterior class-membership probabilities from any latent class model estimated in package lcmm.

For instance, using the 2-class model estimated above, the posterior probabilities and classification can be computed for the newdata, here the data of the second subject of paquid dataset:

predictRE function computes the predicted random-effects of any model estimated within lcmm package for a new subject whose data (i.e., covariates and outcomes) are provided in newdata:

predictY function provides by default the mean outcome value predicted for a profile of covariates (provided in newdata). This prediction (with marg=TRUE by default) is a marginal prediction.

Subject-specific predictions can be obtained only for hlme right now. In that case, it should be indicated option marg=FALSE and newdata should include the outcome data so that the predicted random-effects can be included.

```
predss <- predictY(mhlme2, paquid[2:6,], marg=FALSE)
predss$pred
ID pred_ss pred_ss1 pred_ss2
2 2 27.25200 26.21337 28.52598
3 2 25.96443 25.05766 27.07668
4 2 23.07531 22.46442 23.82464
5 2 16.73826 16.77635 16.69155
6 2 14.93028 15.15352 14.65646
```

Predictions of the outcomes can also be directly computed for a latent process value. This is useful when nonlinear link functions are used in the models. This is obtained with function predictYcond.