This vignette introduces you to sizeMat package and provide a way to estimate Size at Morphometric and Gonad Maturity.

##Install package

install.packages("sizeMat")

1. Size at morphometric maturity

For estimating morphometric maturity use the crabdata base. The data set containing allometric measures and other attributes (year, month, sex category) of 223 crabs of the specie Chionectes tanneri.

Load data

data(crabdata)

head(crabdata)
##   year month carapace_width carapace_length chela_height chela_width
## 1 1974     1            106             107         14.0          22
## 2 1974     1            129             129         27.0          44
## 3 1974     1            119             122         14.6          23
## 4 1974     1            115             118         18.6          29
## 5 1974     1             97              97         11.0          17
## 6 1974     1             94              96         10.0          15
##   sex_category
## 1            m
## 2            m
## 3            m
## 4            m
## 5            m
## 6            m
names(crabdata)
## [1] "year"            "month"           "carapace_width"  "carapace_length"
## [5] "chela_height"    "chela_width"     "sex_category"

The estimation of morphometric maturity involves two processes:

First the data classification, which is referred to the classification of the individuals in two groups. The second is the estimation. This process uses the previous classification to estimate the size at morphometric maturity. The details are given below.

Classify data (print and plot)

The classify_mature function, classify the individuals in two groups (juveniles = 0 and adult = 1). The classification analysis is based on Principal Components Analysis with two allometric variables (x: independent variable, y: dependent variable) in log base, allowing to distinguish two groups that would represent juveniles and adult. The individuals are assigned to each group using a hierarchical classification procedure (hierarchical cluster with agglomeration method: “Ward.D” and the distance measure: “euclidean”). This method is based on establishing a predetermined number of groups (in this case, two) and assigning individuals to one of the groups according to their loads on the two axes of the PCA (Corgos & Freire, 2006).

Using the results of the classification (PCA + cluster), a discriminant analysis (linear or quadratic) is conducted to obtain a discriminating function that permitted any individuals to be classified as a juvenile or an adult on the basis of the X and Y variables.

The classify_mature function requires a data.frame (e.g. crabdata) with allometric variables and sex category. The argument varNames requires the name of two allometric variables only, and varSex requires the name of the variable containing sex information. If the argument selecSex is NULL all the individuals will be used in the classification analysis. Finally the method is focus in the discriminant analysis to be used (“ld”: linear discriminant analysis, “qd”: quadratic discriminant analysis). We recommend begin the analysis with the method = "ld".

The classify_mature function returns an object of class “classify”, with the allometric variables “x” (independent) - “y”" (dependent), and classification of maturity (juveniles = 0, adult = 1).

#For all the individuals
classify_data = classify_mature(crabdata, varNames = c("carapace_width", "chela_height"), 
varSex = "sex_category", selectSex = NULL, method = "ld")
## all individuals were used in the analysis
#For males only
classify_data_males = classify_mature(crabdata, varNames = c("carapace_width", "chela_height"), 
varSex = "sex_category", selectSex = "m", method = "ld")
## only m-sex were used in the analysis

Print the results of the “classify_data” object. It shows the number of juveniles and adults after classification and the linear regression analysis for juveniles and adults.

print(classify_data)
## Number in juvenile group = 83 
## 
## Number in adult group = 140 
## 
## -------------------------------------------------------- 
## 1) Linear regression for juveniles 
## 
## Call:
## glm(formula = y ~ x, data = juv)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -2.77010  -0.57399   0.09397   0.56605   1.99008  
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -3.794687   0.497056  -7.634 3.93e-11 ***
## x            0.161327   0.004701  34.314  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for gaussian family taken to be 0.7320842)
## 
##     Null deviance: 921.306  on 82  degrees of freedom
## Residual deviance:  59.299  on 81  degrees of freedom
## AIC: 213.63
## 
## Number of Fisher Scoring iterations: 2
## 
## -------------------------------------------------------- 
## 2) Linear regression for adults 
## 
## Call:
## glm(formula = y ~ x, data = adt)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -3.3055  -1.0932  -0.0628   1.1178   3.2759  
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -11.246726   1.199496  -9.376   <2e-16 ***
## x             0.273837   0.008648  31.663   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for gaussian family taken to be 2.265729)
## 
##     Null deviance: 2584.24  on 139  degrees of freedom
## Residual deviance:  312.67  on 138  degrees of freedom
## AIC: 515.79
## 
## Number of Fisher Scoring iterations: 2
## 
## -------------------------------------------------------- 
## 3) Difference between slopes (ANCOVA) 
##               Estimate  Std. Error   t value     Pr(>|t|)
## (Intercept) -3.7946869 0.757105677 -5.012097 1.109526e-06
## x            0.1613275 0.007161179 22.528064 6.035478e-59
## mature      -7.4520389 1.285219562 -5.798261 2.320729e-08
## x:mature     0.1125093 0.010361046 10.858878 2.956242e-22
## [1] "slopes are different"

The example shows the arguments that can be used in the plot for “classify_data” object:

par(mfrow = c(2,2))
plot(classify_data)

plot(classify_data, xlab = "Carapace width (mm.)", ylab = "Chela height (mm)", legendPlot = FALSE)
legend("topleft", "Put your legend here", bty = "n")

plot(classify_data, xlab = "Carapace width (mm.)", ylab = "Chela height (mm)", 
     col = c(2, 3), pch = c(5, 6), legendPlot = TRUE)

plot(classify_data, xlab = "Carapace width (mm.)", ylab = "Chela height (mm)", 
     col = c(2, 3), pch = c(5, 6), lty_lines = c(1, 2), lwd_lines = c(1, 3), 
     cex = c(1, 3), main = "Classification")

Estimate size at morphometric maturity (maturity ogive estimation)

The morph_mature function use the logit approach (frequentist or bayesian). The size at 50% maturity (\(L_{50}\)) was estimated as the length at which a randomly chosen specimen has a 50% chance of being mature (Somerton 1980, Roa et al. 1999).

In the regression analysis, \(X\) (e.g: carapace width) is considered the explanatory variable and the classification of maturity \(CS\) (juveniles: 0, adults: 1) is considered the response variable (binomial). The variables are fitted to a logit function with the form:

\[P_{CS} = \frac{1}{1+e^{-(\hat{\beta}_{0} + \hat{\beta}_{1}*X)}}\]

where \(P_{CS}\) is the probability of an individual of being mature at a determinate \(X\) length. \(\hat{\beta}_{0}\) (intercept) and \(\hat{\beta}_{1}\) (slope) are parameters estimated. The \(L_{50}\) is calculated as:

\[L_{50} = -\frac{\hat{\beta}_{0}}{\hat{\beta}_{1}}\]

The morph_mature function requires an object of class “classify” with the X, Y (allometric variables) and classification of maturity (juveniles = 0, adults = 1).

The argument method requires a character string indicating which regression will be used for the test. If method = "fq" the logit regression is based on GLM (frequentist) and if method = "bayes" a sample from the posterior distribution of a logit regression model using a random walk Metropolis algorithm is generated (see MCMClogit function).

The argument niter requires a number. For the GLM regression (method = "fq"), a non-parametric bootstrap method consists in generate B bootstrap samples, by resampling with replacement the original data. Then all statistics for each parameter can be calculated from each bootstrap sample (median and confidence intervals). For the method = "bayes", the argument niter is related to the number of Metropolis iterations for the sampler.

The output is an object of class “morphMat”. This object contains a dataframe with the allometric variables X - Y and the classification of maturity. Also the fitted values for the logit regression and confidence intervals (95%). If you print the object, the median of the size at morphometric maturity estimation (\(L_{50}\)) and parameters are shown.

#Frequentist regression 
my_ogive_fq = morph_mature(classify_data, method = "fq", niter = 1000)

print(my_ogive_fq)
## formula: Y = 1/1+exp-(A + B*X)
##     Original Bootstrap (Median)
## A   -20.753  -20.9725          
## B   0.1748   0.1767            
## L50 118.7237 118.6334          
## R2  -        0.7111
#Bayesian regression
my_ogive_bayes = morph_mature(classify_data, method = "bayes", niter = 1000)

print(my_ogive_bayes)
## formula: Y = 1/1+exp-(A + B*X)
##     Bootstrap (Median)
## A             -20.6964
## B               0.1744
## L50           118.4079
## R2              0.7111

Plot maturity ogive

For plotting the maturity ogive, an object of class “morphMat”is required. The function plot generates 4 graphics: 1), 2) and 3) are histograms for the A, B parameters and the size at morphometric maturity (\(L_{50}\)), the last is the maturity ogive.

par(mfrow = c(2,2))
plot(my_ogive_fq, xlab = "Carapace width (mm.)", ylab = "Proportion mature", col = c("blue", "red"))

## Size at morphometric maturity = 118.6 
## Confidence intervals = 116 - 121.1 
## Rsquare = 0.71
par(mfrow = c(2,2))
plot(my_ogive_bayes, xlab = "Carapace width (mm.)", ylab = "Proportion mature", col = c("blue", "red"))

## Size at morphometric maturity = 118.4 
## Confidence intervals = 115.9 - 120.8 
## Rsquare = 0.71

If you want the maturity ogive plot only, you have to add the param onlyOgive = TRUE. Besides you can modify the axis (size, rotation, etc) and add the legend.

plot(my_ogive_fq, xlab = "Carapace width (mm.)", ylab = "Proportion mature", col = c("blue", "red"), onlyOgive = TRUE)

## Size at morphometric maturity = 118.6 
## Confidence intervals = 116 - 121.1 
## Rsquare = 0.71

This methodology has been used mainly in the estimation of morphological sexual maturity in crabs, but it can be extended to other taxas as Agostinho (2000) reported.

2. Size at gonad maturity

For estimating gonadal maturity use the matFish database. This database contains two variables:

  • total_length: Total length in cm.

  • stage_mat: The gonadal maturation stages: I, II, III, IV, where I is considered immature.

Load data

data(matFish)

head(matFish)
##   total_length stage_mat
## 1           12         I
## 2           12         I
## 3           13         I
## 4           14         I
## 5           14         I
## 6           14         I

The function to be used to estimate gonadal maturity is gonad_mature. This function use the logistic approach.

The gonad_mature function requires a data.frame with allometric variables (e.g: total length, fork length, carapace width, etc) and a variable containing the stages of sexual maturity (gonadal maturation stages).

The argument varNames requires a character string indicating the name of one allometric and the stage of sexual maturity variable to be used for analysis (e.g varNames = c("total_length", "stage_mat")). So the argument varNames must contain two character strings only, the first is the allometric variable and the second is the stage of sexual maturity.

The arguments inmName and matName require a character string indicating the name of the stages of sexual maturity in the data.frame. The argument could contain one character string or could be a vector (e.g inmName = "I", matName = c("II", "III", "IV")). The variable stage_mat in the matFish database, contains the stages of the sexual maturity. In this case, stage I is considered immature and II, III, IV are mature. Then the stages of sexual maturity are transformed in a binomial variable where immature = 0 and mature = 1.

The argument method requires a character string indicating which regression will be used for the test. If method = "fq" the logit regression is based on GLM (frequentist) and if method = "bayes" a sample from the posterior distribution of a logit regression model using a random walk Metropolis algorithm is generated (see MCMClogit function).

The argument niter requires a number. For the GLM regression (method = "fq"), a non-parametric bootstrap method consists in generate B bootstrap samples, by resampling with replacement the original data. Then all statistics for each parameter can be calculated from each bootstrap sample (median and confidence intervals). For the method = "bayes", the argument niter is related to the number of Metropolis iterations for the sampler.

The output is an object of class “gonadMat”. This object contains a dataframe with the allometric variable “X” and stage of sexual maturity (immature = 0, mature = 1). Also the fitted values for the curve logistic regression and confidence intervals (95%). If you print the object, the median of the size at gonad maturity estimation (\(L_{50}\)) and parameters are shown.

Estimate size at gonad maturity (maturity ogive estimation)

#Frequentist regression 
my_ogive_fq = gonad_mature(matFish, varNames = c("total_length", "stage_mat"), inmName = "I",
                           matName = c("II", "III", "IV" ), method = "fq", niter = 999)

print(my_ogive_fq)
## formula: Y = 1/1+exp-(A + B*X)
##     Original Bootstrap (Median)
## A   -8.6047  -8.6407           
## B   0.356    0.3576            
## L50 24.1694  24.1714           
## R2  0.5595   -
#Bayesian regression 
my_ogive_bayes = gonad_mature(matFish, varNames = c("total_length", "stage_mat"), inmName = "I", 
                              matName = c("II", "III", "IV" ), method = "bayes", niter = 999)

print(my_ogive_bayes)
## formula: Y = 1/1+exp-(A + B*X)
##     Bootstrap (Median)
## A              -8.4974
## B               0.3522
## L50            24.1295
## R2              0.5595

Plot maturity ogive

For plotting the maturity ogive the object of class “gonadMat”is required. The function plot generates 4 graphics: 1), 2) and 3) are histograms for the A, B parameters and the size at gonadal maturity (\(L_{50}\)), the last is the maturity ogive.

par(mfrow = c(2,2))
plot(my_ogive_fq, xlab = "Total length (cm.)", ylab = "Proportion mature", col = c("blue", "red"))

## Size at gonad maturity = 24.2 
## Confidence intervals = 23.8 - 24.6 
## Rsquare = 0.56
par(mfrow = c(2,2))
plot(my_ogive_bayes, xlab = "Total length (cm.)", ylab = "Proportion mature", col = c("blue", "red"))

## Size at gonad maturity = 24.1 
## Confidence intervals = 23.7 - 24.5 
## Rsquare = 0.56

If you want the maturity ogive plot only, you have to add the param onlyOgive = TRUE. Besides you can modify the axis (size, rotation, etc) and add the legend.

plot(my_ogive_fq, xlab = "Total length (cm.)", ylab = "Proportion mature", col = c("blue", "red"), onlyOgive = TRUE)