Directly Standardized Rates for Recurrent Outcomes

Matthew Kumar



The dsrrec() function computes directly standardized rates for recurrent and non-recurrent outcomes.

For recurrent outcomes, confidence intervals are computed using the negative binomial approach to variance estimation for directly standardized rates by Stukel et al. (1994).

For non-recurrent outcomes, the gamma distribution approach outlined in Fay et al. (1997) is used.


The dsrrec() function expects person-level event counts and unit-times. On each person-level record, the following variables should be present:

Note: For the standard or reference population, the data must be aggregated by the standardization variables and the unit-time variable name must labeled pop. See example below.

Example Data

The readmission dataset from frailtypack contains rehospitalization times (in days) following surgery in patients diagnosed with colorectal cancer. We will use this data to examine directly (sex) standardized rates for rehospitalizations by Dukes’ tumoral stage, a subgroup variable.

#load necessary packages

Since the data are in a person-period format, we’ll have to calculate the total number of events and total observation time for each person. This will result in a single record per person that summarizes the relevant information.

#Calculate total events and total observation times per person
treadm <- readmission %>%
          group_by(id) %>%
          filter(max(enum)==enum ) %>%
          mutate(events=enum-1,time=t.stop) %>%
          select(id, events, time, sex, dukes)

#View first 6 records
knitr::kable(head(treadm), caption='Person-level Dataset')
Person-level Dataset
id events time sex dukes
1 2 1037 Female D
2 1 1182 Male C
3 1 783 Male C
4 4 2048 Female A-B
5 1 1144 Female C
6 3 1407 Male A-B

The next step is to form the standardized or reference population, which for this analysis, is the total observation time across our study. Since we are standardizing by sex, we’ll compute total observation time by sex. As noted previously, the unit-time must be labelled pop.

#Make the reference data
tref <- treadm %>% 
        group_by(sex) %>% 
        mutate(pop=sum(time)) %>% 
        select(sex, pop) %>% 
        distinct(sex, pop)

knitr::kable(tref, caption='Reference Dataset')
Reference Dataset
sex pop
Female 180198
Male 233093

Directly Standardized Rates for Recurrent Outcomes with dsrrec()

Finally, lets calculate directly (sex) standardized rates for rehospitalizations by Dukes’ tumoral stage.


my_analysis <- dsrrec(data=treadm,
knitr::kable(my_analysis, caption='Analysis Results') 
Analysis Results
Subgroup Numerator Denominator Cr Rate (per 1000) Std Rate (per 1000) 95% LCL (NB) 95% UCL (NB) 95% LCL (Gamma) 95% UCL (Gamma)
D 143 31444 4.548 4.541 3.793 5.289 3.828 5.350
C 193 170190 1.134 1.172 1.006 1.339 1.012 1.351
A-B 169 211657 0.798 0.782 0.663 0.900 0.668 0.909

By default, confidence intervals for recurrent (NB) and non-current (Gamma) outcomes are calculated.