hR: Toolkit for Data Analytics in Human Resources

Dale Kube

2019-06-23

Transform and analyze workforce data in meaningful ways for human resources (HR) analytics. Two functions, ‘hierarchyLong’ and ‘hierarchyWide’, convert standard employee and supervisor relationship data into useful formats. A ‘workforcePlan’ app is available for simple workforce planning.

Install the package from CRAN by running the install.packages("hR") command.

workforceHistory data

The two examples in this document leverage the sample workforceHistory data set which reflects an artificial organization’s workforce history data. The sample is reduced to a data.table that contains one row per active employee and contractor in order to properly use the hierarchyLong and hierarchyWide functions.

data("workforceHistory")

# Reduce to DATE <= today to exclude future-dated records
dt = workforceHistory[DATE<=Sys.Date()]

# Reduce to max DATE and SEQ per person
dt = dt[dt[,.I[which.max(DATE)],by=.(EMPLID)]$V1]
dt = dt[dt[,.I[which.max(SEQ)],by=.(EMPLID,DATE)]$V1]

# Only consider workers who are currently active
# This provides a reliable 'headcount' data set that reflects today's active workforce
dt = dt[STATUS=="Active"]

# Exclude the CEO because she does not have a supervisor
CEO = dt[TITLE=="CEO",EMPLID]
dt = dt[EMPLID!=CEO]

hierarchyLong

The hierarchyLong function transforms a standard set of unique employee and supervisor identifiers (employee IDs, email addresses, etc.) into an elongated format that can be used to aggregate employee data by a particular line of leadership (i.e. include everyone who rolls up to Susan). The function returns a long data.table consisting of one row per employee for every supervisor above them, up to the top of the tree. The levels represent the number of supervisors from the employee (starting with “1” for an employee’s direct supervisor).

# Use the hierarchyLong() function to produce an elongated hierarchy data.table
hLong = hierarchyLong(dt$EMPLID,dt$SUPVID)
print(hLong)
#>     Employee Level Supervisor
#>  1:   131356     1     199827
#>  2:   131356     2     111355
#>  3:   199827     1     111355
#>  4:   199901     1     199827
#>  5:   199901     2     111355
#>  6:   268831     1     131356
#>  7:   268831     2     199827
#>  8:   268831     3     111355
#>  9:   534441     1     199827
#> 10:   534441     2     111355

# Who reports up through Susan? (direct and indirect reports)
hLong[Supervisor==CEO]
#>    Employee Level Supervisor
#> 1:   131356     2     111355
#> 2:   199827     1     111355
#> 3:   199901     2     111355
#> 4:   268831     3     111355
#> 5:   534441     2     111355

hierarchyWide

The hierarchyWide function transforms a standard set of unique employee and supervisor identifiers (employee IDs, email addresses, etc.) into a wide format that can be used to aggregate employee data by a particular line of leadership (i.e. include everyone who rolls up to Susan). The function returns a wide data.table with a column for every level in the hierarchy, starting from the top of the tree (i.e. “Supv1” is likely the CEO in your organization).

hWide = hierarchyWide(dt$EMPLID,dt$SUPVID)
print(hWide)
#>    Employee  Supv1  Supv2  Supv3
#> 1:   131356 111355 199827   <NA>
#> 2:   199827 111355   <NA>   <NA>
#> 3:   534441 111355 199827   <NA>
#> 4:   199901 111355 199827   <NA>
#> 5:   268831 111355 199827 131356

# Who reports up through Pablo? (direct and indirect reports)
hWide[Supv2==199827]
#>    Employee  Supv1  Supv2  Supv3
#> 1:   131356 111355 199827   <NA>
#> 2:   534441 111355 199827   <NA>
#> 3:   199901 111355 199827   <NA>
#> 4:   268831 111355 199827 131356