# Introduction to summarytools

#### 2019-04-11

summarytools provides tools to neatly and quickly summarize data. It can also make R a little easier to learn and use, especially for data cleaning and preliminary analysis. Four functions are at the core of the package:

• freq() : frequency tables with proportions, cumulative proportions and missing data information
• ctable() : cross-tabulations between two factors or any discrete data, with total, rows or columns proportions, as well as marginal totals
• descr() : descriptive (univariate) statistics for numerical data
• dfSummary() : Extensive data frame summaries that facilitate data cleaning and firsthand evaluation

An emphasis has been put on both what and how results are presented, so that the package can serve both as an exploration and reporting tool, used on its own for minimal reports, or with other sets of tools such as rmarkdown, and knitr.

Building on the strengths of pander and htmltools, the outputs produced by summarytools can be:

• Displayed in plain text in the R console (default behaviour)
• Used in Rmarkdown documents and knitted along with other text and R output
• Written to html files that open up in RStudio’s Viewer or in the default browser
• Written to plain or markdown text files

It is also possible to include summarytools functions in Shiny apps.

### Latest Improvements

Version 0.9 brought many changes and improvements to summarytools. A summary of those changes can be found near the end of this vignette. Changes specific to the latest release can be found in the package’s NEWS file located in the summarytools directory inside your R library’s, and also available using news(package = "summarytools") in R versions 3.6.0 and above.

### This Vignette’s Setup

Since this vignette was created using Rmarkdown, we’ve set some global options that are appropriate for this format and which avoid redundancy in the code. Here’s what the setup chunk looks like (further explanations will be given below):

# {r setup, include=FALSE}
# library(knitr)
# opts_chunk$set(results = 'asis', # This is essential (can also be set at the chunk-level) # comment = NA, # prompt = FALSE, # cache = FALSE) # # library(summarytools) # st_options(plain.ascii = FALSE, # This is very handy in all Rmd documents # style = "rmarkdown" # This too # footnote = NA, # Avoids footnotes which would clutter the results # subtitle.emphasis = FALSE # This is a setting to experiment with - according to # ) # the theme used, it might improve the headings' # # layout #  # {r, echo=FALSE} # st_css() # This is a must; without it, expect odd layout, #  # especially with dfSummary() ## The Four Core Functions ## 1 - freq() : Frequency Tables The freq() function generates a table of frequencies with counts and proportions. library(summarytools) freq(iris$Species, plain.ascii = FALSE, style = "rmarkdown")

### Frequencies

iris$Species Type: Factor Freq % Valid % Valid Cum. % Total % Total Cum. setosa 50 33.33 33.33 33.33 33.33 versicolor 50 33.33 66.67 33.33 66.67 virginica 50 33.33 100.00 33.33 100.00 <NA> 0 0.00 100.00 Total 150 100.00 100.00 100.00 100.00 We’ve added the plain.ascii and style arguments for this first example; however, since we have set these options globally using st_options(), they are not really needed. For this reason, we will not include them from hereon. If we do not worry about missing data, we can set report.nas = FALSE: freq(iris$Species, report.nas = FALSE, headings = FALSE)
Freq % % Cum.
setosa 50 33.33 33.33
versicolor 50 33.33 66.67
virginica 50 33.33 100.00
Total 150 100.00 100.00

We can simplify the results further and omit the Totals row by specifying totals = FALSE, as well as omit the cumulative rows by setting cumul = FALSE.

freq(iris$Species, report.nas = FALSE, totals = FALSE, cumul = FALSE, style = "rmarkdown", headings = FALSE) Freq % setosa 50 33.33 versicolor 50 33.33 virginica 50 33.33 To get familiar with the various output styles, try different values for style – “simple”, “rmarkdown” or “grid”, and see how this affects the results in the console. #### Subsetting Rows in Frequency Tables The “rows” argument allows subsetting the resulting frequency table; we can use it in 3 different ways: • To select rows by position, we use a numerical vector; rows = 1:10 will show the frequencies for the first 10 values only • To select rows by name, we either use • a character vector specifying all desired values (row names) • a single character string to be used as a regular expression; only the matching values will be displayed Used in combination with the “order” argument, this can be quite practical. Say we have a character variable containing many distinct values and wish to know which ones are the 10 most frequent. To achieve this, we would simply use order = "freq" along with rows = 1:5. #### Generating Several Frequency Tables at Once There is more than one way to do this, but the best approach is to simply pass the data frame object (subsetted if needed) to freq(): (results not shown) freq(tobacco[ ,c("gender", "age.gr", "smoker")]) We can without fear pass a whole data frame to freq(); it will figure out which variables to ignore (numerical variables having many distinct values). ## 2 - ctable() : Cross-Tabulations We’ll now use a sample data frame called tobacco, which is included in summarytools. We want to cross-tabulate two categorical variables: smoker and diseased. Since markdown does not support multiline headings, we’ll show a rendered html version of the results: print(ctable(tobacco$smoker, tobacco$diseased, prop = "r"), method = "render") ### Cross-Tabulation, Row Proportions smoker * diseased Data Frame: tobacco diseased smoker Yes No Total Yes 125 ( 41.9% ) 173 ( 58.1% ) 298 ( 100.0% ) No 99 ( 14.1% ) 603 ( 85.9% ) 702 ( 100.0% ) Total 224 ( 22.4% ) 776 ( 77.6% ) 1000 ( 100.0% ) By default, ctable() shows row proportions. To show column or total proportions, use prop = "c" or prop = "t", respectively. To omit proportions, use prop = "n". In the next example, we’ll create a simple “2 x 2” table (no proportions, no totals): with(tobacco, print(ctable(smoker, diseased, prop = 'n', totals = FALSE), headings = FALSE, method = "render")) diseased smoker Yes No Yes 125 173 No 99 603 #### Chi-square results To display chi-square results below the table, set the “chisq” parameter to TRUE. This time, instead of with(), we’ll use the %$% operator from the magrittr package, which works in a very similar fashion.

library(magrittr)
tobacco %$% ctable(gender, smoker, chisq = TRUE, headings = FALSE) %>% print(method = "render") smoker gender Yes No Total F 147 ( 30.1% ) 342 ( 69.9% ) 489 ( 100.0% ) M 143 ( 29.2% ) 346 ( 70.8% ) 489 ( 100.0% ) <NA> 8 ( 36.4% ) 14 ( 63.6% ) 22 ( 100.0% ) Total 298 ( 29.8% ) 702 ( 70.2% ) 1000 ( 100.0% ) Χ2 = .5415 df = 2 p = .7628 Note that a warning will be issued when at least one expected cell counts is lower than 5. ## 3 - descr() : Descriptive Univariate Stats The descr() function generates common central tendency statistics and measures of dispersion for numerical data. It can handle single vectors as well as data frames, in which case it will ignore non-numerical columns (and display a message to that effect). descr(iris, style = "rmarkdown") Non-numerical variable(s) ignored: Species ### Descriptive Statistics iris N: 150 Petal.Length Petal.Width Sepal.Length Sepal.Width Mean 3.76 1.20 5.84 3.06 Std.Dev 1.77 0.76 0.83 0.44 Min 1.00 0.10 4.30 2.00 Q1 1.60 0.30 5.10 2.80 Median 4.35 1.30 5.80 3.00 Q3 5.10 1.80 6.40 3.30 Max 6.90 2.50 7.90 4.40 MAD 1.85 1.04 1.04 0.44 IQR 3.50 1.50 1.30 0.50 CV 0.47 0.64 0.14 0.14 Skewness -0.27 -0.10 0.31 0.31 SE.Skewness 0.20 0.20 0.20 0.20 Kurtosis -1.42 -1.36 -0.61 0.14 N.Valid 150.00 150.00 150.00 150.00 Pct.Valid 100.00 100.00 100.00 100.00 ### Transposing, Selecting Statistics If your eyes/brain prefer seeing things the other way around, just use transpose = TRUE. Here, we also select only the statistics we wish to see, and specify headings = FALSE to avoid reprinting the same information as above. We specify the stats we wish to report with the stats argument, which also accepts values “all”, “fivenum”, and “common”. See ?descr for a complete list of available statistics. descr(iris, stats = "common", transpose = TRUE, headings = FALSE) Non-numerical variable(s) ignored: Species Mean Std.Dev Min Median Max N.Valid Pct.Valid Petal.Length 3.76 1.77 1.00 4.35 6.90 150.00 100.00 Petal.Width 1.20 0.76 0.10 1.30 2.50 150.00 100.00 Sepal.Length 5.84 0.83 4.30 5.80 7.90 150.00 100.00 Sepal.Width 3.06 0.44 2.00 3.00 4.40 150.00 100.00 ## 4 - dfSummary() : Data Frame Summaries dfSummary() collects information about all variables in a data frame and displays it in a single legible table. To generate a summary report and have it displayed in RStudio’s Viewer pane (or in the default Web browser if working outside RStudio), we simply do as follows: library(summarytools) view(dfSummary(iris)) Of course, it is also possible to use dfSummary() in Rmarkdown documents. It is usually a good idea to exclude a column or two, otherwise the table might be a bit too wide. For instance, since the Valid and NA columns are redundant, we can drop one of them. dfSummary(tobacco, plain.ascii = FALSE, style = "grid", graph.magnif = 0.75, valid.col = FALSE, tmp.img.dir = "/tmp") While rendering html tables with view() doesn’t require it, here it is essential to specify tmp.img.dir. We’ll explain why further below. ## Tidy Tables With tb() When generating freq() or descr() tables, it is possible to turn the results into “tidy” tables with the use of the tb() function (think of tb as a diminutive for tibble). For example: library(magrittr) iris %>% descr(stats = "common") %>% tb() # A tibble: 4 x 8 variable mean sd min med max n.valid pct.valid <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> 1 Petal.Leng… 3.758 1.765298233… 1 4.35 6.9 150 100 2 Petal.Width 1.199333333… 0.762237668… 0.1 1.3 2.5 150 100 3 Sepal.Leng… 5.843333333… 0.828066127… 4.3 5.8 7.9 150 100 4 Sepal.Width 3.057333333… 0.435866284… 2 3 4.4 150 100  iris$Species %>% freq(cumul = FALSE, report.nas = FALSE) %>% tb()
# A tibble: 3 x 3
value       freq   pct
<fct>      <dbl> <dbl>
1 setosa        50  33.3
2 versicolor    50  33.3
3 virginica     50  33.3

By definition, no total rows are part of tidy tables, and row.names are converted to regular columns. For now, tb() doesn’t handle split-group tables, but it is certainly in store for a future release of summarytools.

## The print() and view() Functions

summarytools has a generic print method, print.summarytools(). By default, its method argument is set to “pander”. One of the ways in which view() is useful is that we can use it to easily display html outputs in RStudio’s Viewer. The view() function simply acts as a wrapper around print.summarytools(), specifying method = 'viewer'. When used outside RStudio, method falls back to “browser” and the report is shown in the system’s default browser.

## Using stby() to Ventilate Results

We can use stby() the same way as R’s base function by() with the four core summarytools functions. This returns a list-type object containing as many elements as there are categories in the grouping variable.

Why not just use by()? The reason is that by() creates objects of class “by()”, which have a dedicated print() method conflicting with summarytools’ way of printing list-type objects. Since print.by() can’t be redefined (as of CRAN policies), the sensible solution was to introduce a function that is essentially a clone of by(), except that the objects it creates have the class “stby”, allowing the desired flexibility.

Using the iris data frame, we will now display descriptive statistics by Species.

(iris_stats_by_species <- stby(data = iris,
INDICES = iris$Species, FUN = descr, stats = c("mean", "sd", "min", "med", "max"), transpose = TRUE)) Non-numerical variable(s) ignored: Species ### Descriptive Statistics iris Group: Species = setosa N: 50 Mean Std.Dev Min Median Max Petal.Length 1.46 0.17 1.00 1.50 1.90 Petal.Width 0.25 0.11 0.10 0.20 0.60 Sepal.Length 5.01 0.35 4.30 5.00 5.80 Sepal.Width 3.43 0.38 2.30 3.40 4.40 Group: Species = versicolor N: 50 Mean Std.Dev Min Median Max Petal.Length 4.26 0.47 3.00 4.35 5.10 Petal.Width 1.33 0.20 1.00 1.30 1.80 Sepal.Length 5.94 0.52 4.90 5.90 7.00 Sepal.Width 2.77 0.31 2.00 2.80 3.40 Group: Species = virginica N: 50 Mean Std.Dev Min Median Max Petal.Length 5.55 0.55 4.50 5.55 6.90 Petal.Width 2.03 0.27 1.40 2.00 2.50 Sepal.Length 6.59 0.64 4.90 6.50 7.90 Sepal.Width 2.97 0.32 2.20 3.00 3.80 To see an html version of these results, we simply use view() (also possible is to use print() with method = "viewer"): (results not shown) view(iris_stats_by_species) # or print(iris_stats_by_species, method = "viewer") A special situation occurs when we want grouped statistics for one variable only. Instead of showing several tables, each having one column, summarytools assembles everything into a single table: data(tobacco) with(tobacco, stby(BMI, age.gr, descr, stats = c("mean", "sd", "min", "med", "max"))) ### Descriptive Statistics BMI by age.gr Data Frame: tobacco N: 258 18-34 35-50 51-70 71 + Mean 23.84 25.11 26.91 27.45 Std.Dev 4.23 4.34 4.26 4.37 Min 8.83 10.35 9.01 16.36 Median 24.04 25.11 26.77 27.52 Max 34.84 39.44 39.21 38.37 The transposed version looks like this: Mean Std.Dev Min Median Max 18-34 23.84 4.23 8.83 24.04 34.84 35-50 25.11 4.34 10.35 25.11 39.44 51-70 26.91 4.26 9.01 26.77 39.21 71 + 27.45 4.37 16.36 27.52 38.37 ### Using stby() With ctable() This is a little trickier – the working syntax is as follows: stby(list(x = tobacco$smoker, y = tobacco$diseased), tobacco$gender, ctable)
# or equivalently
with(tobacco, stby(list(x = smoker, y = diseased), gender, ctable))

## Using summarytools in Rmarkdown Documents

As we have seen, summarytools can generate both text/markdown and html results. Both types of outputs can be used in Rmarkdown documents. The vignette Recommendations for Using summarytools With Rmarkdown provides good guidelines, but here are a few tips to get started:

• Always set the knitr chunk option results = 'asis'. You can do this on a chunk-by-chunk basis, but it is easier to just set it globally in a “setup” chunk:
    knitr::opts_chunk$set(echo = TRUE, results = 'asis') Refer to this page for more knitr’s options. • To get better results when generating html output with method = 'render', set up your .Rmd document so that it includes summarytools’ css. The st_css() function makes this very easy. #### Initial Setup – Example # --- # title: "RMarkdown using summarytools" # output: html_document # --- # # {r setup, include=FALSE} # library(knitr) # opts_chunk$set(comment = NA, prompt = FALSE, cache = FALSE, results = 'asis')
# library(summarytools)
# st_options(plain.ascii = FALSE,          # This is a must in Rmd documents
#            style = "rmarkdown",          # idem
#            dfSummary.varnumbers = FALSE, # This keeps results narrow enough
#            dfSummary.valid.col = FALSE)  # idem
#
#
# {r, echo=FALSE}
# st_css()
# 

Since results = 'asis' can conflict with other packages’ way of generating results, it is sometimes best to use it for individual chunks only.

### Managing Lengthy dfSummary() Outputs in Rmarkdown Documents

For data frames containing numerous variables, we can use the max.tbl.height argument to wrap the results in a scrollable window having the specified height, in pixels. For instance:

print(dfSummary(tobacco, valid.col = FALSE, graph.magnif = 0.75),
max.tbl.height = 300, method = "render")

### Data Frame Summary

tobacco
Dimensions: 1000 x 9
Duplicates: 2
No Variable Stats / Values Freqs (% of Valid) Graph Missing
1 gender [factor] 1. F 2. M
 489 ( 50.0% ) 489 ( 50.0% )
22 (2.2%)
2 age [numeric] Mean (sd) : 49.6 (18.3) min < med < max: 18 < 50 < 80 IQR (CV) : 32 (0.4) 63 distinct values 25 (2.5%)
3 age.gr [factor] 1. 18-34 2. 35-50 3. 51-70 4. 71 +
 258 ( 26.5% ) 241 ( 24.7% ) 317 ( 32.5% ) 159 ( 16.3% )
25 (2.5%)
4 BMI [numeric] Mean (sd) : 25.7 (4.5) min < med < max: 8.8 < 25.6 < 39.4 IQR (CV) : 5.7 (0.2) 974 distinct values 26 (2.6%)
5 smoker [factor] 1. Yes 2. No
 298 ( 29.8% ) 702 ( 70.2% )
0 (0%)
6 cigs.per.day [numeric] Mean (sd) : 6.8 (11.9) min < med < max: 0 < 0 < 40 IQR (CV) : 11 (1.8) 37 distinct values 35 (3.5%)
7 diseased [factor] 1. Yes 2. No
 224 ( 22.4% ) 776 ( 77.6% )
0 (0%)
8 disease [character] 1. Hypertension 2. Cancer 3. Cholesterol 4. Heart 5. Pulmonary 6. Musculoskeletal 7. Diabetes 8. Hearing 9. Digestive 10. Hypotension [ 3 others ]
 36 ( 16.2% ) 34 ( 15.3% ) 21 ( 9.5% ) 20 ( 9.0% ) 20 ( 9.0% ) 19 ( 8.6% ) 14 ( 6.3% ) 14 ( 6.3% ) 12 ( 5.4% ) 11 ( 5.0% ) 21 ( 9.5% )
778 (77.8%)
9 samp.wgts [numeric] Mean (sd) : 1 (0.1) min < med < max: 0.9 < 1 < 1.1 IQR (CV) : 0.2 (0.1)
 0.86! : 267 ( 26.7% ) 1.04! : 249 ( 24.9% ) 1.05! : 324 ( 32.4% ) 1.06! : 160 ( 16.0% ) ! rounded
0 (0%)

## Writing Output to Files

We can use the file argument with print() or view() to indicate that we want to save the results in a file, be it html, Rmd, md, or just plain text (txt). The file extension indicates to summarytools what type of file should be generated.

view(iris_stats_by_species, file = "~/iris_stats_by_species.html")

### Appending Output Files

The append argument allows adding content to existing files generated by summarytools. This is useful if you want to include several statistical tables in a single file. It is a quick alternative to creating an .Rmd document.

## Global options

The following options can be set with st_options():

### General Options

Option name Default Note
style “simple” Set to “rmarkdown” in .Rmd documents
plain.ascii TRUE Set to FALSE in .Rmd documents
round.digits 2 Number of decimals to show
footnote “default” Personalize, or set to NA to omit
display.labels TRUE Show variable / data frame labels in headings
bootstrap.css (*) TRUE Include Bootstrap 4 css in html outputs
custom.css NA Path to your own css file
escape.pipe FALSE Useful for some Pandoc conversions
lang “en” Language (always 2-letter, lowercase)

(*) Set to FALSE in Shiny apps

### Function-Specific Options

Option name Default Note
freq.totals TRUE Display totals row in freq()
freq.report.nas TRUE Display row and “valid” columns
ctable.prop “r” Display row proportions by default
ctable.totals TRUE Show marginal totals
descr.stats “all” “fivenum”, “common” or vector of stats
descr.transpose FALSE
descr.silent FALSE Hide console messages
dfSummary.varnumbers TRUE Show variable numbers in 1st col.
dfSummary.labels.col TRUE Show variable labels when present
dfSummary.graph.col TRUE Show graphs
dfSummary.valid.col TRUE Include the Valid column in the output
dfSummary.na.col TRUE Include the Missing column in the output
dfSummary.graph.magnif 1 Zoom factor for bar plots and histograms
dfSummary.silent FALSE Hide console messages
tmp.img.dir NA Directory to store temporary images

#### Examples

st_options()                      # display all global options values
st_options('round.digits')        # display the value of a specific option
st_options(style = 'rmarkdown')   # change one or several options' values
st_options(footnote = NA)         # Turn off the footnote on all outputs.
# This option was used prior to generating
# the present document.

## Overriding formatting attributes

When a summarytools object is created, its formatting attributes are stored within it. However, you can override most of them when using the print() method or the view() function.

### Overriding Function-Specific Arguments

Argument freq ctable descr dfSummary
style x x x x
round.digits x x x
plain.ascii x x x x
justify x x x x
display.labels x x x x
varnumbers x
labels.col x
graph.col x
valid.col x
na.col x
col.widths x
totals x x
report.nas x
display.type x
missing x
split.tables x x x x
caption x x x x

Argument freq ctable descr dfSummary
Data.frame x x x x
Data.frame.label x x x x
Variable x x x
Variable.label x x x
Group x x x x
date x x x x
Weights x x
Data.type x
Row.variable x
Col.variable x

#### Example

Here’s an example in which we override 3 function-specific arguments, and one element of the heading:

(age_stats <- freq(tobacco$age.gr))  ### Frequencies tobacco$age.gr
Type: Factor

Freq % Valid % Valid Cum. % Total % Total Cum.
18-34 258 26.46 26.46 25.80 25.80
35-50 241 24.72 51.18 24.10 49.90
51-70 317 32.51 83.69 31.70 81.60
71 + 159 16.31 100.00 15.90 97.50
<NA> 25 2.50 100.00
Total 1000 100.00 100.00 100.00 100.00
print(age_stats, report.nas = FALSE, totals = FALSE, display.type = FALSE,
Variable.label = "Age Group")

iris$Species Type: Facteur Fréq. % Valide % Valide cum. % Total % Total cum. setosa 50 33.33 33.33 33.33 33.33 versicolor 50 33.33 66.67 33.33 66.67 virginica 50 33.33 100.00 33.33 100.00 <NA> 0 0.00 100.00 Total 150 100.00 100.00 100.00 100.00 The language used for producing the object is stored within it as an attribute. This is to avoid problems when switching languages between the moment the object is stored, and the moment at which it is printed. ### Non-UTF-8 Locales On most Windows systems, it will be necessary to change the LC_CTYPE element of the locale settings if the character set is not included in the current locale. For instance, in order to get good results – or rather, any results at all – printing in the console with the Russian language, we’ll need to do this: Sys.setlocale("LC_CTYPE", "russian") st_options(lang = 'ru') Then, to go back to default settings: Sys.setlocale("LC_CTYPE", "") st_options(lang = "en") ### Defining and Using Custom Translations With the new function use_custom_lang(), you can add your own set of translations. For this, create a copy of the language_template.csv file located in the summarytools/includes of your package library, or download it from this location. After you’re done translating the +/- 70 items, simply call the use_custom_lang() function, giving it as sole argument the path to the csv file you’ve just created. Note that such custom translations will not persist across R sessions. This means that you should always have handy this csv file if you’re to print objects created with it. ### Defining Specific Keywords Sometimes, all you might want to do is change just a few keywords – say you would rather have “N” instead of “Freq” in the title row of freq() tables. No need to create a full custom language for that. Rather, use define_keywords(). Calling this function without any arguments will bring up, on systems that support graphical devices (the vast majority, that is), an editable window allowing the modify only the desired items. After closing the edit window, you will be offered to export the resulting “custom language” into a .csv file that can be imported later on with use_custom_lang(). Note that it is also possible to define one or several keywords using function arguments. For the list of all possible keywords to define, see ?define_keywords. For instance: define_keywords(freq = "N") ## Changes and Improvements Since Version 0.9 As stated earlier, version 0.9 brought many improvements to summarytools. Here are the key elements: • Translations • Improved printing of list objects • Objects of class “stby” are automatically printed in the console with optimal results; no more need for view(x, method = "pander"); simply use stby() instead of by() • Regular lists containing summarytools objects can also be printed with optimal results simply by calling print(x) (as opposed to “stby” objects, their automatic printing will not be optimal; that being said, freq() now accepts data frames as its first argument, so the need for lapply() is greatly diminished) • Easier management of global settings with st_options() • st_options() now has as many parameters as there are options to set, making it possible to set all options with only one function call; legacy way of setting options is still supported • Several global options were added, with a focus on simplifying Rmarkdown document creation • improved magrittr operators support (%>%, %$%)
• Changes to freq()
• As mentioned earlier, the function now accepts data frames as its main argument; this makes practically obsolete the use of lapply() with it
• Improved outputs when using stby()
• Changes to ctable()
• Fully supports stby()
• Improved number alignment
• Changes to descr()
• For the stats argument, Values “fivenum” and “common” are now allowed, the latter representing the collection of mean, sd, min, med, max, n.valid, and pct.valid
• Improved outputs when using stby()
• The variable used for weights (if any) is removed automatically from the data so no stats are produced for it
• Changes to dfSummary()
• Now fully compatible with Rmarkdown with its png graphs
• Number of columns is included in the heading section
• Number of duplicated rows is also shown in the heading section
• Bar plots now more accurately reflect counts, as they are not stretched across table cells (this allows comparison of frequencies across variables)
• Columns with particular content (unary/binary, integer sequences, UPC/EAN codes) are treated differently; more relevant information is displayed, while irrelevant information is hidden
• For html outputs, a new parameter col.widths can be used to set the width of the resulting table’s columns; this addresses an issue with some graphs not being shown at the desired magnification level (although much effort has been put into improving this as well)
• max.tbl.height parameter is added, allowing lengthy summaries to be shown in a scrollable window

### Other Notable Changes

• The omit.headings parameter has been replaced by the more straightforward (and still boolean) headings. omit.heandings is still supported but will be deprecated in future releases
• Because it was subject to errors, the Rows Subset heading element has been removed. If there is a strong need for it, I can bring it back in a future release (just let me known by email or on GitHub if you’d like to have it back)
• Under the hood, much has been going on; the lengthier functions have been split into more manageable parts, and several normalizing operations were performed, facilitating maintenance and improving code readability
• The tb() function turns results from freq() and descr() into tidy tibbles

### Backward Compatibility

No changes break backward compatibility, but at least one legacy feature will disappear in some further release. Namely, the boolean parameter omit.headings, which has been replaced by the more straightforward headings. For now, a message is shown whenever the “old” parameter name is used, encouraging users to transition to the newer one.

## Stay Up-to-date

Check out the GitHub project’s page - from there you can see the latest updates and also submit feature requests.

For a preview of what’s coming in the next release, have a look at the development branch.

## Final notes

The package comes with no guarantees. It is a work in progress and feedback / feature requests are welcome. Just send an email to , or open an Issue on GitHub if you find a bug or wish to submit a feature request.