foieGras-basics

2019-07-03

Disclaimer

This vignette is an extended set of examples to highlight the foieGras package’s functionality. Please, do NOT interpret these examples as instructions for conducting analysis of animal movement data. Numerous essential steps in a proper analysis have been left out of this document. It is the user’s job to understand their data, ensure they are asking the right questions of their data, and that the analyses they undertake appropriately reflect those questions. We can not do this for you!

foieGras models

This vignette provides a (very) brief overview of how to use foieGras to filter animal track locations obtained via the Argos satellite system. foieGras provides two state-space models (SSM’s) for filtering (ie. estimating “true” locations and associated movement model parameters, while accounting for error-prone observations):

Both models are continuous-time models, that is, they account for time intervals between successive observations, thereby naturally accounting for the irregularly-timed nature of most Argos data. We won’t dwell on the details of the models here, those will come in a future paper, except to say there may be advantages to choosing one over the other in certain circumstances. The Random Walk model tends not to deal well with small to moderate gaps (relative to a specified time step) in observed locations and can over-fit to particularly noisy data. The Correlated Random Walk model can often deal better with these small to moderate data gaps and smooth through noisy data but tends to estimate nonsensical movement through larger data gaps.

input data

foieGras expects data to be provided in one of four possible formats.

  1. a data.frame or tibble that looks like this
#> # A tibble: 6 x 8
#>      id date                lc      lon   lat  smaj  smin   eor
#>   <int> <dttm>              <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 54591 2012-03-05 05:09:33 1      111. -66.4  2442   416    42
#> 2 54591 2012-03-05 13:28:10 A      110. -66.4  2758   569    98
#> 3 54591 2012-03-05 18:46:22 Z      109. -66.8  9350  1415    88
#> 4 54591 2012-03-06 04:55:14 0      110. -66.4 49660   391    90
#> 5 54591 2012-03-06 11:43:57 B      110. -66.4  3264   358    79
#> 6 54591 2012-03-06 18:29:49 B      111. -66.4  4305   478    85

where the Argos data are provided via CLS Argos’ Kalman filter model (KF) and include error ellipse information for each observed location.

  1. a data.frame or tibble that looks like this
#> # A tibble: 6 x 5
#>   id    date                lc      lon   lat
#>   <chr> <dttm>              <chr> <dbl> <dbl>
#> 1 r11   1997-10-27 04:51:17 0      159. -54.6
#> 2 r11   1997-10-27 16:26:39 0      160. -54.6
#> 3 r11   1997-10-28 08:08:46 0      160. -54.7
#> 4 r11   1997-10-28 17:57:13 B      161. -54.5
#> 5 r11   1997-10-29 11:05:20 B      162. -55.1
#> 6 r11   1997-10-30 02:35:14 A      163. -55.6

where the Argos data are provided via CLS Argos’ Least-Squares model (LS) and do not include error ellipse information.

  1. a data.frame or tibble that includes observations with missing KF error ellipse information
#> # A tibble: 6 x 8
#>      id date                lc      lon   lat  smaj  smin   eor
#>   <int> <dttm>              <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 54591 2012-03-05 05:09:33 1      111. -66.4  2442   416    42
#> 2 54591 2012-03-05 13:28:10 A      110. -66.4  2758   569    98
#> 3 54591 2012-03-05 18:46:22 Z      109. -66.8    NA    NA    NA
#> 4 54591 2012-03-06 04:55:14 0      110. -66.4    NA    NA    NA
#> 5 54591 2012-03-06 11:43:57 B      110. -66.4    NA    NA    NA
#> 6 54591 2012-03-06 18:29:49 B      111. -66.4  4305   478    85

in this situation, foieGras treats observations with missing error ellipse information as though they are LS-based observations.

  1. a sf object where observations have any of the previous 3 structures and also include CRS information
#> Simple feature collection with 6 features and 6 fields
#> geometry type:  POINT
#> dimension:      XY
#> bbox:           xmin: 2416.102 ymin: -913.6784 xmax: 2437.96 ymax: -828.1397
#> epsg (SRID):    3031
#> proj4string:    +proj=stere +lat_0=-90 +lat_ts=-71 +lon_0=0 +k=1 +x_0=0 +y_0=0 +datum=WGS84 +units=km +no_defs
#> # A tibble: 6 x 7
#>      id date                lc     smaj  smin   eor             geometry
#>   <int> <dttm>              <chr> <dbl> <dbl> <dbl>         <POINT [km]>
#> 1 54591 2012-03-05 05:09:33 1      2442   416    42 (2430.943 -912.3109)
#> 2 54591 2012-03-05 13:28:10 A      2758   569    98 (2437.855 -906.0344)
#> 3 54591 2012-03-05 18:46:22 Z      9350  1415    88 (2416.102 -828.1397)
#> 4 54591 2012-03-06 04:55:14 0     49660   391    90  (2437.96 -903.7763)
#> 5 54591 2012-03-06 11:43:57 B      3264   358    79  (2436.429 -908.365)
#> 6 54591 2012-03-06 18:29:49 B      4305   478    85 (2431.585 -913.6784)

fitting a foieGras model

model fitting is comprised of 2 steps: a prefilter step where a number of checks are made on the input data (see ?foieGras::prefilter for details), including applying the argsofilter::sdafilter to identify extreme outlier observations. Additionally, if the input data are not supplied as an sf object, prefilter guesses at an appropriate projection (typically world mercator, EPSG 3395) to apply to the data. The SSM is then fit to this projected version of the data. Users invoke this process via the fit_ssm function:

these are the minimum arguments required: the input data, the model (“rw” or “crw”) and the time.step (in h) to which locations are predicted. Additional control can be exerted over the prefiltering step, via the vmax, ang, distlim, spdf and min.dt arguments. see ?foieGras::fit_ssm for details, the defaults for these arguments are quite conservative, usually leading to relative few observations being flagged to be ignored by the SSM. Additional control over the SSM fitting step can also be exerted but these should rarely need to be accessed by users and will not be dealt with here.

accessing and visualising model fit objects

Simple summary information about the foieGras fit can be obtained by calling the fit object:

and a summary plot method allows a quick visual of the SSM fit to the data:

The predicted values are the state estimates predicted at regular time intervals, specified by time.step (here every 24 h). Fitted values (not shown) are the state estimates corresponding to the time of each observation; their time-series are plotted by default - plot(fit$ssm[[1]]).

Estimated tracks can be mapped using the foieGras-applied projection (here EPSG 3395). We use the foieGras::grab() function to access the SSM-predicted values. The (low-res) land is added using the rnaturalearth package. The ggspatial package’s annotation_spatial and layer_spatial functions ease plotting of sf class data.

The tracks can also be transformed to other projections and locations coloured by date

The estimated locations can be accessed for further analysis, custom mapping, etc… by using the grab function. They can be returned as a projected sf object or as a simple unprojected tibble. Note, that for all foieGras outputs the x, y, x.se and y.se units are in km.

fit_ssm can be applied to single tracks as shown, it can also fit to multiple individual tracks in a single input tibble opr data.frame. The SSM is fit to each individual separately. The resulting output is a compound tibble with rows corresponding to each individual foieGras fit object.

individual id is displayed in the 1st column, all fit output (ssm) in the 2nd column, and convergence status of each model fit is displayed in the 3rd column

The individual fits can easily be combined and plotted together using the grab function. Fitted values can be grab-bed using what = "fitted", or just "f", and predicted values using "p".