A Brief Tutorial on Landscape Genetic Analysis Using HierDpart

Xinghu Qin –School of Biology, University of St Andrews

2019-02-12

Instruction

This vignette demonstrates how to use the latest version of HierDpart to do landscape genetic analysis. This package is designed for population geneticists and community geneticists to solve the gap between community ecology and population genetics - how to partition diversity across metrics and spatial scales, from genetic diversity to species diversity using an integrated method. The method, known as the unifying framework, has been explained in recent paper (Gaggiotti, O. E. et al, 2018, Evol Appl, 11:1176-1193). This package was developed to implement the hierarchical decomposition analysis using not only the common genetic metrics, such as He, Fst, allelic richness, but also new genetic metrics from the unifying framework.

Here, this vignette will briefly show how to do analyses using this package.

Practical analyses

HierDpart can read genepop files both in haploid and diploid which are quite common in genetics.

Install HierDpart

#Install from CRAN
#install.packages("HierDpart")
## or you can get the latest version of HierDpart from github
#library(devtools)

#install_github("xinghuq/HierDpart")

library("HierDpart")
# example genepop file
f <- system.file('extdata',package='HierDpart')
infile <- file.path(f, "Island.gen")

Function to calculate genetic diversity profiles (q=0,1,2)

qD is the function calculating genetic diversity profile (qD, q=0,1,2), the true diversities. If we want to calculate the allelic diversity profiles, we can do the following command line:

qD(infile,q=0,ncode=3)
#>         pop1 pop2 pop3 pop4 pop5 pop6 pop7 pop8 pop9 pop10 pop11 pop12
#> locus1     2    3    2    3    3    2    2    5    5     4     5     3
#> locus2     9    9    8    7    7    7    8    4    5     5     6     7
#> locus3     8    9    8    7    8    8    8    8    9     9     7     6
#> locus4     8    8    8    7    5    6    7    8   10     9     8     5
#> locus5     7    7    9    6    7    5   10    6    9     6     7     7
#> locus6     4    6    5    6    6    3    5    4    6     6     6     5
#> locus7     6    5    6    6    8    7    4    6    7     4     5     4
#> locus8     9    8    9    9    9    8    7    8    6     9     7     5
#> locus9     5    6    8    6    8    6    6    6    6     7     6     5
#> locus10    9    8    7    7    8    6    6    6    6     8     7     7
#>         pop13 pop14 pop15 pop16
#> locus1      3     6     3     4
#> locus2      6     5     5     5
#> locus3      6     7     7     7
#> locus4      6     9     9    10
#> locus5      7     8     9     8
#> locus6      4     5     6     5
#> locus7      4     5     4     5
#> locus8      5     3     6     4
#> locus9      4     8     8     6
#> locus10     6     6     7     8

You can choose which diversity you want to get by defining the value of q.

Plot diversity profiles

If we want to plot the diversity profiles, we can do:

qDplot(infile,q="all",ncode=3)

Hierarchical diversity and differentiation decomposition

For geneticists studying landscape genetics, they may want to partition the metapopulation into different subpopulation, thus having several hierarchical groups. Now I am showing how to decompose genetic diversity into different hierarchical levels.

Hierarchical allelic richness (q=0)

Here it is easy to define or delimit your structure (hierarchy), type ?HierAr will show you how to do that. Here, “nreg” is the number of aggregate ( or population) in your metapopulation, and “r” is the number of subaggregate (or subpopulation) in each aggregate.

HAr=HierAr(infile, nreg=4, r=c(7,4,2,3), ncode=3)
print(HAr)
#> $Ar_pop
#>         Arpop 1 Arpop 2 Arpop 3 Arpop 4 Arpop 5 Arpop 6 Arpop 7 Arpop 8
#> locus1        2       3       2       3       3       2       2       5
#> locus2        9       9       8       7       7       7       8       4
#> locus3        8       9       8       7       8       8       8       8
#> locus4        8       8       8       7       5       6       7       8
#> locus5        7       7       9       6       7       5      10       6
#> locus6        4       6       5       6       6       3       5       4
#> locus7        6       5       6       6       8       7       4       6
#> locus8        9       8       9       9       9       8       7       8
#> locus9        5       6       8       6       8       6       6       6
#> locus10       9       8       7       7       8       6       6       6
#>         Arpop 9 Arpop 10 Arpop 11 Arpop 12 Arpop 13 Arpop 14 Arpop 15
#> locus1        5        4        5        3        3        6        3
#> locus2        5        5        6        7        6        5        5
#> locus3        9        9        7        6        6        7        7
#> locus4       10        9        8        5        6        9        9
#> locus5        9        6        7        7        7        8        9
#> locus6        6        6        6        5        4        5        6
#> locus7        7        4        5        4        4        5        4
#> locus8        6        9        7        5        5        3        6
#> locus9        6        7        6        5        4        8        8
#> locus10       6        8        7        7        6        6        7
#>         Arpop 16
#> locus1         4
#> locus2         5
#> locus3         7
#> locus4        10
#> locus5         8
#> locus6         5
#> locus7         5
#> locus8         4
#> locus9         6
#> locus10        8
#> 
#> $Ar_reg
#>   Ar_region 1 Ar_region 2 Ar_region 3 Ar_region 4
#> 1         9.2         8.7           6         7.3
#> 
#> $Ar_ovell
#>       Art Arr    Arp
#> [1,] 10.9 7.8 6.3375

Hierarchical allelic diversity (^1D,q=1)

HiD=HierD(infile, nreg=4, r=c(7,4,2,3), ncode=3)
print(HiD)
#>           D_gamma D_alpha.2 D_alpha.1 D_beta.2 D_beta.1 Differentiation.2
#> Locus 1  3.193142  2.725683  2.547081 1.171501 1.070120         0.1234638
#> Locus 2  8.127991  5.570462  4.756040 1.459123 1.171240         0.2947130
#> Locus 3  9.182470  7.039348  5.851804 1.304449 1.202936         0.2073097
#> Locus 4  8.384311  6.830226  5.819550 1.227531 1.173669         0.1599041
#> Locus 5  8.815656  6.014476  5.031749 1.465740 1.195305         0.2982420
#> Locus 6  4.466209  3.846453  3.517153 1.161124 1.093627         0.1165235
#> Locus 7  6.466375  4.320795  3.849778 1.496570 1.122349         0.3144786
#> Locus 8  8.193804  5.671722  4.805933 1.444677 1.180150         0.2869518
#> Locus 9  9.424112  5.279522  4.494791 1.785031 1.174587         0.4519619
#> Locus 10 8.429114  5.828401  5.117923 1.446214 1.138821         0.2877814
#>          Differentiation.1
#> Locus 1         0.04546744
#> Locus 2         0.10604372
#> Locus 3         0.12395858
#> Locus 4         0.10743385
#> Locus 5         0.11968905
#> Locus 6         0.06004503
#> Locus 7         0.07743759
#> Locus 8         0.11112839
#> Locus 9         0.10795825
#> Locus 10        0.08721245

The result returns a list of hierarchical metrics, the first five metrics are diversities. It explains in the below section.

Hierarchical heterozygosity (q=2)

He=HierHe(infile, nreg=4, r=c(7,4,2,3), ncode=3)
print(He)
#> $HierHe_perloc
#>              Ht   Hr 1   Hr 2   Hr 3   Hr 4     Hp
#> Locus 1  0.5937 0.6587 0.6587 0.6587 0.6587 0.5529
#> Locus 2  0.8549 0.7518 0.7518 0.7518 0.7518 0.7665
#> Locus 3  0.8781 0.8342 0.8342 0.8342 0.8342 0.8173
#> Locus 4  0.8595 0.8702 0.8702 0.8702 0.8702 0.8149
#> Locus 5  0.8488 0.8541 0.8541 0.8541 0.8541 0.7557
#> Locus 6  0.7036 0.5559 0.5559 0.5559 0.5559 0.6546
#> Locus 7  0.8052 0.6892 0.6892 0.6892 0.6892 0.6873
#> Locus 8  0.8410 0.5448 0.5448 0.5448 0.5448 0.7424
#> Locus 9  0.8758 0.7589 0.7589 0.7589 0.7589 0.7355
#> Locus 10 0.8592 0.8200 0.8200 0.8200 0.8200 0.7754
#> 
#> $HieHr_overall
#>         Ht      Hr     Hp
#> [1,] 0.812 0.73165 0.7302

Hierarchical Fst (allelic fixation)

Besides the hierarchical diversity, this package also calculates hierarchical genetic differentiation measured by Fst and Delta D. The Delta D here, which corresponds to q=1, measures allelic differentiation. Now we are showing how to calculate the traditional hierarchical Fst and new hierarchical Delta D.

For Fst:

Hier_Fst=HierFst(infile, nreg=4, r=c(7,4,2,3), ncode=3)
print(Hier_Fst)
#>           region        pop          Ind
#> Total  0.1077832 0.13154297  0.128481750
#> region 0.0000000 0.02663001  0.023198983
#> pop    0.0000000 0.00000000 -0.003524894

This gives you the Fst among subaggregate within in aggregate. See ?HierFst for more details.

Hierarchical Delta D

For DeltaD:

Hier_D=HierD(infile, nreg=4, r=c(7,4,2,3), ncode=3)
print(Hier_D)
#>           D_gamma D_alpha.2 D_alpha.1 D_beta.2 D_beta.1 Differentiation.2
#> Locus 1  3.193142  2.725683  2.547081 1.171501 1.070120         0.1234638
#> Locus 2  8.127991  5.570462  4.756040 1.459123 1.171240         0.2947130
#> Locus 3  9.182470  7.039348  5.851804 1.304449 1.202936         0.2073097
#> Locus 4  8.384311  6.830226  5.819550 1.227531 1.173669         0.1599041
#> Locus 5  8.815656  6.014476  5.031749 1.465740 1.195305         0.2982420
#> Locus 6  4.466209  3.846453  3.517153 1.161124 1.093627         0.1165235
#> Locus 7  6.466375  4.320795  3.849778 1.496570 1.122349         0.3144786
#> Locus 8  8.193804  5.671722  4.805933 1.444677 1.180150         0.2869518
#> Locus 9  9.424112  5.279522  4.494791 1.785031 1.174587         0.4519619
#> Locus 10 8.429114  5.828401  5.117923 1.446214 1.138821         0.2877814
#>          Differentiation.1
#> Locus 1         0.04546744
#> Locus 2         0.10604372
#> Locus 3         0.12395858
#> Locus 4         0.10743385
#> Locus 5         0.11968905
#> Locus 6         0.06004503
#> Locus 7         0.07743759
#> Locus 8         0.11112839
#> Locus 9         0.10795825
#> Locus 10        0.08721245

Here we get alpha, beta, gamma diversity. D alpha1, D alpha2 and Dgamma correspond to number of equivalent alleles at population level, number of equivalent alleles at regional level, and total number of equivalent alleles at ecosystem. Beta.1 is the number of equivalent subaggregate (population or subpopulation depends on the hierarchy), and beta.2 is the number of equivalent aggregate (eg. regions). Differentiation.2 represents genetic differentiation between aggragates, Differentiation.1 represents genetic differentiation between subaggregates.

Untill now you may know how to implement the hierarchical diversity analysis. Furthermore, this package also includes several functions that can plot correlation between geographic distances and genetic differentiations, (see COR_Fstd, and COR_DeltaDd), as well as Delta D matrix and Delta D across loci.

References

’Gaggiotti, O. E., Chao, A., Peres-Neto, P., Chiu, C. H., Edwards, C., Fortin, M. J., … & Selkoe, K. A. (2018). Diversity from genes to ecosystems: A unifying framework to study variation across biological metrics and scales. Evolutionary Applications.

’Weir, B.S. (1996) Genetic Data Analysis II. Sinauer Associates.

’Yang, R.C. (1998). Estimating hierarchical F-statistics. Evolution 52 (4):950-956