Spatial Analysis of Principal Components

Spatial Principal Component Analysis (sPCA) is a multivariate statistical technique that complements the traditional Principal Component Analysis (PCA) by incorporating spatial information into the analysis of genetic variation. While traditional PCA can be used to find spatial patterns^[1], it focuses on reducing data dimensionality by identifying uncorrelated principal components that capture maximum variance, thus often lacking power to identify non-trivial spatial genetic patterns ^[1]^[2]. By accounting for spatial autocorrelation, sPCA is able to uncover spatial patterns in the data and find the spatial structure of datasets where observations are either geographically or topologically linked. This statistical power improvement allows the investigation of cryptic spatial patterns of genetic variability otherwise overlooked. ^[3]

sPCA has been applied in various fields, including geography, ecology and genetics. ^[4]^[5]^[6]^[7]

History

sPCA was introduced in 2008 by Thibaut Jombart, Sébastien Devillard, Anne-Béatrice Dufour, and D. Pontier as a spatially explicit method to investigate the spatial pattern of genetic variation among individuals or populations. ^[3]

In 2017, Valeria Montano and Thibaut Jombart published an alternative non-parametric test to evaluate the significance of global and local spatial genetic patterns with improved statistical power.^[8]

Details

sPCA modifies the PCA framework by integrating spatial weights, typically in the form of connectivity matrices or spatial adjacency graphs. It identifies principal components (PCs) that maximize both genentic variance and spatial autocorreation, as measured by Moran's I. ^[8] These weights represent relationships between observations based on geographic distance or other spatial criteria. ^[9] The method decomposes variance into two components:

Global structures, correspond to positive autocorrelation, that is, reflect broad-scale spatial patterns where similar values cluster over large regions.

Local structures, correspond to negative autocorrelation, that is, capture fine-scale spatial variations or localized patterns.

The core of sPCA relies on the eigenanalysis of a spatially weighted covariance or correlation matrix. The spatial weight matrix can be constructed using techniques such as Delaunay triangulation, nearest-neighbor graphs, or distance-based criteria.

Applications of sPCA should be used only as an explorative tool. ^[1]^[3]

Applications

sPCA has been widely used in many fields, including:

Ecology: To find spatial patterns in species distributions and environmental gradients.^[4]^[5]

Genetics: Population structure and gene flow analysis while allowing for spatial autocorrelation considerations.^[6]
Biogeography: To identify historical dispersal routes, and barriers to gene flow, providing insights into species distribution patterns and evolutionary history.^[7]

Software/Source Code

sPCA implementations are available in R in adegenet and ntbox .^[10]^[11]^[12]

These tools facilitate the application of sPCA by providing functions for constructing spatial weight matrices, performing eigenanalysis, and obtaining spatial principal components in an easy-to-read form.