Spatial Principal Component Analysis (sPCA) is a multivariate statistical technique that complements the traditional Principal Component Analysis (PCA) by incorporating spatial information into the analysis of genetic variation. While traditional PCA can be used to find spatial patterns[1], it focuses on reducing data dimensionality by identifying uncorrelated principal components that capture maximum variance, thus often lacking power to identify non-trivial spatial genetic patterns [1][2]. By accounting for spatial autocorrelation, sPCA is able to uncover spatial patterns in the data and find the spatial structure of datasets where observations are either geographically or topologically linked. This statistical power improvement allows the investigation of cryptic spatial patterns of genetic variability otherwise overlooked. [3]
sPCA was introduced in 2008 by Thibaut Jombart, Sébastien Devillard, Anne-Béatrice Dufour, and D. Pontier as a spatially explicit method to investigate the spatial pattern of genetic variation among individuals or populations. [3]
In 2017, Valeria Montano and Thibaut Jombart published an alternative non-parametric test to evaluate the significance of global and local spatial genetic patterns with improved statistical power.[8]
Details
sPCA modifies the PCA framework by integrating spatial weights, typically in the form of connectivity matrices or spatial adjacency graphs. It identifies principal components (PCs) that maximize both genentic variance and spatial autocorreation, as measured by Moran's I. [8] These weights represent relationships between observations based on geographic distance or other spatial criteria. [9] The method decomposes variance into two components:
Global structures, correspond to positive autocorrelation, that is, reflect broad-scale spatial patterns where similar values cluster over large regions.
Local structures, correspond to negative autocorrelation, that is, capture fine-scale spatial variations or localized patterns.
The core of sPCA relies on the eigenanalysis of a spatially weighted covariance or correlation matrix. The spatial weight matrix can be constructed using techniques such as Delaunay triangulation, nearest-neighbor graphs, or distance-based criteria.
Applications of sPCA should be used only as an explorative tool. [1][3]
Applications
sPCA has been widely used in many fields, including:
Genetics: Population structure and gene flow analysis while allowing for spatial autocorrelation considerations.[6]
Biogeography: To identify historical dispersal routes, and barriers to gene flow, providing insights into species distribution patterns and evolutionary history.[7]
Software/Source Code
sPCA implementations are available in R in adegenet and ntbox .[10][11][12]
These tools facilitate the application of sPCA by providing functions for constructing spatial weight matrices, performing eigenanalysis, and obtaining spatial principal components in an easy-to-read form.
^Jombart, Thibaut (2015-06-23). A tutorial for the spatial Analysis of Principal Components (sPCA) using adegenet 2.0.0. Imperial College London, MRC Centre for Outbreak Analysis and Modelling (published 2015).{{cite book}}: CS1 maint: date and year (link)