Optical Pooled Screening is a method for single-cell functional genomics with image readouts like Cell Painting (depicted).
Optical pooled screening (OPS) is a type of high-contentsingle-cell genetic screen that profiles the phenotypes of individual cells by optical microscopy. The phenotypic profile of each cell is linked to one or several genetic features by in situ genotyping. OPS is used to determine the effect of genetic elements on the characteristics of cells and tissues. Single-cell screening methods like OPS have been adopted by the biotechnology industry for applications in drug development.[1][2]
High-content pooled single-cell genetic screens became available as a functional genomics technique starting circa 2016.[3][4] While the genetic intervention (also known as a "genetic perturbation" in CRISPR screening) can be of any type that can be associated with a genetic sequence in the cell, including modifications in protein-coding or regulatory sequences,[5]CRISPR systems are the most common methodology for affecting genetic perturbations in OPS efforts.[6] The high-content nature of OPS data enables screens for cellular phenotypes not considered prior to data generation and in-depth analysis of the primary screening data to classify and prioritize screening hits.[3] As an intrinsically single-cell-resolved approach, OPS is recognized as capable of identifying perturbation effects on the distribution of single-cell phenotypes across cells.[7][8]
Researchers use OPS to visually assess how gene disruptions and other genetic perturbations cause changes in cellular characteristics like morphology[9] by Cell Painting,[10][11]protein localization,[12] or intracellular signaling via transduction of signals detected by biochemical receptors in the cell.[13] OPS requires in situ genotyping,[14] for example by in situ sequencing[15][16] the perturbation in each cell or a nucleotide sequence "barcode" (analogous to the UPC barcode) that links image-based cell phenotypes to specific genetic alterations at the single-cell level. OPS is used in functional genomics,[17]drug discovery,[10] and disease research.[18]
Context
OPS is one of two approaches (the other being single-cell next-generation sequencing (NGS)) available to generate high-content single-cell screening data.[19] High-content single-cell functional genomic screens[3] differ from previously established pooled genetic screening approaches relying on enrichment of perturbation identifier frequency in selected versus non-selected or original cell populations.[20][21] In contrast, high content single-cell screens like OPS match cell phenotypes and perturbation identifiers at the single-cell level, enabling characterization and possible classification of phenotypes post-hoc based on the primary screening data output.[19] Perturbed cell phenotypes are interpreted based on the nature of the perturbations enriched in a phenotypic class, or a quantitative trait can be directly mapped to genetic alteration in a regulatory or coding sequence.[22][23]
In contrast to NGS approaches for high-content single-cell screening OPS directly reads out cellular structures, dynamic molecular/cellular functionality in live cell settings, and can achieve high resolution of cell states.[19] As an imaging method, OPS is applicable where spatial relationships are relevant, for example, the subcellular distribution or localization of organelles or molecular components,[24] and spatial relationships among cells.[25] Imaging assays can also score cell non-autonomous phenotypes such as cell-cell interaction phenotypes, tissue context-dependent phenotypes,[26] and the effect genes have outside the cell.[27][28] As a live cell imaging method, OPS enables studies of cellular dynamics using advanced imaging modalities, such as single molecule fluorescence microscopy.[14]
The capability of OPS to connect the phenotype of each cell in the pooled library to its genotype distinguishes OPS from imaging based pooled enrichment screens such as robotic picking,[29] Visual Cell Sorting,[30] CRISPR-based microRaft followed by guide RNA identification (CRaft-ID),[31] single-cell isolation following time-lapse imaging (SIFT),[32] AI-photoswitchable screening (AI-PS),[33] optical enrichment,[34] image-enabled cell sorting (ICS),[35] and Photopick.[36] These methods all work by segregating cell populations according to pre-specified single-cell image characteristics and bulk readout perturbation identifier abundance in the segregated populations.
History
OPS was developed concurrently with single-cell screening methods based on NGS, i.e. Perturb-seq,[37][38] CRISP-seq,[39] and CROP-seq.[40][41] In 2017, the first report of an OPS described a small CRISPR interference screen that perturbed different components regulating a fluorescent reporter protein.[14] In this study, the live-cell phenotyping step was followed by FISH-based readout of barcodes expressed by T7 RNA polymerase from the same plasmid as the CRISPR single guide RNA (sgRNA). Another early report described an OPS with a bacterial library of mutated fluorescent proteins also followed by FISH-based readout of barcodes.[42] Applications in human cells with CRISPR perturbations were subsequently reported with readout of thousands of sgRNA CRISPR perturbations by in situ sequencing[15] of sgRNA and barcode sequences amplified from mRNA using a molecular inversion probe and rolling circle amplification (RCA) and sequencing by synthesis chemistry;[12] and in another example, readout of >100 sgRNA perturbations by FISH.[43] Protein epitopes have also been applied to encode genomic perturbations for enrichment[44] and in vivo OPS with readout from tissue sections.[45][46]
A genome-wide scale loss-of-function CRISPR OPS in human cells was reported in 2023 and included high-content phenotypes recorded from >10 million cells assigned to one of 80,408 sgRNA perturbations.[13] Other genome-wide OPS datasets were reported for infection of human cells by filoviruses,[8] cell signaling,[47] and morphological characterization under different culture conditions.[48] New protocols for nucleotide-level barcode readout incorporate "Zombie" in situ T7 RNA polymerase-driven in vitro transcription[49] for amplification[50][51] or pre-amplification[52][53] of OPS readout. A recent application of OPS is genome-wide tracking of chromosome loci over the cell cycle.[54]
Methodology
Creation and use of genetic libraries
OPS requires genetically perturbed cell populations similar to those used for Perturb-seq,[37][38] CRISP-seq[39], and CROP-seq[40] and enrichment[21] screens. In mammalian systems, viral transduction is commonly used to introduce elements of the genetic perturbation system such as sgRNAs into cells.[20] A general challenge in perturbation engineering is maintaining linkage between sgRNA and barcode elements or among sgRNA or barcodes.[55] Specific protocols and construct designs able to maintain the intended linkage have been developed.[56][57] Errors in component synthesis, procedures for production of DNA or viruses, and processes occurring in the cell population for screening can de-link elements, but can be mitigated[58] to maintain screen performance, which is particularly important for systems capable of multiple[59] perturbations.
Bacterial libraries for OPS have been generated using episomal and chromosomally integrated genomic perturbations. A preferred method is to express sgRNA or ORFs from plasmids that also encode T7-expressed RNA barcodes.[14] Strain libraries based on chromosomal mutations have been constructed using the phage lambda-derived Red recombination system.[60] For chromosomally expressed barcodes, Zombie in situ T7 in vitro transcription pre-amplification can achieve the target concentration required for detection by in situ sequencing or sequential FISH genotyping protocols.[61][54]
Data analysis methods
OPS data analysis comprises the extraction of phenotype parameter (known as a morphological feature in cell imaging) scores from each cell and matching these scores with perturbation genotype identifiers extracted from each cell using a series of digital image analysis steps.[22] Then, the distributions of phenotype parameter scores can be determined for each perturbation genotype and compared or tested against the distributions observed for cells with control perturbations or a different perturbation genotype.[48]
Primary analysis of phenotype images involves two major steps. First, cell segmentation and the alignment of segmentation masks across all the available images. Second, feature identification and extraction of feature scores from the pixel level data.[22] Primary analysis of phenotyping images may involve a range of computational approaches including feature selection and machine learning approaches such as support vector machines, PCA, and dimensionality reduction that may involve clustering. For live cell imaging the segmented cells are tracked in time lapse movies and time-dependent phenotypes can be additionally scored.[9]
Primary analysis of in situ genotype data (eg from sequential FISH or in situ sequencing) also involves two major steps. First, identification of signal loci and association of loci with cells and analysis of signal sequences similar to single particle tracking. Second, assignment of perturbation identifiers to signal loci and cells. Primary analysis of genotype images may involve a range of computational approaches including machine learning approaches.[62] Primary analysis concludes with the merging of single-cell phenotypes and genotypes and identification of the set of cells with matched single-cell phenotype scores and genotype identifiers.
Secondary analysis entails testing for perturbation effects and integration with other biological database resources and plausibility considerations based on general biological knowledge. New machine learning approaches for the identification and interpretation of perturbation effects from OPS datasets[63] and for the optimal design of OPS experiments[64] are active areas of development.
Applications
OPS has been applied across multiple research areas and for a variety of purposes.
Functional Genomics and Cell Biology: OPS facilitates comprehensive functional studies by revealing how specific genetic changes affect a wide range of cell functions, cell biological characteristics, and molecular processes[9]
Drug Discovery: By identifying genes that regulate disease-associated cellular pathways/phenotypes/states,[47] and the gene functions that must be intact for a drug to act, OPS helps researchers discover new drug targets[8] and better understand the molecular mechanisms of drugs[65]
Disease Research: OPS is used to investigate the etiology and pathophysiology of diseases including cancer,[28] cell models used to study neurodegenerative conditions,[52] and infectious diseases.[13] By identifying genes associated with disease phenotypes and treatment responses in research models, and exploring the impact models of genes and alleles known to be associated with clinically-defined disease and treatment response in humans, OPS can contribute to the fundamental understanding of disease.
Diagnostics: OPS has been used combined with antibiotic susceptibility testing to identify the species in a mixed sample after the phenotypic susceptibility has been determined for each cell[66]
Causality: As an empirical experimental genetic method, OPS provides data supporting direct causal inference based on results of genomic perturbations/interventions
Phenotype discovery: Exploratory analysis of OPS datasets enables post-hoc discovery of new cell phenotypes - for example from unsupervised machine learning methods - and subsequent analysis of gene perturbation effects on such novel phenotypes
Direct visual readout: OPS provides images of generalized and disease-associated cellular phenotypes and their changes upon genetic perturbation, meeting the seeing is believing evidentiary standard of human belief in a literal sense
High Throughput: OPS uses fast and low-cost optical readouts. The estimated cost per cell including commercial instrumentation, commercially available reagents, and labor using a protocol[22] for human cells was $0.0005/cell.[12]
Perturbation method compatibility: OPS is compatible with the same perturbation technologies and perturbation/cell libraries used for many other screening approaches, facilitating integrative analysis across OPS datasets and across OPS and other screening dataset types. Specific and effective genetic perturbations affected by CRISPR systems including Cas9-based methods are used to effect genetic perturbations in OPS and improve statistical power by reducing noise.[20] OPS is also compatible with approaches requiring or electing the use of multiple perturbations or guide RNAs to be delivered to each cell.[67][59]
Phenotyping method compatibility: Phenotyping can be carried out using any imaging assay and any optical hardware compatible with imaging before or after genotyping that provides cellular throughput sufficient to meet the requirement of the screen designed and preserves mRNA, cDNA or gDNA that can be genotyped to identify the perturbation in each cell. OPS protocols using Zombie in situ T7 RNA polymerase pre-amplification of DNA identifiers pose few restrictions on prior sample processing since only gDNA needs to be preserved.[52][53]
Compatibility with diverse biological systems, including various therapeutically relevant cell types and tissue [26]
High hit rate: when multiple molecular markers are used for readout and analysis scores many cellular features, a large fraction of perturbations result in reproducible phenotypic effects[9]
Live cell and dynamic phenotypes: Phenotyping of live cells avoids fixation artifacts and enables studies of dynamic molecular and cellular phenomena[12]
High statistical power and hit validation rate: the pooled format of OPS reduces interference from batch effects;[68] matched single-cell genotyping and phenotyping allows stringent quality filtering to restrict analysis to cells with high-quality genotypes and phenotypes;[22] image features can be scored for all cells without feature dropout[24]
High interpretability: fine-scale classification of phenotypic hits sets novel hits in the biological context of co-classified genetic perturbations that may have prior functional annotation.[9] This remains the case when no interpretation is available for the scored image features themselves.
Limitations
Diverse expertise and lack of commercial options has limited widespread adoption: OPS requires expertise in traditional pooled screening workflows (library cloning, lentiviral infection) as well as in situ methods, high-content imaging, automated liquid handling, computational image analysis, and single-cell analysis. There are currently limited commercial options for data generation and compute.
Assay development: While some imaging assays for phenotyping are standard (eg Cell Painting), there are many specialized assay protocols that may have compatibility conflicts with in situ genotyping protocols[53]
Perturbation efficacy: OPS is impacted by limitations of the perturbation methodology used.[69] For example, limited perturbation efficiency or specificity will degrade statistical power to detect phenotypic affects associated with the intended perturbation
Data generation cost: High-content imaging systems and the reagents consumed in processing genetic libraries have significant costs, potentially limiting the accessibility or scalability of OPS
Data complexity: The high quantity of imaging data generated by OPS requires substantial computational power and advanced software for storage, processing, and analysis, incurring costs and the need for expert attention
^ abSchirman, Dvir; Gras, Konrad; Kandavalli, Vinodh; Larsson, Jimmy; Fange, David; Elf, Johan (2024-10-30), A dynamic 3D polymer model of the Escherichia coli chromosome driven by data from optical pooled screening, doi:10.1101/2024.10.30.621082
^Feldman, David; Singh, Avtar; Garrity, Anthony J.; Blainey, Paul C. (2018-02-08), Lentiviral co-packaging mitigates the effects of intermolecular recombination and multiple integrations in pooled genetic screens, doi:10.1101/262121
^Soares, Ruben R. G.; García-Soriano, Daniela A.; Larsson, Jimmy; Fange, David; Schirman, Dvir; Grillo, Marco; Knöppel, Anna; Sen, Beer Chakra; Svahn, Fabian (2023-11-17), Pooled optical screening in bacteria using chromosomally expressed barcodes, doi:10.1101/2023.11.17.567382
^Haghighi, Marzieh; Cruz, Mario C.; Weisbart, Erin; Cimini, Beth A.; Singh, Avtar; Bauman, Julia; Lozada, Maria E.; Kavari, Sanam L.; Neal, James T.; Blainey, Paul C.; Carpenter, Anne E.; Singh, Shantanu (August 2023). "Pseudo-Labeling Enhanced by Privileged Information and Its Application to in Situ Sequencing Images". Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence. pp. 4775–4784. arXiv:2306.15898. doi:10.24963/ijcai.2023/531. ISBN978-1-956792-03-4.