Poster Presentation 47th Lorne Genome Conference 2026

Beyond the Paradigm: When Foundations Aren’t Enough for Spatial and Single-Cell Omics (133496)

Hamid Alinejad Rokny 1 , Sally Chen 1 , Roxana Zahedi Nasab 1 , Lucy Chhuo 1 , Ricky Nguyen 2 , Marjan Baghgolshani 3 , Mark Grosser 4 , Ahmadreza Argha 1 , Fatemeh Vafaee 2 , Youqiong Ye 5
  1. Graduate School of Biomedical Engineering, UNSW Australia, Randwick, NSW, Australia
  2. School of Biotechnology and Biomolecular Sciences, UNSW Sydney, Sydney, NSW, Australia
  3. School of Computing, Macquarie University, Sydney, NSW, Australia
  4. 23Strands, Sydney, NSW, Australia
  5. 5Center for Immune-Related Diseases at Shanghai Institute of Immunology, Shanghai Jiao Tong University School of Medicine, Shanghai, China

Foundation models (FMs) are redefining single-cell RNA sequencing (scRNA-seq) and spatial transcriptomics (SRT) analysis by learning transferable representations of cellular states and tissue organisation. While some adopt large language model (LLM) architectures, others rely on graph-based or hybrid multi-modal designs, yet all aim to generalise across biological contexts and analytical tasks. These models now drive diverse applications, from spatial domain discovery to cross-modality integration and therapeutic response prediction. Despite rapid progress, there remains no unified framework for systematically evaluating their capabilities across the breadth of single-cell and spatial analyses.

Here, we present a comprehensive benchmark and, to the best of our knowledge, the first systematic evaluation of foundation models jointly across single-cell and spatial transcriptomics, comparing six state-of-the-art architectures: Nicheformer, CellPLM, scGPT-spatial, GenePT, scELMo, and Novae. Performance is evaluated across multiple scRNA-seq and SRT datasets encompassing diverse diseases, species, and platforms, and assessed on key tasks including zero-shot and continually pretrained cell type clustering, cell type annotation, differential gene expression analysis, and perturbation prediction.

Our results demonstrate that preprocessing strategies, tokenisation schemes, and biologically informed priors profoundly influence model performance. Importantly, we highlight the urgent need for methods that address domain shifts arising from platform heterogeneity and biological variability. We also uncover persistent limitations in generalisation and interpretability, underscoring the challenge of building models that are both robust and biologically meaningful. This study provides actionable guidance for FM selection and establishes a standardised, extensible benchmarking framework to accelerate the next generation of single-cell and spatial foundation models.