Poster Presentation 47th Lorne Genome Conference 2026

Development of a reproducible Nextflow pipeline for somatic variant detection in high-depth whole-genome sequencing of neurodegenerative disease (133266)

Daniel O'Shaughnessy 1 , Andrew N Smith 1 , Lyndal Henden 1 , Kelly L Williams 1
  1. Motor Neuron Disease Research Centre, Faculty of Medicine, Health and Human Sciences, Macquarie University, Sydney, NSW, Australia

Somatic variants are increasingly recognised as contributors to neurodegenerative disease, yet accurate detection in post-mortem brain tissue remains challenging. In amyotrophic lateral sclerosis (ALS), where ~90 % of cases are sporadic, recent evidence suggests that low-frequency, brain-specific somatic mutations may initiate focal neurodegeneration. Detecting somatic variants requires specialised workflows to distinguish true somatic variants from false positives in high-depth (250X) whole-genome sequencing (WGS) data. However, many somatic variant detection tools, developed for cancer genomics, are not optimised for these requirements.

We therefore developed a reproducible Somatic Variant Detection (SVD) pipeline implemented in Nextflow for accurate identification of low-frequency somatic variants from high-depth (250X) WGS of brain tissue. The pipeline uses modular workflows for quality control, variant calling, filtering, and annotation within a fully containerised environment for scalability across HPC systems. Two analytical branches were implemented in the pipeline: a matched-tissue workflow that uses affected (brain) and unaffected (blood) samples from the same individual, and a single-tissue workflow optimised for studies without a matched sample. A technically matched Panel-of-Normals (PONs) variant catalogue, constructed from 143 ALS whole-genome datasets (50X coverage), was incorporated to remove recurrent technical artefacts.

Pipeline performance was evaluated using simulated data in which ~5,000 variants were spiked into 250X WGS brain sequences at variant allele frequencies (VAF) of 10%, 5%, 3%, 2%, and 1%. Across both analysis branches, the SVD pipeline achieved high precision and recall, confidently detecting somatic variants down to a 3% VAF threshold.

This work establishes a reproducible and scalable framework for somatic variant discovery in high-depth WGS data. The SVD pipeline enables identification of brain-specific genomic variation in ALS and related neurodegenerative disorders.