Nkouamedjo Fankep, Rudel Christian ORCID: 0000-0002-8751-8202, Söylev, Arda, Kobiela, Anna-Lena ORCID: 0009-0000-6391-1280, Blom, Jochen, Ernst, Corinna and Motameny, Susanne ORCID: 0000-0003-1186-1108 (2025). SV-MeCa: an XGBoost-based meta-caller approach for structural variant calling from short-read data. BMC Bioinformatics, 26 (1). pp. 1-15. Springer Nature. ISSN 1471-2105

[thumbnail of s12859-025-06246-6.pdf] PDF
s12859-025-06246-6.pdf
Bereitstellung unter der CC-Lizenz: Creative Commons Attribution.

Download (1MB)
Identification Number:10.1186/s12859-025-06246-6

Abstract

[Artikel-Nr.: 218] Background: Calling structural variants (SVs), i.e., genomic alterations of 50bp, from whole genome short-read data remains challenging, as existing callers are known to lack accuracy and robustness. Therefore, meta-caller approaches combining the results of multiple standalone tools in a consensus set of reported SV calls, are widely used. Here, SV-MeCa (Structural Variant Meta-Caller) is presented, the first SV meta-caller incorporating variant-specific quality metrics from individual VCF outputs, rather than relying solely on number and combination of tools supporting consensus SV calls. In addition, SV-MeCa offers a suitable score to rank obtained consensus SV calls according to evidence of representing true positive calls, i.e., real-world variants. Results: SV-MeCa applies seven standalone SV callers and merges resulting deletion and insertion calls into a union VCF file using SURVIVOR. For each entry in the SURVIVOR-generated consensus, caller-specific quality measures are extracted from corresponding standalone VCF files, and serve as input for an either deletion- or insertion-specific XGBoost decision tree classifier, which was previously trained on the HG002 SV benchmark data provided by the Genome in a Bottle consortium. The SV-MeCa XGBoost models assign a probability to (consensus) SV calls to represent true positive calls, which can be used for ranking the final output according to evidence. Performance of SV-MeCa and four previously published meta-caller approaches were evaluated based on autosomal SV calls in samples curated by the Human Genome Structural Variation Consortium, Phase 2. With regard to F scores, which were 0.58 on average for deletions and 0.42 on average for insertions, SV-MeCa outperformed the other meta-callers. With regard to precision, only ConsensuSV achieved higher values (0.97 versus 0.64 on average for deletions, 0.75 versus 0.53 on average for insertions), and with regard to recall, SV-MeCa was outperformed exclusively by Meta-SV for deletions (0.55 versus 0.53). Conclusions: SV-MeCa, publicly available at https://github.com/ccfboc-bioinformatics/SV-MeCa , outperforms existing SV meta-caller approaches by taking variant-specific quality measures into account. Moreover, due to the XGBoost prediction probabilities serving as scores, the output of SV-MeCa can be continuously adjusted to user needs in terms of sensitivity and precision.

Item Type: Article
Creators:
Creators
Email
ORCID
ORCID Put Code
Nkouamedjo Fankep, Rudel Christian
UNSPECIFIED
UNSPECIFIED
Söylev, Arda
UNSPECIFIED
UNSPECIFIED
UNSPECIFIED
Kobiela, Anna-Lena
UNSPECIFIED
UNSPECIFIED
Blom, Jochen
UNSPECIFIED
UNSPECIFIED
UNSPECIFIED
Ernst, Corinna
UNSPECIFIED
UNSPECIFIED
UNSPECIFIED
Motameny, Susanne
UNSPECIFIED
UNSPECIFIED
URN: urn:nbn:de:hbz:38-801154
Identification Number: 10.1186/s12859-025-06246-6
Journal or Publication Title: BMC Bioinformatics
Volume: 26
Number: 1
Page Range: pp. 1-15
Number of Pages: 15
Date: 20 August 2025
Publisher: Springer Nature
ISSN: 1471-2105
Language: English
Faculty: Faculty of Medicine
Divisions: Cologne Center for Genomics
Cologne Center for Genomics > West German Genome Center (WGGC)
Faculty of Medicine > Weitere > Centrum für integrierte Onkologie (CIO)
Subjects: Medical sciences Medicine
Uncontrolled Keywords:
Keywords
Language
Structural variants ; Variant calling ; Whole-genome sequencing ; Meta-caller
English
['eprint_fieldname_oa_funders' not defined]: Publikationsfonds UzK
Refereed: Yes
URI: http://kups.ub.uni-koeln.de/id/eprint/80115

Downloads

Downloads per month over past year

Altmetric

Export

Actions (login required)

View Item View Item