Schrinner, Sven, Goel, Manish, Wulfert, Michael, Spohr, Philipp, Schneeberger, Korbinian and Klau, Gunnar W. ORCID: 0000-0002-6340-0090 (2021). Using the longest run subsequence problem within homology-based scaffolding. Algorithms. Mol. Biol., 16 (1). LONDON: BMC. ISSN 1748-7188

Full text not available from this repository.

Abstract

Genome assembly is one of the most important problems in computational genomics. Here, we suggest addressing an issue that arises in homology-based scaffolding, that is, when linking and ordering contigs to obtain larger pseudo-chromosomes by means of a second incomplete assembly of a related species. The idea is to use alignments of binned regions in one contig to find the most homologous contig in the other assembly. We show that ordering the contigs of the other assembly can be expressed by a new string problem, the longest run subsequence problem (LRS). We show that LRS is NP-hard and present reduction rules and two algorithmic approaches that, together, are able to solve large instances of LRS to provable optimality. All data used in the experiments as well as our source code are freely available. We demonstrate its usefulness within an existing larger scaffolding approach by solving realistic instances resulting from partial Arabidopsis thaliana assemblies in short computation time.

Item Type: Journal Article
Creators:
CreatorsEmailORCIDORCID Put Code
Schrinner, SvenUNSPECIFIEDUNSPECIFIEDUNSPECIFIED
Goel, ManishUNSPECIFIEDUNSPECIFIEDUNSPECIFIED
Wulfert, MichaelUNSPECIFIEDUNSPECIFIEDUNSPECIFIED
Spohr, PhilippUNSPECIFIEDUNSPECIFIEDUNSPECIFIED
Schneeberger, KorbinianUNSPECIFIEDUNSPECIFIEDUNSPECIFIED
Klau, Gunnar W.UNSPECIFIEDorcid.org/0000-0002-6340-0090UNSPECIFIED
URN: urn:nbn:de:hbz:38-585764
DOI: 10.1186/s13015-021-00191-8
Journal or Publication Title: Algorithms. Mol. Biol.
Volume: 16
Number: 1
Date: 2021
Publisher: BMC
Place of Publication: LONDON
ISSN: 1748-7188
Language: English
Faculty: Unspecified
Divisions: Unspecified
Subjects: no entry
Uncontrolled Keywords:
KeywordsLanguage
GENOME ASSEMBLIESMultiple languages
Biochemical Research Methods; Biotechnology & Applied Microbiology; Mathematical & Computational BiologyMultiple languages
URI: http://kups.ub.uni-koeln.de/id/eprint/58576

Downloads

Downloads per month over past year

Altmetric

Export

Actions (login required)

View Item View Item