Beggel, Bastian, Neumann-Fraune, Maria, Kaiser, Rolf, Verheyen, Jens and Lengauer, Thomas (2013). Inferring Short-Range Linkage Information from Sequencing Chromatograms. PLoS One, 8 (12). SAN FRANCISCO: PUBLIC LIBRARY SCIENCE. ISSN 1932-6203

Full text not available from this repository.

Abstract

Direct Sanger sequencing of viral genome populations yields multiple ambiguous sequence positions. It is not straightforward to derive linkage information from sequencing chromatograms, which in turn hampers the correct interpretation of the sequence data. We present a method for determining the variants existing in a viral quasispecies in the case of two nearby ambiguous sequence positions by exploiting the effect of sequence context-dependent incorporation of dideoxynucleotides. The computational model was trained on data from sequencing chromatograms of clonal variants and was evaluated on two test sets of in vitro mixtures. The approach achieved high accuracies in identifying the mixture components of 97.4% on a test set in which the positions to be analyzed are only one base apart from each other, and of 84.5% on a test set in which the ambiguous positions are separated by three bases. In silk experiments suggest two major limitations of our approach in terms of accuracy. First, due to a basic limitation of Sanger sequencing, it is not possible to reliably detect minor variants with a relative frequency of no more than 10%. Second, the model cannot distinguish between mixtures of two or four clonal variants, if one of two sets of linear constraints is fulfilled. Furthermore, the approach requires repetitive sequencing of all variants that might be present in the mixture to be analyzed. Nevertheless, the effectiveness of our method on the two in vitro test sets shows that short-range linkage information of two ambiguous sequence positions can be inferred from Sanger sequencing chromatograms without any further assumptions on the mixture composition. Additionally, our model provides new insights into the established and widely used Sanger sequencing technology. The source code of our method is made available at http://bioinf.mpi-inf.mpg.de/publications/beggel/linkageinformation.zip.

Item Type: Journal Article
Creators:
CreatorsEmailORCIDORCID Put Code
Beggel, BastianUNSPECIFIEDUNSPECIFIEDUNSPECIFIED
Neumann-Fraune, MariaUNSPECIFIEDUNSPECIFIEDUNSPECIFIED
Kaiser, RolfUNSPECIFIEDUNSPECIFIEDUNSPECIFIED
Verheyen, JensUNSPECIFIEDUNSPECIFIEDUNSPECIFIED
Lengauer, ThomasUNSPECIFIEDUNSPECIFIEDUNSPECIFIED
URN: urn:nbn:de:hbz:38-469935
DOI: 10.1371/journal.pone.0081687
Journal or Publication Title: PLoS One
Volume: 8
Number: 12
Date: 2013
Publisher: PUBLIC LIBRARY SCIENCE
Place of Publication: SAN FRANCISCO
ISSN: 1932-6203
Language: English
Faculty: Unspecified
Divisions: Unspecified
Subjects: no entry
Uncontrolled Keywords:
KeywordsLanguage
DNA-POLYMERASE; HEPATITIS-B; VARIANTS; THERAPY; PHASEMultiple languages
Multidisciplinary SciencesMultiple languages
URI: http://kups.ub.uni-koeln.de/id/eprint/46993

Downloads

Downloads per month over past year

Altmetric

Export

Actions (login required)

View Item View Item