Patel, Vipul Kumar (2016). Genotyping by sequencing from sparse sequenced genomes representations from bi- and multi- parental mapping population using a HMM approach. PhD thesis, Universität zu Köln.
|
PDF
Thesis.pdf - Accepted Version Download (7MB) |
Abstract
Genotyping is one key element for successfully carrying out molecular breeding, gene network discovery or assessment of genetic diversity. The onset of next generation sequencing has enabled high-resolution genotyping of thousands or millions of markers per individual in one analysis. Such dense information can be used to identify genetic loci associated with a trait of interest. Development of multiplexing allows sequencing of whole populations in a single run, vastly reducing inputs of time and money per sample. This high throughput genotyping is known as genotyping-by-sequencing (GBS). However, there is a trade-off for using GBS, as the total number of reads per run must be distributed across all samples, leading to a reduction of coverage per sample. The distribution of the total reads is currently not uniform, which leads to samples with only partial sequence coverage. This thesis presents a solution for handling such data by imputing missing markers based on a Hidden Markov Model approaches for bi- and multi- parental mapping populations. The developed methods were not only validated by simulation studies but also applied to several real mapping population datasets. For the bi-parental mapping population, data were derived from three different taxa (Arabidopsis thaliana, Sorghum bicolor and Fragaria vesca) and for the multi-parental mapping population the Arabidopsis multi-parental RIL (AMPRIL) population was genotyped. The successful high resolution genotyping of such mapping populations with sparse sequencing data demonstrates the advantages of the developed method and the positive effects for downstream analysis e.g. for quantitative trait analysis or genome-wide-association studies. This thesis additionally provides a theoretical approach and implementation for a hybrid correction approach of sequencing errors in third generation sequencing data from Pacific Biosciences.
Item Type: | Thesis (PhD thesis) | ||||||||
Translated abstract: |
|
||||||||
Creators: |
|
||||||||
Corporate Creators: | Max-Planck-Institut für Züchtungsforschung in Köln, Abteilung für Entwicklungsbiologie der Pflanzen | ||||||||
URN: | urn:nbn:de:hbz:38-70297 | ||||||||
Date: | 7 March 2016 | ||||||||
Language: | English | ||||||||
Faculty: | Faculty of Mathematics and Natural Sciences | ||||||||
Divisions: | Faculty of Mathematics and Natural Sciences > Department of Biology > Institute for Genetics | ||||||||
Subjects: | Data processing Computer science Life sciences |
||||||||
Uncontrolled Keywords: |
|
||||||||
Date of oral exam: | 18 April 2016 | ||||||||
Referee: |
|
||||||||
Refereed: | Yes | ||||||||
URI: | http://kups.ub.uni-koeln.de/id/eprint/7029 |
Downloads
Downloads per month over past year
Export
Actions (login required)
View Item |