Università degli Studi di Udine OpenUniud - Archivio istituzionale delle tesi di dottorato

OpenUniud - Archivio istituzionale delle tesi di dottorato >
Udine Thesis Repository >
01 - Tesi di dottorato >

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/10990/773

Autori: Zapparoli, Ettore
Supervisore afferente all'Università: MORGANTE, MICHELE
Titolo: Identification of structural variation in Zea mays: use of paired-end mapping and development of a novel algorithm based on split reads
Abstract (in inglese): The present work is part of the ERC-funded project NOVABREED, which has as objective the characterization of the pan-genome of Vitis vinifera and Zea mays, through the application and the development of in silico methods for the analysis of Next Generation Sequencing (NGS) data. The concept of pan-genome arises from the observation that some DNA sequences are not shared by all subjects of a species, and that a single genome is not enough to describe the species. The DNA segments shared from all subjects of a species constitute the core genome, while those not present in all subjects compose the dispensable genome. Here, we focused on the genome of Zea mays, a complex and highly repeated genome, whose size is approximately 2.5 Gb (Schnable et al, Science 2009). Structural variants are an important source of genetic variation in plants, mostly due to large (>1000 bp) insertions and deletions of transposable elements (TEs) and are an important component of the dispensable genome. Maize dispensable fraction of the genome was characterized through the analysis of structural variants (SVs) in 7 inbred lines selected from the parental lines of the MAGIC maize population. As part of the project, a new algorithm (Walle) for the detection of insertions relying on split-read mapping (SR) has been developed, and its performance has been compared with existing tools. Results showed that Walle performed better than existing tools. Deletions were detected using publicly available tools, while insertions were detected using tools previously detected in our lab and the tool developed in the present project. A total of 48,904 deletions and 75,370 insertions were identified, accounting respectively for 0.56 Gb of sequences present in the B73 reference genome and absent in at least one other line, and 0.81 Gb of sequences present in at least one other line while absent in B73. Taken together, those results confirms previous pan-genome estimations (Morgante et al., Curr Opin Plant Biol 2007), in which the authors estimated the relative size of the dispensable genome as the 50% of the pan-genome, compared to our estimate of 48%. The composition of dispensable genome was investigated, confirming that a large fraction of extant variation in maize is due to LTR retrotransposons insertions and that most of them occurred in a relatively recent time. Although most SVs are located in intergenic regions, some of them are located in genes and may disrupt exons, leading to evolutionary consequences. We therefore assessed the function of genes affected by deletions and insertions. Nested elements were investigated in greater detail, and we confirmed that LTR retrotransposons form nesting structures more often than expected by chance alone, as previously reported (Jiang and Wessler, Plant Cell 2001). Moreover, nesting patterns were investigated, finding that most of nesting events occurs within a few families of LTR retrotransposons. The main results of the present work are a) a software tool for the accurate identification of insertions in the genome, which has been shown to outperform existing tools, has been used for the identification of insertions in Zea mays and can be used on the genome of any species, and b) the characterization of the dispensable genome of Zea mays, which resulted in important information on the patterns of the movement of transposable elements, on their nesting patterns, and on the function of genes affected by the movement of TEs.
Parole chiave: Zea mays; Maize; Structural variants; Sequencing NGS; Pan-genome; Dispensable genome; Transposable elements Bioinformatics
MIUR : Settore BIO/18 - Genetica
Settore AGR/07 - Genetica Agraria
Lingua: eng
Data: 17-mar-2017
Corso di dottorato: Dottorato di ricerca in Scienze e biotecnologie agrarie
Ciclo di dottorato: 29
Università di conseguimento titolo: Università degli Studi di Udine
Luogo di discussione: Udine
Altre informazioni: Co-supervisore: Fabio Marroni - Istituto di aggregazione: Istituto di Genomica Applicata (IGA)
Citazione: Zapparoli, E. Identification of structural variation in Zea mays: use of paired-end mapping and development of a novel algorithm based on split reads. (Doctoral Thesis, Università degli Studi di Udine, 2017).
In01 - Tesi di dottorato

Full text:

File Descrizione DimensioniFormatoConsultabilità
tesi_zapparoli_definitiva.pdfDocumento di tesi6,85 MBAdobe PDFVisualizza/apri

Questo documento è distribuito in accordo con Licenza Creative Commons
Creative Commons

Tutti i documenti archiviati in DSPACE sono protetti da copyright. Tutti i diritti riservati.

Segnala questo record su




Stumble it!



  ICT Support, development & maintenance are provided by CINECA. Powered on DSpace SoftwareFeedback CINECA