DIRECTOR
RESEARCH TEAM
José Castresana.
COLLABORATING INSTITUTIONS
DESCRIPTION
Most of the comparitive analyses of genomes require in the first place the construction of appropriate multiple alignments from the genes to be studied. The large number of alignments derived from all identified genes are later used to estimate phylogenetic trees, which are a great aid in the interpretation of the sequenced genomes. Very few studies have been done to determine which alignments or which blocks from alignments are reliable enough for use in subsequent comparative genomics and phylogenomic works.
In this project we propose to apply a wide range of bioinformatic techniques to address the importance of multiple alignments in phylogenomics. We will study specifically the relative importance of horizontal gene transfer as opposed to duplications and gene losses (i.e., the problem of paralogy) in ancient evolution. Horizontal gene transfer may have been overestimated due to lack of consideration of problems in multiple alignments. We will also determine optimal divergence levels in phylogenetics, particularly at the divergence levels of several genomic elements in mammals such as exons and introns.First, simulations of alignments at different levels of divergence will be constructed at protein and DNA levels along known trees, to validate the reliability of current alignment methods. In addition, our method for extracting conserved blocks from alignments -Gblocks- will be used with the simulated alignments in order to determine the best parameters of this method as well as the improvements it provides. Several new routines will be implemented in Gblocks to reach full automation in the selection of reliable blocks.Finally, we will develop a computer program for the rapid visualization of the quality of large amounts of multiple alignments.