Variable selection for data aggregated from different sources with group of variable structure

Broc, Camilo Lucien

Variable selection for data aggregated from different sources with group of variable structure

Broc, Camilo Lucien

Dirigida por:

Borja Calvo Molinos Director/a

Universidad de defensa: Universidad del País Vasco - Euskal Herriko Unibertsitatea

Fecha de defensa: 14 de noviembre de 2019

Tribunal:

Christophe Ambroise Presidente/a
Stéphane Robin Secretario/a
Borja Calvo Molinos Vocal
Astrid Jourdan Vocal
Hélène Jacquemin Gadda Vocal

Departamento:

Ciencia de la Computación e Inteligencia Artificial

Tipo: Tesis

Teseo: 154306 DIALNET ADDI editor

Resumen

During the last decades, the amount of available genetic data on populations has grown drastically. From one side, a refinement of chemical technologies have made possible the extraction of the human genome of individuals at an accessible cost. From the other side, consortia of institutions and laboratories around the world have permitted the collection of data on a variety of individuals and population. This amount of data raised hope on our ability to understand the deepest mechanisms involved in the functioning of our cells. Notably, genetic epidemiology is a field that studies the relation between the genetic features and the onset of a disease. Specific statistical methods have been necessary for those analyses, especially due to the dimensions of available data: in genetics, information is contained in a high number of variables compared to the number of observations. In this dissertation, two contributions are presented. The first project called PIGE (Pathway Interaction Gene Environment) deals with gene-environment interaction assessments. The second one aims at developing variable selection methods for data which has group structures in both the variables and the observations.