Novel evolutionary-based methods for the robust training of SVR and GMDH regressors

  1. GASCÓN MORENO, JORGE
Supervised by:
  1. Sancho Salcedo Sanz Director
  2. José Antonio Portilla Figueras Co-director

Defence university: Universidad de Alcalá

Fecha de defensa: 06 October 2016

Committee:
  1. Silvia Jiménez Fernández Chair
  2. Enrique Alexandre Cortizo Secretary
  3. Javier del Ser Lorente Committee member
  4. Carlos Casanova Mateo Committee member
  5. L. Carro Calvo Committee member

Type: Thesis

Teseo: 524983 DIALNET lock_openTESEO editor

Abstract

This Ph.D. Thesis elaborates on several novel improvements for two specific state-of-the-art Machine Learning algorithms: the Support Vector Regression (SVR) approach, and the Group Method of Data Handling. In the case of the SVR approach, a new multi-parametric evolutionary SVR is proposed. This new algorithm takes into account a different value of the γ parameter for each dimension of the feature space. In this case, it is not possible to apply a classic grid search, due to computational requirements of such an algorithm, and therefore in this thesis an evolutionary approach is successfully applied to obtain the optimal values for these SVR parameters. Regarding the GMDH network, in this thesis a novel construction algorithm based on a hyper-heuristic approach is proposed. Hyper-heuristic is a novel concept related to evolutionary computation, in which the algorithm encodes several smaller heuristics which can be applied in a sequential fashion to solve a given optimization problem. In this specific application, several basic heuristic are encoded in an evolutionary algorithm to form a hyper-heuristic approach which constructs robust versions of GMDH networks for regression problems. A final contribution of this thesis is the proposal of new validation methods to better estimate the performance of regression techniques in data-driven problems. The idea is to obtain better models from the training phase of the algorithms, in such a way that the performance in the test set is improved, mainly in training time and overall performance of the system, with respect to classical evaluation methods such as K-Fold cross validation, etc. All the proposed and developed methods of the thesis are experimentally evaluated in benchmark and real-world data-driven regression problems.