Grouping harmony searchprinciples, novel adaptations and practical applications

  1. LANDA TORRES, ITZIAR
Supervised by:
  1. Sancho Salcedo Sanz Director
  2. Sergio Gil López Co-director

Defence university: Universidad de Alcalá

Fecha de defensa: 19 November 2013

Committee:
  1. Lucas Cuadra Rodriguez Chair
  2. José Antonio Portilla Figueras Secretary
  3. José Miguel Leiva Murillo Committee member
  4. José Ángel Fernández Prieto Committee member
  5. Javier del Ser Lorente Committee member

Type: Thesis

Teseo: 373440 DIALNET

Abstract

Similarly to hard clustering, the baseline paradigm behind grouping problems lies on how to gather elements of a certain set into at least one group such that each object is strictly in one group. However, grouping problems take a step further beyond hard clustering by considering a cost function to be optimized that depends roughly on the composition of the groups, i.e. the properties and/or hierarchical relationship between the elements arranged in a certain group determines the optimality of the clustering topology. Focusing on this particular class of optimization problems, Falkenauer presented in the early 90s a contribution that united previous approaches for tackling grouping problems; he proposed a specic encoding strategy to represent the solution space of such problems at a minimum redundancy, and designed improvisation operators specially tailored for grouping-encoded genetic algorithms. Unfortunately, the related literature thereon has not yet succeeded to further decrease the redundancy when encoding grouping-like problems. This Thesis joins this line of research by undertaking, on the basis of a practical methodology, two related problem formulations: the dissertation first presents a typical grouping approach in which N elements have to be grouped into M groups, followed by a generalization of such a problem where M is assumed to be a priori unknown. To be concise, the contributions of this Thesis are: 1) to tackle the aforementioned problems by means of proposing a novel adaption of the so-called Harmony Search algorithm, giving rise to the Grouping Harmony Search, GHS; and 2) to design improvisation operators specifically tailored to match the characteristics of both the problem and the algorithm itself. Following the aforementioned practical methodology, a set of dynamic application scenarios has been selected as the substrate on which to design, implement and validate the aforementioned technique: first, network deployment problems (as those encountered in telecommunication networks) are modeled as instances of the so-called Switch Location Problem (SLP), which essentially correspond to capacitated - i.e. cost-constrained - grouping problems. The network deployment problem are next spanned towards the Multi-Type Access Node Location Problem (MTANLP), which generalizes the SLP by considering a variable number of groups. As both problem formulations model the design of fixed network deployments, the applicability of the derived algorithms is also veried for the citywide planning of open WiFi access networks. The feasibility and potentiality of the proposed GHS scheme is evidenced by modeling reality based scenarios of diverse complexity, with an emphasis on large-scale problem instances aiming at assessing its scalability. The second important research contribution presented in this Thesis generalizes the acquired knowledge throughout the scenarios to which the proposed grouping scheme has been applied by considering its extension to tackle several classical clustering problems. To date, substantial efforts have been conducted towards establishing the optimal number of clusters in this particular branch of descriptive data analytics, but the activity around analyzing encoding schemes and their redundancy has not developed at the same pace. This Thesis will delve into this issue by thoroughly studying and benchmarking the performance of grouping encoding against other clustering techniques from the literature. The conclusions drawn - that hold for both clustering and grouping problems - reveal that the performance of the proposed algorithm improves when utilizing encoding schemes that univocally (i.e. each solution is represented by just one point in the solution space and vice versa) represent the problem at hand. This premise is buttressed by a novel intelligent sorting method that ensures that each solution is always uniquely encoded. Likewise, the definition of specialized operators is crucial to enhance an optimal information exchange along the search process; based on this premise, 1) improvisation operators are designed based on the metric utilized for evaluating the obtained solutions; 2) constraints imposed in the problem formulation are considered while applying the improvisation operators; and 3) differential schemes are adopted. The dissertation concludes that by implementing this three-fold design approach, not only more accurate solutions are obtained, but also less computational cost is required, which is foreseen to become a key concept when extrapolating the proposed scheme to avant-garde challenges such as Big Data.