# Dissertation/Thesis Abstract

Covering and Clustering with Outliers and Other Constraints
by Inamdar, Tanmay, Ph.D., The University of Iowa, 2020, 187; 28030415
Abstract (Summary)

Covering and clustering are two of the most important areas in the field of combinatorial optimization. Covering problems deal with the question of judiciously selecting a collection of sets (or geometric objects) from the given set system, such that every element (or point) is covered by some chosen set. On the other hand, clustering is the task of partitioning the given set of points into groups, such that points in each cluster are similar to each other. In this thesis, we consider different covering and clustering problems involving additional constraints. Since problems of this nature are invariably NP-hard, i.e., it is conjectured that there are no efficient algorithms to solve these problems optimally, we resort to designing efficient approximation algorithms for them.

Handling the presence of outliers is a recurring theme in most of the problems considered in this thesis. A small number of noisy outliers in the data may result in the generation of undesirable clusters. However, the quality of clustering can be improved if we exclude these outliers from consideration. Note the twofold difficulty here – we have to identify the outliers in the data, while covering or clustering the remaining non-outliers optimally.

At a high level, the problems considered in this thesis are divided into three parts. In the first part, we look at the Geometric Partial Set Cover problem. Here, we are given a set of points and a collection of geometric objects such as disks or squares. We want to find a sub-collection of the geometric objects, such that at least a specified fraction of the points — say 90% — are covered. Our results imply improved approximation guarantees for many versions of Geometric Partial Set Cover. In many cases, these results match that for the corresponding full coverage versions, i.e., where all points must be covered.

Subsequently, we consider a more general version of the scenario, where the set of points is divided into a number of different colors, and each color has a specific coverage requirement. This models the situation where the given population is divided into multiple demographics, and we want to provide the desired amount of coverage to each demographic. We look at this scenario from the lens of covering as well as clustering, and design approximation algorithms for both.

Finally, we look at clustering problems involving other kinds of constraints. Here, we are supposed to find a set of centers that provide some kind service to nearby points. In one setting, we consider the issue of fault tolerance. Here, even if a small number of centers fail, enough points should continue to receive service from a nearby center. In another setting, we want to ensure that there are enough centers to go around, in the interest of not overburdening any particular center beyond its specified capacity.

Another recurring theme in our algorithms is that of black-box reductions. We are often able to reduce a more constrained version of a problem to its simpler version in a black-box manner. Thus, our algorithms build upon the existing algorithms in the literature for the said simpler problems, without having to reinvent the wheel. Another advantage of this approach is that it extends the reusability of the algorithm – an improvement in the algorithm for the simpler problem directly implies an improvement for the more constrained problem, as long as the improved algorithm satisfies certain properties. Therefore, we believe that the algorithms and the techniques developed in this thesis may have wider applicability to other kinds of problems.

Indexing (document details)
 Advisor: Varadarajan, Kasturi Commitee: Chekuri, Chandra, Lin, Qihang, Pemmaraju, Sriram, Stump, Aaron School: The University of Iowa Department: Computer Science School Location: United States -- Iowa Source: DAI-B 82/3(E), Dissertation Abstracts International Source Type: DISSERTATION Subjects: Computer science, Statistical physics, Applied Mathematics Keywords: Approximation Algorithms, Covering and Clustering Problems, Theoretical Computer Science Publication Number: 28030415 ISBN: 9798672127798