Daniel Aloise and Claudio Contardo
Article (2018)
|
Open Access to the full text of this document Accepted Version Terms of Use: All rights reserved Download (209kB) |
Abstract
We consider the problem of clustering a set of points so as to minimize the maximum intra-cluster dissimilarity, which is strongly NP-hard. Exact algorithms for this problem can handle datasets containing up to a few thousand observations, largely insufficient for the nowadays needs. The most popular heuristic for this problem, the complete-linkage hierarchical algorithm, provides feasible solutions that are usually far from optimal. We introduce a sampling-based exact algorithm aimed at solving large-sized datasets. The algorithm alternates between the solution of an exact procedure on a small sample of points, and a heuristic procedure to prove the optimality of the current solution. Our computational experience shows that our algorithm is capable of solving to optimality problems containing more than 500,000 observations within moderate time limits, this is two orders of magnitude larger than the limits of previous exact methods.
Uncontrolled Keywords
Clustering ; Diameter ; Large-scale optimization
Subjects: |
2700 Information technology > 2700 Information technology 2700 Information technology > 2713 Algorithms |
---|---|
Department: | Department of Computer Engineering and Software Engineering |
Funders: | Fonds de recherche Québec - Nature et technologies (FRQNT), CRSNG / NSERC |
Grant number: | 181909, 435824-2013, 2017-05617 |
PolyPublie URL: | https://publications.polymtl.ca/3251/ |
Journal Title: | Journal of Global Optimization (vol. 71, no. 3) |
Publisher: | Springer |
DOI: | 10.1007/s10898-018-0634-1 |
Official URL: | https://doi.org/10.1007/s10898-018-0634-1 |
Date Deposited: | 05 Sep 2018 16:39 |
Last Modified: | 28 Sep 2024 01:09 |
Cite in APA 7: | Aloise, D., & Contardo, C. (2018). A sampling-based exact algorithm for the solution of the minimax diameter clustering problem. Journal of Global Optimization, 71(3), 613-630. https://doi.org/10.1007/s10898-018-0634-1 |
---|---|
Statistics
Total downloads
Downloads per month in the last year
Origin of downloads
Dimensions