Seminars & Colloquia Calendar
K-means clustering with optimization
Soledad Villar - NYU
Location: CoRE 301
Date & time: Wednesday, 11 April 2018 at 11:00AM - 12:00PM
Abstract:: K-means clustering aims to partition a set of n points into k clusters in such a way that each observation belongs to the cluster with the nearest mean, and such that the sum of squared distances from each point to its nearest mean is minimal. In the worst case, this is a hard optimization problem, requiring an exhaustive search over all possible partitions of the data into k clusters in order to find the optimal clustering. At the same time, fast heuristic algorithms for k-means are widely used for data science applications, despite only being guaranteed to converge to local minimizers of the k-means objective.
In this talk, we consider a semidefinite programming relaxation of the k-means optimization problem. We discuss two regimes where the SDP provides an algorithm with improved clustering guarantees compared to previous results in the literature: (a) for points drawn from isotropic distributions supported in separated balls, the SDP recovers the globally optimal k-means clustering under mild separation conditions; (b) for points drawn from mixtures of distributions with bounded variance, the SDP solution can be rounded to a clustering which is guaranteed to classify all but a small fraction of the points correctly.
An interesting feature about the theoretical tools developed for proving (approximate) optimality of partitions under models (a) and (b) is that they can also be used to a posteriori certify (approximate) optimality of k-means clustering solutions of real data, no model required.
Charles Weibel Organizer's Page
Brooke Logan
Wujun Zhang Organizer's webpage
P. Gupta, X.Huang and J. Song Organizer's webpage
Swastik Kopparty, Sepehr Assadi Seminar webpage
Jeffry Kahn, Bhargav Narayanan, Jinyoung Park Organizer's webpage
Yonah Biers-Ariel, Mingjia Yang, and Doron Zeilberger --> homepage
Paul Feehan, Manousos Maridakis, Natasa Sesum Organizer's webpage
Lev Borisov, Emanuel Diaconescu, Angela Gibney, Nicolas Tarasca, and Chris Woodward Organizer's webpage
Jason Saied Seminar webpage
Brian Pinsky, Rashmika Goswami website
Corrine Yap Organizer's webpage
Edna Jones Organizer's webpage
Yanyan Li, Zheng-Chao Han, Jian Song, Natasa Sesum
Lisa Carbone, Yi-Zhi Huang, James Lepowsky, Siddhartha Sahi Organizer's webpage
Simon Thomas website
Kasper Larsen, Daniel Ocone and Kim Weston Organizer's page
Joel Lebowitz, Michael Kiessling
Yanyan Li, Haim Brezis
Brooke Ogrodnik website
Stephen D. Miller, John C. Miller, Alex V. Kontorovich, Claire Burrin seminar website
Stephen D. Miller
Organizers: Yanyan Li, Z.C. Han, Jian Song, Natasa Sesum
Yael Davidov Seminar webpage
Kristen Hendricks, Xiaochun Rong, Hongbin Sun, Chenxi Wu Organizer's page
Ebru Toprak, Organizer
Organizer: Luochen Zhao
James Holland; Organizer website
- Show events from all categories
Special Note to All Travelers
Directions: map and driving directions. If you need information on public transportation, you may want to check the New Jersey Transit page.
Unfortunately, cancellations do occur from time to time. Feel free to call our department: 848-445-6969 before embarking on your journey. Thank you.