Website snapshot

2026-01-30 14:03:39 +00:00 · 2020-01-15 21:51:49 -05:00 · 2020-01-15 21:51:49 -05:00 · 50ec3688a5
commit 50ec3688a5
parent ee0ab66d73
281 changed files with 21066 additions and 0 deletions
--- a/content/research/clusteranalysis/notes/lec11-3.md
+++ b/content/research/clusteranalysis/notes/lec11-3.md
@ -0,0 +1,19 @@
+# K-Medians
+
+This is a variation of k-means clustering where instead of calculating the mean for each cluster to determine its centroid we are going to calculate the median instead.
+
+This has the effect of minimizing error over all the clusters with respect to the Manhattan norm as opposed to the Euclidean squared norm which is minimized in K-means
+
+### Algorithm
+
+Given an initial set of $k$ medians, the algorithm proceeds by alternating between two steps.
+
+**Assignment step**: Assign each observation to the cluster whose median has the leas Manhattan distance.
+
+- Intuitively this is finding the nearest median
+
+**Update Step**: Calculate the new medians to be the centroids of the observations in the new clusters
+
+The algorithm is known to have converged when assignments no longer change. There is no guarantee that the optimum is found using this algorithm. 
+
+The result depends on the initial clusters. It is common to run this multiple times with different starting conditions.