mirror of
https://github.com/Brandon-Rozek/website.git
synced 2025-10-10 15:01:15 +00:00
Website snapshot
This commit is contained in:
parent
ee0ab66d73
commit
50ec3688a5
281 changed files with 21066 additions and 0 deletions
24
content/research/clusteranalysis/notes/lec4-2.md
Normal file
24
content/research/clusteranalysis/notes/lec4-2.md
Normal file
|
@ -0,0 +1,24 @@
|
|||
# Revisiting Similarity Measures
|
||||
|
||||
## Manhatten Distance
|
||||
|
||||
An additional use case for Manhatten distance is when dealing with binary vectors. This approach, otherwise known as the Hamming distance, is the number of bits that are different between two binary vectors.
|
||||
|
||||
## Ordinal Variables
|
||||
|
||||
Ordinal variables can be treated as if they were on a interval scale.
|
||||
|
||||
First, replace the ordinal variable value by its rank ($r_{if}$) Then map the range of each variable onto the interval $[0, 1]$ by replacing the $f_i$ where f is the variable and i is the object by
|
||||
$$
|
||||
z_{if} = \frac{r_{if} - 1}{M_f - 1}
|
||||
$$
|
||||
Where $M_f$ is the maximum rank.
|
||||
|
||||
### Example
|
||||
|
||||
Freshman = $0$ Sophmore = $\frac{1}{3}$ Junior = $\frac{2}{3}$ Senior = $1$
|
||||
|
||||
$d(freshman, senior) = 1$
|
||||
|
||||
$d(junior, senior) = \frac{1}{3}$
|
||||
|
Loading…
Add table
Add a link
Reference in a new issue