mirror of
https://github.com/Brandon-Rozek/website.git
synced 2025-10-09 14:31:13 +00:00
Removing raw HTML
This commit is contained in:
parent
e06d45e053
commit
572d587b8e
33 changed files with 373 additions and 386 deletions
|
@ -24,11 +24,11 @@ An extensive list of similarity measures for binary data exist, the reason for s
|
|||
|
||||
In some cases, zero-zero matches are equivalent to one-one matches and therefore should be included in the calculated similarity measure
|
||||
|
||||
<u>Example</u>: Gender, where there is no preference as to which of the two categories should be coded as zero or one
|
||||
**Example**: Gender, where there is no preference as to which of the two categories should be coded as zero or one
|
||||
|
||||
In other cases the inclusion or otherwise of the matches is more problematic
|
||||
|
||||
<u>Example</u>: When the zero category corresponds to the genuine absence of some property, such as wings in a study of insects
|
||||
**Example**: When the zero category corresponds to the genuine absence of some property, such as wings in a study of insects
|
||||
|
||||
The question that then needs to be asked is do the co-absences contain useful information about the similarity of the two objects?
|
||||
|
||||
|
@ -152,7 +152,7 @@ Since for correlation coefficients we have that $-1 \le \phi_{ij} \le 1$ with th
|
|||
|
||||
The use of correlation coefficients in this context is far more contentious than its noncontroversial role in assessing the linear relationship between two variables based on $n$ observations.
|
||||
|
||||
When correlations between two individuals are used to quantify their similarity the <u>rows of the data matrix are standardized</u>, not its columns.
|
||||
When correlations between two individuals are used to quantify their similarity the *rows of the data matrix are standardized*, not its columns.
|
||||
|
||||
**Disadvantages**
|
||||
|
||||
|
@ -164,7 +164,7 @@ In addition, the correlation coefficient is unable to measure the difference in
|
|||
|
||||
However, the use of a correlation coefficient can be justified for situations where all of the variables have been measured on the same scale and precise values taken are important only to the extent that they provide information about the subject's relative profile
|
||||
|
||||
<u>Example:</u> In classifying animals or plants, the absolute size of the organisms or their parts are often less important than their shapes. In such studies the investigator requires a dissimilarity coefficient that takes the value zero if and only if two individuals' profiles are multiples of each other. The angular separation dissimilarity measure has this property.
|
||||
**Example:** In classifying animals or plants, the absolute size of the organisms or their parts are often less important than their shapes. In such studies the investigator requires a dissimilarity coefficient that takes the value zero if and only if two individuals' profiles are multiples of each other. The angular separation dissimilarity measure has this property.
|
||||
|
||||
**Further considerations**
|
||||
|
||||
|
@ -271,25 +271,26 @@ By also employing within-group correlations, the Mahalanobis distance takes acco
|
|||
|
||||
The use of Mahalanobis implies that the investigator is willing to **assume** that the covariance matrices are at least approximately the same in the two groups. When this is not so, this measure is an inappropriate inter-group measure. Other alternatives exist such as the one proposed by Anderson and Bahadur
|
||||
|
||||
<img src="http://proquest.safaribooksonline.com.ezproxy.umw.edu/getfile?item=cjlhZWEzNDg0N2R0cGMvaS9zMG1nODk0czcvN3MwczM3L2UwLXMzL2VpL3RtYTBjMGdzY2QwLmkxLWdtaWY-" alt="equation">
|
||||

|
||||
|
||||
Another alternative is the *normal information radius* suggested by Jardine and Sibson
|
||||
|
||||
<img src="http://proquest.safaribooksonline.com.ezproxy.umw.edu/getfile?item=cjlhZWEzNDg0N2R0cGMvaS9zMG1nODk0czcvN3MwczM4L2UwLXMzL2VpL3RtYTBjMGdzY2QwLmkxLWdtaWY-" alt="equation">
|
||||

|
||||
|
||||
|
||||
### Inter-group Proximity Based on Group Summaries for Categorical Data
|
||||
|
||||
Approaches for measuring inter-group dissimilarities between groups of individuals for which categorical variables have been observed have been considered by a number of authors. Balakrishnan and Sanghvi (1968), for example, proposed a dissimilarity index of the form
|
||||
|
||||

|
||||

|
||||
|
||||
where $p_{Akl}$ and $p_{Bkl}$ are the proportions of the lth category of the kth variable in group A and B respectively, , ck + 1 is the number of categories for the kth variable and p is the number of variables.
|
||||
where $p_{Akl}$ and $p_{Bkl}$ are the proportions of the lth category of the kth variable in group A and B respectively, , ck + 1 is the number of categories for the kth variable and p is the number of variables.
|
||||
|
||||
Kurczynski (1969) suggested adapting the generalized Mahalanobis distance, with categorical variables replacing quantitative variables. In its most general form, this measure for inter-group distance is given by
|
||||
|
||||

|
||||

|
||||
|
||||
where  contains sample proportions in group A and  is defined in a similar manner, and  is the m × m common sample covariance matrix, where .
|
||||
where  contains sample proportions in group A and  is defined in a similar manner, and  is the m × m common sample covariance matrix, where .
|
||||
|
||||
## Weighting Variables
|
||||
|
||||
|
|
|
@ -11,9 +11,9 @@ Hierarchal Clustering techniques can be subdivided depending on the method of go
|
|||
|
||||
First there are two different methods in forming the clusters *Agglomerative* and *Divisive*
|
||||
|
||||
<u>Agglomerative</u> is when you combine the n individuals into groups through each iteration
|
||||
**Agglomerative** is when you combine the n individuals into groups through each iteration
|
||||
|
||||
<u>Divisive</u> is when you are separating one giant group into finer groupings with each iteration.
|
||||
**Divisive** is when you are separating one giant group into finer groupings with each iteration.
|
||||
|
||||
Hierarchical methods are an irrevocable algorithm, once it joins or separates a grouping, it cannot be undone. As Kaufman and Rousseeuw (1990) colorfully comment: *"A hierarchical method suffers from the defect that it can never repair what was done in previous steps"*.
|
||||
|
||||
|
|
|
@ -130,7 +130,7 @@ These events coincide every twenty years, because $lcm(4, 10) = 20$.
|
|||
|
||||
We are not always interested in full answers, however. Sometimes the remainder suffices for our purposes.
|
||||
|
||||
<u>Example:</u> Suppose your birthday this year falls on a Wednesday. What day of the week will it fall on next year?
|
||||
**Example:** Suppose your birthday this year falls on a Wednesday. What day of the week will it fall on next year?
|
||||
|
||||
The remainder of the number of days between now and then (365 or 366) mod the number of days in a week. $365$ mod $7 = 1$. Which means that your birthday will fall on a Thursday.
|
||||
|
||||
|
@ -138,7 +138,7 @@ The remainder of the number of days between now and then (365 or 366) mod the nu
|
|||
|
||||
**Addition**: $(x + y)$ mod $n$ $=$ $((x $ mod $n) + (y$ mod $n))$ mod $n$
|
||||
|
||||
<u>Example:</u> How much small change will I have if given \$123.45 by my mother and \$94.67 by my father?
|
||||
**Example:** How much small change will I have if given \$123.45 by my mother and \$94.67 by my father?
|
||||
$$
|
||||
\begin{align*}
|
||||
(12345 \text{ mod } 100) + (9467 \text{ mod } 100) &= (45 + 67) \text{ mod } 100 \\
|
||||
|
@ -147,7 +147,7 @@ $$
|
|||
$$
|
||||
**Subtraction** (Essentially addition with negatives):
|
||||
|
||||
<u>Example:</u> Based on the previous example, how much small change will I have after spending \$52.53?
|
||||
**Example:** Based on the previous example, how much small change will I have after spending \$52.53?
|
||||
$$
|
||||
(12 \text{ mod } 100) - (53 \text{ mod } 100) = -41 \text{ mod } 100 = 59 \text{ mod } 100
|
||||
$$
|
||||
|
@ -159,7 +159,7 @@ $$
|
|||
$$
|
||||
xy \text{ mod } n = (x \text{ mod } n)(y \text{ mod } n) \text{ mod n}
|
||||
$$
|
||||
<u>Example:</u> How much change will you have if you earn \$17.28 per hour for 2,143 hours?
|
||||
**Example:** How much change will you have if you earn \$17.28 per hour for 2,143 hours?
|
||||
$$
|
||||
\begin{align*}
|
||||
(1728 * 2143) \text{ mod } 100 &= (28 \text{ mod } 100)(43 \text{ mod 100}) \\
|
||||
|
@ -170,7 +170,7 @@ $$
|
|||
$$
|
||||
x^y \text{ mod } n =(x \text{ mod n})^y \text{ mod } n
|
||||
$$
|
||||
<u>Example:</u> What is the last digit of $2^{100}$?
|
||||
**Example:** What is the last digit of $2^{100}$?
|
||||
$$
|
||||
\begin{align*}
|
||||
2^3 \text{ mod } 10 &= 8 \\
|
||||
|
|
|
@ -24,7 +24,7 @@ This algorithm is called *iterative policy evaluation*.
|
|||
|
||||
To produce each successive approximation, $v_{k + 1}$ from $v_k$, iterative policy evaluation applies the same operation to each state $s$: it replaces the old value of $s$ with a new value obtained from the old values of the successor states of $s$, and the expected immediate rewards, along all the one-step transitions possible under the policy being evaluated.
|
||||
|
||||
<u>**Iterative Policy Evaluation**</u>
|
||||
**Iterative Policy Evaluation**
|
||||
|
||||
```
|
||||
Input π, the policy to be evaluated
|
||||
|
@ -69,7 +69,7 @@ Each policy is guaranteed to be a strict improvement over the previous one (unle
|
|||
|
||||
This way of finding an optimal policy is called *policy iteration*.
|
||||
|
||||
<u>Algorithm</u>
|
||||
**Algorithm**
|
||||
|
||||
```
|
||||
1. Initialization
|
||||
|
|
|
@ -16,7 +16,7 @@ Recall that the value of a state is the expected return -- expected cumulative f
|
|||
|
||||
Each occurrence of state $s$ in an episode is called a *visit* to $s$. The *first-visit MC method* estimates $v_\pi(s)$ as the average of the returns following first visits to $s$, whereas the *every-visit MC method* averages the returns following all visits to $s$. These two Monte Carlo methods are very similar but have slightly different theoretical properties.
|
||||
|
||||
<u>First-visit MC prediction</u>
|
||||
**First-visit MC prediction**
|
||||
|
||||
```
|
||||
Initialize:
|
||||
|
@ -45,7 +45,7 @@ This is the general problem of *maintaining exploration*. For policy evaluation
|
|||
|
||||
We made two unlikely assumptions above in order to easily obtain this guarantee of convergence for the Monte Carlo method. One was that the episodes have exploring starts, and the other was that policy evaluation could be done with an infinite number of episodes.
|
||||
|
||||
<u>Monte Carlo Exploring Starts</u>
|
||||
**Monte Carlo Exploring Starts**
|
||||
|
||||
```
|
||||
Initialize, for all s ∈ S, a ∈ A(s):
|
||||
|
@ -74,7 +74,7 @@ On-policy methods attempt to evaluate or improve the policy that is used to make
|
|||
|
||||
In on-policy control methods the policy is generally *soft*, meaning that $\pi(a|s)$ for all $a \in \mathcal{A}(s)$. The on-policy methods in this section uses $\epsilon$-greedy policies, meaning that most of the time they choose an action that has maximal estimated action value, but with probability $\epsilon$ they instead select an action at random.
|
||||
|
||||
<u>On-policy first-visit MC control (for $\epsilon$-soft policies)</u>
|
||||
**On-policy first-visit MC control (for $\epsilon$-soft policies)**
|
||||
|
||||
```
|
||||
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue