<p>Principal component analysis is a statistical procedure that performs an orthogonal transformation to convert a set of variables into a set of linearly uncorrelated variables called principle components.</p>
<p>The transformation is defined in such a way that the first principle component has the largest possible variance explained in the data.</p>
<p>Each succeeding component has the highest possible variance under the constraint of having to be orthogonal to the preceding components.</p>
<p>PCA is sensitive to the relative scaling of the original variables.</p>
<h3>Results of a PCA</h3>
<p>Results are discussed in terms of <em>component scores</em> which is the transformed variables and <em>loadings</em> which is the weight by which each original variable should be multiplied to get the component score.</p>
<h2>Assumptions of PCA</h2>
<ol>
<li>Linearity</li>
<li>Large variances are important and small variances denote noise</li>
<li>Principal components are orthogonal</li>
</ol>
<h2>Why perform PCA?</h2>
<ul>
<li>Distance measures perform poorly in high-dimensional space (<ahref="https://stats.stackexchange.com/questions/256172/why-always-doing-dimensionality-reduction-before-clustering">https://stats.stackexchange.com/questions/256172/why-always-doing-dimensionality-reduction-before-clustering</a>)</li>
<li>Helps eliminates noise from the dataset (<ahref="https://www.quora.com/Does-it-make-sense-to-perform-principal-components-analysis-before-clustering-if-the-original-data-has-too-many-dimensions-Is-it-theoretically-unsound-to-try-to-cluster-data-with-no-correlation">https://www.quora.com/Does-it-make-sense-to-perform-principal-components-analysis-before-clustering-if-the-original-data-has-too-many-dimensions-Is-it-theoretically-unsound-to-try-to-cluster-data-with-no-correlation</a>)</li>
<li>One initial cost to help reduce further computations</li>
</ul>
<h2>Computing PCA</h2>
<ol>
<li>Subtract off the mean of each measurement type</li>
<li>Compute the covariance matrix</li>
<li>Take the eigenvalues/vectors of the covariance matrix</li>