website/content/notes/dimensionalityreduction/syllabus.md

74 lines
2.4 KiB
Markdown
Raw Normal View History

---
title: Dimensionality Reduction Independent Study Syllabus
---
2020-01-16 02:51:49 +00:00
Dimensionality reduction is the process of reducing the number of random variables under consideration. This study will last for 10 weeks, meeting twice a week for about an hour.
## Introduction to Dimensionality Reduction (0.5 Week)
- Motivations for dimensionality reduction
- Advantages of dimensionality reduction
- Disadvantages of dimensionality reduction
## Feature Selection (3 Weeks)
This is the process of selecting a subset of relevant features. The central premise of this technique is that many features are either redundant or irrelevant and thus can be removed without incurring much loss of information.
### Metaheuristic Methods (1.5 Weeks)
- Filter Method
- Wrapper Method
- Embedded Method
### Optimality Criteria (0.5 Weeks)
- Bayesian Information Criterion
- Mallow's C
- Akaike Information Criterion
### Other Feature Selection Techniques (1 Week)
- Subset Selection
- Minimum-Redundancy-Maximum-Relevance (mRMR) feature selection
- Global Optimization Formulations
- Correlation Feature Selection
### Applications of Metaheuristic Techniques (0.5 Weeks)
- Stepwise Regression
- Branch and Bound
## Feature Extraction (6 Weeks)
Feature extraction transforms the data in high-dimensional space to a space of fewer dimensions. In other words, feature extraction involves reducing the amount of resources required to describe a large set of data.
### Linear Dimensionality Reduction (3 Weeks)
- Principal Component Analysis (PCA)
- Singular Value Decomposition (SVD)
- Non-Negative Matrix Factorization
- Linear Discriminant Analysis (LDA)
- Multidimensional Scaling (MDS)
- Canonical Correlation Analysis (CCA) [If Time Permits]
- Linear Independent Component Analysis [If Time Permits]
- Factor Analysis [If Time Permits]
### Non-Linear Dimensionality Reduction (3 Weeks)
One approach to the simplification is to assume that the data of interest lie on an embedded non-linear manifold within higher-dimensional space.
- Kernel Principal Component Analysis
- Nonlinear Principal Component Analysis
- Generalized Discriminant Analysis (GDA)
- T-Distributed Stochastic Neighbor Embedding (T-SNE)
- Self-Organizing Map
- Multifactor Dimensionality Reduction (MDR)
- Isomap
- Locally-Linear Embedding
- Nonlinear Independent Component Analysis
- Sammon's Mapping [If Time Permits]
- Hessian Eigenmaps [If Time Permits]
- Diffusion Maps [If Time Permits]
- RankVisu [If Time Permits]