I blink once, one month has passed. I blink twice, years had gone by.
Imagine being tasked with analyzing vast datasets of celestial objects. Traditional methods like linear regression or classification might struggle to identify subtle patterns or underlying structures within this high-dimensional data. To overcome this, the team needs a technique that can uncover hidden relationships between these spectral signatures, potentially revealing new classes of celestial objects or understanding the evolution of existing ones.
Manifold learning is a dimensionality reduction technique that seeks to uncover the underlying low-dimensional structure of high-dimensional data. Unlike linear methods like PCA, manifold learning assumes that the data lies on a nonlinear manifold embedded in a higher-dimensional space.
The goal of manifold learning is to unroll or flatten this manifold into a lower-dimensional space while preserving the intrinsic geometric relationships between data points. This is analogous to flattening a rolled-up Swiss roll (see below), where nearby points on the roll remain close in the flattened version.
There are numerous manifold learning techniques, and for this example, we will apply several of them to an S-curved dataset separately. Here is a list of their goals and visualizations:
Multidimensional Scaling (MDS)
Goal: preserves pairwise distances between data points.
Visualization: shows a reasonable representation of the S-curve, but with some distortion, especially in the denser regions.
Isomap
Goal: preserves geodesic distances (shortest path distances) between data points.
Visualization: provides a better representation of the S-curve than MDS, capturing the underlying manifold structure more accurately.
Locally Linear Embedding (LLE)
Goal: preserves local linear relationships between neighboring data points.
Visualization: effectively captures the S-curve structure, with a clear representation of the underlying manifold.
Hessian LLE
Goal: improves upon LLE by incorporating second-order information.
Visualization: similar to LLE, Hessian LLE provides a good representation of the S-curve, potentially with slightly better local structure preservation.
Modified LLE
Goal: a variation of LLE with potential improvements in performance or computational efficiency.
Visualization: shows a reasonable representation of the S-curve, but without significant visual differences compared to LLE.
LTSA
Goal: preserves local linear relationships while also considering the global structure of the data.
Visualization: constructs local coordinate systems around each data point and aligns these coordinate systems to find a global embedding.
Laplacian Eigenmaps
Goal: preserves local and global structure by using the Laplacian matrix.
Visualization: effectively captures the S-curve, with a clear representation of the underlying manifold.
t-SNE
Goal: preserves local structure while allowing for non-linear embeddings.
Visualization: often provides visually appealing and informative low-dimensional representations.
Here is an array of techniques being applied onto a Swiss Roll dataset:
And here is another array of techniques being applied onto a severed sphere dataset:
The new modules that will be used for this instance of manifold learning are matplotlib's NullFormatter and sklearn's manifold. The former removes all labels on the ticks (values on axis); the latter imports premade syntax for a rapid setup.
From 24th July, the 3D cluster plot requires x-, y-, and z-axis values…
For the flat S-curve, you only need two axis values instead.
The same is said for the other 9 perplexity plots below. Onto the topic of perplexity, as perplexity increases per consecutive plot, the representation of the S-curve becomes more spread out and less defined. Lower perplexity values tend to preserve local structure better, while higher values focus on capturing global patterns, and can lead to increased distortion of the original S-curve shape.
To recap for human interpretation, a perplexity plot typically shows how the quality of a language model changes as the model complexity increases. The x-axis often represents model complexity (e.g., number of parameters, model size). The y-axis represents perplexity, a measure of how well the model predicts the next word in a sequence; lower perplexity indicates better performance.