2024-03-19

Summary of https://www.quantmetry.com/blog/topological-data-analysis-with-mapper/:

The Mapper algorithm was introduced by Singh, Memoli and Carlsson in Topological Methods for the Analysis of High Dimensional Data Sets and 3D Object Recognition (2007). It is used to

visualize higher-dimensional data through a particular lens
detect clusters and topological structure
select features

The Algorithm

Given a dataset, we construct a simplicial complex by the following steps:

Project the data to a lower-dimensional space via a filter function $f$ called a lens. Typically onto an axis.
Construct a cover ${U_{i}}$ of the projected space.
For each ball $U_{i}$ , cluster the points in the preimage $f^{- 1} (U_{i})$ .
Take the nerve of the clusters, i.e., each cluster becomes a vertex and are connected via an edge if their clusters intersect, if three clusters intersect, connect via triangle, etc.

center

Lenses, coverings and clustering

The construction depends heavily on the lens $f$ . The choice of lens depends on the type of dataset and your goals. In the example above we are doing a simple height function. In biological applications, a distance to first $k$ -nearest neighbors is often used.

The construction also depends on the cover. A standard choice is a collection of $d$ -dimensional intervals as seen in the example above.

Mapper’s strengths

A common issue with dimensionality reduction algorithms is “projection loss”, that is, points may be far apart in high-dimension but nearby after projected to lower-dimension. Mapper corrects this issue by clustering the pullback of the cover.

Mapper allows us to visualize the topological structure of the higher-dimensional data with a graph representation.

Mapper helps select features that best discriminate data. We can explore the graph and see why the data is interconnected. We can also color the graph by some quantity we are interested in.

We can use Mapper to explore the data structure and gain new insights allowing us to improve some predictive model.

Notes

Explorer

2024-03-19

The Algorithm

Lenses, coverings and clustering

Mapper’s strengths

Graph View

Table of Contents

Backlinks

Source code