TDA

questions tda

questions tda


Set of flashcards Details

Flashcards 73
Language Deutsch
Category Maths
Level University
Created / Updated 05.08.2025 / 05.08.2025
Weblink
https://card2brain.ch/cards/20250805_tda
Embed
<iframe src="https://card2brain.ch/box/20250805_tda/embed" width="780" height="150" scrolling="no" frameborder="0"></iframe>

How can persistence diagrams be used in machine learning?

  • Persistence statistics: The idea is to summarize the diagram using basic descriptive statistics. For the birth times $b$, death times $d$, the interval points $\frac{b+d}{2}$ and the interval lengths $d-b$, we compute \textbf{means, medians, standard deviations, interquartile ranges,full  ranges, and percentiles (10%,25%,75%,90%)}. We also record the number of bars and entropy. => Output is a vector in $\mathbb{R}^{38}$.
  • Persistence landscapes: Intuition: For each point in the persistence diagram draw a horizontal and a vertical line segment to the diagonal. Then flip the arrangement of line segments such that the diagonal is horizontal. We get sth that looks like a mountain range. The persistence landscape is constructed as follows:
    • D = {$(b_1,d_1),...,(b_n,d_n)$} be a persistence diagram, $(b_i,d_i)$ a birth-death pair (off-diagonal point). Each such point gives rise to a triangle defined by the points $\{(t, min[t-b_i,d_i-t]_{+}|t\in\mathbb{R})\}$.
    • The persistence landscape of $D$ is now a function $\lambda_D:\mathbb{N}\times\mathbb{R}\rightarrow \mathbb{R}$ defined by \[\lambda_D(k,t) := \text{ k'-th largest value of }min[t-b_i, d_i -t]\text{ for } i\in \{1,...,n\}.\]
    • For each $k$, we have that $\lambda_D(k, \cdot)$ is a \textit{piece-wise linear function}. This yields a collection of real valued functions, one for each $k$. Notice that for $p=2$, $||\lambda_D||_p$ defines a Hilbert space and thus usable for many ML pipelines.
    • \textit{p-landscape distance $\Lambda_p$} is defined as $\Lambda_p(D_1,D_2):= ||\lambda_{D_1}-\lambda_{D_2}||_p$. Thm.: $D_1, D_2$ persistence diagrams. Then $\Lambda_{\infty}(D_1,D_2)\leq d_b(D_1,D_2)$.
  • Betti Curves: Easier way to get piece-wise linear (even constant) functions from a persistence diaram. Defined as the function $\beta : \mathbb{R}\rightarrow\mathbb{N}_{geq 0}$ which assigns each time $t$ the current Betti number of the filtration. => Space of Betti curves is a Hilbert space with the std 2-norm. Note: The Betti curve throws away the pairing between births and deaths, so some topological information is lost.

What is vectorization? How is it relevant for a machine learning pipeline?

\textbf{Vectorizations} are methods used to turn persistence diagrams into elements of metric spaces (spaces that have an inner product, e.g. Hilbert space, Euclidean space). Many machine learning pipelines require input points to be in such metric spaces.

Why does persistent homology sometimes fail to capture voids in the data?

General problem: Fragility to outliers; Points in a hole may fill it up prematurely.

Case study:

  • \textbf{Dataset structure}: Each vote is encoded as a vector in $\mathbb{R}^{47}$ with values $+1$ for "Yes", $-1$ for "No", and $0$ for absent. PCA shows a square-like structure, where the corners represent typical voting patterns (bipartisan support/rejection, partisan splits) and the center represents ambiguous or absent votes.
  • \textbf{Problem}: There are Quorum calls, where both answers (Present/Not present) are coded as 0, so we have at least 3 points exactly in the middle of the square. These points are enough to fill in the 1-dim hole formed by the square's boundary (no 1-dim homology in persistence diagram). => Dataset geometrically resembles a square (which has a hole), persistent homology fails to detect the loop because the middle is "filled in" by outlier points.

Practical Fix:

  • Compute the density of each data point, e.g. by using Gaussian kernel density estimation.
  • Remove the 1% with lowest density points. 
  • Compute persistent homology only on remaining points (high-density points).

=> See homology class with long persistence, reflects the data's geometry.

In 4 steps, how can we use TDA to analyze the space of image patches?

  1. Preprocess: Dataset becomes a high-dimensional point cloud.
  2. Subsample: Persistent homology on densest regions shows a circle.
  3. Exapand sample: Include medium-density regions, persistent homology shows 5 circles.
  4. Full dataset: By including all points, we detect a 2 cycle (Klein bottle structure)

How can we use persistent homology to analyze the space of image patches?

  • Sample the densest regions: Large datasize, so we need subsampling. Construct \textbf{Witness complex} on a chosen subset and compute \textbf{persistent homology} in dimensions $0,1,$ and $2$. The resulting barcodes suggest: \begin{itemize}  \item[-]$\beta_0 = 1$: dataset is connected. \item[-]$\beta_1 = 1$: there exists a significant 1-dim hole. \item[-]$\beta_2 = 0$: no significant 2-dim feature.\end{itemize} This tells us, that the point cloud resembles a circle, which is visible via PCA. The circe can be interpreted as the possible angles of a detected "edge" in the images.
  • Include medium-density regions: Subsample data with intermediate density and compute the persistent homology. We end up with $\beta_1 = 5$, so 5 1-cycles (high-density: only one 1-cycle). PCA shows cross in the middle of circle seen in high-density subsample, i.e., we see four circles. This leads to interpretation that dataset consists of 3 circles of some $S^2$ that are all orthogonal to one another. One circle representing "edges", one circle describing different translations of horizontal edges, and the third circle describing translations of horizontal edges.
  • Include lowest-density regions: When full dataset is considered, we compute $\beta_2=1$, so a nontrivial 2-dim feature and the number of 1-cycles reduces to 2 ($\beta_1=2)$. This space can look like the space of the Klein bottle (we have a 2-cycle with 3 circles)

 

Discuss the structure of the relevant dataset for the space of image patches.

  • Raw data: ca. 4000 grayscale images 
  • Patch size: 5000 patches of size 3x3p 
  • Filtering: Top 20% most contrasting patches are kept
  • Dimensionality: 3x3p patches are flattened to vectors in $\mathbb{R}^9$
  • Normalization: 1.) Contrast normalization - set darkest picel to 0, brightest pixel to 1 (affine scaling) 2.) Rescale vector to unit norm (|x| = 1). This gives us a vector on the unit sphere $S^7$.
  • Final structure: ca. 4 million points on the unit sphere $S^7$ => Dataset is a large point cloud on $S^7$.

How can we visualize multiparameter persistence modules?

  • The \textbf{Hilbert-function} of a pointwise finite dimensional (p.f.d) persistence module $\mathbb{U}$ indexed by $P$ send $p\in P$ to $dim U_p\in \mathbb{N}$. It can be thought of as a 'homological heatmap'.
  • The \textbf{rank-invariant} of $\mathbb{U}$ sends a pair $(p,p')$, $p\perceqp'$ to the rank of the map $u_{p,p'}: U_p \rightarrow U_{p'}$, $rank(u_{p,p'}$. Note that two modules with $\mathbb{U} \simeq \mathbb{V}$ have the same rank invariant. For modules indexed by $\mathbb{R}$ this implication also.
  • By restricting a bifiltration to a line $L\in\mathbb{R}^2$, where $L : t \mapsto ta + b$ has non negative slope, we get a 1-parameter filtration. The barcode of this restriction is called the \textit{fibered barcode along $L$}. The collection from all such lines forms the \textbf{fibered barcode} of the module. The idea can be extended to lines in $\mathbb{R}^n$.

What is a major upside of multiparameter persistence? What is a major
downside? 

As we have seen, the persistence modules arising from the Čech or Vietoris-Rips complexes are stable under small perturbations of the underlying data, but not robust.

  • Upside: Multiparameter persistence modules are robust to noise and outliers and preserves features that arise from dense clusters of data (greater analytical power). (Robustness
  • Downside: Loses simple, complete, and interpretable barcode structure from the 1-parameter case; We can decompose the poset P uniquely into indecomposables, but there is no way to classify or parametrize these indecomposables in general. (Representability)

What are some common bipersistence modules and how do they relate to their corresponding 1-parameter persistence modules?

  • \textbf{Degree-Rips bifiltration}: Let $X$ be a finite metric space. Then the family $(\mathbb{VR}^r_d(X))_{r\in\mathbb{R},d\in\mathbb{N}}$ defined above is a filtration indexed by $\mathbb{R}\times \mathbb{N}^{op}$. Notice that $r$ denotes the scale, $d$ denotes the degree. By indexing over $\mathbb{R}\times\mathbb{N}^{op}$, $r$ increases while $d$ decreases. For the single param filtration, the Vietoris-Rips complex is not robust to outliers. The bifiltration counteracts this by introducing a \textbf{density parameter $d$}, which requires that simplices be made of only "dense enough" points. So by indexing over $\mathbb{R}\times \mathbb{N}^{op}$ we get: As $r$ increases, we allow longer edges (i.e. looser geometric structure) while $d$ decreases, allowing points with lower connectivity (i.e. less density). Note: Outliers tend to have low degree, so they are filtered out at high $d$.
  • \textbf{Multicover Bifiltration}: Given a finite point cloud $X\subesteq\mathbb{R}^n$, define: \[MC^d_r(X) = \{x\in\mathbb{R}^n\,:\,|B(x,r)\cap X| > d\}.\] This is the union of balls centered at points of $X$ with radius $r$, keeping only those that intersect $> d$ data points. The filtration $(MC_r^d(X))_{(r,d)}$ is also indexed over $\mathbb{R}\times\mathbb{N}^{op}$. This is a density-aware generalization of the Cech filtration. Notice that for large $d$, only regions near densely clustered points survive. So with decreasing $d$, the regions of radius $r$ (increasing) become less dense.
  • \textbf{Normalized multicover bifiltrations}: Let $X \subseteq \mathbb{R}^n$ be a finite point cloud. For $r, \rho \in \mathbb{R}$, the sets \[NMC^r_\rho(X) := \{x\in\mathbb{R}^n: |B(x,r)\cap X|\geq \rho|X|\}= \{x\in\mathbb{R}^n:\mu_X(B(x,r)\capX)\geq \rho\}\] form a bifiltration over $\mathbb{R}\times\mathbb{R}^n$ called the \textit{normalized multicover bifiltration.} This persistence module turns out to be stable w.r.t Prohorov distance (i.e. it is stable to outliers). As opposed to the non-normalized version, this bifiltration scales well when wanting to compare datasets of different sizes (MC's density threshold is an absolute count $d\in\mathbb{N}, NMC's density threshold is defined proportionally to the size of the set $\rho|X|$).

How can we define persistence modules over multiple parameters? 

For the definition of persistence modules over multiple parameters, we introduce a framework based on \textit{poset-indexed persistence modules}.\\

  • \textbf{Poset}: A poset (partially order set) is a tuple $(P, \preceq)$ consisting of a set $P$ and a binary relation $\preceq$ on $P$ satisfying: \begin{itemize} \item $a\preceq a$ for all $a \in P$. \item $a \preceq b$ and $b \preceq a$ implies $a = b$ for all $a,b\in P$. \item $a \preceq b$ and $b \preceq c$ implies $a \preceq c$ for all $a,b,c \in P$. \end{itemize} We will sometimes simply write $P$ if the (partial) ordering $\preceq$ is clear from context.
  • \textbf{$P$-filtration}: A $P$-filtration is a family of topological spaces $(X_p)_{p\in P}$ such that: $p\perceq p' \implies X_p \subseteq X_{p'}.$
  • \textbf{$P$-Persistence Module}: Given a poset $(P, \perceq)$, a $P$-persistence module over a field $\mathbb{F}$ consists of: \begin{itemize} \item A vector space $U_p$ for each $p\in P$. \item A linear map $u_{p,p'}: U_p \rightarrow U_{p'}$ for all $p\perceq p'$. \item These maps satisfy composition: $u_{p_2,p_3}\circ u_{p_1,p_2}= u_{p_1,p_3} for all $p_1, p_2, p_3 \in P$ with $p_1\perceq p_2\perceq p_3$.

Note: A \textbf{single-parameter} persistence module is a special case of a multiparameter persistence module where the poset is totally ordered (for all $a,b\in P$ either $a\perceq b$ or $b \perceq a$, eg. $(\mathbb{R},\leq)$ is a \textit{totally ordered set}.

mapper algorithm

see page 85/86

example mapper

see p. 84/85

mapper

Let f : X → Z be well-behaved, and U be a (finite) open cover of Z. Then the Mapper is defined as \(M(\mathcal{U}, f) := N(f ^∗ (\mathcal{U}))\). (N is ther nerve)

well-behaved

Let X, Z be topological spaces. Then we call f : X → Z well-behaved if for all path-connected open sets U ⊆ Z,\( f ^{−1} (U)\) has finitely many path-connected components.

Reeb graph interleaving

Two Reeb graphs \(R_f\)\(R_g\) are ϵ-interleaved, if there exists a pair of function preserving maps \(φ : R_f → S_ϵ(R_g)\)\(ψ : R_g → S_ϵ(R_f)\) such that the diagram commutes. (see diagram p.82)

\(\epsilon\)-smoothing

For a Reeb graph \(R_f\) consider a function\( f_ϵ : (R_f)_ϵ → \mathbb{R}\) such that \((x, t) \mapsto f(x) + t\). The ϵ-smoothing of \(R_f\), denoted by \(S_ϵ(R_f)\) is the Reeb graph of \((R_f)_ϵ\) with regards to \(f_ϵ\).

\(\epsilon-\)thickening

An ϵ-thickening \(X_ϵ \) of some topological space X is given by \(X_ϵ := X × [−ϵ, +ϵ].\)

horizontal homology group

A p-th homology class \(h ∈ H_p(X)\) is called horizontal if there is a finite set of values \(A = \{a_1, . . . , a_k\}\) such that \( h ∈ im(ι_∗)\) where \( ι_∗\) is the map \(H_p( \bigcup_{ a∈A} X_a) → H_p(X)\) induced by inclusion, where \(X_a = f ^{−1} (a).\)

vertical homology group

The vertical homology group of X with respect to f is the group \(\overset{\vee}{H_p}(X) := H_p(X)/\overline{H_p(X)}\)

theorem reeb graphs

Given a simplicial complex K with m faces and a piece-wise linear function \(f : |K| → \mathbb{R}\) on it, we can compute the augmented Reeb graph \(R_f\) of K with respect to f in time O(m log m). (proof p.78)

augmented reeb graph

We define the augmented Reeb graph of a simplicial complex with a piecewise-linear-function, as the discretization of the Reeb graph by placing a vertex for every connected component in all the levelsets for the values a for which f(v) = a for some vertex of the simplicial complex.

reeb graph

Let X be some topological space, and f a function \(f : X → \mathbb{R}\). Two points x, y are called equivalent (x ∼ y), iff f(x) = f(y) = α and x and y are in the same path-connected component of \(f^{−1} (α)\). The Reeb graph \(R_f \)is the quotient space X/ ∼. 

stability theorem cech complexes

\(d_b(Dgm_p(\mathbb{C}(P)), Dgm_p(\mathbb{C}(Q))) ⩽ d_H(P, Q)\) for all p ⩾ 0. (Proof p.72)

Hausdorff distance

Let A, B ⊆ X be compact sets. Then the Hausdorff distance between A and B is defined as \(d_H(A, B) := max\{max_{a∈A} d(a, B), max_{ b∈B} d(b, A)\}\).

Theorem bottleneck distance, interleaving distance

If U, V are q-tame persistence modules over \(\mathbb{R}\), then \(d_b(DgmU, DgmV) = d_I(U, V)\)

interleaving distance

\(d_I(U, V) := inf\{ϵ | \text{U and V are ϵ-interleaved}\}\)

\(\epsilon\)-interleaving persistence modules

Let U and V be persistence modules over \(\mathbb{R}\) and let ϵ ⩾ 0. We say that U and V are ϵ-interleaved if there exist two families of linear maps, \(φ_a : U_a → V_{a+ϵ}\) and \(ψ_a : V_a → U_{a+ϵ}\) such that the four diagrams are commutative. (diagrams on p.68)

persistence module

A persistence module V over \(\mathbb{R}\) is a collection \( V = \{V_a\}_{a∈\mathbb{R}}\) of vector spaces \(V_a\) together with linear maps \(v_{a,a′} : V_a → V_{a′ }\) for a ⩽ a ′ , such that \(v_{a,a} = id\) and\( v_{b,c} ◦ v_{a,b} = v_{a,c}\) for all a ⩽ b ⩽ c.

Stability Theorem for simplicial filtrations

Let \( f, g : K → \mathbb{R}\) be simplex-wise monotone functions. Then, ∀p ⩾ 0 we have \(d_b(Dgm_p(\mathcal{F}_f), Dgm_p(\mathcal{F}_g)) ⩽ ||f − g||_\infty\). Proof/Illustration on p. 62.

Wasserstein distance

For p ⩾ 0, and q ⩾ 1, the q-Wasserstein distance is defined as \(d_{W,q}(Dgm_p(\mathcal{F}), Dgm_p(\mathcal{G})) := [inf_{π∈Π} ( \sum_{ x∈Dgm_p}(\mathcal{F}) (||x − π(x)||_\infty)^q ) ]1/q\)

Bottleneck distance

Let \(Π = \{π : Dgm_p(\mathcal{F}) → Dgm_p(\mathcal{G}) | π \text{ is bijective}\}\) be the set of all bijections between \(Dgm_p(\mathcal{F})\) and \(Dgm_p(\mathcal{G})\). Then, the Bottleneck distance is defined as \(d_b(Dgm_p(\mathcal{F}), Dgm_p(\mathcal{G})) := inf_{π∈Π} sup_{x∈Dgm_p(\mathcal{F})} ||x − π(x)||_\infty\).

parametrized graph induced complex

Let \( G^r (P)\) be the graph on P where pq is an edge if and only if d(p, q) ⩽ 2r. The parameterized graph induced complex \(\mathbb{G}^r (Q, P)\) is defined as \(\mathbb{G}(Q, G^r (P))\).

graph induced complex

Given two finite point sets Q, P in \( \mathbb{R}^ d\), as well as a graph G(P) with vertices in P, we define v : P → Q by sending each point in P to its closest point in Q. The graph induced complex \(\mathbb{G}(Q, G(P)) \) contains a simplex \( σ = \{q_0, . . . , q_k\} ⊂ Q\) if and only if there is a clique \(\{p_0, . . . , p_k\}\) in G(P) for which \(v(p_i) = q_i\).

parametrized Witness complex

Given a finite point set Q and a point set P ⊃ Q in some metric space as well as a real number radius r > 0, the parameterized Witness complex \(\mathbb{W}^r (Q, P)\) is a simplicial complex on Q defined as follows: Every point p ∈ Q defines a vertex in \(\mathbb{W}^r (Q, P)\). Further, an edge pq is in \(\mathbb{W}^r (Q, P)\) if it is weakly witnessed by x ∈ P \ Q and d(p, x) ⩽ r and d(q, x) ⩽ r. A simplex σ ⊆ Q is in \(\mathbb{W}^r (Q, P)\) if all its edges are.

Witness complex

The Witness complex W(Q, P) is the collection of simplices on Q for which every face is weakly witnessed by some point in P \ Q.

weakly witnessed

Given a finite point set Q and a point set P ⊃ Q in some metric space, we say that a simplex σ ⊆ Q is weakly witnessed by x ∈ P \ Q, if d(q, x) ⩽ d(p, x) for every q ∈ σ and p ∈ Q \ σ.

Alpha complex

Given a finite point set \( P ⊂ \mathbb{R }^d\) in general position as well as a real number radius r > 0, the Alpha complex \(Del^r (P)\) consists of all simplices σ ∈ Del(P) for which the circumscribing ball of σ has radius at most r. 

extended Delaunay complex

Given a finite point set \(P ⊂ \mathbb{R}^ d\), the extended Delaunay complex is the simplicial complex where for every face σ, for d ′ ⩽ d, every d ′ -face of σ is a Delaunay simplex.

Delaunay triangulation

A Delaunay triangulation Del(P) of P is a geometric simplicial complex with the vertex set P where every simplex is a Delaunay simplex and whose underlying space covers the convex hull of P.

Delaunay simplex

Given a finite point set \(P ⊂ \mathbb{R}^{d}\), a Delaunay simplex is a geometric simplex whose vertices are in P and lie on the boundary of a ball whose interior contains no points of P.