SECT statistic

 

The objective of this blog post is to summarize my understanding of the theory and applications behind Smooth Euler Characteristic Transform (SECT) statistic which was first introduced by Crawford et al1 in 2020. The SECT statistic summarizes the shape information associated with an object as a collection of smooth curves that lie in a Hilbert space. The “niceness” of a Hilbert space allows us to use different tools from Functional Data Analysis (FDA). The motivation behind SECT comes from the Persistent Homology Transform (PHT) statistic introduced by Turner et al2 in 2014. The blog post starts with an introduction to PHT, its properties, and drawbacks. The second section discusses the benefits of having a Hilbert space structure from a statistics point of view. The final section discusses the construction of the SECT statistic.

Persistent Homology Transform (PHT)

The authors of both PHT and SECT are trying to tackle the problem of how shape information can be quantified in a way that enables us to perform statistical inference. The PHT statistic, introduced by Turner et al2, is to model objects in $\mathbb{R}^2$ and $\mathbb{R}^3$. PHT captures shape information as a collection of Persistent Diagrams that we have to work with instead of the object itself. Hence, the resulting space of persistence diagrams, $\mathcal{D}$, needs to be “nice” i.e. a space where we can perform statistical inference. For instance, consider the CT scan mesh representations of five teeth taken from Turner et al2. If there exists a distance function $d$ such that $(\mathcal{D}, d)$ is a well-defined metric space, then such a metric $d$ could be used to compare the shapes of these teeth.

teeth

A detailed introduction to Persistence Diagrams can be found in the third chapter of the textbook by Dey and Wang 3. Let $M \subset \mathbb{R}^d$ be such that it can be approximated using a finite simplicial complex. The set of all possible (unit vector) directions in $\mathbb{R}^d$ lie in the $d-1$ sphere $\mathcal{S}^{d-1} \subset \mathbb{R}^d$. Additionally, for a fixed $v \in \mathcal{S}^{d-1}$, $M(v)$ will denote the projection of points in M onto the vector $v$.

\[\mathcal{S}^{d-1} = \lbrace x \in \mathbb{R}^d \text{ } : \text{ } \|\| x \|\|_{2} = 1 \rbrace \quad \text{ and } \quad M(v) \triangleq \lbrace x \cdot v \text{ } : \text{ } x \in M \rbrace \text{ for fixed } v \in \mathcal{S}^{d-1}\]

For each such $v$, we can define a filtration $\lbrace M(v)_{r} : r>0 \rbrace$ (w.r.t. the height function) indexed by height $r$ such that $M(v)_r$ corresponds to the subset of $M$ that is atmost at height $r$, from the baseline, in the direction of $v$.

\[M(v)_r = \lbrace x \in M : x \cdot v \leq r \rbrace \text{ for } r \geq 0\]

For a fixed $v$, topological information in the direction of this vector can be summarized using persistence diagrams associated with the filtration $\lbrace M(v)_{r}: r>0 \rbrace$ (w.r.t height). Intuitively the corresponding persistence diagrams, denoted by $X_k(M,v)$ for $k \in \lbrace 0, 1, \cdots , d-1 \rbrace $, store topological information ( w.r.t. height ) along the direction $v$. Therefore collecting the persistence diagrams $X_k(M,v)$ for all $k$ and all directions $v \in \mathcal{S}^{d-1}$ will contain all possible topological information. This is the intuition behind the Persistent Homology transform of $M$, denoted as $PHT(M)$:

\[PHT(M): \mathcal{S}^{d-1} \longrightarrow \mathcal{D}^d\] \[\quad \quad \quad \quad \quad\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad v \longmapsto (X_0(M,v), X_1(M,v), \cdots, X_{d-1}(M,v))\]

Note that $PHT(M)$ cannot be all possible topological information associated with $M$, instead, the information we have access to is restricted by the choice of filter function (here, it is the height function) and filtration.

  1. Lorin Crawford, Anthea Monod, Andrew X. Chen, Sayan Mukherjee & Raúl Rabadán (2020) Predicting Clinical Outcomes in Glioblastoma: An Application of Topological and Functional Data Analysis, Journal of the American Statistical Association, 115:531, 1139-1150, DOI: 10.1080/01621459.2019.1671198 

  2. Turner, K., Mukherjee, S., and Boyer, D. M. (2014), “Persistent homology transform for modeling shapes and surfaces,” Information and Inference, Oxford University Press (OUP). DOI: https://doi.org/10.1093/imaiai/iau011.  2 3

  3. Dey, T. K., and Wang, Y. (2022), “Computational Topology for Data Analysis,” Cambridge University Press. DOI: https://doi.org/10.1017/9781009099950.