# My personal blog

Machine learning, computer vision, languages

# Computing Gromov Hyperbolicity

22 Jul 2020

Gromov Hyperbolicity measures the “tree-likeness” of a dataset. This metric is an indicator of how well hierarchical embeddings such as Poincaré embeddings [1] would work on a dataset. Some papers which use this metric are [2] and [3]. A Gromov Hyperbolicity of approximately zero means a high tree-likeness.

Here are the results of some NLP datasets:

• WordNet mammals (6540 relations): $$\delta_{avg} \approx 0.00178$$ (tree)

• WordNet nouns (743086 relations): $$\delta_{avg} \approx 4 \cdot 10^{-4}$$ (tree)

• OpenThesaurus nouns (321877 relations): $$\delta_{avg} \approx 0.307$$ (not a tree)

• Hyperlex (2228 relations): $$\delta_{avg} \approx 4 \cdot 10^{-4}$$ (tree)

There are different ways to define Gromov hyperbolicity. I will present here two variants: sampling $$\delta_{avg}$$ and exact $$\delta_{worst}$$. Most papers seem to use sampling, because it is faster and produces better results.

$\delta_{worst}(G) = \max_{x,y,u,v \in V} \frac{S_1 - S_2}{2}$

where $$S_i = \{d(x, y) + d(u, v), d(x, u) + d(y, v), d(x, v) + d(y, u)\}$$ and $$S_i$$ is the $$i$$th largest element.

## Exact Gromov hyperbolicity

SageMath includes already a fast algorithm for computing exact Gromov hyperbolicity. The input is an adjacency matrix. It is a good idea to use a sparse matrix if the dataset is too big.

## Sample Gromov hyperbolicity

It is not hard to implement the formula from above. This time we will use the library NetworkX. One can also change the distance function $$d(u, v)$$ if the dataset is already embedded in a space (see [3]).

## References

[1] M. Nickel and D. Kiela, Poincaré Embeddings for Learning Hierarchical Representations, 2017.

[2] I. Chami, Z. Ying, C. Ré and J. Leskovec, Hyperbolic Graph Convolutional Neural Networks, 2019.

[3] A. Tifrea, G. Bécigneul and O.-E. Ganea, Poincaré GloVe: Hyperbolic Word Embeddings, 2018.