Computing Gromov Hyperbolicity

22 Jul 2020

Gromov Hyperbolicity measures the “tree-likeness” of a dataset. This metric is an indicator of how well hierarchical embeddings such as Poincaré embeddings [1] would work on a dataset. Some papers which use this metric are [2] and [3]. A Gromov Hyperbolicity of approximately zero means a high tree-likeness.

Here are the results of some NLP datasets:

WordNet mammals (6540 relations): \(\delta_{avg} \approx 0.00178\) (tree)
WordNet nouns (743086 relations): \(\delta_{avg} \approx 4 \cdot 10^{-4}\) (tree)
OpenThesaurus nouns (321877 relations): \(\delta_{avg} \approx 0.307\) (not a tree)
Hyperlex (2228 relations): \(\delta_{avg} \approx 4 \cdot 10^{-4}\) (tree)

There are different ways to define Gromov hyperbolicity. I will present here two variants: sampling \(\delta_{avg}\) and exact \(\delta_{worst}\). Most papers seem to use sampling, because it is faster and produces better results.

\[\delta_{worst}(G) = \max_{x,y,u,v \in V} \frac{S_1 - S_2}{2}\]

where \(S_i = \{d(x, y) + d(u, v), d(x, u) + d(y, v), d(x, v) + d(y, u)\}\) and \(S_i\) is the \(i\)th largest element.

Exact Gromov hyperbolicity

SageMath includes already a fast algorithm for computing exact Gromov hyperbolicity. The input is an adjacency matrix. It is a good idea to use a sparse matrix if the dataset is too big.

import sage.all
from sage.graphs.hyperbolicity import hyperbolicity
from sage.graphs.graph import Graph
from sage.matrix.constructor import matrix

def exact_hyperbolicity(adjacency_matrix):
    G = Graph(matrix(adjacency_matrix), format='adjacency_matrix')
    assert G.is_connected()
    h, _, _ = hyperbolicity(G, algorithm='BCCM')

    return h

Sample Gromov hyperbolicity

It is not hard to implement the formula from above. This time we will use the library NetworkX. One can also change the distance function \(d(u, v)\) if the dataset is already embedded in a space (see [3]).

import networkx as nx

def sample_hyperbolicity(adjacency_matrix, num_samples=50000):
    G = nx.from_scipy_sparse_matrix(adjacency_matrix)

    hyps = []
    for i in range(num_samples):
        node_tuple = np.random.choice(G.nodes(), 4, replace=False)
        try:
            d01 = nx.shortest_path_length(G, source=node_tuple[0], target=node_tuple[1], weight=None)
            d23 = nx.shortest_path_length(G, source=node_tuple[2], target=node_tuple[3], weight=None)
            d02 = nx.shortest_path_length(G, source=node_tuple[0], target=node_tuple[2], weight=None)
            d13 = nx.shortest_path_length(G, source=node_tuple[1], target=node_tuple[3], weight=None)
            d03 = nx.shortest_path_length(G, source=node_tuple[0], target=node_tuple[3], weight=None)
            d12 = nx.shortest_path_length(G, source=node_tuple[1], target=node_tuple[2], weight=None)
            
            s = [d01 + d23, d02 + d13, d03 + d12]
            s.sort()
            hyps.append((s[-1] - s[-2]) / 2)
        except Exception as e:
            continue

    return np.max(hyps), np.mean(hyps)

References

[1] M. Nickel and D. Kiela, Poincaré Embeddings for Learning Hierarchical Representations, 2017.

[2] I. Chami, Z. Ying, C. Ré and J. Leskovec, Hyperbolic Graph Convolutional Neural Networks, 2019.

[3] A. Tifrea, G. Bécigneul and O.-E. Ganea, Poincaré GloVe: Hyperbolic Word Embeddings, 2018.