The standard activation function for binary outputs is the sigmoid function. However, in a recent paper, I show empirically on several medical segmentation datasets that other functions can be better.

A variety of operations can be performed on (contextual) word vectors. In this blog post, I will implement some common operations using PyTorch and Python.

In this blog post, I will implement some common methods for uncertainty estimation. My main focus lies on classification and segmentation. Therefore, regression-specific methods such as Pinball loss are not covered here.

Predictions are not just about accuracy, but also about probability. In lots of applications it is important to know how sure a neural network is of a prediction. However, the softmax probabilities in neural networks are not always calibrated and don’t necessarily measure uncertainty.
In this blog post, I will implement the most common metrics to evaluate the output probabilities of neural networks.

After having introduced Riemannian SGD in the last blog post, here I will give a concrete application for this optimization method. Poincaré embeddings [1][2] are hierarchical word embeddings which map integer-encoded words to the hyperbolic space.