Posts by Category

general-ml

Deriving and implementing LDA

3 minute read

In this post, I will derive and implement Linear Discriminant Analysis (LDA). See also chapter 4.3 in [1].

Avoiding underflows in Gaussian Naive Bayes

1 minute read

There are mainly two ways to avoid numerical instability when implementing Gaussian Naive Bayes (GNB). Either one applies the log-sum-exp trick or one takes ...

Calculating the hard-margin SVM by hand

less than 1 minute read

In this blog post, I will show how to calculate the hard-margin SVM by hand. If you are interested in a computational solution, refer to my last post.

Comparing Random Forest and Bagging

3 minute read

I recently read an interesting paper on Bagging [1]. The researchers compared Bagging and Random Subspace (RS) with Random Forest (RF). Their results were th...

Stacking Best Practices

6 minute read

Stacking is a popular ensemble method in data competitions, but general guidelines are nowhere to be found. Most articles just describe how Stacking works.

Back to Top ↑

neural-networks

Object detection from scratch

6 minute read

In this post, I will implement a simple object detector in Keras based on the three YOLO papers [1][2][3]. The complete code can be obtained from here.

Detecting objects using segmentation

3 minute read

To find objects in images, one normally predicts four values: two coordinates, width and height. However, it is also possible to formulate object detection a...

Losses for Image Segmentation

7 minute read

In this post, I will implement some of the most common losses for image segmentation in Keras/TensorFlow. I will only consider the case of two classes (i.e. ...

Back to Top ↑

object-detection

Object detection from scratch

6 minute read

In this post, I will implement a simple object detector in Keras based on the three YOLO papers [1][2][3]. The complete code can be obtained from here.

Detecting objects using segmentation

3 minute read

To find objects in images, one normally predicts four values: two coordinates, width and height. However, it is also possible to formulate object detection a...

Losses for Image Segmentation

7 minute read

In this post, I will implement some of the most common losses for image segmentation in Keras/TensorFlow. I will only consider the case of two classes (i.e. ...

k-means clustering for anchor boxes

3 minute read

In this blog post, I will explain how k-means clustering can be implemented to determine anchor boxes for object detection. Anchor boxes are used in object d...

Back to Top ↑

math

Calculating the hard-margin SVM by hand

less than 1 minute read

In this blog post, I will show how to calculate the hard-margin SVM by hand. If you are interested in a computational solution, refer to my last post.

Back to Top ↑

nlp

n-gram, entropy and entropy rate

5 minute read

n-gram models find use in many areas of computer science, but are often only explained in the context of natural language processing (NLP). In this post, I w...

Portuguese Lemmatizers

4 minute read

In this post, I will compare some lemmatizers for Portuguese. In order to do the comparison, I downloaded subtitles from various television programs. The sen...

Back to Top ↑

various

Wavelet Trees and full-text search indices

5 minute read

The wavelet tree is a useful data structure in many areas of computer science. One of its applications is the full-text search. See the articles [1] and [2] ...

Back to Top ↑