Why use stochastic models?

In this blog post, we will explore the importance of stochastic models in the context of unsupervised learning. We will start with k-means clustering, which deterministically clusters points based on heuristics, and build up to Expectation Maximization (EM), which can use any parametrized probabilistic distribution to cluster data. k-means clustering k-means clustering is a method […]

No Straight Lines Here: The Wacky World of Non-Linear Manifold Learning

In this era of machine learning and data analysis, the quest to understand complex relationships within high-dimensional data like images or videos is not simple and often requires techniques beyond simple ones. The patterns are complex, twisted and intertwined, defying the simplicity of straight lines. This is where non-linear manifold learning algorithms step in. They […]

Evolution & Taxonomy of Clustering Algorithms

History of Clustering The history of clustering algorithms dates back to the early 20th century, foundationally originating in the realms of anthropology and psychology in the 1930s [1, 2] . It was introduced in anthropology by Driver and Kroeber in 1932 [3] to simplify the ambiguity of empirically based typologies of cultures and individuals. In […]

How to Evaluate Features after Dimensionality Reduction?

Introduction  Dimensionality reduction can be a critical preprocessing step that transforms a dataset’s features with high dimensions in input space to much lower dimensions in some latent space. It can bring us multiple benefits when training the model including avoiding the curse of dimensionality issues, reducing the risk of model overfitting, and lowering the computation […]