In Part 1 of this series, we broke down Markov Chains and Markov Decision Processes, two models for fully observable systems. But what happens when the thing you’re trying to model can’t be directly observed?
That’s where Hidden Markov Models (HMMs) come in. These models are essential for making sense of systems where the outcome is visible, but the cause is hidden, like identifying parts of speech in a sentence, or guessing the weather based on how many ice creams someone eats.
Hidden Markov Models are built for partially observable, autonomous systems. That means:
Like other Markov models, HMMs assume that:
Much like Markov Chains, an HMM has a set of states, a transition probability matrix, and an initial probability distribution. Unlike Markov Chains, however, an HMM includes what we call emission probabilities, or the probability that an observation oi is generated at state qi. An HMM builds on the same structure as a Markov Chain but adds one key ingredient:
According to foundational work by Rabiner (1989) and Ferguson (1960s), working with HMMs usually involves solving one of three key problems:
Let’s look at a fun example from Jason Eisner (2002). Suppose we want to figure out whether a day was hot or cold, not by checking the weather report, but by tracking how many ice creams Jason ate.
Here’s our observation sequence:
3 ice creams, 1 ice cream, 3 ice creams
We don’t know the actual weather. That’s hidden. But we can build a trellis diagram to estimate the most likely state sequence based on emissions (ice cream counts) and our known model parameters.
The trellis has:
Depending on the algorithm we apply:
HMMs were a foundational tool in early NLP and speech recognition systems and they still underpin concepts in modern sequence modeling. They allow us to:
Whether you're modeling human behavior, decoding audio signals, or aligning tokens in language tasks, the HMM framework offers a powerful starting point. If you’re working on sequence-based problems, from chatbot interactions to biological data, these models can help you build intuition, and structure, for reasoning under uncertainty.
Watch the full explainer video below: