Finite State Machines (FSMs)
You've written authentication flows that move users through states: unauthenticated β logging in β authenticated β session expired. You've read TCP connection states in debugging output: LISTEN, SYN_SENT, ESTABLISHED, CLOSE_WAIT. You've built form validation that changes behavior based on whether input is empty, invalid, or valid. All of these are implementations of the same theoretical model.
This is that model.
A Finite State Machine is a formal model of computation that describes any system which moves through a fixed set of states based on inputs. Understanding FSMs formally gives you a precise vocabulary for designing state-dependent systems β and explains why regex engines, lexers, and network protocols work the way they do.
Understanding FSMs requires computational thinking: decomposing systems into states, recognizing patterns in transitions, abstracting away implementation details, and designing algorithms to process inputs.
Learning Objectives
By the end of this article, you'll be able to:
- Define an FSM using the formal 5-tuple: states, alphabet, transition function, initial state, and accepting states
- Design an FSM from a prose description of a system's behavior
- Distinguish deterministic (DFA) from non-deterministic (NFA) finite automata and explain the tradeoffs
- Implement an FSM in code and explain why explicit state machines improve testability
- Identify what FSMs cannot do β and why that boundary matters for regex and parsers
Where You've Seen This
FSMs appear throughout production software:
- TCP/IP protocol β every TCP connection moves through states (
LISTENβSYN_RECEIVEDβESTABLISHEDβFIN_WAITβCLOSED). The Linux kernel's TCP implementation is essentially one large FSM - Regex engines β every regular expression compiles to an FSM internally. DFA-based engines (Go's
regexp, Rust'sregex,re2) make a guaranteed linear pass through the state machine. Backtracking engines (Pythonre, JavaScript, Java, PHP, Ruby β the majority in production) can be exponential on adversarial input. Understanding FSMs is what explains the difference - Lexers β the first stage of every compiler or parser breaks source text into tokens using FSMs. Each token type (identifier, number, string literal) is recognized by a state machine
- Authentication flows β unauthenticated β authenticating β authenticated β expired β locked is a classic FSM. Implementing it explicitly prevents bugs like "how did a locked user end up in the authenticated state?"
- Game AI β enemies with patrol β alert β attack β retreat states; game characters with idle β running β jumping β falling states
- Network protocols β HTTP, SMTP, FTP, WebSockets all define state machines for connection lifecycle
What is a Finite State Machine?
A Finite State Machine is an abstract model of computation that:
- Has a finite number of states (hence the name)
- Starts in an initial state
- Transitions between states based on inputs
- May have one or more accepting states (success!)
That's it. No infinite memory, no complex calculationsβjust states and transitions.
stateDiagram-v2
direction LR
[*] --> Off
Off --> On: press
On --> Off: press
This is an FSM for a light switch. Two states (On, Off), one input (press), deterministic behavior. Simple, but powerful.
Formal Definition
For the mathematically inclined, an FSM is a 5-tuple \((Q, \Sigma, \delta, q_0, F)\):
| Symbol | Meaning |
|---|---|
| \(Q\) | Finite set of states |
| \(\Sigma\) | Finite alphabet (possible inputs) |
| \(\delta\) | Transition function \((state \times input \to state)\) |
| \(q_0\) | Initial state |
| \(F\) | Set of accepting/final states |
Don't worry if that looks intimidatingβwe'll work with diagrams.
Example: Turnstile
A classic FSM example is a subway turnstile:
stateDiagram-v2
direction LR
[*] --> Locked
Locked --> Locked: push
Locked --> Unlocked: coin
Unlocked --> Locked: push
Unlocked --> Unlocked: coin
States: Locked, Unlocked
Inputs: coin, push
Behavior:
- Start Locked
- Insert coin β Unlocked
- Push while Unlocked β Locked (you go through)
- Push while Locked β stays Locked (nothing happens)
- Insert coin while Unlocked β stays Unlocked (stays unlocked; no state change)
This is a complete specification of turnstile behavior. No ambiguity.
Example: Validating Binary Numbers Divisible by 3
Here's where FSMs get interesting. Can we build a machine that accepts binary numbers divisible by 3?
stateDiagram-v2
direction LR
classDef accepting fill:#326CE5,stroke:#cbd5e0,stroke-width:2px,color:#fff
[*] --> S0
S0 --> S0: 0
S0 --> S1: 1
S1 --> S2: 0
S1 --> S0: 1
S2 --> S1: 0
S2 --> S2: 1
S0 ::: accepting
Formal Definition of this FSM:
- \(Q = \{S0, S1, S2\}\)
- \(\Sigma = \{0, 1\}\)
- \(q_0 = S0\)
- \(F = \{S0\}\)
- \(\delta\) is defined by the following table:
| State | Input '0' | Input '1' |
|---|---|---|
| S0 | S0 | S1 |
| S1 | S2 | S0 |
| S2 | S1 | S2 |
States represent remainders when dividing by 3:
- S0 = remainder 0 (divisible by 3!)
- S1 = remainder 1
- S2 = remainder 2
Trace through "110" (binary for 6):
- Start at S0 (remainder 0)
- Read '1': move to S1 (1 mod 3 = 1)
- Read '1': move to S0 (3 mod 3 = 0)
- Read '0': stay at S0 (6 mod 3 = 0)
- End at S0 β accept! 6 is divisible by 3. β
Try It Yourself
Trace "101" (binary for 5). Does it end in an accepting state?
What about "1001" (binary for 9)?
Deterministic vs Non-Deterministic
Deterministic FSM (DFA)
Every state has exactly one transition for each input. Given a state and input, you always know where to go next.
Non-Deterministic FSM (NFA)
A state might have:
- Multiple transitions for the same input
- Transitions on "Ξ΅" (epsilon) β moving without consuming input
- No transition for some inputs
stateDiagram-v2
direction LR
[*] --> S0
S0 --> S0: a
S0 --> S1: a
S1 --> S2: b
From S0, reading 'a' could go to S0 or S1. Non-deterministic!
The magic: NFAs and DFAs are equally powerful. Any NFA can be converted to an equivalent DFA (though the DFA might have more states). NFAs are often easier to design; DFAs are easier to implement. Best of both worlds.
FSMs and Regular Languages
What's a "Language" in Computer Science?
A language is simply a set of valid strings. Examples:
- "All strings that start with 'a' and end with 'b'"
- "All valid email addresses"
- "All binary numbers divisible by 3"
- "All strings with balanced parentheses"
What Does "Recognize" Mean?
An FSM "recognizes" a language if it can correctly accept valid strings (ending in an accepting state) and reject invalid ones (ending in a non-accepting state). The FSM is essentially a validatorβgive it a string, and it tells you whether it belongs to the language.
The Fundamental Equivalence
FSMs recognize exactly the regular languagesβthe same languages described by regular expressions. This isn't a coincidence. These three formalisms describe exactly the same class of languages:
| Formalism | Description |
|---|---|
| FSM (DFA/NFA) | State diagrams |
| Regular Expression | Pattern syntax (a*b+) |
| Regular Grammar | Production rules |
Practical meaning: If you can write a regex for something, you can build an FSM for it, and vice versa. They're two different notations for the exact same thing.
Examples: Regular vs. Not Regular
β Regular languages (FSM β, Regex β):
- "Strings with even number of a's" β Regex:
(b*ab*ab*)* - "Strings ending with '.com'" β Regex:
.*\.com - "Binary numbers divisible by 3" β (we built this FSM earlier!)
β Not regular languages (FSM β, Regex β):
- "Same number of a's and b's" β needs counting/memory
- "Balanced parentheses" β needs a stack to track nesting
- \(a^nb^n\) (equal a's and b's) β needs unbounded counting
What FSMs Cannot Do
FSMs have no memory beyond their current state. This means they can't:
- Count unbounded quantities ("same number of a's and b's")
- Match nested structures (balanced parentheses)
- Remember arbitrary history
Example: The "Equal A's and B's" Language
The language \(\ L = \{a^nb^n \mid n \geq 0\} \) is NOT regular. This notation means "n a's followed by n b's, where n is any number 0 or greater":
- When n=0: "" (empty string)
- When n=1: "ab"
- When n=2: "aabb"
- When n=3: "aaabbb"
- When n=100: 100 a's followed by 100 b's
Why FSMs Can't Handle This:
To validate these strings, you'd need to:
- Count the a's: "I saw 5 a's"
- Remember that count: "I need to see exactly 5 b's"
- Count the b's and compare: "1... 2... 3... 4... 5... match!"
FSMs can't do step 2. To remember any possible count, you'd need states like:
- state_0_as, state_1_as, state_2_as, state_3_as, ..., state_1000_as, ...
Here's what this impossible FSM would look like:
stateDiagram-v2
direction LR
[*] --> S0
S0 --> S1: a
S1 --> S2: a
S2 --> S3: a
S3 --> S4: a
S4 --> ...: a
note right of S4: This continues forever!<br/>Need infinite states to<br/>count unbounded a's
But there are infinitely many possible counts, and FSMs must have a FINITE number of states. That's the fundamental limitation.
Contrast with "divisible by 3": That FSM only needs 3 states (remainder 0, 1, or 2) because we track the remainder, not the actual count. The remainder is boundedβit's always 0, 1, or 2. Counting to arbitrary numbers is unbounded.
For languages requiring this kind of counting or nesting (like balanced parentheses), we need more powerful models that add memory to the state machine conceptβlike a call stack that tracks where we are in nested structures.
Real-World FSMs
stateDiagram-v2
direction LR
[*] --> Green
Green --> Yellow: timer
Yellow --> Red: timer
Red --> Green: timer
Real traffic lights are more complex (handling multiple directions, pedestrian buttons, sensors), but the core is an FSM.
Enemy behavior in many games:
stateDiagram-v2
direction LR
[*] --> Patrol
Patrol --> Chase: see_player
Chase --> Attack: in_range
Chase --> Patrol: lose_player
Attack --> Chase: player_fled
Attack --> Patrol: player_dead
This creates believable behavior from simple rules. Not bad for a bunch of circles and arrows.
Network protocols are often specified as FSMs:
stateDiagram-v2
direction LR
[*] --> CLOSED
CLOSED --> LISTEN: passive_open
CLOSED --> SYN_SENT: active_open
LISTEN --> SYN_RCVD: recv_SYN
SYN_SENT --> ESTABLISHED: recv_SYN_ACK
SYN_RCVD --> ESTABLISHED: recv_ACK
ESTABLISHED --> FIN_WAIT: close
ESTABLISHED --> CLOSE_WAIT: recv_FIN
(Simplifiedβthe real TCP state diagram has more states and transitions.)
When a compiler reads your code, the first step is tokenizingβbreaking the code into meaningful chunks called tokens.
Example: The code x = 42 + y gets broken into tokens:
x(identifier/variable name)=(operator)42(number)+(operator)y(identifier/variable name)
Compilers use FSMs to recognize different token types. Here are two FSMsβone for numbers, one for identifiers (variable/function names):
State Names
The state names like "InNumber" and "InIdentifier" are descriptive labels that tell us what the FSM is currently doing:
- InNumber = "currently in the middle of reading a number"
- InIdentifier = "currently in the middle of reading an identifier"
Just like "Locked/Unlocked" for a turnstile, these names help us understand what's happening in each state.
Recognizing Numbers:
stateDiagram-v2
direction LR
[*] --> Start
Start --> InNumber: digit (0-9)
InNumber --> InNumber: digit
InNumber --> [*]: space/operator
Recognizing Identifiers:
stateDiagram-v2
direction LR
[*] --> Start
Start --> InIdentifier: letter (a-z)
InIdentifier --> InIdentifier: letter/digit
InIdentifier --> [*]: space/operator
How it works:
- See a digit (0-9) first β InNumber state, keep reading digits until hitting something else (space, operator, etc.) β emit a number token
- See a letter (a-z, A-Z) first β InIdentifier state, keep reading letters/digits until hitting something else β emit an identifier token
Example trace for x42:
- Start state
- See 'x' (letter) β InIdentifier
- See '4' (digit, allowed in identifiers) β stay InIdentifier
- See '2' (digit) β stay InIdentifier
- See space β done, emit identifier token
x42
This is how compilers turn source code text into structured tokens for parsing!
Why This Matters for Production Code
FSMs aren't just a theoretical curiosity β they're a practical design tool.
Explicit state machines prevent impossible states. If you model an order's lifecycle as an FSM (pending β paid β shipped β delivered β refunded), you can enforce that a cancelled order can never become pending again. Without an explicit model, these invalid transitions sneak in as bugs.
FSMs are directly testable. A state machine has a finite, enumerable set of (state, input) pairs. You can write a test for every valid transition and every invalid one. Compare this to implicit state scattered across boolean flags (isLoggedIn, isLoading, hasError) β testing all combinations becomes exponential.
ReDoS attacks exploit FSM backtracking. Certain regex patterns compile to non-deterministic FSMs that backtrack exponentially on crafted inputs. Cloudflare had a major outage in 2019 caused by exactly this. Understanding that regex engines are FSMs explains why (a+)+ matching against "aaaaaaaaab" can hang a server.
Lexers are FSMs. The first stage of every compiler β tokenization β uses FSMs to scan source code. Understanding this explains lexer performance characteristics and why some patterns are tokenized in one pass.
Beyond FSMs: Adding Memory
While FSMs are powerful for many tasks, they hit a fundamental limitation: they can't count or handle nested structures. More powerful computational models extend FSMs by adding memory:
| Feature | FSM | Extended Models |
|---|---|---|
| Memory | Current state only | State + stack/tape |
| Power | Regular languages | Context-free & beyond |
| Recursion | No | Yes |
| Nesting | Can't handle | Handles naturally |
| Simplicity | Simpler | More powerful |
These extended models essentially add a stack or other memory structure to track nesting depth, enabling recognition of nested structures like parentheses, HTML tags, and programming language syntax.
Implementing an FSM
FSMs translate directly into code. Here's a turnstile implementation:
- Initial state - turnstile starts locked
- Process an input event and transition to next state
- Check current state to determine which transitions are valid
- Transition from locked to unlocked when coin inserted
- Transition from unlocked to locked when pushed
- Constructor method called automatically when creating new instance with
new - Instance property using
this- each Turnstile object has its own state - Use strict equality
===for string comparison (preferred over==in JavaScript) - Return current state after transition for convenient chaining and logging
- Define struct type - Go's way of grouping data (like a class without inheritance)
- Constructor function pattern - Go convention to prefix with "New" and return pointer
- Return address (&) of struct initialized with field syntax
- Method with pointer receiver (*Turnstile) - allows modifying the struct's state
- Derive Debug trait to enable printing State values with
{:?}format - Enum for type-safe states - compiler ensures only Locked or Unlocked exist
- Mutable reference (&mut) required to modify state; return immutable reference
- Match on tuple (state, input) - Rust's powerful pattern matching handles all cases
- Wildcard pattern
_matches any unhandled case (no transition, stay in current state) - Return reference to internal state (borrowing, not transferring ownership)
- Private field with public methods (encapsulation) - standard Java pattern
- Constructor with same name as class - initializes state
- Use
.equals()for string comparison (not==which compares references)
- Initializer list - more efficient than assignment in constructor body
- Pass string by const reference to avoid copying (performance optimization)
- C++ allows
==for string comparison (std::string overloads the operator)
For complex FSMs with many states, a transition table (dictionary/map keyed on (state, input)) scales better than nested conditionals β the logic stays data rather than code.
Technical Interview Context
FSMs appear in system design interviews as state modeling problems, and in lower-level discussions about regex engines and protocol parsers.
Design the state model for a payment / order / user account lifecycle
This is an FSM problem whether or not the interviewer uses that term. Define the states, enumerate valid transitions, and sketch a transition table. The exercise of drawing it out prevents invalid state combinations from creeping into the implementation.
How would you enforce that a cancelled order can't become pending again?
Explicit FSM: only permit transitions that appear in the transition table; reject all others. Contrast this with implicit state scattered across boolean flags, where invalid transitions happen when conditions are checked out of order.
How does a regex engine work?
A regex pattern compiles to a finite automaton (NFA or DFA). The engine reads input characters and follows state transitions; acceptance means the string matched. This is why regex is fast for simple patterns and why certain patterns cause exponential blowup on adversarial input.
Why can't a regex match balanced parentheses?
FSMs have no memory beyond their current state β they cannot count. Matching arbitrary nesting requires a stack (pushdown automaton), which is a more powerful computational model than an FSM.
Practice Problems
Practice Problem 1: Design an FSM
Create an FSM that accepts strings over {a, b} that contain an even number of a's.
Hint: How many states do you need? What do they represent?
Solution
Two states are sufficient:
- S0 (initial, accepting) β even number of
as seen so far (including zero) - S1 β odd number of
as seen so far
Transitions:
atoggles between states: S0 β S1, S1 β S0bkeeps you in the same state: S0 β S0, S1 β S1
stateDiagram-v2
direction LR
[*] --> S0
S0 --> S1 : a
S1 --> S0 : a
S0 --> S0 : b
S1 --> S1 : b
S0 --> [*]
Trace for "abba": S0 β(a)β S1 β(b)β S1 β(b)β S1 β(a)β S0. Accepted β
Trace for "aba": S0 β(a)β S1 β(b)β S1 β(a)β S0. Wait β S0 is accepting, so this is accepted. But "aba" has 2 a's, which is even. Correct β
Key insight: You never need to count the actual number of a's β only the parity (even or odd). Two states perfectly capture parity. This is a general pattern: if you only care about a value modulo N, you need exactly N states.
Practice Problem 2: Vending Machine
Design an FSM for a vending machine that:
- Accepts nickels (5Β’), dimes (10Β’), and quarters (25Β’)
- Dispenses when total reaches 30Β’ or more
- Returns to start after dispensing
What are your states? (Hint: think about accumulated amounts)
Solution
States represent accumulated amounts: S0, S5, S10, S15, S20, S25, plus DISPENSE (accepting).
| From | Nickel (+5Β’) | Dime (+10Β’) | Quarter (+25Β’) |
|---|---|---|---|
| S0 | S5 | S10 | S25 |
| S5 | S10 | S15 | DISPENSE |
| S10 | S15 | S20 | DISPENSE |
| S15 | S20 | S25 | DISPENSE |
| S20 | S25 | DISPENSE | DISPENSE |
| S25 | DISPENSE | DISPENSE | DISPENSE |
| DISPENSE | S0 | S0 | S0 |
DISPENSE is the accepting state. It automatically transitions back to S0 after dispensing β no change is returned in this model (a common simplification).
Why this works: The possible accumulated amounts are finite and bounded (0Β’ to 25Β’ before dispensing). The FSM has exactly enough states to track each amount. This is the FSM sweet spot: finite, bounded state space, where each state has a clear meaning.
Why it wouldn't work for a different design: If the machine returned exact change, you'd need to track arbitrary amounts β potentially unbounded. That would require infinite states, which violates the FSM definition.
Practice Problem 3: Prove It's Not Regular
The language \( L = \{a^nb^n \mid n \geq 0\} \) is not regular.
Try to design an FSM for it. Where do you get stuck? What would you need that an FSM doesn't have?
Solution
Where you get stuck:
To build the FSM, you'd need one state for "seen 0 a's," another for "seen 1 a," another for "seen 2 a's," and so on β because you must remember exactly how many a's you saw in order to accept the same number of b's. Since n is unbounded, you'd need an infinite number of states. FSMs have a finite state count by definition. The construction fails.
What you'd need that FSMs don't have: A counter β memory that grows with the input. An FSM's only memory is its current state, which is fixed at design time. A pushdown automaton (PDA) β an FSM extended with a stack β can solve this: push one symbol per a, then pop one per b. When the stack is empty after all the b's, accept.
Formal argument (Pumping Lemma sketch): Suppose L were regular with pump length p. Consider the string a^p b^p. The pump substring must fall entirely within the a's (at the start). Repeating the pump β pumping up β produces a^{p+k} b^p for some k > 0. This string is not in L because it has more a's than b's. Contradiction. Therefore L is not regular.
Why this matters in practice: {a^n b^n} is the canonical example of why regex cannot match balanced parentheses or validate matching XML/HTML tags. Those structures require counting depth, which requires a stack β which is exactly what parsers provide.
Key Takeaways
| Concept | Meaning |
|---|---|
| State | A configuration the machine can be in |
| Transition | Rule for moving between states |
| Initial State | Where computation begins |
| Accepting State | Success! Input is valid |
| DFA | Deterministic β one transition per input |
| NFA | Non-deterministic β multiple options possible |
| Regular Language | What FSMs can recognize |
Further Reading
On This Site
- Regular Expressions: The Formal Model β How regex patterns compile to automata, why backtracking causes ReDoS, and what regex cannot match
External
- Sipser, Michael. Introduction to the Theory of Computation β The standard university text; Chapter 1 covers finite automata and regular languages in full formal detail
- Cox, Russ. "Regular Expression Matching Can Be Simple And Fast" β Explains why DFA-based engines guarantee linear time; directly relevant to the ReDoS discussion above
FSMs are the "hello world" of computation theory β simple enough to fully understand, powerful enough to be genuinely useful. They're also a practical engineering tool: making state explicit, transitions enumerable, and behavior testable. When a system's complexity comes from the number of states rather than the logic within each state, an FSM is usually the right model.