Minesweeper AI Architecture

Constraint Propagation with Subset Analysis

Pure logic - no neural network needed. Finds cells that are guaranteed safe or guaranteed mines.

// Direct deduction if (remainingMines === hiddenNeighbours.length) → all are MINES if (remainingMines === 0) → all are SAFE // Subset analysis: if A ⊂ B, deduce from difference {1,2} has 1 mine, {1,2,3} has 2 mines → Cell 3 must have (2-1) = 1 mine

Safe Cell Found?

Click it. Guaranteed correct.

No Safe Cells?

Fall through to neural network...

Deep Q-Network: Learned Guessing

When the constraint solver can't find a guaranteed safe cell, the neural network estimates Q-values for each cell. We click the cell with the highest Q-value (learned to correlate with safety).

Board
State

Input

→

One-Hot
11 ch

→

Conv 128
+BN+ReLU

→

Residual
Blocks x6

→

Policy
Head

→

Q-values
per cell

Output

// Training: Deep Q-Learning with experience replay Q(state, action) ← reward + γ * max(Q(next_state, a')) // Inference: Pick highest Q-value among valid cells action = argmax(Q[hiddenCells])

Results

CSP handles ~80% of moves with perfect accuracy.
Neural network handles ~20% - the genuine guessing situations.

Easy (9x9, 10 mines): ~85-95% win rate
Intermediate (16x16, 40 mines): ~70-80% win rate
Expert (16x30, 99 mines): ~15-25% win rate

The theoretical maximum is limited by unavoidable 50/50 guesses, especially on Expert difficulty.

Hybrid Minesweeper AI

Safe Cell Found?

No Safe Cells?