Hybrid Minesweeper AI

Constraint Propagation Solver + Deep Q-Network

Constraint Propagation with Subset Analysis
Pure logic - no neural network needed. Finds cells that are guaranteed safe or guaranteed mines.
// Direct deduction if (remainingMines === hiddenNeighbours.length) → all are MINES if (remainingMines === 0) → all are SAFE // Subset analysis: if A ⊂ B, deduce from difference {1,2} has 1 mine, {1,2,3} has 2 mines → Cell 3 must have (2-1) = 1 mine

Safe Cell Found?

Click it. Guaranteed correct.

No Safe Cells?

Fall through to neural network...

Deep Q-Network: Learned Guessing
When the constraint solver can't find a guaranteed safe cell, the neural network estimates Q-values for each cell. We click the cell with the highest Q-value (learned to correlate with safety).
Board
State
Input
One-Hot
11 ch
Conv 128
+BN+ReLU
Residual
Blocks x6
Policy
Head
Q-values
per cell
Output
// Training: Deep Q-Learning with experience replay Q(state, action) ← reward + γ * max(Q(next_state, a')) // Inference: Pick highest Q-value among valid cells action = argmax(Q[hiddenCells])
Results
CSP handles ~80% of moves with perfect accuracy.
Neural network handles ~20% - the genuine guessing situations.

Easy (9x9, 10 mines): ~85-95% win rate
Intermediate (16x16, 40 mines): ~70-80% win rate
Expert (16x30, 99 mines): ~15-25% win rate

The theoretical maximum is limited by unavoidable 50/50 guesses, especially on Expert difficulty.
← Back to Game