A probabilistic approach to solving crossword puzzles

作者:

摘要

We attacked the problem of solving crossword puzzles by computer: given a set of clues and a crossword grid, try to maximize the number of words correctly filled in. After an analysis of a large collection of puzzles, we decided to use an open architecture in which independent programs specialize in solving specific types of clues, drawing on ideas from information retrieval, database search, and machine learning. Each expert module generates a (possibly empty) candidate list for each clue, and the lists are merged together and placed into the grid by a centralized solver. We used a probabilistic representation as a common interchange language between subsystems and to drive the search for an optimal solution. Proverb, the complete system, averages 95.3% words correct and 98.1% letters correct in under 15 minutes per puzzle on a sample of 370 puzzles taken from the New York Times and several other puzzle sources. This corresponds to missing roughly 3 words or 4 letters on a daily 15×15 puzzle, making Proverb a better-than-average cruciverbalist (crossword solver).

论文关键词:Crossword puzzles,Probabilistic reasoning,Information retrieval,Loopy belief propagation,Probabilistic constraint satisfaction,Posterior probability

论文评审过程:Available online 28 December 2001.

论文官网地址:https://doi.org/10.1016/S0004-3702(01)00114-X