Word Validation and Dictionaries

Module 4 · Lesson 3 · ~30 min · Godot 4.x

A word game is only fun if the dictionary is. Accept too few words and players feel cheated. Accept too many and obscure Scrabble trivia wins. Shipping a good dictionary is an irritating amount of work, which is why only a few game-ready word lists exist.

Pick a word list

ListSizeLicenseNotes
ENABLE~172kPublic domainStandard for US word games. Slightly permissive (accepts some obscure words).
SOWPODS / Collins~276kProprietaryInternational Scrabble. Can't ship without a license.
TWL06~178kProprietaryNorth American Scrabble. Licensed only.
Wiktionary dumpsvariesCC-BY-SAMessy. Contains foreign words and proper nouns. Requires cleanup.

Ship with ENABLE. It's public domain, battle-tested, and readily available. Get it from github.com/dolph/dictionary. The file enable1.txt is ~1.5 MB — trivial to bundle.

Loading it into the game

Copy enable1.txt to res://data/words.txt. Load into a Set (GDScript uses Dictionaries for that) at startup:

# scripts/dictionary.gd — register as autoload "Dictionary"
extends Node

var words: Dictionary = {}

func _ready() -> void:
    _load_words()

func _load_words() -> void:
    var f := FileAccess.open("res://data/words.txt", FileAccess.READ)
    if f == null:
        push_error("Failed to open words.txt")
        return
    while not f.eof_reached():
        var line := f.get_line().strip_edges().to_lower()
        if line.length() >= 2:
            words[line] = true
    f.close()
    print("[Dictionary] loaded ", words.size(), " words")

func is_valid(word: String) -> bool:
    return words.has(word.to_lower())

Dictionary lookup is O(1). On my dev machine the 172k ENABLE list loads in ~80ms; on mid-tier Android ~200ms. Invisible during a splash screen.

Autoload timing Autoloads load in order. If Dictionary has to exist before Global, put it first in the Project Settings autoload list. Order matters only for inter-autoload dependencies.

Minimum word length

Lexicon Duel should enforce a minimum length — probably 3. Allowing 2-letter plays lets players spam "aa" for safe chip damage. Scrabble minimum is 2 but their scoring punishes short plays; you can mirror that with damage scaling:

func compute_damage(word: String, letter_values: Array[int]) -> int:
    if word.length() < 3:
        return 0
    var base := letter_values.reduce(func(a, b): return a + b, 0)
    var length_bonus := pow(word.length() - 2, 1.5) as int  # 3-letter: 1, 5-letter: 5, 7-letter: 11
    return base + length_bonus

Tune those numbers against actual hands. The Word Validator tool in this curriculum lets you test scoring formulas interactively.

Performance: don't rebuild the set per check

# BAD
func is_valid(word: String) -> bool:
    var f := FileAccess.open("res://data/words.txt", FileAccess.READ)
    while not f.eof_reached():
        if f.get_line().strip_edges() == word.to_lower():
            return true
    return false

# GOOD — load once, check from dictionary

Obvious, but worth stating. If you ever find yourself calling FileAccess.open inside a hot path, stop and lift it to _ready.

Prefix checking (for autocomplete/hints)

A more advanced feature: "you can still make a word with these tiles" hints. For prefix queries you want a trie, not a flat set:

class_name WordTrie
extends RefCounted

var root := {}

func add(word: String) -> void:
    var node := root
    for c in word:
        if not node.has(c):
            node[c] = {}
        node = node[c]
    node["$"] = true  # marker for "word ends here"

func has_prefix(prefix: String) -> bool:
    var node := root
    for c in prefix:
        if not node.has(c): return false
        node = node[c]
    return true

func has_word(word: String) -> bool:
    var node := root
    for c in word:
        if not node.has(c): return false
        node = node[c]
    return node.has("$")

Building a trie from 172k words takes ~1s and uses more memory than a dictionary. Do it only when you actually need prefix queries (e.g., a hint system or anagram finder). For plain "is this a word" checks, the flat dictionary is faster and simpler.

Localization (later)

If you ever localize Lexicon Duel, you'll swap the dictionary per language. Structure the code for that now:

func _load_words() -> void:
    var lang := TranslationServer.get_locale().substr(0, 2)  # "en", "fr"
    var path := "res://data/words_%s.txt" % lang
    if not FileAccess.file_exists(path):
        path = "res://data/words_en.txt"
    # ...

Ship English only for v1. Multiple-language word games are way more work than they look — each language's dictionary has licensing and curation issues.

Swear filter

ENABLE contains, uh, comprehensive vocabulary. If you want a family-friendly version, maintain a res://data/words_block.txt of words to remove. At load time, filter them out. Most indie word games ship with an opt-in "censored" toggle in settings.

Do this now

  1. Download enable1.txt from github.com/dolph/dictionary. Save as res://data/words.txt.
  2. Create scripts/dictionary.gd. Register as autoload "Dictionary".
  3. In your Hand's submit_word, replace the naive length check with:
if word.length() < 3:
    print("too short")
    return
if not Dictionary.is_valid(word):
    print("not a word: ", word)
    return
Events.word_submitted.emit(word, total)
  1. Try submitting real and nonsense words. Try "QI" (valid in Scrabble, invalid here because we require length ≥ 3 — design choice).

Try the Word Validator tool with some hands. It's using the same dictionary logic in the browser.