Complexity vs. Rating — Board Games

Figure 3. Heavier games climb the rating ranks. Each point is one of n = 21,229 ranked games with at least one community complexity vote (a further 492 games with no complexity votes are excluded from this figure only; full hygiene rules in ontology/SPEC.md §5). x: community complexity (weight, 1 = light to 5 = heavy); y: Bayes-weighted average rating (1–10; the y-range shows the occupied 3.5–9 band — the Bayes prior pulls all games toward 5.5, so no ranked game approaches 1 or 10, and the baseline is stated here rather than drawn to zero). Point area scales with log10(number of ratings); color = complexity bin (Light < 2.0 ≤ Medium-Light < 2.5 ≤ Medium < 3.0 ≤ Medium-Heavy < 3.5 ≤ Heavy). Black curve: LOESS trend (frac = 0.35). The association is a strong rank-order effect — Spearman ρ = 0.39 (p < 10⁻³⁰⁰) — but the conditional mean rises only ≈0.17 rating points across the full weight range, because the Bayes prior compresses every game toward 5.5: complexity moves games up the rating *ranks* (heavy games dominate the upper tail) far more than it moves the raw conditional mean. Labeled points: celebrated light outliers (Azul, Codenames, The Crew) that out-rate their weight class, heavy benchmarks (Gloomhaven, Brass: Birmingham, Twilight Imperium 4e), and the lowest-rated widely-played heavy game (computed, not hand-picked: lowest Bayes rating among weight ≥ 3.5 with ≥ 1,000 ratings).
Source: Kaggle BoardGameGeek dataset (bgg-kaggle-matrix-2022-01-18), boardgamegeek.com. Complexity and ratings are community aggregates as of the snapshot.