How the leaderboards are computed
The headline ranking on every segment page is the per-segment residual of an OLS regression. This page explains what that means, why it's the metric of choice, and the caveats attached to it.
Why residuals, not raw watches
Letterboxd watch counts correlate strongly with theatrical reach: a film released in 50 countries with a $200M box office will collect millions of LB watches almost regardless of how good it is. Ranking by raw watches mostly ranks by marketing budget.
Residuals strip out the expected effect of theatrical reach and ask the more interesting question: given how widely this film was released, how much more (or less) did Letterboxd embrace it? A positive residual means LB punched above the regression line; a negative one means it punched below.
The two metrics
The site sorts segments by per-segment Metric B residual. Each release type fits its own regression, so a streaming-only film's residual is interpreted relative to other streaming-only films, not against blockbusters.
What residual_b_segment means
For a film in wide_theatrical:
- A residual of +1.0 means its log-watches are 1 above what the wide-theatrical regression predicts from its country count, year, and age — i.e. it's roughly e¹ ≈ 2.7× more watched than expected.
- 0 is exactly on the regression line.
- −1.0 means it's roughly 2.7× less watched than expected.
What different residual values look like
Concrete examples from the current dataset, picked to give the magnitudes some weight.
Important caveats (don't read these as causal)
- Pandemic distortion: 2020-2022 films are handled by a year fixed effect, but the box-office collapse was uneven across segments. We don't add a
covid_era × release_typeinteraction in v1; high-residual 2020-2022 films may be partly artefacts of comparing pandemic-suppressed grosses. - Watches are cumulative, not lifetime-normalised: a film released last month and a film released five years ago aren't directly comparable on watch count. The
log(months_age)covariate partially adjusts. v2+ will switch to "watches at fixed film age" once enough monthly snapshots accumulate. - Letterboxd selection bias: LB's user base skews young, cinephile, English-fluent. Residuals reveal that demographic's preferences, not "audience appeal" generally.
- No causal claim about features: directors and cast are not in the regression — this is intentional (Q15). Aggregate residuals per director/country/theme are descriptive in
notebooks/04_aggregate_residuals.ipynb, not headline output. - Marketing, awards, critic reception confound everything: a film with a heavy critical campaign over-performs on LB even adjusting for theatrical reach. We can't separate that from LB-specific demographic preference without extra signals.
Per-segment film counts
Films currently in the residual fit per segment:
Distribution of residuals per segment
Residuals are mean-zero by construction (OLS includes an intercept), so the centre of each distribution sits on 0. A wider spread = the segment has more variability in LB outcome relative to theatrical reach.
A segment with fewer than 20 films in the residual fit gets no per-segment residual (films keep their pooled residual_b instead). Cutoff per MIN_SEGMENT in src/pipeline/compute_residuals.py.
Pipeline stages
build_universe— TMDB discover ∪ Box Office Mojo, deduped by tmdb_idenrich_metadata— TMDB per-film metadata + watch/providers (cached)scrape_letterboxd— Playwright per-film page (watches, likes, themes)the_numbers— best-effort budget gap-fillclassify_release_type— wide / limited / festival / streaming / hybridmerge— single films table +validateoutlier flagscompute_residuals— Metric A + B + per-segment, written back to films.parquet
Source: GitHub repo (private until publish-decision).