Skip to main content

⚽ FIFA World Cup 2026 Dataset

Comprehensive historical & predictive data for the 2026 FIFA World Cup — covering all 22 editions (1930–2022), 49,000+ international matches, and a full 48-team 2026 pre-tournament snapshot.

Overview & Business Goals

This is a public sandbox dataset for individuals exploring the Delphina analytics product. It covers international men’s football (soccer) with a focus on the FIFA World Cup — both historical (1930–2022) and the upcoming 2026 edition.

Business Goals

  • Support fans asking about World Cup history, team records, and match results
  • Enable 2026 tournament research — groups, predictions, qualifying stats, coaches
  • Provide historical depth for cross-tournament comparisons and all-time records
  • Support match predictions using FIFA rankings, ELO ratings, and qualifying form

Key Entities

  • Matches — 49,000+ international results with scores, venue, and tournament context
  • Goals — 47,600+ individual goal events with scorer, minute, and type (penalty/own goal)
  • Teams — 48 qualified nations for 2026 with rankings, squad values, and qualifying stats
  • Tournaments — 22 World Cup editions (1930–2022) with champions, top scorers, and formats
  • Coaches — 48 head coaches with tactical profiles, career background, and WC experience
  • Shootouts — 678 penalty shootout records from 1967 to present

2026 Tournament Quick Facts

DetailInfo
DatesJune 11 – July 19, 2026
HostsUnited States, Canada, Mexico (tri-hosted)
Teams48 (first expanded format — previously 32)
Groups12 groups (A–L) of 4 teams each
Total Matches104
FormatGroup stage → Round of 32 → Round of 16 → QF → SF → Final
AdvancementTop 2 per group (24) + 8 best third-place teams → 32 in knockout
Final VenueMetLife Stadium, New Jersey
Opening MatchMexico vs South Africa, Estadio Azteca
Defending ChampionArgentina
Host Cities16 cities across 3 countries (11 USA, 2 Canada, 3 Mexico)
Debut TeamsCuraçao, Cape Verde, Uzbekistan, Jordan

Data Sources & Warehouse Schema

All tables reside in the SPORTS.FIFA_2026 schema in Snowflake. Data is sourced from multiple providers:
SourceUsed For
Wikipedia (FIFA WC pages)Match results, squad lists, tournament stats
World Football Elo Ratings (eloratings.net)Pre-1993 team strength proxy + current ELO
FIFA Official Rankings (April 2026)Current team strength (primary indicator)
Transfermarkt methodologySquad market values (estimates for pre-2002)
ESPN / Yahoo Sports2026 group draw confirmation
ELO vs FIFA Rankings: FIFA rankings began in 1993 — they are unavailable for pre-1993 tournaments. ELO ratings go back to the 1870s and serve as a strength proxy for historical analysis. For 2026 analysis, FIFA rankings are the primary signal; ELO is secondary context.

All 17 Tables

Global Tables (All International Football)

Cover all competitions — World Cups, friendlies, continental tournaments, qualifiers. Not limited to World Cup data.
TableDescriptionRows
RESULTSOne row per international match with scores, venue, tournament context. Spans 1872–2026.~49,477
GOALSCORERSOne row per goal — scorer, minute, penalty/own-goal flags. Spans 1916–2026.~47,663
SHOOTOUTSOne row per penalty shootout — teams and winner. Spans 1967–2026.~678
FORMER_NAMESMaps historical/alternate team names to current canonical names. Static reference.46

WC_ Historical Tables (All World Cups 1930–2022)

Cross-tournament comparisons, all-time records, and historical World Cup analysis.
TableDescriptionRows
WC_TOURNAMENTSOne row per World Cup edition — host, champion, goals, attendance, format era.22
WC_TEAM_APPEARANCESOne row per team per WC year — results, goals, ELO, stage reached.~193
WC_TEAM_ALLTIME_STATSAll-time WC aggregate stats per team — wins, goals, titles, best finish.43
WC_MATCHES_HISTORICALCurated key WC matches (finals, semis, selected) with pre-match ELO.117
WC_HEAD_TO_HEADPre-aggregated head-to-head WC records for 19 prominent teams.~61
WC_TOP_SCORES_BY_EDITIONTop 3 goal scorers per WC edition with goals, assists, position.~71

WC_2026_ Tables (2026 Tournament Only)

Pre-tournament snapshot for the upcoming 48-team expanded format.
TableDescriptionRows
WC_2026_GROUPS48 teams × 12 groups. Rankings, titles, key player, squad value.48
WC_2026_FIXTURESFull 104-match schedule — group stage + knockouts, kickoff UTC, stadium, city.104
WC_2026_PREDICTION_FEATURESPrediction model features — rankings, form, qualifying stats, win probability.48
WC_2026_GROUP_DIFFICULTYGroup difficulty scores — opponent ELO, qualification probability, difficulty labels.48
WC_2026_COACHESHead coach profiles — nationality, age, style, WC experience, achievements.48
WC_2026_QUALIFYING_SUMMARYQualifying campaign stats — goals, points, W/D/L, route, key results.~50
WC_2026_TEAMS_SNAPSHOTTeam reference — confederation, ranking, appearances, titles, qualification method.~50

Key Metrics

Scoring Metrics

MetricDescription
Total GoalsCount of all goals scored. Source: GOALSCORERS. Filter for WC-only via RESULTS join.
Goals Per MatchAVG(HOME_SCORE + AWAY_SCORE). Typical WC average: ~2.5 goals/match.
Penalty GoalsCOUNT(*) WHERE PENALTY = TRUE. ~7% of all goals. Excludes shootout goals.
Own GoalsCOUNT(*) WHERE OWN_GOAL = TRUE. ~2% of all goals. TEAM = benefiting team.

Team Performance Metrics

MetricDescription
Win RateWins / Total matches × 100. Unpivot RESULTS into per-team rows.
Goal DifferenceGoals scored − goals conceded. FIFA group-stage tiebreaker #4.
Clean SheetsMatches where team conceded 0 goals. Defensive solidity measure.
Goals Scored / ConcededTeam-level goal tallies from RESULTS (home + away combined).

2026 Prediction Metrics

MetricDescription
Win ProbabilityModel-predicted % of winning WC 2026. Range: 0.1% – 14.2% (France). Sums to ~132.5%.
Squad Market ValueTotal squad value (EUR millions). Range: €35M (Curaçao) – €1,350M (France).
Qualification ProbabilityProbability of advancing past group stage (0–100%). From GROUP_DIFFICULTY table.
Recent FormPoints from last 10 intl matches (3/win, 1/draw). Range: 12–24.

Dimensions & Analytical Notes

Goal Type

Classifies goals into 3 mutually exclusive categories:
  • Open Play (~91%) — standard goals
  • Penalty (~7%) — from the penalty spot
  • Own Goal (~2%) — scored into own net
CASE
  WHEN OWN_GOAL = TRUE THEN 'Own Goal'
  WHEN PENALTY = TRUE THEN 'Penalty'
  ELSE 'Open Play'
END

Match Era

Segments matches by historical era:
  • WC2026 Era — 2026+
  • Post-Pandemic — 2020–2025
  • Modern — 2010–2019
  • Post-Cold War — 1990–2009
  • Classic — 1970–1989
  • Early — pre-1970

Match Stage (WC 2026)

7 stages in the 2026 fixture schedule:
StageMatches
group-stage72
round-of-3216
round-of-168
quarter-finals4
semi-finals2
third-place1
final1

Host Cities (16)

Fixtures by city (top venues):
  • Dallas (9) — AT&T Stadium
  • Los Angeles (8) — SoFi Stadium
  • Atlanta (8) — Mercedes-Benz Stadium
  • New York (8) — MetLife Stadium (Final)
  • Boston, Miami, Houston, Vancouver (7 each)
  • + 8 more cities (4–6 fixtures each)

Key Rules & Conventions

1. Country Name Normalization

Always use FORMER_NAMES to map historical team names to current canonical names. Never use CASE WHEN for name mapping — always LEFT JOIN.
-- Normalize team names via join
COALESCE(fn_home.CURRENT_NAME, r.HOME_TEAM) AS HOME_TEAM_NORMALIZED

LEFT JOIN SPORTS.FIFA_2026.FORMER_NAMES fn_home
  ON r.HOME_TEAM = fn_home.FORMER_NAME
  AND (r.DATE BETWEEN fn_home.DATE_FROM AND fn_home.DATE_TO
       OR fn_home.DATE_FROM IS NULL)

2. FIFA World Cup Filter

RESULTS and GOALSCORERS contain all competitions. To scope to World Cup only:
-- RESULTS: direct filter
WHERE TOURNAMENT = 'FIFA World Cup'

-- GOALSCORERS: requires join (no TOURNAMENT column)
JOIN SPORTS.FIFA_2026.RESULTS r
  ON g.DATE = r.DATE AND g.HOME_TEAM = r.HOME_TEAM AND g.AWAY_TEAM = r.AWAY_TEAM
WHERE r.TOURNAMENT = 'FIFA World Cup'
Do NOT use ILIKE '%World Cup%' — this leaks 8,771+ qualification rows.

3. Completed Matches Filter

RESULTS contains ~52 unplayed future 2026 WC fixtures with NULL scores. Always filter:
WHERE HOME_SCORE IS NOT NULL AND AWAY_SCORE IS NOT NULL

4. Group Stage Validation

Every group must have exactly 4 teams. 12 groups × 4 = 48 teams. Validate:
SELECT GROUP_NAME, COUNT(*) AS team_count
FROM WC_2026_GROUPS
GROUP BY GROUP_NAME
HAVING COUNT(*) != 4;  -- Should return 0 rows

5. Prediction Strength Signals (Priority Order)

  1. FIFA World Ranking (FIFA_RANK_APR2026) — primary strength indicator
  2. Recent form (RECENT_FORM_PTS_LAST10) — last 10 match results
  3. Qualifying campaign stats (QUALIFYING_GF/GA/PTS) — competitive performance
  4. ELO ratings — background context only; not primary predictor

6. WC 2026 Tiebreaker Order

When teams are equal on points in a group:
  1. Head-to-head points between tied teams
  2. Head-to-head goal difference
  3. Head-to-head goals scored
  4. Overall goal difference (all group matches)
  5. Overall goals scored (all group matches)
  6. Team conduct score (fewest cards)
  7. FIFA ranking (final fallback)

7. WC_2026_TEAMS_SNAPSHOT Deduplication

Morocco and Senegal have duplicate rows. Always filter:
WHERE NOTES != 'See above' OR NOTES IS NULL

Common Analyses

Question TypePrimary TablesKey Filters / Notes
WC 2026 group compositionsWC_2026_GROUPSORDER BY GROUP_NAME, FIFA_RANK_APR2026
2026 match schedule / fixturesWC_2026_FIXTURESSTAGE filter for group vs knockout
Team strength / predictionsWC_2026_PREDICTION_FEATURESWin prob sums to ~132.5%, not 100%
Group difficulty rankingsWC_2026_GROUP_DIFFICULTYHigher DIFFICULTY_INDEX = easier path for team
Coach profiles / tacticsWC_2026_COACHESCOACHING_STYLE is free-text (47 distinct values)
Qualifying campaign analysisWC_2026_QUALIFYING_SUMMARYExclude QUALIFIED_AS = ‘Playoff Finalist’
Historical WC match resultsRESULTSWHERE TOURNAMENT = ‘FIFA World Cup’
All-time WC team recordsWC_TEAM_ALLTIME_STATSPre-aggregated; 43 teams through 2022
Head-to-head WC historyWC_HEAD_TO_HEADCheck both orderings (A vs B and B vs A)
Top scorers by editionWC_TOP_SCORES_BY_EDITIONCovers 1930–2022; no 2026 data yet
Goal scorers (all competitions)GOALSCORERSFilter SCORER IS NOT NULL AND SCORER != ‘None’
Penalty shootout historySHOOTOUTS678 shootouts from 1967 to present
Tournament summariesWC_TOURNAMENTS22 editions; FORMAT_ERA classifies structure

Known Limitations & Data Quality

Important data quality notes to keep in mind:
  • WC_MATCHES_HISTORICAL is curated, not complete — 117 key matches (finals, semis, selected); for full WC match coverage use RESULTS WHERE TOURNAMENT = 'FIFA World Cup' (~1,036 matches)
  • Win probabilities sum to ~132.5% — not a calibrated probability distribution; due to rounding across 48 teams
  • WC_2026_TEAMS_SNAPSHOT has duplicates — Morocco and Senegal each appear twice; filter with WHERE NOTES != 'See above' OR NOTES IS NULL
  • GOALSCORERS has ~48 rows with NULL/None scorer — filter for scorer-level analysis
  • GOALSCORERS MINUTE is NULL for ~256 rows — older historical matches lacking minute-level data
  • Squad market values pre-2002 are estimates — based on historical equivalence methodology
  • WC_HEAD_TO_HEAD is not fully symmetric — some pairings only appear in one direction; check both orderings
  • WC_HEAD_TO_HEAD covers only 19 curated teams — historically prominent nations only
  • QUALIFYING_SUMMARY scope mismatch — W/D/L columns cover a broader campaign than QUALIFYING_PLAYED (group-stage only for UEFA); don’t re-derive QUALIFYING_PTS from W/D/L
  • RESULTS contains ~52 unplayed future fixtures — always filter HOME_SCORE IS NOT NULL for completed matches
  • 2 duplicate rows in RESULTS — Gibraltar vs Cayman Islands on 2026-06-06 (differ only in CITY)
  • BEST_FINISH_ENCODED inconsistency — England has WC_TITLES=1 but BEST_FINISH_ENCODED=6 (should be 7)

Schema: SPORTS.FIFA_2026 (Snowflake) | Data through: Active — updated as 2026 WC matches are played (June–July 2026) | Scope: Men’s full internationals only