Clean, structured and automatically updated football (soccer) dataset built from Transfermarkt data -- 79,000+ games, 37,000+ players, 1,800,000+ appearances and more, refreshed weekly.
🌍 New: International football data — The dataset now includes
countries,national_teams, national team competition games (🏆 World Cup, UEFA Euro, Copa América, AFCON, AFC Asian Cup), andinternational_caps/international_goals/current_national_team_idon every player profile.
The dataset is composed of 12 tables covering competitions, games, clubs, players, appearances, player valuations, club games, game events, game lineups, transfers, countries and national teams. Each table contains the attributes of the entity and IDs that can be used to join them together.
| Table | Description | Scale |
|---|---|---|
competitions |
Leagues, tournaments and national team competitions | 40+ |
clubs |
Club details, squad size, market value | 400+ |
players |
Player profiles, positions, market values, international caps | 37,000+ |
games |
Match results, lineups, attendance | 79,000+ |
appearances |
One row per player per game played | 1,800,000+ |
player_valuations |
Historical market value records | 500,000+ |
club_games |
Per-club view of each game | 150,000+ |
game_events |
Goals, cards, substitutions | 1,100,000+ |
game_lineups |
Starting and bench lineups | 2,800,000+ |
transfers |
Player transfers between clubs | 87,000+ |
countries |
Country details and confederation membership | 100+ |
national_teams |
National team profiles, squad size, FIFA ranking | 100+ |
ER diagram
classDiagram
direction LR
competitions --|> games : competition_id
competitions --|> clubs : domestic_competition_id
clubs --|> players : current_club_id
clubs --|> club_games : opponent/club_id
clubs --|> game_events : club_id
players --|> appearances : player_id
players --|> game_events : player_id
players --|> player_valuations : player_id
games --|> appearances : game_id
games --|> game_events : game_id
games --|> clubs : home/away_club_id
games --|> club_games : game_id
countries --|> national_teams : country_id
national_teams --|> players : current_national_team_id
class competitions {
competition_id
type
}
class games {
game_id
home/away_club_id
competition_id
}
class game_events {
game_id
player_id
}
class clubs {
club_id
domestic_competition_id
}
class club_games {
club_id
opponent_club_id
game_id
}
class players {
player_id
current_club_id
current_national_team_id
international_caps
international_goals
}
class player_valuations{
player_id
}
class appearances {
appearance_id
player_id
game_id
}
class countries {
country_id
country_name
confederation
}
class national_teams {
national_team_id
country_id
confederation
fifa_ranking
}
The fastest way to explore the dataset is to download the DuckDB database file -- a single file containing all 12 tables, ready to query.
Or via the command line:
curl -LO https://pub-e682421888d945d684bcae8890b0ec20.r2.dev/data/transfermarkt-datasets.duckdbOpen the file with the DuckDB CLI, Python, R, or any compatible client:
-- DuckDB CLI
-- $ duckdb transfermarkt-datasets.duckdb
SHOW TABLES;
SELECT player_id, name, position, market_value_in_eur
FROM players
WHERE position = 'Attack'
ORDER BY market_value_in_eur DESC
LIMIT 10;
-- player_id | name | position | market_value_in_eur
-- 418560 | Erling Haaland | Attack | 200000000
-- 342229 | Kylian Mbappé | Attack | 180000000
-- 371998 | Vinicius Junior | Attack | 180000000
-- 433177 | Bukayo Saka | Attack | 130000000
-- ...Tip: You can also query individual CSV files remotely with DuckDB -- no download required:
INSTALL httpfs; LOAD httpfs; SELECT * FROM read_csv_auto('https://pub-e682421888d945d684bcae8890b0ec20.r2.dev/data/players.csv.gz') LIMIT 10;
In order to keep things tidy, there are two simple guidelines
- Keep the conversation centralised and public by getting in touch via the Discussions tab.
- Avoid topic duplication by having a quick look at the FAQs
Maintenance of this project is made possible by sponsors. If you'd like to sponsor this project you can use the Sponsor button at the top.
Contributions to transfermarkt-datasets are most welcome. If you want to contribute new fields or assets to this dataset, the instructions are quite simple:
- Fork the repo
- Set up your local environment
- Populate the
datadirectory - Start modifying assets or creating new ones in the dbt project
- If it's all looking good, create a pull request with your changes 🚀
In case you face any issue following the instructions above please get in touch
For full setup and workflow details, see the Developer guide.