Building Superligaen Analytics
This blog documents the end-to-end journey of building Superligaen Analytics — a live data engineering project that ingests football data from the Danish premier league, transforms it through a medallion architecture, and serves it on a public dashboard.
The project was built in roughly 10 days in April 2026, and almost nothing went according to the original plan. Every major tool choice had to be revisited at least once. This is the honest account of what happened, why, and what I’d do differently.
Posts in order:
- The Idea — Why I Built This
- Choosing a Data Source
- Building the Bronze Layer — Raw Ingestion
- Silver and Gold — Transforming Data into a Star Schema
- The Dashboard — Discovering Evidence.dev
- The Deployment Saga — Netlify, Cloudflare, and Finally Vercel
- Migrating to dbt — When Raw SQL Isn’t Enough
- Adding Web Analytics — Vercel and Cloudflare
- Global Launch — A Conclusion
- What’s Next — The Road Ahead
Live dashboard: superligaanalytics.vercel.app
Source code: github.com/SaUgKi1773/data-engineering-demo
Posts
-
What's Next — The Road Ahead
This project started as a personal challenge: build a real end-to-end data engineering system using only free tools, on a dataset I actually care about. It shipped. It runs nightly. It has real users.
-
Adding Web Analytics — Vercel and Cloudflare
Once the dashboard was live, the natural question was: is anyone visiting it? We needed analytics.
-
Global Launch — A Conclusion
By April 2026 — roughly ten days after the first real commit — the pipeline was stable, the dashboard had seven pages, and the nightly job was running cleanly. It was time to call it launched.
-
The Deployment Saga — Netlify, Cloudflare, and Finally Vercel
This is the chapter I wish someone had written before I started. The deployment story is not a story about bad tools — Netlify, Cloudflare Pages, and Vercel are all good products. It is a story about free tier constraints that are easy to overlook until you hit them, and about how a project with an unusual build profile (large data files, Node.js compilation, MotherDuck token handling) does not fit neatly into the assumptions any of these platforms make.
-
Migrating to dbt — When Raw SQL Isn't Enough
When the silver and gold layers were first built, they ran as plain SQL files executed by Python runner scripts —
run_silver.pyandrun_gold.py. Each script would read a directory of.sqlfiles, connect to MotherDuck, and execute them in a specific order. It worked. The data was correct. But as the number of models grew and the logic became more complex, the cracks in the approach started to show. -
The Dashboard — Discovering Evidence.dev
I knew from the start that I wanted a live public dashboard, not a static report or a screenshot. The question was which tool to use.
-
Silver and Gold — Transforming Data into a Star Schema
With 21 tables of raw JSON sitting in MotherDuck, the next step was to make the data actually usable. That meant two more layers: silver (clean, structured tables) and gold (a Kimball dimensional model designed for analytics).
-
Building the Bronze Layer — Raw Ingestion
The bronze layer has one job: pull data from the API and store it in the warehouse exactly as it arrived. No transformation, no business logic. If the API gives you a nested JSON blob, you store a nested JSON blob. The philosophy is that raw data is irreplaceable — once you transform it, you lose the original, and if your transformation logic turns out to be wrong you have nothing to go back to.
-
Choosing a Data Source
Choosing the Data Source
-
The Idea — Why I Built This
Every project starts somewhere. This one started with two things that happened to collide at the right moment.
subscribe via RSS