Predicting the Euro 2016 Final using Markov Chains

Within the next 24 hours, the soccer world will be heads over heels for the upcoming final game of this season’s championship of UEFA Euro 2016 in France. Like usual for such big events, I am curious to know the team’s performances over all. So compare two teams that never played against each other. Additionally, I’d like to get the know the probable winner in advance. According to my predictions, France will win this final with odds 2:1 (about 63.5%).

Data

To be fair, the data is fairly sparse. I only took matches from this end round of the Euro 2016. Only matches from June 10, 2016, to today, July 9, 2016, are taken. There will likely be an update after the final match; but this post and prediction is one day ahead of time. No other data, e.g. ranks, world lists, statistics, player performances, were taken. Only those match results before the final game. So this study is not skewed by history or details.

Method

The method is fairly simple. Being me, I chose Markov Chains to figure out the teams performances. On those sparse data from the end round, for any match I only took the final scores. So if team A plays against team B and the score in the end is 2:1, then B “owes” A 2 recommendations and gets 1 recommendation in return. So every goal scored against one team forces this team “honours” for the team it scored the goal.

Goals during the first 90 or extended 120 minutes are counted as exactly one. Goals scored in penalty shouting are discounted by a factor of ten — they only count for one tenth of a regular goal.

As team count is 24, I build a huge 24 x 24 matrix with all goals from any team to any team. Had one team/team combination been up twice, there were only the recommendation counts added on top the former ones.

This matrix was then normalized, so rows summed to one. Therefrom, I build a Markov Chain. With no Shadow Matrix ε as the sparsity of the data was bad enough and it only adds up doubt. Having a Shadow Matrix in place tends to make everything even, which is not. So in that case, I decided against it.

The Markov Chain simulates a Random Walk between any teams. The very next team is chosen by the odds of recommendations. Having most recommendations and having recommendations from other top teams, therefore guarantees for a top-notch place in the list. The probability of every team being the best over all, is therefore the Steady State solution of this Markov Chain. The latter is like running forever over the graph following the recommendations by chance. The Steady State therefore is, the probability of being in that state by time infinity.

Teams Performances

One friend from England was sad about his team’s performance. They lost in the quarter finals against Iceland. Germany on the other hand, came into the semi final. Even though, England was at most as good as Germany — the latter is the current World Champion and one prospect finalist. England, however, lost to Iceland, which is up to now second best team in this season.

Iceland is most obviously the relative winner. The absolute winner is to be decided tomorrow. Both Iceland and Wales, were extraordinary and made excellent matches — the model does not know former performances and therefore is blind to relative excellence of teams.

The top four teams, as of now are:

top four teams in two clusters with median performance of those teams shown

top four teams in two clusters

This bar chart shows teams sorted by their probability of being the single best team in this Euro 2016 as of now. The colors correspond to four cluster of teams — two clusters span in these top four teams. The vertical axis is the median probability of the sample data shown — not the entire data set.

 

The above-average teams are:

top 50% of all teams in three clusters with median performance of those teams shown

top 50% of all teams in three clusters

This bar chart shows teams sorted by their probability of being the single best team in this Euro 2016 as of now. The colors correspond to four cluster of teams — three clusters span in these top four teams. The vertical axis is the median probability of the sample data shown — not the entire data set.

 

The complete list of all teams as of now:

all teams in four clusters with median performance of entire data set

all teams in four clusters

This bar chart shows teams sorted by their probability of being the single best team in this Euro 2016 as of now. The colors correspond to four cluster of teams. The vertical axis is the median probability of the entire data set.

 

Match Predictions

According to Markov Chains, therefore any posterior of the current Steady State solution, can be used to compute the odds of to teams winning a game against each other. These posterior odds s from a priori values p therefore are

formula for respective odds

formula for respective odds

 

According to this math, Germany was supposed to lose close to France. With the odds before the match being about 43% for Germany vs 57% for France. Just comment out the semi final Germany/France and issue

df[df$team=="France",]$prob /
(df[df$team=="Germany",]$prob + df[df$team=="France",]$prob)

to see it.

Doing the same for the final match, France is supposed to win by a comfortable margin with 36.5% for Portugal vs 63.5% for France respectively — about 2:1, which could be a reasonable score after 90 or 120 minutes.

The interested reader might also like the Goldman Sachs prediction for Euro 2016 using historical data and ELO ranking. The link is added down below as well. With more data but all from the phase before the first match, they got astonishingly accurate results. According to my odds formula above, Goldman Sachs would suggest France winning even by odds 3:1 (about 74% for France vs 26% for Portugal by using priors 23.1% for France and 8% for Portugal respectively).

Taking out some matches, or tweaking match outcomes, shows how the data is still changing upon even minor changes. So before the Germany/France match, Italy was by far the top score — which was send home by Germany. When France eliminated Germany then, France was on top and Germany got reached down rather far. Germany now just does a little better than average.

I encourage you to test out, how my score prediction of 1:2 (Portugal vs France) plays out. If Portugal really scores one goal and got two counter-scores from France, there will be a top-three cluster with France/Iceland/Portugal in this order. France will lead by a wide margin; Portugal can really profit from getting at least one recommendation from France. If Portugal wins, say 2:1, then the top cluster will be France/Portugal in this ranking with a very close gap between the two. So, Iceland’s Markov-Chain-based performance evaluation depends on the goals of the final as well…

Résumé

You can predict sports events and other rankings using Markov Chains. We already did it on other occasions like

  • describing and predicting traffic flow through cities by time and day of week,
  • sport scores to rank teams,
  • to rank ideas,
  • to rank individual people’s performances and
  • vacation venues.

The data is almost to sparse to be quiet exact. But it gives you a feeling for a possible ranking every sharpening with more and more data. On the first data, like here, the teams may swap after one game, which is not a steady state. Incorporating more data makes this effect vanish swiftly.

Thanks for reading my blog post. For any questions and helpful feedback, please do not hesitate to let me know down in the comments’ section below. Thank you.

 

Source:

github.com/danielschulz/Euro2016-Predictions

Further reading:

[A] Goldman Sachs’ sophisticated pre-Euro-2016 forecast with historical data: goldmansachs.com/our-thinking/macroeconomic-insights/euro-cup-2016
[B] amazon.de/Eine-Einführung-zeit-diskrete-homogene-Markov-Ketten/dp/3640797086
[C] grin.com/de/e-book/164218/eine-einfuehrung-in-zeit-diskrete-homogene-markov-ketten

Disruptive Career Changes to Spur Personal Success

Today, iff your career looks like a dead end, you might get more and more bored exercising it every day. To do things you are not proud of day in day out might possibly destroy your dreams, ambitions, and will not make you happy either.

Disruptive Changes

Luckily, there various career paths and business domains to choose from. One of the upcoming services we have today is job coaching on disruptive change management. In the past there was consulting on how to do your job better by employing Time & Self Management techniques. Having a Sparing Partner might help as well.

But iff your current business domain itself is not fulfilling to you, there is Job Coaching for disruptive career changes as well. Like Change Management on an individual level, disruptive changes help evolving more skills to other domains and bring your developed strong skills in other business domains to the table.

In business and Idea Generation this is a proven technique. Famous examples are:

  1. Clayton Christensen, who changed from consulting to academy in his thirties,
  2. Charles Goodyear, who invented rubber instead of being a Priest, and
  3. Albert Einstein, who was bored at the Bern Patent Office and worked on some of his later famous thesis about the universe

Shift gears: scale up

Iff you consider changing not only jobs, but business domain, you get help from job coaches. One from Cologne is linked right here below:

Coaching Studio Schöppe

I wish you success!

FedEx day(s) worked for me

Recently I did my own FedEx day. I was in Cologne for work and visiting friends of mine. But one day I got free for myself. The weather was beautiful and I decided to go to town and enjoy it.

The plan was just to see something of this wonderful city. I sat in parks and cafés and new ideas rushed my brain though. Although I wanted nothing special to do I couldn’t and got some very good ideas. So the lesson is: do not force it until it breaks. Enjoy it instead and ideas flow.

Even so was another evening with friends of mine. But here I got instead of ideas for new things strong insights regarding the world around me. Fantastic.

This is special about creativity: if you do not use new wave techniques for that relaxing is the best way to break though ideas. But notice: this can never be the only way. Especially if this is your job. If that’s the case learn about Lateral Thinking. I recommend above all Edward de Bono and Jens-Uwe Maier.

Lateral Ad in Zen Design

Here’s an ad from Germany. It is from an eco movement party. On this you can read in youth slang: “Your mother.”

This is an allusion to the “Your mother is …” jokes.

[1] http://gruene-rlp.de/wahl-2011

What are FedEx days?

I gonna explain what FedEx days are: at your job you get day to day new stuff to do. Everytime quiet similar things. Now you want to be motivated again. So some companies are doing one different day each year. This shall be the highlight regarding work time. Everybody is looking forward to this. Because one day they get and they can do whatever they want to. So if it is not concerned to their allday work! Continue reading “What are FedEx days?” »

Our faculty to make secure cars

Prof. Jana Dittmann from our Faculty for Computer Sciences (FIN) at Otto-von-Guericke University Magdeburg (OvGU) ensures the safety of future cars. They say this topic will be trending in the next years. Continue reading “Our faculty to make secure cars” »

Facebook’s vast Hadoop clusters

Facebook uses Hadoop to gain information about their members. It helded the second biggest Hadoop database behind Yahoo! [4] By May 2010 Facebook got a bigger Hadoop Cluster than Yahoo! [2] Google is not know to use Hadoop. Because it is an opponent product to GFS and Googles BigTable.

Apache Hadoop has been voted being top innovator in 2011. [9] Continue reading “Facebook’s vast Hadoop clusters” »

Apache Hadoop is top innovator 2011


Congratulations to Apache Hadoop: they’re voted to be the innovator of 2011. The decision was justified Hadoop allows to discover and manipulate a huge amount of data. So Hadoop is the “Swiss Army knife for the 21st century.” [5]  Continue reading “Apache Hadoop is top innovator 2011” »

Think Quarterly

Google just launched a new magazine to reflect the passed three months. It is simple, easy-to-read, and beautiful. The mag follows a minimalistic presentation style. This is known from Japan and called Zen. The design is made by The Church Of London – a creative design team.

Google does not made a big deal out of it’s today’s launch. The online version can be read in Flash or plain HTML. Continue reading “Think Quarterly” »

Billions & Billions