Free Essay

In: Science

Submitted By samgreen

Words 15822

Pages 64

Words 15822

Pages 64

By

Renzo Galea

A Dissertation Submitted in Partial Fulfilment of the Requirements

For the Degree of Bachelor of Science (Honours)

Statistics and Operations Research as main area

DEPARTMENT OF STATISTICS AND OPERATIONS RESEARCH

FACULTY OF SCIENCE

UNIVERSITY OF MALTA

MAY 2011

Declaration of Authorship

I, Renzo Galea 25889G, declare that this dissertation entitled:

“Modelling Football Data”, and the work presented in it is my own.

I confirm that:

(1) This work is carried out under the auspices of the Department of Statistics and Operations

Research as part fulfillment of the requirements of the Bachelor of Science (Hons.) course.

(2) Where any part of this dissertation has previously been submitted for a degree or any other qualification at this university or any other institution, this has been clearly stated.

(3) Where I have used or consulted the published work of others, this is always clearly attributed.

(4) Where I have quoted from the works of others, the source is always given. With the exception of such quotations, this dissertation is entirely my own work.

(5) I have acknowledged all sources used for the purpose of this work.

Signature:

_______________________

Date:

_______________________

Abstract

Renzo Galea, B.Sc. (Hons.)

Department of Statistics & Operations Research

May 2011

University of Malta

The main goal of this dissertation is to investigate the Bayesian modelling performance for football data. An extensive study of Markov processes and the Bayesian statistical approach is carried out. In particular, special reference is made to the radical Markov Chain Monte

Carlo sampling technique. Using real data from the Italian serie A championship, a

Bayesian modelling application (as according to the rationale of professors Gianluca Baio and Martha A. Blangiardo) is considered and confronted with the performance of a comparable generalised linear model.

i

Acknowledgements

It is my pleasure to thank the several people who have made this dissertation possible with their precious support.

First and foremost I am heartily thankful to my supervisor, Professor Lino Sant (Head of

Statistics & Operations Research Department, University of Malta), whose continuous encouragement and professional guidance motivated me to develop a thorough understanding of the subject.

I wish to express my deep gratitude towards two more individuals for their time and disponibility. The first is Vincent Marmara (from Betfair Group, Malta), who helped me develop a better picture of the sports betting industry. And the other is professor Gianluca

Baio (from the Department of Statistical Science, University College London), with whom

I have frequently corresponded via email regarding any queries I had about the implementation of his Bayesian mixture model.

Lastly I would also like to extend my sincerest appreciation to all staff and friends within the Statistics & Operations Research Department who have provided the best possible environment in which to learn and grow throughout the whole course.

ii

Contents

Abstract

i

Acknowledgments

ii

List of Figures

vi

List of Tables

vii

1

Introduction

1

1.1

3

2

Structure of Dissertation

4

2.1

Markov Processes

4

2.2

Markov Chain Monte Carlo (MCMC)

5

2.3

Bayesian Inference

6

2.4

3

Literature Review

Football Analysis

8

Markov Chains & MCMC Algorithms

11

3.1

Introduction

11

3.2

Markov Chains

11

3.3

Transition Probabilities

12

3.4

Properties

13 iii 3.5

15

3.6

Convergence

18

3.7

MCMC

20

3.8

4

General state-space Markov Chains

The Metropolis-Hastings Algorithm

21

The Bayesian Approach

23

4.1

General Framework

23

4.2

Bayes’ Rule

24

4.2.1

Likelihood Functions

25

4.2.2

Posterior Distribution

26

4.2.3

Sufficient Statistics

27

4.3

29

4.3.1

Point Estimation

29

4.3.2

Interval Estimation

30

4.3.2.1

Credible Interval

30

4.3.2.2

4.4

Bayesian Estmation

Highest Posterior Density Interval

31

33

4.4.1

Simple Null & Alternative Hypotheses

33

4.4.2

5

Hypothesis Testing

Simple Null & Composite Alternative Hypotheses

34

Model Estimation using Football Data

36

5.1

Introduction

36

5.2

Data Set

36

5.2.1

37

Descriptive Statistics

5.3

Defining the Problem

38

5.4

Bayesian Modelling Procedure

39

5.4.1

40

5.5

Results

6

45

5.5.1

5.6

Poisson Regression Formulation

46

Results

Bayesian vs GLM

51

Conclusion

53 iv A

56

A.1

B

Matlab m-files

56

Calculating team strengths (using previous 6 match days)

58

B.1

Bayesian mixture model

58

B.2

C

WinBUGS files (Baio & Blangiardo (2010))

Initial values

59

SPSS files

62

C.1

62

Data (serie A season 1994/1995)

Bibliography

67

v

List of Figures

3.1

Markov Model format .

12

4.1

The Bayesian synthesis.

24

5.1

Histogram for the number of home and away goals.

38

5.2

Posterior densities for the attack parameters.

41

5.3

Posterior densities for the defense parameters.

42

5.4

The estimated team parameters vs the final league points.

44

5.5

Residuals for the 2 generalised linear models.

50

5.6

Plotting the differences between the home attack parameters.

51

vi

List of Tables

5.1

Summary statistics of home and away goals.

37

5.2

Bayesian estimation of the main parameters.

40

5.3

Final league table and the corresponding team parameters estimates.

43

5.4

Deviance Information Criterion

44

5.5

Respective tests of model effects for the dependent home goals and dependent away

goals models.

45

5.6

Summary statistics of TS variable

46

5.7

Newly generated tests of model effects.

46

5.8

Parameter estimates for the model with home goals as dependent variable. 47

5.9

Parameter estimates for the model with away goals as dependent variable. 48

5.10

AICs and BICs for the generalised linear model.

vii

50

To Mum and Dad

viii

Chapter 1

Introduction

Association football (nowadays simply known as football) is probably one of the most popular sports that has ever been around. It manages to unite millions of fans from all over the world, irrespective of their age, ethnicity or nationality. Doubtlessly it is a sport that witnesses an interesting mix of skill, chance and intelligence put together as one. However each existing football league is definitely distinguished by its own special and exciting characteristics. The most common league format that is adopted is the so called double round robin tournament. Such a season requires that each team plays against each other twice (home and away), in order to compensate for the home advantage bias. The standard points scheme follows the 3-1-0 system, in which wins are equivalent to 3 points and ties to just 1.

However many football leagues (such as the Italian serie A) originally awarded 2 points for a win instead, but this encouraged team managers to draw away matches and focus on winning home games. Herein the change in system placed an additional value on wins

(with respect to draws), and so it stimulated more attacking play.

In recent years, considerable interest concerning football modelling and match predictions has constantly pursued this sport. And it does not take any long to identify that this has been motivated by the explosive growth in the sports betting industry. Actual figures

1

Chapter 1: Introduction

2

published in the European Commission’s green paper regarding online gambling reported that sports betting generated an annual revenue of around € 1,97 billion (accounting to 32

% of the total € 6,16 billion registered by the online gambling market) in 2008. Surely, developments in digital technology and the widespread availability of internet services have been the main catalysts in this reaction. In addition the same year has seen Malta having as many as 500 registered online gambling operators, becoming the EU member state with the highest number of licences in this category.

Such gaming companies have truly revolutionised the betting market scenario by introducing a wider range of exotic betting systems. Once, wagers were simply placed over the probable match outcomes. Nowadays, it is even possible to bet on bookings, shirt numbers or corner kicks amongst others. This is in fact part of the so called “Spread

Betting” system, where it is possible to place bets while the football event is going on.

Other common football betting systems that will only be mentioned are the “Asian

Handicap” (also known as “Hang Cheng”), “Fixed-Odds Betting”, and “Pari-mutuel

Betting”.

Security, speed and pricing are doubtlessly the most significant attributes that guarantee a profitable internet sports book. The first two characteristics in fact fall under the exclusive responsibility of technology providers. Whereas on the other hand, the methodology for compiling the pricing issues has its roots in statistics.

Determining accurate probabilities for football match outcomes is however not an easy task. It is a very complex process that involves numerous factors such as playing conditions and the teams’ abilities among others. Moreover a team’s ability is itself subject to fluctuation (for example, according to injuries, transfers, suspensions, pressure levels, motivation etc). Herein the “right” model should be capable of estimating a very wide list of paramaters.

This dissertation follows the Bayesian modelling rationale as developed by professors

Gianluca Baio and Martha A. Blangiardo. The model they proposed is presented for a particular football season, and its performance is compared with reference to a comparable generalised linear model application. Various software packages are used in doing so.

WinBUGS’s special ability of updating via MCMC sampling is exploited for the Bayesian

Chapter 1: Introduction

3

modelling purposes, whilst Matlab and SPSS are namely utilised for developing the generalised linear model fit.

1.1

Structure of Dissertation

Having reached the end of this dissertation’s introductory part, we shall now give a brief overview of the next 5 chapters that follow.

Coming up first is a chapter that covers the literature related to the theory involved and the evolution of the BUGS software project. In addition this chapter further develops into discussing several statistical football models that have been proposed throughout the course of time. Special reference is made to those models that somehow employ the Bayesian approach. After the literature review we shall then be introducing the relevant theoretical underpinnings. In chapters 3 and 4 we will be looking at the fundamental concepts of both

Markov processes and the Bayesian statistical approach. In particular we shall delve into the Markov Chain Monte Carlo method, which is a sampling technique that has been largely integrated within the Bayesian world.

Chapter 5 is the core of this dissertation. It starts off with a detailed description of the data set available, and formulates the problem that we would be shortly dealing with. The

Bayesian mixture model as according to Baio and Blangiardo (2010) is estimated for the serie A season 1994/1995 and compared to the performance of a generalised linear model application. Finally we conclude by summing up the key outcomes achieved through this dissertation, whilst sparing a last thought for the possible future of football analysis.

Chapter 2

Literature Review

2.1

Markov Processes

What are nowadays known as Markov chains first appeared in 1906. It was in fact the

Russian mathematician Andrei Andreevich Markov who had introduced the notion of

‘chains’ in the paper “Extension of the law of large numbers to dependent quantities”. As speculated in Medhi (1982), he developed the idea whilst watching Pushkin’s (and

Tchaikovsky’s) opera “Evgeni Onegin”. Indeed he eventually published a chain study for the consonant-vowel alterations of the first 20,000 words in Pushkin’s work.

However, in reality Markov’s interest within this field was probably motivated by

Nekrasov’s “abuse of mathematics” (as frequently alleged by Markov himself). The first paper to unleash controversy was that of 1898, entitled as “General properties of numerous independent events in connection with approximate calculation of functions of very large numbers”. Consequently the dispute pursued when four years later Nekrasov said that not only would “pairwise independence” yield the Weak Law of Large numbers (WLLN) as according to Chebychev, but erroneously declared that “independence is a necessary condition for the law of large numbers”.

4

Chapter 2: Literature Review

5

In response, Markov aimed to extend Chebychev’s conclusions by applying the WLLN and the Central Limit Theorem specifically to his own sequences of dependent random variables. In fact after studying variables whose dependence diminishes with increasing mutual distances, he concluded his 1906 paper by claiming that the “independence of quantities does not constitute a necessary condition for the existence of the law of large numbers”. For Seneta (1996), this statement encapsulated all the motivation that Markov had for studying new schemes of chain dependence.

Subsequently, Bernstien introduced the phrase “Markov chain” for the first time. Having clearly been influenced by Markov’s work, he (among many others such as Romanovsky and Neyman) managed to take statistics to newer levels. However, few know that Poincaré could have taken all the recognition instead. In fact Medhi (1982) recalls how Poincaré had already came across the same sequences of random variables before, but did not really delve into the subject as much as Markov did.

Among the many interesting ideas that have been closely associated with Markov chains, one finds the special cases of random walks. Bernoulli and Laplace were among the first to consider urn models. And later, Galton reused the theory to model the propagation of family names. However, Markov processes left some considerable impact in the physics field as well. With special reference to statistical mechanics, the systems encountered generally evolve independently from time whilst obeying the memoryless condition. In addition, the most eminent figures known to have worked with Markov processes within these areas are in fact Einstein and Smoluchowsky.

Another interesting Markov chain application that is relatively quite recent features internet traffic and navigation of websites. However this is not really surprising, since Markov processes have provided the foundations for Queueing Theory. And remaining within the statistical field, Markov processes were also used as a basis for a very important class of

Monte Carlo techniques (that later became known as the Markov Chain Monte Carlo).

2.2

Markov Chain Monte Carlo (MCMC)

MCMC methods have their roots in the Metropolis algorithm, derived when Metropolis et al. (1953) attempted to compute complex integrals by expressing them as expectations for a

Chapter 2: Literature Review

6

particular distribution and used samples to obtain expectation estimates. Thereafter, there had been the intervention of Hastings (1970) and Peskun (1973, 1981) to extend the original algorithm to a more general case and overcome the curse of dimensionality

(usually met by Monte Carlo methods).

Meanwhile several earlier pioneers had been establishing the seeds of another important

MCMC technique, the Gibbs sampler. In particular Hammersley and Clifford had been developing an argument that recovers a joint distribution from its conditionals. In fact as coined by Besag (1974, 1986), this result is nowadays known as the Hammersley-Clifford theorem. However the real breakthrough of the Gibbs sampler was possible via Geman and

Geman (1984), in which they carried a Bayesian study of the Gibbs random fields (and hence derived the name of Gibbs sampler).

Subsequently, one of the most influential papers regarding MCMC theory was presented by

Tierney. In fact Tierney (1994) put forward all assumptions that are required to analyse

Markov chains and developed their properties. For instance some of his most important issues treated the convergence of ergodic averages and the central limit theorems. In addition, Liu et al. (1994, 1995) continued to enrich the MCMC literature after having analysed the covariance structure of the Gibbs sampler and established the validity of RaoBlackwellisation (which had been previously used by Gelfand and Smith (1990).

Furthermore, among the many contributions in the fields of MCMC theory, other prominent names are those of Gelman, Gilks, Roberts, Rosenthal, Tweedie etc. However it is definitely more important to comprehend that all their precious works have essentially allowed the generation of more complicated but desired probability distributions. And in this respect the Bayesian inference could benefit from a wider range of posterior distributions for simulation.

2.3

Bayesian Inference

Bayesian statistics knows its origin to the English mathematician and nonconformist

Unitarian minister, Thomas Bayes. It all started after his death, when Richard Price (the executor of Bayes’ will) discovered a particular unpublished paper in which there was a detailed description of what is nowadays known as the Bayes’ Theorem. After having

Chapter 2: Literature Review

7

drawn the attention of the Royal Society of London, it was posthumously published in 1763 under the title of “An Essay Towards Solving a Problem in the Doctrine of Chances”.

Basically Bayes’ paper introduced the use of a uniform prior distribution on a binomial parameter, and dealt with the problem of predicting new observations. Consequently the first generalisation of the theorem was introduced by Laplace, who is also notable for approaching problems in celestial mechanics and medical statistics amongst others. Laplace

(1774) considered an elaborate version of the inference problem for the unknown binomial parameter. Differently from Bayes, he justified the choice of a uniform prior distribution by arguing that the parameter’s posterior distribution should be proportional to the likelihood of the data.

Moreover in spite of some rivalry and heated conflicts, the term “Bayesian” entered circulation thanks to one of the most significant contributors of classical statistical theory. It was in fact in 1950 when Fisher introduced the adjective in the volume “Contributions to

Mathematical Statistics”. Subsequently there had been some significant explorations by

Savage, Jeffreys, Cox, Jaynes and Good among others. And on a more recent note, there had also been considerable contributions by Berger and Bernardo.

Interestingly enough, Stigler (1982) questioned whether Bayes might have originally intended his results in a rather more limited way than it is actually done. Nonetheless the wide variety of research and successful applications that have integrated the use of

Bayesian statistics is nowadays enormous. It is enough to envisage the radical changes brought in the delicate fields of astronomy, cosmology, artificial intelligence, biology, marketing, etc. (just to mention some).

However a dramatic boost of this genre was only possible with the rise of computer power, and the consequent unleashing of MCMC potential. Geman and Geman (1984) and Pearl

(1987) were among the very first to induce the desire for the necessary software developments that could utterly reduce the computational problems. And among the numerous versions of available software that have been closely associated with Bayesian statistics, the so-called BUGS development project has been arguably one of the most successful. Chapter 2: Literature Review

8

Spiegelhalter et al. (1995) published a very important manual for the first version of this particular software (originally known by the name of Bayesian inference Using Gibbs

Sampling). The authors recognised their motivation in the yet unexploited power of MCMC theory, and as the name itself indicates, the software was properly intended to perform

Bayesian analysis via the Gibbs sampler. Thereafter, the project triggered a renewed interest in the MCMC theory. In particular, Mengersen and Tweedie (1996) explored the convergence speeds of MCMC algorithms to the target distribution, and Roberts et al.

(1997) set explicit targets on the acceptance rate of the Metropolis-Hastings algorithm.

Subsequently Lunn et al. (2000) introduced a newer version of the software package (which however retained most of its predecessor’s features), named WinBUGS. Herein, an interesting remark worth mentioning was spared by J. A. Royle who defined WinBUGS as a real ‘MCMC blackbox’. Moreover Lunn et al. (2009) further provided a critical evaluation of the whole project. In addition the authors spared some thoughts regarding the future of the open source version of WinBUGS as proposed by Thomas A. et al. (2006).

2.4

Football Analysis

The first statistical analysis of football results dates back to 1956 when M. J. Moroney published the work entitled “Facts from Figures”. In this ‘primitive’ study, he suggested that a modified Poisson distribution (properly the Negative Binomial distribution) should provide an adequate fit to football results. The same Negative Binomial distribution was then reused by Reep and Benjamin (1968) in search of modelling ball passing among football team mates. Having introduced the “r-pass movement” model, in which a series of r successful passes preceed a shot on goal or an intercepted pass, they came up with a very important statement affirming that “chance does dominate the game”.

Hennessy’s opinion was far more controversial. In 1969 he stated that only chance was involved. However eventually I. D. Hill published the paper “Association Football and

Statistical Inference” where he expressed his dissent from such unconvincing arguments.

With particular reference to Reep and Benjamin (1968), he argued that although good passing sequences are quite necessary, it is not right to base an entire model on that alone since that “is not the aim of the game”. In addition Hill (1974) compared Goal’s pre-season

Chapter 2: Literature Review

9

forecasts with the final league tables for the season 1971-72, and observed a significant positive correlation. With this in mind he proposed that although “there is obviously a considerable element of chance”, a significant amount of skill should dominate the final outcome, and hence implying that the situation is predictible to some extent.

In the meantime several other attempts to model the qualities of league teams included maximum likelihood estimation by Thompson (1975) and linear model methodology as in

Harville (1977). However the first real model to predict football scores was put forward by

Maher (1982). According to his model, the goals scored by two opposing teams in some particular match are drawn from independent Poisson distributions. Whilst introducing the home advantage factor, he assigned each team with a pair of fixed parameters (α and β) such that the model would simply consist in combining the respective attacking and defensive parameters of the opposing teams.

Maher’s Poisson approach in fact laid the basis for several other studies within the field.

For instance, Lee (1997) relied on this model to simulate the English Premier League season 1995/1996 for around 1000 times, and investigated whether Manchester United really deserved to emerge victorious. However, whoever used the independent Poisson model frequently observed a (relatively low) correlation between the opponent’s goals.

Followingly, Dixon and Coles (1997) extended Maher’s model by introducing an indirect kind of dependence. Having foreseen the probable need of varying the parameters αi and β i with time, they also tapered the likelihood function in order to assign greater weightings to the more recent results.

Meanwhile, Griffiths and Milne (1978) had introduced the theoretical foundations for bivariate Poisson models. Herein the idea of two goal variables that follow a bivariate

Poisson distribution seemed to be quite a good alternative to Maher’s independent Poisson model. In fact Karlis and Ntzoufras (2003) replaced the goals’ independence assumption properly by a bivariate Poisson model that included an additional covariance parameter for the respective goals. This allowed space for score correlations, which is however quite plausible in view of two competing teams. In addition they also considered a diagonal inflation factor to improve the estimated precision of draws.

Recently, Karlis and Ntzoufras reapproached the situation in a completely different manner.

In their paper “Bayesian modelling of football outcomes for the goal difference” published

Chapter 2: Literature Review

10

in 2008, they focused on modelling the goal difference instead. Relying upon real data from the English Premier League season 2006/2007, they built up their reasoning over one of their own previous papers entitled as “Bayesian analysis of the differences of count data”.

Whilst removing scoring correlations and eliminating Poisson marginals, they made use of the Bayesian’s ability to incorporate any available prior knowledge.

Similarly the work of Baio and Blangiardo (2010), which has very much inspired this dissertation, proposed a Bayesian hierarchical model that was tested over the Italian Serie A championship 1991/1992. And from such recent works, one can note that it took quite a while before Bayesian concepts had been integrated within these fields. However some of the earlier works comprised Rue and Salvessen (1997) and Knorr-Held (2000). In fact the former applied a Bayesian dynamic generalised linear model and utilised MCMC to generate dependent samples from the posterior density. Whilst the latter made use of recursive Bayesian estimation over the 1996/1997 German Bundesliga, and investigated the possible time dependency of team strengths.

Chapter 3

Markov Chains & MCMC Algorithms

3.1

Introduction

This chapter starts off with Markov chains, and develops the framework for Markov Chain

Monte Carlo sampling. Markov chains are foremostly famous for the Markov property which discards the past to condition the future over the present. On the other hand MCMC is the most sought after sampling technique in Bayesian statistics. With this in mind a theoretical overview of both discrete and continuous state-space Markov chains is formulated, such that the simulation process is then explained in detail.

3.2

Markov Chains

A sequence of discrete random variables {X 0 , X 1 , ...} is recognised as a Markov chain if it progresses in accordance with the Markov property:

ℙ[ +1 = | = , −1 = −1 , … , 1 = 1 ] = ℙ[ +1 = | = ]

for all n ∈ ℕ and j, i n , i n-1 , …, i 1 ∈ ℕ.

11

12

Chapter 3: Markov Chains & MCMC Algorithms

Figure 3.1 represents a graphical view of one particular example. It exhibits three different states that are interconnected with one another. In this way, any future state can change to either one of the two possibilities or remain in its original state.

Figure 3.1: Markov Model format

This mechanism is able to model a large variety of discrete processes, however the probability with which a state changes its situation is totally dependent upon the so called transition probabilites.

3.3

Transition Probabilities

The single step evolution of states within a discrete parameter stochastic process ( ) ℕ is

generally explained by:

= ℙ[ = | −1 = ],

such that the whole scenario (of the previous 3-state Markov chain model) is then captured

Π=�

within a transition probability matrix:

�.

As a stochastic matrix, Π should only contain non-negative probabilities � ≥ 0, ∀, �,

whilst the sum of each row should be equal to unity �∑ = 1, ∀�. Furthermore, it can remaining at a current state ( ≥ 0).

be even noted that the first observation caters for the eventual non zero probability of

13

Chapter 3: Markov Chains & MCMC Algorithms

Moreover, processes do not necessarily need to evolve in single steps. It might be the case that a process in state i finds itself in state j after n transitions. In this respect, the transition probability formula changes to:

for all i, j, m, n ≥ 0.

= ℙ[ + = | = ]

As a result, the associated transition probability matrix is now:

()

Πn = � ()

()

()

()

()

()

�.

()

()

In general, these n-step transition probabilities can be figured out by conditioning on the state at any intermediate stage as according to the Chapman-Kolmogorov equation:

+

= ℙ[ + = | 0 = ]

= ∑∞ ℙ[{ + = } ∩ { = }|{0 = }]

=0

= ∑∞ ℙ[{ + = }|{ = } ∩ {0 = }] ℙ[ = | 0 = ]

=0

= ∑∞ ℙ[{ + = }|{ = }] ℙ[0 = ] ℙ[ = | 0 = ]

=0

= ∑∞ ℙ[{ = }|{0 = }] ℙ[ = | 0 = ]

=0

= ∑∞

=0

for all n, m ≥ 0 and any states i, j.

3.4

Properties

For a state j to be considered accessible from state i (written i → j), there must exist a nonzero probability that state i reaches state j in a finite number of transitions. Given some integer n ≥ 0,

ℙ[ = | 0 = ] = > 0.

14

Chapter 3: Markov Chains & MCMC Algorithms

If the same process applies the other way round (ie. j → i), the states are said to communicate (i ↔ j). Hence, a whole set of states that communicate is said to be an irreducible Markov chain.

The period k is denoted by,

= gcd{ ∶ ℙ[ = | 0 = ] > 0},

where gcd stands for the greatest common divisor. If the returns to state i are irregular, k =

∞ the probability for a possible return is equal to zero.

1 and the state is considered to be aperiodic. Else if k > 1 state i is periodic, and when k =

Given a state i, the probability for a first return in n steps is defined by:

, = ℙ[ = , − ≠ | 0 = ],

where 0 < k < n. If , = 1, state i is said to be recurrent. Further, on addition of , = 0 for i

alternative but equivalent condition for the recurrency of state i comes from ∑∞ , = ∞.

=1

≠ j, state i is impossible to leave and is even referred to as absorbing. However, an

Furthermore, it is of great interest to model the expected return time. And since , has

been defined as the probability of first return, the mean number of steps with which such a return occurs is given by:

∞

= � , .

=0

For slow rates of return, is generally infinite, and the state is referred to as null recurrent.

If the rate is instead more consistent such that it renders a finite , it is then called positive

recurrent. On the other hand, non recurrent states are defined as transient and , < 1. Thus

it is possible that state i will not be revisited in the future (and so is clearly infinite).

The mean return time is in fact important when defining three further results. Given an aperiodic, irreducible and recurrent Markov chain:

1) lim →∞ , =

1

2) lim →∞ , = 0 for null recurrent states i

for positive recurrent states i

3) lim →∞ , = lim →∞ , for all states j.

15

Chapter 3: Markov Chains & MCMC Algorithms

Theorem: Given any irreducible, aperiodic and positive recurrent Markov chain there should be a unique stationary distribution π. As a result, for all states i and j:

lim , = = � , ,

→∞

where ∑ = 1 and p ij represent the transitional probabilities.

Stationarity of Markov chains is one fundamental property for MCMC sampling. It is in fact the basis for replicating some target distribution. However in the major problems of interest (such as in estimation of football results), the distribution π is absolutely continuous. As a result the MCMC theory is instead based upon discrete-time Markov chains with continuous state spaces, and so the properties mentioned earlier have to be revisted within this perspective.

3.5

General state-space Markov Chains

In this respect, the proper definition of a time-homogeneous Markov chain {X n ; n ≥ 0} on a continuous state-space E can be regarded as follows:

ℙ[ +1 ∈ +1 | ∈ , −1 ∈ −1 , … , 1 ∈ 1 ] = ℙ[ +1 ∈ +1 | = ]

for all n ϵ ℕ and A ⊂ E. So differently from before, the chain is not restricted anymore by a countable number of possible values. Instead, it can now take any value over some continuous interval.

Furthermore, given an initial distribution the evolution process is governed by a transition kernel P which is defined as:

(, ) = ℙ[ +1 ∈ | = ]

for all measurable sets A and x ∈ E. So if v represents the initial probability distribution on

E, and P is the transition kernel, vP renders the position distribution of the Markov chain exactly after one step:

() = � (, )().

Chapter 3: Markov Chains & MCMC Algorithms

16

On the other hand, if h is a real valued function on E, one can define two more functions:

ℎ() = ∫ ℎ()(, ) and ℎ = ∫ ℎ()().

In this context, one should also consider first returns to some set A ⊂ E. Denoted by ,

these are differently defined as:

= inf{ ≥ 1 ∶ ∈ },

and by convention, a chain will not return to if = ∞.

Another strong condition for the general state-space Markov chains is the detailed balance property. It implies that around any closed cycle of states, no net flow of probability takes place. Moreover a Markov process that satisfies the detailed balance condition is said to be reversible. Theorem: A Markov chain { } on a state-space E is reversible with respect to a probability distribution π on E, if and only if (, ) satisfies the following relation: for all x, y ∈ E.

() (, ) = () (, )

Proof: According to the Bayes rule for conditional probability, we have:

ℙ( ∈ | +1 ∈ ) =

ℙ( +1 ∈ | ∈ )ℙ( ∈ )

.

ℙ( +1 ∈ )

ℙ( +1 ∈ | ∈ ) =

ℙ( +1 ∈ | ∈ )ℙ( ∈ )

ℙ( +1 ∈ )

However, if we assume that the Markov chain is reversible, ℙ( ∈ | +1 ∈ ) =

ℙ( +1 ∈ | ∈ ) and thus,

such that,

⇒ ℙ( ∈ | +1 ∈ ) = ℙ( ∈ | +1 ∈ )

⇒ ∫ ∈ ∫ ∈

() (, ) = ∫ ∈ ∫ ∈

Chapter 3: Markov Chains & MCMC Algorithms

for all A, B ∈ E. for all x, y ∈ E.

() (, )

17

⇒ () (, ) = () (, )

□

Theorem: If a Markov chain is reversible, there exists a unique stationary probability measure π for the chain such that:

Proof:

() = ∫ ()(, ).

∫ () (, )

(3.5.1)

= ∫ () (, )

= () ∫ (, )

= ()

□

̅ sample path averages should converge to the corresponding expectations πf for any

Then assuming that a Markov chain has transition kernel P and stationary distribution π, the

initial distribution. And for this to be possible the chain must firstly be irreducible, such that all interesting sets on the state-space can be reached.

Definition: A Markov chain is π-irreducible for a probability distribution π on E if π(A) > 0 for a set A ⊂ E implies that,

{ < ∞} > 0,

where represents the probability that a Markov chain starts with X 0 = x for all x ∈ E.

Consequently it can be said that if a Markov chain is π-irreducible with respect to some distribution π, the chain is also irreducible, and π can be considered as an irreducibility distribution for it. In addition, the possibility of repetitively reaching the same sets in the long run is represented by the recurrence property.

Chapter 3: Markov Chains & MCMC Algorithms

18

Definition: A π-irreducible Markov chain is recurrent if for any set A ⊂ E with π(A) > 0 satisfies: { ∈ in�initely often} > 0 for all ,

{ ∈ in�initely often} = 1 for -almost all .

It is thus evident that this recurrence differs from that found in the discrete case. Here it is in fact regarded as a property for an entire irreducible chain, and not defined for individual states anymore. Moreover if an irreducible reccurrent chain has a stationary probability distribution, it is said to be positive recurrent. On the other hand if it happens that the second condition of the previous definition fails, such that:

{ ∈ in�initely often} = 1

for all x ∈ , we define a new concept which we call Harris recurrent.

3.6

Convergence

At this point, the ultimate goal (which plays a very important role for the next section) is to actually prove that the transition kernels truly converge to some stationary distribution. In fact, by running a sufficiently long chain, we can deduce that the total variation distance will eventually tend to 0.

Theorem: Assuming an aperiodic, π-irreducible Markov chain with transition kernel P, then, lim ‖ (,∙) − ‖ → 0

→∞

for π-almost all x. If the transtion kernel is positive Harris recurrent, this convergence can be further extended to all x.

Proof: Suppose that the total variation distance can be represented as,

‖ (,∙) − ‖ = � | (, ) − ()| .

19

Chapter 3: Markov Chains & MCMC Algorithms

Also let,

(, ) = �

n

j=0

( = ) − (, ) and

(, ) = �

n

j=0

(, )

− (, )

be the first entrance and last exit. Then if we denote,

() = ()

() = ()

() =

(, )

where is a fixed reference state, and apply the convolution notation ∗ () =

∫ ()( − ) :

(, ) = ∫j=0 ()( − ) n = ∗ ().

and

(, ) = ∫j=0 () ( − )

Thus combining these two equations would give: (, ) =

(, ) + ∗ ∗ (),

n

= ∗ ().

such that the total variation distance is then bounded by three terms:

‖

(,∙)

− ‖ ≤ �

for any , , ∈ .

(,

∞

) + �| ∗ − ()| ∗ () + � () � ()

=+1

However for the first term,

�

(, ) = ( ≥ ) → 0

for all x from Harris recurrence.

Meanwhile from result (3.5.1) which proves the existence of an invariant measure,

∞

Chapter 3: Markov Chains & MCMC Algorithms

() = () � (),

20

=1

and if we integrate both sides,

∞

� () = () � � () < ∞.

=1

Thus the third term,

∞

� () � ()

=+1

should also tend to 0 as n goes to infinity.

Furthermore, using the finiteness of () ∑∞ ∫ () and assuming that is ergodic,

=1

such that,

() = lim →∞ ∗ (),

the middle term tends to zero as well.

Hence since all the bounds converge to 0, we can confirm that,

for all x.

lim →∞ ‖ (,∙) − ‖ → 0

□

Finally, all these desired properties open doors for the leading method of MCMC sampling.

With the intention of exploring posterior distributions of interest, this can be essentially described as Monte Carlo integration by using Markov chains.

3.7

MCMC

The Markov Chain Monte Carlo technique constructs a Markov chain of the type discussed above, such that it approximates some target distribution π by its stationary distribution.

Chapter 3: Markov Chains & MCMC Algorithms

21

The general idea behind the process is to compile random samples from some target distribution, properly by using the theory of random walks.

One typical approach involves conditioning. Assuming that X has a distribution π, Y = f(X) while function f is defined on E, we consider,

(, ) = ℙ{ ∈ | = },

such that P(x, A) = Q(f(x), A) represents the transition kernel with stationary distribution π.

Moreover P is generally not irreducible, but if one constructs a series of conditioning kernels P = P 1 , P 2 , ...P m for a list of several functions, one can derive another kernel P =

P 1 P 2 ...P m with stationary distribution π that is also irreducible. For instance, the Gibbs sampling algorithm (which is highly integrated within the key computational softwares available) relies upon the functions,

() = (1 , … , ) = (1 , … , −1 , +1 , … , )

for i = 1, ..., m and = (1 , … , ) ∈ which denotes a subset of a product space.

The kernel is important to sample from a conditional distribution X | Y = f(X n ) and produce the next state X n+1 . Moreover, since the conditional distribution would be serving as a stationary distribution, the kernel would also have the same stationary distribution π. In case it is not irreducible, a series of kernels can be used again as before. This strategy is employed in the Gibbs sampling, which is however a special case of the original Metropolis algorithm as defined by Hastings.

3.8

The Metropolis-Hastings Algorithm

Given a target distribution π with density μ, the algorithm starts by defining a proper

Markov transition kernel,

(, ) = (, )().

At each state X n = x, a proposal Y is generated from Q(x | .) for the next state X n+1 . Then the relation significance between the current and the proposed state is evaluated according to the acceptance probability,

()(, )

�

()(, )

Chapter 3: Markov Chains & MCMC Algorithms

min �1,

(, ) = �

1

if ()(, ) > 0,

22

()(, ) = 0.

Unless it is not rejected, the candidate point Y will become the new state. Otherwise the chain has to remain at the same state. So,

X n+1 = �

with probability (, ), with probability 1 − (, ).

Furthermore, this algorithm can be actually proven to produce a Markov chain { } which

is reversible with respect to the stationary distribution (.). In fact according to the detailed balance (reversibility) condition, it satisfies:

Proof: Assuming that x ≠ y,

() (, ) = () (, ).

() (, ) = [()][(, )(, )]

= ()(, ) min �1,

�

()(,)

()(,)

= min�()(, ), () (, )�

= () (, ).

□

One can find a variety of different MCMC algorithms which stem out from the MetropolisHastings. However the fundamental characteristics of the orginal version are evidently retained. Differences exist in acceptance probability structures, which sometimes make the process highly comparable to the importance sampling technique. Moreover, irrespective of the algorithm employed, the ultimate goal of any software version remains to converge the transition kernels to the stationary distribution.

Chapter 4

The Bayesian Approach

4.1

General Framework

Bayesian inference is a branch of statistical inference, where the posterior probability of some statistic is given rather than a decision as to the significance or otherwise of some parameter. This is done with accordance to two distinct sources of information. And in this respect the approach uses the rules of probability to fit distributions over every parameter and unobserved quantity of interest. As a result, a model parameter is thus treated as a random variable rather than as an unknown constant.

With this in mind, a family of distributions that best models the situation under study is function (), which represents the possible values within the random distribution firstly selected. Then, the prior beliefs are expressed in terms of a probability density parameter Θ. Finally this is all modified with respect to the sample data Y at hand, which is assumed to be interchangeable.

Definition: A sequence of random variables {1 , … } is interchangeable if the joint density function remains the same under all permutations of the indices, such that:

(1 , … ) = (1 , … ),

23

whenever 1 , … represent a permutation of 1 , … .

24

Chapter 4: Bayesian Inference

However the main issue regards the way in which the parameters’ prior beliefs are related given some sample data Y = y. Known as the posterior distribution function (|), this to the observed evidence. In fact, this is achieved by deriving the conditional density of Θ

summarises the current state of knowledge about all observable or unobservable parameters

of interest. And as the name ‘Bayesian’ itself indicates, the calculations build up upon the

Bayes’ theorem. Figure 4.1 perfectly summarises the whole process.

Figure 4.1: The Bayesian synthesis

4.2

Bayes’ Rule

Assuming that there exists a space Ω where the Y’s and Θ can be defined jointly, we can express the conditional joint probability mass or density function as:

(1 , … , , ) = (1 , … , | )(), or (1 , … , , ) = ( | 1 , … , )(1 , … , ).

Then, combining these equations together will produce the posterior probability density function of the distribution paramaters given the sample data Y:

( | 1 , … , ) =

=

(1 , … , , )

(1 , … , )

(1 ,…, | )()

(1 ,…, )

,

(4.2.1)

where (1 , … , ) represents the total summation (or integration, depending on the nature

25

Chapter 4: Bayesian Inference

of the paramaters) over all values of available.

In fact (1 , … , ) is actually a marginal function of (1 , … , ) only, since is integrated out. Herein result (4.2.1) can be simplified into:

( | 1 , … , ) = (1 , … , | )() × ,

(4.2.2)

where is a constant of proportionality. However this result is also subject to change according to the introduction of likelihood functions.

4.2.1 Likelihood Functions

(1 , … , ) about the parameter values θ of our statistical model. So given that the random

The likelihood is a function that represents all information within the observed sample variables {1 , … } are independent and identically distributed,

( ; 1 , … , ) = (1 | ) … ( | )

= � ( | ).

=1

For instance if we assume that goals follow a poisson distribution, the likelihood function would be:

−

,

!

( | ) =

and so the likelihood for the whole sample becomes:

− −

( ; 1 , … , ) = � �

�=

�� � ! !

=1

where ∏ =1

1

!

∝ − ∑

=1

(4.2.3)

is treated as a constant of proportionality due to being independent from .

26

Chapter 4: Bayesian Inference

Consequently the next subsection will deal with the appropriate changes to the posterior probability density function that was obtained as result (4.2.2).

4.2.2 Posterior Distribution

Substituting in the likelihood function,

( | 1 , … , ) = ( ; 1 , … , )() × .

And as a result, we may finally re-write the posterior as proportional to the product of the likelihood and prior:

( | 1 , … , ) ∝ ( ; 1 , … , )().

Herein in case we are lacking prior information about goals, we would like to model our prior beliefs by a normal distribution. Hence we would have:

( ; , 2 ) =

1

√2 2

exp �−

( − )2

�,

2 2

and on combination with result (4.2.3), the posterior density function that follows would satisfy: ( | 1 , … , ) ∝ − ∑

∝ − ∑

∝ ∑

( − )2 exp �−

�

2 2

exp �−

( 2 − 2 + 2 )

�

2 2

exp �− −

� 2 −2�

22

�

(4.2.4)

where any term that is independent of is considered as a constant of proportionality.

However in this example, the likelihood of this posterior density function depends upon the full sample data. Instead, it is sometimes possible to work with an appropriate function of the same observations in question.

27

Chapter 4: Bayesian Inference

4.2.3 Sufficient Statistics

In view of some sample data (1 , … , ), a statistic (1 , … , ) = is essentially sufficient

in the sense that it contains the same amount of relevant information about some unknown parameter of interest Θ, such that:

Definition: The conditional probability distribution of the sample data (1 , … , ) given is independent of Θ, and so:

ℙ[ = | () = ; ] = ℙ[ = | () = ].

frequently consulted because of its ability to isolate out the dependence on to one

And in this respect, a particular theorem by the name of Neyman’s factorisation is

multiplicand function.

Theorem: = (1 , … , ) is considered as a sufficient statistic for some parameter if

and only if there are two functions and ℎ such that,

(1 , … , | ) = ((1 , … , ), )ℎ(1 , … , ),

where ℎ is independent of and .

Proof: Consider a discrete random variable, and suppose that is sufficient for . Then if

(1 , … , ) = and = for all = 1, … ,

(1 , … , | ) = ℙ {1 = 1 , … , = }

=

�

∈ (1 ,… )

=ℙ�

ℙ {1 = 1 , … , = | (1 , … , ) = } ℙ {(1 , … , ) = }

�

∈ (1 ,… )

{1 = 1 , … , = ∩ (1 , … , ) = }�

= ℙ [{1 = 1 , … , = } ∩ {(1 , … , ) = (1 , … , )}]

= ℙ{1 = 1 , … , = | (1 , … , ) = (1 , … , )}

ℙ {(1 , … , ) = (1 , … , )}.

x

Hence by definition of sufficiency, ℙ{1 = 1 , … , = | (1 , … , ) = (1 , … , )} is

28

Chapter 4: Bayesian Inference

independent of and factorisation holds for:

((1 , … , ), ) = ℙ {(1 , … , ) = (1 , … , )} and ℎ(1 , … , ) = ℙ{1 = 1 , … , = | (1 , … , ) = (1 , … , )}.

Conversely, suppose that factorisation holds, then

ℙ{1 = 1 , … , = | (1 , … , ) = } =

=

=

=

ℙ{1 =1 ,…, = ,(1 ,…, )=}

ℙ{(1 ,…, )=}

ℙ{1 =1 ,…, = }

ℙ{(1 ,…, )=}

((1 ,…, ),)ℎ(1 ,…, )

∑ ((1 ,…, ),)ℎ(1 ,…, )

ℎ(1 ,…, )

∑ ℎ(1 ,…, )

where the summation extends over all (1 , … , ) for which (1 , … , ) = . Then, if:

(1 , … , ) = ⇒ ℙ{1 = 1 , … , = | (1 , … , ) = } does not depend on ,

(1 , … , ) ≠ ⇒ ℙ{1 = 1 , … , = | (1 , … , ) = } = 0 and does not depend

on as well.

Hence is a true sufficient statistic for .

□

As a result if we reconsider the goals’ poisson likelihood that was obtained in (4.2.2), it can be decomposed as:

(, ) = −

where = ∑ =1 is sufficient.

and

ℎ(1 , … , ) = ∏ =1 � !�,

1

29

Chapter 4: Bayesian Inference

4.3

Bayesian Estimation

After having defined the fundamental constituents that make up the posterior distribution, we shall next move on to discuss the Bayesian estimation methods for the unknown probability distribution parameters. While starting off with the concepts of point estimation, the focus will eventually turn over the ideas behind interval estimation.

4.3.1 Point Estimation

The first approach to inference is to calculate a single point estimate that should serve as the best possible value of an unknown population parameter. And just as the classical approach has the maximum likelihood estimator, the Bayesian perspective has its own so called generalised maximum likelihood estimation method.

Definition: The generalised maximum likelihood is an ideal estimate � for the parameter

, at which the posterior density is at a maximum:

� = max{( | 1 , … , )}.

However the parameter estimate � can also be regarded as the mode of the posterior

density. Nevertheless rather than presenting such a parameter estimate on its own, it would be appropriate to accompany it with some measure of its statistical behaviour. Herein we can calculate the Bayes risk by referring to the posterior variance.

�

Definition: The posterior variance of Θ is defined as:

�

�

� 2

�Θ | 1 , … , � = ��Θ − �Θ�� | 1 , … , �

� 2

= ��Θ − Θ� | 1 , … , �

�

= ��Θ − Θ� ( | 1 , … , )

2

� where Θ is assumed to be an unbiased estimator of the real valued parameter Θ with posterior density ( | 1 , … , ).

30

Chapter 4: Bayesian Inference

However, although point estimation is very practical and efficient, it is sometimes also desirable to go beyond the generation of single ‘best’ estimates.

4.3.2 Interval Estimation

In contrast to point estimation, this method expands the estimation to an interval which is most likely to host the true value of some parameter. While commonly compared to the confidence intervals found in classical statistical inference, these are known as Bayesian credible intervals.

4.3.2.1 Credible Interval

For some level of significance , a credible interval can be defined as a range under which

the posterior probability that the parameter Θ lies in the same interval is taken as (1 − ).

In other words this means that with (1 − ) % certainity the selected interval contains the

confidence interval. In contrast, the (1 − ) % of the classical framework refers to the parameter in question, which is however quite different from the interpretation of the

amount of selected confidence intervals that are likely to host the parameter under study.

Definition: Suppose that ( | 1 , … , ) represents the posterior cumulative distribution

function for the parameter Θ, then we can specify a credible interval which ranges over the limits a and b such that,

(1 − ) = ℙ[ < < | 1 , … , ]

= ( | 1 , … , ) − ( | 1 , … , )

where α stands for some predetermined level of significance.

Hence according to result (4.4.3), the credible interval [a, b] is derived from:

( | 1 , … , ) = ∫ ∑

0

exp �− −

and

(2 −2)

2 2

� = ,

2

( | 1 , … , ) = ∫ ∑

0

Chapter 4: Bayesian Inference

exp �− −

(2 −2)

2 2

� = 1 − .

31

2

interval with probability (1 − ). Therefore in absence of this interval uniqueness, we need

However the problem with such interval estimations is that there are more than just one

to define something which filters the selection.

4.3.2.2 Highest Posterior Density Interval

The highest posterior density interval is an extension of the credible interval theory. It applies certain conditions which lead to that specific interval that contains the points with the highest posterior densities. essentially those smallest (1 − ) credible intervals that satisfy:

Definition: For a unimodal posterior density, the highest posterior density intervals are

1.

2.

( | 1 , … , ) − ( | 1 , … , ) = 1 − , and

(1 | 1 , … , ) ≥ (2 | 1 , … , ), for all 1 ∈ [, ] & 2 ∉ [, ].

In other words the principal condition ensures that each point inside the highest posterior density interval has a greater posterior density than the points outside the interval. As a result, given that by (4.4.3):

�

we fix a value:

∑

∑

exp �− −

such that:

For all ∉ [, ]

For all ∈ [, ]

∑

∑

( 2 − 2)

� = 1 − , exp �− −

2 2

(2 − 2)

( 2 − 2)

� = ∑ exp �− −

� = ,

2 2

2 2

exp �− −

exp �− −

� 2 −2�

22

� < ,

�2 −2�

22

� ≥ .

32

Chapter 4: Bayesian Inference

interval can be easily generalised as that (1 − ) credible region that comprises of the Θ

Moreover, in the case of higher dimensions the definition of a highest posterior density

values with the highest posterior densities. In addition the highest posterior density interval is guaranteed to hold the uniqueness property as shown by the next theorem.

Theorem: Given that the posterior density for all credible intervals with limits (a, b) is never uniform in any interval of the space of θ, the highest posterior density interval exists and is unique.

Proof: Assuming a unimodal posterior density, we start by defining the Lagrangian as,

ℒ = − + �� ( | 1 , … , ) − (1 − )�,

where is in fact a Lagrange multiplier. Then, if we differentiate partially with respect to a and b respectively, and equate to 0:

ℒ

= −1 − [( | 1 , … , )] = 0,

⟹ ( | 1 , … , ) = − .

1

and

ℒ

= −1 − [( | 1 , … , )] = 0,

⟹ ( | 1 , … , ) = − .

1

Hence for the probability density to be positive, should be negative. Moreover the second order differential terms are as follows:

2ℒ

( | 1 , … , )

= − �

� > 0,

()2

2ℒ

( | 1 , … , )

= − �

� > 0,

()2

2ℒ 2ℒ

=

= 0,

()() ()()

such that the Hessian matrix is positive definite, and so the interval (a, b) is truly a minimum. □

Another common estimation technique uses the emprical Bayes’ estimator, in which the sample data is utilised to decipher the parameters (better referred to as hyperparameters) of

33

Chapter 4: Bayesian Inference

the prior distribution. Hence strictly speaking this process violates the Bayes’ theorem which formally requires that the formulation of the prior distribution is totally independent from the actual data. Moreover in light of an inexistent natural standard error, it is a major setback when establishing credible intervals or testing hypotheses. As a result it is not deemed very useful for this study.

4.4

Hypothesis Testing

Although the parameter’s posterior distribution generally summarises all the required information, a researcher might need to investigate whether the random parameter θ lies in any particular part of the parameter space ΩΘ. As a result, a null and alternative hypotheses are established as:

0 = Θ ∈ ΩΘ0 ,

1 = Θ ∈ ΩΘ1 ,

where ΩΘ0 and ΩΘ1 are two partitioned subsets of the parameter space ΩΘ .

ℙ[0 | 1 , … , ] and ℙ[1 | 1 , … , ]. In contrast to the Frequentist hypothesis testing,

Rejection of hypothesis is only based upon comparison of the posterior probabilities

these subjective probabilities are in accordance to the data and prior information. As a result it is important to outline that the accepted hypothesis is not necessarily true, but it is temporarily the best in light of the actual data.

Next we shall illustrate the approach for two different types of hypotheses. In the process the new concept of posterior odds ratio will also be introduced as according to Jeffreys’ accept 0 ; otherwise, we reject 0 in favor of 1 .” [Press 2003]

hypothesis testing criterion which states that: “If the posterior odds ratio exceeds unity, we

4.4.1 Simple Null & Alternative Hypotheses alternative hypotheses are fully specified. Suppose that 0 and 1 are constants, then the

The simplest hypothesis testing occurs when both distribution functions of the null and

34

Chapter 4: Bayesian Inference

hypotheses would be:

0 = Θ = 0 ,

1 = Θ = 1 .

appropriate test statistic for a sample of n readings by . Then using the Bayes’ theorem,

Assuming these hypotheses to be mutually exclusive and exhaustive, we denote the the posterior probabilities for the two hypotheses given the observed data value are:

ℙ[0 | ] =

and,

ℙ[1 | ] =

ℙ[ | 0 ]ℙ[0 ]

ℙ[ | 0 ]ℙ[0 ] + ℙ[ | 1 ]ℙ[1 ]

ℙ[ | 1 ]ℙ[1 ]

ℙ[ | 1 ]ℙ[1 ] + ℙ[ | 0 ]ℙ[0 ]

where the ℙ[0 ] and ℙ[1 ] represent the prior probabilities of the respective hypotheses.

Consequently we can combine these two equations as:

ℙ[0 | ] ℙ[ | 0 ]ℙ[0 ]

=

,

ℙ[1 | ] ℙ[ | 1 ]ℙ[1 ]

and since ℙ[0 | ] and ℙ[1 | ] add up to 1, this gives the posterior odds ratio in favour of 0 .

4.4.2 Simple Null & Composite Alternative Hypotheses alternative hypothesis is partly specified by a range of possible values for . Hence

Another common hypothesis testing case is when the distribution function of the

assuming a constant value 0 , the null and alternative hypotheses would be:

0 = Θ = 0 ,

1 = Θ ≠ 0 .

Moreover assuming that retains its same significance, the posterior odds ratio in

Chapter 4: Bayesian Inference

favour of 0 is differently defined as:

ℙ[0 | ] ℙ[ | 0 ]ℙ[0 ]

ℙ[ | 0 , ]

ℙ[0 ]

=

=

�

�,

ℙ[1 | ] ℙ[ | 1 ]ℙ[1 ] ∫ ℙ[ | 1 , ]1 () ℙ[1 ]

with 1 () being the posterior density for under 1 .

35

Chapter 5

Model Estimation using

Football Data

5.1

Introduction

Having covered the fundamental concepts of both Markov processes and Bayesian statistics, it is time to put the theory into action. This chapter starts off with a concise description of the available data set, and moves on to formulate the problem that will be regarded. A Bayesian hierarchical model (following Baio and Blangiardo (2010)) is consequently presented for a particular football season, and its performance is compared with reference to a generalised linear model that was estimated using the same data.

5.2

Data Set

The study will focus upon the Italian serie A scudetto season 1994/1995, which has seen

Juventus F.C. dominate the final league table and the introduction of the 3-1-0 points scheme. The main reason behind this selection is that this prestigious football league features some very slow football (in comparison to other major leagues such as the English

Premier League or Spanish Liga). As a result some soccer experts argue that the Italian

36

37

Chapter 5: Model Estimation using Football Data

football style gives rise to sequences of matches whose results are somewhat more predictible than for other leagues. Hence it is quite plausible to expect that such football is more susceptible to mathematical modelling.

The corresponding data set (which is also enclosed in Appendix C(C.1)) was downloaded online from the url address: http://www.football-data.co.uk/italym.php. This is actually a free website that collects match statistics and betting odds data for up to 22 European league divisions. In fact its primary intention is basically that of enhancing the development and analysis of football betting systems.

The data set per se comprises the season’s history list of 306 football matches. Each of these observations consists of the final match score (eg. 0 – 1) along with the corresponding opponent teams (eg. Bari vs Lazio, where Bari would be the home side). However such online data files usually contain much more extra information (such as team line ups, referees, attendances, shots on goal, etc), but we are not interested in using these in our study. Most importantly, the data set is also ordered by the respective match dates.

5.2.1 Descriptive Statistics

In order to grasp a better understanding of the data set at hand, this subsection is dedicated to some descriptive statistics for the variables under study. Table 5.1 gives us the summary statistics for the home and away goals scored in all the 306 matches of the season in question. And as expected, the average home goals (1.5621) is quite high on comparison to the average away goals (0.9542).

Summary Statistics home_goals N

Mean

Median

Std. Deviation

Valid

away_goals

306

306

1.5621

.9542

1.0000

1.0000

1.31488

1.09748

Minimum

.00

.00

Maximum

8.00

5.00

Table 5.1: Summary statistics of home and away goals.

38

Chapter 5: Model Estimation using Football Data

Meanwhile from the same table, the maximum number of home goals is 8, which in turn exceeds the number of away goals by a count of 3. In addition the respective standard deviations further indicate that the variability of the home goals is also slightly greater with respect to that of the away goals.

In addition the standard deviations are 1.31488 and 1.09748 respectively. But consequently, one can observe how both variances (1.72891 and 1.20446) exceed the respective sample means. And in fact, in such cases of overdispersion the use of the poisson distribution for modelling goals is rather questionable. Furthermore Figure 5.1 represents the respective distributions of the home and away goals.

Goals' Distribution

140

120

100

80

60

40

20

0

home_goals away_goals 0

1

2

3

4

5

6

7

8

Number of Goals

Figure 5.1: Histogram for the number of home and away goals.

Nevertheless it is very important to comprehend that the collection of goals illustrated up here form a so called heterogeneous population. Had it been the case where the set of goals originates from one sole team, it would have been a totally different story. However in reality the goals under study come from a mixed group of participating teams, where each of which is endowed with its own different qualities and special abilities.

5.3

Defining the Problem

Given the whole list of match observations, it is thus desirable to estimate the necessary parameters that best explain the (just mentioned) different attributes of the teams. Therefore

39

Chapter 5: Model Estimation using Football Data

for each football match the study will follow Baio and Blangiardo (2010), and assume a log linear combination of the teams’ attack and defense parameters, such that: log 1 = ℎ + ℎ

log 2 =

+

+ ℎ

.

,

Consequently we shall indulge in an exercise to extract comparable parameters from a generalised linear model. However at this point, it should be clear that this reference model is not expected to provide an exceptional fit. One should keep in mind that the assumption of goals that follow a poisson distribution holds only because the poisson theory is one of the most evolved theories available at the moment. One flaw of this distribution simply arises from the absence of a theoretical upper bound, and hence match goals can unrealistically tend to infinity.

5.4

Bayesian Modelling Procedure

First of all, the data set in question was very slightly modified according to the general requirements of WinBUGS. Apart from the specific text file format, the participating teams were ordered alphabetically and consequently assigned a number from 1 to 18. Following the Bayesian trend, all random parameters were then assigned a suitable prior distribution

(with the normal and gamma being the main protagonists). Furthermore all the 18 participating teams were further identified as top, mid or low-table teams according to the general perception of the respective attack and defense potential.

The model (which is attached in Appendix B(B.1)) was then implemented in WinBUGS along with the use of a special file of initial values (also attached in Appendix B(B.2)). In fact this particular file of initial values was directly obtained from professor Gianluca Baio himself, and comprises a series of sensible initial values for all important nodes of the model. He advised me that due to the complexity of the model, one cannot let WinBUGS generate the initial values on its own as it will tend to produce non acceptible values that will freeze the simulation process.

40

Chapter 5: Model Estimation using Football Data

5.4.1 Results

Table 5.2 represents the summary statistics obtained for the posterior distributions of the parameters of interest. These comprise the posterior mean along with the corresponding standard error and MC error of each node. However one also finds the median and the 95 % confidence interval. The simulation consisted of 30,000 iterations, but the first 500 were discarded during the burning process (in order for the final estimates to become independent from the arbitrary initial values). node mean

sd

MC error

2.5%

median

97.5%

home

0.4919

0.06946

8.836E-4

0.3569

0.4916

0.6279

att[Bari] att[Brescia] att[Cagliari] att[Cremonese] att[Fiorentina] att[Foggia] att[Genoa] att[Internazionale] att[Juventus] att[Lazio] att[Milan] att[Napoli] att[Padova] att[Parma] att[Reggiana] att[Roma] att[Sampdoria] att[Torino] -0.02083

-0.5688

-0.02697

-0.2481

0.3537

-0.2732

-0.2382

-0.158

0.3164

0.4007

0.2546

-0.02067

-0.1918

0.2307

-0.4164

0.0367

0.234

0.02676

0.1142

0.2154

0.1152

0.1549

0.1237

0.1558

0.1577

0.1573

0.1227

0.1255

0.1305

0.1144

0.1593

0.1335

0.1739

0.1126

0.1332

0.1144

0.001061

0.002727

0.001097

0.002108

0.00181

0.002064

0.002163

0.002233

0.001838

0.001887

0.001928

0.00106

0.002331

0.001991

0.002437

0.001029

0.001988

0.001237

-0.2683

-1.042

-0.2785

-0.5523

0.1102

-0.583

-0.545

-0.4657

0.06996

0.1567

-0.01348

-0.2726

-0.4997

-0.04523

-0.7872

-0.1813

-0.04135

-0.1981

-0.01662

-0.5506

-0.02124

-0.2478

0.3528

-0.2716

-0.2388

-0.1577

0.3171

0.3991

0.2594

-0.01567

-0.1931

0.2361

-0.4075

0.03124

0.2408

0.02175

0.2016

-0.2024

0.1931

0.05275

0.5982

0.02916

0.07062

0.1468

0.5547

0.6536

0.5

0.198

0.1212

0.4793

-0.1031

0.2816

0.4794

0.2725

def[Bari] def[Brescia] def[Cagliari] def[Cremonese] def[Fiorentina] def[Foggia] def[Genoa] def[Internazionale] def[Juventus] def[Lazio] def[Milan] def[Napoli] def[Padova] def[Parma] def[Reggiana] def[Roma] def[Sampdoria] def[Torino] -0.04996

0.2675

-0.2468

-0.2662

0.227

0.03375

0.02213

-0.3145

-0.325

-0.2963

-0.3261

-0.01576

0.2175

-0.3392

0.1913

-0.4264

-0.2775

0.01925

0.1188

0.125

0.1408

0.1393

0.1261

0.1125

0.1107

0.1354

0.1359

0.1372

0.1364

0.1139

0.1266

0.1369

0.1273

0.1545

0.138

0.1122

0.001425

0.001689

0.002035

0.002242

0.001766

0.001149

0.001098

0.002029

0.002123

0.002195

0.002108

0.001257

0.00165

0.002139

0.001631

0.002317

0.002035

0.001167

-0.3214

0.02425

-0.5103

-0.5279

-0.02151

-0.1847

-0.2011

-0.5807

-0.5995

-0.5614

-0.596

-0.2595

-0.03289

-0.616

-0.06106

-0.763

-0.5425

-0.207

-0.04015

0.2669

-0.2527

-0.2712

0.2274

0.02811

0.01866

-0.3149

-0.3234

-0.2993

-0.325

-0.0119

0.2181

-0.3372

0.1912

-0.4145

-0.2814

0.01663

0.1656

0.5166

0.04289

0.02059

0.4744

0.2736

0.2519

-0.04452

-0.05822

-0.01356

-0.05745

0.2068

0.4643

-0.0726

0.4407

-0.152

0.003826

0.2531

Table 5.2: Bayesian estimation of the main parameters

The home effect resulted to be 0.4928 and is assumed to be constant for the entire list of participating teams. All other nodes represent the attack and defense parameters of the respective teams, for which we next attach the posterior density plots.

41

Chapter 5: Model Estimation using Football Data att[1] sample: 29500

att[2] sample: 29500

6.0

att[3] sample: 29500

2.0

1.5

1.0

0.5

0.0

4.0

2.0

0.0

-1.0

-0.5

0.0

6.0

4.0

2.0

0.0

-2.0

att[4] sample: 29500

-1.5

-1.0

-0.5

2.0

1.0

0.0

0.0

att[7] sample: 29500

att[6] sample: 29500

1.0

0.0

0.0

1.0

0.0

-1.0

-0.5

0.0

-1.0

att[10] sample: 29500

-0.5

0.0

0.5

2.0

0.0

0.0

0.5

0.0

-1.0

-0.5

0.0

2.0

1.0

0.0

0.0

0.5

0.0

-0.5

0.0

-1.5

att[17] sample: 29500

2.0

0.5

0.0

att[15] sample: 29500

-1.0

-0.5

0.0

att[18] sample: 29500

4.0

3.0

2.0

1.0

0.0

4.0

-0.5

3.0

-0.5

att[16] sample: 29500

6.0

-1.0

-1.0

att[14] sample: 29500

1.0

0.5

4.0

4.0

3.0

2.0

1.0

0.0

2.0

0.0

6.0

-0.5

att[13] sample: 29500

3.0

0.0

att[12] sample: 29500

4.0

3.0

2.0

1.0

0.0

-0.5

-0.5

0.0

att[11] sample: 29500

4.0

3.0

2.0

1.0

0.0

-0.5

4.0

3.0

2.0

1.0

0.0

2.0

0.0

-1.0

att[9] sample: 29500

att[8] sample: 29500

1.0

-1.5

-1.5

0.5

3.0

2.0

0.0

2.0

-0.5

3.0

-0.5

3.0

4.0

3.0

2.0

1.0

0.0

-0.5

-1.0

att[5] sample: 29500

3.0

-1.0

0.0

6.0

4.0

2.0

0.0

-0.5

0.0

0.5

-0.5

0.0

0.5

Figure 5.2: Posterior densities for the attack parameters.

All these plots consist of the product of the data (which is expressed formally by the likelihood function) and the prior distribution (which models our external beliefs) as explained previously in chapter 4. The posterior densities above respresent the attack parameters, whilst those on the following page comprise the defense parameters of the teams (numbered respectively from 1 to 18).

On a general note, all posterior plots are unimodal and the interval basically ranges from –

0.5 up to 0.5 However each plot is accordingly centered around different values, and these particular values are actually the mean node values presented in Table 5.2. All in all, we have more defense (rather than attack parameters) that are centred around a negative mean.

Also the majority of the plots are symmetric as well.

42

Chapter 5: Model Estimation using Football Data

Nevertheless we find several exceptions which do not conform with this general explanation. The most significantly different posterior densities are actually those for att[2], att[15], and def[16]. In fact these represent the attack parameters of Brescia and Reggiana

(both relegated) and Roma’s defense parameter respectively. def[1] sample: 29500

def[2] sample: 29500

6.0

def[3] sample: 29500

4.0

3.0

2.0

1.0

0.0

4.0

2.0

0.0

-1.0

-0.5

0.0

3.0

2.0

1.0

0.0

-0.5

def[4] sample: 29500

0.0

0.5

def[5] sample: 29500

4.0

3.0

2.0

1.0

0.0

-0.5

0.0

4.0

2.0

0.0

0.0

0.5

2.0

0.0

-0.5

0.0

0.5

-0.5

0.0

2.0

0.0

-0.5

0.0

-0.5

0.0

1.0

0.0

-1.0

-0.5

-0.5

def[17] sample: 29500

2.0

0.0

-0.5

0.0

0.5

def[15] sample: 29500

0.0

0.5

def[18] sample: 29500

4.0

3.0

2.0

1.0

0.0

-1.5

-1.0

4.0

3.0

2.0

1.0

0.0

-1.0

def[16] sample: 29500

3.0

0.0

4.0

def[14] sample: 29500

0.5

-0.5

def[12] sample: 29500

4.0

3.0

2.0

1.0

0.0

0.0

0.5

6.0

-1.0

def[13] sample: 29500

4.0

3.0

2.0

1.0

0.0

-0.5

-1.0

def[11] sample: 29500

0.0

0.0

def[9] sample: 29500

4.0

3.0

2.0

1.0

0.0

-0.5

-0.5

4.0

3.0

2.0

1.0

0.0

-1.0

def[10] sample: 29500

4.0

3.0

2.0

1.0

0.0

-1.0

-1.0

def[8] sample: 29500

4.0

3.0

2.0

1.0

0.0

4.0

0.0

6.0

-0.5

def[7] sample: 29500

6.0

-0.5

def[6] sample: 29500

4.0

3.0

2.0

1.0

0.0

-1.0

-1.0

6.0

4.0

2.0

0.0

-1.0

-0.5

0.0

-1.0

-0.5

0.0

0.5

Figure 5.3: Posterior densities for the defense parameters.

Furthermore it would be wise to put the mean posterior results in a more comparable scenario. Herein the respective attack and defense parameters were embedded within the final league table. Meanwhile it is important to understand that the higher parameters indicate greater amounts of goals (i.e. effective style in case of attack and poor play in case of defense).

43

Chapter 5: Model Estimation using Football Data

Team

Played

wins

Draws

lost

goals for

attack parameter goals against defense parameter final points Bari

34

23

4

7

40

-0.02083

43

-0.04996

44

Brescia (R)

34

19

6

9

18

-0.5688

65

0.2675

12

Cagliari

34

18

9

7

40

-0.02697

39

-0.2468

49

Cremonese

34

17

9

8

35

-0.2481

38

-0.2662

41

Fiorentina

34

16

11

7

61

0.3537

57

0.227

47

Foggia (R)

34

14

10

10

32

-0.2732

50

0.03375

34

Genoa

34

13

12

9

34

-0.2382

49

0.02213

40

Internazionale

34

13

11

10

39

-0.158

34

-0.3145

52

Juventus (C)

34

13

10

11

59

0.3164

32

-0.325

73

Lazio

34

12

11

11

69

0.4007

34

-0.2963

63

Milan

34

12

9

13

53

0.2546

32

-0.3261

60

Napoli

34

12

8

14

40

-0.02067

45

-0.01576

51

Padova

34

11

8

15

37

-0.1918

58

0.2175

40

Parma

34

10

10

14

51

0.2307

31

-0.3392

63

Reggiana (R)

34

12

4

18

24

-0.4164

59

0.1913

18

Roma

34

8

10

16

46

0.0367

25

-0.4264

59

Sampdoria

34

4

6

24

51

0.234

37

-0.2775

50

Torino

34

2

6

26

44

0.02676

48

0.01925

45

Table 5.3: Final league table and the corresponding team parameters estimates.

From Table 5.3, one can see that Juventus (the crowned champions) are assigned with one of the highest attack parameters (0.3164). However it is important to take notice of Lazio and Fiorentina who managed to gain a better estimate. Such a situation can in fact be supported by their respective amount of goals (69 and 61) that utterly exceed the 59 goals scored by Juventus.

Meanwhile the lowest attack parameter (-0.5688) unsurprisingly resulted for Brescia who have merely managed 18 goals in total. Furthermore these also had the worst defense parameter (0.2675) and are accompanied by those relegated and several exceptions such as

Padova and Fiorentina. On the other hand the best defense parameter is reserved for Roma who have conceeded the least goals (25) of all.

In addition, the same two sets of estimated parameters were further plotted against the final league points. And herein Figure 5.4 further witnesses the expected opposite linear dependence between the respective variables and final league points.

44

Chapter 5: Model Estimation using Football Data

80

70 final league points

60

50

40 attack parameter

30

defense parameter

20

10

0

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

estimated parameters

Figure 5.4: The estimated team parameters vs the final league points.

Furthermore, WinBUGS provides us with a Bayesian method for model comparsion that is known as the Deviance Information Criterion (DIC):

DIC = Dbar + pD = Dhat + 2 pD, where Dbar is the posterior mean of the deviance, Dhat is a point estimate of the deviance, and pD is the effective number of parameters. Figure 5.4 comprises the DIC obtained for our two main variables of interest.

Dbar

Awaygoals

Homegoals

total

Dhat

777.969

923.093

1701.060

pD

769.526

909.608

1679.130

DIC

8.443

13.485

21.927

786.412

936.578

1722.990

Table 5.4: Deviance Information Criterion

As a matter of fact, the smallest DIC is generally considered to indicate the best replication of a data set. However in our case, both DICs are very low and the difference of 13.485 –

8.443 = 5.042 is very minimal. In fact this suggests that our model is probably too simplistic to cater for the complexity with which the actual data generating mechanism operates. In addition let us compare this model with another which also introduces another key factor into play.

45

Chapter 5: Model Estimation using Football Data

5.5

Poisson Regression Formulation

The development of a comparable poisson regression model was quite challenging and interesting at the same time. The very first difference to clarify with respect to the Bayesian approach is that we will now estimate two models (according to the two dependent variables) which will be referred to as:

(i)

the dependent home goals model.

(ii)

the dependent away goals model.

As a result we will instead generate 2 sets of attack and defense parameters for the same team. Meanwhile we departed by running the simplest possible model using SPSS, and hereunder we attached Table 5.5 with the respective tests of model effects.

Tests of Model Effects

Type III

Source

(Intercept)

home_team away_team Wald

ChiSquare

54.109

41.755

35.574

Df

Type III

1

Sig.

.000

17

17

.001

.005

Source

(Intercept)

home_team away_team Wald

ChiSquare

7.238

26.584

38.555

df

1

Sig.

.007

17

17

.064

.002

Table 5.5: Respective tests of model effects for the dependent home goals and dependent away goals models.

However following lots of discussion, we questioned whether we could add something more that could well be utilised as an extra model predictor to improve these results. So eventually we came up with the idea of generating a new sequence of team strengths from the same available data set. Herein this particular variable was supposed to calculate the difference in actual teams’ strengths at the time when the corresponding matches were held, by considering the teams’ performances of the previous 6 fixtures.

With this in mind an appropriate program (which is attached in Appendix A.1) was specifically constructed, and the variable which calculates this difference in the strengths was called TS. In addition it is also important to mention that in the absence of prior fixtures such as in the case of match day 1, the final league table of the previous season laid the foundations for the generation of the very first 9 TS values.

46

Chapter 5: Model Estimation using Football Data

5.5.1 Results

Having now repeated the SPSS poisson regression with this new variable, we obtained some very interesting results which we will comment hereunder. However we first present

Table 5.6 which introduces some characteristics of the TS variable.

Variable Information

Covariate

N

306

TS

Minimum

-16

Maximum

18

Std.

Deviation

5.15639

Mean

-0.2026

Table 5.6: Summary statistics of TS variable.

In fact with a mean of – 0.2026 and a standard deviation of 5.15639, it is important to note that the TS values range from – 16 up to a maximum of 18. Subsequently Table 5.7 comprises the new tests of model effects for both the dependent home goals and dependent away goals models.

Tests of Model Effects

Type III

Source

(Intercept)

Wald

ChiSquare

49.86

df

Type III

Sig.

1

0

Source

(Intercept)

home_team away_team team_strengths

home_team

45.823

17

0

away_team team_strengths 39.965

4.514

17

1

0.001

0.034

Wald

ChiSquare

7.182

Df

1

Sig.

0.007

28.792

17

0.036

39.216

2.631

17

1

0.002

0.105

Table 5.7: Newly generated tests of model effects.

On comparing Table 5.7 with Table 5.6, one can notice that the Wald Chi-Squares of the respective intercepts were reduced. However on the other hand, the Wald Chi-Squares for the home and away teams were all increased. As a result, such a change implies that the contribution of this new variable can be utterly retained significant and is worth considering. Subsequently we present Tables 5.8 and 5.9 which comprise the estimated sets of parameters (along with the corresponding standard error), the 95 % Wald Confidence

Interval and the respective Hypothesis test.

47

Chapter 5: Model Estimation using Football Data

Parameter Estimates

95% Wald Confidence

Interval

Hypothesis Test

Parameter

(Intercept)

B

0.692

Std. Error

0.2651

Lower

0.172

Upper

1.212

Wald ChiSquare

6.812

df

1

Sig.

0.009

[Bari]

-0.053

0.2836

-0.609

0.503

0.035

1

0.852

[Brescia]

-0.746

0.3428

-1.418

-0.075

4.741

1

0.029

[Cagliari]

-0.006

0.2835

-0.562

0.549

0.001

1

0.982

[Cremoneses]

-0.134

0.2897

-0.702

0.434

0.213

1

0.644

0.485

0.2577

-0.021

0.99

3.535

1

0.06

[Foggia]

-0.223

0.2974

-0.806

0.36

0.563

1

0.453

[Genoa]

-0.109

0.2896

-0.677

0.458

0.143

1

0.705

[Internazionale]

[Fiorentina]

-0.159

0.2929

-0.733

0.415

0.295

1

0.587

[Juventus]

0.192

0.281

-0.359

0.743

0.468

1

0.494

[Lazio]

0.738

0.2488

0.251

1.226

8.808

1

0.003

[Milan]

0.04

0.2853

-0.519

0.599

0.02

1

0.889

[Napoli]

-0.04

0.2866

-0.602

0.521

0.02

1

0.888

[Padova]

-0.059

0.2869

-0.621

0.504

0.042

1

0.838

[Parma]

0.316

0.2678

-0.208

0.841

1.396

1

0.237

-0.628

0.3314

-1.278

0.021

3.593

1

0.058

[Roma]

0.109

0.2805

-0.441

0.658

0.15

1

0.698

[Sampdoria]

0.328

0.2628

-0.187

0.843

1.558

1

0.212

a

.

.

.

.

.

.

-0.575

0.2795

-1.123

-0.028

4.238

1

0.04

[Brescia]

0.057

0.2481

-0.429

0.543

0.053

1

0.817

[Cagliari]

-0.268

0.2526

-0.763

0.227

1.125

1

0.289

[Cremoneses]

-0.352

0.2653

-0.872

0.168

1.763

1

0.184

[Reggiana]

[Torino]

[Bari]

[Fiorentina]

0

0

0.2371

-0.465

0.464

0

1

0.999

[Foggia]

0.005

0.2418

-0.469

0.479

0

1

0.983

[Genoa]

-0.122

0.2458

-0.604

0.359

0.248

1

0.618

[Internazionale]

-0.645

0.2804

-1.195

-0.096

5.293

1

0.021

[Juventus]

-0.707

0.2856

-1.267

-0.147

6.126

1

0.013

[Lazio]

-0.743

0.2958

-1.323

-0.163

6.306

1

0.012

[Milan]

-0.618

0.2778

-1.162

-0.073

4.948

1

0.026

[Napoli]

-0.319

0.2579

-0.825

0.186

1.533

1

0.216

[Padova]

0.113

0.2304

-0.339

0.564

0.239

1

0.625

[Parma]

-0.637

0.2803

-1.186

-0.088

5.163

1

0.023

0.049

0.2423

-0.425

0.524

0.042

1

0.839

[Roma]

-0.731

0.295

-1.309

-0.153

6.143

1

0.013

[Sampdoria]

-0.654

0.2893

-1.221

-0.087

5.107

1

0.024

a

.

.

.

.

.

.

-0.027

0.0125

-0.051

-0.002

4.514

1

0.034

[Reggiana]

[Torino] team_strengths (Scale)

0

1

b

Table 5.8: Parameter estimates for the model with home goals as dependent variable.

48

Chapter 5: Model Estimation using Football Data

Parameter Estimates

95% Wald Confidence

Interval

Hypothesis Test

Parameter

(Intercept)

B

-0.19

Std. Error

0.374

Lower

-0.923

Upper

0.543

Wald ChiSquare

0.257

df

1

Sig.

0.612

[Bari]

0.615

0.3597

-0.09

1.32

2.925

1

0.087

[Brescia]

1.014

0.3542

0.32

1.708

8.195

1

0.004

[Cagliari]

-0.111

0.4181

-0.931

0.708

0.071

1

0.79

[Cremoneses]

0.138

0.3941

-0.635

0.91

0.122

1

0.727

[Fiorentina]

0.542

0.3632

-0.17

1.254

2.226

1

0.136

[Foggia]

0.263

0.3825

-0.486

1.013

0.474

1

0.491

[Genoa]

0.397

0.3736

-0.335

1.129

1.13

1

0.288

[Internazionale]

0.143

0.3941

-0.629

0.916

0.132

1

0.716

-0.053

0.4134

-0.863

0.757

0.017

1

0.898

[Lazio]

0.284

0.3798

-0.46

1.028

0.559

1

0.455

[Milan]

-0.127

0.421

-0.953

0.698

0.092

1

0.762

[Napoli]

0.435

0.3695

-0.289

1.159

1.384

1

0.239

[Padova]

0.407

0.3736

-0.326

1.139

1.185

1

0.276

[Parma]

-0.236

0.43

-1.079

0.607

0.301

1

0.583

0.651

0.3709

-0.076

1.378

3.078

1

0.079

-0.468

0.4583

-1.366

0.431

1.042

1

0.307

0.369

0.3738

-0.364

1.101

0.973

1

0.324

a

.

.

.

.

.

.

[Bari]

-0.211

0.3462

-0.889

0.468

0.37

1

0.543

[Brescia]

-1.612

0.5555

-2.701

-0.524

8.427

1

0.004

[Cagliari]

-0.211

0.3463

-0.89

0.467

0.373

1

0.541

[Cremoneses]

[Juventus]

[Reggiana]

[Roma]

[Sampdoria]

[Torino]

0

-0.588

0.3808

-1.335

0.158

2.387

1

0.122

[Fiorentina]

0.226

0.3151

-0.392

0.843

0.512

1

0.474

[Foggia]

-0.61

0.3825

-1.359

0.14

2.541

1

0.111

[Genoa]

-0.567

0.3805

-1.313

0.179

2.219

1

0.136

[Internazionale]

-0.075

0.3349

-0.731

0.582

0.05

1

0.824

0.62

0.3023

0.028

1.213

4.209

1

0.04

[Lazio]

0.006

0.3305

-0.642

0.654

0

1

0.986

[Milan]

0.445

0.2998

-0.143

1.033

2.203

1

0.138

[Napoli]

-0.151

0.34

-0.817

0.516

0.196

1

0.658

[Padova]

-0.444

0.3694

-1.168

0.28

1.442

1

0.23

[Parma]

0.048

0.3363

-0.611

0.707

0.021

1

0.886

[Reggiana]

-0.799

0.4086

-1.6

0.002

3.826

1

0.05

[Roma]

-0.017

0.3248

-0.654

0.62

0.003

1

0.958

[Sampdoria]

-0.167

0.3402

-0.834

0.499

0.242

1

0.623

a

.

.

.

.

.

.

0.025

0.0156

-0.005

0.056

2.631

1

0.105

[Juventus]

[Torino] team_strengths (Scale)

0

1

b

Table 5.9: Parameter estimates for the model with away goals as dependent variable.

49

Chapter 5: Model Estimation using Football Data

Differently from the Bayesian results, one of the generalised linear model parameter estimates has been set to 0 such that it serves as the base parameter, and all other parameters are interpreted as a difference from it. In case of the dependent home goals model the base parameter (0.692) is Torino’s home attack, whilst in case of the dependent away goals model the base parameter (– 0.19) is the same team’s home defense.

From Table 5.8, the best home attack parameter is 0.692 + 0.738 = 1.43 and is associated with Lazio, while the worst parameter is 0.692 – 0.746 = – 0.054 and resulted for Brescia.

In fact it is interesting to note that these teams hold the highest (69) and lowest (18) scored amount of goals respectively. Also both parameter estimates happened to be two of the few home attack parameters to hold a p-value which is less than 0.05.

Similar story holds for the away attack, where Brescia further excelled with the worst parameter estimate. Using table 5.9 it was in fact – 0.19 – 1.612 = – 1.802. However the best away attack parameter was yet reserved for the league champions. Juventus’ away attack parameter is actually – 0.19 + 0.62 = 0.43, and probably it embodies the real team strength behind their success.

With regard to the defense scenario, the significance of the parameters is then the other way round. In fact the trend is that a higher coefficient indicates a higher amount of goals conceeded. Meanwhile Brescia further reconfirmed its insecure position with the worst home defense parameter (1.014 – 0.19 = 0.824), but in case of the away defense category

Padova did the worst with 0.692 + 0.113 = 0.805. Furthermore the two Romans (Roma &

Lazio) had the best associated home (– 0.19 – 0.468 = – 0.658) and away (0.692 – 0.743 =

– 0.051) defense parameter estimates respectively. However Roma did have a very strong away defense parameter as well.

Consequently we should consider the goodness of fit of this generalised linear model

=

=

application. Herein Table 5.10 comprises the associated:

−2 ln ( )+2

and

( ) − ln

where ( ) represents the model likelihood, refers to the number of parameters, and is the degrees of freedom for the deviance.

50

Chapter 5: Model Estimation using Football Data

Goodness of Fit

Dependent Variable: home goals

Dependent Variable: away goals

Akaike's Information Criterion (AIC)

959.556

813.567

Bayesian Information Criterion (BIC)

1093.605

947.616

Table 5.10: AICs and BICs for the generalised linear model.

Similarly to the DIC statistic which was considered earlier for the Bayesian model, the smallest values of AICs and BICs indicate the better model fits. And in this respect, both criterions enclosed in Table 5.10 show that when the away goals were considered as the dependent variable, the poisson regression fit was better.

Finally, the last part of this section was designed to identify any possible outliers that could have influenced the results. In this respect Figure 5.5 plots the actual differences between the observed and the predicted goals for both the dependent home goals and the dependent away goals models.

residuals(homegoals)

residuals(awaygoals)

4

4

2

2

0

0

-2

-4

-2

1

23

45

67

89

111

133

155

177

199

221

243

265

287

6

1

23

45

67

89

111

133

155

177

199

221

243

265

287

6

-4

Figure 5.5: Residuals for the 2 generalised linear models.

From the above residual plots one can note that in general there is an element of randomness, and no particular patterns are present. Herein it is a very good indication that such residuals are worth considering.

Both plots happen to feature some few exceptions where large residuals are observed. In fact these are the so called outliers, which mainly result from the most unexpected match scores. For instance, two of the most unexpected heavy wins which are featured in the

51

Chapter 5: Model Estimation using Football Data

above plots refer to Torino vs Internazionale (5 – 0) and Padova vs Napoli (5 – 0) respectively. 5.6

Bayesian vs GLM

Overall the two models considered during this study were both interesting for their own diverse characteristics. Whilst the Bayesian method generated one set of attack and defense parameters along with a fixed home effect, the generalised linear model differed by developing two sets of parameters for the same team. For instance, if we consider

Juventus’s attacking style when playing home:

From Table 5.2

From Table 5.8

home + att[Juventus]

intercept + home attack [Juventus]

= 0.4919 + 0.3164

= 0.692 + 0.192

= 0.8083

= 0.884

These two results are not the same, but are however quite comparable. Consequently in order to understand whether this was just an exception, the following figure plots the differences between the respective home attack parameters of both models.

Home Attack Differences

0

-0.2

-0.4

Bari

Brescia (R)

Cagliari

Cremonese

Fiorentina

Foggia (R)

Genoa

Internazion…

Juventus (C)

Lazio

Milan

Napoli

Padova

Parma

Reggiana (R)

Roma

Sampdoria

Torino

0.2

-0.6

Figure 5.6: Plotting the differences between the home attack parameters.

In fact this graph illustrates that some discrepancies do exist. And basically the same situation was observed for the remaining 3 categories of parameter estimates as well.

Chapter 5: Model Estimation using Football Data

52

However, in conclusion we can outline that although the respective parameter estimates did differ from each other, the main characteristics of the season (which particularly regarded the performances of Juventus, Lazio, Roma, Brescia and Padova amongst others) were pointed out by both models.

Chapter 6

Conclusion

Throughout this dissertation, Markov theory and Bayesian statistics were fundamental in the modelling process of football data. In particular, the use of the Markov Chain Monte

Carlo technique was useful and interesting to work with, even if computationally time demanding to bring about the parameter estimates of the participating teams for all sorts of models. The validity and interpretation can be quite tricky, but such exercises are nowadays being utilised in all fields of application.

Recapitulating our efforts, the football league under study was the Italian serie A season

1994/1995. In fact this was chosen because of its tendency to feature some of the slowest played football, and hence the style could be somewhat more susceptible to mathematical modelling. Similar to many studies which were consulted, the first data characteristic to be noted was overdispersion. This was mainly because the goals considered orginate from a list of different teams (with distinct qualities), and thus the population is not homogeneous.

Eventually it was pointed out that the Bayesian application was probably too simplistic to cope with the complexity with which the actual data generating mechanism operates. And when compared to a reference generalised linear model the respective parameter estimates that were obtained were not particularly close. The dependent variables used were the home

53

Chapter 6: Conclusion

54

goals and away goals sequences. However with regard to the generalised linear model, the situation got fairly interesting with the introduction of more predictors.

For example, the appropriately generated sequence of team strengths played a very important role in the generalised linear model. This new TS variable was intended to calculate the actual difference between two opponent teams by considering the results of up to 6 previous match days. Herein, as confirmed by the change in the Wald Chi-Square statistic, it managed to ameliorate the contributions of both model predictors used (home & away teams).

However one must also admit that the final conclusions are very much the same. For instance both models singled out Brescia’s poor performance whilst valued the prestigious behaviour of Juventus and Lazio amongst others. Thus it can be concluded that the two models are quite comparable when explaining the overall season performance. And in this respect the results are considered to be quite satisfying.

Meanwhile the poisson regression model was originally not expected to provide an exceptional fit. The assumption that goals follow a poisson distribution has its own limitations as well. For instance the inexistent upper bound of this particular distribution theoretically allows match scores to tend to inifinity. And in fact such a situation is definitely not considered realistic.

In addition, when the whole procedure was repeated for a particular season of the English

Premier League (EPL), the results were quite inconsistent to be considered here. The situation encountered brings into mind Reep and Benjamin’s statement that “chance does dominate the game”. Nevertheless this was probably because of the faster and more spontaneous football that is present in the EPL. And in this respect, apart from the fact that each football season is characterised by its own different story, this highlights that different countries offer different dynamics and styles which might need different modelling techniques. With regard to the future of this field, there is no doubt that the software improvements will continue to enhance interest. However in a more technical framework the situation could develop at a more delicate pace. At this point, rather than modelling team attacks and

Chapter 6: Conclusion

55

defenses, an interesting idea which could probably entice people in the gambling sector is to focus on systems that make use of direct player evaluations.

Appendix A

Matlab m-files

A.1

Calculating team strengths (using previous 6 match days)

Input: List of season’s participating teams (teams), final league table for previous season

(c0(1;:)), sequence of scores (scores), and respective teams (home & away). winpoints=3; n=18; c=zeros(2*(n-1),n); for i=1:n*(n-1) team=home(i,:); k=0; for j=1:n if team==teams(j,:) k=j; end end h(i,1)=k; team=away(i,:); k=0; for j=1:n if team==teams(j,:) k=j; end end a(i,1)=k; end for i=1:2*(n-1) for j=1:n/2

56

Appendix A: Matlab m-files k=j+(i-1)*n/2; u=scores(k,1)-scores(k,2); if u>0 c(i,h(k))=c(i,h(k))+winpoints; else if u==0 c(i,h(k))=c(i,h(k))+1; c(i,a(k))=c(i,a(k))+1; else c(i,a(k))=c(i,a(k))+winpoints; end end end end

% first round, first match

TS=[];

for k=1:n/2

TS=[TS

c0(h(k))-c0(a(k))]; end for i=2:2*(n-1) for j=1:n/2 k=j+(i-1)*n/2; l=max(1,i-6); u=max(1,i-1); v=sum(c(l:u,h(k)))-sum(c(l:u,a(k)));

TS=[TS

v]; end end

57

Appendix B

WinBUGS files (Baio & Blangiardo (2010))

B.1

Bayesian mixture model

Input: Home & Away teams, scores, attack & defense prior categories, and initial values. model { for (i in 1:ngames) {

# observed no of goals

Homegoals[i] ~ dpois(lambda[i,1])

Awaygoals[i] ~ dpois(lambda[i,2])

# Predictive distribution for the number of goals scored ynew[i,1] ~ dpois(lambda[i,1]) ynew[i,2] ~ dpois(lambda[i,2])

# scoring intensity

log(lambda[i,1])…...

Premium Essay

...Football refers to a number of sports that involve, to varying degrees, kicking a ball with the foot to score a goal. The most popular of these sports worldwide is association football, more commonly known as just "football" or "soccer". Unqualified, the word football applies to whichever form of football is the most popular in the regional context in which the word appears, including association football, as well as American football, Australian rules football, Canadian football, Gaelic football, rugby league, rugby union[1] and other related games. These variations of football are known as football codes. Various forms of football can be identified in history, often as popular peasant games. Contemporary codes of football can be traced back to the codification of these games at English public schools in the eighteenth and nineteenth century.[2][3] The influence and power of the British Empire allowed these rules of football to spread, including to areas of British influence outside of the directly controlled Empire,[4] though by the end of the nineteenth century, distinct regional codes were already developing: Gaelic Football, for example, deliberately incorporated the rules of local traditional football games in order to maintain their heritage.[5] In 1888, The Football League was founded in England, becoming the first of many professional football competitions. In the twentieth century, the various codes of football have become amongst the most popular team sports in......

Words: 2571 - Pages: 11

Premium Essay

...My senior year of high school football was my toughest. We had a horrible losing record going into our last game, our senior game. A lot of pressure was on all of the team to go out there and get this last victory for the season. The season going the way it was there wasn’t much support from anyone. Our last game was against Pueblo High School and we needed to win. The practices that week were very serious for the remaining seniors. We all knew how important this game for us and the community. We knew that no matter what we had to win. Come game day everybody was prepared both mentally and physically. All of the team was focused and constantly had one thing on our mind, getting the W. Starting stretching and running through pre-game warm-ups was so intense. The look in every senior’s eyes showed everyone we had one mission and it was going to be done. At the start of the game the sun was barely peeking over the mountains as the moon was coming up. The temperature was refreshing just as the feel of the grass was. The noise from crowd was so loud you could hear it from every near street corner. At the coin toss we shook hands with each other showing sportsmanship but inside we were ready pounce on them in an instant. The rising chants from the sideline continued just the kickoff started. We raced downfield for the first of many bone-crushing hits that night. The First quarter was dead stall as each team was trying to gauge each other’s ability and talent. The second quarter...

Words: 830 - Pages: 4

Free Essay

...Empir Software Eng (2010) 15:455–492 DOI 10.1007/s10664-009-9127-7 An experimental comparison of ER and UML class diagrams for data modelling Andrea De Lucia · Carmine Gravino · Rocco Oliveto · Genoveffa Tortora Published online: 11 December 2009 © Springer Science+Business Media, LLC 2009 Editor: Erik Arisholm Abstract We present the results of three sets of controlled experiments aimed at analysing whether UML class diagrams are more comprehensible than ER diagrams during data models maintenance. In particular, we considered the support given by the two notations in the comprehension and interpretation of data models, comprehension of the change to perform to meet a change request, and detection of defects contained in a data model. The experiments involved university students with different levels of ability and experience. The results demonstrate that using UML class diagrams subjects achieved better comprehension levels. With regard to the support given by the two notations during maintenance activities the results demonstrate that the two notations give the same support, while in general UML class diagrams provide a better support with respect to ER diagrams during veriﬁcation activities. Keywords Controlled experiments · Entity-relation diagrams · UML class diagrams · Design notations · Comprehension · Maintenance · Veriﬁcation The work described in this paper is supported by the project METAMORPHOS (MEthods and Tools for migrAting software systeMs towards...

Words: 16567 - Pages: 67

Premium Essay

...Football, to me, is more than just a game. I have probably learned more valuable lessons from it than from school. When I joined the team freshman year, I didn’t realize what I was getting into. Even though I had been playing since fourth grade and knew it was hard work, nothing would prepare me for the effort I would put into football that year. We worked all summer in the weight room and ran on the track to get in physical and mental shape before the season. See, football is more of a mental sport than anything else, so running on the track wasn’t only about getting in shape. We would push our minds by running as hard as we could even if we felt like we were going to pass out. At the beginning, I was immature and only thought of myself, sometimes even giving up when I was tired or hurting. Then after the third game I had a season-ending injury. Imagine working all summer and then only being able to play three games! I needed surgery on my arm and at least five months to recover. Needless to say, I was sidelined for the rest of the season, but this actually helped me realize that since you never know when your last play will be, you should try your hardest in football and life. After freshman year I decided that I would always give my best effort. Playing varsity football has taught me so much more than just what my assignments are on a particular play or how to block. I have learned to think about others first, and realized how important working hard......

Words: 435 - Pages: 2

Premium Essay

...Advanced Modelling in Finance using Excel and VBA Mary Jackson and Mike Staunton JOHN WILEY & SONS, LTD Chichester ž New York ž Weinheim ž Brisbane ž Singapore ž Toronto Copyright 2001 by John Wiley & Sons, Ltd, Bafﬁns Lane, Chichester, West Sussex PO19 1UD, England National 01243 779777 International (C44) 1243 779777 e-mail (for orders and customer service enquiries): cs-books@wiley.co.uk Visit our Home Page on http://www.wiley.co.uk or http://www.wiley.com All Rights Reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except under the terms of the Copyright, Designs and Patents Act 1988 or under the terms of a licence issued by the Copyright Licensing Agency, 90 Tottenham Court Road, London W1P 9HE, UK, without the permission in writing of the publisher. Other Wiley Editorial Ofﬁces John Wiley & Sons, Inc., 605 Third Avenue, New York, NY 10158-0012, USA Wiley-VCH Verlag GmbH, Pappelallee 3, D-69469 Weinheim, Germany John Wiley & Sons Australia Ltd, 42 McDougall Street, Milton, Queensland 4064, Australia John Wiley & Sons (Asia) Pte Ltd, 2 Clementi Loop #02-01, Jin Xing Distripark, Singapore 129809 John Wiley & Sons Canada Ltd, 6045 Freemont Blvd, Mississauga, ONT, L5R 4J3, Canada British Library Cataloguing in Publication Data A catalogue record for this book is available from the......

Words: 57326 - Pages: 230

Premium Essay

...whether or not college football players should get paid to play football. I feel like this is a topic that a lot of football players that is transitioning from high school to college discuss frequently. There has been a few controversies on if college football players have been paid under the table. When you hear that they have been paid under the table, it means that either the college they have chosen has given them money secretively, the head football coach paid them out of their own pockets, or even someone they know personally has paid them to play for that school. College football players should not be paid to play football. There are many reasons on why they should not get paid. When you hear a certain college name, the first thing you hear come out of someone’s mouth is something about how good or bad their football team is. Football players are shown a lot of attention. On their campuses they are very well known, whether or not if they play in the games or sit on the sidelines. Most college football players end up getting a full ride to school because they play football. If they are not paying for school out of pocket, they shouldn’t get paid to do something they are supposed to do. Another reason why they shouldn’t get paid to play football is the fact that other teams that play sports on the campus don’t get paid. Why out of all teams, football should be the one to get paid. For example, a baseball team on the campus might be better than the football team but......

Words: 494 - Pages: 2

Premium Essay

...Madden NFL 15 Ultimate Edition. With 30 Madden Ultimate Team Pro Packs including NFL super stars of the past and present and a Draft Class Pack featuring 10 of the first round draft picks from the 2014 NFL Draft, you’ve never been more equipped to dominate the field at the start of Madden Season. Lead your ultimate team using a new breed of defense built with a new arsenal of pass rush moves, an intuitive tackling system, improved coverage logic, and immersive new camera angles, making defense exciting and fun to play. Call plays with confidence thanks to an all-new crowd-sourced recommendation engine built from millions of online games played by the Madden community. Add in all-new NFL Films inspired presentation and it’s not just football, it’s Madden Season! Deliver on Defense Bring the Heat – Utilize a new set of pass rush tools to beat your blocker and disrupt the backfield. New mechanics to jump the snap, shed blocks and steer offensive linemen put your in control and make defensive linemen more dangerous than ever. Risk vs. Reward – Defenders can now make aggressive or conservative tackles in the open field, with proximity cones showing the effective range of each. Aggressive tackles can lead to big plays and fumbles, but conservative tackles are more likely to bring the ball-carrier down without giving up extra yards. The choice is yours, but so are the consequences. A New Point of View – See defense through a whole new lens with all-new camera angles...

Words: 486 - Pages: 2

Free Essay

...Kirsten Rogers Most Common Injuries in Football General Purpose: To inform Specific Purpose: To inform the class about the type of injuries sustained in football Introduction: I. Attention Getter: Everyone has either played football or know someone who has played, right? 1.5 million young men play football, 1.2 million of these receive football related injuries. II. Thesis Statement: Football can be a dangerous sport. III. Statement of Importance/Relate to Audience: Many of you have played football or know someone that has played football. IV. Preview: There are three common types of injuries in football, these are concussions, knee injuries, and shoulder injuries. Body: I. Main Point: Concussions are a common injury in football. a. Sub-point: Symptoms of concussions include headaches, dizziness, nausea, and blurry vision. b. Sub-point: Long-term effects are memory loss, aggression, and personality changes. II. Main Point: Knee injuries are very common injury. a. Sub-point: ACL tear is cause by changing direction rapidly or slowing down while running. b. Sub-point: MCL tear is caused by a direct blow to the outside of the knee. c. Sub-point: PCL tear is caused by a direct blow to the front of the knee. d. Sub-point: Torn meniscus is caused by pivoting, decelerating, or being tackled. III. Main Point: Shoulder injuries are a common football injury. a. Sub-point: Dislocation can be caused by a fall or hit, where...

Words: 373 - Pages: 2

Premium Essay

...Word Count: 587 This is my own unaided work . SPREADSHEET mODELLING Word Count: 587 . SPREADSHEET mODELLING Spreadsheet modelling As a future financial analyst my work will be based on analysing data from company’s gathered data spreadsheets. Hence it is crucial for my own career success to understand the importance of spreadsheet modelling and its implementation. As according to (Susan Coles, Jennifer Rowley, 1996) “The decision maker’s judgement must be exercised in the interpretation of the data and the final decision making.” There is no better way of benefiting my future career than developing advanced skills in modelling Excel. As I have learned from few articles I read about this subject, there are many advantages of excelling my skills so as problems I should look out for. Firstly, studying advanced Excel modelling not only helps students to build technological knowledge but “also largely improve their practical analytical skills and abilities.”(Junying, 2010). By studying how to model spreadsheets students get a real chance to see how actual data is analysed by spreadsheet build in models and help them to further their own analytical knowledge and see how it works in real life. I think it is hugely beneficial for students in their further real...

Words: 715 - Pages: 3

Premium Essay

...1. What is an entity super type, and why is it used? It is an entity with 1 or more subtypes. It contains common characteristics. It is used to take advantage of inheritance, constraints, discriminators, and reduce the number of nulls. 2. What kinds of data would you store in an entity subtype? Entity subtypes contain more unique characteristics. A subtype will contain that data that is specific to the entity. Ex. A super type entity named Student contains a field for degrees studying but there could be multiple subtype entities for each different degrees and they would contain that which pertains to that certain degree. 3. What is a specialization hierarchy? It is an arrangement that has a super type entity that branches into one or more subtype or child entity. 4. What is a subtype discriminator? Give an example of its use. Attribute in the super type entity that determines to which subtype each super type occurrence is related. Subtype discriminator may be based on other comparison condition Flight Hours (>1,500 or <=1,500) 5. What is an overlapping subtype? Give an example. It is a subtype that contains non-unique subsets of the super type entity set. E.g. an employee can be both an Administrator and a teacher. 6. What is the difference between partial completeness and total completeness? Partial completeness is when a super-type does not need to use one of the subtypes, whereas total completeness must use at least one. Partial example:......

Words: 1326 - Pages: 6

Premium Essay

...Football and society I am writing this blog in order to highlight the effects football has had on society and how society has changed or influenced football in a particular way. I have been a football fan for as long as I remember and it has been a huge part of my life. It has affected the way in which I think and also my physical health. If I was asked when I was young what I wanted to be when I grew up, I would give the same answer as the majority of my peers, which was to be a professional football player, playing for my favourite team. Which I’m sure made my father gravely disappointed as he would have imagined a completely different path for me, He would have liked me to be a professional footballer playing for his favourite team. The reason I say ‘Completely different path’ isn’t just an attempt to be humorous it is an indication as to how seriously some football fans take the game. Calling it a game in fact would almost be blasphemous to some who see it as more of a way of life. Opposing sides are no longer commonly referred to as competitors they are known as your Rivals, implying they are your enemy. Even in the lowest tiers of Football, where your average working person plays, the game has been taken so seriously that people have been badly injured through ‘off the field’ violence, Parents and supporters can be seen hurling obscene abuse at children and even members of the clubs partaking can be witnessed behaving in such a manner that they either have to be sent...

Words: 2546 - Pages: 11

Free Essay

...level for the proposed structures so that the maximum hydropower benefits are yielded through the scheme without compromising the safety. Present study intends to investigate the same for Marala Hydropower Project (MHP) proposed on Upper Chenab Canal (UCC) off-shooting from Marala Barrage on River Chenab. In order to define the optimum crest level of the spillway such that there is no negative impact on the discharge passing capacity of the UCC head regulator, two-pronged strategy has been applied i.e. using computational flow dynamics (CFD) computer software and analytical approach. The results of modelling are first compared with the physical model of the said scheme for validation. This study will be helpful for any future hydropower development schemes on irrigation canals close to barrages. Key Words: Low Head Hydropower Development, Canal Regulation, Numerical Modelling, FLOW-3D 1. INTRODUCTION Water is diverted into a canal from a pool behind a barrage through a structure called the canal head regulator. This structure is also used as a regulation device for controlling the amount of water passing into the canal with the help of adjustable gates [1]. Spillway is one of the foremost important structures of a dam project. It enables the project to dispose off the excess water or negotiate floods in either controlled or uncontrolled manner in order to ensure the safety of the project. Spillway should be designed with utmost care and importance must......

Words: 3280 - Pages: 14

Premium Essay

...Erosion Modelling Soil erosion is a significant environmental process that degrades the soil in which we rely on for food, fuel, clean water, carbon storage, and as a substrate for buildings and infrastructure (Quinton 2011). It is the disruption of the soil mantle – the pedosphere, or the underlying rock base – the lithosphere by the action of matter of external geomorphic factors, such as water, snow, ice, air, weathered debris, organisms and man (Zachar 1982). Both abiotic and biotic forms of erosion forms patterns that are typical for a particular area such as climate, relief, nature of the surface, activity of the organism, and activity of man (Zachar 1982). It is the degradation or aggradations of the Earth’s surface by the movement of soil material by wind, rain, overland flow and gravity (ASSIGNMENT). Problems with Erosion The movement of sediment and associated pollutants over the landscape and into water bodies is of increasing concern with respect to pollution control and environmental protection. With the expected change in climate over the coming decades, there is a need to predict how environmental problems associated with sediment are likely to be affected so that appropriate management systems can be put in place (Morgan & Nearing). Erosion can impact the productivity of agricultural, post-mining and native systems and is a sign of land degradation (ASSIGNMENT). Soil erosion acts a mechanism for transferring pollutants to surface waters and reduces...

Words: 1174 - Pages: 5

Free Essay

...(25) The advent of high throughput technologies such as next generation sequencing has led to generation of a lot of biological data which include protein sequences data. The full understanding of the biological roles of protein requires the knowledge of their structures. Experimental protein structure prediction methods consisting of x-ray crystallography and NMR spectroscopy are time consuming leaving a gap between generation of sequences and structure prediction. Computational approaches can be used to develop protein structure models which can be used for rational design of biochemical experiments which include site directed mutagenesis, protein stability and functional analysis of proteins. There are three computational approaches to three dimensional structure prediction namely homology modeling, threading and ab initio prediction (Xong, 2006). Homology modeling (comparative modeling) is a computational protein structure modeling technique used to build three dimensional (3D) models of proteins of unknown structure ( the target) on the basis of a sequence similarity to proteins of known structure (the template). Two conditions must be met to build a useful model, the similarity between the target sequence and template must be detectable and a substantially correct alignment between the target and the template should be calculated. Homology modelling is possible because small changes in protein sequence result in small changes in its 3D structures. The 3D structures of......

Words: 1208 - Pages: 5

Free Essay

...Football is a game played with a oval ball, where two teams compete to kick, carry or throw the ball into the others’ end-zone or territory. It has very interesting and extensive history, and there are a lot of other factors needed to be a great football player. A player that I look up to and strive to be like is LaDainian Tomlinson. The first open-air game of football took place in 1409. In the past there were a lot of things you could do that you couldn’t today, for example during present time if you pull someone by their facemask you will get punished by facing a ten yard penalty. If this act happened about 40 years ago a simple slap on the wrist would be enough punishment. Due to the very physical nature of this sport, football can be very dangerous. History shows players would cut off certain body parts if it would cause them so to not miss a game, people were hitting so hard sometimes they would pass out or get brain damage. A good football player needs talent, perseverance, and drive. A good football player needs the determination to push through the line, to get to the quarterback, to get to the end-zone, to keep the play alive, to not let their man get pass them. A good football player will be able to play any position his coach puts him in and play it to the best of their ability. A player with those skills that I know of is, LaDainian Tomlinson. Tomlinson was born on June23, 1979 in Rosebud, Texas and is a free agent in the National Football League or [NFL]....

Words: 448 - Pages: 2