`BrierScore.Rd`

The Brier score is a proper score function that measures the accuracy of probabilistic predictions.

`BrierScore(z, class)`

- z
a matrix containing the predicted probabilities of each observation to be classified in one of the classes. Thus, the number of rows must match the length of

`class`

, and the number of columns the number of known classes.- class
a numeric, character vector or factor containing the known class labels for each observation. If

`class`

is a factor, the number of classes is`nlevels(class)`

with classes`levels(class)`

. If`class`

is a numeric or character vector, the number of classes is equal to the number of classes obtained via`unique(class)`

.

The Brier Score is the mean square difference between the true classes and the predicted probabilities.

This function implements the original multi-class definition by Brier (1950), normalized to \([0,1]\) as in Kruppa et al (2014). The formula is the following: $$ BS = \frac{1}{2n} \sum_{i=1}^n \sum_{k=1}^K (C_{ik} - p_{ik})^2 $$ where \(n\) is the number of observations, \(K\) the number of classes, \(C_{ik} = \{0,1\}\) the indicator of class \(k\) for observation \(i\), and \(p_{ik}\) is the predicted probability of observation \(i\) to belong to class \(k\).

The above formulation is applicable to multi-class predictions, including the binary case. A small value of the Brier Score indicates high prediction accuracy.

The Brier Score is a strictly proper score (Gneiting and Raftery, 2007), which means that it takes its minimal value only when the predicted probabilities match the empirical probabilities.

Brier, G.W. (1950) Verification of forecasts expressed in terms of probability. *Monthly Weather Review*, 78 (1): 1-3.

Gneiting, G. and Raftery, A. E. (2007) Strictly proper scoring rules, prediction, and estimation. *Journal of the American Statistical Association* 102 (477): 359-378.

Kruppa, J., Liu, Y., Diener, H.-C., Holste, T., Weimar, C., Koonig, I. R., and Ziegler, A. (2014) Probability estimation with machine learning methods for dichotomous and multicategory outcome: Applications. *Biometrical Journal*, 56 (4): 564-583.

```
# multi-class case
class <- factor(c(5,5,5,2,5,3,1,2,1,1), levels = 1:5)
probs <- matrix(c(0.15, 0.01, 0.08, 0.23, 0.01, 0.23, 0.59, 0.02, 0.38, 0.45,
0.36, 0.05, 0.30, 0.46, 0.15, 0.13, 0.06, 0.19, 0.27, 0.17,
0.40, 0.34, 0.18, 0.04, 0.47, 0.34, 0.32, 0.01, 0.03, 0.11,
0.04, 0.04, 0.09, 0.05, 0.28, 0.27, 0.02, 0.03, 0.12, 0.25,
0.05, 0.56, 0.35, 0.22, 0.09, 0.03, 0.01, 0.75, 0.20, 0.02),
nrow = 10, ncol = 5)
cbind(class, probs, map = map(probs))
#> class map
#> [1,] 5 0.15 0.36 0.40 0.04 0.05 3
#> [2,] 5 0.01 0.05 0.34 0.04 0.56 5
#> [3,] 5 0.08 0.30 0.18 0.09 0.35 5
#> [4,] 2 0.23 0.46 0.04 0.05 0.22 2
#> [5,] 5 0.01 0.15 0.47 0.28 0.09 3
#> [6,] 3 0.23 0.13 0.34 0.27 0.03 3
#> [7,] 1 0.59 0.06 0.32 0.02 0.01 1
#> [8,] 2 0.02 0.19 0.01 0.03 0.75 5
#> [9,] 1 0.38 0.27 0.03 0.12 0.20 1
#> [10,] 1 0.45 0.17 0.11 0.25 0.02 1
BrierScore(probs, class)
#> [1] 0.33144
# two-class case
class <- factor(c(1,1,1,2,2,1,1,2,1,1), levels = 1:2)
probs <- matrix(c(0.91, 0.4, 0.56, 0.27, 0.37, 0.7, 0.97, 0.22, 0.68, 0.43,
0.09, 0.6, 0.44, 0.73, 0.63, 0.3, 0.03, 0.78, 0.32, 0.57),
nrow = 10, ncol = 2)
cbind(class, probs, map = map(probs))
#> class map
#> [1,] 1 0.91 0.09 1
#> [2,] 1 0.40 0.60 2
#> [3,] 1 0.56 0.44 1
#> [4,] 2 0.27 0.73 2
#> [5,] 2 0.37 0.63 2
#> [6,] 1 0.70 0.30 1
#> [7,] 1 0.97 0.03 1
#> [8,] 2 0.22 0.78 2
#> [9,] 1 0.68 0.32 1
#> [10,] 1 0.43 0.57 2
BrierScore(probs, class)
#> [1] 0.13381
# two-class case when predicted probabilities are constrained to be equal to
# 0 or 1, then the (normalized) Brier Score is equal to the classification
# error rate
probs <- ifelse(probs > 0.5, 1, 0)
cbind(class, probs, map = map(probs))
#> class map
#> [1,] 1 1 0 1
#> [2,] 1 0 1 2
#> [3,] 1 1 0 1
#> [4,] 2 0 1 2
#> [5,] 2 0 1 2
#> [6,] 1 1 0 1
#> [7,] 1 1 0 1
#> [8,] 2 0 1 2
#> [9,] 1 1 0 1
#> [10,] 1 0 1 2
BrierScore(probs, class)
#> [1] 0.2
classError(map(probs), class)$errorRate
#> [1] 0.2
# plot Brier score for predicted probabilities in range [0,1]
class <- factor(rep(1, each = 100), levels = 0:1)
prob <- seq(0, 1, by = 0.01)
brier <- sapply(prob, function(p)
{ z <- matrix(c(1-p,p), nrow = length(class), ncol = 2, byrow = TRUE)
BrierScore(z, class)
})
plot(prob, brier, type = "l", main = "Scoring all one class",
xlab = "Predicted probability", ylab = "Brier score")
# brier score for predicting balanced data with constant prob
class <- factor(rep(c(1,0), each = 50), levels = 0:1)
prob <- seq(0, 1, by = 0.01)
brier <- sapply(prob, function(p)
{ z <- matrix(c(1-p,p), nrow = length(class), ncol = 2, byrow = TRUE)
BrierScore(z, class)
})
plot(prob, brier, type = "l", main = "Scoring balanced classes",
xlab = "Predicted probability", ylab = "Brier score")
# brier score for predicting unbalanced data with constant prob
class <- factor(rep(c(0,1), times = c(90,10)), levels = 0:1)
prob <- seq(0, 1, by = 0.01)
brier <- sapply(prob, function(p)
{ z <- matrix(c(1-p,p), nrow = length(class), ncol = 2, byrow = TRUE)
BrierScore(z, class)
})
plot(prob, brier, type = "l", main = "Scoring unbalanced classes",
xlab = "Predicted probability", ylab = "Brier score")
```