A new metric to assess the predictive accuracy of multinomial land cover models
Aim: The earth's land cover is often represented by discrete classes, and predicting shifts between these classes is a major goal in the field. One increasingly common approach is to build models that predict land cover classes with probabilities rather than discrete outcomes. Current assessment approaches have drawbacks when applied to these types of models. In this paper we present a new metric, which assesses agreement between model predictions and observations, while correcting for chance agreement. Location: Global. Methods: κmultinomial is the product of two metrics: the first component measures the agreement in the ranks of the predicted and observed classes, the other specifies the certainty of the model in the case of discrete observations. We analysed the behaviour of κmultinomial and two alternative metrics: Cohen's Kappa (κ) and an extension of the area under receiver operating characteristic Curve to multiple classes (mAUC) when applied to multinomial predictions and discrete observations. Results: Using real and synthetic datasets, we show that κmultinomial - in contrast to κ - can distinguish between models that are very far off versus slightly off. In addition, κmultinomial ranks models higher that predict observed classes with an onaverage higher probability. In contrast, mAUC gives the same score to models that are perfectly able to discriminate among classes of outcomes regardless of the certainty with which those classes are predicted. Main conclusions: With κmultinomial we have provided a tool that directly uses the multinomial probabilities for accuracy assessment. κmultinomial may also be applied to cases where model predictions are evaluated against multiple sets of observations, at multiple spatial scales, or compared to reference models. As models develop we assess how well new models perform compared to the real world.
Main Authors: | , , |
---|---|
Format: | Article/Letter to editor biblioteca |
Language: | English |
Subjects: | Cohen's kappa, Kappa multinomial, Land cover, Model predictive accuracy, Multinomial models, Multiple class AUC, Validation, |
Online Access: | https://research.wur.nl/en/publications/a-new-metric-to-assess-the-predictive-accuracy-of-multinomial-lan |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Aim: The earth's land cover is often represented by discrete classes, and predicting shifts between these classes is a major goal in the field. One increasingly common approach is to build models that predict land cover classes with probabilities rather than discrete outcomes. Current assessment approaches have drawbacks when applied to these types of models. In this paper we present a new metric, which assesses agreement between model predictions and observations, while correcting for chance agreement. Location: Global. Methods: κmultinomial is the product of two metrics: the first component measures the agreement in the ranks of the predicted and observed classes, the other specifies the certainty of the model in the case of discrete observations. We analysed the behaviour of κmultinomial and two alternative metrics: Cohen's Kappa (κ) and an extension of the area under receiver operating characteristic Curve to multiple classes (mAUC) when applied to multinomial predictions and discrete observations. Results: Using real and synthetic datasets, we show that κmultinomial - in contrast to κ - can distinguish between models that are very far off versus slightly off. In addition, κmultinomial ranks models higher that predict observed classes with an onaverage higher probability. In contrast, mAUC gives the same score to models that are perfectly able to discriminate among classes of outcomes regardless of the certainty with which those classes are predicted. Main conclusions: With κmultinomial we have provided a tool that directly uses the multinomial probabilities for accuracy assessment. κmultinomial may also be applied to cases where model predictions are evaluated against multiple sets of observations, at multiple spatial scales, or compared to reference models. As models develop we assess how well new models perform compared to the real world. |
---|