Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • harvard-cite-them-right
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Probabilistic Prediction in scikit-learn
Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
2021 (engelsk)Konferansepaper, Publicerat paper (Annet vitenskapelig)
Abstract [en]

Adding confidence measures to predictive models should increase the trustworthiness, but only if the models are well-calibrated. Historically, some algorithms like logistic regression, but also neural networks, have been considered to produce well-calibrated probability estimates off-the-shelf. Other techniques, like decision trees and Naive Bayes, on the other hand, are infamous for being significantly overconfident in their probabilistic predictions. In this paper, a large experimental study is conducted to investigate how well-calibrated models produced by a number of algorithms in the scikit-learn library are out-of-the-box, but also if either the built-in calibration techniques Platt scaling and isotonic regression, or Venn-Abers, can be used to improve the calibration. The results show that of the seven algorithms evaluated, the only one obtaining well-calibrated models without the external calibration is logistic regression. All other algorithms, i.e., decision trees, adaboost, gradient boosting, kNN, naive Bayes and random forest benefit from using any of the calibration techniques. In particular, decision trees, Naive Bayes and the boosted models are substantially improved using external calibration. From a practitioner’s perspective, the obvious recommendation becomes to incorporate calibration when using probabilistic prediction. Comparing the different calibration techniques, Platt scaling and VennAbers generally outperform isotonic regression, on these rather small datasets. Finally, the unique ability of Venn-Abers to output not only well-calibrated probability estimates, but also the confidence in these estimates is demonstrated.

sted, utgiver, år, opplag, sider
2021.
HSV kategori
Forskningsprogram
Handel och IT
Identifikatorer
URN: urn:nbn:se:hb:diva-26746OAI: oai:DiVA.org:hb-26746DiVA, id: diva2:1603345
Konferanse
The 18th International Conference on Modeling Decisions for Artificial Intelligence, On-line (from Umeå, Sweden), September 27 - 30, 2021.
Tilgjengelig fra: 2021-10-15 Laget: 2021-10-15 Sist oppdatert: 2025-09-24bibliografisk kontrollert

Open Access i DiVA

fulltext(454 kB)7664 nedlastinger
Filinformasjon
Fil FULLTEXT01.pdfFilstørrelse 454 kBChecksum SHA-512
3f3a32ad20d6aaf05762b93d4705621b803f742e303100da3396a915d33302cd09a796d8a89225b9d4fc761e1f6519b40062b3664f1be59f7a948fd3c1f55373
Type fulltextMimetype application/pdf

Person

Sweidan, DirarJohansson, Ulf

Søk i DiVA

Av forfatter/redaktør
Sweidan, DirarJohansson, Ulf
Av organisasjonen

Søk utenfor DiVA

GoogleGoogle Scholar
Totalt: 7664 nedlastinger
Antall nedlastinger er summen av alle nedlastinger av alle fulltekster. Det kan for eksempel være tidligere versjoner som er ikke lenger tilgjengelige

urn-nbn

Altmetric

urn-nbn
Totalt: 10147 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • harvard-cite-them-right
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf