Conformal Prediction Using Decision Trees
2013 (English)Conference paper, Published paper (Refereed)
Abstract [en]
Conformal prediction is a relatively new framework
in which the predictive models output sets of predictions with
a bound on the error rate, i.e., in a classification context, the
probability of excluding the correct class label is lower than a predefined
significance level. An investigation of the use of decision
trees within the conformal prediction framework is presented,
with the overall purpose to determine the effect of different
algorithmic choices, including split criterion, pruning scheme and
way to calculate the probability estimates. Since the error rate
is bounded by the framework, the most important property of
conformal predictors is efficiency, which concerns minimizing the
number of elements in the output prediction sets. Results from
one of the largest empirical investigations to date within the
conformal prediction framework are presented, showing that in
order to optimize efficiency, the decision trees should be induced
using no pruning and with smoothed probability estimates. The
choice of split criterion to use for the actual induction of the
trees did not turn out to have any major impact on the efficiency.
Finally, the experimentation also showed that when using decision
trees, standard inductive conformal prediction was as efficient as
the recently suggested method cross-conformal prediction. This
is an encouraging results since cross-conformal prediction uses
several decision trees, thus sacrificing the interpretability of a
single decision tree.
Place, publisher, year, edition, pages
IEEE , 2013.
Keywords [en]
Conformal prediction, Decision trees, Data mining, Machine Learning
National Category
Computer Sciences Computer and Information Sciences
Identifiers
URN: urn:nbn:se:hb:diva-7055DOI: 10.1109/ICDM.2013.85ISI: 000332874200034Local ID: 2320/13010OAI: oai:DiVA.org:hb-7055DiVA, id: diva2:887762
Conference
IEEE International Conference on Data Mining
Note
Sponsorship:
Swedish Foundation
for Strategic Research through the project High-Performance
Data Mining for Drug Effect Detection (IIS11-0053) and the
Knowledge Foundation through the project Big Data Analytics
by Online Ensemble Learning (20120192)
2015-12-222015-12-222020-01-29