Regression conformal prediction produces prediction intervals that are valid, i.e., the probability of excluding the correct target value is bounded by a predefined confidence level. The most important criterion when comparing conformal regressors is efficiency; the prediction intervals should be as tight (informative) as possible. In this study, the use of random forests as the underlying model for regression conformal prediction is investigated and compared to existing state-of-the-art techniques, which are based on neural networks and k-nearest neighbors. In addition to their robust predictive performance, random forests allow for determining the size of the prediction intervals by using out-of-bag estimates instead of requiring a separate calibration set. An extensive empirical investigation, using 33 publicly available data sets, was undertaken to compare the use of random forests to existing stateof- the-art conformal predictors. The results show that the suggested approach, on almost all confidence levels and using both standard and normalized nonconformity functions, produced significantly more efficient conformal predictors than the existing alternatives.
This paper extends the conformal prediction framework to rule extraction, making it possible to extract interpretable models from opaque models in a setting where either the infidelity or the error rate is bounded by a predefined significance level. Experimental results on 27 publicly available data sets show that all three setups evaluated produced valid and rather efficient conformal predictors. The implication is that augmenting rule extraction with conformal prediction allows extraction of models where test set errors or test sets infidelities are guaranteed to be lower than a chosen acceptable level. Clearly this is beneficial for both typical rule extraction scenarios, i.e., either when the purpose is to explain an existing opaque model, or when it is to build a predictive model that must be interpretable.
Successful use of probabilistic classification requires well-calibrated probability estimates, i.e., the predicted class probabilities must correspond to the true probabilities. The standard solution is to employ an additional step, transforming the outputs from a classifier into probability estimates. In this paper, Venn predictors are compared to Platt scaling and isotonic regression, for the purpose of producing well-calibrated probabilistic predictions from decision trees. The empirical investigation, using 22 publicly available datasets, showed that the probability estimates from the Venn predictor were extremely well-calibrated. In fact, in a direct comparison using the accepted reliability metric, the Venn predictor estimates were the most exact on every data set.
Online predictive modeling of streaming data is a key task for big data analytics. In this paper, a novel approach for efficient online learning of regression trees is proposed, which continuously updates, rather than retrains, the tree as more labeled data become available. A conformal predictor outputs prediction sets instead of point predictions; which for regression translates into prediction intervals. The key property of a conformal predictor is that it is always valid, i.e., the error rate, on novel data, is bounded by a preset significance level. Here, we suggest applying Mondrian conformal prediction on top of the resulting models, in order to obtain regression trees where not only the tree, but also each and every rule, corresponding to a path from the root node to a leaf, is valid. Using Mondrian conformal prediction, it becomes possible to analyze and explore the different rules separately, knowing that their accuracy, in the long run, will not be below the preset significance level. An empirical investigation, using 17 publicly available data sets, confirms that the resulting rules are independently valid, but also shows that the prediction intervals are smaller, on average, than when only the global model is required to be valid. All-in-all, the suggested method provides a data miner or a decision maker with highly informative predictive models of streaming data.
In this paper, we study a generalization of a recently developed strategy for generating conformal predictor ensembles: out-of-bag calibration. The ensemble strategy is evaluated, both theoretically and empirically, against a commonly used alternative ensemble strategy, bootstrap conformal prediction, as well as common non-ensemble strategies. A thorough analysis is provided of out-of-bag calibration, with respect to theoretical validity, empirical validity (error rate), efficiency (prediction region size) and p-value stability (the degree of variance observed over multiple predictions for the same object). Empirical results show that out-of-bag calibration displays favorable characteristics with regard to these criteria, and we propose that out-of-bag calibration be adopted as a standard method for constructing conformal predictor ensembles. (C) 2019 Elsevier B.V. All rights reserved.
In this paper, we propose a practically useful means of interpreting the predictions produced by a conformal classifier. The proposed interpretation leads to a classifier with a reject option, that allows the user to limit the number of erroneous predictions made on the test set, without any need to reveal the true labels of the test objects. The method described in this paper works by estimating the cumulative error count on a set of predictions provided by a conformal classifier, ordered by their confidence. Given a test set and a user-specified parameter k, the proposed classification procedure outputs the largest possible amount of predictions containing on average at most k errors, while refusing to make predictions for test objects where it is too uncertain. We conduct an empirical evaluation using benchmark datasets, and show that we are able to provide accurate estimates for the error rate on the test set.
In the conformal prediction literature, it appears axiomatic that transductive conformal classifiers possess a higher predictive efficiency than inductive conformal classifiers, however, this depends on whether or not the nonconformity function tends to overfit misclassified test examples. With the conformal prediction framework’s increasing popularity, it thus becomes necessary to clarify the settings in which this claim holds true. In this paper, the efficiency of transductive conformal classifiers based on decision tree, random forest and support vector machine classification models is compared to the efficiency of corresponding inductive conformal classifiers. The results show that the efficiency of conformal classifiers based on standard decision trees or random forests is substantially improved when used in the inductive mode, while conformal classifiers based on support vector machines are more efficient in the transductive mode. In addition, an analysis is presented that discusses the effects of calibration set size on inductive conformal classifier efficiency.
Conformal classiers output condence prediction regions, i.e., multi-valued predictions that are guaranteed to contain the true output value of each test pattern with some predened probability. In order to fully utilize the predictions provided by a conformal classier, it is essential that those predictions are reliable, i.e., that a user is able to assess the quality of the predictions made. Although conformal classiers are statistically valid by default, the error probability of the prediction regions output are dependent on their size in such a way that smaller, and thus potentially more interesting, predictions are more likely to be incorrect. This paper proposes, and evaluates, a method for producing rened error probability estimates of prediction regions, that takes their size into account. The end result is a binary conformal condence predictor that is able to provide accurate error probability estimates for those prediction regions containing only a single class label.
This paper suggests a modification of the Conformal Prediction framework for regression that will strengthen the associated guarantee of validity. We motivate the need for this modification and argue that our conformal regressors are more closely tied to the actual error distribution of the underlying model, thus allowing for more natural interpretations of the prediction intervals. In the experimentation, we provide an empirical comparison of our conformal regressors to traditional conformal regressors and show that the proposed modification results in more robust two-tailed predictions, and more efficient one-tailed predictions.
Conformal prediction is a learning framework that produces models that associate witheach of their predictions a measure of statistically valid confidence. These models are typi-cally constructed on top of traditional machine learning algorithms. An important result ofconformal prediction theory is that the models produced are provably valid under relativelyweak assumptions—in particular, their validity is independent of the specific underlyinglearning algorithm on which they are based. Since validity is automatic, much research onconformal predictors has been focused on improving their informational and computationalefficiency. As part of the efforts in constructing efficient conformal predictors, aggregatedconformal predictors were developed, drawing inspiration from the field of classification andregression ensembles. Unlike early definitions of conformal prediction procedures, the va-lidity of aggregated conformal predictors is not fully understood—while it has been shownthat they might attain empirical exact validity under certain circumstances, their theo-retical validity is conditional on additional assumptions that require further clarification.In this paper, we show why validity is not automatic for aggregated conformal predictors,and provide a revised definition of aggregated conformal predictors that gains approximatevalidity conditional on properties of the underlying learning algorithm.
This study introduces the conformal prediction framework to the task of predicting the presence of adverse drug events in electronic health records with an associated measure of statistically valid confidence. The imbalanced nature of the problem was addressed both by evaluating different machine learning algorithms, and by comparing different types of conformal predictors. A novel solution was also evaluated, where different underlying models, each model optimized towards one particular class, were combined into a single conformal predictor. This novel solution proved to be superior to previously existing approaches.