Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • harvard-cite-them-right
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Evolving decision trees using oracle guides
University of Borås, School of Business and IT. (CSL@BS)
2009 (English)Conference paper, Published paper (Refereed)
Abstract [en]

Abstract—Some data mining problems require predictive models to be not only accurate but also comprehensible. Comprehensibility enables human inspection and understanding of the model, making it possible to trace why individual predictions are made. Since most high-accuracy techniques produce opaque models, accuracy is, in practice, regularly sacrificed for comprehensibility. One frequently studied technique, often able to reduce this accuracy vs. comprehensibility tradeoff, is rule extraction, i.e., the activity where another, transparent, model is generated from the opaque. In this paper, it is argued that techniques producing transparent models, either directly from the dataset, or from an opaque model, could benefit from using an oracle guide. In the experiments, genetic programming is used to evolve decision trees, and a neural network ensemble is used as the oracle guide. More specifically, the datasets used by the genetic programming when evolving the decision trees, consist of several different combinations of the original training data and “oracle data”, i.e., training or test data instances, together with corresponding predictions from the oracle. In total, seven different ways of combining regular training data with oracle data were evaluated, and the results, obtained on 26 UCI datasets, clearly show that the use of an oracle guide improved the performance. As a matter of fact, trees evolved using training data only had the worst test set accuracy of all setups evaluated. Furthermore, statistical tests show that two setups, both using the oracle guide, produced significantly more accurate trees, compared to the setup using training data only.

Place, publisher, year, edition, pages
IEEE , 2009.
Keywords [en]
oracle guides, rule extraction, genetic programming, Machine learning
Keywords [sv]
data mining
National Category
Computer and Information Sciences Computer and Information Sciences
Identifiers
URN: urn:nbn:se:hb:diva-6247Local ID: 2320/5718ISBN: 9781424427659 (print)OAI: oai:DiVA.org:hb-6247DiVA, id: diva2:886934
Conference
IEEE Symposium on Computational Intelligence and Data Mining (CIDM)
Available from: 2015-12-22 Created: 2015-12-22 Last updated: 2018-01-10

Open Access in DiVA

fulltext(164 kB)653 downloads
File information
File name FULLTEXT01.pdfFile size 164 kBChecksum SHA-512
cf316f7c0e3ae775c17c8d4dc598dcc116aee86d19e3d45d77d295bab6fb680b3d166bc58e124faaa59fd49e5d6c9b09374aba8879bcbeeb54f008ffad8ca86d
Type fulltextMimetype application/pdf

Authority records

Johansson, Ulf

Search in DiVA

By author/editor
Johansson, Ulf
By organisation
School of Business and IT
Computer and Information SciencesComputer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 653 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 238 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • harvard-cite-them-right
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf