Pin-Pointing Concept Descriptions
2010 (English)Conference paper, Published paper (Refereed)
Abstract [en]
In this study, the task of obtaining accurate and
comprehensible concept descriptions of a specific set of production
instances has been investigated. The suggested method, inspired
by rule extraction and transductive learning, uses a highly
accurate opaque model, called an oracle, to coach construction
of transparent decision list models. The decision list algorithms
evaluated are JRip and four different variants of Chipper, a
technique specifically developed for concept description. Using 40
real-world data sets from the drug discovery domain, the results
show that employing an oracle coach to label the production
data resulted in significantly more accurate and smaller models
for almost all techniques. Furthermore, augmenting normal
training data with production data labeled by the oracle also
led to significant increases in predictive performance, but with
a slight increase in model size. Of the techniques evaluated,
normal Chipper optimizing FOIL’s information gain and allowing
conjunctive rules was clearly the best. The overall conclusion is
that oracle coaching works very well for concept description.
Place, publisher, year, edition, pages
2010.
Keywords [en]
concept description, decision lists, Machine Learning
Keywords [sv]
data mining
National Category
Computer and Information Sciences Information Systems
Identifiers
URN: urn:nbn:se:hb:diva-6518DOI: 10.1109/ICSMC.2010.5641998Local ID: 2320/7460OAI: oai:DiVA.org:hb-6518DiVA, id: diva2:887214
Conference
2010 IEEE International Conference on Systems Man and Cybernetics (SMC)
Note
Sponsorship:
This work was supported by the INFUSIS project
(www.his.se/infusis) at the University of Skövde, Sweden, in
partnership with the Swedish Knowledge Foundation under
grant 2008/0502.
2015-12-222015-12-222018-01-10