Planned maintenance
A system upgrade is planned for 10/12-2024, at 12:00-13:00. During this time DiVA will be unavailable.
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • harvard-cite-them-right
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Generating Comprehensible QSAR Models
University of Borås, School of Business and IT. (CSL@BS)
University of Borås, School of Business and IT. (CSL@BS)
2009 (English)Conference paper, Published paper (Refereed)
Abstract [en]

This paper presents work in progress from the INFUSIS project and contains initial experimentation, using publicly available medicinal chemistry datasets, on obtaining comprehensible QSAR models. Three techniques are evaluated on both predictive performance, measured as accuracy, and comprehensibility, measured as model size. The chosen techniques are J48 decision trees and JRip and Chipper decision lists. The results show that J48 obtains superior accuracy and that Chipper performs best of the two decision list algorithms on accuracy. Furthermore, it is seen that, regarding accuracy, all techniques benefit from feature reduction, which almost always results in increased accuracy. Regarding comprehensibility, JRip obtains the smallest models, followed by Chipper, with J48 producing the largest models. For model size, feature reduction is not seen to be universally beneficial; only J48 produces smaller models for the reduced datasets, while both decision list algorithms actually produce larger models on average. The overall conclusion is that, for these datasets, there exists a definite tradeoff between accuracy and comprehensibility that needs to be investigated further.

Place, publisher, year, edition, pages
University of Skövde , 2009.
Series
Skövde studies in Informatics, ISSN 1653-2325 ; 2009:3
Keywords [en]
concept description, QSAR, classification, Machine Learning
Keywords [sv]
data mining
National Category
Computer and Information Sciences Computer and Information Sciences
Identifiers
URN: urn:nbn:se:hb:diva-6309Local ID: 2320/5911OAI: oai:DiVA.org:hb-6309DiVA, id: diva2:886996
Conference
3rd Skövde Workshop on Information Fusion Topics 2009, Skövde, Sweden
Available from: 2015-12-22 Created: 2015-12-22 Last updated: 2018-01-10

Open Access in DiVA

fulltext(102 kB)267 downloads
File information
File name FULLTEXT01.pdfFile size 102 kBChecksum SHA-512
deaa63e73ede3a5f74deb0bc901cec10f6307f34d8db43342ef79aba1bc0a72521190cc7607a7c787e26c4ba4d15ef2fd402e5135d334c7f34df15ddbbf23a25
Type fulltextMimetype application/pdf

Authority records

Sönströd, CeciliaJohansson, Ulf

Search in DiVA

By author/editor
Sönströd, CeciliaJohansson, Ulf
By organisation
School of Business and IT
Computer and Information SciencesComputer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 268 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 170 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • harvard-cite-them-right
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf