Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • harvard-cite-them-right
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Predictive modelling for user preferences in digital libraries: Using sentiment analysis and machine learning
University of Borås, Faculty of Librarianship, Information, Education and IT.
University of Borås, Faculty of Librarianship, Information, Education and IT.
2024 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

The rapid advancement of digital technologies has transformed information access, making digital libraries essential knowledge repositories. This thesis explores the application of Sentiment Analysis and Artificial Intelligence (AI) in analysing ratings and reviews within digital libraries. By leveraging AI, digital libraries can address challenges in managing vast information volumes and understanding user preferences, thereby enhancing recommendation systems.

This study aims to develop predictive models employing user sentiments in reviews and ratings to improve recommendation systems. We investigated the correlation between review sentiments and numerical ratings, evaluated regression and classification models, and examined the impact of feature engineering on prediction accuracy. Utilising Orange Data Mining software, we analysed two datasets from Kaggle, focusing on Amazon Books and Kindle ratings and reviews. Besides VADER and SentiArt were used for sentiment analysis.

Results showed VADER outperformed SentiArt in capturing sentiment nuances, with a higher correlation to numerical ratings. Document Embedding (SBERT)demonstrated a moderate correlation with ratings, explaining 32.3% of the variance, whereas Bag of Words (TF-IDF) explained 10.5%. Linear Regression consistently outperformed other models, explaining up to 25% of rating variance. Neural Networks also showed promise in classification tasks, accurately categorising 'Low' and 'High' ratings.

In conclusion, this research demonstrates the potential of AI to enhance digital libraries and improve recommendation systems. The findings highlight the benefits of integrating advanced data analytics into digital libraries to boost user satisfaction and service quality. However, it also recognises challenges such as data privacy and suggests the importance of environmental sustainability in AI applications as future research.

Place, publisher, year, edition, pages
2024.
Keywords [en]
Machine learning, Sentiment analysis, Classification analysis, Artificial intelligence, Digital libraries
National Category
Information Studies
Identifiers
URN: urn:nbn:se:hb:diva-33047OAI: oai:DiVA.org:hb-33047DiVA, id: diva2:1925709
Available from: 2025-01-13 Created: 2025-01-09 Last updated: 2025-09-24Bibliographically approved

Open Access in DiVA

fulltext(2288 kB)345 downloads
File information
File name FULLTEXT01.pdfFile size 2288 kBChecksum SHA-512
4ac397b976332056645ffee980686d3ff5106325f875e1842f5912bd3c4999e419e61351b54fd3f5d0b29ade9def5589f3fb7943f22f1156f6fa1271818dc4e2
Type fulltextMimetype application/pdf

By organisation
Faculty of Librarianship, Information, Education and IT
Information Studies

Search outside of DiVA

GoogleGoogle Scholar
Total: 345 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 475 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • harvard-cite-them-right
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf