Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • harvard-cite-them-right
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Semantic knowledge discovery
University of Borås, Faculty of Librarianship, Information, Education and IT. (Knowledge infrastructures)ORCID iD: 0000-0001-5196-7148
University of Borås, Faculty of Librarianship, Information, Education and IT. (Knowledge infrastructures)
2021 (English)Conference paper, Oral presentation with published abstract (Refereed)
Abstract [en]

Since many databases lack relevance ranking, a citation-based approach can be a valuable complement since it is possible to use citation-based data to indicate centrality, relevance, or visibility in the research community. However, using bibliometric methods in the humanities is often challenging since a lot of the research literature is not indexed in the traditional citation databases that we generally use for bibliometric mapping.

 

We introduce a combined bibliometric and semantic approach to extend a network of bibliographic records by incorporating a larger set of records lacking bibliometric features based on the semantic similarities between their titles. In order to expand the set of identified relevant articles, we used the Universal Sentence Encoder (USE) algorithm developed by Google Research to generate semantic vectors for the titles.

 

We searched several different databases, of which some include citation data, to create a pool C of candidate documents within the selected subject area. A set A of documents was obtained from a citation database to generate the initial network of articles. We then calculated the bibliographic coupling of articles as quantified by their shared references. 

 

We manually selected a small set S1 ⊂ A of documents representing different topical clusters as a seed for the expansion based on semantic similarities. For each document d ∈ S1, we ranked the documents in C in ascending order according to their cosine distance to the title vector assigned to d, then selecting the k documents closest to d. This procedure gave us a set S2 ⊂ C of documents to read. 

The results were evaluated using qualitative analysis to determine they were thematically relevant to the present information needs. 

Place, publisher, year, edition, pages
2021.
Keywords [en]
citation analysis, machine learning, semantic modelling, bibliographic networks
National Category
Information Studies Natural Language Processing
Research subject
Library and Information Science
Identifiers
URN: urn:nbn:se:hb:diva-27158OAI: oai:DiVA.org:hb-27158DiVA, id: diva2:1626216
Conference
26th Nordic Workshop on Bibliometrics and Research Policy (NWB2021), Odense, Denmark, 3-5 november 2021.
Projects
Data as Impact LabAvailable from: 2022-01-10 Created: 2022-01-10 Last updated: 2025-02-01Bibliographically approved

Open Access in DiVA

fulltext(144 kB)91 downloads
File information
File name FULLTEXT01.pdfFile size 144 kBChecksum SHA-512
10f53a2cd2f8cab17464b1ea247106b82bb7585f6384a9d22c6cf83633fdfcd553474a7fc3c1c484eef0c2ed852a34e6ad331f583b5c8de78ad5c24535c7be77
Type fulltextMimetype application/pdf

Other links

Presentation

Authority records

Nelhans, GustafEklund, Johan

Search in DiVA

By author/editor
Nelhans, GustafEklund, Johan
By organisation
Faculty of Librarianship, Information, Education and IT
Information StudiesNatural Language Processing

Search outside of DiVA

GoogleGoogle Scholar
Total: 91 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 287 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • harvard-cite-them-right
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf