Computed Author: author name disambiguation for PubMed

Return to NUCSL Home Page

PubMed users frequently use author names in queries for retrieving scientific literature. However, author name ambiguity (different authors share the same name) may lead to irrelevant retrieval results. Thus we have developed a machine-learning method to score the features for disambiguating a pair of papers with ambiguous names.
Subsequently, agglomerative clustering is employed to collect all papers belong to the same authors from those classified pairs. Disambiguation performance is evaluated with manual verification of random samples of pairs from clustering results, with a higher accuracy than other state-of-the-art methods. It has been integrated into PubMed to facilitate
author name searches.

Date

Oct 2022

Organization

Department of Health and Human Services, National Institutes of Health (NIH) NLM

Source URL

https://www.hhs.gov/sites/def…

Organization Type

Government