SingleCite: Improving single citation search in PubMed

A search that is targeted at finding a specific document in databases is called a Single Citation search, which is particularly important for scholarly databases, such as PubMed, because it is a typical information need of the users. We have developed SingleCite, an automated algorithm that establishes a query-document mapping by building a regression function to predict the probability of a retrieved document being the target based on three variables: the score of the highest scoring retrieved document, the difference in score between the two top retrieved documents, and the fraction of a query matched by the candidate citation. SingleCite shows superior performance in benchmarking experiments and is applied to rescue queries that would fail otherwise.