Speeding up Document Ranking with Rank-based Features

Lucchese, Claudio; Nardini, Franco Maria; Orlando, Salvatore; Perego, Raffaele; Tonellotto, Nicola

doi:10.1145/2766462.2767776

Learning to Rank (LtR) is an effective machine learning methodology for inducing high-quality document ranking functions. Given a query and a candidate set of documents, where query-document pairs are represented by feature vectors, a machine-learned function is used to reorder this set. In this paper we propose a new family of rank-based features, which extend the original feature vector associated with each query-document pair. Indeed, since they are derived as a function of the query-document pair and the full set of candidate documents to score, rank-based features provide additional information to better rank documents and return the most relevant ones. We report a comprehensive evaluation showing that rank-based features allow us to achieve the desired effectiveness with ranking models being up to 3.5 times smaller than models not using them, with a scoring time reduction up to 70%. ACM 978-1-4503-3621-5/15/08 $15.00.