Performance analysis of distributed information retrieval architectures using an improved network simulation model

作者:

Highlights:

摘要

The increasing number of documents that have to be indexed in different environments, particularly on the Web, and the lack of scalability of a single centralised index lead to the use of distributed information retrieval systems to effectively search for and locate the required information. In this study, we present several improvements over the two main bottlenecks in a distributed information retrieval system (the network and the brokers). We extend a simulation network model in order to represent a switched network. The new simulation model is validated by comparing the estimated response times with those obtained using a real system. We show that the use of a switched network reduces the saturation of the interconnection network, especially in a replicated system, and some improvements may be achieved using multicast messages and faster connections with the brokers. We also demonstrate that reducing the partial results sets will improve the response time of a distributed system by 53%, with a negligible probability of changing the system’s precision and recall values. Finally, we present a simple hierarchical distributed broker model that will reduce the response times for a distributed system by 55%.

论文关键词:Distributed information retrieval,Performance,Simulation,Network model

论文评审过程:Received 11 November 2005, Revised 1 June 2006, Accepted 3 June 2006, Available online 14 August 2006.

论文官网地址:https://doi.org/10.1016/j.ipm.2006.06.002