Replacement sort revisited: The “gold standard” unearthed!

作者:

Highlights:

摘要

The present paper shows that for certain algorithms such as sorting, the parameters of the input distribution must also be taken into account, apart from the input size, for a more precise evaluation of computational and time complexity (average case only) of the algorithm in question (the so-called “gold standard”). Some concrete results are presented to warrant a new and improved model for replacement sort (also called selection sort) asTavg(n,p1,p2,…pk)=a0+b0n(n-1)/2+c0i(n,p1,p2,…pk)+ϵ,where the LHS gives the average case time complexity, n is the input size, pi’s the parameters of the input distribution characterizing the sorting elements, i is the average number of interchanges which is a function of both the input size and the parameters, the rest of the terms arising due to linear regression and have usual meanings. The error term ϵ arises as we have fixed only the input size n in the model but varying the specific input elements and their relative positions in the array, for a particular distribution [H. Mahmoud, Sorting: A Distribution Theory, John Wiley and Sons, 2000]. The term n(n-1)/2 represents the number of comparisons.We claim this to be an improvement over the conventional model, namely,Tavg(n)=a+bn+cn2+ϵ,which stems from the O(n2) complexity for this algorithm.We argue that the new model in our opinion can be a guiding factor in distinguishing this algorithm from other sorting algorithms of similar order of average complexity such as bubble sort and insertion sort. Note carefully that the dependence of the number of interchanges on the parameters is more prominent for discrete distributions rather than continuous ones and we suspect this to be because the probability of a tie is zero in a continuous case. However, presence of ties and their relative positions in the array is crucial for discrete cases. And this is precisely where the parameters of the input distribution come into play. Those algorithms where ties have a greater influence on some of the computations will have greater influence of parameters of the input distribution in it. Another strength of the paper is that it brings up the close connection between algorithmic complexity and computer experiments, a crucial issue which is overlooked in the textbooks on algorithms. This is a paper on modeling rather than speed.

论文关键词:Replacement sort,Parameters of input distribution,Average complexity,Interchanges,Stochastic realization of a deterministic computer experiment

论文评审过程:Available online 8 January 2007.

论文官网地址:https://doi.org/10.1016/j.amc.2006.11.093