# A general framework for the realistic analysis of sorting and searching algorithms. Application for some popular algorithms

Thu Hien Ngyuen Thi (GREYC, Caen)As there are some algorithms which deal with keys (viewed as a whole structure) and some others which deal with symbols, it is not easy to compare their efficiency. Furthermore, since in most real-life situations a key is complex and cannot be reduced to a single machine word, it is not realistic to consider the comparison between two keys as a unitary comparison. That is why Sedgewick proposed in 1998 to unify the analyses of basic sorting and searching algorithms by studying on the average the number of symbol comparisons between keys (seen as words). First results are due to Fill and Janson, Fill and Nakama who analyzed QuickSort and QuickSelect, but only on data composed of random uniform bits. Then, Clément, Fill, Flajolet and Vallée extend the previous analyses when data are words emitted by a general source.

In the present work, we draw a general framework for such realistic analyses:
we provide an “automatic” method which takes as an input the probability of
comparing two random words for a given algorithm and computes the exact
formula for the average number of symbol comparisons. This method is applied
to three popular algorithms: Insertion Sort, Bubble Sort and Minimum
Selection. We also recover the results of QuickSort and QuickSelect. The
results are the following: with respect to key comparisons, the average-case
complexity of QuickSort is asymptotic to 2 *n* log *n* , Insertion Sort to 2
*n* , BubbleSort to *n* 2/2, QuickMin to 2 *n* , Selection Min to *n* -1. With
respect to the number of symbol comparisons, the complexity becomes of
respective order Θ( *n* log 2 *n* ), Θ( *n* 2), Θ( *n* 2 log *n* ) for sorting
algorithms and remains Θ( *n* ) for Selection algorithms. We further discuss
the change in the complexity measure of the algorithms, from the number of key
comparisons to the number of symbol comparisons and also the different
constants arising in the analysis reflecting the mixing between algorithms and
the way data are produced.

Joint work with Julien Clément and Brigitte Vallée.