The Behavior of Deep statistical Comparison Approach for Different Criteria of Comparing Distributions
T. Eftimov, P. KoroÅ¡ec, B. KorouÅ¡iÄ‡ Seljak Proc. 9th International Joint Conference on Computational Intelligence , Funchal, Madeira, Portugal: 1-3 November 2017 pages: 73-82 DOI: 10.5220/0006499900730082
abstract: Deep Statistical Comparison (DSC) is a recently proposed approach for the statistical comparison of meta-heuristic stochastic algorithms for single-objective optimization. The main contribution of the DSC is a ranking scheme, which is based on the whole distribution, instead of using only one statistic, such as average or median, which are commonly used. Contrary to common approach, the DSC gives more robust statistical results, which are not affected by outliers or misleading ranking scheme. The DSC ranking scheme uses a statistical test for comparing distributions in order to rank the algorithms. DSC was tested using the two-sample Kolmogorov-Smirnov (KS) test. However, distributions can be compared using different criteria, statistical tests. In this paper, we analyze the behavior of the DSC using two different criteria, the two-sample Kolmogorov-Smirnov (KS) test and the Anderson-Darling (AD) test. Experimental results from benchmark tests consisting of single-objective problems, show that both criteria behave similarly. However, when algorithms are compared on a single problem, it is better to use the AD test because it is more powerful and can better detect differences than the KS test when the distributions vary in shift only, in scale only, in symmetry only, or have the same mean and standard deviation but differ on the tail ends only. This influence is not emphasized when the approach is used for multiple-problem analysis.