Publication: Quantifying Performance Changes with Effect Size Confidence Intervals - School of Computing - University of Kent

Tech report with more details of the statistics behind Tomas/Richard's approach. In particular, this describes how to do the same thing in either parametric or non-parametric ways, and gives some description of how badly the parametric approach performs when the underlying data isn't normally distributed (not very badly, as it turns out).

on 26 March 2014

FAQ: Why is the Mann-Whitney significant when the medians are equal?

A nice example of why the rank-sum test *doesn't* test whether the medians are the same. (Unless the distributions are otherwise very similar.)

on 26 March 2014

