Adam Sampson's benchmarking bookmarkshttps://bookmarks.offog.org/ats/benchmarkingAdam Sampson2016-12-06T20:41:25Zripgrep is faster than {grep, ag, git grep, ucg, pt, sift} - Andrew Gallant's Bloghttps://bookmarks.offog.org/edit?url=http%3A%2F%2Fblog.burntsushi.net%2Fripgrep%2F2016-12-06T20:41:25ZThe tool isn't hugely compelling, but the collection of benchmarks for a regexp search engine is interesting.The USE Methodhttps://bookmarks.offog.org/edit?url=http%3A%2F%2Fwww.brendangregg.com%2Fusemethod.html2014-08-24T21:54:04Z"For every resource, check utilization, saturation, and errors." STABILIZER: statistically sound performance evaluationhttps://bookmarks.offog.org/edit?url=http%3A%2F%2Fdl.acm.org%2Fcitation.cfm%3Fdoid%3D2451116.24511412014-04-01T16:29:15ZNeat trick: this uses some LLVM instrumentation to shuffle memory layout around in a program while it's running, to randomise the effects of layout on performance. As a result of the central limit theorem, this tends to normalise the distribution of timing errors too (provided your program runs long enough to have been thoroughly shuffled).Welcome to the Evaluate Collaboratory! | Evaluate Collaboratoryhttps://bookmarks.offog.org/edit?url=http%3A%2F%2Fevaluate.inf.usi.ch%2F2014-03-26T17:31:43ZTomas and Richard are involved in this project for empirical measurement in CS. Their position paper would be sensible reading for students; it explains some of the common pitfalls of performance measurement.Publication: Quantifying Performance Changes with Effect Size Confidence Intervals - School of Computing - University of Kenthttps://bookmarks.offog.org/edit?url=http%3A%2F%2Fwww.cs.kent.ac.uk%2Fpubs%2F2012%2F3233%2F2014-03-26T17:29:50ZTech report with more details of the statistics behind Tomas/Richard's approach. In particular, this describes how to do the same thing in either parametric or non-parametric ways, and gives some description of how badly the parametric approach performs when the underlying data isn't normally distributed (not very badly, as it turns out).Rigorous Benchmarking in Reasonable Time - Kent Academic Repositoryhttps://bookmarks.offog.org/edit?url=http%3A%2F%2Fkar.kent.ac.uk%2F33611%2F2014-03-26T17:25:54ZTomas Kalibera and Richard Jones' paper on how to do benchmarking that's actually meaningful -- presenting results as confidence intervals for effect sizes, with techniques to establish i.i.d. results and work out how many repetitions you need to do. Very nice work for a pretty short paper! (I've spent most of today chasing references from this in the interests of understanding the maths behind it...)Statistically rigorous Java performance evaluationhttps://bookmarks.offog.org/edit?url=http%3A%2F%2Fdl.acm.org%2Fcitation.cfm%3Fid%3D12970332014-03-26T17:24:11ZOne of the papers that inspired Tomas/Richard's rigorous benchmarking work. This is a much simpler strategy, involving looking for overlapping confidence intervals -- which is statistically pretty dubious, but common in other disciplines...Producing wrong data without doing anything obviously wrong!https://bookmarks.offog.org/edit?url=http%3A%2F%2Fdl.acm.org%2Fcitation.cfm%3Fid%3D15082752014-03-26T17:22:30ZLots of examples of how environmental factors (e.g. environment variable size, room temperature, link order, ASLR...) can affect experimental results, to the tune of 20% or more. Basically: why pretty much any benchmark you've seen in a paper where the effect size isn't huge is probably nonsense.
Statistics with Confidencehttps://bookmarks.offog.org/edit?url=http%3A%2F%2Fwww.cs.york.ac.uk%2Fnature%2Ftangent%2Fstats.pdf2014-03-19T10:56:42ZSusan's stats tutorial (which I first saw at ICARIS 2009). Highly recommended for students who're doing performance measurement. Perl, Python, Ruby, PHP, C, C++, Lua, tcl, javascript and Java comparisonhttps://bookmarks.offog.org/edit?url=http%3A%2F%2Fraid6.com.au%2F%7Eonlyjob%2Fposts%2Farena%2F2013-12-14T15:38:03ZComparison of lots of languages on a fairly simple string-handling problem. Interesting for the breadth of languages. I'd take his assertions with a large yellow roadside bin of salt, though.How not to lie with statistics: the correct way to summarize benchmark resultshttps://bookmarks.offog.org/edit?url=https%3A%2F%2Fdl.acm.org%2Fcitation.cfm%3Fid%3D56732013-12-14T15:17:45Z"Using the arithmetic mean to summarize normalized benchmark results leads to mistaken conclusions that can be avoided by using the preferred method: the geometric mean."