Speaking of outliers, it’s not always obvious when and why a data point is an outlier. The Overview of Outliers package in R by Antony Unwin lets you compare methods.

Articles on outlier methods use a mixture of theory and practice. Theory is all very well, but outliers are outliers because they don’t follow theory. Practice involves testing methods on data, sometimes with data simulated based on theory, better with `real’ datasets. A method can be considered successful if it finds the outliers we all agree on, but do we all agree on which cases are outliers?

See also Unwin’s talk from 2017 for more about the thinking behind the package.

Tags: ,

Nathan Yau

Comments are closed.