|
Contents
| Quality Encyclopedia
| Discussion Blogs
K-S
statistic
The Kolmogorov-Smirnov (K-S)
statistic should be used as a relative indicator of curve fit. While some
users may be more familiar with Chi Square goodness of fits, or general
tests for normality, the K-S test has been shown to provide superior estimates
of error in curve fitting models (Massey, 1951)
The K-S statistic reported
is alpha, where alpha is the reject level for the hypothesis that the fitted
curve is the same as the empirical curve. K-S should be a high value (Max
=1.0) when the fit is good and a low value (Min = 0.0) when the fit is not
good. When the K-S value goes below 0.05, you will be informed that the
Lack of fit is significant.
As an example, if the
K-S statistic is 0.4 for a Normal fit and 0.7 for a Johnson fit, the Normal
is rejected at 0.6 and the Johnson at 0.3. That makes the Johnson better
in that it is rejected at a lower level and is therefore more likely to
be the same as the data. The normal fit has a maximum deviation that is
expected to occur by chance only 40% of the time and the Johnson fit a deviation
that is expected 70% of the time. If the deviation were such that it would
be expected to occur 99% of the time, it would be an excellent fit.
The K-S criterion is based
on the expectation that there is likely to be a difference between a discrete
distribution generated from data and the continuous distribution from which
it was drawn, caused by step difference and random error. As n increases,
the size of the difference is expected to decrease. If the measured maximum
difference is smaller than that expected, then the probability that the
distributions are the same is high.
Note that the K-S criterion
is very demanding as n becomes large, because the K-S criterion is scaled
by the square root of n, reflecting an expected decrease in the step size
error. The random error and outliers then dominate, with outliers having
a strong effect on the reported value for alpha (because K-S is a measure
of maximum deviation).
Note: An asymptotic
value for the K-S critical value is taken from Dudewicz and Mishra where
"the exact values differ little from the asymptotic values unless
n (the number of samples) is very small." The calculation
includes a summation of an oscillating, monotonicallly decreasing function,
which is carried to a precision of approximately 1E-8, so that any error
in the approximation is primarily in the assumption of sufficiently large
n.
|