Abstract
The problem of testing outlying observations, although an old one, is of considerable importance in applied statistics. Many and various types of significance tests have been proposed by statisticians interested in this field of application. In this connection, we bring out in the Histrical Comments notable advances toward a clear formulation of the problem and important points which should be considered in attempting a complete solution. In Section 4 we state some of the situations the experimental statistician will very likely encounter in practice, these considerations being based on experience. For testing the significance of the largest observation in a sample of size $n$ from a normal population, we propose the statistic $\frac{S^2_n}{S^2} = \frac{\sum^{n-1}_{i=1} (x_i - \bar x_n)^2}{\sum^n_{i=1} (x_i - \bar x)^2}$ where $x_1 \leq x_2 \leq \cdots \leq x_n, \bar x_n = \frac{1}{n - 1} \sum^{n-1}_{i=1} x_i$ and $\bar x = \frac{1}{n}\sum^{n}_{i=1} x_i.$ A similar statistic, $S^2_1/S^2$, can be used for testing whether the smallest observation is too low. It turns out that $\frac{S^2_n}{S^2} = 1 - \frac{1}{n - 1} \big(\frac{x_n - \bar x}{s}\big)^2 = 1 - \frac{1}{n - 1} T^2_n,$ where $s^2 = \frac{1}{n}\sigma(x_i - \bar x)^2,$ and $T_n$ is the studentized extreme deviation already suggested by E. Pearson and C. Chandra Sekar [1] for testing the significance of the largest observation. Based on previous work by W. R. Thompson [12], Pearson and Chandra Sekar were able to obtain certain percentage points of $T_n$ without deriving the exact distribution of $T_n$. The exact distribution of $S^2_n/S^2$ (or $T_n$) is apparently derived for the first time by the present author. For testing whether the two largest observations are too large we propose the statistic $\frac{S^2_{n-1,n}}{S^2} = \frac{\sum^{n-2}_{i=1} (x_i - \bar x_{n-1,n})^2}{\sum^n_{i=1} (x_i - \bar x)^2},\quad\bar x_{n-1,n} = \frac{1}{n - 2} \sum^{n-2}_{i=1} x_i$ and a similar statistic, $S^2_{1,2}/S^2$, can be used to test the significance of the two smallest observations. The probability distributions of the above sample statistics $S^2 = \sum^n_{i=1} (x_i - \bar x)^2 \text{where} \bar x = \frac{1}{n} \sum^n_{i=1} x_i$ $S^2_n = \sum^{n-1}_{i=1} (x_i - \bar x_n)^2 \text{where} \bar x_n = \frac{1}{n-1} \sum^{n-1}_{i=1} x_i$ $S^2_1 = \sum^n_{i=2} (x_i - \bar x_1)^2 \text{where} \bar x_1 = \frac{1}{n-1} \sum^n_{i=2} x_i$ are derived for a normal parent and tables of appropriate percentage points are given in this paper (Table I and Table V). Although the efficiencies of the above tests have not been completely investigated under various models for outlying observations, it is apparent that the proposed sample criteria have considerable intuitive appeal. In deriving the distributions of the sample statistics for testing the largest (or smallest) or the two largest (or two smallest) observations, it was first necessary to derive the distribution of the difference between the extreme observation and the sample mean in terms of the population $\sigma$. This probability$X_1 \leq x_2 \leq x_3 \cdots \leq x_n$ $s^2 = \frac{1}{n} \sum^n_{i=1} (x_i - \bar x)^2 \quad \bar x = \frac{1}{n} \sum^n_{i=1} x_i$ distribution was apparently derived first by A. T. McKay [11] who employed the method of characteristic functions. The author was not aware of the work of McKay when the simplified derivation for the distribution of $\frac{x_n - \bar x}{\sigma}$ outlined in Section 5 below was worked out by him in the spring of 1945, McKay's result being called to his attention by C. C. Craig. It has been noted also that K. R. Nair [20] worked out independently and published the same derivation of the distribution of the extreme minus the mean arrived at by the present author--see Biometrika, Vol. 35, May, 1948. We nevertheless include part of this derivation in Section 5 below as it was basic to the work in connection with the derivations given in Sections 8 and 9. Our table is considerably more extensive than Nair's table of the probability integral of the extreme deviation from the sample mean in normal samples, since Nair's table runs from $n = 2$ to $n = 9,$ whereas our Table II is for $n = 2$ to $n = 25$. The present work is concluded with some examples.