Linearity Measures of the P-P Plot in the Two-Sample Problem

Aplicaci\'on de medidas de linealidad del gr\'afico P-P al problema de dos muestras

FRANCISCO M. OJEDA1, ROSALVA L. PULIDO2, ADOLFO J. QUIROZ3, ALFREDO J. R\'IOS4

1Universidad Sim\'on Bol{\ai}var, Departamento de Matem\'aticas Puras y Aplicadas, Caracas, Venezuela. Professor. Email: fojeda@usb.ve
2Universidad Sim\'on Bol{\ai}var, Departamento de C\'omputo Cient{\ai}fico y Estad{\ai}stica, Caracas, Venezuela. Professor. Email: rosalvaph@gmail.com
3Universidad Sim\'on Bol{\ai}var, Departamento de C\'omputo Cient{\ai}fico y Estad{\ai}stica, Caracas, Venezuela. Universidad de Los Andes, Departamento de Matem\'aticas, Bogot\'a, Colombia. Professor. Email: aj.quiroz1079@uniandes.edu.co
4Universidad Sim\'on Bol{\ai}var, Departamento de Matem\'aticas Puras y Aplicadas, Caracas, Venezuela. Professor. Email: alfrios@usb.ve


Abstract

We present a non-parametric statistic based on a linearity measure of the P-P plot for the two-sample problem by adapting a known statistic proposed for goodness of fit to a univariate parametric family. A Monte Carlo comparison is carried out to compare the method proposed with the classical Wilcoxon and Ansari-Bradley statistics and the Kolmogorov-Smirnov and Cram\er-von Mises statistics the two-sample problem, showing that, for certain relevant alternatives, the proposed method offers advantages, in terms of power, over its classical counterparts. Theoretically, the consistency of the statistic proposed is studied and a Central Limit Theorem is established for its distribution.

Key words: Nonparametric statistics, P-P plot, Two-sample problem.


Resumen

Se presenta un estad{\ai}stico no-param\etrico para el problema de dos muestras, basado en una medida de linealidad del gr\afico P-P. El estad{\ai}stico propuesto es la adaptaci\on de una idea bien conocida en la literatura en el contexto de bondad de ajuste a una familia param\etrica. Se lleva a cabo una comparaci\on Monte Carlo con los m\etodos cl\asicos de Wilcoxon y Ansari-Bradley, Kolmogorov-Smirnov y Cram\er-von Mises para el probelam de dos muestras. Dicha comparaci\on demuestra que el m\etodo propuesto ofrece una potencia superior frente a ciertas alternativas relevantes. Desde el punto de vista te\orico, se estudia la consistencia del m\etodo propuesto y se establece un Teorema del L{\ai}mite Central para su distribuci\on.

Palabras clave: estad\ai sticos no-param\'etricos, gr\'afico P-P, problema de dos muestras.


Texto completo disponible en PDF


References

1. Anderson, T. W. (1962), `On the distribution of the two sample Cram\'er- von Mises criterion´, Annals of Mathematical Statistics 33(3), 1148-1159.

2. Darling, D. A. (1957), `The Kolmogorov-Smirnov, Cram\'er-von Mises tests´, Annals of Mathematical Statistics 28(4), 823-838.

3. Dekking, F. M., Kraaikamp, C., Lopuhaa, H. P. & Meester, L. E. (2005), A Modern Introduction to Probability and Statistics, Springer-Verlag, London.

4. Gan, F. F. & Koehler, K. J. (1990), `Goodness-of-fit tests based on P-P probability plots´, Technometrics 32(3), 289-303.

5. Guenther, W. C. (1975), `The inverse hypergeometric - a useful model´, Statistica Neerlandica 29, 129-144.

6. Hand, D. J., Daly, F., Lunn, A. D., McConway, K. J. & Ostrowski, E. (1994), A Handbook of Small Data Sets, Chapman & Hall, Boca Raton, Florida.

7. Hollander, M. & Wolfe, D. A. (1999), Nonparametric Statistical Methods, 2 edn, John Wiley & Sons, New York.

8. Johnson, N. L., Kotz, S. & Balakrishnan, N. (1995), Continuous Univariate Distributions, 2 edn, John Wiley & Sons, New York.

9. Kimball, B. F. (1960), `On the choice of plotting positions on probability paper´, Journal of the American Statistical Association 55, 546-560.

10. Liu, R. Y., Parelius, J. M. & Singh, K. (1999), `Multivariate analysis by data depth: descriptive statistics, graphics and inference´, The Annals of Statistics 27(3), 783-858.

11. Mathisen, H. C. (1943), `A method for testing the hypothesis that two samples are from the same population´, The Annals of Mathematical Statistics 14, 188-194.

12. Penner, R. & Watts, D. G. (1991), `Mining information´, The Annals of Statistics 45(1), 4-9.

13. R Development Core Team, (2011), `R: a language and environment for statistical computing´. Vienna, Austria. *http://www.R-project.org/

14. Randles, R. H. & Wolfe, D. A. (1979), Introduction to the Theory of Nonparametric Statistics, Krieger Publishing, Malabar, Florida.

15. Serfling, R. J. (1980), Approximation Theorems of Mathematical Statistics, John Wiley and Sons, New York.


[Recibido en febrero de 2010. Aceptado en octubre de 2011]

Este artículo se puede citar en LaTeX utilizando la siguiente referencia bibliográfica de BibTeX:

@ARTICLE{RCEv35n1a01,
    AUTHOR  = {Ojeda, Francisco M. and Pulido, Rosalva L. and Quiroz, Adolfo J. and R\'ios, Alfredo J.},
    TITLE   = {{Linearity Measures of the P-P Plot in the Two-Sample Problem}},
    JOURNAL = {Revista Colombiana de Estadística},
    YEAR    = {2012},
    volume  = {35},
    number  = {1},
    pages   = {1-14}
}