Choosing Reliable Statistical Software

Abstract
Should we trust the results of our statistical computations? Closely following the development of the mainframe computer, Longley (1967) criticized the accuracy of the first regression programs. Approximately every 10 years thereafter, similar comments echoed for each new generation of statistical software. In a recent criticism, McCullough and Vinod (1999) argue that commonly used statistical packages may give “horrendously inaccurate” results, which have gone largely unnoticed (635–37). Moreover, they argue that in consequence of these inaccuracies, past inferences are in question, and future work must document and archive statistical software alongside statistical models (660–62). When political scientists discuss accuracy in computer-intensive quantitative analysis, however, we are relatively sanguine. Numerical accuracy is almost never discussed in articles or even in textbooks geared toward the most sophisticated and computationally intensive techniques (e.g., King 1989; Mooney 1997). Notable exceptions are a forthcoming APSR controversy that depends on the meaning and evaluation of numerical accuracy in ecological inference (King 2001; Tam Cho and Gaines 2001) and a study of numerical accuracy issues in replication (Altman and McDonald 2001).