John Tukey, a mathematician who made huge contributions to statistical methodology, once said: ‘Far better an approximate answer to the right question, which is often vague, than an exact answer to the wrong question, which can always be made precise’ [3]. A p value provides an exact answer, but often to the wrong question.
For historical reasons, likelihoods and their ratios will probably not replace analyses using other approaches, especially the well-entrenched p value. However, the likelihood approach can supplement or complement other approaches. For some, it will add another instrument to their statistical bag of tricks.
1 1 Edwards AWF. Likelihood. Baltimore: John Hopkins University Press; 1992.
2 2 Neyman J, Pearson ES. On the use and interpretation of certain test criteria for purposes of statistical inference: part I. Biometrika. 1928; 20A(1/2):175–240.
3 3 Tukey JW. The future of data analysis. The Annals of Mathematical Statistics. 1962; 33(1):1–67.
1 The Evidence is the Evidence
It is the simple suggestion that the only valid reason for rejecting a statistical hypothesis is that some alternative hypothesis explains the observed events with a greater degree of probability. 1
—E.S. Pearson on receiving a letter from W.S. Gosset [2, p. 242]
1.1 Evidence-Based Statistics
Science advances from evidence, and scientific evidence guides decision-making, practice, and policy. Evidence-based practice encompasses numerous fields: policy, design, management, medicine, education, etc. In medicine, practitioners and patients alike rightly demand and expect that treatments used are evidence-based. To say that the use of a particular therapy is evidence-based means that it has sufficient evidence to support the benefit of its use compared with other possible treatments.
In science, data is obtained in many different ways depending on the methodology. Often the methodology is dictated by the constraints peculiar to the research area. Data can provide evidence on a number of different levels. It may be anecdotal, may come from observational, or from experimental studies. Anecdotal evidence is regarded as the weakest, although it may be the starting point for more systematic research. At the next level, multiple observations provide observational evidence which is usually correlational in nature. A carefully designed study, such as randomized controlled trial, can provide causal evidence for the effectiveness of a treatment. Finally, taking evidence from many research studies may be achieved by carrying out meta-analyses and systematic reviews. Each level in the pyramid of evidence has its advantages and drawbacks.
Appropriate statistical practice is fundamental to doing good science. This book is different from most statistical texts. It is an introduction to the likelihood approach and provides practical instructions on how to convert data into statistical evidence. It uses the likelihood approach that is fully objective in producing statistical results that depend only on the observed data. As Taper and Lele said ‘…the use of the likelihood ratio as an evidence measure is that only the models and the actual data are involved. This is quite different from the classical frequentist and error-statistical approaches, where the strength of evidence is the probability of making an error, calculated over all possible configurations of potential data’ [1, p. 538].
The likelihood approach encompasses a range of techniques grounded in established statistical theory. These techniques allow us to express relative evidence as a ratio of likelihoods. The phrases evidential approach and likelihood approach will be used interchangeably. Using the evidential approach frees us from dependence on the subjective considerations that bedevil other approaches. Based only upon observed evidence, it always informs us correctly about the relative strength of evidence for one hypothesis versus another.
A fuller discussion of the difficulties with approaches associated with p values is relegated to Appendix C.
The use of evidence based on likelihoods and likelihood ratios (LRs) strikes those unfamiliar with it as highly specialized and esoteric, even arcane. There is widespread belief, though misguided, that evidential methodology can only be used safely and credibly by highly experienced or professional statisticians. A contributing factor supporting this belief is the fact that, compared with other areas of statistical methodology, there are relatively few books and research papers on the evidential approach. However, the quality of the texts makes up for their quantity.
The most important book on the subject is Likelihood by Edwards. Originally published in 1972, it represented a highly original text. An expanded edition was subsequently published in 1992 [3]. A.W.F. Edwards (below photo) is a statistician and geneticist who did his PhD with R.A. Fisher, who was also a statistician and geneticist. Edwards's ground-breaking book covers a remarkable range of topics. Sometimes densely written, other times appearing to cover important topics, such as the F ratio, in a cursory fashion. The succinct text, peppered with dry humour and understatement, repays careful reading and re-reading. Many glittering gems relevant to applied statistics await to be mined and polished.
Professor A.W.F. Edwards FRS. Source: Photo from Gonville and Caius College, Cambridge.
Royall's book [4], Statistical Evidence: A Likelihood Paradigm , published 25 years later is a remarkable monograph, providing a tour de force of carefully argued prose and examples to convince anyone still in doubt about the merits of the evidential approach. The book adds to Edwards's work, for example by explaining how sample size calculations relevant to the evidential approach can be done.
The books by Edwards and Royall are outstanding sources of reference for theory and examples. They make an appeal to reason as to why statistical inferences based on statistical tests and Bayesian methods are flawed, and that only the likelihood approach is valid. These books may appear somewhat inaccessible to readers who lack sufficient mathematical or statistical expertise.
A deep theoretical and philosophical treatment of the likelihood approach is given by Hacking [5]. This may appeal to philosophers and theoreticians but there is little there for the applied statistician or researcher.
There are some excellent books with large sections devoted to the evidential approach. First up is the book by Dienes with his excellent, cogent, and entertaining Understanding Psychology as a Science: An introduction to Scientific and Statistical Inference [6]. Then, there is the very solid and thorough treatment by Baguley in Serious Stats: A Guide to Advanced Statistics for the Behavioral Sciences [7]. Both these books offer limited computer code to perform LR calculations. Taper and Lele edited The Nature of Scientific Evidence: Statistical, Philosophical, and Empirical Considerations which consists of a compilation of chapters including some notable authors, such as Royall, Mayo, and others [1]. There are commentaries to the chapters, including by D.R. Cox who was critical of Royall's approach, which was followed by a robust and memorable rejoinder by Royall.
The book by Aitken is a useful addition, but is limited in scope to forensic statistical evidence [8]. Pawitan's In All Likelihood is a useful mathematical treatment of a range of likelihood topics [9]. Clayton and Hills's Statistical Models in Epidemiology [10] is excellent but limits itself to epidemiological statistics. Lindsey's book Introductory Statistics: A Modelling Approach [11], makes extensive use of the likelihood approach. Kirkwood and Sterne's Medical Statistics [12] is a useful practical book that devotes a chapter to likelihood. Armitage et al's Statistical Methods in Medical Research [13] is a solid standard reference work for medical statistics which makes passing references to the likelihood approach. There are some excellent books that use a modelling approach, although without likelihoods, for example Maxwell and Delaney's Designing Experiments and Analyzing Data: A Model Comparison Perspective [14] and Judd et al's Data Analysis: A Model Comparison Approach to Regression, ANOVA, and Beyond [15].
Читать дальше