1 Cover
2 List of Figures
3 List of Tables
4 List of Algorithms
5 Preface Preface The first version of this book was published in 2001, the year I left the Ecole Nationale de la Statistique et de l'Analyse de l'Information (ENSAI) in Rennes (France) to teach at the University of Neuchâtel in Switzerland. This version came from several course materials of sampling theory that I had taught in Rennes. At the ENSAI, the collaboration with Jean‐Claude Deville was particularly stimulating. The editing of this new edition was laborious and was done in fits and starts. I thank all those who reviewed the drafts and provided me with their comments. Special thanks to Monique Graf for her meticulous re‐reading of some chapters. The almost 20 years I spent in Neuchâtel were dotted with multiple adventures. I am particularly grateful to Philippe Eichenberger and Jean‐Pierre Renfer, who successively headed the Statistical Methods Section of the Federal Statistical Office. Their trust and professionalism helped to establish a fruitful exchange between the Institute of Statistics of the University of Neuchâtel and the Swiss Federal Statistical Office. I am also very grateful to the PhD students that I have had the pleasure of mentoring so far. Each thesis is an adventure that teaches both supervisor and doctoral student. Thank you to Alina Matei, Lionel Quality, Desislava Nedyalkova, Erika Antal, Matti Langel, Toky Randrianasolo, Eric Graf, Caren Hasler, Matthieu Wilhelm, Mihaela Guinand‐Anastasiade, and Audrey‐Anne Vallée who trusted me and whom I had the pleasure to supervise for a few years. Yves Tillé Neuchâtel, 2018
6 Preface to the First French Edition Preface to the First French Edition This book contains teaching material that I started to develop in 1994. All chapters have indeed served as a support for teaching, a course, training, a workshop or a seminar. By grouping this material, I hope to present a coherent and modern set of results on the sampling, estimation, and treatment of nonresponses, in other words, on all the statistical operations of a standard sample survey. In producing this book, my goal is not to provide a comprehensive overview of survey sampling theory, but rather to show that sampling theory is a living discipline, with a very broad scope. If, in several chapters demonstrations have been discarded, I have always been careful to refer the reader to bibliographical references. The abundance of very recent publications attests to the fertility of the 1990s in this area. All the developments presented in this book are based on the so‐called “design‐based” approach. In theory, there is another point of view based on population modeling. I intentionally left this approach aside, not out of disinterest, but to propose an approach that I deem consistent and ethically acceptable to the public statistician. I would like to thank all the people who, in one way or another, helped me to make this book: Laurence Broze, who entrusted me with my first sampling course at the University Lille 3, Carl Särndal, who encouraged me on several occasions, and Yves Berger, with whom I shared an office at the Université Libre de Bruxelles for several years and who gave me a multitude of relevent remarks. My thanks also go to Antonio Canedo who taught me to use LaTeX, to Lydia Zaïd who has corrected the manuscript several times, and to Jean Dumais for his many constructive comments. I wrote most of this book at the École Nationale de la Statistique et de l'Analyse de l'Information . The warm atmosphere that prevailed in the statistics department gave me a lot of support. I especially thank my colleagues Fabienne Gaude, Camelia Goga, and Sylvie Rousseau, who meticulously reread the manuscript, and Germaine Razé, who did the work of reproduction of the proofs. Several exercises are due to Pascal Ardilly, Jean‐Claude Deville, and Laurent Wilms. I want to thank them for allowing me to reproduce them. My gratitude goes particularly to Jean‐Claude Deville for our fruitful collaboration within the Laboratory of Survey Statistics of the Center for Research in Economics and Statistics. The chapters on the splitting method and balanced sampling also reflect the research that we have done together. Yves Tillé Bruz, 2001
7 Table of Notations
8 Chapter 1: A History of Ideas in Survey Sampling Theory1.1 Introduction 1.2 Enumerative Statistics During the 19th Century 1.3 Controversy on the use of Partial Data 1.4 Development of a Survey Sampling Theory 1.5 The US Elections of 1936 1.6 The Statistical Theory of Survey Sampling 1.7 Modeling the Population 1.8 Attempt to a Synthesis 1.9 Auxiliary Information 1.10 Recent References and Development Notes
9 Chapter 2: Population, Sample, and Estimation2.1 Population 2.2 Sample 2.3 Inclusion Probabilities 2.4 Parameter Estimation 2.5 Estimation of a Total 2.6 Estimation of a Mean 2.7 Variance of the Total Estimator 2.8 Sampling with Replacement
10 Chapter 3: Simple and Systematic Designs 3.1 Simple Random Sampling without Replacement with Fixed Sample Size 3.2 Bernoulli Sampling 3.3 Simple Random Sampling with Replacement 3.4 Comparison of the Designs with and Without Replacement 3.5 Sampling with Replacement and Retaining Distinct Units 3.6 Inverse Sampling with Replacement 3.7 Estimation of Other Functions of Interest 3.8 Determination of the Sample Size 3.9 Implementation of Simple Random Sampling Designs 3.10 Systematic Sampling with Equal Probabilities 3.11 Entropy for Simple and Systematic Designs
11 Chapter 4: Stratification 4.1 Population and Strata 4.2 Sample, Inclusion Probabilities, and Estimation 4.3 Simple Stratified Designs 4.4 Stratified Design with Proportional Allocation 4.5 Optimal Stratified Design for the Total 4.6 Notes About Optimality in Stratification 4.7 Power Allocation 4.8 Optimality and Cost 4.9 Smallest Sample Size 4.10 Construction of the Strata 4.11 Stratification Under Many Objectives
12 Chapter 5: Sampling with Unequal Probabilities 5.1 Auxiliary Variables and Inclusion Probabilities 5.2 Calculation of the Inclusion Probabilities 5.3 General Remarks 5.4 Sampling with Replacement with Unequal Inclusion Probabilities 5.5 Nonvalidity of the Generalization of the Successive Drawing without Replacement 5.6 Systematic Sampling with Unequal Probabilities 5.7 Deville's Systematic Sampling 5.8 Poisson Sampling 5.9 Maximum Entropy Design 5.10 Rao–Sampford Rejective Procedure 5.11 Order Sampling 5.12 Splitting Method 5.13 Choice of Method 5.14 Variance Approximation 5.15 Variance Estimation Exercises
13 Chapter 6: Balanced Sampling6.1 Introduction 6.2 Balanced Sampling: Definition 6.3 Balanced Sampling and Linear Programming 6.4 Balanced Sampling by Systematic Sampling 6.5 Methode of Deville, Grosbras, and Roth 6.6 Cube Method 6.7 Variance Approximation 6.8 Variance Estimation 6.9 Special Cases of Balanced Sampling 6.10 Practical Aspects of Balanced Sampling Exercise
14 Chapter 7: Cluster and Two‐stage Sampling 7.1 Cluster Sampling 7.2 Two‐stage Sampling 7.3 Multi‐stage Designs 7.4 Selecting Primary Units with Replacement 7.5 Two‐phase Designs 7.6 Intersection of Two Independent Samples Exercises
15 Chapter 8: Other Topics on Sampling8.1 Spatial Sampling 8.2 Coordination in Repeated Surveys 8.3 Multiple Survey Frames 8.4 Indirect Sampling 8.5 Capture–Recapture
16 Chapter 9: Estimation with a Quantitative Auxiliary Variable 9.1 The Problem 9.2 Ratio Estimator 9.3 The Difference Estimator 9.4 Estimation by Regression 9.5 The Optimal Regression Estimator 9.6 Discussion of the Three Estimation Methods
17 Chapter 10: Post‐Stratification and Calibration on Marginal Totals10.1 Introduction 10.2 Post‐Stratification 10.3 The Post‐Stratified Estimator in Simple Designs 10.4 Estimation by Calibration on Marginal Totals 10.5 Example
Читать дальше