Library of Congress Cataloging‐in‐Publication Data
Names: Horgan, Jane M., author.
Title: Probability with R : an introduction with computer science
applications / Jane Mary Horgan.
Description: Second edition. | Hoboken, NJ, USA : Wiley, 2020. | Includes
bibliographical references and index.
Identifiers: LCCN 2019032520 (print) | LCCN 2019032521 (ebook) | ISBN
9781119536949 (hardback) | ISBN 9781119536925 (adobe pdf) | ISBN
9781119536987 (epub)
Subjects: LCSH: Computer science–Mathematics. | Probabilities. | R
(Computer program language)
Classification: LCC QA76.9.M35 H863 2020 (print) | LCC QA76.9.M35 (ebook)
| DDC 004.01/5113–dc23
LC record available at https://lccn.loc.gov/2019032520
LC ebook record available at https://lccn.loc.gov/2019032521
Cover design by Wiley
To the memory of Willie, referee, and father of all the Horgans
Preface to the Second Edition
It is now over 10 years since the publication of the first edition of “ Probability with R .” Back then we had just begun to hear of smartphones, fitbits, apps, and Bluetooth; machine learning was in its infancy. It is timely to address how probability applies to new developments in computing. The applications and examples of the first edition are beginning to look somewhat passé and old fashioned. Here, therefore, we offer an updated and extended version of that first edition.
This second edition is still intended to be a first course in probability, addressed to students of computing and related disciplines. As in the first edition, we favor experimentation and simulation rather than the traditional mathematical approach. We continue to rely on the freely downloadable language R , which has of course evolved over the past 10 years.
Our R programs are integrated throughout the text, to illustrate the concepts of probability, to simulate distributions, and to explore new problems. We have been mindful to avoid as far as is possible mathematical details, instead encouraging students to investigate for themselves, through experimentation and simulation in R . Algebraic derivations, when deemed necessary, are developed in the appendices.
In this second edition, all chapters have been revised and updated. Examples and applications of probability in new areas of computing, as well as exercises and projects, have been added to most chapters. The R code has been improved and expanded, by using procedures and functions that have become available in recent years. Extended use of loops and curve facilities to generate graphs with differing parameters have tidied up our approach to limiting distributions.
Briefly the changes in this second edition are
1 Part I, “The R Language” now contains:new and improved R procedures, and an introduction to packages and interfaces ( Chapter 1);examples on apps to illustrate outliers, to calculate statistics in a data frame and statistics appropriate to skewed data ( Chapter 2);an introduction to linear regression, with a discussion of its importance as a tool in machine learning. We show how to obtain the line of best fit with the training set, and how to use the testing set to examine the suitability of the model. We also include extra graphing facilities ( Chapter 3).
2 In Part II, “Fundamentals of Probability”: Chapter 4has been extended with extra examples on password recognition and new R functions to address hash table collision, server overload and the general allocation problem;The concept of “independence” has now been extended from pairs to multiply variables ( Chapter 6); Chapter 7contains new material on machine learning, notably the use of Bayes' theorem to develop spam filters.
3 Part III“Discrete Distributions” now includes:an introduction to bivariate discrete distributions, and programming techniques to handle large conditional matrices ( Chapter 9);an algorithm to simulate the Markov property of the geometric distribution ( Chapter 10);an extension of the reliability model of Chapter 8to the general reliability model ( Chapter 11);an update of the lottery rules ( Chapter 12);an extended range of Poisson applications such as network failures, website hits, and virus attacks ( Chapter 13).
4 In Part IV “Continuous Distributions”:Chapters 16 and 17 have been reorganized. Chapter 17now concentrates entirely on queues while Chapter 16is extended to deal with the applications of the exponential distribution to lifetime models.
5 Part V “Tailing Off”has extra exercises on recent applications of computing.
6 We have added three new appendices: Appendix A gives the data set used in Part I, Appendix B derives the coefficients of the line of best fit and Appendix F contains new proofs of the Markov and Chebyshev inequalities. The original appendices A, B, and C have been relabeled C, D, and E.
7 A separate index containing R commands and functions has been added.
All errors in the first edition have hopefully been corrected. I apologize in advance for any new errors that may escape my notice in this edition; should they arise, they will be corrected in the companion website.
Jane M. Horgan
Dublin City University
Ireland
2019
Preface to the First Edition
This book is offered as a first introduction to probability, and its application to computer disciplines. It has grown from a one‐semester course delivered over the past several years to students reading for a degree in computing at Dublin City University. Students of computing seem to be able happily to think about Database, Computer Architecture, Language Design, Software Engineering, Operating Systems, and then to freeze up when it comes to “Probability,” and to wonder what it might have to do with computing. Convincing undergraduates of the relevance of probability to computing is one of the objectives of this book.
One reason for writing this has been my inability to find a good text in which probability is applied to problems in computing at the appropriate level. Most existing texts on probability seem to be overly rigorous, too mathematical for the typical computing student. While some computer students may be adept at mathematics, there are many who resist the subject. In this book, we have largely replaced the mathematical approach to probability by one of simulation and experimentation, taking advantage of the powerful graphical and simulation facilities of the statistical system R , which is freely available, and downloadable, from the web. The text is designed for students who have taken a first course in mathematics, involving just a little calculus, as is usual in most degree courses in computing. Mathematical derivations in the main text are kept to a minimum: when we think it necessary, algebraic details are provided in the appendices. To emphasize our attitude to the simulation and experimentation approach, we have chosen to incorporate instructions in R throughout the text, rather than put them back to an appendix.
Features of the book which distinguish it from other texts in probability include
R is used not only as a tool for calculation and data analysis, but also to illustrate the concepts of probability, to simulate distributions, and to explore by experimentation different scenarios in decision‐making. The R books currently available skim over the concepts of probability, and concentrate on using it for statistical inference and modelling.
Recognizing that the student better understands definitions, generalizations and abstractions after seeing the applications, almost all new ideas are introduced and illustrated by real, computer‐related, examples, covering a wide range of computer science applications.
Читать дальше