Jane M. Horgan - Probability with R

Здесь есть возможность читать онлайн «Jane M. Horgan - Probability with R» — ознакомительный отрывок электронной книги совершенно бесплатно, а после прочтения отрывка купить полную версию. В некоторых случаях можно слушать аудио, скачать через торрент в формате fb2 и присутствует краткое содержание. Жанр: unrecognised, на английском языке. Описание произведения, (предисловие) а так же отзывы посетителей доступны на портале библиотеки ЛибКат.

Probability with R: краткое содержание, описание и аннотация

Предлагаем к чтению аннотацию, описание, краткое содержание или предисловие (зависит от того, что написал сам автор книги «Probability with R»). Если вы не нашли необходимую информацию о книге — напишите в комментариях, мы постараемся отыскать её.

Provides a comprehensive introduction to probability with an emphasis on computing-related applications This self-contained new and extended edition outlines a first course in probability applied to computer-related disciplines. As in the first edition, experimentation and simulation are favoured over mathematical proofs. The freely down-loadable statistical programming language 
is used throughout the text, not only as a tool for calculation and data analysis, but also to illustrate concepts of probability and to simulate distributions. The examples in 
cover a wide range of computer science applications, including: testing program performance; measuring response time and CPU time; estimating the reliability of components and systems; evaluating algorithms and queuing systems. 
Chapters cover: The R language; summarizing statistical data; graphical displays; the fundamentals of probability; reliability; discrete and continuous distributions; and more. 
This second edition includes:
improved R code throughout the text, as well as new procedures, packages and interfaces; updated and additional examples, exercises and projects covering recent developments of computing; an introduction to bivariate discrete distributions together with the R functions used to handle large matrices of conditional probabilities, which are often needed in machine translation; an introduction to linear regression with particular emphasis on its application to machine learning using testing and training data; a new section on spam filtering using Bayes theorem to develop the filters; an extended range of Poisson applications such as network failures, website hits, virus attacks and accessing the cloud; use of new allocation functions in R to deal with hash table collision, server overload and the general allocation problem. The book is supplemented with a Wiley Book Companion Site featuring data and solutions to exercises within the book.
Primarily addressed to students of computer science and related areas, 
is also an excellent text for students of engineering and the general sciences. Computing professionals who need to understand the relevance of probability in their areas of practice will find it useful.

Probability with R — читать онлайн ознакомительный отрывок

Ниже представлен текст книги, разбитый по страницам. Система сохранения места последней прочитанной страницы, позволяет с удобством читать онлайн бесплатно книгу «Probability with R», без необходимости каждый раз заново искать на чём Вы остановились. Поставьте закладку, и сможете в любой момент перейти на страницу, на которой закончили чтение.

Тёмная тема
Сбросить

Интервал:

Закладка:

Сделать
Write a program to calculate this and apply it to the data in resultstxt Is - фото 126

Write a program to calculate this, and apply it to the data in results.txt . Is it a reasonable approximation?

3 Graphical Displays

In addition to numerical summaries of statistical data, there are various pictorial representations and graphical displays available in R that have a more dramatic impact and make for a better understanding of the data. The ease and speed with which graphical displays can be produced is one of the important features of R . By writing

demo(graphics)

you will see examples of the many graphical procedures of R , along with the code needed to implement them. In this chapter, we will examine some of the most common of these.

3.1 BOXPLOTS

A boxplot is a graphical summary based on the median, quartiles, and extreme values. To display the downtime data given in Example 1.1using a boxplot, write

boxplot(downtime)

which gives Fig. 3.1. Often called the Box and Whiskers Plot , the box represents the interquartile range that contains 50% of cases. The whiskers are the lines that extend from the box to the highest and lowest values. The line across the box indicates the median.

Figure 31A Simple Boxplot To improve the look of the graph we could label the - фото 127

Figure 3.1A Simple Boxplot

To improve the look of the graph, we could label the axes as follows:

boxplot(downtime, xlab = "Downtime", ylab = "Minutes")

which gives Fig. 3.2.

Figure 32A Boxplot with Axis Labels Multiple boxplots can be displayed on the - фото 128

Figure 3.2A Boxplot with Axis Labels

Multiple boxplots can be displayed on the same axis, by adding extra arguments to the boxplot function. For example,

boxplot(results$arch1, results$arch2, xlab = "Architecture Semesters 1 and 2")

or simply

boxplot(arch1, arch2, xlab = "Architecture Semesters 1 and 2")

gives Fig. 3.3.

Figure 33Multiple Boxplots Figure 33allows us to compare the performance of - фото 129

Figure 3.3Multiple Boxplots

Figure 3.3allows us to compare the performance of the students in Architecture in the two semesters. It shows, for example, that the marks are lower in Architecture in Semester 2 and the range of marks is narrower than those obtained in Architecture in Semester 1.

Notice also in Fig. 3.3that there are points outside the whiskers of the boxplot in Architecture in Semester 2. These points represent cases over 1.5 box lengths from the upper or lower end of the box and are called outliers . They are considered atypical of the data in general, being either extremely low or extremely high compared to the rest of the data.

Looking at Exercise 1.1 with the uncorrected data, Fig. 3.4is obtained using

boxplot(marks˜gendermarks) Figure 34A Gender Comparison Notice the outlier in Fig 34in the male - фото 130

Figure 3.4A Gender Comparison

Notice the outlier in Fig. 3.4in the male boxplot, a value that appears large compared to the rest of the data. You will recall that a check on the examination results indicated that this value should have been 46, not 86, and we corrected it using

marks[34] <- 46

Repeating the analysis, after making this correction

boxplot(marks˜gendermarks)

gives Fig. 3.5.

Figure 35A Gender Comparison corrected You will now observe from Fig - фото 131

Figure 3.5A Gender Comparison (corrected)

You will now observe from Fig. 3.5that there are no outliers in the male or female data. In this way, a boxplot may be used as a data validation tool. Of course, it is possible that the mark of 86 may in fact be valid, and that a male student did indeed obtain a mark that was much higher than his classmates. A boxplot highlights this and alerts us to the possibility of an error.

To compare the performance of females and males in Architecture in Semester 1, write

gender <- factor(gender, levels = c("f", "m"), labels = c("Female", "Male"))

which changes the labels from “f ” and “m” to “Female” and “Male,” respectively. Then

boxplot(arch1∼gender, ylab = "Marks (%)", main = "Architecture Semester 1", font.main = 1)

outputs Fig. 3.6.

Figure 36A Gender Comparison Notice the effect of using main Architecture - фото 132

Figure 3.6A Gender Comparison

Notice the effect of using main = "Architecture Semester 1"that puts the title on the diagram. Also, the use of font.main = 1ensures that the main title is in plain font.

We can display plots as a matrix using the parfunction: par(mfrow = c(2,2))causes the outputs to be displayed in a картинка 133array.

par(mfrow = c(2,2)) boxplot(arch1∼gender, main = "Architecture Semester 1", font.main = 1) boxplot(arch2∼gender, main = "Architecture Semester 2", font.main = 1) boxplot(prog1∼gender, main = "Programming Semester 1", font.main = 1) boxplot(prog2∼gender, main = "Programming Semester 2", font.main = 1)

produces Fig. 3.7.

Figure 37A Lattice of Boxplots We see from Fig 37that female students seem - фото 134

Figure 3.7A Lattice of Boxplots

We see from Fig. 3.7that female students seem to do less well than their male counterparts in Programming in Semester 1, where the median mark of the females is considerably lower than that of the males: it is lower even than the first quartile of the male marks. In the other subjects, there do not appear to be any substantial differences.

To undo a matrix‐type output, write

par(mfrow = c(1,1))

which restores the graphics output to the full screen.

3.2 HISTOGRAMS

A histogram is a graphical display of frequencies in categories of a variable and is the traditional way of examining the “shape” of the data.

hist(prog1, xlab ="Marks (%)", main = "Programming Semester 1")

yields Fig. 3.8.

Figure 38A Histogram with Default Breaks As we can see from Fig 38 - фото 135

Figure 3.8A Histogram with Default Breaks

As we can see from Fig. 3.8, histgives the count of the observations that fall within the categories or “bins” as they are sometimes called. R chooses a “suitable” number of categories, unless otherwise specified. Alternatively, breaksmay be used as an argument in histto determine the number of categories. For example, to get five categories of equal width, you need to include breaks = 5as an argument.

hist(prog1, xlab = "Marks (%)", main = "Programming Semester 1", breaks = 5)

Читать дальше
Тёмная тема
Сбросить

Интервал:

Закладка:

Сделать

Похожие книги на «Probability with R»

Представляем Вашему вниманию похожие книги на «Probability with R» списком для выбора. Мы отобрали схожую по названию и смыслу литературу в надежде предоставить читателям больше вариантов отыскать новые, интересные, ещё непрочитанные произведения.


Отзывы о книге «Probability with R»

Обсуждение, отзывы о книге «Probability with R» и просто собственные мнения читателей. Оставьте ваши комментарии, напишите, что Вы думаете о произведении, его смысле или главных героях. Укажите что конкретно понравилось, а что нет, и почему Вы так считаете.

x