Iain Pardoe - Applied Regression Modeling

Здесь есть возможность читать онлайн «Iain Pardoe - Applied Regression Modeling» — ознакомительный отрывок электронной книги совершенно бесплатно, а после прочтения отрывка купить полную версию. В некоторых случаях можно слушать аудио, скачать через торрент в формате fb2 и присутствует краткое содержание. Жанр: unrecognised, на английском языке. Описание произведения, (предисловие) а так же отзывы посетителей доступны на портале библиотеки ЛибКат.

Applied Regression Modeling: краткое содержание, описание и аннотация

Предлагаем к чтению аннотацию, описание, краткое содержание или предисловие (зависит от того, что написал сам автор книги «Applied Regression Modeling»). Если вы не нашли необходимую информацию о книге — напишите в комментариях, мы постараемся отыскать её.

Master the fundamentals of regression without learning calculus with this one-stop resource The newly and thoroughly revised 3rd Edition of
delivers a concise but comprehensive treatment of the application of statistical regression analysis for those with little or no background in calculus. Accomplished instructor and author Dr. Iain Pardoe has reworked many of the more challenging topics, included learning outcomes and additional end-of-chapter exercises, and added coverage of several brand-new topics including multiple linear regression using matrices.
The methods described in the text are clearly illustrated with multi-format datasets available on the book's supplementary website. In addition to a fulsome explanation of foundational regression techniques, the book introduces modeling extensions that illustrate advanced regression strategies, including model building, logistic regression, Poisson regression, discrete choice models, multilevel models, Bayesian modeling, and time series forecasting. Illustrations, graphs, and computer software output appear throughout the book to assist readers in understanding and retaining the more complex content.
covers a wide variety of topics, like:
Simple linear regression models, including the least squares criterion, how to evaluate model fit, and estimation/prediction Multiple linear regression, including testing regression parameters, checking model assumptions graphically, and testing model assumptions numerically Regression model building, including predictor and response variable transformations, qualitative predictors, and regression pitfalls Three fully described case studies, including one each on home prices, vehicle fuel efficiency, and pharmaceutical patches Perfect for students of any undergraduate statistics course in which regression analysis is a main focus,
also belongs on the bookshelves of non-statistics graduate students, including MBAs, and for students of vocational, professional, and applied courses like data science and machine learning.

Applied Regression Modeling — читать онлайн ознакомительный отрывок

Ниже представлен текст книги, разбитый по страницам. Система сохранения места последней прочитанной страницы, позволяет с удобством читать онлайн бесплатно книгу «Applied Regression Modeling», без необходимости каждый раз заново искать на чём Вы остановились. Поставьте закладку, и сможете в любой момент перейти на страницу, на которой закончили чтение.

Тёмная тема
Сбросить

Интервал:

Закладка:

Сделать

From a statistical perspective, a distribution (strictly speaking, a probability distribution ) is a theoretical model that describes how a random variable varies. For our purposes, a random variable represents the data values of interest in the population, for example, the sale prices of all single‐family homes in our housing market. One way to represent the population distribution of data values is in a histogram, as described in Section 1.1. The difference now is that the histogram displays the whole population rather than just the sample. Since the population is so much larger than the sample, the bins of the histogram (the consecutive ranges of the data that comprise the horizontal intervals for the bars) can be much smaller than in Figure 1.1. For example, Figure 1.2shows a histogram for a simulated population of sale prices The scale of the vertical axis now represents proportions - фото 97sale prices. The scale of the vertical axis now represents proportions (density) rather than the counts (frequency) of Figure 1.1.

Figure 12Histogram for a simulated population of sale prices together with a - фото 98

Figure 1.2Histogram for a simulated population of картинка 99sale prices, together with a normal density curve.

As the population size gets larger, we can imagine the histogram bars getting thinner and more numerous, until the histogram resembles a smooth curve rather than a series of steps. This smooth curve is called a density curve and can be thought of as the theoretical version of the population histogram. Density curves also provide a way to visualize probability distributions such as the normal distribution. A normal density curve is superimposed on Figure 1.2. The simulated population histogram follows the curve quite closely, which suggests that this simulated population distribution is quite close to normal.

To see how a theoretical distribution can prove useful for making statistical inferences about populations such as that in our home prices example, we need to look more closely at the normal distribution. To begin, we consider a particular version of the normal distribution, the standard normal , as represented by the density curve in Figure 1.3. Random variables that follow a standard normal distribution have a mean of 0 (represented in Figure 1.3by the curve being symmetric about 0, which is under the highest point of the curve) and a standard deviation of 1 (represented in Figure 1.3by the curve having a point of inflection—where the curve bends first one way and then the other—at картинка 100and картинка 101). The normal density curve is sometimes called the “bell curve” since its shape resembles that of a bell. It is a slightly odd bell, however, since its sides never quite reach the ground (although the ends of the curve in Figure 1.3are quite close to zero on the vertical axis, they would never actually quite reach there, even if the graph were extended a very long way on either side).

Figure 13Standard normal density curve together with a shaded area of between - фото 102

Figure 1.3Standard normal density curve together with a shaded area of картинка 103between картинка 104and картинка 105, which represents the probability that a standard normal random variable lies between картинка 106and картинка 107.

The key feature of the normal density curve that allows us to make statistical inferences is that areas under the curve represent probabilities. The entire area under the curve is one, while the area under the curve between one point on the horizontal axis ( картинка 108, say) and another point ( картинка 109, say) represents the probability that a random variable that follows a standard normal distribution is between картинка 110and картинка 111. So, for example, Figure 1.3shows there is a probability of 0.475 that a standard normal random variable lies between картинка 112and картинка 113, since the area under the curve between картинка 114and картинка 115is 0.475.

We can obtain values for these areas or probabilities from a variety of sources: tables of numbers, calculators, spreadsheet or statistical software, Internet websites, and so on. In this book, we print only a few select values since most of the later calculations use a generalization of the normal distribution called the “t‐distribution.” Also, rather than areas such as that shaded in Figure 1.3, it will become more useful to consider “tail areas” (e.g., to the right of point картинка 116), and so for consistency with later tables of numbers, the following table allows calculation of such tail areas: Normal distribution probabilities (tail areas) and percentiles (horizontal axis values)

Upper‐tail area 0.1 0.05 0.025 0.01 0.005 0.001
Horizontal axis value 1.282 1.645 1.960 2.326 2.576 3.090
Two‐tail area 0.2 0.1 0.05 0.02 0.01 0.002

In particular, the upper‐tail area to the right of 1.960 is 0.025; this is equivalent to saying that the area between 0 and 1.960 is 0.475 (since the entire area under the curve is 1 and the area to the right of 0 is 0.5). Similarly, the two‐tail area, which is the sum of the areas to the right of 1.960 and to the left of −1.960, is two times 0.025, or 0.05.

How does all this help us to make statistical inferences about populations such as that in our home prices example? The essential idea is that we fit a normal distribution model to our sample data and then use this model to make inferences about the corresponding population. For example, we can use probability calculations for a normal distribution (as shown in Figure 1.3) to make probability statements about a population modeled using that normal distribution—we will show exactly how to do this in Section 1.3. Before we do that, however, we pause to consider an aspect of this inferential sequence that can make or break the process. Does the model provide a close enough approximation to the pattern of sample values that we can be confident the model adequately represents the population values? The better the approximation, the more reliable our inferential statements will be.

Читать дальше
Тёмная тема
Сбросить

Интервал:

Закладка:

Сделать

Похожие книги на «Applied Regression Modeling»

Представляем Вашему вниманию похожие книги на «Applied Regression Modeling» списком для выбора. Мы отобрали схожую по названию и смыслу литературу в надежде предоставить читателям больше вариантов отыскать новые, интересные, ещё непрочитанные произведения.


Отзывы о книге «Applied Regression Modeling»

Обсуждение, отзывы о книге «Applied Regression Modeling» и просто собственные мнения читателей. Оставьте ваши комментарии, напишите, что Вы думаете о произведении, его смысле или главных героях. Укажите что конкретно понравилось, а что нет, и почему Вы так считаете.

x