LibCat » Книги » Приключения » unrecognised » Bhisham C. Gupta - Statistics and Probability with Applications for Engineers and Scientists Using MINITAB, R and JMP

Bhisham C. Gupta - Statistics and Probability with Applications for Engineers and Scientists Using MINITAB, R and JMP

Здесь есть возможность читать онлайн «Bhisham C. Gupta - Statistics and Probability with Applications for Engineers and Scientists Using MINITAB, R and JMP» — ознакомительный отрывок электронной книги совершенно бесплатно, а после прочтения отрывка купить полную версию. В некоторых случаях можно слушать аудио, скачать через торрент в формате fb2 и присутствует краткое содержание. Жанр: unrecognised, на английском языке. Описание произведения, (предисловие) а так же отзывы посетителей доступны на портале библиотеки ЛибКат.

Читать книгу

Название:
Statistics and Probability with Applications for Engineers and Scientists Using MINITAB, R and JMP
Автор:
Bhisham C. Gupta
Жанр:
unrecognised / на английском языке
Год:
неизвестен
ISBN:
нет данных
Рейтинг книги:
4 / 5. Голосов: 1
Избранное:

Добавить в избранное
Отзывы:
Написать комментарий
Ваша оценка:
- 80
- 1
- 2
- 3
- 4
- 5

Statistics and Probability with Applications for Engineers and Scientists Using MINITAB, R and JMP: краткое содержание, описание и аннотация

Предлагаем к чтению аннотацию, описание, краткое содержание или предисловие (зависит от того, что написал сам автор книги «Statistics and Probability with Applications for Engineers and Scientists Using MINITAB, R and JMP»). Если вы не нашли необходимую информацию о книге — напишите в комментариях, мы постараемся отыскать её.

Introduces basic concepts in probability and statistics to data science students, as well as engineers and scientists Aimed at undergraduate/graduate-level engineering and natural science students, this timely, fully updated edition of a popular book on statistics and probability shows how real-world problems can be solved using statistical concepts. It removes Excel exhibits and replaces them with R software throughout, and updates both MINITAB and JMP software instructions and content. A new chapter discussing data mining—including big data, classification, machine learning, and visualization—is featured. Another new chapter covers cluster analysis methodologies in hierarchical, nonhierarchical, and model based clustering. The book also offers a chapter on Response Surfaces that previously appeared on the book’s companion website.
Statistics and Probability with Applications for Engineers and Scientists using MINITAB, R and JMP, Second Edition Features two new chapters—one on Data Mining and another on Cluster Analysis Now contains R exhibits including code, graphical display, and some results MINITAB and JMP have been updated to their latest versions Emphasizes the p-value approach and includes related practical interpretations Offers a more applied statistical focus, and features modified examples to better exhibit statistical concepts Supplemented with an Instructor's-only solutions manual on a book’s companion website
is an excellent text for graduate level data science students, and engineers and scientists. It is also an ideal introduction to applied statistics and probability for undergraduate students in engineering and the natural sciences.

Statistics and Probability with Applications for Engineers and Scientists Using MINITAB, R and JMP — читать онлайн ознакомительный отрывок

Ниже представлен текст книги, разбитый по страницам. Система сохранения места последней прочитанной страницы, позволяет с удобством читать онлайн бесплатно книгу «Statistics and Probability with Applications for Engineers and Scientists Using MINITAB, R and JMP», без необходимости каждый раз заново искать на чём Вы остановились. Поставьте закладку, и сможете в любой момент перейти на страницу, на которой закончили чтение.

Тёмная тема

Шрифт:

↓

↑

Сбросить

Интервал:

↓

↑

Закладка:

Сделать

2.1.1 What Is Statistics?

The term statistics is commonly used in two ways. On the one hand, we use the term statistics in day‐to‐day communication when we refer to the collection of numbers or facts. What follows are some examples of statistics:

1 In 2000, the salaries of CEOs from 10 selected companies ranged from $2 million to $5 million.

2 On average, the starting salary of engineers is 40% higher than that of technicians.

3 In 2007, over 45 million people in the United States did not have health insurance.

4 In 2008, the average tuition of private colleges soared to over $40,000.

5 In the United States, seniors spend a significant portion of their income on health care.

6 The R&D budget of the pharmaceutical division of a company is higher than the R&D budget of its biomedical division.

7 In December 2009, a total of 43 states reported rising jobless rates.

On the other hand, statistics is a scientific subject that provides the techniques of collecting, organizing, summarizing, analyzing, and interpreting the results as input to make appropriate decisions. In a broad sense, the subject of statistics can be divided into two parts: descriptive statistics and inferential statistics .

Descriptive statistics uses techniques to organize, summarize, analyze, and interpret the information contained in a data set to draw conclusions that do not go beyond the boundaries of the data set. Inferential statistics uses techniques that allow us to draw conclusions about a large body of data based on the information obtained by analyzing a small portion of these data. In this book, we study both descriptive statistics and inferential statistics. This chapter discusses the topics of descriptive statistics. Chapters 3through Chapter 7are devoted to building the necessary tools needed to study inferential statistics, and the rest of the chapters are mostly dedicated to inferential statistics.

2.1.2 Population and Sample in a Statistical Study

In a very broad sense, statistics may be defined as the science of collecting and analyzing data. The tradition of collecting data is centuries old. In European countries, numerous government agencies started keeping records on births, deaths, and marriages about four centuries ago. However, scientific methods of analyzing such data are not old. Most of the advanced techniques of analyzing data have in fact been developed only in the twentieth century, and routine use of these techniques became possible only after the invention of modern computers.

During the last four decades, the use of advanced statistical techniques has increased exponentially. The collection and analysis of various kinds of data has become essential in the fields of agriculture, pharmaceuticals, business, medicine, engineering, manufacturing, product distribution, and by government or nongovernment agencies. In a typical field, there is often need to collect quantitative information on all elements of interest, which is usually referred to as the population . The problem, however, with collecting all conceivable values of interest on all elements is that populations are usually so large that examining each element is not feasible. For instance, suppose that we are interested in determining the breaking strength of the filament in a type of electric bulb manufactured by a particular company. Clearly, in this case, examining each and every bulb means that we have to wait until each bulb dies. Thus, it is unreasonable to collect data on all the elements of interest. In other cases, as doing so may be either quite expensive, time‐consuming, or both, we cannot examine all the elements. Thus, we always end up examining only a small portion of a population that is usually referred to as a sample . More formally, we may define population and sample as follows:

Definition 2.1.1

A population is a collection of all elements that possess a characteristic of interest.

Populations can be finite or infinite. A population where all the elements are easily countable may be considered as finite , and a population where all the elements are not easily countable as infinite . For example, a production batch of ball bearings may be considered a finite population, whereas all the ball bearings that may be produced from a certain manufacturing line are considered conceptually as being infinite.

Definition 2.1.2

A portion of a population selected for study is called a sample .

Definition 2.1.3

The target population is the population about which we want to make inferences based on the information contained in a sample.

Definition 2.1.4

The population from which a sample is being selected is called a sampled population .

The population from which a sample is being selected is called a sampled population , and the population being studied is called the target population . Usually, these two populations coincide, since every effort should be made to ensure that the sampled population is the same as the target population. However, whether for financial reasons, a time constraint, a part of the population not being easily accessible, the unexpected loss of a part of the population, and so forth, we may have situations where the sampled population is not equivalent to the whole target population. In such cases, conclusions made about the sampled population are not usually applicable to the target population.

In almost all statistical studies, the conclusions about a population are based on the information drawn from a sample. In order to obtain useful information about a population by studying a sample, it is important that the sample be a representative sample; that is, the sample should possess the characteristics of the population under investigation. For example, if we are interested in studying the family incomes in the United States, then our sample must consist of representative families that are very poor, poor, middle class, rich, and very rich. One way to achieve this goal is by taking a random sample.

Definition 2.1.5

A sample is called a simple random sample if each element of the population has the same chance of being included in the sample.

There are several techniques of selecting a random sample, but the concept that each element of the population has the same chance of being included in a sample forms the basis of all random sampling, namely simple random sampling, systematic random sampling, stratified random sampling, and cluster random sampling . These four different types of sampling schemes are usually referred to as sample designs .

Since collecting each data point costs time and money, it is important that in taking a sample, some balance be kept between the sample size and resources available. Too small a sample may not provide much useful information, but too large a sample may result in a waste of resources. Thus, it is very important that in any sampling procedure, an appropriate sampling design is selected. In this section, we will review, very briefly, the four sample designs mentioned previously.

Before taking any sample, we need to divide the target population into nonoverlapping units, usually known as sampling units . It is important to recognize that the sampling units in a given population may not always be the same. Sampling units are in fact determined by the sample design chosen. For example, in sampling voters in a metropolitan area, the sampling units might be individual voters, all voters in a family, all voters living in a town block, or all voters in a town. Similarly, in sampling parts from a manufacturing plant, the sampling units might be an individual part or a box containing several parts.