LibCat » Книги » Приключения » unrecognised » Maria Cristina Mariani - Data Science in Theory and Practice

Maria Cristina Mariani - Data Science in Theory and Practice

Здесь есть возможность читать онлайн «Maria Cristina Mariani - Data Science in Theory and Practice» — ознакомительный отрывок электронной книги совершенно бесплатно, а после прочтения отрывка купить полную версию. В некоторых случаях можно слушать аудио, скачать через торрент в формате fb2 и присутствует краткое содержание. Жанр: unrecognised, на английском языке. Описание произведения, (предисловие) а так же отзывы посетителей доступны на портале библиотеки ЛибКат.

Читать книгу

Название:
Data Science in Theory and Practice
Автор:
Maria Cristina Mariani
Жанр:
unrecognised / на английском языке
Год:
неизвестен
ISBN:
нет данных
Рейтинг книги:
3 / 5. Голосов: 1
Избранное:

Добавить в избранное
Отзывы:
Написать комментарий
Ваша оценка:
- 60
- 1
- 2
- 3
- 4
- 5

Data Science in Theory and Practice: краткое содержание, описание и аннотация

Предлагаем к чтению аннотацию, описание, краткое содержание или предисловие (зависит от того, что написал сам автор книги «Data Science in Theory and Practice»). Если вы не нашли необходимую информацию о книге — напишите в комментариях, мы постараемся отыскать её.

DATA SCIENCE IN THEORY AND PRACTICE delivers a comprehensive treatment of the mathematical and statistical models useful for analyzing data sets arising in various disciplines, like banking, finance, health care, bioinformatics, security, education, and social services. Written in five parts, the book examines some of the most commonly used and fundamental mathematical and statistical concepts that form the basis of data science. The authors go on to analyze various data transformation techniques useful for extracting information from raw data, long memory behavior, and predictive modeling. The book offers readers a multitude of topics all relevant to the analysis of complex data sets. Along with a robust exploration of the theory underpinning data science, it contains numerous applications to specific and practical problems. The book also provides examples of code algorithms in R and Python and provides pseudo-algorithms to port the code to any other language. Ideal for students and practitioners without a strong background in data science, readers will also learn from topics like: Analyses of foundational theoretical subjects, including the history of data science, matrix algebra and random vectors, and multivariate analysis A comprehensive examination of time series forecasting, including the different components of time series and transformations to achieve stationarity Introductions to both the R and Python programming languages, including basic data types and sample manipulations for both languages An exploration of algorithms, including how to write one and how to perform an asymptotic analysis A comprehensive discussion of several techniques for analyzing and predicting complex data sets Perfect for advanced undergraduate and graduate students in Data Science, Business Analytics, and Statistics programs,
will also earn a place in the libraries of practicing data scientists, data and business analysts, and statisticians in the private sector, government, and academia.

Data Science in Theory and Practice — читать онлайн ознакомительный отрывок

Ниже представлен текст книги, разбитый по страницам. Система сохранения места последней прочитанной страницы, позволяет с удобством читать онлайн бесплатно книгу «Data Science in Theory and Practice», без необходимости каждый раз заново искать на чём Вы остановились. Поставьте закладку, и сможете в любой момент перейти на страницу, на которой закончили чтение.

Тёмная тема

Шрифт:

↓

↑

Сбросить

Интервал:

↓

↑

Закладка:

Сделать

Definition 221 Probability density functionThe pdf of a continuous random - фото 182

Definition 2.21 (Probability density function)The pdf, of a continuous random variable is the function that satisfies

Data Science in Theory and Practice - изображение 185

We will discuss these notations in details in Chapter 20.

Using these concepts, we can define the moments of the distribution. In fact, suppose that Data Science in Theory and Practice - изображение 186 is any function, then we can calculate the expected value of the random variable when the joint density exists as:

Now we can define the moments of the random vector The first moment is a - фото 188

Now we can define the moments of the random vector. The first moment is a vector

The expectation applies to each component in the random vector Expectations of - фото 189

The expectation applies to each component in the random vector. Expectations of functions of random vectors are computed just as with univariate random variables. We recall that expectation of a random variable is its average value.

The second moment requires calculating all the combination of the components. The result can be presented in a matrix form. The second central moment can be presented as the covariance matrix.

(2.1) where we used the transpose matrix notation and since the the matrix is - фото 190

where we used the transpose matrix notation and since the the matrix is symmetric We note that the covariance matrix is positive - фото 191 , the matrix is symmetric.

We note that the covariance matrix is positive semidefinite (nonnegative definite), i.e. for any vector картинка 192 , we have картинка 193 .

Now we explain why the covariance matrix has to be semidefinite. Take any vector Data Science in Theory and Practice - изображение 194 . Then the product

(2.2) Data Science in Theory and Practice - изображение 195

is a random variable (one dimensional) and its variance must be nonnegative. This is because in the one‐dimensional case, the variance of a random variable is defined as Data Science in Theory and Practice - изображение 196 . We see that the variance is nonnegative for every random variable, and it is equal to zero if and only if the random variable is constant. The expectation of ( 2.2) is . Then we can write (since for any number Since the variance is always nonnegative the covariance matrix mu - фото 198 , )

Since the variance is always nonnegative the covariance matrix must be - фото 200

Since the variance is always nonnegative, the covariance matrix must be nonnegative definite (or positive semidefinite). We recall that a square symmetric matrix картинка 201 is positive semidefinite if картинка 202 картинка 203 . This difference is in fact important in the context of random variables since you may be able to construct a linear combination картинка 204 which is not always constant but whose variance is equal to zero.

The covariance matrix is discussed in detail in Chapter 3.

We now present examples of multivariate distributions.

2.3.1 The Dirichlet Distribution

Before we discuss the Dirichlet distribution, we define the Beta distribution.

Definition 2.22 (Beta distribution)A random variable картинка 205 is said to have a Beta distribution with parameters картинка 206 and if it has a pdf defined as where - фото 207 if it has a pdf defined as where and - фото 208 defined as:

where and The Dirichlet distribution - фото 209

where картинка 210 and картинка 211 .

The Dirichlet distribution картинка 212 , named after Johann Peter Gustav Lejeune Dirichlet (1805–1859), is a multivariate distribution parameterized by a vector Data Science in Theory and Practice - изображение 213 of positive parameters .