Philippe J. S. De Brouwer - The Big R-Book

Здесь есть возможность читать онлайн «Philippe J. S. De Brouwer - The Big R-Book» — ознакомительный отрывок электронной книги совершенно бесплатно, а после прочтения отрывка купить полную версию. В некоторых случаях можно слушать аудио, скачать через торрент в формате fb2 и присутствует краткое содержание. Жанр: unrecognised, на английском языке. Описание произведения, (предисловие) а так же отзывы посетителей доступны на портале библиотеки ЛибКат.

The Big R-Book: краткое содержание, описание и аннотация

Предлагаем к чтению аннотацию, описание, краткое содержание или предисловие (зависит от того, что написал сам автор книги «The Big R-Book»). Если вы не нашли необходимую информацию о книге — напишите в комментариях, мы постараемся отыскать её.

Introduces professionals and scientists to statistics and machine learning using the programming language R Written by and for practitioners, this book provides an overall introduction to R, focusing on tools and methods commonly used in data science, and placing emphasis on practice and business use. It covers a wide range of topics in a single volume, including big data, databases, statistical machine learning, data wrangling, data visualization, and the reporting of results. The topics covered are all important for someone with a science/math background that is looking to quickly learn several practical technologies to enter or transition to the growing field of data science. 
The Big R-Book for Professionals: From Data Science to Learning Machines and Reporting with R Provides a practical guide for non-experts with a focus on business users Contains a unique combination of topics including an introduction to R, machine learning, mathematical models, data wrangling, and reporting Uses a practical tone and integrates multiple topics in a coherent framework Demystifies the hype around machine learning and AI by enabling readers to understand the provided models and program them in R Shows readers how to visualize results in static and interactive reports Supplementary materials includes PDF slides based on the book’s content, as well as all the extracted R-code and is available to everyone on a Wiley Book Companion Site
is an excellent guide for science technology, engineering, or mathematics students who wish to make a successful transition from the academic world to the professional. It will also appeal to all young data scientists, quantitative analysts, and analytics professionals, as well as those who make mathematical models.

The Big R-Book — читать онлайн ознакомительный отрывок

Ниже представлен текст книги, разбитый по страницам. Система сохранения места последней прочитанной страницы, позволяет с удобством читать онлайн бесплатно книгу «The Big R-Book», без необходимости каждый раз заново искать на чём Вы остановились. Поставьте закладку, и сможете в любой момент перейти на страницу, на которой закончили чтение.

Тёмная тема
Сбросить

Интервал:

Закладка:

Сделать

Then there is the tidyverse. It is a recent addition to R that is both a collection of often used functionalities and a philosophy.

The developers of tidyversepromote 2 :

Use existing and common data structures. So all the packages in the tidyverse will share a common S3 class types; this means that in general functions will accept data frames (or tibbles). More low-level functions will work with the base R vector types.

Reuse data structures in your code. The idea here is that there is a better option than always over-writing a variable or create a new one in every line: pass on the output of one line to the next with a “pipe”: %>%. To be accepted in the tidyverse, the functions in a package need to be able to use this pipe. 3 pipe

Keep functions concise and clear. For example, do not mix side-effects and transformations, function names should be verbs where ever possible (unless they become too generic or meaningless of course), and keep functions short (they do only one thing, but do it well).

Embrace R as a functional programming language. This means that reflexes that youmight have from say C++, C#, python, PHP, etc., will have to be mended. This means for example that it is best to use immutable objects and copy-on-modify semantics and avoid using the refclass model (see Section 6.4 “The Reference Class, refclass, RC or R5 Model” on page 113). Use where possible the generic functions provided by S3 and S4. Avoid writing loops (such as repeat and for but use the apply family of functions (or refer to the package purrr).

Keep code clean and readable for humans. For example, prefer meaningful but long variable names over short but meaningless ones, be considerate towards people using auto-complete in RStudio (so add an id in the first and not last letters of a function name), etc.

Tidyverse is in permanent development as core R itself and many other packages. For further and most up-to-date information we refer to the website of the Tidyverse: http://tidyverse.tidyverse.org.

Tidy Data

Tidy data is in essence data that is easy to understand by people and is formatted and structured with the following rules in mind.

1 a tibble/data-frame for each dataset,

2 a column for each variable,

3 a row for each observation,

4 a value (or NA) in each cell (a “cell” is the intersection between row and column).

The concept of tidy data is so important that we will devote a whole section to tidy data (Section 17.2 “Tidy Data” on page 275) and how to make data tidy ( Chapter 17 “Data Wrangling in the tidyverse” on page 265). For now, it is sufficient to have the previous rules in mind. This will allow us to introduce the tools of the tidyverse first and then later come back to making data tidy by using these tools.

Tidy Conventions

The tidyverse also enforces some rules to keep code tidy. The aims are to make code easier to read, reduce the potential misunderstandings, etc.

For example, we remember the convention that R uses to implement it is S3 object oriented programming framework from Chapter 6.2 “S3 Objects” on page 91. In that section we have explained how R finds for example the right method (function) to use when printing an object via the generic dispatcher function print(). When an object of class “glm” is passed to print(), then the function will dispatch the handling to the function print.glm().

However, this is also true for data-frames: the handling is dispatched to print.data.frame(). This example illustrate how at this point it becomes unclear if the function print.data.frame()is the specific case for a data.frame for the print()function or if it is the special case to print a “frame” in the framework of a call to “print.data().” Therefore, the tidyverse recommends naming conventions to avoid the dot ( .). And use the snake_styleor UpperCasestyle instead.

картинка 94Further information – Tidyverse philosophy

More about programming style in the tidyverse can be found in the online manifesto of the tidyverse website: https://tidyverse.tidyverse.org/articles/manifesto.html.

7.2. Packages in the Tidyverse

Loading the tidyverse will report on which packages are included:

tidyverse

# we assume that you installed the package before: # install.packages(“tidyverse”) # so load it: library(tidyverse) ## - Attaching packages ----------- tidyverse 1.3.0 - ## v ggplot2 3.2.1 v purrr 0.3.3 ## v tibble 2.1.3 v dplyr 0.8.3 ## v tidyr 1.0.0 v stringr 1.4.0 ## v readr 1.3.1 v forcats 0.4.0 ## - Conflicts ------------- tidyverse_conflicts() - ## x purrr::compose() masks pryr::compose() ## x dplyr::filter() masks stats::filter() ## x dplyr::lag() masks stats::lag() ## x purrr::partial() masks pryr::partial()

So, loading the library tidyverse, loads actually a series of other packages. The collection of these packages are called “core-tidyverse.”

Further, loading tidyverse also informs you about which potential conflicts may occur. For example, we see that calling the function filter()will dispatch to dplyr::filter()(ie. “the function filterin the package dplyr,” while before loading tidyverse, the function stats::filter()would have been called). 4

filter()

Digression – Calling methods of not loaded packages

When a package is not loaded, it is still possible to call its member functions. To call a function from a certain package, we can use the ::operator.

In other words, when we use the ::operator, we specify in which package this function should be found. Therefore it is possible to use a function froma package that is not loaded or is superseded by a function with the same name from a package that got loaded later.

R allows you to stand on the shoulders of giants: when making your analysis, you can rely on existing packages. It is best to use packages that are part of the tidyverse, whenever there is choice. Doing so, your code can be more consistent, readable, and it will become overall a more satisfying experience to work with R.

7.2.1 The Core Tidyverse

The core tidyverse includes some packages that are commonly used in data wrangling and modelling. Here is a word of explanation already. Later we will explore some of those packages more in detail.

tidyr provides a set of functions that help you get to tidy up data and make adhering to the rules of tidy data easier.tidyrThe idea of tidy data is really simple: it is data where every variable has its own column, and every column is a variable. For more information, see Chapter 17.3 “Tidying Up Data with tidyr” on page 277.

dplyr provides a grammar of data manipulation, providing a consistent set of verbs that solve the most common data manipulation challenges. For more information, see Chapter 17“DataWrangling in the tidyverse” on page 265.

ggplot2 is a system to create graphics with a philosophy: it adheres to a “Grammar of Graphics” and is able to create really stunning results at a reasonable price (it is a notch more abstract to use than the core-R functionality). For more information, see Chapter 31“A Grammar of Graphics with ggplot2” on page 687.ggplot2For both reasons, we will talk more about it in the sections about reporting: see Chapter 31on page 687.

Читать дальше
Тёмная тема
Сбросить

Интервал:

Закладка:

Сделать

Похожие книги на «The Big R-Book»

Представляем Вашему вниманию похожие книги на «The Big R-Book» списком для выбора. Мы отобрали схожую по названию и смыслу литературу в надежде предоставить читателям больше вариантов отыскать новые, интересные, ещё непрочитанные произведения.


Отзывы о книге «The Big R-Book»

Обсуждение, отзывы о книге «The Big R-Book» и просто собственные мнения читателей. Оставьте ваши комментарии, напишите, что Вы думаете о произведении, его смысле или главных героях. Укажите что конкретно понравилось, а что нет, и почему Вы так считаете.

x