LibCat » Книги » Приключения » unrecognised » Philippe J. S. De Brouwer - The Big R-Book

Philippe J. S. De Brouwer - The Big R-Book

Здесь есть возможность читать онлайн «Philippe J. S. De Brouwer - The Big R-Book» — ознакомительный отрывок электронной книги совершенно бесплатно, а после прочтения отрывка купить полную версию. В некоторых случаях можно слушать аудио, скачать через торрент в формате fb2 и присутствует краткое содержание. Жанр: unrecognised, на английском языке. Описание произведения, (предисловие) а так же отзывы посетителей доступны на портале библиотеки ЛибКат.

Читать книгу

Название:
The Big R-Book
Автор:
Philippe J. S. De Brouwer
Жанр:
unrecognised / на английском языке
Год:
неизвестен
ISBN:
нет данных
Рейтинг книги:
3 / 5. Голосов: 1
Избранное:

Добавить в избранное
Отзывы:
Написать комментарий
Ваша оценка:
- 60
- 1
- 2
- 3
- 4
- 5

The Big R-Book: краткое содержание, описание и аннотация

Предлагаем к чтению аннотацию, описание, краткое содержание или предисловие (зависит от того, что написал сам автор книги «The Big R-Book»). Если вы не нашли необходимую информацию о книге — напишите в комментариях, мы постараемся отыскать её.

Introduces professionals and scientists to statistics and machine learning using the programming language R Written by and for practitioners, this book provides an overall introduction to R, focusing on tools and methods commonly used in data science, and placing emphasis on practice and business use. It covers a wide range of topics in a single volume, including big data, databases, statistical machine learning, data wrangling, data visualization, and the reporting of results. The topics covered are all important for someone with a science/math background that is looking to quickly learn several practical technologies to enter or transition to the growing field of data science.
The Big R-Book for Professionals: From Data Science to Learning Machines and Reporting with R Provides a practical guide for non-experts with a focus on business users Contains a unique combination of topics including an introduction to R, machine learning, mathematical models, data wrangling, and reporting Uses a practical tone and integrates multiple topics in a coherent framework Demystifies the hype around machine learning and AI by enabling readers to understand the provided models and program them in R Shows readers how to visualize results in static and interactive reports Supplementary materials includes PDF slides based on the book’s content, as well as all the extracted R-code and is available to everyone on a Wiley Book Companion Site
is an excellent guide for science technology, engineering, or mathematics students who wish to make a successful transition from the academic world to the professional. It will also appeal to all young data scientists, quantitative analysts, and analytics professionals, as well as those who make mathematical models.

The Big R-Book — читать онлайн ознакомительный отрывок

Ниже представлен текст книги, разбитый по страницам. Система сохранения места последней прочитанной страницы, позволяет с удобством читать онлайн бесплатно книгу «The Big R-Book», без необходимости каждый раз заново искать на чём Вы остановились. Поставьте закладку, и сможете в любой момент перейти на страницу, на которой закончили чтение.

Тёмная тема

Шрифт:

↓

↑

Сбросить

Интервал:

↓

↑

Закладка:

Сделать

4.3.8.3 Editing Data in a Data Frame

While one usually reads in large amounts of data and uses an IDE such as RStudio that facilitates the visualization and manual modification of data frames, it is useful to know how this is done when no graphical interface is available. Even when working on a server, all these functions will always be available.

de()

data.entry()

edit()

de(x) # fails if x is not defined de(x <- c(NA)) # worksx <- de(x <- c(NA)) # will also save the changes data.entry(x) # de is short for data.entryx <- edit(x) # use the standard editor (vi in *nix)

Of course, there are also multiple ways to address data directly in R.

# The following lines do the same.data_test $Score[1] <-80 data_test[3,1] <-80

4.3.8.4 Modifying Data Frames

Add Columns to a Data-frame

Typically, the variables are in the columns and adding a column corresponds to adding a new, observed variable. This is done via the function cbind().

cbind()

# Expand the data frame, simply define the additional column:data_test $End_date <- as.Date( c(“2014-03-01”, “2017-02-13”, “2014-10-10”, “2015-05-10”,“2010-08-25”)) print(data_test) ## Name Gender Score Age End_date ## 1 Piotr Male 80 42 2014-03-01 ## 2 Pawel Male 88 38 2017-02-13 ## 3 Female 92 26 2014-10-10 ## 4 Lisa Female 89 30 2015-05-10 ## 5 Laura Female 84 35 2010-08-25 # Or use the function cbind() to combine data frames along columns:Start_date <- as.Date( c(“2012-03-01”, “2013-02-13”, “2012-10-10”, “2011-05-10”,“2001-08-25”)) # Use this vector directly:df <- cbind(data_test, Start_date) print(df) ## Name Gender Score Age End_date Start_date ## 1 Piotr Male 80 42 2014-03-01 2012-03-01 ## 2 Pawel Male 88 38 2017-02-13 2013-02-13 ## 3 Female 92 26 2014-10-10 2012-10-10 ## 4 Lisa Female 89 30 2015-05-10 2011-05-10 ## 5 Laura Female 84 35 2010-08-25 2001-08-25 # or first convert to a data frame:df <- data.frame(“Start_date” = t(Start_date)) df <- cbind(data_test, Start_date) print(df) ## Name Gender Score Age End_date Start_date ## 1 Piotr Male 80 42 2014-03-01 2012-03-01 ## 2 Pawel Male 88 38 2017-02-13 2013-02-13 ## 3 Female 92 26 2014-10-10 2012-10-10 ## 4 Lisa Female 89 30 2015-05-10 2011-05-10 ## 5 Laura Female 84 35 2010-08-25 2001-08-25

Adding Rows to a Data-frame

Adding rows corresponds to adding observations. This is done via the function rbind().

rbind()

# To add a row, we need the rbind() function:data_test.to.add <- data.frame( Name = c(“Ricardo”, “Anna”), Gender = c(“Male”, “Female”), Score = c(66,80), Age = c(70,36), End_date = as.Date( c(“2016-05-05”,“2016-07-07”)) ) data_test.new <- rbind(data_test,data_test.to.add) print(data_test.new) ## Name Gender Score Age End_date ## 1 Piotr Male 80 42 2014-03-01 ## 2 Pawel Male 88 38 2017-02-13 ## 3 Female 92 26 2014-10-10 ## 4 Lisa Female 89 30 2015-05-10 ## 5 Laura Female 84 35 2010-08-25 ## 6 Ricardo Male 66 70 2016-05-05 ## 7 Anna Female 80 36 2016-07-07

Merging data frames

Merging allows to extract the subset of two data-frames where a given set of columns match.

data_test.1 <- data.frame( Name = c(“Piotr”, “Pawel”,“Paula”,“Lisa”,“Laura”), Gender = c(“Male”, “Male”,“Female”, “Female”,“Female”), Score = c(78,88,92,89,84), Age = c(42,38,26,30,35) ) data_test.2 <- data.frame( Name = c(“Piotr”, “Pawel”,“notPaula”,“notLisa”,“Laura”), Gender = c(“Male”, “Male”,“Female”, “Female”,“Female”), Score = c(78,88,92,89,84), Age = c(42,38,26,30,135) ) data_test.merged <- merge(x=data_test.1,y=data_test.2, by.x= c(“Name”,“Age”),by.y= c(“Name”,“Age”)) # Only records that match in name and age are in the merged table: print(data_test.merged) ## Name Age Gender.x Score.x Gender.y Score.y ## 1 Pawel 38 Male 88 Male 88 ## 2 Piotr 42 Male 78 Male 78

merge()

Short-cuts

R will allow the use of short-cuts, provided that they are unique. For example, in the data-frame data_testthere is a column Name. There are no other columns whose name start with the letter “N”; hence. this one letter is enough to address this column.

short-cut

data_test $N ## [1] Piotr Pawel Paula Lisa Laura ## Levels: Laura Lisa Paula Pawel Piotr

картинка 40 Warning – Short-cuts can be dangerous

Use “short-cuts” sparingly and only when working interactively (not in functions or code that will be saved and re-run later). When later another column is added the short-cut will no longer be unique and behaviour is hard to predict and it is even harder to spot the programming error in a part of your code that previously worked fine.

Naming Rows and Columns

In the preceding code, we have named columns when we created the data-frame. It is also possible to do that later or to change column names …and it is even possible to name each row individually.

# Get the rownames. colnames(data_test) ## [1] “Name” “Gender” “Score” “Age” “End_date” rownames(data_test) ## [1] “1” “2” “3” “4” “5” colnames(data_test)[2] ## [1] “Gender” rownames(data_test)[3] ## [1] “3” # assign new names colnames(data_test)[1] <-“first_name” rownames(data_test) <-LETTERS[1 :nrow(data_test)] print(data_test) ## first_name Gender Score Age End_date ## A Piotr Male 80 42 2014-03-01 ## B Pawel Male 88 38 2017-02-13 ## C Female 92 26 2014-10-10 ## D Lisa Female 89 30 2015-05-10 ## E Laura Female 84 35 2010-08-25