# The connection is stored in an R object myConnection and # it needs the database name (db_name), username and passwordmyConnection = dbConnect( MySQL(), user = ‘root’, password = ‘xxx’, dbname = ‘db_name’, host = ‘localhost’) # e.g. list the tables available in this database. dbListTables(myConnection)
dbConnect()
dbSendQuery()
Fetching Data Drom a Database
Once connected to the database, we can select and download the data that we will use for our analysis in R.
# Prepare the query for the databaseresult <- dbSendQuery(myConnection, “SELECT * from tbl_students WHERE age > 33”) # fetch() will get us the results, it takes a parameter n, which # is the number of desired records. # Fetch all the records(with n = -1) and store it in a data frame.data <- fetch(result, n = -1)
It is also possible to manipulate data in the database directly from R. This allows use to prepare data in the database first and then download it and do our analysis, while we keep all the code in one file.
fetch()
The dbSendQuery()
function can be used to send any query, including UPDATE, INSERT, CREATE TABLE and DROP TABLE queries so we can push results back to the database.
sSQL =“” sSQL[1] <-“UPDATE tbl_students SET score = ‘A’ WHERE raw_score > 90;” sSQL[2] <-“INSERT INTO tbl_students (name, class, score, raw_score) VALUES (‘Robert’, ‘Grade 0’, 88,NULL);” sSQL[3] <-“DROP TABLE IF EXISTS tbl_students;” for(k in c(1 :3)){ dbSendQuery(myConnection, sSQL[k]) }
dbSendQuery()
Create Tables from R Data-frames
Finally, it is possible to write back data from a dataset in R to a table in a database.
dbWriteTable(myConnection, “tbl_name”, data_frame_name[, ], overwrite = TRUE)
dbWriteTable()
Warning – Closing the database connection
Even while connections will be automatically closed when the scope of the database object is lost, it is a good idea to close a connection explicitly. Closing a connection explicitly, makes sure that it is closed and does not remain open if something went wrong, and hence the resources of our RDBMS are freed. Closing a database connection can be done with dbDisconnect(myConnection, …)
.
dbDisconnect()
1 1There are different package management systems for different flavours of Linux and discussing them all is not only beyond the scope of this book, but not really necessary.We assume that if you use Genttoo, that you will know what to do or where to choose.
2 aThis behaviour is caused by the dispatcher-function implementation of an object-oriented programming model. To understand how this works and what it means, we refer to Section 6.2 “S3 Objects” on page 122.
3 2The character type in R is what in most other languages would be called a “string.” In other words, that is text that – generally – will not allow much arithmetic.
4 aSee Chapter 7“Tidy R with the Tidyverse” on page 161 and note that tibbles (the data-frame alternative in the tidyverse) do not coerce text to factors.
5 3If your programming experience is limited, this might seem a little confusing. It is best to accept this and read further. Then, after reading Section 5.1 “Environments in R” on page 110 have another look at this section.
6 4A direct link to this file (in zip-format) is here: http://www.ecb.europa.eu/stats/eurofxref/eurofxref-hist.zip?c6f8f9a0a5f970e31538be5271051b3c.
7 5With R it is also possible to read files directly from the Internet by supplying the URL in to the function read.csv().
♣5♣ Lexical Scoping and Environments
5.1 Environments in R
Environments can be thought of as a set of existing or declared objects (functions, variables, etc.). When we start R, it will create an environment before the command prompt is available to the user.
The top-level environment is the R command prompt. This is the “global environment” and known as R_GlobalEnv
and can be accessed as .GlobalEnv
.
global environment
R_GlobalEnv
As mentioned earlier, the function ls()
shows which variables and functions are defined in the current environment. The function environment()
will tell us which environment is the current one.
environment()
environment() # get the environment## rm(list= ls()) # clear the environmentls() # list all objects ## character(0) a <-“a” f <- function(x) print(x) ls() # note that x is not part of.GlobalEnv ## [1] “a” “f”
When a function starts to be executed this will create a new environment that is subordonated to the environment that calls the function.
# f # Multiple actions and side effects to illustrate environments # Arguments: # x -- single typef <- function(x){ # define a local function g() within f() g <- function(y){ b <-“local variable in g()” print(” -- function g() -- “) print( environment()) print( ls()) print( paste(“b is”, b)) print( paste(“c is”, c)) } # actions in function f: a <<-‘new value for a, also in Global_env’ x <-‘new value for x’ b <- d # d is taken from the environment higher c <-“only defined in f(), not in g()” g(“parameter to g”) print(” -- function f()-- “) print( environment()) print( ls()) print( paste(“a is”, a)) print( paste(“b is”, b)) print( paste(“c is”, c)) print( paste(“x is”, x)) } # Test the function f():b <-a <-c <-d <-pi rm(x) ## Warning in rm(x): object ‘x’ not found f(a) ## [1] “ -- function g() -- “ ## ## [1] “b” “y” ## [1] “b is local variable in g()” ## [1] “c is only defined in f(), not in g()” ## [1] “ -- function f() -- “ ## ## [1] “b” “c” “g” “x” ## [1] “a is new value for a, also in Global_env” ## [1] “b is 3.14159265358979” ## [1] “c is only defined in f(), not in g()” ## [1] “x is new value for x” # Check the existence and values:a ## [1] “new value for a, also in Global_env” b ## [1] 0 c ## [1] 3.141593 x ## Error in eval(expr, envir, enclos): object ‘x’ not found
Each function within a function or environment will create a new environment that has its own variables. Variable names can be the same but the local environment will always take precedence. A few things stand out in the example above.
Читать дальше