Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Los Angeles R users group - Dec 14 2010 - Part 2

2,870 views

Published on

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

Los Angeles R users group - Dec 14 2010 - Part 2

  1. 1. Database Access through R Krishna Bhogaonker December 14, 2010
  2. 2. Where is this Going? ● Introduction to database connection methods ● Examples from some common R packages (cheat sheets a.k.a. eye charts) ● Introduction to the sqldf package14/12/10 Database Access Through R 2
  3. 3. Database access is about fitting round pegs intosquare holes.14/12/10 Database Access Through R 3
  4. 4. Issues to Consider when Choosing a DataAccess Method for Basic Analysis● How much work does it take to set up? ● Lazy ways – GUIs like RCommander, Deducer, JGR, Revolutions, RedR . . . . ● Diligent ways – Database or Protocol Specific R Packages.● Speed● Stability● Platform14/12/10 Database Access Through R 4
  5. 5. High-Level Database Connection Procedure ● Open a database connection object using the appropriate driver (ODBC, JDBC, etc.) ● Authenicate user and confirm connection ● Execute database tasks by referencing the appropriate methods on the database object14/12/10 Database Access Through R 5
  6. 6. DBI Package ● Big package with connections to various database protocols including Oracle, PostgreSQL, ODBC, SQLite, MySQL ## choose the proper DBMS driver and connect to the server drv <- dbDriver("ODBC") con <- dbConnect(drv, "dsn", "usr", "pwd") ## the interface can work at a higher level importing tables as data.frames and exporting data.frames as DBMS tables. dbListTables(con) dbListFields(con, "quakes") if(dbExistsTable(con, "new_results")) dbRemoveTable(con, "new_results") dbWriteTable(con, "new_results", new.output)14/12/10 Database Access Through R 6
  7. 7. RODBC ● Provides access to ODBC compliant databases, including MSSQL, MS Access, and others # connect to database library(RODBC) myconn <-odbcConnect("mydsn", uid="Rob", pwd="aardvark") # query data from the database crimedat <- sqlFetch(myconn, Crime) pundat <- sqlQuery(myconn, "select * from Punishment") # close database connection close(myconn)14/12/10 Database Access Through R 7
  8. 8. RJDBC ● Uses the DBI interface for the front-end and JDBC driver on the back-end # connect to the database drv <- JDBC("com.mysql.jdbc.Driver", "/etc/jdbc/mysql-connector-java-3.1.14-bin.jar", "`") conn <- dbConnect(drv, "jdbc:mysql://localhost/test") # access database tables dbListTables(conn) data(iris) # write to and query tables dbWriteTable(conn, "iris", iris) dbGetQuery(conn, "select count(*) from iris") d <- dbReadTable(conn, "iris")14/12/10 Database Access Through R 8
  9. 9. RMySQL ● Database interface for MySQL driver using the DBI standard. ## connect and authenticate to a MySQL Db con <- dbConnect(MySQL(), group = "lasers") con2 <- dbConnect(MySQL(), user="opto", password="pure-light", dbname="lasers", host="merced" ## list tables ad fields in a table dbListTables(con) dbListFields(con, "table_name") ## import and export data frames d <- dbReadTable(con, "WL") dbWriteTable(con, "WL2", a.data.frame) ## table from a data.frame dbWriteTable(con, "test2", "~/data/test2.csv") ## table from file14/12/10 Database Access Through R 9
  10. 10. RpgSQL ● PostgreSQL interface to R via RJDBC # the user/password/dbname used here are actually the defaults con <- dbConnect(pgSQL(), user = "postgres", password = "", dbname = "test") # create table, populate it and display it s <- create table tt("id" int primary key, "name" varchar(255)) dbSendUpdate(con, s) dbSendUpdate(con, "insert into tt values(1, Hello)") dbSendUpdate(con, "insert into tt values(2, World)") dbGetQuery(con, "select * from tt") # transfer a data frame to pgSQL and then display it from the database # dbWriteTable is case sensitive dbWriteTable(con, "BOD", BOD) # table names are lower cased unless double quoted dbGetQuery(con, select * from "BOD")14/12/10 Database Access Through R 10
  11. 11. RMongo ● Access to Mongodb through R. Modeled on RMySQL. Still in alpha as of Nov 3, 2010. # connect to a database mongo <- mongoDbConnect("eat2treat_development") # show the collections dbShowCollections(mongo) # perform an all query with a document limit of 2 and offset of 0. # the results is a data.frame object. Nested documents are not supported at the moment. They will just be the string output. results <- dbGetQuery(mongo, "nutrient_metadatas", "{}", 0, 2) names(results) results <- dbGetQuery(mongo, "nutrient_metadatas", {"nutrient_definition_id": 307})14/12/10 Database Access Through R 11
  12. 12. A Few Words about the sqldf Package ● Sqldf provides a way to run SQL statements on R dataframes. ● Sqldf works with the SQLite, H2, and PostgreSQL databases. ● This package allows you to run most SQL commands against an R dataframe: Selects, Joins, Ordering, Grouping, Averaging, etc.14/12/10 Database Access Through R 12
  13. 13. Sqldf Example # load sqldf into workspace and execute SELECT queries library(sqldf) sqldf("select * from iris limit 5") sqldf("select count(*) from iris") sqldf("select Species, count(*) from iris group by Species") # example of a JOIN Abbr <- data.frame(Species = levels(iris$Species), + Abbr = c("S", "Ve", "Vi")) sqldf("select Abbr, avg(Sepal_Length) + from iris natural join Abbr group by Species")14/12/10 Database Access Through R 13
  14. 14. Thank You “How are you going to run the universe if you cant answer a few unsolvable problems?”14/12/10 Database Access Through R 14

×