Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Harnessing the Power of the Web via R Clients for Web APIs

258 views

Published on

We often want to harness the power of the internet in our daily data practices, i.e., collect data from the internet, share data on the internet, let a dataset evolve on the internet and analyze it periodically, put products up on the internet, etc. While many of these goals can be achieved in a browser via mouse clicks, these practices aren’t very reproducible and they don’t scale, as they are difficult to capture and replicate. Most of what can be done in a browser can also be implemented with code. Web application programing interfaces (APIs) are one tool for facilitating this communication in a reproducible and scriptable way. In this talk we will discuss the general framework of common R clients for web APIs, as well as dive into specific examples. We will focus primarily on the googledrive package, a package that allows the user to control their Google Drive from the comfort of their R console, as well as other common R clients for web APIs, while discussing best practices for efficient and reproducible coding.

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Harnessing the Power of the Web via R Clients for Web APIs

  1. 1. HARNESSING THE POWER OF THE WEB VIA R CLIENTS FOR WEB APIS Lucy D’Agostino McGowan, PhD Johns Hopkins Bloomberg School of Public Health @LucyStats
  2. 2. JSM 2018 OUTLINE ! What is an API?
  3. 3. JSM 2018 OUTLINE ! What is an API? ! How are APIs accessed from R
  4. 4. JSM 2018 OUTLINE ! What is an API? ! How are APIs accessed from R ! Case studies
  5. 5. JSM 2018 OUTLINE ! What is an API? ! How are APIs accessed from R ! Case studies
  6. 6. JSM 2018 API
  7. 7. JSM 2018 APIapplication
  8. 8. JSM 2018 APIapplication programming
  9. 9. JSM 2018 APIapplication programming interface
  10. 10. JSM 2018
  11. 11. JSM 2018
  12. 12. JSM 2018
  13. 13. JSM 2018
  14. 14. JSM 2018 RESTful API
  15. 15. JSM 2018 HyperText Transfer Protocol REQUEST VIA HTTP METHODS
  16. 16. JSM 2018 GET POST PUT PATCH DELETE REQUEST VIA HTTP METHODS HyperText Transfer Protocol
  17. 17. JSM 2018 GET POST PUT PATCH DELETE REQUEST VIA HTTP METHODS
  18. 18. JSM 2018 GET POST PUT PATCH DELETE REQUEST VIA HTTP METHODS
  19. 19. JSM 2018 GET POST PUT PATCH DELETE REQUEST VIA HTTP METHODS
  20. 20. JSM 2018 GET POST PUT PATCH DELETE REQUEST VIA HTTP METHODS
  21. 21. JSM 2018 GET POST PUT PATCH DELETE REQUEST VIA HTTP METHODS
  22. 22. JSM 2018 HTTP Methods
  23. 23. JSM 2018 ü Choose Method
  24. 24. JSM 2018 Built a set of urls that return data
  25. 25. JSM 2018 Built a set of urls that return data
  26. 26. JSM 2018 Built a set of urls that return data https://developer.twitter.com/en/docs.html
  27. 27. JSM 2018 Built a set of urls that return data https://developer.twitter.com/en/docs.html https://developers.google.com/drive/api/v3/reference/
  28. 28. JSM 2018 Built a set of urls that return data https://developer.twitter.com/en/docs.html https://developers.google.com/drive/api/v3/reference/ https://developer.github.com/v3/
  29. 29. JSM 2018 Built a set of urls that return data
  30. 30. JSM 2018 Built a set of urls that return data, often JSON
  31. 31. JSM 2018 JSONJavaScript
  32. 32. JSM 2018 JSONJavaScript Object
  33. 33. JSM 2018 JSONJavaScript Object Notation
  34. 34. JSM 2018 according to json.org light weight data-interchange format JSON
  35. 35. JSM 2018 according to json.org light weight data-interchange format easy for humans to read / write JSON
  36. 36. JSM 2018 according to json.org light weight data-interchange format easy for humans to read / write easy for computers to parse / generate JSON
  37. 37. JSM 2018 JSON [ { "statuses": { "created_at": "Thu Jul 19 21:45:08 +0000 2018", "id": 1020062005892470000, "id_str": "1020062005892468738", "full_text": "Our team are looking forward to meeting all the @AmstatNews attendees in Vancouver next week #JSM2018 #Vancouver #Canada https://t.co/XiOrCtuUOk", . . . } } ]
  38. 38. JSM 2018 JSON [ { "statuses": { "created_at": "Thu Jul 19 21:45:08 +0000 2018", "id": 1020062005892470000, "id_str": "1020062005892468738", "full_text": "Our team are looking forward to meeting all the @AmstatNews attendees in Vancouver next week #JSM2018 #Vancouver #Canada https://t.co/XiOrCtuUOk", . . . } } ] name
  39. 39. JSM 2018 JSON [ { "statuses": { "created_at": "Thu Jul 19 21:45:08 +0000 2018", "id": 1020062005892470000, "id_str": "1020062005892468738", "full_text": "Our team are looking forward to meeting all the @AmstatNews attendees in Vancouver next week #JSM2018 #Vancouver #Canada https://t.co/XiOrCtuUOk", . . . } } ] name value
  40. 40. JSM 2018 JSON HTTP Methods
  41. 41. JSM 2018 ü Choose Method ü Build URL
  42. 42. JSM 2018 HTTP Methods JSON
  43. 43. JSM 2018 OAuth
  44. 44. JSM 2018 OAuthOpen
  45. 45. JSM 2018 OAuthOpen Authorization
  46. 46. JSM 2018 ü Choose Method ü Build URL ü Get Authorization
  47. 47. JSM 2018 OUTLINE ! What is an API? ! How are APIs accessed from R ! Case studies
  48. 48. JSM 2018 httr
  49. 49. JSM 2018 q Choose Method q Build URL q Get Authorization q Make Request q Process Response
  50. 50. JSM 2018 q Choose Method q Build URL q Get Authorization q Make Request q Process Response
  51. 51. JSM 2018
  52. 52. JSM 2018 REQUEST VIA HTTP METHODS GET POST PUT PATCH DELETE httr::GET() httr::POST() httr::PUT() httr::PATCH() httr::DELETE()
  53. 53. JSM 2018 REQUEST VIA HTTP METHODS GET POST PUT PATCH DELETE httr::GET() httr::POST() httr::PUT() httr::PATCH() httr::DELETE()
  54. 54. JSM 2018 ü Choose Method q Build URL q Get Authorization q Make Request q Process Response
  55. 55. JSM 2018 BUILD URL library(httr) url <- modify_url( url = "https://api.github.com/", path = "repos/tidyverse/googledrive/issues", query = list(labels = "docs") ) https://api.github.com/repos/tidyverse/googledrive/issues?labels=docs
  56. 56. JSM 2018 BUILD URL library(httr) url <- modify_url( url = "https://api.github.com/", path = "repos/tidyverse/googledrive/issues", query = list(labels = "docs") ) https://api.github.com/repos/tidyverse/googledrive/issues?labels=docs base URL
  57. 57. JSM 2018 BUILD URL library(httr) url <- modify_url( url = "https://api.github.com/", path = "repos/tidyverse/googledrive/issues", query = list(labels = "docs") ) https://api.github.com/repos/tidyverse/googledrive/issues?labels=docs base URL endpoint
  58. 58. JSM 2018 BUILD URL https://api.github.com/repos/tidyverse/googledrive/issues?labels=docs library(httr) url <- modify_url( url = "https://api.github.com/", path = "repos/tidyverse/googledrive/issues", query = list(labels = "docs") ) base URL endpoint parameters
  59. 59. JSM 2018 ü Choose Method ü Build URL q Get Authorization q Make Request q Process Response
  60. 60. JSM 2018 AUTHENTICATION 1 Register an application https://github.com/settings/developers
  61. 61. JSM 2018 AUTHENTICATION 1 2Register an application Create an OAuth application library(httr) app <- oauth_app( appname = "NAME", key = "KEY", secret = "SECRET" ) https://github.com/settings/developers
  62. 62. JSM 2018 AUTHENTICATION 1 2Register an application Create an OAuth application 3 Generate a token library(httr) app <- oauth_app( appname = "NAME", key = "KEY", secret = "SECRET" ) token <- oauth2.0_token( oauth_endpoints("github"), app ) https://github.com/settings/developers
  63. 63. JSM 2018 AUTHENTICATION 1 2Register an application Create an OAuth application 3 Generate a token library(httr) app <- oauth_app( appname = "NAME", key = "KEY", secret = "SECRET" ) token <- oauth2.0_token( oauth_endpoints("github"), app ) https://github.com/settings/developers
  64. 64. JSM 2018
  65. 65. JSM 2018 ü Choose Method ü Build URL ü Get Authorization q Make Request q Process Response
  66. 66. JSM 2018 library(httr) req <- GET(url, config = token) REQUEST
  67. 67. JSM 2018 library(httr) req <- GET(url, config = token) req #> Response #> [https://api.github.com/repos/tidyverse/googledrive/issues?labels=docs] #> Date: 2018-07-20 14:35 #> Status: 200 #> Content-Type: application/json; charset=utf-8 #> Size: 86.6 kB #> [ #> { #> "url": "https://api.github.com/repos/tidyverse/googledrive/issues/150", #> "repository_url": "https://api.github.com/repos/tidyverse/googledrive", #> ... REQUEST
  68. 68. JSM 2018 library(httr) req <- GET(url, config = token) req #> Response #> [https://api.github.com/repos/tidyverse/googledrive/issues?label=docs] #> Date: 2018-07-20 14:35 #> Status: 200 #> Content-Type: application/json; charset=utf-8 #> Size: 86.6 kB #> [ #> { #> "url": "https://api.github.com/repos/tidyverse/googledrive/issues/150", #> "repository_url": "https://api.github.com/repos/tidyverse/googledrive", #> ... REQUEST
  69. 69. JSM 2018 ü Choose Method ü Build URL ü Get Authorization ü Make Request q Process Response
  70. 70. JSM 2018 library(httr) req <- GET(url, config = token) res <- content(req) PROCESS
  71. 71. JSM 2018 library(httr) req <- GET(url, config = token) res <- content(req) res #> [[1]] #> [[1]]$url #> [1] "https://api.github.com/repos/tidyverse/googledrive/issues/150" #> #> [[1]]$repository_url #> [1] "https://api.github.com/repos/tidyverse/googledrive" PROCESS
  72. 72. JSM 2018 library(httr) req <- GET(url, config = token) res <- content(req) res #> [[1]] #> [[1]]$url #> [1] "https://api.github.com/repos/tidyverse/googledrive/issues/150" #> #> [[1]]$repository_url #> [1] "https://api.github.com/repos/tidyverse/googledrive" PROCESS
  73. 73. JSM 2018 library(httr) req <- GET(url, config = token) res <- content(req) purrr::map_chr(res, "url") PROCESS
  74. 74. JSM 2018 library(httr) req <- GET(url, config = token) res <- content(req) purrr::map_chr(res, "url") #> [1] "https://api.github.com/repos/tidyverse/googledrive/issues/150" #> [2] "https://api.github.com/repos/tidyverse/googledrive/issues/123" #> [3] "https://api.github.com/repos/tidyverse/googledrive/issues/79" PROCESS
  75. 75. JSM 2018 ü Choose Method ü Build URL ü Get Authorization ü Make Request ü Process Response
  76. 76. JSM 2018 gh rtweet googledrive googlesheets yelpr meetupr imugR aws.*
  77. 77. JSM 2018 library(googledrive) drive_ls()
  78. 78. JSM 2018 library(googledrive) drive_ls() #> Waiting for authentication in browser... #> Press Esc/Ctrl + C to abort
  79. 79. JSM 2018
  80. 80. JSM 2018 library(googledrive) drive_ls() #> Waiting for authentication in browser... #> Press Esc/Ctrl + C to abort #> Authentication complete. #> Items so far: #> 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500 1600 1670 #> # A tibble: 1,670 x 3 #> name id drive_resource #> * <chr> <chr> <list> #> 1 data.zip 1ET81D_Uku6Tm_Z4U04jB4qTNrC… <list [37]> #> 2 science_manuscript 10X07LsQI5GL_Tdrq3-3PtgcDtW… <list [31]> #> 3 talks 12bBwImIu4Yzg5UaqcvxIZ0rbyG… <list [31]> #> 4 presentations 1-PItelqpv0Sb_LdiEDqb8O3D_R… <list [32]> #> 5 science_supplementary_materials 1JNogHk0Zn-H5je6a-DBh0muwBW… <list [31]> #> 6 iris-collaboration-analysis-plan 10DrdHfxyPFTuACKO5YqcpFrcPM… <list [32]> #> 7 2018-07-16_meeting-minutes 1oUAPeFSvVfo5BBoeCip-2kuOup… <list [32]> #> 8 p-hack-athon.Rda 1cwgv9GkrgcVEImxcKSni-Tdf-B… <list [32]> #> 9 rladies-pres 1-cSy2GqNumuD2YpeIOG_Tv2aQ-… <list [33]> #> 10 Women in Statistics 1ZRK0Bonakg_qkg5McU7Lr9fpDx… <list [30]> #> # ... with 1,660 more rows
  81. 81. JSM 2018 OUTLINE ! What is an API? ! How are APIs accessed from R ! Case studies
  82. 82. JSM 2018 UPLOAD MY SLIDES TO DRIVE library(googledrive) drive_upload("2018-07_jsm/slides.pptx", path = "postdoc/talks/", name = "2018-07_jsm_slides", type = "presentation")
  83. 83. JSM 2018 ORGANIZE MY DRIVE FROM THE COMFORT OF MY CONSOLE library(googledrive) leek_files <- drive_find(q = "'jtleek@gmail.com' in writers") postdoc_folder <- drive_mkdir("postdoc") purrr::walk(leek_files$id, ~drive_mv(as_id(.x), path = postdoc_folder) )
  84. 84. JSM 2018 ORGANIZE MY DRIVE FROM THE COMFORT OF MY CONSOLE library(googledrive) leek_files <- drive_find(q = "'jtleek@gmail.com' in writers") postdoc_folder <- drive_mkdir("postdoc") purrr::walk(leek_files$id, ~drive_mv(as_id(.x), path = postdoc_folder) )
  85. 85. JSM 2018 ORGANIZE MY DRIVE FROM THE COMFORT OF MY CONSOLE library(googledrive) leek_files <- drive_find(q = "'jtleek@gmail.com' in writers") postdoc_folder <- drive_mkdir("postdoc") purrr::walk(leek_files$id, ~drive_mv(as_id(.x), path = postdoc_folder) )
  86. 86. JSM 2018 PULLING MOST FREQUENT SLACK EMOJI http://livefreeordichotomize.com/2017/07/17/ropensci-slack-emojis/
  87. 87. JSM 2018 NAVIGATING TWITTER RESPONSES http://livefreeordichotomize.com/2017/07/24/twitter-trees/
  88. 88. JSM 2018 WE R LADIES http://livefreeordichotomize.com/2017/07/18/the-making-of-we-r-ladies/
  89. 89. @LUCYSTATS bit.ly/lucystats-jsm2018 LUCYDAGOSTINO@GMAIL.COM Thank you! HARNESSING THE POWER OF THE WEB VIA R CLIENTS FOR WEB APIS Lucy D’Agostino McGowan, PhD Johns Hopkins Bloomberg School of Public Health

×