Light-weight
Dashboarding and Reporting
workflows with R
And the GREAT stack...
Yuriy Goldman
Jon Mauney
http://www.meetup.com
/BigOnData
http://www.meetup.com
/NYC-Open-Data
http://www.meetup.com
/FinTech
@betterment
#bootstrapbi
Team Polaris @ Betterment
Yuriy Jon
Avi Nick Andrew
https://www.betterment.com/blog/2014/03/07/bootstrap-data-team/
Bootstrapping Business Intelligence
Get Here
Walk before you run...
Leverage existing skillset
Minimally Viable Product
Lean and Efficient
Of
• GREAT Stack in Layers
• Exercise a workflow for Development, Staging,
Deployment
• Teamwork or Mingle - we will build an...
GitHub For source control of scaffolding and scripts
R-Language RStudio, Knitr, Rcharts, Yaml
Engineering Elbow Grease Ena...
Engineer Tested
Analyst Approved
Workflow Overview
AUTHORING
STAGING
DEPLOYING
Local Environment
project
YAML
MySQL
R-scripts,
system scripts,
deployment-fu
network file storage
s3::rwizflowy-bucket
/m...
Complete environment for R development
(But you knew that already)
Collaborative Source Code Management
Continuous Integration hooks
Post Deployment processing
Like Dropbox, but gives you dependable,
static URLs to files you save there (images, html
pages)
AWS S3
Access AWS S3 Bucket as a local drive
Symlink /Volume/rwizflowy-bucket to
/mnt/rwizflowy-bucket
Expan
Drive
Sample data will come from MySQL.
YAML is syntax for a config file our R scripts will
load at runtime. It can tell us how to connect to
MySQL or where to ou...
Assembly of Dashboards or Reports happen
within Google Sites. But any Wiki will do. Use
whatever as long as IMG and IFrame...
Local Environment
project
YAML
MySQL
R-scripts,
system scripts,
deployment-fu
network file storage
s3::rwizflowy-bucket
/m...
Team Exercise #1: Authoring
Set up Local Environment
Exercise a sample script
Output to S3
1.Get Code 3.Connect to S3
and ...
Find a team captain for the
Authoring Challenge!
● Reconvene in 20 minutes
● Take one of the samples and come up with an
o...
Staging
Add R output to a Wiki
https://s3.amazonaws.com/rwizflowy-bucket/${TEAMNAME}/${FILENAME.png}
Local Environment
project
YAML
MySQL
R-scripts,
system scripts,
deployment-fu
network file storage
s3::rwizflowy-bucket
/m...
Server Environment
project
YAML
MySQL
R-scripts,
system scripts,
deployment-fu
network file storage
s3::rwizflowy-bucket
/...
Integration Environment
hook pullLocal Server
build and deploy
git push
Local Server
git push
Continuous Integration and Deployment tool.
Connects to your GitHub account and listens for
changes to your Branches. We t...
Deploying
Commit to Master
Sit back and enjoy the show...
Server Environment
YAML
MySQL
network file storage
s3::rwizflowy-bucket
/mnt/rwizflowy-bucket
via S3Fuse
cron
scheduler
Wrap Up, Q&A
AUTHORING
STAGING
DEPLOYING
https://github.com/ygoldman/rwizflowy
http://www.betterment.com/jobs
https://www.betterment.com/blog
Get a Betterment acco...
Betterment Engineering - Bootstrapping Data Intelligence - Agile R Dashboards
Betterment Engineering - Bootstrapping Data Intelligence - Agile R Dashboards
Upcoming SlideShare
Loading in …5
×

Betterment Engineering - Bootstrapping Data Intelligence - Agile R Dashboards

1,499 views

Published on

Key points from the presentation
- Bootstrap. Don’t introduce complexity into your environment until you really need it.
- Leverage the skill set of your organization. If your analysts are great with R, productionize an R workflow.
- Automate. Pragmatic engineering can empower your analysts while supporting your process.
- Freemium Cloud, first. IaaS providers like Amazon have a free tier to help you get started. Try it before you buy it.
- Use Hosted Tools and Services. There are powerful hosted tools and services out there, like Travis-CI, to help you automate your workflow. Add them to your toolkit.

For more content from Betterment's engineers, please visit: https://www.betterment.com/blog/topics/engineering/.
Code samples: https://github.com/ygoldman/rwizflowy

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,499
On SlideShare
0
From Embeds
0
Number of Embeds
84
Actions
Shares
0
Downloads
19
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Betterment Engineering - Bootstrapping Data Intelligence - Agile R Dashboards

  1. 1. Light-weight Dashboarding and Reporting workflows with R And the GREAT stack... Yuriy Goldman Jon Mauney
  2. 2. http://www.meetup.com /BigOnData http://www.meetup.com /NYC-Open-Data http://www.meetup.com /FinTech
  3. 3. @betterment #bootstrapbi
  4. 4. Team Polaris @ Betterment Yuriy Jon Avi Nick Andrew https://www.betterment.com/blog/2014/03/07/bootstrap-data-team/
  5. 5. Bootstrapping Business Intelligence Get Here
  6. 6. Walk before you run...
  7. 7. Leverage existing skillset
  8. 8. Minimally Viable Product
  9. 9. Lean and Efficient
  10. 10. Of
  11. 11. • GREAT Stack in Layers • Exercise a workflow for Development, Staging, Deployment • Teamwork or Mingle - we will build an “almost realtime” Dashboard in R Agenda
  12. 12. GitHub For source control of scaffolding and scripts R-Language RStudio, Knitr, Rcharts, Yaml Engineering Elbow Grease Enabling Automation, QA AWS Amazon Web Services - S3, EC2, RDS Travis CI for continuous build and deployments GREAT Stack
  13. 13. Engineer Tested Analyst Approved
  14. 14. Workflow Overview AUTHORING STAGING DEPLOYING
  15. 15. Local Environment project YAML MySQL R-scripts, system scripts, deployment-fu network file storage s3::rwizflowy-bucket /mnt/rwizflowy-bucket (A) Set up Git (B) Open project in R Studio (C) Mount S3 Bucket and Symlink (D) Test DB Connection WiFi: BettermentGuest: guest, welcome to betterment
  16. 16. Complete environment for R development (But you knew that already)
  17. 17. Collaborative Source Code Management Continuous Integration hooks Post Deployment processing
  18. 18. Like Dropbox, but gives you dependable, static URLs to files you save there (images, html pages) AWS S3
  19. 19. Access AWS S3 Bucket as a local drive Symlink /Volume/rwizflowy-bucket to /mnt/rwizflowy-bucket Expan Drive
  20. 20. Sample data will come from MySQL.
  21. 21. YAML is syntax for a config file our R scripts will load at runtime. It can tell us how to connect to MySQL or where to output our plots. Any settings that can change between your Local environment and Server environment should be defined here.
  22. 22. Assembly of Dashboards or Reports happen within Google Sites. But any Wiki will do. Use whatever as long as IMG and IFrames are supported. Google Sites
  23. 23. Local Environment project YAML MySQL R-scripts, system scripts, deployment-fu network file storage s3::rwizflowy-bucket /mnt/rwizflowy-bucket (A) Set up Git (B) Open project in R Studio (C) Mount S3 Bucket and Symlink (D) Test DB Connection
  24. 24. Team Exercise #1: Authoring Set up Local Environment Exercise a sample script Output to S3 1.Get Code 3.Connect to S3 and MySQL 4.Run Code, output to S32.Team Name
  25. 25. Find a team captain for the Authoring Challenge! ● Reconvene in 20 minutes ● Take one of the samples and come up with an original graphic ● Team with the best ‘custom’ content that is web accessible (in the s3 bucket) gets t-shirts! https://s3.amazonaws.com/rwizflowy-bucket/${team-name}
  26. 26. Staging Add R output to a Wiki https://s3.amazonaws.com/rwizflowy-bucket/${TEAMNAME}/${FILENAME.png}
  27. 27. Local Environment project YAML MySQL R-scripts, system scripts, deployment-fu network file storage s3::rwizflowy-bucket /mnt/rwizflowy-bucket
  28. 28. Server Environment project YAML MySQL R-scripts, system scripts, deployment-fu network file storage s3::rwizflowy-bucket /mnt/rwizflowy-bucket via S3Fuse cron scheduler
  29. 29. Integration Environment hook pullLocal Server build and deploy git push Local Server git push
  30. 30. Continuous Integration and Deployment tool. Connects to your GitHub account and listens for changes to your Branches. We tell it what to do via our .travis.yml file (in our project). Travis can execute unit/integration tests. If all is A.OK. it can push to EC2. Awesomeness! Travis-CI
  31. 31. Deploying Commit to Master Sit back and enjoy the show...
  32. 32. Server Environment YAML MySQL network file storage s3::rwizflowy-bucket /mnt/rwizflowy-bucket via S3Fuse cron scheduler
  33. 33. Wrap Up, Q&A AUTHORING STAGING DEPLOYING
  34. 34. https://github.com/ygoldman/rwizflowy http://www.betterment.com/jobs https://www.betterment.com/blog Get a Betterment account: https://www.betterment.com/fintech

×