Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

A Data Culture with Embedded Analytics in Action

1,892 views

Published on

Data-driven companies have a need to make their data easily accessible to those who analyze it. Many organizations have adopted the Looker application, LookML on AWS, a centralized analytical database with a user-friendly interface that allows employees to ask and answer their own questions to make informed business decisions.

Join our webinar to learn how our customer, Casper, an online mattress retailer, made the switch from a transactional database to Looker’s data analytics program on Amazon Redshift. Looker on Amazon Redshift can help you greatly reduce your analytics lifecycle with a simplified infrastructure and rapid cloud scaling.

Join us to learn:
• How to utilize LookML to build reusable definitions and logic for your data
• Best practices for architecting a centralized analytical database
• How Casper leveraged Looker and Amazon Redshift to provide all their employees access to their data and metrics

Who should attend: Heads of Analytics, Heads of BI, Analytics Managers, BI Teams, Senior Analysts

Published in: Technology
  • Be the first to comment

A Data Culture with Embedded Analytics in Action

  1. 1. A Data Culture with Embedded Analytics in Action Dave Rocamora • Solutions Architect, AWS Erin Franz • Senior Analyst, Alliances, Looker Scott Breitenother • VP, Data and Analytics, Casper
  2. 2. Data is Growing of new data will be created every second for every human being on the planet by 2020 1.7MB compound annual growth rate of 58% surpassing $1 billion by 2020 forecasted for the Hadoop market 58% of all data is ever analyzed and used at the moment 0.5%< http://www.ap-institute.com/big-data-articles/big-data- what-is-hadoop-%E2%80%93-an-explanation-for- absolutely-anyone.aspx http://www.marketanalysis.com/?p=279 http://www.technologyreview.com/news/514346/the- data-made-me-do-it/ http://www.whizpr.be/upload/medialab/21/company/M edia_Presentation_2012_DigiUniverseFINAL1.pdf
  3. 3. Big Data is for Everyone The market for Big Data technologies is growing more than six times faster than the information technology market as a whole… …and those companies who use their data will win.
  4. 4. Why AWS for Big Data? Immediately Available Broad and Deep Capabilities Trusted and Secure Scalable
  5. 5. Collect, Store, Analyze, and Visualize It’s easy to get data to AWS, store it securely, and analyze it with the engine of your choice, without any long-term commitment or vendor lock-in Collect AWS Import/Export AWS Snowball Direct Connect VM Import/Export Store Amazon S3 Amazon EMR Amazon Glacier Amazon Redshift DynamoDB Amazon Aurora Analyze Amazon Kinesis AWS Lambda Amazon EMR Amazon EC2
  6. 6. AWS Provides the Most Complete Platform for Big Data What can you do with Big Data on AWS? Big Data Repositories Clickstream Analysis ETL Offload Machine Learning Online Ad Serving BI Applications
  7. 7. A Data Culture with Embedded Analytics in Action Erin Franz • Senior Analyst, Alliances, Looker
  8. 8. Make it easy for everyone to find, explore and understand the data that drives your business
  9. 9. Looker: A Self-Service Data Platform Find, explore and understand the data Explore Everything Find, explore and understand all the data Create Standards Define your data and business metrics Any SQL Database Analyze all of your data where it is stored Build a Data Culture Anyone can ask and answer questions
  10. 10. Looker for Amazon Web Services RDS Redshift EMR Aurora Deployment Easy deployment on Amazon EC2 Data Sources Connect to Amazon RDS, Amazon Redshift, Amazon Aurora and Amazon EMR (Spark SQL and Presto) Data Modeling Layer Define your data and business metrics Explore Find, explore and understand your data
  11. 11. The Technical Pillars that Make it Possible 100% in Database  Leverage all your data  Avoid summarizing or moving it Modern Web Architecture  Access from anywhere  Share and collaborate  Extend to anyone LookML Intelligent Modeling Layer  Describe the data  Create reusable and shareable business logic
  12. 12. Looker/Redshift Integration Highlights In-Database Architecture The power of Amazon Redshift is directly leveraged by Looker because all transformation is done in-database Looker: A Standard for Amazon Redshift Some of the most demanding Amazon Redshift deployments choose Looker for data exploration, including: Highest Level of Looker Features We’ve invested in providing Looker features for Amazon Redshift to make the best experience possible, including:  As real-time as data in Amazon Redshift  Shared compute, scalability, caching all utilized by Looker  Persistent derived tables  Symmetric aggregates  Query killing  Lat/Long location  Sony  Lyft  Yahoo!  Kohler  Docker
  13. 13. Companies Winning with Redshift + Looker eCommerce Technology Marketplaces Fin Services Media/Ad Tech
  14. 14. A Data Culture with Embedded Analytics in Action Scott Breitenother • VP, Data and Analytics, Casper
  15. 15. Casper Is a NYC Based Sleep Startup
  16. 16. Data Powers Everything that We Do Data Team Mission: Enable better, faster decisions through information visibility and analytical expertise
  17. 17. Until We Outgrew Our Data Infrastructure  Required data refresh by the data team  File speed and data size limitations  Intimidating presentation of information  Analysis is siloed in files  Cannot query across sources, must download data to join  Difficult to manage ad hoc queries  No one place holds all the information  Inconsistent definitions (and a lot of work if you make a small change!) Production Databases Solution was not efficient or scalable Big Excel Files
  18. 18. Enter Looker & AWS  Central warehouse for all data  Join previously siloed data for better analysis  Dialect is very similar to Postgresql  We use AWS ecosystem (AWS Lambda, Amazon RDS, Amazon EC2)  Efficient data modeling  Easy to manage source of truth  Visualization layer  Intuitive UI for business users  No SQL for business users!! Amazon Redshift
  19. 19. We Implemented in Phases Copy Batch copy production databases 1 Copy Faster Frequent, faster and incremental copy 2 ELT Build specific data marts 3
  20. 20. Phase 1: Copy  Open source project from DonorsChoose.org  Bash script with regex translations from Postgres to Redshift  Full refresh with up to 40 min load time  Whitelist of tables to copy for each database Results  Data updated every 6 hours  Missing certain key aggregations  Not read performant  Unwieldy to manage  Poor UX on Looker front end How We Did It
  21. 21.  Stitch (formerly RJ Metrics Pipeline)  Integrates with Postgres as well as other common third party sources  30 minute refresh cycle  Point and click to add tables and integrations  Easy to use UI  Incremental copy  Pre-existing integrations and expertise (multiple engineers, customer support)  Fully managed and relatively inexpensive  Transparent logging (rows replicated, errors) Phase 2: Copy Faster ResultsHow We Did It
  22. 22. Phase 3: ELT How We Did It  Data Build Tool (dbt) from Fishtown Analytics  Looker like abstraction of tables and views  SQL that references other SQL  Manages dependency graph  Options for materializing SQL (CTE, view, table)  Set sort and distribution keys  Simple repo deployed to EC2 tiny Results  Pre-aggregated tables  Marts: de-normalized table for an area of the business  Lookups: attributes for a product, location  Rollups: time-series aggregations for summary reporting  Facts: aggregations on key “business objects” (orders, customers)  Updates every 30 minutes
  23. 23. This Is What Success Looks Like
  24. 24. This Is What Success Looks Like Access for Business Users  Find many answers themselves  Easy to filter, pivot and visualize  Access to all existing analysis  Data is refreshed and up-to-date  Multiple ways to consume (web, email, links via slack) Simple Management for Data Team  Single source of truth  Insight into usage  Centralized business logic  Git managed, easy collaboration  Keep pace with evolving business (new countries and products)
  25. 25. Success Story: Supply Chain Solution Monitoring 2 KPIS:  Operational Metric – daily Days on Hand (DOH)  Success Metric – weekly Order to Ship SLA Challenge Operations team needed to ensure fast delivery of our highly in-demand products Order to Ship SLAMattress Inventory DOH
  26. 26. Success Story: Executive Reporting Solution  Created a dashboard that highlights key metrics from each department  Each metric has a goal and includes a weekly, MTD, QTD and trend view Challenge Executive team needed actionable metrics and the ability to track against goals while seeing trends over time
  27. 27. There Will Always Be More Questions Increased Access to Data More Sophisticated Clients Tougher Questions
  28. 28. Q&A Dave Rocamora • Solutions Architect, AWS Erin Franz • Senior Analyst, Alliances, Looker Scott Breitenother • VP, Data and Analytics, Casper

×