Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Fifth elephant 2017 Data Pipeline workshop

110 views

Published on

Workshop take aways and outline

Published in: Engineering
  • Be the first to comment

  • Be the first to like this

Fifth elephant 2017 Data Pipeline workshop

  1. 1. Unless you measure it; you can’t improve it Data pipelines to track KPIs and KRAs for your business
  2. 2. What are we improving? • Airbnb clone  yourbnb [ architecture discussion – 15 mins] • Initial Setup Verification – 10 mins • Workshop phases • Basic Instrumentation 1. Add host metrics and visualization [15 mins] 2. App/Services – Instrumentation with meters, gauges, counters, histograms [15 mins] 3. Audit trails, deployment history – 10 mins • Event Sourcing • Theory and approach – 10 mins • Introduce – events, measurements, metrics, logs – 5 mins discussion + 15 mins hands on • Data pipeline • Architectural pattern and options – 10 mins • Changes in ingestion and publishing - 20 mins • Dashboards – 45 minutes
  3. 3. 1. Monitor all the infrastructure • Gather system performance cpu, i/o, network stats and sent out to common data store • Visualize these stats Tech Stack • Metrics – App and System health library • Compute - S3 and Lambda • Visualization – grafana • Storage – influx /druid [TBD]
  4. 4. 2. Monitor services • Add metrics for each service e.g. for web api it can be requests per second for each API endpoint and response distribution ( 200, 503,401 etc) • Avg response time • Version information for each service and it’s update history Tech Stack • Metrics – App and System health library • Compute - S3 and Lambda • Visualization – grafana • Storage – influx /druid [TBD]
  5. 5. 3. Audit Trails • Change capture system • Annotations / Markers
  6. 6. 4. Polyglot Persistence • Host and Service telemetry in time series database • Master data – document store /RDBMS • App Logs – Elastic Search
  7. 7. 5. Data Pipeline • Architectural paradigm • Event Logs as system of record • Open source options • Implement
  8. 8. 6. Event Sourcing and CQRS
  9. 9. 7. A/B testing
  10. 10. 8. Dashboards for KPIs

×