Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Upcoming SlideShare
What to Upload to SlideShare
Next
Download to read offline and view in fullscreen.

0

Share

Download to read offline

Fifth elephant 2017 Data Pipeline workshop

Download to read offline

Workshop take aways and outline

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all
  • Be the first to like this

Fifth elephant 2017 Data Pipeline workshop

  1. 1. Unless you measure it; you can’t improve it Data pipelines to track KPIs and KRAs for your business
  2. 2. What are we improving? • Airbnb clone  yourbnb [ architecture discussion – 15 mins] • Initial Setup Verification – 10 mins • Workshop phases • Basic Instrumentation 1. Add host metrics and visualization [15 mins] 2. App/Services – Instrumentation with meters, gauges, counters, histograms [15 mins] 3. Audit trails, deployment history – 10 mins • Event Sourcing • Theory and approach – 10 mins • Introduce – events, measurements, metrics, logs – 5 mins discussion + 15 mins hands on • Data pipeline • Architectural pattern and options – 10 mins • Changes in ingestion and publishing - 20 mins • Dashboards – 45 minutes
  3. 3. 1. Monitor all the infrastructure • Gather system performance cpu, i/o, network stats and sent out to common data store • Visualize these stats Tech Stack • Metrics – App and System health library • Compute - S3 and Lambda • Visualization – grafana • Storage – influx /druid [TBD]
  4. 4. 2. Monitor services • Add metrics for each service e.g. for web api it can be requests per second for each API endpoint and response distribution ( 200, 503,401 etc) • Avg response time • Version information for each service and it’s update history Tech Stack • Metrics – App and System health library • Compute - S3 and Lambda • Visualization – grafana • Storage – influx /druid [TBD]
  5. 5. 3. Audit Trails • Change capture system • Annotations / Markers
  6. 6. 4. Polyglot Persistence • Host and Service telemetry in time series database • Master data – document store /RDBMS • App Logs – Elastic Search
  7. 7. 5. Data Pipeline • Architectural paradigm • Event Logs as system of record • Open source options • Implement
  8. 8. 6. Event Sourcing and CQRS
  9. 9. 7. A/B testing
  10. 10. 8. Dashboards for KPIs

Workshop take aways and outline

Views

Total views

268

On Slideshare

0

From embeds

0

Number of embeds

20

Actions

Downloads

1

Shares

0

Comments

0

Likes

0

×