SlideShare a Scribd company logo
Monitoring Anomalies in
Experimentation Platform
Deepak Vasthimal – MTS @ eBay
Connect Me - https://www.linkedin.com/in/whatisdeepak
Available on eBay Tech Blog - https://goo.gl/6bUbE9
Overview of Experimentation (A/B Test) Platform
• A/B Testing is comparing different experiences and
measure their performance.
• Variance could be in UI, Components, Algorithms etc.
• Measure Bankable metrics and non-bankable (activity
click rates).
• Enables data driven decisions.
• Avoid making releases features on intuitions to over 150
million users.
Monitoring Anomalies in the Experimentation Platform 2
Experimentation Reporting
• 1500+ experiments
• Migrated from Teradata/SQL to Hadoop using Scala.
• Process 100s of TBs of data daily on Hadoop cluster with
around 400 M/R jobs.
• 200+ metrics generated daily using batch system.
• Built using a mix of open source technologies like Scala,
Scoobi, Hive, Hadoop and proprietary tech like Teradata
and MicroStrategy.
Monitoring Anomalies in the Experimentation Platform 3
High Level Flow
Monitoring Anomalies in the Experimentation Platform 4
Anomalies with Experiments
• Traffic corruption – Traffic between test and control is
skewed by UID/GUID.
• Tag corruption – Data loss/corrupted during logging &
transfer to HDFS.
• GUID Reset – Browser cookie.
• Cache refresh - eBay application servers maintain
caches of experiment configurations. A software or
hardware glitch can cause corruption of cache.
Monitoring Anomalies in the Experimentation Platform 5
Monitoring Anomalies
• Identify & Categorize anomalies within
experiments using Teradata/Hive.
• Store identified anomalies in HDFS and route
to InfluxDB (TSDB)
•Visualize using Grafana.
•I was introduced to Grafana through SpaceX
tweet by Torkel.
Monitoring Anomalies in the Experimentation Platform 6
Reason we chose Grafana
• Visually pleasing graphs.
• Easy setup.
• In Built Query Editor (SQL & UI)
• Instantly change dashboards using duplicate
dashboards/panels feature.
• InfluxDB for its ability to be setup in minutes
when compared to Graphite/Prometheus .
• Entire pipeline took couple of days.
Monitoring Anomalies in the Experimentation Platform 7
Home Page
Monitoring Anomalies in the Experimentation Platform 8
DrillDown
Monitoring Anomalies in the Experimentation Platform 9
Search
Monitoring Anomalies in the Experimentation Platform 10
Scale
• InfluxDB (v 0.11-1) is installed on a single node with
45 GB of memory.
• Grafana (v 3.0.2) is installed on a single node with 45
GB of memory.
• 2000 points are ingested daily which is minuscule.
• Currently have around 10 months of historical data.
Monitoring Anomalies in the Experimentation Platform 11
Grafana <3 is spreading.
• Performance analysis of MapReduce jobs by
Experimentation platform with job counters.
• Monitor Elastic Search Cluster (10k+ points per
second).
• Anomaly detection in Tracking data (10k+ points
per minute).
• Each use case stores data in InfluxDB.
Monitoring Anomalies in the Experimentation Platform 12
Thank You
Monitoring Anomalies in the Experimentation Platform 13

More Related Content

What's hot

How the Automation of a Benchmark Famework Keeps Pace with the Dev Cycle at I...
How the Automation of a Benchmark Famework Keeps Pace with the Dev Cycle at I...How the Automation of a Benchmark Famework Keeps Pace with the Dev Cycle at I...
How the Automation of a Benchmark Famework Keeps Pace with the Dev Cycle at I...
DevOps.com
 
Deployment Checkup: How to Regularly Tune Your Cloud Environment - RightScale...
Deployment Checkup: How to Regularly Tune Your Cloud Environment - RightScale...Deployment Checkup: How to Regularly Tune Your Cloud Environment - RightScale...
Deployment Checkup: How to Regularly Tune Your Cloud Environment - RightScale...
RightScale
 
Spark Summit EU talk by Jim Dowling
Spark Summit EU talk by Jim DowlingSpark Summit EU talk by Jim Dowling
Spark Summit EU talk by Jim Dowling
Spark Summit
 
Sysml 2019 demo_paper
Sysml 2019 demo_paperSysml 2019 demo_paper
Sysml 2019 demo_paper
strange_loop
 
OOW13 Exadata and ODI with Parallel
OOW13 Exadata and ODI with ParallelOOW13 Exadata and ODI with Parallel
OOW13 Exadata and ODI with Parallel
Kellyn Pot'Vin-Gorman
 
DataStax & O'Reilly Media: Large Scale Data Analytics with Spark and Cassandr...
DataStax & O'Reilly Media: Large Scale Data Analytics with Spark and Cassandr...DataStax & O'Reilly Media: Large Scale Data Analytics with Spark and Cassandr...
DataStax & O'Reilly Media: Large Scale Data Analytics with Spark and Cassandr...
DataStax Academy
 
Apache Apex Meetup at Cask
Apache Apex Meetup at CaskApache Apex Meetup at Cask
Apache Apex Meetup at Cask
Apache Apex
 
Apache Apex connector with Kafka 0.9 consumer API
Apache Apex connector with Kafka 0.9 consumer APIApache Apex connector with Kafka 0.9 consumer API
Apache Apex connector with Kafka 0.9 consumer API
Apache Apex
 
Scaling Machine Learning To Billions Of Parameters
Scaling Machine Learning To Billions Of ParametersScaling Machine Learning To Billions Of Parameters
Scaling Machine Learning To Billions Of Parameters
Jen Aman
 
Highlights and Challenges from Running Spark on Mesos in Production by Morri ...
Highlights and Challenges from Running Spark on Mesos in Production by Morri ...Highlights and Challenges from Running Spark on Mesos in Production by Morri ...
Highlights and Challenges from Running Spark on Mesos in Production by Morri ...
Spark Summit
 
Apache Apex Kafka Input Operator
Apache Apex Kafka Input OperatorApache Apex Kafka Input Operator
Apache Apex Kafka Input Operator
Apache Apex
 
Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...
Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...
Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...
Databricks
 
How to Improve Performance Testing Using InfluxDB and Apache JMeter
How to Improve Performance Testing Using InfluxDB and Apache JMeterHow to Improve Performance Testing Using InfluxDB and Apache JMeter
How to Improve Performance Testing Using InfluxDB and Apache JMeter
InfluxData
 
Introduction to Apache Apex
Introduction to Apache ApexIntroduction to Apache Apex
Introduction to Apache Apex
Apache Apex
 
Extending The Yahoo Streaming Benchmark to Apache Apex
Extending The Yahoo Streaming Benchmark to Apache ApexExtending The Yahoo Streaming Benchmark to Apache Apex
Extending The Yahoo Streaming Benchmark to Apache Apex
Apache Apex
 
From Pipelines to Refineries: scaling big data applications with Tim Hunter
From Pipelines to Refineries: scaling big data applications with Tim HunterFrom Pipelines to Refineries: scaling big data applications with Tim Hunter
From Pipelines to Refineries: scaling big data applications with Tim Hunter
Databricks
 
Kafka to Hadoop Ingest with Parsing, Dedup and other Big Data Transformations
Kafka to Hadoop Ingest with Parsing, Dedup and other Big Data TransformationsKafka to Hadoop Ingest with Parsing, Dedup and other Big Data Transformations
Kafka to Hadoop Ingest with Parsing, Dedup and other Big Data Transformations
Apache Apex
 
Architectual Comparison of Apache Apex and Spark Streaming
Architectual Comparison of Apache Apex and Spark StreamingArchitectual Comparison of Apache Apex and Spark Streaming
Architectual Comparison of Apache Apex and Spark Streaming
Apache Apex
 
Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache ApexApache Big Data EU 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache Apex
Apache Apex
 
Smart Partitioning with Apache Apex (Webinar)
Smart Partitioning with Apache Apex (Webinar)Smart Partitioning with Apache Apex (Webinar)
Smart Partitioning with Apache Apex (Webinar)
Apache Apex
 

What's hot (20)

How the Automation of a Benchmark Famework Keeps Pace with the Dev Cycle at I...
How the Automation of a Benchmark Famework Keeps Pace with the Dev Cycle at I...How the Automation of a Benchmark Famework Keeps Pace with the Dev Cycle at I...
How the Automation of a Benchmark Famework Keeps Pace with the Dev Cycle at I...
 
Deployment Checkup: How to Regularly Tune Your Cloud Environment - RightScale...
Deployment Checkup: How to Regularly Tune Your Cloud Environment - RightScale...Deployment Checkup: How to Regularly Tune Your Cloud Environment - RightScale...
Deployment Checkup: How to Regularly Tune Your Cloud Environment - RightScale...
 
Spark Summit EU talk by Jim Dowling
Spark Summit EU talk by Jim DowlingSpark Summit EU talk by Jim Dowling
Spark Summit EU talk by Jim Dowling
 
Sysml 2019 demo_paper
Sysml 2019 demo_paperSysml 2019 demo_paper
Sysml 2019 demo_paper
 
OOW13 Exadata and ODI with Parallel
OOW13 Exadata and ODI with ParallelOOW13 Exadata and ODI with Parallel
OOW13 Exadata and ODI with Parallel
 
DataStax & O'Reilly Media: Large Scale Data Analytics with Spark and Cassandr...
DataStax & O'Reilly Media: Large Scale Data Analytics with Spark and Cassandr...DataStax & O'Reilly Media: Large Scale Data Analytics with Spark and Cassandr...
DataStax & O'Reilly Media: Large Scale Data Analytics with Spark and Cassandr...
 
Apache Apex Meetup at Cask
Apache Apex Meetup at CaskApache Apex Meetup at Cask
Apache Apex Meetup at Cask
 
Apache Apex connector with Kafka 0.9 consumer API
Apache Apex connector with Kafka 0.9 consumer APIApache Apex connector with Kafka 0.9 consumer API
Apache Apex connector with Kafka 0.9 consumer API
 
Scaling Machine Learning To Billions Of Parameters
Scaling Machine Learning To Billions Of ParametersScaling Machine Learning To Billions Of Parameters
Scaling Machine Learning To Billions Of Parameters
 
Highlights and Challenges from Running Spark on Mesos in Production by Morri ...
Highlights and Challenges from Running Spark on Mesos in Production by Morri ...Highlights and Challenges from Running Spark on Mesos in Production by Morri ...
Highlights and Challenges from Running Spark on Mesos in Production by Morri ...
 
Apache Apex Kafka Input Operator
Apache Apex Kafka Input OperatorApache Apex Kafka Input Operator
Apache Apex Kafka Input Operator
 
Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...
Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...
Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...
 
How to Improve Performance Testing Using InfluxDB and Apache JMeter
How to Improve Performance Testing Using InfluxDB and Apache JMeterHow to Improve Performance Testing Using InfluxDB and Apache JMeter
How to Improve Performance Testing Using InfluxDB and Apache JMeter
 
Introduction to Apache Apex
Introduction to Apache ApexIntroduction to Apache Apex
Introduction to Apache Apex
 
Extending The Yahoo Streaming Benchmark to Apache Apex
Extending The Yahoo Streaming Benchmark to Apache ApexExtending The Yahoo Streaming Benchmark to Apache Apex
Extending The Yahoo Streaming Benchmark to Apache Apex
 
From Pipelines to Refineries: scaling big data applications with Tim Hunter
From Pipelines to Refineries: scaling big data applications with Tim HunterFrom Pipelines to Refineries: scaling big data applications with Tim Hunter
From Pipelines to Refineries: scaling big data applications with Tim Hunter
 
Kafka to Hadoop Ingest with Parsing, Dedup and other Big Data Transformations
Kafka to Hadoop Ingest with Parsing, Dedup and other Big Data TransformationsKafka to Hadoop Ingest with Parsing, Dedup and other Big Data Transformations
Kafka to Hadoop Ingest with Parsing, Dedup and other Big Data Transformations
 
Architectual Comparison of Apache Apex and Spark Streaming
Architectual Comparison of Apache Apex and Spark StreamingArchitectual Comparison of Apache Apex and Spark Streaming
Architectual Comparison of Apache Apex and Spark Streaming
 
Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache ApexApache Big Data EU 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache Apex
 
Smart Partitioning with Apache Apex (Webinar)
Smart Partitioning with Apache Apex (Webinar)Smart Partitioning with Apache Apex (Webinar)
Smart Partitioning with Apache Apex (Webinar)
 

Viewers also liked

Razvoj novca u Republici Hrvatskoj
Razvoj novca u Republici HrvatskojRazvoj novca u Republici Hrvatskoj
Razvoj novca u Republici Hrvatskoj
Marko Božac
 
Tablas
TablasTablas
Godišnje izvješće Financijskog kluba u mandatu 2014./2015.
Godišnje izvješće Financijskog kluba u mandatu 2014./2015.Godišnje izvješće Financijskog kluba u mandatu 2014./2015.
Godišnje izvješće Financijskog kluba u mandatu 2014./2015.
Marko Božac
 
When ego gets in the way of marketing
When ego gets in the way of marketingWhen ego gets in the way of marketing
When ego gets in the way of marketing
weinmark
 
Dietética
Dietética Dietética
Dietética
Blanckys Gonzalez
 
Muhammad_Hassan_Karachi
Muhammad_Hassan_KarachiMuhammad_Hassan_Karachi
Muhammad_Hassan_Karachi
Syed Muhammad Hassan Zaidi
 
Demanda sap y otros
Demanda sap y otrosDemanda sap y otros
Demanda sap y otros
Adalberto Cervantes Rodriguez
 
Curso adm 471 aplicacion de flujo de caja
Curso adm 471   aplicacion de flujo de cajaCurso adm 471   aplicacion de flujo de caja
Curso adm 471 aplicacion de flujo de caja
Procasecapacita
 
trabajo final de tecnologia
trabajo final de tecnologiatrabajo final de tecnologia
trabajo final de tecnologia
yuraniscope
 
Nesvrstani
NesvrstaniNesvrstani
Finding Product Market Fit - Peter Reinhardt at SaaStock 16
Finding Product Market Fit - Peter Reinhardt at SaaStock 16Finding Product Market Fit - Peter Reinhardt at SaaStock 16
Finding Product Market Fit - Peter Reinhardt at SaaStock 16
SaaStock
 
Slide smau firenze 08.07.2016
Slide smau   firenze 08.07.2016Slide smau   firenze 08.07.2016
Slide smau firenze 08.07.2016
Claudio Morelli
 
Mantenimiento de estructuras
Mantenimiento de estructurasMantenimiento de estructuras
Mantenimiento de estructuras
Antonio Mogollon
 
Energias alternas o alternativas
Energias alternas o alternativasEnergias alternas o alternativas
Energias alternas o alternativas
yuraniscope
 

Viewers also liked (14)

Razvoj novca u Republici Hrvatskoj
Razvoj novca u Republici HrvatskojRazvoj novca u Republici Hrvatskoj
Razvoj novca u Republici Hrvatskoj
 
Tablas
TablasTablas
Tablas
 
Godišnje izvješće Financijskog kluba u mandatu 2014./2015.
Godišnje izvješće Financijskog kluba u mandatu 2014./2015.Godišnje izvješće Financijskog kluba u mandatu 2014./2015.
Godišnje izvješće Financijskog kluba u mandatu 2014./2015.
 
When ego gets in the way of marketing
When ego gets in the way of marketingWhen ego gets in the way of marketing
When ego gets in the way of marketing
 
Dietética
Dietética Dietética
Dietética
 
Muhammad_Hassan_Karachi
Muhammad_Hassan_KarachiMuhammad_Hassan_Karachi
Muhammad_Hassan_Karachi
 
Demanda sap y otros
Demanda sap y otrosDemanda sap y otros
Demanda sap y otros
 
Curso adm 471 aplicacion de flujo de caja
Curso adm 471   aplicacion de flujo de cajaCurso adm 471   aplicacion de flujo de caja
Curso adm 471 aplicacion de flujo de caja
 
trabajo final de tecnologia
trabajo final de tecnologiatrabajo final de tecnologia
trabajo final de tecnologia
 
Nesvrstani
NesvrstaniNesvrstani
Nesvrstani
 
Finding Product Market Fit - Peter Reinhardt at SaaStock 16
Finding Product Market Fit - Peter Reinhardt at SaaStock 16Finding Product Market Fit - Peter Reinhardt at SaaStock 16
Finding Product Market Fit - Peter Reinhardt at SaaStock 16
 
Slide smau firenze 08.07.2016
Slide smau   firenze 08.07.2016Slide smau   firenze 08.07.2016
Slide smau firenze 08.07.2016
 
Mantenimiento de estructuras
Mantenimiento de estructurasMantenimiento de estructuras
Mantenimiento de estructuras
 
Energias alternas o alternativas
Energias alternas o alternativasEnergias alternas o alternativas
Energias alternas o alternativas
 

Similar to Monitoring anomalies in experimentation platform.

Cerberus : Framework for Manual and Automated Testing (Web Application)
Cerberus : Framework for Manual and Automated Testing (Web Application)Cerberus : Framework for Manual and Automated Testing (Web Application)
Cerberus : Framework for Manual and Automated Testing (Web Application)
CIVEL Benoit
 
Cerberus_Presentation1
Cerberus_Presentation1Cerberus_Presentation1
Cerberus_Presentation1
CIVEL Benoit
 
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
ssuserd3a367
 
Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...
Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...
Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...
Spark Summit
 
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and TransformIntro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
Apache Apex
 
Stream Computing (The Engineer's Perspective)
Stream Computing (The Engineer's Perspective)Stream Computing (The Engineer's Perspective)
Stream Computing (The Engineer's Perspective)
Ilya Ganelin
 
PEARC17: Live Integrated Visualization Environment: An Experiment in General...
PEARC17: Live Integrated Visualization Environment: An Experiment in General...PEARC17: Live Integrated Visualization Environment: An Experiment in General...
PEARC17: Live Integrated Visualization Environment: An Experiment in General...
moneyjh
 
Tale of two streaming frameworks- Apace Storm & Apache Flink
Tale of two streaming frameworks- Apace Storm & Apache FlinkTale of two streaming frameworks- Apace Storm & Apache Flink
Tale of two streaming frameworks- Apace Storm & Apache Flink
Karthik Deivasigamani
 
Tale of two streaming frameworks (Karthik D - Walmart)
Tale of two streaming frameworks (Karthik D - Walmart)Tale of two streaming frameworks (Karthik D - Walmart)
Tale of two streaming frameworks (Karthik D - Walmart)
KafkaZone
 
SQL on Hadoop benchmarks using TPC-DS query set
SQL on Hadoop benchmarks using TPC-DS query setSQL on Hadoop benchmarks using TPC-DS query set
SQL on Hadoop benchmarks using TPC-DS query set
Kognitio
 
IoT Ingestion & Analytics using Apache Apex - A Native Hadoop Platform
 IoT Ingestion & Analytics using Apache Apex - A Native Hadoop Platform IoT Ingestion & Analytics using Apache Apex - A Native Hadoop Platform
IoT Ingestion & Analytics using Apache Apex - A Native Hadoop Platform
Apache Apex
 
Bay Area Impala User Group Meetup (Sept 16 2014)
Bay Area Impala User Group Meetup (Sept 16 2014)Bay Area Impala User Group Meetup (Sept 16 2014)
Bay Area Impala User Group Meetup (Sept 16 2014)
Cloudera, Inc.
 
Testistanbul 2016 - Keynote: "Performance Testing of Big Data" by Roland Leusden
Testistanbul 2016 - Keynote: "Performance Testing of Big Data" by Roland LeusdenTestistanbul 2016 - Keynote: "Performance Testing of Big Data" by Roland Leusden
Testistanbul 2016 - Keynote: "Performance Testing of Big Data" by Roland Leusden
Turkish Testing Board
 
Nonfunctional Testing: Examine the Other Side of the Coin
Nonfunctional Testing: Examine the Other Side of the CoinNonfunctional Testing: Examine the Other Side of the Coin
Nonfunctional Testing: Examine the Other Side of the Coin
TechWell
 
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...
Cloudera, Inc.
 
Using Machine Learning to Understand Kafka Runtime Behavior (Shivanath Babu, ...
Using Machine Learning to Understand Kafka Runtime Behavior (Shivanath Babu, ...Using Machine Learning to Understand Kafka Runtime Behavior (Shivanath Babu, ...
Using Machine Learning to Understand Kafka Runtime Behavior (Shivanath Babu, ...
confluent
 
SCAPE - Scalable Preservation Environments
SCAPE - Scalable Preservation EnvironmentsSCAPE - Scalable Preservation Environments
SCAPE - Scalable Preservation Environments
SCAPE Project
 
Tech talk microservices debugging
Tech talk microservices debuggingTech talk microservices debugging
Tech talk microservices debugging
Andrey Kolodnitsky
 
Debugging Microservices - key challenges and techniques - Microservices Odesa...
Debugging Microservices - key challenges and techniques - Microservices Odesa...Debugging Microservices - key challenges and techniques - Microservices Odesa...
Debugging Microservices - key challenges and techniques - Microservices Odesa...
Lohika_Odessa_TechTalks
 
Monitoring federation open stack infrastructure
Monitoring federation open stack infrastructureMonitoring federation open stack infrastructure
Monitoring federation open stack infrastructure
Fernando Lopez Aguilar
 

Similar to Monitoring anomalies in experimentation platform. (20)

Cerberus : Framework for Manual and Automated Testing (Web Application)
Cerberus : Framework for Manual and Automated Testing (Web Application)Cerberus : Framework for Manual and Automated Testing (Web Application)
Cerberus : Framework for Manual and Automated Testing (Web Application)
 
Cerberus_Presentation1
Cerberus_Presentation1Cerberus_Presentation1
Cerberus_Presentation1
 
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
 
Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...
Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...
Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...
 
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and TransformIntro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
 
Stream Computing (The Engineer's Perspective)
Stream Computing (The Engineer's Perspective)Stream Computing (The Engineer's Perspective)
Stream Computing (The Engineer's Perspective)
 
PEARC17: Live Integrated Visualization Environment: An Experiment in General...
PEARC17: Live Integrated Visualization Environment: An Experiment in General...PEARC17: Live Integrated Visualization Environment: An Experiment in General...
PEARC17: Live Integrated Visualization Environment: An Experiment in General...
 
Tale of two streaming frameworks- Apace Storm & Apache Flink
Tale of two streaming frameworks- Apace Storm & Apache FlinkTale of two streaming frameworks- Apace Storm & Apache Flink
Tale of two streaming frameworks- Apace Storm & Apache Flink
 
Tale of two streaming frameworks (Karthik D - Walmart)
Tale of two streaming frameworks (Karthik D - Walmart)Tale of two streaming frameworks (Karthik D - Walmart)
Tale of two streaming frameworks (Karthik D - Walmart)
 
SQL on Hadoop benchmarks using TPC-DS query set
SQL on Hadoop benchmarks using TPC-DS query setSQL on Hadoop benchmarks using TPC-DS query set
SQL on Hadoop benchmarks using TPC-DS query set
 
IoT Ingestion & Analytics using Apache Apex - A Native Hadoop Platform
 IoT Ingestion & Analytics using Apache Apex - A Native Hadoop Platform IoT Ingestion & Analytics using Apache Apex - A Native Hadoop Platform
IoT Ingestion & Analytics using Apache Apex - A Native Hadoop Platform
 
Bay Area Impala User Group Meetup (Sept 16 2014)
Bay Area Impala User Group Meetup (Sept 16 2014)Bay Area Impala User Group Meetup (Sept 16 2014)
Bay Area Impala User Group Meetup (Sept 16 2014)
 
Testistanbul 2016 - Keynote: "Performance Testing of Big Data" by Roland Leusden
Testistanbul 2016 - Keynote: "Performance Testing of Big Data" by Roland LeusdenTestistanbul 2016 - Keynote: "Performance Testing of Big Data" by Roland Leusden
Testistanbul 2016 - Keynote: "Performance Testing of Big Data" by Roland Leusden
 
Nonfunctional Testing: Examine the Other Side of the Coin
Nonfunctional Testing: Examine the Other Side of the CoinNonfunctional Testing: Examine the Other Side of the Coin
Nonfunctional Testing: Examine the Other Side of the Coin
 
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...
 
Using Machine Learning to Understand Kafka Runtime Behavior (Shivanath Babu, ...
Using Machine Learning to Understand Kafka Runtime Behavior (Shivanath Babu, ...Using Machine Learning to Understand Kafka Runtime Behavior (Shivanath Babu, ...
Using Machine Learning to Understand Kafka Runtime Behavior (Shivanath Babu, ...
 
SCAPE - Scalable Preservation Environments
SCAPE - Scalable Preservation EnvironmentsSCAPE - Scalable Preservation Environments
SCAPE - Scalable Preservation Environments
 
Tech talk microservices debugging
Tech talk microservices debuggingTech talk microservices debugging
Tech talk microservices debugging
 
Debugging Microservices - key challenges and techniques - Microservices Odesa...
Debugging Microservices - key challenges and techniques - Microservices Odesa...Debugging Microservices - key challenges and techniques - Microservices Odesa...
Debugging Microservices - key challenges and techniques - Microservices Odesa...
 
Monitoring federation open stack infrastructure
Monitoring federation open stack infrastructureMonitoring federation open stack infrastructure
Monitoring federation open stack infrastructure
 

Recently uploaded

Introduction to Computer Networks & OSI MODEL.ppt
Introduction to Computer Networks & OSI MODEL.pptIntroduction to Computer Networks & OSI MODEL.ppt
Introduction to Computer Networks & OSI MODEL.ppt
Dwarkadas J Sanghvi College of Engineering
 
AI + Data Community Tour - Build the Next Generation of Apps with the Einstei...
AI + Data Community Tour - Build the Next Generation of Apps with the Einstei...AI + Data Community Tour - Build the Next Generation of Apps with the Einstei...
AI + Data Community Tour - Build the Next Generation of Apps with the Einstei...
Paris Salesforce Developer Group
 
Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...
Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...
Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...
PriyankaKilaniya
 
Ericsson LTE Throughput Troubleshooting Techniques.ppt
Ericsson LTE Throughput Troubleshooting Techniques.pptEricsson LTE Throughput Troubleshooting Techniques.ppt
Ericsson LTE Throughput Troubleshooting Techniques.ppt
wafawafa52
 
一比一原版(uoft毕业证书)加拿大多伦多大学毕业证如何办理
一比一原版(uoft毕业证书)加拿大多伦多大学毕业证如何办理一比一原版(uoft毕业证书)加拿大多伦多大学毕业证如何办理
一比一原版(uoft毕业证书)加拿大多伦多大学毕业证如何办理
sydezfe
 
ITSM Integration with MuleSoft.pptx
ITSM  Integration with MuleSoft.pptxITSM  Integration with MuleSoft.pptx
ITSM Integration with MuleSoft.pptx
VANDANAMOHANGOUDA
 
Open Channel Flow: fluid flow with a free surface
Open Channel Flow: fluid flow with a free surfaceOpen Channel Flow: fluid flow with a free surface
Open Channel Flow: fluid flow with a free surface
Indrajeet sahu
 
3rd International Conference on Artificial Intelligence Advances (AIAD 2024)
3rd International Conference on Artificial Intelligence Advances (AIAD 2024)3rd International Conference on Artificial Intelligence Advances (AIAD 2024)
3rd International Conference on Artificial Intelligence Advances (AIAD 2024)
GiselleginaGloria
 
paper relate Chozhavendhan et al. 2020.pdf
paper relate Chozhavendhan et al. 2020.pdfpaper relate Chozhavendhan et al. 2020.pdf
paper relate Chozhavendhan et al. 2020.pdf
ShurooqTaib
 
A high-Speed Communication System is based on the Design of a Bi-NoC Router, ...
A high-Speed Communication System is based on the Design of a Bi-NoC Router, ...A high-Speed Communication System is based on the Design of a Bi-NoC Router, ...
A high-Speed Communication System is based on the Design of a Bi-NoC Router, ...
DharmaBanothu
 
This study Examines the Effectiveness of Talent Procurement through the Imple...
This study Examines the Effectiveness of Talent Procurement through the Imple...This study Examines the Effectiveness of Talent Procurement through the Imple...
This study Examines the Effectiveness of Talent Procurement through the Imple...
DharmaBanothu
 
SELENIUM CONF -PALLAVI SHARMA - 2024.pdf
SELENIUM CONF -PALLAVI SHARMA - 2024.pdfSELENIUM CONF -PALLAVI SHARMA - 2024.pdf
SELENIUM CONF -PALLAVI SHARMA - 2024.pdf
Pallavi Sharma
 
UNIT-III- DATA CONVERTERS ANALOG TO DIGITAL CONVERTER
UNIT-III- DATA CONVERTERS ANALOG TO DIGITAL CONVERTERUNIT-III- DATA CONVERTERS ANALOG TO DIGITAL CONVERTER
UNIT-III- DATA CONVERTERS ANALOG TO DIGITAL CONVERTER
vmspraneeth
 
Butterfly Valves Manufacturer (LBF Series).pdf
Butterfly Valves Manufacturer (LBF Series).pdfButterfly Valves Manufacturer (LBF Series).pdf
Butterfly Valves Manufacturer (LBF Series).pdf
Lubi Valves
 
Blood finder application project report (1).pdf
Blood finder application project report (1).pdfBlood finder application project report (1).pdf
Blood finder application project report (1).pdf
Kamal Acharya
 
Beckhoff Programmable Logic Control Overview Presentation
Beckhoff Programmable Logic Control Overview PresentationBeckhoff Programmable Logic Control Overview Presentation
Beckhoff Programmable Logic Control Overview Presentation
VanTuDuong1
 
ELS: 2.4.1 POWER ELECTRONICS Course objectives: This course will enable stude...
ELS: 2.4.1 POWER ELECTRONICS Course objectives: This course will enable stude...ELS: 2.4.1 POWER ELECTRONICS Course objectives: This course will enable stude...
ELS: 2.4.1 POWER ELECTRONICS Course objectives: This course will enable stude...
Kuvempu University
 
309475979-Creativity-Innovation-notes-IV-Sem-2016-pdf.pdf
309475979-Creativity-Innovation-notes-IV-Sem-2016-pdf.pdf309475979-Creativity-Innovation-notes-IV-Sem-2016-pdf.pdf
309475979-Creativity-Innovation-notes-IV-Sem-2016-pdf.pdf
Sou Tibon
 
Particle Swarm Optimization–Long Short-Term Memory based Channel Estimation w...
Particle Swarm Optimization–Long Short-Term Memory based Channel Estimation w...Particle Swarm Optimization–Long Short-Term Memory based Channel Estimation w...
Particle Swarm Optimization–Long Short-Term Memory based Channel Estimation w...
IJCNCJournal
 
一比一原版(uofo毕业证书)美国俄勒冈大学毕业证如何办理
一比一原版(uofo毕业证书)美国俄勒冈大学毕业证如何办理一比一原版(uofo毕业证书)美国俄勒冈大学毕业证如何办理
一比一原版(uofo毕业证书)美国俄勒冈大学毕业证如何办理
upoux
 

Recently uploaded (20)

Introduction to Computer Networks & OSI MODEL.ppt
Introduction to Computer Networks & OSI MODEL.pptIntroduction to Computer Networks & OSI MODEL.ppt
Introduction to Computer Networks & OSI MODEL.ppt
 
AI + Data Community Tour - Build the Next Generation of Apps with the Einstei...
AI + Data Community Tour - Build the Next Generation of Apps with the Einstei...AI + Data Community Tour - Build the Next Generation of Apps with the Einstei...
AI + Data Community Tour - Build the Next Generation of Apps with the Einstei...
 
Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...
Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...
Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...
 
Ericsson LTE Throughput Troubleshooting Techniques.ppt
Ericsson LTE Throughput Troubleshooting Techniques.pptEricsson LTE Throughput Troubleshooting Techniques.ppt
Ericsson LTE Throughput Troubleshooting Techniques.ppt
 
一比一原版(uoft毕业证书)加拿大多伦多大学毕业证如何办理
一比一原版(uoft毕业证书)加拿大多伦多大学毕业证如何办理一比一原版(uoft毕业证书)加拿大多伦多大学毕业证如何办理
一比一原版(uoft毕业证书)加拿大多伦多大学毕业证如何办理
 
ITSM Integration with MuleSoft.pptx
ITSM  Integration with MuleSoft.pptxITSM  Integration with MuleSoft.pptx
ITSM Integration with MuleSoft.pptx
 
Open Channel Flow: fluid flow with a free surface
Open Channel Flow: fluid flow with a free surfaceOpen Channel Flow: fluid flow with a free surface
Open Channel Flow: fluid flow with a free surface
 
3rd International Conference on Artificial Intelligence Advances (AIAD 2024)
3rd International Conference on Artificial Intelligence Advances (AIAD 2024)3rd International Conference on Artificial Intelligence Advances (AIAD 2024)
3rd International Conference on Artificial Intelligence Advances (AIAD 2024)
 
paper relate Chozhavendhan et al. 2020.pdf
paper relate Chozhavendhan et al. 2020.pdfpaper relate Chozhavendhan et al. 2020.pdf
paper relate Chozhavendhan et al. 2020.pdf
 
A high-Speed Communication System is based on the Design of a Bi-NoC Router, ...
A high-Speed Communication System is based on the Design of a Bi-NoC Router, ...A high-Speed Communication System is based on the Design of a Bi-NoC Router, ...
A high-Speed Communication System is based on the Design of a Bi-NoC Router, ...
 
This study Examines the Effectiveness of Talent Procurement through the Imple...
This study Examines the Effectiveness of Talent Procurement through the Imple...This study Examines the Effectiveness of Talent Procurement through the Imple...
This study Examines the Effectiveness of Talent Procurement through the Imple...
 
SELENIUM CONF -PALLAVI SHARMA - 2024.pdf
SELENIUM CONF -PALLAVI SHARMA - 2024.pdfSELENIUM CONF -PALLAVI SHARMA - 2024.pdf
SELENIUM CONF -PALLAVI SHARMA - 2024.pdf
 
UNIT-III- DATA CONVERTERS ANALOG TO DIGITAL CONVERTER
UNIT-III- DATA CONVERTERS ANALOG TO DIGITAL CONVERTERUNIT-III- DATA CONVERTERS ANALOG TO DIGITAL CONVERTER
UNIT-III- DATA CONVERTERS ANALOG TO DIGITAL CONVERTER
 
Butterfly Valves Manufacturer (LBF Series).pdf
Butterfly Valves Manufacturer (LBF Series).pdfButterfly Valves Manufacturer (LBF Series).pdf
Butterfly Valves Manufacturer (LBF Series).pdf
 
Blood finder application project report (1).pdf
Blood finder application project report (1).pdfBlood finder application project report (1).pdf
Blood finder application project report (1).pdf
 
Beckhoff Programmable Logic Control Overview Presentation
Beckhoff Programmable Logic Control Overview PresentationBeckhoff Programmable Logic Control Overview Presentation
Beckhoff Programmable Logic Control Overview Presentation
 
ELS: 2.4.1 POWER ELECTRONICS Course objectives: This course will enable stude...
ELS: 2.4.1 POWER ELECTRONICS Course objectives: This course will enable stude...ELS: 2.4.1 POWER ELECTRONICS Course objectives: This course will enable stude...
ELS: 2.4.1 POWER ELECTRONICS Course objectives: This course will enable stude...
 
309475979-Creativity-Innovation-notes-IV-Sem-2016-pdf.pdf
309475979-Creativity-Innovation-notes-IV-Sem-2016-pdf.pdf309475979-Creativity-Innovation-notes-IV-Sem-2016-pdf.pdf
309475979-Creativity-Innovation-notes-IV-Sem-2016-pdf.pdf
 
Particle Swarm Optimization–Long Short-Term Memory based Channel Estimation w...
Particle Swarm Optimization–Long Short-Term Memory based Channel Estimation w...Particle Swarm Optimization–Long Short-Term Memory based Channel Estimation w...
Particle Swarm Optimization–Long Short-Term Memory based Channel Estimation w...
 
一比一原版(uofo毕业证书)美国俄勒冈大学毕业证如何办理
一比一原版(uofo毕业证书)美国俄勒冈大学毕业证如何办理一比一原版(uofo毕业证书)美国俄勒冈大学毕业证如何办理
一比一原版(uofo毕业证书)美国俄勒冈大学毕业证如何办理
 

Monitoring anomalies in experimentation platform.

  • 1. Monitoring Anomalies in Experimentation Platform Deepak Vasthimal – MTS @ eBay Connect Me - https://www.linkedin.com/in/whatisdeepak Available on eBay Tech Blog - https://goo.gl/6bUbE9
  • 2. Overview of Experimentation (A/B Test) Platform • A/B Testing is comparing different experiences and measure their performance. • Variance could be in UI, Components, Algorithms etc. • Measure Bankable metrics and non-bankable (activity click rates). • Enables data driven decisions. • Avoid making releases features on intuitions to over 150 million users. Monitoring Anomalies in the Experimentation Platform 2
  • 3. Experimentation Reporting • 1500+ experiments • Migrated from Teradata/SQL to Hadoop using Scala. • Process 100s of TBs of data daily on Hadoop cluster with around 400 M/R jobs. • 200+ metrics generated daily using batch system. • Built using a mix of open source technologies like Scala, Scoobi, Hive, Hadoop and proprietary tech like Teradata and MicroStrategy. Monitoring Anomalies in the Experimentation Platform 3
  • 4. High Level Flow Monitoring Anomalies in the Experimentation Platform 4
  • 5. Anomalies with Experiments • Traffic corruption – Traffic between test and control is skewed by UID/GUID. • Tag corruption – Data loss/corrupted during logging & transfer to HDFS. • GUID Reset – Browser cookie. • Cache refresh - eBay application servers maintain caches of experiment configurations. A software or hardware glitch can cause corruption of cache. Monitoring Anomalies in the Experimentation Platform 5
  • 6. Monitoring Anomalies • Identify & Categorize anomalies within experiments using Teradata/Hive. • Store identified anomalies in HDFS and route to InfluxDB (TSDB) •Visualize using Grafana. •I was introduced to Grafana through SpaceX tweet by Torkel. Monitoring Anomalies in the Experimentation Platform 6
  • 7. Reason we chose Grafana • Visually pleasing graphs. • Easy setup. • In Built Query Editor (SQL & UI) • Instantly change dashboards using duplicate dashboards/panels feature. • InfluxDB for its ability to be setup in minutes when compared to Graphite/Prometheus . • Entire pipeline took couple of days. Monitoring Anomalies in the Experimentation Platform 7
  • 8. Home Page Monitoring Anomalies in the Experimentation Platform 8
  • 9. DrillDown Monitoring Anomalies in the Experimentation Platform 9
  • 10. Search Monitoring Anomalies in the Experimentation Platform 10
  • 11. Scale • InfluxDB (v 0.11-1) is installed on a single node with 45 GB of memory. • Grafana (v 3.0.2) is installed on a single node with 45 GB of memory. • 2000 points are ingested daily which is minuscule. • Currently have around 10 months of historical data. Monitoring Anomalies in the Experimentation Platform 11
  • 12. Grafana <3 is spreading. • Performance analysis of MapReduce jobs by Experimentation platform with job counters. • Monitor Elastic Search Cluster (10k+ points per second). • Anomaly detection in Tracking data (10k+ points per minute). • Each use case stores data in InfluxDB. Monitoring Anomalies in the Experimentation Platform 12
  • 13. Thank You Monitoring Anomalies in the Experimentation Platform 13