SlideShare a Scribd company logo
Impatience is a Virtue:
Revisiting Disorder in High-Performance
Log Analytics
Badrish Chandramouli, Jonathan Goldstein, Yinan Li
Microsoft Research
1
Disordered Data Processing
• Big data systems frequently collect log and telemetry data from
machines, sensors, devices, apps, browsers, …
• Disorder is common in such logs, due to:
• Network delays
• Intermittent machine failures
• Periods of poor connectivity
• Race conditions during log aggregation
• …
• Increasing demand for real-time analysis on such streams
• Examples: Microsoft Trill, Spark Streaming, Google Cloud Dataflow,
Apache Flink
2
Real Workload Analysis
• Event time: the logical time at which the event occurs.
• Processing time: the time at which the event is ingested into a streaming engine.
3
CloudLog AndroidLog
Chaotic at fine granularity Chaotic at coarse granularity
Disordered Data Processing in Trill
• Trill is a high-performance query processor for streaming analytics
• Widely used in Microsoft products (Azure Stream Analytics, Bing, Office, Halo)
• Highly optimized implementation (columnar storage, code generation, etc…)
• All operators are in-order operators
• Side note: you can now download Trill binaries at http://aka.ms/trill
• Our Goal: make Trill efficiently process out-of-order streams
• Keep using high-performance implementation of in-order operators
• High throughput, low latency, low memory usage
4
Key Challenges and Solutions
• How to sort streams efficiently?
• Impatience sort: online Patience sort
• How to produce good streaming query plans with sorting operators?
• Sort-as-needed execution strategy: push down order-insensitive operators
• How to cope more flexibly with the latency-completeness tradeoff?
• Impatience framework: deliver early results without losing late events
5
Impatience Sort:
Problem Definition and Performance Requirements
• Online sorting operator
• Data stream consists of data events and punctuations
• When receiving a punctuation with a timestamp T, sort all events whose
timestamps are less than or equal to T and output the sorted stream.
• Performance requirements:
• Adaptive to sortedness
• Efficient incremental sorting
6
Existing sorting algorithms fall short of
at least one of the two requirements
2 6 5 1 2 4 3 7 4 8 ∞ 1 2 2 3 4 4 5 6 7 8 ∞
Online
sorting
?Impatience
sort
Background on Patience Sort
• Offline sort inspired by the British card game of Patience (Solitaire)
• Two phases
• Partition phase: for each element, place it into the first sorted run whose last
element is less than or equal to the current element,
or if such a run does not exist, create a new run
• Merge phase: merge all sorted runs
7
2 6 5 1 4 3 7 8
Run 1
Run 2
Run 3
Run 4
# Runs: k = O( 𝑛)
Run selection cost: O(logk)
Partition cost: O(nlogk)
Sorting cost: O(nlogk)
Why Patience Sort?
• Reason 1: Patience sort is naturally adaptive to many common out-of-
order patterns appearing in logs
• If input array is generated by interleaving d sorted runs, we have k ≤ d.
• If there are d natural runs in an input array, we have k ≤ d.
• If there are d distinct values of timestamps in input array, we have k ≤ d.
• Reason 2: its merge-based nature implies a potential solution for
incremental sorting
8
Impatience Sort
• A variant of Patience sort that supports online (incremental) sorting
• Create sorted runs as we receive data
• When we receive a punctuation with timestamp T
• For each sorted run, remove all events whose timestamps ≤ T.
• Merge all removed subsequences and output the merged results.
9
2 6 5 1 2 4 3 7 4 8 ∞Input Stream
Output Stream
Run 1
Run 2
Run 3
Run 4
≤2≤4 ≤∞
Impatience Sort (continued)
• Impatience sort can gradually clean up sorted runs created by
severely delayed events  fewer sorted runs  better performance.
10
The number of sorted runs in Patience and Impatience sort
when sorting the CloudLog dataset
More optimizations in the paper!
Performance Evaluation: offline data
• Implemented all sort algorithms in Trill (in C#)
• Preloaded data in memory
• Single thread execution
11
0
2
4
6
8
10
12
14
16
18
CloudLog AndroidLog
Throughput(millionevents/sec)
Impatience Quicksort Timsort Heapsort
Impatience sort takes better advantage of existing order in input data
Performance Evaluation: Online Data
12
0
2
4
6
8
10
12
14
16
18
Throughput(millionevents/sec)
Gap between punctuation, log scale
Impatience Quicksort Timsort Heapsort
0
5
10
15
20
25
30
35
40
Throughput(millionevents/sec)
Gap between punctuation, log scale
Impatience Quicksort Timsort Heapsort
CloudLog AndroidLog
Impatience sort is less sensitive to frequent punctuations
More results are in the paper!
Outline
• How to sort streams efficiently?
• Impatience sort
• How to produce good streaming query plans with sorting operators?
• Sort-as-needed execution
• How to cope more flexibly with the latency-completeness tradeoff?
• Impatience framework
13
Optimizations on Query Plans
• Idea: sorts data “only as needed” for a given query.
• Solution: push down order-insensitive operators
• Selection and projection operators
• Window operators
• Example: a hopping (sliding) window query that computes over an
one-minute window for every second.
• In Trill, this is performed by adjusting timestamps:
eventTime - eventTime % hop-size
• Reduce number of distinct values, number of natural runs  better sorting
performance of Impatience sort.
• Performance: up to 7X speedup
14
Outline
• How to sort streams efficiently?
• Impatience sort
• How to produce good streaming query plans with sorting operators?
• Sort-as-needed execution
• How to cope more flexibly with the latency-completeness tradeoff?
• Impatience framework
15
Impatience framework
• Impatience framework
- Add support for user-specified set of reorder
latencies (e.g. {1 sec, 1 min, 1 hour})
- Deliver early results without losing late arrival
events
- Reduce memory usage
16
Low-latency
Completeness
1 sec, 98%
1 hour, 100%
?
• Pitfalls of sort-based out-of-order data
processing
• Users are forced to make a tradeoff
between completeness and latency
• High memory usage
Impatience framework
• Partition events based on delay, e.g., {< 1 sec, < 1 min, < 1hour }
• Inject user-provided Trill operators into framework
• Low-overhead in throughput
• Reduces memory usage in certain cases
• Unmodified in-order Trill operators
17
patitionwindowfilter sort count union sum
sort count
sort count union sum
1 hour
1 min
1 sec
: user-provided operator
: out-of-order stream
: in-order stream
Performance of Impatience framework
18
Impatience framework:
High completeness, low latency, high throughput, low memory usage!
Complete
ness
Latency
{1 sec,
1 min,
1 hour}
100% ~ 1 sec
{1 sec} 98% ~ 1 sec
{1 hour} 100% ~ 1 hour
{1 sec} +
{1 min} +
{1 hour}
100% ~ 1 sec
0
2
4
6
8
10
12
14
16
Count SmallGroupByLargeGroupBy TopK
Throughput(million/sec)
Throughput
{1sec, 1min, 1hour} {1sec}
{1hour} {1sec}+{1min}+{1hour}
1
10
100
1000
Count SmallGroupByLargeGroupBy TopK
Memoryusage(MB)
Memory usage
{1sec, 1min, 1hour} {1sec}
{1hour} {1sec}+{1min}+{1hour}
Conclusion
• End-to-end sort-based solution for processing disordered
streams
• Impatience sort: an efficient streaming sort operator that can take
advantage of existing order in input stream
• Sort-as-needed execution: push down order-insensitive operators
• Impatience framework: deliver early results without losing late
events
• High completeness, low latency, high throughput, low memory usage
19
http://aka.ms/trill
Thanks
20
Impatience framework
• Impatience framework
- Adds support for a set of reorder latencies
(e.g. {1 sec, 1 min, 1 hour})
- Delivers early results without losing late
arrival events
21
Low-latency
Completeness
1 sec, 98%
1 hour, 100%
?
0
20
40
60
80
100
0
20
40
60
80
100
0
20
40
60
80
100
0
20
40
60
80
100
0
20
40
60
80
100
1 min
Refresh every second

More Related Content

What's hot

Introduction to Real-Time Data Processing
Introduction to Real-Time Data ProcessingIntroduction to Real-Time Data Processing
Introduction to Real-Time Data Processing
Apache Apex
 
DataTorrent Presentation @ Big Data Application Meetup
DataTorrent Presentation @ Big Data Application MeetupDataTorrent Presentation @ Big Data Application Meetup
DataTorrent Presentation @ Big Data Application Meetup
Thomas Weise
 
Apache Apex Fault Tolerance and Processing Semantics
Apache Apex Fault Tolerance and Processing SemanticsApache Apex Fault Tolerance and Processing Semantics
Apache Apex Fault Tolerance and Processing Semantics
Apache Apex
 
Apex as yarn application
Apex as yarn applicationApex as yarn application
Apex as yarn application
Chinmay Kolhatkar
 
Ingestion and Dimensions Compute and Enrich using Apache Apex
Ingestion and Dimensions Compute and Enrich using Apache ApexIngestion and Dimensions Compute and Enrich using Apache Apex
Ingestion and Dimensions Compute and Enrich using Apache Apex
Apache Apex
 
Fault Tolerance and Processing Semantics in Apache Apex
Fault Tolerance and Processing Semantics in Apache ApexFault Tolerance and Processing Semantics in Apache Apex
Fault Tolerance and Processing Semantics in Apache Apex
Apache Apex Organizer
 
Streaming SQL (at FlinkForward, Berlin, 2016/09/12)
Streaming SQL (at FlinkForward, Berlin, 2016/09/12)Streaming SQL (at FlinkForward, Berlin, 2016/09/12)
Streaming SQL (at FlinkForward, Berlin, 2016/09/12)
Julian Hyde
 
Introduction to Apache Apex - CoDS 2016
Introduction to Apache Apex - CoDS 2016Introduction to Apache Apex - CoDS 2016
Introduction to Apache Apex - CoDS 2016
Bhupesh Chawda
 
Intro to Apache Apex - Next Gen Native Hadoop Platform - Hackac
Intro to Apache Apex - Next Gen Native Hadoop Platform - HackacIntro to Apache Apex - Next Gen Native Hadoop Platform - Hackac
Intro to Apache Apex - Next Gen Native Hadoop Platform - Hackac
Apache Apex
 
Scalable Realtime Analytics with declarative SQL like Complex Event Processin...
Scalable Realtime Analytics with declarative SQL like Complex Event Processin...Scalable Realtime Analytics with declarative SQL like Complex Event Processin...
Scalable Realtime Analytics with declarative SQL like Complex Event Processin...
Srinath Perera
 
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache ApexHadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Apache Apex
 
Building your first aplication using Apache Apex
Building your first aplication using Apache ApexBuilding your first aplication using Apache Apex
Building your first aplication using Apache Apex
Yogi Devendra Vyavahare
 
Accidental Data Analytics
Accidental Data AnalyticsAccidental Data Analytics
Accidental Data Analytics
APNIC
 
Flink internals web
Flink internals web Flink internals web
Flink internals web
Kostas Tzoumas
 
Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex
Apache Apex
 
Flink Streaming Hadoop Summit San Jose
Flink Streaming Hadoop Summit San JoseFlink Streaming Hadoop Summit San Jose
Flink Streaming Hadoop Summit San Jose
Kostas Tzoumas
 
Always On: Building Highly Available Applications on Cassandra
Always On: Building Highly Available Applications on CassandraAlways On: Building Highly Available Applications on Cassandra
Always On: Building Highly Available Applications on Cassandra
Robbie Strickland
 
Apache Beam (incubating)
Apache Beam (incubating)Apache Beam (incubating)
Apache Beam (incubating)
Apache Apex
 
So you think you can stream.pptx
So you think you can stream.pptxSo you think you can stream.pptx
So you think you can stream.pptx
Prakash Chockalingam
 

What's hot (19)

Introduction to Real-Time Data Processing
Introduction to Real-Time Data ProcessingIntroduction to Real-Time Data Processing
Introduction to Real-Time Data Processing
 
DataTorrent Presentation @ Big Data Application Meetup
DataTorrent Presentation @ Big Data Application MeetupDataTorrent Presentation @ Big Data Application Meetup
DataTorrent Presentation @ Big Data Application Meetup
 
Apache Apex Fault Tolerance and Processing Semantics
Apache Apex Fault Tolerance and Processing SemanticsApache Apex Fault Tolerance and Processing Semantics
Apache Apex Fault Tolerance and Processing Semantics
 
Apex as yarn application
Apex as yarn applicationApex as yarn application
Apex as yarn application
 
Ingestion and Dimensions Compute and Enrich using Apache Apex
Ingestion and Dimensions Compute and Enrich using Apache ApexIngestion and Dimensions Compute and Enrich using Apache Apex
Ingestion and Dimensions Compute and Enrich using Apache Apex
 
Fault Tolerance and Processing Semantics in Apache Apex
Fault Tolerance and Processing Semantics in Apache ApexFault Tolerance and Processing Semantics in Apache Apex
Fault Tolerance and Processing Semantics in Apache Apex
 
Streaming SQL (at FlinkForward, Berlin, 2016/09/12)
Streaming SQL (at FlinkForward, Berlin, 2016/09/12)Streaming SQL (at FlinkForward, Berlin, 2016/09/12)
Streaming SQL (at FlinkForward, Berlin, 2016/09/12)
 
Introduction to Apache Apex - CoDS 2016
Introduction to Apache Apex - CoDS 2016Introduction to Apache Apex - CoDS 2016
Introduction to Apache Apex - CoDS 2016
 
Intro to Apache Apex - Next Gen Native Hadoop Platform - Hackac
Intro to Apache Apex - Next Gen Native Hadoop Platform - HackacIntro to Apache Apex - Next Gen Native Hadoop Platform - Hackac
Intro to Apache Apex - Next Gen Native Hadoop Platform - Hackac
 
Scalable Realtime Analytics with declarative SQL like Complex Event Processin...
Scalable Realtime Analytics with declarative SQL like Complex Event Processin...Scalable Realtime Analytics with declarative SQL like Complex Event Processin...
Scalable Realtime Analytics with declarative SQL like Complex Event Processin...
 
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache ApexHadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
 
Building your first aplication using Apache Apex
Building your first aplication using Apache ApexBuilding your first aplication using Apache Apex
Building your first aplication using Apache Apex
 
Accidental Data Analytics
Accidental Data AnalyticsAccidental Data Analytics
Accidental Data Analytics
 
Flink internals web
Flink internals web Flink internals web
Flink internals web
 
Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex
 
Flink Streaming Hadoop Summit San Jose
Flink Streaming Hadoop Summit San JoseFlink Streaming Hadoop Summit San Jose
Flink Streaming Hadoop Summit San Jose
 
Always On: Building Highly Available Applications on Cassandra
Always On: Building Highly Available Applications on CassandraAlways On: Building Highly Available Applications on Cassandra
Always On: Building Highly Available Applications on Cassandra
 
Apache Beam (incubating)
Apache Beam (incubating)Apache Beam (incubating)
Apache Beam (incubating)
 
So you think you can stream.pptx
So you think you can stream.pptxSo you think you can stream.pptx
So you think you can stream.pptx
 

Similar to Impatience is a Virtue: Revisiting Disorder in High-Performance Log Analytics

Next Gen Big Data Analytics with Apache Apex
Next Gen Big Data Analytics with Apache Apex Next Gen Big Data Analytics with Apache Apex
Next Gen Big Data Analytics with Apache Apex
DataWorks Summit/Hadoop Summit
 
Apache con 2020 use cases and optimizations of iotdb
Apache con 2020 use cases and optimizations of iotdbApache con 2020 use cases and optimizations of iotdb
Apache con 2020 use cases and optimizations of iotdb
ZhangZhengming
 
Discretized Stream - Fault-Tolerant Streaming Computation at Scale - SOSP
Discretized Stream - Fault-Tolerant Streaming Computation at Scale - SOSPDiscretized Stream - Fault-Tolerant Streaming Computation at Scale - SOSP
Discretized Stream - Fault-Tolerant Streaming Computation at Scale - SOSP
Tathagata Das
 
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and TransformIntro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
Apache Apex
 
Data Stream Processing - Concepts and Frameworks
Data Stream Processing - Concepts and FrameworksData Stream Processing - Concepts and Frameworks
Data Stream Processing - Concepts and Frameworks
Matthias Niehoff
 
From Trill to Quill and Beyond
From Trill to Quill and BeyondFrom Trill to Quill and Beyond
From Trill to Quill and Beyond
Badrish Chandramouli
 
Introduction to Apache Apex by Thomas Weise
Introduction to Apache Apex by Thomas WeiseIntroduction to Apache Apex by Thomas Weise
Introduction to Apache Apex by Thomas Weise
Big Data Spain
 
Data Stream Processing with Apache Flink
Data Stream Processing with Apache FlinkData Stream Processing with Apache Flink
Data Stream Processing with Apache Flink
Fabian Hueske
 
Large volume data analysis on the Typesafe Reactive Platform - Big Data Scala...
Large volume data analysis on the Typesafe Reactive Platform - Big Data Scala...Large volume data analysis on the Typesafe Reactive Platform - Big Data Scala...
Large volume data analysis on the Typesafe Reactive Platform - Big Data Scala...
Martin Zapletal
 
Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache ApexApache Big Data EU 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache Apex
Apache Apex
 
Dynamics CRM high volume systems - lessons from the field
Dynamics CRM high volume systems - lessons from the fieldDynamics CRM high volume systems - lessons from the field
Dynamics CRM high volume systems - lessons from the field
Stéphane Dorrekens
 
Cassandra Day Atlanta 2015: Diagnosing Problems in Production
Cassandra Day Atlanta 2015: Diagnosing Problems in ProductionCassandra Day Atlanta 2015: Diagnosing Problems in Production
Cassandra Day Atlanta 2015: Diagnosing Problems in Production
DataStax Academy
 
Cassandra Day Chicago 2015: Diagnosing Problems in Production
Cassandra Day Chicago 2015: Diagnosing Problems in ProductionCassandra Day Chicago 2015: Diagnosing Problems in Production
Cassandra Day Chicago 2015: Diagnosing Problems in Production
DataStax Academy
 
Cassandra Day London 2015: Diagnosing Problems in Production
Cassandra Day London 2015: Diagnosing Problems in ProductionCassandra Day London 2015: Diagnosing Problems in Production
Cassandra Day London 2015: Diagnosing Problems in Production
DataStax Academy
 
RedisConf17 - Redis in High Traffic Adtech Stack
RedisConf17 - Redis in High Traffic Adtech StackRedisConf17 - Redis in High Traffic Adtech Stack
RedisConf17 - Redis in High Traffic Adtech Stack
Redis Labs
 
John adams talk cloudy
John adams   talk cloudyJohn adams   talk cloudy
John adams talk cloudy
John Adams
 
Debunking Common Myths in Stream Processing
Debunking Common Myths in Stream ProcessingDebunking Common Myths in Stream Processing
Debunking Common Myths in Stream Processing
DataWorks Summit/Hadoop Summit
 
Stream Processing Overview
Stream Processing OverviewStream Processing Overview
Stream Processing Overview
Maycon Viana Bordin
 
Streaming datasets for personalization
Streaming datasets for personalizationStreaming datasets for personalization
Streaming datasets for personalization
Shriya Arora
 
Drinking from the Firehose - Real-time Metrics
Drinking from the Firehose - Real-time MetricsDrinking from the Firehose - Real-time Metrics
Drinking from the Firehose - Real-time Metrics
Samantha Quiñones
 

Similar to Impatience is a Virtue: Revisiting Disorder in High-Performance Log Analytics (20)

Next Gen Big Data Analytics with Apache Apex
Next Gen Big Data Analytics with Apache Apex Next Gen Big Data Analytics with Apache Apex
Next Gen Big Data Analytics with Apache Apex
 
Apache con 2020 use cases and optimizations of iotdb
Apache con 2020 use cases and optimizations of iotdbApache con 2020 use cases and optimizations of iotdb
Apache con 2020 use cases and optimizations of iotdb
 
Discretized Stream - Fault-Tolerant Streaming Computation at Scale - SOSP
Discretized Stream - Fault-Tolerant Streaming Computation at Scale - SOSPDiscretized Stream - Fault-Tolerant Streaming Computation at Scale - SOSP
Discretized Stream - Fault-Tolerant Streaming Computation at Scale - SOSP
 
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and TransformIntro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
 
Data Stream Processing - Concepts and Frameworks
Data Stream Processing - Concepts and FrameworksData Stream Processing - Concepts and Frameworks
Data Stream Processing - Concepts and Frameworks
 
From Trill to Quill and Beyond
From Trill to Quill and BeyondFrom Trill to Quill and Beyond
From Trill to Quill and Beyond
 
Introduction to Apache Apex by Thomas Weise
Introduction to Apache Apex by Thomas WeiseIntroduction to Apache Apex by Thomas Weise
Introduction to Apache Apex by Thomas Weise
 
Data Stream Processing with Apache Flink
Data Stream Processing with Apache FlinkData Stream Processing with Apache Flink
Data Stream Processing with Apache Flink
 
Large volume data analysis on the Typesafe Reactive Platform - Big Data Scala...
Large volume data analysis on the Typesafe Reactive Platform - Big Data Scala...Large volume data analysis on the Typesafe Reactive Platform - Big Data Scala...
Large volume data analysis on the Typesafe Reactive Platform - Big Data Scala...
 
Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache ApexApache Big Data EU 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache Apex
 
Dynamics CRM high volume systems - lessons from the field
Dynamics CRM high volume systems - lessons from the fieldDynamics CRM high volume systems - lessons from the field
Dynamics CRM high volume systems - lessons from the field
 
Cassandra Day Atlanta 2015: Diagnosing Problems in Production
Cassandra Day Atlanta 2015: Diagnosing Problems in ProductionCassandra Day Atlanta 2015: Diagnosing Problems in Production
Cassandra Day Atlanta 2015: Diagnosing Problems in Production
 
Cassandra Day Chicago 2015: Diagnosing Problems in Production
Cassandra Day Chicago 2015: Diagnosing Problems in ProductionCassandra Day Chicago 2015: Diagnosing Problems in Production
Cassandra Day Chicago 2015: Diagnosing Problems in Production
 
Cassandra Day London 2015: Diagnosing Problems in Production
Cassandra Day London 2015: Diagnosing Problems in ProductionCassandra Day London 2015: Diagnosing Problems in Production
Cassandra Day London 2015: Diagnosing Problems in Production
 
RedisConf17 - Redis in High Traffic Adtech Stack
RedisConf17 - Redis in High Traffic Adtech StackRedisConf17 - Redis in High Traffic Adtech Stack
RedisConf17 - Redis in High Traffic Adtech Stack
 
John adams talk cloudy
John adams   talk cloudyJohn adams   talk cloudy
John adams talk cloudy
 
Debunking Common Myths in Stream Processing
Debunking Common Myths in Stream ProcessingDebunking Common Myths in Stream Processing
Debunking Common Myths in Stream Processing
 
Stream Processing Overview
Stream Processing OverviewStream Processing Overview
Stream Processing Overview
 
Streaming datasets for personalization
Streaming datasets for personalizationStreaming datasets for personalization
Streaming datasets for personalization
 
Drinking from the Firehose - Real-time Metrics
Drinking from the Firehose - Real-time MetricsDrinking from the Firehose - Real-time Metrics
Drinking from the Firehose - Real-time Metrics
 

Recently uploaded

Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
Bill641377
 
State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023
kuntobimo2016
 
Natural Language Processing (NLP), RAG and its applications .pptx
Natural Language Processing (NLP), RAG and its applications .pptxNatural Language Processing (NLP), RAG and its applications .pptx
Natural Language Processing (NLP), RAG and its applications .pptx
fkyes25
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
sameer shah
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
nyfuhyz
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Kiwi Creative
 
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
mzpolocfi
 
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
g4dpvqap0
 
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
74nqk8xf
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
soxrziqu
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
v7oacc3l
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
nuttdpt
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
bopyb
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
Roger Valdez
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
Social Samosa
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 

Recently uploaded (20)

Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
 
State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023
 
Natural Language Processing (NLP), RAG and its applications .pptx
Natural Language Processing (NLP), RAG and its applications .pptxNatural Language Processing (NLP), RAG and its applications .pptx
Natural Language Processing (NLP), RAG and its applications .pptx
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
 
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
 
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
 
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 

Impatience is a Virtue: Revisiting Disorder in High-Performance Log Analytics

  • 1. Impatience is a Virtue: Revisiting Disorder in High-Performance Log Analytics Badrish Chandramouli, Jonathan Goldstein, Yinan Li Microsoft Research 1
  • 2. Disordered Data Processing • Big data systems frequently collect log and telemetry data from machines, sensors, devices, apps, browsers, … • Disorder is common in such logs, due to: • Network delays • Intermittent machine failures • Periods of poor connectivity • Race conditions during log aggregation • … • Increasing demand for real-time analysis on such streams • Examples: Microsoft Trill, Spark Streaming, Google Cloud Dataflow, Apache Flink 2
  • 3. Real Workload Analysis • Event time: the logical time at which the event occurs. • Processing time: the time at which the event is ingested into a streaming engine. 3 CloudLog AndroidLog Chaotic at fine granularity Chaotic at coarse granularity
  • 4. Disordered Data Processing in Trill • Trill is a high-performance query processor for streaming analytics • Widely used in Microsoft products (Azure Stream Analytics, Bing, Office, Halo) • Highly optimized implementation (columnar storage, code generation, etc…) • All operators are in-order operators • Side note: you can now download Trill binaries at http://aka.ms/trill • Our Goal: make Trill efficiently process out-of-order streams • Keep using high-performance implementation of in-order operators • High throughput, low latency, low memory usage 4
  • 5. Key Challenges and Solutions • How to sort streams efficiently? • Impatience sort: online Patience sort • How to produce good streaming query plans with sorting operators? • Sort-as-needed execution strategy: push down order-insensitive operators • How to cope more flexibly with the latency-completeness tradeoff? • Impatience framework: deliver early results without losing late events 5
  • 6. Impatience Sort: Problem Definition and Performance Requirements • Online sorting operator • Data stream consists of data events and punctuations • When receiving a punctuation with a timestamp T, sort all events whose timestamps are less than or equal to T and output the sorted stream. • Performance requirements: • Adaptive to sortedness • Efficient incremental sorting 6 Existing sorting algorithms fall short of at least one of the two requirements 2 6 5 1 2 4 3 7 4 8 ∞ 1 2 2 3 4 4 5 6 7 8 ∞ Online sorting ?Impatience sort
  • 7. Background on Patience Sort • Offline sort inspired by the British card game of Patience (Solitaire) • Two phases • Partition phase: for each element, place it into the first sorted run whose last element is less than or equal to the current element, or if such a run does not exist, create a new run • Merge phase: merge all sorted runs 7 2 6 5 1 4 3 7 8 Run 1 Run 2 Run 3 Run 4 # Runs: k = O( 𝑛) Run selection cost: O(logk) Partition cost: O(nlogk) Sorting cost: O(nlogk)
  • 8. Why Patience Sort? • Reason 1: Patience sort is naturally adaptive to many common out-of- order patterns appearing in logs • If input array is generated by interleaving d sorted runs, we have k ≤ d. • If there are d natural runs in an input array, we have k ≤ d. • If there are d distinct values of timestamps in input array, we have k ≤ d. • Reason 2: its merge-based nature implies a potential solution for incremental sorting 8
  • 9. Impatience Sort • A variant of Patience sort that supports online (incremental) sorting • Create sorted runs as we receive data • When we receive a punctuation with timestamp T • For each sorted run, remove all events whose timestamps ≤ T. • Merge all removed subsequences and output the merged results. 9 2 6 5 1 2 4 3 7 4 8 ∞Input Stream Output Stream Run 1 Run 2 Run 3 Run 4 ≤2≤4 ≤∞
  • 10. Impatience Sort (continued) • Impatience sort can gradually clean up sorted runs created by severely delayed events  fewer sorted runs  better performance. 10 The number of sorted runs in Patience and Impatience sort when sorting the CloudLog dataset More optimizations in the paper!
  • 11. Performance Evaluation: offline data • Implemented all sort algorithms in Trill (in C#) • Preloaded data in memory • Single thread execution 11 0 2 4 6 8 10 12 14 16 18 CloudLog AndroidLog Throughput(millionevents/sec) Impatience Quicksort Timsort Heapsort Impatience sort takes better advantage of existing order in input data
  • 12. Performance Evaluation: Online Data 12 0 2 4 6 8 10 12 14 16 18 Throughput(millionevents/sec) Gap between punctuation, log scale Impatience Quicksort Timsort Heapsort 0 5 10 15 20 25 30 35 40 Throughput(millionevents/sec) Gap between punctuation, log scale Impatience Quicksort Timsort Heapsort CloudLog AndroidLog Impatience sort is less sensitive to frequent punctuations More results are in the paper!
  • 13. Outline • How to sort streams efficiently? • Impatience sort • How to produce good streaming query plans with sorting operators? • Sort-as-needed execution • How to cope more flexibly with the latency-completeness tradeoff? • Impatience framework 13
  • 14. Optimizations on Query Plans • Idea: sorts data “only as needed” for a given query. • Solution: push down order-insensitive operators • Selection and projection operators • Window operators • Example: a hopping (sliding) window query that computes over an one-minute window for every second. • In Trill, this is performed by adjusting timestamps: eventTime - eventTime % hop-size • Reduce number of distinct values, number of natural runs  better sorting performance of Impatience sort. • Performance: up to 7X speedup 14
  • 15. Outline • How to sort streams efficiently? • Impatience sort • How to produce good streaming query plans with sorting operators? • Sort-as-needed execution • How to cope more flexibly with the latency-completeness tradeoff? • Impatience framework 15
  • 16. Impatience framework • Impatience framework - Add support for user-specified set of reorder latencies (e.g. {1 sec, 1 min, 1 hour}) - Deliver early results without losing late arrival events - Reduce memory usage 16 Low-latency Completeness 1 sec, 98% 1 hour, 100% ? • Pitfalls of sort-based out-of-order data processing • Users are forced to make a tradeoff between completeness and latency • High memory usage
  • 17. Impatience framework • Partition events based on delay, e.g., {< 1 sec, < 1 min, < 1hour } • Inject user-provided Trill operators into framework • Low-overhead in throughput • Reduces memory usage in certain cases • Unmodified in-order Trill operators 17 patitionwindowfilter sort count union sum sort count sort count union sum 1 hour 1 min 1 sec : user-provided operator : out-of-order stream : in-order stream
  • 18. Performance of Impatience framework 18 Impatience framework: High completeness, low latency, high throughput, low memory usage! Complete ness Latency {1 sec, 1 min, 1 hour} 100% ~ 1 sec {1 sec} 98% ~ 1 sec {1 hour} 100% ~ 1 hour {1 sec} + {1 min} + {1 hour} 100% ~ 1 sec 0 2 4 6 8 10 12 14 16 Count SmallGroupByLargeGroupBy TopK Throughput(million/sec) Throughput {1sec, 1min, 1hour} {1sec} {1hour} {1sec}+{1min}+{1hour} 1 10 100 1000 Count SmallGroupByLargeGroupBy TopK Memoryusage(MB) Memory usage {1sec, 1min, 1hour} {1sec} {1hour} {1sec}+{1min}+{1hour}
  • 19. Conclusion • End-to-end sort-based solution for processing disordered streams • Impatience sort: an efficient streaming sort operator that can take advantage of existing order in input stream • Sort-as-needed execution: push down order-insensitive operators • Impatience framework: deliver early results without losing late events • High completeness, low latency, high throughput, low memory usage 19 http://aka.ms/trill
  • 21. Impatience framework • Impatience framework - Adds support for a set of reorder latencies (e.g. {1 sec, 1 min, 1 hour}) - Delivers early results without losing late arrival events 21 Low-latency Completeness 1 sec, 98% 1 hour, 100% ? 0 20 40 60 80 100 0 20 40 60 80 100 0 20 40 60 80 100 0 20 40 60 80 100 0 20 40 60 80 100 1 min Refresh every second