SlideShare a Scribd company logo
http://chinesemilitaryreview.blogspot.com/2012/01/plaafs-j-10a-refueling-from-h-6u-badger.html

1
•
•
•
•
•
•

Motivation
Data flow update types
Quantitative/Qualitative metrics
Update strategies
Evaluation
Conclusion

http://www.gilaberttax.com/2013/04/16/targetedpartnership-tax-allocations/

2
Gartner, “Big Data,”
http://www.gartner.com/itglossary/big-data/

Twitter Storm

3
http://smartgrid.usc.edu
“The Smart Grid Explained, The Hype and The Promise”, WESCO, FMEA/FMPA Conference 2009

4
• Mission critical data flows cannot suffer downtime
– How to update continuous dataflow applications with minimal disruption ?

• Evaluating dynamic update.
– Performance impact
• Throughput , Latency

– Consistency
• Data loss
• Reproducibility

5

http://www.ipandora.net/2009/08/09/pray-steadfastly/
• Formalize different types of data flow
updates needs.
• Identify qualitative and quantitative
metrics to be considered when
designing update strategies
• Introduce five different data flow
strategies and analytically characterize
their performance metrics
• Implement a consistent, low latency
update strategy in Floe continuous
dataflow engine and evaluate it against
a simple update strategy for a
motivating application from Los Angeles
power grid project

http://www.flickr.com/photos/dhammakaya/7095451689/

6
• Continuous data flow τ(Ƿ,С) is a directed graph
– Ƿ set of processors
– С set of directed edges(channels) connecting processors

7
•
•
•
•

Processor update
Channel update
Independent sub graph update
Connected sub graph update

8
P3

P1

P5

P2

P4

P6

P2++

• Updates to one or more processors
• | Ƿ | remains constant
• С remains constant

9
P3

P1

P5

P2

P4

P6

• Change in number of channels or connectivity
• No changes to processors

10
P3

P1

P5

P2

P4

P6

P2++

• Updates to one or more processors and channels
– No change in number of processors
– Channel connectivity change, Channel addition/removal
11
P3

P1

P5

P2

P4

P2++

P6

P2++

P2++

• Connected sub-graph in data flow is replaced by another
connected sub-graph
12
• Quantitative
–
–
–
–

Refresh latency
Lag latency
Throughput
Message loss

• Qualitative
– Consistency
– Interleaved vs Delineated
http://theculturevulture.co.uk/blog/reviews/what-happened-at-whats-next/

13
• Refresh latency
– Time between update start and first message from the
new workflow component

• Lag latency
– Time between update start and time at which last
message from the old work flow is emitted.

• Throughput
– Message throughput drop at update time

• Message loss
– Is there a message loss ? How many ?

14
• Consistency
– Does message consistently processed through a one
version of data flow ?

• Interleaved & Delineated
• Let tf be the first message processed and emitted from τs+1 and tl
be the last message processed and emitted from τs
• Delineated if tf > tl

15
•
•
•
•
•

Native Consistent Lossy update (NCL)
Native Consistent High latency update (NCH)
High-Throughput Inconsistent Update (HTI)
Message Versioned Consistent Update (MVC)
Path Versioned Consistent Update (PVC)

https://www.ubat.com/blog/do-i-really-need-a-business-plan/
16
P3

P5

P2

P4

Pause

P1

P6

• Pause input stream , terminate dataflow , deploy new data flow,
resume workflow
–
–
–
–
–

Consistent
Delineated
Lag latency = 0
Refresh latency = Deployment time + Min(wave head time)
Throughput = 0 ;starting at update start time for a duration of refresh
latency

17
P5

P3
Flush
Pause

P1

P2

P4

P6

• Pause input stream , flush on the fly messages (TTLold), terminate
dataflow , deploy new data flow, resume workflow
–
–
–
–
–
–

Consistent
Delineated
Refresh latency = DT + TTLold + Min(wave head time)
Lag Latency = TTLold
No Message loss
Throughput goes to 0
18
P3

P1

P5

P2

P4

P6

• Perform in place updates upon request
– Inconsistent
– Interleaving messages
– Low latencies (bounds are derived per update type)

19
P3

P5

Update current version

P1

P2

• Tags messages at the
sources
• Message versions are used
to find the correct
processor/channel/sub-graph

P4

P6

P4

– Consistent
– Interleaved
20
• Extension of MVC
• Message tagged with current path it took
• Dispatch messages to new version either if they processed
through new version or its processed through components
present in both new and old versions of workflow.
– Consistent
– Interleaved

21
• Implemented MCV in Floe[1] Continuous data flow engine.
• Compare MVC against Naïve Consistent Lossy update.
• Used Message Context as a carrier of data-flow version
Floe Message
Key
Properties<K,V>

Payload

[1] https://github.com/usc-cloud/floe
22
• Update processor “Parse” to “Parse++”

23
24
• Online updates to mission critical continuous data flows is
an important problem space.
• Formalized and analyzed
– Update models
– Evaluation metrics
– Update strategies and their trade offs

• Empirically evaluate MVC and NCL update strategies.

25
http://thesciencepresenter.wordpress.com/category/behavi
our-management/
26

More Related Content

Viewers also liked

Global Commodity Update
Global Commodity UpdateGlobal Commodity Update
Global Commodity Update
Agrud
 
North American Market Update
North American Market UpdateNorth American Market Update
North American Market Update
Agrud
 
Daftar harga tas maika 2016
Daftar harga tas maika 2016Daftar harga tas maika 2016
Daftar harga tas maika 2016
wujaya miha
 
портфоліо чехова
портфоліо чеховапортфоліо чехова
портфоліо чехова
Богдан Мітлицький
 
DRI Qualified Immunity Article
DRI Qualified Immunity ArticleDRI Qualified Immunity Article
DRI Qualified Immunity ArticleDale Conder Jr.
 
#CodeGate NIGP April 2015 Minutes
#CodeGate NIGP April 2015 Minutes#CodeGate NIGP April 2015 Minutes
#CodeGate NIGP April 2015 Minutes
Jon Hansen
 
Q2 2015 Financial Results and Investors' Conference Call
Q2 2015 Financial Results and Investors' Conference CallQ2 2015 Financial Results and Investors' Conference Call
Q2 2015 Financial Results and Investors' Conference Call
TeckResourcesLtd
 
презентація викладача і курсу
презентація викладача і курсупрезентація викладача і курсу
презентація викладача і курсу
ErmakovaIP
 
M2 bmc2007 cours01
M2 bmc2007 cours01M2 bmc2007 cours01
M2 bmc2007 cours01
Elsa von Licy
 
Tipos de distribuciones de probabilidad.
Tipos de distribuciones de probabilidad.Tipos de distribuciones de probabilidad.
Tipos de distribuciones de probabilidad.
Roberto Dominguez
 
Diabetes Destroyer
Diabetes Destroyer Diabetes Destroyer
Diabetes Destroyer
JLamor
 
Design of Pedestrian Bridge
Design of Pedestrian BridgeDesign of Pedestrian Bridge
THE KINGDOM
THE KINGDOMTHE KINGDOM

Viewers also liked (15)

summary10-2
summary10-2summary10-2
summary10-2
 
Global Commodity Update
Global Commodity UpdateGlobal Commodity Update
Global Commodity Update
 
North American Market Update
North American Market UpdateNorth American Market Update
North American Market Update
 
Daftar harga tas maika 2016
Daftar harga tas maika 2016Daftar harga tas maika 2016
Daftar harga tas maika 2016
 
портфоліо чехова
портфоліо чеховапортфоліо чехова
портфоліо чехова
 
DRI Qualified Immunity Article
DRI Qualified Immunity ArticleDRI Qualified Immunity Article
DRI Qualified Immunity Article
 
#CodeGate NIGP April 2015 Minutes
#CodeGate NIGP April 2015 Minutes#CodeGate NIGP April 2015 Minutes
#CodeGate NIGP April 2015 Minutes
 
2016 hksgda booklet
2016 hksgda booklet2016 hksgda booklet
2016 hksgda booklet
 
Q2 2015 Financial Results and Investors' Conference Call
Q2 2015 Financial Results and Investors' Conference CallQ2 2015 Financial Results and Investors' Conference Call
Q2 2015 Financial Results and Investors' Conference Call
 
презентація викладача і курсу
презентація викладача і курсупрезентація викладача і курсу
презентація викладача і курсу
 
M2 bmc2007 cours01
M2 bmc2007 cours01M2 bmc2007 cours01
M2 bmc2007 cours01
 
Tipos de distribuciones de probabilidad.
Tipos de distribuciones de probabilidad.Tipos de distribuciones de probabilidad.
Tipos de distribuciones de probabilidad.
 
Diabetes Destroyer
Diabetes Destroyer Diabetes Destroyer
Diabetes Destroyer
 
Design of Pedestrian Bridge
Design of Pedestrian BridgeDesign of Pedestrian Bridge
Design of Pedestrian Bridge
 
THE KINGDOM
THE KINGDOMTHE KINGDOM
THE KINGDOM
 

Similar to Escience2013-Continuous Data Flow Update Strategies for Mission-Critical Applications

Next generation web protocols
Next generation web protocolsNext generation web protocols
Next generation web protocols
Daniel Austin
 
Free Netflow analyzer training - diagnosing_and_troubleshooting
Free Netflow analyzer  training - diagnosing_and_troubleshootingFree Netflow analyzer  training - diagnosing_and_troubleshooting
Free Netflow analyzer training - diagnosing_and_troubleshooting
ManageEngine, Zoho Corporation
 
NetFlow Analyzer Training Part II : Diagnosing and troubleshooting traffic is...
NetFlow Analyzer Training Part II : Diagnosing and troubleshooting traffic is...NetFlow Analyzer Training Part II : Diagnosing and troubleshooting traffic is...
NetFlow Analyzer Training Part II : Diagnosing and troubleshooting traffic is...
ManageEngine, Zoho Corporation
 
Monolithic to microservices
Monolithic to microservicesMonolithic to microservices
Monolithic to microservices
Ronald Hsu
 
Bandwidth reporting, capacity planning, and traffic shaping: NetFlow Analyzer...
Bandwidth reporting, capacity planning, and traffic shaping: NetFlow Analyzer...Bandwidth reporting, capacity planning, and traffic shaping: NetFlow Analyzer...
Bandwidth reporting, capacity planning, and traffic shaping: NetFlow Analyzer...
ManageEngine, Zoho Corporation
 
Streaming real time data with Vibe Data Stream
Streaming real time data with Vibe Data StreamStreaming real time data with Vibe Data Stream
Streaming real time data with Vibe Data Stream
InformaticaMarketplace
 
Labmeeting - 20150831 - Overhead and Performance of Low Latency Live Streamin...
Labmeeting - 20150831 - Overhead and Performance of Low Latency Live Streamin...Labmeeting - 20150831 - Overhead and Performance of Low Latency Live Streamin...
Labmeeting - 20150831 - Overhead and Performance of Low Latency Live Streamin...
Syuan Wang
 
Openstack Summit Vancouver 2015 - Maintaining and Operating Swift at Public C...
Openstack Summit Vancouver 2015 - Maintaining and Operating Swift at Public C...Openstack Summit Vancouver 2015 - Maintaining and Operating Swift at Public C...
Openstack Summit Vancouver 2015 - Maintaining and Operating Swift at Public C...
donaghmccabe
 
Performance test
Performance testPerformance test
Performance test
Tony Fortunato
 
Fast ni csproposersdayslidesfinal
Fast ni csproposersdayslidesfinalFast ni csproposersdayslidesfinal
Fast ni csproposersdayslidesfinal
inside-BigData.com
 
Validation and Business Considerations for Clinical Study Migrations
Validation and Business Considerations for Clinical Study MigrationsValidation and Business Considerations for Clinical Study Migrations
Validation and Business Considerations for Clinical Study Migrations
Perficient, Inc.
 
Using Spark Streaming and NiFi for the Next Generation of ETL in the Enterprise
Using Spark Streaming and NiFi for the Next Generation of ETL in the EnterpriseUsing Spark Streaming and NiFi for the Next Generation of ETL in the Enterprise
Using Spark Streaming and NiFi for the Next Generation of ETL in the Enterprise
DataWorks Summit
 
Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022
StreamNative
 
UAT - Cards Migration (Whitepaper)
UAT - Cards Migration (Whitepaper)UAT - Cards Migration (Whitepaper)
UAT - Cards Migration (Whitepaper)
Thinksoft Global
 
IT Service Transformations
IT Service TransformationsIT Service Transformations
IT Service TransformationsTCM Solutions
 
Igniting Audience Measurement at Time Warner Cable
Igniting Audience Measurement at Time Warner CableIgniting Audience Measurement at Time Warner Cable
Igniting Audience Measurement at Time Warner Cable
Tim Case
 
IBM Impact 2014 AMC-1877: IBM WebSphere MQ for z/OS: Performance & Accounting
IBM Impact 2014 AMC-1877: IBM WebSphere MQ for z/OS: Performance & AccountingIBM Impact 2014 AMC-1877: IBM WebSphere MQ for z/OS: Performance & Accounting
IBM Impact 2014 AMC-1877: IBM WebSphere MQ for z/OS: Performance & Accounting
Paul Dennis
 
What's New in Syncsort's Trillium Line of Data Quality Software - TSS Enterpr...
What's New in Syncsort's Trillium Line of Data Quality Software - TSS Enterpr...What's New in Syncsort's Trillium Line of Data Quality Software - TSS Enterpr...
What's New in Syncsort's Trillium Line of Data Quality Software - TSS Enterpr...
Precisely
 
Scaling Push Messaging for Millions of Devices @Netflix
Scaling Push Messaging for Millions of Devices @NetflixScaling Push Messaging for Millions of Devices @Netflix
Scaling Push Messaging for Millions of Devices @Netflix
C4Media
 

Similar to Escience2013-Continuous Data Flow Update Strategies for Mission-Critical Applications (20)

Next generation web protocols
Next generation web protocolsNext generation web protocols
Next generation web protocols
 
Free Netflow analyzer training - diagnosing_and_troubleshooting
Free Netflow analyzer  training - diagnosing_and_troubleshootingFree Netflow analyzer  training - diagnosing_and_troubleshooting
Free Netflow analyzer training - diagnosing_and_troubleshooting
 
NetFlow Analyzer Training Part II : Diagnosing and troubleshooting traffic is...
NetFlow Analyzer Training Part II : Diagnosing and troubleshooting traffic is...NetFlow Analyzer Training Part II : Diagnosing and troubleshooting traffic is...
NetFlow Analyzer Training Part II : Diagnosing and troubleshooting traffic is...
 
Monolithic to microservices
Monolithic to microservicesMonolithic to microservices
Monolithic to microservices
 
Bandwidth reporting, capacity planning, and traffic shaping: NetFlow Analyzer...
Bandwidth reporting, capacity planning, and traffic shaping: NetFlow Analyzer...Bandwidth reporting, capacity planning, and traffic shaping: NetFlow Analyzer...
Bandwidth reporting, capacity planning, and traffic shaping: NetFlow Analyzer...
 
Streaming real time data with Vibe Data Stream
Streaming real time data with Vibe Data StreamStreaming real time data with Vibe Data Stream
Streaming real time data with Vibe Data Stream
 
Labmeeting - 20150831 - Overhead and Performance of Low Latency Live Streamin...
Labmeeting - 20150831 - Overhead and Performance of Low Latency Live Streamin...Labmeeting - 20150831 - Overhead and Performance of Low Latency Live Streamin...
Labmeeting - 20150831 - Overhead and Performance of Low Latency Live Streamin...
 
Openstack Summit Vancouver 2015 - Maintaining and Operating Swift at Public C...
Openstack Summit Vancouver 2015 - Maintaining and Operating Swift at Public C...Openstack Summit Vancouver 2015 - Maintaining and Operating Swift at Public C...
Openstack Summit Vancouver 2015 - Maintaining and Operating Swift at Public C...
 
Performance test
Performance testPerformance test
Performance test
 
Fast ni csproposersdayslidesfinal
Fast ni csproposersdayslidesfinalFast ni csproposersdayslidesfinal
Fast ni csproposersdayslidesfinal
 
Shaik Niyas Ahamed M Resume
Shaik Niyas Ahamed M ResumeShaik Niyas Ahamed M Resume
Shaik Niyas Ahamed M Resume
 
Validation and Business Considerations for Clinical Study Migrations
Validation and Business Considerations for Clinical Study MigrationsValidation and Business Considerations for Clinical Study Migrations
Validation and Business Considerations for Clinical Study Migrations
 
Using Spark Streaming and NiFi for the Next Generation of ETL in the Enterprise
Using Spark Streaming and NiFi for the Next Generation of ETL in the EnterpriseUsing Spark Streaming and NiFi for the Next Generation of ETL in the Enterprise
Using Spark Streaming and NiFi for the Next Generation of ETL in the Enterprise
 
Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022
 
UAT - Cards Migration (Whitepaper)
UAT - Cards Migration (Whitepaper)UAT - Cards Migration (Whitepaper)
UAT - Cards Migration (Whitepaper)
 
IT Service Transformations
IT Service TransformationsIT Service Transformations
IT Service Transformations
 
Igniting Audience Measurement at Time Warner Cable
Igniting Audience Measurement at Time Warner CableIgniting Audience Measurement at Time Warner Cable
Igniting Audience Measurement at Time Warner Cable
 
IBM Impact 2014 AMC-1877: IBM WebSphere MQ for z/OS: Performance & Accounting
IBM Impact 2014 AMC-1877: IBM WebSphere MQ for z/OS: Performance & AccountingIBM Impact 2014 AMC-1877: IBM WebSphere MQ for z/OS: Performance & Accounting
IBM Impact 2014 AMC-1877: IBM WebSphere MQ for z/OS: Performance & Accounting
 
What's New in Syncsort's Trillium Line of Data Quality Software - TSS Enterpr...
What's New in Syncsort's Trillium Line of Data Quality Software - TSS Enterpr...What's New in Syncsort's Trillium Line of Data Quality Software - TSS Enterpr...
What's New in Syncsort's Trillium Line of Data Quality Software - TSS Enterpr...
 
Scaling Push Messaging for Millions of Devices @Netflix
Scaling Push Messaging for Millions of Devices @NetflixScaling Push Messaging for Millions of Devices @Netflix
Scaling Push Messaging for Millions of Devices @Netflix
 

Recently uploaded

PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
Enhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZEnhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZ
Globus
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
Jen Stirrup
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
Peter Spielvogel
 
UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..
UiPathCommunity
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 

Recently uploaded (20)

PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
Enhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZEnhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZ
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
 
UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 

Escience2013-Continuous Data Flow Update Strategies for Mission-Critical Applications

  • 2. • • • • • • Motivation Data flow update types Quantitative/Qualitative metrics Update strategies Evaluation Conclusion http://www.gilaberttax.com/2013/04/16/targetedpartnership-tax-allocations/ 2
  • 4. http://smartgrid.usc.edu “The Smart Grid Explained, The Hype and The Promise”, WESCO, FMEA/FMPA Conference 2009 4
  • 5. • Mission critical data flows cannot suffer downtime – How to update continuous dataflow applications with minimal disruption ? • Evaluating dynamic update. – Performance impact • Throughput , Latency – Consistency • Data loss • Reproducibility 5 http://www.ipandora.net/2009/08/09/pray-steadfastly/
  • 6. • Formalize different types of data flow updates needs. • Identify qualitative and quantitative metrics to be considered when designing update strategies • Introduce five different data flow strategies and analytically characterize their performance metrics • Implement a consistent, low latency update strategy in Floe continuous dataflow engine and evaluate it against a simple update strategy for a motivating application from Los Angeles power grid project http://www.flickr.com/photos/dhammakaya/7095451689/ 6
  • 7. • Continuous data flow τ(Ƿ,С) is a directed graph – Ƿ set of processors – С set of directed edges(channels) connecting processors 7
  • 8. • • • • Processor update Channel update Independent sub graph update Connected sub graph update 8
  • 9. P3 P1 P5 P2 P4 P6 P2++ • Updates to one or more processors • | Ƿ | remains constant • С remains constant 9
  • 10. P3 P1 P5 P2 P4 P6 • Change in number of channels or connectivity • No changes to processors 10
  • 11. P3 P1 P5 P2 P4 P6 P2++ • Updates to one or more processors and channels – No change in number of processors – Channel connectivity change, Channel addition/removal 11
  • 12. P3 P1 P5 P2 P4 P2++ P6 P2++ P2++ • Connected sub-graph in data flow is replaced by another connected sub-graph 12
  • 13. • Quantitative – – – – Refresh latency Lag latency Throughput Message loss • Qualitative – Consistency – Interleaved vs Delineated http://theculturevulture.co.uk/blog/reviews/what-happened-at-whats-next/ 13
  • 14. • Refresh latency – Time between update start and first message from the new workflow component • Lag latency – Time between update start and time at which last message from the old work flow is emitted. • Throughput – Message throughput drop at update time • Message loss – Is there a message loss ? How many ? 14
  • 15. • Consistency – Does message consistently processed through a one version of data flow ? • Interleaved & Delineated • Let tf be the first message processed and emitted from τs+1 and tl be the last message processed and emitted from τs • Delineated if tf > tl 15
  • 16. • • • • • Native Consistent Lossy update (NCL) Native Consistent High latency update (NCH) High-Throughput Inconsistent Update (HTI) Message Versioned Consistent Update (MVC) Path Versioned Consistent Update (PVC) https://www.ubat.com/blog/do-i-really-need-a-business-plan/ 16
  • 17. P3 P5 P2 P4 Pause P1 P6 • Pause input stream , terminate dataflow , deploy new data flow, resume workflow – – – – – Consistent Delineated Lag latency = 0 Refresh latency = Deployment time + Min(wave head time) Throughput = 0 ;starting at update start time for a duration of refresh latency 17
  • 18. P5 P3 Flush Pause P1 P2 P4 P6 • Pause input stream , flush on the fly messages (TTLold), terminate dataflow , deploy new data flow, resume workflow – – – – – – Consistent Delineated Refresh latency = DT + TTLold + Min(wave head time) Lag Latency = TTLold No Message loss Throughput goes to 0 18
  • 19. P3 P1 P5 P2 P4 P6 • Perform in place updates upon request – Inconsistent – Interleaving messages – Low latencies (bounds are derived per update type) 19
  • 20. P3 P5 Update current version P1 P2 • Tags messages at the sources • Message versions are used to find the correct processor/channel/sub-graph P4 P6 P4 – Consistent – Interleaved 20
  • 21. • Extension of MVC • Message tagged with current path it took • Dispatch messages to new version either if they processed through new version or its processed through components present in both new and old versions of workflow. – Consistent – Interleaved 21
  • 22. • Implemented MCV in Floe[1] Continuous data flow engine. • Compare MVC against Naïve Consistent Lossy update. • Used Message Context as a carrier of data-flow version Floe Message Key Properties<K,V> Payload [1] https://github.com/usc-cloud/floe 22
  • 23. • Update processor “Parse” to “Parse++” 23
  • 24. 24
  • 25. • Online updates to mission critical continuous data flows is an important problem space. • Formalized and analyzed – Update models – Evaluation metrics – Update strategies and their trade offs • Empirically evaluate MVC and NCL update strategies. 25

Editor's Notes

  1. The story behind this paper is some what interesting.It was started in last winter break where I started implementing a dynamic update for out in house continious data flow engine.After the winter break is over and when I met the with Yogesh and one of my colleges in lab who maintained floe at that time. we had a disagreement regarding the consistency provided by my implementation. In this discussion we realized that there can be different dynamic update types for distributed continuous data flows with different trade offs.This paper is a result of looking in to this problem in detail.
  2. Motivation behind the need for dynamic update for continuous data-flows.Different possible types of updates to data flowsIdentify metrics to evaluate dynamic data flow updatesIntroduce set of update strategies&apos; which offer different performance/quality trade-offs for different update types.Present empirical evaluation results some update strategies. Finish the presentation with a conclusion.
  3. Big data High volume , High Variety , High velocity data assets Initial efforts on batch processing systemsCyber physical systems, Sensor networks, social network streams need data stream processing systems. This is where continious data flows comes in to the picture
  4. Power grids are transforming into smart grids. (We can see power grids are transforming in to smart girds)Smart meters allow electricity consumption events to transfer in near real timeIntelligent management Demand response optimization to predict and forecast power grid demands and allow take corrective measurements if demand &gt; supply USC act as a micro-grid test bed to evaluate forecasting models and curtailment techniques.Process data streams from over 100 buildings /50k sensors to measure power usage/ equipment status , ambient temperature etc. Continious data flowsRead data , phrase data ,extract and validate reading ,annotate data , inserted in to RDF storage used by smart girl web portal, and also parsed data directed to analytics model which does energy forecasting. Those trigger actions.
  5. Those work flows can’t suffer downtime.Need to update : to improve, fix issues, ex : parsing/annotating logic needs to change when sensor streams change or get updated. User need :No message loss.Ordering of new/old messagesReproducibility of data. Update should not affect the reproducibility property Some applications might accept delay as : Web portal while facility management might not. Shut down the data flow and restart views on the fly updates.
  6. I’ll not board you by going through all the formalism and theoretical bounds. Rather I’ll try to go through examples and give an intuition which will be useful when you are reading the paper.
  7. Continious data flow is a directed graph with processor nodes and channel edges. Processors does the data processing while channels carry data between processors connecting them
  8. We define four types of updates that can be done in a continious data flow.
  9. Processor update defined as update to one or more processors without changing number of processors or channels for channel connectivity.
  10. Channel update change either number of channels in the data flow or change the connectivity.
  11. Combination of previous update types.Update one or more processors and channelsNo change in number of processors No Change in
  12. In Connected sub-graph update, update is done by replacing a connected sub-graph in data flow replaced by another connected subgraph.
  13. We identify and formalize different Quantitate and qualitative metrics that can be used to evaluate different dynamic data flow updates. We define and identify RL,LL,T,ML as Quantitative metrics and Consistency , Interleaved vs Delineated as Qualitative metrics
  14. Refresh latency : How fast we see the effect of update ? Lag Latency : How long it take to flush messages from the old work flowThroughput : is there a impact to throughputMessage loss : Does this update strategy cause
  15. Consistency : Is data reproducible ? Delineated : Can we draw a line between new messages and old messages such that there is no old messages emitted from the system after we see the first new message
  16. We introduce 5 different data flow update strategies that can be implemented with different trade-offs. And we analyze the evaluation metrics against
  17. Pause the input stream , terminate the data flow immediately deploy new data flow and then resume the workflow. Consistent : We terminate the old data flow immediately upon the request and messages processing in the data flow is lost. So all the messages are processed either from old version of the dataflow or new version of the data flow. Delineated: Since we terminated the data flow with the on the fly messages. There will be no old messages emitted from the data flow after the deployment of the new data floe. Lag latency is zero since we terminate the data flow immediately after update request. Refresh latency