SlideShare a Scribd company logo
1 of 57
Download to read offline
Advanced service-based data analytics:
concepts and designs
Hong-Linh Truong
Distributed Systems Group,
Vienna University of Technology
truong@dsg.tuwien.ac.at
dsg.tuwien.ac.at/staff/truong
1ASE Summer 2014
Advanced Services Engineering,
Summer 2014, Lecture 7
Advanced Services Engineering,
Summer 2014, Lecture 7
Outline
 Principles of elasticity for advanced service-
based data analytics
 Data analytics within a single system
 Data analytics across multiple systems
 Composable cost evaluation
ASE Summer 2014 2
PRINCIPLES OF ELASTICITY FOR DATA
ANALYTICS
ASE Summer 2014 3
Advanced service-based data
analytics (1)
ASE Summer 2014 4
Cities, e.g. including:
10000+ buildings
1000000+ sensors
Near
realtime
analytics
Near
realtime
analytics
Predictive
data
analytics
Visual
Analytics
Enterprise
Resource
Planning
Enterprise
Resource
Planning
Emergency
Management
Emergency
Management
Internet/public cloud
boundary
Organization-specific
boundary
Tracking/Log
istics
Tracking/Log
istics
Infrastructure
Monitoring
Infrastructure
Monitoring
Infrastructure/Internet of Things
......
Advanced service-based data
analytics (2)
ASE Summer 2014 5
A lot of input data (L0):
~2.7 TB per day
A lot of results (L1, L2):
e.g., L1 has ~140 MB per
day for a grid of
1kmx1km
Soil
moisture
analysis for
Sentinel-1
Michael Hornacek,Wolfgang Wagner, Daniel Sabel, Hong-Linh Truong, Paul Snoeij, Thomas Hahmann, Erhard Diedrich, Marcela Doubkova,
Potential for High Resolution Systematic Global Surface Soil Moisture Retrieval Via Change Detection Using Sentinel-1, IEEE Journal of
Selected Topics in Applied Earth Observations and Remote Sensing, April, 2012
Michael Hornacek,Wolfgang Wagner, Daniel Sabel, Hong-Linh Truong, Paul Snoeij, Thomas Hahmann, Erhard Diedrich, Marcela Doubkova,
Potential for High Resolution Systematic Global Surface Soil Moisture Retrieval Via Change Detection Using Sentinel-1, IEEE Journal of
Selected Topics in Applied Earth Observations and Remote Sensing, April, 2012
Data-as-a-Service
and Platform-as-a-
Service in clouds
Data-as-a-Service
and Platform-as-a-
Service in clouds
Advanced service-based data
analytics -- fundamental concepts
ASE Summer 2014 6
Part A Part B ...... Part N
Cluster Grid Local Cloud Public cloud/Sky
Applications
System
infrastructures
Domain 1 Domain 2 Domain n
Design questions
 Which system infrastructures are used?
 Which are the functions of units?
 Which interfaces are suitable for units?
 Which programming models are used within units?
 Which are fundamental units to be used?
 How do different units interact?
 Which non-functional parameters are important and
how to measure them?
ASE Summer 2014 7
Part = a (composite) service unitPart = a (composite) service unit
Fundamental concepts – system
infrastructure unit
ASE Summer 2014 8
System
infrastructures
Cloud
Software-
based Cloud
system
Human-based
Cloud system
Cluster Grid
High
Performance
Server
Fundamental concepts – unit
functions
ASE Summer 2014 9
Function
Front-
end/Presentation
Data Analytics
Service
Visualization Service Middleware
Enterprise Service
Bus
Publish/Subscription Messaging/Queuing Data Transfer
Fundamental concepts –
programming model within units
ASE Summer 2014 10
Programming
model
MapReduce MPI
Parallel
Database
Workflow
Other
solutions
Fundamental concepts – interfaces
between units
ASE Summer 2014 11
Interface
Standard
REST SOAP
APIs
Specific APIs
Standard APIs
(e.g.
OpenStack)
Interaction
Pull Push
Fundamental concepts – services
and data concerns
ASE Summer 2014 12
Service and
data
concerns
Data
concerns
Quality of
data
Pricing Data Right ...
Service
Concerns
QoS Pricing ...
Complex dependencies in (big)
data analytics
 More data  more
computational resources
(e.g. more VMs)
 More types of data 
more computational models
 more analytics
processes
 Change quality of results
 Change quality of data
 Change response time
 Change cost
 Change types of result
(form of the data
output, e.g. tree, visual,
story, etc.)
 More data  more
computational resources
(e.g. more VMs)
 More types of data 
more computational models
 more analytics
processes
 Change quality of results
 Change quality of data
 Change response time
 Change cost
 Change types of result
(form of the data
output, e.g. tree, visual,
story, etc.)
Data
Computational
Model
Analytics
Process
Analytics Result
Data
Data
DataxDatax
DatayDatay
DatazDataz
Computational
Model
Computational
ModelComputational
Model
Computational
ModelComputational
Model
Computational
Model
Analytics
Process
Analytics
ProcessAnalytics
Process
Analytics
ProcessAnalytics
Process
Analytics
Process
Quality of
Result
ASE Summer 2014 13
Hong-Linh Truong, Schahram Dustdar, "Principles of
Software-defined Elastic Systems for Big Data
Analytics", (c) IEEE Computer Society, IEEE
International Workshop on Software Defined
Systems, 2014 IEEE International Conference on
Cloud Engineering (IC2E 2014), Boston,
Massachusetts, USA, 10-14 March 2014
Hong-Linh Truong, Schahram Dustdar, "Principles of
Software-defined Elastic Systems for Big Data
Analytics", (c) IEEE Computer Society, IEEE
International Workshop on Software Defined
Systems, 2014 IEEE International Conference on
Cloud Engineering (IC2E 2014), Boston,
Massachusetts, USA, 10-14 March 2014
Complex dependencies in (big)
data analytics
Elasticity principles
can be used to
support this!
Elasticity principles
can be used to
support this!
ASE Summer 2014 14
Elasticity Principles: Elasticity of
data and computational models
 Multiple types of objects from different sources with
complex dependencies, relevancies, and quality
 Different data and computational models the same
analytics subject
 New analytics subjects can be defined and analytics
goals can be changed
 Decide/select/define/compose not only computational
models for analytics subjects but also data models
based on existing ones
Management and modeling of elasticity of data and computational
model during the analytics
Management and modeling of elasticity of data and computational
model during the analytics
ASE Summer 2014 15
Elasticity Principles: Elasticity of
data resources
 Data provided, managed and shared by different
providers
 Data associated with different concerns (cost, quality
of data, privacy, contract, etc.
 Static data, open data, data-as-a-service, opportunistic
data (from sensors and human sensing)
 Not just centralized big data and total data ownership
Data resources can be taken into account in an elastic
manger: similar to VMs, based on their quality,
relevancy, pricing, etc.
Data resources can be taken into account in an elastic
manger: similar to VMs, based on their quality,
relevancy, pricing, etc.
ASE Summer 2014 16
Elasticity Principles: Elasticity of
humans and software as computing
units
 Human in the loop to solve analytics tasks that software
cannot solve
 Human-based compute units can be scaled up/down
with different cost, availability, performance models
 Human-based compute units + software-based
compute units for executing computational models
 Elasticity controls can be also done by humans
Provisioning hybrid compute units in an elastic way for
computational/data/network tasks as well as for
monitoring/control tasks in the analytics process
Provisioning hybrid compute units in an elastic way for
computational/data/network tasks as well as for
monitoring/control tasks in the analytics process
ASE Summer 2014 17
Elasticity Principles: Elasticity of
quality of results
 Definition of quality of results
 Trade-offs of time, cost, quality of data, forms of
output
 Using quality of results to select suitable
computational models, data resources,
computing units
 Multi-level control for the elasticity based on
quality of results
Able to cope with changes in quality of data,
performance, cost and types of results at runtime
Able to cope with changes in quality of data,
performance, cost and types of results at runtime
ASE Summer 2014 18
WE NEED TO START FROM
DATA ANALYTICS WITHIN A
SINGLE SYSTEM
ASE Summer 2014 19
Domain ADomain A
Data analytics within a single
system
 They are complex enough but
do not meet all requirements
 In a single domain
 Tightly coupled computing
infrastructures
 E.g., in the same
cloud
 Computation and data are
close
 Several concerns can be
by-passed
ASE Summer 2014 20
Data
service
unit
Data Analytics
Unit
Not always provisioned under the „Service Unit“ model
Data analytics within a single
system
ASE Summer 2014 21
1. Andrew Pavlo, Erik Paulson, Alexander Rasin, Daniel J. Abadi, David J. DeWitt, Samuel Madden, and Michael
Stonebraker. 2009. A comparison of approaches to large-scale data analysis. In Proceedings of the 2009 ACM
SIGMOD International Conference on Management of data (SIGMOD '09), Carsten Binnig and Benoit Dageville
(Eds.). ACM, New York, NY, USA, 165-178. DOI=10.1145/1559845.1559865
http://doi.acm.org/10.1145/1559845.1559865
2. Leonardo Neumeyer, Bruce Robbins, Anish Nair, Anand Kesari: S4: Distributed Stream Computing Platform. ICDM
Workshops 2010: 170-177
3. Jerry Chou, Mark Howison, Brian Austin, Kesheng Wu, Ji Qiang, E. Wes Bethel, Arie Shoshani, Oliver Rübel,
Prabhat, and Rob D. Ryne. 2011. Parallel index and query for large scale data analysis. In Proceedings of 2011
International Conference for High Performance Computing, Networking, Storage and Analysis (SC '11). ACM, New
York, NY, USA, , Article 30 , 11 pages. DOI=10.1145/2063384.2063424 http://doi.acm.org/10.1145/2063384.2063424
4. Boduo Li, Edward Mazur, Yanlei Diao, Andrew McGregor, Prashant J. Shenoy: A platform for scalable one-pass
analytics using MapReduce. SIGMOD Conference 2011: 985-996
5. Fabrizio Marozzo, Domenico Talia, Paolo Trunfio: A Cloud Framework for Parameter Sweeping Data Mining
Applications. CloudCom 2011: 367-374
6. Yingyi Bu, Bill Howe, Magdalena Balazinska, Michael D. Ernst: HaLoop: Efficient Iterative Data Processing on Large
Clusters. PVLDB 3(1): 285-296 (2010)
1. Andrew Pavlo, Erik Paulson, Alexander Rasin, Daniel J. Abadi, David J. DeWitt, Samuel Madden, and Michael
Stonebraker. 2009. A comparison of approaches to large-scale data analysis. In Proceedings of the 2009 ACM
SIGMOD International Conference on Management of data (SIGMOD '09), Carsten Binnig and Benoit Dageville
(Eds.). ACM, New York, NY, USA, 165-178. DOI=10.1145/1559845.1559865
http://doi.acm.org/10.1145/1559845.1559865
2. Leonardo Neumeyer, Bruce Robbins, Anish Nair, Anand Kesari: S4: Distributed Stream Computing Platform. ICDM
Workshops 2010: 170-177
3. Jerry Chou, Mark Howison, Brian Austin, Kesheng Wu, Ji Qiang, E. Wes Bethel, Arie Shoshani, Oliver Rübel,
Prabhat, and Rob D. Ryne. 2011. Parallel index and query for large scale data analysis. In Proceedings of 2011
International Conference for High Performance Computing, Networking, Storage and Analysis (SC '11). ACM, New
York, NY, USA, , Article 30 , 11 pages. DOI=10.1145/2063384.2063424 http://doi.acm.org/10.1145/2063384.2063424
4. Boduo Li, Edward Mazur, Yanlei Diao, Andrew McGregor, Prashant J. Shenoy: A platform for scalable one-pass
analytics using MapReduce. SIGMOD Conference 2011: 985-996
5. Fabrizio Marozzo, Domenico Talia, Paolo Trunfio: A Cloud Framework for Parameter Sweeping Data Mining
Applications. CloudCom 2011: 367-374
6. Yingyi Bu, Bill Howe, Magdalena Balazinska, Michael D. Ernst: HaLoop: Efficient Iterative Data Processing on Large
Clusters. PVLDB 3(1): 285-296 (2010)
Some papers
Data analytics within a single
system – some examples
ASE Summer 2014 22
Message Passing
Interface (MPI) + Cluster-
based File system
Message Passing
Interface (MPI) + Cluster-
based File system
MapReduce + Google
File System
MapReduce + Google
File System
Hadoop + HDFSHadoop + HDFS
Dryad+LINQDryad+LINQ
Parallel Database
(SQL/NonSQL)
Parallel Database
(SQL/NonSQL)
Yahoo S4Yahoo S4
WorkflowWorkflow
A short, good overview in Chapter 6: Cloud Programming and Software Environments, Book: Distributed and Cloud
Computing – from Parallel Processing to the Internet of Things, Kai Hwang, Geoffrey C. Fox and Jack J Dongarra,
Morgan Kaufmann, 2012
A short, good overview in Chapter 6: Cloud Programming and Software Environments, Book: Distributed and Cloud
Computing – from Parallel Processing to the Internet of Things, Kai Hwang, Geoffrey C. Fox and Jack J Dongarra,
Morgan Kaufmann, 2012
Discussion time
ASE Summer 2014 23
WHY ANALYTICS UNITS SHOULD BE
„CLOSED“ TO DATA UNITS?
WHICH CONCERNS COULD BE IGNORE IN
SINGLE SYSTEM DATA ANALYTICS?
WHICH ISSUES WE NEED TO CONSIDER
WHEN OUR DATA UNITS ARE IN
DIFFERENT SYSTEMS?
Data analytics across multiple
systems – design choice
 Programming models
for data analytics
service
 Data service units
 Supporting middleware
units
ASE Summer 2014 24
Programming model
System
Infrastrucure
Interface
Data analytics across multiple systems
– programming models (1)
Static data
ASE Summer 2014 25
Local
input
data
Analytics
Results
MapReduce/Hadoop
Workflow
MPI
Other solutions
Servers/Cloud/Cluster
What are our design concerns?What are our design concerns?
Input
data
Output
data
Stockmarket
Social media
M2M
Stockmarket
Social media
M2M
Data analytics across multiple systems
– programming models (2)
Near-realtime data
ASE Summer 2014 26
Analytics
Results
Complex event processing
Stream data analysis
Other solutions
Servers/Cloud/Cluster
Input
data
Output
data
What are our design concerns?What are our design concerns?
Big data (e.g.,
satellite images)
Big data (e.g.,
satellite images)
Data analytics across multiple systems
– programming models (3)
Near-realtime data
ASE Summer 2014 27
Analytics
Results
MPI
Workflow
Other solutions
Servers/Cloud/Cluster
Input
data
Output
data
What are our design concerns?What are our design concerns?
Data analytics across multiple
systems – data service units
ASE Summer 2014 28
Cluster file
LustreLustreNFSNFS
Data
Analytics Unit
• Read/write data via direct ,
low-level read/write via IO
Interface
• Cluster or cluster of clusters
• Can be very large
System
• Usually parallel processing
Programming model
Hadoop File SystemHadoop File System Google file systemGoogle file system
Read/write data
Data analytics across multiple
systems – data service units
ASE Summer 2014 29
Storage-as-a-
Service
Google Storage Service
(REST API)
Google Storage Service
(REST API)
Amazon S3
(SOAP/REST API)
Amazon S3
(SOAP/REST API)
Data
Analytics Unit • Direct data transfer via REST/SOAP APIs
Interface
• Decouple between analytics and storage
System
• May require middleware for data transfer
• Request via SOAP/REST
• Real data transfer done by external
middleware
• A rich set of programming models can be used
Programming model
commands data
Data analytics across multiple
systems – data service units
ASE Summer 2014 30
Database-as-a-
Service
SkySQL
Amazon RDS
Microsoft SQL Azure
Clustrix DBaaS
SkySQL
Amazon RDS
Microsoft SQL Azure
Clustrix DBaaS
MongoDB/MongoLab
Amazon DynamoDB
Amazon SimpleDB
Cloudant Data
MongoDB/MongoLab
Amazon DynamoDB
Amazon SimpleDB
Cloudant Data
Data
Analytics Unit
• REST/SOAP APIs
• Mainly for commands and results
Interface
• Decouple between analytics unit and database
• Database as a sevice can be very large
System
• Analytics can be done at both sides
• Analytic units can use any programming models
• Database-as-a-service can perform a lot of
analytics
• Parallel database operations
Programming model
Technology
queries data
Data analytics across multiple
systems – data service units
ASE Summer 2014 31
DaaS
Infochimps
Microsoft Azure
Xively
GNIP
Infochimps
Microsoft Azure
Xively
GNIP
Data
Analytics Unit
• Data transfer can be uni or bi-
direction
• REST/SOAP APIs
Interface
• Both systems for DaaS and for
analytics units can be very large
System
• Can be any
Programming model
Technology
Middleware service unit for
transfering large data -- GlobusOnline
ASE Summer 2014 32
Source: Bryce Allen, John Bresnahan, Lisa Childers, Ian Foster, Gopi Kandaswamy, Raj Kettimuthu, Jack Kordas, Mike
Link, Stuart Martin, Karl Pickett, and Steven Tuecke. 2012. Software as a service for data scientists. Commun. ACM 55,
2 (February 2012), 81-88. DOI=10.1145/2076450.2076468 http://doi.acm.org/10.1145/2076450.2076468
Source: Bryce Allen, John Bresnahan, Lisa Childers, Ian Foster, Gopi Kandaswamy, Raj Kettimuthu, Jack Kordas, Mike
Link, Stuart Martin, Karl Pickett, and Steven Tuecke. 2012. Software as a service for data scientists. Commun. ACM 55,
2 (February 2012), 81-88. DOI=10.1145/2076450.2076468 http://doi.acm.org/10.1145/2076450.2076468
Middleware service unit for
transfering large data -- ProxyWS
ASE Summer 2014 33
Spiros Koulouzis, Reginald Cushing, K. A. Karasavvas, Adam Belloum, Marian Bubak: Enabling Web Services to
Consume and Produce Large Datasets. IEEE Internet Computing 16(1): 52-60 (2012)
Spiros Koulouzis, Reginald Cushing, K. A. Karasavvas, Adam Belloum, Marian Bubak: Enabling Web Services to
Consume and Produce Large Datasets. IEEE Internet Computing 16(1): 52-60 (2012)
Middleware service units for
messages/queuing
 Advanced Message Queuing Protocol (AMQP)
 Simple (or Streaming) Text Orientated
Messaging Protocol (STOMP)
 Specific protocols/APIs
ASE Summer 2014 34
Amazon SQSAmazon SQSStormMQStormMQ RabbitMQRabbitMQ
SOME EXAMPLES OF
COMPLEX DATA ANALYTICS
SERVICE
ASE Summer 2014 35
The SMAD distributed processing
architecture
36ASE Summer 2014
Different possibilities: Grids and/or
clouds
 Raw images stored in archival: iRODS, HTTP server, or
Amazon S3
 Notification
 Any queuing system: on-premise or cloud-based
service
 Reference images:
 Local/pre-deployed or deployed on demand
 Computation: set of workstations, cluster, EC2, etc.
 Sentinel-1 images and SSM storage:
 Local files, cloud storage, iRODs, etc.
 Result notification and sharing: to whom? At which
scale?
37
The choices are also strongly dependent on “collaboration
needs” and money! But how easy data sharing is?
The choices are also strongly dependent on “collaboration
needs” and money! But how easy data sharing is?
ASE Summer 2014
Prototype
38
PBS on Vienna Scientific
Cluster (vsc.ac.at), In total
~ 4000 cores
ASE Summer 2014
39
Illustrative experiment (1)
ASE Summer 2014
Sustainability governance analysis
ASE Summer 2014 40
Cities, e.g. including:
10000+ buildings
1000000+ sensors
Near
realtime
analytics
Near
realtime
analytics
Predictive
data
analytics
Visual
Analytics
Enterprise
Resource
Planning
Enterprise
Resource
Planning
Emergency
Management
Emergency
Management
Internet/public cloud
boundary
Organization-specific
boundary
Tracking/Log
istics
Tracking/Log
istics
Infrastructure
Monitoring
Infrastructure
Monitoring
Infrastructure/Internet of Things
......
DaaS for sustainability governance
 Monitoring data DaaS
 Domain-specific knowledge DaaS
ASE Summer 2014 41
Hong-Linh Truong, Schahram Dustdar , Sustainability Data and Analytics in Cloud-Based M2M Systems, Big Data
and Internet of Things: A Roadmap for Smart Environments Studies in Computational Intelligence Volume 546, 2014, pp
343-365
Hong-Linh Truong, Schahram Dustdar , Sustainability Data and Analytics in Cloud-Based M2M Systems, Big Data
and Internet of Things: A Roadmap for Smart Environments Studies in Computational Intelligence Volume 546, 2014, pp
343-365
Platform-as-a-Service for
Sustainability Governance
• different types of analytics application models,
such as batch, workflow and stream applications
and intelligent bots
 different programming models and languages
 For analytics of large-scale data but also bot-as-
a-service
ASE Summer 2014 42
Cloud-based Sustainability
governance analysis framework
ASE Summer 2014 43
Cloud-based Sustainability
governance analysis framework
ASE Summer 2014 44
HOW TO DEAL WITH COST
AND QUALITY OF COMPLEX
SERVICES?
Discussion time
ASE Summer 2014 45
46 46
Examples of our complex data
analytics
 Bio-mechanic applications
 Simulate the stiffness of human bones
 Data and computation intensive applications
 Sequential and parallel programs (e.g., parfe and paraview),
 Complex software installation: Parmetis, Trilinos, Parfe, Paraview, and HDF5
 run under batch and interactive modes
ASE Summer 2014
Composable evaluation approach
 We test with „cost“
ASE Summer 2014 47
Part A Part B ...... Part N
Dealing with performance and cost
of complex applications in clouds
 Application complexity
 Elastic high performance applications on multiple clouds: libraries, software
services, virtual machines, etc.
 Cost and performance are needed for determining which parts of the application
should be excuted in the clouds and when
 Cost/performance model complexity
 Coarse- and fine-grained cost models of clouds at different layers:
 Too coarse-grained (networks, storages, machines) or too fine-grained (IO
calls)
 Software-, data-, human-specific cost/performance models
 Cost models for individual parts (workflow, MPI, OpenMP, etc.)
Tran Vu Pham, Hong-Linh Truong, Schahram Dustdar "Elastic High Performance Applications - A Composition
Framework", The 2011 Asia-Pacific Services Computing Conference (IEEE APSCC 2011), (c) IEEE Computer Society,
December 12 - 15, 2011, Jeju, Korea
Hong Linh Truong, Schahram Dustdar: Composable cost estimation and monitoring for computational applications in
cloud computing environments. Procedia CS 1(1): 2175-2184 (2010)
Tran Vu Pham, Hong-Linh Truong, Schahram Dustdar "Elastic High Performance Applications - A Composition
Framework", The 2011 Asia-Pacific Services Computing Conference (IEEE APSCC 2011), (c) IEEE Computer Society,
December 12 - 15, 2011, Jeju, Korea
Hong Linh Truong, Schahram Dustdar: Composable cost estimation and monitoring for computational applications in
cloud computing environments. Procedia CS 1(1): 2175-2184 (2010)
ASE Summer 2014 48
Composable cost evaluation
Part A Part B Part C
Cost/performance
model i
Cost/performance
model j
Cost/performance
model k
Runtime:
Elastic
processes
Elastic high performance applications on multiple clouds:
libraries, software services, virtual machines, etc.
Utilize different
performance and
dependencies models
for sequential, parallel,
workflows, etc.
ASE Summer 2014 49
Composable cost evaluation:
Estimation and Monitoring
 Leverage our previous knowledge on event representations,
application monitoring, performance analysis, dependability
analysis
 Employ service-oriented approach
 RESTful service, JSON and XML event data
50 50ASE Summer 2014
Event Representations and
Instrumentation
 Captured monitoring events based on a well-defined
specification
– Well-known instrumentation techniques can be
reused
 Consider different application execution models (e.g.,
MPI, workflows, etc.)
ASE Summer 2014 51
Composable cost evaluation --
Fine-grained composition cost
models
52 52ASE Summer 2014
Composable cost evaluation --
Illustrative experiments
53
Examples with the Bones application
ASE Summer 2014
Simple Cost Estimation - Examples
 aaaa
54ASE Summer 2014 54
Online Cost Monitoring - Examples
 One experiment of a
bioinformatic
workflow in EC2
 Support runtime
cost-based
composition and
execution
55ASE Summer 2014 55
Exercises
 Read mentioned papers
 Analyze the relationships between programming
models and system infrastructures for data
analytics across multiple domains
 Examine http://cloudcomputingpatterns.org and
see how it supports data analytics patterns
 Develop some patterns for data analytics across
multiple systems
 Work on composable cost evaluation for
complex data analytics
ASE Summer 2014 56
57
Thanks for
your attention
Hong-Linh Truong
Distributed Systems Group
Vienna University of Technology
truong@dsg.tuwien.ac.at
dsg.tuwien.ac.at/staff/truong
ASE Summer 2014

More Related Content

Similar to Advanced Analytics: Single System Concepts

TUW-ASE-Summer 2014: Emerging Dynamic Distributed Systems and Challenges for ...
TUW-ASE-Summer 2014: Emerging Dynamic Distributed Systems and Challenges for ...TUW-ASE-Summer 2014: Emerging Dynamic Distributed Systems and Challenges for ...
TUW-ASE-Summer 2014: Emerging Dynamic Distributed Systems and Challenges for ...Hong-Linh Truong
 
TUW - Quality of data-aware data analytics workflows
TUW - Quality of data-aware data analytics workflowsTUW - Quality of data-aware data analytics workflows
TUW - Quality of data-aware data analytics workflowsHong-Linh Truong
 
TUW-ASE-Summer 2014: Advanced Services Engineering- Introduction
TUW-ASE-Summer 2014: Advanced Services Engineering- IntroductionTUW-ASE-Summer 2014: Advanced Services Engineering- Introduction
TUW-ASE-Summer 2014: Advanced Services Engineering- IntroductionHong-Linh Truong
 
Emerging Dynamic TUW-ASE Summer 2015 - Distributed Systems and Challenges for...
Emerging Dynamic TUW-ASE Summer 2015 - Distributed Systems and Challenges for...Emerging Dynamic TUW-ASE Summer 2015 - Distributed Systems and Challenges for...
Emerging Dynamic TUW-ASE Summer 2015 - Distributed Systems and Challenges for...Hong-Linh Truong
 
Principles of Software-defined Elastic Systems for Big Data Analytics
Principles of Software-defined Elastic Systems for Big Data AnalyticsPrinciples of Software-defined Elastic Systems for Big Data Analytics
Principles of Software-defined Elastic Systems for Big Data AnalyticsHong-Linh Truong
 
TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS
TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaSTUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS
TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaSHong-Linh Truong
 
TUWien - ASE Summer 2015: Engineering human-based services in elastic systems
TUWien - ASE Summer 2015: Engineering human-based services in elastic systemsTUWien - ASE Summer 2015: Engineering human-based services in elastic systems
TUWien - ASE Summer 2015: Engineering human-based services in elastic systemsHong-Linh Truong
 
TUW-ASE-Summer 2014: Data as a Service – Concepts, Design & Implementation, a...
TUW-ASE-Summer 2014: Data as a Service – Concepts, Design & Implementation, a...TUW-ASE-Summer 2014: Data as a Service – Concepts, Design & Implementation, a...
TUW-ASE-Summer 2014: Data as a Service – Concepts, Design & Implementation, a...Hong-Linh Truong
 
Cloud and Bid data Dr.VK.pdf
Cloud and Bid data Dr.VK.pdfCloud and Bid data Dr.VK.pdf
Cloud and Bid data Dr.VK.pdfkalai75
 
TUW-ASE- Summer 2014: Analyzing and Specifying Concerns for DaaS
TUW-ASE- Summer 2014: Analyzing and Specifying Concerns for DaaSTUW-ASE- Summer 2014: Analyzing and Specifying Concerns for DaaS
TUW-ASE- Summer 2014: Analyzing and Specifying Concerns for DaaSHong-Linh Truong
 
“Semantic Technologies for Smart Services”
“Semantic Technologies for Smart Services” “Semantic Technologies for Smart Services”
“Semantic Technologies for Smart Services” diannepatricia
 
Programming Elasticity in the Cloud
Programming Elasticity in the CloudProgramming Elasticity in the Cloud
Programming Elasticity in the CloudHong-Linh Truong
 
TUW- 184.742 Emerging Dynamic Distributed Systems and Challenges for Advanced...
TUW- 184.742 Emerging Dynamic Distributed Systems and Challenges for Advanced...TUW- 184.742 Emerging Dynamic Distributed Systems and Challenges for Advanced...
TUW- 184.742 Emerging Dynamic Distributed Systems and Challenges for Advanced...Hong-Linh Truong
 
A systems engineering methodology for wide area network selection
A systems engineering methodology for wide area network selectionA systems engineering methodology for wide area network selection
A systems engineering methodology for wide area network selectionAlexander Decker
 
The Ultimate Guide to C2090 552 ibm info sphere optim for distributed systems...
The Ultimate Guide to C2090 552 ibm info sphere optim for distributed systems...The Ultimate Guide to C2090 552 ibm info sphere optim for distributed systems...
The Ultimate Guide to C2090 552 ibm info sphere optim for distributed systems...SoniaSrivastva
 
Mobile Data Analytics
Mobile Data AnalyticsMobile Data Analytics
Mobile Data AnalyticsRICHARD AMUOK
 
Cloud Computing & Big Data
Cloud Computing & Big DataCloud Computing & Big Data
Cloud Computing & Big DataMrinal Kumar
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data AnalyticsOsman Ali
 
Sycamore Quantum Computer 2019 developed.pptx
Sycamore Quantum Computer 2019 developed.pptxSycamore Quantum Computer 2019 developed.pptx
Sycamore Quantum Computer 2019 developed.pptxshujee381
 
IRJET - A Framework for Tourist Identification and Analytics using Transport ...
IRJET - A Framework for Tourist Identification and Analytics using Transport ...IRJET - A Framework for Tourist Identification and Analytics using Transport ...
IRJET - A Framework for Tourist Identification and Analytics using Transport ...IRJET Journal
 

Similar to Advanced Analytics: Single System Concepts (20)

TUW-ASE-Summer 2014: Emerging Dynamic Distributed Systems and Challenges for ...
TUW-ASE-Summer 2014: Emerging Dynamic Distributed Systems and Challenges for ...TUW-ASE-Summer 2014: Emerging Dynamic Distributed Systems and Challenges for ...
TUW-ASE-Summer 2014: Emerging Dynamic Distributed Systems and Challenges for ...
 
TUW - Quality of data-aware data analytics workflows
TUW - Quality of data-aware data analytics workflowsTUW - Quality of data-aware data analytics workflows
TUW - Quality of data-aware data analytics workflows
 
TUW-ASE-Summer 2014: Advanced Services Engineering- Introduction
TUW-ASE-Summer 2014: Advanced Services Engineering- IntroductionTUW-ASE-Summer 2014: Advanced Services Engineering- Introduction
TUW-ASE-Summer 2014: Advanced Services Engineering- Introduction
 
Emerging Dynamic TUW-ASE Summer 2015 - Distributed Systems and Challenges for...
Emerging Dynamic TUW-ASE Summer 2015 - Distributed Systems and Challenges for...Emerging Dynamic TUW-ASE Summer 2015 - Distributed Systems and Challenges for...
Emerging Dynamic TUW-ASE Summer 2015 - Distributed Systems and Challenges for...
 
Principles of Software-defined Elastic Systems for Big Data Analytics
Principles of Software-defined Elastic Systems for Big Data AnalyticsPrinciples of Software-defined Elastic Systems for Big Data Analytics
Principles of Software-defined Elastic Systems for Big Data Analytics
 
TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS
TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaSTUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS
TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS
 
TUWien - ASE Summer 2015: Engineering human-based services in elastic systems
TUWien - ASE Summer 2015: Engineering human-based services in elastic systemsTUWien - ASE Summer 2015: Engineering human-based services in elastic systems
TUWien - ASE Summer 2015: Engineering human-based services in elastic systems
 
TUW-ASE-Summer 2014: Data as a Service – Concepts, Design & Implementation, a...
TUW-ASE-Summer 2014: Data as a Service – Concepts, Design & Implementation, a...TUW-ASE-Summer 2014: Data as a Service – Concepts, Design & Implementation, a...
TUW-ASE-Summer 2014: Data as a Service – Concepts, Design & Implementation, a...
 
Cloud and Bid data Dr.VK.pdf
Cloud and Bid data Dr.VK.pdfCloud and Bid data Dr.VK.pdf
Cloud and Bid data Dr.VK.pdf
 
TUW-ASE- Summer 2014: Analyzing and Specifying Concerns for DaaS
TUW-ASE- Summer 2014: Analyzing and Specifying Concerns for DaaSTUW-ASE- Summer 2014: Analyzing and Specifying Concerns for DaaS
TUW-ASE- Summer 2014: Analyzing and Specifying Concerns for DaaS
 
“Semantic Technologies for Smart Services”
“Semantic Technologies for Smart Services” “Semantic Technologies for Smart Services”
“Semantic Technologies for Smart Services”
 
Programming Elasticity in the Cloud
Programming Elasticity in the CloudProgramming Elasticity in the Cloud
Programming Elasticity in the Cloud
 
TUW- 184.742 Emerging Dynamic Distributed Systems and Challenges for Advanced...
TUW- 184.742 Emerging Dynamic Distributed Systems and Challenges for Advanced...TUW- 184.742 Emerging Dynamic Distributed Systems and Challenges for Advanced...
TUW- 184.742 Emerging Dynamic Distributed Systems and Challenges for Advanced...
 
A systems engineering methodology for wide area network selection
A systems engineering methodology for wide area network selectionA systems engineering methodology for wide area network selection
A systems engineering methodology for wide area network selection
 
The Ultimate Guide to C2090 552 ibm info sphere optim for distributed systems...
The Ultimate Guide to C2090 552 ibm info sphere optim for distributed systems...The Ultimate Guide to C2090 552 ibm info sphere optim for distributed systems...
The Ultimate Guide to C2090 552 ibm info sphere optim for distributed systems...
 
Mobile Data Analytics
Mobile Data AnalyticsMobile Data Analytics
Mobile Data Analytics
 
Cloud Computing & Big Data
Cloud Computing & Big DataCloud Computing & Big Data
Cloud Computing & Big Data
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Sycamore Quantum Computer 2019 developed.pptx
Sycamore Quantum Computer 2019 developed.pptxSycamore Quantum Computer 2019 developed.pptx
Sycamore Quantum Computer 2019 developed.pptx
 
IRJET - A Framework for Tourist Identification and Analytics using Transport ...
IRJET - A Framework for Tourist Identification and Analytics using Transport ...IRJET - A Framework for Tourist Identification and Analytics using Transport ...
IRJET - A Framework for Tourist Identification and Analytics using Transport ...
 

More from Hong-Linh Truong

QoA4ML – A Framework for Supporting Contracts in Machine Learning Services
QoA4ML – A Framework for Supporting Contracts in Machine Learning ServicesQoA4ML – A Framework for Supporting Contracts in Machine Learning Services
QoA4ML – A Framework for Supporting Contracts in Machine Learning ServicesHong-Linh Truong
 
Sharing Blockchain Performance Knowledge for Edge Service Development
Sharing Blockchain Performance Knowledge for Edge Service DevelopmentSharing Blockchain Performance Knowledge for Edge Service Development
Sharing Blockchain Performance Knowledge for Edge Service DevelopmentHong-Linh Truong
 
Measuring, Quantifying, & Predicting the Cost-Accuracy Tradeoff
Measuring, Quantifying, & Predicting the Cost-Accuracy TradeoffMeasuring, Quantifying, & Predicting the Cost-Accuracy Tradeoff
Measuring, Quantifying, & Predicting the Cost-Accuracy TradeoffHong-Linh Truong
 
DevOps for Dynamic Interoperability of IoT, Edge and Cloud Systems
DevOps for Dynamic Interoperability of IoT, Edge and Cloud SystemsDevOps for Dynamic Interoperability of IoT, Edge and Cloud Systems
DevOps for Dynamic Interoperability of IoT, Edge and Cloud SystemsHong-Linh Truong
 
Dynamic IoT data, protocol, and middleware interoperability with resource sli...
Dynamic IoT data, protocol, and middleware interoperability with resource sli...Dynamic IoT data, protocol, and middleware interoperability with resource sli...
Dynamic IoT data, protocol, and middleware interoperability with resource sli...Hong-Linh Truong
 
Integrated Analytics for IIoT Predictive Maintenance using IoT Big Data Cloud...
Integrated Analytics for IIoT Predictive Maintenance using IoT Big Data Cloud...Integrated Analytics for IIoT Predictive Maintenance using IoT Big Data Cloud...
Integrated Analytics for IIoT Predictive Maintenance using IoT Big Data Cloud...Hong-Linh Truong
 
Modeling and Provisioning IoT Cloud Systems for Testing Uncertainties
Modeling and Provisioning IoT Cloud Systems for Testing UncertaintiesModeling and Provisioning IoT Cloud Systems for Testing Uncertainties
Modeling and Provisioning IoT Cloud Systems for Testing UncertaintiesHong-Linh Truong
 
Characterizing Incidents in Cloud-based IoT Data Analytics
Characterizing Incidents in Cloud-based IoT Data AnalyticsCharacterizing Incidents in Cloud-based IoT Data Analytics
Characterizing Incidents in Cloud-based IoT Data AnalyticsHong-Linh Truong
 
Enabling Edge Analytics of IoT Data: The Case of LoRaWAN
Enabling Edge Analytics of IoT Data: The Case of LoRaWANEnabling Edge Analytics of IoT Data: The Case of LoRaWAN
Enabling Edge Analytics of IoT Data: The Case of LoRaWANHong-Linh Truong
 
Analytics of Performance and Data Quality for Mobile Edge Cloud Applications
Analytics of Performance and Data Quality for Mobile Edge Cloud ApplicationsAnalytics of Performance and Data Quality for Mobile Edge Cloud Applications
Analytics of Performance and Data Quality for Mobile Edge Cloud ApplicationsHong-Linh Truong
 
Testing Uncertainty of Cyber-Physical Systems in IoT Cloud Infrastructures: C...
Testing Uncertainty of Cyber-Physical Systems in IoT Cloud Infrastructures: C...Testing Uncertainty of Cyber-Physical Systems in IoT Cloud Infrastructures: C...
Testing Uncertainty of Cyber-Physical Systems in IoT Cloud Infrastructures: C...Hong-Linh Truong
 
Deep Context-Awareness: Context Coupling and New Types of Context Information...
Deep Context-Awareness: Context Coupling and New Types of Context Information...Deep Context-Awareness: Context Coupling and New Types of Context Information...
Deep Context-Awareness: Context Coupling and New Types of Context Information...Hong-Linh Truong
 
Managing and Testing Ensembles of IoT, Network functions, and Clouds
Managing and Testing Ensembles of IoT, Network functions, and CloudsManaging and Testing Ensembles of IoT, Network functions, and Clouds
Managing and Testing Ensembles of IoT, Network functions, and CloudsHong-Linh Truong
 
Towards a Resource Slice Interoperability Hub for IoT
Towards a Resource Slice Interoperability Hub for IoTTowards a Resource Slice Interoperability Hub for IoT
Towards a Resource Slice Interoperability Hub for IoTHong-Linh Truong
 
On Supporting Contract-aware IoT Dataspace Services
On Supporting Contract-aware IoT Dataspace ServicesOn Supporting Contract-aware IoT Dataspace Services
On Supporting Contract-aware IoT Dataspace ServicesHong-Linh Truong
 
Towards the Realization of Multi-dimensional Elasticity for Distributed Cloud...
Towards the Realization of Multi-dimensional Elasticity for Distributed Cloud...Towards the Realization of Multi-dimensional Elasticity for Distributed Cloud...
Towards the Realization of Multi-dimensional Elasticity for Distributed Cloud...Hong-Linh Truong
 
On Engineering Analytics of Elastic IoT Cloud Systems
On Engineering Analytics of Elastic IoT Cloud SystemsOn Engineering Analytics of Elastic IoT Cloud Systems
On Engineering Analytics of Elastic IoT Cloud SystemsHong-Linh Truong
 
HINC – Harmonizing Diverse Resource Information Across IoT, Network Functions...
HINC – Harmonizing Diverse Resource Information Across IoT, Network Functions...HINC – Harmonizing Diverse Resource Information Across IoT, Network Functions...
HINC – Harmonizing Diverse Resource Information Across IoT, Network Functions...Hong-Linh Truong
 
SINC – An Information-Centric Approach for End-to-End IoT Cloud Resource Prov...
SINC – An Information-Centric Approach for End-to-End IoT Cloud Resource Prov...SINC – An Information-Centric Approach for End-to-End IoT Cloud Resource Prov...
SINC – An Information-Centric Approach for End-to-End IoT Cloud Resource Prov...Hong-Linh Truong
 
Governing Elastic IoT Cloud Systems under Uncertainties
Governing Elastic IoT Cloud Systems under UncertaintiesGoverning Elastic IoT Cloud Systems under Uncertainties
Governing Elastic IoT Cloud Systems under UncertaintiesHong-Linh Truong
 

More from Hong-Linh Truong (20)

QoA4ML – A Framework for Supporting Contracts in Machine Learning Services
QoA4ML – A Framework for Supporting Contracts in Machine Learning ServicesQoA4ML – A Framework for Supporting Contracts in Machine Learning Services
QoA4ML – A Framework for Supporting Contracts in Machine Learning Services
 
Sharing Blockchain Performance Knowledge for Edge Service Development
Sharing Blockchain Performance Knowledge for Edge Service DevelopmentSharing Blockchain Performance Knowledge for Edge Service Development
Sharing Blockchain Performance Knowledge for Edge Service Development
 
Measuring, Quantifying, & Predicting the Cost-Accuracy Tradeoff
Measuring, Quantifying, & Predicting the Cost-Accuracy TradeoffMeasuring, Quantifying, & Predicting the Cost-Accuracy Tradeoff
Measuring, Quantifying, & Predicting the Cost-Accuracy Tradeoff
 
DevOps for Dynamic Interoperability of IoT, Edge and Cloud Systems
DevOps for Dynamic Interoperability of IoT, Edge and Cloud SystemsDevOps for Dynamic Interoperability of IoT, Edge and Cloud Systems
DevOps for Dynamic Interoperability of IoT, Edge and Cloud Systems
 
Dynamic IoT data, protocol, and middleware interoperability with resource sli...
Dynamic IoT data, protocol, and middleware interoperability with resource sli...Dynamic IoT data, protocol, and middleware interoperability with resource sli...
Dynamic IoT data, protocol, and middleware interoperability with resource sli...
 
Integrated Analytics for IIoT Predictive Maintenance using IoT Big Data Cloud...
Integrated Analytics for IIoT Predictive Maintenance using IoT Big Data Cloud...Integrated Analytics for IIoT Predictive Maintenance using IoT Big Data Cloud...
Integrated Analytics for IIoT Predictive Maintenance using IoT Big Data Cloud...
 
Modeling and Provisioning IoT Cloud Systems for Testing Uncertainties
Modeling and Provisioning IoT Cloud Systems for Testing UncertaintiesModeling and Provisioning IoT Cloud Systems for Testing Uncertainties
Modeling and Provisioning IoT Cloud Systems for Testing Uncertainties
 
Characterizing Incidents in Cloud-based IoT Data Analytics
Characterizing Incidents in Cloud-based IoT Data AnalyticsCharacterizing Incidents in Cloud-based IoT Data Analytics
Characterizing Incidents in Cloud-based IoT Data Analytics
 
Enabling Edge Analytics of IoT Data: The Case of LoRaWAN
Enabling Edge Analytics of IoT Data: The Case of LoRaWANEnabling Edge Analytics of IoT Data: The Case of LoRaWAN
Enabling Edge Analytics of IoT Data: The Case of LoRaWAN
 
Analytics of Performance and Data Quality for Mobile Edge Cloud Applications
Analytics of Performance and Data Quality for Mobile Edge Cloud ApplicationsAnalytics of Performance and Data Quality for Mobile Edge Cloud Applications
Analytics of Performance and Data Quality for Mobile Edge Cloud Applications
 
Testing Uncertainty of Cyber-Physical Systems in IoT Cloud Infrastructures: C...
Testing Uncertainty of Cyber-Physical Systems in IoT Cloud Infrastructures: C...Testing Uncertainty of Cyber-Physical Systems in IoT Cloud Infrastructures: C...
Testing Uncertainty of Cyber-Physical Systems in IoT Cloud Infrastructures: C...
 
Deep Context-Awareness: Context Coupling and New Types of Context Information...
Deep Context-Awareness: Context Coupling and New Types of Context Information...Deep Context-Awareness: Context Coupling and New Types of Context Information...
Deep Context-Awareness: Context Coupling and New Types of Context Information...
 
Managing and Testing Ensembles of IoT, Network functions, and Clouds
Managing and Testing Ensembles of IoT, Network functions, and CloudsManaging and Testing Ensembles of IoT, Network functions, and Clouds
Managing and Testing Ensembles of IoT, Network functions, and Clouds
 
Towards a Resource Slice Interoperability Hub for IoT
Towards a Resource Slice Interoperability Hub for IoTTowards a Resource Slice Interoperability Hub for IoT
Towards a Resource Slice Interoperability Hub for IoT
 
On Supporting Contract-aware IoT Dataspace Services
On Supporting Contract-aware IoT Dataspace ServicesOn Supporting Contract-aware IoT Dataspace Services
On Supporting Contract-aware IoT Dataspace Services
 
Towards the Realization of Multi-dimensional Elasticity for Distributed Cloud...
Towards the Realization of Multi-dimensional Elasticity for Distributed Cloud...Towards the Realization of Multi-dimensional Elasticity for Distributed Cloud...
Towards the Realization of Multi-dimensional Elasticity for Distributed Cloud...
 
On Engineering Analytics of Elastic IoT Cloud Systems
On Engineering Analytics of Elastic IoT Cloud SystemsOn Engineering Analytics of Elastic IoT Cloud Systems
On Engineering Analytics of Elastic IoT Cloud Systems
 
HINC – Harmonizing Diverse Resource Information Across IoT, Network Functions...
HINC – Harmonizing Diverse Resource Information Across IoT, Network Functions...HINC – Harmonizing Diverse Resource Information Across IoT, Network Functions...
HINC – Harmonizing Diverse Resource Information Across IoT, Network Functions...
 
SINC – An Information-Centric Approach for End-to-End IoT Cloud Resource Prov...
SINC – An Information-Centric Approach for End-to-End IoT Cloud Resource Prov...SINC – An Information-Centric Approach for End-to-End IoT Cloud Resource Prov...
SINC – An Information-Centric Approach for End-to-End IoT Cloud Resource Prov...
 
Governing Elastic IoT Cloud Systems under Uncertainties
Governing Elastic IoT Cloud Systems under UncertaintiesGoverning Elastic IoT Cloud Systems under Uncertainties
Governing Elastic IoT Cloud Systems under Uncertainties
 

Recently uploaded

Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxAvyJaneVismanos
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,Virag Sontakke
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaVirag Sontakke
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsKarinaGenton
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfadityarao40181
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 

Recently uploaded (20)

Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptx
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of India
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its Characteristics
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdf
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 

Advanced Analytics: Single System Concepts

  • 1. Advanced service-based data analytics: concepts and designs Hong-Linh Truong Distributed Systems Group, Vienna University of Technology truong@dsg.tuwien.ac.at dsg.tuwien.ac.at/staff/truong 1ASE Summer 2014 Advanced Services Engineering, Summer 2014, Lecture 7 Advanced Services Engineering, Summer 2014, Lecture 7
  • 2. Outline  Principles of elasticity for advanced service- based data analytics  Data analytics within a single system  Data analytics across multiple systems  Composable cost evaluation ASE Summer 2014 2
  • 3. PRINCIPLES OF ELASTICITY FOR DATA ANALYTICS ASE Summer 2014 3
  • 4. Advanced service-based data analytics (1) ASE Summer 2014 4 Cities, e.g. including: 10000+ buildings 1000000+ sensors Near realtime analytics Near realtime analytics Predictive data analytics Visual Analytics Enterprise Resource Planning Enterprise Resource Planning Emergency Management Emergency Management Internet/public cloud boundary Organization-specific boundary Tracking/Log istics Tracking/Log istics Infrastructure Monitoring Infrastructure Monitoring Infrastructure/Internet of Things ......
  • 5. Advanced service-based data analytics (2) ASE Summer 2014 5 A lot of input data (L0): ~2.7 TB per day A lot of results (L1, L2): e.g., L1 has ~140 MB per day for a grid of 1kmx1km Soil moisture analysis for Sentinel-1 Michael Hornacek,Wolfgang Wagner, Daniel Sabel, Hong-Linh Truong, Paul Snoeij, Thomas Hahmann, Erhard Diedrich, Marcela Doubkova, Potential for High Resolution Systematic Global Surface Soil Moisture Retrieval Via Change Detection Using Sentinel-1, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, April, 2012 Michael Hornacek,Wolfgang Wagner, Daniel Sabel, Hong-Linh Truong, Paul Snoeij, Thomas Hahmann, Erhard Diedrich, Marcela Doubkova, Potential for High Resolution Systematic Global Surface Soil Moisture Retrieval Via Change Detection Using Sentinel-1, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, April, 2012 Data-as-a-Service and Platform-as-a- Service in clouds Data-as-a-Service and Platform-as-a- Service in clouds
  • 6. Advanced service-based data analytics -- fundamental concepts ASE Summer 2014 6 Part A Part B ...... Part N Cluster Grid Local Cloud Public cloud/Sky Applications System infrastructures Domain 1 Domain 2 Domain n
  • 7. Design questions  Which system infrastructures are used?  Which are the functions of units?  Which interfaces are suitable for units?  Which programming models are used within units?  Which are fundamental units to be used?  How do different units interact?  Which non-functional parameters are important and how to measure them? ASE Summer 2014 7 Part = a (composite) service unitPart = a (composite) service unit
  • 8. Fundamental concepts – system infrastructure unit ASE Summer 2014 8 System infrastructures Cloud Software- based Cloud system Human-based Cloud system Cluster Grid High Performance Server
  • 9. Fundamental concepts – unit functions ASE Summer 2014 9 Function Front- end/Presentation Data Analytics Service Visualization Service Middleware Enterprise Service Bus Publish/Subscription Messaging/Queuing Data Transfer
  • 10. Fundamental concepts – programming model within units ASE Summer 2014 10 Programming model MapReduce MPI Parallel Database Workflow Other solutions
  • 11. Fundamental concepts – interfaces between units ASE Summer 2014 11 Interface Standard REST SOAP APIs Specific APIs Standard APIs (e.g. OpenStack) Interaction Pull Push
  • 12. Fundamental concepts – services and data concerns ASE Summer 2014 12 Service and data concerns Data concerns Quality of data Pricing Data Right ... Service Concerns QoS Pricing ...
  • 13. Complex dependencies in (big) data analytics  More data  more computational resources (e.g. more VMs)  More types of data  more computational models  more analytics processes  Change quality of results  Change quality of data  Change response time  Change cost  Change types of result (form of the data output, e.g. tree, visual, story, etc.)  More data  more computational resources (e.g. more VMs)  More types of data  more computational models  more analytics processes  Change quality of results  Change quality of data  Change response time  Change cost  Change types of result (form of the data output, e.g. tree, visual, story, etc.) Data Computational Model Analytics Process Analytics Result Data Data DataxDatax DatayDatay DatazDataz Computational Model Computational ModelComputational Model Computational ModelComputational Model Computational Model Analytics Process Analytics ProcessAnalytics Process Analytics ProcessAnalytics Process Analytics Process Quality of Result ASE Summer 2014 13 Hong-Linh Truong, Schahram Dustdar, "Principles of Software-defined Elastic Systems for Big Data Analytics", (c) IEEE Computer Society, IEEE International Workshop on Software Defined Systems, 2014 IEEE International Conference on Cloud Engineering (IC2E 2014), Boston, Massachusetts, USA, 10-14 March 2014 Hong-Linh Truong, Schahram Dustdar, "Principles of Software-defined Elastic Systems for Big Data Analytics", (c) IEEE Computer Society, IEEE International Workshop on Software Defined Systems, 2014 IEEE International Conference on Cloud Engineering (IC2E 2014), Boston, Massachusetts, USA, 10-14 March 2014
  • 14. Complex dependencies in (big) data analytics Elasticity principles can be used to support this! Elasticity principles can be used to support this! ASE Summer 2014 14
  • 15. Elasticity Principles: Elasticity of data and computational models  Multiple types of objects from different sources with complex dependencies, relevancies, and quality  Different data and computational models the same analytics subject  New analytics subjects can be defined and analytics goals can be changed  Decide/select/define/compose not only computational models for analytics subjects but also data models based on existing ones Management and modeling of elasticity of data and computational model during the analytics Management and modeling of elasticity of data and computational model during the analytics ASE Summer 2014 15
  • 16. Elasticity Principles: Elasticity of data resources  Data provided, managed and shared by different providers  Data associated with different concerns (cost, quality of data, privacy, contract, etc.  Static data, open data, data-as-a-service, opportunistic data (from sensors and human sensing)  Not just centralized big data and total data ownership Data resources can be taken into account in an elastic manger: similar to VMs, based on their quality, relevancy, pricing, etc. Data resources can be taken into account in an elastic manger: similar to VMs, based on their quality, relevancy, pricing, etc. ASE Summer 2014 16
  • 17. Elasticity Principles: Elasticity of humans and software as computing units  Human in the loop to solve analytics tasks that software cannot solve  Human-based compute units can be scaled up/down with different cost, availability, performance models  Human-based compute units + software-based compute units for executing computational models  Elasticity controls can be also done by humans Provisioning hybrid compute units in an elastic way for computational/data/network tasks as well as for monitoring/control tasks in the analytics process Provisioning hybrid compute units in an elastic way for computational/data/network tasks as well as for monitoring/control tasks in the analytics process ASE Summer 2014 17
  • 18. Elasticity Principles: Elasticity of quality of results  Definition of quality of results  Trade-offs of time, cost, quality of data, forms of output  Using quality of results to select suitable computational models, data resources, computing units  Multi-level control for the elasticity based on quality of results Able to cope with changes in quality of data, performance, cost and types of results at runtime Able to cope with changes in quality of data, performance, cost and types of results at runtime ASE Summer 2014 18
  • 19. WE NEED TO START FROM DATA ANALYTICS WITHIN A SINGLE SYSTEM ASE Summer 2014 19
  • 20. Domain ADomain A Data analytics within a single system  They are complex enough but do not meet all requirements  In a single domain  Tightly coupled computing infrastructures  E.g., in the same cloud  Computation and data are close  Several concerns can be by-passed ASE Summer 2014 20 Data service unit Data Analytics Unit Not always provisioned under the „Service Unit“ model
  • 21. Data analytics within a single system ASE Summer 2014 21 1. Andrew Pavlo, Erik Paulson, Alexander Rasin, Daniel J. Abadi, David J. DeWitt, Samuel Madden, and Michael Stonebraker. 2009. A comparison of approaches to large-scale data analysis. In Proceedings of the 2009 ACM SIGMOD International Conference on Management of data (SIGMOD '09), Carsten Binnig and Benoit Dageville (Eds.). ACM, New York, NY, USA, 165-178. DOI=10.1145/1559845.1559865 http://doi.acm.org/10.1145/1559845.1559865 2. Leonardo Neumeyer, Bruce Robbins, Anish Nair, Anand Kesari: S4: Distributed Stream Computing Platform. ICDM Workshops 2010: 170-177 3. Jerry Chou, Mark Howison, Brian Austin, Kesheng Wu, Ji Qiang, E. Wes Bethel, Arie Shoshani, Oliver Rübel, Prabhat, and Rob D. Ryne. 2011. Parallel index and query for large scale data analysis. In Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC '11). ACM, New York, NY, USA, , Article 30 , 11 pages. DOI=10.1145/2063384.2063424 http://doi.acm.org/10.1145/2063384.2063424 4. Boduo Li, Edward Mazur, Yanlei Diao, Andrew McGregor, Prashant J. Shenoy: A platform for scalable one-pass analytics using MapReduce. SIGMOD Conference 2011: 985-996 5. Fabrizio Marozzo, Domenico Talia, Paolo Trunfio: A Cloud Framework for Parameter Sweeping Data Mining Applications. CloudCom 2011: 367-374 6. Yingyi Bu, Bill Howe, Magdalena Balazinska, Michael D. Ernst: HaLoop: Efficient Iterative Data Processing on Large Clusters. PVLDB 3(1): 285-296 (2010) 1. Andrew Pavlo, Erik Paulson, Alexander Rasin, Daniel J. Abadi, David J. DeWitt, Samuel Madden, and Michael Stonebraker. 2009. A comparison of approaches to large-scale data analysis. In Proceedings of the 2009 ACM SIGMOD International Conference on Management of data (SIGMOD '09), Carsten Binnig and Benoit Dageville (Eds.). ACM, New York, NY, USA, 165-178. DOI=10.1145/1559845.1559865 http://doi.acm.org/10.1145/1559845.1559865 2. Leonardo Neumeyer, Bruce Robbins, Anish Nair, Anand Kesari: S4: Distributed Stream Computing Platform. ICDM Workshops 2010: 170-177 3. Jerry Chou, Mark Howison, Brian Austin, Kesheng Wu, Ji Qiang, E. Wes Bethel, Arie Shoshani, Oliver Rübel, Prabhat, and Rob D. Ryne. 2011. Parallel index and query for large scale data analysis. In Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC '11). ACM, New York, NY, USA, , Article 30 , 11 pages. DOI=10.1145/2063384.2063424 http://doi.acm.org/10.1145/2063384.2063424 4. Boduo Li, Edward Mazur, Yanlei Diao, Andrew McGregor, Prashant J. Shenoy: A platform for scalable one-pass analytics using MapReduce. SIGMOD Conference 2011: 985-996 5. Fabrizio Marozzo, Domenico Talia, Paolo Trunfio: A Cloud Framework for Parameter Sweeping Data Mining Applications. CloudCom 2011: 367-374 6. Yingyi Bu, Bill Howe, Magdalena Balazinska, Michael D. Ernst: HaLoop: Efficient Iterative Data Processing on Large Clusters. PVLDB 3(1): 285-296 (2010) Some papers
  • 22. Data analytics within a single system – some examples ASE Summer 2014 22 Message Passing Interface (MPI) + Cluster- based File system Message Passing Interface (MPI) + Cluster- based File system MapReduce + Google File System MapReduce + Google File System Hadoop + HDFSHadoop + HDFS Dryad+LINQDryad+LINQ Parallel Database (SQL/NonSQL) Parallel Database (SQL/NonSQL) Yahoo S4Yahoo S4 WorkflowWorkflow A short, good overview in Chapter 6: Cloud Programming and Software Environments, Book: Distributed and Cloud Computing – from Parallel Processing to the Internet of Things, Kai Hwang, Geoffrey C. Fox and Jack J Dongarra, Morgan Kaufmann, 2012 A short, good overview in Chapter 6: Cloud Programming and Software Environments, Book: Distributed and Cloud Computing – from Parallel Processing to the Internet of Things, Kai Hwang, Geoffrey C. Fox and Jack J Dongarra, Morgan Kaufmann, 2012
  • 23. Discussion time ASE Summer 2014 23 WHY ANALYTICS UNITS SHOULD BE „CLOSED“ TO DATA UNITS? WHICH CONCERNS COULD BE IGNORE IN SINGLE SYSTEM DATA ANALYTICS? WHICH ISSUES WE NEED TO CONSIDER WHEN OUR DATA UNITS ARE IN DIFFERENT SYSTEMS?
  • 24. Data analytics across multiple systems – design choice  Programming models for data analytics service  Data service units  Supporting middleware units ASE Summer 2014 24 Programming model System Infrastrucure Interface
  • 25. Data analytics across multiple systems – programming models (1) Static data ASE Summer 2014 25 Local input data Analytics Results MapReduce/Hadoop Workflow MPI Other solutions Servers/Cloud/Cluster What are our design concerns?What are our design concerns? Input data Output data
  • 26. Stockmarket Social media M2M Stockmarket Social media M2M Data analytics across multiple systems – programming models (2) Near-realtime data ASE Summer 2014 26 Analytics Results Complex event processing Stream data analysis Other solutions Servers/Cloud/Cluster Input data Output data What are our design concerns?What are our design concerns?
  • 27. Big data (e.g., satellite images) Big data (e.g., satellite images) Data analytics across multiple systems – programming models (3) Near-realtime data ASE Summer 2014 27 Analytics Results MPI Workflow Other solutions Servers/Cloud/Cluster Input data Output data What are our design concerns?What are our design concerns?
  • 28. Data analytics across multiple systems – data service units ASE Summer 2014 28 Cluster file LustreLustreNFSNFS Data Analytics Unit • Read/write data via direct , low-level read/write via IO Interface • Cluster or cluster of clusters • Can be very large System • Usually parallel processing Programming model Hadoop File SystemHadoop File System Google file systemGoogle file system Read/write data
  • 29. Data analytics across multiple systems – data service units ASE Summer 2014 29 Storage-as-a- Service Google Storage Service (REST API) Google Storage Service (REST API) Amazon S3 (SOAP/REST API) Amazon S3 (SOAP/REST API) Data Analytics Unit • Direct data transfer via REST/SOAP APIs Interface • Decouple between analytics and storage System • May require middleware for data transfer • Request via SOAP/REST • Real data transfer done by external middleware • A rich set of programming models can be used Programming model commands data
  • 30. Data analytics across multiple systems – data service units ASE Summer 2014 30 Database-as-a- Service SkySQL Amazon RDS Microsoft SQL Azure Clustrix DBaaS SkySQL Amazon RDS Microsoft SQL Azure Clustrix DBaaS MongoDB/MongoLab Amazon DynamoDB Amazon SimpleDB Cloudant Data MongoDB/MongoLab Amazon DynamoDB Amazon SimpleDB Cloudant Data Data Analytics Unit • REST/SOAP APIs • Mainly for commands and results Interface • Decouple between analytics unit and database • Database as a sevice can be very large System • Analytics can be done at both sides • Analytic units can use any programming models • Database-as-a-service can perform a lot of analytics • Parallel database operations Programming model Technology queries data
  • 31. Data analytics across multiple systems – data service units ASE Summer 2014 31 DaaS Infochimps Microsoft Azure Xively GNIP Infochimps Microsoft Azure Xively GNIP Data Analytics Unit • Data transfer can be uni or bi- direction • REST/SOAP APIs Interface • Both systems for DaaS and for analytics units can be very large System • Can be any Programming model Technology
  • 32. Middleware service unit for transfering large data -- GlobusOnline ASE Summer 2014 32 Source: Bryce Allen, John Bresnahan, Lisa Childers, Ian Foster, Gopi Kandaswamy, Raj Kettimuthu, Jack Kordas, Mike Link, Stuart Martin, Karl Pickett, and Steven Tuecke. 2012. Software as a service for data scientists. Commun. ACM 55, 2 (February 2012), 81-88. DOI=10.1145/2076450.2076468 http://doi.acm.org/10.1145/2076450.2076468 Source: Bryce Allen, John Bresnahan, Lisa Childers, Ian Foster, Gopi Kandaswamy, Raj Kettimuthu, Jack Kordas, Mike Link, Stuart Martin, Karl Pickett, and Steven Tuecke. 2012. Software as a service for data scientists. Commun. ACM 55, 2 (February 2012), 81-88. DOI=10.1145/2076450.2076468 http://doi.acm.org/10.1145/2076450.2076468
  • 33. Middleware service unit for transfering large data -- ProxyWS ASE Summer 2014 33 Spiros Koulouzis, Reginald Cushing, K. A. Karasavvas, Adam Belloum, Marian Bubak: Enabling Web Services to Consume and Produce Large Datasets. IEEE Internet Computing 16(1): 52-60 (2012) Spiros Koulouzis, Reginald Cushing, K. A. Karasavvas, Adam Belloum, Marian Bubak: Enabling Web Services to Consume and Produce Large Datasets. IEEE Internet Computing 16(1): 52-60 (2012)
  • 34. Middleware service units for messages/queuing  Advanced Message Queuing Protocol (AMQP)  Simple (or Streaming) Text Orientated Messaging Protocol (STOMP)  Specific protocols/APIs ASE Summer 2014 34 Amazon SQSAmazon SQSStormMQStormMQ RabbitMQRabbitMQ
  • 35. SOME EXAMPLES OF COMPLEX DATA ANALYTICS SERVICE ASE Summer 2014 35
  • 36. The SMAD distributed processing architecture 36ASE Summer 2014
  • 37. Different possibilities: Grids and/or clouds  Raw images stored in archival: iRODS, HTTP server, or Amazon S3  Notification  Any queuing system: on-premise or cloud-based service  Reference images:  Local/pre-deployed or deployed on demand  Computation: set of workstations, cluster, EC2, etc.  Sentinel-1 images and SSM storage:  Local files, cloud storage, iRODs, etc.  Result notification and sharing: to whom? At which scale? 37 The choices are also strongly dependent on “collaboration needs” and money! But how easy data sharing is? The choices are also strongly dependent on “collaboration needs” and money! But how easy data sharing is? ASE Summer 2014
  • 38. Prototype 38 PBS on Vienna Scientific Cluster (vsc.ac.at), In total ~ 4000 cores ASE Summer 2014
  • 40. Sustainability governance analysis ASE Summer 2014 40 Cities, e.g. including: 10000+ buildings 1000000+ sensors Near realtime analytics Near realtime analytics Predictive data analytics Visual Analytics Enterprise Resource Planning Enterprise Resource Planning Emergency Management Emergency Management Internet/public cloud boundary Organization-specific boundary Tracking/Log istics Tracking/Log istics Infrastructure Monitoring Infrastructure Monitoring Infrastructure/Internet of Things ......
  • 41. DaaS for sustainability governance  Monitoring data DaaS  Domain-specific knowledge DaaS ASE Summer 2014 41 Hong-Linh Truong, Schahram Dustdar , Sustainability Data and Analytics in Cloud-Based M2M Systems, Big Data and Internet of Things: A Roadmap for Smart Environments Studies in Computational Intelligence Volume 546, 2014, pp 343-365 Hong-Linh Truong, Schahram Dustdar , Sustainability Data and Analytics in Cloud-Based M2M Systems, Big Data and Internet of Things: A Roadmap for Smart Environments Studies in Computational Intelligence Volume 546, 2014, pp 343-365
  • 42. Platform-as-a-Service for Sustainability Governance • different types of analytics application models, such as batch, workflow and stream applications and intelligent bots  different programming models and languages  For analytics of large-scale data but also bot-as- a-service ASE Summer 2014 42
  • 43. Cloud-based Sustainability governance analysis framework ASE Summer 2014 43
  • 44. Cloud-based Sustainability governance analysis framework ASE Summer 2014 44
  • 45. HOW TO DEAL WITH COST AND QUALITY OF COMPLEX SERVICES? Discussion time ASE Summer 2014 45
  • 46. 46 46 Examples of our complex data analytics  Bio-mechanic applications  Simulate the stiffness of human bones  Data and computation intensive applications  Sequential and parallel programs (e.g., parfe and paraview),  Complex software installation: Parmetis, Trilinos, Parfe, Paraview, and HDF5  run under batch and interactive modes ASE Summer 2014
  • 47. Composable evaluation approach  We test with „cost“ ASE Summer 2014 47 Part A Part B ...... Part N
  • 48. Dealing with performance and cost of complex applications in clouds  Application complexity  Elastic high performance applications on multiple clouds: libraries, software services, virtual machines, etc.  Cost and performance are needed for determining which parts of the application should be excuted in the clouds and when  Cost/performance model complexity  Coarse- and fine-grained cost models of clouds at different layers:  Too coarse-grained (networks, storages, machines) or too fine-grained (IO calls)  Software-, data-, human-specific cost/performance models  Cost models for individual parts (workflow, MPI, OpenMP, etc.) Tran Vu Pham, Hong-Linh Truong, Schahram Dustdar "Elastic High Performance Applications - A Composition Framework", The 2011 Asia-Pacific Services Computing Conference (IEEE APSCC 2011), (c) IEEE Computer Society, December 12 - 15, 2011, Jeju, Korea Hong Linh Truong, Schahram Dustdar: Composable cost estimation and monitoring for computational applications in cloud computing environments. Procedia CS 1(1): 2175-2184 (2010) Tran Vu Pham, Hong-Linh Truong, Schahram Dustdar "Elastic High Performance Applications - A Composition Framework", The 2011 Asia-Pacific Services Computing Conference (IEEE APSCC 2011), (c) IEEE Computer Society, December 12 - 15, 2011, Jeju, Korea Hong Linh Truong, Schahram Dustdar: Composable cost estimation and monitoring for computational applications in cloud computing environments. Procedia CS 1(1): 2175-2184 (2010) ASE Summer 2014 48
  • 49. Composable cost evaluation Part A Part B Part C Cost/performance model i Cost/performance model j Cost/performance model k Runtime: Elastic processes Elastic high performance applications on multiple clouds: libraries, software services, virtual machines, etc. Utilize different performance and dependencies models for sequential, parallel, workflows, etc. ASE Summer 2014 49
  • 50. Composable cost evaluation: Estimation and Monitoring  Leverage our previous knowledge on event representations, application monitoring, performance analysis, dependability analysis  Employ service-oriented approach  RESTful service, JSON and XML event data 50 50ASE Summer 2014
  • 51. Event Representations and Instrumentation  Captured monitoring events based on a well-defined specification – Well-known instrumentation techniques can be reused  Consider different application execution models (e.g., MPI, workflows, etc.) ASE Summer 2014 51
  • 52. Composable cost evaluation -- Fine-grained composition cost models 52 52ASE Summer 2014
  • 53. Composable cost evaluation -- Illustrative experiments 53 Examples with the Bones application ASE Summer 2014
  • 54. Simple Cost Estimation - Examples  aaaa 54ASE Summer 2014 54
  • 55. Online Cost Monitoring - Examples  One experiment of a bioinformatic workflow in EC2  Support runtime cost-based composition and execution 55ASE Summer 2014 55
  • 56. Exercises  Read mentioned papers  Analyze the relationships between programming models and system infrastructures for data analytics across multiple domains  Examine http://cloudcomputingpatterns.org and see how it supports data analytics patterns  Develop some patterns for data analytics across multiple systems  Work on composable cost evaluation for complex data analytics ASE Summer 2014 56
  • 57. 57 Thanks for your attention Hong-Linh Truong Distributed Systems Group Vienna University of Technology truong@dsg.tuwien.ac.at dsg.tuwien.ac.at/staff/truong ASE Summer 2014