Data as a Service – Concepts, Design &
Implementation, and Ecosystems
Hong-Linh Truong
Distributed Systems Group,
Vienna University of Technology
truong@dsg.tuwien.ac.at
http://dsg.tuwien.ac.at/staff/truong
1ASE Summer 2014
Advanced Services Engineering,
Summer 2014
Advanced Services Engineering,
Summer 2014
Outline
 Data provisioning and data service units
 Data-as-a-Service concepts
 DaaS design and implementation
 DaaS ecosystems
ASE Summer 2014 2
Data versus data assets
ASE Summer 2014
3
Data
Data
Assets
Data
management
and
provisioning
Data concerns
Data
collection,
assessment
and
enrichment
Data provisioning activities and
issues
ASE Summer 2014 4
Collect
• Data sources
• Ownership
• License
• Quality
assessment
and
enrichment
Store
• Query and
backup
capabilities
• Local versus
cloud,
distributed
versus
centralized
storage
Access
• Interface
• Public versus
private
access
• Access
granularity
• Pricing and
licensing
model
Utilize
• Alone or in
combination
with other
data sources
• Redistribution
• Updates
Non-exhausive list! Add your own issues!
Provisioning Models
Stakeholders in data provisioning
ASE Summer 2014 5
Data
Data Provider
• People
(individual/crowds/org
anization)
• Software, Things
Data Provider
• People
(individual/crowds/org
anization)
• Software, Things
Service Provider
• Software and people
Service Provider
• Software and people
Data Consumer
• People, Software,
Things
Data Consumer
• People, Software,
Things
Data Aggregator/Integrator
• Software
• People + software
Data Aggregator/Integrator
• Software
• People + software
Data Assessment
• Software and
people
Data Assessment
• Software and
people
Stakeholder classes can be further divided!
Domain-specific versus domain-independent functions
Recall – Service Unit
ASE Summer 2014 6
Service
model
Unit
Concept
Service
unit
„basic
component“/“basic
function“ modeling
and description
Consumption,
ownership,
provisioning, price, etc.
What about service units providing data?What about service units providing data?
Data service unit
ASE Summer 2014 7
Service
model
Unit
Concept
Data
service
unit
Data
 Can be used for private
or public
 Can be elastic or not
What about the
granularity of
the unit?
What about the
granularity of
the unit?
Data service units in clouds/internet
 Provide data capabilities rather than provide
computation or software capabilities
 Providing data in clouds/internet is an increasing
trend
 In both business and e-science environments
 Bio data, weather data, company balance
sheets, etc., via Web services
 Now often in a combination of data + analytics
atop the data
 Reasons: economic benefits, performance, service
ecosystems
8ASE Summer 2014
Data service unitData service unit
9
Data service units in
clouds/internet
datadata
Internet/CloudInternet/Cloud
Data service unitData service unit
People
data
Data service unitData service unit
Things
ASE Summer 2014
data data
SO DATA SERVICE UNIT IS
BIG OR SMALL? PROVIDING
REALTIME OR STATIC DATA?
Discussion time
ASE Summer 2014 10
11
NIST Cloud definitions
“This cloud model promotes availability and is
composed of five essential characteristics,
three service models, and four deployment
models.”
ASE Summer 2014
Source: NIST Definition of Cloud Computing v15, http://csrc.nist.gov/groups/SNS/cloud-computing/cloud-def-v15.docSource: NIST Definition of Cloud Computing v15, http://csrc.nist.gov/groups/SNS/cloud-computing/cloud-def-v15.doc
Data as a Service -- characteristics
 On-demand self-service
 Capabilities to provision data at different granularities
 Resource pooling
 Multiple types of data, big, static or near-realtime,raw data and
high-level information
 Broad network access
 Can be access from anywhere
 Rapid elasticity
 Easy to add/remove data sources
 Measured service
 Measuring, monitoring and publishing data concerns and usage
ASE Summer 2014 12
Built atop NIST‘s definition
Data-as-a-Service – service modelsData-as-a-Service – service models
Data as a Service – service models
and deployment models
ASE Summer 2014 13
Storage-as-a-Service
(Basic storage functions)
Storage-as-a-Service
(Basic storage functions)
Database-as-a-Service
(Structured/non-structured
querying systems)
Database-as-a-Service
(Structured/non-structured
querying systems)
Data publish/subcription
middleware as a service
Data publish/subcription
middleware as a service
Sensor-as-a-ServiceSensor-as-a-Service
Private/Public/Hybrid/Community CloudsPrivate/Public/Hybrid/Community Clouds
deploy
Examples of DaaS
ASE Summer 2014 14
Xively Cloud Services™
https://xively.com/
Xively Cloud Services™
https://xively.com/
WHAT ELSE DO YOU THINK
CAN BE INCLUDED INTO DAAS
MODELS?
Discussion time
ASE Summer 2014 15
DaaS design & implementation –
APIs
 Read-only DaaS versus CRUD DaaS APIs
 Service APIs versus Data APIs
 They are not the same wrt data/service
concerns
 SOAP versus REST
 Streaming data API
ASE Summer 2014 16
DaaS design & implementation –
service provider vs data provider
 The DaaS provider is separated from the data
provider
17
DaaS
Consumer
DaaS
Sensor
DaaS
Consumer DaaS provider Data
provider
ASE Summer 2014
Example: DaaS provider =! data
provider
18ASE Summer 2014
DaaS design & implementation –
structures
 DaaS and data providers have the right to
publish the data
ASE Summer 2014 19
DaaS
• Service
APIs
• Data APIs
for the
whole
resource
Data
Resource
• Data APIs
for
particular
resources
• Data APIs
for data
items
Data Items
• Data APIs
for data
items
Three levels
20
DaaS design & implementation –
structures (2)
Data
items
Data
items
Data
items
Data resourceData resource
Data
assets
Data resourceData resource Data resourceData resource
Data resourceData resourceData resourceData resource
Consumer
Consumer
DaaS
ASE Summer 2014
DaaS design & implementation –
patterns for „turning data to DaaS“ (1)
ASE Summer 2014 21
DaaSDaaSdatadata Build Data
Service
APIs
Deploy
Data
Service
Examples: using WSO2 data service
Storage/Database
-as-a-Service
Storage/Database
-as-a-Service
DaaS design & implementation –
patterns for „turning data to DaaS“ (2)
ASE Summer 2014 22
datadata
Examples: using
Amazon S3
DaaSDaaS
Storage/Databa
se/Middleware
Storage/Databa
se/Middleware
DaaS design & implementation –
patterns for „turning data to DaaS“ (3)
ASE Summer 2014 23
datadata
Examples:
using Crowd-
sourcing with
Pachube (the
predecessor of
Xively)
Things
One Thing  10000... Things
DaaSDaaS
Storage/Database/
Middleware
Storage/Database/
Middleware
DaaS design & implementation –
patterns for „turning data to DaaS“ (4)
ASE Summer 2014 24
datadata
Examples: using Twitter
People
DaaSDaaS
........
DaaS design & implementation –
not just „functional“ aspects (1)
ASE Summer 2014 25
datadata DaaSDaaS.... data assetsdata assets
Data
concerns
Quality of
data
Ownership
Price
License ....
Enrichment
Cleansing
Profiling
Integration ...
Data Assessment
/Improvement
APIs, Querying, Data Management, etc.
DaaS design & implementation –
not just „functional“ aspects (2)
ASE Summer 2014 26
Understand the DaaS ecosystem
Specifying, Evaluating and Provisioning Data
concerns and Data Contract
In follow-up
lectures
WHAT ARE OTHER PATTERNS
IN „TURNING DATA TO
DAAS“?
Discussion time
ASE Summer 2014 27
DaaS ecosystems
ASE Summer 2014 28
Data Assessment and Enrichment
Marco Comerio, Hong Linh Truong, Carlo Batini, Schahram Dustdar: Service-oriented data quality engineering and
data publishing in the cloud. SOCA 2010: 1-6
Marco Comerio, Hong Linh Truong, Carlo Batini, Schahram Dustdar: Service-oriented data quality engineering and
data publishing in the cloud. SOCA 2010: 1-6
Examples of service units in DaaS
ecosystems
ASE Summer 2014 29
Platforms/services Capabilities
Strikeiron clean, verify and validate data.
Jigsaw clean, verify and validate
business contact.
PostcodeAnywhere capture, clean, validate
and enrich business data.
Trillium Software Quality clean and standardize data
Uniserv Data Quality Solution X profile and clean data
Adeptia Integration Solution integrate data
Marco Comerio, Hong Linh Truong, Carlo Batini, Schahram Dustdar: Service-oriented data quality engineering and
data publishing in the cloud. SOCA 2010: 1-6
Marco Comerio, Hong Linh Truong, Carlo Batini, Schahram Dustdar: Service-oriented data quality engineering and
data publishing in the cloud. SOCA 2010: 1-6
DaaS ecosystem –
profiling/enriching example
ASE Summer 2014 30
http://www.strikeiron.com/
Cloud-based conceptual architecture
for data quality and enrichment
ASE Summer 2014 31
Marco Comerio, Hong Linh Truong, Carlo Batini, Schahram Dustdar: Service-oriented data quality engineering and
data publishing in the cloud. SOCA 2010: 1-6
Marco Comerio, Hong Linh Truong, Carlo Batini, Schahram Dustdar: Service-oriented data quality engineering and
data publishing in the cloud. SOCA 2010: 1-6
Data Enrichment using Web data
ASE Summer 2014 32
Source: Gomadam, K.; Yeh,
P.Z.; Verma, K.; Miller, J.A.,
"Data Enrichment Using Web
APIs," Services Economics
(SE), 2012 IEEE First
International Conference on ,
vol., no., pp.46,53, 24-29 June
2012
Source: Gomadam, K.; Yeh,
P.Z.; Verma, K.; Miller, J.A.,
"Data Enrichment Using Web
APIs," Services Economics
(SE), 2012 IEEE First
International Conference on ,
vol., no., pp.46,53, 24-29 June
2012
WHY DO YOU NEED TO STUDY
DAAS CONCEPTS, DESIGN
AND IMPLEMENTATION, AND
ECOSYSTEMS?
Discussion time
ASE Summer 2014 33
Some conceptual questions
 What are the relationshipes between „data service unit“
and DaaS?
 „Data service unit“ versus DaaS versus Data
Marketplace?
 The unit concept supports „composability“
 What does it mean „composability“ of data service
units? multiple data service units or multiple data
resources?
ASE Summer 2014 34
With the current trend on the API Management: service
providers focus on management of their API metadata
and lifecycle, is the concept of „service unit“ relevant to
API management? What are the relationships between
service units and APIs
With the current trend on the API Management: service
providers focus on management of their API metadata
and lifecycle, is the concept of „service unit“ relevant to
API management? What are the relationships between
service units and APIs
Exercises
 Read mentioned papers
 Check characteristics, service models and
deployment models of mentioned DaaS (and
find out more)
 Identify services in the ecosystem of some DaaS
 Write small programs to test public DaaS, such
as Xively, Microsoft Azure and Infochimps
 Turn some data to DaaS using existing tools
ASE Summer 2014 35
36
Thanks for
your attention
Hong-Linh Truong
Distributed Systems Group
Vienna University of Technology
truong@dsg.tuwien.ac.at
http://dsg.tuwien.ac.at/staff/truong
ASE Summer 2014

TUW-ASE-Summer 2014: Data as a Service – Concepts, Design & Implementation, and Ecosystems

  • 1.
    Data as aService – Concepts, Design & Implementation, and Ecosystems Hong-Linh Truong Distributed Systems Group, Vienna University of Technology truong@dsg.tuwien.ac.at http://dsg.tuwien.ac.at/staff/truong 1ASE Summer 2014 Advanced Services Engineering, Summer 2014 Advanced Services Engineering, Summer 2014
  • 2.
    Outline  Data provisioningand data service units  Data-as-a-Service concepts  DaaS design and implementation  DaaS ecosystems ASE Summer 2014 2
  • 3.
    Data versus dataassets ASE Summer 2014 3 Data Data Assets Data management and provisioning Data concerns Data collection, assessment and enrichment
  • 4.
    Data provisioning activitiesand issues ASE Summer 2014 4 Collect • Data sources • Ownership • License • Quality assessment and enrichment Store • Query and backup capabilities • Local versus cloud, distributed versus centralized storage Access • Interface • Public versus private access • Access granularity • Pricing and licensing model Utilize • Alone or in combination with other data sources • Redistribution • Updates Non-exhausive list! Add your own issues! Provisioning Models
  • 5.
    Stakeholders in dataprovisioning ASE Summer 2014 5 Data Data Provider • People (individual/crowds/org anization) • Software, Things Data Provider • People (individual/crowds/org anization) • Software, Things Service Provider • Software and people Service Provider • Software and people Data Consumer • People, Software, Things Data Consumer • People, Software, Things Data Aggregator/Integrator • Software • People + software Data Aggregator/Integrator • Software • People + software Data Assessment • Software and people Data Assessment • Software and people Stakeholder classes can be further divided! Domain-specific versus domain-independent functions
  • 6.
    Recall – ServiceUnit ASE Summer 2014 6 Service model Unit Concept Service unit „basic component“/“basic function“ modeling and description Consumption, ownership, provisioning, price, etc. What about service units providing data?What about service units providing data?
  • 7.
    Data service unit ASESummer 2014 7 Service model Unit Concept Data service unit Data  Can be used for private or public  Can be elastic or not What about the granularity of the unit? What about the granularity of the unit?
  • 8.
    Data service unitsin clouds/internet  Provide data capabilities rather than provide computation or software capabilities  Providing data in clouds/internet is an increasing trend  In both business and e-science environments  Bio data, weather data, company balance sheets, etc., via Web services  Now often in a combination of data + analytics atop the data  Reasons: economic benefits, performance, service ecosystems 8ASE Summer 2014
  • 9.
    Data service unitDataservice unit 9 Data service units in clouds/internet datadata Internet/CloudInternet/Cloud Data service unitData service unit People data Data service unitData service unit Things ASE Summer 2014 data data
  • 10.
    SO DATA SERVICEUNIT IS BIG OR SMALL? PROVIDING REALTIME OR STATIC DATA? Discussion time ASE Summer 2014 10
  • 11.
    11 NIST Cloud definitions “Thiscloud model promotes availability and is composed of five essential characteristics, three service models, and four deployment models.” ASE Summer 2014 Source: NIST Definition of Cloud Computing v15, http://csrc.nist.gov/groups/SNS/cloud-computing/cloud-def-v15.docSource: NIST Definition of Cloud Computing v15, http://csrc.nist.gov/groups/SNS/cloud-computing/cloud-def-v15.doc
  • 12.
    Data as aService -- characteristics  On-demand self-service  Capabilities to provision data at different granularities  Resource pooling  Multiple types of data, big, static or near-realtime,raw data and high-level information  Broad network access  Can be access from anywhere  Rapid elasticity  Easy to add/remove data sources  Measured service  Measuring, monitoring and publishing data concerns and usage ASE Summer 2014 12 Built atop NIST‘s definition
  • 13.
    Data-as-a-Service – servicemodelsData-as-a-Service – service models Data as a Service – service models and deployment models ASE Summer 2014 13 Storage-as-a-Service (Basic storage functions) Storage-as-a-Service (Basic storage functions) Database-as-a-Service (Structured/non-structured querying systems) Database-as-a-Service (Structured/non-structured querying systems) Data publish/subcription middleware as a service Data publish/subcription middleware as a service Sensor-as-a-ServiceSensor-as-a-Service Private/Public/Hybrid/Community CloudsPrivate/Public/Hybrid/Community Clouds deploy
  • 14.
    Examples of DaaS ASESummer 2014 14 Xively Cloud Services™ https://xively.com/ Xively Cloud Services™ https://xively.com/
  • 15.
    WHAT ELSE DOYOU THINK CAN BE INCLUDED INTO DAAS MODELS? Discussion time ASE Summer 2014 15
  • 16.
    DaaS design &implementation – APIs  Read-only DaaS versus CRUD DaaS APIs  Service APIs versus Data APIs  They are not the same wrt data/service concerns  SOAP versus REST  Streaming data API ASE Summer 2014 16
  • 17.
    DaaS design &implementation – service provider vs data provider  The DaaS provider is separated from the data provider 17 DaaS Consumer DaaS Sensor DaaS Consumer DaaS provider Data provider ASE Summer 2014
  • 18.
    Example: DaaS provider=! data provider 18ASE Summer 2014
  • 19.
    DaaS design &implementation – structures  DaaS and data providers have the right to publish the data ASE Summer 2014 19 DaaS • Service APIs • Data APIs for the whole resource Data Resource • Data APIs for particular resources • Data APIs for data items Data Items • Data APIs for data items Three levels
  • 20.
    20 DaaS design &implementation – structures (2) Data items Data items Data items Data resourceData resource Data assets Data resourceData resource Data resourceData resource Data resourceData resourceData resourceData resource Consumer Consumer DaaS ASE Summer 2014
  • 21.
    DaaS design &implementation – patterns for „turning data to DaaS“ (1) ASE Summer 2014 21 DaaSDaaSdatadata Build Data Service APIs Deploy Data Service Examples: using WSO2 data service
  • 22.
    Storage/Database -as-a-Service Storage/Database -as-a-Service DaaS design &implementation – patterns for „turning data to DaaS“ (2) ASE Summer 2014 22 datadata Examples: using Amazon S3 DaaSDaaS
  • 23.
    Storage/Databa se/Middleware Storage/Databa se/Middleware DaaS design &implementation – patterns for „turning data to DaaS“ (3) ASE Summer 2014 23 datadata Examples: using Crowd- sourcing with Pachube (the predecessor of Xively) Things One Thing  10000... Things DaaSDaaS
  • 24.
    Storage/Database/ Middleware Storage/Database/ Middleware DaaS design &implementation – patterns for „turning data to DaaS“ (4) ASE Summer 2014 24 datadata Examples: using Twitter People DaaSDaaS
  • 25.
    ........ DaaS design &implementation – not just „functional“ aspects (1) ASE Summer 2014 25 datadata DaaSDaaS.... data assetsdata assets Data concerns Quality of data Ownership Price License .... Enrichment Cleansing Profiling Integration ... Data Assessment /Improvement APIs, Querying, Data Management, etc.
  • 26.
    DaaS design &implementation – not just „functional“ aspects (2) ASE Summer 2014 26 Understand the DaaS ecosystem Specifying, Evaluating and Provisioning Data concerns and Data Contract In follow-up lectures
  • 27.
    WHAT ARE OTHERPATTERNS IN „TURNING DATA TO DAAS“? Discussion time ASE Summer 2014 27
  • 28.
    DaaS ecosystems ASE Summer2014 28 Data Assessment and Enrichment Marco Comerio, Hong Linh Truong, Carlo Batini, Schahram Dustdar: Service-oriented data quality engineering and data publishing in the cloud. SOCA 2010: 1-6 Marco Comerio, Hong Linh Truong, Carlo Batini, Schahram Dustdar: Service-oriented data quality engineering and data publishing in the cloud. SOCA 2010: 1-6
  • 29.
    Examples of serviceunits in DaaS ecosystems ASE Summer 2014 29 Platforms/services Capabilities Strikeiron clean, verify and validate data. Jigsaw clean, verify and validate business contact. PostcodeAnywhere capture, clean, validate and enrich business data. Trillium Software Quality clean and standardize data Uniserv Data Quality Solution X profile and clean data Adeptia Integration Solution integrate data Marco Comerio, Hong Linh Truong, Carlo Batini, Schahram Dustdar: Service-oriented data quality engineering and data publishing in the cloud. SOCA 2010: 1-6 Marco Comerio, Hong Linh Truong, Carlo Batini, Schahram Dustdar: Service-oriented data quality engineering and data publishing in the cloud. SOCA 2010: 1-6
  • 30.
    DaaS ecosystem – profiling/enrichingexample ASE Summer 2014 30 http://www.strikeiron.com/
  • 31.
    Cloud-based conceptual architecture fordata quality and enrichment ASE Summer 2014 31 Marco Comerio, Hong Linh Truong, Carlo Batini, Schahram Dustdar: Service-oriented data quality engineering and data publishing in the cloud. SOCA 2010: 1-6 Marco Comerio, Hong Linh Truong, Carlo Batini, Schahram Dustdar: Service-oriented data quality engineering and data publishing in the cloud. SOCA 2010: 1-6
  • 32.
    Data Enrichment usingWeb data ASE Summer 2014 32 Source: Gomadam, K.; Yeh, P.Z.; Verma, K.; Miller, J.A., "Data Enrichment Using Web APIs," Services Economics (SE), 2012 IEEE First International Conference on , vol., no., pp.46,53, 24-29 June 2012 Source: Gomadam, K.; Yeh, P.Z.; Verma, K.; Miller, J.A., "Data Enrichment Using Web APIs," Services Economics (SE), 2012 IEEE First International Conference on , vol., no., pp.46,53, 24-29 June 2012
  • 33.
    WHY DO YOUNEED TO STUDY DAAS CONCEPTS, DESIGN AND IMPLEMENTATION, AND ECOSYSTEMS? Discussion time ASE Summer 2014 33
  • 34.
    Some conceptual questions What are the relationshipes between „data service unit“ and DaaS?  „Data service unit“ versus DaaS versus Data Marketplace?  The unit concept supports „composability“  What does it mean „composability“ of data service units? multiple data service units or multiple data resources? ASE Summer 2014 34 With the current trend on the API Management: service providers focus on management of their API metadata and lifecycle, is the concept of „service unit“ relevant to API management? What are the relationships between service units and APIs With the current trend on the API Management: service providers focus on management of their API metadata and lifecycle, is the concept of „service unit“ relevant to API management? What are the relationships between service units and APIs
  • 35.
    Exercises  Read mentionedpapers  Check characteristics, service models and deployment models of mentioned DaaS (and find out more)  Identify services in the ecosystem of some DaaS  Write small programs to test public DaaS, such as Xively, Microsoft Azure and Infochimps  Turn some data to DaaS using existing tools ASE Summer 2014 35
  • 36.
    36 Thanks for your attention Hong-LinhTruong Distributed Systems Group Vienna University of Technology truong@dsg.tuwien.ac.at http://dsg.tuwien.ac.at/staff/truong ASE Summer 2014