Vortex is a platform that provides seamless, ubiquitous, efficient and timely data sharing across mobile, embedded, desktop, cloud and web applications. Today Vortex is the enabling technology at the core the most innovative Internet of Things and Industrial Internet applications, such as Smart Cities, Smart Grids, and Smart Traffic.
This two parts tutorial (1) introduces the key concepts of Vortex, (2) gets you started with using Vortex to efficiently exchange data across mobile, embedded, desktop, cloud and web applications, and (3) provides a series of best practices, patterns and idiom to get the best our of Vortex.
The only prerequisite to fully exploit this tutorial is a basic understanding of Java, C++ and JavaScript. Some knowledge of Scala and CoffeScript will be a plus.
3. Copyright PrismTech, 2014
The Vortex Platform
Vortex enables seamless,
ubiquitous, efficient and
timely data sharing across
mobile, embedded, desktop,
cloud and web applications
Vortex is based on the OMG
DDS standard
Vortex Device
Tools
Integration
MaaS
Vortex Cloud
5. Copyright PrismTech, 2014
ChirpIt Requirements
To explore the various features provided by the Vortex platform we will be designing and
implementing a micro-blogging platform called ChirpIt. Specifically, we want to support the
following features:
ChirpIt users should be able to “chirp”, “re-chirp”, “like” and “dislike” trills as well as get
detailed statistics
The ChirpIt platform should provide information on trending topics — identified by hashtags —
as well as trending users
Third party services should be able to flexibly access slices of produced trills to perform their
own trend analysis
ChirpIt should scale to millions of users
ChirpIt should be based on a Lambda Architecture
9. Copyright PrismTech, 2014
Calculating Trendy #hashtags
As an example of how VORTEX can be used to compute analytics we’ll
see how to compute trending topics, as identified by their #hashtag
13. Copyright PrismTech, 2014
QoS Policies
VORTEX provides a rich set of QoS-Policies
to control local as well as
end-to-end properties of data
sharing
Some QoS-Policies are matched
based on a Request vs. Offered
(RxO) Model
DURABILITY
HISTORY
LIFESPAN
LIVELINESS
DEADLINE
LATENCY BUDGET
TRANSPORT PRIO
TIME-BASED FILTER
RESOURCE LIMITS
USER DATA
TOPIC DATA
GROUP DATA
OWENERSHIP
OWN. STRENGTH
DW LIFECYCLE
DR LIFECYCLE
ENTITY FACTORY
DEST. ORDER
PARTITION
PRESENTATION
RELIABILITY
RxO QoS Local QoS
14. Copyright PrismTech, 2014
Data Delivery
Data Delivery QoS Policies provide
control over:
who delivers data
where data is delivered, and
how data is delivered
Reliability
Ownership Ownership
Presentation
Destination
Partition Order
Strength
Data Delivery
15. Copyright PrismTech, 2014
Data Delivery
Data Delivery QoS Policies provide
control over:
who delivers data
where data is delivered, and
how data is delivered
Reliability
Ownership Ownership
Presentation
Destination
Partition Order
Strength
Data Delivery
16. Copyright PrismTech, 2014
Data Delivery
Data Delivery QoS Policies provide
control over:
who delivers data
where data is delivered, and
how data is delivered
Reliability
Ownership Ownership
Presentation
Destination
Partition Order
Strength
Data Delivery
17. Copyright PrismTech, 2014
Data Delivery
Data Delivery QoS Policies provide
control over:
who delivers data
where data is delivered, and
how data is delivered
Reliability
Ownership Ownership
Presentation
Destination
Partition Order
Strength
Data Delivery
18. Copyright PrismTech, 2014
Data Availability
Data Availability QoS Policies provide
control over data availability with
respect to:
Temporal Decoupling (late Joiners)
Temporal Validity
History
Lifespan Data Availability Durability
19. Copyright PrismTech, 2014
Data Availability
Data Availability QoS Policies provide
control over data availability with
respect to:
Temporal Decoupling (late Joiners)
Temporal Validity
History
Lifespan Data Availability Durability
20. Copyright PrismTech, 2014
Data Availability
Data Availability QoS Policies provide
control over data availability with
respect to:
Temporal Decoupling (late Joiners)
Temporal Validity
History
Lifespan Data Availability Durability
21. Copyright PrismTech, 2014
Temporal Properties
Several policies provide control over
temporal properties, specifically:
Outbound Throughput
Inbound Throughput
Latency
TimeBasedFilter
[Inbound]
Throughput
[Outbound]
Deadline
Latency
TransportPriority
LatencyBudget
22. Copyright PrismTech, 2014
Temporal Properties
Several policies provide control over
temporal properties, specifically:
Outbound Throughput
Inbound Throughput
Latency
TimeBasedFilter
[Inbound]
Throughput
[Outbound]
Deadline
Latency
TransportPriority
LatencyBudget
23. Copyright PrismTech, 2014
Temporal Properties
Several policies provide control over
temporal properties, specifically:
Outbound Throughput
Inbound Throughput
Latency
TimeBasedFilter
[Inbound]
Throughput
[Outbound]
Deadline
Latency
TransportPriority
LatencyBudget
24. Copyright PrismTech, 2014
Temporal Properties
Several policies provide control over
temporal properties, specifically:
Outbound Throughput
Inbound Throughput
Latency
TimeBasedFilter
[Inbound]
Throughput
[Outbound]
Deadline
Latency
TransportPriority
LatencyBudget
25. Copyright PrismTech, 2014
QoS Model
For data to flow from a DataWriter (DW) to
one or many DataReader (DR) a few
conditions have to apply:
The DR and DW domain participants have
to be in the same domain
The partition expression of the DR’s
Subscriber and the DW’s Publisher should
match (in terms of regular expression
match)
The QoS Policies offered by the DW should
exceed or match those requested by the DR
Domain
Participant
joins joins
Domain Id
produces-in consumes-from
RxO QoS Policies
DURABILITY
DEST. ORDER
RELIABILITY
LATENCY BUDGET
DEADLINE
OWENERSHIP
LIVELINESS
Publisher
DataWriter
PARTITION
Domain
Participant
Subscriber
DataReader
offered
QoS
writes reads
Topic
requested
QoS
27. Copyright PrismTech, 2014
History QoS Policy
The DataWriter HISTORY QoS Policy controls
the amount of data that can be made available
to late joining DataReaders under
TRANSIENT_LOCAL Durability
The DataReader HISTORY QoS Policy controls
how many samples will be kept on the reader
cache
- Keep Last. DDS will keep the most recent
“depth” samples of each instance of data
identified by its key
- Keep All. The DDS keep all the samples of each
instance of data identified by its key -- up to
reaching some configurable resource limits
0 1 2 3
Pressure
KeepLast(1)
time
Pressure
time
Pressure
time
KeepLast(3)
KeepAll
28. Copyright PrismTech, 2014
Exploiting the History Policy in ChirpIt
The HISTORY QoS can be leveraged to automatically provide Chirps to late
joiners
In other terms, depending on applications settings, VORTEX can be leveraged to
automatically provide an application with the last n chirps produced
Notice that HISTORY can be “DURABLE” thus making it possible to completely
decouple in time the availability of history
29. Copyright PrismTech, 2014
Durability QoS Policy
The DURABILITY QoS controls the data availability w.r.t. late joiners, specifically the
DDS provides the following variants:
Volatile. No need to keep data instances for late joining data readers
Transient Local. Data instance availability for late joining data reader is tied to
the data writer availability
Transient. Data instance availability outlives the data writer
Persistent. Data instance availability outlives system restarts
30. Copyright PrismTech, 2014 Copyright
2013,
PrismTech
–
All
Rights
Reserved.
Volatile Durability
Data
Writer
Data
Reader
QoS
TopicA
1
• No Time Decoupling
• Readers get only data produced after they joined the Global Data Space
31. Copyright PrismTech, 2014 Copyright
2013,
PrismTech
–
All
Rights
Reserved.
Volatile Durability
Data
Writer
QoS
• No Time Decoupling
• Readers get only data produced after they joined the Global Data Space
Late Joiner
Data
Reader
TopicA
Data
Reader
1
32. Copyright PrismTech, 2014 Copyright
2013,
PrismTech
–
All
Rights
Reserved.
Volatile Durability
Data
Writer
QoS
Late Joiner
• No Time Decoupling
• Readers get only data produced after they joined the Global Data Space
Data
Reader
TopicA
Data
Reader
1
2
33. Copyright PrismTech, 2014 Copyright
2013,
PrismTech
–
All
Rights
Reserved.
Transient Local Durability
Data
Writer
Data
Reader
QoS
TopicA
1
• Some Time Decoupling
• Data availability is tied to the availability of the data writer and the history settings
34. Copyright PrismTech, 2014 Copyright
2013,
PrismTech
–
All
Rights
Reserved.
Transient Local Durability
Data
Writer
Late Joiner
Data
Reader
Data
Reader
QoS
TopicA
1
1
• Some Time Decoupling
• Data availability is tied to the availability of the data writer and the history settings
35. Copyright PrismTech, 2014 Copyright
2013,
PrismTech
–
All
Rights
Reserved.
Transient-Local Durability
Data
Writer
Data
Reader
QoS
TopicA
Data
Reader
2
1
1
1
• Some Time Decoupling
• Data availability is tied to the availability of the data writer and the history settings
36. Copyright PrismTech, 2014 Copyright
2013,
PrismTech
–
All
Rights
Reserved.
Transient Durability
Data
Writer
Data
Reader
QoS
TopicA
1
1
• Time Decoupling
• Data availability is tied to the availability of the durability service
37. Copyright PrismTech, 2014 Copyright
2013,
PrismTech
–
All
Rights
Reserved.
Transient Durability
Data
Writer
Late Joiner
Data
Reader
QoS
TopicA
Data
Reader
1
1
• Time Decoupling
• Data availability is tied to the availability of the durability service
38. Copyright PrismTech, 2014 Copyright
2013,
PrismTech
–
All
Rights
Reserved.
Transient Durability
Data
Writer
Data
Reader
QoS
TopicA
Data
Reader
2
1
1
1
2
• Time Decoupling
• Data availability is tied to the availability of the durability service
39. Copyright PrismTech, 2014 Copyright
2013,
PrismTech
–
All
Rights
Reserved.
Transient Durability
Data
Reader
QoS
TopicA
Data
Reader
1
1
2
2
2 1
40. Copyright PrismTech, 2014 Copyright
2013,
PrismTech
–
All
Rights
Reserved.
Transient Durability
1
1
Data
Reader
2
Late Joiner
QoS
TopicA
Data
Reader
Data
Reader
2
22 11
• Time Decoupling
• Data availability is tied to the availability of the durability service
41. Copyright PrismTech, 2014
Durability Service Configuration
Beside the service specific configuration — that we won’t discuss here — it is important
to understand that the amount of data that the durability service will maintain for a
given topic is configured using the DurabilityService Policy
The DurabilityService Policy, defined for topics, can be used to store:
- The last n samples for each topic instance
- All samples ever produced for a given Topic (across all its instances)
Resource constraints can also be specified to limit the maximum amount of data taken
by a topic
NOTE: beware that when you dispose an instance its data is removed from the
Durability Service
42. Copyright PrismTech, 2014
Getting Data From the Durability Service
Data can be retrieved from the Durability Service in two ways
Automatic Retrieval: Non VOLATILES DataReaders automatically receive a set of
historical data. How much data is received depends on the DR HISTORY setting and the
Durability Service (or DW for TRANSIENT_LOCAL) historical settings
Query-Based Retrieval: Any applications can create a “special” data reader to query the
Durability Service. Query can predicate on content, as well as time
- Get all Chirps made by @wolverine in the last 2 days:
• dr.getHistoricalData(“uid = ‘@wolverine’”, now() - 2 days, now())
- Get all Chirps containing the #hashtag “#xmen” in the last day:
• dr.getHistoricalData(“msg like ‘*#xmen*’”, now() - 1 day, now())
43. Copyright PrismTech, 2014
Exploiting VORTEX Durability in ChirpIt
VORTEX’s Durability can be leveraged to address several different use cases in
ChirpIt
Batch Layer: VORTEX durability can be used to persist all the chirps ever
received by our application. Scalability can be easily achieved by partitioning
(more later)
Speed Layer: Views on the data-set can be efficiently created using the
Durability Service Query API
Historical Data: Any analytics application as well as end-user application can
access historical data through either the Automatic or Query-based delivery
44. Copyright PrismTech, 2014
ChirpIt Architecture
Durability
Service
stats
analytics
data centre
chirp
stats
emotions
batch
layer
chirpit
apps
Cloud Messaging
serving
layer
chirp
emotions
Chirps chirp
3rd
party
svcs
chirp
stats
emotions
Speed
layer
Chirps
trend …
Historical
Data
45. Copyright PrismTech, 2014
Batch Layer Observations
Beside the Durability Service, you may implement the batch layer via:
VORTEX Record and Reply (RnR) Service
3rd Party Big Data Store
Depending on the specific system requirements one solution may be more
appropriate than another. Notice however, that the back-ends for both VORTEX
Durability and RnR are pluggable, thus a big-data store back-end could be easily
plugged
46. Copyright PrismTech, 2014
Big-Data Store Integration
It is trivial to push Chirps into a big data store such as HBase by using VORTEX
Gateway
Just define the following route:
val chirpURI = "ddsi:ChirpAction:0/com.chirpIt.ChirpAction”
val hbaseURI = "hbase:chirpit?mappingStrategyName=body&operation=CamelHBasePut"
// ...
// Put incoming chirps into an HBase Table
chirpURI unmarshal(cdrData) process { e2d (_, "chirp") } to(hbaseURI)
48. Copyright PrismTech, 2014
Be Stateless
A good approach to deal with failures is
to ensure that applications are stateless
The application state is maintained
externally (in our case on VORTEX
Durability)
49. Copyright PrismTech, 2014
#hashtag ranking function
ChirpIt #hashtag ranking function will be
stateless
The latest rankings will be maintained by
the VORTEX durability service. This allows
to restore the state after a failure as well
as easily do aggregations (more later)
#hashtag
ranking function
Ranking
Chirps
latest
ranking
Ranking
last
ranking
real-‐time
data
As such, the ranking function will consume the latest ranking along with live chirps
and produce the new ranking for the basic interval (say the shortest interval that
will define our granularity and from which aggregation will be created)
50. Copyright PrismTech, 2014
Trendy #hashtags
#hashtag
ranking function
Durability
Service
Ranking
chirp
stats
emotions
batch
layer
chirpit
apps
Cloud Messaging
serving
layer
Chirps chirp
3rd
party
svcs
chirp
stats
emotions
trend …
Chirps
latest
ranking
Ranking
last
ranking
real-‐time
data
51. Copyright PrismTech, 2014
Scaling out the #hashtag ranking
Wait a moment… Do we have a single
instance of the ranking function?
How do we scale out?
How can we partition the load?
52. Copyright PrismTech, 2014
Divide et Impera
@magneto
It would be a good idea to add more
structure to our partition to include
some geographical information
We take advantage of information
concerning the continent, country,
region, and city of the registrant
@wolverine
ChirpAction
@drx
ChirpAction
chirp:wolverine
ChirpAction
chirp:magneto
53. Copyright PrismTech, 2014
Divide et Impera
@archimede
By encoding origins in the partitions
associated with users we can easily and
efficiently do regional aggregation
Thus all the chirps in EU would be in
EU*:chirp:*, while all chirps in
Siracusa would be
EU:IT:Siracusa:chirp:*
@william
ChirpAction
@joe
ChirpAction
EU:UK:London:chirp:william
ChirpAction
NA:CA:Pasadina:chirp:joe
EU:IT:Siracusa:chirp:archimede
@antonio
ChirpAction
SA:AR:Rosario:chirp:antonio
54. Copyright PrismTech, 2014
Scaling out the #hashtag ranking
With the new partition organisation it is now trivial to scale out the ranking function
In addition, the application could easily support on-line reconfiguration since we
may want to consolidate or further distribute the load as the system goes
56. Copyright PrismTech, 2014
Some Observations
The processing pipeline can be easily
reconfigured at runtime by simply
changing the partitions expressions on
which every process operates
Every single stage in the processing
pipeline is stateless, its state is
maintained on the Durability Service
Hierarchical aggregation can be
introduced easily too
57. Copyright PrismTech, 2014
Ranking Topic
Notice that the Ranking topic is
keyless
The durability service will be
configured with KeepAll
struct HashtagScore {
string hashtag;
float score;
};
typedef sequence <HashtagScore> HashtagScores;
struct HashtagRanking {
long startTS;
long endTS;
HashtagScores htscores;
};
#pragma keylist HashtagRanking
58. Copyright PrismTech, 2014
Summary
In this presentation we have seen how the vortex platform can be used to
implement not only data sharing and distribution but also analytics
We have seen how VORTEX provide solution for both the batch as well as the
speed layer thus making it quite easy to implement lambda architectures