© 2016 IBM Corporation
Non-Blocking Checkpointing for
Consistent Regions in Streams 4.2
IBM Streams Version 4.2
Fang Zheng
Streams Development
zhengfan@us.ibm.com
2 © 2016 IBM Corporation
Important Disclaimer
THE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONAL
PURPOSES ONLY.
WHILE EFFORTS WERE MADE TO VERIFY THE COMPLETENESS AND ACCURACY OF THE
INFORMATION CONTAINED IN THIS PRESENTATION, IT IS PROVIDED “AS IS”, WITHOUT WARRANTY
OF ANY KIND, EXPRESS OR IMPLIED.
IN ADDITION, THIS INFORMATION IS BASED ON IBM’S CURRENT PRODUCT PLANS AND STRATEGY,
WHICH ARE SUBJECT TO CHANGE BY IBM WITHOUT NOTICE.
IBM SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR
OTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION.
NOTHING CONTAINED IN THIS PRESENTATION IS INTENDED TO, OR SHALL HAVE THE EFFECT OF:
• CREATING ANY WARRANTY OR REPRESENTATION FROM IBM (OR ITS AFFILIATES OR ITS OR
THEIR SUPPLIERS AND/OR LICENSORS); OR
• ALTERING THE TERMS AND CONDITIONS OF THE APPLICABLE LICENSE AGREEMENT
GOVERNING THE USE OF IBM SOFTWARE.
IBM’s statements regarding its plans, directions, and intent are subject to change or
withdrawal without notice at IBM’s sole discretion. Information regarding potential
future products is intended to outline our general product direction and it should not
be relied on in making a purchasing decision. The information mentioned regarding
potential future products is not a commitment, promise, or legal obligation to deliver
any material, code or functionality. Information about potential future products may
not be incorporated into any contract. The development, release, and timing of any
future features or functionality described for our products remains at our sole
discretion.
THIS INFORMATION IS BASED ON IBM’S CURRENT PRODUCT PLANS AND STRATEGY, WHICH ARE SUBJECT TO CHANGE BY IBM WITHOUT NOTICE.
IBM SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR OTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION.
3 © 2016 IBM Corporation
Consistent Region and Guaranteed Tuple Processing
Since 4.0, Streams supports at-least-once tuple processing for a sub-graph of
operators via the Consistent Region feature
• Specify a consistent region via @consistent annotation in SPL source code
• Stateful operators in a consistent region are checkpointed in a coordinated manner
to form a consistent state of the region
• Upon failure, operators restore state from last checkpoints, and tuples are replayed
Documentation: https://www.ibm.com/support/knowledgecenter/SSCRJU_4.2.0/com.ibm.streams.dev.doc/doc/consistentregions.html
Streamsdev article: https://developer.ibm.com/streamsdev/2015/02/20/processing-tuples-least-infosphere-streams-consistent-regions
() as JCP = JobControlPlane() {}
@consistent(trigger=periodic, period=1.0)
stream<rstirng ric> op1 = MySource() {
….
}
stream<float32 price> op2 = Aggregate(op1) {
….
}
stream<string ric> op3 = MyOp(op1) {
….
}
4 © 2016 IBM Corporation
Consistent Cut Protocol
The consistent cut protocol is a distributed snapshot protocol based on the
Chandy-Lamport algorithm
• A consistent state comprises of a collection of persisted operator states that are
consistent with having processed all tuples up to a certain logical point
• A consistent state is established or restored via propagating punctuations (called
markers) through the region
• Drain markers are used to suspend tuple processing and establish a
consistent state
• Resume markers are used to resume tuple processing
• Reset markers are used to restore consistent state
• For each consistent region, a central controller manages the establishing and
restoring of consistent state
Chandy-Lamport algorithm: https://en.wikipedia.org/wiki/Snapshot_algorithm
op1 op2
tuplesmarker
controller
5 © 2016 IBM Corporation
Enhancements in Streams 4.2
Increased level of concurrency in consistent cut protocol
• Markers are forwarded downstream before an operator finishes checkpointing itself
• Each PE has a background thread pool dedicated for checkpointing and
restoration of operators
Non-blocking checkpointing API to allow operator state to be checkpointed in
the background, asynchronously with tuple processing
• Tuple flow can be resumed without waiting for the completion of checkpointing
Together, these enhancements reduce the time tuple flow has to block waiting
for the consistent cut protocol to complete
6 © 2016 IBM Corporation
StateHandler Interface
Stateful Operators that wish to participate in consistent regions must
implement the StateHandler interface
• The StateHandler interface is available for both C++ and Java primitive operators
• Operator class derives/implements the StateHandler interface and provides the
implementation of the callback functions in it
StateHandler Interface documentation:
https://www.ibm.com/support/knowledgecenter/SSCRJU_4.2.0/com.ibm.streams.dev.doc/doc/consistentstatefuloperators.html
// Public header file in $STREMS_INSTALL/include/SPL/Runtime/Operator/State/StateHandler.h
class StateHandler
{
public:
virtual void drain() {} // drain() callback is invoked when drain marker reaches the operator
virtual void checkpoint(Checkpoint & ckpt) {} // the callback to checkpoint operator state
virtual void reset(Checkpoint & ckpt) {} // the callback to reset operator state from checkpoint
virtual void resetToInitialState() {} // the callback to reset operator to initial state
virtual void retireCheckpoint(int64_t id) {} // invoked when checkpoint of the given seq. ID is retired
// new API introduced in version 4.2 for non-blocking checkpointing
virtual void prepareForNonBlockingCheckpoint(int64_t seqId) {} // prepare operator for non-blocking checkpoint.
virtual void regionCheckpointed(int64_t seqid) {} // invoked when the whole region is fully checkpointed.
};
7 © 2016 IBM Corporation
Non-Blocking Checkpointing API
Operator state is checkpointed in two phases
• Operator first “prepare” its state which is then persisted in checkpoint()
• Mental Model:
virtual void prepareForNonBlockingCheckpoint(int64_t id) { }
virtual void checkpoint(Checkpoint & ckpt) {}
drain() prepare Tuple processing
checkpoint()
1. prepareForNonBlockingCheckpoint() is called after drain()
2. prepareForNonBlockingCheckpoint()
has exclusive access to operator
state; it should prepare the state for
later checkpointing 4. checkpoint() must read
the version of state when
prepare() was called
3. operator state can be read and updated
when new tuples are processed
Operator
state
time
8 © 2016 IBM Corporation
How to Use Non-Blocking Checkpointing API
Using a lock to protect operator state won’t work; instead, some form of copy-on-
write is needed
• Approach 1: if the operator state is small, prepareForNonBlockingCheckpoint() can
copy the state, checkpoint() serializes the copied version of state and persist to
backend store
• Approach 2: if the operator state is small, prepareForNonBlockingCheckpoint() can
serialize the state into a byte buffer (e.g., SPL::NativeByteBuffe), checkpoint()
reads the serialized data from the byte buffer and persists to backend store
• Approach 3: use copy-on-write data structures to allow concurrent access to
operator state data; please check out the sample application on Streams github:
https://github.com/IBMStreams/samples/tree/master/ConsistentRegions/NonBlockingCheckpoint
9 © 2016 IBM Corporation
How to Use Non-Blocking Checkpointing API
• Non-blocking checkpointing must be explicitly enabled by calling the following
function in operator’s constructor
• A consistent region can contain both operators with non-blocking checkpointing
enabled and operators which do not support non-blocking checkpointing
• In this case, the tuple flow is resumed when all non-blocking operators have
finished preparation, and all blocking operators have finished checkpoint()
• When all operators are checkpointed, start operator(s) of a consistent region get
notified via the regionCheckpointed() callback
virtual void ConsistentRegionContext::enableNonBlockingCheckpoint();
10 © 2016 IBM Corporation
Example of Consistent Region with Mixed Operators
• Suppose the consistent region contains three operators:
• Operator C has non-blocking checkpointing enabled, while A and B don’t.
• A is in its own PE, B and C are in another PE.
A
B
C
tp
tp
tp
checkpoint
drain
checkpoint
drain
checkpoint
tp
tp
tp
3. forward
Drain marker
5. forward
Drain marker
Controller
1. trigger Drain
2. Call A’s
drain()
processing
draining
started
4. Call B’s
drain()
6. Call drain() and
prepareForNonBlockingCheckpoint()
7. A’s checkpoint
completed
8. B’s checkpoint
And C’s prepare()
completed
checkpoint
pending
12. C’s checkpoint
completed
1
8
12
9. Resume
submission
10. Forward
Resume marker
11. Forward
Resume marker
State Transition of Consistent Region:
drain
Tuple processing
stall
prepareForNonBlockingCh
eckpoint()
checkpoint()
drain()
11 © 2016 IBM Corporation
Summary
Non-blocking checkpointing, together with other enhancements for concurrency,
reduces the tuple flow blocking time when establishing a consistent state of a
consistent region
Useful links
• Streams documentation on non-blocking checkpointing API
• Streamsdev article on non-blocking checkpointing API (coming soon)
• Sample application on Streams github

Non-Blocking Checkpointing for Consistent Regions in IBM Streams V4.2.

  • 1.
    © 2016 IBMCorporation Non-Blocking Checkpointing for Consistent Regions in Streams 4.2 IBM Streams Version 4.2 Fang Zheng Streams Development zhengfan@us.ibm.com
  • 2.
    2 © 2016IBM Corporation Important Disclaimer THE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONAL PURPOSES ONLY. WHILE EFFORTS WERE MADE TO VERIFY THE COMPLETENESS AND ACCURACY OF THE INFORMATION CONTAINED IN THIS PRESENTATION, IT IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED. IN ADDITION, THIS INFORMATION IS BASED ON IBM’S CURRENT PRODUCT PLANS AND STRATEGY, WHICH ARE SUBJECT TO CHANGE BY IBM WITHOUT NOTICE. IBM SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR OTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION. NOTHING CONTAINED IN THIS PRESENTATION IS INTENDED TO, OR SHALL HAVE THE EFFECT OF: • CREATING ANY WARRANTY OR REPRESENTATION FROM IBM (OR ITS AFFILIATES OR ITS OR THEIR SUPPLIERS AND/OR LICENSORS); OR • ALTERING THE TERMS AND CONDITIONS OF THE APPLICABLE LICENSE AGREEMENT GOVERNING THE USE OF IBM SOFTWARE. IBM’s statements regarding its plans, directions, and intent are subject to change or withdrawal without notice at IBM’s sole discretion. Information regarding potential future products is intended to outline our general product direction and it should not be relied on in making a purchasing decision. The information mentioned regarding potential future products is not a commitment, promise, or legal obligation to deliver any material, code or functionality. Information about potential future products may not be incorporated into any contract. The development, release, and timing of any future features or functionality described for our products remains at our sole discretion. THIS INFORMATION IS BASED ON IBM’S CURRENT PRODUCT PLANS AND STRATEGY, WHICH ARE SUBJECT TO CHANGE BY IBM WITHOUT NOTICE. IBM SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR OTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION.
  • 3.
    3 © 2016IBM Corporation Consistent Region and Guaranteed Tuple Processing Since 4.0, Streams supports at-least-once tuple processing for a sub-graph of operators via the Consistent Region feature • Specify a consistent region via @consistent annotation in SPL source code • Stateful operators in a consistent region are checkpointed in a coordinated manner to form a consistent state of the region • Upon failure, operators restore state from last checkpoints, and tuples are replayed Documentation: https://www.ibm.com/support/knowledgecenter/SSCRJU_4.2.0/com.ibm.streams.dev.doc/doc/consistentregions.html Streamsdev article: https://developer.ibm.com/streamsdev/2015/02/20/processing-tuples-least-infosphere-streams-consistent-regions () as JCP = JobControlPlane() {} @consistent(trigger=periodic, period=1.0) stream<rstirng ric> op1 = MySource() { …. } stream<float32 price> op2 = Aggregate(op1) { …. } stream<string ric> op3 = MyOp(op1) { …. }
  • 4.
    4 © 2016IBM Corporation Consistent Cut Protocol The consistent cut protocol is a distributed snapshot protocol based on the Chandy-Lamport algorithm • A consistent state comprises of a collection of persisted operator states that are consistent with having processed all tuples up to a certain logical point • A consistent state is established or restored via propagating punctuations (called markers) through the region • Drain markers are used to suspend tuple processing and establish a consistent state • Resume markers are used to resume tuple processing • Reset markers are used to restore consistent state • For each consistent region, a central controller manages the establishing and restoring of consistent state Chandy-Lamport algorithm: https://en.wikipedia.org/wiki/Snapshot_algorithm op1 op2 tuplesmarker controller
  • 5.
    5 © 2016IBM Corporation Enhancements in Streams 4.2 Increased level of concurrency in consistent cut protocol • Markers are forwarded downstream before an operator finishes checkpointing itself • Each PE has a background thread pool dedicated for checkpointing and restoration of operators Non-blocking checkpointing API to allow operator state to be checkpointed in the background, asynchronously with tuple processing • Tuple flow can be resumed without waiting for the completion of checkpointing Together, these enhancements reduce the time tuple flow has to block waiting for the consistent cut protocol to complete
  • 6.
    6 © 2016IBM Corporation StateHandler Interface Stateful Operators that wish to participate in consistent regions must implement the StateHandler interface • The StateHandler interface is available for both C++ and Java primitive operators • Operator class derives/implements the StateHandler interface and provides the implementation of the callback functions in it StateHandler Interface documentation: https://www.ibm.com/support/knowledgecenter/SSCRJU_4.2.0/com.ibm.streams.dev.doc/doc/consistentstatefuloperators.html // Public header file in $STREMS_INSTALL/include/SPL/Runtime/Operator/State/StateHandler.h class StateHandler { public: virtual void drain() {} // drain() callback is invoked when drain marker reaches the operator virtual void checkpoint(Checkpoint & ckpt) {} // the callback to checkpoint operator state virtual void reset(Checkpoint & ckpt) {} // the callback to reset operator state from checkpoint virtual void resetToInitialState() {} // the callback to reset operator to initial state virtual void retireCheckpoint(int64_t id) {} // invoked when checkpoint of the given seq. ID is retired // new API introduced in version 4.2 for non-blocking checkpointing virtual void prepareForNonBlockingCheckpoint(int64_t seqId) {} // prepare operator for non-blocking checkpoint. virtual void regionCheckpointed(int64_t seqid) {} // invoked when the whole region is fully checkpointed. };
  • 7.
    7 © 2016IBM Corporation Non-Blocking Checkpointing API Operator state is checkpointed in two phases • Operator first “prepare” its state which is then persisted in checkpoint() • Mental Model: virtual void prepareForNonBlockingCheckpoint(int64_t id) { } virtual void checkpoint(Checkpoint & ckpt) {} drain() prepare Tuple processing checkpoint() 1. prepareForNonBlockingCheckpoint() is called after drain() 2. prepareForNonBlockingCheckpoint() has exclusive access to operator state; it should prepare the state for later checkpointing 4. checkpoint() must read the version of state when prepare() was called 3. operator state can be read and updated when new tuples are processed Operator state time
  • 8.
    8 © 2016IBM Corporation How to Use Non-Blocking Checkpointing API Using a lock to protect operator state won’t work; instead, some form of copy-on- write is needed • Approach 1: if the operator state is small, prepareForNonBlockingCheckpoint() can copy the state, checkpoint() serializes the copied version of state and persist to backend store • Approach 2: if the operator state is small, prepareForNonBlockingCheckpoint() can serialize the state into a byte buffer (e.g., SPL::NativeByteBuffe), checkpoint() reads the serialized data from the byte buffer and persists to backend store • Approach 3: use copy-on-write data structures to allow concurrent access to operator state data; please check out the sample application on Streams github: https://github.com/IBMStreams/samples/tree/master/ConsistentRegions/NonBlockingCheckpoint
  • 9.
    9 © 2016IBM Corporation How to Use Non-Blocking Checkpointing API • Non-blocking checkpointing must be explicitly enabled by calling the following function in operator’s constructor • A consistent region can contain both operators with non-blocking checkpointing enabled and operators which do not support non-blocking checkpointing • In this case, the tuple flow is resumed when all non-blocking operators have finished preparation, and all blocking operators have finished checkpoint() • When all operators are checkpointed, start operator(s) of a consistent region get notified via the regionCheckpointed() callback virtual void ConsistentRegionContext::enableNonBlockingCheckpoint();
  • 10.
    10 © 2016IBM Corporation Example of Consistent Region with Mixed Operators • Suppose the consistent region contains three operators: • Operator C has non-blocking checkpointing enabled, while A and B don’t. • A is in its own PE, B and C are in another PE. A B C tp tp tp checkpoint drain checkpoint drain checkpoint tp tp tp 3. forward Drain marker 5. forward Drain marker Controller 1. trigger Drain 2. Call A’s drain() processing draining started 4. Call B’s drain() 6. Call drain() and prepareForNonBlockingCheckpoint() 7. A’s checkpoint completed 8. B’s checkpoint And C’s prepare() completed checkpoint pending 12. C’s checkpoint completed 1 8 12 9. Resume submission 10. Forward Resume marker 11. Forward Resume marker State Transition of Consistent Region: drain Tuple processing stall prepareForNonBlockingCh eckpoint() checkpoint() drain()
  • 11.
    11 © 2016IBM Corporation Summary Non-blocking checkpointing, together with other enhancements for concurrency, reduces the tuple flow blocking time when establishing a consistent state of a consistent region Useful links • Streams documentation on non-blocking checkpointing API • Streamsdev article on non-blocking checkpointing API (coming soon) • Sample application on Streams github