Realizing Parallelism and Transparency in Applications through Idempotence

Realizing Parallelism and Transparency in
Applications through Idempotence
Karthik Sankar
Undergraduate Student (Final Year),
Department of CSE,
National Institute of Technology, Tiruchirappalli
February 2010

© 2009 Unisys Corporation. All rights reserved.
Agenda
• Applications on the cloud – The Stakeholders
• The problem at hand
• Idempotence
• Referential Transparency
• Where does it all fit in
• Need for Idempotence
• Implementation Methodologies
• Future Scope
WHO WHAT WHERE WHY HOW

Applications on the cloud
THE STAKEHOLDERS
• Applications that can tap the potential of parallel computing
• Applications with many concurrent users
• HPC applications
– Financial Computing [ trading systems ]
– Data Mining Applications
WHY ARE THESE APPLICATIONS THE STAKEHOLDERS ?
Resilience
Composability Graceful Recovery
Geo-distribution

The Problem at Hand
DIFFICULTY IN …
• “Quick” rollback of system state in case of error
• Fault tolerance with less latency
• Faster Avoidance of State of Imbalance and Inconsistency
PROPOSED SOLUTION
• Maintaining Idempotence in Algorithms - micro-level
• Idempotence in Application model - macro-level
• Referential Transparency

Idempotence
DEFINITION
In mathematical terms, a function f: D -> D is idempotent if
f (f (x)) = f (x), for all x in D
 Importance of Idempotence
 Inherently Idempotent Functions
1
at least once execution - instead of
exactly once
3
Make temporary objects used as
permanent objects
2
No modification to any global
variable
4
Maintain state by using distributed
and synchronized cache
HOW TO ACHIEVE IDEMPOTENCE

Referential Transparency
DEFINITION
An expression is said to be referentially transparent if it can be replaced with its value without
changing the program
IMPLICATIONS
1 Ability to do all kinds of things to existing programs automatically and be assured that
you did not break them or change their behaviour.
2 The program could be automatically transformed into a parallel version, or from parallel
to serial, or optimized, or changed in some other way and still behave the same way as
intended.
3 A compiler can refactor the program automatically
4 Automated Memoization

Where does it all fit in ?
IMPERATIVE DECLARATIVE LOGIC
CONSTRAINT
FUNCTIONAL
 The way of writing code must be changed:
 Refactoring can be done in the code optimization phase of the compiler
HOW WHAT
M
I
C
R
O
STEP1
Identify potential
points of failure in
the application
model
STEP2
Check if the state
reached at this
point can be
inconsistent STEP3
Try to make the
function stateless,
idempotent and
referentially
transparent
STEP4
In case of failure,
flag the process,
re-execute the
transaction with
same input
M
A
C
R
O

Need For Idempotence
The service invocation may fail due to different reasons that may include network
connectivity, resource failures (like db failure) etc., and the client may retry
Idempotent applications inherently and independently provide fault tolerance
Referential transparency can help in further optimizing wherever possible, by
means of memoization, common subexpression elimination or parallelization
The reliability and availability factor also improves
Easily scalable
Applications can be easily executed in distributed environments and multi-
threaded, multi-core designs

Implementation Methodologies
Two-step process to maintain idempotence:
1. Detecting duplicate requests
2. Caching of response messages
CLIENT SIDE PROCESSING:
 Creates unique request identifier, passes it along with the message.
 For retrial, uses the same request identifier.
 There are various options for implementing a unique identifier:
• Business identifier [e.g.: purchase order number]
• Hash of the input message
SERVER SIDE PROCESSING:
 Must inspect the request identifier, check if the request has been
already processed and if so, return the response.
 If not - process the message; store the response in a persistent log
or database along with the request identifier.
A p p l i c a t i o n
Idempotent Sub
Layer
Non Idempotent
Core

Conclusions and Future Scope
• By implementing idempotence in applications, we can achieve parallelism,
transparency and atomicity.
• Increased fault tolerance and transparency features make idempotent applications a
favourite choice for deployment on the cloud.
• Idempotence and Referential Transparency are recent topics in computers and have
great potential for further research.
• For easy deployment in cloud, applications need not be developed from scratch.
Instead, a common framework can be devised to standardize the process of
refactoring existing applications to become cloud-compatible, transparent and
idempotent.

© 2009 Unisys Corporation. All rights reserved
Page 11
Questions ?

Page 12
THANK YOU

Page 13
ANNEXURE

ACID Properties
Atomicity
Consistency
Isolation
Durability
 Valid mainly for single transactions => “Exactly Once” Execution
 Difficulty to guarantee ACID in a Distributed Environment
 2PC Protocol for distributed environment has disadvantages
ANNEXURE

2 - Phase Commit Protocol
ANNEXURE
• Distributed Atomic Transaction
• Not resilient to all possible
failures
• Not idempotent
• Blocking Protocol

Declarative Programming
FUNCTIONAL PROGRAMMING
• Treats computation as the evaluation of mathematical functions
• Avoids state and mutable data
LOGIC PROGRAMMING
• Use of mathematical logic for computer programming
CONSTRAINT PROGRAMMING
• Relations between variables are stated in the form of constraints
• They do not specify a step or sequence of steps to execute, but rather the properties of a
solution to be found
ANNEXURE

Imperative vs. Declarative
THE X = X + 1 PROBLEM
LET SUM = 0
FOR I = 1 to 100
LET SUM = SUM + I
NEXT
SUM(1) THEN RETURN 1
SUM(I) THEN NEW_I = I - 1; RETURN {I + SUM(NEW_I)}
ANNEXURE
HOW
WHAT

Referential Transparency - 1
IMPORTANCE
• Free of side effects
• Pure functions
• Used to reason about program behaviour
NO DEFINITE SEQUENCE POINT IN A PROGRAM
ENFORCING REFERENTIAL TRANSPARENCY
Incorporate mechanisms to make it easier while retaining the purely functional quality of the
language, such as definite clause grammars and monads.
ANNEXURE

Referential Transparency - 2
EXAMPLE
globalValue = 0;
integer function rq(integer x)
begin
globalValue = globalValue + 1;
return x + globalValue;
end
integer function rt(integer x)
begin
return x + 1;
end
integer p = rq(x) + rq(y) * (rq(x) - rq(x));
NOT REFERENTIALLY TRANSPARENT
ANNEXURE

Automatic Memoization
MEMOIZATION
In computing, memoization is an optimization technique used primarily to speed up computer
programs by having function calls avoid repeating the calculation of results for previously-
processed inputs
• Referentially transparent functions may be automatically memoized externally
• Used in:
– Artificial Intelligence
– Term Rewriting
ANNEXURE

Implementation Methodologies
The compiler optimization process can be modified to identify the parts of the program that
can be made idempotent.
• Using local variables instead of global wherever necessary
• Write-back temporary objects to permanent memory
• Isolate code that can act as pure functions into separate methods.
The methods that have been made idempotent can be separated out into a different layer of
code, which can exist independently and uniquely for a set of input.
ANNEXURE

Example – Online Trade Execution
ANNEXURE
Savings Account
Trading AccountTrade Execution
User Application
DEBIT
CREDIT
REQUEST
ORDER
ACK - 1
ACK - 3
ACK - 2
ACID Transaction  Safe [assumption]
Need to be made idempotent
6
5
4
3
2
1

Realizing Parallelism and Transparency in Applications through Idempotence

Recommended

Recommended

More Related Content

Similar to Realizing Parallelism and Transparency in Applications through Idempotence

Similar to Realizing Parallelism and Transparency in Applications through Idempotence (20)

More from Karthik Sankar

More from Karthik Sankar (8)

Recently uploaded

Recently uploaded (20)

Realizing Parallelism and Transparency in Applications through Idempotence