More Related Content Similar to Realizing Parallelism and Transparency in Applications through Idempotence (20) More from Karthik Sankar (8) Realizing Parallelism and Transparency in Applications through Idempotence1. Realizing Parallelism and Transparency in
Applications through Idempotence
Karthik Sankar
Undergraduate Student (Final Year),
Department of CSE,
National Institute of Technology, Tiruchirappalli
February 2010
2. Page 2© 2009 Unisys Corporation. All rights reserved.
Agenda
• Applications on the cloud – The Stakeholders
• The problem at hand
• Idempotence
• Referential Transparency
• Where does it all fit in
• Need for Idempotence
• Implementation Methodologies
• Future Scope
WHO WHAT WHERE WHY HOW
3. Page 3© 2009 Unisys Corporation. All rights reserved.
Applications on the cloud
THE STAKEHOLDERS
• Applications that can tap the potential of parallel computing
• Applications with many concurrent users
• HPC applications
– Financial Computing [ trading systems ]
– Data Mining Applications
WHO WHAT WHERE WHY HOW
WHY ARE THESE APPLICATIONS THE STAKEHOLDERS ?
Resilience
Composability Graceful Recovery
Geo-distribution
4. Page 4© 2009 Unisys Corporation. All rights reserved.
The Problem at Hand
DIFFICULTY IN …
• “Quick” rollback of system state in case of error
• Fault tolerance with less latency
• Faster Avoidance of State of Imbalance and Inconsistency
WHO WHAT WHERE WHY HOW
PROPOSED SOLUTION
• Maintaining Idempotence in Algorithms - micro-level
• Idempotence in Application model - macro-level
• Referential Transparency
5. Page 5© 2009 Unisys Corporation. All rights reserved.
Idempotence
DEFINITION
In mathematical terms, a function f: D -> D is idempotent if
f (f (x)) = f (x), for all x in D
Importance of Idempotence
WHO WHAT WHERE WHY HOW
Inherently Idempotent Functions
1
at least once execution - instead of
exactly once
3
Make temporary objects used as
permanent objects
2
No modification to any global
variable
4
Maintain state by using distributed
and synchronized cache
HOW TO ACHIEVE IDEMPOTENCE
6. Page 6© 2009 Unisys Corporation. All rights reserved.
Referential Transparency
DEFINITION
An expression is said to be referentially transparent if it can be replaced with its value without
changing the program
WHO WHAT WHERE WHY HOW
IMPLICATIONS
1 Ability to do all kinds of things to existing programs automatically and be assured that
you did not break them or change their behaviour.
2 The program could be automatically transformed into a parallel version, or from parallel
to serial, or optimized, or changed in some other way and still behave the same way as
intended.
3 A compiler can refactor the program automatically
4 Automated Memoization
7. Page 7© 2009 Unisys Corporation. All rights reserved.
Where does it all fit in ?
WHO WHAT WHERE WHY HOW
IMPERATIVE DECLARATIVE LOGIC
CONSTRAINT
FUNCTIONAL
The way of writing code must be changed:
Refactoring can be done in the code optimization phase of the compiler
HOW WHAT
M
I
C
R
O
STEP1
Identify potential
points of failure in
the application
model
STEP2
Check if the state
reached at this
point can be
inconsistent STEP3
Try to make the
function stateless,
idempotent and
referentially
transparent
STEP4
In case of failure,
flag the process,
re-execute the
transaction with
same input
M
A
C
R
O
8. Page 8© 2009 Unisys Corporation. All rights reserved.
Need For Idempotence
The service invocation may fail due to different reasons that may include network
connectivity, resource failures (like db failure) etc., and the client may retry
Idempotent applications inherently and independently provide fault tolerance
Referential transparency can help in further optimizing wherever possible, by
means of memoization, common subexpression elimination or parallelization
The reliability and availability factor also improves
Easily scalable
Applications can be easily executed in distributed environments and multi-
threaded, multi-core designs
WHO WHAT WHERE WHY HOW
9. Page 9© 2009 Unisys Corporation. All rights reserved.
Implementation Methodologies
Two-step process to maintain idempotence:
1. Detecting duplicate requests
2. Caching of response messages
CLIENT SIDE PROCESSING:
Creates unique request identifier, passes it along with the message.
For retrial, uses the same request identifier.
There are various options for implementing a unique identifier:
• Business identifier [e.g.: purchase order number]
• Hash of the input message
SERVER SIDE PROCESSING:
Must inspect the request identifier, check if the request has been
already processed and if so, return the response.
If not - process the message; store the response in a persistent log
or database along with the request identifier.
WHO WHAT WHERE WHY HOW
A p p l i c a t i o n
Idempotent Sub
Layer
Non Idempotent
Core
10. Page 10© 2009 Unisys Corporation. All rights reserved.
Conclusions and Future Scope
• By implementing idempotence in applications, we can achieve parallelism,
transparency and atomicity.
• Increased fault tolerance and transparency features make idempotent applications a
favourite choice for deployment on the cloud.
• Idempotence and Referential Transparency are recent topics in computers and have
great potential for further research.
• For easy deployment in cloud, applications need not be developed from scratch.
Instead, a common framework can be devised to standardize the process of
refactoring existing applications to become cloud-compatible, transparent and
idempotent.
WHO WHAT WHERE WHY HOW
11. © 2009 Unisys Corporation. All rights reserved
Page 11
Questions ?
12. © 2009 Unisys Corporation. All rights reserved
Page 12
THANK YOU
13. © 2009 Unisys Corporation. All rights reserved
Page 13
ANNEXURE
14. Page 14© 2009 Unisys Corporation. All rights reserved.
ACID Properties
Atomicity
Consistency
Isolation
Durability
Valid mainly for single transactions => “Exactly Once” Execution
Difficulty to guarantee ACID in a Distributed Environment
2PC Protocol for distributed environment has disadvantages
ANNEXURE
15. Page 15© 2009 Unisys Corporation. All rights reserved.
2 - Phase Commit Protocol
ANNEXURE
• Distributed Atomic Transaction
• Not resilient to all possible
failures
• Not idempotent
• Blocking Protocol
16. Page 16© 2009 Unisys Corporation. All rights reserved.
Declarative Programming
FUNCTIONAL PROGRAMMING
• Treats computation as the evaluation of mathematical functions
• Avoids state and mutable data
LOGIC PROGRAMMING
• Use of mathematical logic for computer programming
CONSTRAINT PROGRAMMING
• Relations between variables are stated in the form of constraints
• They do not specify a step or sequence of steps to execute, but rather the properties of a
solution to be found
ANNEXURE
17. Page 17© 2009 Unisys Corporation. All rights reserved.
Imperative vs. Declarative
THE X = X + 1 PROBLEM
LET SUM = 0
FOR I = 1 to 100
LET SUM = SUM + I
NEXT
SUM(1) THEN RETURN 1
SUM(I) THEN NEW_I = I - 1; RETURN {I + SUM(NEW_I)}
ANNEXURE
HOW
WHAT
18. Page 18© 2009 Unisys Corporation. All rights reserved.
Referential Transparency - 1
IMPORTANCE
• Free of side effects
• Pure functions
• Used to reason about program behaviour
NO DEFINITE SEQUENCE POINT IN A PROGRAM
ENFORCING REFERENTIAL TRANSPARENCY
Incorporate mechanisms to make it easier while retaining the purely functional quality of the
language, such as definite clause grammars and monads.
ANNEXURE
19. Page 19© 2009 Unisys Corporation. All rights reserved.
Referential Transparency - 2
EXAMPLE
globalValue = 0;
integer function rq(integer x)
begin
globalValue = globalValue + 1;
return x + globalValue;
end
integer function rt(integer x)
begin
return x + 1;
end
integer p = rq(x) + rq(y) * (rq(x) - rq(x));
NOT REFERENTIALLY TRANSPARENT
ANNEXURE
20. Page 20© 2009 Unisys Corporation. All rights reserved.
Automatic Memoization
MEMOIZATION
In computing, memoization is an optimization technique used primarily to speed up computer
programs by having function calls avoid repeating the calculation of results for previously-
processed inputs
• Referentially transparent functions may be automatically memoized externally
• Used in:
– Artificial Intelligence
– Term Rewriting
ANNEXURE
21. Page 21© 2009 Unisys Corporation. All rights reserved.
Implementation Methodologies
The compiler optimization process can be modified to identify the parts of the program that
can be made idempotent.
• Using local variables instead of global wherever necessary
• Write-back temporary objects to permanent memory
• Isolate code that can act as pure functions into separate methods.
The methods that have been made idempotent can be separated out into a different layer of
code, which can exist independently and uniquely for a set of input.
WHO WHAT WHERE WHY HOW
ANNEXURE
22. Page 22© 2009 Unisys Corporation. All rights reserved.
Example – Online Trade Execution
ANNEXURE
Savings Account
Trading AccountTrade Execution
User Application
DEBIT
CREDIT
REQUEST
ORDER
ACK - 1
ACK - 3
ACK - 2
ACID Transaction Safe [assumption]
Need to be made idempotent
6
5
4
3
2
1