The document describes a secure checkpointing approach for mobile environments. It proposes using elliptic curve cryptography combined with checkpointing to provide a low overhead, secure, fault tolerant system. Key points:
- Checkpointing is used to save system states to allow recovery from failures. Elliptic curve cryptography provides security by encrypting communication and generating digital signatures.
- The approach shifts cryptographic calculations to base stations to reduce mobile node overhead. Checkpoints and recovery information are stored at base stations.
- Mobile nodes save checkpoints and transfer them to the current base station they are connected to. A recovery algorithm allows processes to rollback and resume from the last saved checkpoint if a failure occurs.
2. information. After getting the information the initiator calculate the recovery line and broadcast a rollback
request message along with recovery line. After receiving it the process whose current state belongs to the
recovery line simply resumes execution or roll it back to earlier checkpoint as indicated at recovery line.
Some important standards for data communications and its security associations are introduced in [3]. The
network provider identifies the user. The network provider checks if the user knows the IMSI and the user
key Ki. Many times the user sends the TIMSI (temporary IMSI) to the base station, to hide the identity of the
MS. Though data is transferred through the air, there is a high chance to attack the data using radio receiver
by the attacker. So encryption of data is needed to make it secure during communication. Confidentiality,
integrity, authentication and non-repudiation of transmitted data define the communication security in [5].
The authentication protocols are discussed in [6]. Vulnerability can be defined as a flaw or weakness in
system security procedures, design, implementation, or internal controls as [4]. In [4] Bharat Bhargava et. al.
discussed 4 mechanisms to reduce the vulnerabilities and threats of a system. To recover from security
attacks at the time of communication we have concentrate on some cryptography[12] mechanisms. One of
the Public key cryptosystem ECC is a low overhead cryptography technique.
According to the works presented in [8], [9], [10] it is proved that ECC has better performance than other
public key cryptography. In [8] software implementations of DSA and RSA digital signature scheme is
compared with ECDSA. Experiments were performed on both PC’s and mobile devices. A signature
operation includes key generation, sign of document, and verification of signature. This is concluded from [8]
by Wendy Chou that verification process of RSA is faster than ECDSA, but signature operation of ECDSA is
faster than RSA and DSA. For DSA verification process itself is much longer than the total process time of
ECDSA. ECDSA performs well for mobile environment than other public key cryptography.
Our main attention is on secure checkpointing. Two models of secure checkpointing are discussed, one is
distributed checkpointing [2] and another model is probabilistic checkpointing [7]. In [7] the described model
is a combination of key agreement model and the secure probabilistic checkpointing scheme. There are two
types of nodes, checkpoint nodes and recovery nodes. Number of recovery node is small and used for
recovery when checkpoint nodes fail. Here DES algorithm is used for authentication and cryptography.
III. PROBLEM
IDENTIFICATION
Studying related papers discussed so far existing secure checkpointing algorithm [7] implements secret key
cryptography. Secret key cryptography ensures authentication and data confidentiality. Security threats to
data integrity and non repudiation are a big concern. We address all these issues in our work. As a solution
we combine Elliptic curve cryptography with checkpointing. We chose ECC because it is a low overhead
public key cryptography algorithm which is suitable for resource constrained mobile computing system.
A. Design Issues
Design of low overhead secure fault tolerant mobile computing system includes following design issues:
To reduce computation and storage overhead of mobile nodes: calculations related to cryptography
are shifted to base stations, recovery information, checkpoints logs are saved in base stations
To secure communication: always encrypted data is transferred through wired and wireless links
To ensure trusted computing: trust = fault tolerance (checkpointing) + secure(cryptography)
Mobile nodes save checkpoint and transfers to current base stations.
IV. THE
PROPOSED SECURE CHECKPOINTING ALGORITHM
A. System Model And Assumptions
In system model (Fig. 1) there are large number of mobile nodes and few base stations. The mobile nodes
communicate through a wireless network. There is no concept of shared memory. The mobile nodes
communicate with others through message passing. The mobile nodes communicate, send and receive
message through the base stations. The base stations are more secure than the mobile nodes .A mobile node is
indicated by a process running on it.
The assumptions of the system model are:
Failures can be link failure, node failure etc. Here failure is mainly node failure until stated
otherwise.
Mobile nodes connected to same base station are local to each other and mobile nodes connected to
different base stations are remote to each other.
12
3. Fault tolerance is considered here at operating system level only.
B. Data Structure And Notation
Sn: sender; Rc: receiver; pb: public key; pr: private key; pbsn: public key of sender; prsn: private key of
sender; pbrc: public key of receiver; prrc: private key of receiver; pr1, pr2, pr3,…,pri: i number of processes;
BSq: qth base station; MNoq: oth mobile node of qth base station; BS[u]: array of u number of element
maintained by base station,(BS[1]=keep the base station number; BS [2, 3, 4, 5,…,u]=keep the track of
mobile nodes attached with it); MNpri[v]: array of v number of elements maintained by each mobile
node,(MNpri[1]=keep its own base station number; MNpri[2, 3,…,v]: hold its base station number it
traversed already); IDMNoq: identification number of oth mobile node of qth base station; PWDMNoq:
password of oth mobile node of qth base station; RP(pri): set of restart points; T= (p, a, b, G, n, h) or (m, f(x),
a, b, G, n, h): Domain parameter of ECC over prime field or binary field;
Figure 1. System Model
p: specifying the number of element of finite field; m: defined the finite field over (2m); f(x): irreducible
polynomial; a and b: two elements specifying elliptic curve; G: generator point [(xg, yg) a point on the
elliptic curve for cryptographic operation]; n: order of elliptic curve; h: cofactor [ h= #E(p)/n; #E(p) =
number of points on elliptic curve]; F(p): the finite field; Fp: prime field; F2m: binary field; P: point of
elliptic curve; O: point of elliptic curve which is in infinity; Msg: message (contain: sender, MNpri[v],
receiver, MNpri[v], message body); Cpt: chipertext; E_msg: encrypted form of Msg; MAC: chosen MAC
scheme [SHA-1-160 or SHA-1-180]; En_schm: symmetric encryption scheme such as AES or DEC;
Mc_schm: message authentication code algorithm such as HMAC; Dn_schm: symmetric decryption scheme;
enky: encryption key; dcky: decryption key; mcky: MAC key; tg: tag; Hash: chosen hash function SHA1;CP: checkpoint; TCPprijk: jth temporary checkpoint of process pri of base station k; PCPprijk: jth
permanent checkpoint of process pri of base station k; CPprijk-1-CPprijk: checkpoint interval; Ip_I_Tprik:
Input information table of process pri of base station k; tlpri: local clock; Thrl: threshold value of local clock;
Thrc: a constant threshold value of each process; snd_sq_noprik: send sequence number; rcv_sq_noprik:
receive sequence number; Mlog_T: table to maintain the send sequence number along with receive sequence
number.
C. Secure Checkpointing Algorithm
Secure_checkpointing_algorithm ()
Step1: the processes are authenticated using Authenticate().
Step2: after authentication the processes start communication by sending messages and logging messages in
its volatile
memory and maintain snd_sq_noprik.
Step3: check
If MNpri[1] of sender is equal to MNpri[1] of receiver
Then the process encrypt the message using Encrypt_msg() and send;
Else the process will generate signature for that message using Signature_gn() and send;
Step 4: when a process receive a message it return an acknowledgement along with rcv_sq_noprik.
Step5: if MNpri[1] of receiver is equal to MNpri[1] of sender
Then the process decrypts the message using Decrypt_msg() and save it to local stable storage;
13
4. Else the base station will verify the signature of the message Verify_sig() and then send the
encrypted message to the process and the process decrypt the message using Decrypt_msg() and
save it in stable storage;
Step6: the processes will take checkpoint using Checkpoint ().
Authenticate ()
Note: the message digests form of passwords will be saved in the base station’s database.
Step1: before starting communication the MNoq will send it IDMNo q and PWDMNoq to the attached base station
BSq.
Step2: the BSq will check its own database to find the PWDMNoq which is sent by MNoq.
If PWDMNoq is present in its own database
Then BSq will send an authentication successful message to MNoq;
Else the BSq will broadcast that PWDMNoq to other base stations;
step3: if PWDMNoq is present in database of other base stations
Then they will send a positive message to the sender base station and BSq will send an
authentication successful message to MNoq;
Else they will send a negative message to the sender base station and BSq will send an
authentication unsuccessful message to MNoq;
Encrypt_msg ()
INPUT: Domain parameters T= (p, a, b, G, n, h), public key pbsn, plaintext Msg.
OUTPUT: Encrypted message E_msg.
Step1: Select prsn from [1, n-1].
Step2: Compute pbsn=prsn.G and Z= h.prsn.pbsn. If Z=
Step3: (enky, mcky)
coordinate of Z.
Z , pbsn), where xZ is the xStep4: Compute Cpt = En_schmenky(Msg) and tg = Mc_schmmcky(Cpt).
Step5: Return E_msg = (pbsn, Cpt, tg).
Decrypt_msg ()
INPUT: Domain parameters T= (p, a, b, G, n, h), private key prrc, Encrypted message E_msg.
OUTPUT: Plaintext Msg or rejection of the ciphertext.
Step1: Perform an embedded public key validation of pbsn. If the validation fails then return (“Reject the
ciphertext”).
Step2: Compute Z= h.prrc.pbsn. If Z =
the ciphertext”).
Step3: (enky, mcky)
coordinate of Z.
Z , pbsn), where xZ is the xStep4: Compute tg `= MACmcky(Cpt). If tg`
return (“Reject the ciphertext”).
Step5: Compute Msg = Dn_schm(Cpt).
Step6: Return (Msg).
Signature_gn ()
INPUT: domain parameters T=(p, a, b, G, n, h), public key pb and private key pr, the message Msg .
OUTPUT: the signature along with the message.
– 1]
Step2. Computes Point kG = (x, y) and r = x mod n, if r = 0 then go to 1.
Step3: Compute t = k–1 mod n.
Step4: Compute e = SHA-1(m), where SHA-1 denotes the
160 bit hash function.
Step5: Compute s = k– 1 (e + prsn *r) mod n, if s = 0 goto Step 1.
Step6: The signature of message Msg is the pair (r, s).
Step7: Sends (Msg, r, s).
Verify _sig()
INPUT: (receiver knows the domain parameters T and sender’s public key pb ).
OUTPUT: accept message if valid signature.
Step1: Verify r and s are integers in the range [1, n – 1].
Step2: Compute e = SHA-1(Msg).
Step3: Compute w = s–1 mod n.
Step4: Compute u1 = e.w and u2 = r.w .
Step5: Compute Point X = (x1, y1) = u1G + u2pb.
Step6: If X = O, then reject the signature Else compute v =
x1 mod n.
14
5. Step7: Accept signature iff v = r.
Checkpoint()
{
For each process pr1, pr2, pr3… pri
{
Initiate local clock t lpri;
Execute normal execution;
If (tlpri = = Thrl)
{
Take checkpoint TCPprijk;
Maintain Ip_I_Tprik;
}
}
Choose initiator from base stations let say pr1
For each TCPpr1 jk = Thrc
{
Broadcast TCPpr1jk as tentative checkpoint to all other processes;
If all other processes send reply
{
Then pr1 broadcast commit message and take TCPpr1jk as PCPpr1jk;
Refresh Ip_I_Tprik;
Discard all previous temporary checkpoints;
}
Else continue normal execution;
}
}
D. Recovery Algorithm
If a process fails, recovery is needed to resume computation from last saved state. If process pri fails before
taking TCPprijk, it restore the checkpoint TCPpri(j-1)k and include it to RP(pri) maintained by its respective base
station BSq. Base station sends the failure message to the communicating processes through their base
stations. Those processes stop execution and send their Ip_I_Tprik to their respective base stations. These base
stations send all these tables to BSq of failed process. BSq draws a dependency graph using the information
of Ip_I_Tprik and calculate the restart point and include it in RP(pri). BSq of failed process send the rollback
message to the communicating processes along with RP(pri). These processes will rollback and resume
execution.
E. Working Example
In the Fig. 2 there are four processes. Each process sends messages, logs messages, takes tentative
checkpoints and makes tentative checkpoints permanent.
Every process pr1, pr2, pr3, pr4 first send their identification number and password to their corresponding base
stations. After successful authentication they start their execution. The local clocks t lpri also start for each
process. Process pr1 generate snd_sq_noprik and logged message Msg1 to its volatile log. Msg1 consist (pr1,
MNpr1[v], pr3, MNpr3[v], message body). Then sender pr 1 check if MNpr1[1]= MNpr3[1], if they are equal then
encrypt the message and send the message to pr3.
If they are not equal then pr1 generate the signature and encrypt the message and send the encrypted message
along with signature to pr3. After receiving Msg1 process pr3 send rcv_sq_noprik to pr1. Then receiver pr 3
check if MNpr3[1] = MNpr1[1], if they are equal receiver decrypt the message and save into its stable storage.
If they are not equal then the respective base station of pr3 verifies the signature and sends the encrypted
message to pr3. Then pr3 decrypts the message and saves into its stable storage. In this way all the messages
send and received. After receiving message input information table is updated by the receiver. The columns
of the table are the processes and the rows contain the checkpoints of that process. For example the Ip_I_Tprik
is given above for processes pr1. In the meantime when the local clock value of each process meets the
15
6. Figure 2. Checkpointing Process
Figure 3. Calculation of restart point
threshold value of local clock take tentative checkpoint TCPprijk. Process pr1 take the temporary checkpoint
TCPpr111 which means that it is a 1st temporary checkpoint of process 1 of base station 1. After taking
TCPpr121 process pr1 send a request to all other processes to take that tentative checkpoint as permanent
checkpoint. When all other processes send reply message to pr1 then TCPpr121 becomes PCPpr121 and TCPpr221
becomes PCPpr221 and so on and refresh the input information table and discard the previous temporary
checkpoints. If pr1 did not get the reply message from all other processes the temporary checkpoint is not
changed into permanent checkpoint.
Calculation of restart point is shown in Fig. 3. The process pr 1 fails after TCPpr114 and sends the failure
message to its base station BS1. After receiving the failure message from pr1, BS1 calculates a dependency
graph based on Input Information Table of pr1 (Table I) .After getting the information it will rollback the
message to process previous temporary checkpoint. In the Fig. 3 process pr1 fails after TCPpr114 so all the
information available till that point. The message will be rolled back to this checkpoint and it will be treated
as rollback point. Similarly for pr3 lost Msg8 because of pr1 failure so it will also treat TCPpr334 as rollback
point. Hence, both TCPpr114 and TCPpr334 together treated as restart point.
We have shown here a single failure. Our system is can recover from multiple failures which are described
following.
16
7. TABLE I. EXAMPLE OF INPUT INFORMATION TABLE OF PR1
Pr 1
Processes
Pr2
Pr3
Pr4
CPpr101
CPpr111
CPpr121
CPpr131
CPpr141
CPpr231
Figure 4. Checkpointing Process
Figure 5. Calculation of restart point (multiple failure)
In Fig. 4 there are four processes, each process sends messages, logs messages, takes tentative checkpoints
and makes tentative checkpoint permanent as described earlier. In Fig. 5 we describe multiple failures,
process pr1 and pr3 fail and it is denoted as F1 and F2. The process pr1 fails after TCPpr114 and pr3 fails after
TCPpr334. Both of them send failure message to their respective Base station. After getting the information
Base station will calculate the dependency graph based on Input Information table. The Input Information
table of different process is shown following. Table II, Table III, Table IV, and Table V is describing input
information for process pr1, pr2, pr3, pr4 respectively. In Fig. 5 it shows that process pr1 fails after TCPpr114 so
all the information is still available to this temporary checkpoint. The process will be rolled back to this point
and this will be treated as rollback point for this process. Similarly for process pr 3 TCPpr334 temporary
checkpoint will be treated as rollback point. In the Fig. 5 it shows that process pr4 sends a message Msg5 to
pr3 but it lost due to pr 3 failure so pr4 also lost the message after TCPpr444. Hence pr4 will rollback to TCPpr444
and this temporary checkpoint will be treated as rollback point for this process. This three temporary
checkpoint TCPpr114, TCPpr334 and TCPpr444 together called as restart point for this multiple failure.
TABLE II. EXAMPLE OF INPUT INFORMATION TABLE
OF pr1
TABLE III. EXAMPLE OF INPUT INFORMATION TABLE OF pr2
Processes
Processes
Pr1
Pr2
Pr3
Pr1
Pr4
Pr2
Pr3
Pr4
CPpr202
CPpr101
CPpr111
CPpr111
CPpr212
CPpr2
31
CPpr1
CPpr131
CPpr141
17
22
CPpr2
CPpr121
32
CPpr242
13
CPpr3
8. T ABLE IV. E XAMPLE OF INPUT INFORMATION TABLE OF PR3
TABLE V.
OF PR4
Processes
Pr1
Pr2
EXAMPLE OF INPUT INFORMATION TABLE
Processes
Pr3
Pr4
Pr1
Pr2
CPpr303
Pr3
Pr4
CPpr404
CPpr313
CPpr414
CPpr323
CPpr222
CPpr424
CPpr333
CPpr434
43
CPpr3
CPpr232
CPpr444
V. CONCLUSIONS
In recent day use of mobile devices are increasing in applications such as e-commerce, banking, stock trading
etc. So to provide the proper functionality and provide the security is an alarming issue. To make a system
fault tolerant checkpointing is used. Here we are concentrating on the secure checkpointing methods. We are
proposing a secure checkpointing algorithm which mainly concentrates on the communication security. Our
algorithm used more consistent checkpointing processes and low overhead public key cryptosystem. We are
not comparing our algorithm with any existing work because there is no such existing work of secure
checkpointing using public key cryptosystem. Our algorithm is based on coordinated checkpointing along
with low overhead public key cryptosystem ECC. So we can conclude that secure checkpointing with public
key cryptosystem is possible in mobile environment.
REFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
E.N. (Mootaz) Elnozahy, L.Alvisi, Yi-min Wang and D. B. Johnson, “A Survey of Rollback Recovery Protocol in
Message Passing System,” ACM Comput. Surv., Vol. 34, No.3. pp. 375-408 September 2002.
S. Zang and T. Yuan, “Secure Fault Tolerance in Wireless Sensor Network,” Proc. IEEE 8th International
Conference on Computer and Information Technology Workshops, IEEE Computer Society Washington, DC, USA,
2008, pp. 477-482, doi 10.1109CIT.2008.Workshop.26.
J. Pelzl and T. Wollinger, “Security Aspect of Mobile Communication Systems,”2005 pp.168-185.
Bhargava and L. Lilien, “Vulnerabilities and Threats in Distributed Systems,” Distributed Computing and Internet
Technology, First International Conference, ICDCIT 2004, Bhubaneswar, India, LNCS 3347, pp.146-157, 2004.
A. Josang and G. Sanderud, “Security in Mobile Communication: Challenges and Opportunities,” Proc. of the
Australasian information security workshop conference on ACSW frontiers 2003, Volume 21.
H. Lin, L. Harn and V. Kumar, “Authentication Protocols in Wireless Communications,” 1995.
H. Nam, J. Kim, S.J. Hong, and S. Lee, "Secure checkpointing", presented at Journal of Systems Architecture,
pp.237-254, 2003.
W. Chou, “Elliptic curve cryptography and its application to mobile devices”, Federal Information Processing
Standards Publications, Prentice Hall, 2003.
N. Gura, A. Patel, A. Wander, H. Eberle, and S. C. Shantz, “Comparing Elliptic curve cryptography and RSA on 8bit CPUs”, Proc. CHES, pp.119-132 2004.
A. Wander, N. Gura, H. Eberle, V. Gupta and S. C. Shantz, “Energy analysis for public key cryptography for
Wireless Sensor Network” Proc. IEEE 3rd International Conference on Pervasive Computing and Communication,
pp. 324-328, March 2005.
Mobile Communications by Jochen Schiller, Second Edition, Pearson Education 2003 edition.
Cryptography and Network Security by Atul Kahate, Second Edition, Tata McGraw-Hill Publishing Company
Limited,2007 publication.
Guide to Elliptic Curve Cryptography by Darrel Hankerson, Alfred Menezes, Scott Vanstone, Springer,2004
publication.
18