SlideShare a Scribd company logo
1 of 96
What does it mean ?
Organization should secure the entire life cycle of their records,
So that records are created, kept accessible for an appropriate
period of time and deleted, without tampering from
organizational insider or outsider.
Why do we need such this concept ?
Most traditional security techniques are of a little help in
ensuring trustworthy retention of records, because these
techniques focus on outsiders as the source of threats!
With organizational fraud, the threats come from inside the
organization, often from highly-placed employees.
What’s the goal of Trustworthy
Records Retention ?
The gaol of it is to provide long-term retention and eventual
disposal of organizational records in such a manner that no user
can delete, hide, or tamper with record during it’s retention
period.
* Nor recreate a record’s content once it has been deleted!
Regulatory Legislation
Trustworthy records retention has become mandatory with the
passing of regulatory legislation all around the world.
With each regulations is designed for a particular application
area a number of assurance criteria are common to many of the
directives.
Common assurance criteria:
1- Guaranteed retention.
2- Long-term retention.
3- Efficient access to data.
4- Data confidentiality.
5- Data integrity.
Common assurance criteria:
6- Litigation holds.
7- Guaranteed deletion.
8- Auditing.
9- High penalties for non-compliance
10- Insider adversaries.
Some examples of laws & regulations:
( to ensure trustworthy retention )
* The Sarbanes-Oxley Act of 2002 require public companies to
provide disclosure and accountability of their financial reporting
subject to independent audits.
* Food and Drug Administration place control over records of
trials of potential medicines.
Some examples of laws & regulations:
( to ensure trustworthy retention )
* The Health Insurance Portability and Accountability Act
(HIPAA) requires trustworthy storage of medical records.
* Federal Information Security Management Act which
requiring yearly audits, risk assessments, certifications and
continuous monitoring such system.
Some examples of laws & regulations:
( to ensure trustworthy retention )
* The Family Education Rights and Privacy Act requires
long-term trustworthy storage of student records from
elementary school through the university level.
* The Markets in Financial Instruments Directives( MiFID)
regulates financial market across Europe, and introduces
strict requirements on electronic record keeping.
Threat model:
* The main focus in trustworthy retention records is on preventing
malicious insiders from tampering with or destroying records.
* The second factor in threat model for trustworthy retention is that
the visible alteration or destruction of records is tantamount to an
admission of guilt, in the context of litigation.
Some implications of the threat model:
* Trustworthy retention
When an adversary attempt to modify or delete a record, or hide it,
we must make sure that the regulatory authority can detect such
this attacks and prevent them.
Some implications of the threat model:
* Trustworthy access and migration
When the organization needs to migrate it’s record to new storage
server, the regulatory authority must be able to detect whether any
such modification or omissions occurred during migration.
Some implications of the threat model:
* Trustworthy deletion
When mandatory retention of a records is over, the organization
removes the record.. the regulatory authority needs to prevent
adversary from gaining any information about that deleted record.
Storage architecture
Because the key requirement for trustworthy retention of records
is to prevent deletion and modification of the records,
Because also, the existing signature-based approaches and the
techniques for outsourcing and traditional access control are
powerless and don’t guarantee these requirements..
We need a new kind of storage architecture to thwart these
attacks.
Storage architecture
The new storage architecture should have the following
properties:
1- The component for enforcing the storage security properties
should be as small as possible.
2- Cost of any effective attack against the component must be
high, and it’s results must be conspicuous.
3- The resulting system must provide end-to-end security
guarantees.
4- The price per byte of storage must be modest.
Storage architecture
The storage industry has developed a variety of compliance
storage products and these products are often referred to as
WORM ( write once, read many ) devices.
Three types of storage will be discussed:
1- Tape-based products.
2- Optical-disk products.
3- Hard disk products.
Storage architecture
1- Tape-based products
The Quantum DLTSage predictive, preventative and diagnostic
tools for tape storage are provides a compliance storage.
Disadvantage:
* The WORM assurances are provided under the assumption
that only Quantum tape readers are deployed, which is
impractical.
Storage architecture
1- Tape-based products
Disadvantages (con.):
* Given the nature of magnetic tape, an attacker can easily
dismantle the plastic tape enclosure and access the underlying
data on different readers, thus compromising it’s integrity.
In addition to inability of secure deletion.
Storage architecture
2- Optical-disk products
Optical WORM-disk solution rely on irreversible physical write
effects to ensure the inability to alter existing content.
Disadvantages:
* It’s challenging to deploy a scalable optical-only solution
with increasing amount of information on constant low-latency.
* inability to fine-tune WORM and secure deletion granularity.
* Perform poorly in price- performance measurements.
Storage architecture
3- Hard disk-products
Magnetic disk recording offers better overall cost and
performance than optical or tape storage.
There are a lots of soft-WORM that offers immutability for
records in hard disk storage devices.
Storage architecture
3- Hard disk-products
Examples of soft-WORM that can be applied on hard
disks:
1- EMC Centera
* Each data record has two components: the content & its
associated content descriptor file (CDF) which contains metadata
attribute (creation date, time, format) and the object’s content
address.
Storage architecture
Hard disk-products
Examples of soft-WORM that can be applied on hard
disks:
1- EMC Centera (con.)
* The CDF is used for access to and management of the records.
* Centera permit deletion of a pointer to a record upon
expiration
of the retention period.
* Given its software- only nature, these mechanisms are
vulnerable to simple software-based attacks and physical
attacks.
Storage architecture
Hard disk-products
Examples of soft-WORM that can be applied on hard
disks:
2- Hitachi Message Archive for Compliance
* The system allow customers to lock down archived data,
making it non-erasable and non-rewritable for prescribed period.
* Given its software- only nature, these mechanisms are
vulnerable to simple software-based attacks and physical
attacks as Centera.
Storage architecture
Hard disk-products
Examples of soft-WORM that can be applied on hard
disks:
3- IBM System Storage Archive Manager
* The system make the deletion of data before it’s scheduled
expiration extremely difficult.
Storage architecture
Hard disk-products
Examples of soft-WORM that can be applied on hard
disks:
4- Sun StorageTek Compliance Archiving Software
* The system offers WORM assurances through its StorageTek,
and this software run to provide compliance-enabling features
for
authenticity, integrity, ready access and security.
Storage architecture
Strong WORM
Today’s compliance storage products do not really satisfy the
criteria for trustworthy record retention.
Storage architecture
Strong WORM
For sound design, the following properties are required to strong
WORM:
* To prevent physical attack, strong tamper-resistant and reactive
hardware is requires to ensure data integrity.
* The requirement for efficient access to large volume of records
will need to be searched using indexes. These indexes cannot be
kept on traditional storage, as a super user could hide a record.
Storage architecture
Strong WORM
For sound design, the following properties are required to strong
WORM:
* Current products don’t ensure that a record is trustworthy
throughout it’s entire life cycle, from creation, through migration
to newer strong servers, to eventual deletion.
*Current compliance storage products aim to address the
problem of ( documents ) retention; no product support
structured data.
Resistance to physical attack !
Using of the PROM circuitry that can put in the arm electronics
of hard disk drive, or in processor accessible memory to prevent
further writing on disk surface of the hard disk or writing to a
section of logical block addresses (LBAs) don’t provide strong
WORM guarantees, an insider can open the storage medium
enclosures to gain physical access to underlying data.
Resistance to physical attack !
By adding a trusted SCPU (Secure CPU) inside the storage
server, we can guarantee the trustworthiness of records.
Resistance to physical attack !
To achieve high throughput rates, the SCPU is involved in
document insertions and deletions but NOT in reads, thus
minimizing the overhead if the workload is dominated by read
queries.
Resistance to physical attack !
Clients who perform reads get an SCPU-certified guarantee that:
1- The block was not tampered with if the read is successful; and
if the read is fail either.
2- The block was deleted according to its retention policy.
3- The block is never existed on this storage server.
Resistance to physical attack !
Another trick to increase throughput during periods of high load
is to temporarily replace expensive SCPU signature operations
with less expensive short-term secure variants.
The system can strengthen these weaker constructs when load
slackens, but within their security lifetime.
Resistance to physical attack !
To authenticate the contents of the records on the storage server,
one option is to keep a Merkle tree whose entries are signed by
the SCPU.
However, the resulting O(log n) cost to insert or delete a record,
where n is the number of documents, will reduce the throughput
of the system.
Resistance to physical attack !
To address this problem, one can instead label data block with
monotonically increasing consecutive serial numbers and then
introduce a concept of sliding ”windows” that are authenticated
at O(1) cost by only signing the window boundaries.
Trustworthy Indexing:
Indexing ensures that a target record can be quickly extracted from
terabytes of data.
Trustworthy Indexing:
An indexing approach for trustworthy records retention must have
the following properties:
1- The search path to an index entry must be immutable for the
lifetime of the record that it indexes.
2- The indexing code should reside outside the storage server to
keep the trusted computing base small.
Trustworthy Indexing:
An indexing approach for trustworthy records retention must have
the following properties:
3- The insertion and indexing of a record must be performed
atomically.
4- All traces of a record must be removed from the index when the
record is deleted.
Trustworthy Indexing:
The first step in ensuring trustworthy indexing is to store the index
on WORM.
However, use of WORM alone is insufficient to index trustworthy
because the following problems in B-tree and hash based structure.
Trustworthy Indexing:
Use B-tree to store the index:
The problem comes when a node is
split into two nodes when it
overflows.
33
39 43 47 51 6321 23 33
Write-Once B-Tree
Trustworthy Indexing:
Use B-tree to store the index:
Two pointers are added to the end of
its parent node, superseding the
earlier pointer to the old node. 33 33 47
39 43 47 51 6321 23 33
Write-Once B-Tree
Insert 45
39 43 45 47 51 63
Trustworthy Indexing:
Use B-tree to store the index:
So, an adversary can effectively modify any record he wishes by
creating a new version of the appropriate nodes during copy
operation.
33 33 47
39 43 47 51 6321 23 33
Tampered Write-Once B-Tree ( omit 51)
Insert 45
39 43 45 47 63
Trustworthy Indexing:
Use hash-based structure to store the index:
The problem comes when the number of records in a hash table
exceeds a high water marks.
A new hash table with larger size is allocated and all the records
are rehashed and moved into the new table.
Trustworthy Indexing:
Use hash-based structure to store the index:
The ability to relocate records, however, provides an opportunity
for an adversary to alter the record during the copying step.
Trustworthy Indexing:
Hash-based structure and B-tree
All these approaches are vulnerable because the search path to
particular record is not term-immutable.
So, researchers have proposed trustworthy versions of hashing and
inverted indexes, both guarantee term-immutable search path.
Trustworthy Indexing:
Generalized hash tree
GHT is a balanced tree-based data structure that dose not require
periodic rebalancing.
In a GHT, predefined hashes of the record key determine all
possible lookup or insertion locations.
Trustworthy Indexing:
Generalized hash tree
The location where a record can be
inserted or looked up are therefore
immutable.
GHT
3 51
39 54 47 51
39 43 47 513939 4739 43 47 51
0 1 2 3
0 1 2 3
0 1 2 30 1 2 3
4 5 6 7
4 5 6 74 5 6 7
3 51
Trustworthy Indexing:
Generalized hash tree
To insert or look up a record in a GHT, the record key is hashed to
obtain a position within the root node. If the corresponding node
position at the root node is empty, the record is inserted there.
Trustworthy Indexing:
Generalized hash tree
If there is a collision, the key is rehashed (using different hash
function) and attempt is made to insert the key in the appropriate
sub tree of the root node.
Trustworthy Indexing:
Generalized hash tree
This process is repeated until an
empty node position is found.
If record cannot be inserted, a new
leaf node is added.
GHT After Insertion
3 51
39 54 47 51
39 43 47 513939 4739 43 47 51
0 1 2 3
0 1 2 3
0 1 2 30 1 2 3
4 5 6 7
4 5 6 74 5 6 7
3 51
39 47 51
h0 ( k)=1
h1 ( k)=0
h2 ( k)=7
h3 ( k)=2
Trustworthy Indexing:
Inverted indexes
Keyword search is the most convenient way to query unstructured
records such as email bodies and reports.
Search engines typically use inverted indexes for this purpose.
Trustworthy Indexing:
Inverted indexes
An inverted index comprises a dictionary of terms plus a posting
list for each term containing the identifiers of all records
containing
that term with additional metadata.
Trustworthy Indexing:
Inverted indexes
Queries are answered by scanning the posting lists of terms in the
query.
Query
Data
Base
Worm
Index
1 3 9 17 36
3 9 31
3 19
7 36
3
Ordinary Inverted Index
Trustworthy Indexing:
Inverted indexes
For trustworthy version of inverted indexes, each posting list can
be stored in a separate append-only file on WORM storage, but
this
approach is too slow to support real-time insertion of typically
business documents.
Trustworthy Indexing:
Inverted indexes
The performance can be improved vastly by merging the posting
lists for different terms until the tails of all posting lists fit into the
storage server cashe.
Trustworthy Indexing:
Inverted indexes
However, the "popular” terms are not merged together,
performance is little affected by merging.
Query
Data
Base
Worm
Index
1 3 9 17 36
3#Data 3#Base 9#Data 19#Base 3#£Data
7 36
3
Inverted Index After Merging
Trustworthy Indexing:
B+ tree
Multi-keyword conjunctive queries can be answered by
intersecting
the posting lists of the query terms.
To make the intersection fast, an additional index such as a B+ tree
is usually kept for each posting list, and zigzag join is used to
perform the intersection.
Trustworthy Indexing:
B+ tree
B+ tree can be created for an
Increasing sequence of document ID
without any node splits or merges, by
building the tree from the bottom up.
23
7 11
7 1
3
31
2 4 13 19 23 29 31 33
B+ tree in WORM
Trustworthy Indexing:
B+ tree
Such this index structure is also not trustworthy, even when kept
on
WOPM storage, because the path to each entry is not immutable.
Trustworthy Indexing:
B+ tree
The adversary can hide some entries
by creating a separate sub tree that
does not contain specific entry and
adding an entry at root to lead to the
new sub tree.
23 2
5
7 11
7 1
3
31
2 4 13 19 23 29 31 33
B+ tree with manipulated to
hide 31 by adding 25 to the root
32
25 26 32
Trustworthy Indexing:
B+ tree
To address the problem of B+ tree, researchers proposed jump
indexes technique.
Trustworthy Indexing:
Jump Indexes
Jump index can be used to index monotonic sequences, such as
documents IDs in a posting list, as a replacement for non
trustworthy B+ trees.
Jump index lookup performance is within a factor of 1.4 of the
performance of an equivalent b+ tree.
Trustworthy Indexing:
Jump Indexes
In jump index, to reach a particular number k < N, we can jump
from 0 to k in powers of two.
For example, let b1, b2,… bp be the binary representation of k.
We can reach k in p steps by starting at zero, then jumping forward
by b1*2p-1
integers, then jumping forward by b2*2p-2
integers;
and so on, until finally a bp* 20
jump brings us to the number k.
Trustworthy Indexing:
Jump Indexes
The ith jump pointer stored with jump
index entry (node) L will point to the
smallest jump index entry (node) L’
such that:
L+2i
<= L’ <= L+2i+1
Lookups can be done in O(log2N) time
where N=2p
.
1 0 1 2 3 4
Jump Pointers
2 0 1 2 3 4
Jump Pointers
5 0 1 2 3 4
Jump Pointers
10
0 2
Jump P
7 0 1 2 3 4
Jump Pointers
15 0 1 2 3 4
Jump Pointers
Binary Jump Index
Trustworthy Indexing:
Jump Indexes
By using block-structured jump index we can gain better space and
time efficiency.
Block-structured jump index in which p posting entries are stored
together in blocks of size L.
Trustworthy Indexing:
Jump Indexes
Pointers are associated with blocks,
rather than with every entry.
Jump pointers are calculated using
powers of B rather than two where
p>=B.
Each pointer is uniquely identified by a
pair (i, j), where
0<= i < logB(N) and 1<= j <B
1 2 5 7 0،1 0،2 1،1 2،1 2،2 3،1 3،2 4،1 4،2
Block 0
8 10 15 19 0،1 0،2 1،1 2،1 2،2
Jump Pointers
Jump Pointers
21 22 25 0،1 0،2 1،1 2،1 2
Jump Pointers
Block-Structured Jump Index
Block 1
Block 2
Trustworthy Indexing:
In trustworthy indexing approach described before, an adversary
can insert malicious entries into the index.
Trustworthy Indexing:
Malicious entries fall into categories:
1- Those that cause subsequent legitimate insertions to fail.
2- Those that will only be noticed when a lookup operation finds a
dangling pointer or returns a record that does not match the query.
Both events will draw immediate unwanted attention to the attack.
Trustworthy Indexing:
If an adversary gains physical access to storage, he may tamper
with the index contents.
If we have trusted hardware that can periodically sign portions of
the index such SCPU, then any discrepancy between the signature
and the current index contents can be detected.
Trustworthy migration:
It’s impractical to store record on a single server for decades, as the
server will become obsolete and too expensive to maintain.
When the records must be moved, the migration process needs to
be trustworthy even if a super user adversary performed the
migration.
Trustworthy migration:
Researchers have developed two schemes for trustworthy
migration
of records between compliance storage servers.
Trustworthy migration:
First migration schema
1- is initiated by the system operator retrieving a migration
certificate (MA) from the regulatory authority (RA).
The MA is a signature on a message containing the timestamped
identities of SCPU1 and SCPU2.
Trustworthy migration:
First migration schema
2- Upon migration, the MC is presented to SCPU1and SCPU2 who
authenticates the signature of the RA.
3- If this succeeds, SCPU1 is ready to mutually authenticate and
perform a key exchange with SCPU2 using their internally stored
key pairs and certificates.
Trustworthy migration:
First migration schema
4- If step 3 succeeds, SCPU1 will be ready and willing to transfer a
description of the state of the compliance records and index
contents on secure channel provided by an agreed-upon symmetric
key.
Trustworthy migration:
First migration schema
After the state information has been migrated, the actual records
and index contents can be transferred by the main CPU, without
SCPU involvement.
Trustworthy migration:
Second migration schema
This approach relies on the existence of a trusted third party such
as a storage system vendor.
Trustworthy migration:
Second migration schema
The migration process is divided by three phases:
Phase1:
The party in charge of migration prepares a plan for the migration.
The log of this plan includes the policies governing the migration
and, in compact form, a representation of the list of files and
directories to be migrated.
Trustworthy migration:
Second migration schema
Phase2:
The current storage server generate certificates that attest to the
current state of the directory and file contents and add them to the
log.
Trustworthy migration:
Second migration schema
Phase3:
Finally, the party in charge of migration moves the files to be
migrated and copies the log to the new server.
Using public key used by an organization’s series of storage
servers and validation routines to check whether the migration took
place appropriately.
Trustworthy deletion :
The primary purpose of WORM devices is to prevent data deletion.
However, simple erasure is not enough for trustworthy deletion, as
an erased record can be recreated by reverse-engineering an index.
Trustworthy deletion :
For the deletion of document d to be strongly secure, the presence
or absence of any word w in any reconstruction of d should not
convey any information about its presence in the original
document.
Trustworthy deletion :
The trustworthy indexing schemes earlier don’t support strongly
secure deletion.
* Generalized hash trees GHT offer weakly secure deletion.
* Trustworthy inverted index and jump index are even more
problematic with respect to deletion.
Trustworthy deletion :
An inverted index for example,
when a record is deleted, it may be possible to exactly recreate the
record by looking at it’s index entries.
Therefore, the index entries must also be removed to ensure non
reproducibility of deleted records.
Trustworthy deletion :
An inverted index for example,
also, a structural properties of index may allow an adversary to
infer that the deleted index entries existed.
Trustworthy deletion :
To address this problem two options are proposed:
1- Dividing expiration times into epochs, and keeping a separate
set
of indexes for records expiring in each epoch.
Then one could delete the entire epoch of indexes once the epoch
is
over.
However, this option is impractical because the litigation holds
may require a document to retained even after its mandatory
retention period is over.
Trustworthy deletion :
To address this problem two options are proposed:
2 - Rebuild the index in a trustworthy manner when records are
deleted.
However, the record arrival rate of today will be the required
record deletion rate in the future. Thus this option is too expensive
to be practical.
Trustworthy deletion :
Using encryption:
Encryption the document identifiers before being stored in the
Index is inadequate, because one can still perform a join on the
encrypted document identifiers to recover the document content.
Trustworthy deletion :
Using Inverted index with encryption:
An alternative for trustworthy inverted index is to merge posting
lists together as usual, then encrypt the term encoding associated
with each posting element and store it in the merged posting list
entries.
Trustworthy deletion :
Using Inverted index with encryption:
One possible encryption technique is to replace the keyword
encoding E in the posting element with it’s XOR with a random
secret, which can be stored with the record and deleted upon its
expiration.
T1
:
Tk
Tk+1
Tl
…….. 00101 d
01100
Supporting deletion from a
trustworthy inverted index
Encoding Document ID
Random Seqr0 =
Ө
Trustworthy deletion :
Using Inverted index with encryption:
The adversary will not be able to determine which of the q merged
keywords corresponds to the posting element, after the secrets is
discarded.
The schema does not achieve strongly secure deletion, though it is
immune to a variety of possible attacks.
Open Problems:
The biggest open issues and challenges in trustworthy record
retention
1- Corrections
Current models for records retention don’t support correction to
record content. An elegant, cost-effective approach is needed for
supporting corrections.
Open Problems:
The biggest open issues and challenges in trustworthy record
retention
2- Deletions
No entirely satisfactory schema exists for trustworthy deletion of
records. Traces of record metadata may remain in indexes or
migration logs, allowing an adversary to infer the contents of a
deleted record
Open Problems:
The biggest open issues and challenges in trustworthy record
retention
3- Structured Information
Database records need a similar level of protection, but no work to
date has addressed this problem.
Open Problems:
The biggest open issues and challenges in trustworthy record
retention
4- Exploiting trusted hardware
It’s important to explore how to deploy trusted hardware WOPM
to
achieve increased security and efficiency in the upper layers (e.g.
indexing).
The End!

More Related Content

What's hot

CS8791 Cloud Computing - Question Bank
CS8791 Cloud Computing - Question BankCS8791 Cloud Computing - Question Bank
CS8791 Cloud Computing - Question Bank
pkaviya
 
Network management
Network managementNetwork management
Network management
Mohd Arif
 
Mobile Cloud Computing Challenges and Security
Mobile Cloud Computing Challenges and SecurityMobile Cloud Computing Challenges and Security
Mobile Cloud Computing Challenges and Security
John Paul Prassanna
 

What's hot (20)

Requirements engineering by elizabeth hull, ken jackson, jeremy dick (auth.) ...
Requirements engineering by elizabeth hull, ken jackson, jeremy dick (auth.) ...Requirements engineering by elizabeth hull, ken jackson, jeremy dick (auth.) ...
Requirements engineering by elizabeth hull, ken jackson, jeremy dick (auth.) ...
 
Network security - OSI Security Architecture
Network security - OSI Security ArchitectureNetwork security - OSI Security Architecture
Network security - OSI Security Architecture
 
Mobile Computing-Unit-V-Mobile Platforms and Applications
Mobile Computing-Unit-V-Mobile Platforms and ApplicationsMobile Computing-Unit-V-Mobile Platforms and Applications
Mobile Computing-Unit-V-Mobile Platforms and Applications
 
Trends in distributed systems
Trends in distributed systemsTrends in distributed systems
Trends in distributed systems
 
Common Standards in Cloud Computing
Common Standards in Cloud ComputingCommon Standards in Cloud Computing
Common Standards in Cloud Computing
 
Issues in cloud computing
Issues in cloud computingIssues in cloud computing
Issues in cloud computing
 
Firewalls
FirewallsFirewalls
Firewalls
 
CS8791 Cloud Computing - Question Bank
CS8791 Cloud Computing - Question BankCS8791 Cloud Computing - Question Bank
CS8791 Cloud Computing - Question Bank
 
Intruders
IntrudersIntruders
Intruders
 
Io t system management with
Io t system management withIo t system management with
Io t system management with
 
Mobile computing
Mobile computing Mobile computing
Mobile computing
 
Network management
Network managementNetwork management
Network management
 
Mobile Cloud Computing Challenges and Security
Mobile Cloud Computing Challenges and SecurityMobile Cloud Computing Challenges and Security
Mobile Cloud Computing Challenges and Security
 
MAC Address – All you Need to Know About it
MAC Address – All you Need to Know About itMAC Address – All you Need to Know About it
MAC Address – All you Need to Know About it
 
Screen based controls in HCI
Screen based controls in HCIScreen based controls in HCI
Screen based controls in HCI
 
Eucalyptus cloud computing
Eucalyptus cloud computingEucalyptus cloud computing
Eucalyptus cloud computing
 
SDN( Software Defined Network) and NFV(Network Function Virtualization) for I...
SDN( Software Defined Network) and NFV(Network Function Virtualization) for I...SDN( Software Defined Network) and NFV(Network Function Virtualization) for I...
SDN( Software Defined Network) and NFV(Network Function Virtualization) for I...
 
4.2 spatial data mining
4.2 spatial data mining4.2 spatial data mining
4.2 spatial data mining
 
Operating system support in distributed system
Operating system support in distributed systemOperating system support in distributed system
Operating system support in distributed system
 
IT8602 Mobile Communication - Unit I Introduction
IT8602 Mobile Communication - Unit I IntroductionIT8602 Mobile Communication - Unit I Introduction
IT8602 Mobile Communication - Unit I Introduction
 

Similar to Trustworthy Records Retention

documentation for identity based secure distrbuted data storage schemes
documentation for identity based secure distrbuted data storage schemesdocumentation for identity based secure distrbuted data storage schemes
documentation for identity based secure distrbuted data storage schemes
Sahithi Naraparaju
 
Beware the Firewall My Son: The Jaws That Bite, The Claws That Catch!
Beware the Firewall My Son: The Jaws That Bite, The Claws That Catch!Beware the Firewall My Son: The Jaws That Bite, The Claws That Catch!
Beware the Firewall My Son: The Jaws That Bite, The Claws That Catch!
Michele Chubirka
 
Iaetsd secure data storage against attacks in cloud
Iaetsd secure data storage against attacks in cloudIaetsd secure data storage against attacks in cloud
Iaetsd secure data storage against attacks in cloud
Iaetsd Iaetsd
 
Dr. Eric Cole - 30 Things Every Manager Should Know
Dr. Eric Cole - 30 Things Every Manager Should KnowDr. Eric Cole - 30 Things Every Manager Should Know
Dr. Eric Cole - 30 Things Every Manager Should Know
Nuuko, Inc.
 
iaetsd Using encryption to increase the security of network storage
iaetsd Using encryption to increase the security of network storageiaetsd Using encryption to increase the security of network storage
iaetsd Using encryption to increase the security of network storage
Iaetsd Iaetsd
 
Advantages And Disadvantages Of Nc
Advantages And Disadvantages Of NcAdvantages And Disadvantages Of Nc
Advantages And Disadvantages Of Nc
Kristen Wilson
 

Similar to Trustworthy Records Retention (20)

Coud discovery chap 5
Coud discovery chap 5Coud discovery chap 5
Coud discovery chap 5
 
Block Chain Record Management
Block Chain Record ManagementBlock Chain Record Management
Block Chain Record Management
 
EMC Data Domain Retention Lock Software: Detailed Review
EMC Data Domain Retention Lock Software: Detailed ReviewEMC Data Domain Retention Lock Software: Detailed Review
EMC Data Domain Retention Lock Software: Detailed Review
 
Designing a security policy to protect your automation solution
Designing a security policy to protect your automation solutionDesigning a security policy to protect your automation solution
Designing a security policy to protect your automation solution
 
documentation for identity based secure distrbuted data storage schemes
documentation for identity based secure distrbuted data storage schemesdocumentation for identity based secure distrbuted data storage schemes
documentation for identity based secure distrbuted data storage schemes
 
Beware the Firewall My Son: The Jaws That Bite, The Claws That Catch!
Beware the Firewall My Son: The Jaws That Bite, The Claws That Catch!Beware the Firewall My Son: The Jaws That Bite, The Claws That Catch!
Beware the Firewall My Son: The Jaws That Bite, The Claws That Catch!
 
CSE_Instructor_Materials_Chapter7.pptx
CSE_Instructor_Materials_Chapter7.pptxCSE_Instructor_Materials_Chapter7.pptx
CSE_Instructor_Materials_Chapter7.pptx
 
Securing Nuclear Facilities
Securing Nuclear FacilitiesSecuring Nuclear Facilities
Securing Nuclear Facilities
 
Cloud Computing
Cloud ComputingCloud Computing
Cloud Computing
 
Iaetsd secure data storage against attacks in cloud
Iaetsd secure data storage against attacks in cloudIaetsd secure data storage against attacks in cloud
Iaetsd secure data storage against attacks in cloud
 
Confidential compute with hyperledger fabric .v17
Confidential compute with hyperledger fabric .v17Confidential compute with hyperledger fabric .v17
Confidential compute with hyperledger fabric .v17
 
III SEM MCA-Module 4 -Ch2.pdf- Securing IoT
III SEM MCA-Module 4 -Ch2.pdf- Securing IoTIII SEM MCA-Module 4 -Ch2.pdf- Securing IoT
III SEM MCA-Module 4 -Ch2.pdf- Securing IoT
 
Approach of Data Security in Local Network Using Distributed Firewalls
Approach of Data Security in Local Network Using Distributed FirewallsApproach of Data Security in Local Network Using Distributed Firewalls
Approach of Data Security in Local Network Using Distributed Firewalls
 
Technology overview of_mobil_247134-1
Technology overview of_mobil_247134-1Technology overview of_mobil_247134-1
Technology overview of_mobil_247134-1
 
Cisco cybersecurity essentials chapter - 6
Cisco cybersecurity essentials chapter - 6Cisco cybersecurity essentials chapter - 6
Cisco cybersecurity essentials chapter - 6
 
Dr. Eric Cole - 30 Things Every Manager Should Know
Dr. Eric Cole - 30 Things Every Manager Should KnowDr. Eric Cole - 30 Things Every Manager Should Know
Dr. Eric Cole - 30 Things Every Manager Should Know
 
Secure Architecture and Incident Management for E-Business
Secure Architecture and Incident Management for E-BusinessSecure Architecture and Incident Management for E-Business
Secure Architecture and Incident Management for E-Business
 
iaetsd Using encryption to increase the security of network storage
iaetsd Using encryption to increase the security of network storageiaetsd Using encryption to increase the security of network storage
iaetsd Using encryption to increase the security of network storage
 
Advantages And Disadvantages Of Nc
Advantages And Disadvantages Of NcAdvantages And Disadvantages Of Nc
Advantages And Disadvantages Of Nc
 
Datasheet: Security
Datasheet: SecurityDatasheet: Security
Datasheet: Security
 

Recently uploaded

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Recently uploaded (20)

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 

Trustworthy Records Retention

  • 1.
  • 2. What does it mean ? Organization should secure the entire life cycle of their records, So that records are created, kept accessible for an appropriate period of time and deleted, without tampering from organizational insider or outsider.
  • 3. Why do we need such this concept ? Most traditional security techniques are of a little help in ensuring trustworthy retention of records, because these techniques focus on outsiders as the source of threats! With organizational fraud, the threats come from inside the organization, often from highly-placed employees.
  • 4. What’s the goal of Trustworthy Records Retention ? The gaol of it is to provide long-term retention and eventual disposal of organizational records in such a manner that no user can delete, hide, or tamper with record during it’s retention period. * Nor recreate a record’s content once it has been deleted!
  • 5. Regulatory Legislation Trustworthy records retention has become mandatory with the passing of regulatory legislation all around the world. With each regulations is designed for a particular application area a number of assurance criteria are common to many of the directives.
  • 6. Common assurance criteria: 1- Guaranteed retention. 2- Long-term retention. 3- Efficient access to data. 4- Data confidentiality. 5- Data integrity.
  • 7. Common assurance criteria: 6- Litigation holds. 7- Guaranteed deletion. 8- Auditing. 9- High penalties for non-compliance 10- Insider adversaries.
  • 8. Some examples of laws & regulations: ( to ensure trustworthy retention ) * The Sarbanes-Oxley Act of 2002 require public companies to provide disclosure and accountability of their financial reporting subject to independent audits. * Food and Drug Administration place control over records of trials of potential medicines.
  • 9. Some examples of laws & regulations: ( to ensure trustworthy retention ) * The Health Insurance Portability and Accountability Act (HIPAA) requires trustworthy storage of medical records. * Federal Information Security Management Act which requiring yearly audits, risk assessments, certifications and continuous monitoring such system.
  • 10. Some examples of laws & regulations: ( to ensure trustworthy retention ) * The Family Education Rights and Privacy Act requires long-term trustworthy storage of student records from elementary school through the university level. * The Markets in Financial Instruments Directives( MiFID) regulates financial market across Europe, and introduces strict requirements on electronic record keeping.
  • 11. Threat model: * The main focus in trustworthy retention records is on preventing malicious insiders from tampering with or destroying records. * The second factor in threat model for trustworthy retention is that the visible alteration or destruction of records is tantamount to an admission of guilt, in the context of litigation.
  • 12. Some implications of the threat model: * Trustworthy retention When an adversary attempt to modify or delete a record, or hide it, we must make sure that the regulatory authority can detect such this attacks and prevent them.
  • 13. Some implications of the threat model: * Trustworthy access and migration When the organization needs to migrate it’s record to new storage server, the regulatory authority must be able to detect whether any such modification or omissions occurred during migration.
  • 14. Some implications of the threat model: * Trustworthy deletion When mandatory retention of a records is over, the organization removes the record.. the regulatory authority needs to prevent adversary from gaining any information about that deleted record.
  • 15. Storage architecture Because the key requirement for trustworthy retention of records is to prevent deletion and modification of the records, Because also, the existing signature-based approaches and the techniques for outsourcing and traditional access control are powerless and don’t guarantee these requirements.. We need a new kind of storage architecture to thwart these attacks.
  • 16. Storage architecture The new storage architecture should have the following properties: 1- The component for enforcing the storage security properties should be as small as possible. 2- Cost of any effective attack against the component must be high, and it’s results must be conspicuous. 3- The resulting system must provide end-to-end security guarantees. 4- The price per byte of storage must be modest.
  • 17. Storage architecture The storage industry has developed a variety of compliance storage products and these products are often referred to as WORM ( write once, read many ) devices. Three types of storage will be discussed: 1- Tape-based products. 2- Optical-disk products. 3- Hard disk products.
  • 18. Storage architecture 1- Tape-based products The Quantum DLTSage predictive, preventative and diagnostic tools for tape storage are provides a compliance storage. Disadvantage: * The WORM assurances are provided under the assumption that only Quantum tape readers are deployed, which is impractical.
  • 19. Storage architecture 1- Tape-based products Disadvantages (con.): * Given the nature of magnetic tape, an attacker can easily dismantle the plastic tape enclosure and access the underlying data on different readers, thus compromising it’s integrity. In addition to inability of secure deletion.
  • 20. Storage architecture 2- Optical-disk products Optical WORM-disk solution rely on irreversible physical write effects to ensure the inability to alter existing content. Disadvantages: * It’s challenging to deploy a scalable optical-only solution with increasing amount of information on constant low-latency. * inability to fine-tune WORM and secure deletion granularity. * Perform poorly in price- performance measurements.
  • 21. Storage architecture 3- Hard disk-products Magnetic disk recording offers better overall cost and performance than optical or tape storage. There are a lots of soft-WORM that offers immutability for records in hard disk storage devices.
  • 22. Storage architecture 3- Hard disk-products Examples of soft-WORM that can be applied on hard disks: 1- EMC Centera * Each data record has two components: the content & its associated content descriptor file (CDF) which contains metadata attribute (creation date, time, format) and the object’s content address.
  • 23. Storage architecture Hard disk-products Examples of soft-WORM that can be applied on hard disks: 1- EMC Centera (con.) * The CDF is used for access to and management of the records. * Centera permit deletion of a pointer to a record upon expiration of the retention period. * Given its software- only nature, these mechanisms are vulnerable to simple software-based attacks and physical attacks.
  • 24. Storage architecture Hard disk-products Examples of soft-WORM that can be applied on hard disks: 2- Hitachi Message Archive for Compliance * The system allow customers to lock down archived data, making it non-erasable and non-rewritable for prescribed period. * Given its software- only nature, these mechanisms are vulnerable to simple software-based attacks and physical attacks as Centera.
  • 25. Storage architecture Hard disk-products Examples of soft-WORM that can be applied on hard disks: 3- IBM System Storage Archive Manager * The system make the deletion of data before it’s scheduled expiration extremely difficult.
  • 26. Storage architecture Hard disk-products Examples of soft-WORM that can be applied on hard disks: 4- Sun StorageTek Compliance Archiving Software * The system offers WORM assurances through its StorageTek, and this software run to provide compliance-enabling features for authenticity, integrity, ready access and security.
  • 27. Storage architecture Strong WORM Today’s compliance storage products do not really satisfy the criteria for trustworthy record retention.
  • 28. Storage architecture Strong WORM For sound design, the following properties are required to strong WORM: * To prevent physical attack, strong tamper-resistant and reactive hardware is requires to ensure data integrity. * The requirement for efficient access to large volume of records will need to be searched using indexes. These indexes cannot be kept on traditional storage, as a super user could hide a record.
  • 29. Storage architecture Strong WORM For sound design, the following properties are required to strong WORM: * Current products don’t ensure that a record is trustworthy throughout it’s entire life cycle, from creation, through migration to newer strong servers, to eventual deletion. *Current compliance storage products aim to address the problem of ( documents ) retention; no product support structured data.
  • 30. Resistance to physical attack ! Using of the PROM circuitry that can put in the arm electronics of hard disk drive, or in processor accessible memory to prevent further writing on disk surface of the hard disk or writing to a section of logical block addresses (LBAs) don’t provide strong WORM guarantees, an insider can open the storage medium enclosures to gain physical access to underlying data.
  • 31. Resistance to physical attack ! By adding a trusted SCPU (Secure CPU) inside the storage server, we can guarantee the trustworthiness of records.
  • 32. Resistance to physical attack ! To achieve high throughput rates, the SCPU is involved in document insertions and deletions but NOT in reads, thus minimizing the overhead if the workload is dominated by read queries.
  • 33. Resistance to physical attack ! Clients who perform reads get an SCPU-certified guarantee that: 1- The block was not tampered with if the read is successful; and if the read is fail either. 2- The block was deleted according to its retention policy. 3- The block is never existed on this storage server.
  • 34. Resistance to physical attack ! Another trick to increase throughput during periods of high load is to temporarily replace expensive SCPU signature operations with less expensive short-term secure variants. The system can strengthen these weaker constructs when load slackens, but within their security lifetime.
  • 35. Resistance to physical attack ! To authenticate the contents of the records on the storage server, one option is to keep a Merkle tree whose entries are signed by the SCPU. However, the resulting O(log n) cost to insert or delete a record, where n is the number of documents, will reduce the throughput of the system.
  • 36. Resistance to physical attack ! To address this problem, one can instead label data block with monotonically increasing consecutive serial numbers and then introduce a concept of sliding ”windows” that are authenticated at O(1) cost by only signing the window boundaries.
  • 37. Trustworthy Indexing: Indexing ensures that a target record can be quickly extracted from terabytes of data.
  • 38. Trustworthy Indexing: An indexing approach for trustworthy records retention must have the following properties: 1- The search path to an index entry must be immutable for the lifetime of the record that it indexes. 2- The indexing code should reside outside the storage server to keep the trusted computing base small.
  • 39. Trustworthy Indexing: An indexing approach for trustworthy records retention must have the following properties: 3- The insertion and indexing of a record must be performed atomically. 4- All traces of a record must be removed from the index when the record is deleted.
  • 40. Trustworthy Indexing: The first step in ensuring trustworthy indexing is to store the index on WORM. However, use of WORM alone is insufficient to index trustworthy because the following problems in B-tree and hash based structure.
  • 41. Trustworthy Indexing: Use B-tree to store the index: The problem comes when a node is split into two nodes when it overflows. 33 39 43 47 51 6321 23 33 Write-Once B-Tree
  • 42. Trustworthy Indexing: Use B-tree to store the index: Two pointers are added to the end of its parent node, superseding the earlier pointer to the old node. 33 33 47 39 43 47 51 6321 23 33 Write-Once B-Tree Insert 45 39 43 45 47 51 63
  • 43. Trustworthy Indexing: Use B-tree to store the index: So, an adversary can effectively modify any record he wishes by creating a new version of the appropriate nodes during copy operation. 33 33 47 39 43 47 51 6321 23 33 Tampered Write-Once B-Tree ( omit 51) Insert 45 39 43 45 47 63
  • 44. Trustworthy Indexing: Use hash-based structure to store the index: The problem comes when the number of records in a hash table exceeds a high water marks. A new hash table with larger size is allocated and all the records are rehashed and moved into the new table.
  • 45. Trustworthy Indexing: Use hash-based structure to store the index: The ability to relocate records, however, provides an opportunity for an adversary to alter the record during the copying step.
  • 46. Trustworthy Indexing: Hash-based structure and B-tree All these approaches are vulnerable because the search path to particular record is not term-immutable. So, researchers have proposed trustworthy versions of hashing and inverted indexes, both guarantee term-immutable search path.
  • 47. Trustworthy Indexing: Generalized hash tree GHT is a balanced tree-based data structure that dose not require periodic rebalancing. In a GHT, predefined hashes of the record key determine all possible lookup or insertion locations.
  • 48. Trustworthy Indexing: Generalized hash tree The location where a record can be inserted or looked up are therefore immutable. GHT 3 51 39 54 47 51 39 43 47 513939 4739 43 47 51 0 1 2 3 0 1 2 3 0 1 2 30 1 2 3 4 5 6 7 4 5 6 74 5 6 7 3 51
  • 49. Trustworthy Indexing: Generalized hash tree To insert or look up a record in a GHT, the record key is hashed to obtain a position within the root node. If the corresponding node position at the root node is empty, the record is inserted there.
  • 50. Trustworthy Indexing: Generalized hash tree If there is a collision, the key is rehashed (using different hash function) and attempt is made to insert the key in the appropriate sub tree of the root node.
  • 51. Trustworthy Indexing: Generalized hash tree This process is repeated until an empty node position is found. If record cannot be inserted, a new leaf node is added. GHT After Insertion 3 51 39 54 47 51 39 43 47 513939 4739 43 47 51 0 1 2 3 0 1 2 3 0 1 2 30 1 2 3 4 5 6 7 4 5 6 74 5 6 7 3 51 39 47 51 h0 ( k)=1 h1 ( k)=0 h2 ( k)=7 h3 ( k)=2
  • 52. Trustworthy Indexing: Inverted indexes Keyword search is the most convenient way to query unstructured records such as email bodies and reports. Search engines typically use inverted indexes for this purpose.
  • 53. Trustworthy Indexing: Inverted indexes An inverted index comprises a dictionary of terms plus a posting list for each term containing the identifiers of all records containing that term with additional metadata.
  • 54. Trustworthy Indexing: Inverted indexes Queries are answered by scanning the posting lists of terms in the query. Query Data Base Worm Index 1 3 9 17 36 3 9 31 3 19 7 36 3 Ordinary Inverted Index
  • 55. Trustworthy Indexing: Inverted indexes For trustworthy version of inverted indexes, each posting list can be stored in a separate append-only file on WORM storage, but this approach is too slow to support real-time insertion of typically business documents.
  • 56. Trustworthy Indexing: Inverted indexes The performance can be improved vastly by merging the posting lists for different terms until the tails of all posting lists fit into the storage server cashe.
  • 57. Trustworthy Indexing: Inverted indexes However, the "popular” terms are not merged together, performance is little affected by merging. Query Data Base Worm Index 1 3 9 17 36 3#Data 3#Base 9#Data 19#Base 3#£Data 7 36 3 Inverted Index After Merging
  • 58. Trustworthy Indexing: B+ tree Multi-keyword conjunctive queries can be answered by intersecting the posting lists of the query terms. To make the intersection fast, an additional index such as a B+ tree is usually kept for each posting list, and zigzag join is used to perform the intersection.
  • 59. Trustworthy Indexing: B+ tree B+ tree can be created for an Increasing sequence of document ID without any node splits or merges, by building the tree from the bottom up. 23 7 11 7 1 3 31 2 4 13 19 23 29 31 33 B+ tree in WORM
  • 60. Trustworthy Indexing: B+ tree Such this index structure is also not trustworthy, even when kept on WOPM storage, because the path to each entry is not immutable.
  • 61. Trustworthy Indexing: B+ tree The adversary can hide some entries by creating a separate sub tree that does not contain specific entry and adding an entry at root to lead to the new sub tree. 23 2 5 7 11 7 1 3 31 2 4 13 19 23 29 31 33 B+ tree with manipulated to hide 31 by adding 25 to the root 32 25 26 32
  • 62. Trustworthy Indexing: B+ tree To address the problem of B+ tree, researchers proposed jump indexes technique.
  • 63. Trustworthy Indexing: Jump Indexes Jump index can be used to index monotonic sequences, such as documents IDs in a posting list, as a replacement for non trustworthy B+ trees. Jump index lookup performance is within a factor of 1.4 of the performance of an equivalent b+ tree.
  • 64. Trustworthy Indexing: Jump Indexes In jump index, to reach a particular number k < N, we can jump from 0 to k in powers of two. For example, let b1, b2,… bp be the binary representation of k. We can reach k in p steps by starting at zero, then jumping forward by b1*2p-1 integers, then jumping forward by b2*2p-2 integers; and so on, until finally a bp* 20 jump brings us to the number k.
  • 65. Trustworthy Indexing: Jump Indexes The ith jump pointer stored with jump index entry (node) L will point to the smallest jump index entry (node) L’ such that: L+2i <= L’ <= L+2i+1 Lookups can be done in O(log2N) time where N=2p . 1 0 1 2 3 4 Jump Pointers 2 0 1 2 3 4 Jump Pointers 5 0 1 2 3 4 Jump Pointers 10 0 2 Jump P 7 0 1 2 3 4 Jump Pointers 15 0 1 2 3 4 Jump Pointers Binary Jump Index
  • 66. Trustworthy Indexing: Jump Indexes By using block-structured jump index we can gain better space and time efficiency. Block-structured jump index in which p posting entries are stored together in blocks of size L.
  • 67. Trustworthy Indexing: Jump Indexes Pointers are associated with blocks, rather than with every entry. Jump pointers are calculated using powers of B rather than two where p>=B. Each pointer is uniquely identified by a pair (i, j), where 0<= i < logB(N) and 1<= j <B 1 2 5 7 0،1 0،2 1،1 2،1 2،2 3،1 3،2 4،1 4،2 Block 0 8 10 15 19 0،1 0،2 1،1 2،1 2،2 Jump Pointers Jump Pointers 21 22 25 0،1 0،2 1،1 2،1 2 Jump Pointers Block-Structured Jump Index Block 1 Block 2
  • 68. Trustworthy Indexing: In trustworthy indexing approach described before, an adversary can insert malicious entries into the index.
  • 69. Trustworthy Indexing: Malicious entries fall into categories: 1- Those that cause subsequent legitimate insertions to fail. 2- Those that will only be noticed when a lookup operation finds a dangling pointer or returns a record that does not match the query. Both events will draw immediate unwanted attention to the attack.
  • 70. Trustworthy Indexing: If an adversary gains physical access to storage, he may tamper with the index contents. If we have trusted hardware that can periodically sign portions of the index such SCPU, then any discrepancy between the signature and the current index contents can be detected.
  • 71. Trustworthy migration: It’s impractical to store record on a single server for decades, as the server will become obsolete and too expensive to maintain. When the records must be moved, the migration process needs to be trustworthy even if a super user adversary performed the migration.
  • 72. Trustworthy migration: Researchers have developed two schemes for trustworthy migration of records between compliance storage servers.
  • 73. Trustworthy migration: First migration schema 1- is initiated by the system operator retrieving a migration certificate (MA) from the regulatory authority (RA). The MA is a signature on a message containing the timestamped identities of SCPU1 and SCPU2.
  • 74. Trustworthy migration: First migration schema 2- Upon migration, the MC is presented to SCPU1and SCPU2 who authenticates the signature of the RA. 3- If this succeeds, SCPU1 is ready to mutually authenticate and perform a key exchange with SCPU2 using their internally stored key pairs and certificates.
  • 75. Trustworthy migration: First migration schema 4- If step 3 succeeds, SCPU1 will be ready and willing to transfer a description of the state of the compliance records and index contents on secure channel provided by an agreed-upon symmetric key.
  • 76. Trustworthy migration: First migration schema After the state information has been migrated, the actual records and index contents can be transferred by the main CPU, without SCPU involvement.
  • 77. Trustworthy migration: Second migration schema This approach relies on the existence of a trusted third party such as a storage system vendor.
  • 78. Trustworthy migration: Second migration schema The migration process is divided by three phases: Phase1: The party in charge of migration prepares a plan for the migration. The log of this plan includes the policies governing the migration and, in compact form, a representation of the list of files and directories to be migrated.
  • 79. Trustworthy migration: Second migration schema Phase2: The current storage server generate certificates that attest to the current state of the directory and file contents and add them to the log.
  • 80. Trustworthy migration: Second migration schema Phase3: Finally, the party in charge of migration moves the files to be migrated and copies the log to the new server. Using public key used by an organization’s series of storage servers and validation routines to check whether the migration took place appropriately.
  • 81. Trustworthy deletion : The primary purpose of WORM devices is to prevent data deletion. However, simple erasure is not enough for trustworthy deletion, as an erased record can be recreated by reverse-engineering an index.
  • 82. Trustworthy deletion : For the deletion of document d to be strongly secure, the presence or absence of any word w in any reconstruction of d should not convey any information about its presence in the original document.
  • 83. Trustworthy deletion : The trustworthy indexing schemes earlier don’t support strongly secure deletion. * Generalized hash trees GHT offer weakly secure deletion. * Trustworthy inverted index and jump index are even more problematic with respect to deletion.
  • 84. Trustworthy deletion : An inverted index for example, when a record is deleted, it may be possible to exactly recreate the record by looking at it’s index entries. Therefore, the index entries must also be removed to ensure non reproducibility of deleted records.
  • 85. Trustworthy deletion : An inverted index for example, also, a structural properties of index may allow an adversary to infer that the deleted index entries existed.
  • 86. Trustworthy deletion : To address this problem two options are proposed: 1- Dividing expiration times into epochs, and keeping a separate set of indexes for records expiring in each epoch. Then one could delete the entire epoch of indexes once the epoch is over. However, this option is impractical because the litigation holds may require a document to retained even after its mandatory retention period is over.
  • 87. Trustworthy deletion : To address this problem two options are proposed: 2 - Rebuild the index in a trustworthy manner when records are deleted. However, the record arrival rate of today will be the required record deletion rate in the future. Thus this option is too expensive to be practical.
  • 88. Trustworthy deletion : Using encryption: Encryption the document identifiers before being stored in the Index is inadequate, because one can still perform a join on the encrypted document identifiers to recover the document content.
  • 89. Trustworthy deletion : Using Inverted index with encryption: An alternative for trustworthy inverted index is to merge posting lists together as usual, then encrypt the term encoding associated with each posting element and store it in the merged posting list entries.
  • 90. Trustworthy deletion : Using Inverted index with encryption: One possible encryption technique is to replace the keyword encoding E in the posting element with it’s XOR with a random secret, which can be stored with the record and deleted upon its expiration. T1 : Tk Tk+1 Tl …….. 00101 d 01100 Supporting deletion from a trustworthy inverted index Encoding Document ID Random Seqr0 = Ө
  • 91. Trustworthy deletion : Using Inverted index with encryption: The adversary will not be able to determine which of the q merged keywords corresponds to the posting element, after the secrets is discarded. The schema does not achieve strongly secure deletion, though it is immune to a variety of possible attacks.
  • 92. Open Problems: The biggest open issues and challenges in trustworthy record retention 1- Corrections Current models for records retention don’t support correction to record content. An elegant, cost-effective approach is needed for supporting corrections.
  • 93. Open Problems: The biggest open issues and challenges in trustworthy record retention 2- Deletions No entirely satisfactory schema exists for trustworthy deletion of records. Traces of record metadata may remain in indexes or migration logs, allowing an adversary to infer the contents of a deleted record
  • 94. Open Problems: The biggest open issues and challenges in trustworthy record retention 3- Structured Information Database records need a similar level of protection, but no work to date has addressed this problem.
  • 95. Open Problems: The biggest open issues and challenges in trustworthy record retention 4- Exploiting trusted hardware It’s important to explore how to deploy trusted hardware WOPM to achieve increased security and efficiency in the upper layers (e.g. indexing).