Data Intensive Grid Service Model

CS6703- GRID AND CLOUD
COMPUTING
2.6. Data Intensive Grid Service Models
By
M.Gomathy Nayagam, AP(SG)/CSE
Ramco Institute of Techonolgy, Rajapalayam

DATA INTENSIVE GRID SERVICE MODEL
 Grid applications are grouped together as:
 Computation intensive
 Data intensive.
 Data intensive- application have to deal with
massive amount of data.
 Example: Large Hadron Collider data set exceeds
several peta bytes by every year.
 Data intensive grid system designed to discover,
transfer, and manipulate these massive data sets.
 Transferring massive data is time consuming one
 Let us discuss some mechanism for solving data
movement problems.

DATA REPLICATION AND UNIFIED NAMESPACE.
 Data access method is known as caching.
 Caching –enhance the data efficiency.
 Replication – stores the same data block and scatter
them in multiple regions of grid.
 Hence user can access the same data with locality of
reference.
 Key data will not lose incase of failure.
 But, it increases storage requirements and network
bandwidth.
 Replication strategies determine when and where to
create a replica of the data.
 The factors to consider include data demand, network
conditions, and transfer cost.

DATA REPLICATION AND UNIFIED NAMESPACE.
 Two types of replication strategies are:
 Static replication
 The locations and number of replicas are determined in
advance and will not be modified.
 It cannot be suitable to adapt for changes in demand,
bandwidth, and storage availability
 Dynamic replication - adjust locations and number of
data replicas according to changes in conditions
 Frequent data-moving operations can result in
much more overhead than in static strategies.
 The replication strategy must be optimized with
respect to the status of data replicas.

GRID DATA ACCESS MODEL
 Multiple participants may want to share the same data
collection.
 To retrieve any piece of data, a grid with a unique global
namespace is needed.
 Similarly, need to have unique file names.
 So, we need to resolve inconsistencies among multiple data
objects bearing the same name.
 Data needs to be protected to avoid leakage and damage.
 Users who want to access data have to be authenticated first
and then authorized for access.
 There are 4 data access models:
 Monadic Model
 Hierarchical Model
 Federation Model
 Hybrid Model

Monadic model:
 Centralized data repository
model
 data is saved in a central data
repository
 Users have to submit requests
directly to the central
repository for accessing data.
 No data is replicated for
preserving data locality.
 It is a simple model.
 Data replication is permitted in
this model only when fault
tolerance is demanded.

Hierarchical model:
 It is suitable for building a large
data grid
 The data may be transferred
from the source to a second-
level center.
 Then some data in the regional
center is transferred to the
third-level center.
 After being forwarded several
times, specific data objects are
accessed directly by users.

Federation Model:
 It is suited for designing a data
grid with multiple sources of data
supplies.
 This model is also known as a
mesh model.
 The data sources are distributed
to many different locations.
 Although the data is shared, the
data items are still owned and
controlled by their original owners.
 According to predefined access
policies, only authenticated users
are authorized to request data
from any data source.

Hybrid Model
 The model combines the best
features of the hierarchical
and mesh models.
 Traditional data transfer
technology, such as FTP,
applies for networks with
lower bandwidth.

PARALLEL VERSUS STRIPED DATA
TRANSFERS
 Parallel data transfer opens multiple data streams
for passing subdivided segments of a file
simultaneously.
 Although the speed of each stream is the same as
in sequential streaming, the total time to move data
in all streams can be significantly reduced
compared to FTP transfer.
 In striped data transfer, a data object is partitioned
into a number of sections, and each section is
placed in an individual site in a data grid.

Data Intensive Grid Service Model

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Data Intensive Grid Service Model

Similar to Data Intensive Grid Service Model (20)

Recently uploaded

Recently uploaded (20)

Data Intensive Grid Service Model