SlideShare a Scribd company logo
1 of 9
Download to read offline
International Journal of Grid Computing & Applications (IJGCA) Vol.5, No.1, March 2014
DOI: 10.5121/ijgca.2014.5101 1
METHOD FOR CONDUCTING A COMBINED
ANALYSIS OF GRID ENVIRONMENT’S FTA AND
GWA THROUGH SESSION BASED MAPPING OF
TRACE VARIABLES
Ramandeep Singh1
and R.K.Bawa2
1
Lovely Professional University, Jalandhar, India
2
Department of Computer Science, Punjabi University Patiala, India
ABSTRACT
Grid computing environment due to its scale and heterogeneous nature is more vulnerable to faults. To
store and analyze fault and workload information, FTA Fault Trace Archive and GWA Grid Workload
Archive are used. Previously researchers have analyzed FTA and GWA as separate research problems but
in this research paper we have proposed a method for conducting a combined analysis of FTA and GWA
based on session based mapping of trace file variables. This is the first attempt to conduct a combined
analysis of these two trace files. Along with the step by step process of combining trace files we have also
included do’s and don’ts while conducting this analysis .Through this combined analysis we have
established a correlation based relationship among number of node failures, number of failed jobs, failure
duration and number of nodes. We have found that these variables are positively correlated with different
correlation coefficients.
Keywords: Grid Computing, FTA Fault Trace Archive, GWA Grid Workload Archive,
Correlation
1. Introduction
Grid computing is a concept which involves combining computing and data storage resources
from different sources into virtual organizations (VO). These resources are of heterogeneous
nature and are connected through dedicated network links or via internet. Due to its large size and
lack of central control Grid resources are vulnerable to different types of faults. These faults
affect the quality of service of Grid by either failing the submitted jobs or by delaying execution
of these jobs. So researchers and developers need to analyze workload and fault information to
study the Grid environment and to improve the overall Grid performance along with reliability.
Trace files are used to collect data about the events taking place inside the Grid. These events
include faults which are occurring on the Grid resources and events related to the jobs which are
submitted for execution on Grid. Basically two types of trace files are used in Grid for collecting
resource availability and job data which are FTA and GWA.
FTA stands for Fault Trace Archive [1] and GWA stands for Grid Workload Archive [2]. FTA
collects information about the faults which are taking place on the Grid nodes and GWA collects
International Journal of Grid Computing & Applications (IJGCA) Vol.5, No.1, March 2014
2
information about the workload which is being submitted on the Grid. FTA contain information
about nodes, platforms, sites, hardware configuration, availability and unavailability events, time
of occurrence of these events and the corresponding step which was taken to deal with the failure
situation. In the same way GWA contain information about job id, submission time, waiting time,
processor requirement, estimated execution time, status and other information fields related to
queues, users and groups. Status of a job can have different values e.g. completed, failed,
cancelled or any other trace file specific value. Trace files are available in different formats e.g.
raw, tabbed or mysql.FTA is a collection of different files which can be joined with each other
with the help of the common fields, like we perform join operation among SQL tables. Due to
space limitation we cannot include the whole format of trace files but details about the format and
design of FTA can be found from FTA web page [3] and the detailed format information of GWA
can be found from GWA web page [4][9].
These trace files can be analyzed using different tools e.g. Matlab, GridSim or mysql. GridSim is
a good tool in case you want to run simulations on these trace files according to the constraints of
a Grid environment. These trace files i.e. FTA and GWA have been individually analyzed by
many researchers for studying Grid environment for different purposes, but no one have
considered combining these trace files to study the relationship among two. In this research paper
we are going to propose and implement a technique using which we can combine FTA and GWA.
Both these trace files are collected from the same Grid platform simultaneously i.e. in parallel.
Using this technique we can study the influence of the events of one trace file on the events of
another trace file.
2. Related Work
Trace files are a good way of collecting data about the events taking place inside a system either
distributed or centralized. Analysis of these trace files can reveal many interesting facts related to
the system. Artur Andrzejak et. al [5]have worked on SETI@home host trace files, which
consisted of 48000 hosts, to analyze host availability patterns. They proposed a model which can
ensure that a certain number of hosts will be available for a certain amount of time either with
replication or with over provisioning of resources. Similarly Bahman Javadi, et al., [6] have
collected and analyzed SETI @home trace files for discovering subset of hosts, which share
similar kind of statistical availability patterns. Out of 230000 hosts they have discovered that
availability of around 34% of hosts is a truly random process but rest of these hosts can often be
modeled in the form of different groups with few distinctions from one another. User oriented
analysis reveals the facts about the workload patterns which are submitted by different users
[11][12].
Bianca [7] have used trace file to analyze and predict the average life span of hard drives being
used at a high performance cluster. Data sheets of these hard drives show that MTTF (Mean Time
to Failure) is around 1,000,000 to 15,000,000 hours, suggesting an annual failure rate around
0.88%. But analysis to the actual field data show that the minimum disk replacement rate is 1%
and usually it is up to 3-4%, and in some cases up to 13% replacement rate is there. Analysis
show that actual data gathered from field may differ from the conceptual or ideal data. Nezih
Yigitbasi et.al. have analyzed availability and unavailability of Grid nodes and have identified
that there exist a predictable pattern in these events and this behavior can be modeled. They have
also established a correlation based model for node failures [8].
International Journal of Grid Computing & Applications (IJGCA) Vol.5, No.1, March 2014
3
GWA trace files are used by researchers to study what kind of jobs are submitted on the Grid and
how does the success and failure of a job is dependent on different parameters of job. Grid
environments are either application specific or job specific. Application specific Grid can execute
or process only a specific application related tasks while on the other hand job specific Grid can
execute different types of jobs which may belong to different applications. Although individual
analysis of trace files is very useful but in this research paper we are going to propose a method
of combining these two trace files for a combined analysis.
3. Combining FTA and GWA
Following are the operations which need to be performed for conducting combined analysis.
3.1 Acquiring FTA and GWA Trace Files
For conducting a combined analysis of FTA and GWA very first condition is that both trace files
should belong to the same Grid environment and should be collected at the same time i.e. events
should be logged in parallel. Reason for this is that if these trace files belong to different
platforms or are collected at different time intervals then there will be no benefit of conducting
the combined analysis because then these events cannot be related to each other and the analysis
will make no sense. In the absence of mysql format of trace files, we can use SPSS for importing
data and converting it in required formats. From the excel format we can insert data in SQL or
mysql table.
3.2 Trimming Trace Files
After acquiring both trace files we need to check event start time and event end time of both trace
files. Reason for this is that collection process of these trace files may start and end at different
times so we need to trim one or both the trace files at a common point of event start time and
event end time. Let’s consider an example where FTA trace file was started at 10:00:00 am, 10
Jan 2013 and it ends on 12:00:00 pm, 15 Jan 2014. Similarly GWA was started at 09:00:00 am,
31 Dec 2012 and it ended on 11:00:00 am, 1 March 2014. Although both these trace files are
collected from the same environment and individually there is nothing wrong with these trace
files but for combined analysis we need to trim either one or both from either beginning or end to
synchronize the event start and event end time. So in case of above considered example we will
trim GWA so that first event start time is 10:00:00 am, 10 Jan 2013 and last event end time will
be 12:00:00 pm, 15 Jan 2014. In order to remove noise we will also have to remove the events
which start before the trimming point but end after it, because such events may cause deviations
in calculations. Same will be applicable to events which start before the trimming point but end
after the trimming point. Although we can say that few such events will not make a difference
when we are talking about millions of such events, but to minimize the errors this is the one
precaution which should be considered in the beginning. So at the end of this step we will have
two trace files which start and end at the same time.
3.3 Slicing and Dicing
Now that we have two trace files of equal time duration we can start with our analysis. Slicing
means that we will divide trace files into sub parts. But the one thing to keep in mind is that these
International Journal of Grid Computing & Applications (IJGCA) Vol.5, No.1, March 2014
4
slices should be of equal size, otherwise it will lead to inconsistent results. Slicing duration can
vary from an hour to day or week and even a month depending on the type of analysis. Because
the event time is epoch time so we would have to convert seconds to days, weeks or months and
then add this value subsequently for slicing and dicing operations.
Figure 1: Mapping and Slicing of FTA and GWA
Figure 1 represents slicing operations in which FTA and GWA trace files of 12 months duration
have been sliced in 6 month and 3 month durations. Vertical lines represent slicing operations. As
we can see that synchronizing is very well considered here as both the trace files have been sliced
at the same locations i.e. time.
3.4 Data Extraction
After slicing down the trace files now we can retrieve data from these trace files for our combined
analysis. Data selection and extraction is based on what kind of analysis we are performing and
which fields of the trace file are required for the analysis. As we know that these two trace files
do not share any common field except the event time variable so we cannot combine FTA and
GWA directly. We have used a different approach which is based on aggregation functions.
a) FTA Data Extraction
Fault Trace Archive contains information about Grid nodes. If NF represents number of failures
and NF(i)(s) represents number of failures for node i where }...,.........4,3,2,1{ nJi ∈ .J represents
collection of n nodes which are part of Grid environment and S represents time slot i.e. session. D
represents total failure duration of all the nodes on the Grid whereas D(i) represents total failure
duration for node i. F(i)T represents failure frequency of node I for duration T.R(i) represents
average resume time after failure for node i.
In the absence of common variable, we have used the method of aggregation functions (basically
summation and average) to collect and retrieve data for analysis. NF(i), D(i) and F(i) can be
directly calculated if we are considering the whole trace file as our possible data set. But we can
also calculate these values for a specific duration S which is our sampling duration. Now the
value of NF(i), D(i) and F(i) will be calculated from this duration S. S can be generated randomly
International Journal of Grid Computing & Applications (IJGCA) Vol.5, No.1, March 2014
5
so that data should be retrieved form the whole trace file rather than one particular section of the
trace file. This helps in getting more accurate results.
b) GWA Data Extraction
Similar to FTA now we need to extract data from GWA and map it with FTA variables. Once
again we will use aggregation function and sample duration for retrieving data. Jid represents one
job form the GWA with unique identifier id where },........3,2,1{ kWid ∈ here W represents set of
jobs which were submitted on the Grid for execution. C id represents number of processors which
are requested by job Jid for execution. WTid represents waiting time for job Jid and Rid represents
runtime requested by Jid..Uid represents user id of the user who submitted job Jid. Sid represents
status of the job Jid. Status as already discussed can be completed, failed, cancelled or any other
trace file specific reason. Ncompleted(s), Nfailed(s), Ncancelled(s) represents number of completed, failed
and canceled jobs respectively in duration S.
4. Combining Data
After the data extraction, now data of different variables should be mapped according to the time
slots. In case the time slot shuffling is taking place in both the data sets simultaneously, then it
will not lead to any inconsistencies, but if one data set’s time slot have been shifted but not the
other one, then it will lead to incorrect results. Graphical representation of this concept is shown
in figure 2. In this figure S1, S2 and S3 represents the time slots for which aggregation functions
have been used to retrieve data. In the figure Fvar and Gvar represents variables of FTA and
GWA. Variable can be any variable which the analyst wants to include in the analysis and which
have some relationship with the other variables of the trace file. This relation can be direct or
indirect. For example we may consider that there exist a relationship between number of node
failures and number of failed jobs. So on one side i.e. FTA we can have data about number of
node failures and other relevant variables such as failure duration, resume time etc. and on the
other side i.e. GWA we can have data related to number of job failures and the other relevant
variables which are required for this analysis such as waiting time, number of processors
requested by job, run time etc. It is not mandatory that the number of variables from FTA and
GWA which are being used in analysis needs to be same in number. There can be a case where
we may have only one or two variables from FTA but from GWA side there may be more than
two. This is totally dependent on the analysis model.
International Journal of Grid Computing & Applications (IJGCA) Vol.5, No.1, March 2014
6
Figure 2: Combining FTA and GWA with Mapping Time Slots
5. Conducting Analysis
Based on the above proposed technique we have conducted a correlation based analysis by
combining FTA and GWA. Through this analysis we have identified relationship among the
variables of FTA and GWA. Variables which we have considered in our analysis are NF(s)
(Number of node Failures in duration S), D(s) (Failure Duration in time S), NN (Number of
Nodes), Nfailed(s) (Number of Failed Jobs in duration S).
Serial No. Variable Variable Description Variable Name for Plotting
1 NF(s) Number of node Failures in duration S NumNodeFailures
2 D(s) Failure Duration in time S FailureDuration
3 NN Number of Nodes NumNodes
4 Nfailed(s) Number of Failed Jobs in duration S NJobFailed
Correlation is a statistical measure that indicates the extent to which two or more variable values
fluctuate together. A positive correlation indicates the extent to which two or more than two
variables increase in parallel and a negative correlation indicates the extent to which one variable
increases as the other decreases. The strength of the linear association among two variables is
quantified by the correlation coefficient whose value can vary from -1 to +1. We have used
Spearman correlation for this analysis because of the absence of linear variation in data from both
trace files. Reason being number of jobs submitted or failed and number of node failures are
varying at a large scale from time to time .This technique uses a ranking based approach for
calculating correlation coefficients. Following equation represents Spearman’s correlation.
International Journal of Grid Computing & Applications (IJGCA) Vol.5, No.1, March 2014
7
2
2
6
1
( 1)
S
d
r
n n
×
= −
−
∑
Here rs represents Spearman’s correlation coefficient, d represents the difference in the ranks of
two variables whose correlation we are calculating and n represents the number of values which
are being used for conducting this correlation. From the randomly collected data set we
conducted a correlation based analysis of different variables. Result of this analysis is shown in
the following table 1 and the corresponding plots are shown in figure 3
We can make the following conclusions from these results.
I. There exists a positive correlation between number of node failures and number of failed
jobs. So we can say that with the increase in the number of node failures number of failed
jobs is also increasing and this explains the very basic behavior of the Grid environment.
So number of node failures has a direct effect on the quality of service of the
environment. Scheduling policy can use this information in order to make better
scheduling decisions in a failure critical situation.
II. If we look at the correlation coefficient of number of node failures and failure duration
then we can see that there exists a considerable correlation between two equal to 0.660.
So based on the correlation with increase in number of node failures the failure duration
can be predicted
III. Correlation coefficient between failure duration and number of failed jobs is also positive
equal to 0.498. Although it is not a strong correlation value but it supports out hypothesis
that longer the nodes stay unavailable more will be the failed or cancelled jobs.
Table 1 : Correlation Analysis Results
International Journal of Grid Computing & Applications (IJGCA) Vol.5, No.1, March 2014
8
Figure 3: Correlation Analysis Plots
6. Conclusion and Future Work
In this research paper we have proposed first technique of combining Fault Trace Archive and
Grid Workload Archive as a single research problem. We have discussed the step by step
approach of combining FTA and GWA. We have also identified and discussed that what type of
mistakes can be made while conducting the analysis and what kind of impact these mistakes can
have on the results. Finally with the help of correlation based combined analyses we have found
out that there are positive correlations among FTA and GWA variables and have also identified
the coefficient of these correlations. In future we can establish a regression based model of
different variables and can predict system behavior in response to variation of events.
International Journal of Grid Computing & Applications (IJGCA) Vol.5, No.1, March 2014
9
References
[1] Fault Trace Archieve. Fault Trace Archieve. [Online]. http://fta.scem.uws.edu.au
[2] Hui Li, Mathieu jan, Shanny Anoep Alexandru Iosup, "The Grid Workload Archieve,".
[3] Alexandru Iosup,Matthieu Gallet,Emmanuel Jeannot,Derrick Kondo,Bahman Javadi,Artur Andrzejak
Dick Epema. (2009) Fault Trace Archive: For improving the reliability of distributed systems.
[Online]. HYPERLINK http://fta.scem.uws.edu.au/index.php?n=Main.FTAFormat
[4] Catalin Dumitrescu, Dick Epema, Alexandru Iosup, Mathieu Jan, Hui Li, Lex Wolters Shanny Anoep.
(2007, January) The Grid Workloads Archive. [Online]. HYPERLINK http://gwa.ewi.tudelft.nl
[5] Derrick Kondo, David P.Anderson Attur Andrzejak, "Ensuring Collective Availability in Volatile
Resource Pools via Forecasting,".
[6] Derrick Kondo, Jean-Marc Vincent, David P.Anderson Bahman Javedi, "Mining for Statistical
Models of Availability in Large-Scale Distributed Systems: An Emperical Study of SETI@home,".
[7] Garth A,Gibson Bianca Schroeder, "Disk Failures in Real World: What does and MTTF of 1,000,000
hours mean to you?," in 5th USENIX Conference on File and Storage Technologies, San Hose, CA,
Feb 14-16,2007.
[8] Matthieu Gallet, Derrick Kondo, Alexandru Iosup, Dick Epema Nezih Yigitbasi, "Analysis and
Modeling of Time Correlated Failures in Large Scale Distributed Systems," 8th IEEE/ACM
International Conference on Grid Computing, PDS-2010-004, 2010.
[9] Bahman Javadi, Alexandru Iosup, Dick Epema Derrick Kondo, "The Failure Trace Archive :
Enabling Comparative Analysis of Failures,".
[10] Dick Epema Alexandru Iosup, "Grid Computing Workloads," IEEE Transactions on Internet
Computng, vol. 15, no. 2, pp. 19-26, 2011.
[11] Dick Epema Alexandru Iosup, "Grid Computing Workloads: Bags of Tasks, Workflows, Pilots, and
Others," , Netherlands, 2011.
[12] Carsten Ememann, and Ramin Yahyapour Baiyi Song, "User Group-based Workload Analysis and
Modelling," in 2005 IEEE International Symposium on Cluster Computing and the Grid, 2005, pp.
953-961.

More Related Content

What's hot

Accelerated Materials Discovery Using Theory, Optimization, and Natural Langu...
Accelerated Materials Discovery Using Theory, Optimization, and Natural Langu...Accelerated Materials Discovery Using Theory, Optimization, and Natural Langu...
Accelerated Materials Discovery Using Theory, Optimization, and Natural Langu...Anubhav Jain
 
Executing Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web DataExecuting Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web DataeXascale Infolab
 
Privacy Preserving Multi-keyword Top-K Search based on Cosine Similarity Clus...
Privacy Preserving Multi-keyword Top-K Search based on Cosine Similarity Clus...Privacy Preserving Multi-keyword Top-K Search based on Cosine Similarity Clus...
Privacy Preserving Multi-keyword Top-K Search based on Cosine Similarity Clus...IRJET Journal
 
GEN: A Database Interface Generator for HPC Programs
GEN: A Database Interface Generator for HPC ProgramsGEN: A Database Interface Generator for HPC Programs
GEN: A Database Interface Generator for HPC ProgramsTanu Malik
 
Survey on Load Rebalancing for Distributed File System in Cloud
Survey on Load Rebalancing for Distributed File System in CloudSurvey on Load Rebalancing for Distributed File System in Cloud
Survey on Load Rebalancing for Distributed File System in CloudAM Publications
 
Improving the Performance of Mapping based on Availability- Alert Algorithm U...
Improving the Performance of Mapping based on Availability- Alert Algorithm U...Improving the Performance of Mapping based on Availability- Alert Algorithm U...
Improving the Performance of Mapping based on Availability- Alert Algorithm U...AM Publications
 
A Survey on Improve Efficiency And Scability vertical mining using Agriculter...
A Survey on Improve Efficiency And Scability vertical mining using Agriculter...A Survey on Improve Efficiency And Scability vertical mining using Agriculter...
A Survey on Improve Efficiency And Scability vertical mining using Agriculter...Editor IJMTER
 
Applications of Natural Language Processing to Materials Design
Applications of Natural Language Processing to Materials DesignApplications of Natural Language Processing to Materials Design
Applications of Natural Language Processing to Materials DesignAnubhav Jain
 
Open-source tools for generating and analyzing large materials data sets
Open-source tools for generating and analyzing large materials data setsOpen-source tools for generating and analyzing large materials data sets
Open-source tools for generating and analyzing large materials data setsAnubhav Jain
 
The Status of ML Algorithms for Structure-property Relationships Using Matb...
The Status of ML Algorithms for Structure-property Relationships Using Matb...The Status of ML Algorithms for Structure-property Relationships Using Matb...
The Status of ML Algorithms for Structure-property Relationships Using Matb...Anubhav Jain
 
Sciunits: Reusable Research Objects
Sciunits: Reusable Research Objects Sciunits: Reusable Research Objects
Sciunits: Reusable Research Objects Globus
 
Software Tools, Methods and Applications of Machine Learning in Functional Ma...
Software Tools, Methods and Applications of Machine Learning in Functional Ma...Software Tools, Methods and Applications of Machine Learning in Functional Ma...
Software Tools, Methods and Applications of Machine Learning in Functional Ma...Anubhav Jain
 
DuraMat Data Management and Analytics
DuraMat Data Management and AnalyticsDuraMat Data Management and Analytics
DuraMat Data Management and AnalyticsAnubhav Jain
 

What's hot (18)

Fault tolerance on cloud computing
Fault tolerance on cloud computingFault tolerance on cloud computing
Fault tolerance on cloud computing
 
Iugonet 20100706
Iugonet 20100706Iugonet 20100706
Iugonet 20100706
 
Accelerated Materials Discovery Using Theory, Optimization, and Natural Langu...
Accelerated Materials Discovery Using Theory, Optimization, and Natural Langu...Accelerated Materials Discovery Using Theory, Optimization, and Natural Langu...
Accelerated Materials Discovery Using Theory, Optimization, and Natural Langu...
 
Project Report (Summer 2016)
Project Report (Summer 2016)Project Report (Summer 2016)
Project Report (Summer 2016)
 
Executing Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web DataExecuting Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web Data
 
Privacy Preserving Multi-keyword Top-K Search based on Cosine Similarity Clus...
Privacy Preserving Multi-keyword Top-K Search based on Cosine Similarity Clus...Privacy Preserving Multi-keyword Top-K Search based on Cosine Similarity Clus...
Privacy Preserving Multi-keyword Top-K Search based on Cosine Similarity Clus...
 
GEN: A Database Interface Generator for HPC Programs
GEN: A Database Interface Generator for HPC ProgramsGEN: A Database Interface Generator for HPC Programs
GEN: A Database Interface Generator for HPC Programs
 
Survey on Load Rebalancing for Distributed File System in Cloud
Survey on Load Rebalancing for Distributed File System in CloudSurvey on Load Rebalancing for Distributed File System in Cloud
Survey on Load Rebalancing for Distributed File System in Cloud
 
Improving the Performance of Mapping based on Availability- Alert Algorithm U...
Improving the Performance of Mapping based on Availability- Alert Algorithm U...Improving the Performance of Mapping based on Availability- Alert Algorithm U...
Improving the Performance of Mapping based on Availability- Alert Algorithm U...
 
A Survey on Improve Efficiency And Scability vertical mining using Agriculter...
A Survey on Improve Efficiency And Scability vertical mining using Agriculter...A Survey on Improve Efficiency And Scability vertical mining using Agriculter...
A Survey on Improve Efficiency And Scability vertical mining using Agriculter...
 
Applications of Natural Language Processing to Materials Design
Applications of Natural Language Processing to Materials DesignApplications of Natural Language Processing to Materials Design
Applications of Natural Language Processing to Materials Design
 
Open-source tools for generating and analyzing large materials data sets
Open-source tools for generating and analyzing large materials data setsOpen-source tools for generating and analyzing large materials data sets
Open-source tools for generating and analyzing large materials data sets
 
The Status of ML Algorithms for Structure-property Relationships Using Matb...
The Status of ML Algorithms for Structure-property Relationships Using Matb...The Status of ML Algorithms for Structure-property Relationships Using Matb...
The Status of ML Algorithms for Structure-property Relationships Using Matb...
 
Sciunits: Reusable Research Objects
Sciunits: Reusable Research Objects Sciunits: Reusable Research Objects
Sciunits: Reusable Research Objects
 
Software Tools, Methods and Applications of Machine Learning in Functional Ma...
Software Tools, Methods and Applications of Machine Learning in Functional Ma...Software Tools, Methods and Applications of Machine Learning in Functional Ma...
Software Tools, Methods and Applications of Machine Learning in Functional Ma...
 
DuraMat Data Management and Analytics
DuraMat Data Management and AnalyticsDuraMat Data Management and Analytics
DuraMat Data Management and Analytics
 
Dynamic integrations of crop data and corresponding meteorological data based...
Dynamic integrations of crop data and corresponding meteorological data based...Dynamic integrations of crop data and corresponding meteorological data based...
Dynamic integrations of crop data and corresponding meteorological data based...
 
Dynamic Integrations of Crop Data and Corresponding Meteorological Data based...
Dynamic Integrations of Crop Data and Corresponding Meteorological Data based...Dynamic Integrations of Crop Data and Corresponding Meteorological Data based...
Dynamic Integrations of Crop Data and Corresponding Meteorological Data based...
 

Viewers also liked

Method for conducting a combined analysis of grid environment’s fta and gwa t...
Method for conducting a combined analysis of grid environment’s fta and gwa t...Method for conducting a combined analysis of grid environment’s fta and gwa t...
Method for conducting a combined analysis of grid environment’s fta and gwa t...ijgca
 
International Journal of Grid Computing & Applications (IJGCA)
International Journal of Grid Computing & Applications (IJGCA)International Journal of Grid Computing & Applications (IJGCA)
International Journal of Grid Computing & Applications (IJGCA)ijgca
 
Att ta fram beslutsunderlag för investeringar i Talent och Performance Manage...
Att ta fram beslutsunderlag för investeringar i Talent och Performance Manage...Att ta fram beslutsunderlag för investeringar i Talent och Performance Manage...
Att ta fram beslutsunderlag för investeringar i Talent och Performance Manage...GreenBulletSolutions
 
Unit 3 music and media
Unit 3 music and mediaUnit 3 music and media
Unit 3 music and mediablaneg79
 
Case study: Performance management i en multinationell organisation
Case study: Performance management i en multinationell organisationCase study: Performance management i en multinationell organisation
Case study: Performance management i en multinationell organisationGreenBulletSolutions
 
PERFORMANCE FACTORS OF CLOUD COMPUTING DATA CENTERS USING [(M/G/1) : (∞/GDM O...
PERFORMANCE FACTORS OF CLOUD COMPUTING DATA CENTERS USING [(M/G/1) : (∞/GDM O...PERFORMANCE FACTORS OF CLOUD COMPUTING DATA CENTERS USING [(M/G/1) : (∞/GDM O...
PERFORMANCE FACTORS OF CLOUD COMPUTING DATA CENTERS USING [(M/G/1) : (∞/GDM O...ijgca
 
Magic through heart emotional honesty
Magic through heart   emotional honestyMagic through heart   emotional honesty
Magic through heart emotional honestyMagic Through Heart
 
A comparative study in dynamic job scheduling approaches in grid computing en...
A comparative study in dynamic job scheduling approaches in grid computing en...A comparative study in dynamic job scheduling approaches in grid computing en...
A comparative study in dynamic job scheduling approaches in grid computing en...ijgca
 

Viewers also liked (9)

Method for conducting a combined analysis of grid environment’s fta and gwa t...
Method for conducting a combined analysis of grid environment’s fta and gwa t...Method for conducting a combined analysis of grid environment’s fta and gwa t...
Method for conducting a combined analysis of grid environment’s fta and gwa t...
 
International Journal of Grid Computing & Applications (IJGCA)
International Journal of Grid Computing & Applications (IJGCA)International Journal of Grid Computing & Applications (IJGCA)
International Journal of Grid Computing & Applications (IJGCA)
 
Att ta fram beslutsunderlag för investeringar i Talent och Performance Manage...
Att ta fram beslutsunderlag för investeringar i Talent och Performance Manage...Att ta fram beslutsunderlag för investeringar i Talent och Performance Manage...
Att ta fram beslutsunderlag för investeringar i Talent och Performance Manage...
 
Unit 3 music and media
Unit 3 music and mediaUnit 3 music and media
Unit 3 music and media
 
Case study: Performance management i en multinationell organisation
Case study: Performance management i en multinationell organisationCase study: Performance management i en multinationell organisation
Case study: Performance management i en multinationell organisation
 
PERFORMANCE FACTORS OF CLOUD COMPUTING DATA CENTERS USING [(M/G/1) : (∞/GDM O...
PERFORMANCE FACTORS OF CLOUD COMPUTING DATA CENTERS USING [(M/G/1) : (∞/GDM O...PERFORMANCE FACTORS OF CLOUD COMPUTING DATA CENTERS USING [(M/G/1) : (∞/GDM O...
PERFORMANCE FACTORS OF CLOUD COMPUTING DATA CENTERS USING [(M/G/1) : (∞/GDM O...
 
Magic through heart emotional honesty
Magic through heart   emotional honestyMagic through heart   emotional honesty
Magic through heart emotional honesty
 
Agile Performance Management
Agile Performance ManagementAgile Performance Management
Agile Performance Management
 
A comparative study in dynamic job scheduling approaches in grid computing en...
A comparative study in dynamic job scheduling approaches in grid computing en...A comparative study in dynamic job scheduling approaches in grid computing en...
A comparative study in dynamic job scheduling approaches in grid computing en...
 

Similar to Method for conducting a combined analysis of grid environment’s fta and gwa through session based mapping of trace variables

TASK-DECOMPOSITION BASED ANOMALY DETECTION OF MASSIVE AND HIGH-VOLATILITY SES...
TASK-DECOMPOSITION BASED ANOMALY DETECTION OF MASSIVE AND HIGH-VOLATILITY SES...TASK-DECOMPOSITION BASED ANOMALY DETECTION OF MASSIVE AND HIGH-VOLATILITY SES...
TASK-DECOMPOSITION BASED ANOMALY DETECTION OF MASSIVE AND HIGH-VOLATILITY SES...ijdpsjournal
 
REPLICATION STRATEGY BASED ON DATA RELATIONSHIP IN GRID COMPUTING
REPLICATION STRATEGY BASED ON DATA RELATIONSHIP IN GRID COMPUTINGREPLICATION STRATEGY BASED ON DATA RELATIONSHIP IN GRID COMPUTING
REPLICATION STRATEGY BASED ON DATA RELATIONSHIP IN GRID COMPUTINGcsandit
 
Data repository for sensor network a data mining approach
Data repository for sensor network  a data mining approachData repository for sensor network  a data mining approach
Data repository for sensor network a data mining approachijdms
 
STORAGE GROWING FORECAST WITH BACULA BACKUP SOFTWARE CATALOG DATA MINING
STORAGE GROWING FORECAST WITH BACULA BACKUP SOFTWARE CATALOG DATA MININGSTORAGE GROWING FORECAST WITH BACULA BACKUP SOFTWARE CATALOG DATA MINING
STORAGE GROWING FORECAST WITH BACULA BACKUP SOFTWARE CATALOG DATA MININGcsandit
 
Efficient Record De-Duplication Identifying Using Febrl Framework
Efficient Record De-Duplication Identifying Using Febrl FrameworkEfficient Record De-Duplication Identifying Using Febrl Framework
Efficient Record De-Duplication Identifying Using Febrl FrameworkIOSR Journals
 
Cache mechanism to avoid dulpication of same thing in hadoop system to speed ...
Cache mechanism to avoid dulpication of same thing in hadoop system to speed ...Cache mechanism to avoid dulpication of same thing in hadoop system to speed ...
Cache mechanism to avoid dulpication of same thing in hadoop system to speed ...eSAT Journals
 
Data collection in multi application sharing wireless sensor networks
Data collection in multi application sharing wireless sensor networksData collection in multi application sharing wireless sensor networks
Data collection in multi application sharing wireless sensor networksPvrtechnologies Nellore
 
The Impact of Data Replication on Job Scheduling Performance in Hierarchical ...
The Impact of Data Replication on Job Scheduling Performance in Hierarchical ...The Impact of Data Replication on Job Scheduling Performance in Hierarchical ...
The Impact of Data Replication on Job Scheduling Performance in Hierarchical ...graphhoc
 
Sharing of cluster resources among multiple Workflow Applications
Sharing of cluster resources among multiple Workflow ApplicationsSharing of cluster resources among multiple Workflow Applications
Sharing of cluster resources among multiple Workflow Applicationsijcsit
 
REPLICATION STRATEGY BASED ON DATA RELATIONSHIP IN GRID COMPUTING
REPLICATION STRATEGY BASED ON DATA RELATIONSHIP IN GRID COMPUTINGREPLICATION STRATEGY BASED ON DATA RELATIONSHIP IN GRID COMPUTING
REPLICATION STRATEGY BASED ON DATA RELATIONSHIP IN GRID COMPUTINGcscpconf
 
MAP/REDUCE DESIGN AND IMPLEMENTATION OF APRIORIALGORITHM FOR HANDLING VOLUMIN...
MAP/REDUCE DESIGN AND IMPLEMENTATION OF APRIORIALGORITHM FOR HANDLING VOLUMIN...MAP/REDUCE DESIGN AND IMPLEMENTATION OF APRIORIALGORITHM FOR HANDLING VOLUMIN...
MAP/REDUCE DESIGN AND IMPLEMENTATION OF APRIORIALGORITHM FOR HANDLING VOLUMIN...acijjournal
 
EFFICIENT MIXED MODE SUMMARY FOR MOBILE NETWORKS
EFFICIENT MIXED MODE SUMMARY FOR MOBILE NETWORKSEFFICIENT MIXED MODE SUMMARY FOR MOBILE NETWORKS
EFFICIENT MIXED MODE SUMMARY FOR MOBILE NETWORKSijwmn
 
T AXONOMY OF O PTIMIZATION A PPROACHES OF R ESOURCE B ROKERS IN D ATA G RIDS
T AXONOMY OF  O PTIMIZATION  A PPROACHES OF R ESOURCE B ROKERS IN  D ATA  G RIDST AXONOMY OF  O PTIMIZATION  A PPROACHES OF R ESOURCE B ROKERS IN  D ATA  G RIDS
T AXONOMY OF O PTIMIZATION A PPROACHES OF R ESOURCE B ROKERS IN D ATA G RIDSijcsit
 
HMR Log Analyzer: Analyze Web Application Logs Over Hadoop MapReduce
HMR Log Analyzer: Analyze Web Application Logs Over Hadoop MapReduce HMR Log Analyzer: Analyze Web Application Logs Over Hadoop MapReduce
HMR Log Analyzer: Analyze Web Application Logs Over Hadoop MapReduce ijujournal
 
HMR LOG ANALYZER: ANALYZE WEB APPLICATION LOGS OVER HADOOP MAPREDUCE
HMR LOG ANALYZER: ANALYZE WEB APPLICATION LOGS OVER HADOOP MAPREDUCEHMR LOG ANALYZER: ANALYZE WEB APPLICATION LOGS OVER HADOOP MAPREDUCE
HMR LOG ANALYZER: ANALYZE WEB APPLICATION LOGS OVER HADOOP MAPREDUCEijujournal
 
Enhancement techniques for data warehouse staging area
Enhancement techniques for data warehouse staging areaEnhancement techniques for data warehouse staging area
Enhancement techniques for data warehouse staging areaIJDKP
 

Similar to Method for conducting a combined analysis of grid environment’s fta and gwa through session based mapping of trace variables (20)

TASK-DECOMPOSITION BASED ANOMALY DETECTION OF MASSIVE AND HIGH-VOLATILITY SES...
TASK-DECOMPOSITION BASED ANOMALY DETECTION OF MASSIVE AND HIGH-VOLATILITY SES...TASK-DECOMPOSITION BASED ANOMALY DETECTION OF MASSIVE AND HIGH-VOLATILITY SES...
TASK-DECOMPOSITION BASED ANOMALY DETECTION OF MASSIVE AND HIGH-VOLATILITY SES...
 
REPLICATION STRATEGY BASED ON DATA RELATIONSHIP IN GRID COMPUTING
REPLICATION STRATEGY BASED ON DATA RELATIONSHIP IN GRID COMPUTINGREPLICATION STRATEGY BASED ON DATA RELATIONSHIP IN GRID COMPUTING
REPLICATION STRATEGY BASED ON DATA RELATIONSHIP IN GRID COMPUTING
 
Data repository for sensor network a data mining approach
Data repository for sensor network  a data mining approachData repository for sensor network  a data mining approach
Data repository for sensor network a data mining approach
 
STORAGE GROWING FORECAST WITH BACULA BACKUP SOFTWARE CATALOG DATA MINING
STORAGE GROWING FORECAST WITH BACULA BACKUP SOFTWARE CATALOG DATA MININGSTORAGE GROWING FORECAST WITH BACULA BACKUP SOFTWARE CATALOG DATA MINING
STORAGE GROWING FORECAST WITH BACULA BACKUP SOFTWARE CATALOG DATA MINING
 
Efficient Record De-Duplication Identifying Using Febrl Framework
Efficient Record De-Duplication Identifying Using Febrl FrameworkEfficient Record De-Duplication Identifying Using Febrl Framework
Efficient Record De-Duplication Identifying Using Febrl Framework
 
Cache mechanism to avoid dulpication of same thing in hadoop system to speed ...
Cache mechanism to avoid dulpication of same thing in hadoop system to speed ...Cache mechanism to avoid dulpication of same thing in hadoop system to speed ...
Cache mechanism to avoid dulpication of same thing in hadoop system to speed ...
 
Data collection in multi application sharing wireless sensor networks
Data collection in multi application sharing wireless sensor networksData collection in multi application sharing wireless sensor networks
Data collection in multi application sharing wireless sensor networks
 
The Impact of Data Replication on Job Scheduling Performance in Hierarchical ...
The Impact of Data Replication on Job Scheduling Performance in Hierarchical ...The Impact of Data Replication on Job Scheduling Performance in Hierarchical ...
The Impact of Data Replication on Job Scheduling Performance in Hierarchical ...
 
50120130406035
5012013040603550120130406035
50120130406035
 
Sharing of cluster resources among multiple Workflow Applications
Sharing of cluster resources among multiple Workflow ApplicationsSharing of cluster resources among multiple Workflow Applications
Sharing of cluster resources among multiple Workflow Applications
 
REPLICATION STRATEGY BASED ON DATA RELATIONSHIP IN GRID COMPUTING
REPLICATION STRATEGY BASED ON DATA RELATIONSHIP IN GRID COMPUTINGREPLICATION STRATEGY BASED ON DATA RELATIONSHIP IN GRID COMPUTING
REPLICATION STRATEGY BASED ON DATA RELATIONSHIP IN GRID COMPUTING
 
Ax34298305
Ax34298305Ax34298305
Ax34298305
 
50120130405014 2-3
50120130405014 2-350120130405014 2-3
50120130405014 2-3
 
MAP/REDUCE DESIGN AND IMPLEMENTATION OF APRIORIALGORITHM FOR HANDLING VOLUMIN...
MAP/REDUCE DESIGN AND IMPLEMENTATION OF APRIORIALGORITHM FOR HANDLING VOLUMIN...MAP/REDUCE DESIGN AND IMPLEMENTATION OF APRIORIALGORITHM FOR HANDLING VOLUMIN...
MAP/REDUCE DESIGN AND IMPLEMENTATION OF APRIORIALGORITHM FOR HANDLING VOLUMIN...
 
[IJET V2I2P18] Authors: Roopa G Yeklaspur, Dr.Yerriswamy.T
[IJET V2I2P18] Authors: Roopa G Yeklaspur, Dr.Yerriswamy.T[IJET V2I2P18] Authors: Roopa G Yeklaspur, Dr.Yerriswamy.T
[IJET V2I2P18] Authors: Roopa G Yeklaspur, Dr.Yerriswamy.T
 
EFFICIENT MIXED MODE SUMMARY FOR MOBILE NETWORKS
EFFICIENT MIXED MODE SUMMARY FOR MOBILE NETWORKSEFFICIENT MIXED MODE SUMMARY FOR MOBILE NETWORKS
EFFICIENT MIXED MODE SUMMARY FOR MOBILE NETWORKS
 
T AXONOMY OF O PTIMIZATION A PPROACHES OF R ESOURCE B ROKERS IN D ATA G RIDS
T AXONOMY OF  O PTIMIZATION  A PPROACHES OF R ESOURCE B ROKERS IN  D ATA  G RIDST AXONOMY OF  O PTIMIZATION  A PPROACHES OF R ESOURCE B ROKERS IN  D ATA  G RIDS
T AXONOMY OF O PTIMIZATION A PPROACHES OF R ESOURCE B ROKERS IN D ATA G RIDS
 
HMR Log Analyzer: Analyze Web Application Logs Over Hadoop MapReduce
HMR Log Analyzer: Analyze Web Application Logs Over Hadoop MapReduce HMR Log Analyzer: Analyze Web Application Logs Over Hadoop MapReduce
HMR Log Analyzer: Analyze Web Application Logs Over Hadoop MapReduce
 
HMR LOG ANALYZER: ANALYZE WEB APPLICATION LOGS OVER HADOOP MAPREDUCE
HMR LOG ANALYZER: ANALYZE WEB APPLICATION LOGS OVER HADOOP MAPREDUCEHMR LOG ANALYZER: ANALYZE WEB APPLICATION LOGS OVER HADOOP MAPREDUCE
HMR LOG ANALYZER: ANALYZE WEB APPLICATION LOGS OVER HADOOP MAPREDUCE
 
Enhancement techniques for data warehouse staging area
Enhancement techniques for data warehouse staging areaEnhancement techniques for data warehouse staging area
Enhancement techniques for data warehouse staging area
 

More from ijgca

11th International Conference on Computer Science, Engineering and Informati...
11th International Conference on Computer Science, Engineering and  Informati...11th International Conference on Computer Science, Engineering and  Informati...
11th International Conference on Computer Science, Engineering and Informati...ijgca
 
SERVICE LEVEL AGREEMENT BASED FAULT TOLERANT WORKLOAD SCHEDULING IN CLOUD COM...
SERVICE LEVEL AGREEMENT BASED FAULT TOLERANT WORKLOAD SCHEDULING IN CLOUD COM...SERVICE LEVEL AGREEMENT BASED FAULT TOLERANT WORKLOAD SCHEDULING IN CLOUD COM...
SERVICE LEVEL AGREEMENT BASED FAULT TOLERANT WORKLOAD SCHEDULING IN CLOUD COM...ijgca
 
SERVICE LEVEL AGREEMENT BASED FAULT TOLERANT WORKLOAD SCHEDULING IN CLOUD COM...
SERVICE LEVEL AGREEMENT BASED FAULT TOLERANT WORKLOAD SCHEDULING IN CLOUD COM...SERVICE LEVEL AGREEMENT BASED FAULT TOLERANT WORKLOAD SCHEDULING IN CLOUD COM...
SERVICE LEVEL AGREEMENT BASED FAULT TOLERANT WORKLOAD SCHEDULING IN CLOUD COM...ijgca
 
11th International Conference on Computer Science, Engineering and Informatio...
11th International Conference on Computer Science, Engineering and Informatio...11th International Conference on Computer Science, Engineering and Informatio...
11th International Conference on Computer Science, Engineering and Informatio...ijgca
 
Topology Aware Load Balancing for Grids.
Topology Aware Load Balancing for Grids.Topology Aware Load Balancing for Grids.
Topology Aware Load Balancing for Grids.ijgca
 
11th International Conference on Computer Science and Information Technology ...
11th International Conference on Computer Science and Information Technology ...11th International Conference on Computer Science and Information Technology ...
11th International Conference on Computer Science and Information Technology ...ijgca
 
AN INTELLIGENT SYSTEM FOR THE ENHANCEMENT OF VISUALLY IMPAIRED NAVIGATION AND...
AN INTELLIGENT SYSTEM FOR THE ENHANCEMENT OF VISUALLY IMPAIRED NAVIGATION AND...AN INTELLIGENT SYSTEM FOR THE ENHANCEMENT OF VISUALLY IMPAIRED NAVIGATION AND...
AN INTELLIGENT SYSTEM FOR THE ENHANCEMENT OF VISUALLY IMPAIRED NAVIGATION AND...ijgca
 
13th International Conference on Data Mining & Knowledge Management Process (...
13th International Conference on Data Mining & Knowledge Management Process (...13th International Conference on Data Mining & Knowledge Management Process (...
13th International Conference on Data Mining & Knowledge Management Process (...ijgca
 
Call for Papers - 15th International Conference on Wireless & Mobile Networks...
Call for Papers - 15th International Conference on Wireless & Mobile Networks...Call for Papers - 15th International Conference on Wireless & Mobile Networks...
Call for Papers - 15th International Conference on Wireless & Mobile Networks...ijgca
 
Call for Papers - 4th International Conference on Big Data (CBDA 2023)
Call for Papers - 4th International Conference on Big Data (CBDA 2023)Call for Papers - 4th International Conference on Big Data (CBDA 2023)
Call for Papers - 4th International Conference on Big Data (CBDA 2023)ijgca
 
Call for Papers - 15th International Conference on Computer Networks & Commun...
Call for Papers - 15th International Conference on Computer Networks & Commun...Call for Papers - 15th International Conference on Computer Networks & Commun...
Call for Papers - 15th International Conference on Computer Networks & Commun...ijgca
 
Call for Papers - 15th International Conference on Computer Networks & Commun...
Call for Papers - 15th International Conference on Computer Networks & Commun...Call for Papers - 15th International Conference on Computer Networks & Commun...
Call for Papers - 15th International Conference on Computer Networks & Commun...ijgca
 
Call for Papers - 9th International Conference on Cryptography and Informatio...
Call for Papers - 9th International Conference on Cryptography and Informatio...Call for Papers - 9th International Conference on Cryptography and Informatio...
Call for Papers - 9th International Conference on Cryptography and Informatio...ijgca
 
Call for Papers - 9th International Conference on Cryptography and Informatio...
Call for Papers - 9th International Conference on Cryptography and Informatio...Call for Papers - 9th International Conference on Cryptography and Informatio...
Call for Papers - 9th International Conference on Cryptography and Informatio...ijgca
 
Call for Papers - 4th International Conference on Machine learning and Cloud ...
Call for Papers - 4th International Conference on Machine learning and Cloud ...Call for Papers - 4th International Conference on Machine learning and Cloud ...
Call for Papers - 4th International Conference on Machine learning and Cloud ...ijgca
 
Call for Papers - 11th International Conference on Data Mining & Knowledge Ma...
Call for Papers - 11th International Conference on Data Mining & Knowledge Ma...Call for Papers - 11th International Conference on Data Mining & Knowledge Ma...
Call for Papers - 11th International Conference on Data Mining & Knowledge Ma...ijgca
 
Call for Papers - 4th International Conference on Blockchain and Internet of ...
Call for Papers - 4th International Conference on Blockchain and Internet of ...Call for Papers - 4th International Conference on Blockchain and Internet of ...
Call for Papers - 4th International Conference on Blockchain and Internet of ...ijgca
 
Call for Papers - International Conference IOT, Blockchain and Cryptography (...
Call for Papers - International Conference IOT, Blockchain and Cryptography (...Call for Papers - International Conference IOT, Blockchain and Cryptography (...
Call for Papers - International Conference IOT, Blockchain and Cryptography (...ijgca
 
Call for Paper - 4th International Conference on Cloud, Big Data and Web Serv...
Call for Paper - 4th International Conference on Cloud, Big Data and Web Serv...Call for Paper - 4th International Conference on Cloud, Big Data and Web Serv...
Call for Paper - 4th International Conference on Cloud, Big Data and Web Serv...ijgca
 
Call for Papers - International Journal of Database Management Systems (IJDMS)
Call for Papers - International Journal of Database Management Systems (IJDMS)Call for Papers - International Journal of Database Management Systems (IJDMS)
Call for Papers - International Journal of Database Management Systems (IJDMS)ijgca
 

More from ijgca (20)

11th International Conference on Computer Science, Engineering and Informati...
11th International Conference on Computer Science, Engineering and  Informati...11th International Conference on Computer Science, Engineering and  Informati...
11th International Conference on Computer Science, Engineering and Informati...
 
SERVICE LEVEL AGREEMENT BASED FAULT TOLERANT WORKLOAD SCHEDULING IN CLOUD COM...
SERVICE LEVEL AGREEMENT BASED FAULT TOLERANT WORKLOAD SCHEDULING IN CLOUD COM...SERVICE LEVEL AGREEMENT BASED FAULT TOLERANT WORKLOAD SCHEDULING IN CLOUD COM...
SERVICE LEVEL AGREEMENT BASED FAULT TOLERANT WORKLOAD SCHEDULING IN CLOUD COM...
 
SERVICE LEVEL AGREEMENT BASED FAULT TOLERANT WORKLOAD SCHEDULING IN CLOUD COM...
SERVICE LEVEL AGREEMENT BASED FAULT TOLERANT WORKLOAD SCHEDULING IN CLOUD COM...SERVICE LEVEL AGREEMENT BASED FAULT TOLERANT WORKLOAD SCHEDULING IN CLOUD COM...
SERVICE LEVEL AGREEMENT BASED FAULT TOLERANT WORKLOAD SCHEDULING IN CLOUD COM...
 
11th International Conference on Computer Science, Engineering and Informatio...
11th International Conference on Computer Science, Engineering and Informatio...11th International Conference on Computer Science, Engineering and Informatio...
11th International Conference on Computer Science, Engineering and Informatio...
 
Topology Aware Load Balancing for Grids.
Topology Aware Load Balancing for Grids.Topology Aware Load Balancing for Grids.
Topology Aware Load Balancing for Grids.
 
11th International Conference on Computer Science and Information Technology ...
11th International Conference on Computer Science and Information Technology ...11th International Conference on Computer Science and Information Technology ...
11th International Conference on Computer Science and Information Technology ...
 
AN INTELLIGENT SYSTEM FOR THE ENHANCEMENT OF VISUALLY IMPAIRED NAVIGATION AND...
AN INTELLIGENT SYSTEM FOR THE ENHANCEMENT OF VISUALLY IMPAIRED NAVIGATION AND...AN INTELLIGENT SYSTEM FOR THE ENHANCEMENT OF VISUALLY IMPAIRED NAVIGATION AND...
AN INTELLIGENT SYSTEM FOR THE ENHANCEMENT OF VISUALLY IMPAIRED NAVIGATION AND...
 
13th International Conference on Data Mining & Knowledge Management Process (...
13th International Conference on Data Mining & Knowledge Management Process (...13th International Conference on Data Mining & Knowledge Management Process (...
13th International Conference on Data Mining & Knowledge Management Process (...
 
Call for Papers - 15th International Conference on Wireless & Mobile Networks...
Call for Papers - 15th International Conference on Wireless & Mobile Networks...Call for Papers - 15th International Conference on Wireless & Mobile Networks...
Call for Papers - 15th International Conference on Wireless & Mobile Networks...
 
Call for Papers - 4th International Conference on Big Data (CBDA 2023)
Call for Papers - 4th International Conference on Big Data (CBDA 2023)Call for Papers - 4th International Conference on Big Data (CBDA 2023)
Call for Papers - 4th International Conference on Big Data (CBDA 2023)
 
Call for Papers - 15th International Conference on Computer Networks & Commun...
Call for Papers - 15th International Conference on Computer Networks & Commun...Call for Papers - 15th International Conference on Computer Networks & Commun...
Call for Papers - 15th International Conference on Computer Networks & Commun...
 
Call for Papers - 15th International Conference on Computer Networks & Commun...
Call for Papers - 15th International Conference on Computer Networks & Commun...Call for Papers - 15th International Conference on Computer Networks & Commun...
Call for Papers - 15th International Conference on Computer Networks & Commun...
 
Call for Papers - 9th International Conference on Cryptography and Informatio...
Call for Papers - 9th International Conference on Cryptography and Informatio...Call for Papers - 9th International Conference on Cryptography and Informatio...
Call for Papers - 9th International Conference on Cryptography and Informatio...
 
Call for Papers - 9th International Conference on Cryptography and Informatio...
Call for Papers - 9th International Conference on Cryptography and Informatio...Call for Papers - 9th International Conference on Cryptography and Informatio...
Call for Papers - 9th International Conference on Cryptography and Informatio...
 
Call for Papers - 4th International Conference on Machine learning and Cloud ...
Call for Papers - 4th International Conference on Machine learning and Cloud ...Call for Papers - 4th International Conference on Machine learning and Cloud ...
Call for Papers - 4th International Conference on Machine learning and Cloud ...
 
Call for Papers - 11th International Conference on Data Mining & Knowledge Ma...
Call for Papers - 11th International Conference on Data Mining & Knowledge Ma...Call for Papers - 11th International Conference on Data Mining & Knowledge Ma...
Call for Papers - 11th International Conference on Data Mining & Knowledge Ma...
 
Call for Papers - 4th International Conference on Blockchain and Internet of ...
Call for Papers - 4th International Conference on Blockchain and Internet of ...Call for Papers - 4th International Conference on Blockchain and Internet of ...
Call for Papers - 4th International Conference on Blockchain and Internet of ...
 
Call for Papers - International Conference IOT, Blockchain and Cryptography (...
Call for Papers - International Conference IOT, Blockchain and Cryptography (...Call for Papers - International Conference IOT, Blockchain and Cryptography (...
Call for Papers - International Conference IOT, Blockchain and Cryptography (...
 
Call for Paper - 4th International Conference on Cloud, Big Data and Web Serv...
Call for Paper - 4th International Conference on Cloud, Big Data and Web Serv...Call for Paper - 4th International Conference on Cloud, Big Data and Web Serv...
Call for Paper - 4th International Conference on Cloud, Big Data and Web Serv...
 
Call for Papers - International Journal of Database Management Systems (IJDMS)
Call for Papers - International Journal of Database Management Systems (IJDMS)Call for Papers - International Journal of Database Management Systems (IJDMS)
Call for Papers - International Journal of Database Management Systems (IJDMS)
 

Recently uploaded

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 

Recently uploaded (20)

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 

Method for conducting a combined analysis of grid environment’s fta and gwa through session based mapping of trace variables

  • 1. International Journal of Grid Computing & Applications (IJGCA) Vol.5, No.1, March 2014 DOI: 10.5121/ijgca.2014.5101 1 METHOD FOR CONDUCTING A COMBINED ANALYSIS OF GRID ENVIRONMENT’S FTA AND GWA THROUGH SESSION BASED MAPPING OF TRACE VARIABLES Ramandeep Singh1 and R.K.Bawa2 1 Lovely Professional University, Jalandhar, India 2 Department of Computer Science, Punjabi University Patiala, India ABSTRACT Grid computing environment due to its scale and heterogeneous nature is more vulnerable to faults. To store and analyze fault and workload information, FTA Fault Trace Archive and GWA Grid Workload Archive are used. Previously researchers have analyzed FTA and GWA as separate research problems but in this research paper we have proposed a method for conducting a combined analysis of FTA and GWA based on session based mapping of trace file variables. This is the first attempt to conduct a combined analysis of these two trace files. Along with the step by step process of combining trace files we have also included do’s and don’ts while conducting this analysis .Through this combined analysis we have established a correlation based relationship among number of node failures, number of failed jobs, failure duration and number of nodes. We have found that these variables are positively correlated with different correlation coefficients. Keywords: Grid Computing, FTA Fault Trace Archive, GWA Grid Workload Archive, Correlation 1. Introduction Grid computing is a concept which involves combining computing and data storage resources from different sources into virtual organizations (VO). These resources are of heterogeneous nature and are connected through dedicated network links or via internet. Due to its large size and lack of central control Grid resources are vulnerable to different types of faults. These faults affect the quality of service of Grid by either failing the submitted jobs or by delaying execution of these jobs. So researchers and developers need to analyze workload and fault information to study the Grid environment and to improve the overall Grid performance along with reliability. Trace files are used to collect data about the events taking place inside the Grid. These events include faults which are occurring on the Grid resources and events related to the jobs which are submitted for execution on Grid. Basically two types of trace files are used in Grid for collecting resource availability and job data which are FTA and GWA. FTA stands for Fault Trace Archive [1] and GWA stands for Grid Workload Archive [2]. FTA collects information about the faults which are taking place on the Grid nodes and GWA collects
  • 2. International Journal of Grid Computing & Applications (IJGCA) Vol.5, No.1, March 2014 2 information about the workload which is being submitted on the Grid. FTA contain information about nodes, platforms, sites, hardware configuration, availability and unavailability events, time of occurrence of these events and the corresponding step which was taken to deal with the failure situation. In the same way GWA contain information about job id, submission time, waiting time, processor requirement, estimated execution time, status and other information fields related to queues, users and groups. Status of a job can have different values e.g. completed, failed, cancelled or any other trace file specific value. Trace files are available in different formats e.g. raw, tabbed or mysql.FTA is a collection of different files which can be joined with each other with the help of the common fields, like we perform join operation among SQL tables. Due to space limitation we cannot include the whole format of trace files but details about the format and design of FTA can be found from FTA web page [3] and the detailed format information of GWA can be found from GWA web page [4][9]. These trace files can be analyzed using different tools e.g. Matlab, GridSim or mysql. GridSim is a good tool in case you want to run simulations on these trace files according to the constraints of a Grid environment. These trace files i.e. FTA and GWA have been individually analyzed by many researchers for studying Grid environment for different purposes, but no one have considered combining these trace files to study the relationship among two. In this research paper we are going to propose and implement a technique using which we can combine FTA and GWA. Both these trace files are collected from the same Grid platform simultaneously i.e. in parallel. Using this technique we can study the influence of the events of one trace file on the events of another trace file. 2. Related Work Trace files are a good way of collecting data about the events taking place inside a system either distributed or centralized. Analysis of these trace files can reveal many interesting facts related to the system. Artur Andrzejak et. al [5]have worked on SETI@home host trace files, which consisted of 48000 hosts, to analyze host availability patterns. They proposed a model which can ensure that a certain number of hosts will be available for a certain amount of time either with replication or with over provisioning of resources. Similarly Bahman Javadi, et al., [6] have collected and analyzed SETI @home trace files for discovering subset of hosts, which share similar kind of statistical availability patterns. Out of 230000 hosts they have discovered that availability of around 34% of hosts is a truly random process but rest of these hosts can often be modeled in the form of different groups with few distinctions from one another. User oriented analysis reveals the facts about the workload patterns which are submitted by different users [11][12]. Bianca [7] have used trace file to analyze and predict the average life span of hard drives being used at a high performance cluster. Data sheets of these hard drives show that MTTF (Mean Time to Failure) is around 1,000,000 to 15,000,000 hours, suggesting an annual failure rate around 0.88%. But analysis to the actual field data show that the minimum disk replacement rate is 1% and usually it is up to 3-4%, and in some cases up to 13% replacement rate is there. Analysis show that actual data gathered from field may differ from the conceptual or ideal data. Nezih Yigitbasi et.al. have analyzed availability and unavailability of Grid nodes and have identified that there exist a predictable pattern in these events and this behavior can be modeled. They have also established a correlation based model for node failures [8].
  • 3. International Journal of Grid Computing & Applications (IJGCA) Vol.5, No.1, March 2014 3 GWA trace files are used by researchers to study what kind of jobs are submitted on the Grid and how does the success and failure of a job is dependent on different parameters of job. Grid environments are either application specific or job specific. Application specific Grid can execute or process only a specific application related tasks while on the other hand job specific Grid can execute different types of jobs which may belong to different applications. Although individual analysis of trace files is very useful but in this research paper we are going to propose a method of combining these two trace files for a combined analysis. 3. Combining FTA and GWA Following are the operations which need to be performed for conducting combined analysis. 3.1 Acquiring FTA and GWA Trace Files For conducting a combined analysis of FTA and GWA very first condition is that both trace files should belong to the same Grid environment and should be collected at the same time i.e. events should be logged in parallel. Reason for this is that if these trace files belong to different platforms or are collected at different time intervals then there will be no benefit of conducting the combined analysis because then these events cannot be related to each other and the analysis will make no sense. In the absence of mysql format of trace files, we can use SPSS for importing data and converting it in required formats. From the excel format we can insert data in SQL or mysql table. 3.2 Trimming Trace Files After acquiring both trace files we need to check event start time and event end time of both trace files. Reason for this is that collection process of these trace files may start and end at different times so we need to trim one or both the trace files at a common point of event start time and event end time. Let’s consider an example where FTA trace file was started at 10:00:00 am, 10 Jan 2013 and it ends on 12:00:00 pm, 15 Jan 2014. Similarly GWA was started at 09:00:00 am, 31 Dec 2012 and it ended on 11:00:00 am, 1 March 2014. Although both these trace files are collected from the same environment and individually there is nothing wrong with these trace files but for combined analysis we need to trim either one or both from either beginning or end to synchronize the event start and event end time. So in case of above considered example we will trim GWA so that first event start time is 10:00:00 am, 10 Jan 2013 and last event end time will be 12:00:00 pm, 15 Jan 2014. In order to remove noise we will also have to remove the events which start before the trimming point but end after it, because such events may cause deviations in calculations. Same will be applicable to events which start before the trimming point but end after the trimming point. Although we can say that few such events will not make a difference when we are talking about millions of such events, but to minimize the errors this is the one precaution which should be considered in the beginning. So at the end of this step we will have two trace files which start and end at the same time. 3.3 Slicing and Dicing Now that we have two trace files of equal time duration we can start with our analysis. Slicing means that we will divide trace files into sub parts. But the one thing to keep in mind is that these
  • 4. International Journal of Grid Computing & Applications (IJGCA) Vol.5, No.1, March 2014 4 slices should be of equal size, otherwise it will lead to inconsistent results. Slicing duration can vary from an hour to day or week and even a month depending on the type of analysis. Because the event time is epoch time so we would have to convert seconds to days, weeks or months and then add this value subsequently for slicing and dicing operations. Figure 1: Mapping and Slicing of FTA and GWA Figure 1 represents slicing operations in which FTA and GWA trace files of 12 months duration have been sliced in 6 month and 3 month durations. Vertical lines represent slicing operations. As we can see that synchronizing is very well considered here as both the trace files have been sliced at the same locations i.e. time. 3.4 Data Extraction After slicing down the trace files now we can retrieve data from these trace files for our combined analysis. Data selection and extraction is based on what kind of analysis we are performing and which fields of the trace file are required for the analysis. As we know that these two trace files do not share any common field except the event time variable so we cannot combine FTA and GWA directly. We have used a different approach which is based on aggregation functions. a) FTA Data Extraction Fault Trace Archive contains information about Grid nodes. If NF represents number of failures and NF(i)(s) represents number of failures for node i where }...,.........4,3,2,1{ nJi ∈ .J represents collection of n nodes which are part of Grid environment and S represents time slot i.e. session. D represents total failure duration of all the nodes on the Grid whereas D(i) represents total failure duration for node i. F(i)T represents failure frequency of node I for duration T.R(i) represents average resume time after failure for node i. In the absence of common variable, we have used the method of aggregation functions (basically summation and average) to collect and retrieve data for analysis. NF(i), D(i) and F(i) can be directly calculated if we are considering the whole trace file as our possible data set. But we can also calculate these values for a specific duration S which is our sampling duration. Now the value of NF(i), D(i) and F(i) will be calculated from this duration S. S can be generated randomly
  • 5. International Journal of Grid Computing & Applications (IJGCA) Vol.5, No.1, March 2014 5 so that data should be retrieved form the whole trace file rather than one particular section of the trace file. This helps in getting more accurate results. b) GWA Data Extraction Similar to FTA now we need to extract data from GWA and map it with FTA variables. Once again we will use aggregation function and sample duration for retrieving data. Jid represents one job form the GWA with unique identifier id where },........3,2,1{ kWid ∈ here W represents set of jobs which were submitted on the Grid for execution. C id represents number of processors which are requested by job Jid for execution. WTid represents waiting time for job Jid and Rid represents runtime requested by Jid..Uid represents user id of the user who submitted job Jid. Sid represents status of the job Jid. Status as already discussed can be completed, failed, cancelled or any other trace file specific reason. Ncompleted(s), Nfailed(s), Ncancelled(s) represents number of completed, failed and canceled jobs respectively in duration S. 4. Combining Data After the data extraction, now data of different variables should be mapped according to the time slots. In case the time slot shuffling is taking place in both the data sets simultaneously, then it will not lead to any inconsistencies, but if one data set’s time slot have been shifted but not the other one, then it will lead to incorrect results. Graphical representation of this concept is shown in figure 2. In this figure S1, S2 and S3 represents the time slots for which aggregation functions have been used to retrieve data. In the figure Fvar and Gvar represents variables of FTA and GWA. Variable can be any variable which the analyst wants to include in the analysis and which have some relationship with the other variables of the trace file. This relation can be direct or indirect. For example we may consider that there exist a relationship between number of node failures and number of failed jobs. So on one side i.e. FTA we can have data about number of node failures and other relevant variables such as failure duration, resume time etc. and on the other side i.e. GWA we can have data related to number of job failures and the other relevant variables which are required for this analysis such as waiting time, number of processors requested by job, run time etc. It is not mandatory that the number of variables from FTA and GWA which are being used in analysis needs to be same in number. There can be a case where we may have only one or two variables from FTA but from GWA side there may be more than two. This is totally dependent on the analysis model.
  • 6. International Journal of Grid Computing & Applications (IJGCA) Vol.5, No.1, March 2014 6 Figure 2: Combining FTA and GWA with Mapping Time Slots 5. Conducting Analysis Based on the above proposed technique we have conducted a correlation based analysis by combining FTA and GWA. Through this analysis we have identified relationship among the variables of FTA and GWA. Variables which we have considered in our analysis are NF(s) (Number of node Failures in duration S), D(s) (Failure Duration in time S), NN (Number of Nodes), Nfailed(s) (Number of Failed Jobs in duration S). Serial No. Variable Variable Description Variable Name for Plotting 1 NF(s) Number of node Failures in duration S NumNodeFailures 2 D(s) Failure Duration in time S FailureDuration 3 NN Number of Nodes NumNodes 4 Nfailed(s) Number of Failed Jobs in duration S NJobFailed Correlation is a statistical measure that indicates the extent to which two or more variable values fluctuate together. A positive correlation indicates the extent to which two or more than two variables increase in parallel and a negative correlation indicates the extent to which one variable increases as the other decreases. The strength of the linear association among two variables is quantified by the correlation coefficient whose value can vary from -1 to +1. We have used Spearman correlation for this analysis because of the absence of linear variation in data from both trace files. Reason being number of jobs submitted or failed and number of node failures are varying at a large scale from time to time .This technique uses a ranking based approach for calculating correlation coefficients. Following equation represents Spearman’s correlation.
  • 7. International Journal of Grid Computing & Applications (IJGCA) Vol.5, No.1, March 2014 7 2 2 6 1 ( 1) S d r n n × = − − ∑ Here rs represents Spearman’s correlation coefficient, d represents the difference in the ranks of two variables whose correlation we are calculating and n represents the number of values which are being used for conducting this correlation. From the randomly collected data set we conducted a correlation based analysis of different variables. Result of this analysis is shown in the following table 1 and the corresponding plots are shown in figure 3 We can make the following conclusions from these results. I. There exists a positive correlation between number of node failures and number of failed jobs. So we can say that with the increase in the number of node failures number of failed jobs is also increasing and this explains the very basic behavior of the Grid environment. So number of node failures has a direct effect on the quality of service of the environment. Scheduling policy can use this information in order to make better scheduling decisions in a failure critical situation. II. If we look at the correlation coefficient of number of node failures and failure duration then we can see that there exists a considerable correlation between two equal to 0.660. So based on the correlation with increase in number of node failures the failure duration can be predicted III. Correlation coefficient between failure duration and number of failed jobs is also positive equal to 0.498. Although it is not a strong correlation value but it supports out hypothesis that longer the nodes stay unavailable more will be the failed or cancelled jobs. Table 1 : Correlation Analysis Results
  • 8. International Journal of Grid Computing & Applications (IJGCA) Vol.5, No.1, March 2014 8 Figure 3: Correlation Analysis Plots 6. Conclusion and Future Work In this research paper we have proposed first technique of combining Fault Trace Archive and Grid Workload Archive as a single research problem. We have discussed the step by step approach of combining FTA and GWA. We have also identified and discussed that what type of mistakes can be made while conducting the analysis and what kind of impact these mistakes can have on the results. Finally with the help of correlation based combined analyses we have found out that there are positive correlations among FTA and GWA variables and have also identified the coefficient of these correlations. In future we can establish a regression based model of different variables and can predict system behavior in response to variation of events.
  • 9. International Journal of Grid Computing & Applications (IJGCA) Vol.5, No.1, March 2014 9 References [1] Fault Trace Archieve. Fault Trace Archieve. [Online]. http://fta.scem.uws.edu.au [2] Hui Li, Mathieu jan, Shanny Anoep Alexandru Iosup, "The Grid Workload Archieve,". [3] Alexandru Iosup,Matthieu Gallet,Emmanuel Jeannot,Derrick Kondo,Bahman Javadi,Artur Andrzejak Dick Epema. (2009) Fault Trace Archive: For improving the reliability of distributed systems. [Online]. HYPERLINK http://fta.scem.uws.edu.au/index.php?n=Main.FTAFormat [4] Catalin Dumitrescu, Dick Epema, Alexandru Iosup, Mathieu Jan, Hui Li, Lex Wolters Shanny Anoep. (2007, January) The Grid Workloads Archive. [Online]. HYPERLINK http://gwa.ewi.tudelft.nl [5] Derrick Kondo, David P.Anderson Attur Andrzejak, "Ensuring Collective Availability in Volatile Resource Pools via Forecasting,". [6] Derrick Kondo, Jean-Marc Vincent, David P.Anderson Bahman Javedi, "Mining for Statistical Models of Availability in Large-Scale Distributed Systems: An Emperical Study of SETI@home,". [7] Garth A,Gibson Bianca Schroeder, "Disk Failures in Real World: What does and MTTF of 1,000,000 hours mean to you?," in 5th USENIX Conference on File and Storage Technologies, San Hose, CA, Feb 14-16,2007. [8] Matthieu Gallet, Derrick Kondo, Alexandru Iosup, Dick Epema Nezih Yigitbasi, "Analysis and Modeling of Time Correlated Failures in Large Scale Distributed Systems," 8th IEEE/ACM International Conference on Grid Computing, PDS-2010-004, 2010. [9] Bahman Javadi, Alexandru Iosup, Dick Epema Derrick Kondo, "The Failure Trace Archive : Enabling Comparative Analysis of Failures,". [10] Dick Epema Alexandru Iosup, "Grid Computing Workloads," IEEE Transactions on Internet Computng, vol. 15, no. 2, pp. 19-26, 2011. [11] Dick Epema Alexandru Iosup, "Grid Computing Workloads: Bags of Tasks, Workflows, Pilots, and Others," , Netherlands, 2011. [12] Carsten Ememann, and Ramin Yahyapour Baiyi Song, "User Group-based Workload Analysis and Modelling," in 2005 IEEE International Symposium on Cluster Computing and the Grid, 2005, pp. 953-961.