SlideShare a Scribd company logo
1 of 4
Download to read offline
INTERNATIONAL RESEARCH JOURNAL OF ENGINEERING AND TECHNOLOGY (IRJET) E-ISSN: 2395-0056
VOLUME: 06 ISSUE: 07 | JULY 2019 WWW.IRJET.NET P-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 1619
Big Data Processes and Analysis using Hadoop Framework
Nida Mahveen1, Radha B.K2
1P.G.Student, Department of Computer Science and Engineering, PDA, Karnataka ( India)
2Asst. Professor, Department of Computer Science and Engineering, PDA, Karnataka (India)
------------------------------------------------------------------------***-------------------------------------------------------------------------
Abstract: During investigation the issue of sub-dataset investigation over conveyed record frameworks, e.g, the Hadoop
document framework. The trials demonstrate with the intention of the sub-datasets appropriation more than HDFS squares,
which be covered up with HDFS, can frequently make comparing examinations experience the bad effects of a genuinely
imbalanced or wasteful parallel execution. In particular, the substance distribution of sub-datasets brings about some
computational hubs completing significantly more remaining task at hand than others; besides, it prompts wasteful testing of
sub-datasets, as examination projects will regularly peruse a lot of superfluous information. The direct an extensive
examination on how imbalanced figuring examples and wasteful inspecting happen and at that point propose a capacity
dispersion mindful strategy to streamline the sub-dataset investigation over disseminated stockpiling frameworks alluded to
as Data-Net. Right off the bat, an effective calculation to acquire the metadata of sub-dataset disseminations is projected. Also,
this plan a versatile stockpiling structure called Elastic-Map dependent on the Map-Reduce and Bloom-Filter systems towards
accumulate the meta-information. We utilize dispersion mindful calculations used for sub-dataset application to accomplish
adjusted plus productive similar implementation. The projected technique preserve profit diverse sub-dataset examinations
through different computational prerequisites. Analyses be directed scheduled PRObEs-Marmot 128-computer distributed
plus the outcomes demonstrate the presentation advantages of Data-Net.
Keywords: Hadoop Distributed File System (HDFS), Map-Reduce, Data-Node, Data-Net
I. Introduction
ADVANCES in recognizing, frameworks organization and
limit developments have incited the age and assembling of
data at incredibly high rates and volumes. Gigantic
organizations, for instance, Google, Amazon and Facebook
produce and assemble terabytes of data concerning snap
stream or event signs in only a few hours. In order to
ensure system security and option business learning these
data commonly ought to be furthermore gathered or picked
for individual assessment. For instance, in proposition
systems and redid web benefits, the examination on the
website page snaps streams needs to perform customer
sessionization assessment so as to give better help of each
customer. Similarly, in framework traffic structures, stream
improvement subject to framework traffic pursues should
isolate different sorts of framework traffic and direct
examination in like way. In like manner, in business trade
examination, data with unequivocal features are commonly
picked for coercion area and peril appraisal. In this paper, a
get-together of data related to explicit events or features is
implied as a sub-dataset. At present, the Hadoop record
structure is the acknowledged open-source accumulating
system in gigantic data examination. It might be really sent
on the plates of pack centre points for adjacent data get to.
When securing a dataset, HDFS will segment the dataset
into smaller square records and subjectively circle them
with a couple of undefined copies (for unflinching quality).
For all intents and purposes, an immense dataset may
contain millions or billions of sub-datasets, for instance,
business snaps or event based log data.
II. Literature Survey
Composing outline is the tremendous development in
programming improvement process. Before improving the
tools it is mandatory to pick the economy quality, time
factor. At the point when the designer's make the structure
tools as programming architect need a huge amount of
outside assistance, this kind of assistance should be
conceivable by senior programming engineers, from locales
or from books.
[1] Logothetis,. Trezzo, K. C. Webb, K. Yocum, "In-situ
MapReduce for log handling”
Log examination are a bedrock part of running a
considerable lot of the present Internet destinations.
Application and snap logs structure the reason for
following and breaking down client practices and
inclinations, and they structure the essential contributions
to promotion focusing on calculations. Logs are additionally
basic for execution and security observing, investigating,
and enhancing the huge process frameworks that make up
the register "cloud", a large number of machines spreading
over various server farms. With current log age rates on the
request of 1–10 MB/s per machine, a solitary server farm
can make several TBs of log information daily. While mass
information handling has demonstrated to be a basic device
for log preparing, current practice moves all logs to a
brought together process group. This not just devours a lot
of system and circle transmission capacity, yet additionally
postpones the fruition of time-touchy examination. We
present an in-situ MapReduce design that mines
information "on area", bypassing the expense and hold up
time of this store-first-inquiry later methodology. In
contrast to current methodologies, our design
unequivocally supports diminished information constancy,
enabling clients to explain inquiries with inactivity and
loyalty prerequisites
INTERNATIONAL RESEARCH JOURNAL OF ENGINEERING AND TECHNOLOGY (IRJET) E-ISSN: 2395-0056
VOLUME: 06 ISSUE: 07 | JULY 2019 WWW.IRJET.NET P-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 1620
[2]. Chen Z, Wu D. A sprout channel based methodology
designed for productive MapReduce
The MapReduce handling structure is uninformed of the
property of basic datasets. For requested datasets (e.g.,
time-arrangement information), in which records have
been now arranged, MapReduce still performs pointless
arranging tasks during its execution. It straightforwardly
brings about a noteworthy increment of execution time, as
arranging a huge volume of information is tedious. In this
paper, we propose a blossom channel based way to deal
with improve the presentation of MapReduce when
handling requested datasets. In our methodology, all
records are put away in a lot of sprout channels after the
Mapping stage and information inquiries can be
proficiently prepared by checking the blossom channels.
Because of the high questioning proficiency of sprout
channels, we can accomplish huge execution gain in the
Reducing stage. We direct a progression of tests to assess
the adequacy of our proposed blossom channel based
methodology. Our exploratory outcomes demonstrate that
our methodology can accomplish 2x speedup as far as
question handling execution, and lessen the CPU/memory
usage in the in the mean time. In addition, we likewise
assess the adaptability of our proposed methodology when
preparing different questions, and see that the speedup can
be additionally improved with the expanding number of
inquiries.
[3]. E. Coppa, I. Finocchi, "On information skewness
stragglers
Author handle the issue of anticipating the presentation of
MapReduce applications planning precise advancement
markers, which keep software engineers educated on the
level of finished calculation time during the execution of a
vocation. This is particularly significant in pay-as-you-go
cloud situations, where moderate employments can be
prematurely ended so as to keep away from unnecessary
expenses. Execution forecasts can likewise fill in as a
structure hinder for a few profile-guided advancements. By
accepting that the running time depends directly on the
info measure, cutting edge procedures can be genuinely
hurt by information skewness, load unbalancing, and
straggling assignments. We in this way structure a novel
profile-guided advancement pointer, called NearestFit, that
works without the straight theory supposition in a
completely online manner (i.e., without falling back on
profile information gathered from past executions).
NearestFit misuses a cautious blend of closest neighbor
relapse and factual bend fitting procedures. Fine-grained
profiles required by our hypothetical advancement model
are approximated through existence productive
information gushing calculations. We executed NearestFit
over Hadoop 2.6.0. A broad experimental appraisal over
the Amazon EC2 stage on an assortment of benchmarks
demonstrates that its precision is generally excellent,
notwithstanding when contenders bring about non-
immaterial blunders and wide expectation changes.
[4]. L. Yu, Y. Shao, B. Cui, "Abusing grid reliance for effective
appropriated network calculation
Conveyed grid calculation is a well-known methodology for
some huge scale information examination and AI
undertakings. Anyway existing disseminated network
calculation frameworks for the most part cause
overwhelming correspondence cost during the runtime,
which corrupts the general execution. Network calculation
framework, named DMac, which adventures the lattice
conditions in grid programs for productive grid calculation
in the conveyed condition. We deteriorate every
framework program into an arrangement of activities, and
uncover the lattice conditions between tasks in the
program. We next structure a reliance arranged cost model
to choose an ideal execution methodology for every
activity, and create a correspondence proficient execution
plan for the network calculation program. To encourage the
grid calculation in circulated frameworks, we further gap
the execution plan into numerous un-interleaved stages
which can keep running in an appropriated group with
effective nearby execution system on every laborer. The
DMac framework has been actualized on a prominent
broadly useful information preparing system, Spark. The
test results show that our strategies can essentially
improve the exhibition of a wide scope of network
programs.
[5].. Wang Y, Jiao, WeikuanY, Coo-MR: Cross-task
synchronization used for productive information the
executives in MapReduce programs
Hadoop is a generally embraced open source execution of
MapReduce programming model for enormous information
preparing. It speaks to framework assets as accessible
guide and diminish openings and appoints them to
different undertakings. This execution model gives little
respect to the need of cross-task coordination on the
utilization of shared framework assets on a register hub,
which results in assignment obstruction. Likewise, the
current Hadoop blend calculation can cause over the top
I/O. In this investigation, we attempt a push to address the
two issues. As needs be, we have structured a cross-task
coordination system called CooMR for proficient
information the board in MapReduce programs. CooMR
comprises of three segment plans including cross job crafty
recollection contribution plus log-organized input output
combination, which be intended towards encourage job
organization, and the key-situated in-situ consolidate
(KISM) calculation which is intended to empower the
arranging/converging of Hadoop halfway information
without really moving the <key, value> sets. Our
assessment shows that CooMR can build task coordination,
improve framework asset usage, and altogether accelerate
the execution time of MapReduce programs.
[6]. Viles C. L, FrenchJ. C.Content region inside dispersed
computerized libraries
In this paper we present the idea of substance territory in
conveyed report accumulations. Content territory is how
INTERNATIONAL RESEARCH JOURNAL OF ENGINEERING AND TECHNOLOGY (IRJET) E-ISSN: 2395-0056
VOLUME: 06 ISSUE: 07 | JULY 2019 WWW.IRJET.NET P-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 1621
much substance comparative records are colocated in a
dispersed accumulation. We propose two measurements
for estimation of substance area, one dependent on point
marks and the other dependent on accumulation insights.
We give deductions and investigation of the two
measurements and use them to quantify the substance
territory in two sorts of record accumulations, the notable
TREC quantity. We likewise demonstrate so as to substance
region be able to be consideration of transiently just since
spatially plus give proof of its reality inside transiently
requested report accumulations like news sources.
III. Activity Diagram
Figure1: Sequence Diagram
In above diagram shows the overall process of working
with the proposed system, where all the data is spread
across cluster of commodity hardwares and then using the
map and reduce process it is analysed parallel so that the
analysis of the information is done fast and accurate.
IV. Result and Discussion
.
Figure 2. Screen-Shot-1
In the above screen shot, one can see that the processing of
the data is carried out and information regarding how
much of time taken for the processing and what
information is processed is showing.
Figure 3.Screen-shot-2
In the above screen shot one can see the data which is
processed how it is distributed across the commodity of
hardware and output is given in the form of the file name
par-0000.
V. Conclusion and Future Scope
We look into the issues of imbalanced sub-dataset
examination and inefficient testing on sub-datasets over a
Hadoop gathering. On account of the missing information of
sub-datasets' region, the substance gathering trademark in
most sub-datasets keeps applications from capably setting
them up. Through a theoretical examination, we assume
that an uneven sub-dataset spread frequently prompts a
INTERNATIONAL RESEARCH JOURNAL OF ENGINEERING AND TECHNOLOGY (IRJET) E-ISSN: 2395-0056
VOLUME: 06 ISSUE: 07 | JULY 2019 WWW.IRJET.NET P-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 1622
lower-execution in parallel data assessment. To address
this issue, we propose DataNet to help sub-dataset
apportionment careful figuring. DataNet uses an adaptable
structure, called Elastic-Map, to store the sub-dataset
spreads. In like manner, an overarching sub-dataset
segment computation is proposed to help the advancement
of ElasticMap. With the usage of DataNet, sub-dataset
assessments can without quite a bit of a stretch equality
their remarkable job that needs to be done among
computational center points. Additionally, for sub-dataset
testing, we can without quite a bit of a stretch find the best
data obstructs from the whole dataset as information. We
direct total tests for different sub-dataset applications with
the use of DataNet and the preliminary outcomes show the
promising introduction of DataNet.
Acknowledgment
The creators might want to thank an incredible help.
References
[1] Flume: Open source log collection system.
[2] Vignesh Prajapati, Big Data Analytics with R and
Hadoop, Birmingham:Packt Publishing Ltd, 2013.
[3] om White, Hadoop- The Definitive Guide, United State of
America:O’Reilly, 2012.
[3]M. I. Jordan, T. M. Mitchel, "Machine Learning: Trends
Perspectives and Prospects", American Association for the
Advancement of Science, 2015.

More Related Content

What's hot

IRJET-Framework for Dynamic Resource Allocation and Efficient Scheduling Stra...
IRJET-Framework for Dynamic Resource Allocation and Efficient Scheduling Stra...IRJET-Framework for Dynamic Resource Allocation and Efficient Scheduling Stra...
IRJET-Framework for Dynamic Resource Allocation and Efficient Scheduling Stra...IRJET Journal
 
sumerian_datacenter-consolidation-white_paper
sumerian_datacenter-consolidation-white_papersumerian_datacenter-consolidation-white_paper
sumerian_datacenter-consolidation-white_paperKelly Moscrop
 
Differentiating Algorithms of Cloud Task Scheduling Based on various Parameters
Differentiating Algorithms of Cloud Task Scheduling Based on various ParametersDifferentiating Algorithms of Cloud Task Scheduling Based on various Parameters
Differentiating Algorithms of Cloud Task Scheduling Based on various Parametersiosrjce
 
A Big-Data Process Consigned Geographically by Employing Mapreduce Frame Work
A Big-Data Process Consigned Geographically by Employing Mapreduce Frame WorkA Big-Data Process Consigned Geographically by Employing Mapreduce Frame Work
A Big-Data Process Consigned Geographically by Employing Mapreduce Frame WorkIRJET Journal
 
Exploration of Call Transcripts with MapReduce and Zipf’s Law
Exploration of Call Transcripts with MapReduce and Zipf’s LawExploration of Call Transcripts with MapReduce and Zipf’s Law
Exploration of Call Transcripts with MapReduce and Zipf’s LawTom Donoghue
 
Distributed Feature Selection for Efficient Economic Big Data Analysis
Distributed Feature Selection for Efficient Economic Big Data AnalysisDistributed Feature Selection for Efficient Economic Big Data Analysis
Distributed Feature Selection for Efficient Economic Big Data AnalysisIRJET Journal
 
IRJET - Evaluating and Comparing the Two Variation with Current Scheduling Al...
IRJET - Evaluating and Comparing the Two Variation with Current Scheduling Al...IRJET - Evaluating and Comparing the Two Variation with Current Scheduling Al...
IRJET - Evaluating and Comparing the Two Variation with Current Scheduling Al...IRJET Journal
 
Svm Classifier Algorithm for Data Stream Mining Using Hive and R
Svm Classifier Algorithm for Data Stream Mining Using Hive and RSvm Classifier Algorithm for Data Stream Mining Using Hive and R
Svm Classifier Algorithm for Data Stream Mining Using Hive and RIRJET Journal
 
Paper id 25201498
Paper id 25201498Paper id 25201498
Paper id 25201498IJRAT
 
Real Time Analytics
Real Time AnalyticsReal Time Analytics
Real Time AnalyticsMohsin Hakim
 
Efficient Cost Minimization for Big Data Processing
Efficient Cost Minimization for Big Data ProcessingEfficient Cost Minimization for Big Data Processing
Efficient Cost Minimization for Big Data ProcessingIRJET Journal
 
GPU accelerated Large Scale Analytics
GPU accelerated Large Scale AnalyticsGPU accelerated Large Scale Analytics
GPU accelerated Large Scale AnalyticsSuleiman Shehu
 
FDMC: Framework for Decision Making in Cloud for EfficientResource Management
FDMC: Framework for Decision Making in Cloud for EfficientResource Management FDMC: Framework for Decision Making in Cloud for EfficientResource Management
FDMC: Framework for Decision Making in Cloud for EfficientResource Management IJECEIAES
 
Survey on Performance of Hadoop Map reduce Optimization Methods
Survey on Performance of Hadoop Map reduce Optimization MethodsSurvey on Performance of Hadoop Map reduce Optimization Methods
Survey on Performance of Hadoop Map reduce Optimization Methodspaperpublications3
 
The Underutilization of GIS technologies - Q&A with Shane Barrett
The Underutilization of GIS technologies - Q&A with Shane BarrettThe Underutilization of GIS technologies - Q&A with Shane Barrett
The Underutilization of GIS technologies - Q&A with Shane BarrettIQPC Australia
 
A time efficient and accurate retrieval of range aggregate queries using fuzz...
A time efficient and accurate retrieval of range aggregate queries using fuzz...A time efficient and accurate retrieval of range aggregate queries using fuzz...
A time efficient and accurate retrieval of range aggregate queries using fuzz...IJECEIAES
 
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCENETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCEcsandit
 
IRJET- Analysis for EnhancedForecastof Expense Movement in Stock Exchange
IRJET- Analysis for EnhancedForecastof Expense Movement in Stock ExchangeIRJET- Analysis for EnhancedForecastof Expense Movement in Stock Exchange
IRJET- Analysis for EnhancedForecastof Expense Movement in Stock ExchangeIRJET Journal
 

What's hot (20)

IRJET-Framework for Dynamic Resource Allocation and Efficient Scheduling Stra...
IRJET-Framework for Dynamic Resource Allocation and Efficient Scheduling Stra...IRJET-Framework for Dynamic Resource Allocation and Efficient Scheduling Stra...
IRJET-Framework for Dynamic Resource Allocation and Efficient Scheduling Stra...
 
sumerian_datacenter-consolidation-white_paper
sumerian_datacenter-consolidation-white_papersumerian_datacenter-consolidation-white_paper
sumerian_datacenter-consolidation-white_paper
 
Differentiating Algorithms of Cloud Task Scheduling Based on various Parameters
Differentiating Algorithms of Cloud Task Scheduling Based on various ParametersDifferentiating Algorithms of Cloud Task Scheduling Based on various Parameters
Differentiating Algorithms of Cloud Task Scheduling Based on various Parameters
 
Distributed Web System Performance Improving Forecasting Accuracy
Distributed Web System Performance Improving Forecasting  AccuracyDistributed Web System Performance Improving Forecasting  Accuracy
Distributed Web System Performance Improving Forecasting Accuracy
 
A Big-Data Process Consigned Geographically by Employing Mapreduce Frame Work
A Big-Data Process Consigned Geographically by Employing Mapreduce Frame WorkA Big-Data Process Consigned Geographically by Employing Mapreduce Frame Work
A Big-Data Process Consigned Geographically by Employing Mapreduce Frame Work
 
Exploration of Call Transcripts with MapReduce and Zipf’s Law
Exploration of Call Transcripts with MapReduce and Zipf’s LawExploration of Call Transcripts with MapReduce and Zipf’s Law
Exploration of Call Transcripts with MapReduce and Zipf’s Law
 
Distributed Feature Selection for Efficient Economic Big Data Analysis
Distributed Feature Selection for Efficient Economic Big Data AnalysisDistributed Feature Selection for Efficient Economic Big Data Analysis
Distributed Feature Selection for Efficient Economic Big Data Analysis
 
IRJET - Evaluating and Comparing the Two Variation with Current Scheduling Al...
IRJET - Evaluating and Comparing the Two Variation with Current Scheduling Al...IRJET - Evaluating and Comparing the Two Variation with Current Scheduling Al...
IRJET - Evaluating and Comparing the Two Variation with Current Scheduling Al...
 
Svm Classifier Algorithm for Data Stream Mining Using Hive and R
Svm Classifier Algorithm for Data Stream Mining Using Hive and RSvm Classifier Algorithm for Data Stream Mining Using Hive and R
Svm Classifier Algorithm for Data Stream Mining Using Hive and R
 
Paper id 25201498
Paper id 25201498Paper id 25201498
Paper id 25201498
 
Real Time Analytics
Real Time AnalyticsReal Time Analytics
Real Time Analytics
 
Efficient Cost Minimization for Big Data Processing
Efficient Cost Minimization for Big Data ProcessingEfficient Cost Minimization for Big Data Processing
Efficient Cost Minimization for Big Data Processing
 
GPU accelerated Large Scale Analytics
GPU accelerated Large Scale AnalyticsGPU accelerated Large Scale Analytics
GPU accelerated Large Scale Analytics
 
FDMC: Framework for Decision Making in Cloud for EfficientResource Management
FDMC: Framework for Decision Making in Cloud for EfficientResource Management FDMC: Framework for Decision Making in Cloud for EfficientResource Management
FDMC: Framework for Decision Making in Cloud for EfficientResource Management
 
Survey on Performance of Hadoop Map reduce Optimization Methods
Survey on Performance of Hadoop Map reduce Optimization MethodsSurvey on Performance of Hadoop Map reduce Optimization Methods
Survey on Performance of Hadoop Map reduce Optimization Methods
 
The Underutilization of GIS technologies - Q&A with Shane Barrett
The Underutilization of GIS technologies - Q&A with Shane BarrettThe Underutilization of GIS technologies - Q&A with Shane Barrett
The Underutilization of GIS technologies - Q&A with Shane Barrett
 
Sugi-82-26 Dolkart
Sugi-82-26 DolkartSugi-82-26 Dolkart
Sugi-82-26 Dolkart
 
A time efficient and accurate retrieval of range aggregate queries using fuzz...
A time efficient and accurate retrieval of range aggregate queries using fuzz...A time efficient and accurate retrieval of range aggregate queries using fuzz...
A time efficient and accurate retrieval of range aggregate queries using fuzz...
 
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCENETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
 
IRJET- Analysis for EnhancedForecastof Expense Movement in Stock Exchange
IRJET- Analysis for EnhancedForecastof Expense Movement in Stock ExchangeIRJET- Analysis for EnhancedForecastof Expense Movement in Stock Exchange
IRJET- Analysis for EnhancedForecastof Expense Movement in Stock Exchange
 

Similar to IRJET- Big Data Processes and Analysis using Hadoop Framework

IRJET- Recommendation System based on Graph Database Techniques
IRJET- Recommendation System based on Graph Database TechniquesIRJET- Recommendation System based on Graph Database Techniques
IRJET- Recommendation System based on Graph Database TechniquesIRJET Journal
 
An Energy Efficient Data Transmission and Aggregation of WSN using Data Proce...
An Energy Efficient Data Transmission and Aggregation of WSN using Data Proce...An Energy Efficient Data Transmission and Aggregation of WSN using Data Proce...
An Energy Efficient Data Transmission and Aggregation of WSN using Data Proce...IRJET Journal
 
IRJET- Placement Portal
IRJET- Placement PortalIRJET- Placement Portal
IRJET- Placement PortalIRJET Journal
 
Fast Range Aggregate Queries for Big Data Analysis
Fast Range Aggregate Queries for Big Data AnalysisFast Range Aggregate Queries for Big Data Analysis
Fast Range Aggregate Queries for Big Data AnalysisIRJET Journal
 
Cloud Computing Task Scheduling Algorithm Based on Modified Genetic Algorithm
Cloud Computing Task Scheduling Algorithm Based on Modified Genetic AlgorithmCloud Computing Task Scheduling Algorithm Based on Modified Genetic Algorithm
Cloud Computing Task Scheduling Algorithm Based on Modified Genetic AlgorithmIRJET Journal
 
Web usage Mining Based on Request Dependency Graph
Web usage Mining Based on Request Dependency GraphWeb usage Mining Based on Request Dependency Graph
Web usage Mining Based on Request Dependency GraphIRJET Journal
 
IRJET - A Prognosis Approach for Stock Market Prediction based on Term Streak...
IRJET - A Prognosis Approach for Stock Market Prediction based on Term Streak...IRJET - A Prognosis Approach for Stock Market Prediction based on Term Streak...
IRJET - A Prognosis Approach for Stock Market Prediction based on Term Streak...IRJET Journal
 
Effective Information Flow Control as a Service: EIFCaaS
Effective Information Flow Control as a Service: EIFCaaSEffective Information Flow Control as a Service: EIFCaaS
Effective Information Flow Control as a Service: EIFCaaSIRJET Journal
 
Energy-Efficient Task Scheduling in Cloud Environment
Energy-Efficient Task Scheduling in Cloud EnvironmentEnergy-Efficient Task Scheduling in Cloud Environment
Energy-Efficient Task Scheduling in Cloud EnvironmentIRJET Journal
 
Implementing Proof of Retriavaibility for Multiple Replica of Data File using...
Implementing Proof of Retriavaibility for Multiple Replica of Data File using...Implementing Proof of Retriavaibility for Multiple Replica of Data File using...
Implementing Proof of Retriavaibility for Multiple Replica of Data File using...IRJET Journal
 
IRJET- Scheduling of Independent Tasks over Virtual Machines on Computati...
IRJET-  	  Scheduling of Independent Tasks over Virtual Machines on Computati...IRJET-  	  Scheduling of Independent Tasks over Virtual Machines on Computati...
IRJET- Scheduling of Independent Tasks over Virtual Machines on Computati...IRJET Journal
 
IRJET- Enhance Dynamic Heterogeneous Shortest Job first (DHSJF): A Task Schedu...
IRJET- Enhance Dynamic Heterogeneous Shortest Job first (DHSJF): A Task Schedu...IRJET- Enhance Dynamic Heterogeneous Shortest Job first (DHSJF): A Task Schedu...
IRJET- Enhance Dynamic Heterogeneous Shortest Job first (DHSJF): A Task Schedu...IRJET Journal
 
Performance evaluation of Map-reduce jar pig hive and spark with machine lear...
Performance evaluation of Map-reduce jar pig hive and spark with machine lear...Performance evaluation of Map-reduce jar pig hive and spark with machine lear...
Performance evaluation of Map-reduce jar pig hive and spark with machine lear...IJECEIAES
 
IRJET- A Survey on Remote Data Possession Verification Protocol in Cloud Storage
IRJET- A Survey on Remote Data Possession Verification Protocol in Cloud StorageIRJET- A Survey on Remote Data Possession Verification Protocol in Cloud Storage
IRJET- A Survey on Remote Data Possession Verification Protocol in Cloud StorageIRJET Journal
 
SECURE FILE STORAGE IN THE CLOUD WITH HYBRID ENCRYPTION
SECURE FILE STORAGE IN THE CLOUD WITH HYBRID ENCRYPTIONSECURE FILE STORAGE IN THE CLOUD WITH HYBRID ENCRYPTION
SECURE FILE STORAGE IN THE CLOUD WITH HYBRID ENCRYPTIONIRJET Journal
 
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduce Framework
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduce FrameworkBIGDATA- Survey on Scheduling Methods in Hadoop MapReduce Framework
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduce FrameworkMahantesh Angadi
 
High Dimensionality Structures Selection for Efficient Economic Big data usin...
High Dimensionality Structures Selection for Efficient Economic Big data usin...High Dimensionality Structures Selection for Efficient Economic Big data usin...
High Dimensionality Structures Selection for Efficient Economic Big data usin...IRJET Journal
 
IRJET- Survey of Big Data with Hadoop
IRJET-  	  Survey of Big Data with HadoopIRJET-  	  Survey of Big Data with Hadoop
IRJET- Survey of Big Data with HadoopIRJET Journal
 
IRJET- A Detailed Analysis on Windows Event Log Viewer for Faster Root Ca...
IRJET-  	  A Detailed Analysis on Windows Event Log Viewer for Faster Root Ca...IRJET-  	  A Detailed Analysis on Windows Event Log Viewer for Faster Root Ca...
IRJET- A Detailed Analysis on Windows Event Log Viewer for Faster Root Ca...IRJET Journal
 

Similar to IRJET- Big Data Processes and Analysis using Hadoop Framework (20)

IRJET- Recommendation System based on Graph Database Techniques
IRJET- Recommendation System based on Graph Database TechniquesIRJET- Recommendation System based on Graph Database Techniques
IRJET- Recommendation System based on Graph Database Techniques
 
An Energy Efficient Data Transmission and Aggregation of WSN using Data Proce...
An Energy Efficient Data Transmission and Aggregation of WSN using Data Proce...An Energy Efficient Data Transmission and Aggregation of WSN using Data Proce...
An Energy Efficient Data Transmission and Aggregation of WSN using Data Proce...
 
IRJET- Placement Portal
IRJET- Placement PortalIRJET- Placement Portal
IRJET- Placement Portal
 
Fast Range Aggregate Queries for Big Data Analysis
Fast Range Aggregate Queries for Big Data AnalysisFast Range Aggregate Queries for Big Data Analysis
Fast Range Aggregate Queries for Big Data Analysis
 
Cloud Computing Task Scheduling Algorithm Based on Modified Genetic Algorithm
Cloud Computing Task Scheduling Algorithm Based on Modified Genetic AlgorithmCloud Computing Task Scheduling Algorithm Based on Modified Genetic Algorithm
Cloud Computing Task Scheduling Algorithm Based on Modified Genetic Algorithm
 
Web usage Mining Based on Request Dependency Graph
Web usage Mining Based on Request Dependency GraphWeb usage Mining Based on Request Dependency Graph
Web usage Mining Based on Request Dependency Graph
 
IRJET - A Prognosis Approach for Stock Market Prediction based on Term Streak...
IRJET - A Prognosis Approach for Stock Market Prediction based on Term Streak...IRJET - A Prognosis Approach for Stock Market Prediction based on Term Streak...
IRJET - A Prognosis Approach for Stock Market Prediction based on Term Streak...
 
Effective Information Flow Control as a Service: EIFCaaS
Effective Information Flow Control as a Service: EIFCaaSEffective Information Flow Control as a Service: EIFCaaS
Effective Information Flow Control as a Service: EIFCaaS
 
A Survey and Comparison of SDN Based Traffic Management Techniques
A Survey and Comparison of SDN Based Traffic Management TechniquesA Survey and Comparison of SDN Based Traffic Management Techniques
A Survey and Comparison of SDN Based Traffic Management Techniques
 
Energy-Efficient Task Scheduling in Cloud Environment
Energy-Efficient Task Scheduling in Cloud EnvironmentEnergy-Efficient Task Scheduling in Cloud Environment
Energy-Efficient Task Scheduling in Cloud Environment
 
Implementing Proof of Retriavaibility for Multiple Replica of Data File using...
Implementing Proof of Retriavaibility for Multiple Replica of Data File using...Implementing Proof of Retriavaibility for Multiple Replica of Data File using...
Implementing Proof of Retriavaibility for Multiple Replica of Data File using...
 
IRJET- Scheduling of Independent Tasks over Virtual Machines on Computati...
IRJET-  	  Scheduling of Independent Tasks over Virtual Machines on Computati...IRJET-  	  Scheduling of Independent Tasks over Virtual Machines on Computati...
IRJET- Scheduling of Independent Tasks over Virtual Machines on Computati...
 
IRJET- Enhance Dynamic Heterogeneous Shortest Job first (DHSJF): A Task Schedu...
IRJET- Enhance Dynamic Heterogeneous Shortest Job first (DHSJF): A Task Schedu...IRJET- Enhance Dynamic Heterogeneous Shortest Job first (DHSJF): A Task Schedu...
IRJET- Enhance Dynamic Heterogeneous Shortest Job first (DHSJF): A Task Schedu...
 
Performance evaluation of Map-reduce jar pig hive and spark with machine lear...
Performance evaluation of Map-reduce jar pig hive and spark with machine lear...Performance evaluation of Map-reduce jar pig hive and spark with machine lear...
Performance evaluation of Map-reduce jar pig hive and spark with machine lear...
 
IRJET- A Survey on Remote Data Possession Verification Protocol in Cloud Storage
IRJET- A Survey on Remote Data Possession Verification Protocol in Cloud StorageIRJET- A Survey on Remote Data Possession Verification Protocol in Cloud Storage
IRJET- A Survey on Remote Data Possession Verification Protocol in Cloud Storage
 
SECURE FILE STORAGE IN THE CLOUD WITH HYBRID ENCRYPTION
SECURE FILE STORAGE IN THE CLOUD WITH HYBRID ENCRYPTIONSECURE FILE STORAGE IN THE CLOUD WITH HYBRID ENCRYPTION
SECURE FILE STORAGE IN THE CLOUD WITH HYBRID ENCRYPTION
 
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduce Framework
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduce FrameworkBIGDATA- Survey on Scheduling Methods in Hadoop MapReduce Framework
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduce Framework
 
High Dimensionality Structures Selection for Efficient Economic Big data usin...
High Dimensionality Structures Selection for Efficient Economic Big data usin...High Dimensionality Structures Selection for Efficient Economic Big data usin...
High Dimensionality Structures Selection for Efficient Economic Big data usin...
 
IRJET- Survey of Big Data with Hadoop
IRJET-  	  Survey of Big Data with HadoopIRJET-  	  Survey of Big Data with Hadoop
IRJET- Survey of Big Data with Hadoop
 
IRJET- A Detailed Analysis on Windows Event Log Viewer for Faster Root Ca...
IRJET-  	  A Detailed Analysis on Windows Event Log Viewer for Faster Root Ca...IRJET-  	  A Detailed Analysis on Windows Event Log Viewer for Faster Root Ca...
IRJET- A Detailed Analysis on Windows Event Log Viewer for Faster Root Ca...
 

More from IRJET Journal

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...IRJET Journal
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTUREIRJET Journal
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...IRJET Journal
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsIRJET Journal
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...IRJET Journal
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...IRJET Journal
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...IRJET Journal
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...IRJET Journal
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASIRJET Journal
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...IRJET Journal
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProIRJET Journal
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...IRJET Journal
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemIRJET Journal
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesIRJET Journal
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web applicationIRJET Journal
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...IRJET Journal
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.IRJET Journal
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...IRJET Journal
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignIRJET Journal
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...IRJET Journal
 

More from IRJET Journal (20)

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil Characteristics
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADAS
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare System
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridges
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web application
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
 

Recently uploaded

Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxbritheesh05
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
Churning of Butter, Factors affecting .
Churning of Butter, Factors affecting  .Churning of Butter, Factors affecting  .
Churning of Butter, Factors affecting .Satyam Kumar
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfROCENODodongVILLACER
 
EduAI - E learning Platform integrated with AI
EduAI - E learning Platform integrated with AIEduAI - E learning Platform integrated with AI
EduAI - E learning Platform integrated with AIkoyaldeepu123
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...srsj9000
 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ
 
Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptSAURABHKUMAR892774
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)Dr SOUNDIRARAJ N
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AIabhishek36461
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxKartikeyaDwivedi3
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEroselinkalist12
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...asadnawaz62
 

Recently uploaded (20)

Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptx
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
Churning of Butter, Factors affecting .
Churning of Butter, Factors affecting  .Churning of Butter, Factors affecting  .
Churning of Butter, Factors affecting .
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdf
 
EduAI - E learning Platform integrated with AI
EduAI - E learning Platform integrated with AIEduAI - E learning Platform integrated with AI
EduAI - E learning Platform integrated with AI
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
 
Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.ppt
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
Design and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdfDesign and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdf
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AI
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptx
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...
 
young call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Serviceyoung call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Service
 

IRJET- Big Data Processes and Analysis using Hadoop Framework

  • 1. INTERNATIONAL RESEARCH JOURNAL OF ENGINEERING AND TECHNOLOGY (IRJET) E-ISSN: 2395-0056 VOLUME: 06 ISSUE: 07 | JULY 2019 WWW.IRJET.NET P-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 1619 Big Data Processes and Analysis using Hadoop Framework Nida Mahveen1, Radha B.K2 1P.G.Student, Department of Computer Science and Engineering, PDA, Karnataka ( India) 2Asst. Professor, Department of Computer Science and Engineering, PDA, Karnataka (India) ------------------------------------------------------------------------***------------------------------------------------------------------------- Abstract: During investigation the issue of sub-dataset investigation over conveyed record frameworks, e.g, the Hadoop document framework. The trials demonstrate with the intention of the sub-datasets appropriation more than HDFS squares, which be covered up with HDFS, can frequently make comparing examinations experience the bad effects of a genuinely imbalanced or wasteful parallel execution. In particular, the substance distribution of sub-datasets brings about some computational hubs completing significantly more remaining task at hand than others; besides, it prompts wasteful testing of sub-datasets, as examination projects will regularly peruse a lot of superfluous information. The direct an extensive examination on how imbalanced figuring examples and wasteful inspecting happen and at that point propose a capacity dispersion mindful strategy to streamline the sub-dataset investigation over disseminated stockpiling frameworks alluded to as Data-Net. Right off the bat, an effective calculation to acquire the metadata of sub-dataset disseminations is projected. Also, this plan a versatile stockpiling structure called Elastic-Map dependent on the Map-Reduce and Bloom-Filter systems towards accumulate the meta-information. We utilize dispersion mindful calculations used for sub-dataset application to accomplish adjusted plus productive similar implementation. The projected technique preserve profit diverse sub-dataset examinations through different computational prerequisites. Analyses be directed scheduled PRObEs-Marmot 128-computer distributed plus the outcomes demonstrate the presentation advantages of Data-Net. Keywords: Hadoop Distributed File System (HDFS), Map-Reduce, Data-Node, Data-Net I. Introduction ADVANCES in recognizing, frameworks organization and limit developments have incited the age and assembling of data at incredibly high rates and volumes. Gigantic organizations, for instance, Google, Amazon and Facebook produce and assemble terabytes of data concerning snap stream or event signs in only a few hours. In order to ensure system security and option business learning these data commonly ought to be furthermore gathered or picked for individual assessment. For instance, in proposition systems and redid web benefits, the examination on the website page snaps streams needs to perform customer sessionization assessment so as to give better help of each customer. Similarly, in framework traffic structures, stream improvement subject to framework traffic pursues should isolate different sorts of framework traffic and direct examination in like way. In like manner, in business trade examination, data with unequivocal features are commonly picked for coercion area and peril appraisal. In this paper, a get-together of data related to explicit events or features is implied as a sub-dataset. At present, the Hadoop record structure is the acknowledged open-source accumulating system in gigantic data examination. It might be really sent on the plates of pack centre points for adjacent data get to. When securing a dataset, HDFS will segment the dataset into smaller square records and subjectively circle them with a couple of undefined copies (for unflinching quality). For all intents and purposes, an immense dataset may contain millions or billions of sub-datasets, for instance, business snaps or event based log data. II. Literature Survey Composing outline is the tremendous development in programming improvement process. Before improving the tools it is mandatory to pick the economy quality, time factor. At the point when the designer's make the structure tools as programming architect need a huge amount of outside assistance, this kind of assistance should be conceivable by senior programming engineers, from locales or from books. [1] Logothetis,. Trezzo, K. C. Webb, K. Yocum, "In-situ MapReduce for log handling” Log examination are a bedrock part of running a considerable lot of the present Internet destinations. Application and snap logs structure the reason for following and breaking down client practices and inclinations, and they structure the essential contributions to promotion focusing on calculations. Logs are additionally basic for execution and security observing, investigating, and enhancing the huge process frameworks that make up the register "cloud", a large number of machines spreading over various server farms. With current log age rates on the request of 1–10 MB/s per machine, a solitary server farm can make several TBs of log information daily. While mass information handling has demonstrated to be a basic device for log preparing, current practice moves all logs to a brought together process group. This not just devours a lot of system and circle transmission capacity, yet additionally postpones the fruition of time-touchy examination. We present an in-situ MapReduce design that mines information "on area", bypassing the expense and hold up time of this store-first-inquiry later methodology. In contrast to current methodologies, our design unequivocally supports diminished information constancy, enabling clients to explain inquiries with inactivity and loyalty prerequisites
  • 2. INTERNATIONAL RESEARCH JOURNAL OF ENGINEERING AND TECHNOLOGY (IRJET) E-ISSN: 2395-0056 VOLUME: 06 ISSUE: 07 | JULY 2019 WWW.IRJET.NET P-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 1620 [2]. Chen Z, Wu D. A sprout channel based methodology designed for productive MapReduce The MapReduce handling structure is uninformed of the property of basic datasets. For requested datasets (e.g., time-arrangement information), in which records have been now arranged, MapReduce still performs pointless arranging tasks during its execution. It straightforwardly brings about a noteworthy increment of execution time, as arranging a huge volume of information is tedious. In this paper, we propose a blossom channel based way to deal with improve the presentation of MapReduce when handling requested datasets. In our methodology, all records are put away in a lot of sprout channels after the Mapping stage and information inquiries can be proficiently prepared by checking the blossom channels. Because of the high questioning proficiency of sprout channels, we can accomplish huge execution gain in the Reducing stage. We direct a progression of tests to assess the adequacy of our proposed blossom channel based methodology. Our exploratory outcomes demonstrate that our methodology can accomplish 2x speedup as far as question handling execution, and lessen the CPU/memory usage in the in the mean time. In addition, we likewise assess the adaptability of our proposed methodology when preparing different questions, and see that the speedup can be additionally improved with the expanding number of inquiries. [3]. E. Coppa, I. Finocchi, "On information skewness stragglers Author handle the issue of anticipating the presentation of MapReduce applications planning precise advancement markers, which keep software engineers educated on the level of finished calculation time during the execution of a vocation. This is particularly significant in pay-as-you-go cloud situations, where moderate employments can be prematurely ended so as to keep away from unnecessary expenses. Execution forecasts can likewise fill in as a structure hinder for a few profile-guided advancements. By accepting that the running time depends directly on the info measure, cutting edge procedures can be genuinely hurt by information skewness, load unbalancing, and straggling assignments. We in this way structure a novel profile-guided advancement pointer, called NearestFit, that works without the straight theory supposition in a completely online manner (i.e., without falling back on profile information gathered from past executions). NearestFit misuses a cautious blend of closest neighbor relapse and factual bend fitting procedures. Fine-grained profiles required by our hypothetical advancement model are approximated through existence productive information gushing calculations. We executed NearestFit over Hadoop 2.6.0. A broad experimental appraisal over the Amazon EC2 stage on an assortment of benchmarks demonstrates that its precision is generally excellent, notwithstanding when contenders bring about non- immaterial blunders and wide expectation changes. [4]. L. Yu, Y. Shao, B. Cui, "Abusing grid reliance for effective appropriated network calculation Conveyed grid calculation is a well-known methodology for some huge scale information examination and AI undertakings. Anyway existing disseminated network calculation frameworks for the most part cause overwhelming correspondence cost during the runtime, which corrupts the general execution. Network calculation framework, named DMac, which adventures the lattice conditions in grid programs for productive grid calculation in the conveyed condition. We deteriorate every framework program into an arrangement of activities, and uncover the lattice conditions between tasks in the program. We next structure a reliance arranged cost model to choose an ideal execution methodology for every activity, and create a correspondence proficient execution plan for the network calculation program. To encourage the grid calculation in circulated frameworks, we further gap the execution plan into numerous un-interleaved stages which can keep running in an appropriated group with effective nearby execution system on every laborer. The DMac framework has been actualized on a prominent broadly useful information preparing system, Spark. The test results show that our strategies can essentially improve the exhibition of a wide scope of network programs. [5].. Wang Y, Jiao, WeikuanY, Coo-MR: Cross-task synchronization used for productive information the executives in MapReduce programs Hadoop is a generally embraced open source execution of MapReduce programming model for enormous information preparing. It speaks to framework assets as accessible guide and diminish openings and appoints them to different undertakings. This execution model gives little respect to the need of cross-task coordination on the utilization of shared framework assets on a register hub, which results in assignment obstruction. Likewise, the current Hadoop blend calculation can cause over the top I/O. In this investigation, we attempt a push to address the two issues. As needs be, we have structured a cross-task coordination system called CooMR for proficient information the board in MapReduce programs. CooMR comprises of three segment plans including cross job crafty recollection contribution plus log-organized input output combination, which be intended towards encourage job organization, and the key-situated in-situ consolidate (KISM) calculation which is intended to empower the arranging/converging of Hadoop halfway information without really moving the <key, value> sets. Our assessment shows that CooMR can build task coordination, improve framework asset usage, and altogether accelerate the execution time of MapReduce programs. [6]. Viles C. L, FrenchJ. C.Content region inside dispersed computerized libraries In this paper we present the idea of substance territory in conveyed report accumulations. Content territory is how
  • 3. INTERNATIONAL RESEARCH JOURNAL OF ENGINEERING AND TECHNOLOGY (IRJET) E-ISSN: 2395-0056 VOLUME: 06 ISSUE: 07 | JULY 2019 WWW.IRJET.NET P-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 1621 much substance comparative records are colocated in a dispersed accumulation. We propose two measurements for estimation of substance area, one dependent on point marks and the other dependent on accumulation insights. We give deductions and investigation of the two measurements and use them to quantify the substance territory in two sorts of record accumulations, the notable TREC quantity. We likewise demonstrate so as to substance region be able to be consideration of transiently just since spatially plus give proof of its reality inside transiently requested report accumulations like news sources. III. Activity Diagram Figure1: Sequence Diagram In above diagram shows the overall process of working with the proposed system, where all the data is spread across cluster of commodity hardwares and then using the map and reduce process it is analysed parallel so that the analysis of the information is done fast and accurate. IV. Result and Discussion . Figure 2. Screen-Shot-1 In the above screen shot, one can see that the processing of the data is carried out and information regarding how much of time taken for the processing and what information is processed is showing. Figure 3.Screen-shot-2 In the above screen shot one can see the data which is processed how it is distributed across the commodity of hardware and output is given in the form of the file name par-0000. V. Conclusion and Future Scope We look into the issues of imbalanced sub-dataset examination and inefficient testing on sub-datasets over a Hadoop gathering. On account of the missing information of sub-datasets' region, the substance gathering trademark in most sub-datasets keeps applications from capably setting them up. Through a theoretical examination, we assume that an uneven sub-dataset spread frequently prompts a
  • 4. INTERNATIONAL RESEARCH JOURNAL OF ENGINEERING AND TECHNOLOGY (IRJET) E-ISSN: 2395-0056 VOLUME: 06 ISSUE: 07 | JULY 2019 WWW.IRJET.NET P-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 1622 lower-execution in parallel data assessment. To address this issue, we propose DataNet to help sub-dataset apportionment careful figuring. DataNet uses an adaptable structure, called Elastic-Map, to store the sub-dataset spreads. In like manner, an overarching sub-dataset segment computation is proposed to help the advancement of ElasticMap. With the usage of DataNet, sub-dataset assessments can without quite a bit of a stretch equality their remarkable job that needs to be done among computational center points. Additionally, for sub-dataset testing, we can without quite a bit of a stretch find the best data obstructs from the whole dataset as information. We direct total tests for different sub-dataset applications with the use of DataNet and the preliminary outcomes show the promising introduction of DataNet. Acknowledgment The creators might want to thank an incredible help. References [1] Flume: Open source log collection system. [2] Vignesh Prajapati, Big Data Analytics with R and Hadoop, Birmingham:Packt Publishing Ltd, 2013. [3] om White, Hadoop- The Definitive Guide, United State of America:O’Reilly, 2012. [3]M. I. Jordan, T. M. Mitchel, "Machine Learning: Trends Perspectives and Prospects", American Association for the Advancement of Science, 2015.