SlideShare a Scribd company logo
1 of 7
Download to read offline
[Kumari et. al., Vol.6 (Iss.6): June 2019] ISSN: 2454-1907
DOI: 10.5281/zenodo.3266146
Http://www.ijetmr.com©International Journal of Engineering Technologies and Management Research [116]
AN ANALYTICAL REVIEW STUDY ON BIG DATA ANALYSIS USING R
STUDIO
Anita Kumari 1, Neeraj Verma 2
1
MTech Scholar, Dept of Computer Science & Engineering PPIMT, Hisar (Haryana), India
2
Assistant Professor, Dept of Computer Science & Engineering PPIMT, Hisar (Haryana), India
Abstract:
A larger amount of data gives a better output but also working with it can become a challenge
due to processing limitations. Nowadays companies are starting to realize the importance of
using more data in order to support decision for their strategies. It was said and proved
through study cases that “More data usually beats better algorithms”. With this statement
companies started to realize that they can chose to invest more in processing larger sets of data
rather than investing in expensive algorithms. During the last decade, large statistics
evaluation has seen an exponential boom and will absolutely retain to witness outstanding
tendencies due to the emergence of new interactive multimedia packages and extraordinarily
incorporated systems driven via the speedy growth in statistics services and microelectronic
gadgets. Up to now, maximum of the modern mobile structures are especially centered to voice
communications with low transmission fees.
Keywords: Huge Statistics; Big Data Analysis; R Studio.
Cite This Article: Anita Kumari, and Neeraj Verma. (2019). “AN ANALYTICAL REVIEW
STUDY ON BIG DATA ANALYSIS USING R STUDIO.” International Journal of
Engineering Technologies and Management Research, 6(6), 116-122. DOI:
10.5281/zenodo.3266146.
1. Introduction
We can associate the importance of Big Data and Big Data Analysis with the society that we live
in. Today we are living in an Informational Society and we are moving towards a Knowledge
Based Society. In order to extract better knowledge we need a bigger amount of data. The
Society of Information is a society where information plays a major role in the economic,
cultural and political stage. [2]
In the Knowledge society the competitive advantage is gained through understanding the
information and predicting the evolution of facts based on data. The same happens with Big
Data. Every organization needs to collect a large set of data in order to support its decision and
extract correlations through data analysis as a basis for decisions. [3]
In this article we will define the concept of Big Data, its importance and different perspectives
on its use. In addition we will stress the importance of Big Data Analysis and show how the
analysis of Big Data will improve decisions in the future.
[Kumari et. al., Vol.6 (Iss.6): June 2019] ISSN: 2454-1907
DOI: 10.5281/zenodo.3266146
Http://www.ijetmr.com©International Journal of Engineering Technologies and Management Research [117]
2. Big Data Concept
The term “Big Data” was first introduced to the computing world by Roger Magoulas from
O’Reilly media in 2005 in order to define a great amount of data that traditional data
management techniques cannot manage and process due to the complexity and size of this data.
[7]
A study on the Evolution of Big Data as a Research and Scientific Topic shows that the term
“Big Data” was present in research starting with 1970s but has been The emerging large records
science time period, displaying its broader effect on our society and in our enterprise lifestyles
cycle, has insightful transformed our society and could hold to draw numerous attentions from
technical specialists and in addition to public in standard [1] [2]. It is apparent that we're residing
in huge records era, proven with the aid of the sheer quantity of data from a spread of sources
and its growing rate of era. as an example, an IDC file predicts that, from 2005 to 2020, the
global statistics dimensions will grow with the aid of a issue of three hundred, from a hundred
thirty Exabyte’s to 40,000 Exabyte’s, representing a double increase each two years. this is
specializes in on hand big-records structures that include a set of equipment and technique to
load, extract, and enhance numerous information while leveraging the immensely parallel
processing energy to carry out complicated transformations and evaluation. “large-information”
machine faces a chain of technical challenges, along with:
First, because of the large variety of different records sources and the massive extent, its miles
too tough to acquire, combine and evaluation of “huge statistics” with scalability from scattered
places.
Second “huge facts” structures want to manage, shop and integrate the amassed massive and
varied verity of datasets, whilst provide function and performance warranty [1], in terms of rapid
retrieval, scalability and secrecy safety.
Third “large information” analytics have to efficaciously excavation large datasets at one of kind
degrees in actual time or close to real time - which includes modeling, visualization [2],
prediction and optimization - such that inherent potentials can be discovered to improve choice
making and collect similarly advantages. To address these challenges, the researcher IT industry
and community has given various solutions for “Big Data” science systems in an ad-hoc manner.
Cloud computing can be called as the substructure layer for “Big Data” systems to meet certain
substructure requirements, such as cost-effectiveness, resistance [2], and the ability to scale up or
down. Distributed file systems and No SQL databases are suitable for persistent storage and the
management of massive scheme free datasets [1]. Map Reduce, R is a programming framework,
has achieved great success in processing “Big Data” group-aggregation tasks, such as website
ranking [10].
RStudio integrates data storage, data processing, system management, and other modules to form
a powerful system-level solution, which is becoming the mainstay in handling “Big Data”
challenges. We can build various “Big Data” application system based on these innovative
technologies and platforms. In light of the of big-data technologies, a systematic frame work
should be in order to capture the fast evolution of big-data research.
[Kumari et. al., Vol.6 (Iss.6): June 2019] ISSN: 2454-1907
DOI: 10.5281/zenodo.3266146
Http://www.ijetmr.com©International Journal of Engineering Technologies and Management Research [118]
2.1. Big Data Problem and Challenges
However, considering variety of data sets in “Big Data” problems, it is still a big challenge for us
to purpose efficient representation, access, and analysis of shapeless or semi-structured data in
the further researches [12]. How can the data be preprocessed in order to improve the quality of
data and analysis results before we begin data analysis [1] [2]? As the sizes of dataset are often
very large, sometimes several gigabytes or more, and their origin from varied sources, current
real-world databases are pitilessly susceptible to inconsistent, incomplete, and noisy data.
Therefore, a number of data preprocessing techniques, including data cleaning [11], data
integration, data transformation and date reduction, can be applied to remove noise and correct
irregularities. Different challenges arise in each sub-process when it comes to data-driven
applications.
2.2. Principles for designing Big Data System
In designing “Big Data” analytics systems, we summarize seven necessary principles to guide
the development of this kind of burning issues [3]. “Big Data” analytics in a highly distributed
system cannot be achievable without the following principles [13]:
1) Good architectures and frameworks are necessary and on the top priority.
2) Support a variety of analytical methods
3) No size fits all
4) Bring the analysis to data
5) Processing must be distributable for in-memory computation.
6) Data storage must be distributable for in-memory storage.
7) Coordination is needed between processing and data units.
2.3. Big Data Opportunities
The bonds between “Big Data” and knowledge hidden in it are highly crucial in all areas of
national priority. This initiative will also lay the groundwork for complementary “Big Data”
activities, such as “Big Data” substructure projects, platforms development, and techniques in
settling complex, data-driven problems in sciences and engineering. Researchers, policy and
decision makers have to recognize the potential of harnessing “Big Data” to uncover the next
wave of growth in their fields. There are many advantages in business section that can be
obtained through harnessing “Big Data” increasing operational efficiency, informing strategic
direction, developing better customer service, identifying and developing new products and
services, identifying new customers and markets, etc. [10]
2.4. Big Data Analysis
The last and most important stage of the “Big Data” value chain is data analysis, the goal of
which is to get useful values, suggest best conclusions and support decision-making system of an
organization to stay in competition market. [1]
Descriptive Analytics: exploits historical data to describe what occurred in past. For instance, a
regression technique may be used to find simple trends in the datasets, visualization presents data
[Kumari et. al., Vol.6 (Iss.6): June 2019] ISSN: 2454-1907
DOI: 10.5281/zenodo.3266146
Http://www.ijetmr.com©International Journal of Engineering Technologies and Management Research [119]
in a meaningful fashion, and data modeling is used to collect, store and cut the data in an
efficient way. Descriptive analytics is typically associated with business intelligence or visibility
systems [2].
Predictive Analytics: focuses on predicting future probabilities and trends. For example,
predictive modeling uses statistical techniques [6] such as linear and logistic regression to
understand trends and predict future out-comes, and data mining extracts patterns to provide
insight and forecasts [4].
Prescriptive Analytics: addresses decision making and efficiency. For example, simulation is
used to analyze complex systems to gain insight into system performance and identify issues and
optimization techniques are used to find best solutions under given constraints.
3. Literature Review
Shweta Pandey et al. in 2014 shows that Big Data is having challenges related to volume,
velocity and variety. Big Data has 3Vs Volume means large amount of data, Velocity means data
arrives at high speed, Variety means data comes from heterogeneous resources. In Big Data
definition, Big means a dataset which makes data concept to grow so much that it becomes
difficult to manage it by using existing data management concepts and tools. Map Reduce is
playing a very significant role in processing of Big Data. The main objective of this paper is
purposed a tool like Map Reduce is elastic scalable, efficient and fault tolerant for analyzing a
large set of data, highlights the features of Map Reduce in comparison of other design model
which makes it popular tool for processing large scale data.
HAN HU and YONGGANG WEN in 2014 shows that Recent technological advancements have
led to a deluge of data from distinctive domains e.g., health care and scientific sensors, user-
generated data, Internet and financial companies, and supply chain systems. Over the past two
decades. The term big data was coined to capture the meaning of this emerging trend. In addition
to its sheer volume, big data also exhibits other unique characteristics as compared with
traditional data. For instance, big data is commonly unstructured and require more real-time
analysis. This development calls for new system architectures for data acquisition, transmission,
storage, and large-scale data processing mechanisms. In this paper, we present a literature survey
and system tutorial for big data analytics platforms, aiming to provide an overall picture for non-
expert readers and instill a do-it-yourself spirit for advanced audiences to customize their own
big-data solutions.
Katarina et al. in 2014 discussed challenges for Map Reduce in Big Data. In the Big Data
community, Map Reduce has been seen as one of the key enabling approaches for meeting the
continuously increasing demands on computing resources imposed by massive data sets. At the
same time, Map Reduce faces a number of obstacles when dealing with Big Data including the
lack of a high-level language such as SQL, challenges in implementing iterative algorithms,
support for iterative ad-hoc data exploration, and stream processing. The identified Map Reduce
challenges are grouped into four main categories corresponding to Big Data tasks types: data
storage, analytics, online processing, security and privacy. The main objective of this paper is
identifies Map Reduce issues and challenges in handling Big Data with the objective of
[Kumari et. al., Vol.6 (Iss.6): June 2019] ISSN: 2454-1907
DOI: 10.5281/zenodo.3266146
Http://www.ijetmr.com©International Journal of Engineering Technologies and Management Research [120]
providing an overview of the field, facilitating better planning and management of Big Data
projects, and identifying opportunities for future research in this field.
Zhen Jia1, Jianfeng Zhan, Lei Wang, Rui Han, Sally A. McKee, Qiang Yang, Chunjie Luo, and
Jingwei Li in 2014 discussed Characterizing and Subsetting Big Data Workloads. A large
number of benchmarks pose great challenges, since our usual simulation-based research methods
become prohibitively expensive. They use hardware performance counters to analyze micro
architectural behaviors of those scale out workloads. They compare the scale-out workloads and
traditional benchmarks to identify the key contributors to the micro architecture inefficiency on
modern processors. They conclude that mismatches exist between the needs of scale-out
workloads and the capabilities of modern processors. Much work focuses on comparing the
performance of different data management systems. For OLTP or database systems evaluation,
TPC-C is often used to evaluate transaction-processing system performance in terms of
transactions per minute. Cooper defines a core set of benchmarks and report throughput and
latency results for five widely used data management systems.
3.1. Big Data Tools: Techniques and Technologies
To capture the value from “Big Data”, we need to develop new techniques and technologies for
analyzing it. Until now, scientists have developed a wide variety of techniques and technologies
to capture, curate, analyze and visualize Big Data.
We need tools (platforms) to make sense of “Big Data”. Current tools concentrate on three
classes, namely, batch processing tools, stream processing tools, and interactive analysis tools.
Most batch processing tools are based on the Apache RStudio infrastructure, such as Map reduce
[4], R Programming and Dryad. The interactive analysis processes the data in an interactive
environment, allowing users to undertake their own analysis of information.
3.2. R Programming
The R language is well mounted as the language for doing data, facts evaluation, information-
mining algorithm improvement, stock buying and selling, credit score danger scoring,
marketplace basket evaluation and all [9] way of predictive analytics. However, given the deluge
of information that must be processed and analyzed nowadays, many groups had been reticent
about deploying R past studies into production programs. [16]
3.3. Comparisons of Classification for Big Data Science
To apply different classification technique I have chosen a real dataset about the student’s
knowledge status about the subject of Electrical DC Machines. Distribution of every numeric
variable can be checked with function summary (), which returns the minimum, maximum,
mean, median, and the first (25%) and third (75%) quartiles. For factors (or categorical
variables), it shows the frequency of every level.
[Kumari et. al., Vol.6 (Iss.6): June 2019] ISSN: 2454-1907
DOI: 10.5281/zenodo.3266146
Http://www.ijetmr.com©International Journal of Engineering Technologies and Management Research [121]
3.4. The importance of Big Data
The main importance of Big Data consists in the potential to improve efficiency in the context of
use a large volume of data, of different type. If Big Data is defined properly and used
accordingly, organizations can get a better view on their business therefore leading to efficiency
in different areas like sales, improving the manufactured product and so forth. [2]
Big Data can be used effectively in the following areas:
 In information technology in order to improve security and troubleshooting by analyzing
the patterns in the existing logs;
 In customer service by using information from call centers in order to get the customer
pattern and thus enhance customer satisfaction by customizing services;
 In improving services and products through the use of social media content. By knowing
the potential customers preferences the company can modify its product in order to
address a larger area of people;
 In the detection of fraud in the online transactions for any industry;
 In risk assessment by analyzing information from the transactions on the financial market
4. Conclusions
The year 2012 is the year when companies are starting to orient themselves towards the use of
Big Data. That is why this articol presents the Big Data concept and the R technologies
associated in order to understand better the multiple benefices of this new concept ant
technology.
In the future we propose for our research to further investigate the practical advantages that can
be gain through R Studio.
References
[1] C.L. Philip Chen, Chun-Yang Zhang, “Data intensive applications, challenges, techniques and
technologies: A survey on Big Data” Information Science 0020-0255 (2014), PP 341-347,
elsevier
[2] Han hu1At. Al. (Fellow, IEEE),” Toward Scalable Systems for Big Data Analytics: A
Technology Tutorial”, IEEE 2169-3536(2014),PP 652-687
[3] Shweta Pandey, Dr.VrindaTokekar, “Prominence of MapReduce in BIG DATA Processing”,
IEEE (Fourth International Conference on Communication Systems and Network
Technologies)978-1-4799-3070-8/14, PP 555-560
[4] Katarina Grolinger At. Al. “Challenges for MapReduce in Big Data”, IEEE (10th World
Congress on Services)978-1-4799-5069-0/14,PP 182-189
[5] Zhen Jia1 At. Al. “Characterizing and Subsetting Big Data Workloads", IEEE 978-1-4799-6454-
3/14, PP 191-201
[6] AvitaKatal, Mohammad Wazid, R H Goudar, “Big Data: Issues, Challenges, Tools and Good
Practices”, IEEE 978-1-4799-0192-0/13,PP 404-409
[7] Du Zhang,” Inconsistencies in Big Data”, IEEE 978-1-4799-0783-0/13, PP 61-67
[Kumari et. al., Vol.6 (Iss.6): June 2019] ISSN: 2454-1907
DOI: 10.5281/zenodo.3266146
Http://www.ijetmr.com©International Journal of Engineering Technologies and Management Research [122]
[8] ZibinZheng, Jieming Zhu, and Michael R. Lyu, “Service-generated Big Data and Big Data-as-a-
Service: An Overview”, IEEE (International Congress on Big Data) 978-0-7695-5006-0/13, PP
403-410
[9] VigneshPrajapati, Big Data Analytics with R and RStudioPackt Publishing
[10] Lei Wang At. Al., “BigDataBench: aBigData Benchmark Suite from InternetServices”, IEEE
978-1-4799-3097-5/14.
[11] AnirudhKadadi At. Al., “Challenges of Data Integration and Interoperability in Big Data”, IEEE
(International Conference on Big Data)978-1-4799-5666-1/14, PP 38-40
[12] SAS, Five big data challenges and how to overcome them with visual analytics
[13] HajarMousanif At. Al., “From Big Data to Big Projects: a Step-by-step Roadmap”, IEEE
(International Conference on Future Internet of Things and Cloud) 978-1-4799-4357-9/14, PP
373-378
[14] Tianbo Lu At. Al., “Next Big Thing in Big Data: The Security of the ICT Supply Chain”, IEEE
(SocialCom/PASSAT/BigData/EconCom/BioMedCom) 978-0-7695-5137-1/13, PP 1066-1073
[15] Ganapathy Mani, NimaBarit, Duoduo Liao, Simon Berkovich, “Organization of Knowledge
Extraction from Big Data Systems”, IEEE (4 Fifth International Conference on Computing for
Geospatial Research and Application) 978-1-4799-4321-0/14, PP 63-69
[16] Joseph Rickert, “Big Data Analysis with Revolution R Enterprise”, 2011
[17] Carson Kai-Sang Leung, Richard Kyle MacKinnon, Fan Jiang, “Reducing the Search Space for
Big Data Mining for Interesting Patterns from Uncertain Data”, IEEE 2014, PP 315-322
[18] Ajith Abraham1, Swagatam Das2, and Sandip Roy3, “Swarm Intelligence Algorithms for Data
Clustering”, PP 280-313
[19] Swagatam Das, Ajith Abraham, Senior Member, IEEE, and Amit Konar, “Automatic Clustering
Using an Improved Differential Evolution Algorithm”, IEEE 2008, PP 218-237
[20] KarthikKambatla, GiorgosKollias, Vipin Kumar, AnanthGrama, “J. Parallel Distrib. Comput”,
Elsevier 2014, PP 2561-2573
[21] Yanchang Zhao, “R and Data Mining: Examples and Case Studies”,
www.RDataMining.com,2014
[22] H. T. Kahraman, Sagiroglu, S., Colak,“User Knowledge Modeling Data Set”, UCI, vol. 37, pp.
283-295, 2013
[23] Mrigank Mridul, Akashdeep Khajuria, Snehasish Dutta, Kumar N, “Analysis of Bidgata using
Apache RStudio and Map”, Volume 4, Issue 5, May 2014 Reduce, PP. 555-560.
[24] Sonja Pravilovic,” R language in data mining techniques and statistics”, 20130201.12,2013
[25] Vrushali Y Kulkarni,” Random Forest Classifiers: A Survey and Future Research Directions”,
International Journal of Advanced Computing, ISSN: 2051-0845, Vol.36, Issue.1, April 2013
[26] Aditya Krishna Menon,” Large-Scale Support Vector Machines: Algorithms and Theory”.

More Related Content

Similar to Big Data Analysis Using R Studio

An Comprehensive Study of Big Data Environment and its Challenges.
An Comprehensive Study of Big Data Environment and its Challenges.An Comprehensive Study of Big Data Environment and its Challenges.
An Comprehensive Study of Big Data Environment and its Challenges.ijceronline
 
Learning Resources Week 2 Frankfort-Nachmias, C., & Leon-Guerr.docx
Learning Resources Week 2 Frankfort-Nachmias, C., & Leon-Guerr.docxLearning Resources Week 2 Frankfort-Nachmias, C., & Leon-Guerr.docx
Learning Resources Week 2 Frankfort-Nachmias, C., & Leon-Guerr.docxjesssueann
 
Learning Resources Week 2 Frankfort-Nachmias, C., & Leon-Guerr.docx
Learning Resources Week 2 Frankfort-Nachmias, C., & Leon-Guerr.docxLearning Resources Week 2 Frankfort-Nachmias, C., & Leon-Guerr.docx
Learning Resources Week 2 Frankfort-Nachmias, C., & Leon-Guerr.docxgauthierleppington
 
Big data - a review (2013 4)
Big data - a review (2013 4)Big data - a review (2013 4)
Big data - a review (2013 4)Sonu Gupta
 
Sameer Kumar Das International Conference Paper 53
Sameer Kumar Das International Conference Paper 53Sameer Kumar Das International Conference Paper 53
Sameer Kumar Das International Conference Paper 53Mr.Sameer Kumar Das
 
IRJET- Scope of Big Data Analytics in Industrial Domain
IRJET- Scope of Big Data Analytics in Industrial DomainIRJET- Scope of Big Data Analytics in Industrial Domain
IRJET- Scope of Big Data Analytics in Industrial DomainIRJET Journal
 
Big data is a broad term for data sets so large or complex that tr.docx
Big data is a broad term for data sets so large or complex that tr.docxBig data is a broad term for data sets so large or complex that tr.docx
Big data is a broad term for data sets so large or complex that tr.docxhartrobert670
 
Big Data.pdf
Big Data.pdfBig Data.pdf
Big Data.pdfAnilaAbid2
 
DEALING CRISIS MANAGEMENT USING AI
DEALING CRISIS MANAGEMENT USING AIDEALING CRISIS MANAGEMENT USING AI
DEALING CRISIS MANAGEMENT USING AIIJCSEA Journal
 
DEALING CRISIS MANAGEMENT USING AI
DEALING CRISIS MANAGEMENT USING AIDEALING CRISIS MANAGEMENT USING AI
DEALING CRISIS MANAGEMENT USING AIIJCSEA Journal
 
DEALING CRISIS MANAGEMENT USING AI
DEALING CRISIS MANAGEMENT USING AIDEALING CRISIS MANAGEMENT USING AI
DEALING CRISIS MANAGEMENT USING AIIJCSEA Journal
 
IRJET- A Scrutiny on Research Analysis of Big Data Analytical Method and Clou...
IRJET- A Scrutiny on Research Analysis of Big Data Analytical Method and Clou...IRJET- A Scrutiny on Research Analysis of Big Data Analytical Method and Clou...
IRJET- A Scrutiny on Research Analysis of Big Data Analytical Method and Clou...IRJET Journal
 
sustainabilityCase ReportIntegrated Understanding of B.docx
sustainabilityCase ReportIntegrated Understanding of B.docxsustainabilityCase ReportIntegrated Understanding of B.docx
sustainabilityCase ReportIntegrated Understanding of B.docxdeanmtaylor1545
 
sustainabilityCase ReportIntegrated Understanding of B.docx
sustainabilityCase ReportIntegrated Understanding of B.docxsustainabilityCase ReportIntegrated Understanding of B.docx
sustainabilityCase ReportIntegrated Understanding of B.docxmabelf3
 
RESEARCH IN BIG DATA – AN OVERVIEW
RESEARCH IN BIG DATA – AN OVERVIEWRESEARCH IN BIG DATA – AN OVERVIEW
RESEARCH IN BIG DATA – AN OVERVIEWieijjournal
 
Research in Big Data - An Overview
Research in Big Data - An OverviewResearch in Big Data - An Overview
Research in Big Data - An Overviewieijjournal
 
RESEARCH IN BIG DATA – AN OVERVIEW
RESEARCH IN BIG DATA – AN OVERVIEWRESEARCH IN BIG DATA – AN OVERVIEW
RESEARCH IN BIG DATA – AN OVERVIEWieijjournal1
 
RESEARCH IN BIG DATA – AN OVERVIEW
RESEARCH IN BIG DATA – AN OVERVIEWRESEARCH IN BIG DATA – AN OVERVIEW
RESEARCH IN BIG DATA – AN OVERVIEWieijjournal
 
The effect of technology-organization-environment on adoption decision of bi...
The effect of technology-organization-environment on  adoption decision of bi...The effect of technology-organization-environment on  adoption decision of bi...
The effect of technology-organization-environment on adoption decision of bi...IJECEIAES
 

Similar to Big Data Analysis Using R Studio (20)

An Comprehensive Study of Big Data Environment and its Challenges.
An Comprehensive Study of Big Data Environment and its Challenges.An Comprehensive Study of Big Data Environment and its Challenges.
An Comprehensive Study of Big Data Environment and its Challenges.
 
Learning Resources Week 2 Frankfort-Nachmias, C., & Leon-Guerr.docx
Learning Resources Week 2 Frankfort-Nachmias, C., & Leon-Guerr.docxLearning Resources Week 2 Frankfort-Nachmias, C., & Leon-Guerr.docx
Learning Resources Week 2 Frankfort-Nachmias, C., & Leon-Guerr.docx
 
Learning Resources Week 2 Frankfort-Nachmias, C., & Leon-Guerr.docx
Learning Resources Week 2 Frankfort-Nachmias, C., & Leon-Guerr.docxLearning Resources Week 2 Frankfort-Nachmias, C., & Leon-Guerr.docx
Learning Resources Week 2 Frankfort-Nachmias, C., & Leon-Guerr.docx
 
Big data - a review (2013 4)
Big data - a review (2013 4)Big data - a review (2013 4)
Big data - a review (2013 4)
 
Sameer Kumar Das International Conference Paper 53
Sameer Kumar Das International Conference Paper 53Sameer Kumar Das International Conference Paper 53
Sameer Kumar Das International Conference Paper 53
 
IRJET- Scope of Big Data Analytics in Industrial Domain
IRJET- Scope of Big Data Analytics in Industrial DomainIRJET- Scope of Big Data Analytics in Industrial Domain
IRJET- Scope of Big Data Analytics in Industrial Domain
 
Big data is a broad term for data sets so large or complex that tr.docx
Big data is a broad term for data sets so large or complex that tr.docxBig data is a broad term for data sets so large or complex that tr.docx
Big data is a broad term for data sets so large or complex that tr.docx
 
Big Data.pdf
Big Data.pdfBig Data.pdf
Big Data.pdf
 
DEALING CRISIS MANAGEMENT USING AI
DEALING CRISIS MANAGEMENT USING AIDEALING CRISIS MANAGEMENT USING AI
DEALING CRISIS MANAGEMENT USING AI
 
DEALING CRISIS MANAGEMENT USING AI
DEALING CRISIS MANAGEMENT USING AIDEALING CRISIS MANAGEMENT USING AI
DEALING CRISIS MANAGEMENT USING AI
 
DEALING CRISIS MANAGEMENT USING AI
DEALING CRISIS MANAGEMENT USING AIDEALING CRISIS MANAGEMENT USING AI
DEALING CRISIS MANAGEMENT USING AI
 
IRJET- A Scrutiny on Research Analysis of Big Data Analytical Method and Clou...
IRJET- A Scrutiny on Research Analysis of Big Data Analytical Method and Clou...IRJET- A Scrutiny on Research Analysis of Big Data Analytical Method and Clou...
IRJET- A Scrutiny on Research Analysis of Big Data Analytical Method and Clou...
 
sustainabilityCase ReportIntegrated Understanding of B.docx
sustainabilityCase ReportIntegrated Understanding of B.docxsustainabilityCase ReportIntegrated Understanding of B.docx
sustainabilityCase ReportIntegrated Understanding of B.docx
 
sustainabilityCase ReportIntegrated Understanding of B.docx
sustainabilityCase ReportIntegrated Understanding of B.docxsustainabilityCase ReportIntegrated Understanding of B.docx
sustainabilityCase ReportIntegrated Understanding of B.docx
 
RESEARCH IN BIG DATA – AN OVERVIEW
RESEARCH IN BIG DATA – AN OVERVIEWRESEARCH IN BIG DATA – AN OVERVIEW
RESEARCH IN BIG DATA – AN OVERVIEW
 
Research in Big Data - An Overview
Research in Big Data - An OverviewResearch in Big Data - An Overview
Research in Big Data - An Overview
 
RESEARCH IN BIG DATA – AN OVERVIEW
RESEARCH IN BIG DATA – AN OVERVIEWRESEARCH IN BIG DATA – AN OVERVIEW
RESEARCH IN BIG DATA – AN OVERVIEW
 
RESEARCH IN BIG DATA – AN OVERVIEW
RESEARCH IN BIG DATA – AN OVERVIEWRESEARCH IN BIG DATA – AN OVERVIEW
RESEARCH IN BIG DATA – AN OVERVIEW
 
Big Data Challenges faced by Organizations
Big Data Challenges faced by OrganizationsBig Data Challenges faced by Organizations
Big Data Challenges faced by Organizations
 
The effect of technology-organization-environment on adoption decision of bi...
The effect of technology-organization-environment on  adoption decision of bi...The effect of technology-organization-environment on  adoption decision of bi...
The effect of technology-organization-environment on adoption decision of bi...
 

More from Dereck Downing

How To Write Dialogue A Master List Of Grammar Techniques
How To Write Dialogue A Master List Of Grammar TechniquesHow To Write Dialogue A Master List Of Grammar Techniques
How To Write Dialogue A Master List Of Grammar TechniquesDereck Downing
 
Writing Paper Service Educational Blog Secrets To Writing Blog Even
Writing Paper Service Educational Blog Secrets To Writing Blog EvenWriting Paper Service Educational Blog Secrets To Writing Blog Even
Writing Paper Service Educational Blog Secrets To Writing Blog EvenDereck Downing
 
Scholarship Essay Essays Terrorism
Scholarship Essay Essays TerrorismScholarship Essay Essays Terrorism
Scholarship Essay Essays TerrorismDereck Downing
 
Best Nursing Essay Writing Services - Essay Help Online
Best Nursing Essay Writing Services - Essay Help OnlineBest Nursing Essay Writing Services - Essay Help Online
Best Nursing Essay Writing Services - Essay Help OnlineDereck Downing
 
The Federalist Papers By Alexander Hamilton Over
The Federalist Papers By Alexander Hamilton OverThe Federalist Papers By Alexander Hamilton Over
The Federalist Papers By Alexander Hamilton OverDereck Downing
 
Example Of Acknowledgment
Example Of AcknowledgmentExample Of Acknowledgment
Example Of AcknowledgmentDereck Downing
 
Writing Paper Or Write My Paper -
Writing Paper Or Write My Paper -Writing Paper Or Write My Paper -
Writing Paper Or Write My Paper -Dereck Downing
 
Observation Report-1
Observation Report-1Observation Report-1
Observation Report-1Dereck Downing
 
Third Person Narrative Essay - First, Second, And Third-Person Points
Third Person Narrative Essay - First, Second, And Third-Person PointsThird Person Narrative Essay - First, Second, And Third-Person Points
Third Person Narrative Essay - First, Second, And Third-Person PointsDereck Downing
 
007 Apa Essay Format Example Thatsnotus
007 Apa Essay Format Example Thatsnotus007 Apa Essay Format Example Thatsnotus
007 Apa Essay Format Example ThatsnotusDereck Downing
 
Writing Paper - Etsy
Writing Paper - EtsyWriting Paper - Etsy
Writing Paper - EtsyDereck Downing
 
Research Paper For Cheap, Papers Online Essay
Research Paper For Cheap, Papers Online EssayResearch Paper For Cheap, Papers Online Essay
Research Paper For Cheap, Papers Online EssayDereck Downing
 
Law Essay Example CustomEssayMeister.Com
Law Essay Example CustomEssayMeister.ComLaw Essay Example CustomEssayMeister.Com
Law Essay Example CustomEssayMeister.ComDereck Downing
 
Good Introductions For Research Papers. How To
Good Introductions For Research Papers. How ToGood Introductions For Research Papers. How To
Good Introductions For Research Papers. How ToDereck Downing
 
How To Write Better Essays - Ebooksz
How To Write Better Essays - EbookszHow To Write Better Essays - Ebooksz
How To Write Better Essays - EbookszDereck Downing
 
Handwriting Without Tears Worksheets Free Printable Fr
Handwriting Without Tears Worksheets Free Printable FrHandwriting Without Tears Worksheets Free Printable Fr
Handwriting Without Tears Worksheets Free Printable FrDereck Downing
 
Inspirational Quotes For Writers. QuotesGram
Inspirational Quotes For Writers. QuotesGramInspirational Quotes For Writers. QuotesGram
Inspirational Quotes For Writers. QuotesGramDereck Downing
 
What Is Abstract In Research. H
What Is Abstract In Research. HWhat Is Abstract In Research. H
What Is Abstract In Research. HDereck Downing
 
Anecdotes Are Commonly Use
Anecdotes Are Commonly UseAnecdotes Are Commonly Use
Anecdotes Are Commonly UseDereck Downing
 
Synthesis Journal Example. Synthesis Examples. 2022-1
Synthesis Journal Example. Synthesis Examples. 2022-1Synthesis Journal Example. Synthesis Examples. 2022-1
Synthesis Journal Example. Synthesis Examples. 2022-1Dereck Downing
 

More from Dereck Downing (20)

How To Write Dialogue A Master List Of Grammar Techniques
How To Write Dialogue A Master List Of Grammar TechniquesHow To Write Dialogue A Master List Of Grammar Techniques
How To Write Dialogue A Master List Of Grammar Techniques
 
Writing Paper Service Educational Blog Secrets To Writing Blog Even
Writing Paper Service Educational Blog Secrets To Writing Blog EvenWriting Paper Service Educational Blog Secrets To Writing Blog Even
Writing Paper Service Educational Blog Secrets To Writing Blog Even
 
Scholarship Essay Essays Terrorism
Scholarship Essay Essays TerrorismScholarship Essay Essays Terrorism
Scholarship Essay Essays Terrorism
 
Best Nursing Essay Writing Services - Essay Help Online
Best Nursing Essay Writing Services - Essay Help OnlineBest Nursing Essay Writing Services - Essay Help Online
Best Nursing Essay Writing Services - Essay Help Online
 
The Federalist Papers By Alexander Hamilton Over
The Federalist Papers By Alexander Hamilton OverThe Federalist Papers By Alexander Hamilton Over
The Federalist Papers By Alexander Hamilton Over
 
Example Of Acknowledgment
Example Of AcknowledgmentExample Of Acknowledgment
Example Of Acknowledgment
 
Writing Paper Or Write My Paper -
Writing Paper Or Write My Paper -Writing Paper Or Write My Paper -
Writing Paper Or Write My Paper -
 
Observation Report-1
Observation Report-1Observation Report-1
Observation Report-1
 
Third Person Narrative Essay - First, Second, And Third-Person Points
Third Person Narrative Essay - First, Second, And Third-Person PointsThird Person Narrative Essay - First, Second, And Third-Person Points
Third Person Narrative Essay - First, Second, And Third-Person Points
 
007 Apa Essay Format Example Thatsnotus
007 Apa Essay Format Example Thatsnotus007 Apa Essay Format Example Thatsnotus
007 Apa Essay Format Example Thatsnotus
 
Writing Paper - Etsy
Writing Paper - EtsyWriting Paper - Etsy
Writing Paper - Etsy
 
Research Paper For Cheap, Papers Online Essay
Research Paper For Cheap, Papers Online EssayResearch Paper For Cheap, Papers Online Essay
Research Paper For Cheap, Papers Online Essay
 
Law Essay Example CustomEssayMeister.Com
Law Essay Example CustomEssayMeister.ComLaw Essay Example CustomEssayMeister.Com
Law Essay Example CustomEssayMeister.Com
 
Good Introductions For Research Papers. How To
Good Introductions For Research Papers. How ToGood Introductions For Research Papers. How To
Good Introductions For Research Papers. How To
 
How To Write Better Essays - Ebooksz
How To Write Better Essays - EbookszHow To Write Better Essays - Ebooksz
How To Write Better Essays - Ebooksz
 
Handwriting Without Tears Worksheets Free Printable Fr
Handwriting Without Tears Worksheets Free Printable FrHandwriting Without Tears Worksheets Free Printable Fr
Handwriting Without Tears Worksheets Free Printable Fr
 
Inspirational Quotes For Writers. QuotesGram
Inspirational Quotes For Writers. QuotesGramInspirational Quotes For Writers. QuotesGram
Inspirational Quotes For Writers. QuotesGram
 
What Is Abstract In Research. H
What Is Abstract In Research. HWhat Is Abstract In Research. H
What Is Abstract In Research. H
 
Anecdotes Are Commonly Use
Anecdotes Are Commonly UseAnecdotes Are Commonly Use
Anecdotes Are Commonly Use
 
Synthesis Journal Example. Synthesis Examples. 2022-1
Synthesis Journal Example. Synthesis Examples. 2022-1Synthesis Journal Example. Synthesis Examples. 2022-1
Synthesis Journal Example. Synthesis Examples. 2022-1
 

Recently uploaded

Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
MICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptxMICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptxabhijeetpadhi001
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxAvyJaneVismanos
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)Dr. Mazin Mohamed alkathiri
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaVirag Sontakke
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupJonathanParaisoCruz
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
18-04-UA_REPORT_MEDIALITERAĐĄY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAĐĄY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAĐĄY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAĐĄY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
CELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxCELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxJiesonDelaCerna
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 

Recently uploaded (20)

Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
MICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptxMICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptx
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptx
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of India
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized Group
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
18-04-UA_REPORT_MEDIALITERAĐĄY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAĐĄY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAĐĄY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAĐĄY_INDEX-DM_23-1-final-eng.pdf
 
CELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxCELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptx
 
OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 

Big Data Analysis Using R Studio

  • 1. [Kumari et. al., Vol.6 (Iss.6): June 2019] ISSN: 2454-1907 DOI: 10.5281/zenodo.3266146 Http://www.ijetmr.com©International Journal of Engineering Technologies and Management Research [116] AN ANALYTICAL REVIEW STUDY ON BIG DATA ANALYSIS USING R STUDIO Anita Kumari 1, Neeraj Verma 2 1 MTech Scholar, Dept of Computer Science & Engineering PPIMT, Hisar (Haryana), India 2 Assistant Professor, Dept of Computer Science & Engineering PPIMT, Hisar (Haryana), India Abstract: A larger amount of data gives a better output but also working with it can become a challenge due to processing limitations. Nowadays companies are starting to realize the importance of using more data in order to support decision for their strategies. It was said and proved through study cases that “More data usually beats better algorithms”. With this statement companies started to realize that they can chose to invest more in processing larger sets of data rather than investing in expensive algorithms. During the last decade, large statistics evaluation has seen an exponential boom and will absolutely retain to witness outstanding tendencies due to the emergence of new interactive multimedia packages and extraordinarily incorporated systems driven via the speedy growth in statistics services and microelectronic gadgets. Up to now, maximum of the modern mobile structures are especially centered to voice communications with low transmission fees. Keywords: Huge Statistics; Big Data Analysis; R Studio. Cite This Article: Anita Kumari, and Neeraj Verma. (2019). “AN ANALYTICAL REVIEW STUDY ON BIG DATA ANALYSIS USING R STUDIO.” International Journal of Engineering Technologies and Management Research, 6(6), 116-122. DOI: 10.5281/zenodo.3266146. 1. Introduction We can associate the importance of Big Data and Big Data Analysis with the society that we live in. Today we are living in an Informational Society and we are moving towards a Knowledge Based Society. In order to extract better knowledge we need a bigger amount of data. The Society of Information is a society where information plays a major role in the economic, cultural and political stage. [2] In the Knowledge society the competitive advantage is gained through understanding the information and predicting the evolution of facts based on data. The same happens with Big Data. Every organization needs to collect a large set of data in order to support its decision and extract correlations through data analysis as a basis for decisions. [3] In this article we will define the concept of Big Data, its importance and different perspectives on its use. In addition we will stress the importance of Big Data Analysis and show how the analysis of Big Data will improve decisions in the future.
  • 2. [Kumari et. al., Vol.6 (Iss.6): June 2019] ISSN: 2454-1907 DOI: 10.5281/zenodo.3266146 Http://www.ijetmr.com©International Journal of Engineering Technologies and Management Research [117] 2. Big Data Concept The term “Big Data” was first introduced to the computing world by Roger Magoulas from O’Reilly media in 2005 in order to define a great amount of data that traditional data management techniques cannot manage and process due to the complexity and size of this data. [7] A study on the Evolution of Big Data as a Research and Scientific Topic shows that the term “Big Data” was present in research starting with 1970s but has been The emerging large records science time period, displaying its broader effect on our society and in our enterprise lifestyles cycle, has insightful transformed our society and could hold to draw numerous attentions from technical specialists and in addition to public in standard [1] [2]. It is apparent that we're residing in huge records era, proven with the aid of the sheer quantity of data from a spread of sources and its growing rate of era. as an example, an IDC file predicts that, from 2005 to 2020, the global statistics dimensions will grow with the aid of a issue of three hundred, from a hundred thirty Exabyte’s to 40,000 Exabyte’s, representing a double increase each two years. this is specializes in on hand big-records structures that include a set of equipment and technique to load, extract, and enhance numerous information while leveraging the immensely parallel processing energy to carry out complicated transformations and evaluation. “large-information” machine faces a chain of technical challenges, along with: First, because of the large variety of different records sources and the massive extent, its miles too tough to acquire, combine and evaluation of “huge statistics” with scalability from scattered places. Second “huge facts” structures want to manage, shop and integrate the amassed massive and varied verity of datasets, whilst provide function and performance warranty [1], in terms of rapid retrieval, scalability and secrecy safety. Third “large information” analytics have to efficaciously excavation large datasets at one of kind degrees in actual time or close to real time - which includes modeling, visualization [2], prediction and optimization - such that inherent potentials can be discovered to improve choice making and collect similarly advantages. To address these challenges, the researcher IT industry and community has given various solutions for “Big Data” science systems in an ad-hoc manner. Cloud computing can be called as the substructure layer for “Big Data” systems to meet certain substructure requirements, such as cost-effectiveness, resistance [2], and the ability to scale up or down. Distributed file systems and No SQL databases are suitable for persistent storage and the management of massive scheme free datasets [1]. Map Reduce, R is a programming framework, has achieved great success in processing “Big Data” group-aggregation tasks, such as website ranking [10]. RStudio integrates data storage, data processing, system management, and other modules to form a powerful system-level solution, which is becoming the mainstay in handling “Big Data” challenges. We can build various “Big Data” application system based on these innovative technologies and platforms. In light of the of big-data technologies, a systematic frame work should be in order to capture the fast evolution of big-data research.
  • 3. [Kumari et. al., Vol.6 (Iss.6): June 2019] ISSN: 2454-1907 DOI: 10.5281/zenodo.3266146 Http://www.ijetmr.com©International Journal of Engineering Technologies and Management Research [118] 2.1. Big Data Problem and Challenges However, considering variety of data sets in “Big Data” problems, it is still a big challenge for us to purpose efficient representation, access, and analysis of shapeless or semi-structured data in the further researches [12]. How can the data be preprocessed in order to improve the quality of data and analysis results before we begin data analysis [1] [2]? As the sizes of dataset are often very large, sometimes several gigabytes or more, and their origin from varied sources, current real-world databases are pitilessly susceptible to inconsistent, incomplete, and noisy data. Therefore, a number of data preprocessing techniques, including data cleaning [11], data integration, data transformation and date reduction, can be applied to remove noise and correct irregularities. Different challenges arise in each sub-process when it comes to data-driven applications. 2.2. Principles for designing Big Data System In designing “Big Data” analytics systems, we summarize seven necessary principles to guide the development of this kind of burning issues [3]. “Big Data” analytics in a highly distributed system cannot be achievable without the following principles [13]: 1) Good architectures and frameworks are necessary and on the top priority. 2) Support a variety of analytical methods 3) No size fits all 4) Bring the analysis to data 5) Processing must be distributable for in-memory computation. 6) Data storage must be distributable for in-memory storage. 7) Coordination is needed between processing and data units. 2.3. Big Data Opportunities The bonds between “Big Data” and knowledge hidden in it are highly crucial in all areas of national priority. This initiative will also lay the groundwork for complementary “Big Data” activities, such as “Big Data” substructure projects, platforms development, and techniques in settling complex, data-driven problems in sciences and engineering. Researchers, policy and decision makers have to recognize the potential of harnessing “Big Data” to uncover the next wave of growth in their fields. There are many advantages in business section that can be obtained through harnessing “Big Data” increasing operational efficiency, informing strategic direction, developing better customer service, identifying and developing new products and services, identifying new customers and markets, etc. [10] 2.4. Big Data Analysis The last and most important stage of the “Big Data” value chain is data analysis, the goal of which is to get useful values, suggest best conclusions and support decision-making system of an organization to stay in competition market. [1] Descriptive Analytics: exploits historical data to describe what occurred in past. For instance, a regression technique may be used to find simple trends in the datasets, visualization presents data
  • 4. [Kumari et. al., Vol.6 (Iss.6): June 2019] ISSN: 2454-1907 DOI: 10.5281/zenodo.3266146 Http://www.ijetmr.com©International Journal of Engineering Technologies and Management Research [119] in a meaningful fashion, and data modeling is used to collect, store and cut the data in an efficient way. Descriptive analytics is typically associated with business intelligence or visibility systems [2]. Predictive Analytics: focuses on predicting future probabilities and trends. For example, predictive modeling uses statistical techniques [6] such as linear and logistic regression to understand trends and predict future out-comes, and data mining extracts patterns to provide insight and forecasts [4]. Prescriptive Analytics: addresses decision making and efficiency. For example, simulation is used to analyze complex systems to gain insight into system performance and identify issues and optimization techniques are used to find best solutions under given constraints. 3. Literature Review Shweta Pandey et al. in 2014 shows that Big Data is having challenges related to volume, velocity and variety. Big Data has 3Vs Volume means large amount of data, Velocity means data arrives at high speed, Variety means data comes from heterogeneous resources. In Big Data definition, Big means a dataset which makes data concept to grow so much that it becomes difficult to manage it by using existing data management concepts and tools. Map Reduce is playing a very significant role in processing of Big Data. The main objective of this paper is purposed a tool like Map Reduce is elastic scalable, efficient and fault tolerant for analyzing a large set of data, highlights the features of Map Reduce in comparison of other design model which makes it popular tool for processing large scale data. HAN HU and YONGGANG WEN in 2014 shows that Recent technological advancements have led to a deluge of data from distinctive domains e.g., health care and scientific sensors, user- generated data, Internet and financial companies, and supply chain systems. Over the past two decades. The term big data was coined to capture the meaning of this emerging trend. In addition to its sheer volume, big data also exhibits other unique characteristics as compared with traditional data. For instance, big data is commonly unstructured and require more real-time analysis. This development calls for new system architectures for data acquisition, transmission, storage, and large-scale data processing mechanisms. In this paper, we present a literature survey and system tutorial for big data analytics platforms, aiming to provide an overall picture for non- expert readers and instill a do-it-yourself spirit for advanced audiences to customize their own big-data solutions. Katarina et al. in 2014 discussed challenges for Map Reduce in Big Data. In the Big Data community, Map Reduce has been seen as one of the key enabling approaches for meeting the continuously increasing demands on computing resources imposed by massive data sets. At the same time, Map Reduce faces a number of obstacles when dealing with Big Data including the lack of a high-level language such as SQL, challenges in implementing iterative algorithms, support for iterative ad-hoc data exploration, and stream processing. The identified Map Reduce challenges are grouped into four main categories corresponding to Big Data tasks types: data storage, analytics, online processing, security and privacy. The main objective of this paper is identifies Map Reduce issues and challenges in handling Big Data with the objective of
  • 5. [Kumari et. al., Vol.6 (Iss.6): June 2019] ISSN: 2454-1907 DOI: 10.5281/zenodo.3266146 Http://www.ijetmr.com©International Journal of Engineering Technologies and Management Research [120] providing an overview of the field, facilitating better planning and management of Big Data projects, and identifying opportunities for future research in this field. Zhen Jia1, Jianfeng Zhan, Lei Wang, Rui Han, Sally A. McKee, Qiang Yang, Chunjie Luo, and Jingwei Li in 2014 discussed Characterizing and Subsetting Big Data Workloads. A large number of benchmarks pose great challenges, since our usual simulation-based research methods become prohibitively expensive. They use hardware performance counters to analyze micro architectural behaviors of those scale out workloads. They compare the scale-out workloads and traditional benchmarks to identify the key contributors to the micro architecture inefficiency on modern processors. They conclude that mismatches exist between the needs of scale-out workloads and the capabilities of modern processors. Much work focuses on comparing the performance of different data management systems. For OLTP or database systems evaluation, TPC-C is often used to evaluate transaction-processing system performance in terms of transactions per minute. Cooper defines a core set of benchmarks and report throughput and latency results for five widely used data management systems. 3.1. Big Data Tools: Techniques and Technologies To capture the value from “Big Data”, we need to develop new techniques and technologies for analyzing it. Until now, scientists have developed a wide variety of techniques and technologies to capture, curate, analyze and visualize Big Data. We need tools (platforms) to make sense of “Big Data”. Current tools concentrate on three classes, namely, batch processing tools, stream processing tools, and interactive analysis tools. Most batch processing tools are based on the Apache RStudio infrastructure, such as Map reduce [4], R Programming and Dryad. The interactive analysis processes the data in an interactive environment, allowing users to undertake their own analysis of information. 3.2. R Programming The R language is well mounted as the language for doing data, facts evaluation, information- mining algorithm improvement, stock buying and selling, credit score danger scoring, marketplace basket evaluation and all [9] way of predictive analytics. However, given the deluge of information that must be processed and analyzed nowadays, many groups had been reticent about deploying R past studies into production programs. [16] 3.3. Comparisons of Classification for Big Data Science To apply different classification technique I have chosen a real dataset about the student’s knowledge status about the subject of Electrical DC Machines. Distribution of every numeric variable can be checked with function summary (), which returns the minimum, maximum, mean, median, and the first (25%) and third (75%) quartiles. For factors (or categorical variables), it shows the frequency of every level.
  • 6. [Kumari et. al., Vol.6 (Iss.6): June 2019] ISSN: 2454-1907 DOI: 10.5281/zenodo.3266146 Http://www.ijetmr.com©International Journal of Engineering Technologies and Management Research [121] 3.4. The importance of Big Data The main importance of Big Data consists in the potential to improve efficiency in the context of use a large volume of data, of different type. If Big Data is defined properly and used accordingly, organizations can get a better view on their business therefore leading to efficiency in different areas like sales, improving the manufactured product and so forth. [2] Big Data can be used effectively in the following areas:  In information technology in order to improve security and troubleshooting by analyzing the patterns in the existing logs;  In customer service by using information from call centers in order to get the customer pattern and thus enhance customer satisfaction by customizing services;  In improving services and products through the use of social media content. By knowing the potential customers preferences the company can modify its product in order to address a larger area of people;  In the detection of fraud in the online transactions for any industry;  In risk assessment by analyzing information from the transactions on the financial market 4. Conclusions The year 2012 is the year when companies are starting to orient themselves towards the use of Big Data. That is why this articol presents the Big Data concept and the R technologies associated in order to understand better the multiple benefices of this new concept ant technology. In the future we propose for our research to further investigate the practical advantages that can be gain through R Studio. References [1] C.L. Philip Chen, Chun-Yang Zhang, “Data intensive applications, challenges, techniques and technologies: A survey on Big Data” Information Science 0020-0255 (2014), PP 341-347, elsevier [2] Han hu1At. Al. (Fellow, IEEE),” Toward Scalable Systems for Big Data Analytics: A Technology Tutorial”, IEEE 2169-3536(2014),PP 652-687 [3] Shweta Pandey, Dr.VrindaTokekar, “Prominence of MapReduce in BIG DATA Processing”, IEEE (Fourth International Conference on Communication Systems and Network Technologies)978-1-4799-3070-8/14, PP 555-560 [4] Katarina Grolinger At. Al. “Challenges for MapReduce in Big Data”, IEEE (10th World Congress on Services)978-1-4799-5069-0/14,PP 182-189 [5] Zhen Jia1 At. Al. “Characterizing and Subsetting Big Data Workloads", IEEE 978-1-4799-6454- 3/14, PP 191-201 [6] AvitaKatal, Mohammad Wazid, R H Goudar, “Big Data: Issues, Challenges, Tools and Good Practices”, IEEE 978-1-4799-0192-0/13,PP 404-409 [7] Du Zhang,” Inconsistencies in Big Data”, IEEE 978-1-4799-0783-0/13, PP 61-67
  • 7. [Kumari et. al., Vol.6 (Iss.6): June 2019] ISSN: 2454-1907 DOI: 10.5281/zenodo.3266146 Http://www.ijetmr.com©International Journal of Engineering Technologies and Management Research [122] [8] ZibinZheng, Jieming Zhu, and Michael R. Lyu, “Service-generated Big Data and Big Data-as-a- Service: An Overview”, IEEE (International Congress on Big Data) 978-0-7695-5006-0/13, PP 403-410 [9] VigneshPrajapati, Big Data Analytics with R and RStudioPackt Publishing [10] Lei Wang At. Al., “BigDataBench: aBigData Benchmark Suite from InternetServices”, IEEE 978-1-4799-3097-5/14. [11] AnirudhKadadi At. Al., “Challenges of Data Integration and Interoperability in Big Data”, IEEE (International Conference on Big Data)978-1-4799-5666-1/14, PP 38-40 [12] SAS, Five big data challenges and how to overcome them with visual analytics [13] HajarMousanif At. Al., “From Big Data to Big Projects: a Step-by-step Roadmap”, IEEE (International Conference on Future Internet of Things and Cloud) 978-1-4799-4357-9/14, PP 373-378 [14] Tianbo Lu At. Al., “Next Big Thing in Big Data: The Security of the ICT Supply Chain”, IEEE (SocialCom/PASSAT/BigData/EconCom/BioMedCom) 978-0-7695-5137-1/13, PP 1066-1073 [15] Ganapathy Mani, NimaBarit, Duoduo Liao, Simon Berkovich, “Organization of Knowledge Extraction from Big Data Systems”, IEEE (4 Fifth International Conference on Computing for Geospatial Research and Application) 978-1-4799-4321-0/14, PP 63-69 [16] Joseph Rickert, “Big Data Analysis with Revolution R Enterprise”, 2011 [17] Carson Kai-Sang Leung, Richard Kyle MacKinnon, Fan Jiang, “Reducing the Search Space for Big Data Mining for Interesting Patterns from Uncertain Data”, IEEE 2014, PP 315-322 [18] Ajith Abraham1, Swagatam Das2, and Sandip Roy3, “Swarm Intelligence Algorithms for Data Clustering”, PP 280-313 [19] Swagatam Das, Ajith Abraham, Senior Member, IEEE, and Amit Konar, “Automatic Clustering Using an Improved Differential Evolution Algorithm”, IEEE 2008, PP 218-237 [20] KarthikKambatla, GiorgosKollias, Vipin Kumar, AnanthGrama, “J. Parallel Distrib. Comput”, Elsevier 2014, PP 2561-2573 [21] Yanchang Zhao, “R and Data Mining: Examples and Case Studies”, www.RDataMining.com,2014 [22] H. T. Kahraman, Sagiroglu, S., Colak,“User Knowledge Modeling Data Set”, UCI, vol. 37, pp. 283-295, 2013 [23] Mrigank Mridul, Akashdeep Khajuria, Snehasish Dutta, Kumar N, “Analysis of Bidgata using Apache RStudio and Map”, Volume 4, Issue 5, May 2014 Reduce, PP. 555-560. [24] Sonja Pravilovic,” R language in data mining techniques and statistics”, 20130201.12,2013 [25] Vrushali Y Kulkarni,” Random Forest Classifiers: A Survey and Future Research Directions”, International Journal of Advanced Computing, ISSN: 2051-0845, Vol.36, Issue.1, April 2013 [26] Aditya Krishna Menon,” Large-Scale Support Vector Machines: Algorithms and Theory”.