SlideShare a Scribd company logo
1 of 3
Download to read offline
33
(IJID) International Journal on Informatics for Development
Vol. 7, No. 1, 2018, Pp. 33-35
A REVIEW PAPER ON BIG DATA AND DATA
MINING
Concepts and Techniques
Prasdika F.B.S 1
, Dr. Bambang Sugiantoro, S.Si., M.T 2
Research Scholar 1
, Faculty Advisor 2
Faculty of Science and Technology UIN Sunan Kalijaga
Yogyakarta, Indonesia
Abstract— In the digital era like today the growth of data in the
database is very rapid, all things related to technology have a
large contribution to data growth as well as social media,
financial technology and scientific data. Therefore, topics such as
big data and data mining are topics that are often discussed. Data
mining is a method of extracting information through from big
data to produce an information pattern or data anomaly.
Keywords-component; data, big data, data mining
The main purpose of data mining is classification or
forecast. In classification, there are a lot of cases that can be
simulated such as in a product ad can be sorted out where the
respondents who are interested in the ad and which are not,
from here can be developed about what is behind someone
interested in the ad. this information is expected to be
predicted through data mining
II. DATA ATRIBUTE
I. INTRODUCTION
Since the digital era began and the internet began to be
used extensively in the early 1990s, it has produced
tremendous amounts of data transactions. Even long before the
internet era of things a few decades earlier several studies had
discussed the data of the washouse, big data and data mining.
Analogously, data mining should have been more
appropriately named "knowledge mining from data," which is
unfortunately somewhat long. However, the shorter term,
knowledge mining may not reflect the emphasis on mining
from large amounts of data. Nevertheless, mining is a vivid
term characterizing the process that finds a small set of
precious nuggets from a great deal of raw material.[1]
Data mining is used to analyze and explore large amounts
of data to find a valuable result from the extraction. there is a
lot of information that can be extracted from a collection of
databases that can be used for several needs, for example a
company engaged in the commercial sector can utilize the
transaction data to find optimal sales patterns. Thus, the
income from a company can be increased through the use of
data mining.
According to the type of data can be divided into :
A. Nominal Attributes
Nominal means “relating to names.” The values of a
nominal attribute are symbols or names of things. Each value
represents some kind of category, code, or state, and so
nominal attributes are also referred to as categorical. The
values do not have any meaningful order. In computer science,
the values are also known as enumerations.
B. Binary Attributes
A binary attribute is a nominal attribute with only two
categories or states: 0 or 1, where 0 typically means that the
attribute is absent, and 1 means that it is present. Binary
attributes are referred to as Boolean if the two states
correspond to true and false.
C. Ordinal Attributes
An ordinal attribute is an attribute with possible values that
have a meaningful order or ranking among them, but the
magnitude between successive values is not known.
III. CONCEPTS
34
(IJID) International Journal on Informatics for Development
Vol. 7, No. 1, 2018
A. Big Data
Big data is data that contains greater variety arriving in
increasing volumes and with ever-higher velocity. This is
known as the three Vs.
• Volume
The amount of data matters. With big data, you’ll
have to process high volumes of low-density,
unstructured data. This can be data of unknown value,
such as Twitter data feeds, clickstreams on a webpage
or a mobile app, or sensor-enabled equipment. For
some organizations, this might be tens of terabytes of
data. For others, it may be hundreds of petabytes.
• Velocity
Velocity is the fast rate at which data is received
and (perhaps) acted on. Normally, the highest velocity
of data streams directly into memory versus being
written to disk. Some internet-enabled smart products
operate in real time or near real time and will require
real-time evaluation and action.
• Variety
Variety refers to the many types of data that are
available. Traditional data types were structured and fit
neatly in a relational database. With the rise of big
data, data comes in new unstructured data types.
Unstructured and semistructured data types, such as
text, audio, and video require additional preprocessing
to derive meaning and support metadata.
B. Data Mining
Data mining is the process of finding anomalies, patterns
and correlations within large data sets to predict outcomes.
IV. ALGORITHM
A. Decision Tree algorithm
Decision Tree algorithm belongs to the family of
supervised learning algorithms. Unlike other supervised
learning algorithms, decision tree algorithm can be used for
solving regression and classification problems too. The general
motive of using Decision Tree is to create a training model
which can use to predict class or value of target variables by
learning decision rules inferred from prior data(training data).
The understanding level of Decision Trees algorithm is so easy
compared with other classification algorithms. The decision
tree algorithm tries to solve the problem, by using tree
representation. Each internal node of the tree corresponds to an
attribute, and each leaf node corresponds to a class label.
B. Genetic Algorithms (Complex Adaptive Systems)
A genetic or evolutionary algorithm applies the principles
of evolution found in nature to the problem of finding an
optimal solution to a Solver problem. In a "genetic algorithm,"
the problem is encoded in a series of bit strings that are
manipulated by the algorithm; in an "evolutionary algorithm,"
the decision variables and problem functions are used directly.
Most commercial Solver products are based on evolutionary
algorithms. An evolutionary algorithm for optimization is
different from "classical" optimization methods in several ways
C. Neural Networks
An artificial neural network is made up of a series of
nodes. Nodes are connected in many ways like the neurons and
axons in the human brain. These nodes are primed in a number
of different ways. Some are limited to certain algorithms and
tasks which they perform exclusively. In most cases, however,
nodes are able to process a variety of algorithms. Nodes are
able to absorb input and produce output. They are also
connected to an artificial learning program. The program can
change inputs as well as the weights for different nodes.
Weighting places more focus on different nodes or
different functions of the same group of nodes. This network
also has a function for displaying and evaluating the output.
Evaluation is at the heart of the neural network and what makes
it intelligent. Many types of computer systems run off of
networks that have different nodes which perform different
functions. However, weighting and artificial evaluation
immensely increase the number of possible functions for the
network.
Once it is established, the machine learning algorithm
must be assigned and implemented. These networks need a
35
(IJID) International Journal on Informatics for Development
Vol. 7, No. 1, 2018
basic type of function to be implemented by an operator. This
function can be one of many different forms such as Apriori or
k-means clustering. The artificial neural network then goes to
work. It absorbs input and sends different tasks to different
nodes. These nodes may work independently or they may
frequently communicate with one another. The nodes then
transfer an output to an end series of nodes.
V. CHALLENGES AND ISSUES
A. Challenges
Efficient and effective data mining in large databases
poses numerous requirements and great challenges to
researchers and developers.
The issues involved include data mining methodology,
user interaction, performance and scalability, and the
processing of a large variety of data types.
Other issues include the exploration of data mining
applications and their social impacts.
B. Issues
• Information poorness
▪
The abundance of data, coupled with the need for
powerful data analysis tools, has been described
as a data rich but information poor situation
▪
Data collected in large data repositories become
“data tombs”
▪
data archives that are seldom visited
• Decision Making
• Important decisions are often made based not on the
information-rich data stored in data repositories,
but rather on a decision maker’s intuition
• The decision maker does not have the tools to
extract the valuable knowledge embedded in the
vast amounts of data
• Data Entry
▪
Often systems rely on users or domain experts to
manually input knowledge into knowledge bases.
▪
Unfortunately, this procedure is prone to biases
and errors, and is extremely time-consuming and
costly
• Bad Name
▪
The term is actually a misnomer.
▪
Mining of gold from rocks or sand is referred to as gold
mining rather than rock or sand mining.
▪
Data mining should have been more appropriately
named “knowledge mining from data” which is
unfortunately somewhat long
VI. CONCLUTION
With very rapid data growth, research in the field of big data
and data mining is still growing rapidly too, those who struggling
in big data will always find an increasement of complexity from
big data. Therefore we conclude that the approach of data mining
algorithms can still be improved.. With
the many problems that still exist now and issues that occur the
possibility is still widely open.
REFERENCES
[1] Jiawei Han, Micheline Kamber, Jian Pei, “Data Mining :
Concepts and Techniques”, ISBN 978-0-12-381479-1, 225
Wyman Street, Waltham, MA 02451, US, 2012
[2] Ian H. Witten, Eibe Frank, “Data Mining Practical Machine
Learning Tools and Techniques”, 500 Sansome Street,
Suite 400, San Francisco, CA 9411, 2015
[3] Bart van der Sloot, Dennis Broeders, Erik Schrijvers,
”Exploring the Boundaries of Big Data”, Amsterdam, 2016
[4] Nikita Jain, Vishal Srivastava, “Data Mining
Techniques: A Survey Paper”, eISSN: 2319-1163 | pISSN:
2321-7308, Rajasthan, India, 2013
[5] Nirmal Kaur, Gurpinder Singh,” A Review Paper On Data
Mining And Big Data”, ISSN No. 0976-5697, Jalandhar,
Punjab, India, 2017

More Related Content

Similar to A Review Paper On Big Data And Data Mining Concepts And Techniques

The Survey of Data Mining Applications And Feature Scope
The Survey of Data Mining Applications  And Feature Scope The Survey of Data Mining Applications  And Feature Scope
The Survey of Data Mining Applications And Feature Scope IJCSEIT Journal
 
Week-1-Introduction to Data Mining.pptx
Week-1-Introduction to Data Mining.pptxWeek-1-Introduction to Data Mining.pptx
Week-1-Introduction to Data Mining.pptxTake1As
 
20211011112936_PPT01-Introduction to Big Data.pptx
20211011112936_PPT01-Introduction to Big Data.pptx20211011112936_PPT01-Introduction to Big Data.pptx
20211011112936_PPT01-Introduction to Big Data.pptxSyauqiAsyhabira1
 
Data science.chapter-1,2,3
Data science.chapter-1,2,3Data science.chapter-1,2,3
Data science.chapter-1,2,3varshakumar21
 
notes_dmdw_chap1.docx
notes_dmdw_chap1.docxnotes_dmdw_chap1.docx
notes_dmdw_chap1.docxAbshar Fatima
 
Real World Application of Big Data In Data Mining Tools
Real World Application of Big Data In Data Mining ToolsReal World Application of Big Data In Data Mining Tools
Real World Application of Big Data In Data Mining Toolsijsrd.com
 
Nikita rajbhoj(a 50)
Nikita rajbhoj(a 50)Nikita rajbhoj(a 50)
Nikita rajbhoj(a 50)NikitaRajbhoj
 
Data Mining in the World of BIG Data-A Survey
Data Mining in the World of BIG Data-A SurveyData Mining in the World of BIG Data-A Survey
Data Mining in the World of BIG Data-A SurveyEditor IJCATR
 
IRJET- A Study on Data Mining in Software
IRJET- A Study on Data Mining in SoftwareIRJET- A Study on Data Mining in Software
IRJET- A Study on Data Mining in SoftwareIRJET Journal
 
data mining and data warehousing
data mining and data warehousingdata mining and data warehousing
data mining and data warehousingSunny Gandhi
 
Introduction to big data
Introduction to big dataIntroduction to big data
Introduction to big dataHari Priya
 
An Overview Of The Use Of Neural Networks For Data Mining Tasks
An Overview Of The Use Of Neural Networks For Data Mining TasksAn Overview Of The Use Of Neural Networks For Data Mining Tasks
An Overview Of The Use Of Neural Networks For Data Mining TasksKelly Taylor
 
Data minig with Big data analysis
Data minig with Big data analysisData minig with Big data analysis
Data minig with Big data analysisPoonam Kshirsagar
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data AnalyticsJOSEPH FRANCIS
 
A Deep Dissertion Of Data Science Related Issues And Its Applications
A Deep Dissertion Of Data Science  Related Issues And Its ApplicationsA Deep Dissertion Of Data Science  Related Issues And Its Applications
A Deep Dissertion Of Data Science Related Issues And Its ApplicationsTracy Hill
 
introduction to data science
introduction to data scienceintroduction to data science
introduction to data scienceJohnson Ubah
 
KIT-601 Lecture Notes-UNIT-1.pdf
KIT-601 Lecture Notes-UNIT-1.pdfKIT-601 Lecture Notes-UNIT-1.pdf
KIT-601 Lecture Notes-UNIT-1.pdfDr. Radhey Shyam
 

Similar to A Review Paper On Big Data And Data Mining Concepts And Techniques (20)

The Survey of Data Mining Applications And Feature Scope
The Survey of Data Mining Applications  And Feature Scope The Survey of Data Mining Applications  And Feature Scope
The Survey of Data Mining Applications And Feature Scope
 
Week-1-Introduction to Data Mining.pptx
Week-1-Introduction to Data Mining.pptxWeek-1-Introduction to Data Mining.pptx
Week-1-Introduction to Data Mining.pptx
 
20211011112936_PPT01-Introduction to Big Data.pptx
20211011112936_PPT01-Introduction to Big Data.pptx20211011112936_PPT01-Introduction to Big Data.pptx
20211011112936_PPT01-Introduction to Big Data.pptx
 
Data science.chapter-1,2,3
Data science.chapter-1,2,3Data science.chapter-1,2,3
Data science.chapter-1,2,3
 
notes_dmdw_chap1.docx
notes_dmdw_chap1.docxnotes_dmdw_chap1.docx
notes_dmdw_chap1.docx
 
Real World Application of Big Data In Data Mining Tools
Real World Application of Big Data In Data Mining ToolsReal World Application of Big Data In Data Mining Tools
Real World Application of Big Data In Data Mining Tools
 
Nikita rajbhoj(a 50)
Nikita rajbhoj(a 50)Nikita rajbhoj(a 50)
Nikita rajbhoj(a 50)
 
Data Mining in the World of BIG Data-A Survey
Data Mining in the World of BIG Data-A SurveyData Mining in the World of BIG Data-A Survey
Data Mining in the World of BIG Data-A Survey
 
IRJET- A Study on Data Mining in Software
IRJET- A Study on Data Mining in SoftwareIRJET- A Study on Data Mining in Software
IRJET- A Study on Data Mining in Software
 
Abstract
AbstractAbstract
Abstract
 
data mining and data warehousing
data mining and data warehousingdata mining and data warehousing
data mining and data warehousing
 
Introduction to big data
Introduction to big dataIntroduction to big data
Introduction to big data
 
An Overview Of The Use Of Neural Networks For Data Mining Tasks
An Overview Of The Use Of Neural Networks For Data Mining TasksAn Overview Of The Use Of Neural Networks For Data Mining Tasks
An Overview Of The Use Of Neural Networks For Data Mining Tasks
 
Data minig with Big data analysis
Data minig with Big data analysisData minig with Big data analysis
Data minig with Big data analysis
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
A Deep Dissertion Of Data Science Related Issues And Its Applications
A Deep Dissertion Of Data Science  Related Issues And Its ApplicationsA Deep Dissertion Of Data Science  Related Issues And Its Applications
A Deep Dissertion Of Data Science Related Issues And Its Applications
 
Data mining
Data miningData mining
Data mining
 
introduction to data science
introduction to data scienceintroduction to data science
introduction to data science
 
Challenges of Big Data Research
Challenges of Big Data ResearchChallenges of Big Data Research
Challenges of Big Data Research
 
KIT-601 Lecture Notes-UNIT-1.pdf
KIT-601 Lecture Notes-UNIT-1.pdfKIT-601 Lecture Notes-UNIT-1.pdf
KIT-601 Lecture Notes-UNIT-1.pdf
 

More from Amy Cernava

What Should I Write My College Essay About 15
What Should I Write My College Essay About 15What Should I Write My College Essay About 15
What Should I Write My College Essay About 15Amy Cernava
 
A New Breakdown Of. Online assignment writing service.
A New Breakdown Of. Online assignment writing service.A New Breakdown Of. Online assignment writing service.
A New Breakdown Of. Online assignment writing service.Amy Cernava
 
Evaluative Writing. 6 Ways To Evaluate. Online assignment writing service.
Evaluative Writing. 6 Ways To Evaluate. Online assignment writing service.Evaluative Writing. 6 Ways To Evaluate. Online assignment writing service.
Evaluative Writing. 6 Ways To Evaluate. Online assignment writing service.Amy Cernava
 
General Water. Online assignment writing service.
General Water. Online assignment writing service.General Water. Online assignment writing service.
General Water. Online assignment writing service.Amy Cernava
 
Essay Websites Sample Parent Essays For Private Hi
Essay Websites Sample Parent Essays For Private HiEssay Websites Sample Parent Essays For Private Hi
Essay Websites Sample Parent Essays For Private HiAmy Cernava
 
How To Write About Myself Examples - Coverletterpedia
How To Write About Myself Examples - CoverletterpediaHow To Write About Myself Examples - Coverletterpedia
How To Write About Myself Examples - CoverletterpediaAmy Cernava
 
Punctuating Titles MLA Printable Classroom Posters Quotations
Punctuating Titles MLA Printable Classroom Posters QuotationsPunctuating Titles MLA Printable Classroom Posters Quotations
Punctuating Titles MLA Printable Classroom Posters QuotationsAmy Cernava
 
Essay Introductions For Kids. Online assignment writing service.
Essay Introductions For Kids. Online assignment writing service.Essay Introductions For Kids. Online assignment writing service.
Essay Introductions For Kids. Online assignment writing service.Amy Cernava
 
Writing Creative Essays - College Homework Help A
Writing Creative Essays - College Homework Help AWriting Creative Essays - College Homework Help A
Writing Creative Essays - College Homework Help AAmy Cernava
 
Free Printable Primary Paper Te. Online assignment writing service.
Free Printable Primary Paper Te. Online assignment writing service.Free Printable Primary Paper Te. Online assignment writing service.
Free Printable Primary Paper Te. Online assignment writing service.Amy Cernava
 
Diversity Essay Sample Graduate School Which Ca
Diversity Essay Sample Graduate School Which CaDiversity Essay Sample Graduate School Which Ca
Diversity Essay Sample Graduate School Which CaAmy Cernava
 
Large Notepad - Heart Border Writing Paper Print
Large Notepad - Heart Border  Writing Paper PrintLarge Notepad - Heart Border  Writing Paper Print
Large Notepad - Heart Border Writing Paper PrintAmy Cernava
 
Personal Challenges Essay. Online assignment writing service.
Personal Challenges Essay. Online assignment writing service.Personal Challenges Essay. Online assignment writing service.
Personal Challenges Essay. Online assignment writing service.Amy Cernava
 
Buy College Application Essays Dos And Dont
Buy College Application Essays Dos And DontBuy College Application Essays Dos And Dont
Buy College Application Essays Dos And DontAmy Cernava
 
8 Printable Outline Template - SampleTemplatess - Sa
8 Printable Outline Template - SampleTemplatess - Sa8 Printable Outline Template - SampleTemplatess - Sa
8 Printable Outline Template - SampleTemplatess - SaAmy Cernava
 
Analytical Essay Intro Example. Online assignment writing service.
Analytical Essay Intro Example. Online assignment writing service.Analytical Essay Intro Example. Online assignment writing service.
Analytical Essay Intro Example. Online assignment writing service.Amy Cernava
 
Types Of Essay And Examples. 4 Major Types O
Types Of Essay And Examples. 4 Major Types OTypes Of Essay And Examples. 4 Major Types O
Types Of Essay And Examples. 4 Major Types OAmy Cernava
 
026 Describe Yourself Essay Example Introduce Myself
026 Describe Yourself Essay Example Introduce Myself026 Describe Yourself Essay Example Introduce Myself
026 Describe Yourself Essay Example Introduce MyselfAmy Cernava
 
Term Paper Introduction Help - How To Write An Intr
Term Paper Introduction Help - How To Write An IntrTerm Paper Introduction Help - How To Write An Intr
Term Paper Introduction Help - How To Write An IntrAmy Cernava
 
Analysis Of Students Critical Thinking Skill Of Middle School Through STEM E...
Analysis Of Students  Critical Thinking Skill Of Middle School Through STEM E...Analysis Of Students  Critical Thinking Skill Of Middle School Through STEM E...
Analysis Of Students Critical Thinking Skill Of Middle School Through STEM E...Amy Cernava
 

More from Amy Cernava (20)

What Should I Write My College Essay About 15
What Should I Write My College Essay About 15What Should I Write My College Essay About 15
What Should I Write My College Essay About 15
 
A New Breakdown Of. Online assignment writing service.
A New Breakdown Of. Online assignment writing service.A New Breakdown Of. Online assignment writing service.
A New Breakdown Of. Online assignment writing service.
 
Evaluative Writing. 6 Ways To Evaluate. Online assignment writing service.
Evaluative Writing. 6 Ways To Evaluate. Online assignment writing service.Evaluative Writing. 6 Ways To Evaluate. Online assignment writing service.
Evaluative Writing. 6 Ways To Evaluate. Online assignment writing service.
 
General Water. Online assignment writing service.
General Water. Online assignment writing service.General Water. Online assignment writing service.
General Water. Online assignment writing service.
 
Essay Websites Sample Parent Essays For Private Hi
Essay Websites Sample Parent Essays For Private HiEssay Websites Sample Parent Essays For Private Hi
Essay Websites Sample Parent Essays For Private Hi
 
How To Write About Myself Examples - Coverletterpedia
How To Write About Myself Examples - CoverletterpediaHow To Write About Myself Examples - Coverletterpedia
How To Write About Myself Examples - Coverletterpedia
 
Punctuating Titles MLA Printable Classroom Posters Quotations
Punctuating Titles MLA Printable Classroom Posters QuotationsPunctuating Titles MLA Printable Classroom Posters Quotations
Punctuating Titles MLA Printable Classroom Posters Quotations
 
Essay Introductions For Kids. Online assignment writing service.
Essay Introductions For Kids. Online assignment writing service.Essay Introductions For Kids. Online assignment writing service.
Essay Introductions For Kids. Online assignment writing service.
 
Writing Creative Essays - College Homework Help A
Writing Creative Essays - College Homework Help AWriting Creative Essays - College Homework Help A
Writing Creative Essays - College Homework Help A
 
Free Printable Primary Paper Te. Online assignment writing service.
Free Printable Primary Paper Te. Online assignment writing service.Free Printable Primary Paper Te. Online assignment writing service.
Free Printable Primary Paper Te. Online assignment writing service.
 
Diversity Essay Sample Graduate School Which Ca
Diversity Essay Sample Graduate School Which CaDiversity Essay Sample Graduate School Which Ca
Diversity Essay Sample Graduate School Which Ca
 
Large Notepad - Heart Border Writing Paper Print
Large Notepad - Heart Border  Writing Paper PrintLarge Notepad - Heart Border  Writing Paper Print
Large Notepad - Heart Border Writing Paper Print
 
Personal Challenges Essay. Online assignment writing service.
Personal Challenges Essay. Online assignment writing service.Personal Challenges Essay. Online assignment writing service.
Personal Challenges Essay. Online assignment writing service.
 
Buy College Application Essays Dos And Dont
Buy College Application Essays Dos And DontBuy College Application Essays Dos And Dont
Buy College Application Essays Dos And Dont
 
8 Printable Outline Template - SampleTemplatess - Sa
8 Printable Outline Template - SampleTemplatess - Sa8 Printable Outline Template - SampleTemplatess - Sa
8 Printable Outline Template - SampleTemplatess - Sa
 
Analytical Essay Intro Example. Online assignment writing service.
Analytical Essay Intro Example. Online assignment writing service.Analytical Essay Intro Example. Online assignment writing service.
Analytical Essay Intro Example. Online assignment writing service.
 
Types Of Essay And Examples. 4 Major Types O
Types Of Essay And Examples. 4 Major Types OTypes Of Essay And Examples. 4 Major Types O
Types Of Essay And Examples. 4 Major Types O
 
026 Describe Yourself Essay Example Introduce Myself
026 Describe Yourself Essay Example Introduce Myself026 Describe Yourself Essay Example Introduce Myself
026 Describe Yourself Essay Example Introduce Myself
 
Term Paper Introduction Help - How To Write An Intr
Term Paper Introduction Help - How To Write An IntrTerm Paper Introduction Help - How To Write An Intr
Term Paper Introduction Help - How To Write An Intr
 
Analysis Of Students Critical Thinking Skill Of Middle School Through STEM E...
Analysis Of Students  Critical Thinking Skill Of Middle School Through STEM E...Analysis Of Students  Critical Thinking Skill Of Middle School Through STEM E...
Analysis Of Students Critical Thinking Skill Of Middle School Through STEM E...
 

Recently uploaded

Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........LeaCamillePacle
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.arsicmarija21
 
ROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint PresentationROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint PresentationAadityaSharma884161
 
Planning a health career 4th Quarter.pptx
Planning a health career 4th Quarter.pptxPlanning a health career 4th Quarter.pptx
Planning a health career 4th Quarter.pptxLigayaBacuel1
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfUjwalaBharambe
 

Recently uploaded (20)

Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.
 
ROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint PresentationROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint Presentation
 
Planning a health career 4th Quarter.pptx
Planning a health career 4th Quarter.pptxPlanning a health career 4th Quarter.pptx
Planning a health career 4th Quarter.pptx
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
 

A Review Paper On Big Data And Data Mining Concepts And Techniques

  • 1. 33 (IJID) International Journal on Informatics for Development Vol. 7, No. 1, 2018, Pp. 33-35 A REVIEW PAPER ON BIG DATA AND DATA MINING Concepts and Techniques Prasdika F.B.S 1 , Dr. Bambang Sugiantoro, S.Si., M.T 2 Research Scholar 1 , Faculty Advisor 2 Faculty of Science and Technology UIN Sunan Kalijaga Yogyakarta, Indonesia Abstract— In the digital era like today the growth of data in the database is very rapid, all things related to technology have a large contribution to data growth as well as social media, financial technology and scientific data. Therefore, topics such as big data and data mining are topics that are often discussed. Data mining is a method of extracting information through from big data to produce an information pattern or data anomaly. Keywords-component; data, big data, data mining The main purpose of data mining is classification or forecast. In classification, there are a lot of cases that can be simulated such as in a product ad can be sorted out where the respondents who are interested in the ad and which are not, from here can be developed about what is behind someone interested in the ad. this information is expected to be predicted through data mining II. DATA ATRIBUTE I. INTRODUCTION Since the digital era began and the internet began to be used extensively in the early 1990s, it has produced tremendous amounts of data transactions. Even long before the internet era of things a few decades earlier several studies had discussed the data of the washouse, big data and data mining. Analogously, data mining should have been more appropriately named "knowledge mining from data," which is unfortunately somewhat long. However, the shorter term, knowledge mining may not reflect the emphasis on mining from large amounts of data. Nevertheless, mining is a vivid term characterizing the process that finds a small set of precious nuggets from a great deal of raw material.[1] Data mining is used to analyze and explore large amounts of data to find a valuable result from the extraction. there is a lot of information that can be extracted from a collection of databases that can be used for several needs, for example a company engaged in the commercial sector can utilize the transaction data to find optimal sales patterns. Thus, the income from a company can be increased through the use of data mining. According to the type of data can be divided into : A. Nominal Attributes Nominal means “relating to names.” The values of a nominal attribute are symbols or names of things. Each value represents some kind of category, code, or state, and so nominal attributes are also referred to as categorical. The values do not have any meaningful order. In computer science, the values are also known as enumerations. B. Binary Attributes A binary attribute is a nominal attribute with only two categories or states: 0 or 1, where 0 typically means that the attribute is absent, and 1 means that it is present. Binary attributes are referred to as Boolean if the two states correspond to true and false. C. Ordinal Attributes An ordinal attribute is an attribute with possible values that have a meaningful order or ranking among them, but the magnitude between successive values is not known. III. CONCEPTS
  • 2. 34 (IJID) International Journal on Informatics for Development Vol. 7, No. 1, 2018 A. Big Data Big data is data that contains greater variety arriving in increasing volumes and with ever-higher velocity. This is known as the three Vs. • Volume The amount of data matters. With big data, you’ll have to process high volumes of low-density, unstructured data. This can be data of unknown value, such as Twitter data feeds, clickstreams on a webpage or a mobile app, or sensor-enabled equipment. For some organizations, this might be tens of terabytes of data. For others, it may be hundreds of petabytes. • Velocity Velocity is the fast rate at which data is received and (perhaps) acted on. Normally, the highest velocity of data streams directly into memory versus being written to disk. Some internet-enabled smart products operate in real time or near real time and will require real-time evaluation and action. • Variety Variety refers to the many types of data that are available. Traditional data types were structured and fit neatly in a relational database. With the rise of big data, data comes in new unstructured data types. Unstructured and semistructured data types, such as text, audio, and video require additional preprocessing to derive meaning and support metadata. B. Data Mining Data mining is the process of finding anomalies, patterns and correlations within large data sets to predict outcomes. IV. ALGORITHM A. Decision Tree algorithm Decision Tree algorithm belongs to the family of supervised learning algorithms. Unlike other supervised learning algorithms, decision tree algorithm can be used for solving regression and classification problems too. The general motive of using Decision Tree is to create a training model which can use to predict class or value of target variables by learning decision rules inferred from prior data(training data). The understanding level of Decision Trees algorithm is so easy compared with other classification algorithms. The decision tree algorithm tries to solve the problem, by using tree representation. Each internal node of the tree corresponds to an attribute, and each leaf node corresponds to a class label. B. Genetic Algorithms (Complex Adaptive Systems) A genetic or evolutionary algorithm applies the principles of evolution found in nature to the problem of finding an optimal solution to a Solver problem. In a "genetic algorithm," the problem is encoded in a series of bit strings that are manipulated by the algorithm; in an "evolutionary algorithm," the decision variables and problem functions are used directly. Most commercial Solver products are based on evolutionary algorithms. An evolutionary algorithm for optimization is different from "classical" optimization methods in several ways C. Neural Networks An artificial neural network is made up of a series of nodes. Nodes are connected in many ways like the neurons and axons in the human brain. These nodes are primed in a number of different ways. Some are limited to certain algorithms and tasks which they perform exclusively. In most cases, however, nodes are able to process a variety of algorithms. Nodes are able to absorb input and produce output. They are also connected to an artificial learning program. The program can change inputs as well as the weights for different nodes. Weighting places more focus on different nodes or different functions of the same group of nodes. This network also has a function for displaying and evaluating the output. Evaluation is at the heart of the neural network and what makes it intelligent. Many types of computer systems run off of networks that have different nodes which perform different functions. However, weighting and artificial evaluation immensely increase the number of possible functions for the network. Once it is established, the machine learning algorithm must be assigned and implemented. These networks need a
  • 3. 35 (IJID) International Journal on Informatics for Development Vol. 7, No. 1, 2018 basic type of function to be implemented by an operator. This function can be one of many different forms such as Apriori or k-means clustering. The artificial neural network then goes to work. It absorbs input and sends different tasks to different nodes. These nodes may work independently or they may frequently communicate with one another. The nodes then transfer an output to an end series of nodes. V. CHALLENGES AND ISSUES A. Challenges Efficient and effective data mining in large databases poses numerous requirements and great challenges to researchers and developers. The issues involved include data mining methodology, user interaction, performance and scalability, and the processing of a large variety of data types. Other issues include the exploration of data mining applications and their social impacts. B. Issues • Information poorness ▪ The abundance of data, coupled with the need for powerful data analysis tools, has been described as a data rich but information poor situation ▪ Data collected in large data repositories become “data tombs” ▪ data archives that are seldom visited • Decision Making • Important decisions are often made based not on the information-rich data stored in data repositories, but rather on a decision maker’s intuition • The decision maker does not have the tools to extract the valuable knowledge embedded in the vast amounts of data • Data Entry ▪ Often systems rely on users or domain experts to manually input knowledge into knowledge bases. ▪ Unfortunately, this procedure is prone to biases and errors, and is extremely time-consuming and costly • Bad Name ▪ The term is actually a misnomer. ▪ Mining of gold from rocks or sand is referred to as gold mining rather than rock or sand mining. ▪ Data mining should have been more appropriately named “knowledge mining from data” which is unfortunately somewhat long VI. CONCLUTION With very rapid data growth, research in the field of big data and data mining is still growing rapidly too, those who struggling in big data will always find an increasement of complexity from big data. Therefore we conclude that the approach of data mining algorithms can still be improved.. With the many problems that still exist now and issues that occur the possibility is still widely open. REFERENCES [1] Jiawei Han, Micheline Kamber, Jian Pei, “Data Mining : Concepts and Techniques”, ISBN 978-0-12-381479-1, 225 Wyman Street, Waltham, MA 02451, US, 2012 [2] Ian H. Witten, Eibe Frank, “Data Mining Practical Machine Learning Tools and Techniques”, 500 Sansome Street, Suite 400, San Francisco, CA 9411, 2015 [3] Bart van der Sloot, Dennis Broeders, Erik Schrijvers, ”Exploring the Boundaries of Big Data”, Amsterdam, 2016 [4] Nikita Jain, Vishal Srivastava, “Data Mining Techniques: A Survey Paper”, eISSN: 2319-1163 | pISSN: 2321-7308, Rajasthan, India, 2013 [5] Nirmal Kaur, Gurpinder Singh,” A Review Paper On Data Mining And Big Data”, ISSN No. 0976-5697, Jalandhar, Punjab, India, 2017