SlideShare a Scribd company logo
Data Mining With Big
Data
Guide: Prof. Prashant G. Ahire
Presented by :
Miss.Rupa Solapure
Roll no. 259
Agenda
Problem Definition
Objectives
Literature Survey
Architecture/Big Data mining algorithm
Existing System/Mathematical model
Advantages
Disadvantages/Limitations
Characteristics of Big Data
Big Data and it’s challenges
Big Data mining Tools
Applications of Big Data
References
Problem Definition:
Big Data consists of huge modules, difficult, growing data sets with
numerous and , independent sources. With the fast development of
networking, storage of data, and the data gathering capacity, Big Data are
now quickly increasing in all science and engineering domains, as well as
animal, genetic and biomedical sciences. This paper elaborates a HACE
theorem that states the characteristics of the Big Data revolution, and
proposes a Big Data processing model from the data mining view.
Objective:
This requires carefully designed algorithms to analyze model correlations
between distributed sites, and fuse decisions from multiple sources to gain a best
model out of the Big Data. Developing a safe and sound information sharing
protocol is a major challenge.
To support Big Data mining, high-performance computing platforms are
required, which impose systematic designs to unleash the full power of the Big
Data. Big data as an emerging trend and the need for Big data mining is rising in
all science and engineering domains.
Literature Survey
Title/Year Keywords Concept/Abstract Author
“Data Mining With Big
Data,Jan 2014”
Big Data,data
Mining,Heterogeneity,Au
tonomous
sources,Complex,and
Evolving associations.
This paper presents a HACE
theorem that characterizes the
features of Big Data
revolutions,processing model
from data mining.
Xindong Wu, Fellow,
IEEE, Xingquan Zhu,
Senior Member, IEEE,
Gong-Qing Wu, and Wei
Ding
“The Survey of Data
Mining Applications
And Feature
Scope,,June 2012”
Data mining task, Data
mining life cycle ,
Visualization of the data
mining model , Data
mining Methods,s
Data mining applications.
This paper imparts more
number of applications of the
data mining and also o focuses
scope of the data mining which
will helpful in the further
research.
Neelamadhab Padhy1,
Dr. Pragnyaban Mishra 2,
and Rasmita Panigrahi3
“Review on Data
Mining with Big
Data..Dec 2014”
Big Data, data mining,
heterogeneity,
autonomous sources,
complex and evolving
associations.
This data-driven model involves
demand-driven aggregation of
information sources, mining and
analysis, security and privacy
considerations.
Savita Suryavanshi, Prof.
Bharati Kale.
“SURVEY ON BIG
DATA MINING
PLATFORMS,
ALGORITHMS AND
CHALLENGES.sep201
4”
big data, big data mining
platforms, big data
mining algorithms, big
data mining challenges,
data mining.
This paper gives A review on
various big data mining
platforms, algorithms and
challenges is also discussed in
this paper.
SHERIN A1, Dr S UMA2,
SARANYA K3, SARANYA
VANI M4.
Architecture:
Fig.: Big data Memory evolution
Data Mining Algorithm
 Decision tree induction classification algorithms
 Evolutionary based classification algorithms
 Partitioning based clustering algorithms
 Hierarchical based clustering algorithms
 Hierarchical based clustering algorithms
 Hierarchical based clustering algorithms
 Model based clustering algorithms
Existing System:
The rise of Big Data applications where data collection has grown tremendous
doubly and is beyond the ability of commonly used software tools to capture,
manage, and process within a “tolerable elapsed time.”
The most fundamental challenge for Big Data applications is to explore the large
volumes of data and extract useful information or knowledge for future actions.
In many situations, the knowledge extraction process has to be very efficient and
close to real time because storing all observed data is nearly infeasible.
The unprecedented data volumes require an effective data analysis and prediction
platform to achieve fast response and real-time classification for such Big Data.
In model level it will produce local pattern. This pattern will be produced after
mined local data.
By sharing these local patterns with other local sites, we can produce a single
global pattern.
At the knowledge level, model correlation analysis investigates the relevance
between models generated from various data sources to determine how related
the data sources are correlated to each other, and how to form accurate decisions
based on models built from autonomous sources
Continue…
Big Data
Big Data is a comprehensive term for any collection of data sets so large and multifarious
that it becomes difficult to process them using conventional data processing applications.
There are two types of Big Data: structured and unstructured.
Structured data
Structured data are numbers and words that can be easily categorized and analyzed.
These data are generated by things like network sensors embedded in electronic
devices, smart phones, and global positioning system (GPS) devices. Structured data
also include things like sales figures, account balances, and transaction data.
Unstructured data
Unstructured data include more multifarious information, such as customer reviews
from feasible websites, photos and other multimedia, and comments on social
networking sites. These data can not be separated into categorized or analyzed
numerically.
Big Data Characteristic(HACE Theorem)
Figure . The blind men and the enormous elephant: the restricted view
of each blind man leads to a biased conclusion.
HACE theorem suggests that the key characteristics of the
Big Data are:
A. Huge with various and miscellaneous data sources
B. Autonomous Sources with circulated & disperse Control
C. Complex and Evolving associations
Applications of Data Mining
Marketing
 Analysis of consumer behaviour
 Advertising campaigns
 Targeted mailings
 Segmentation of customers, stores, or products
Finance
 Creditworthiness of clients
 Performance analysis of finance investments
 Fraud detection
Manufacturing
 Optimization of resources
 Optimization of manufacturing processes
 Product design based on customer requirements
Health Care
 Discovering patterns in X-ray images
 Analyzing side effects of drugs
 Effectiveness of treatments
Big Data Mining Algorithm
Big data applications have so many sources to gather information.
 If we want to mine data, we need to gather all distributed data to the
centralized site.But it is prohibited because of high data transmission cost
and privacy concerns.
Most of the mining levels order to achieve the pattern of correlations, or
patterns can be discovered from combined variety of sources.
The global data mining is done through two steps process.
 Model level
Knowledge level.
Each and every local sites use local data to calculate the data statistics
and it share this information in order to achieve global data distribution in
their data level.
Data Mining Challenges With Big Data
Fig. a conceptual view of the Big Data processing framework
DISADVANTAGES OF EXISTING
SYSTEM
To explore Big Data, we have analysed several challenges at the
data, model, and system levels.
The challenges at Tier I focus on data accessing and arithmetic
computing procedures. Because Big Data are often stored at
different locations and data volumes may continuously grow, an
effective computing platform will have to take distributed large-
scale data storage into consideration for computing.
PROPOSED SYSTEM
We propose a HACE theorem to model Big Data characteristics. The
characteristics of HACH make it an extreme challenge for
discovering useful knowledge from the Big Data.
ADVANTAGES OF PROPOSED SYSTEM
Provide most relevant and most accurate social sensing feedback to
better understand our society at real time.
ADVANTAGES OF PROPOSED SYSTEM
Provide most relevant and most accurate social sensing feedback to
better understand our society at real time.
Characteristics of Big Data
Fig. Five Vs of BIG DATA
Volume- The quantity of data
Variety - categorizing the data
Velocity- speed of generation of data or the speed
of processing the data
Variability- Inconsistency
Complexity- Managing the data
Continue…
BIG Data Mining Tools
Hadoop
Apache S4
Strom
Apache Mahout
MOA
Fig.: Big Data processing
Conclusion:
Because of Increase in the amount of data in the field of genomics,
meteorology, biology, environmental research, it becomes difficult to handle
the data, to find Associations, patterns and to analyze the large data sets.
As an organization collects more data at this scale, formalizing the process of
big data analysis will become paramount.The paper describes methods for
different algorithms used to handle such large data sets. And it gives an
overview of architecture and algorithms used in large data sets.
References
 McKinsy Global Institute, Big Data: The next frontier for
innovation, competition and productivity- May 2011
Xindong Wu, Xinguan Zhu, Gong-Qing Wu, Wei Ding, 2013,
Data Mining with Big Data
 Ahmed and Karypis 2012, Rezwan Ahmed, George Karpis,
Algorithms for mining the evolution of conserved relational states in
dynamic network
 IEEE, Data Mining with Big Data, January 2014
 Oracle, June 2013,Unstructured Data Management with Oracle
Database 12c
Data minig with Big data analysis

More Related Content

What's hot

Big data-ppt
Big data-pptBig data-ppt
Big data-ppt
Nazir Ahmed
 
Idiro Analytics - Analytics & Big Data
Idiro Analytics - Analytics & Big DataIdiro Analytics - Analytics & Big Data
Idiro Analytics - Analytics & Big Data
Idiro Analytics
 
Big Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesBig Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture Capabilities
Ashraf Uddin
 
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Simplilearn
 
Presentation on Big Data
Presentation on Big DataPresentation on Big Data
Presentation on Big Data
Maruf Abdullah (Rion)
 
Data Science presentation for elementary school students
Data Science presentation for elementary school studentsData Science presentation for elementary school students
Data Science presentation for elementary school students
Melanie Manning, CFA
 
Data Analytics
Data AnalyticsData Analytics
Data Analytics
Srinimf-Slides
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional Modeling
Sunita Sahu
 
Introduction to data analytics
Introduction to data analyticsIntroduction to data analytics
Introduction to data analytics
SSaudia
 
01 Data Mining: Concepts and Techniques, 2nd ed.
01 Data Mining: Concepts and Techniques, 2nd ed.01 Data Mining: Concepts and Techniques, 2nd ed.
01 Data Mining: Concepts and Techniques, 2nd ed.
Institute of Technology Telkom
 
Introduction on Data Science
Introduction on Data ScienceIntroduction on Data Science
Introduction on Data ScienceEdureka!
 
Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecturepcherukumalla
 
Chapter 1 big data
Chapter 1 big dataChapter 1 big data
Chapter 1 big data
Prof .Pragati Khade
 
Big Data
Big DataBig Data
Big Data
Priyanka Tuteja
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
Ghulam Imaduddin
 
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
Edureka!
 
Presentation About Big Data (DBMS)
Presentation About Big Data (DBMS)Presentation About Big Data (DBMS)
Presentation About Big Data (DBMS)
SiamAhmed16
 
Data Warehouse Modeling
Data Warehouse ModelingData Warehouse Modeling
Data Warehouse Modelingvivekjv
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
Vipin Batra
 

What's hot (20)

Big data-ppt
Big data-pptBig data-ppt
Big data-ppt
 
Idiro Analytics - Analytics & Big Data
Idiro Analytics - Analytics & Big DataIdiro Analytics - Analytics & Big Data
Idiro Analytics - Analytics & Big Data
 
Big Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesBig Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture Capabilities
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
 
Presentation on Big Data
Presentation on Big DataPresentation on Big Data
Presentation on Big Data
 
Data Science presentation for elementary school students
Data Science presentation for elementary school studentsData Science presentation for elementary school students
Data Science presentation for elementary school students
 
Data Analytics
Data AnalyticsData Analytics
Data Analytics
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional Modeling
 
Introduction to data analytics
Introduction to data analyticsIntroduction to data analytics
Introduction to data analytics
 
01 Data Mining: Concepts and Techniques, 2nd ed.
01 Data Mining: Concepts and Techniques, 2nd ed.01 Data Mining: Concepts and Techniques, 2nd ed.
01 Data Mining: Concepts and Techniques, 2nd ed.
 
Introduction on Data Science
Introduction on Data ScienceIntroduction on Data Science
Introduction on Data Science
 
Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecture
 
Chapter 1 big data
Chapter 1 big dataChapter 1 big data
Chapter 1 big data
 
Big Data
Big DataBig Data
Big Data
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
 
Presentation About Big Data (DBMS)
Presentation About Big Data (DBMS)Presentation About Big Data (DBMS)
Presentation About Big Data (DBMS)
 
Data Warehouse Modeling
Data Warehouse ModelingData Warehouse Modeling
Data Warehouse Modeling
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 

Viewers also liked

Big Data v Data Mining
Big Data v Data MiningBig Data v Data Mining
Big Data v Data Mining
University of Hertfordshire
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
kk1718
 
What is Big Data?
What is Big Data?What is Big Data?
What is Big Data?
Bernard Marr
 
Big Data
Big DataBig Data
Big Data
NGDATA
 
Data mining slides
Data mining slidesData mining slides
Data mining slidessmj
 
Data mining
Data miningData mining
Data mining
imran khan
 
Introduction to Data Mining and Big Data Analytics
Introduction to Data Mining and Big Data AnalyticsIntroduction to Data Mining and Big Data Analytics
Introduction to Data Mining and Big Data Analytics
Big Data Engineering, Faculty of Engineering, Dhurakij Pundit University
 
What is big data?
What is big data?What is big data?
What is big data?
David Wellman
 
Big Data - 25 Amazing Facts Everyone Should Know
Big Data - 25 Amazing Facts Everyone Should KnowBig Data - 25 Amazing Facts Everyone Should Know
Big Data - 25 Amazing Facts Everyone Should Know
Bernard Marr
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
Philippe Julio
 
Big Data & The Role Analytics Can Play In Our Organizations
Big Data & The Role Analytics Can Play In Our OrganizationsBig Data & The Role Analytics Can Play In Our Organizations
Big Data & The Role Analytics Can Play In Our Organizations
Agile Technologies
 
Data is Currency
Data is CurrencyData is Currency
Data is Currency
iMedia Connection
 
How Great Companies Think Differently
How Great Companies Think DifferentlyHow Great Companies Think Differently
How Great Companies Think Differently
Dia Lao
 
Frank henry digital rural futures conf june 2013 v3
Frank henry digital rural futures conf june  2013  v3Frank henry digital rural futures conf june  2013  v3
Frank henry digital rural futures conf june 2013 v3Frank Henry
 
Big data by Mithlesh sadh
Big data by Mithlesh sadhBig data by Mithlesh sadh
Big data by Mithlesh sadh
Mithlesh Sadh
 
2016 and 2017 Data Mining Projects @ TMKS Infotech
2016 and 2017 Data Mining Projects @ TMKS Infotech2016 and 2017 Data Mining Projects @ TMKS Infotech
2016 and 2017 Data Mining Projects @ TMKS Infotech
Manju Nath
 
2016 and 2017 IEEE Titles
2016 and 2017 IEEE Titles2016 and 2017 IEEE Titles
2016 and 2017 IEEE Titles
Manju Nath
 
Data mining on big data
Data mining on big dataData mining on big data
Data mining on big data
Swapnil Chaudhari
 

Viewers also liked (20)

Big Data v Data Mining
Big Data v Data MiningBig Data v Data Mining
Big Data v Data Mining
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
Big data ppt
Big  data pptBig  data ppt
Big data ppt
 
What is Big Data?
What is Big Data?What is Big Data?
What is Big Data?
 
Big Data
Big DataBig Data
Big Data
 
Data mining slides
Data mining slidesData mining slides
Data mining slides
 
Data mining
Data miningData mining
Data mining
 
Introduction to Data Mining and Big Data Analytics
Introduction to Data Mining and Big Data AnalyticsIntroduction to Data Mining and Big Data Analytics
Introduction to Data Mining and Big Data Analytics
 
What is big data?
What is big data?What is big data?
What is big data?
 
Big Data - 25 Amazing Facts Everyone Should Know
Big Data - 25 Amazing Facts Everyone Should KnowBig Data - 25 Amazing Facts Everyone Should Know
Big Data - 25 Amazing Facts Everyone Should Know
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
 
Big Data & The Role Analytics Can Play In Our Organizations
Big Data & The Role Analytics Can Play In Our OrganizationsBig Data & The Role Analytics Can Play In Our Organizations
Big Data & The Role Analytics Can Play In Our Organizations
 
Data is Currency
Data is CurrencyData is Currency
Data is Currency
 
How Great Companies Think Differently
How Great Companies Think DifferentlyHow Great Companies Think Differently
How Great Companies Think Differently
 
Frank henry digital rural futures conf june 2013 v3
Frank henry digital rural futures conf june  2013  v3Frank henry digital rural futures conf june  2013  v3
Frank henry digital rural futures conf june 2013 v3
 
Big data by Mithlesh sadh
Big data by Mithlesh sadhBig data by Mithlesh sadh
Big data by Mithlesh sadh
 
2016 and 2017 Data Mining Projects @ TMKS Infotech
2016 and 2017 Data Mining Projects @ TMKS Infotech2016 and 2017 Data Mining Projects @ TMKS Infotech
2016 and 2017 Data Mining Projects @ TMKS Infotech
 
2016 and 2017 IEEE Titles
2016 and 2017 IEEE Titles2016 and 2017 IEEE Titles
2016 and 2017 IEEE Titles
 
Data mining on big data
Data mining on big dataData mining on big data
Data mining on big data
 

Similar to Data minig with Big data analysis

Big data and data mining
Big data and data miningBig data and data mining
Big data and data mining
Polash Halder
 
ISSUES, CHALLENGES, AND SOLUTIONS: BIG DATA MINING
ISSUES, CHALLENGES, AND SOLUTIONS: BIG DATA MININGISSUES, CHALLENGES, AND SOLUTIONS: BIG DATA MINING
ISSUES, CHALLENGES, AND SOLUTIONS: BIG DATA MINING
cscpconf
 
Issues, challenges, and solutions
Issues, challenges, and solutionsIssues, challenges, and solutions
Issues, challenges, and solutions
csandit
 
Paradigm4 Research Report: Leaving Data on the table
Paradigm4 Research Report: Leaving Data on the tableParadigm4 Research Report: Leaving Data on the table
Paradigm4 Research Report: Leaving Data on the table
Paradigm4
 
[IJET-V1I3P10] Authors : Kalaignanam.K, Aishwarya.M, Vasantharaj.K, Kumaresan...
[IJET-V1I3P10] Authors : Kalaignanam.K, Aishwarya.M, Vasantharaj.K, Kumaresan...[IJET-V1I3P10] Authors : Kalaignanam.K, Aishwarya.M, Vasantharaj.K, Kumaresan...
[IJET-V1I3P10] Authors : Kalaignanam.K, Aishwarya.M, Vasantharaj.K, Kumaresan...
IJET - International Journal of Engineering and Techniques
 
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdf
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdfKIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdf
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdf
Dr. Radhey Shyam
 
KIT-601 Lecture Notes-UNIT-1.pdf
KIT-601 Lecture Notes-UNIT-1.pdfKIT-601 Lecture Notes-UNIT-1.pdf
KIT-601 Lecture Notes-UNIT-1.pdf
Dr. Radhey Shyam
 
Introduction to Data Analytics and data analytics life cycle
Introduction to Data Analytics and data analytics life cycleIntroduction to Data Analytics and data analytics life cycle
Introduction to Data Analytics and data analytics life cycle
Dr. Radhey Shyam
 
Real World Application of Big Data In Data Mining Tools
Real World Application of Big Data In Data Mining ToolsReal World Application of Big Data In Data Mining Tools
Real World Application of Big Data In Data Mining Tools
ijsrd.com
 
Mining Big Data using Genetic Algorithm
Mining Big Data using Genetic AlgorithmMining Big Data using Genetic Algorithm
Mining Big Data using Genetic Algorithm
IRJET Journal
 
A SURVEY OF BIG DATA ANALYTICS
A SURVEY OF BIG DATA ANALYTICSA SURVEY OF BIG DATA ANALYTICS
A SURVEY OF BIG DATA ANALYTICS
ijistjournal
 
GROUP PROJECT REPORT_FY6055_FX7378
GROUP PROJECT REPORT_FY6055_FX7378GROUP PROJECT REPORT_FY6055_FX7378
GROUP PROJECT REPORT_FY6055_FX7378Parag Kapile
 
1 UNIT-DSP.pptx
1 UNIT-DSP.pptx1 UNIT-DSP.pptx
1 UNIT-DSP.pptx
PothyeswariPothyes
 
Data Mining – A Perspective Approach
Data Mining – A Perspective ApproachData Mining – A Perspective Approach
Data Mining – A Perspective Approach
IRJET Journal
 
Unit-1 introduction to Big data.pdf
Unit-1 introduction to Big data.pdfUnit-1 introduction to Big data.pdf
Unit-1 introduction to Big data.pdf
Sitamarhi Institute of Technology
 
Unit-1 introduction to Big data.pdf
Unit-1 introduction to Big data.pdfUnit-1 introduction to Big data.pdf
Unit-1 introduction to Big data.pdf
Sitamarhi Institute of Technology
 
Nikita rajbhoj(a 50)
Nikita rajbhoj(a 50)Nikita rajbhoj(a 50)
Nikita rajbhoj(a 50)
NikitaRajbhoj
 
Big Data Mining - Classification, Techniques and Issues
Big Data Mining - Classification, Techniques and IssuesBig Data Mining - Classification, Techniques and Issues
Big Data Mining - Classification, Techniques and Issues
Karan Deep Singh
 
BigData
BigDataBigData
BigData
Viveka Sharma
 
A Novel Framework for Big Data Processing in a Data-driven Society
A Novel Framework for Big Data Processing in a Data-driven SocietyA Novel Framework for Big Data Processing in a Data-driven Society
A Novel Framework for Big Data Processing in a Data-driven Society
AnthonyOtuonye
 

Similar to Data minig with Big data analysis (20)

Big data and data mining
Big data and data miningBig data and data mining
Big data and data mining
 
ISSUES, CHALLENGES, AND SOLUTIONS: BIG DATA MINING
ISSUES, CHALLENGES, AND SOLUTIONS: BIG DATA MININGISSUES, CHALLENGES, AND SOLUTIONS: BIG DATA MINING
ISSUES, CHALLENGES, AND SOLUTIONS: BIG DATA MINING
 
Issues, challenges, and solutions
Issues, challenges, and solutionsIssues, challenges, and solutions
Issues, challenges, and solutions
 
Paradigm4 Research Report: Leaving Data on the table
Paradigm4 Research Report: Leaving Data on the tableParadigm4 Research Report: Leaving Data on the table
Paradigm4 Research Report: Leaving Data on the table
 
[IJET-V1I3P10] Authors : Kalaignanam.K, Aishwarya.M, Vasantharaj.K, Kumaresan...
[IJET-V1I3P10] Authors : Kalaignanam.K, Aishwarya.M, Vasantharaj.K, Kumaresan...[IJET-V1I3P10] Authors : Kalaignanam.K, Aishwarya.M, Vasantharaj.K, Kumaresan...
[IJET-V1I3P10] Authors : Kalaignanam.K, Aishwarya.M, Vasantharaj.K, Kumaresan...
 
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdf
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdfKIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdf
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdf
 
KIT-601 Lecture Notes-UNIT-1.pdf
KIT-601 Lecture Notes-UNIT-1.pdfKIT-601 Lecture Notes-UNIT-1.pdf
KIT-601 Lecture Notes-UNIT-1.pdf
 
Introduction to Data Analytics and data analytics life cycle
Introduction to Data Analytics and data analytics life cycleIntroduction to Data Analytics and data analytics life cycle
Introduction to Data Analytics and data analytics life cycle
 
Real World Application of Big Data In Data Mining Tools
Real World Application of Big Data In Data Mining ToolsReal World Application of Big Data In Data Mining Tools
Real World Application of Big Data In Data Mining Tools
 
Mining Big Data using Genetic Algorithm
Mining Big Data using Genetic AlgorithmMining Big Data using Genetic Algorithm
Mining Big Data using Genetic Algorithm
 
A SURVEY OF BIG DATA ANALYTICS
A SURVEY OF BIG DATA ANALYTICSA SURVEY OF BIG DATA ANALYTICS
A SURVEY OF BIG DATA ANALYTICS
 
GROUP PROJECT REPORT_FY6055_FX7378
GROUP PROJECT REPORT_FY6055_FX7378GROUP PROJECT REPORT_FY6055_FX7378
GROUP PROJECT REPORT_FY6055_FX7378
 
1 UNIT-DSP.pptx
1 UNIT-DSP.pptx1 UNIT-DSP.pptx
1 UNIT-DSP.pptx
 
Data Mining – A Perspective Approach
Data Mining – A Perspective ApproachData Mining – A Perspective Approach
Data Mining – A Perspective Approach
 
Unit-1 introduction to Big data.pdf
Unit-1 introduction to Big data.pdfUnit-1 introduction to Big data.pdf
Unit-1 introduction to Big data.pdf
 
Unit-1 introduction to Big data.pdf
Unit-1 introduction to Big data.pdfUnit-1 introduction to Big data.pdf
Unit-1 introduction to Big data.pdf
 
Nikita rajbhoj(a 50)
Nikita rajbhoj(a 50)Nikita rajbhoj(a 50)
Nikita rajbhoj(a 50)
 
Big Data Mining - Classification, Techniques and Issues
Big Data Mining - Classification, Techniques and IssuesBig Data Mining - Classification, Techniques and Issues
Big Data Mining - Classification, Techniques and Issues
 
BigData
BigDataBigData
BigData
 
A Novel Framework for Big Data Processing in a Data-driven Society
A Novel Framework for Big Data Processing in a Data-driven SocietyA Novel Framework for Big Data Processing in a Data-driven Society
A Novel Framework for Big Data Processing in a Data-driven Society
 

Recently uploaded

Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
Kamal Acharya
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
obonagu
 
Building Electrical System Design & Installation
Building Electrical System Design & InstallationBuilding Electrical System Design & Installation
Building Electrical System Design & Installation
symbo111
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Sreedhar Chowdam
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
Kerry Sado
 
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
AJAYKUMARPUND1
 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation & Control
 
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
ssuser7dcef0
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
Robbie Edward Sayers
 
Investor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptxInvestor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptx
AmarGB2
 
Heap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTS
Heap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTSHeap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTS
Heap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTS
Soumen Santra
 
AP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specificAP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specific
BrazilAccount1
 
6th International Conference on Machine Learning & Applications (CMLA 2024)
6th International Conference on Machine Learning & Applications (CMLA 2024)6th International Conference on Machine Learning & Applications (CMLA 2024)
6th International Conference on Machine Learning & Applications (CMLA 2024)
ClaraZara1
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Dr.Costas Sachpazis
 
Forklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella PartsForklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella Parts
Intella Parts
 
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdfGoverning Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
WENKENLI1
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
gdsczhcet
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
MdTanvirMahtab2
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
zwunae
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
VENKATESHvenky89705
 

Recently uploaded (20)

Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
 
Building Electrical System Design & Installation
Building Electrical System Design & InstallationBuilding Electrical System Design & Installation
Building Electrical System Design & Installation
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
 
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
 
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
 
Investor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptxInvestor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptx
 
Heap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTS
Heap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTSHeap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTS
Heap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTS
 
AP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specificAP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specific
 
6th International Conference on Machine Learning & Applications (CMLA 2024)
6th International Conference on Machine Learning & Applications (CMLA 2024)6th International Conference on Machine Learning & Applications (CMLA 2024)
6th International Conference on Machine Learning & Applications (CMLA 2024)
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
 
Forklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella PartsForklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella Parts
 
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdfGoverning Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
 

Data minig with Big data analysis

  • 1. Data Mining With Big Data Guide: Prof. Prashant G. Ahire Presented by : Miss.Rupa Solapure Roll no. 259
  • 2. Agenda Problem Definition Objectives Literature Survey Architecture/Big Data mining algorithm Existing System/Mathematical model Advantages Disadvantages/Limitations Characteristics of Big Data Big Data and it’s challenges Big Data mining Tools Applications of Big Data References
  • 3. Problem Definition: Big Data consists of huge modules, difficult, growing data sets with numerous and , independent sources. With the fast development of networking, storage of data, and the data gathering capacity, Big Data are now quickly increasing in all science and engineering domains, as well as animal, genetic and biomedical sciences. This paper elaborates a HACE theorem that states the characteristics of the Big Data revolution, and proposes a Big Data processing model from the data mining view.
  • 4. Objective: This requires carefully designed algorithms to analyze model correlations between distributed sites, and fuse decisions from multiple sources to gain a best model out of the Big Data. Developing a safe and sound information sharing protocol is a major challenge. To support Big Data mining, high-performance computing platforms are required, which impose systematic designs to unleash the full power of the Big Data. Big data as an emerging trend and the need for Big data mining is rising in all science and engineering domains.
  • 5. Literature Survey Title/Year Keywords Concept/Abstract Author “Data Mining With Big Data,Jan 2014” Big Data,data Mining,Heterogeneity,Au tonomous sources,Complex,and Evolving associations. This paper presents a HACE theorem that characterizes the features of Big Data revolutions,processing model from data mining. Xindong Wu, Fellow, IEEE, Xingquan Zhu, Senior Member, IEEE, Gong-Qing Wu, and Wei Ding “The Survey of Data Mining Applications And Feature Scope,,June 2012” Data mining task, Data mining life cycle , Visualization of the data mining model , Data mining Methods,s Data mining applications. This paper imparts more number of applications of the data mining and also o focuses scope of the data mining which will helpful in the further research. Neelamadhab Padhy1, Dr. Pragnyaban Mishra 2, and Rasmita Panigrahi3 “Review on Data Mining with Big Data..Dec 2014” Big Data, data mining, heterogeneity, autonomous sources, complex and evolving associations. This data-driven model involves demand-driven aggregation of information sources, mining and analysis, security and privacy considerations. Savita Suryavanshi, Prof. Bharati Kale. “SURVEY ON BIG DATA MINING PLATFORMS, ALGORITHMS AND CHALLENGES.sep201 4” big data, big data mining platforms, big data mining algorithms, big data mining challenges, data mining. This paper gives A review on various big data mining platforms, algorithms and challenges is also discussed in this paper. SHERIN A1, Dr S UMA2, SARANYA K3, SARANYA VANI M4.
  • 6. Architecture: Fig.: Big data Memory evolution
  • 7. Data Mining Algorithm  Decision tree induction classification algorithms  Evolutionary based classification algorithms  Partitioning based clustering algorithms  Hierarchical based clustering algorithms  Hierarchical based clustering algorithms  Hierarchical based clustering algorithms  Model based clustering algorithms
  • 8. Existing System: The rise of Big Data applications where data collection has grown tremendous doubly and is beyond the ability of commonly used software tools to capture, manage, and process within a “tolerable elapsed time.” The most fundamental challenge for Big Data applications is to explore the large volumes of data and extract useful information or knowledge for future actions. In many situations, the knowledge extraction process has to be very efficient and close to real time because storing all observed data is nearly infeasible. The unprecedented data volumes require an effective data analysis and prediction platform to achieve fast response and real-time classification for such Big Data.
  • 9. In model level it will produce local pattern. This pattern will be produced after mined local data. By sharing these local patterns with other local sites, we can produce a single global pattern. At the knowledge level, model correlation analysis investigates the relevance between models generated from various data sources to determine how related the data sources are correlated to each other, and how to form accurate decisions based on models built from autonomous sources Continue…
  • 10. Big Data Big Data is a comprehensive term for any collection of data sets so large and multifarious that it becomes difficult to process them using conventional data processing applications. There are two types of Big Data: structured and unstructured. Structured data Structured data are numbers and words that can be easily categorized and analyzed. These data are generated by things like network sensors embedded in electronic devices, smart phones, and global positioning system (GPS) devices. Structured data also include things like sales figures, account balances, and transaction data. Unstructured data Unstructured data include more multifarious information, such as customer reviews from feasible websites, photos and other multimedia, and comments on social networking sites. These data can not be separated into categorized or analyzed numerically.
  • 11. Big Data Characteristic(HACE Theorem) Figure . The blind men and the enormous elephant: the restricted view of each blind man leads to a biased conclusion.
  • 12. HACE theorem suggests that the key characteristics of the Big Data are: A. Huge with various and miscellaneous data sources B. Autonomous Sources with circulated & disperse Control C. Complex and Evolving associations
  • 13. Applications of Data Mining Marketing  Analysis of consumer behaviour  Advertising campaigns  Targeted mailings  Segmentation of customers, stores, or products Finance  Creditworthiness of clients  Performance analysis of finance investments  Fraud detection Manufacturing  Optimization of resources  Optimization of manufacturing processes  Product design based on customer requirements Health Care  Discovering patterns in X-ray images  Analyzing side effects of drugs  Effectiveness of treatments
  • 14. Big Data Mining Algorithm Big data applications have so many sources to gather information.  If we want to mine data, we need to gather all distributed data to the centralized site.But it is prohibited because of high data transmission cost and privacy concerns. Most of the mining levels order to achieve the pattern of correlations, or patterns can be discovered from combined variety of sources. The global data mining is done through two steps process.  Model level Knowledge level. Each and every local sites use local data to calculate the data statistics and it share this information in order to achieve global data distribution in their data level.
  • 15. Data Mining Challenges With Big Data Fig. a conceptual view of the Big Data processing framework
  • 16. DISADVANTAGES OF EXISTING SYSTEM To explore Big Data, we have analysed several challenges at the data, model, and system levels. The challenges at Tier I focus on data accessing and arithmetic computing procedures. Because Big Data are often stored at different locations and data volumes may continuously grow, an effective computing platform will have to take distributed large- scale data storage into consideration for computing.
  • 17. PROPOSED SYSTEM We propose a HACE theorem to model Big Data characteristics. The characteristics of HACH make it an extreme challenge for discovering useful knowledge from the Big Data.
  • 18. ADVANTAGES OF PROPOSED SYSTEM Provide most relevant and most accurate social sensing feedback to better understand our society at real time.
  • 19. ADVANTAGES OF PROPOSED SYSTEM Provide most relevant and most accurate social sensing feedback to better understand our society at real time.
  • 20. Characteristics of Big Data Fig. Five Vs of BIG DATA
  • 21. Volume- The quantity of data Variety - categorizing the data Velocity- speed of generation of data or the speed of processing the data Variability- Inconsistency Complexity- Managing the data Continue…
  • 22. BIG Data Mining Tools Hadoop Apache S4 Strom Apache Mahout MOA
  • 23. Fig.: Big Data processing
  • 24. Conclusion: Because of Increase in the amount of data in the field of genomics, meteorology, biology, environmental research, it becomes difficult to handle the data, to find Associations, patterns and to analyze the large data sets. As an organization collects more data at this scale, formalizing the process of big data analysis will become paramount.The paper describes methods for different algorithms used to handle such large data sets. And it gives an overview of architecture and algorithms used in large data sets.
  • 25. References  McKinsy Global Institute, Big Data: The next frontier for innovation, competition and productivity- May 2011 Xindong Wu, Xinguan Zhu, Gong-Qing Wu, Wei Ding, 2013, Data Mining with Big Data  Ahmed and Karypis 2012, Rezwan Ahmed, George Karpis, Algorithms for mining the evolution of conserved relational states in dynamic network  IEEE, Data Mining with Big Data, January 2014  Oracle, June 2013,Unstructured Data Management with Oracle Database 12c