Submitted to:
Mahabub Haque(Sohel)
Lecture,
Dept. Of CSE
University Of Chittagong
Submitted by:
Aseem Chakrabarthy
THESIS STATEMENT
DATA MINING AND BIG DATA
Preliminary Knowledge about Data mining & Big Data
Data Mining
1.What is Data Mining?
2.Why Data Mining?
3.Application Of Data Mining
4.Steps Of Data Mining
5.Data Warehousing
6.Data Mining Tools
Big Data
1.What is Big Data?
2.Why Big Data?
3.What Type Of Data is “Big Data”?
4. Problems Associated with “BIG DATA”
5.Characteristics of Big Data.
What is Data Mining?
Data mining (sometimes called data or knowledge discovery) is the
process of analyzing data from different perspectives and
summarizing it into useful information - information that can be used
to increase revenue, cuts costs, or both. Data mining software is one
of a number of analytical tools for analyzing data. It allows users to
analyze data from many different dimensions or angles, categorize it,
and summarize the relationships identified. Technically, data mining is
the process of finding correlations or patterns among dozens of fields
in large relational databases.
Why Data Mining?
=>I can’t find the data I need
 I can’t get the data I need
 I can’t understand the data I found
 I can’t use the data I found
 Data explosion problem:
Advance data collection tools and database technology lead to
tremendous amounts of data stored in databases.
Application Of Data Mining
Industry Application
Finance Credit card Analysis
Insurance Claims , fraud Analysis
Telecommunication Call record analysis
Transport logistics management
Consumer goods promotion analysis
Scientific Research Image , Video , Speech
Utilities Power usage analysis
Steps Of Data Mining
 Data integration
 Data selection
 Data cleaning
 Data transformation
 Data mining
 Pattern evaluation
 Knowledge presentation
Data Warehousing:
 A data warehouse is a subject-oriented ,integrated ,
time-variant and non-volatile collection of data in
support of management’s decision making process.
 Data warehousing is the process of constructing and
using a data warehouse. A data warehouse is
constructed by integrating data from multiple
heterogeneous sources that support analytical
reporting, structured and/or ad hoc queries, and
decision making. Data warehousing involves data
cleaning, data integration, and data consolidations.
Data Mining Tools:
 Microsoft SQL server 2005
 Microsoft SQL Server 2008
 Oracle Data Mining
 DBMiner
What is big Data?
=> ‘Big Data’ is similar to ‘small data’ but bigger in size.
Big data is a term used to represent a collection of data sets which are
extremely large in size , because it is too large and complex . It is difficult to
process using tradition tools
=>Walmart handles more than 1 million customer transactions every hour.
=>Facebook handles 40 billion photos from its user base
Why Big Data?
=>Growth Of Big Data is needed
Increase of storage capacities
Increase of processing power
Availability of data(Different data types)
Every day we create 2.5 quintillion bytes of data; 90% of the data in the
world today has been created in the last two years alone.
=> Facebook generate 10TB daily
=>Twitter generates 7TB of data daily
=>IBM claims 90% of today’s stored data was generated in just the last
two years.
What Type Of Data is “Big Data”?
=>Social networking habits
=>Location of cell phone usage
 Shopping habits
 Search habit
 Online Interests
 Many , Many more
4.Problems Associated with “BIG DATA”
=>Physical problems
 Processing problems
=>Big business VS small business
=>Incorrect Data
Characteristics Of Big Data:
=>Three Characteristics of Big Data V3s
->Volume (Data Quantity)
->Velocity (Data Speed)
->variety(Data Types)
Thank you

Data mining & big data presentation 01

  • 1.
    Submitted to: Mahabub Haque(Sohel) Lecture, Dept.Of CSE University Of Chittagong Submitted by: Aseem Chakrabarthy
  • 2.
  • 3.
    Preliminary Knowledge aboutData mining & Big Data Data Mining 1.What is Data Mining? 2.Why Data Mining? 3.Application Of Data Mining 4.Steps Of Data Mining 5.Data Warehousing 6.Data Mining Tools Big Data 1.What is Big Data? 2.Why Big Data? 3.What Type Of Data is “Big Data”? 4. Problems Associated with “BIG DATA” 5.Characteristics of Big Data.
  • 4.
    What is DataMining? Data mining (sometimes called data or knowledge discovery) is the process of analyzing data from different perspectives and summarizing it into useful information - information that can be used to increase revenue, cuts costs, or both. Data mining software is one of a number of analytical tools for analyzing data. It allows users to analyze data from many different dimensions or angles, categorize it, and summarize the relationships identified. Technically, data mining is the process of finding correlations or patterns among dozens of fields in large relational databases.
  • 5.
    Why Data Mining? =>Ican’t find the data I need  I can’t get the data I need  I can’t understand the data I found  I can’t use the data I found  Data explosion problem: Advance data collection tools and database technology lead to tremendous amounts of data stored in databases.
  • 6.
    Application Of DataMining Industry Application Finance Credit card Analysis Insurance Claims , fraud Analysis Telecommunication Call record analysis Transport logistics management Consumer goods promotion analysis Scientific Research Image , Video , Speech Utilities Power usage analysis
  • 7.
    Steps Of DataMining  Data integration  Data selection  Data cleaning  Data transformation  Data mining  Pattern evaluation  Knowledge presentation
  • 8.
    Data Warehousing:  Adata warehouse is a subject-oriented ,integrated , time-variant and non-volatile collection of data in support of management’s decision making process.  Data warehousing is the process of constructing and using a data warehouse. A data warehouse is constructed by integrating data from multiple heterogeneous sources that support analytical reporting, structured and/or ad hoc queries, and decision making. Data warehousing involves data cleaning, data integration, and data consolidations.
  • 9.
    Data Mining Tools: Microsoft SQL server 2005  Microsoft SQL Server 2008  Oracle Data Mining  DBMiner
  • 10.
    What is bigData? => ‘Big Data’ is similar to ‘small data’ but bigger in size. Big data is a term used to represent a collection of data sets which are extremely large in size , because it is too large and complex . It is difficult to process using tradition tools =>Walmart handles more than 1 million customer transactions every hour. =>Facebook handles 40 billion photos from its user base
  • 11.
    Why Big Data? =>GrowthOf Big Data is needed Increase of storage capacities Increase of processing power Availability of data(Different data types) Every day we create 2.5 quintillion bytes of data; 90% of the data in the world today has been created in the last two years alone. => Facebook generate 10TB daily =>Twitter generates 7TB of data daily =>IBM claims 90% of today’s stored data was generated in just the last two years.
  • 12.
    What Type OfData is “Big Data”? =>Social networking habits =>Location of cell phone usage  Shopping habits  Search habit  Online Interests  Many , Many more
  • 13.
    4.Problems Associated with“BIG DATA” =>Physical problems  Processing problems =>Big business VS small business =>Incorrect Data
  • 14.
    Characteristics Of BigData: =>Three Characteristics of Big Data V3s ->Volume (Data Quantity) ->Velocity (Data Speed) ->variety(Data Types)
  • 15.