SlideShare a Scribd company logo
Big Data
Shankar Radhakrishnan
Topics

• Data Management Today


• New Interests, Expectations, Problems


• Big Data


• New Approach


• Big Data Ecosystem


• Q&A
Data Management Today

• Relational Databases


  • Oracle, MySQL, MS-SQL Server


• Data warehouse Appliances


  • Teradata, IBM-Netezza


• Legacy Systems


  • Mainframes
New Interests, Expectations

• Collect More, Data-Mine More   • Actionable Insights


• Complex Data Integration       • Extension of Investments


• Advanced Analytics             • Talent Management


• Social Data Analysis           • ROI


• Machine Data Analysis          • TCO


• Realtime Data Analysis         • Business Continuity
How Big is Data?




? BIG                            90
                                                                                          is the average


                                                              $214
                                                                                       amount companies
                                                                                        have to spend per
                                 of the world’s data                                      compromised

             Facts               was created in the
                                    last two years
                                                                                        customer when a
                                                                                       data breach occurs
(as of Oct 2012)




                                2.7bn
                                                  Average number
                                                    of “likes” and
                                                    “comments”
                                                      posted on
                                                                             247bn
                                                                                e-mail messages are sent each
                                                   Facebook daily                day… about 80% of them are
                                                                                           spam
    It would take 2,000 hours
     to watch all the YouTube

                                500,000+
      videos uploaded while                                     data centers across the world are large
       we’re talking on this                                      enough to fill 5,955 football fields
              panel*



                                 *this is 3x more than just 2 short years ago

5
New Problems

• Unpredictable Volume          • Computing Limitations


• Data Processing Issues        • Information vs. Insights


• Data Integration Issues       • Business Requirements


• Identifying Source-of-Truth   • Regulatory Requirements


• Store vs. Analyze             • True Value-of-Data


• Data Retrieval Requirements   • Price to Performance Dilemma
What is Big Data?


                •     Very large data sets                                •   Real-time data streaming
                •     Sizes from 100 TB to 50 PB                              data
                •     Larger than “one machine”                           •   High volume / Low latency
                •     Whole data set analysis                                   • Write heavy
                      replaces “sampling”                                       • Read heavy
                                                                                • Both is common
                                                   Volume     Velocity




                •     Structured data
                        • OLTP                     Variety   Complexity

                        • DW
                        • ODS
                        • Data marts
                •     Unstructured data                                   •   Complexity
                        • Text                                                 • Data acquisition
                        • Audio                                                • Analysis
                        • Video                                                • Deriving insights
                        • Click streams
                        • Log files



Source: Ventana Research
New Approach

• Commodity Hardware


  • Open Computing Project


• Open Source Solutions, Frameworks


  • Value Added Products – Cloudera, Datastax, 10gen


• Research Oriented Product Development


• Augmented Ecosystem
Big Data : Ecosystem


                                                    Advanced
                                                     Analytics
                                            Predictive & Optimization
                                               Modeling, Business
Data Analytics                                 Processes Analysis,
                                                                                                 R            Splunk
                                               Functional Analysis                                                                SAS Big Data
                                                                                                Madlib           Mahout
                                                                                                                                 Visual Analytics

                                                                                                           Tableau
                                            Advanced Visualizations
Data Delivery                        Data Delivery - Dashboards , Scorecard                                               SpotFire
                                      (Strategy Maps), Spatial & Temporal                            Datameer
Data Visualization                                   Analysis

                                                                                                  Pig            Hive       Other BI Tools with
Data Engineering                                  BI / Reporting                                                            Hadoop connectors
                             Data Engineering - Performance Reporting, Enterprise                       Lucene                     Karmasphere
Data Agility                 Metrics, Data Agility - Data Mining, OLAP Modeling etc
                                                                                                         Cassandra         Crunch          Pangool
Data Consolidation                        Data Storage and Processing
                                                                                                           HDFS           HBase        Mapreduce
                                          Data Storage, Data processing
Data Economics
                                                                                                                 Flume        Scribe          Avro
                                                                                                                         Sqoop              Chukwa
                                          Data Integration & Management                                                 Zookeeper            Oozie
                     Data Filtering, Data Consolidation & Warehousing, Data Quality, Metadata
Integration                         Management, Job Scheduling, Data Economics                                             Native Hadoop ETL

                                                                                                                           Traditional ETL with
                                                                                                                           Hadoop connectors
                                           Distributed Infrastructure

                                  Hadoop components                        Open source Hadoop platforms
                                   3rd party Hadoop supporting platforms
What Big Data can do that traditional data warehousing and analytics cannot?

               Traditional DW                                                 Big Data

Complete records from known transactional            Data from many different internal & external sources
systems.                                             with unknown quality and/or utility.
                                                 u

Data is structured, and data fields have known       Loosely structured data. Flat schemas with few
(and often complex) interrelationships.              complex interrelationships, connections between data
                                                 u   elements have to be probabilistically inferred.



Multi Terabytes of Data                              Multi Peta Bytes of Data
                                                 u

Mostly Scale Up Architecture                         Scale Out Architecture
                                                 u

                                                     The analytic models are larger and require very large
Analytics run on a stable data model.            u   amounts of hardware resources to process them in a
                                                     timely manner


Low Performance/Cost ratio as most of the            High Performance/Cost ratio as most of the software/
software/hardware platforms are proprietary      u   hardware platforms are commodity, free, open source
and license based


10
What Big Data can do that traditional data warehousing and analytics cannot?

                 Traditional DW                                           Big Data


 Aggregate data (structured)                    u    Raw Data (structured and unstructured)




                                                     Individual level analytics, Micro segmentation,
 Aggregate / Segment analytics                  u    individualized offers to customers


 Mainstream analytics
                                                     Outlier analytics, Pattern discovery, Simulation and
 – Structured analysis                          u    modeling, Machine learning
 - OLAP cubes


                                                     Entire population of granular data can be
 Sample data is used for identifying patterns   u    leveraged



Reports & Dashboards are done on a production       Real-time operational analytics and reporting. Intra-
basis                                           u   day decision making.


 Traditional models good for small amount of         Big Models: Computationally intensive analyses,
 data due to time constraints                   u    simulations, models with many parameters




11
Q&A




      Thank You !

More Related Content

What's hot

Maximize the Value of Your Data: Neo4j Graph Data Platform
Maximize the Value of Your Data: Neo4j Graph Data PlatformMaximize the Value of Your Data: Neo4j Graph Data Platform
Maximize the Value of Your Data: Neo4j Graph Data Platform
Neo4j
 
Exploring Big Data Analytics Tools
Exploring Big Data Analytics ToolsExploring Big Data Analytics Tools
Exploring Big Data Analytics Tools
Multisoft Virtual Academy
 
Making Big Data Easy for Everyone
Making Big Data Easy for EveryoneMaking Big Data Easy for Everyone
Making Big Data Easy for Everyone
Caserta
 
Big Data Evolution
Big Data EvolutionBig Data Evolution
Big Data Evolution
itnewsafrica
 
Big Data’s Big Impact on Businesses
Big Data’s Big Impact on BusinessesBig Data’s Big Impact on Businesses
Big Data’s Big Impact on Businesses
CRISIL Limited
 
BigData Analysis
BigData AnalysisBigData Analysis
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
Tyrone Systems
 
What is big data - Architectures and Practical Use Cases
What is big data - Architectures and Practical Use CasesWhat is big data - Architectures and Practical Use Cases
What is big data - Architectures and Practical Use Cases
Tony Pearson
 
Big data 101
Big data 101Big data 101
Big data 101
Lars Marius Garshol
 
BIG DATA and USE CASES
BIG DATA and USE CASESBIG DATA and USE CASES
BIG DATA and USE CASES
Bhaskara Reddy Sannapureddy
 
Big data
Big dataBig data
Big data-ppt
Big data-pptBig data-ppt
Big data-ppt
Nazir Ahmed
 
How Can Analytics Improve Business?
How Can Analytics Improve Business?How Can Analytics Improve Business?
How Can Analytics Improve Business?
Inside Analysis
 
Top Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practicesTop Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practices
SpringPeople
 
Big data analysis
Big data analysisBig data analysis
Big data analysis
SAishwaryaDinesh
 
Big Data Analytics MIS presentation
Big Data Analytics MIS presentationBig Data Analytics MIS presentation
Big Data Analytics MIS presentation
AASTHA PANDEY
 
Big Data Use Cases
Big Data Use CasesBig Data Use Cases
Big Data Use Cases
InSemble
 
Unit i big data introduction
Unit  i big data introductionUnit  i big data introduction
Unit i big data introduction
SujaMaryD
 
Big data 101
Big data 101Big data 101
Big data ppt
Big  data pptBig  data ppt
Big data ppt
Nasrin Hussain
 

What's hot (20)

Maximize the Value of Your Data: Neo4j Graph Data Platform
Maximize the Value of Your Data: Neo4j Graph Data PlatformMaximize the Value of Your Data: Neo4j Graph Data Platform
Maximize the Value of Your Data: Neo4j Graph Data Platform
 
Exploring Big Data Analytics Tools
Exploring Big Data Analytics ToolsExploring Big Data Analytics Tools
Exploring Big Data Analytics Tools
 
Making Big Data Easy for Everyone
Making Big Data Easy for EveryoneMaking Big Data Easy for Everyone
Making Big Data Easy for Everyone
 
Big Data Evolution
Big Data EvolutionBig Data Evolution
Big Data Evolution
 
Big Data’s Big Impact on Businesses
Big Data’s Big Impact on BusinessesBig Data’s Big Impact on Businesses
Big Data’s Big Impact on Businesses
 
BigData Analysis
BigData AnalysisBigData Analysis
BigData Analysis
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
What is big data - Architectures and Practical Use Cases
What is big data - Architectures and Practical Use CasesWhat is big data - Architectures and Practical Use Cases
What is big data - Architectures and Practical Use Cases
 
Big data 101
Big data 101Big data 101
Big data 101
 
BIG DATA and USE CASES
BIG DATA and USE CASESBIG DATA and USE CASES
BIG DATA and USE CASES
 
Big data
Big dataBig data
Big data
 
Big data-ppt
Big data-pptBig data-ppt
Big data-ppt
 
How Can Analytics Improve Business?
How Can Analytics Improve Business?How Can Analytics Improve Business?
How Can Analytics Improve Business?
 
Top Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practicesTop Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practices
 
Big data analysis
Big data analysisBig data analysis
Big data analysis
 
Big Data Analytics MIS presentation
Big Data Analytics MIS presentationBig Data Analytics MIS presentation
Big Data Analytics MIS presentation
 
Big Data Use Cases
Big Data Use CasesBig Data Use Cases
Big Data Use Cases
 
Unit i big data introduction
Unit  i big data introductionUnit  i big data introduction
Unit i big data introduction
 
Big data 101
Big data 101Big data 101
Big data 101
 
Big data ppt
Big  data pptBig  data ppt
Big data ppt
 

Viewers also liked

LISAAanim
LISAAanimLISAAanim
LISAAanim
REM Limited
 
Graphic Design-LISAA Paris
Graphic Design-LISAA ParisGraphic Design-LISAA Paris
Graphic Design-LISAA Paris
REM Limited
 
FashionDesign-LISAA Paris
FashionDesign-LISAA ParisFashionDesign-LISAA Paris
FashionDesign-LISAA Paris
REM Limited
 
LISAA-programmes
LISAA-programmesLISAA-programmes
LISAA-programmes
REM Limited
 
LISAAfashion
LISAAfashionLISAAfashion
LISAAfashion
REM Limited
 
Mahabharata And Gita Synopsis
Mahabharata And Gita SynopsisMahabharata And Gita Synopsis
Mahabharata And Gita Synopsis
Avinash Patwadhan
 
Metro and Windows Phone 7
Metro and Windows Phone 7Metro and Windows Phone 7
Metro and Windows Phone 7
Johan Lindfors
 
Kurukshetra
KurukshetraKurukshetra
Kurukshetra
sushil sharma
 
Ajay mishra mahabharat maps with diagrams events peoples domain range chains ...
Ajay mishra mahabharat maps with diagrams events peoples domain range chains ...Ajay mishra mahabharat maps with diagrams events peoples domain range chains ...
Ajay mishra mahabharat maps with diagrams events peoples domain range chains ...
ALINA MATSENKO AND AJAY MISHRA ALINA AND AJAY
 
Indian history
Indian historyIndian history
Indian history
Shivang Upadhyay
 
Lesser known monuments of delhi
Lesser known monuments of delhiLesser known monuments of delhi
Lesser known monuments of delhi
Arun Kumar
 
Pearl academy prospectus 2015
Pearl academy prospectus 2015Pearl academy prospectus 2015
Pearl academy prospectus 2015
Academy of Fashion & Design
 
NID UPGRADED
NID UPGRADEDNID UPGRADED
NID UPGRADED
Siddhesh Aparaj
 
Nid
NidNid
Values, ethics & the mahabharat
Values, ethics & the mahabharatValues, ethics & the mahabharat
Values, ethics & the mahabharat
Chandresh Suprit
 
ArchiSurance Case Study
ArchiSurance Case StudyArchiSurance Case Study
ArchiSurance Case Study
Iver Band
 
National Institute Of Fashion Technology(NIFT)-Delhi,Mumbai,Chennai
National Institute Of Fashion Technology(NIFT)-Delhi,Mumbai,ChennaiNational Institute Of Fashion Technology(NIFT)-Delhi,Mumbai,Chennai
National Institute Of Fashion Technology(NIFT)-Delhi,Mumbai,Chennai
Director-Navnirman Bahu-Uddeshiya Mahila Sanstha,Suvarna Lele Architects.
 
Concept study of mahindra united world college,pune and pearl academy of fash...
Concept study of mahindra united world college,pune and pearl academy of fash...Concept study of mahindra united world college,pune and pearl academy of fash...
Concept study of mahindra united world college,pune and pearl academy of fash...
harshita batra
 
CASE STUDY NEHRU PLACE,DELHI
CASE STUDY NEHRU PLACE,DELHICASE STUDY NEHRU PLACE,DELHI
CASE STUDY NEHRU PLACE,DELHI
Aashna Arora
 
case study on National institute of design, Ahmedabad.
case study on National institute of design, Ahmedabad.case study on National institute of design, Ahmedabad.
case study on National institute of design, Ahmedabad.
Milan Jain
 

Viewers also liked (20)

LISAAanim
LISAAanimLISAAanim
LISAAanim
 
Graphic Design-LISAA Paris
Graphic Design-LISAA ParisGraphic Design-LISAA Paris
Graphic Design-LISAA Paris
 
FashionDesign-LISAA Paris
FashionDesign-LISAA ParisFashionDesign-LISAA Paris
FashionDesign-LISAA Paris
 
LISAA-programmes
LISAA-programmesLISAA-programmes
LISAA-programmes
 
LISAAfashion
LISAAfashionLISAAfashion
LISAAfashion
 
Mahabharata And Gita Synopsis
Mahabharata And Gita SynopsisMahabharata And Gita Synopsis
Mahabharata And Gita Synopsis
 
Metro and Windows Phone 7
Metro and Windows Phone 7Metro and Windows Phone 7
Metro and Windows Phone 7
 
Kurukshetra
KurukshetraKurukshetra
Kurukshetra
 
Ajay mishra mahabharat maps with diagrams events peoples domain range chains ...
Ajay mishra mahabharat maps with diagrams events peoples domain range chains ...Ajay mishra mahabharat maps with diagrams events peoples domain range chains ...
Ajay mishra mahabharat maps with diagrams events peoples domain range chains ...
 
Indian history
Indian historyIndian history
Indian history
 
Lesser known monuments of delhi
Lesser known monuments of delhiLesser known monuments of delhi
Lesser known monuments of delhi
 
Pearl academy prospectus 2015
Pearl academy prospectus 2015Pearl academy prospectus 2015
Pearl academy prospectus 2015
 
NID UPGRADED
NID UPGRADEDNID UPGRADED
NID UPGRADED
 
Nid
NidNid
Nid
 
Values, ethics & the mahabharat
Values, ethics & the mahabharatValues, ethics & the mahabharat
Values, ethics & the mahabharat
 
ArchiSurance Case Study
ArchiSurance Case StudyArchiSurance Case Study
ArchiSurance Case Study
 
National Institute Of Fashion Technology(NIFT)-Delhi,Mumbai,Chennai
National Institute Of Fashion Technology(NIFT)-Delhi,Mumbai,ChennaiNational Institute Of Fashion Technology(NIFT)-Delhi,Mumbai,Chennai
National Institute Of Fashion Technology(NIFT)-Delhi,Mumbai,Chennai
 
Concept study of mahindra united world college,pune and pearl academy of fash...
Concept study of mahindra united world college,pune and pearl academy of fash...Concept study of mahindra united world college,pune and pearl academy of fash...
Concept study of mahindra united world college,pune and pearl academy of fash...
 
CASE STUDY NEHRU PLACE,DELHI
CASE STUDY NEHRU PLACE,DELHICASE STUDY NEHRU PLACE,DELHI
CASE STUDY NEHRU PLACE,DELHI
 
case study on National institute of design, Ahmedabad.
case study on National institute of design, Ahmedabad.case study on National institute of design, Ahmedabad.
case study on National institute of design, Ahmedabad.
 

Similar to Kurukshetra - Big Data

Big Data 視覺化分析解決方案
Big Data 視覺化分析解決方案Big Data 視覺化分析解決方案
Big Data 視覺化分析解決方案
Etu Solution
 
Anexinet Big Data Solutions
Anexinet Big Data SolutionsAnexinet Big Data Solutions
Anexinet Big Data Solutions
Mark Kromer
 
Software architecture & design patterns for MS CRM Developers
Software architecture & design patterns for MS CRM  Developers Software architecture & design patterns for MS CRM  Developers
Software architecture & design patterns for MS CRM Developers
sebedatalabs
 
Big Data and Implications on Platform Architecture
Big Data and Implications on Platform ArchitectureBig Data and Implications on Platform Architecture
Big Data and Implications on Platform Architecture
Odinot Stanislas
 
Streaming Hadoop for Enterprise Adoption
Streaming Hadoop for Enterprise AdoptionStreaming Hadoop for Enterprise Adoption
Streaming Hadoop for Enterprise Adoption
DATAVERSITY
 
Introducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & TalendIntroducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & Talend
Caserta
 
Secure Big Data Analytics - Hadoop & Intel
Secure Big Data Analytics - Hadoop & IntelSecure Big Data Analytics - Hadoop & Intel
Secure Big Data Analytics - Hadoop & Intel
Intel - API Security & Tokenization
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
Ovidiu Dimulescu
 
Building Big Data Applications
Building Big Data ApplicationsBuilding Big Data Applications
Building Big Data Applications
Richard McDougall
 
Webinar | Using Hadoop Analytics to Gain a Big Data Advantage
Webinar | Using Hadoop Analytics to Gain a Big Data AdvantageWebinar | Using Hadoop Analytics to Gain a Big Data Advantage
Webinar | Using Hadoop Analytics to Gain a Big Data Advantage
Cloudera, Inc.
 
Ibm big data ibm marriage of hadoop and data warehousing
Ibm big dataibm marriage of hadoop and data warehousingIbm big dataibm marriage of hadoop and data warehousing
Ibm big data ibm marriage of hadoop and data warehousing
DataWorks Summit
 
Big data? No. Big Decisions are What You Want
Big data? No. Big Decisions are What You WantBig data? No. Big Decisions are What You Want
Big data? No. Big Decisions are What You Want
Stuart Miniman
 
Ibm big data hadoop summit 2012 james kobielus final 6-13-12(1)
Ibm big data    hadoop summit 2012 james kobielus final 6-13-12(1)Ibm big data    hadoop summit 2012 james kobielus final 6-13-12(1)
Ibm big data hadoop summit 2012 james kobielus final 6-13-12(1)
Ajay Ohri
 
Cutting Big Data Down to Size with AMD and Dell
Cutting Big Data Down to Size with AMD and DellCutting Big Data Down to Size with AMD and Dell
Cutting Big Data Down to Size with AMD and Dell
AMD
 
Big Data, Big Content, and Aligning Your Storage Strategy
Big Data, Big Content, and Aligning Your Storage StrategyBig Data, Big Content, and Aligning Your Storage Strategy
Big Data, Big Content, and Aligning Your Storage Strategy
Hitachi Vantara
 
Mind Blowing Business Intelligence Dashboards
Mind Blowing Business Intelligence DashboardsMind Blowing Business Intelligence Dashboards
Mind Blowing Business Intelligence Dashboards
Unilytics
 
Data mining - GDi Techno Solutions
Data mining - GDi Techno SolutionsData mining - GDi Techno Solutions
Data mining - GDi Techno Solutions
GDi Techno Solutions
 
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012
Gigaom
 
DataPortal Presentation
DataPortal Presentation DataPortal Presentation
DataPortal Presentation
DataPortal
 
Scaling MySQL: Catch 22 of Read Write Splitting
Scaling MySQL: Catch 22 of Read Write SplittingScaling MySQL: Catch 22 of Read Write Splitting
Scaling MySQL: Catch 22 of Read Write Splitting
ScaleBase
 

Similar to Kurukshetra - Big Data (20)

Big Data 視覺化分析解決方案
Big Data 視覺化分析解決方案Big Data 視覺化分析解決方案
Big Data 視覺化分析解決方案
 
Anexinet Big Data Solutions
Anexinet Big Data SolutionsAnexinet Big Data Solutions
Anexinet Big Data Solutions
 
Software architecture & design patterns for MS CRM Developers
Software architecture & design patterns for MS CRM  Developers Software architecture & design patterns for MS CRM  Developers
Software architecture & design patterns for MS CRM Developers
 
Big Data and Implications on Platform Architecture
Big Data and Implications on Platform ArchitectureBig Data and Implications on Platform Architecture
Big Data and Implications on Platform Architecture
 
Streaming Hadoop for Enterprise Adoption
Streaming Hadoop for Enterprise AdoptionStreaming Hadoop for Enterprise Adoption
Streaming Hadoop for Enterprise Adoption
 
Introducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & TalendIntroducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & Talend
 
Secure Big Data Analytics - Hadoop & Intel
Secure Big Data Analytics - Hadoop & IntelSecure Big Data Analytics - Hadoop & Intel
Secure Big Data Analytics - Hadoop & Intel
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Building Big Data Applications
Building Big Data ApplicationsBuilding Big Data Applications
Building Big Data Applications
 
Webinar | Using Hadoop Analytics to Gain a Big Data Advantage
Webinar | Using Hadoop Analytics to Gain a Big Data AdvantageWebinar | Using Hadoop Analytics to Gain a Big Data Advantage
Webinar | Using Hadoop Analytics to Gain a Big Data Advantage
 
Ibm big data ibm marriage of hadoop and data warehousing
Ibm big dataibm marriage of hadoop and data warehousingIbm big dataibm marriage of hadoop and data warehousing
Ibm big data ibm marriage of hadoop and data warehousing
 
Big data? No. Big Decisions are What You Want
Big data? No. Big Decisions are What You WantBig data? No. Big Decisions are What You Want
Big data? No. Big Decisions are What You Want
 
Ibm big data hadoop summit 2012 james kobielus final 6-13-12(1)
Ibm big data    hadoop summit 2012 james kobielus final 6-13-12(1)Ibm big data    hadoop summit 2012 james kobielus final 6-13-12(1)
Ibm big data hadoop summit 2012 james kobielus final 6-13-12(1)
 
Cutting Big Data Down to Size with AMD and Dell
Cutting Big Data Down to Size with AMD and DellCutting Big Data Down to Size with AMD and Dell
Cutting Big Data Down to Size with AMD and Dell
 
Big Data, Big Content, and Aligning Your Storage Strategy
Big Data, Big Content, and Aligning Your Storage StrategyBig Data, Big Content, and Aligning Your Storage Strategy
Big Data, Big Content, and Aligning Your Storage Strategy
 
Mind Blowing Business Intelligence Dashboards
Mind Blowing Business Intelligence DashboardsMind Blowing Business Intelligence Dashboards
Mind Blowing Business Intelligence Dashboards
 
Data mining - GDi Techno Solutions
Data mining - GDi Techno SolutionsData mining - GDi Techno Solutions
Data mining - GDi Techno Solutions
 
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012
 
DataPortal Presentation
DataPortal Presentation DataPortal Presentation
DataPortal Presentation
 
Scaling MySQL: Catch 22 of Read Write Splitting
Scaling MySQL: Catch 22 of Read Write SplittingScaling MySQL: Catch 22 of Read Write Splitting
Scaling MySQL: Catch 22 of Read Write Splitting
 

Recently uploaded

Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptxBeyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
EduSkills OECD
 
Chapter wise All Notes of First year Basic Civil Engineering.pptx
Chapter wise All Notes of First year Basic Civil Engineering.pptxChapter wise All Notes of First year Basic Civil Engineering.pptx
Chapter wise All Notes of First year Basic Civil Engineering.pptx
Denish Jangid
 
Wound healing PPT
Wound healing PPTWound healing PPT
Wound healing PPT
Jyoti Chand
 
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
Nguyen Thanh Tu Collection
 
Benner "Expanding Pathways to Publishing Careers"
Benner "Expanding Pathways to Publishing Careers"Benner "Expanding Pathways to Publishing Careers"
Benner "Expanding Pathways to Publishing Careers"
National Information Standards Organization (NISO)
 
BIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptx
BIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptxBIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptx
BIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptx
RidwanHassanYusuf
 
spot a liar (Haiqa 146).pptx Technical writhing and presentation skills
spot a liar (Haiqa 146).pptx Technical writhing and presentation skillsspot a liar (Haiqa 146).pptx Technical writhing and presentation skills
spot a liar (Haiqa 146).pptx Technical writhing and presentation skills
haiqairshad
 
SWOT analysis in the project Keeping the Memory @live.pptx
SWOT analysis in the project Keeping the Memory @live.pptxSWOT analysis in the project Keeping the Memory @live.pptx
SWOT analysis in the project Keeping the Memory @live.pptx
zuzanka
 
skeleton System.pdf (skeleton system wow)
skeleton System.pdf (skeleton system wow)skeleton System.pdf (skeleton system wow)
skeleton System.pdf (skeleton system wow)
Mohammad Al-Dhahabi
 
RESULTS OF THE EVALUATION QUESTIONNAIRE.pptx
RESULTS OF THE EVALUATION QUESTIONNAIRE.pptxRESULTS OF THE EVALUATION QUESTIONNAIRE.pptx
RESULTS OF THE EVALUATION QUESTIONNAIRE.pptx
zuzanka
 
Nutrition Inc FY 2024, 4 - Hour Training
Nutrition Inc FY 2024, 4 - Hour TrainingNutrition Inc FY 2024, 4 - Hour Training
Nutrition Inc FY 2024, 4 - Hour Training
melliereed
 
Temple of Asclepius in Thrace. Excavation results
Temple of Asclepius in Thrace. Excavation resultsTemple of Asclepius in Thrace. Excavation results
Temple of Asclepius in Thrace. Excavation results
Krassimira Luka
 
Geography as a Discipline Chapter 1 __ Class 11 Geography NCERT _ Class Notes...
Geography as a Discipline Chapter 1 __ Class 11 Geography NCERT _ Class Notes...Geography as a Discipline Chapter 1 __ Class 11 Geography NCERT _ Class Notes...
Geography as a Discipline Chapter 1 __ Class 11 Geography NCERT _ Class Notes...
ImMuslim
 
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptxNEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
iammrhaywood
 
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
imrankhan141184
 
Electric Fetus - Record Store Scavenger Hunt
Electric Fetus - Record Store Scavenger HuntElectric Fetus - Record Store Scavenger Hunt
Electric Fetus - Record Store Scavenger Hunt
RamseyBerglund
 
Bossa N’ Roll Records by Ismael Vazquez.
Bossa N’ Roll Records by Ismael Vazquez.Bossa N’ Roll Records by Ismael Vazquez.
Bossa N’ Roll Records by Ismael Vazquez.
IsmaelVazquez38
 
Pharmaceutics Pharmaceuticals best of brub
Pharmaceutics Pharmaceuticals best of brubPharmaceutics Pharmaceuticals best of brub
Pharmaceutics Pharmaceuticals best of brub
danielkiash986
 
Haunted Houses by H W Longfellow for class 10
Haunted Houses by H W Longfellow for class 10Haunted Houses by H W Longfellow for class 10
Haunted Houses by H W Longfellow for class 10
nitinpv4ai
 
Juneteenth Freedom Day 2024 David Douglas School District
Juneteenth Freedom Day 2024 David Douglas School DistrictJuneteenth Freedom Day 2024 David Douglas School District
Juneteenth Freedom Day 2024 David Douglas School District
David Douglas School District
 

Recently uploaded (20)

Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptxBeyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
 
Chapter wise All Notes of First year Basic Civil Engineering.pptx
Chapter wise All Notes of First year Basic Civil Engineering.pptxChapter wise All Notes of First year Basic Civil Engineering.pptx
Chapter wise All Notes of First year Basic Civil Engineering.pptx
 
Wound healing PPT
Wound healing PPTWound healing PPT
Wound healing PPT
 
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
 
Benner "Expanding Pathways to Publishing Careers"
Benner "Expanding Pathways to Publishing Careers"Benner "Expanding Pathways to Publishing Careers"
Benner "Expanding Pathways to Publishing Careers"
 
BIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptx
BIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptxBIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptx
BIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptx
 
spot a liar (Haiqa 146).pptx Technical writhing and presentation skills
spot a liar (Haiqa 146).pptx Technical writhing and presentation skillsspot a liar (Haiqa 146).pptx Technical writhing and presentation skills
spot a liar (Haiqa 146).pptx Technical writhing and presentation skills
 
SWOT analysis in the project Keeping the Memory @live.pptx
SWOT analysis in the project Keeping the Memory @live.pptxSWOT analysis in the project Keeping the Memory @live.pptx
SWOT analysis in the project Keeping the Memory @live.pptx
 
skeleton System.pdf (skeleton system wow)
skeleton System.pdf (skeleton system wow)skeleton System.pdf (skeleton system wow)
skeleton System.pdf (skeleton system wow)
 
RESULTS OF THE EVALUATION QUESTIONNAIRE.pptx
RESULTS OF THE EVALUATION QUESTIONNAIRE.pptxRESULTS OF THE EVALUATION QUESTIONNAIRE.pptx
RESULTS OF THE EVALUATION QUESTIONNAIRE.pptx
 
Nutrition Inc FY 2024, 4 - Hour Training
Nutrition Inc FY 2024, 4 - Hour TrainingNutrition Inc FY 2024, 4 - Hour Training
Nutrition Inc FY 2024, 4 - Hour Training
 
Temple of Asclepius in Thrace. Excavation results
Temple of Asclepius in Thrace. Excavation resultsTemple of Asclepius in Thrace. Excavation results
Temple of Asclepius in Thrace. Excavation results
 
Geography as a Discipline Chapter 1 __ Class 11 Geography NCERT _ Class Notes...
Geography as a Discipline Chapter 1 __ Class 11 Geography NCERT _ Class Notes...Geography as a Discipline Chapter 1 __ Class 11 Geography NCERT _ Class Notes...
Geography as a Discipline Chapter 1 __ Class 11 Geography NCERT _ Class Notes...
 
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptxNEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
 
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
 
Electric Fetus - Record Store Scavenger Hunt
Electric Fetus - Record Store Scavenger HuntElectric Fetus - Record Store Scavenger Hunt
Electric Fetus - Record Store Scavenger Hunt
 
Bossa N’ Roll Records by Ismael Vazquez.
Bossa N’ Roll Records by Ismael Vazquez.Bossa N’ Roll Records by Ismael Vazquez.
Bossa N’ Roll Records by Ismael Vazquez.
 
Pharmaceutics Pharmaceuticals best of brub
Pharmaceutics Pharmaceuticals best of brubPharmaceutics Pharmaceuticals best of brub
Pharmaceutics Pharmaceuticals best of brub
 
Haunted Houses by H W Longfellow for class 10
Haunted Houses by H W Longfellow for class 10Haunted Houses by H W Longfellow for class 10
Haunted Houses by H W Longfellow for class 10
 
Juneteenth Freedom Day 2024 David Douglas School District
Juneteenth Freedom Day 2024 David Douglas School DistrictJuneteenth Freedom Day 2024 David Douglas School District
Juneteenth Freedom Day 2024 David Douglas School District
 

Kurukshetra - Big Data

  • 2. Topics • Data Management Today • New Interests, Expectations, Problems • Big Data • New Approach • Big Data Ecosystem • Q&A
  • 3. Data Management Today • Relational Databases • Oracle, MySQL, MS-SQL Server • Data warehouse Appliances • Teradata, IBM-Netezza • Legacy Systems • Mainframes
  • 4. New Interests, Expectations • Collect More, Data-Mine More • Actionable Insights • Complex Data Integration • Extension of Investments • Advanced Analytics • Talent Management • Social Data Analysis • ROI • Machine Data Analysis • TCO • Realtime Data Analysis • Business Continuity
  • 5. How Big is Data? ? BIG 90 is the average $214 amount companies have to spend per of the world’s data compromised Facts was created in the last two years customer when a data breach occurs (as of Oct 2012) 2.7bn Average number of “likes” and “comments” posted on 247bn e-mail messages are sent each Facebook daily day… about 80% of them are spam It would take 2,000 hours to watch all the YouTube 500,000+ videos uploaded while data centers across the world are large we’re talking on this enough to fill 5,955 football fields panel* *this is 3x more than just 2 short years ago 5
  • 6. New Problems • Unpredictable Volume • Computing Limitations • Data Processing Issues • Information vs. Insights • Data Integration Issues • Business Requirements • Identifying Source-of-Truth • Regulatory Requirements • Store vs. Analyze • True Value-of-Data • Data Retrieval Requirements • Price to Performance Dilemma
  • 7. What is Big Data? • Very large data sets • Real-time data streaming • Sizes from 100 TB to 50 PB data • Larger than “one machine” • High volume / Low latency • Whole data set analysis • Write heavy replaces “sampling” • Read heavy • Both is common Volume Velocity • Structured data • OLTP Variety Complexity • DW • ODS • Data marts • Unstructured data • Complexity • Text • Data acquisition • Audio • Analysis • Video • Deriving insights • Click streams • Log files Source: Ventana Research
  • 8. New Approach • Commodity Hardware • Open Computing Project • Open Source Solutions, Frameworks • Value Added Products – Cloudera, Datastax, 10gen • Research Oriented Product Development • Augmented Ecosystem
  • 9. Big Data : Ecosystem Advanced Analytics Predictive & Optimization Modeling, Business Data Analytics Processes Analysis, R Splunk Functional Analysis SAS Big Data Madlib Mahout Visual Analytics Tableau Advanced Visualizations Data Delivery Data Delivery - Dashboards , Scorecard SpotFire (Strategy Maps), Spatial & Temporal Datameer Data Visualization Analysis Pig Hive Other BI Tools with Data Engineering BI / Reporting Hadoop connectors Data Engineering - Performance Reporting, Enterprise Lucene Karmasphere Data Agility Metrics, Data Agility - Data Mining, OLAP Modeling etc Cassandra Crunch Pangool Data Consolidation Data Storage and Processing HDFS HBase Mapreduce Data Storage, Data processing Data Economics Flume Scribe Avro Sqoop Chukwa Data Integration & Management Zookeeper Oozie Data Filtering, Data Consolidation & Warehousing, Data Quality, Metadata Integration Management, Job Scheduling, Data Economics Native Hadoop ETL Traditional ETL with Hadoop connectors Distributed Infrastructure Hadoop components Open source Hadoop platforms 3rd party Hadoop supporting platforms
  • 10. What Big Data can do that traditional data warehousing and analytics cannot? Traditional DW Big Data Complete records from known transactional Data from many different internal & external sources systems. with unknown quality and/or utility. u Data is structured, and data fields have known Loosely structured data. Flat schemas with few (and often complex) interrelationships. complex interrelationships, connections between data u elements have to be probabilistically inferred. Multi Terabytes of Data Multi Peta Bytes of Data u Mostly Scale Up Architecture Scale Out Architecture u The analytic models are larger and require very large Analytics run on a stable data model. u amounts of hardware resources to process them in a timely manner Low Performance/Cost ratio as most of the High Performance/Cost ratio as most of the software/ software/hardware platforms are proprietary u hardware platforms are commodity, free, open source and license based 10
  • 11. What Big Data can do that traditional data warehousing and analytics cannot? Traditional DW Big Data Aggregate data (structured) u Raw Data (structured and unstructured) Individual level analytics, Micro segmentation, Aggregate / Segment analytics u individualized offers to customers Mainstream analytics Outlier analytics, Pattern discovery, Simulation and – Structured analysis u modeling, Machine learning - OLAP cubes Entire population of granular data can be Sample data is used for identifying patterns u leveraged Reports & Dashboards are done on a production Real-time operational analytics and reporting. Intra- basis u day decision making. Traditional models good for small amount of Big Models: Computationally intensive analyses, data due to time constraints u simulations, models with many parameters 11
  • 12. Q&A Thank You !