SlideShare a Scribd company logo
Big Data Cloud Meetup Big Data & Cloud Computing - Help, Educate & Demystify. June 3rd 2011
Kitenga, Mark Davis CTO June 3rd 2011 Meetup Unlocking Big Data through Analytics and Search
Big Data Enormous transactional data Enormous unstructured information Too big for databases New tools are needed
Unstructured data explosion Multimedia Content Text Imagery Audio Video Sensor Streams Biometric data 3D Text Email Documents Web pages Tweets Posts <5% Structured Enterprise Data Datawarehouse CDRs Financial records Access logs 4
Big Data  Trillions of user interactions/transactions == Big Data >100M <10M <1M Open source MySQL PHP Data warehousing Parallel SQL Big hardware NoSQL Hadoop/MapReduce Hbase/HIVE Emerging  technologies   Traditional (DBMS-based) solutions  5
The Structured/Unstructured Chasm SQL RDBMS Transactional Data BI Tools Search Documents Text Classification Taxonomies Ontologies
Unstructured Analytics: Surfacing Metadata
Information Extraction Machine-Learning Finite State Transducer Finite State Transducer Finite State Transducer Parts-of-Speech Tagging Lemmatization Tokenization
Search + Analytics Resource Integration Facet Browsing Facet Charting Autosuggest Spellcheck Query Language Indexing Metadata Extraction
Defense Intelligence Analyst support staff needs to convert raw data into actionable intelligence 10 Named Entity Extraction Image tagging Video analytics Linkage Analysis Network Visualization Search Improve Force Effectiveness Hadoop/MapReduce, GPUs, HDFS, Hbase, SOLR Situation Reports Geo-tagged Imagery US Army  Navy DHS NSA
CASE STUDY: US ARMY 11 The Solution >200 data feeds <0.5s queries Fast analysis cycles Machine Learning Analytics Biometrics Linkage Analysis Face recognition Video tagging Collaborative systems Analysis Bottlenecks 200 data feeds Unacceptable response time Analysts avoid complete searches Basic entity extraction Slow analysis cycles Distribution by PowerPoint Enabling techonolgies: GPU clouds, Hadoop/MapReduce, Katta, Lucene, NoSQL, Hbase Enabling Technologies: Oracle and custom thick clients
Pharma Bioinformatics Increase speed of drug discovery 12 Biological Named Entity Extraction Author Name Extraction and Normalization Linkage Analysis Timelines Facetted Search ZettaVox Faster Discovery Hadoop/MapReduce, HDFS, Hbase, GPUs, SOLR Patents Genetic Sequence Data Journal Articles
PharmaTreemap 13
14
Demo
Summary Big Data spans unstructured and structured data Effective tools for managing both involve understanding the differences and similarities of both Bridging the chasm between them means merging search and analytics together
Questions?
Contact Info 20 mark@kitenga.com http://www.kitenga.com Kitenga, Inc. 2953 Bunker Hill Lane, Suite 400 Santa Clara, CA 95054 1-(408)-462-KITE 1-(253)-541-6799 (FAX)

More Related Content

What's hot

Thilga
ThilgaThilga
data mining with big data
data mining with big datadata mining with big data
data mining with big data
swathi78
 
Presentation at Google Day on Big Data
Presentation at Google Day on Big DataPresentation at Google Day on Big Data
Presentation at Google Day on Big DataRezaur Rahman
 
Alfresco Corporate Presentation
Alfresco Corporate PresentationAlfresco Corporate Presentation
Alfresco Corporate Presentation
XeniT Solutions nv
 
Big Data Landscape 2018
Big Data Landscape 2018Big Data Landscape 2018
Big Data Landscape 2018
Leanne Hwee
 
A chart of the big data ecosystem
A chart of the big data ecosystemA chart of the big data ecosystem
A chart of the big data ecosystem
Matt Turck
 
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
BigMine
 
What is Big Data ?
What is Big Data ?What is Big Data ?
What is Big Data ?
AkhmadZakiAlsafi
 
Graph-based intelligence analysis
Graph-based intelligence analysis Graph-based intelligence analysis
Graph-based intelligence analysis
Linkurious
 
Big Data
Big DataBig Data
Big Data
Vinayak Kamath
 
Introduction to BigData
Introduction to BigData Introduction to BigData
Introduction to BigData
Abdelkader OUARED
 
Big data peresintaion
Big data peresintaion Big data peresintaion
Big data peresintaion
ahmed alshikh
 
Big data-ppt-
Big data-ppt-Big data-ppt-
Big data-ppt-
Bhagya Patil
 
Fraudes Financières: Méthodes de Prévention et Détection
Fraudes Financières: Méthodes de Prévention et DétectionFraudes Financières: Méthodes de Prévention et Détection
Fraudes Financières: Méthodes de Prévention et Détection
Linkurious
 
SEAD: Opening Data in the "Long Tail" for Active and Social Curation
SEAD: Opening Data in the "Long Tail" for Active and Social CurationSEAD: Opening Data in the "Long Tail" for Active and Social Curation
SEAD: Opening Data in the "Long Tail" for Active and Social Curation
SEAD
 
Big data analytics presented at meetup big data for decision makers
Big data analytics presented at meetup big data for decision makersBig data analytics presented at meetup big data for decision makers
Big data analytics presented at meetup big data for decision makers
Ruhollah Farchtchi
 
Introduction to Big Data & Big Data 1.0 System
Introduction to Big Data & Big Data 1.0 SystemIntroduction to Big Data & Big Data 1.0 System
Introduction to Big Data & Big Data 1.0 System
Petr Novotný
 
Powerful Information Discovery with Big Knowledge Graphs –The Offshore Leaks ...
Powerful Information Discovery with Big Knowledge Graphs –The Offshore Leaks ...Powerful Information Discovery with Big Knowledge Graphs –The Offshore Leaks ...
Powerful Information Discovery with Big Knowledge Graphs –The Offshore Leaks ...
Connected Data World
 

What's hot (20)

Thilga
ThilgaThilga
Thilga
 
data mining with big data
data mining with big datadata mining with big data
data mining with big data
 
Presentation at Google Day on Big Data
Presentation at Google Day on Big DataPresentation at Google Day on Big Data
Presentation at Google Day on Big Data
 
Alfresco Corporate Presentation
Alfresco Corporate PresentationAlfresco Corporate Presentation
Alfresco Corporate Presentation
 
Big Data Landscape 2018
Big Data Landscape 2018Big Data Landscape 2018
Big Data Landscape 2018
 
A chart of the big data ecosystem
A chart of the big data ecosystemA chart of the big data ecosystem
A chart of the big data ecosystem
 
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
 
Big Data analytics
Big Data analyticsBig Data analytics
Big Data analytics
 
What is Big Data ?
What is Big Data ?What is Big Data ?
What is Big Data ?
 
Graph-based intelligence analysis
Graph-based intelligence analysis Graph-based intelligence analysis
Graph-based intelligence analysis
 
Big Data
Big DataBig Data
Big Data
 
Introduction to BigData
Introduction to BigData Introduction to BigData
Introduction to BigData
 
Big data peresintaion
Big data peresintaion Big data peresintaion
Big data peresintaion
 
Big data-ppt-
Big data-ppt-Big data-ppt-
Big data-ppt-
 
Fraudes Financières: Méthodes de Prévention et Détection
Fraudes Financières: Méthodes de Prévention et DétectionFraudes Financières: Méthodes de Prévention et Détection
Fraudes Financières: Méthodes de Prévention et Détection
 
SEAD: Opening Data in the "Long Tail" for Active and Social Curation
SEAD: Opening Data in the "Long Tail" for Active and Social CurationSEAD: Opening Data in the "Long Tail" for Active and Social Curation
SEAD: Opening Data in the "Long Tail" for Active and Social Curation
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
Big data analytics presented at meetup big data for decision makers
Big data analytics presented at meetup big data for decision makersBig data analytics presented at meetup big data for decision makers
Big data analytics presented at meetup big data for decision makers
 
Introduction to Big Data & Big Data 1.0 System
Introduction to Big Data & Big Data 1.0 SystemIntroduction to Big Data & Big Data 1.0 System
Introduction to Big Data & Big Data 1.0 System
 
Powerful Information Discovery with Big Knowledge Graphs –The Offshore Leaks ...
Powerful Information Discovery with Big Knowledge Graphs –The Offshore Leaks ...Powerful Information Discovery with Big Knowledge Graphs –The Offshore Leaks ...
Powerful Information Discovery with Big Knowledge Graphs –The Offshore Leaks ...
 

Viewers also liked

BigDataCloud Sept 8 2011 meetup - Big Data Analytics for Health by Charles Ka...
BigDataCloud Sept 8 2011 meetup - Big Data Analytics for Health by Charles Ka...BigDataCloud Sept 8 2011 meetup - Big Data Analytics for Health by Charles Ka...
BigDataCloud Sept 8 2011 meetup - Big Data Analytics for Health by Charles Ka...
BigDataCloud
 
Big Data Analytics in a Heterogeneous World - Joydeep Das of Sybase
Big Data Analytics in a Heterogeneous World - Joydeep Das of SybaseBig Data Analytics in a Heterogeneous World - Joydeep Das of Sybase
Big Data Analytics in a Heterogeneous World - Joydeep Das of Sybase
BigDataCloud
 
Why Hadoop is the New Infrastructure for the CMO?
Why Hadoop is the New Infrastructure for the CMO?Why Hadoop is the New Infrastructure for the CMO?
Why Hadoop is the New Infrastructure for the CMO?
BigDataCloud
 
Engagement slideshow final 6 4-2011
Engagement slideshow final 6 4-2011Engagement slideshow final 6 4-2011
Engagement slideshow final 6 4-2011bryanbigos
 
Big Data Cloud Meetup - Jan 24 2013 - Zettaset
Big Data Cloud Meetup - Jan 24 2013 - ZettasetBig Data Cloud Meetup - Jan 24 2013 - Zettaset
Big Data Cloud Meetup - Jan 24 2013 - Zettaset
BigDataCloud
 
Streak + Google Cloud Platform
Streak + Google Cloud PlatformStreak + Google Cloud Platform
Streak + Google Cloud Platform
BigDataCloud
 
Big Data Analytics in Motorola on the Google Cloud Platform
Big Data Analytics in Motorola on the Google Cloud PlatformBig Data Analytics in Motorola on the Google Cloud Platform
Big Data Analytics in Motorola on the Google Cloud Platform
BigDataCloud
 
Creating Business Value from Big Data, Analytics & Technology.
Creating Business Value from Big Data, Analytics & Technology.Creating Business Value from Big Data, Analytics & Technology.
Creating Business Value from Big Data, Analytics & Technology.
BigDataCloud
 

Viewers also liked (8)

BigDataCloud Sept 8 2011 meetup - Big Data Analytics for Health by Charles Ka...
BigDataCloud Sept 8 2011 meetup - Big Data Analytics for Health by Charles Ka...BigDataCloud Sept 8 2011 meetup - Big Data Analytics for Health by Charles Ka...
BigDataCloud Sept 8 2011 meetup - Big Data Analytics for Health by Charles Ka...
 
Big Data Analytics in a Heterogeneous World - Joydeep Das of Sybase
Big Data Analytics in a Heterogeneous World - Joydeep Das of SybaseBig Data Analytics in a Heterogeneous World - Joydeep Das of Sybase
Big Data Analytics in a Heterogeneous World - Joydeep Das of Sybase
 
Why Hadoop is the New Infrastructure for the CMO?
Why Hadoop is the New Infrastructure for the CMO?Why Hadoop is the New Infrastructure for the CMO?
Why Hadoop is the New Infrastructure for the CMO?
 
Engagement slideshow final 6 4-2011
Engagement slideshow final 6 4-2011Engagement slideshow final 6 4-2011
Engagement slideshow final 6 4-2011
 
Big Data Cloud Meetup - Jan 24 2013 - Zettaset
Big Data Cloud Meetup - Jan 24 2013 - ZettasetBig Data Cloud Meetup - Jan 24 2013 - Zettaset
Big Data Cloud Meetup - Jan 24 2013 - Zettaset
 
Streak + Google Cloud Platform
Streak + Google Cloud PlatformStreak + Google Cloud Platform
Streak + Google Cloud Platform
 
Big Data Analytics in Motorola on the Google Cloud Platform
Big Data Analytics in Motorola on the Google Cloud PlatformBig Data Analytics in Motorola on the Google Cloud Platform
Big Data Analytics in Motorola on the Google Cloud Platform
 
Creating Business Value from Big Data, Analytics & Technology.
Creating Business Value from Big Data, Analytics & Technology.Creating Business Value from Big Data, Analytics & Technology.
Creating Business Value from Big Data, Analytics & Technology.
 

Similar to Unlocking Big Data through Analytics and Search - Big Data Cloud - June 3 Meetup

Big data data lake and beyond
Big data data lake and beyond Big data data lake and beyond
Big data data lake and beyond
Rajesh Kumar
 
Demystify Big Data, Data Science & Signal Extraction Deep Dive
Demystify Big Data, Data Science & Signal Extraction Deep DiveDemystify Big Data, Data Science & Signal Extraction Deep Dive
Demystify Big Data, Data Science & Signal Extraction Deep Dive
Hyderabad Scalability Meetup
 
Demystify big data data science
Demystify big data  data scienceDemystify big data  data science
Demystify big data data science
Mahesh Kumar CV
 
Big-Data-Analytics.8592259.powerpoint.pdf
Big-Data-Analytics.8592259.powerpoint.pdfBig-Data-Analytics.8592259.powerpoint.pdf
Big-Data-Analytics.8592259.powerpoint.pdf
rajsharma159890
 
02 a holistic approach to big data
02 a holistic approach to big data02 a holistic approach to big data
02 a holistic approach to big dataRaul Chong
 
Big Data Analytics MIS presentation
Big Data Analytics MIS presentationBig Data Analytics MIS presentation
Big Data Analytics MIS presentationAASTHA PANDEY
 
Introduction Big Data
Introduction Big DataIntroduction Big Data
Introduction Big Data
Frank Kienle
 
INTRODUCTION TO BIG DATA AND HADOOP
INTRODUCTION TO BIG DATA AND HADOOPINTRODUCTION TO BIG DATA AND HADOOP
INTRODUCTION TO BIG DATA AND HADOOP
Dr Geetha Mohan
 
An Encyclopedic Overview Of Big Data Analytics
An Encyclopedic Overview Of Big Data AnalyticsAn Encyclopedic Overview Of Big Data Analytics
An Encyclopedic Overview Of Big Data Analytics
Audrey Britton
 
Bring Your Data Model Alive with Automation - Data Modeling Zone Europe 2018
Bring Your Data Model Alive with Automation - Data Modeling Zone Europe 2018 Bring Your Data Model Alive with Automation - Data Modeling Zone Europe 2018
Bring Your Data Model Alive with Automation - Data Modeling Zone Europe 2018
biGENiUS | Big Data & Data Warehouse Automation
 
Unit 1
Unit 1Unit 1
Big Data Meetup: Analytical Systems Evolution
Big Data Meetup: Analytical Systems EvolutionBig Data Meetup: Analytical Systems Evolution
Big Data Meetup: Analytical Systems Evolution
Provectus
 
Big data lecture notes
Big data lecture notesBig data lecture notes
Big data lecture notes
Mohit Saini
 
Hadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data ProcessingHadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data Processing
Hortonworks
 
An Comprehensive Study of Big Data Environment and its Challenges.
An Comprehensive Study of Big Data Environment and its Challenges.An Comprehensive Study of Big Data Environment and its Challenges.
An Comprehensive Study of Big Data Environment and its Challenges.
ijceronline
 
Big data's impact on online marketing
Big data's impact on online marketingBig data's impact on online marketing
Big data's impact on online marketing
Pros Global Inc
 
Qo Introduction V2
Qo Introduction V2Qo Introduction V2
Qo Introduction V2
Joe_F
 
Big Data LDN 2018: THE THIRD REVOLUTION IN ANALYTICS
Big Data LDN 2018: THE THIRD REVOLUTION IN ANALYTICSBig Data LDN 2018: THE THIRD REVOLUTION IN ANALYTICS
Big Data LDN 2018: THE THIRD REVOLUTION IN ANALYTICS
Matt Stubbs
 

Similar to Unlocking Big Data through Analytics and Search - Big Data Cloud - June 3 Meetup (20)

Big data data lake and beyond
Big data data lake and beyond Big data data lake and beyond
Big data data lake and beyond
 
Demystify Big Data, Data Science & Signal Extraction Deep Dive
Demystify Big Data, Data Science & Signal Extraction Deep DiveDemystify Big Data, Data Science & Signal Extraction Deep Dive
Demystify Big Data, Data Science & Signal Extraction Deep Dive
 
Demystify big data data science
Demystify big data  data scienceDemystify big data  data science
Demystify big data data science
 
A Big Data Concept
A Big Data ConceptA Big Data Concept
A Big Data Concept
 
Big-Data-Analytics.8592259.powerpoint.pdf
Big-Data-Analytics.8592259.powerpoint.pdfBig-Data-Analytics.8592259.powerpoint.pdf
Big-Data-Analytics.8592259.powerpoint.pdf
 
02 a holistic approach to big data
02 a holistic approach to big data02 a holistic approach to big data
02 a holistic approach to big data
 
Big Data Analytics MIS presentation
Big Data Analytics MIS presentationBig Data Analytics MIS presentation
Big Data Analytics MIS presentation
 
Introduction Big Data
Introduction Big DataIntroduction Big Data
Introduction Big Data
 
INTRODUCTION TO BIG DATA AND HADOOP
INTRODUCTION TO BIG DATA AND HADOOPINTRODUCTION TO BIG DATA AND HADOOP
INTRODUCTION TO BIG DATA AND HADOOP
 
An Encyclopedic Overview Of Big Data Analytics
An Encyclopedic Overview Of Big Data AnalyticsAn Encyclopedic Overview Of Big Data Analytics
An Encyclopedic Overview Of Big Data Analytics
 
Bring Your Data Model Alive with Automation - Data Modeling Zone Europe 2018
Bring Your Data Model Alive with Automation - Data Modeling Zone Europe 2018 Bring Your Data Model Alive with Automation - Data Modeling Zone Europe 2018
Bring Your Data Model Alive with Automation - Data Modeling Zone Europe 2018
 
Unit 1
Unit 1Unit 1
Unit 1
 
Big Data Meetup: Analytical Systems Evolution
Big Data Meetup: Analytical Systems EvolutionBig Data Meetup: Analytical Systems Evolution
Big Data Meetup: Analytical Systems Evolution
 
Big data lecture notes
Big data lecture notesBig data lecture notes
Big data lecture notes
 
Hadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data ProcessingHadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data Processing
 
An Comprehensive Study of Big Data Environment and its Challenges.
An Comprehensive Study of Big Data Environment and its Challenges.An Comprehensive Study of Big Data Environment and its Challenges.
An Comprehensive Study of Big Data Environment and its Challenges.
 
All About Big Data
All About Big Data All About Big Data
All About Big Data
 
Big data's impact on online marketing
Big data's impact on online marketingBig data's impact on online marketing
Big data's impact on online marketing
 
Qo Introduction V2
Qo Introduction V2Qo Introduction V2
Qo Introduction V2
 
Big Data LDN 2018: THE THIRD REVOLUTION IN ANALYTICS
Big Data LDN 2018: THE THIRD REVOLUTION IN ANALYTICSBig Data LDN 2018: THE THIRD REVOLUTION IN ANALYTICS
Big Data LDN 2018: THE THIRD REVOLUTION IN ANALYTICS
 

More from BigDataCloud

Webinar - Comparative Analysis of Cloud based Machine Learning Platforms
Webinar - Comparative Analysis of Cloud based Machine Learning PlatformsWebinar - Comparative Analysis of Cloud based Machine Learning Platforms
Webinar - Comparative Analysis of Cloud based Machine Learning Platforms
BigDataCloud
 
Crime Analysis & Prediction System
Crime Analysis & Prediction SystemCrime Analysis & Prediction System
Crime Analysis & Prediction System
BigDataCloud
 
REAL-TIME RECOMMENDATION SYSTEMS
REAL-TIME RECOMMENDATION SYSTEMS REAL-TIME RECOMMENDATION SYSTEMS
REAL-TIME RECOMMENDATION SYSTEMS
BigDataCloud
 
Cloud Computing Services
Cloud Computing ServicesCloud Computing Services
Cloud Computing Services
BigDataCloud
 
Google Enterprise Cloud Platform - Resources & $2000 credit!
Google Enterprise Cloud Platform - Resources & $2000 credit!Google Enterprise Cloud Platform - Resources & $2000 credit!
Google Enterprise Cloud Platform - Resources & $2000 credit!
BigDataCloud
 
Big Data in the Cloud - Solutions & Apps
Big Data in the Cloud - Solutions & AppsBig Data in the Cloud - Solutions & Apps
Big Data in the Cloud - Solutions & Apps
BigDataCloud
 
Using Advanced Analyics to bring Business Value
Using Advanced Analyics to bring Business Value Using Advanced Analyics to bring Business Value
Using Advanced Analyics to bring Business Value
BigDataCloud
 
Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher ManningDeep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
BigDataCloud
 
Recommendation Engines - An Architectural Guide
Recommendation Engines - An Architectural GuideRecommendation Engines - An Architectural Guide
Recommendation Engines - An Architectural GuideBigDataCloud
 
Hadoop : A Foundation for Change - Milind Bhandarkar Chief Scientist, Pivotal
Hadoop : A Foundation for Change - Milind Bhandarkar Chief Scientist, PivotalHadoop : A Foundation for Change - Milind Bhandarkar Chief Scientist, Pivotal
Hadoop : A Foundation for Change - Milind Bhandarkar Chief Scientist, Pivotal
BigDataCloud
 
Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB
Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDBBig Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB
Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB
BigDataCloud
 
A Survey of Petabyte Scale Databases and Storage Systems Deployed at Facebook
A Survey of Petabyte Scale Databases and Storage Systems Deployed at FacebookA Survey of Petabyte Scale Databases and Storage Systems Deployed at Facebook
A Survey of Petabyte Scale Databases and Storage Systems Deployed at Facebook
BigDataCloud
 
What Does Big Data Mean and Who Will Win
What Does Big Data Mean and Who Will WinWhat Does Big Data Mean and Who Will Win
What Does Big Data Mean and Who Will Win
BigDataCloud
 
BigDataCloud meetup Feb 16th - Microsoft's Saptak Sen's presentation
BigDataCloud meetup Feb 16th - Microsoft's Saptak Sen's presentationBigDataCloud meetup Feb 16th - Microsoft's Saptak Sen's presentation
BigDataCloud meetup Feb 16th - Microsoft's Saptak Sen's presentationBigDataCloud
 
BigDataCloud Sept 8 2011 Meetup - Fail-Proofing Hadoop Clusters with Automati...
BigDataCloud Sept 8 2011 Meetup - Fail-Proofing Hadoop Clusters with Automati...BigDataCloud Sept 8 2011 Meetup - Fail-Proofing Hadoop Clusters with Automati...
BigDataCloud Sept 8 2011 Meetup - Fail-Proofing Hadoop Clusters with Automati...
BigDataCloud
 
BigDataCloud Sept 8 2011 Meetup - Big Data Analytics for DoddFrank Regulation...
BigDataCloud Sept 8 2011 Meetup - Big Data Analytics for DoddFrank Regulation...BigDataCloud Sept 8 2011 Meetup - Big Data Analytics for DoddFrank Regulation...
BigDataCloud Sept 8 2011 Meetup - Big Data Analytics for DoddFrank Regulation...
BigDataCloud
 
Recommendation Engine Powered by Hadoop - Pranab Ghosh
Recommendation Engine Powered by Hadoop - Pranab GhoshRecommendation Engine Powered by Hadoop - Pranab Ghosh
Recommendation Engine Powered by Hadoop - Pranab Ghosh
BigDataCloud
 
BigDataCloud meetup - July 8th - Cost effective big-data processing using Ama...
BigDataCloud meetup - July 8th - Cost effective big-data processing using Ama...BigDataCloud meetup - July 8th - Cost effective big-data processing using Ama...
BigDataCloud meetup - July 8th - Cost effective big-data processing using Ama...BigDataCloud
 
Optimizing Bursty Hadoop on AWS - Big Data Cloud - June 3rd Meetup
Optimizing Bursty Hadoop on AWS - Big Data Cloud - June 3rd MeetupOptimizing Bursty Hadoop on AWS - Big Data Cloud - June 3rd Meetup
Optimizing Bursty Hadoop on AWS - Big Data Cloud - June 3rd Meetup
BigDataCloud
 

More from BigDataCloud (19)

Webinar - Comparative Analysis of Cloud based Machine Learning Platforms
Webinar - Comparative Analysis of Cloud based Machine Learning PlatformsWebinar - Comparative Analysis of Cloud based Machine Learning Platforms
Webinar - Comparative Analysis of Cloud based Machine Learning Platforms
 
Crime Analysis & Prediction System
Crime Analysis & Prediction SystemCrime Analysis & Prediction System
Crime Analysis & Prediction System
 
REAL-TIME RECOMMENDATION SYSTEMS
REAL-TIME RECOMMENDATION SYSTEMS REAL-TIME RECOMMENDATION SYSTEMS
REAL-TIME RECOMMENDATION SYSTEMS
 
Cloud Computing Services
Cloud Computing ServicesCloud Computing Services
Cloud Computing Services
 
Google Enterprise Cloud Platform - Resources & $2000 credit!
Google Enterprise Cloud Platform - Resources & $2000 credit!Google Enterprise Cloud Platform - Resources & $2000 credit!
Google Enterprise Cloud Platform - Resources & $2000 credit!
 
Big Data in the Cloud - Solutions & Apps
Big Data in the Cloud - Solutions & AppsBig Data in the Cloud - Solutions & Apps
Big Data in the Cloud - Solutions & Apps
 
Using Advanced Analyics to bring Business Value
Using Advanced Analyics to bring Business Value Using Advanced Analyics to bring Business Value
Using Advanced Analyics to bring Business Value
 
Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher ManningDeep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
 
Recommendation Engines - An Architectural Guide
Recommendation Engines - An Architectural GuideRecommendation Engines - An Architectural Guide
Recommendation Engines - An Architectural Guide
 
Hadoop : A Foundation for Change - Milind Bhandarkar Chief Scientist, Pivotal
Hadoop : A Foundation for Change - Milind Bhandarkar Chief Scientist, PivotalHadoop : A Foundation for Change - Milind Bhandarkar Chief Scientist, Pivotal
Hadoop : A Foundation for Change - Milind Bhandarkar Chief Scientist, Pivotal
 
Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB
Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDBBig Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB
Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB
 
A Survey of Petabyte Scale Databases and Storage Systems Deployed at Facebook
A Survey of Petabyte Scale Databases and Storage Systems Deployed at FacebookA Survey of Petabyte Scale Databases and Storage Systems Deployed at Facebook
A Survey of Petabyte Scale Databases and Storage Systems Deployed at Facebook
 
What Does Big Data Mean and Who Will Win
What Does Big Data Mean and Who Will WinWhat Does Big Data Mean and Who Will Win
What Does Big Data Mean and Who Will Win
 
BigDataCloud meetup Feb 16th - Microsoft's Saptak Sen's presentation
BigDataCloud meetup Feb 16th - Microsoft's Saptak Sen's presentationBigDataCloud meetup Feb 16th - Microsoft's Saptak Sen's presentation
BigDataCloud meetup Feb 16th - Microsoft's Saptak Sen's presentation
 
BigDataCloud Sept 8 2011 Meetup - Fail-Proofing Hadoop Clusters with Automati...
BigDataCloud Sept 8 2011 Meetup - Fail-Proofing Hadoop Clusters with Automati...BigDataCloud Sept 8 2011 Meetup - Fail-Proofing Hadoop Clusters with Automati...
BigDataCloud Sept 8 2011 Meetup - Fail-Proofing Hadoop Clusters with Automati...
 
BigDataCloud Sept 8 2011 Meetup - Big Data Analytics for DoddFrank Regulation...
BigDataCloud Sept 8 2011 Meetup - Big Data Analytics for DoddFrank Regulation...BigDataCloud Sept 8 2011 Meetup - Big Data Analytics for DoddFrank Regulation...
BigDataCloud Sept 8 2011 Meetup - Big Data Analytics for DoddFrank Regulation...
 
Recommendation Engine Powered by Hadoop - Pranab Ghosh
Recommendation Engine Powered by Hadoop - Pranab GhoshRecommendation Engine Powered by Hadoop - Pranab Ghosh
Recommendation Engine Powered by Hadoop - Pranab Ghosh
 
BigDataCloud meetup - July 8th - Cost effective big-data processing using Ama...
BigDataCloud meetup - July 8th - Cost effective big-data processing using Ama...BigDataCloud meetup - July 8th - Cost effective big-data processing using Ama...
BigDataCloud meetup - July 8th - Cost effective big-data processing using Ama...
 
Optimizing Bursty Hadoop on AWS - Big Data Cloud - June 3rd Meetup
Optimizing Bursty Hadoop on AWS - Big Data Cloud - June 3rd MeetupOptimizing Bursty Hadoop on AWS - Big Data Cloud - June 3rd Meetup
Optimizing Bursty Hadoop on AWS - Big Data Cloud - June 3rd Meetup
 

Recently uploaded

GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
ThomasParaiso2
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 

Recently uploaded (20)

GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 

Unlocking Big Data through Analytics and Search - Big Data Cloud - June 3 Meetup

  • 1. Big Data Cloud Meetup Big Data & Cloud Computing - Help, Educate & Demystify. June 3rd 2011
  • 2. Kitenga, Mark Davis CTO June 3rd 2011 Meetup Unlocking Big Data through Analytics and Search
  • 3. Big Data Enormous transactional data Enormous unstructured information Too big for databases New tools are needed
  • 4. Unstructured data explosion Multimedia Content Text Imagery Audio Video Sensor Streams Biometric data 3D Text Email Documents Web pages Tweets Posts <5% Structured Enterprise Data Datawarehouse CDRs Financial records Access logs 4
  • 5. Big Data Trillions of user interactions/transactions == Big Data >100M <10M <1M Open source MySQL PHP Data warehousing Parallel SQL Big hardware NoSQL Hadoop/MapReduce Hbase/HIVE Emerging technologies Traditional (DBMS-based) solutions 5
  • 6. The Structured/Unstructured Chasm SQL RDBMS Transactional Data BI Tools Search Documents Text Classification Taxonomies Ontologies
  • 8. Information Extraction Machine-Learning Finite State Transducer Finite State Transducer Finite State Transducer Parts-of-Speech Tagging Lemmatization Tokenization
  • 9. Search + Analytics Resource Integration Facet Browsing Facet Charting Autosuggest Spellcheck Query Language Indexing Metadata Extraction
  • 10. Defense Intelligence Analyst support staff needs to convert raw data into actionable intelligence 10 Named Entity Extraction Image tagging Video analytics Linkage Analysis Network Visualization Search Improve Force Effectiveness Hadoop/MapReduce, GPUs, HDFS, Hbase, SOLR Situation Reports Geo-tagged Imagery US Army Navy DHS NSA
  • 11. CASE STUDY: US ARMY 11 The Solution >200 data feeds <0.5s queries Fast analysis cycles Machine Learning Analytics Biometrics Linkage Analysis Face recognition Video tagging Collaborative systems Analysis Bottlenecks 200 data feeds Unacceptable response time Analysts avoid complete searches Basic entity extraction Slow analysis cycles Distribution by PowerPoint Enabling techonolgies: GPU clouds, Hadoop/MapReduce, Katta, Lucene, NoSQL, Hbase Enabling Technologies: Oracle and custom thick clients
  • 12. Pharma Bioinformatics Increase speed of drug discovery 12 Biological Named Entity Extraction Author Name Extraction and Normalization Linkage Analysis Timelines Facetted Search ZettaVox Faster Discovery Hadoop/MapReduce, HDFS, Hbase, GPUs, SOLR Patents Genetic Sequence Data Journal Articles
  • 14. 14
  • 15.
  • 16. Demo
  • 17.
  • 18. Summary Big Data spans unstructured and structured data Effective tools for managing both involve understanding the differences and similarities of both Bridging the chasm between them means merging search and analytics together
  • 20. Contact Info 20 mark@kitenga.com http://www.kitenga.com Kitenga, Inc. 2953 Bunker Hill Lane, Suite 400 Santa Clara, CA 95054 1-(408)-462-KITE 1-(253)-541-6799 (FAX)