SlideShare a Scribd company logo
1 of 37
Download to read offline
Essential Tools For
Your Big Data Arsenal
Matt Asay (@mjasay)
VP, Business Development & Strategy, MongoDB
The Big Data Unknown
Top Big Data Challenges?
Translation?
Most struggle
to know what
Big Data
is, how to
manage it and
who can
manage it

Source: Gartner
3
Understanding Big Data – It’s Not Very “Big”

64% - Ingest
diverse, new data in
real-time
15% - More than 100TB
of data
20% - Less than 100TB
(average of all? <20TB)
from Big Data Executive Summary – 50+ top executives from Government and F500 firms

4
Innovation As Iteration
“I have not failed. I've just found 10,000 ways that won't work.”
― Thomas A. Edison
Back in 1970…Cars Were Great!

7
So Were Computers!

8
Lots of Great Innovations Since 1970

9
Including the Relational Database

10
RDBMS Makes Development Hard

Code

DB Schema

Application

11

XML Config

Object Relational
Mapping

Relational
Database
And Even Harder To Iterate
New
Table

New
Column

New
Table
Name

Pet

Phone

New
Column

3 months later…

12

Email
From Complexity to Simplicity
RDBMS

MongoDB

{

_id : ObjectId("4c4ba5e5e8aabf3"),
employee_name: "Dunham, Justin",
department : "Marketing",
title : "Product Manager, Web",
report_up: "Neray, Graham",
pay_band: “C",
benefits : [
{

type :

"Health",

plan : "PPO Plus" },
{

type :

"Dental",

plan : "Standard" }
]
}

13
So…Use Open Source

14
Big Data != Big Upfront Payment

15
RDBMS Is Expensive To Scale

“Clients can also opt to run zEC12 without a raised
datacenter floor -- a first for high-end IBM mainframes.”
IBM Press Release 28 Aug, 2012

16
Spoiled for choice
DB-Engines.com Database Ranking
1 Oracle
2 MySQL
3 Microsoft SQL Server
4 PostgreSQL
5 DB2
6 MongoDB
7 Microsoft Access
8 SQLite
9 Sybase
10 Teradata

17

Relational DBMS
1583.84
Relational DBMS
1331.34
Relational DBMS
1207
Relational DBMS
177.01
Relational DBMS
175.83
NoSQL Document Store 149.48
Relational DBMS
142.49
Relational DBMS
77.88
Relational DBMS
73.66
Relational DBMS
54.41

54.23
25.58
-106.78
-5.22
3.58
-2.71
-4.21
-4.9
-1.68
3.32
Remember the Long Tail?

18
It Didn’t Work Out So Well

19
Use Popular, Well-Known Technologies

20

Source: Silicon Angle, 2012
Ask the Right Questions…

“Organizations already have people who know
their own data better than mystical data
scientists….Learning Hadoop [or MongoDB] is
easier than learning the company’s business.”
(Gartner, 2012)

21
Leverage Existing Skills

22
Search as a Sign?

23
When To Use Hadoop, NoSQL
25

Applications
CRM, ERP, Collaboration, Mobile, BI

Data Management
Online Data
RDBMS
RDBMS

Offline Data
Hadoop

Infrastructure
OS & Virtualization, Compute, Storage, Network

EDW

Security & Auditing

Management & Monitoring

Enterprise Big Data Stack
Consideration – Online vs. Offline
Online

• Real-time
• Low-latency
• High availability
26

vs.

Offline

• Long-running
• High-Latency
• Availability is lower priority
Consideration – Online vs. Offline
Online

27

vs.

Offline
Hadoop Is Good for…

Risk Modeling

Recommendation
Engine

Ad Targeting

Transaction
Analysis

Trade
Surveillance

Network Failure
Prediction

28

Churn Analysis

Search Quality

Data Lake
MongoDB/NoSQL Is Good for…

360° View of the
Customer

Fraud Detection

User Data
Management

Content
Management &
Delivery

Reference Data

Product Catalogs

29

Mobile & Social
Apps

Machine to
Machine Apps

Data Hub
How To Use The Two Together?
Finding Waldo

31
Customer example: Online Travel

Travel

Algorithms
MongoDB
Connector for
Hadoop

•
•
•
•

32

Flights, hotels and cars
Real-time offers
User profiles, reviews
User metadata (previous
purchases, clicks, views)

•
•
•
•

User segmentation
Offer recommendation engine
Ad serving engine
Bundling engine
Predictive Analytics

Government

Algorithms

MongoDB
+ Hadoop
• Predictive analytics system
for crime, health issues
• Diverse, unstructured (incl.
geospatial) data from 30+
agencies
• Correlate data in real-time
33

• Long-form trend analysis
• MongoDB data dumped into
Hadoop, analyzed, re-inserted
into MongoDB for better realtime response
Data Hub

Churn
Analysis

Insurance
MongoDB
Connector for
Hadoop

•
•
•
•
•

34

Insurance policies
Demographic data
Customer web data
Call center data
Real-time churn detection

• Customer action analysis
• Churn prediction
algorithms
Machine Learning

Ad-Serving

Algorithms
MongoDB
Connector for
Hadoop

•
•
•
•
•

35

Catalogs and products
User profiles
Clicks
Views
Transactions

• User segmentation
• Recommendation engine
• Prediction engine
MongoDB + Hadoop Connector
• Makes MongoDB a Hadoop-enabled file system
• Read and write to live data, in-place
• Copy data between Hadoop and MongoDB

• Full support for data processing
– Hive
– MapReduce
– Pig
– Streaming
– EMR

36

MongoDB
Connector for
Hadoop
@mjasay

More Related Content

What's hot

IoT ( M2M) - Big Data - Analytics: Emulation and Demonstration
IoT ( M2M) - Big Data - Analytics: Emulation and DemonstrationIoT ( M2M) - Big Data - Analytics: Emulation and Demonstration
IoT ( M2M) - Big Data - Analytics: Emulation and DemonstrationCHAKER ALLAOUI
 
Internet of things, Big Data and Analytics 101
Internet of things, Big Data and Analytics 101Internet of things, Big Data and Analytics 101
Internet of things, Big Data and Analytics 101Mukul Krishna
 
Key Data Management Requirements for the IoT
Key Data Management Requirements for the IoTKey Data Management Requirements for the IoT
Key Data Management Requirements for the IoTMongoDB
 
Short introduction to Big Data Analytics, the Internet of Things, and their s...
Short introduction to Big Data Analytics, the Internet of Things, and their s...Short introduction to Big Data Analytics, the Internet of Things, and their s...
Short introduction to Big Data Analytics, the Internet of Things, and their s...Andrei Khurshudov
 
Big Data and Analytics: The IBM Perspective
Big Data and Analytics: The IBM PerspectiveBig Data and Analytics: The IBM Perspective
Big Data and Analytics: The IBM PerspectiveThe_IPA
 
The competitive landscape of the Internet of Things
The competitive landscape of the Internet of ThingsThe competitive landscape of the Internet of Things
The competitive landscape of the Internet of ThingsIoTAnalytics
 
Big data introduction
Big data introductionBig data introduction
Big data introductionvikas samant
 
Big Data & Analytics (Conceptual and Practical Introduction)
Big Data & Analytics (Conceptual and Practical Introduction)Big Data & Analytics (Conceptual and Practical Introduction)
Big Data & Analytics (Conceptual and Practical Introduction)Yaman Hajja, Ph.D.
 
Big data - Key Enablers, Drivers & Challenges
Big data - Key Enablers, Drivers & ChallengesBig data - Key Enablers, Drivers & Challenges
Big data - Key Enablers, Drivers & ChallengesShilpi Sharma
 
Big Data for Development
Big Data for DevelopmentBig Data for Development
Big Data for DevelopmentJoud Khattab
 
Big Data & Future - Big Data, Analytics, Cloud, SDN, Internet of things
Big Data & Future - Big Data, Analytics, Cloud, SDN, Internet of thingsBig Data & Future - Big Data, Analytics, Cloud, SDN, Internet of things
Big Data & Future - Big Data, Analytics, Cloud, SDN, Internet of thingsRamakant Gawande
 
Big data (word file)
Big data  (word file)Big data  (word file)
Big data (word file)Shahbaz Anjam
 
Interplay of Big Data and IoT - StampedeCon 2016
Interplay of Big Data and IoT - StampedeCon 2016Interplay of Big Data and IoT - StampedeCon 2016
Interplay of Big Data and IoT - StampedeCon 2016StampedeCon
 
ParStream - Big Data for Business Users
ParStream - Big Data for Business UsersParStream - Big Data for Business Users
ParStream - Big Data for Business UsersParStream Inc.
 

What's hot (20)

IoT ( M2M) - Big Data - Analytics: Emulation and Demonstration
IoT ( M2M) - Big Data - Analytics: Emulation and DemonstrationIoT ( M2M) - Big Data - Analytics: Emulation and Demonstration
IoT ( M2M) - Big Data - Analytics: Emulation and Demonstration
 
Internet of things, Big Data and Analytics 101
Internet of things, Big Data and Analytics 101Internet of things, Big Data and Analytics 101
Internet of things, Big Data and Analytics 101
 
Iot data analytics
Iot data analyticsIot data analytics
Iot data analytics
 
Key Data Management Requirements for the IoT
Key Data Management Requirements for the IoTKey Data Management Requirements for the IoT
Key Data Management Requirements for the IoT
 
Short introduction to Big Data Analytics, the Internet of Things, and their s...
Short introduction to Big Data Analytics, the Internet of Things, and their s...Short introduction to Big Data Analytics, the Internet of Things, and their s...
Short introduction to Big Data Analytics, the Internet of Things, and their s...
 
Big Data and Analytics: The IBM Perspective
Big Data and Analytics: The IBM PerspectiveBig Data and Analytics: The IBM Perspective
Big Data and Analytics: The IBM Perspective
 
The competitive landscape of the Internet of Things
The competitive landscape of the Internet of ThingsThe competitive landscape of the Internet of Things
The competitive landscape of the Internet of Things
 
Big data introduction
Big data introductionBig data introduction
Big data introduction
 
Big data
Big dataBig data
Big data
 
Big Data & Analytics (Conceptual and Practical Introduction)
Big Data & Analytics (Conceptual and Practical Introduction)Big Data & Analytics (Conceptual and Practical Introduction)
Big Data & Analytics (Conceptual and Practical Introduction)
 
Big data - Key Enablers, Drivers & Challenges
Big data - Key Enablers, Drivers & ChallengesBig data - Key Enablers, Drivers & Challenges
Big data - Key Enablers, Drivers & Challenges
 
Big Data for Development
Big Data for DevelopmentBig Data for Development
Big Data for Development
 
Big data
Big dataBig data
Big data
 
Big Data & Future - Big Data, Analytics, Cloud, SDN, Internet of things
Big Data & Future - Big Data, Analytics, Cloud, SDN, Internet of thingsBig Data & Future - Big Data, Analytics, Cloud, SDN, Internet of things
Big Data & Future - Big Data, Analytics, Cloud, SDN, Internet of things
 
Fraud and Risk in Big Data
Fraud and Risk in Big DataFraud and Risk in Big Data
Fraud and Risk in Big Data
 
Big data (word file)
Big data  (word file)Big data  (word file)
Big data (word file)
 
Interplay of Big Data and IoT - StampedeCon 2016
Interplay of Big Data and IoT - StampedeCon 2016Interplay of Big Data and IoT - StampedeCon 2016
Interplay of Big Data and IoT - StampedeCon 2016
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
IoT Big Data Analytics Insights from Patents
IoT Big Data Analytics Insights from PatentsIoT Big Data Analytics Insights from Patents
IoT Big Data Analytics Insights from Patents
 
ParStream - Big Data for Business Users
ParStream - Big Data for Business UsersParStream - Big Data for Business Users
ParStream - Big Data for Business Users
 

Viewers also liked

MongoDB for Oracle Experts - OUGF Harmony 2014
MongoDB for Oracle Experts - OUGF Harmony 2014 MongoDB for Oracle Experts - OUGF Harmony 2014
MongoDB for Oracle Experts - OUGF Harmony 2014 Henrik Ingo
 
How We Fixed Our MongoDB Problems
How We Fixed Our MongoDB Problems How We Fixed Our MongoDB Problems
How We Fixed Our MongoDB Problems MongoDB
 
Considerations for using NoSQL technology on your next IT project
Considerations for using NoSQL technology on your next IT projectConsiderations for using NoSQL technology on your next IT project
Considerations for using NoSQL technology on your next IT projectAkmal Chaudhri
 
Backing Up Data with MMS
Backing Up Data with MMSBacking Up Data with MMS
Backing Up Data with MMSMongoDB
 
GDBinSV_Meetup_DBMS_Trends_10062016
GDBinSV_Meetup_DBMS_Trends_10062016GDBinSV_Meetup_DBMS_Trends_10062016
GDBinSV_Meetup_DBMS_Trends_10062016Joshua Bae
 
Big Data Hoopla Simplified - TDWI Memphis 2014
Big Data Hoopla Simplified - TDWI Memphis 2014Big Data Hoopla Simplified - TDWI Memphis 2014
Big Data Hoopla Simplified - TDWI Memphis 2014Rajan Kanitkar
 
Big Data Science: Intro and Benefits
Big Data Science: Intro and BenefitsBig Data Science: Intro and Benefits
Big Data Science: Intro and BenefitsChandan Rajah
 
Quick dive into the big data pool without drowning - Demi Ben-Ari @ Panorays
Quick dive into the big data pool without drowning - Demi Ben-Ari @ PanoraysQuick dive into the big data pool without drowning - Demi Ben-Ari @ Panorays
Quick dive into the big data pool without drowning - Demi Ben-Ari @ PanoraysDemi Ben-Ari
 
Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databasesJames Serra
 

Viewers also liked (11)

MongoDB for Oracle Experts - OUGF Harmony 2014
MongoDB for Oracle Experts - OUGF Harmony 2014 MongoDB for Oracle Experts - OUGF Harmony 2014
MongoDB for Oracle Experts - OUGF Harmony 2014
 
How We Fixed Our MongoDB Problems
How We Fixed Our MongoDB Problems How We Fixed Our MongoDB Problems
How We Fixed Our MongoDB Problems
 
Considerations for using NoSQL technology on your next IT project
Considerations for using NoSQL technology on your next IT projectConsiderations for using NoSQL technology on your next IT project
Considerations for using NoSQL technology on your next IT project
 
Backing Up Data with MMS
Backing Up Data with MMSBacking Up Data with MMS
Backing Up Data with MMS
 
GDBinSV_Meetup_DBMS_Trends_10062016
GDBinSV_Meetup_DBMS_Trends_10062016GDBinSV_Meetup_DBMS_Trends_10062016
GDBinSV_Meetup_DBMS_Trends_10062016
 
TOUG Big Data Challenge and Impact
TOUG Big Data Challenge and ImpactTOUG Big Data Challenge and Impact
TOUG Big Data Challenge and Impact
 
Big Data Hoopla Simplified - TDWI Memphis 2014
Big Data Hoopla Simplified - TDWI Memphis 2014Big Data Hoopla Simplified - TDWI Memphis 2014
Big Data Hoopla Simplified - TDWI Memphis 2014
 
Big Data Science: Intro and Benefits
Big Data Science: Intro and BenefitsBig Data Science: Intro and Benefits
Big Data Science: Intro and Benefits
 
Quick dive into the big data pool without drowning - Demi Ben-Ari @ Panorays
Quick dive into the big data pool without drowning - Demi Ben-Ari @ PanoraysQuick dive into the big data pool without drowning - Demi Ben-Ari @ Panorays
Quick dive into the big data pool without drowning - Demi Ben-Ari @ Panorays
 
QlikView & Big Data
QlikView & Big DataQlikView & Big Data
QlikView & Big Data
 
Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databases
 

Similar to Essential Tools For Your Big Data Arsenal

Big Data for One Big Family
Big Data for One Big FamilyBig Data for One Big Family
Big Data for One Big FamilyMatt Asay
 
The Business of Big Data - IA Ventures
The Business of Big Data - IA VenturesThe Business of Big Data - IA Ventures
The Business of Big Data - IA VenturesBen Siscovick
 
Big data and data mining
Big data and data miningBig data and data mining
Big data and data miningEmran Hossain
 
02 a holistic approach to big data
02 a holistic approach to big data02 a holistic approach to big data
02 a holistic approach to big dataRaul Chong
 
Finding business value in Big Data
Finding business value in Big DataFinding business value in Big Data
Finding business value in Big DataJames Serra
 
Big Data : From HindSight to Insight to Foresight
Big Data : From HindSight to Insight to ForesightBig Data : From HindSight to Insight to Foresight
Big Data : From HindSight to Insight to ForesightSunil Ranka
 
Big data introduction, Hadoop in details
Big data introduction, Hadoop in detailsBig data introduction, Hadoop in details
Big data introduction, Hadoop in detailsMahmoud Yassin
 
Overview - IBM Big Data Platform
Overview - IBM Big Data PlatformOverview - IBM Big Data Platform
Overview - IBM Big Data PlatformVikas Manoria
 
Big Data Trends - WorldFuture 2015 Conference
Big Data Trends - WorldFuture 2015 ConferenceBig Data Trends - WorldFuture 2015 Conference
Big Data Trends - WorldFuture 2015 ConferenceDavid Feinleib
 
QuickView #3 - Big Data
QuickView #3 - Big DataQuickView #3 - Big Data
QuickView #3 - Big DataSonovate
 
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate Oomph! Recruitment
 
New professional careers in data
New professional careers in dataNew professional careers in data
New professional careers in dataDavid Rostcheck
 
Big Data for the Rest of Us - OpenWest 2014 - Matt Asay
Big Data for the Rest of Us - OpenWest 2014 - Matt AsayBig Data for the Rest of Us - OpenWest 2014 - Matt Asay
Big Data for the Rest of Us - OpenWest 2014 - Matt AsayMatt Asay
 
Building the Cognitive Era : Big Data Strategies
Building the Cognitive Era : Big Data StrategiesBuilding the Cognitive Era : Big Data Strategies
Building the Cognitive Era : Big Data StrategiesKevin Sigliano
 
Data mining with big data implementation
Data mining with big data implementationData mining with big data implementation
Data mining with big data implementationSandip Tipayle Patil
 
ch1vsat2k_BDA_Introduction11Jan17-converted.pptx
ch1vsat2k_BDA_Introduction11Jan17-converted.pptxch1vsat2k_BDA_Introduction11Jan17-converted.pptx
ch1vsat2k_BDA_Introduction11Jan17-converted.pptxMrityunjay Emmi
 

Similar to Essential Tools For Your Big Data Arsenal (20)

Big Data for One Big Family
Big Data for One Big FamilyBig Data for One Big Family
Big Data for One Big Family
 
The Business of Big Data - IA Ventures
The Business of Big Data - IA VenturesThe Business of Big Data - IA Ventures
The Business of Big Data - IA Ventures
 
"Big Data Dreams"
"Big Data Dreams""Big Data Dreams"
"Big Data Dreams"
 
A Big Data Concept
A Big Data ConceptA Big Data Concept
A Big Data Concept
 
Big data and data mining
Big data and data miningBig data and data mining
Big data and data mining
 
02 a holistic approach to big data
02 a holistic approach to big data02 a holistic approach to big data
02 a holistic approach to big data
 
Finding business value in Big Data
Finding business value in Big DataFinding business value in Big Data
Finding business value in Big Data
 
Big Data : From HindSight to Insight to Foresight
Big Data : From HindSight to Insight to ForesightBig Data : From HindSight to Insight to Foresight
Big Data : From HindSight to Insight to Foresight
 
Big data introduction, Hadoop in details
Big data introduction, Hadoop in detailsBig data introduction, Hadoop in details
Big data introduction, Hadoop in details
 
Presentation on Big Data
Presentation on Big DataPresentation on Big Data
Presentation on Big Data
 
Overview - IBM Big Data Platform
Overview - IBM Big Data PlatformOverview - IBM Big Data Platform
Overview - IBM Big Data Platform
 
Big Data Trends - WorldFuture 2015 Conference
Big Data Trends - WorldFuture 2015 ConferenceBig Data Trends - WorldFuture 2015 Conference
Big Data Trends - WorldFuture 2015 Conference
 
QuickView #3 - Big Data
QuickView #3 - Big DataQuickView #3 - Big Data
QuickView #3 - Big Data
 
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
 
New professional careers in data
New professional careers in dataNew professional careers in data
New professional careers in data
 
Big Data for the Rest of Us - OpenWest 2014 - Matt Asay
Big Data for the Rest of Us - OpenWest 2014 - Matt AsayBig Data for the Rest of Us - OpenWest 2014 - Matt Asay
Big Data for the Rest of Us - OpenWest 2014 - Matt Asay
 
Building the Cognitive Era : Big Data Strategies
Building the Cognitive Era : Big Data StrategiesBuilding the Cognitive Era : Big Data Strategies
Building the Cognitive Era : Big Data Strategies
 
Data mining with big data implementation
Data mining with big data implementationData mining with big data implementation
Data mining with big data implementation
 
ch1vsat2k_BDA_Introduction11Jan17-converted.pptx
ch1vsat2k_BDA_Introduction11Jan17-converted.pptxch1vsat2k_BDA_Introduction11Jan17-converted.pptx
ch1vsat2k_BDA_Introduction11Jan17-converted.pptx
 
Big Data
Big DataBig Data
Big Data
 

More from MongoDB

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump StartMongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB
 

More from MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

Recently uploaded

Tracking license compliance made easy - intro to Grant (OSS)
Tracking license compliance made easy - intro to Grant (OSS)Tracking license compliance made easy - intro to Grant (OSS)
Tracking license compliance made easy - intro to Grant (OSS)Anchore
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxUdaiappa Ramachandran
 
ServiceNow Integration with MuleSoft.pptx
ServiceNow Integration with MuleSoft.pptxServiceNow Integration with MuleSoft.pptx
ServiceNow Integration with MuleSoft.pptxshyamraj55
 
GenAI and AI GCC State of AI_Object Automation Inc
GenAI and AI GCC State of AI_Object Automation IncGenAI and AI GCC State of AI_Object Automation Inc
GenAI and AI GCC State of AI_Object Automation IncObject Automation
 
HHUG-03-2024-Impactful-Reporting-in-HubSpot.pptx
HHUG-03-2024-Impactful-Reporting-in-HubSpot.pptxHHUG-03-2024-Impactful-Reporting-in-HubSpot.pptx
HHUG-03-2024-Impactful-Reporting-in-HubSpot.pptxHampshireHUG
 
Unleashing the power of AI in UiPath Studio with UiPath Autopilot.
Unleashing the power of AI in UiPath Studio with UiPath Autopilot.Unleashing the power of AI in UiPath Studio with UiPath Autopilot.
Unleashing the power of AI in UiPath Studio with UiPath Autopilot.DianaGray10
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...DianaGray10
 
Does AI(Artificial intelligence) need a Working Memory??
Does AI(Artificial intelligence) need a Working Memory??Does AI(Artificial intelligence) need a Working Memory??
Does AI(Artificial intelligence) need a Working Memory??N.K KooZN
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesDavid Newbury
 
CHIPS Alliance_Object Automation Inc_workshop
CHIPS Alliance_Object Automation Inc_workshopCHIPS Alliance_Object Automation Inc_workshop
CHIPS Alliance_Object Automation Inc_workshopObject Automation
 
LLM Threats: Prompt Injections and Jailbreak Attacks
LLM Threats: Prompt Injections and Jailbreak AttacksLLM Threats: Prompt Injections and Jailbreak Attacks
LLM Threats: Prompt Injections and Jailbreak AttacksThien Q. Tran
 
20200723_insight_release_plan
20200723_insight_release_plan20200723_insight_release_plan
20200723_insight_release_planJamie (Taka) Wang
 
Future Research Directions for Augmented Reality
Future Research Directions for Augmented RealityFuture Research Directions for Augmented Reality
Future Research Directions for Augmented RealityMark Billinghurst
 
Do we need a new standard for visualizing the invisible?
Do we need a new standard for visualizing the invisible?Do we need a new standard for visualizing the invisible?
Do we need a new standard for visualizing the invisible?SANGHEE SHIN
 
Checklist to troubleshoot CD moisture profiles.docx
Checklist to troubleshoot CD moisture profiles.docxChecklist to troubleshoot CD moisture profiles.docx
Checklist to troubleshoot CD moisture profiles.docxNoman khan
 
BODYPACK DIGITAL TECHNOLOGY STACK - 2024
BODYPACK DIGITAL TECHNOLOGY STACK - 2024BODYPACK DIGITAL TECHNOLOGY STACK - 2024
BODYPACK DIGITAL TECHNOLOGY STACK - 2024Andri H.
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopBachir Benyammi
 
Digital magic. A small project for controlling smart light bulbs.
Digital magic. A small project for controlling smart light bulbs.Digital magic. A small project for controlling smart light bulbs.
Digital magic. A small project for controlling smart light bulbs.francesco barbera
 
PicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer ServicePicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer ServiceRenan Moreira de Oliveira
 
IEEE Computer Society 2024 Technology Predictions Update
IEEE Computer Society 2024 Technology Predictions UpdateIEEE Computer Society 2024 Technology Predictions Update
IEEE Computer Society 2024 Technology Predictions UpdateHironori Washizaki
 

Recently uploaded (20)

Tracking license compliance made easy - intro to Grant (OSS)
Tracking license compliance made easy - intro to Grant (OSS)Tracking license compliance made easy - intro to Grant (OSS)
Tracking license compliance made easy - intro to Grant (OSS)
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptx
 
ServiceNow Integration with MuleSoft.pptx
ServiceNow Integration with MuleSoft.pptxServiceNow Integration with MuleSoft.pptx
ServiceNow Integration with MuleSoft.pptx
 
GenAI and AI GCC State of AI_Object Automation Inc
GenAI and AI GCC State of AI_Object Automation IncGenAI and AI GCC State of AI_Object Automation Inc
GenAI and AI GCC State of AI_Object Automation Inc
 
HHUG-03-2024-Impactful-Reporting-in-HubSpot.pptx
HHUG-03-2024-Impactful-Reporting-in-HubSpot.pptxHHUG-03-2024-Impactful-Reporting-in-HubSpot.pptx
HHUG-03-2024-Impactful-Reporting-in-HubSpot.pptx
 
Unleashing the power of AI in UiPath Studio with UiPath Autopilot.
Unleashing the power of AI in UiPath Studio with UiPath Autopilot.Unleashing the power of AI in UiPath Studio with UiPath Autopilot.
Unleashing the power of AI in UiPath Studio with UiPath Autopilot.
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
 
Does AI(Artificial intelligence) need a Working Memory??
Does AI(Artificial intelligence) need a Working Memory??Does AI(Artificial intelligence) need a Working Memory??
Does AI(Artificial intelligence) need a Working Memory??
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond Ontologies
 
CHIPS Alliance_Object Automation Inc_workshop
CHIPS Alliance_Object Automation Inc_workshopCHIPS Alliance_Object Automation Inc_workshop
CHIPS Alliance_Object Automation Inc_workshop
 
LLM Threats: Prompt Injections and Jailbreak Attacks
LLM Threats: Prompt Injections and Jailbreak AttacksLLM Threats: Prompt Injections and Jailbreak Attacks
LLM Threats: Prompt Injections and Jailbreak Attacks
 
20200723_insight_release_plan
20200723_insight_release_plan20200723_insight_release_plan
20200723_insight_release_plan
 
Future Research Directions for Augmented Reality
Future Research Directions for Augmented RealityFuture Research Directions for Augmented Reality
Future Research Directions for Augmented Reality
 
Do we need a new standard for visualizing the invisible?
Do we need a new standard for visualizing the invisible?Do we need a new standard for visualizing the invisible?
Do we need a new standard for visualizing the invisible?
 
Checklist to troubleshoot CD moisture profiles.docx
Checklist to troubleshoot CD moisture profiles.docxChecklist to troubleshoot CD moisture profiles.docx
Checklist to troubleshoot CD moisture profiles.docx
 
BODYPACK DIGITAL TECHNOLOGY STACK - 2024
BODYPACK DIGITAL TECHNOLOGY STACK - 2024BODYPACK DIGITAL TECHNOLOGY STACK - 2024
BODYPACK DIGITAL TECHNOLOGY STACK - 2024
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 Workshop
 
Digital magic. A small project for controlling smart light bulbs.
Digital magic. A small project for controlling smart light bulbs.Digital magic. A small project for controlling smart light bulbs.
Digital magic. A small project for controlling smart light bulbs.
 
PicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer ServicePicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer Service
 
IEEE Computer Society 2024 Technology Predictions Update
IEEE Computer Society 2024 Technology Predictions UpdateIEEE Computer Society 2024 Technology Predictions Update
IEEE Computer Society 2024 Technology Predictions Update
 

Essential Tools For Your Big Data Arsenal

  • 1. Essential Tools For Your Big Data Arsenal Matt Asay (@mjasay) VP, Business Development & Strategy, MongoDB
  • 2. The Big Data Unknown
  • 3. Top Big Data Challenges? Translation? Most struggle to know what Big Data is, how to manage it and who can manage it Source: Gartner 3
  • 4. Understanding Big Data – It’s Not Very “Big” 64% - Ingest diverse, new data in real-time 15% - More than 100TB of data 20% - Less than 100TB (average of all? <20TB) from Big Data Executive Summary – 50+ top executives from Government and F500 firms 4
  • 6. “I have not failed. I've just found 10,000 ways that won't work.” ― Thomas A. Edison
  • 7. Back in 1970…Cars Were Great! 7
  • 9. Lots of Great Innovations Since 1970 9
  • 11. RDBMS Makes Development Hard Code DB Schema Application 11 XML Config Object Relational Mapping Relational Database
  • 12. And Even Harder To Iterate New Table New Column New Table Name Pet Phone New Column 3 months later… 12 Email
  • 13. From Complexity to Simplicity RDBMS MongoDB { _id : ObjectId("4c4ba5e5e8aabf3"), employee_name: "Dunham, Justin", department : "Marketing", title : "Product Manager, Web", report_up: "Neray, Graham", pay_band: “C", benefits : [ { type : "Health", plan : "PPO Plus" }, { type : "Dental", plan : "Standard" } ] } 13
  • 15. Big Data != Big Upfront Payment 15
  • 16. RDBMS Is Expensive To Scale “Clients can also opt to run zEC12 without a raised datacenter floor -- a first for high-end IBM mainframes.” IBM Press Release 28 Aug, 2012 16
  • 17. Spoiled for choice DB-Engines.com Database Ranking 1 Oracle 2 MySQL 3 Microsoft SQL Server 4 PostgreSQL 5 DB2 6 MongoDB 7 Microsoft Access 8 SQLite 9 Sybase 10 Teradata 17 Relational DBMS 1583.84 Relational DBMS 1331.34 Relational DBMS 1207 Relational DBMS 177.01 Relational DBMS 175.83 NoSQL Document Store 149.48 Relational DBMS 142.49 Relational DBMS 77.88 Relational DBMS 73.66 Relational DBMS 54.41 54.23 25.58 -106.78 -5.22 3.58 -2.71 -4.21 -4.9 -1.68 3.32
  • 18. Remember the Long Tail? 18
  • 19. It Didn’t Work Out So Well 19
  • 20. Use Popular, Well-Known Technologies 20 Source: Silicon Angle, 2012
  • 21. Ask the Right Questions… “Organizations already have people who know their own data better than mystical data scientists….Learning Hadoop [or MongoDB] is easier than learning the company’s business.” (Gartner, 2012) 21
  • 23. Search as a Sign? 23
  • 24. When To Use Hadoop, NoSQL
  • 25. 25 Applications CRM, ERP, Collaboration, Mobile, BI Data Management Online Data RDBMS RDBMS Offline Data Hadoop Infrastructure OS & Virtualization, Compute, Storage, Network EDW Security & Auditing Management & Monitoring Enterprise Big Data Stack
  • 26. Consideration – Online vs. Offline Online • Real-time • Low-latency • High availability 26 vs. Offline • Long-running • High-Latency • Availability is lower priority
  • 27. Consideration – Online vs. Offline Online 27 vs. Offline
  • 28. Hadoop Is Good for… Risk Modeling Recommendation Engine Ad Targeting Transaction Analysis Trade Surveillance Network Failure Prediction 28 Churn Analysis Search Quality Data Lake
  • 29. MongoDB/NoSQL Is Good for… 360° View of the Customer Fraud Detection User Data Management Content Management & Delivery Reference Data Product Catalogs 29 Mobile & Social Apps Machine to Machine Apps Data Hub
  • 30. How To Use The Two Together?
  • 32. Customer example: Online Travel Travel Algorithms MongoDB Connector for Hadoop • • • • 32 Flights, hotels and cars Real-time offers User profiles, reviews User metadata (previous purchases, clicks, views) • • • • User segmentation Offer recommendation engine Ad serving engine Bundling engine
  • 33. Predictive Analytics Government Algorithms MongoDB + Hadoop • Predictive analytics system for crime, health issues • Diverse, unstructured (incl. geospatial) data from 30+ agencies • Correlate data in real-time 33 • Long-form trend analysis • MongoDB data dumped into Hadoop, analyzed, re-inserted into MongoDB for better realtime response
  • 34. Data Hub Churn Analysis Insurance MongoDB Connector for Hadoop • • • • • 34 Insurance policies Demographic data Customer web data Call center data Real-time churn detection • Customer action analysis • Churn prediction algorithms
  • 35. Machine Learning Ad-Serving Algorithms MongoDB Connector for Hadoop • • • • • 35 Catalogs and products User profiles Clicks Views Transactions • User segmentation • Recommendation engine • Prediction engine
  • 36. MongoDB + Hadoop Connector • Makes MongoDB a Hadoop-enabled file system • Read and write to live data, in-place • Copy data between Hadoop and MongoDB • Full support for data processing – Hive – MapReduce – Pig – Streaming – EMR 36 MongoDB Connector for Hadoop

Editor's Notes

  1. Big Data is new, and you’re likely going to fail as you start. But it’s almost guaranteed, as well, that you won’t know which data to capture, or how to leverage it, without trial and error. As such, if you were to “design for failure,” what key things would you need? You need to reduce the cost of failure, both in terms of time and money. You’d need to build on data infrastructure that supports your iterations toward success and then rewards you by making it easy and cost effective to scale.
  2. IBM designed IMS with Rockwell and Caterpillar starting in 1966 for the Apollo program. IMS&apos;s challenge was to inventory the very large bill of materials (BOM) for the Saturn V moon rocket and Apollo space vehicle.
  3. Loading a paper tape reader on the KDF9 computer.
  4. IBM designed IMS with Rockwell and Caterpillar starting in 1966 for the Apollo program. IMS&apos;s challenge was to inventory the very large bill of materials (BOM) for the Saturn V moon rocket and Apollo space vehicle.
  5. This is helpful because as much as 95% of enterprise information is unstructured, and doesn’t fit neatly into tidy rows and columns. NoSQL and Hadoop allow for dynamic schema.
  6. The industry is talking about Hadoop and MongoDB for Big Data. So should you
  7. Why not Hbase? MongoDB dramatically more popularMuch easier to useWorks from small scale to large scaleFar closer to the functionality available in RDBMS, including geospatial, secondary indexes, text search40+ languages mean you can work in your preferred programming language
  8. The industry is not betting on RDBMS for Big Data. Neither should you
  9. This is where MongoDB fits into the existing enterprise IT stackMongoDB is an operational data store used for online data, in the same way that Oracle is an operational data store. It supports applications that ingest, store, manage and even analyze data in real-time. (Compared to Hadoop and data warehouses, which are used for offline, batch analytical workloads.)
  10. OrSo not everyone would agree with the term offline big data
  11. What each of these has in common is that they’re retrospective: they’re about looking at the past to help predict the future. The learnings from these Hadoop applications end up being applied by a different technology. This is where MongoDB comes in.
  12. Marketing has been breaking people down into segments (Hadoop - user base as a whole) for a long time, while new marketing needs to focus on individuals (user base as a user). CriteoIf you&apos;re only optimizing on the aggregate data, you&apos;re missing out on the personalization. but i you only do the individual, you&apos;re missing out on patterns across your entire user baseYou want to do both but the systems needed to do this (on a single data set) tend to be architecturally at odds with each other.One dataset. That system lives in two different databases which are taking care of the different processing needs on the dataIn order for Hadoop/column-level processing to be useful, you want to have lots of columns. You need your real-time data store to be very rich, or you&apos;ll lose information. You don&apos;t want to be simplifying that data from the start by putting it into an RDBMS or key-value store. as a specific example, people talk about Hadoop used for log file analysis. Those log files lack a lot of context coming out of your web server for example (IP address, time stamp, etc. but not any real context as to what the log files mean - this is the watch movie button). You can actually have a much richer version of that interaction by keeping that data in a doc db that describes what is actually happening in that log