SlideShare a Scribd company logo
1 of 37
Hadoop & MongoDB
Understanding your Big Data
2
MongoDB World
3
Speakers
Jnan Dash
Senior Advisor
jnan.dash@mongodb.com
Kelly Stirman
Director of Products
kelly.stirman@mongodb.com
4
• Last 12 years (2002-Now) - Executive Consultant, on the board
and advisory board of several new software companies
including Big Data players such as MongoDB
• 10 Years (1992-2002) – Oracle, Group Vice President, Systems
Architecture and Technology, responsible for the server product
planning and rollout
• 16 years (1975-1992) – IBM, Planner, architect, and
development manager for DB2 product line at Silicon Valley
Lab and Austin Lab. Head of IBM‟s Database
architecture, strategy, and technology
Jnan Dash
5
• Finally, some real innovation in DBMS
• MongoDB momentum is unprecedented!
• The changing landscape needs MongoDB
– “Internet scale” distributed operations + highly flexible
data model for agile development + open source
• Perfect fit for cloud, mobility, and big data
Why am I excited about MongoDB?
6
• Big Data - Observations
• Evolution of Database Technology
• Hadoop+MongoDB
• Customer Examples
• Roadmap
• Summary
Agenda
7
1. Thousand years ago – Experimental Science
Description of natural phenomenon
2. Last few hundred years – Theoretical Science
Newton‟s Laws, Maxwell‟s Equation,..
3. Last few decades – Computational Science
Simulation of complex phenomena
4. Today – Data-intensive Science
Scientists overwhelmed with data deluge
Unify theory, experiment & simulation
The Fourth Paradigm
8
Internet Scale Commercial Supercomputing
• Originated with companies operating at Internet scale (to process
ever increasing #users and data)
– Yahoo in the 1990s, then Google, Facebook, Twitter
– They needed to do it quickly, economically, and affordably at scale
• Hadoop is the first commercial supercomputing software platform
– Works at scale, affordable at scale
• HPC was used for meteorology and engineering scientific super
computing. Big data is commercial equivalent of HPC
– Less about equations, more about discovery, patterns
• Many technologies have been around for decades
• Clustering
• Parallel processing
• Distributed file systems
9
Big Data: 3V’s
10
Some Make it 4V’s
11
What’s driving Big Data
- Ad-hoc querying and reporting
- Data mining techniques
- Structured data, typical sources
- Small to mid-size datasets
- Optimizations and predictive analytics
- Complex statistical analysis
- All types of data, and many sources
- Very large datasets
- More of a real-time
12
Big Data – the full spectrum
Transaction
Processing
Analytical
Processing
Data
Mining, Visualiz
ation, and
Integration
Tools
RDBMS OLAP/DW
DW
Appliance
Hadoop, Im
pala,..
NoSQL
NewSQL, In
-
Memory, Str
eam...
Online/Realtime Offline/Batch
13
Hadoop Ecosystem
Programming
Languages
Computation
Object Storage
Zookeeper
(Coordination)
Core Apache Hadoop Related Apache Projects
HDFS
(Hadoop Distributed File System)
MapReduce
(Distributed Programing Framework)
Hive
(SQL)
Pig
(Data Flow)
HBase
(Wide Column Storage)
HCatalog
(Meta Data)
HMS
(Management)
Table Storage
Database Technology Evolution
15
Data Management over the years
1960’s
File
Systems
1970’s
1st Generation
DBMS
Data as
Shared Resource
1980’s
Relational
Technology
Ease of Query
1990’s
New data types
OLAP/DW
Web Support
Unstructured Data
2005+
Big Data
Post-PC, Data
Deluge, 3Vs,
NoSQL
16
Operational vs. Analytics
2010
RDBMS
Key-Value/
Wide-column
OLAP/DW
Hadoop
2000
RDBMS
OLAP/DW
1990
RDBMS
Operational
Database
Data warehouse
Document DB
NoSQL
17
MongoDB Features
• JSON Document Model
with Dynamic Schemas
• Auto-Sharding for
Horizontal Scalability
• Text Search
• Aggregation Framework
and MapReduce
• Full, Flexible Index Support
and Rich Queries
• Native Replication for High
Availability
• Advanced Security
• Large Media Storage with
GridFS
18
Documents are Rich Data Structures
{
first_name: „Paul‟,
surname: „Miller‟,
cell: „+447557505611‟
city: „London‟,
location: [45.123,47.232],
Profession: [banking, finance, trader],
cars: [
{ model: „Bentley‟,
year: 1973,
value: 100000, … },
{ model: „Rolls Royce‟,
year: 1965,
value: 330000, … }
}
}
Fields can contain an
array of sub-documents
Fields
Typed field
values
Fields can
contain
arrays
19
Machine Generated Data
20
• Hundreds of thousands of records per second
• Fast response required
• Sometimes all data kept, sometimes just
summary
• Horizontal scalability required
Fast Moving Data
21
• A machine generates a specific kind of data
• The data model is unlikely to change
• But there are so many different machines…
• Queryability across all types
Data is Structured, but Varied…
22
• Event data written multiple times per second,
minute, or hour
• Tracking progression of metrics over time
Time Series Data
23
Do More With Your Data
MongoDB
Rich Queries
• Find Paul’s cars
• Find everybody in London with a car
built between 1970 and 1980
Geospatial
• Find all of the car owners within 5km of
Trafalgar Sq.
Text Search
• Find all the cars described as having
leather seats
Aggregation
• Calculate the average value of Paul’s
car collection
Map Reduce
• What is the ownership pattern of colors
by geography over time? (is purple
trending up in China?)
{
first_name: „Paul‟,
surname: „Miller‟,
city: „London‟,
location: [51.524,-0.087],
cars: [
{ model: „Bentley‟,
year: 1973,
value: 100000, … },
{ model: „Rolls Royce‟,
year: 1965,
value: 330000, … }
}
}
Hadoop & MongoDB
25
Enterprise Big Data Stack
EDWHadoop
Management&Monitoring
Security&Auditing
RDBMS
CRM, ERP, Collaboration, Mobile, BI
OS & Virtualization, Compute, Storage, Network
RDBMS
Applications
Infrastructure
Data Management
Online Data Offline Data
26
MongoDB & Hadoop
• Multi-source analytics
• Interactive & Batch
• Data lake
• Online, Real-time
• High concurrency & HA
• Live analytics
Operational Analytical
MongoDB
Connector for
Hadoop
27
Hadoop Is Good for…
Risk Modeling Churn Analysis
Recommendation
Modeling
Ad Targeting
Transaction
Analysis
Trade
Surveillance
Network Failure
Prediction
Search Quality Data Lake
28
MongoDB Is Good for…
Single View Mobile Apps Fraud Detection
Customer Data
Management
Content
Management &
Delivery
Database-as-a-
Service
Product & Asset
Catalogs
Internet of Things
Social &
Collaboration
Customer Examples
30
Many more examples
Big Data Product & Asset
Catalogs
Security &
Fraud
Internet of
Things
Database-as-a-
Service
Mobile
Apps
Customer Data
Management
Single
View
Social &
Collaboration
Content
Management
Intelligence Agencies
Top Investment and
Retail Banks
Top US Retailer
Top Global Shipping
Company
Top Industrial Equipment
Manufacturer
Top Media Company
Top Investment and
Retail Banks
31
MongoDB Enterprise Value
32
• Makes MongoDB a Hadoop-enabled file system
• Full use of MongoDB‟s indexes
• Read and write to live data, in-place
• Copy data between Hadoop and MongoDB
• Full support for data processing
– Hive
– MapReduce
– Pig
– Streaming
– EMR
MongoDB+Hadoop Connector
MongoDB
Connector for
Hadoop
33
Customer Example – MetLife
Customer
Service
• Insurance policies
• Demographic data
• Customer web data
• Call center data
• Real-time churn detection
• Customer action analysis
• Churn prediction
algorithms
Churn Analysis
MongoDB
Connector for
Hadoop
34
Customer Example - eCommerce
Travel
• Flights, hotels and cars
• Real-time offers
• User profiles, reviews
• User metadata (previous
purchases, clicks, views)
• User segmentation
• Offer recommendation engine
• Ad serving engine
• Bundling engine
Algorithms
MongoDB
Connector for
Hadoop
35
Roadmap
Capability Today Soon
Connectivity Custom
Centralized
Administration
MongoDB  Hadoop Dynamic reads Automated Snapshots
BSON Support MapReduce, Hive, Pig Impala, Tez, Spark
Hadoop  MongoDB Dynamic writes Bulk Loader
36
• Big Data covers a wide spectrum
– Volume, Velocity, Variety
– Hence the mythical equation Big Data = Hadoop
• Enterprises are more concerned about Variety
– MongoDB provides the best platform
• Hadoop and MongoDB are complimentary
– MongoDB for operational workloads
– Hadoop for analytical workloads
Summary
MongoDB & Hadoop - Understanding Your Big Data

More Related Content

What's hot

Big Data Processing with Spark and Scala
Big Data Processing with Spark and Scala Big Data Processing with Spark and Scala
Big Data Processing with Spark and Scala Edureka!
 
Understanding your Data - Data Analytics Lifecycle and Machine Learning
Understanding your Data - Data Analytics Lifecycle and Machine LearningUnderstanding your Data - Data Analytics Lifecycle and Machine Learning
Understanding your Data - Data Analytics Lifecycle and Machine LearningAbzetdin Adamov
 
Introduction to Hadoop Technology
Introduction to Hadoop TechnologyIntroduction to Hadoop Technology
Introduction to Hadoop TechnologyManish Borkar
 
Chapter 1 semantic web
Chapter 1 semantic webChapter 1 semantic web
Chapter 1 semantic webR A Akerkar
 
Introduction to column oriented databases
Introduction to column oriented databasesIntroduction to column oriented databases
Introduction to column oriented databasesArangoDB Database
 
NOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLNOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLRamakant Soni
 
Automating Data Quality Processes at Reckitt
Automating Data Quality Processes at ReckittAutomating Data Quality Processes at Reckitt
Automating Data Quality Processes at ReckittDatabricks
 
9. Document Oriented Databases
9. Document Oriented Databases9. Document Oriented Databases
9. Document Oriented DatabasesFabio Fumarola
 
Cloud Computing and Service oriented Architecture
Cloud Computing and Service oriented Architecture Cloud Computing and Service oriented Architecture
Cloud Computing and Service oriented Architecture Ravindra Dastikop
 
Cloud computing: cost reduction
Cloud computing: cost reductionCloud computing: cost reduction
Cloud computing: cost reductionHesham Shabana
 
Apache Hadoop
Apache HadoopApache Hadoop
Apache HadoopAjit Koti
 
Data Warehousing - in the real world
Data Warehousing - in the real worldData Warehousing - in the real world
Data Warehousing - in the real worldukc4
 
Big data-analytics-cpe8035
Big data-analytics-cpe8035Big data-analytics-cpe8035
Big data-analytics-cpe8035Neelam Rawat
 
Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component rebeccatho
 
No sql distilled-distilled
No sql distilled-distilledNo sql distilled-distilled
No sql distilled-distilledrICh morrow
 
Databricks for Dummies
Databricks for DummiesDatabricks for Dummies
Databricks for DummiesRodney Joyce
 
Cloud computing risk & challenges
Cloud computing risk & challengesCloud computing risk & challenges
Cloud computing risk & challengesParag Deodhar
 

What's hot (20)

Big Data Processing with Spark and Scala
Big Data Processing with Spark and Scala Big Data Processing with Spark and Scala
Big Data Processing with Spark and Scala
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Understanding your Data - Data Analytics Lifecycle and Machine Learning
Understanding your Data - Data Analytics Lifecycle and Machine LearningUnderstanding your Data - Data Analytics Lifecycle and Machine Learning
Understanding your Data - Data Analytics Lifecycle and Machine Learning
 
Introduction to Hadoop Technology
Introduction to Hadoop TechnologyIntroduction to Hadoop Technology
Introduction to Hadoop Technology
 
Chapter 1 semantic web
Chapter 1 semantic webChapter 1 semantic web
Chapter 1 semantic web
 
Introduction to column oriented databases
Introduction to column oriented databasesIntroduction to column oriented databases
Introduction to column oriented databases
 
NOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLNOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQL
 
Automating Data Quality Processes at Reckitt
Automating Data Quality Processes at ReckittAutomating Data Quality Processes at Reckitt
Automating Data Quality Processes at Reckitt
 
9. Document Oriented Databases
9. Document Oriented Databases9. Document Oriented Databases
9. Document Oriented Databases
 
Cloud Computing and Service oriented Architecture
Cloud Computing and Service oriented Architecture Cloud Computing and Service oriented Architecture
Cloud Computing and Service oriented Architecture
 
Cloud computing: cost reduction
Cloud computing: cost reductionCloud computing: cost reduction
Cloud computing: cost reduction
 
Apache hive
Apache hiveApache hive
Apache hive
 
Apache Hadoop
Apache HadoopApache Hadoop
Apache Hadoop
 
Data Warehousing - in the real world
Data Warehousing - in the real worldData Warehousing - in the real world
Data Warehousing - in the real world
 
Big data-analytics-cpe8035
Big data-analytics-cpe8035Big data-analytics-cpe8035
Big data-analytics-cpe8035
 
Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component
 
Big data storage
Big data storageBig data storage
Big data storage
 
No sql distilled-distilled
No sql distilled-distilledNo sql distilled-distilled
No sql distilled-distilled
 
Databricks for Dummies
Databricks for DummiesDatabricks for Dummies
Databricks for Dummies
 
Cloud computing risk & challenges
Cloud computing risk & challengesCloud computing risk & challenges
Cloud computing risk & challenges
 

Similar to MongoDB & Hadoop - Understanding Your Big Data

Mongo DB: Operational Big Data Database
Mongo DB: Operational Big Data DatabaseMongo DB: Operational Big Data Database
Mongo DB: Operational Big Data DatabaseXpand IT
 
Operational Analytics Using Spark and NoSQL Data Stores
Operational Analytics Using Spark and NoSQL Data StoresOperational Analytics Using Spark and NoSQL Data Stores
Operational Analytics Using Spark and NoSQL Data StoresDATAVERSITY
 
Enterprise Reporting with MongoDB and JasperSoft
Enterprise Reporting with MongoDB and JasperSoftEnterprise Reporting with MongoDB and JasperSoft
Enterprise Reporting with MongoDB and JasperSoftMongoDB
 
IARE_BDBA_ PPT_0.pptx
IARE_BDBA_ PPT_0.pptxIARE_BDBA_ PPT_0.pptx
IARE_BDBA_ PPT_0.pptxAIMLSEMINARS
 
L'architettura di classe enterprise di nuova generazione - Massimo Brignoli
L'architettura di classe enterprise di nuova generazione - Massimo BrignoliL'architettura di classe enterprise di nuova generazione - Massimo Brignoli
L'architettura di classe enterprise di nuova generazione - Massimo BrignoliData Driven Innovation
 
How Government Agencies are Using MongoDB to Build Data as a Service Solutions
How Government Agencies are Using MongoDB to Build Data as a Service SolutionsHow Government Agencies are Using MongoDB to Build Data as a Service Solutions
How Government Agencies are Using MongoDB to Build Data as a Service SolutionsMongoDB
 
Advanced applications with MongoDB
Advanced applications with MongoDBAdvanced applications with MongoDB
Advanced applications with MongoDBNorberto Leite
 
Enabling Telco to Build and Run Modern Applications
Enabling Telco to Build and Run Modern Applications Enabling Telco to Build and Run Modern Applications
Enabling Telco to Build and Run Modern Applications Tugdual Grall
 
Neo4j GraphTalks - Einführung in Graphdatenbanken
Neo4j GraphTalks - Einführung in GraphdatenbankenNeo4j GraphTalks - Einführung in Graphdatenbanken
Neo4j GraphTalks - Einführung in GraphdatenbankenNeo4j
 
Lecture1 BIG DATA and Types of data in details
Lecture1 BIG DATA and Types of data in detailsLecture1 BIG DATA and Types of data in details
Lecture1 BIG DATA and Types of data in detailsAbhishekKumarAgrahar2
 
An Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDBAn Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDBMongoDB
 
Webinar: Expanding Retail Frontiers with MongoDB
 Webinar: Expanding Retail Frontiers with MongoDB Webinar: Expanding Retail Frontiers with MongoDB
Webinar: Expanding Retail Frontiers with MongoDBMongoDB
 
OPENEXPO Madrid 2015 - Advanced Applications with MongoDB
OPENEXPO Madrid 2015 - Advanced Applications with MongoDB OPENEXPO Madrid 2015 - Advanced Applications with MongoDB
OPENEXPO Madrid 2015 - Advanced Applications with MongoDB MongoDB
 
Expanding Retail Frontiers with MongoDB
Expanding Retail Frontiers with MongoDBExpanding Retail Frontiers with MongoDB
Expanding Retail Frontiers with MongoDBNorberto Leite
 
Webinar: Enterprise Trends for Database-as-a-Service
Webinar: Enterprise Trends for Database-as-a-ServiceWebinar: Enterprise Trends for Database-as-a-Service
Webinar: Enterprise Trends for Database-as-a-ServiceMongoDB
 
Webinar: When to Use MongoDB
Webinar: When to Use MongoDBWebinar: When to Use MongoDB
Webinar: When to Use MongoDBMongoDB
 
Data Treatment MongoDB
Data Treatment MongoDBData Treatment MongoDB
Data Treatment MongoDBNorberto Leite
 
Graphs for Enterprise Architects
Graphs for Enterprise ArchitectsGraphs for Enterprise Architects
Graphs for Enterprise ArchitectsNeo4j
 
Neo4j GraphTalk Frankfurt - Einführung
Neo4j GraphTalk Frankfurt - EinführungNeo4j GraphTalk Frankfurt - Einführung
Neo4j GraphTalk Frankfurt - EinführungNeo4j
 
Webinar: An Enterprise Architect’s View of MongoDB
Webinar: An Enterprise Architect’s View of MongoDBWebinar: An Enterprise Architect’s View of MongoDB
Webinar: An Enterprise Architect’s View of MongoDBMongoDB
 

Similar to MongoDB & Hadoop - Understanding Your Big Data (20)

Mongo DB: Operational Big Data Database
Mongo DB: Operational Big Data DatabaseMongo DB: Operational Big Data Database
Mongo DB: Operational Big Data Database
 
Operational Analytics Using Spark and NoSQL Data Stores
Operational Analytics Using Spark and NoSQL Data StoresOperational Analytics Using Spark and NoSQL Data Stores
Operational Analytics Using Spark and NoSQL Data Stores
 
Enterprise Reporting with MongoDB and JasperSoft
Enterprise Reporting with MongoDB and JasperSoftEnterprise Reporting with MongoDB and JasperSoft
Enterprise Reporting with MongoDB and JasperSoft
 
IARE_BDBA_ PPT_0.pptx
IARE_BDBA_ PPT_0.pptxIARE_BDBA_ PPT_0.pptx
IARE_BDBA_ PPT_0.pptx
 
L'architettura di classe enterprise di nuova generazione - Massimo Brignoli
L'architettura di classe enterprise di nuova generazione - Massimo BrignoliL'architettura di classe enterprise di nuova generazione - Massimo Brignoli
L'architettura di classe enterprise di nuova generazione - Massimo Brignoli
 
How Government Agencies are Using MongoDB to Build Data as a Service Solutions
How Government Agencies are Using MongoDB to Build Data as a Service SolutionsHow Government Agencies are Using MongoDB to Build Data as a Service Solutions
How Government Agencies are Using MongoDB to Build Data as a Service Solutions
 
Advanced applications with MongoDB
Advanced applications with MongoDBAdvanced applications with MongoDB
Advanced applications with MongoDB
 
Enabling Telco to Build and Run Modern Applications
Enabling Telco to Build and Run Modern Applications Enabling Telco to Build and Run Modern Applications
Enabling Telco to Build and Run Modern Applications
 
Neo4j GraphTalks - Einführung in Graphdatenbanken
Neo4j GraphTalks - Einführung in GraphdatenbankenNeo4j GraphTalks - Einführung in Graphdatenbanken
Neo4j GraphTalks - Einführung in Graphdatenbanken
 
Lecture1 BIG DATA and Types of data in details
Lecture1 BIG DATA and Types of data in detailsLecture1 BIG DATA and Types of data in details
Lecture1 BIG DATA and Types of data in details
 
An Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDBAn Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDB
 
Webinar: Expanding Retail Frontiers with MongoDB
 Webinar: Expanding Retail Frontiers with MongoDB Webinar: Expanding Retail Frontiers with MongoDB
Webinar: Expanding Retail Frontiers with MongoDB
 
OPENEXPO Madrid 2015 - Advanced Applications with MongoDB
OPENEXPO Madrid 2015 - Advanced Applications with MongoDB OPENEXPO Madrid 2015 - Advanced Applications with MongoDB
OPENEXPO Madrid 2015 - Advanced Applications with MongoDB
 
Expanding Retail Frontiers with MongoDB
Expanding Retail Frontiers with MongoDBExpanding Retail Frontiers with MongoDB
Expanding Retail Frontiers with MongoDB
 
Webinar: Enterprise Trends for Database-as-a-Service
Webinar: Enterprise Trends for Database-as-a-ServiceWebinar: Enterprise Trends for Database-as-a-Service
Webinar: Enterprise Trends for Database-as-a-Service
 
Webinar: When to Use MongoDB
Webinar: When to Use MongoDBWebinar: When to Use MongoDB
Webinar: When to Use MongoDB
 
Data Treatment MongoDB
Data Treatment MongoDBData Treatment MongoDB
Data Treatment MongoDB
 
Graphs for Enterprise Architects
Graphs for Enterprise ArchitectsGraphs for Enterprise Architects
Graphs for Enterprise Architects
 
Neo4j GraphTalk Frankfurt - Einführung
Neo4j GraphTalk Frankfurt - EinführungNeo4j GraphTalk Frankfurt - Einführung
Neo4j GraphTalk Frankfurt - Einführung
 
Webinar: An Enterprise Architect’s View of MongoDB
Webinar: An Enterprise Architect’s View of MongoDBWebinar: An Enterprise Architect’s View of MongoDB
Webinar: An Enterprise Architect’s View of MongoDB
 

More from MongoDB

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump StartMongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB
 

More from MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

Recently uploaded

The Ultimate Prompt Engineering Guide for Generative AI: Get the Most Out of ...
The Ultimate Prompt Engineering Guide for Generative AI: Get the Most Out of ...The Ultimate Prompt Engineering Guide for Generative AI: Get the Most Out of ...
The Ultimate Prompt Engineering Guide for Generative AI: Get the Most Out of ...SOFTTECHHUB
 
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)Paige Cruz
 
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data PlatformLess Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data PlatformWSO2
 
2024 May Patch Tuesday
2024 May Patch Tuesday2024 May Patch Tuesday
2024 May Patch TuesdayIvanti
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMKumar Satyam
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsLeah Henrickson
 
ChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityVictorSzoltysek
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingScyllaDB
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
JavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate GuideJavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate GuidePixlogix Infotech
 
Decarbonising Commercial Real Estate: The Role of Operational Performance
Decarbonising Commercial Real Estate: The Role of Operational PerformanceDecarbonising Commercial Real Estate: The Role of Operational Performance
Decarbonising Commercial Real Estate: The Role of Operational PerformanceIES VE
 
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxFIDO Alliance
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe中 央社
 
Modernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using BallerinaModernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using BallerinaWSO2
 
UiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overviewUiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overviewDianaGray10
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctBrainSell Technologies
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightSafe Software
 

Recently uploaded (20)

The Ultimate Prompt Engineering Guide for Generative AI: Get the Most Out of ...
The Ultimate Prompt Engineering Guide for Generative AI: Get the Most Out of ...The Ultimate Prompt Engineering Guide for Generative AI: Get the Most Out of ...
The Ultimate Prompt Engineering Guide for Generative AI: Get the Most Out of ...
 
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
 
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data PlatformLess Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
 
2024 May Patch Tuesday
2024 May Patch Tuesday2024 May Patch Tuesday
2024 May Patch Tuesday
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
 
ChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps Productivity
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream Processing
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
JavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate GuideJavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate Guide
 
Decarbonising Commercial Real Estate: The Role of Operational Performance
Decarbonising Commercial Real Estate: The Role of Operational PerformanceDecarbonising Commercial Real Estate: The Role of Operational Performance
Decarbonising Commercial Real Estate: The Role of Operational Performance
 
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptx
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe
 
Modernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using BallerinaModernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using Ballerina
 
UiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overviewUiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overview
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage Intacct
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 

MongoDB & Hadoop - Understanding Your Big Data

  • 3. 3 Speakers Jnan Dash Senior Advisor jnan.dash@mongodb.com Kelly Stirman Director of Products kelly.stirman@mongodb.com
  • 4. 4 • Last 12 years (2002-Now) - Executive Consultant, on the board and advisory board of several new software companies including Big Data players such as MongoDB • 10 Years (1992-2002) – Oracle, Group Vice President, Systems Architecture and Technology, responsible for the server product planning and rollout • 16 years (1975-1992) – IBM, Planner, architect, and development manager for DB2 product line at Silicon Valley Lab and Austin Lab. Head of IBM‟s Database architecture, strategy, and technology Jnan Dash
  • 5. 5 • Finally, some real innovation in DBMS • MongoDB momentum is unprecedented! • The changing landscape needs MongoDB – “Internet scale” distributed operations + highly flexible data model for agile development + open source • Perfect fit for cloud, mobility, and big data Why am I excited about MongoDB?
  • 6. 6 • Big Data - Observations • Evolution of Database Technology • Hadoop+MongoDB • Customer Examples • Roadmap • Summary Agenda
  • 7. 7 1. Thousand years ago – Experimental Science Description of natural phenomenon 2. Last few hundred years – Theoretical Science Newton‟s Laws, Maxwell‟s Equation,.. 3. Last few decades – Computational Science Simulation of complex phenomena 4. Today – Data-intensive Science Scientists overwhelmed with data deluge Unify theory, experiment & simulation The Fourth Paradigm
  • 8. 8 Internet Scale Commercial Supercomputing • Originated with companies operating at Internet scale (to process ever increasing #users and data) – Yahoo in the 1990s, then Google, Facebook, Twitter – They needed to do it quickly, economically, and affordably at scale • Hadoop is the first commercial supercomputing software platform – Works at scale, affordable at scale • HPC was used for meteorology and engineering scientific super computing. Big data is commercial equivalent of HPC – Less about equations, more about discovery, patterns • Many technologies have been around for decades • Clustering • Parallel processing • Distributed file systems
  • 10. 10 Some Make it 4V’s
  • 11. 11 What’s driving Big Data - Ad-hoc querying and reporting - Data mining techniques - Structured data, typical sources - Small to mid-size datasets - Optimizations and predictive analytics - Complex statistical analysis - All types of data, and many sources - Very large datasets - More of a real-time
  • 12. 12 Big Data – the full spectrum Transaction Processing Analytical Processing Data Mining, Visualiz ation, and Integration Tools RDBMS OLAP/DW DW Appliance Hadoop, Im pala,.. NoSQL NewSQL, In - Memory, Str eam... Online/Realtime Offline/Batch
  • 13. 13 Hadoop Ecosystem Programming Languages Computation Object Storage Zookeeper (Coordination) Core Apache Hadoop Related Apache Projects HDFS (Hadoop Distributed File System) MapReduce (Distributed Programing Framework) Hive (SQL) Pig (Data Flow) HBase (Wide Column Storage) HCatalog (Meta Data) HMS (Management) Table Storage
  • 15. 15 Data Management over the years 1960’s File Systems 1970’s 1st Generation DBMS Data as Shared Resource 1980’s Relational Technology Ease of Query 1990’s New data types OLAP/DW Web Support Unstructured Data 2005+ Big Data Post-PC, Data Deluge, 3Vs, NoSQL
  • 17. 17 MongoDB Features • JSON Document Model with Dynamic Schemas • Auto-Sharding for Horizontal Scalability • Text Search • Aggregation Framework and MapReduce • Full, Flexible Index Support and Rich Queries • Native Replication for High Availability • Advanced Security • Large Media Storage with GridFS
  • 18. 18 Documents are Rich Data Structures { first_name: „Paul‟, surname: „Miller‟, cell: „+447557505611‟ city: „London‟, location: [45.123,47.232], Profession: [banking, finance, trader], cars: [ { model: „Bentley‟, year: 1973, value: 100000, … }, { model: „Rolls Royce‟, year: 1965, value: 330000, … } } } Fields can contain an array of sub-documents Fields Typed field values Fields can contain arrays
  • 20. 20 • Hundreds of thousands of records per second • Fast response required • Sometimes all data kept, sometimes just summary • Horizontal scalability required Fast Moving Data
  • 21. 21 • A machine generates a specific kind of data • The data model is unlikely to change • But there are so many different machines… • Queryability across all types Data is Structured, but Varied…
  • 22. 22 • Event data written multiple times per second, minute, or hour • Tracking progression of metrics over time Time Series Data
  • 23. 23 Do More With Your Data MongoDB Rich Queries • Find Paul’s cars • Find everybody in London with a car built between 1970 and 1980 Geospatial • Find all of the car owners within 5km of Trafalgar Sq. Text Search • Find all the cars described as having leather seats Aggregation • Calculate the average value of Paul’s car collection Map Reduce • What is the ownership pattern of colors by geography over time? (is purple trending up in China?) { first_name: „Paul‟, surname: „Miller‟, city: „London‟, location: [51.524,-0.087], cars: [ { model: „Bentley‟, year: 1973, value: 100000, … }, { model: „Rolls Royce‟, year: 1965, value: 330000, … } } }
  • 25. 25 Enterprise Big Data Stack EDWHadoop Management&Monitoring Security&Auditing RDBMS CRM, ERP, Collaboration, Mobile, BI OS & Virtualization, Compute, Storage, Network RDBMS Applications Infrastructure Data Management Online Data Offline Data
  • 26. 26 MongoDB & Hadoop • Multi-source analytics • Interactive & Batch • Data lake • Online, Real-time • High concurrency & HA • Live analytics Operational Analytical MongoDB Connector for Hadoop
  • 27. 27 Hadoop Is Good for… Risk Modeling Churn Analysis Recommendation Modeling Ad Targeting Transaction Analysis Trade Surveillance Network Failure Prediction Search Quality Data Lake
  • 28. 28 MongoDB Is Good for… Single View Mobile Apps Fraud Detection Customer Data Management Content Management & Delivery Database-as-a- Service Product & Asset Catalogs Internet of Things Social & Collaboration
  • 30. 30 Many more examples Big Data Product & Asset Catalogs Security & Fraud Internet of Things Database-as-a- Service Mobile Apps Customer Data Management Single View Social & Collaboration Content Management Intelligence Agencies Top Investment and Retail Banks Top US Retailer Top Global Shipping Company Top Industrial Equipment Manufacturer Top Media Company Top Investment and Retail Banks
  • 32. 32 • Makes MongoDB a Hadoop-enabled file system • Full use of MongoDB‟s indexes • Read and write to live data, in-place • Copy data between Hadoop and MongoDB • Full support for data processing – Hive – MapReduce – Pig – Streaming – EMR MongoDB+Hadoop Connector MongoDB Connector for Hadoop
  • 33. 33 Customer Example – MetLife Customer Service • Insurance policies • Demographic data • Customer web data • Call center data • Real-time churn detection • Customer action analysis • Churn prediction algorithms Churn Analysis MongoDB Connector for Hadoop
  • 34. 34 Customer Example - eCommerce Travel • Flights, hotels and cars • Real-time offers • User profiles, reviews • User metadata (previous purchases, clicks, views) • User segmentation • Offer recommendation engine • Ad serving engine • Bundling engine Algorithms MongoDB Connector for Hadoop
  • 35. 35 Roadmap Capability Today Soon Connectivity Custom Centralized Administration MongoDB  Hadoop Dynamic reads Automated Snapshots BSON Support MapReduce, Hive, Pig Impala, Tez, Spark Hadoop  MongoDB Dynamic writes Bulk Loader
  • 36. 36 • Big Data covers a wide spectrum – Volume, Velocity, Variety – Hence the mythical equation Big Data = Hadoop • Enterprises are more concerned about Variety – MongoDB provides the best platform • Hadoop and MongoDB are complimentary – MongoDB for operational workloads – Hadoop for analytical workloads Summary

Editor's Notes

  1. MongoDB provides agility, scalability, and performance without sacrificing the functionality of relational databases, like full index support and rich queriesIndexes: secondary, compound, text search, geospatial, and more
  2. We have all these fantastic machines… they give the same metrics they used to, but now they transmit the data. We have metrics about metrics, and we need a place to store the data. We need a place to understand what the data means.
  3. This is where MongoDB fits into the existing enterprise IT stackMongoDB is an operational data store used for online data, in the same way that Oracle is an operational data store. It supports applications that ingest, store, manage and even analyze data in real-time. (Compared to Hadoop and data warehouses, which are used for offline, batch analytical workloads.)
  4. Makes MongoDB a Hadoop-enabled file systemRead and write to live data, in-placeCopy data between Hadoop and MongoDBUses MongoDB indexes to filter dataFull support for data processingHiveMapReducePigStreaming
  5. What each of these has in common is that they’re retrospective: they’re about looking at the past to help predict the future. The learnings from these Hadoop applications end up being applied by a different technology. This is where MongoDB comes in.
  6. Customer Data Management (e.g., Customer Relationship Management, Biometrics, User Profile Management)Product and Asset Catalogs (e.g., eCommerce, Inventory Management)Social and Collaboration Apps: (e.g., Social Networks and Feeds, Document and Project Collaboration Tools)Mobile Apps (e.g., for Smartphones and Tablets) Content Management (e.g, Web CMS, Document Management, Digital Asset and Metadata Management)Internet of Things / Machine to Machine (e.g., mHealth, Connected Home, Smart Meters)Security and Fraud Apps (e.g., Fraud Detection, Cyberthreat Analysis)DbaaS (Cloud Database-as-a-Service)Data Hub (Aggregating Data from Multiple Sources for Operational or Analytical Purposes)Big Data (e.g., Genomics, Clickstream Analysis, Customer Sentiment Analysis)