MongoDB Inc. Proprietary and Confidential
3 Ways Modern Databases
Drive Revenue
Matt Asay, VP of Marketing
2
The Last 40 Years Of Data:
Neat and Tidy
3
Your Industry Has Changed
UPFRONT SUBSCRIBE
Business
YEARS MONTHS
Applications
PC MOBILE
Customers
ADS SOCIAL
Engagement
SERVERS CLOUD
Infrastructure
4
Your Data Has Changed
• 90% of data created in the
last 2 years
• 80% of enterprise data is
unstructured
• Unstructured data growing
2x faster than structured
Sources: IBM, Gartner 2012
What Hasn’t Changed?
6
How Do You Manage Big Data?
* From Big Data Executive Summary of 50+ execs from F100, gov orgs
Top Big Data Issues
“Of Gartner's "3Vs" of big data (volume, velocity, variety), the
variety of data sources is seen by our clients as both the greatest
challenge and the greatest opportunity.”
Forrester, 2014
Data Variety (68%)
Data Volume (15%)
Other Data (17%)
Diverse, streaming or new data types
Greater than 100TB
Less than 100TB
NoSQL ≠ NoSQL
8
How a Modern Database Can
Change Your Business
Scale
Unlock
Adapt
Scale
10
• Horizontal scale – on commodity
hardware or cloud – is mandatory
• Most apps require TBs of data, but you
want PBs of headroom
Your Database Must Not Throttle Success
Ambitious startup scale to 1M+ users in
weeks; 100s of millions of emails/month
Global media company scaled MongoDB to
4.5 PBs across public cloud infrastructure
Automated failover and ability to add nodes
means scale without downtime; “Blown
away by MongoDB’s performance”
11
Powerful predictive analytics system that started
on Chief Data Officer’s laptop
Iterate…
Problem Results
• Diverse data from 30+
different government
agencies
• Limited budget – had to
prove the system to
justify budget
• Had to be able to
integrate geospatial
data with other highly
unstructured data
• Scales from single node
to many, many servers
• Easy-to-manage
dynamic data model
enables limitless growth
• Support for ad hoc
queries, geospatial
• Award-winning
government project
• Cost effective while
delivering exceptional
performance
• Easily extended to
incorporate new data
sources
Why MongoDB
Adapt
13
Innovation Requires Iteration
New
Table
New
Table
New
Column
Name Pet Phone Email
New
Column
3 months later…
14
RDBMS
From Complexity to Simplicity
MongoDB
{
_id : ObjectId("4c4ba5e5e8aabf3"),
employee_name: "Dunham, Justin",
department : "Marketing",
title : "Product Manager, Web",
report_up: "Neray, Graham",
pay_band: “C",
benefits : [
{ type : "Health",
plan : "PPO Plus" },
{ type : "Dental",
plan : "Standard" }
]
}
15
• Break free of schema servitude: focus
on your app, not object-relational
mapping and rigid schema design
Your Database Must Make It Simple to
Add New Data Sources and Types
Struggled for years with RDBMS: schema
customization too difficult. MongoDB
“added flexibility and easy scalability”
Shaved years off projects to <4 months;
lowered TCO; no security compromise.
“Devs build apps w/out becoming DBAs”
Dramatically decreased drug development
time by making adding new data types
easy; integrates seamlessly with RDBMS
16
Single view of customer data
(Virtually impossible with RDBMS)
Diverse Data…
Problem Why MongoDB Results
• 70+ disparate data
sources (maintframe,
RDBMS)
• RDBMS could not
support centralized data
mgt and federation of
information services
• Document model allows
easy integration of diverse
data sources
• Fast, easy scalability
• Full query language
• Delivers high scalability,
fast performance, and
easy maintenance, while
keeping support costs low
• Successful POC in 3
weeks; in production
within 90 days
• Single view of the
customer (improved
customer experience,
improved sales)
• 71% less expensive
Unlock
19
• Storing data for fast access isn’t
enough. Questions matter most
• Database must support rich queries,
indexing, aggregation and search
across multi-structured, rapidly
changing data sets in real time
Your Database Must Enable Rich
Querying of Your Data
Transformed cumbersome data storage to
high-performance data analytics.
MongoDB-based Internet of Things platform
that takes advantage of ever-changing
sensor data and analytics against this data.
Runs unified data store serving hundreds of
diverse web properties on MongoDB
20
50% increase in paid subscribers due to 95%
performance improvement over RDBMS
Multi-attribute Queries
Problem Why MongoDB Results
• RDBMS couldn’t handle
high-volume, bi-
directional searches
• Couldn’t persist a
billion-plus matches
• RDBMS was difficult to
manage in production
(schema changes were
painful; hard to scale)
• Ease of management:
auto-scaling, auto-
sharding, no downtime
• Complex queries across
250+ different attributes
• Exceptional performance
• Ability to dynamically
update schema without
complex schema redesign
• 95% performance
improvement: 3 billion
matches daily using 60
million complex queries
across 250+ attributes
• Big increase in
customer satisfaction,
paid subscribers
• Significantly less
expensive
“I have not failed. I've just found 10,000 ways that won't work.”
― Thomas A. Edison
22
Optimize for (Developer) Iteration
1985 2013
Infrastructure Cost
Engineer Cost
23
8,000,000+
MongoDB Downloads
Build on NoSQL’s Largest Ecosystem
1,000+
Customers Across All Industries; Hundreds of
Thousands of Users
600+
Technology and Services Partners
35,000+
MongoDB Management Service (MMS) Users
35,000+
MongoDB User Group Members
200,000+
Online Education Registrants
24
MongoDB Can Help
3 Ways Modern Databases Drive Revenue

3 Ways Modern Databases Drive Revenue

  • 1.
    MongoDB Inc. Proprietaryand Confidential 3 Ways Modern Databases Drive Revenue Matt Asay, VP of Marketing
  • 2.
    2 The Last 40Years Of Data: Neat and Tidy
  • 3.
    3 Your Industry HasChanged UPFRONT SUBSCRIBE Business YEARS MONTHS Applications PC MOBILE Customers ADS SOCIAL Engagement SERVERS CLOUD Infrastructure
  • 4.
    4 Your Data HasChanged • 90% of data created in the last 2 years • 80% of enterprise data is unstructured • Unstructured data growing 2x faster than structured Sources: IBM, Gartner 2012
  • 5.
  • 6.
    6 How Do YouManage Big Data? * From Big Data Executive Summary of 50+ execs from F100, gov orgs Top Big Data Issues “Of Gartner's "3Vs" of big data (volume, velocity, variety), the variety of data sources is seen by our clients as both the greatest challenge and the greatest opportunity.” Forrester, 2014 Data Variety (68%) Data Volume (15%) Other Data (17%) Diverse, streaming or new data types Greater than 100TB Less than 100TB
  • 7.
  • 8.
    8 How a ModernDatabase Can Change Your Business Scale Unlock Adapt
  • 9.
  • 10.
    10 • Horizontal scale– on commodity hardware or cloud – is mandatory • Most apps require TBs of data, but you want PBs of headroom Your Database Must Not Throttle Success Ambitious startup scale to 1M+ users in weeks; 100s of millions of emails/month Global media company scaled MongoDB to 4.5 PBs across public cloud infrastructure Automated failover and ability to add nodes means scale without downtime; “Blown away by MongoDB’s performance”
  • 11.
    11 Powerful predictive analyticssystem that started on Chief Data Officer’s laptop Iterate… Problem Results • Diverse data from 30+ different government agencies • Limited budget – had to prove the system to justify budget • Had to be able to integrate geospatial data with other highly unstructured data • Scales from single node to many, many servers • Easy-to-manage dynamic data model enables limitless growth • Support for ad hoc queries, geospatial • Award-winning government project • Cost effective while delivering exceptional performance • Easily extended to incorporate new data sources Why MongoDB
  • 12.
  • 13.
  • 14.
    14 RDBMS From Complexity toSimplicity MongoDB { _id : ObjectId("4c4ba5e5e8aabf3"), employee_name: "Dunham, Justin", department : "Marketing", title : "Product Manager, Web", report_up: "Neray, Graham", pay_band: “C", benefits : [ { type : "Health", plan : "PPO Plus" }, { type : "Dental", plan : "Standard" } ] }
  • 15.
    15 • Break freeof schema servitude: focus on your app, not object-relational mapping and rigid schema design Your Database Must Make It Simple to Add New Data Sources and Types Struggled for years with RDBMS: schema customization too difficult. MongoDB “added flexibility and easy scalability” Shaved years off projects to <4 months; lowered TCO; no security compromise. “Devs build apps w/out becoming DBAs” Dramatically decreased drug development time by making adding new data types easy; integrates seamlessly with RDBMS
  • 16.
    16 Single view ofcustomer data (Virtually impossible with RDBMS) Diverse Data… Problem Why MongoDB Results • 70+ disparate data sources (maintframe, RDBMS) • RDBMS could not support centralized data mgt and federation of information services • Document model allows easy integration of diverse data sources • Fast, easy scalability • Full query language • Delivers high scalability, fast performance, and easy maintenance, while keeping support costs low • Successful POC in 3 weeks; in production within 90 days • Single view of the customer (improved customer experience, improved sales) • 71% less expensive
  • 17.
  • 19.
    19 • Storing datafor fast access isn’t enough. Questions matter most • Database must support rich queries, indexing, aggregation and search across multi-structured, rapidly changing data sets in real time Your Database Must Enable Rich Querying of Your Data Transformed cumbersome data storage to high-performance data analytics. MongoDB-based Internet of Things platform that takes advantage of ever-changing sensor data and analytics against this data. Runs unified data store serving hundreds of diverse web properties on MongoDB
  • 20.
    20 50% increase inpaid subscribers due to 95% performance improvement over RDBMS Multi-attribute Queries Problem Why MongoDB Results • RDBMS couldn’t handle high-volume, bi- directional searches • Couldn’t persist a billion-plus matches • RDBMS was difficult to manage in production (schema changes were painful; hard to scale) • Ease of management: auto-scaling, auto- sharding, no downtime • Complex queries across 250+ different attributes • Exceptional performance • Ability to dynamically update schema without complex schema redesign • 95% performance improvement: 3 billion matches daily using 60 million complex queries across 250+ attributes • Big increase in customer satisfaction, paid subscribers • Significantly less expensive
  • 21.
    “I have notfailed. I've just found 10,000 ways that won't work.” ― Thomas A. Edison
  • 22.
    22 Optimize for (Developer)Iteration 1985 2013 Infrastructure Cost Engineer Cost
  • 23.
    23 8,000,000+ MongoDB Downloads Build onNoSQL’s Largest Ecosystem 1,000+ Customers Across All Industries; Hundreds of Thousands of Users 600+ Technology and Services Partners 35,000+ MongoDB Management Service (MMS) Users 35,000+ MongoDB User Group Members 200,000+ Online Education Registrants
  • 24.

Editor's Notes

  • #2 Good to start by asking who is: -first time to the event -a user -etc
  • #3 Talk about the relational database and how incredible it was for our transactional systems of record.
  • #6 One thing hasn’t changed: data means money. Your business depends on data more than ever before. Therefore, finding ways to optimize productivity with your data is crucial.
  • #8 There is no such thing as NoSQL. Not as we tend to think of it, anyway. While NoSQL was born as a movement away from rigid relational data models so web giants could embrace Big Data with scale-out architectures, the term has come to categorize a set of databases that are more different than they are the same. This broad categorization doesn’t work. It’s not helpful. While we at MongoDB still sometimes refer to NoSQL, we try to do it sparingly, given its propensity to confuse rather than enlighten. Deconstructing NoSQL Today the NoSQL category includes a cacophony of over 100 document, key-value, wide-column and graph databases (link is external). Each of these database types comes with its own strengths and limits. Each differs markedly from the others, with disparate models and capabilities relative to data storage, querying, consistency, scalability and high availability. Comparing a document database to a key-value store, for example, is like comparing a smartphone to a beeper. A beeper is exceptionally useful for getting a simple message from Point A to Point B. It’s fast. It’s reliable. But it’s nowhere near as functional as a smartphone, which can quickly and reliably transmit messages, but can also do so much more. Both are useful, but the smartphone fits a far broader range of applications than the more limited beeper. As such, organizations searching for a database to tackle Gartner’s three V’s of Big Data -- volume, velocity and variety -- won’t find an immediate answer in “NoSQL.” Instead, they need to probe deeper for a modern database that can handle all of their Big Data application requirements.
  • #9 One of these requirements is, of course, the ability to handle large volumes of data, the original impetus behind the NoSQL movement. But the ability to handle volume, or scale, is something all databases categorized as “NoSQL” share. MongoDB, for example, counts among its users those who regularly store petabytes of data, perform over 1,000,000 operations per second and clusters that exceed 1,000 nodes. A modern database, however, must do more than scale. Scalability is table stakes. It also must enable agility to accelerate development and time to market. It must allow organizations to iterate as they embrace new business requirements. And a modern database must, above all, enable enterprises to take advantage of rapidly growing data variety. Indeed the “greatest challenge and opportunity” for enterprises, as Forrester notes, is managing a “variety of data sources,” including data types and sources that may not even exist today. In general, all so-called NoSQL databases are much more helpful than relational databases at storing a wide variety of data types and sources, including mobile device, geospatial, social and sensor data. But the hallmark of a modern database its ability to allow organizations to do useful things with their data.
  • #21 eHarmony: Started with a simple architecture running Oracle. As their data volumes ballooned, they found they couldn’t perform high volume, bi-directional searches. And the second problem was that they could no longer persist a billion-plus potential matches at scale. They turned to Postgres running on a bunch of high-end, expensive servers. Each one of eHarmony’s compatibility matching platform applications was co-located with a local Postgres database server that stored a complete copy of all searchable data, so that it could perform queries locally, hence reducing the load on the central database. This worked until the data size became bigger, and the data model became more complex. Compounding the problem was that every single time they needed to make any schema changes, such as adding a new attribute to the data model, it was a complete nightmare for both their engineering and ops teams. They would spend spent several hours first extracting the data dump from Postgres, massaging the data, copy it to multiple servers and multiple machines, reloading the data back to Postgres, and that translated to a lot of high operational cost to maintain this solution. And it was a lot worse if that particular attribute needed to be part of an index. They decided they needed something different. They didn’t want to repeat the same mistake, that is, a decentralized SQL solution based on Postgres. It had to support auto-scaling. They also wanted a solution that didn’t require that they spend a lot of time maintaining the database, like adding a new shard, a new cluster, a new server to the cluster, and so forth. They needed auto-sharding. As their big data got bigger, they wanted to be able to spec the data to multiple shards, across multiple physical servers, to maintain high throughput performance without any server upgrade. They also needed the database to allow auto-balancing of data to ensure even distribution of data across multiple shards seamlessly. In addition, the new database had to support fast, complex, multi-attribute queries with high performance throughput. So eHarmony chose MongoDB. Result? eHarmony is now able to generate over 3 billion potential matches each day, which depends on over 60 million complex queries across 250+ attributes each day. Their systems store and manage roughly 200 million photos and another 4B+ relationship questionnaires, comprising many tens of terabytes of data. Whereas eHarmony’s RDBMS solution took two weeks to reprocess all of the people in its database, with MongoDB eHarmony has cut that by more than 95% to under 12 hours, analyzing 3 billion-plus potential matches every single day. As a result, eHarmony now sees a 30% increase in two-way communication, 50% increase in the paid subscribers, and 60% plus increase in traffic growth, in terms of the unique visitors and visits.
  • #22 Big Data is new, and you’re likely going to fail as you start. But it’s almost guaranteed, as well, that you won’t know which data to capture, or how to leverage it, without trial and error. As such, if you were to “design for failure,” what key things would you need? You need to reduce the cost of failure, both in terms of time and money. You’d need to build on data infrastructure that supports your iterations toward success and then rewards you by making it easy and cost effective to scale.
  • #23 In 1985, storage was the key expense: $100,000 per GB; developer salary: $28,000 per year So relational databases were built to optimize for storage In 2013, storage is cheap: $0.05 per GB. Developers are expensive: $90,000 per year So MongoDB was built to optimize for developer productivity This is what the ratio of those expenses looks like, in 1985 and today Assumptions: 3-year TCO 1985: 2 developers and 5 GB 2013: 2 developers and 5 TB Developer costs comprise the lion’s share relative to storage today. So optimize for developer productivity