3 Ways Modern Databases Drive Revenue

1,075 views

Published on

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,075
On SlideShare
0
From Embeds
0
Number of Embeds
380
Actions
Shares
0
Downloads
39
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Good to start by asking who is:

    -first time to the event
    -a user
    -etc
  • Talk about the relational database and how incredible it was for our transactional systems of record.
  • One thing hasn’t changed: data means money. Your business depends on data more than ever before. Therefore, finding ways to optimize productivity with your data is crucial.
  • There is no such thing as NoSQL. Not as we tend to think of it, anyway. While NoSQL was born as a movement away from rigid relational data models so web giants could embrace Big Data with scale-out architectures, the term has come to categorize a set of databases that are more different than they are the same.

    This broad categorization doesn’t work. It’s not helpful.

    While we at MongoDB still sometimes refer to NoSQL, we try to do it sparingly, given its propensity to confuse rather than enlighten.

    Deconstructing NoSQL

    Today the NoSQL category includes a cacophony of over 100 document, key-value, wide-column and graph databases (link is external). Each of these database types comes with its own strengths and limits. Each differs markedly from the others, with disparate models and capabilities relative to data storage, querying, consistency, scalability and high availability.

    Comparing a document database to a key-value store, for example, is like comparing a smartphone to a beeper. A beeper is exceptionally useful for getting a simple message from Point A to Point B. It’s fast. It’s reliable. But it’s nowhere near as functional as a smartphone, which can quickly and reliably transmit messages, but can also do so much more.

    Both are useful, but the smartphone fits a far broader range of applications than the more limited beeper.

    As such, organizations searching for a database to tackle Gartner’s three V’s of Big Data -- volume, velocity and variety -- won’t find an immediate answer in “NoSQL.” Instead, they need to probe deeper for a modern database that can handle all of their Big Data application requirements.
  • One of these requirements is, of course, the ability to handle large volumes of data, the original impetus behind the NoSQL movement. But the ability to handle volume, or scale, is something all databases categorized as “NoSQL” share. MongoDB, for example, counts among its users those who regularly store petabytes of data, perform over 1,000,000 operations per second and clusters that exceed 1,000 nodes.

    A modern database, however, must do more than scale. Scalability is table stakes. It also must enable agility to accelerate development and time to market. It must allow organizations to iterate as they embrace new business requirements.

    And a modern database must, above all, enable enterprises to take advantage of rapidly growing data variety. Indeed the “greatest challenge and opportunity” for enterprises, as Forrester notes, is managing a “variety of data sources,” including data types and sources that may not even exist today.

    In general, all so-called NoSQL databases are much more helpful than relational databases at storing a wide variety of data types and sources, including mobile device, geospatial, social and sensor data. But the hallmark of a modern database its ability to allow organizations to do useful things with their data.
  • eHarmony:
    Started with a simple architecture running Oracle. As their data volumes ballooned, they found they couldn’t perform high volume, bi-directional searches. And the second problem was that they could no longer persist a billion-plus potential matches at scale.

    They turned to Postgres running on a bunch of high-end, expensive servers. Each one of eHarmony’s compatibility matching platform applications was co-located with a local Postgres database server that stored a complete copy of all searchable data, so that it could perform queries locally, hence reducing the load on the central database.

    This worked until the data size became bigger, and the data model became more complex.

    Compounding the problem was that every single time they needed to make any schema changes, such as adding a new attribute to the data model, it was a complete nightmare for both their engineering and ops teams. They would spend spent several hours first extracting the data dump from Postgres, massaging the data, copy it to multiple servers and multiple machines, reloading the data back to Postgres, and that translated to a lot of high operational cost to maintain this solution. And it was a lot worse if that particular attribute needed to be part of an index.

    They decided they needed something different.

    They didn’t want to repeat the same mistake, that is, a decentralized SQL solution based on Postgres. It had to support auto-scaling.

    They also wanted a solution that didn’t require that they spend a lot of time maintaining the database, like adding a new shard, a new cluster, a new server to the cluster, and so forth. They needed auto-sharding.

    As their big data got bigger, they wanted to be able to spec the data to multiple shards, across multiple physical servers, to maintain high throughput performance without any server upgrade. They also needed the database to allow auto-balancing of data to ensure even distribution of data across multiple shards seamlessly. In addition, the new database had to support fast, complex, multi-attribute queries with high performance throughput.

    So eHarmony chose MongoDB. Result?

    eHarmony is now able to generate over 3 billion potential matches each day, which depends on over 60 million complex queries across 250+ attributes each day. Their systems store and manage roughly 200 million photos and another 4B+ relationship questionnaires, comprising many tens of terabytes of data.

    Whereas eHarmony’s RDBMS solution took two weeks to reprocess all of the people in its database, with MongoDB eHarmony has cut that by more than 95% to under 12 hours, analyzing 3 billion-plus potential matches every single day. As a result, eHarmony now sees a 30% increase in two-way communication, 50% increase in the paid subscribers, and 60% plus increase in traffic growth, in terms of the unique visitors and visits.
  • Big Data is new, and you’re likely going to fail as you start. But it’s almost guaranteed, as well, that you won’t know which data to capture, or how to leverage it, without trial and error. As such, if you were to “design for failure,” what key things would you need? You need to reduce the cost of failure, both in terms of time and money. You’d need to build on data infrastructure that supports your iterations toward success and then rewards you by making it easy and cost effective to scale.
  • In 1985, storage was the key expense: $100,000 per GB; developer salary: $28,000 per year
    So relational databases were built to optimize for storage

    In 2013, storage is cheap: $0.05 per GB. Developers are expensive: $90,000 per year
    So MongoDB was built to optimize for developer productivity

    This is what the ratio of those expenses looks like, in 1985 and today
    Assumptions:
    3-year TCO
    1985: 2 developers and 5 GB
    2013: 2 developers and 5 TB

    Developer costs comprise the lion’s share relative to storage today. So optimize for developer productivity
  • 3 Ways Modern Databases Drive Revenue

    1. 1. MongoDB Inc. Proprietary and Confidential 3 Ways Modern Databases Drive Revenue Matt Asay, VP of Marketing
    2. 2. 2 The Last 40 Years Of Data: Neat and Tidy
    3. 3. 3 Your Industry Has Changed UPFRONT SUBSCRIBE Business YEARS MONTHS Applications PC MOBILE Customers ADS SOCIAL Engagement SERVERS CLOUD Infrastructure
    4. 4. 4 Your Data Has Changed • 90% of data created in the last 2 years • 80% of enterprise data is unstructured • Unstructured data growing 2x faster than structured Sources: IBM, Gartner 2012
    5. 5. What Hasn’t Changed?
    6. 6. 6 How Do You Manage Big Data? * From Big Data Executive Summary of 50+ execs from F100, gov orgs Top Big Data Issues “Of Gartner's "3Vs" of big data (volume, velocity, variety), the variety of data sources is seen by our clients as both the greatest challenge and the greatest opportunity.” Forrester, 2014 Data Variety (68%) Data Volume (15%) Other Data (17%) Diverse, streaming or new data types Greater than 100TB Less than 100TB
    7. 7. NoSQL ≠ NoSQL
    8. 8. 8 How a Modern Database Can Change Your Business Scale Unlock Adapt
    9. 9. Scale
    10. 10. 10 • Horizontal scale – on commodity hardware or cloud – is mandatory • Most apps require TBs of data, but you want PBs of headroom Your Database Must Not Throttle Success Ambitious startup scale to 1M+ users in weeks; 100s of millions of emails/month Global media company scaled MongoDB to 4.5 PBs across public cloud infrastructure Automated failover and ability to add nodes means scale without downtime; “Blown away by MongoDB’s performance”
    11. 11. 11 Powerful predictive analytics system that started on Chief Data Officer’s laptop Iterate… Problem Results • Diverse data from 30+ different government agencies • Limited budget – had to prove the system to justify budget • Had to be able to integrate geospatial data with other highly unstructured data • Scales from single node to many, many servers • Easy-to-manage dynamic data model enables limitless growth • Support for ad hoc queries, geospatial • Award-winning government project • Cost effective while delivering exceptional performance • Easily extended to incorporate new data sources Why MongoDB
    12. 12. Adapt
    13. 13. 13 Innovation Requires Iteration New Table New Table New Column Name Pet Phone Email New Column 3 months later…
    14. 14. 14 RDBMS From Complexity to Simplicity MongoDB { _id : ObjectId("4c4ba5e5e8aabf3"), employee_name: "Dunham, Justin", department : "Marketing", title : "Product Manager, Web", report_up: "Neray, Graham", pay_band: “C", benefits : [ { type : "Health", plan : "PPO Plus" }, { type : "Dental", plan : "Standard" } ] }
    15. 15. 15 • Break free of schema servitude: focus on your app, not object-relational mapping and rigid schema design Your Database Must Make It Simple to Add New Data Sources and Types Struggled for years with RDBMS: schema customization too difficult. MongoDB “added flexibility and easy scalability” Shaved years off projects to <4 months; lowered TCO; no security compromise. “Devs build apps w/out becoming DBAs” Dramatically decreased drug development time by making adding new data types easy; integrates seamlessly with RDBMS
    16. 16. 16 Single view of customer data (Virtually impossible with RDBMS) Diverse Data… Problem Why MongoDB Results • 70+ disparate data sources (maintframe, RDBMS) • RDBMS could not support centralized data mgt and federation of information services • Document model allows easy integration of diverse data sources • Fast, easy scalability • Full query language • Delivers high scalability, fast performance, and easy maintenance, while keeping support costs low • Successful POC in 3 weeks; in production within 90 days • Single view of the customer (improved customer experience, improved sales) • 71% less expensive
    17. 17. Unlock
    18. 18. 19 • Storing data for fast access isn’t enough. Questions matter most • Database must support rich queries, indexing, aggregation and search across multi-structured, rapidly changing data sets in real time Your Database Must Enable Rich Querying of Your Data Transformed cumbersome data storage to high-performance data analytics. MongoDB-based Internet of Things platform that takes advantage of ever-changing sensor data and analytics against this data. Runs unified data store serving hundreds of diverse web properties on MongoDB
    19. 19. 20 50% increase in paid subscribers due to 95% performance improvement over RDBMS Multi-attribute Queries Problem Why MongoDB Results • RDBMS couldn’t handle high-volume, bi- directional searches • Couldn’t persist a billion-plus matches • RDBMS was difficult to manage in production (schema changes were painful; hard to scale) • Ease of management: auto-scaling, auto- sharding, no downtime • Complex queries across 250+ different attributes • Exceptional performance • Ability to dynamically update schema without complex schema redesign • 95% performance improvement: 3 billion matches daily using 60 million complex queries across 250+ attributes • Big increase in customer satisfaction, paid subscribers • Significantly less expensive
    20. 20. “I have not failed. I've just found 10,000 ways that won't work.” ― Thomas A. Edison
    21. 21. 22 Optimize for (Developer) Iteration 1985 2013 Infrastructure Cost Engineer Cost
    22. 22. 23 8,000,000+ MongoDB Downloads Build on NoSQL’s Largest Ecosystem 1,000+ Customers Across All Industries; Hundreds of Thousands of Users 600+ Technology and Services Partners 35,000+ MongoDB Management Service (MMS) Users 35,000+ MongoDB User Group Members 200,000+ Online Education Registrants
    23. 23. 24 MongoDB Can Help

    ×