Webinar: Realizing the Promise of Machine to Machine (M2M) with MongoDB


Published on

From sensors to location-tracking, machines generate an enormous amount of information. Despite the potential opportunities for monetization, organizations have struggled to realize the promise of M2M and the Internet of Things. In this webinar, we'll explore how MongoDB can ingest, store, manage and analyze vast amounts and types of data, enabling new M2M applications that were previously not possible. We'll discuss example applications based on real-world use cases, including schema design, example queries, and aggregation.

Published in: Technology, Business
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Once upon a time, machine-readable data got created at the rate at which cards could be punched.
  • But nowadays, all kinds of systems can be generating data continuously.Image from money.cnn.com
  • This is, in some sense, a new space.
  • I presume you all know what open source means at this point, and I’m going to leave high performance unexamined for the moment
  • In many respects, MongoDB’s feature set is relatively analogous to the kinds of features that traditional databases have offered: dynamic queries, built-in aggregation, specialized indexing types and search facilities (geospatial and text search, for instance). Only we change the data model we use, partly for programmer productivity and also because the changed data model permits a powerful scaling approach, which, indeed, is critical to the performance requirements of M2M systems.
  • So our data model is based on the idea of nestable documents, rather than flat rows. These model data structures in programs considerably more conveniently than flat relational rows do, leading to gains in programmer productivity, among other things.
  • But MongoDB also offers a horizontal scaling architecture, whereby the capacity of a MongoDB deployment to perform reads and/or writes grows linearly with the quantity of hardware deployed. In practice, this means that growing a MongoDB cluster’s should cost you linearly in the amount you want to grow your capacity (whereas traditional vertical scaling approaches tend to evince highly nonlinear marginal scaling costs).(In practice, however, one tends not to deploy clusters of Apple iMacs for a database. At least, I’ve never seen it.)
  • And within a cluster, it’s customary for each of the units of horizontal scaling, the shards, to be a set of nodes that replicate your data. This gives you data redundancy for high availabilty and disaster recovery, purposes, but is also, increasingly, useful for working with large volumes of machine generated data, as we will revisit later in this webinar.
  • A comment about use cases: all the use cases described here are real, only the names and details have been elided to protect confidentiality. (In many cases, users consider MongoDB to be a “secret weapon” that gives them competitive advantage, which prevents us from talking publically about names and details. If you happen to be using MongoDB in any capacity and are interested in letting others know about it, we’d love to hear from you.)
  • Note that some of these data are potentially high-volume if you record them frequently enough, or if your buildings/campuses are very large
  • Can optimize for most-common use cases
  • So the nice thing is that you can do different work on different boxes, possibly in different DCs.
  • Now, individualsdon’t generate data that fast (owing to their numbers and the laws of physics), but the same ideas apply here as for more numerous or frequent sensor readings
  • But the interesting thing about customers is that they change, and business goals change, and so what you record about your customer will need to adapt as you go.
  • So, for example, maybe at the outset you record some basic info about your customers, and as you develop more information about them you have an increasingly refined idea of what each customer is about. Maybe you acquire ideas about their habits/purchases/etc.; or maybe next year’s smartphones will be able to measure people’s heart rates or body temperature. (And who knows what Google Glass will let us do?)That information is easy to merge into your existing data about your customer, just by leveraging MongoDB’s flexible schemas. And all this works with the deep features we’ve already discussed: geo, aggregation, dynamic query, etc.
  • Webinar: Realizing the Promise of Machine to Machine (M2M) with MongoDB

    1. 1. Consulting Manager, 10gen IncRichard KreuterMachine to Machine:Managing MechanicallyGenerated Data withMongoDB
    2. 2. Agenda• Whats Machine to Machine Data?• Whats MongoDB?• Some use cases• Next steps
    3. 3. What’s Machine to MachineAbout?
    4. 4. Why this is game-changing• Resource tracking• Process optimization• Market analysis• Real-time decision making• etc.– (… including things we haven’t thought of yet…)
    5. 5. Why this is technically challenging• Massive volumes of data• High data ingestion rates• Complex data analysis requirements• Evolving data modeling needs
    6. 6. What’s MongoDB About?
    7. 7. MongoDB is a ___________ database• Open source• High performance• Full featured• Document-oriented• Horizontally scalable
    8. 8. Full Featured• Dynamic (ad-hoc) queries• Built-in online aggregation• Rich query capabilities• Traditionally consistent• Many advanced features• Support for many programming languages
    9. 9. Document-Oriented Database• A document is a nestable associative array• Document schemas are flexible• Documents can contain various data types(numbers, text, timestamps, blobs, etc)
    10. 10. Horizontally Scalable (Sharding)
    11. 11. Replication within a shard
    12. 12. Use Case #1: Keep on Trucking
    13. 13. Suppose you’ve got a lot of trucks• You probably care about where they are• You probably care if they’re on schedule• You might also care what they’re carrying(cargo and/or fuel)
    14. 14. What are these data?• Vehicle tracking data is positional infoavailable via GPS• Cargo/fuel might be continuous sensor data(e.g., volume or weight) or inventory(e.g., RFID)
    15. 15. What MongoDB can do here• For GPS type data, MongoDB has long hadpowerful geospatial query facilities:– “Find all trucks within a region (within a certaintime range, perhaps)”– “Find all trucks within 100km from thewarehouse/customer”
    16. 16. Recent additions to GeoSpatial• MongoDB version 2.4 introduces support forindexing and querying on various GeoJSONdata types (polygons, line strings)– “Find trucking routes that intersect”– “Find routes that pass nearby to a customer”
    17. 17. What about the cargo/fuel?• Sensor data is an easy fit for MongoDB• Cf. MMS, the MongoDB Monitoring Service– Cloud service hosted by 10gen since 2011– On the order of 1M measurements written persecond (i.e., 10B writes/day)– Continuous data rendering/alerting/analysis ofingested data, 24x7
    18. 18. Use Case #2: Keeping Cool
    19. 19. Suppose you’ve got a building• You probably keep it climate controlled…• … and lit …• … and perhaps secured with entry cards.
    20. 20. So you measure things• Temperature readings and/or HVACutilization reports• Lights on/off• Swipe-ins/swipe-outs through secured doors
    21. 21. This is “just” sensor data• Straightforward to store in MongoDBdocuments• With strategic document design, a singleserver can save hundreds of thousands ofsensor reads per second
    22. 22. But how do you use this data?• MongoDB has a built-in AggregationFramework that supports ad-hoc analysistasks over data sets– “What rooms had the highest average AirConditioning utilization bracketed daily?”– “Which secured doors have the most ‘pass-back’ problems?”
    23. 23. Replication supports analytics• E • Queries can beparceled out todifferent replicas• In different DCs, say,• Or to segregatecompetingworkloads
    24. 24. Use Case #3: Who are theclients in your neighborhood?
    25. 25. Suppose you’ve mobile customers• You might know where they are during theirday• You might want to pitch them offers whilethey’re there– Or perhaps notify partners in real-time• And you might evolve your model progressively
    26. 26. Tracking locations is straightforward• We’ve already discussed GPS data– Recording positional information– Geospatial queries for proximity etc.• Though mobile customers might have moreinteresting locational data than just GPS– e.g., user U is in retail store S now– or, user U1 is near to U2 now
    27. 27. Evolving your customer model• One interesting thing about people is thatthey differ• And you might choose to pay attention tochanging things as your business evolves• MongoDB makes this easy
    28. 28. Flexible Schema supports Agility• As you become interested in new data, youinclude that data in your documents– No laborious data migration processes arenecessary – just adapt the application models torecord new information– Queries and indexes work over polymorphicdata models, too
    29. 29. What’s next?
    30. 30. Going further• Where do you build your nextwarehouse/place your next office?• How should you staff your retailspaces/factory floors?• What will next year’s smart home productsenable?
    31. 31. To recap• MongoDB is widely used for machine-generated sensor, GPS, inventory, and otherkinds of data• Dynamic schema, rich query facilities, built-in analytic features, replication andhorizontal scaling simplify M2Marchitectures
    32. 32. Consulting Manager, 10genRichard KreuterQuestions?