Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Introducing DocumentDB

1,070 views

Published on

DocumentDB is a powerful NoSQL solution. It provides elastic scale, high performance, global distribution, a flexible data model, and is fully managed. If you are looking for a scaled OLTP solution that is too much for SQL Server to handle (i.e. millions of transactions per second) and/or will be using JSON documents, DocumentDB is the answer.

Published in: Technology
  • Be the first to comment

Introducing DocumentDB

  1. 1. Microsoft Azure DocumentDB Overview presentation James Serra Big Data Evangelist Microsoft JamesSerra3@gmail.com
  2. 2. About Me  Microsoft, Big Data Evangelist  In IT for 30 years, worked on many BI and DW projects  Worked as desktop/web/database developer, DBA, BI and DW architect and developer, MDM architect, PDW/APS developer  Been perm employee, contractor, consultant, business owner  Presenter at PASS Business Analytics Conference, PASS Summit, Enterprise Data World conference  Certifications: MCSE: Data Platform, Business Intelligence; MS: Architecting Microsoft Azure Solutions, Design and Implement Big Data Analytics Solutions, Design and Implement Cloud Data Platform Solutions  Blog at JamesSerra.com  Former SQL Server MVP  Author of book “Reporting with Microsoft SQL Server 2012”
  3. 3. Agenda NoSQL Overview DocumentDB Overview Today’s application environment Pricing DocumentDB basics Service summary Development scenarios Resources and tools
  4. 4. What is NoSQL? Choose the store that best fits your needs A database solution designed to compensate for the technical limitations of SQL
  5. 5. Traditional approach: relational stores Data is stored in tables that comprise: • Schemas • Columns • Rows Chappell & Associates. “Understanding NoSQL on Microsoft Azure.” 2014. http://www.davidchappell.com/writing/white_papers/Azure-NoSQL-Technologies-v2.0--Chappell.pdf. Image based on: PricewaterhouseCoopers. “Data models in NoSQL and NewSQL databases.” 2015 http://www.pwc.com/us/en/technology-forecast/2015/remapping-database-landscape/features/assets/data-models-production.pdf
  6. 6. Azure DocumentDB Uses all but graph category Includes some key-value and columnstore capabilities NoSQL approach: various types of stores PricewaterhouseCoopers. “Data models in NoSQL and NewSQL databases.” 2015 http://www.pwc.com/us/en/technology-forecast/2015/remapping-database-landscape/features/assets/data-models-production.pdf A NoSQL database uses four categories of stores:
  7. 7. Key-value stores Key-value stores offer high speed through the least-complicated data model—anything can be stored as a value, as long as each value is associated with a key or name. Key Value Image based on: PricewaterhouseCoopers. “Data models in NoSQL and NewSQL databases.” 2015 http://www.pwc.com/us/en/technology-forecast/2015/remapping-database-landscape/features/assets/data-models-production.pdf
  8. 8. Wide-column stores Wide-column stores are fast and can be almost as simple as key-value stores. They include a primary key, an optional secondary key, and anything stored as a value. Image based on: PricewaterhouseCoopers. “Data models in NoSQL and NewSQL databases.” 2015 http://www.pwc.com/us/en/technology-forecast/2015/remapping-database-landscape/features/assets/data-models-production.pdf Values Primary key Keys and values can be sparse or numerous Secondary key
  9. 9. Graph databases Image based on: PricewaterhouseCoopers. “Data models in NoSQL and NewSQL databases.” 2015 http://www.pwc.com/us/en/technology-forecast/2015/remapping-database-landscape/features/assets/data-models-production.pdf Title: Forgotten Bridges Title: Mythical Bridges Purchased Date: 03/02/2011 Purchased Date: 09/09/2011 Purchased Date: 05/07/2011 Name: Ian Name: Alan
  10. 10. Document stores Document stores contain data objects that are inherently hierarchical, tree- like structures (most notably JavaScript Object Notation [JSON] or Extensible Markup Language [XML]). Note that these are not Microsoft Word documents! Image based on: PricewaterhouseCoopers. “Data models in NoSQL and NewSQL databases.” 2015 http://www.pwc.com/us/en/technology-forecast/2015/remapping-database-landscape/features/assets/data-models-production.pdf
  11. 11. NewSQL: another variation Relational NewSQL stores are designed for web-scale applications, but they still require up-front schemas, joins, and table management that can be labor intensive. Image based on: PricewaterhouseCoopers. “Data models in NoSQL and NewSQL databases.” 2015. http://www.pwc.com/us/en/technology-forecast/2015/remapping-database-landscape/features/assets/data-models-production.pdf.
  12. 12. Why NoSQL evolved Image based on: PricewaterhouseCoopers. “Data models in NoSQL and NewSQL databases.” 2015 http://www.pwc.com/us/en/technology-forecast/2015/remapping-database-landscape/features/assets/data-models-production.pdf Drivers
  13. 13. SQL and NoSQL: each has its place Fully featured RDBMS Transactional processing RichQuery Managed as a service Elastic scale Internet-accessible http/rest Schema-free data model Arbitrary data formats
  14. 14. Azure DocumentDB Perfect for cloud architects and developers who need an enterprise-ready NoSQL document database JSON { "name": "John", "country": "Canada", "age": 43, "lastUse": "March 4, 2014" } { "name": "Eva", "country": "Germany", "age": 25 } { "name": "Lou", "country": "Australia", "age": 51, "firstUse": "May 8, 2013" } { "docCount": 3, "last": "May 1, 2014" } DOCUMENT1 DOCUMENT2 DOCUMENT3 DOCUMENT4 A NoSQL document database-as-a-service, fully managed by Azure
  15. 15. { "name": "SmugMug", "permalink": "smugmug", "homepage_url": "http://www.smugmug.com", "blog_url": "http://blogs.smugmug.com/", "category_code": "photo_video", "products": [ { "name": "SmugMug", "permalink": "smugmug" } ], "offices": [ { "description": "", "address1": "67 E. Evelyn Ave", "address2": "", "zip_code": "94041", "city": "Mountain View", "state_code": "CA", "country_code": "USA", "latitude": 37.390056, "longitude": -122.067692 } ] } Perfect for: schema-agnostic JSON store for hierarchical and denormalized data at scale What documents? Not Word documents
  16. 16. Azure DocumentDB details Native support for JavaScript, SQL query, and transactions over JSON documents Reliable and predictable performance • Tunable consistency • Elastic scale Rapid development • Build with familiar tools—REST, JSON, JavaScript RichQuery and transactions over JSON data • Query JSON data with no secondary indices Ideal for apps designed for the cloud when the following are high priorities:
  17. 17. Top Features Auto-scaling/sharding • Improved scalability and reliability due to distribution of large data sets across multiple machines Automatic indexing • All document properties are available for queries • Frees you from relying on schemas or secondary indexes SQL query language • Make use of SQL experience and .NET LINQ Managed service • Spin up on demand with no setup • Availability guarantee of 99.95% • Linear price curve without virtual-machine step functions • Integration with Azure HDInsight and Azure Search
  18. 18. Top Features Greater consistency control • Four consistency levels provide more options for consistency, availability, and performance requirements Atomicity, Consistency, Isolation, and Durability (ACID) transaction control • Simpler programming model (compared to state variables) • Use JavaScript for insert, update, and delete actions Standards-based open API with RESTful HTTP • Uses JSON standard—no mapping of Binary JSON (BSON) to JSON needed Granular access rights • Allows access to all documents and attachments within collections
  19. 19. Monitor an account • View performance metrics for a DocumentDB account • Customize performance metric views for a DocumentDB account • Create side-by-side performance metric charts • View usage metrics for a DocumentDB account • Set up performance metric alerts for a DocumentDB account
  20. 20. Today’s modern apps • Produce and consume data at a staggering rate • Require instantaneous response times to match user expectations • Are developed iteratively with many versions supported concurrently • Are developed with continuously evolving data models • Are increasingly complex • Experience unpredictable and explosive growth
  21. 21. Well-suited for web and mobile apps Catalog data Preferences and state Event store User-generated content Data exchange
  22. 22. Azure DocumentDB at Microsoft More than 450 million unique users Store 20 TB of JSON document data Under 15 millisecond (ms) writes and single-digit ms reads Store for 40+ app/device combinations Available globally to serve all markets USER DATA STORE
  23. 23. Standard pricing tier with hourly billing
  24. 24. Azure DocumentDB basics Resource model • Entities addressable by logical Uniform Resource Identifier (URI) • Partitioned for scale out • Replicated for high availability • Entities represented as JSON • Accounts scale out by moving a slider Interaction model • RESTful interaction over HTTPS • HTTPS and TCP connectivity • Standard HTTPS verbs and semantics Development • .NET, Node.js, Python, Java, and JavaScript clients • SQL for query expression, .NET LINQ • JavaScript for server-side app logic Azure DocumentDB account Databases Users Permissions 101 010 Attachments Your documents here { } { } DocumentsCollections Stored procedures Triggers User-defined functions JS JS JS
  25. 25. • Collections != tables • Unit of partitioning • Transaction boundary • No enforced schema, flexible • Queried or updated stay together in one collection • Elasticity to 10 GB • RUs evenly distributed across partitions Azure DocumentDB collections 101 010 Attachments Your documents here DocumentsCollections Stored procedures Triggers User-defined functions JS JS JS
  26. 26. … Elastic collections • Collection != single partition • Partition count dynamic • Each partition (key) is 10 GB • Online splits and merges with full availability • RUs evenly distributed across partitions
  27. 27. Rich query over JSON data Native JavaScript transactional processing Familiar SQL-based query language Query on JSON data without specifying secondary indices or constructing views Build modern, scalable apps with robust transactional querying and data processing on JSON documents
  28. 28. JavaScript transactions Transactionally process multiple documents with application-defined stored procedures and triggers • JavaScript as the procedural language • Language integrated • Execution wrapped in an implicit transaction • Preregistered and scoped to a collection • Performed with ACID guarantees • Triggers invoked as pre- or post-operations Stored procedures JS Triggers
  29. 29. Reliable and predictable performance Tunable consistency Elastic scaleFast, predictable performance Defined throughput levels that scale linearly with application needs Azure DocumentDB is born in the cloud to achieve fast, predictable performance with reserved resources to deliver on throughput needs. Delivers reliable, tunable consistency to increase performance based on application needs.
  30. 30. Document myDoc = await client.ReadDocumentAsync(documentLink, new RequestOptions { ConsistencyLevel = ConsistencyLevel.Eventual }); Four consistency levels Strong Session Bounded Staleness Eventual Lower consistency level on read operations
  31. 31. Consistency levels enable guarantees Choose your consistency level and make predictable trade-offs between consistency, availability, and performance Choose your level Strong Data consistency Session Monotonic reads (on explicit read requests) and writes Bounded Staleness Total order of propagation of writes Eventual Lowest latency for reads and writes
  32. 32. Security model Azure Document DB is designed to be secure with: • Master key • Access control on resources • User operations • Permission operations • Code execution
  33. 33. Rapid development Easy to start and fully-manage Enterprise-grade Azure platform Build with familiar tools—REST, JSON, and JavaScript Reduce development friction and complexity when building new business-class applications by using familiar tools and industry-standard platforms. Combine Azure DocumentDB with a portfolio of complementary cloud services on the Azure platform, such as the Azure HDInsight Connector and Azure Search Indexer
  34. 34. Tools https://azure.microsoft.com/en- us/blog/exploring-azure-documentdb- in-visual-studio/ https://azure.microsoft.com/en- us/documentation/articles/documentdb -import-data/ http://portal.azure.com
  35. 35. Azure DocumentDB service summary Unique among NoSQL stores: • Developed for the cloud and for delivery as a service • Truly query-able JSON store • Transactional processing through language- integrated JavaScript • Predictable performance and tunable consistency
  36. 36. Development scenarios Consider Azure DocumentDB when you need: • To build new web and mobile cloud-based applications • Rapid development and high-scalability requirements • Query and processing of user- and device-generated data • More query and processing support for your key-value stores • To run a document store in virtual machines • A managed service model
  37. 37. Build your first Azure DocumentDB app today Get support Schedule a 1:1 chat directly with the Azure DocumentDB engineering team at askdocdb.com Give feedback Ask questions through the forum at http://aka.ms/docdbforum Suggest an idea and vote to support other ideas for Azure DocumentDB at http://aka.ms/docdbideas On Twitter @documentdb Get started Sign up for Azure DocumentDB at http://aka.ms/docdbstart Access and configure your account at http://portal.azure.com Download an SDK from http://aka.ms/docdbsdks, and then build a sample at http://aka.ms/docdbsample
  38. 38. Go to http://www.documentdb.com/sql/demo Test out sample queries or write your own against the dataset Using DocumentDB Query Playground
  39. 39. Learn more David Chappell NoSQL overview paper on Infopedia http://www.davidchappell.com/writing/white_papers/Azure-NoSQL-Technologies-v2.0--Chappell.pdf Seven Databases in Seven Weeks: A Guide to Modern Databases and the NoSQL Movement [book] http://www.pdfiles.com/pdf/files/English/Databases/Seven_Databases_In_Seven_Weeks.pdf Replicated Data Consistency Explained Through Baseball [paper] http://research.microsoft.com/apps/pubs/default.aspx?id=206913
  40. 40. Q & A ? James Serra, Big Data Evangelist Email me at: JamesSerra3@gmail.com Follow me at: @JamesSerra Link to me at: www.linkedin.com/in/JamesSerra Visit my blog at: JamesSerra.com (where this slide deck will be posted)

×