Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Geek Sync | Data in the Cloud: Understanding Amazon Database Services with Visual Models

169 views

Published on

You can watch the replay for this Geek Sync webcast, Data in the Cloud: Understanding Amazon Database Services with Visual Models, in the IDERA Resource Center, http://ow.ly/QYVj50A4qkv.

As a data professional, you understand that a data model is primarily used for designing databases. But as more databases move up to the cloud, data modeling can also serve as a visual approach to capture concepts and relationships for database services, such as Amazon RDS, Aurora, and Redshift. Data models can demystify the complexities perceived and associated with managing and modeling cloud databases. Henry Nirsberger will show you how conceptual data models for Amazon database services can clarify confusion and accelerate an understanding of these complex offerings.

Speaker: Henry Nirsberger (CDMP, CBIP) is the author of “A Conceptual Data Model for Amazon EC2” and CEO of HMN Consulting LLC, providing IT consulting services specializing in Data Management, Enterprise Architecture, Cloud, Facilitation, and IT Leadership. As a trained facilitator, he has facilitated over 600 IT design and planning sessions for data modeling, process modeling, database design, project planning, process improvement, requirements consensus, strategy planning, issues management, and team building. He continues to be an unremitting student of data modeling, cloud computing, enterprise architecture, and all aspects of data management. His certifications include CDMP (DAMA), CBIP (TDWI), CDP-DM (ICCP), CFPIM (APICS 1984–2003) and TOGAF 9.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Geek Sync | Data in the Cloud: Understanding Amazon Database Services with Visual Models

  1. 1. Data in the Cloud: Understanding Amazon Database Services with Visual Models Henry M. Nirsberger, CEO HMN Consulting, LLC Info@HMNconsulting.com www.HMNconsulting.com A Mind Map for Cloud Database Services!
  2. 2. Agenda 1. Intro/Objective …. Why? 2. What? Amazon Database Services? 3. Data Modeling Tutorial … AWS EC2 Basics 4. Visual Model  A Conceptual Data Model …  Amazon EC2 Basics  Relational Database Service (RDS)  Aurora  Neptune  DocumentDB  Redshift  DynamoDB  ElastiCache  Quantum Ledger Database (QLDB) 5. Q&A 6. IDERA … ER/Studio Demo ER Studio
  3. 3. Intro/Objective …. Why? • Many enterprises … Migrating existing apps to Amazon Web Services (AWS) … The Amazon Cloud • Early stages of Cloud Migration … Web/App servers … holding back on Database Servers? • Many IT Data PROs  Little or no direct experience with AWS and Amazon Database Services … Challenge  Quickly learning AWS cloud computing concepts and Database Services "The Cloud" • Relational  Oracle, SQL Server, MySQL, etc. • Non-Relational … NoSQL  Graph, Document, Ledger databases for new classes of apps (e.g., Recommender Engines)
  4. 4. Intro/Objective …. Why? AWS Database and EC2 documentation • Amazon Relational Database Service, API Reference (API Version 2014-10-31) • Amazon Aurora, User Guide for Aurora (API Version 2014-10-31) • Amazon Neptune, User Guide (API Version 2017-11-29) • Amazon DocumentDB, Developer Guide (API version: 2014-10-31) • Amazon Redshift, API Reference (API Version 2012-12-01) • Amazon DynamoDB, Developer Guide (API Version 2012-08-10) • Amazon ElastiCache, API Reference (API Version 2015-02-02) • Amazon ElastiCache for Redis, ElastiCache for Redis User Guide (API Version 2015-02-02) • Amazon ElastiCache, ElastiCache for Memcached User Guide (API Version 2015-02-02) • Amazon Elastic Compute Cloud User Guide for Linux Instances (2016) • Amazon Elastic Compute Cloud, API Reference (API Version 2016-11-15) • Amazon Quantum Ledger Database (Amazon QLDB): Developer Guide (API version: 2019-01-02, Latest documentation update: September 10, 2019) • Thousands of pages … Cliffs Notes? Rosetta Stone? • Sometimes, the best way to understand a complex subject area? Study its data model!
  5. 5. Intro/Objective…Why? Cont’d “A Conceptual Data Model for Amazon EC2” (Kindle eBook) “Data In The Cloud: A Conceptual Data Model for Amazon Database Services” (Kindle eBook and Paperback)
  6. 6. Agenda 1. Intro/Objective …. Why? 2. What? Amazon Database Services? 3. Data Modeling Tutorial/Refresher 4. Visual Model  A Conceptual Data Model …  Amazon EC2 Basics  Relational Database Service (RDS)  Aurora  Neptune  DocumentDB  Redshift  DynamoDB  ElastiCache  Quantum Ledger Database (QLDB 5. Q&A
  7. 7. Amazon Database Services? • Amazon Web Services (AWS) foundation  Amazon EC2 (Elastic Compute Cloud) • EC2 … Virtual Machines, Network, Storage in the cloud • Infrastructure as a Service (IaaS). • Foundation for Platform as a Service (PaaS). • Amazon Database Services are Platform Services built on top of EC2 Reducing CAPEX & OPEX  Substantial Paradigm Shift vs. Provisioning IT infrastructure in private data centers. "The Amazon Cloud" Relational Database Service (RDS), Aurora, Neptune, DocumentDB, Redshift, DynamoDB, ElastiCache, Quantum Ledger Database (QLDB) “Managed Services”  fewer worries … for provisioning servers, backups, scaling resources, HA, etc.  Fewer DBAs??
  8. 8. Amazon Database Services? Relational … SQL • Relational Database Service (RDS)  Database Instances: Oracle DB, MS SQL Server, MySQL, MariaDB, and PostgreSQL • Aurora  Clusters of Database Instances … open source DB engines (MySQL and PostgreSQL) • Redshift  Clusters, columnar … PostgreSQL … very large data sets (e.g., BIDW) Non-Relational … NoSQL • Neptune  Graph DB engines (Gremlin & SPARQL GQLs) • DocumentDB  Clusters of document DB servers (MongoDB) • DynamoDB  Serverless … structured & semi-structured data (JSON files) .. Cache Clusters for global internet scale apps • ElastiCache  Cache Clusters … in-memory (Memcached & Redis) • Quantum Ledger Database (QLDB)  Ledger databases … blends relational, document, and blockchain concepts. Much More to Data “Life” than Relational Stuff!
  9. 9. Agenda 1. Intro/Objective …. Why? 2. What? Amazon Database Services? 3. Data Modeling Tutorial … AWS EC2 Basics 4. Visual Model  A Conceptual Data Model …  Relational Database Service (RDS)  Aurora  Neptune  DocumentDB  Redshift  DynamoDB  ElastiCache  Quantum Ledger Database (QLDB 5. Q&A • IE • Information Engineering Notation
  10. 10. Data Modeling Tutorial .. Intro AWS EC2 Basics Regions  global, geo locations … e.g. US East Region (N. Virginia) Availability Zones are isolated data centers … for High Availability. Account  For billing AWS resources & usage. Amazon Machine Images (AMI)  templates for creating virtual machines (“instances”), E.g. AMI for launching a Linux/Apache Web Server.
  11. 11. Amazon EC2 Basics, Cont’d • An Image (AMI) can be used to launch many instances (virtual machines) … a 1 to Many relationship. • An instance can be used to create 1 or more AMIs. Each instance has an instance type … indicating the size of the instance in terms of vCPUs, RAM & Storage.
  12. 12. Amazon EC2 Basics, Cont’d •Classless Inter-Domain Routing (CIDR)  IP Address Range. •An IP address is part of a CIDR Block, e.g. 192.168.0.0/16. •Each account can have 1 or more Virtual Private Clouds (VPC) -- a virtual network for logically separating AWS resources. •E.g. for different for orgs, or for development vs. production apps, etc. •Each VPC is composed of 1 or more subnets, e.g. for Web, app or DB servers. •Each subnet  within a single availability zone •A VPC can traverse > 1 availability zone  HA.
  13. 13. Agenda 1. Intro/Objective …. Why? 2. What? Amazon Database Services? 3. Data Modeling Tutorial … AWS EC2 Basics 4. Visual Model  A Conceptual Data Model …  Relational Database Service (RDS)  Aurora  Neptune  DocumentDB  Redshift  DynamoDB  ElastiCache  Quantum Ledger Database (QLDB 5. Q&A Managed Service for Migrating or Creating Relational Databases
  14. 14. Relational Database Service (RDS) • Migrate/Create Relatonal Databases  Oracle DB, MS SQL Server, MySQL, MariaDB, and PostgreSQL Key Concepts • Database Instances • Option Groups/Options • Parameter Groups/Parameters • Event Notification (e.g., backup, low storage) • Reserved Database Instances … reduced pricing • Database Backups … automated & manual • Database Logs/Log Types … monitor activity (e.g., error logs) How model these concepts?? • e.g., Oracle OEM, JVM • e.g., Oracle SGA, PGA
  15. 15. RDS • DB Instance is a subtype of EC2 Instance • Inheritance relationship A DB Instance can have many read-only DB Instances … Read Replicas
  16. 16. RDS Option Groups with Options such as Oracle OEM, JVM Options have Option Settings and Allowed Values
  17. 17. RDS Event Notification Events fall into different categories (e.g., Backups)
  18. 18. Agenda 1. Intro/Objective …. Why? 2. What? Amazon Database Services? 3. Data Modeling Tutorial … AWS EC2 Basics 4. Visual Model  A Conceptual Data Model …  Relational Database Service (RDS)  Aurora  Neptune  DocumentDB  Redshift  DynamoDB  ElastiCache  Quantum Ledger Database (QLDB 5. Q&A • Database Clusters  Many DB instances • Open Source … MySQL & PostgreSQL
  19. 19. Aurora • Database Clusters  Many DB instances • Open Source  MySQL and PostgreSQL • Cluster  security groups, subnet group, parameter group, engine version, a source for event notifications. • Aurora Concepts: o Primary DB instance o Read Replica DB instances o Read Replica DB clusters o Virtual cluster volumes … SSD … replicated across AZs o Backtracking … Change Records … rewind/undo o Serverless DB clusters … warm pools of DB instances How model these concepts?? Both Reads & Writes Read-only, Performance, HA, Updates Auto Synchronized Cross-Region Clusters… remote customers … MySQL
  20. 20. Aurora 2 Types of Clusters: • DB Clusters • Cache Clusters … in- memory data 4 Types of DB Clusters: • Aurora • Neptune • DocumentDB • Redshift Clusters … Overview Cluster Taxonomy
  21. 21. Aurora DB Clusters inherit Cluster relationships, e.g. Security Groups (~ Firewalls) • Aurora Clusters inherit DB Cluster relationships, e.g. Snapshots (backups) • Aurora Clusters  Many DB Instances • Aurora … Special relationships, e.g. for MySQL cross-region replicas, backtracks. Aurora Cluster … Overview Firewalls
  22. 22. Agenda 1. Intro/Objective …. Why? 2. What? Amazon Database Services? 3. Data Modeling Tutorial/Refresher 4. Visual Model  A Conceptual Data Model …  Relational Database Service (RDS)  Aurora  Neptune  DocumentDB  Redshift  DynamoDB  ElastiCache  Quantum Ledger Database (QLDB 5. Q&A • Apps  Complex M/M Relationships, e.g. Recommender engines • Graph Databases Why?
  23. 23. Neptune • Database Clusters • NoSQL  “Non SQL” or "Not Only SQL" … Non-Relational • SQL … GQLs … Graph Query Languages: Gremlin, SPARQL • Labelled Property Graphs (LPGs) • Graph Data Structures  Vertices/Nodes (~ Rows)  Edges (~ Relationships)  Properties (~ Columns) Person 1 Person 3 Person 2 Edge e.g. “Friend” Edge e.g. “Connected” Label = PERSON PERSON PERSON Vertex/Node • Graph Databases • Complex M/M relationships, e.g. Recommender engines NoSQL Examples??
  24. 24. Neptune • Gremlin  Graph Query Language … GQL  ADDV (Add Vertex) … SQL INSERT  ADDE (Add Edge) … linkage from 1 Vertex to another … analogous to a Foreign Key  PROPERTY (Add Property) … Column Value ... Schemaless  HAS … Filtering … analogous to SQL SELECT  DROP: The drop step … analogous to SQL DELETE Examples g.addV('person').property(id, 'PER-0001'). property('name','Random A. Person').property('dob', '03/03/1995') g.addE('friend').from(g.V('PER-0001')).to(g.V('PER-0002')) SQL Select, Insert, Update, Delete How model Neptune platform concepts?? NoSQL
  25. 25. Neptune • Neptune Clusters inherit DB Cluster relationships, e.g. Snapshots (backups) • Neptune Clusters  Many DB Instances … graph DB instances • Primary & Read Replicas • Unlike Aurora Clusters … • No cross-region replicas • No backtracks Neptune Clusters similar to Aurora Clusters … Differences in RED
  26. 26. Agenda 1. Intro/Objective …. Why? 2. What? Amazon Database Services? 3. Data Modeling Tutorial/Refresher 4. Visual Model  A Conceptual Data Model …  Relational Database Service (RDS)  Aurora  Neptune  DocumentDB  Redshift  DynamoDB  ElastiCache  Quantum Ledger Database (QLDB 5. Q&A • Agile developers …“more intuitive” • Document DBs  MongoDB • JSON files … Schemaless Why?
  27. 27. DocumentDB • Database Clusters • NoSQL … Differences with RDS, Aurora, and Neptune Key Concepts • MongoDB • Collections … like Tables • Documents … like Rows • Field … a Key-Value pair … like a column of a row • Embedded Documents … Nested Data • Document Databases • MongoDB … JSON files Semi-structured Data JSON  Key-value Pairs NoSQL Examples?? 1/Many Relationships within a Document
  28. 28. DocumentDB • insertOne (~ SQL INSERT) inserts a document into a collection. • insertMany: Inserts multiple documents into a collection. • find: (SQL SELECT) retrieves documents from a collection. • updateOne: (SQL UPDATE) updates a document in a collection • updateMany: updates all documents that satisfy search criteria for a specified collection. • deleteOne: (SQL DELETE) removes a document from a collection based on search criteria. • deleteMany: This method removes all documents that satisfy specified search criteria from a specified collection. How model DocumentDB platform concepts? { "SSN": "123-45-6789", “EmployeeID”: “PER-0001”, "Name": "Random A. Person", "DOB": "1990-01-01", “Jobtitle”: “sales person”, "Street": "1000 Any Street", "City": "Any Town", "State-Province": "NY", "Country": "USA" } Document (~ Row) SQL Select, Insert, Update, Delete Employee Collection NoSQL
  29. 29. DocumentDB • DocumentDB Clusters inherit DB Cluster relationships, e.g. Snapshots (backups) • DocumentDB Clusters  Many DB Instances • Primary & Read Replicas • Unlike Aurora Clusters … • No cross-region replicas • No backtracks DocumentDB Clusters similar to Aurora & Neptune Clusters … Differences in RED
  30. 30. Agenda 1. Intro/Objective …. Why? 2. What? Amazon Database Services? 3. Data Modeling Tutorial/Refresher 4. Visual Model  A Conceptual Data Model …  Relational Database Service (RDS)  Aurora  Neptune  DocumentDB  Redshift  DynamoDB  ElastiCache  Quantum Ledger Database (QLDB 5. Q&A • Analytics … OLAP • Large data sets … Fast Response  Columnar Database  Massively Parallel Processing Why?
  31. 31. Redshift • Database Clusters • For OLAP & BIDW  Large data sets … Few columns accessed • Comfort Zone … Relational … PostgreSQL • New Vocabulary: o Leader Node  Many Compute Nodes o Columnar Data … Single column values for many rows stored in each data block How model Redshift platform concepts?? Both Reads & Writes Read-only, Performance, HA, Updates Auto Synchronized Many “Nodes” … not “DB instances” Star Schema Dimension Tables & Fact Tables  Partitioned Data Sets … Distributed across Nodes  Massively Parallel Processing (MPP) Fast!
  32. 32. Redshift • Redshift Clusters inherit DB Cluster relationships, e.g. Snapshots (backups) • Unlike Aurora Clusters … • No cross-region replicas • No backtracks • Other Differences in RED • Redshift Clusters  Many “Nodes” • Leader & Compute Nodes o Partitioned Data Sets o Massively Parallel Processing • Table Restore Requests
  33. 33. Agenda 1. Intro/Objective …. Why? 2. What? Amazon Database Services? 3. Data Modeling Tutorial/Refresher 4. Visual Model  A Conceptual Data Model …  Relational Database Service (RDS)  Aurora  Neptune  DocumentDB  Redshift  DynamoDB  ElastiCache  Quantum Ledger Database (QLDB 5. Q&A • Multimaster Database • Cache Clusters • Globally Distributed, Internet Scale Apps • Thousands of concurrent users Why?
  34. 34. DynamoDB  • NoSQL … structured & semi-structured … key-value pairs … JSON Documents • New Vocabulary … o Tables  Items (~ Rows)  Attributes (key-value pairs, ~ Columns) o Global Tables … replicated across regions … updates synchronized o Throughput Settings … Serverless … No provisioning of DB servers  Read Capacity Units (RCUs) … anticipated # of table reads/sec  Write Capacity Units (WCUs) … anticipated # of table writes/sec  Auto Scaling Policies o Cache Clusters  Item Cache, Query Cache, eventually and strongly consistent reads … DynamoDB Accelerator … DAX Clusters Multimaster Database Performance, World-wide Access, Disaster Recovery, HA • Serverless  Based on Table Reads/Writes • Servers automatically allocated from a “warm pool” of servers Globally Distributed, Internet Scale Applications NoSQL Examples??
  35. 35. DynamoDB DynamoDB vs. SQL • PutItem  Adds an item to a table ....….. • GetItem  Retrieving a single item by its primary key • Query  Retrieving multiple items based on query filters • UpdateItem  Update a single item ….….SQL UPDATE • DeleteItem  Deletes one item ……….. NoSQL ~ SQL SELECT ~ SQL INSERT ~ SQL UPDATE ~ SQL DELETE SQL Select, Insert, Update, Delete How model DynamoDB platform concepts??
  36. 36. DynamoDB Global Table … replicated across Regions • Auto Scaling Policies • Serverless … • # of Reads on each Table • # of Writes on each Table Schemaless … Attributes Not Predefined ~ Rows Table Indexes What about Cache Clusters … DAX Clusters??
  37. 37. DynamoDB 2 Types of Clusters: • DB Clusters • Cache Clusters … in- memory data 2 Types of Cache Clusters: • DAX Clusters • ElastiCache Clusters • DAX  DynamoDB Accelerator … response times ~ Microseconds • In-memory … Pareto Principle • Primary Node … Read Replica Nodes • Item Cache  Items accessed using Keys • Query Cache  Result sets accessed Parameter Values
  38. 38. Agenda 1. Intro/Objective …. Why? 2. What? Amazon Database Services? 3. Data Modeling Tutorial/Refresher 4. Visual Model  A Conceptual Data Model …  Relational Database Service (RDS)  Aurora  Neptune  DocumentDB  Redshift  DynamoDB  ElastiCache  Quantum Ledger Database (QLDB 5. Q&A • In-memory storage of data • Rapid Response • No back-end database servers? Why?
  39. 39. ElastiCache • Cache Server Clusters  NoSQL, Key-values, In-memory • Possible to persist and recover data using … o Backups o Change logs • New Vocabulary:  Memcached  Redis  Lazy Loading Caching  Write Through Caching  Replication Groups For Even Faster Response Times • Open-source, partitioning data across multiple Cache Servers …. called “nodes” • High Availability  Multiple AZs • Data Structure Server ... beyond Key-Value Pairs • Abstract Data Types: e.g., Lists, Sorted Sets, Hashes (~ Rows … of Key-Value Pairs) Redis • App updates DB & Cache • Cache always current • Cache Miss  App accesses DB directly • App refreshes cache data • Each Partition  Group of Nodes • Primary node & Read Replica nodes Redis Possibly No back-end database servers??
  40. 40. ElastiCache Redis … Data Structure Server • Strings ~ Blob • Hashes ~ Row in an RDBMS … Row of Key-Value pairs • Lists … Ordered sequence of string values • Sets … Unordered sequence of string values • Publish/Subscribe … Message subscriptions Memcached … Key-value store • Strings  hash table • Key String Value  Another String Value NoSQL … API Examples • LPUSH • RPUSH • LRANGE • HMSET • HMGET • HEXISTS • Set Data • Add Data • Replace Data • Append Data • Prepend Data • Get Data • Delete Key SQL Select, Insert, Update, Delete NoSQL
  41. 41. ElastiCache 2 Types of Clusters: • DB Clusters • Cache Clusters … in-memory data 2 Types of Cache Clusters: • DAX Clusters • ElastiCache Clusters • In-memory … Pareto Principle • ElastiCache Nodes
  42. 42. ElastiCache Super Fast Response Times Replication Group = A Type of ElastiCache Cluster Redis A Replication Group has many Node Groups A Node Group for Each Partition  A Primary node & Read Replica nodes
  43. 43. Agenda 1. Intro/Objective …. Why? 2. What? Amazon Database Services? 3. Data Modeling Tutorial/Refresher 4. Visual Model  A Conceptual Data Model …  Relational Database Service (RDS)  Aurora  Neptune  DocumentDB  Redshift  DynamoDB  ElastiCache  Quantum Ledger Database (QLDB) 5. Q&A • CyberSecurity Threats? Data Integrity? • Ledger Databases … System of Record • Immutable … Append Only • Blockchain Concepts Why?
  44. 44. Quantum Ledger Database (QLDB) • Ledger database for System of Record (SOR) apps  Complete transaction history (e.g., eCommerce order tracking & fulfillment). • Append only Journal of entries … Built-in change history Smorgasbord of Concepts … Relational … Document … Blockchain Concepts Cyber-security threats to data integrity? • Tables • SQL Like Avoids … Triggers, Stored Procedures, Partitioned Tables, Audit Logs, etc. No Updates to existing data • Merkle Trees • Merkle Audit Proofs Documents  Key-Value Pairs, like Rows in a Table • Documents  in “Blocks” linked by cryptography … SHA-256 Hash Codes • Immutable and Verifiable
  45. 45. Quantum Ledger Database (QLDB) • PartiQL  Open Source … ~ SQL  INSERT, SELECT, UPDATE, DELETE • Extensions to SQL  Access to documents Dot Notation and Aliasing of nested data. INSERT INTO PurchaseOrder { 'POId' : 'PO123456789', 'CustomerId' : 'Any Random Customer', 'OrderDate' : `2019-12-25T`, 'POItems' : [ { 'ItemId' : 'Random Widget A' , 'Qty' : 1, 'UnitPrice': 1.75}, { 'ItemId' : 'Random Widget B' , 'Qty' : 2, 'UnitPrice': 2.75}, { 'ItemId' : 'Random Widget C' , 'Qty' : 3, 'UnitPrice': 3.75} ] SELECT po.POId, po.OrderDate, poi.ItemId, poi.Qty FROM PurchaseOrder AS po, @po.POItems AS poi WHERE po.CustomerId = 'Any Random Customer' • Alias for nested data • Simplifies access • Avoids Table Join
  46. 46. QLDB Relational Concepts Document Concepts SHA-256 hash codes for verifying immutability Blockchain Concepts Blocks linked by SHA 256 Hash Codes
  47. 47. The End! 1“A Conceptual Data Model is worth a thousand tweets.”

×