More Related Content
Similar to Amazon DocumentDB - Architecture 및 Best Practice (Level 200) - 발표자: 장동훈, Sr. Database SA, WWSO, AWS ::: AWS Data Roadshow 2023 (20)
More from Amazon Web Services Korea (20)
Amazon DocumentDB - Architecture 및 Best Practice (Level 200) - 발표자: 장동훈, Sr. Database SA, WWSO, AWS ::: AWS Data Roadshow 2023
- 1. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Amazon DocumentDB
For Modern Applications
Donghoon Jang
Database Solution Architect
WWSO
- 2. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Modern Database ?
- 3. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
데이터에 대한 접근방식을 재고해야 하는 이유는 무엇인가?
데이터 접근에 대한 혁신적 변화
Data grows 10x
every 5 years
Transition from IT
to DevOps increases
rate of change
Purpose-built databases provide
optimized performance and cost
savings
Explosion of data Microservices changes data and
analytics requirements
Rapid rate of change
Dev Ops
- 4. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
현대화된 어플리케이션은?
Users 1M+
Data volume Terabytes–petabytes
Locality Global
Performance Microsecond latency
Request rate Millions per second
Access Mobile, IoT, devices
Scale Virtually unlimited
Economics Pay-as-you-go
Developer access Instance API access
Development Apps and storage
are decoupled
Online
gaming
Social
media
Media
streaming
e-commerce Shared
economy
- 5. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
현대화된 어플리케이션에 적합한 데이터 인프라는?
Built-in best practices
Routine maintenance
Automated patching
Industry compliance
Isolation and security
Backup and recovery
Push-button scaling
Advanced monitoring
Automatic fail-over
Schema design
Query optimization
Query construction
Built-in best practices
Routine maintenance
Automated patching
Industry compliance
Isolation and security
Automatic fail-over
Backup and recovery
Push-button scaling
Advanced monitoring
Query construction
Query Optimization
Schema design
You
Self managed Fully managed
You
- 6. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
현대화된 어플리케이션에 적합한 데이터 스토어는?
Moving to Open Database
+
Commercial-grade performance and reliability
- 7. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Amazon Web Service Managed Database Services
DOCUMENT
Amazon DocumentDB
CACHING
Amazon ElastiCache
KEY- VALUE
Amazon DynamoDB
GRAPH
Amazon Neptune
LEDGER
Amazon QLDB
TIME- SERIES
Amazon TimeStream
WIDE COLUMN
Amazon KeySpaces
MEMORY
Amazon MemoryDB
Amazon
RDS
Amazon
Aurora
RELATIONAL
기본적으로 JSON
데이터 저장,
쿼리 및 인덱싱
유연한
인덱싱
유연한
스키마구조
Ad hoc 쿼리
기능
- 8. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
요구사항
• MongoDB와의 호환성
• 마이크로서비스를 독립적으로 확장
• 복잡한 순위 쿼리의 성능 향상
• 완전 관리형 데이터베이스 서비스로 원활한 마이그레이션
해결방안
• 55% 적은 인스턴스로 마이크로서비스를 확장 ( 클러스터/LoB)
• 순위 지정 쿼리의 지연 시간 16배 감소 : 읽기 전용 복제본 활용
• 최소한의 코드 변경 : MongoDB 호환성으로 60% 절감
효과
• 정전 횟수 : 0
• 대기 시간 : 500ms -> 80ms
• 운영 오버헤드: 50%
• 3개월 동안 100개 이상의 마이그레이션
- 9. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Amazon DocumentDB
Architecture
- 10. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Amazon DocumentDB: Cloud Native Architecture
호환성
복제
스토리지와 컴퓨팅의
분리
내구성
백업
Modern, cloud-native
database architecture
- 11. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Amazon DocumentDB: Cloud Native Architecture
Compute
2-96 cores
4-768 GB RAM
Instance
(replica)
Reads
Instance
(primary)
Reads
Writes
Instance
(replica)
Reads
Backup
AZ 1 AZ 2 AZ 3
Amazon S3
Storage
Distributed storage volume
호환성
복제
스토리지와 컴퓨팅의
분리
내구성
백업
- 12. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Architecture
Instance
(replica)
Reads
Instance
(primary)
Reads
Writes
Instance
(replica)
Reads
Distributed storage volume
호환성
Amazon
DocumentDB는
MongoDB API를
에뮬레이션
db.foo.find({}) {"x":1}
AZ 1 AZ 2 AZ 3
- 13. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Demo code
- 14. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Architecture
스토리지와 컴퓨팅의
분리
클라우드 네이티브
데이터베이스
아키텍처
Instance
(replica)
Reads
Instance
(primary)
Reads
Writes
Instance
(replica)
Reads
Distributed storage volume
AZ 1 AZ 2 AZ 3
Compute
Storage
- 15. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Architecture
스토리지와 컴퓨팅의
분리
API
Query processor
Caching
Logging
Storage
Monolithic,
shared disk
architecture
기존 데이터베이스
아키텍처
- 16. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Architecture
스토리지와 컴퓨팅의
분리
API
Query processor
Caching
Logging
Storage
API
Query processor
Caching
Logging
Storage
확장하려면 전체
스택을 복사
API
Query processor
Caching
Logging
Storage
- 17. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Architecture
스토리지와 컴퓨팅의
분리
API
Query processor
Caching
Logging
Storage
Log writes
스토리지와 컴퓨팅의
분리
- 18. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Architecture
스토리지와 컴퓨팅의
분리
API
Query processor
Caching
Logging
Storage
Log writes
Compute layer
Storage layer
스토리지와 컴퓨팅의
분리
- 19. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Architecture
스토리지와 컴퓨팅의
분리
API
Query processor
Caching
Logging
Storage
Log writes
Decouple compute and storage
Compute layer
Storage layer
Scale compute
Scale storage
스토리지와 컴퓨팅의
분리
- 20. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Architecture
스토리지와 컴퓨팅의
분리
Distributed storage volume
AZ 1 AZ 2 AZ 3
Compute
Storage
Instance
(primary)
Reads
Writes
r6g.large
Instance
(replica)
Reads
r6g.large
Instance
(replica)
Reads
r6g.large
- 21. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Architecture
복제
ACK
db.foo.insert({’x’:1})
db.foo.insert({’x’:1}) ACK
Instance
(replica)
Reads
Instance
(primary)
Reads
Writes
Instance
(replica)
Reads
Distributed storage volume
AZ 1 AZ 2 AZ 3
Compute
Storage
Eventual
consistency
Eventual
consistency
- 22. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Architecture
복제
Instance
(replica)
Reads
Instance
(primary)
Reads
Writes
Instance
(replica)
Reads
Distributed storage volume
AZ 1 AZ 2 AZ 3
Compute
Storage
Eventual
consistency
Eventual
consistency
db.foo.find({}) {‘x’:1}
- 23. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Demo code
- 24. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Architecture
내구성
평균 복구 시간(mean
time-to-recovery )은
10GB를 복제하는
기능
Instance
(replica)
Reads
Instance
(primary)
Reads
Writes
Instance
(replica)
Reads
Distributed storage volume
AZ 1 AZ 2 AZ 3
Compute
Storage
- 25. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Demo code
- 26. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Architecture
백업
Amazon S3로 연속
스트리밍
Backup
AZ 1 AZ 2 AZ 3
Amazon S3
Storage
Distributed storage volume
- 27. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Compute
Storage
Cost Optimization: Pricing
Distributed storage volume
Amazon S3
Backup: GiB/month (100% Free! $0.021/GiB)
4
Instance
(replica)
Reads
Instance
(primary)
Reads
Writes
Instance
(replica)
Reads
Storage: GiB/month ($0.10/GiB)
3
Instances: Size/hr * count (db.t4g.medium $0.075/hr)
1
IOPS: Count ($0.20/million)
2
- 28. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Best Practices
- 29. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Number of instances determines availability target
Availability Target Total Instances Replicas Availability Zones
Recovery
Time
99% 1 0 1 8-10min
99.9% 2 1 2 <30sec
99.99% 3 2 3 <30sec
99.99% 4 3 3 <30sec
Best Practice: Use at least 2 replicas in different AZs for production deployments
Cluster Sizing: Availability
- 30. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Instance Size = Processing Power + Cache
Class vCPU Memory (GiB) Estimated Cache Size
(~2/3 of RAM)
t4g.medium 2 4 ~2.5GB
r6g.large 2 16 ~10.5GB
r6g.xlarge 4 32 ~21GB
r6g.2xlarge 8 64 ~42.5GB
r6g.4xlarge 16 128 ~85GB
r6g.8xlarge 32 256 ~171GB
r6g.12xlarge 48 384 ~256GB
r6g.16xlarge 64 512 ~341GB
r5.24xlarge 96 768 ~512GB
Best Practice: Ensure indices and working set fit in cache
Cluster Sizing: Instance Performance
- 31. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Recover to any time from 5 minutes ago until the Backup Retention Period
Best practice: set retention based on your Recovery Point Objective
Cluster Sizing: Backups
- 32. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Connecting: Endpoints
Distributed storage volume
AZ 1 AZ 2 AZ 3
Instance
(primary)
Reads
Writes
Instance
(replica)
Reads
Instance
(replica)
Reads
"members":[
{
"_id":1,
"stateStr":"PRIMARY",
...
},
{
"_id":2,
"stateStr":"SECONDARY",
...
},
{
"_id":3,
"stateStr":"SECONDARY",
...
}
]
Application
- 33. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Best practice: Use cluster endpoint and connect as a replica set
Connecting: Replica Set Emulation
- 34. © 2023, Amazon Web Services, Inc. or its Affiliates.
Connecting: Failover
Distributed storage volume
AZ 1 AZ 2 AZ 3
Instance
(replica)
Reads
Instance
(primary)
Reads
Writes
Instance
(replica)
Reads
Primary fails
- 35. © 2023, Amazon Web Services, Inc. or its Affiliates.
Connecting: Failover
Distributed storage volume
AZ 1 AZ 2 AZ 3
Instance
(replica)
Reads
Replica promoted to new primary
Instance
(primary)
Reads
Writes
- 36. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Instance
(Primary)
Distributed storage volume
AZ1 AZ2 AZ3
Reads
Instance
(Replica)
Instance
(Replica)
Reads
Writes
Reads
Containers
Up to
30000
Up to
30000
Up to
30000
Connection Limits
- 37. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Instance
(Primary)
Distributed storage volume
AZ1 AZ2 AZ3
Reads
Instance
(Replica)
Instance
(Replica)
Reads
Writes
Reads
Containers
Up to
4560
Up to
4560
Up to
4560
Cursor Limits
- 38. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Scaling: Dynamic Read Preference
Distributed storage volume
AZ 1 AZ 2 AZ 3
Instance
(primary)
Reads
Writes
Instance
(replica)
Reads
Instance
(replica)
Reads
Application
Override on each call
(readPreference: primary)
Default read preference
(readPreference:
secondaryPreferred)
- 39. © 2023, Amazon Web Services, Inc. or its Affiliates.
Scaling: Write Traffic
Distributed storage volume
AZ 1 AZ 2 AZ 3
Reads
Writes
Replica
db.r6g.large
Reads
Replica
db.r6g.large
Reads
Replica
db.r6g.large
Reads
Primary
db.r6g.4xlarge
Reads
Replica
db.r6g.4xlarge
Reads
Replica
db.r6g.4xlarge
Writes
- 40. © 2023, Amazon Web Services, Inc. or its Affiliates.
Compute
Storage
Scaling: Storage and I/O
Distributed storage volume
Grows automatically from
10 GiB - 128 TiB
AZ 1 AZ 2 AZ 3
Instance
(replica)
Reads
Instance
(primary)
Reads
Writes
Instance
(replica)
Reads
- 41. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
• Create billing alarms
• 50% spend
• 75% spend
• Cost Allocation Tags
Monitoring: Billing
- 42. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
BufferCacheHitRatio
IndexBufferCacheHitRatio
DatabaseConnections
DatabaseCursors
FreeableMemory
CPUUtilization
Monitoring: Instance Metrics
- 43. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
DBClusterReplicaLagMaximum
DatabaseCursorsTimedOut
VolumeWriteIOPs
VolumeReadIOPs
Opscounters
Monitoring: Cluster Metrics
- 44. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Auditing
DDL events
Auth events
Role Grants
Create alarms
Profiling
Slow queries
Monitoring: Auditing and Profiling
- 45. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
데이터베이스 로드를 측정하여
지난 시점의 시스템 활동 검토
• Average Active Sessions
• Wait States
• Operation level granularity
Complementary to profiling
Monitoring: Performance Insights
- 46. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
• Indexes come with a cost
• Constrain indexes to those
necessary for common queries
• 5 per collection max
rule of thumb
• 1% selectivity goal
rs0:PRIMARY> db.collName.getIndexes()
[
{
"v":2,
"key":{
"_id":1
},
"name":"_id_",
"ns":"tournament.results"
},
{
"v":2,
"key":{
"user_id":1
},
"name":"user_id_1",
"ns":"tournament.results"
}
]
Indexing
- 47. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
• Verify indexes fit in memory
• Monitor IndexBufferCacheHitRatio
rs0:PRIMARY> db.collName.stats()
{
"ns":"tournament.results",
"count":39549,
"size":7000173,
"avgObjSize":177.303,
"storageSize":8609792,
"capped":false,
"nindexes":2,
"totalIndexSize":5472256,
"indexSizes":{
"_id_":2760704,
"user_id_1":2711552
},
"ok":1
}
Indexing: Caching
- 48. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Drop unused indexes where possible
rs0:PRIMARY> db.collName.aggregate([{$indexStats:{}}]).pretty()
{
"name":"user_id_1",
"key":{
"user_id":1
},
"host":"docdb2019.us-east-2.docdb.amazonaws.com:27017",
"accesses":{
"ops":NumberLong(0),
"since":ISODate("2020-01-15T06:57:38Z")
}
}
Indexing: Unused Indexes
- 49. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
• Stops for up to 7 days; cluster then restarts automatically
• While Stopped :
• No instance costs
• Storage costs continue
• Backup costs do not increase
Cost Optimization: Start/Stop Cluster
- 50. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Cost Optimization: I/O
• 적절한 인스턴스 크기 선택
▪ Working set and indices should fit in cache
▪ Monitor metrics to ensure cache is appropriately sized
– BufferCacheHitRatio and IndexBufferCacheHitRatio
– Should be >90%
• Special Case : TTL 워크로드
▪ TTL 인덱스는 데이터를 삭제하기 위해 I/O를 발생시킴
▪ Instead use a collection per day
– Query all collections for the data of interest
– Drop entire collection when the data “expires” (No I/O cost to drop a collection)
- 51. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Cost Optimization: Storage and Backup
• 필요한 데이터만 저장
▪ 미사용 인덱스 식별
▪ 불필요한 데이터 식별
– 문서 내 불필요한 필드
– 불필요한 문서
• 필요한 백업만 유지
▪ Snapshot 에 주의
– 더 이상 필요하지 않은 항목 제거
▪ 복구 지점 목표 검토
– 백업 보존 기간을 적절하게 조정
- 52. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Security group B
VPC
Security group A
Application DocumentDB Cluster
Security group B:
• Inbound (min): TCP (27017)
Security group A:
• Outbound (min): TCP (27017)
Security Groups
- 53. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Role Scope Role Name Description Actions
Database read
Read any collection in a
DB
collStats, dbStats, find,
listCollections, …
Database readWrite
Read and write any
collection in a DB
createCollection,, createIndex,
insert, remove, update, …
Cluster readAnyDatabase
Read any collection in any
DB
listChangeStreams,
listDatabases, [actions in read]
Cluster readWriteAnyDatabase
Read or write any
collection in any DB
listChangeStreams,
listDatabases, [actions in
readWrite]
Cluster clusterMonitor
Read access for
monitoring tools
listSessions, serverStatus, top,
dbStats, …
RBAC – Built-in Roles
- 54. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
RBAC – User Defined Roles
• Roles 은 사용자가 DB 리소스에서 수행할 수 있는 작업을 결정
• User-defined roles 은 조직의 요구 사항에 따라 RBAC 역할을 사용자 지정할 수 있는 유연성을
제공
• 세분화된 액세스 제어(일명 최소 권한 액세스)로 사용자를 생성할 수 있음
• 특정 작업/API에 대한 액세스를 제한하는 역할 생성
• 특정 컬렉션에 대한 액세스를 제한하는 역할 생성
• 기존 사용자 정의 역할에 기본 제공 역할 또는 작업에 대한 액세스를 추가할 수 있음
- 55. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
• Use TLS in-transit
• KMS-backed at-rest
encryption
Encryption
- 56. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
AWS Secrets Manager
Amazon DocumentDB
Application
Lambda Rotation
Function
Retrieve credentials
Login with credentials
Update credentials
Trigger update
Integration with AWS Secrets Manager
- 57. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Data Modeling
- 58. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
What is it ?
물리적 또는 논리적으로 고유하게
식별할 수 있는 추상화 개체
Entities for an e-commerce application
Data modeling concepts - Entities
- 59. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
➢ One to one
➢ One to many
➢ Many to many
Data modeling concepts - Relationships
- 60. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Data modeling concepts – Normalized or Denormalized
Embed model
{
"username": "john",
"userId": 1234,
"access": {
"level": 2,
"group": "dba"
},
"contact":{
"phone": "123-24212",
"email": "john@domain.tld"
}
}
Embedded sub-doc
Embedded sub-doc
{
"username": "john",
"userId": 1234
}
{
"userId": 1234,
"phone": "123-24212",
"email": "john@domain.tld"
}
{
"userId": 1234,
"group": "dba",
"level": 2
}
User document
Reference model
Contact document
Access document
- 61. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
• Optimize the access patterns
• Better document structure
• Simplify queries
• Less indexes
What is it ?
다양한 사용 사례에 적용하고
재사용할 수 있는 데이터 모델링
기술
Data modeling concepts – Design pattern
- 62. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Data modeling
methodology
- 63. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
응용
프로그램
요구 사항
설명
엔터티 및
관계 식별
디자인 패턴
적용
Schema
model
Methodology - Phases
- 64. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
PHASE 1
워크로드 식별
- 65. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
응용
프로그램
요구 사항
설명
INPUT
• 요구사항 문서
• 시나리오
• 지표 및 로그
• 기존 NoSQL 모델 마이그레이션
또는 리팩터링
• 가정
• 어플리케이션 요구 사항 및 데이터 사용 방법 정의
• 액세스 패턴, 읽기vs 쓰기 식별
• 가장 중요한 쿼리 식별
• 데이터 크기 추정
• 오래된 데이터에 대한 일관성 요구 사항 및 허용
오차를 식별
Methodology – Identify workload
- 66. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
시나리오 예제 :
온라인 블로그 웹 사이트에서 작성자는 모든 기사에 대한 기사
및 댓글을 게시하거나 읽을 수 있습니다. 각 기사에는 태그가
있을 수 있으며 하나 이상의 범주에 속할 수 있음
CRUD Operation Type Frequency (peak) Avg doc size Max Latency
New blog added/updated Write 100/month 500 KB < 500ms
New comment added
Author added
Write
Write
5000/month
30/month
32 KB
2 KB
< 150ms
<100ms
Blog views Read 10000/day/blog 5ms
Author logs in
Read 200/day 10ms
응용
프로그램
요구 사항
설명
Methodology – Identify operations
- 67. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
PHASE 2
엔터티 및 관계 식별
- 68. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
엔터티 및 관계 식별
Embedding
or
Referencing ?
Methodology – Identify relationships
- 69. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Authors
• id
• name
• email
Blogs
• id
• title
• date
• text
Categories
• id
• name
Tags
• id
• name
Comments
• id
• date
• text
Blogs
• id
• title
• date
• text
Authors
• id
• name
• email
Categories
• id
• name
Tags
• id
• name
Comments
• id
• date
• text
Duplication of
authors data
Query by blogs
Query by authors
Methodology – Model by access pattern
- 70. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
id
Authors
•
• name
• email
Query by blogs and authors
Authors 데이터 중복 방지
Blogs
• id
• title
• date
• text
• author_id
Categories
• id
• name
Tags
• id
• name
Comments
• id
• date
• text
Methodology – Model by access pattern
- 71. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Blogs
• id
• title
• date
• text
• author_id
Categories
• id
• name
Tags
• id
• name
Comments
• id
• date
• text
Blogs -> Comments – one to many unbounded
comments 를 다른 컬렉션으로 분리하고 여러
쪽에 참조를 유지.
Blogs
• id
• title
• date
• text
• author_id
Categories
• id
• name
Tags
• id
• name
Comments
• id
• date
• text
• blog_id
Methodology – Model by relationship type 1:M
- 72. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Courses
{
”course_id": <objectId>,
"name": <string>,
"instructors": [
List[inst_id]
]
}
Instructors
{
"inst_id": <objectId>,
"name": <string>,
"courses": [
List[course_id]
]
}
Two-way embedding
Courses
{
”course_id": <objectId>,
"name": <string>,
}
Students
{
”student_id": <objectId>,
"name": <string>,
"courses": [
List[course_id]
]
}
Many to many bounded Many to many unbounded
Courses -> Instructors Courses -> Students
Methodology – Model by relationship type N:M
- 73. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Access pattern or relationship type Recommended model
- Need all related data in one query
- One to one relationship
- One to many, where many is bounded Embedded
- A portion of data is rarely accessed
- Data that is frequently updated and growing
- One to many potentially unbounded Reference
- Many to many Combination of both
Embedding vs. Referencing - How to choose ?
- 74. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
PHASE 3
Identify and apply design
patterns
- 75. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
{
"name": <string>,
"price": <int>,
"specs": [
{k:"size", v: <string>},
{k:"weight", v: <string>},
{k:"colour", v: <string>},
{k:"material", v: <string>}
]
}
Only one index needed:
➢ { "specs.k": 1, "specs.v": 1 }
// Item document
{
"name": <string>,
"price": <int>,
"size": <string>,
"weight": <string>,
"colour": <string>,
"material": <string>
}
4 indexes needed:
➢ {"size": 1}
➢ {"weight": 1}
➢ {"colour": 1}
➢ {"material": 1}
Attribute pattern
Challenge:
• Many similar fields
• Fields present only in a subset of
documents
Use cases:
• Catalogs, Inventory
Methodology – Apply design pattern
- 76. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
// Blogs collection
{
"id": <objectId>,
"title": <string>,
"date": <date>,
"text": <string>,
"author_id": <objectId>,
"last_comments": [
List[Last 20 comments]
]
}
// Comments collection
{
"id": <objectId>,
"date": <date>,
"text": <string>,
"blog_id": <objectId>
}
Subset pattern
Challenge:
• Documents too large
• Working set doesn’t fit in RAM
Use cases:
• Whenever a significant data inside
a document that is rarely needed
Methodology – Apply design pattern
- 77. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
// Items
{
"id": <objecId>,
"name": <string>,
"description": <string>,
"specs": [
{"k": <string>,
"v": <string>
}
],
"category": [
List[categories]
]
}
Counting items for each category will
require to read all items and group per
category
“Cache” the count when a new item is
inserted
// items_category_count
db.itemsitems_category_count.update(
{_id: "books"},
{$inc: {count: 1}}
)
Computed pattern
Challenge:
• Repeated calculations
• Read intensive workload
Use cases:
• Catalogs, IoT, Mobile, Real
time analysis
Methodology – Apply design pattern
- 78. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
// Invoices
{
"invoice_id": <objectId>,
"customer_id": <objectId>,
"customer_info": {
"fullName": <string>,
"street": <string>,
"city": <string>,
"zipcode": <string>
}
}
// Customers
{
"customer_id": <objectId>,
"fullName": <string>,
"street": <string>,
"city": <string>,
"zipcode": <string>,
"email": <string>,
"phone": <string>
}
You need to manage duplication:
• Duplicate only needed fields and that do
not change often
Extended reference
Challenge:
• Too many roundtrips to
database or joins
Use case:
• Catalog, Mobile apps, E-
commerce
Methodology – Apply design pattern
- 79. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Bucket pattern
// SensorData
{
"device_id": 4523,
"ts": ISODate("2023-03-10T10:00"),
"temp": 20
},
{
"device_id": 4523,
"ts": ISODate("2023-03-10T10:01"),
"temp": 20
},
{
"device_id": 4523,
"ts": ISODate("2023-03-10T10:02"),
"temp": 21
}
Bucket per hour
Combine with
computed pattern
//SensorData
"device
"date": ISODate("2023-03-10T10"),
"temp": [
{"ts": ISODate("2023-03-10T10:00"), "temp": 20},
{"ts": ISODate("2023-03-10T10:00"), "temp": 20},
{"ts": ISODate("2023-03-10T10:00"), "temp": 21},
],
"temp_count": 3,
"temp_sum": 61
Challenge:
• Large documents or too
many documents
Use cases:
• IoT
• Historical data
• Lots of data associated with
one entity
Methodology – Apply design pattern
- 80. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Current & Next
- 81. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
A few highlights from recent releases
2019 2020 2021 2022 2023
Frankfurt
Secrets Manager
DDL
auditing
Aggregation
operators
Launch
Sydney
London
Canada t3 instances
Cross-region
snapshot copy
RBAC user-defined
roles
JDBC driver
Geospatial
Performance
Insights (preview)
Elastic Clusters
Slow query logger
Aggregation
operators
RBAC Free trial
Aggregation
operators
MongoDB 4.0
Acid transactions
Fast database
cloning
Per-second billing
Tokyo
Seoul
Change streams
Mumbai
Paris
Singapore Glue ETL Global clusters Milan
DML auditing
Decimal128
support
Start/stop cluster
Deletion protection
Aggregation
operators
Increase cursor &
connection limits
Graviton2
AWS Backup
Dynamic volume
resizing
MongoDB 5.0
Lambda ESM
Client-side field
level encryption
- 82. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
db.foo.findOne() {“x”:1}
Global replication: Up to 5 secondary regions
Low replica lag: Typically < 1 sec
Fast recovery: Typically < 1 min downtime
Compatibility: Version 4.0 and later
Global reader instances: Up to 90
db.foo.insertOne({“x”:1})
Reads
Reads
Writes
Reads
Replication
Service
(primary region)
Ohio
Reads
Reads
Replication
Service
(secondary region)
Oregon
Reads
Reads
Replication
Service
(secondary region)
Tokyo
Global Clusters
- 83. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Elastic Cluster architecture
db.foo.find(
{
order_id:1
})
{
“order_id”:1,
“name”:”Amazon”
}
db.foo.insert(
{
order_id: 2
})
{“inserted_id”:2}
Shard-1
Compute capacity
Writes
Reads
Distributed storage volume
Shard-2
Compute capacity
Writes
Reads
Distributed storage volume
Shard-n
Compute capacity
Writes
Reads
Distributed storage volume
Elastic Cluster
Request Router and Service
- 84. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
What’s Next?
“Amazon DocumentDB resources”
https://aws.amazon.com/documentdb/resources/
“Amazon DocumentDB immersion day workshop”
https://documentdb-immersionday.workshop.aws/
- 85. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Q&A
- 86. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Thank you!