SlideShare a Scribd company logo
1 of 15
Download to read offline
NoSQL at Scale:
Architecture and Common Pitfalls
Arthur Gimpel
DataZone
About Me
• Working with databases for the past 8 years
• 5 years as SQL Server DBA & .NET
• 3 years with NoSQL, Python, Node.js
• 2015 - DataZone
DataZone
Data is our business
• High end consultancy & projects
• Private and public training
• Multi vendor MSP (Managed Service Provider)
• Part of Matrix, CloudZone Family
• Proud partners of:
NoSQL
Why not?
All the wrong reasons..
• Web scale! web scale!
• Cost reduction
• Free school for developers
Common Pitfalls & Best
Practices
Normalizing in the low cost storage era
Expectations gone wild
• Stored procedures
• Jobs
• Triggers
• Mailing Servers
• Shell Commands
ODM != ORM
ORM
• Close the modelling
gaps
• Simple Standard
CRUD
ODM
• Typing the untyped
• Retries
• Transactions
• Events
Best Practices
• DynamoDB
• Streams: Change Data Capture
• Cross region replication & synchronisation
• Event driven development: Send your new user a
welcome email
• Polyglot Persistence: Initiate an ETL, update your
Elasticsearch index
Best Practices
• Elasticsearch
• AWS Plugin:
• Allow service discovery based on security groups
• Run snapshots & restore directly to/from s3
• Storage:
• In small clusters, use EBS.
• In large clusters, use instance store, if you have replication enabled.
• Leverage Watcher & AWS Lambda for anything!
Best Practices
• Couchbase
• XDCR:
• Make sure you use cross datacenter replication for multi
region deployments
• Use encrypted communication, especially on XDCR
• Sizing:
• All members must be identical
• Prefer Memory & IO Optimized for most deplyments
When Family Can Be Chosen
• Every engine has its flavours - know your tradeoffs
• Splitting drives is good for on prem. EBS is different
• Scaling up & down is always X2
• Prefer linux AMI by AWS- always check for official
support
Eventually…
• Don’t work for technology. Make technology for
you. Chose your engines wisely
• Beware of your cluster’s distribution model - don’t
over-estimate elasticity
• Experimenting is easy: Change, Benchmark,
Analyse. Repeat.
• Run engine backups - not EBS snapshots
Thank You!
Arthur Gimpel
arthurgi@datazone.io

More Related Content

Viewers also liked

Elasticsearch in Zalando
Elasticsearch in ZalandoElasticsearch in Zalando
Elasticsearch in Zalando
Alaa Elhadba
 

Viewers also liked (17)

Elasticsearch in Production (London version)
Elasticsearch in Production (London version)Elasticsearch in Production (London version)
Elasticsearch in Production (London version)
 
Amazon QuickSight
Amazon QuickSightAmazon QuickSight
Amazon QuickSight
 
You know, for search. Querying 24 Billion Documents in 900ms
You know, for search. Querying 24 Billion Documents in 900msYou know, for search. Querying 24 Billion Documents in 900ms
You know, for search. Querying 24 Billion Documents in 900ms
 
AWS Customer Presentation: Coca Cola Turkey migrates SAP ERP to AWS-SAPPHIRE ...
AWS Customer Presentation: Coca Cola Turkey migrates SAP ERP to AWS-SAPPHIRE ...AWS Customer Presentation: Coca Cola Turkey migrates SAP ERP to AWS-SAPPHIRE ...
AWS Customer Presentation: Coca Cola Turkey migrates SAP ERP to AWS-SAPPHIRE ...
 
Search and analyze your data with elasticsearch
Search and analyze your data with elasticsearchSearch and analyze your data with elasticsearch
Search and analyze your data with elasticsearch
 
TCS: Leveraging AWS for SAP on Oracle implementations
TCS: Leveraging AWS for SAP on Oracle implementationsTCS: Leveraging AWS for SAP on Oracle implementations
TCS: Leveraging AWS for SAP on Oracle implementations
 
Elasticsearch Introduction to Data model, Search & Aggregations
Elasticsearch Introduction to Data model, Search & AggregationsElasticsearch Introduction to Data model, Search & Aggregations
Elasticsearch Introduction to Data model, Search & Aggregations
 
Design Patterns for Distributed Non-Relational Databases
Design Patterns for Distributed Non-Relational DatabasesDesign Patterns for Distributed Non-Relational Databases
Design Patterns for Distributed Non-Relational Databases
 
AWS May 2016 Webinar Series - AWS Services Overview
AWS May 2016 Webinar Series - AWS Services OverviewAWS May 2016 Webinar Series - AWS Services Overview
AWS May 2016 Webinar Series - AWS Services Overview
 
ElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learnedElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learned
 
1. 利用微服務架構建立雲端影音平台 (Building Media Platform by Microservices Architecture)
1.	利用微服務架構建立雲端影音平台 (Building Media Platform by Microservices Architecture)1.	利用微服務架構建立雲端影音平台 (Building Media Platform by Microservices Architecture)
1. 利用微服務架構建立雲端影音平台 (Building Media Platform by Microservices Architecture)
 
AWS re:Invent 2016: Real-Time Data Exploration and Analytics with Amazon Elas...
AWS re:Invent 2016: Real-Time Data Exploration and Analytics with Amazon Elas...AWS re:Invent 2016: Real-Time Data Exploration and Analytics with Amazon Elas...
AWS re:Invent 2016: Real-Time Data Exploration and Analytics with Amazon Elas...
 
AWS re:Invent 2016: How to Scale and Operate Elasticsearch on AWS (DEV307)
AWS re:Invent 2016: How to Scale and Operate Elasticsearch on AWS (DEV307)AWS re:Invent 2016: How to Scale and Operate Elasticsearch on AWS (DEV307)
AWS re:Invent 2016: How to Scale and Operate Elasticsearch on AWS (DEV307)
 
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
 
Elasticsearch in Zalando
Elasticsearch in ZalandoElasticsearch in Zalando
Elasticsearch in Zalando
 
Data modeling for Elasticsearch
Data modeling for ElasticsearchData modeling for Elasticsearch
Data modeling for Elasticsearch
 
Scaling massive elastic search clusters - Rafał Kuć - Sematext
Scaling massive elastic search clusters - Rafał Kuć - SematextScaling massive elastic search clusters - Rafał Kuć - Sematext
Scaling massive elastic search clusters - Rafał Kuć - Sematext
 

More from Amazon Web Services

Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
Amazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
Amazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
Amazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
Amazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Recently uploaded

TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc
 
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
panagenda
 

Recently uploaded (20)

Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024
 
Overview of Hyperledger Foundation
Overview of Hyperledger FoundationOverview of Hyperledger Foundation
Overview of Hyperledger Foundation
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджера
 
Vector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptxVector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptx
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
 
The Metaverse: Are We There Yet?
The  Metaverse:    Are   We  There  Yet?The  Metaverse:    Are   We  There  Yet?
The Metaverse: Are We There Yet?
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
 
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
 
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
 
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
 
TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024
 
2024 May Patch Tuesday
2024 May Patch Tuesday2024 May Patch Tuesday
2024 May Patch Tuesday
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 Warsaw
 
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream Processing
 
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
 
Design Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxDesign Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptx
 
Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024
 
Intro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxIntro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptx
 

NoSQL on AWS - Pop-up Loft Tel Aviv

  • 1. NoSQL at Scale: Architecture and Common Pitfalls Arthur Gimpel DataZone
  • 2. About Me • Working with databases for the past 8 years • 5 years as SQL Server DBA & .NET • 3 years with NoSQL, Python, Node.js • 2015 - DataZone
  • 3. DataZone Data is our business • High end consultancy & projects • Private and public training • Multi vendor MSP (Managed Service Provider) • Part of Matrix, CloudZone Family • Proud partners of:
  • 5. All the wrong reasons.. • Web scale! web scale! • Cost reduction • Free school for developers
  • 6. Common Pitfalls & Best Practices
  • 7. Normalizing in the low cost storage era
  • 8. Expectations gone wild • Stored procedures • Jobs • Triggers • Mailing Servers • Shell Commands
  • 9. ODM != ORM ORM • Close the modelling gaps • Simple Standard CRUD ODM • Typing the untyped • Retries • Transactions • Events
  • 10. Best Practices • DynamoDB • Streams: Change Data Capture • Cross region replication & synchronisation • Event driven development: Send your new user a welcome email • Polyglot Persistence: Initiate an ETL, update your Elasticsearch index
  • 11. Best Practices • Elasticsearch • AWS Plugin: • Allow service discovery based on security groups • Run snapshots & restore directly to/from s3 • Storage: • In small clusters, use EBS. • In large clusters, use instance store, if you have replication enabled. • Leverage Watcher & AWS Lambda for anything!
  • 12. Best Practices • Couchbase • XDCR: • Make sure you use cross datacenter replication for multi region deployments • Use encrypted communication, especially on XDCR • Sizing: • All members must be identical • Prefer Memory & IO Optimized for most deplyments
  • 13. When Family Can Be Chosen • Every engine has its flavours - know your tradeoffs • Splitting drives is good for on prem. EBS is different • Scaling up & down is always X2 • Prefer linux AMI by AWS- always check for official support
  • 14. Eventually… • Don’t work for technology. Make technology for you. Chose your engines wisely • Beware of your cluster’s distribution model - don’t over-estimate elasticity • Experimenting is easy: Change, Benchmark, Analyse. Repeat. • Run engine backups - not EBS snapshots