This document outlines Mitch Pirtle's presentation on setting up Elasticsearch in the cloud. The presentation discusses Mitch's early experiences with Elasticsearch and how he initially wanted a simple web crawler but found Elasticsearch difficult to set up locally. It then covers various approaches to deploying Elasticsearch in the cloud, including single instance deployments, clustered services, and using Amazon Elasticsearch Service. The presentation addresses the pros and cons of each approach and concludes by taking questions from the audience.
Presentation on how to chat with PDF using ChatGPT code interpreter
Cloudy with a Chance of Scale: Running Elasticsearch in Production
1. C L O U D Y W I T H A
C H A N C E O F S C A L E
M I T C H P I R T L E
E L A S T I C S E A R C H M E E T U P
C A P I TA L O N E L A B S , A R L I N G T O N VA
D E C E M B E R 2 0 1 6
2. A B O U T M E
• FOSS founder/contributor
• Startupper
• Technology Fellow, Capital One
• Skate punk
• Musician
• Football coach
3. A B O U T Y O U
• Just a user
• In production by hand
• Fully automated in production
• I should be giving this talk
4. A B O U T T H I S
S E S S I O N
• ES from an Ops perspective
• The good, bad and truncated
• What to expect
6. – M E , T H E E A R LY- O N C L U E L E S S V E R S I O N
“All I want is a web crawler, this can’t be too hard to setup.”
7. – M E , T H E E A R LY- O N C L U E L E S S V E R S I O N
“All I want is a web crawler, this can’t be too hard to setup.”
8. F E S S U P !
• Fess: http://fess.codelibs.org/
• FOSS
• Simple web UI for setup
• Multiple sources
• Multiple types
• Supports threads, throttling
• ES as persistence store
10. L O C A L S E T U P
• Fess comes as a self-contained package
• All you need is a java runtime and you’re good
11. I ’ M T H I N K I N G B I G G E R T H A N T H AT.
S C R E W L O C A L H O S T,
12.
13. W H Y P U B L I C C L O U D
• Horizontal scale
• Access to integrated services - storage, load balancing, etc
• Opportunity to automate. All. The. Things.
14. S I N G L E I N S TA N C E .
• Quick to setup, uses embedded
Elasticsearch + plugins
• Quick to duplicate
• Easy to maintain
15. S I N G L E I N S TA N C E :
I S S U E S
• Single point of failure
• Zero scale opportunity
• Fully manual effort
• Good golly filesystem access is
SLOW
16. C L U S T E R E D
S E R V I C E .
• Horizontal scale
• Easier to expand
17. C L U S T E R E D
S E R V I C E : I S S U E S
• Needs separate instance of
Elasticsearch, requires plugin
installation (version specific)
• Even harder to setup: Shards or
replicas? Master or data or both, or
neither?
• Even harder to automate
• Requires additional tooling for
operations (logs, events)
18. H O W D O I D O
T H I S ?
• ElasticSearch Cluster: Configuration
& Best Practices
(http://www.xmsxmx.com/
elasticsearch-cluster-configuration-
best-practices/)
19. A M A Z O N E L A S T I C
S E R V I C E
• No setup
• Simple scale
• Fully automated
• Advanced configuration by default
20. A M A Z O N E L A S T I C
S E R V I C E : I S S U E S
• No custom plugins (no Fess!)
• Service limits (number of nodes per
cluster, etc)
• Lack of customization options
21. T H E L I N K S
• https://aws.amazon.com/elasticsearch-service/
• http://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/
aes-limits.html
22.
23. S T U M P T H E S P E A K E R
M Y FA V O R I T E G A M E
24. T H A N K Y O U V E RY
G R A Z I E .
• @mitchitized
• mitch.pirtle@capitalone.com
• github.com/spacemonkey
• about.me/mitchitized
• www.slideshare.net/
spacemonkeylabs