C L O U D Y W I T H A
C H A N C E O F S C A L E
M I T C H P I R T L E
E L A S T I C S E A R C H M E E T U P
C A P I TA L O N E L A B S , A R L I N G T O N VA
D E C E M B E R 2 0 1 6
A B O U T M E
• FOSS founder/contributor
• Startupper
• Technology Fellow, Capital One
• Skate punk
• Musician
• Football coach
A B O U T Y O U
• Just a user
• In production by hand
• Fully automated in production
• I should be giving this talk
A B O U T T H I S
S E S S I O N
• ES from an Ops perspective
• The good, bad and truncated
• What to expect
H O W I T A L L S TA R T E D
– M E , T H E E A R LY- O N C L U E L E S S V E R S I O N
“All I want is a web crawler, this can’t be too hard to setup.”
– M E , T H E E A R LY- O N C L U E L E S S V E R S I O N
“All I want is a web crawler, this can’t be too hard to setup.”
F E S S U P !
• Fess: http://fess.codelibs.org/
• FOSS
• Simple web UI for setup
• Multiple sources
• Multiple types
• Supports threads, throttling
• ES as persistence store
S C E N A R I O S .
L O C A L S E T U P
• Fess comes as a self-contained package
• All you need is a java runtime and you’re good
I ’ M T H I N K I N G B I G G E R T H A N T H AT.
S C R E W L O C A L H O S T,
W H Y P U B L I C C L O U D
• Horizontal scale
• Access to integrated services - storage, load balancing, etc
• Opportunity to automate. All. The. Things.
S I N G L E I N S TA N C E .
• Quick to setup, uses embedded
Elasticsearch + plugins
• Quick to duplicate
• Easy to maintain
S I N G L E I N S TA N C E : 

I S S U E S
• Single point of failure
• Zero scale opportunity
• Fully manual effort
• Good golly filesystem access is
SLOW
C L U S T E R E D
S E R V I C E .
• Horizontal scale
• Easier to expand
C L U S T E R E D
S E R V I C E : I S S U E S
• Needs separate instance of
Elasticsearch, requires plugin
installation (version specific)
• Even harder to setup: Shards or
replicas? Master or data or both, or
neither?
• Even harder to automate
• Requires additional tooling for
operations (logs, events)
H O W D O I D O
T H I S ?
• ElasticSearch Cluster: Configuration
& Best Practices

(http://www.xmsxmx.com/
elasticsearch-cluster-configuration-
best-practices/)
A M A Z O N E L A S T I C
S E R V I C E
• No setup
• Simple scale
• Fully automated
• Advanced configuration by default
A M A Z O N E L A S T I C
S E R V I C E : I S S U E S
• No custom plugins (no Fess!)
• Service limits (number of nodes per
cluster, etc)
• Lack of customization options
T H E L I N K S
• https://aws.amazon.com/elasticsearch-service/
• http://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/
aes-limits.html
S T U M P T H E S P E A K E R
M Y FA V O R I T E G A M E
T H A N K Y O U V E RY
G R A Z I E .
• @mitchitized
• mitch.pirtle@capitalone.com
• github.com/spacemonkey
• about.me/mitchitized
• www.slideshare.net/
spacemonkeylabs

Cloudy with a chance of scale

  • 1.
    C L OU D Y W I T H A C H A N C E O F S C A L E M I T C H P I R T L E E L A S T I C S E A R C H M E E T U P C A P I TA L O N E L A B S , A R L I N G T O N VA D E C E M B E R 2 0 1 6
  • 2.
    A B OU T M E • FOSS founder/contributor • Startupper • Technology Fellow, Capital One • Skate punk • Musician • Football coach
  • 3.
    A B OU T Y O U • Just a user • In production by hand • Fully automated in production • I should be giving this talk
  • 4.
    A B OU T T H I S S E S S I O N • ES from an Ops perspective • The good, bad and truncated • What to expect
  • 5.
    H O WI T A L L S TA R T E D
  • 6.
    – M E, T H E E A R LY- O N C L U E L E S S V E R S I O N “All I want is a web crawler, this can’t be too hard to setup.”
  • 7.
    – M E, T H E E A R LY- O N C L U E L E S S V E R S I O N “All I want is a web crawler, this can’t be too hard to setup.”
  • 8.
    F E SS U P ! • Fess: http://fess.codelibs.org/ • FOSS • Simple web UI for setup • Multiple sources • Multiple types • Supports threads, throttling • ES as persistence store
  • 9.
    S C EN A R I O S .
  • 10.
    L O CA L S E T U P • Fess comes as a self-contained package • All you need is a java runtime and you’re good
  • 11.
    I ’ MT H I N K I N G B I G G E R T H A N T H AT. S C R E W L O C A L H O S T,
  • 13.
    W H YP U B L I C C L O U D • Horizontal scale • Access to integrated services - storage, load balancing, etc • Opportunity to automate. All. The. Things.
  • 14.
    S I NG L E I N S TA N C E . • Quick to setup, uses embedded Elasticsearch + plugins • Quick to duplicate • Easy to maintain
  • 15.
    S I NG L E I N S TA N C E : 
 I S S U E S • Single point of failure • Zero scale opportunity • Fully manual effort • Good golly filesystem access is SLOW
  • 16.
    C L US T E R E D S E R V I C E . • Horizontal scale • Easier to expand
  • 17.
    C L US T E R E D S E R V I C E : I S S U E S • Needs separate instance of Elasticsearch, requires plugin installation (version specific) • Even harder to setup: Shards or replicas? Master or data or both, or neither? • Even harder to automate • Requires additional tooling for operations (logs, events)
  • 18.
    H O WD O I D O T H I S ? • ElasticSearch Cluster: Configuration & Best Practices
 (http://www.xmsxmx.com/ elasticsearch-cluster-configuration- best-practices/)
  • 19.
    A M AZ O N E L A S T I C S E R V I C E • No setup • Simple scale • Fully automated • Advanced configuration by default
  • 20.
    A M AZ O N E L A S T I C S E R V I C E : I S S U E S • No custom plugins (no Fess!) • Service limits (number of nodes per cluster, etc) • Lack of customization options
  • 21.
    T H EL I N K S • https://aws.amazon.com/elasticsearch-service/ • http://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/ aes-limits.html
  • 23.
    S T UM P T H E S P E A K E R M Y FA V O R I T E G A M E
  • 24.
    T H AN K Y O U V E RY G R A Z I E . • @mitchitized • mitch.pirtle@capitalone.com • github.com/spacemonkey • about.me/mitchitized • www.slideshare.net/ spacemonkeylabs