To infinity and beyond! A practical guide for Mooseherds  (and other carers of livestock)                   @clintongormle...
I have an idea for a     killer app!
Quick! Lets...
Design our objects
Flatten them into tables
Normalize data
Add indexes
Add tables formany-to-one
More indexes
Need full text search?
Copy data tosearch engine
Keep the two in sync
Get search results,pull objects from DB
Success!
Need to scale
Buy a bigger box
Tune indexes
Add caching
Fix caching bugs
Master - Slave replication
Buy SSDs
Denormalize data
Buy bigger boxes
Shard your data  (ie rewrite your application)
Do you really need a   relational DB?
Do you really need a   relational DB?    faster horse?
NoSQL advantages
Document oriented
...just store your object
Fast reads and writes
Scale horizontally
Recover from failure
But...
Different from RDBM
No transactions
No joins
Denormalized data
Still need to add:      indexes
Still need to add: full text search
elasticsearch
Real timedocument store
Powerfulfull text search  (Near real time: < 1 second)
Filters, geolocation...
Distributed by design
Fault tolerant
Easy sharding
Start smallScale massively
Why keep twodatastores in sync?
Just useelasticsearch
withElastic::Model
Store and query Moose objects
Exposes full power of elasticsearch
and takes care ofthe housekeeping
How?
package MyApp::Post;use Moose;has title => (     is    => rw,     isa => Str);has content => (     is    => rw,     isa =>...
package MyApp::Post;                    package MyApp::User;use Moose;                              use Moose;has title =>...
package MyApp::Post;                    package MyApp::User;use Moose;                              use Moose;has title =>...
package MyApp::Post;                    package MyApp::User;use Moose;                              use Moose;has title =>...
package MyApp::Post;                    package MyApp::User;use Moose;                              use Moose;has title =>...
package MyApp::Post;                    package MyApp::User;use Elastic::Doc;                       use Elastic::Doc;has t...
Some definitions...elasticsearch* index       Like a database* type        Like a table* doc         Like a row in a table...
We need a Model
package MyApp;use Elastic::Model;
package MyApp;use Elastic::Model;has_namespace myapp => {};
package MyApp;use Elastic::Model;has_namespace myapp => {    user => MyApp::User,    post => MyApp::Post,};
package MyApp;use Elastic::Model;has_namespace myapp => {    user => MyApp::User,    post => MyApp::Post,};# like table <=...
Using our Model
use MyApp;
use MyApp;my $model = MyApp->new;
use MyApp;my $model = MyApp->new;To do anything useful, we need:my $namespace = $model->namespace(myapp);# For index and a...
Namespace: Create an indexmy $namespace = $model->namespace(myapp);$namespace->index->create;* create index myapp* namespa...
Namespace: Delete an indexmy $namespace = $model->namespace(myapp);$namespace->index->delete;
Namespace: Create an aliasmy $namespace = $model->namespace(myapp);$namespace->index(myapp_v1)->create;$namespace->alias->...
Domain: Create a usermy $domain = $model->domain(myapp);my $user = $domain->new_doc(    user => {        name    => Clinto...
Domain: Create a usermy $domain = $model->domain(myapp);my $user = $domain->create(    user => {        name    => Clinton...
Domain: Create a usermy $domain = $model->domain(myapp);my $user = $domain->create(    user => {        name    => Clinton...
Domain: Create a postmy $domain = $model->domain(myapp);my $post = $domain->create(    post => {        id      => 2,     ...
Domain: Retrieve a docmy $domain = $model->domain(myapp);my $post = $domain->get( post => 2 );my $user = $post->user;     ...
Domain: Update a docmy $domain = $model->domain(myapp);$post->title(Awesome blog post);say $post->has_changed;# 1say $post...
optimisticversion control
$version++on every change
1:   $post = $domain->get(post=>2);  2: $post = $domain->get(post=>2);1:   $post->title(Awesome blog post);  2: $post->tit...
Dealing with conflicts
Ignore them  $post->overwrite;
on_conflict handler
$post->save(    on_conflict => sub {        my ($old,$new) = @_;       # do something       # to resolve conflict});
$post->save(    on_conflict => sub {        my ($old,$new) = @_;        my %changed = $old->old_values;        $new->$_( $...
Query docs: View $results = $model->view->search;
Views are reusable$posts   = $model->view( type => post );$featured = $posts->filterb( featured => 1 );
Single domain $view = $domain->view;
Multi domain$view = $model->view;
Multi domain$view = $model->view;$view = $model->view->domain(foo,bar);
Multi type$view = $model->view;$view = $model->view->type(user,post);
my $view = $domain    ->view    ->type( post)    ->filterb(            created => { gte => 2012-08-01 },            user  ...
First result$results = $view->first
$size results $results = $view->search;
Unbounded results    $results = $view->scroll    $results = $view->scan
Results are iterators     $result = $results->next     $result = $results->prev     $result = $results->first     $result ...
Result is:metadata + object   say $result->object->title
my $results = $view->search;say "Total hits: " . $results->total;say "Took: "       . $results->took . "ms";while ( my $re...
Just the object $object = $results->next_object
Just objects  $results->as_objects;$object = $results->next;
Enough dull API!
Not just a doc store
***   POWERFUL ***       search engine
BUT...
You can only get out  what you put in
Prepare your data
Tell elasticsearch:* what fields you have* what data they contain* how to index them
"Mapping"(like a database schema)
Moose gives usintrospection   (takes the pain away)
Examples: analyzed full texthas name => (          name: {    is       => rw,        type: "string"    isa      => Str,   ...
Examples: analyze and stem texthas name => (             name: {    is       => rw,           type: "string",    isa      ...
Examples: analyze and stem texthas name => (               name: {    is       => rw,             type: "string",    isa  ...
Examples: store the exact valuehas tag => (                   tag: {    is       => rw,                type: "string",    ...
Examples: complex datause MooseX::Types::Moose qw(Str);use MooseX::Types::Structured qw(Dict);has name => (               ...
Examples: Elastic::Doc classeshas user => (                       user: {    is            => rw,                type: "ob...
Examples: Elastic::Doc classes              Denormalisedhas user => (    is                 data!                  => rw, ...
Examples: Elastic::Doc classeshas user => (                       user: {    is            => rw,                type: "ob...
Examples: Elastic::Doc classeshas user => (                       user: {    is            => rw,                type: "ob...
Examples: Elastic::Doc classeshas user => (                       user: {    is            => rw,                type: "ob...
Same data. Different purposehas title => (               title: {    is     => rw,                type: "string"    isa   ...
Multi-fieldsindex the same data  in different ways
Same data. Different purposehas title => (    is     => rw,    isa    => Str,}
Same data. Different purposehas title => (    is     => rw,    isa    => Str,    multi => {        untouched => {         ...
Same data. Different purposehas title => (    is     => rw,    isa    => Str,    multi => {        untouched => {         ...
Lets TWEAK stuff!
How aboutAUTO-COMPLETE?
Dont use wildcards      Slow & inefficient
Prepare your data:    "Analysis"
With edge-ngrams
Analysis process"Édith Piaf"                    -> standard tokenizer ->["Édith", "Piaf"]                    -> lowercase ...
Add a custom analyzer to our Modelpackage MyApp;use Elastic::Model;has_namespace myapp => {    user => MyApp::User,    typ...
Add a custom analyzer to our Modelpackage MyApp;use Elastic::Model;has_namespace myapp => {    user => MyApp::User,    typ...
Add a custom analyzer to our Modelpackage MyApp;use Elastic::Model;has_namespace myapp => {    user => MyApp::User,    typ...
Add a custom analyzer to our Modelpackage MyApp;use Elastic::Model;has_namespace myapp => {    user => MyApp::User,    typ...
Add analyzer to our Doc classhas title => (    is     => rw,    isa    => Str,    multi => {        untouched => {        ...
Add analyzer to our Doc classhas title => (    is     => rw,    isa    => Str,    multi => {        untouched =>   {      ...
Add analyzer to our Doc classhas title => (                           title => An AMAZING talk!    is     => rw,    isa   ...
Add analyzer to our Doc classhas title => (                           title => An AMAZING talk!    is     => rw,    isa   ...
Apply your changes
Update the mapping   AND the data
Reindex
$new = $namespace->index(myapp_v2);$new->reindex(myapp);$namespace->alias->to(myapp_v2);$namespace->index(myapp_v1)->delete;
Autocomplete query
$view = $domain->view->queryb();
$view = $domain->view->queryb(     "title.autocomplete" => "amazing ta",);
$view = $domain->view->queryb(    "title.autocomplete" => "amazing ta",);       Matches anything starting with a or t     ...
$view = $domain->view->queryb(    "title.autocomplete" => {        -text => {            query    => "amazing ta",        ...
$view = $domain->view->queryb(    "title.autocomplete" => {        -text => {            query    => "amazing ta",        ...
$view = $domain->view->queryb(    "title.autocomplete" => {        -text => {            query    => "amazing ta",        ...
$view = $domain->view->queryb(    "title.autocomplete" => {        -text => {            query    => "amazing ta",        ...
$view = $domain->view->queryb(    "title.autocomplete" => {        -text => {            query    => "amazing ta",        ...
$view = $domain->view->queryb([    "title.autocomplete" => {        -text => {            query    => "amazing ta",       ...
Done!
Scaling
To infinity and beyond!
Basic unit of scale:    the shard
An index has  1-or-moreprimary shards
Each primary has   0-or-more replica shards
Primariesscale total data
Replicas arefor failover andto scale queries
Default: 5 primary shardswith 1 replica each
5 * (1 + 1) = 10 shards
10 shards =1 .. 10 servers
Can changenumber of replicas
CANNOT changenumber of primaries
So how do we scale?
Kagillion  shards!
Umm, No.
Be a growernot a shower
At query time:
1 index x 10 shards         ==10 indices x 1 shard
Two patterns:
Time based indices  Index-per-user
Time based indices  Index-per-user
* one index per month* write to alias: logs_current* query alias:   logs
$ns = $model->namespace(logs);$ns->index(logs_2012_08)->create;$ns->alias(logs_current)->to(logs_2012_08);$ns->alias->to(l...
New month, new index $ns->index(logs_2012_09)->create; $ns->alias(logs_current)->to(logs_2012_09); $ns->alias->add(logs_20...
Add alias for 2012   $ns->alias(logs_2012)->to(          logs_2012_08,          logs_2012_09,          ...   );
Time based indices  Index-per-user
Users have their   own data
Most searches are    per-user
Ideal:Index-per-user
Expensive
Most users have  little data
Some have LOTS!
Start with one indexfor all users
Use aliasesto pretend
...aliases with...filters and routing
$ns->alias( bloggs_plumbers )->to(     myapp_v1 => {         filterb => { client_id => bloggs_plumbers },         routing ...
Routing determines:which shard stores your data
Routing == bloggs_plumbersAll users data on same shard
CRUD -> hit one shardQueries -> hit one shard
SUPER efficient!
New client joins...
...called "Twitter"
6 months later...
$new = $ns->index(twitter_v1);$new->reindex(twitter);$ns->alias(twitter)->to(twitter_v1);$ns->alias->add(twitter_v1);
What more do you need?
Go forth and HERD!
Upcoming SlideShare
Loading in …5
×

To infinity and beyond

2,340 views

Published on

Elastic::Model is a new framework to store your Moose objects, which uses ElasticSearch as a NoSQL document store and flexible search engine.

It is designed to make small beginnings simple, but to scale easily to Big Data requirements without needing to rearchitect your application. No job too big or small!

This talk will introduce Elastic::Model, demonstrate how to develop a simple application, introduce some more advanced techniques, and discuss how it uses ElasticSearch to scale.

https://github.com/clintongormley/Elastic-Model

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,340
On SlideShare
0
From Embeds
0
Number of Embeds
26
Actions
Shares
0
Downloads
28
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

To infinity and beyond

  1. 1. To infinity and beyond! A practical guide for Mooseherds (and other carers of livestock) @clintongormley #elasticsearch YAPC::EU 2012
  2. 2. I have an idea for a killer app!
  3. 3. Quick! Lets...
  4. 4. Design our objects
  5. 5. Flatten them into tables
  6. 6. Normalize data
  7. 7. Add indexes
  8. 8. Add tables formany-to-one
  9. 9. More indexes
  10. 10. Need full text search?
  11. 11. Copy data tosearch engine
  12. 12. Keep the two in sync
  13. 13. Get search results,pull objects from DB
  14. 14. Success!
  15. 15. Need to scale
  16. 16. Buy a bigger box
  17. 17. Tune indexes
  18. 18. Add caching
  19. 19. Fix caching bugs
  20. 20. Master - Slave replication
  21. 21. Buy SSDs
  22. 22. Denormalize data
  23. 23. Buy bigger boxes
  24. 24. Shard your data (ie rewrite your application)
  25. 25. Do you really need a relational DB?
  26. 26. Do you really need a relational DB? faster horse?
  27. 27. NoSQL advantages
  28. 28. Document oriented
  29. 29. ...just store your object
  30. 30. Fast reads and writes
  31. 31. Scale horizontally
  32. 32. Recover from failure
  33. 33. But...
  34. 34. Different from RDBM
  35. 35. No transactions
  36. 36. No joins
  37. 37. Denormalized data
  38. 38. Still need to add: indexes
  39. 39. Still need to add: full text search
  40. 40. elasticsearch
  41. 41. Real timedocument store
  42. 42. Powerfulfull text search (Near real time: < 1 second)
  43. 43. Filters, geolocation...
  44. 44. Distributed by design
  45. 45. Fault tolerant
  46. 46. Easy sharding
  47. 47. Start smallScale massively
  48. 48. Why keep twodatastores in sync?
  49. 49. Just useelasticsearch
  50. 50. withElastic::Model
  51. 51. Store and query Moose objects
  52. 52. Exposes full power of elasticsearch
  53. 53. and takes care ofthe housekeeping
  54. 54. How?
  55. 55. package MyApp::Post;use Moose;has title => ( is => rw, isa => Str);has content => ( is => rw, isa => Str);has created => ( is => rw, isa => DateTime, default => sub { DateTime->now });
  56. 56. package MyApp::Post; package MyApp::User;use Moose; use Moose;has title => ( has name => ( is => rw, is => rw, isa => Str isa => Str); );has content => ( has email => ( is => rw, is => rw, isa => Str isa => Str,); required => 1 );has created => ( is => rw, isa => DateTime, default => sub { DateTime->now });
  57. 57. package MyApp::Post; package MyApp::User;use Moose; use Moose;has title => ( has name => ( is => rw, is => rw, isa => Str isa => Str); );has content => ( has email => ( is => rw, is => rw, isa => Str isa => Str,); required => 1 );has created => ( is => rw, isa => DateTime, default => sub { DateTime->now });has user => ( is => ro, isa => MyApp::User,);
  58. 58. package MyApp::Post; package MyApp::User;use Moose; use Moose;has title => ( has name => ( is => rw, is => rw, isa => Str isa => Str); );has content => ( has email => ( is => rw, is => rw, isa => Str isa => Str,); required => 1 );has created => ( is => rw, isa => DateTime, default => sub { DateTime->now });has user => ( is => ro, isa => MyApp::User,);
  59. 59. package MyApp::Post; package MyApp::User;use Moose; use Moose;has title => ( has name => ( is => rw, is => rw, isa => Str isa => Str); );has content => ( has email => ( is => rw, is => rw, isa => Str isa => Str,); required => 1 );has created => ( is => rw, isa => DateTime, default => sub { DateTime->now });has user => ( is => ro, isa => MyApp::User,);
  60. 60. package MyApp::Post; package MyApp::User;use Elastic::Doc; use Elastic::Doc;has title => ( has name => ( is => rw, is => rw, isa => Str isa => Str); );has content => ( has email => ( is => rw, is => rw, isa => Str isa => Str,); required => 1 );has created => ( is => rw, isa => DateTime, default => sub { DateTime->now });has user => ( is => ro, isa => MyApp::User,);
  61. 61. Some definitions...elasticsearch* index Like a database* type Like a table* doc Like a row in a table* alias Like a symbolic link, points to one or more indicesElastic::Model* domain An index or an alias, used for CRUD* namespace Maps type <=> class for all associated domains* model Connects your app to elasticsearch.
  62. 62. We need a Model
  63. 63. package MyApp;use Elastic::Model;
  64. 64. package MyApp;use Elastic::Model;has_namespace myapp => {};
  65. 65. package MyApp;use Elastic::Model;has_namespace myapp => { user => MyApp::User, post => MyApp::Post,};
  66. 66. package MyApp;use Elastic::Model;has_namespace myapp => { user => MyApp::User, post => MyApp::Post,};# like table <=> class
  67. 67. Using our Model
  68. 68. use MyApp;
  69. 69. use MyApp;my $model = MyApp->new;
  70. 70. use MyApp;my $model = MyApp->new;To do anything useful, we need:my $namespace = $model->namespace(myapp);# For index and alias managementmy $domain = $model->domain(myapp);# For document CRUDmy $view = $model->view;# For searching
  71. 71. Namespace: Create an indexmy $namespace = $model->namespace(myapp);$namespace->index->create;* create index myapp* namespace:myapp => index:myapp
  72. 72. Namespace: Delete an indexmy $namespace = $model->namespace(myapp);$namespace->index->delete;
  73. 73. Namespace: Create an aliasmy $namespace = $model->namespace(myapp);$namespace->index(myapp_v1)->create;$namespace->alias->to(myapp_v1);* alias:myapp => index:myapp_v1* namespace:myapp => alias:myapp => index:myapp_v1
  74. 74. Domain: Create a usermy $domain = $model->domain(myapp);my $user = $domain->new_doc( user => { name => Clinton, email => clint@foo.com, });$user->save;
  75. 75. Domain: Create a usermy $domain = $model->domain(myapp);my $user = $domain->create( user => { name => Clinton, email => clint@foo.com, });$user->save;
  76. 76. Domain: Create a usermy $domain = $model->domain(myapp);my $user = $domain->create( user => { name => Clinton, email => clint@foo.com, id => 1, });say $user->id;# 1say $user->type;# user
  77. 77. Domain: Create a postmy $domain = $model->domain(myapp);my $post = $domain->create( post => { id => 2, title => To infinity and beyond, content => Elastic::Model persists Moose . . objects in elasticsearch, user => $user });
  78. 78. Domain: Retrieve a docmy $domain = $model->domain(myapp);my $post = $domain->get( post => 2 );my $user = $post->user; # stub objectsay $user->id; # still stub# 1say $user->name; # full object# Clinton
  79. 79. Domain: Update a docmy $domain = $model->domain(myapp);$post->title(Awesome blog post);say $post->has_changed;# 1say $post->has_changed(title);# 1say $post->old_value(title);# To infinity and beyond$post->save;
  80. 80. optimisticversion control
  81. 81. $version++on every change
  82. 82. 1: $post = $domain->get(post=>2); 2: $post = $domain->get(post=>2);1: $post->title(Awesome blog post); 2: $post->title(Brilliant blog post);1: $post->save; 2: $post->save; *** CONFLICT ERROR ***
  83. 83. Dealing with conflicts
  84. 84. Ignore them $post->overwrite;
  85. 85. on_conflict handler
  86. 86. $post->save( on_conflict => sub { my ($old,$new) = @_; # do something # to resolve conflict});
  87. 87. $post->save( on_conflict => sub { my ($old,$new) = @_; my %changed = $old->old_values; $new->$_( $changed->{$_} ) for keys %changed; $new->save; $post = $new;});
  88. 88. Query docs: View $results = $model->view->search;
  89. 89. Views are reusable$posts = $model->view( type => post );$featured = $posts->filterb( featured => 1 );
  90. 90. Single domain $view = $domain->view;
  91. 91. Multi domain$view = $model->view;
  92. 92. Multi domain$view = $model->view;$view = $model->view->domain(foo,bar);
  93. 93. Multi type$view = $model->view;$view = $model->view->type(user,post);
  94. 94. my $view = $domain ->view ->type( post) ->filterb( created => { gte => 2012-08-01 }, user => $user, ) ->queryb( title => awesome ) ->sort( timestamp ) ->size( 20 ) ->highlight( content ) ->explain( 1 ); See "Terms of Endearment" on speakerdeck.com
  95. 95. First result$results = $view->first
  96. 96. $size results $results = $view->search;
  97. 97. Unbounded results $results = $view->scroll $results = $view->scan
  98. 98. Results are iterators $result = $results->next $result = $results->prev $result = $results->first $result = $results->last $result = $results->shift
  99. 99. Result is:metadata + object say $result->object->title
  100. 100. my $results = $view->search;say "Total hits: " . $results->total;say "Took: " . $results->took . "ms";while ( my $result = $results->next ) { say "Title:" . $result->object->title; say "Snippets:" . join "n", $result->highlight(content); say "Score:" . $result->score; say "Debug:" . $result->explain;}
  101. 101. Just the object $object = $results->next_object
  102. 102. Just objects $results->as_objects;$object = $results->next;
  103. 103. Enough dull API!
  104. 104. Not just a doc store
  105. 105. *** POWERFUL *** search engine
  106. 106. BUT...
  107. 107. You can only get out what you put in
  108. 108. Prepare your data
  109. 109. Tell elasticsearch:* what fields you have* what data they contain* how to index them
  110. 110. "Mapping"(like a database schema)
  111. 111. Moose gives usintrospection (takes the pain away)
  112. 112. Examples: analyzed full texthas name => ( name: { is => rw, type: "string" isa => Str, });
  113. 113. Examples: analyze and stem texthas name => ( name: { is => rw, type: "string", isa => Str, analyzer: "english" analyzer => english });
  114. 114. Examples: analyze and stem texthas name => ( name: { is => rw, type: "string", isa => Str, analyzer: "norwegian" analyzer => norwegian });
  115. 115. Examples: store the exact valuehas tag => ( tag: { is => rw, type: "string", isa => Str, index: "not_analyzed" index => not_analyzed });
  116. 116. Examples: complex datause MooseX::Types::Moose qw(Str);use MooseX::Types::Structured qw(Dict);has name => ( name: { is => rw, type: "object", isa => Dict[ properties: { first => Str, first: { type: string }, last => Str, last: { type: string }, middle => Optional[Str], middle: { type: string} ], }); }
  117. 117. Examples: Elastic::Doc classeshas user => ( user: { is => rw, type: "object", isa => MyApp::User, properties: {); name: { type: string }, email: { type: string }, uid: { type: "object", properties: { index: {...}, type: {...}, id: {...}, routing: {...} } } } }
  118. 118. Examples: Elastic::Doc classes Denormalisedhas user => ( is data! => rw, user: { type: "object", isa => MyApp::User, properties: {); name: { type: string }, email: { type: string }, uid: { type: "object", properties: { index: {...}, type: {...}, id: {...}, routing: {...} } } } }
  119. 119. Examples: Elastic::Doc classeshas user => ( user: { is => rw, type: "object", isa => MyApp::User, properties: { exclude_attrs => [email] name: { type: string },); email: { type: string }, uid: { type: "object", properties: { index: {...}, type: {...}, id: {...}, routing: {...} } } } }
  120. 120. Examples: Elastic::Doc classeshas user => ( user: { is => rw, type: "object", isa => MyApp::User, properties: { include_attrs => [email] name: { type: string },); email: { type: string }, uid: { type: "object", properties: { index: {...}, type: {...}, id: {...}, routing: {...} } } } }
  121. 121. Examples: Elastic::Doc classeshas user => ( user: { is => rw, type: "object", isa => MyApp::User, properties: { include_attrs => [] name: { type: string },); email: { type: string }, uid: { type: "object", properties: { index: {...}, type: {...}, id: {...}, routing: {...} } } } }
  122. 122. Same data. Different purposehas title => ( title: { is => rw, type: "string" isa => Str, }}title => An AMAZING talk! title: [amazing,talk] What do you sort on? amazing or talk
  123. 123. Multi-fieldsindex the same data in different ways
  124. 124. Same data. Different purposehas title => ( is => rw, isa => Str,}
  125. 125. Same data. Different purposehas title => ( is => rw, isa => Str, multi => { untouched => { index => not_analyzed } }}
  126. 126. Same data. Different purposehas title => ( is => rw, isa => Str, multi => { untouched => { index => not_analyzed } }}title => An AMAZING talk! title: { title: [amazing,talk], untouched: "An AMAZING talk!" }
  127. 127. Lets TWEAK stuff!
  128. 128. How aboutAUTO-COMPLETE?
  129. 129. Dont use wildcards Slow & inefficient
  130. 130. Prepare your data: "Analysis"
  131. 131. With edge-ngrams
  132. 132. Analysis process"Édith Piaf" -> standard tokenizer ->["Édith", "Piaf"] -> lowercase token filter ->["édith", "piaf"] -> ascii-folding token filter ->["edith", "piaf"] -> edge-ngrams token filter ->["e", "ed", "edi", "edit", "edith", "p", "pi", "pia", "piaf"] Perfect for partial matching!
  133. 133. Add a custom analyzer to our Modelpackage MyApp;use Elastic::Model;has_namespace myapp => { user => MyApp::User, type => MyApp::Post,};
  134. 134. Add a custom analyzer to our Modelpackage MyApp;use Elastic::Model;has_namespace myapp => { user => MyApp::User, type => MyApp::Post,};
  135. 135. Add a custom analyzer to our Modelpackage MyApp;use Elastic::Model;has_namespace myapp => { user => MyApp::User, type => MyApp::Post,};has_filter my_edge_ngrams => { type => edge_ngrams, min_gram => 1, max_gram => 15};
  136. 136. Add a custom analyzer to our Modelpackage MyApp;use Elastic::Model;has_namespace myapp => { user => MyApp::User, type => MyApp::Post,};has_filter my_edge_ngrams => { type => edge_ngrams, min_gram => 1, max_gram => 15};has_analyzer autocomplete => { tokenizer => standard, filter => [lowercase,asciifolding, my_edge_ngrams]};
  137. 137. Add analyzer to our Doc classhas title => ( is => rw, isa => Str, multi => { untouched => { index => not_analyzed } }}
  138. 138. Add analyzer to our Doc classhas title => ( is => rw, isa => Str, multi => { untouched => { index => not_analyzed }, autocomplete => { analyzer => autocomplete } }}
  139. 139. Add analyzer to our Doc classhas title => ( title => An AMAZING talk! is => rw, isa => Str, multi => { title: { untouched => { title: [amazing,talk], index => not_analyzed untouched: "An AMAZING talk!" }, } autocomplete => { analyzer => autocomplete } }}
  140. 140. Add analyzer to our Doc classhas title => ( title => An AMAZING talk! is => rw, isa => Str, multi => { title: { untouched => { title: [amazing,talk], index => not_analyzed untouched: "An AMAZING talk!", }, autocomplete: [ autocomplete => { a, am, ama, amaz, analyzer => autocomplete amazi, amazin, amazing, } t, ta, tal, talk } ]} }
  141. 141. Apply your changes
  142. 142. Update the mapping AND the data
  143. 143. Reindex
  144. 144. $new = $namespace->index(myapp_v2);$new->reindex(myapp);$namespace->alias->to(myapp_v2);$namespace->index(myapp_v1)->delete;
  145. 145. Autocomplete query
  146. 146. $view = $domain->view->queryb();
  147. 147. $view = $domain->view->queryb( "title.autocomplete" => "amazing ta",);
  148. 148. $view = $domain->view->queryb( "title.autocomplete" => "amazing ta",); Matches anything starting with a or t BOOH!
  149. 149. $view = $domain->view->queryb( "title.autocomplete" => { -text => { query => "amazing ta", } });
  150. 150. $view = $domain->view->queryb( "title.autocomplete" => { -text => { query => "amazing ta", operator => "or" } }); "a OR am OR ama OR amaz OR ... OR t OR ta"
  151. 151. $view = $domain->view->queryb( "title.autocomplete" => { -text => { query => "amazing ta", operator => "and" } });
  152. 152. $view = $domain->view->queryb( "title.autocomplete" => { -text => { query => "amazing ta", operator => "and" } }); Complete words should be more relevant
  153. 153. $view = $domain->view->queryb( "title.autocomplete" => { -text => { query => "amazing ta", operator => "and" } }, "title" => "amazing ta",);
  154. 154. $view = $domain->view->queryb([ "title.autocomplete" => { -text => { query => "amazing ta", operator => "and" } }, "title" => "amazing ta",]);
  155. 155. Done!
  156. 156. Scaling
  157. 157. To infinity and beyond!
  158. 158. Basic unit of scale: the shard
  159. 159. An index has 1-or-moreprimary shards
  160. 160. Each primary has 0-or-more replica shards
  161. 161. Primariesscale total data
  162. 162. Replicas arefor failover andto scale queries
  163. 163. Default: 5 primary shardswith 1 replica each
  164. 164. 5 * (1 + 1) = 10 shards
  165. 165. 10 shards =1 .. 10 servers
  166. 166. Can changenumber of replicas
  167. 167. CANNOT changenumber of primaries
  168. 168. So how do we scale?
  169. 169. Kagillion shards!
  170. 170. Umm, No.
  171. 171. Be a growernot a shower
  172. 172. At query time:
  173. 173. 1 index x 10 shards ==10 indices x 1 shard
  174. 174. Two patterns:
  175. 175. Time based indices Index-per-user
  176. 176. Time based indices Index-per-user
  177. 177. * one index per month* write to alias: logs_current* query alias: logs
  178. 178. $ns = $model->namespace(logs);$ns->index(logs_2012_08)->create;$ns->alias(logs_current)->to(logs_2012_08);$ns->alias->to(logs_2012_08);$model->domain(logs_current)->create( log => %data );$model->domain(logs)->view->search;
  179. 179. New month, new index $ns->index(logs_2012_09)->create; $ns->alias(logs_current)->to(logs_2012_09); $ns->alias->add(logs_2012_09);
  180. 180. Add alias for 2012 $ns->alias(logs_2012)->to( logs_2012_08, logs_2012_09, ... );
  181. 181. Time based indices Index-per-user
  182. 182. Users have their own data
  183. 183. Most searches are per-user
  184. 184. Ideal:Index-per-user
  185. 185. Expensive
  186. 186. Most users have little data
  187. 187. Some have LOTS!
  188. 188. Start with one indexfor all users
  189. 189. Use aliasesto pretend
  190. 190. ...aliases with...filters and routing
  191. 191. $ns->alias( bloggs_plumbers )->to( myapp_v1 => { filterb => { client_id => bloggs_plumbers }, routing => bloggs_plumbers });
  192. 192. Routing determines:which shard stores your data
  193. 193. Routing == bloggs_plumbersAll users data on same shard
  194. 194. CRUD -> hit one shardQueries -> hit one shard
  195. 195. SUPER efficient!
  196. 196. New client joins...
  197. 197. ...called "Twitter"
  198. 198. 6 months later...
  199. 199. $new = $ns->index(twitter_v1);$new->reindex(twitter);$ns->alias(twitter)->to(twitter_v1);$ns->alias->add(twitter_v1);
  200. 200. What more do you need?
  201. 201. Go forth and HERD!

×