Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Prototyping applications with heroku and elasticsearch

852 views

Published on

How we successfully create prototypes using heroku, terraform and elasticsearch. With real life examples.

Published in: Internet
  • Be the first to comment

  • Be the first to like this

Prototyping applications with heroku and elasticsearch

  1. 1. Prototyping Apps with Elasticsearch and Heroku October 29th 2015
  2. 2. Protofy Martin and Mike: Prototyping with Heroku and Elasticsearch
  3. 3. Protofy builds Prototypes. Salad-Delivery-Service: From idea to first shipped salad -> 4days. 
 - Validation of the concept in a prototype 
 - Live-Launch in March 2015
 - Continuous prototyping while quickly growing key KPIs
 - Seed-Financing in October, November
  4. 4. Protofy builds Prototypes. Voicefile transcoding and indexing for callcenters client-a.com:
 - Validation of the concept in a prototype 
 - Excessive usage of Elasticsearch as main database
 
 => THIS is the FIRST project we will show deeper.
  5. 5. Protofy builds Prototypes. Automatic news aggregation by given list of keywords and synonyms. client-b.com:
 - Validation of the concept in a prototype 
 - Excessive usage of Elasticsearch to filter feed items and merge them
 
 => THIS is the SECOND project we will show deeper.
  6. 6. And big infrastructure. Education Community Framework with Log-Everything strategy. PokerStrategy.com:
 - 7 Mio members (2007-2013)
 - up to 1 Billion pageviews/year
 - sold in mid 2013
 
 After that 2 companies have been found: DECK36 and Feelgood. Both merged in early 2015 to PROTOFY.
  7. 7. Prototyping with Heroku.
 Focus on the app.
  8. 8. Heroku: Platform as a service. Prebuilt VMs for different programming languages
 - deployment via git
 - customizable with build-packs and add-ons
 - easily scalable
 - full logging of each part of the app and process
 - releases: Easy rollback on errors
 - heroku toolbelt to support local execution
  9. 9. Heroku: Prepare your app - Apps using other infrastructural services like MongoDB or Redis need to be aware of environment variables For example: Elasticsearch-service BONSAI provides: 
 BONSAI_URL=https://user:pw@host.bonsai.io - Use environment variables for everthing dependent. - In general relay on the (Attention. Buzzword.) http://12factor.net/ methodology.
  10. 10. Buzzwording: 12 Factors I. Codebase One codebase tracked in revision control, many deploys II. Dependencies Explicitly declare and isolate dependencies III. Config Store config in the environment IV. Backing Services Treat backing services as attached resources V. Build, release, run Strictly separate build and run stages VI. Processes Execute the app as one or more stateless processes VII. Port binding Export services via port binding VIII. Concurrency Scale out via the process model IX. Disposability Maximize robustness with fast startup and graceful shutdown X. Dev/prod parity Keep development, staging, and production as similar as possible XI. Logs Treat logs as event streams XII. Admin processes Run admin/management tasks as one-off processes
  11. 11. Heroku: add-ons - Logentries - NewRelic - Bonsai-Elasticsearch - MongoLabs - Scheduler - SSL TIP: Care about backups! Even if they promise to do.
  12. 12. Heroku: Test before deploy CONTINUOUS DEPLOYMENT using codehip.io (or others) git bitbucket codeship test heroku
  13. 13. Terraforming.
 Infrastructure as code.
  14. 14. Heroku: Infrastructure as a service. - deployment via git - a lot of add ons - individual scaling of parts of the app - process isolation - full logging of each part of the app and process - easy-to-use command line tools - supports several languages (NodeJS, PHP, Rails, etc. - releases: Easy rollback on errors. - heroku toolbelt to support local execution like it would be on heroku with Foreman TERRAFORM Build,'Combine,'and'Launch'Infrastructure
  15. 15. from automatic provisioning of servers ... Configuration as Code
  16. 16. Infrastructure as Code … to automatic provisioning of services.
  17. 17. Why do we need that? As with Configuration Management: -Replace “click-paths” with source code -Reproducible Environment -Versioning in SCM -Specification and Documentation
  18. 18. What does it do? Configuration Language for Services Actions: -Plan -Apply -Refresh -Destroy
  19. 19. What does it manage? Providers: - Google Cloud
 - AWS
 - Azure
 - Heroku
 - DNSMadeEasy
 - … Resources: - aws_instance
 - aws_vpc
 - azure_instance
 - heroku_app
 - … Provisioners: - chef
 - file
 - exec
  20. 20. Example (part 1) ###  AWS  Setup provider  "aws"  {    access_key  =  "${var.aws_access_key}"    secret_key  =  "${var.aws_secret_key}"    region          =  "${var.aws_region}" } #  Queue  between  importer  and  analyzer resource  "aws_sqs_queue"  "importqueue"  {    name  =  "${var.app_name}-­‐${var.app_env}-­‐import-­‐queue" } resource  "aws_s3_bucket"  "importdisk"  {    bucket  =  "${var.app_name}-­‐${var.app_env}-­‐app-­‐importer"    acl        =  "private" }  
  21. 21. Example (part 2) ###  Heroku  Setup provider  "heroku"  {...} #  App  EntityImporter resource  "heroku_app"  "importer"  {    name  =  "${var.app_name}-­‐${var.app_env}-­‐importer"    config_vars  {        SQS_REGION        =  "${var.aws_region}"        SQS_QUEUE_URL  =  "${aws_sqs_queue.importqueue.id}"        S3_BUCKET          =  "${aws_s3_bucket.importdisk.id}"        NODE_ENV            =  "${var.app_env}"    } } resource  "heroku_addon"  "mongolab"  {    app    =  "${heroku_app.importer.name}"    plan  =  "mongolab:sandbox" }
  22. 22. Graph
  23. 23. Live-Demo Launch application terraform plan terraform apply terraform show terraform destroy
  24. 24. Comparable Software – AWS CloudFormation – HEAT, OpenStack orchestration – boto, Python AWS library – fog, Ruby cloud abstraction library
  25. 25. Problems – Version 0.6 – Still a few bugs – Provider coverage – Modules too simple – Lacking syntactic sugar
  26. 26. Software as a service. Elasticsearch.
  27. 27. Elasticsearch Service Let other do the dirty work. 
 - Relatively complex setup with Shards and Replicas is maintained by specialists. - Backups and version upgrades are done by these specialists, too. - But 1: If version upgrades are announced YOU have to take action. - But 2: Backups SHOULD be done by the specialists. In some cases they cannot provide consistent backups and that can lead to data loss. => Care about them yourself. - But 3: If you need plugins: in the non-dedicated plans you cannot install them. Decide well if or if not to use a service or do it yourself.
  28. 28. The projects. Short overview.
  29. 29. client-a.com Voicefile transcoding and indexing for callcenters 
 - Make telephone calls searchable
 - AccessManagement per Callcenter and Customer
 - Fast responses and results
 - Mobile
 - Be able to white label
  30. 30. Callcenter
  31. 31. client-b.com Automatic content aggregation based on editor’s given input. 
 - Have up to 250.000 news items/day related to a topic from blogs, twitter/facebook/ instagram and other configurable sources.
 - Have automatic sorting and merging of similar items into stories.
 - Be nearly realtime
 - Make editing of main stories possible
 - Mobile first
  32. 32. Elasticsearch. Some magic for the app.
  33. 33. Elasticsearch: General Search server based on Lucene. Providing RESTful web interface for JSON documents. 
 - Near real-time search. - Sophisticated mapping configuration options. => Where the magic comes from. - Highly scaleable and available. - Conflict management with optimistic version control to avoid dataloss during concurrent write operations. - Supporting Plugins for different areas (Like Filters, Queries, Analyzers etc.)
  34. 34. Elasticsearch: client-a.com Elasticsearch as main database - Provide several states of a document based on the state of processing. Always findable and restricted by ACLs
 How to reach that?
  35. 35. Elasticsearch: client-a.com Restrict access by ACLs for „normal“ search 1. Check if user is allowed to access groups trying to request documents for. 2. If yes: Build query with filter restricting results to customers and callcenters based on ACL.
  36. 36. Find documents related to callcenter1and callcenter2 {
 "query": {
 "filtered": {
 "query": {
 "query_string": {
 "default_operator": "AND",
 "minimum_should_match": "55%",
 "auto_generate_phrase_queries": true,
 "phrase_slop": 3,
 "fields": [
 "tags^2",
 "transscript"
 ],
 "query": "*"
 }
 },
 "filter": {
 "bool": {
 "must": [
 {
 "range": {
 "lastUpdated": {
 "gte": "now-24h",
 "lte": "2015-10-25T17:34:24+00:00"
 }
 }
 },
 {
 "range": {
 "lastUpdated": {
 "gte": "2015-08-30T21:04:08+00:00"
 }
 }
 },
 {
 "bool": {
 "should": [
 {
 "term": {
 "source.callcenter": "callcenter1"
 }
 },
 {
 "term": {
 "source.callcenter": "callcenter2"
 }
 }
 ]
 }
 }
 ]
 }
 }
 }
 }
 } {
 "query": {
 "filtered": {
 "query": {
 "query_string": {
 "default_operator": "AND",
 "minimum_should_match": "55%",
 "auto_generate_phrase_queries": true,
 "phrase_slop": 3,
 "fields": [
 "tags^2",
 "transscript.texts.contents"
 ],
 "query": "*"
 }
 },
 "filter": {
 "bool": {
 "must": [
 {
 "range": {
 "lastUpdated": {
 "gte": "now-24h",
 "lte": "2015-10-25T17:34:24+00:00"
 }
 }
 },
 {
 "range": {
 "lastUpdated": {
 "gte": "2015-08-30T21:04:08+00:00"
 }
 }
 },
 {
 "bool": {
 "should": [
 {
 "term": {
 "source.callcenter": "callcenter1"
 }
 },
 {
 "term": {
 "source.callcenter": "callcenter2"
 }
 }
 ]
 }
 }
 ]
 }
 }
 }
 }
 }
  37. 37. Elasticsearch: client-a.com Restrict access for suggests 1. Completion suggests are special handling for really fast autocompletion
 https://www.elastic.co/guide/en/elasticsearch/reference/current/search-suggesters-completion.html 2. How to make suggestions context (ACL) aware?
  38. 38. {
 "body": {
 "suggest": {
 "text": "Agent 007",
 "completion": {
 "field": "agent.suggest",
 "size": 20,
 "fuzzy": false,
 "context": {
 "customer": [
 "customer1",
 "customer4"
 ],
 "callcenter": [
 "callcenter1",
 "callcenter2"
 ]
 }
 }
 }
 }
 } Find suggestions related to context {
 "agent": {
 "type": "multi_field",
 "fields": {
 "agent": {
 "type": "string",
 "copy_to": "autocompletion"
 },
 "autocompletion": {
 "type": "string",
 "index_analyzer": "edgeNGram_analyzer_suggest"
 },
 "suggest": {
 "type": "completion",
 "index_analyzer": "nGram_analyzer_suggest2",
 "search_analyzer": "whitespace_analyzer",
 "max_input_length": 20,
 "context": {
 "customer": {
 "type": "category",
 "path": "source.customer_lowercase"
 },
 "callcenter": {
 "type": "category",
 "path": "source.callcenter_lowercase"
 }
 }
 }
 },
 "include_in_all": false
 }
 } Query Mapping
  39. 39. Elasticsearch: client-b.com Elasticsearch to find similar articles and match them to stories - Index stories and automatically find entities within the articles text - Match similar articles to at least one story (based on entities) and context
 How to do that?
  40. 40. Elasticsearch: client-b.com Entity matching by list of keep words and aliases 1. Create a list of synonyms and keep words to be used in filters.
 https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-keep-words-tokenfilter.html
 https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-synonym-tokenfilter.html 2. Index document 1st time to find entities based on keep words and synonyms. 3. Take document enriched with entities to build a query from it to match against the set of documents to find similar articles. 4. Combine them to a story.
  41. 41. Setting for matching entities "settings": {
 "analysis": {
 "filter": {},
 "analyzer": {
 "entity_analyzer": {
 "tokenizer": "whitespace",
 "filter": [
 "german_stop",
 "shingle",
 "entity_synonym",
 "shingle",
 "entity_keepwords"
 ]
 }
 }
 }
 },
  42. 42. Live-Demo Check how entities are matched in a text 
 1. ./load_entities_list 2. curl -XGET "localhost:9200/talk/_analyze?analyzer=entity_analyzer&pretty=true" -d "Text" => Document is indexed with found entities on indexing time. Analyzing process is like operating on a stream.
  43. 43. Martin Schütte and Mike Lohmann Protofy
 Kaiser-Wilhelm-Straße 85
 20355 Hamburg 
 martin@protofy.com
 mike@protofy.com

×