Introducing CloudSearch✦ Powered by “a9” search engine✦ Same search used by Amazon.com✦ Similar to Apache Solr✦ Managed Service, Auto scale based on usage and storage✦ Searches full-text and metadata✦ Customized Schema
What is CloudSearch?✦ Search Domains✦ Full text indexing of documents and Metadata✦ Simple Document API✦ Rich API to search - no AWS Credentials required✦ “Search Facets”✦ “Result Field”
Search Domains✦ Single set of Endpoints✦ Completely Isolated✦ Can not search across domains✦ Set of instances✦ Set of permissions✦ Specific Schema
Indexing✦ Key -> value (multi)✦ Specify schema!: Limit of 100 values per item✦ Supports different types: ✦ text (default) ✦ uint ✦ literal (tokenized)✦ Options on each index: ✦ Search ✦ Facet Can’t use both! ✦ Result
Advanced Indexing Settings✦ Rank expressions: how to determine match results✦ Stopwords - Words to remove and not index: “the”, “a” “an” “and”✦ Stemming: Reduce a given word to its “root form”: “Learning”: “learn”✦ Synonyms: Transform one word into another “google”: “search”
Document API✦ REST-Style API✦ Not signed requests✦ Permissions by IP✦ Can also upload via the Console✦ Add via SDF (Search Document Format)✦ Batch operations, add and delete✦ Each document has an ID and a Version
Search API ✦ Authorized by IP address (or CIDR range) ✦ Supports “simple” and “boolean” query searches ✦ Search across all indexed fields, or specific fields, or both ✦ Returns simple JSON or XML output ✦ Also allows returning of Facets.
Search Facets✦ Special “filtering” fields for fields that do not have a lot of unique values✦ Each search request can return these counts✦ Can be used to limit further searches by adding a boolean query✦ Can not also be returned in results
Result Fields✦ Special fields that are returned with each hit✦ Each field is an array✦ Also return total number found and “start” index
How does this help with DynamoDB?✦ DynamoDB is non-indexed✦ Stores Metadata only✦ Can be used to store full metadata for objects that are indexed in CloudSearch✦ Both are exceptionally fast and scalable
Its not cheap✦ Priced per instance and instance type✦ You do not control scaling, Amazon does✦ At minimum, approximately $100 per “domain”
Pricing ✦ $0.12/hour - 1 million documents $87/month ✦ $0.48/hour - 4 million documents $346/month ✦ $0.68/hour - 8 million documents $490/month ✦ $0.10 per 1,000 Batch Put requests
What to take away from this✦ CloudSearch is expensive, but saves development time✦ CloudSearch provides powerful features that would take time to implement yourself✦ Just like everything else Amazon releases, the price will decrease eventually.