SlideShare a Scribd company logo
1 of 38
Download to read offline
Build a Scalable Search Engine With the
New Amazon CloudSearch
Agenda
• What Search Engines Do
• Amazon CloudSearch Introduction
• Building With CloudSearch
What Search Engines Do
Search Engines Connect Us To Data
Documents
Representation of a Document
Field Value
id tt0371746
title Iron Man
description When wealthy industrialist Tony Stark is forced to build
an armored suit after a life-threatening incident, he
ultimately decides to use its technology to fight against
evil.
director John Favreau
actors Robert Downey Jr., Gwyneth Paltrow, Terrence Howard
...
rating 7.9
release_date 2008-05-02T00:00:00Z
Data Types
Doubles
Dates
Signed Integers
Text
Literal
Geo
• Latlon data type
• Region search
• Distance sort
• Supports mobile
Text Processing (Normalization)
• Tokenization
(parsing)
• Downcasing
• Stemming
• Stopword removal
• Synonym Addition
When wealthy industrialist Tony Stark is forced to
build an armored suit after a life-threatening
incident, he ultimately decides to use its
technology to fight against evil.
when wealth industrial tony stark force build
armor suit after life threaten incident ultimate
decide use technology fight against evil
Indexing
Term Documents (Posting List)
Iron The Man in the Iron Mask
Iron Man 2
Iron Man
The Iron Giant
The Iron Lady
...
Man Rain Man
The Man in the Moon
Iron Man 2
The Lawnmower Man
The Third Man
Iron Man
...
Matching
The Man in the Iron
Mask
Iron Man 2
Iron Man
The Iron Giant
The Iron Lady
Rain Man
The Man in the Moon
Iron Man 2
The Lawnmower Man
The Third Man
Iron Man
Iron Man 2
Iron Man
Ranking and Relevance
• The meat of the search engine
• TF-IDF – uniqueness and presence
• Additional Criteria
– Measures of document value (e.g. rating)
– Observed user behavior
– Freshness
Summary
• Search makes data accessible
• Search documents gather information about one search target
• Reverse indices provide the basis of text-text matching
• Relevance brings the best matches
Amazon CloudSearch
Building a Search service
• Build your own
– Extend datastores and build custom relevance engine
• Open Source
– Apache Solr, ElasticSearch
• Legacy Enterprise Search
– FAST, Autonomy, Endeca
Challenges with building a Search service
• COMPLEX: Requires extensive search expertise
• COSTLY: High upfront expenditure
• SLOW: Long time to market. Slows innovation
• UNDIFFERENTIATED: Operational overhead that doesn’t add value to
core product
Where CloudSearch fits in the picture
Amazon CloudSearch is a fully managed search service in the cloud that
makes it easy to setup, operate, and scale a search solution for your
website or application
Similar benefits as other AWS Managed Services
• Easy to setup and operate (Console, SDK, CLT)
• Pay as you go
• No need to guess capacity
• Experiment fast with low risk
• Go Global in minutes
Building With CloudSearch
Create a Domain
Upload Data
Document Upload
http(s)://< document service endpoint >/2013-01-01/documents/batch
Accept: application/json
Content-Length: 1176
Content-Type: application/json
Host: doc.imdb-movies-rr2f34ofg56xneuemujamut52i.us-east-1.cloudsearch.amazonaws.com
{ : , : "tt0371746", : { "directors" : [ "Jon Favreau" ], "release_date" : "2008-04-
14T00:00:00Z", "rating" : 7.9, "genres" : [ "Action", "Adventure", "Sci-Fi" ], "image_url" : "http://ia.media-
imdb.com/images/M/MV5BMTczNTI2ODUwOF5BMl5BanBnXkFtZTcwMTU0NTIzMw@@._V1_SX400_.jpg", "plot"
: "When wealthy industrialist Tony Stark is forced to build an armored suit after a life-threatening incident, he
ultimately decides to use its technology to fight against evil.", "title" : "Iron Man", "rank" : 171, "running_time_secs" :
7560, "actors" : [ "Robert Downey Jr.", "Gwyneth Paltrow", "Terrence Howard" ], "year" : 2008 }},
{ , : "tt0434409"} ]
Simple Queries
Movies > Sci-Fi/Fantasy > 2008 to 2010 > Downey > "Iron"
Simple Queries
http(s)/<search endpoint>/2013-01-01/search?q=iron+man
{"id": "tt0371746",
"highlights": {
"plot": "When wealthy industrialist Tony Stark is
forced to build an armored suit after a life-threatening
incident, he ultimately decides to use its technology to
fight against evil.",
"title": "Iron Man"} },
{"id": "tt1866249",
"highlights": {
"plot": "A man in an iron lung who wishes to lose his
virginity contacts a professional sex surrogate with the
help of his therapist and priest.",
"title": "The Sessions" } },
Complex Queries
Movies > Sci-Fi/Fantasy > 2008 to 2010 > Downey > "Iron"
Complex Queries
/search?q=(and 'iron' genres:'Sci-Fi/Fantasy' actors:'downey'
year:[2008,2010] category:'Movies')&q.parser=structured&
q.options={fields:['title^2','plot^0.5']}
{"id": "tt0371746",
"fields": {
"title": "Iron Man",
"year": "2008"
}},
{"id": "tt1228705",
"fields": {
"title": "Iron Man 2",
"year": "2010"
}}
Faceting
Movies > Sci-Fi/Fantasy > 2008 to 2010 > Downey > "Iron"
Feature Detail: Faceting
/search?q=iron man&facet.genres={}
{"status": {...},"hits": {...},
"facets": {"genres": {
"buckets": [
{"value": "Action", "count": 62},
{"value": "Sci-Fi/Fantasy", "count": 25},
{"value": "Comedy", "count": 2},
{"value": "History", "count": 1},...
Adjustable Ranking
Movies > Sci-Fi/Fantasy > 2008 to 2010 > Downey > "Iron"
Expressions
• Baseline TF-IDF function provides textual relevance
• Expressions use field sources or other expressions
• Allows customization per-user or per-query
Movies > Sci-Fi/Fantasy > 2008 to 2010 > Downey > "Iron"
Highlighting
Feature Detail: Highlighting
/search&q=iron+man&highlight.plot={"format":"text"}
{"status": {"rid": "8Pq/88woCwrstGQ=","time-ms": 48},
"hits": {"found": 9,"start": 0,
"hit": [{
"id": "tt1228705",
"fields": {
"title": "Iron Man 2"
},
"highlights": {
"plot": "With the world now aware of his identity as
*Iron* *Man*, Tony Stark must contend..."
} }, . . .
Movies > Sci-Fi/Fantasy > 2008 to 2010 > Downey > "Iron"
Feature Detail: Suggestions
http://<endpoint>/2013-01-01/suggest?q=ir&suggester=title_sug
{"status": {"rid": "t7mti80oAQrstGQ=","time-ms": 3},
"suggest": {"query": "ir", "found": 5,
"suggestions": [
{"suggestion":"Iron Man Three","score": 0,
"id": "tt0371746"},
{ "suggestion": "Iron Man", "score": 0,
"id": "tt1228705"},
Feature Detail: Availability Options
Feature Detail: Scaling Options
Feature Detail: IAM Integration
Configuration API Only
{
"Version":"2012-10-17",
"Statement": [
{ "Effect": "Allow",
"Action": ["cloudsearch:*"],
"Resource": "arn:aws:cloudsearch:us-east-1:111122223333:domain/imdb-movies" },
{ "Effect": "Deny",
"Action": ["cloudsearch:DeleteDomain"],
"Resource": "arn:aws:cloudsearch:us-east-1:111122223333:domain/imdb-movies" }
]
}
Closing Thoughts
• Content Discovery goes hand in hand with Content. Search is
everywhere!
• CloudSearch is a fully managed, easy to use, cost effective search
service
• Get the powerful search features found in open source engines
(Apache Solr) combined with value add AWS features (easy setup, on
demand pricing, auto scaling, Multi-AZ, global availability)
Questions?
Jon Handler (handler@amazon.com)
Pravin Muthukumar (pravinm@amazon.com)

More Related Content

Viewers also liked

Advanced Strategies for Leveraging AWS for Disaster Recovery
Advanced Strategies for Leveraging AWS for Disaster Recovery   Advanced Strategies for Leveraging AWS for Disaster Recovery
Advanced Strategies for Leveraging AWS for Disaster Recovery Amazon Web Services
 
Expectation for cloudSearch
Expectation for cloudSearchExpectation for cloudSearch
Expectation for cloudSearchMinoru Osuka
 
Japanese CloudSearch Use-Cases and Tech Deep Dive
Japanese CloudSearch Use-Cases and Tech Deep DiveJapanese CloudSearch Use-Cases and Tech Deep Dive
Japanese CloudSearch Use-Cases and Tech Deep DiveEiji Shinohara
 
Storage TCO using AWS Storage Gateway, Amazon S3 and Amazon Glacier (STG202) ...
Storage TCO using AWS Storage Gateway, Amazon S3 and Amazon Glacier (STG202) ...Storage TCO using AWS Storage Gateway, Amazon S3 and Amazon Glacier (STG202) ...
Storage TCO using AWS Storage Gateway, Amazon S3 and Amazon Glacier (STG202) ...Amazon Web Services
 
Amazon CloudSearch
Amazon CloudSearchAmazon CloudSearch
Amazon CloudSearchdavtchev
 
Auto-Scaling Web Application Security in Amazon Web Services (SEC308) | AWS r...
Auto-Scaling Web Application Security in Amazon Web Services (SEC308) | AWS r...Auto-Scaling Web Application Security in Amazon Web Services (SEC308) | AWS r...
Auto-Scaling Web Application Security in Amazon Web Services (SEC308) | AWS r...Amazon Web Services
 
AWS Enterprise Day | Securing your Web Applications in the Cloud
AWS Enterprise Day | Securing your Web Applications in the CloudAWS Enterprise Day | Securing your Web Applications in the Cloud
AWS Enterprise Day | Securing your Web Applications in the CloudAmazon Web Services
 
(GAM401) Build a Serverless Mobile Game w/ Cognito, Lambda & DynamoDB
(GAM401) Build a Serverless Mobile Game w/ Cognito, Lambda & DynamoDB(GAM401) Build a Serverless Mobile Game w/ Cognito, Lambda & DynamoDB
(GAM401) Build a Serverless Mobile Game w/ Cognito, Lambda & DynamoDBAmazon Web Services
 
ARC205 Building Web-scale Applications Architectures with AWS - AWS re: Inven...
ARC205 Building Web-scale Applications Architectures with AWS - AWS re: Inven...ARC205 Building Web-scale Applications Architectures with AWS - AWS re: Inven...
ARC205 Building Web-scale Applications Architectures with AWS - AWS re: Inven...Amazon Web Services
 
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...Amazon Web Services
 
How to Tune Search Requests using Amazon CloudSearch
How to Tune Search Requests using Amazon CloudSearchHow to Tune Search Requests using Amazon CloudSearch
How to Tune Search Requests using Amazon CloudSearchAmazon Web Services
 
Access Control for the Cloud: AWS Identity and Access Management (IAM) (SEC20...
Access Control for the Cloud: AWS Identity and Access Management (IAM) (SEC20...Access Control for the Cloud: AWS Identity and Access Management (IAM) (SEC20...
Access Control for the Cloud: AWS Identity and Access Management (IAM) (SEC20...Amazon Web Services
 
Enrich Search User Experience Using Amazon CloudSearch (SVC302) | AWS re:Inve...
Enrich Search User Experience Using Amazon CloudSearch (SVC302) | AWS re:Inve...Enrich Search User Experience Using Amazon CloudSearch (SVC302) | AWS re:Inve...
Enrich Search User Experience Using Amazon CloudSearch (SVC302) | AWS re:Inve...Amazon Web Services
 
Meetup - Using CloudSearch with DynamoDB
Meetup - Using CloudSearch with DynamoDBMeetup - Using CloudSearch with DynamoDB
Meetup - Using CloudSearch with DynamoDBAmazon Web Services
 
AWS re:Invent 2016: Automating and Scaling Infrastructure Administration with...
AWS re:Invent 2016: Automating and Scaling Infrastructure Administration with...AWS re:Invent 2016: Automating and Scaling Infrastructure Administration with...
AWS re:Invent 2016: Automating and Scaling Infrastructure Administration with...Amazon Web Services
 
(BDT309) Data Science & Best Practices for Apache Spark on Amazon EMR
(BDT309) Data Science & Best Practices for Apache Spark on Amazon EMR(BDT309) Data Science & Best Practices for Apache Spark on Amazon EMR
(BDT309) Data Science & Best Practices for Apache Spark on Amazon EMRAmazon Web Services
 
AWS re:Invent 2016: How to Scale and Operate Elasticsearch on AWS (DEV307)
AWS re:Invent 2016: How to Scale and Operate Elasticsearch on AWS (DEV307)AWS re:Invent 2016: How to Scale and Operate Elasticsearch on AWS (DEV307)
AWS re:Invent 2016: How to Scale and Operate Elasticsearch on AWS (DEV307)Amazon Web Services
 
Intrusion Detection in the Cloud (SEC402) | AWS re:Invent 2013
Intrusion Detection in the Cloud (SEC402) | AWS re:Invent 2013Intrusion Detection in the Cloud (SEC402) | AWS re:Invent 2013
Intrusion Detection in the Cloud (SEC402) | AWS re:Invent 2013Amazon Web Services
 

Viewers also liked (20)

Advanced Strategies for Leveraging AWS for Disaster Recovery
Advanced Strategies for Leveraging AWS for Disaster Recovery   Advanced Strategies for Leveraging AWS for Disaster Recovery
Advanced Strategies for Leveraging AWS for Disaster Recovery
 
Expectation for cloudSearch
Expectation for cloudSearchExpectation for cloudSearch
Expectation for cloudSearch
 
Japanese CloudSearch Use-Cases and Tech Deep Dive
Japanese CloudSearch Use-Cases and Tech Deep DiveJapanese CloudSearch Use-Cases and Tech Deep Dive
Japanese CloudSearch Use-Cases and Tech Deep Dive
 
Storage TCO using AWS Storage Gateway, Amazon S3 and Amazon Glacier (STG202) ...
Storage TCO using AWS Storage Gateway, Amazon S3 and Amazon Glacier (STG202) ...Storage TCO using AWS Storage Gateway, Amazon S3 and Amazon Glacier (STG202) ...
Storage TCO using AWS Storage Gateway, Amazon S3 and Amazon Glacier (STG202) ...
 
AWS Search Services
AWS Search ServicesAWS Search Services
AWS Search Services
 
Amazon CloudSearch
Amazon CloudSearchAmazon CloudSearch
Amazon CloudSearch
 
Auto-Scaling Web Application Security in Amazon Web Services (SEC308) | AWS r...
Auto-Scaling Web Application Security in Amazon Web Services (SEC308) | AWS r...Auto-Scaling Web Application Security in Amazon Web Services (SEC308) | AWS r...
Auto-Scaling Web Application Security in Amazon Web Services (SEC308) | AWS r...
 
AWS Enterprise Day | Securing your Web Applications in the Cloud
AWS Enterprise Day | Securing your Web Applications in the CloudAWS Enterprise Day | Securing your Web Applications in the Cloud
AWS Enterprise Day | Securing your Web Applications in the Cloud
 
(GAM401) Build a Serverless Mobile Game w/ Cognito, Lambda & DynamoDB
(GAM401) Build a Serverless Mobile Game w/ Cognito, Lambda & DynamoDB(GAM401) Build a Serverless Mobile Game w/ Cognito, Lambda & DynamoDB
(GAM401) Build a Serverless Mobile Game w/ Cognito, Lambda & DynamoDB
 
ARC205 Building Web-scale Applications Architectures with AWS - AWS re: Inven...
ARC205 Building Web-scale Applications Architectures with AWS - AWS re: Inven...ARC205 Building Web-scale Applications Architectures with AWS - AWS re: Inven...
ARC205 Building Web-scale Applications Architectures with AWS - AWS re: Inven...
 
Amazon cloud search comparison report
Amazon cloud search comparison reportAmazon cloud search comparison report
Amazon cloud search comparison report
 
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
 
How to Tune Search Requests using Amazon CloudSearch
How to Tune Search Requests using Amazon CloudSearchHow to Tune Search Requests using Amazon CloudSearch
How to Tune Search Requests using Amazon CloudSearch
 
Access Control for the Cloud: AWS Identity and Access Management (IAM) (SEC20...
Access Control for the Cloud: AWS Identity and Access Management (IAM) (SEC20...Access Control for the Cloud: AWS Identity and Access Management (IAM) (SEC20...
Access Control for the Cloud: AWS Identity and Access Management (IAM) (SEC20...
 
Enrich Search User Experience Using Amazon CloudSearch (SVC302) | AWS re:Inve...
Enrich Search User Experience Using Amazon CloudSearch (SVC302) | AWS re:Inve...Enrich Search User Experience Using Amazon CloudSearch (SVC302) | AWS re:Inve...
Enrich Search User Experience Using Amazon CloudSearch (SVC302) | AWS re:Inve...
 
Meetup - Using CloudSearch with DynamoDB
Meetup - Using CloudSearch with DynamoDBMeetup - Using CloudSearch with DynamoDB
Meetup - Using CloudSearch with DynamoDB
 
AWS re:Invent 2016: Automating and Scaling Infrastructure Administration with...
AWS re:Invent 2016: Automating and Scaling Infrastructure Administration with...AWS re:Invent 2016: Automating and Scaling Infrastructure Administration with...
AWS re:Invent 2016: Automating and Scaling Infrastructure Administration with...
 
(BDT309) Data Science & Best Practices for Apache Spark on Amazon EMR
(BDT309) Data Science & Best Practices for Apache Spark on Amazon EMR(BDT309) Data Science & Best Practices for Apache Spark on Amazon EMR
(BDT309) Data Science & Best Practices for Apache Spark on Amazon EMR
 
AWS re:Invent 2016: How to Scale and Operate Elasticsearch on AWS (DEV307)
AWS re:Invent 2016: How to Scale and Operate Elasticsearch on AWS (DEV307)AWS re:Invent 2016: How to Scale and Operate Elasticsearch on AWS (DEV307)
AWS re:Invent 2016: How to Scale and Operate Elasticsearch on AWS (DEV307)
 
Intrusion Detection in the Cloud (SEC402) | AWS re:Invent 2013
Intrusion Detection in the Cloud (SEC402) | AWS re:Invent 2013Intrusion Detection in the Cloud (SEC402) | AWS re:Invent 2013
Intrusion Detection in the Cloud (SEC402) | AWS re:Invent 2013
 

Similar to AWS Webcast - Build a Scalable Search Engine with the New Amazon CloudSearch

Amazon Cloudsearch Session With Elsevier: re:Invent 2013
Amazon Cloudsearch Session With Elsevier: re:Invent 2013 Amazon Cloudsearch Session With Elsevier: re:Invent 2013
Amazon Cloudsearch Session With Elsevier: re:Invent 2013 Michael Bohlig
 
What to expect when you are visualizing (v.2)
What to expect when you are visualizing (v.2)What to expect when you are visualizing (v.2)
What to expect when you are visualizing (v.2)Krist Wongsuphasawat
 
ABD338_MirrorWeb - Powering Large-scale, Full-text Search for the UK Governme...
ABD338_MirrorWeb - Powering Large-scale, Full-text Search for the UK Governme...ABD338_MirrorWeb - Powering Large-scale, Full-text Search for the UK Governme...
ABD338_MirrorWeb - Powering Large-scale, Full-text Search for the UK Governme...Amazon Web Services
 
Cross-Platform Mobile Apps & Drupal Web Services
Cross-Platform Mobile Apps & Drupal Web ServicesCross-Platform Mobile Apps & Drupal Web Services
Cross-Platform Mobile Apps & Drupal Web ServicesBob Sims
 
20181027 deep learningcommunity_aws
20181027 deep learningcommunity_aws20181027 deep learningcommunity_aws
20181027 deep learningcommunity_awsHirokuni Uchida
 
kumogata-template の紹介
kumogata-template の紹介kumogata-template の紹介
kumogata-template の紹介Naoya Nakazawa
 
Best Practices for IoT Security in the Cloud
Best Practices for IoT Security in the CloudBest Practices for IoT Security in the Cloud
Best Practices for IoT Security in the CloudAmazon Web Services
 
Best Practices of IoT in the Cloud
Best Practices of IoT in the CloudBest Practices of IoT in the Cloud
Best Practices of IoT in the CloudAmazon Web Services
 
Best Practices for IoT Security in the Cloud
Best Practices for IoT Security in the CloudBest Practices for IoT Security in the Cloud
Best Practices for IoT Security in the CloudAmazon Web Services
 
A search engine in a world of events and microservices - SF Pot @Meetic
A search engine in a world of events and microservices - SF Pot @MeeticA search engine in a world of events and microservices - SF Pot @Meetic
A search engine in a world of events and microservices - SF Pot @MeeticmeeticTech
 
How to Make OpenStack Heat Better based on Our One Year Production Journey
How to Make OpenStack Heat Better based on Our One Year Production JourneyHow to Make OpenStack Heat Better based on Our One Year Production Journey
How to Make OpenStack Heat Better based on Our One Year Production JourneyKaz Shinohara
 
Going from A to C, a Practical Approach to Semantic Search
Going from A to C, a Practical Approach to Semantic SearchGoing from A to C, a Practical Approach to Semantic Search
Going from A to C, a Practical Approach to Semantic SearchPawel Kowaluk
 
Solr Search Engine: Optimize Is (Not) Bad for You
Solr Search Engine: Optimize Is (Not) Bad for YouSolr Search Engine: Optimize Is (Not) Bad for You
Solr Search Engine: Optimize Is (Not) Bad for YouSematext Group, Inc.
 
Introduction to Azure DocumentDB
Introduction to Azure DocumentDBIntroduction to Azure DocumentDB
Introduction to Azure DocumentDBDenny Lee
 
AWS를 통한 데이터 분석 및 처리의 새로운 혁신 기법 - 김윤건, AWS사업개발 담당:: AWS Summit Online Korea 2020
AWS를 통한 데이터 분석 및 처리의 새로운 혁신 기법 - 김윤건, AWS사업개발 담당::  AWS Summit Online Korea 2020AWS를 통한 데이터 분석 및 처리의 새로운 혁신 기법 - 김윤건, AWS사업개발 담당::  AWS Summit Online Korea 2020
AWS를 통한 데이터 분석 및 처리의 새로운 혁신 기법 - 김윤건, AWS사업개발 담당:: AWS Summit Online Korea 2020Amazon Web Services Korea
 
apidays LIVE London 2021 - API Horror Stories from an Unnamed Coworking Compa...
apidays LIVE London 2021 - API Horror Stories from an Unnamed Coworking Compa...apidays LIVE London 2021 - API Horror Stories from an Unnamed Coworking Compa...
apidays LIVE London 2021 - API Horror Stories from an Unnamed Coworking Compa...apidays
 
apidays LIVE Australia 2021 - API Horror Stories from an Unnamed Coworking Co...
apidays LIVE Australia 2021 - API Horror Stories from an Unnamed Coworking Co...apidays LIVE Australia 2021 - API Horror Stories from an Unnamed Coworking Co...
apidays LIVE Australia 2021 - API Horror Stories from an Unnamed Coworking Co...apidays
 
Data Science - The Most Profitable Movie Characteristic
Data Science -  The Most Profitable Movie CharacteristicData Science -  The Most Profitable Movie Characteristic
Data Science - The Most Profitable Movie CharacteristicCheah Eng Soon
 
170119 metadata representation_v2s
170119 metadata representation_v2s170119 metadata representation_v2s
170119 metadata representation_v2sWoontack Woo
 

Similar to AWS Webcast - Build a Scalable Search Engine with the New Amazon CloudSearch (20)

Amazon Cloudsearch Session With Elsevier: re:Invent 2013
Amazon Cloudsearch Session With Elsevier: re:Invent 2013 Amazon Cloudsearch Session With Elsevier: re:Invent 2013
Amazon Cloudsearch Session With Elsevier: re:Invent 2013
 
What to expect when you are visualizing (v.2)
What to expect when you are visualizing (v.2)What to expect when you are visualizing (v.2)
What to expect when you are visualizing (v.2)
 
ABD338_MirrorWeb - Powering Large-scale, Full-text Search for the UK Governme...
ABD338_MirrorWeb - Powering Large-scale, Full-text Search for the UK Governme...ABD338_MirrorWeb - Powering Large-scale, Full-text Search for the UK Governme...
ABD338_MirrorWeb - Powering Large-scale, Full-text Search for the UK Governme...
 
Cross-Platform Mobile Apps & Drupal Web Services
Cross-Platform Mobile Apps & Drupal Web ServicesCross-Platform Mobile Apps & Drupal Web Services
Cross-Platform Mobile Apps & Drupal Web Services
 
20181027 deep learningcommunity_aws
20181027 deep learningcommunity_aws20181027 deep learningcommunity_aws
20181027 deep learningcommunity_aws
 
kumogata-template の紹介
kumogata-template の紹介kumogata-template の紹介
kumogata-template の紹介
 
Best Practices for IoT Security in the Cloud
Best Practices for IoT Security in the CloudBest Practices for IoT Security in the Cloud
Best Practices for IoT Security in the Cloud
 
Best Practices of IoT in the Cloud
Best Practices of IoT in the CloudBest Practices of IoT in the Cloud
Best Practices of IoT in the Cloud
 
Best Practices for IoT Security in the Cloud
Best Practices for IoT Security in the CloudBest Practices for IoT Security in the Cloud
Best Practices for IoT Security in the Cloud
 
A search engine in a world of events and microservices - SF Pot @Meetic
A search engine in a world of events and microservices - SF Pot @MeeticA search engine in a world of events and microservices - SF Pot @Meetic
A search engine in a world of events and microservices - SF Pot @Meetic
 
921創業小聚
921創業小聚921創業小聚
921創業小聚
 
How to Make OpenStack Heat Better based on Our One Year Production Journey
How to Make OpenStack Heat Better based on Our One Year Production JourneyHow to Make OpenStack Heat Better based on Our One Year Production Journey
How to Make OpenStack Heat Better based on Our One Year Production Journey
 
Going from A to C, a Practical Approach to Semantic Search
Going from A to C, a Practical Approach to Semantic SearchGoing from A to C, a Practical Approach to Semantic Search
Going from A to C, a Practical Approach to Semantic Search
 
Solr Search Engine: Optimize Is (Not) Bad for You
Solr Search Engine: Optimize Is (Not) Bad for YouSolr Search Engine: Optimize Is (Not) Bad for You
Solr Search Engine: Optimize Is (Not) Bad for You
 
Introduction to Azure DocumentDB
Introduction to Azure DocumentDBIntroduction to Azure DocumentDB
Introduction to Azure DocumentDB
 
AWS를 통한 데이터 분석 및 처리의 새로운 혁신 기법 - 김윤건, AWS사업개발 담당:: AWS Summit Online Korea 2020
AWS를 통한 데이터 분석 및 처리의 새로운 혁신 기법 - 김윤건, AWS사업개발 담당::  AWS Summit Online Korea 2020AWS를 통한 데이터 분석 및 처리의 새로운 혁신 기법 - 김윤건, AWS사업개발 담당::  AWS Summit Online Korea 2020
AWS를 통한 데이터 분석 및 처리의 새로운 혁신 기법 - 김윤건, AWS사업개발 담당:: AWS Summit Online Korea 2020
 
apidays LIVE London 2021 - API Horror Stories from an Unnamed Coworking Compa...
apidays LIVE London 2021 - API Horror Stories from an Unnamed Coworking Compa...apidays LIVE London 2021 - API Horror Stories from an Unnamed Coworking Compa...
apidays LIVE London 2021 - API Horror Stories from an Unnamed Coworking Compa...
 
apidays LIVE Australia 2021 - API Horror Stories from an Unnamed Coworking Co...
apidays LIVE Australia 2021 - API Horror Stories from an Unnamed Coworking Co...apidays LIVE Australia 2021 - API Horror Stories from an Unnamed Coworking Co...
apidays LIVE Australia 2021 - API Horror Stories from an Unnamed Coworking Co...
 
Data Science - The Most Profitable Movie Characteristic
Data Science -  The Most Profitable Movie CharacteristicData Science -  The Most Profitable Movie Characteristic
Data Science - The Most Profitable Movie Characteristic
 
170119 metadata representation_v2s
170119 metadata representation_v2s170119 metadata representation_v2s
170119 metadata representation_v2s
 

More from Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Recently uploaded

Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 

Recently uploaded (20)

Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 

AWS Webcast - Build a Scalable Search Engine with the New Amazon CloudSearch

  • 1. Build a Scalable Search Engine With the New Amazon CloudSearch
  • 2. Agenda • What Search Engines Do • Amazon CloudSearch Introduction • Building With CloudSearch
  • 6. Representation of a Document Field Value id tt0371746 title Iron Man description When wealthy industrialist Tony Stark is forced to build an armored suit after a life-threatening incident, he ultimately decides to use its technology to fight against evil. director John Favreau actors Robert Downey Jr., Gwyneth Paltrow, Terrence Howard ... rating 7.9 release_date 2008-05-02T00:00:00Z
  • 8. Geo • Latlon data type • Region search • Distance sort • Supports mobile
  • 9. Text Processing (Normalization) • Tokenization (parsing) • Downcasing • Stemming • Stopword removal • Synonym Addition When wealthy industrialist Tony Stark is forced to build an armored suit after a life-threatening incident, he ultimately decides to use its technology to fight against evil. when wealth industrial tony stark force build armor suit after life threaten incident ultimate decide use technology fight against evil
  • 10. Indexing Term Documents (Posting List) Iron The Man in the Iron Mask Iron Man 2 Iron Man The Iron Giant The Iron Lady ... Man Rain Man The Man in the Moon Iron Man 2 The Lawnmower Man The Third Man Iron Man ...
  • 11. Matching The Man in the Iron Mask Iron Man 2 Iron Man The Iron Giant The Iron Lady Rain Man The Man in the Moon Iron Man 2 The Lawnmower Man The Third Man Iron Man Iron Man 2 Iron Man
  • 12. Ranking and Relevance • The meat of the search engine • TF-IDF – uniqueness and presence • Additional Criteria – Measures of document value (e.g. rating) – Observed user behavior – Freshness
  • 13. Summary • Search makes data accessible • Search documents gather information about one search target • Reverse indices provide the basis of text-text matching • Relevance brings the best matches
  • 15. Building a Search service • Build your own – Extend datastores and build custom relevance engine • Open Source – Apache Solr, ElasticSearch • Legacy Enterprise Search – FAST, Autonomy, Endeca
  • 16. Challenges with building a Search service • COMPLEX: Requires extensive search expertise • COSTLY: High upfront expenditure • SLOW: Long time to market. Slows innovation • UNDIFFERENTIATED: Operational overhead that doesn’t add value to core product
  • 17. Where CloudSearch fits in the picture Amazon CloudSearch is a fully managed search service in the cloud that makes it easy to setup, operate, and scale a search solution for your website or application Similar benefits as other AWS Managed Services • Easy to setup and operate (Console, SDK, CLT) • Pay as you go • No need to guess capacity • Experiment fast with low risk • Go Global in minutes
  • 21. Document Upload http(s)://< document service endpoint >/2013-01-01/documents/batch Accept: application/json Content-Length: 1176 Content-Type: application/json Host: doc.imdb-movies-rr2f34ofg56xneuemujamut52i.us-east-1.cloudsearch.amazonaws.com { : , : "tt0371746", : { "directors" : [ "Jon Favreau" ], "release_date" : "2008-04- 14T00:00:00Z", "rating" : 7.9, "genres" : [ "Action", "Adventure", "Sci-Fi" ], "image_url" : "http://ia.media- imdb.com/images/M/MV5BMTczNTI2ODUwOF5BMl5BanBnXkFtZTcwMTU0NTIzMw@@._V1_SX400_.jpg", "plot" : "When wealthy industrialist Tony Stark is forced to build an armored suit after a life-threatening incident, he ultimately decides to use its technology to fight against evil.", "title" : "Iron Man", "rank" : 171, "running_time_secs" : 7560, "actors" : [ "Robert Downey Jr.", "Gwyneth Paltrow", "Terrence Howard" ], "year" : 2008 }}, { , : "tt0434409"} ]
  • 22. Simple Queries Movies > Sci-Fi/Fantasy > 2008 to 2010 > Downey > "Iron"
  • 23. Simple Queries http(s)/<search endpoint>/2013-01-01/search?q=iron+man {"id": "tt0371746", "highlights": { "plot": "When wealthy industrialist Tony Stark is forced to build an armored suit after a life-threatening incident, he ultimately decides to use its technology to fight against evil.", "title": "Iron Man"} }, {"id": "tt1866249", "highlights": { "plot": "A man in an iron lung who wishes to lose his virginity contacts a professional sex surrogate with the help of his therapist and priest.", "title": "The Sessions" } },
  • 24. Complex Queries Movies > Sci-Fi/Fantasy > 2008 to 2010 > Downey > "Iron"
  • 25. Complex Queries /search?q=(and 'iron' genres:'Sci-Fi/Fantasy' actors:'downey' year:[2008,2010] category:'Movies')&q.parser=structured& q.options={fields:['title^2','plot^0.5']} {"id": "tt0371746", "fields": { "title": "Iron Man", "year": "2008" }}, {"id": "tt1228705", "fields": { "title": "Iron Man 2", "year": "2010" }}
  • 26. Faceting Movies > Sci-Fi/Fantasy > 2008 to 2010 > Downey > "Iron"
  • 27. Feature Detail: Faceting /search?q=iron man&facet.genres={} {"status": {...},"hits": {...}, "facets": {"genres": { "buckets": [ {"value": "Action", "count": 62}, {"value": "Sci-Fi/Fantasy", "count": 25}, {"value": "Comedy", "count": 2}, {"value": "History", "count": 1},...
  • 28. Adjustable Ranking Movies > Sci-Fi/Fantasy > 2008 to 2010 > Downey > "Iron"
  • 29. Expressions • Baseline TF-IDF function provides textual relevance • Expressions use field sources or other expressions • Allows customization per-user or per-query
  • 30. Movies > Sci-Fi/Fantasy > 2008 to 2010 > Downey > "Iron" Highlighting
  • 31. Feature Detail: Highlighting /search&q=iron+man&highlight.plot={"format":"text"} {"status": {"rid": "8Pq/88woCwrstGQ=","time-ms": 48}, "hits": {"found": 9,"start": 0, "hit": [{ "id": "tt1228705", "fields": { "title": "Iron Man 2" }, "highlights": { "plot": "With the world now aware of his identity as *Iron* *Man*, Tony Stark must contend..." } }, . . .
  • 32. Movies > Sci-Fi/Fantasy > 2008 to 2010 > Downey > "Iron"
  • 33. Feature Detail: Suggestions http://<endpoint>/2013-01-01/suggest?q=ir&suggester=title_sug {"status": {"rid": "t7mti80oAQrstGQ=","time-ms": 3}, "suggest": {"query": "ir", "found": 5, "suggestions": [ {"suggestion":"Iron Man Three","score": 0, "id": "tt0371746"}, { "suggestion": "Iron Man", "score": 0, "id": "tt1228705"},
  • 36. Feature Detail: IAM Integration Configuration API Only { "Version":"2012-10-17", "Statement": [ { "Effect": "Allow", "Action": ["cloudsearch:*"], "Resource": "arn:aws:cloudsearch:us-east-1:111122223333:domain/imdb-movies" }, { "Effect": "Deny", "Action": ["cloudsearch:DeleteDomain"], "Resource": "arn:aws:cloudsearch:us-east-1:111122223333:domain/imdb-movies" } ] }
  • 37. Closing Thoughts • Content Discovery goes hand in hand with Content. Search is everywhere! • CloudSearch is a fully managed, easy to use, cost effective search service • Get the powerful search features found in open source engines (Apache Solr) combined with value add AWS features (easy setup, on demand pricing, auto scaling, Multi-AZ, global availability)
  • 38. Questions? Jon Handler (handler@amazon.com) Pravin Muthukumar (pravinm@amazon.com)