[2B1]검색엔진의 패러다임 전환

NAVER D2
NAVER D2NAVER D2
검색엔진의패러다임전환 
-빅데이터분석과검색의융합- 
고려대학교정보대학컴퓨터학과 
강재우
연구배경 
사용자의정보욕구변화 
참여, 공유, 개방의Web 2.0 시대도래 사용자중심의정보생산/소비구조로의변화 
웹및SNS상에개인의의견/주관적정보의양폭증 
“분당상견례하기좋은한식집”, “반전이좋은스릴러“, “유행하는핸드백” 등의주관적정보에대한정보요구증가 
•사실검색(e.g., ‘action movie’) 수요는정체또는불규칙한반면, ‘best action movie’, ‘best SUV’와같은주관적질의는꾸준히증가 
2 
“action movie'와best action movie' 질의어에대한구글검색추세그래프 
(Google Trends, http://www.google.com/trends/)
3 
Aardvark: Large-Scale Social Search Engine 
(Horowitz and Kamvar, WWW2010) 
“64% of queries contain subjective element in Aardvark” 
(e.g., “Do you know of any great delis in Baltimore, MD?” 
“What are the things/crafts/toys your children have made that made them really proud of themselves?”) 
2010년google이$50,000,000 USD (한화530억) 에인수 
사실검색VS. 컨센서스검색 
컨센서스검색요구의증가
검색엔진VS. 컨센서스엔진 
기존문서기반검색엔진의한계 
객관적정보(e.g., ‘액션영화’또는‘핸드백가격‘)는현재의검색엔진에서검색가능하나주관적질의(‘재미있는액션영화’, ’요즘유행하는핸드백‘) 에는적절한대응불가능 
문서내에서기술의대상이되는객체를찾아내어이를색인의대상으로인식하고다양한문서에산재한사용자의의견을대상객체별로종합/분석하여랭킹하는새로운검색기술로의근본적인패러다임의전환요구 
4
5 
•낮은가격순 
•높은가격순 
•등록일순 
•상품평많은순 
의단순한상품정렬 
단순나열되는사용자리뷰 
•내용파악이힘들며 
•정보의종합이어려움 
복잡한옵션선택 
TV의인치와가격외에유용한정보가없는결과리스트
6 
구매후기|2013.04.12 
고가의전자제품을인터넷구매라많이망설였습니다.설치된후제품을보니너무만족합니다. 화면크고잘나오고저렴하게구입잘한것같아서기분이좋습니다. 
LG전자 
47LM6200 
가격대비막강한성능을가진TV입니다.|2013.04.01 
제품자체가보급형으로저렴한가격.인터넷, 3D 등의막강한기능을가졌고이곳저곳상품평읽어보니모두만족하는제품이라안심하고구매했습니다. 좋은제품합리적인가격에잘구매한것같습니다. 감사합니다. 
탁월한선택... LG 스마트TV 47LM6200...|2012.09.10 
특히리모콘의기능과3D안경은S사것보다활용도가아주편하고좋습니다. 3D안경도타사의밧데리로하는3D안경보다훨씬편하고특히안경쓴사람들에게편리한클립형은아이디어가돋보인다. 
깔끔한화질및벽걸이설치Good. 제품수급에따른배송지연|2012.07.02 
화질도깔끔히잘나오고, 무엇보다벽걸이형으로아주잘설치되어서만족합니다. 
나쁘지않습니다.|2013.04.19 
가격대비이정도면괜찮은듯싶습니다. 
그러나마우스리모컨이은근계륵이네요. 스마트티비엔확실히필요하나감도가영불편하게되어있구요. 리모컨도초간단으로나오는데.. 너무간단해서조작하기영.. 리모컨시스템빼고는뭐나쁘지않습니다. 
Search 
가격성능비가좋은TV 
제품자체가보급형으로저렴한가격 
LG 47LM 
가격대비아주좋은선택이었네요. 
LG 47LM 
가격대비성능비가매우우수한3D 스마트LED TV라고생각합니다. 
LG 47LM 
LG 47LM 
화면크고잘나오고저렴하게구입잘한것같아서기분이좋습니다. 
삼성UN50 
무엇보다가격대배최고의제품이라말하고싶습니다 
삼성UN50 
아주좋은가격에사게되어만족합니다 
삼성UN50 
가격대비크기및화질좋습니다. 
삼성UN50 
정말최고의제품&서비스입니다.|2013.07.31 
어제주문했는데이렇게빨리배송이올줄이야!!! 배송기사님도너무마음에들게설치해주시고. 무엇보다가격대배최고의제품이라말하고싶습니다. 모든것만족!! 
착한가격에만족합니다.|2012.12.18 
아주좋은가격에사게되어만족합니다. 삼성스마트TV로성능이나외관은기존에백화점에서보는것과별반다르지않고만족합니다. 현재약2주정도사용중인데기능이나외관모두만족입니다 
가격대비최고의가치있는모델|2013.03.21 
저녁에주문했는데다음날아침에배송!!!벽걸이로샀는데크기도크고영화보기에는아주좋을것같습니다. 화질도좋고, 크기도좋고, 배송도번개배송!! 
저렴하게구입 
가격대배최고 
저렴한가격 
가격대비성능비가매우우수 
가격대비크기및화질좋습니다 
아주좋은가격 
가격대비이정도면괜찮 
가격대비아주좋은선택 
0.5 
0.8 
0.9 
0.7 
0.5 
0.8 
0.7 
0.6 
Query Term과매칭된Aspect 
Segment Score 
삼성합계: 2.9 
LG합계: 2.6 
최종검색순위 
1. 삼성UN50ES6800F 
2. LG 47LM6200 
Click! 
삼성전자 
UN50ES6800F
Consensus Search 
최근사용자들은구매활동이나문화생활과관련된의사결정을위해인터넷검색을활발히활용 
공연관람이나, 상품구매를위해타사용자들의리뷰, 후기를참조 
각리뷰는작성자의“주관적의견”을토대로작성 
가능한많은리뷰를읽어야의사결정에도움 
컨센서스엔진이란? 
타사용자들이기작성해놓은수많은리뷰를사전에분석 
사용자가원하는관점(질의)에서타사용자들의리뷰를분석, 종합해주는검색시스템 
7
Consensus Engine 
현재의검색엔진으로는충분하지않다! 
상위몇개의문서에원하는정보가있을수는있다 
하지만각각의문서는각작성자의의견 
대중의consensus를대표할수없다 
하지만답은이미Web에존재! 
많은사용자들이각자의의견을여러형태(SNS, blog, review)로온라인상에게시 
이러한온라인의견들을“잠재적투표”로인식 
이미피력된온라인의견을검색시점에(query time)모아서분석하면컨센서스검색이가능 
8
Uhm.. Yeah.. It is noisy, but… 
9 
Online Consumer Posts: 2ndmost trusted forms of advertising (The Nielson Company, Q3 2011)
Is consensus search ever possible…? 
“Best Action Movies in 2013” 
Not immediately answerable with conventional search engines 
Because the answer should be based on consensus, which cannot be found in one of “top-10” documents 
However, the answers are already on the Web 
Numerous implicit votes from people on the Web and Social Networks 
Only if we can process them …. 
… ONLINE! 
10
CONSENTO Overview 
11
CONSENTO Overview 
12
The Key Ideas (I) 
Subdocument-level Indexing 
Capture semantics from user opinion more precisely 
Indexing unit no longer a page but; 
•a reviewwithin a page if more than one reviews exist on the page, 
•or a sentencewithin a review, 
•or even a clauseor phrasewithin a sentence discussing one aspect of the target entity 
Maximal Coherent Semantic Unit (MCSU) 
•a finest granule indexing unit used in CONSENTO indexing 
•maximal subsequence of words within a sentence, which carries single coherent semantics 
Indexing MCSUs instead of documents enables semantic analysis to be performed during indexing time 
•facilitating the online processing of consensus search in query time 
13
The Key Ideas (II) 
ConsensusRank: A Unique Ranking Method based on Public Sentiment 
Virtually, all existing ranking methods rank target objects (either documents or entities) directly based on their relevance to the query terms 
Contrastingly, ConsensusRankranks the entities indirectly through aggregating the scores of referring segments (e.g., MCSUs) that match to the query context 
It can be viewed as a voting process where each reviewer casts a weighted vote on an entity with respect to a query by expressing positive or negative opinions about that entity 
14
15 
(A)Indexing Subsystem 
Web 
Documents 
Parsing & 
Preprocessing 
DOM-tree Parsing 
Contents Extraction 
ContentsSegmentation 
Sentence Splitter 
MCSU Extraction 
Entity 
Search Index 
(B) Searching Subsystem 
Query Parsing 
Query Preprocessing 
& Expansion 
Retrieval 
Matching MCSU Retrieval 
Ranking 
Segment Grouping 
Score Aggregation 
Entity List 
User 
Query 
1 
2 
3 
4 
5 
6 
ReviewContents 
ExpandedQuery 
MCSU 
Posting List 
MCSUs 
Indexing 
Inverted Entry Construction 
& Indexing 
CONSENTOArchitecture 
Indexing Subsystem 
Parsing & Preprocessing 
Contents Segmentation 
Indexing 
Searching Subsystem 
Query Parsing 
Retrieval 
Ranking
The current working prototype of CONSENTO is built on movie domain 
CONSENTO crawled review pages from popular movie review sites such as IMDB, Meta Critics etc. 
Review contents are extracted using DOM- tree parsing and XPATH queries 
Extracted information include: 
entity name (i.e., movie name) 
review text, 
date and time 
review quality (e.g., “20 out of 30 people found the review helpful”) 
I: Parsing & Preprocessing
Split the review contents into MCSUs 
e.g., “The storyline is ridiculous, the acting is laughable, and the camera work is terrible.” 
s1) “The storyline is ridiculous” 
s2) “the acting is laughable” 
s3) “the camera work is terrible” 
II: Contents Segmentation
II: Contents Segmentation
CONSENTOindexes MCSUs on a conventional inverted index that is used in most modern search engines. 
Only mapping needs to be redefined logically from (terms → documents) to (terms → MCSUs) 
III: Indexing
III: Indexing 
20 
Feature 2 
Feature 1 
excellent 
visual effects, 
but 
plot 
was 
hard to follow 
Entity Name 
Transformer 3 
sentiment 
sentiment 
Document #1 
Bag of words 
excellent 
effects, 
plot 
hard 
Doc#1 
Term 
Doc 
excellent 
#1 
hard 
#1 
follow 
#1 
plot 
#1 
visual 
#1 
effects 
#1 
follow 
visual 
Traditional 
Inverted index 
Query: “excellent plot”. System return this document 
* Conventional Indexing Method Example
III: Indexing 
21 
excellent 
visual effects, 
but 
plot 
was 
hard to follow 
Segment 2 
Segment 1 
SegmentID 
ObjectName 
Feature 
Sentiment 
Segment1 
Transformer 3 
visual effects 
excellent 
Segment 2 
Transformer 3 
plot 
hard to follow 
Sub-document level indexing 
Term 
SegmentID 
ObjectName 
Feature 
Sentiment 
excellent 
SID1 
Transformer 3 
visual effects 
excellent 
visual 
SID1 
Transformer 3 
visual effects 
excellent 
effect 
SID1 
Transformer 3 
visual effects 
excellent 
plot 
SID2 
Transformer 3 
plot 
hard 
hard 
SID2 
Transformer 3 
plot 
hard 
follow 
SID2 
Transformer 3 
plot 
hard 
Query: “excellent plot”, doesn't match any segment 
* Subdocument-level Indexing Example
III: Indexing 
Simply treating an MCSU as a document 
Store additional information in each posting for use in the ranking stage 
MCSU posting structure
rid 
ts 
rq 
푟1 
푡푠1 
0.8 
푟2 
푡푠2 
0.4 
푟3 
푡푠3 
0.6 
푟4 
푡푠4 
0.9 
푟5 
푡푠5 
0.4 
푟6 
푡푠6 
0.5 
푟7 
푡푠7 
0.7 
푟8 
푡푠8 
0.6 
푟9 
푡푠9 
0.8 
Site Name 
Source ID 
IMDb 
푤1 
Flixster 
푤2 
Metacritic 
푤3 
Yahoo! 
푤4 
Feature 
id 
music 
푎1 
soundtrack 
푎2 
story 
푎3 
plot 
푎4 
performance 
푎5 
acting 
푎6 
Sentiword 
id 
great 
푚1 
excellent 
푚2 
superb 
푚3 
tragic 
푚4 
Entity 
id 
Titanic 
푒1 
Brokeback 
Mountain 
푒2 
Dark Knight 
푒3 
Avatar 
푒4 
Term 
Postings 
Cameron 
<푠19, 푒4, [−], [푚3], 푟7, 푤3> 
Pandora 
<푠16, 푒4, [푎2], [−], 푟6, 푤3>, 
<푠18, 푒4, [−], [−], 푟6, 푤3> 
tragic 
<푠7, 푒2, [푎3], [푚4], 푟3, 푤1> 
performance 
<푠5, 푒1, [푎6], [푚6], 푟2, 푤1>, 
<푠9, 푒2, [푎6], [푚3], 푟3, 푤1>, 
<푠11, 푒2, [푎6], [푚1], 푟4, 푤1>, 
<푠13, 푒3, [푎6], [−], 푟5, 푤2>, 
<푠15, 푒4, [푎6], [−], 푟5, 푤3>, 
<푠20, 푒3, [푎6], [−], 푟8, 푤4>, 
<푠21, 푒3,[푎6], [푚6], 푟9, 푤4> 
soundtrack 
<푠4, 푒1, [푎2],[−], 푟2, 푤1>, 
<푠10, 푒2, [푎2],[푚2], 푟4, 푤1>, 
<푠16, 푒4, [푎2],[−], 푟6, 푤2>, 
<푠22, 푒3, [푎2],[푚1], 푟9, 푤4> 
plot 
<푠14, 푒3, [푎4],[−], 푟5, 푤2> 
acting 
<푠13, 푒4, [푎6], [−], 푟9, 푤4>, 
music 
<푠2, 푒1, [푎1], [푚1], 푟1, 푤1>, 
<푠8, 푒2, [푎1], [푚1], 푟3, 푤1> 
Yeston 
<푠2, 푒1, [푎1], [−],푟1, 푤1>, 
story 
<푠1, 푒1, [푎3], [푚1],푟1, 푤1>, 
<푠7, 푒2, [푎3], [−],푟3, 푤1>, 
<푠12, 푒2, [푎3], [푚2],푟4, 푤1>, 
<푠17, 푒4, [푎3], [−],푟6, 푤3> 
(s7) beautiful tragic love story, //(s8)with great music.//(s9) superb performances in movies ever! 
(s10) The soundtrack is also excellent,// 
(s11)great performance, //(s12)excellent presentation of a love story… 
Brokeback 
Mountain 
퐫ퟑ 
퐫ퟒ 
The Dark Knight 
(s13) The performance by Heath Ledger was outstanding //(s14) and plot is amazing too… 
퐫ퟓ 
The Dark Knight 
(s20) Joker shows phonemically awesome performance!… 
(s21) nice performance //(s22)and backed up with great soundtrack. //(s23)excellent casting! 
퐫ퟖ 
퐫ퟗ 
(s1) the greatest love stories of all //(s2)and beautiful music from Yeston. // (s3) Everything about this movie was excellent... 
(푠4) touching soundtrack, //(푠5) and perfect handling of the known tragedy with nice performance. //(푠6)This has the best love scene I have ever seen… 
Titanic 
퐫ퟏ 
퐫ퟐ 
(s15) Navilooks very real, good performance, 
//(s16) beautiful soundtrack that emphasize the vastness of the Pandora, //(s17)with love story.// (s18) The world of Pandora is stunning 
Avatar 
퐫ퟔ 
퐫ퟕ 
(s19) James Cameron deserves high praise for this creation… 
Review ID
IV: Query Parsing 
CONSENTOpreprocesses the query and performs query expansion 
stop-word removal, 
polarity only-word removal 
feature expansion 
stemming 
Polarity only-word removal 
"good action movie" and "greataction movie" should be treated as the same query 
Feature words expanded for better recall 
‘plot’ → {plot, story} 
‘music’ → {music, soundtrack}
V: Retrieval 
Retrieve MCSU segments that match to the query terms 
Same as the conventional systems retrieve document posting lists
VI: Ranking 
Group MCSU postings by entity and aggregate the scores of the postings to compute the score of the corresponding entity
VI: Ranking
VI: Ranking
VI: Ranking 
29
VI: Ranking 
30
Movie data sets 
Source 
•Amazon , IMDB, Metacritic, Flixster, Rotten Tomatoes and Yahoo Movies 
Period 
•2008 ~ 2010 
More than 740 movies, and 30K reviews 
Hotel data sets 
hotel data set from Ganesanand Zhai 
reviews for the hotels in 10 major cities from TripAdvisor 
The authors provided us the corrected judgment set for our test 
Experimental Setup: Data Set
Experiment 
Methods 
Ganesanand Zhai’sOE and QAM methods 
•Opinion expansion word 
•Query aspect model 
Baseline 
1) BM25 
•b = 0.75 
•k1 = 2 
2) VSMBM (lucenedefault) 
•Vector space model + Boolean model 
3) ConsensusRank
Experimental Result -Movie
Experimental Result -Hotel
Hawaii 
Cebu 
Gold Coast
Honeymoon 
Snorkeling 
Hawaii! 
Honeymoon 
Whale Watching 
Snorkeling 
Whale watching 
Whale Watching 
Snorkeling 
Snorkeling 
Active Volcano 
Honeymoon 
Honeymoon 
Whale Watching 
Snorkeling 
Honeymoon 
Whale Watching
1. 웹및소셜네트워크상의다양한정보를 
사전에분석및인덱싱 
스릴러영화? 
반전있는 
스릴러 
영화? 
대학생백팩? 
믿을만한 
중고차딜러? 
믿을만한 
근처어린이집 
2. Ad-hoc 의사결정질의에대한실시간결과도출 
면접용 
메이크업 
미용실 
학원근처 
갈만한 
스터디장소 
강남상견례한식집 
배낭여행숙소 
우리동네PT 잘하는 
트레이너?
38 
best thriller with plot twist
The Artist vs. Jack and Jill 
39
40 
good pizza restaurant
Click!
42
CONSENTO Local 서비스예제 
43
CONSENTO Local 서비스예제 
44
‘Napk-In’ 서비스예제 
45
‘Napk-In’ 서비스예제 
46
‘슝’서비스예제 
47
잠재된컨센서스검색시장 
48 
사실검색 
컨센서스검색
ENGINEERINGKNOWLEDGE 
SEARCHINGWISDOM
CONSENTO
THANK YOU
1 of 51

Recommended

[2C6]SQLite DB 의 입출력 특성분석 : Android 와 Tizen 사례 by
[2C6]SQLite DB 의 입출력 특성분석 : Android 와 Tizen 사례[2C6]SQLite DB 의 입출력 특성분석 : Android 와 Tizen 사례
[2C6]SQLite DB 의 입출력 특성분석 : Android 와 Tizen 사례NAVER D2
11.7K views43 slides
[2D1]Elasticsearch 성능 최적화 by
[2D1]Elasticsearch 성능 최적화[2D1]Elasticsearch 성능 최적화
[2D1]Elasticsearch 성능 최적화NAVER D2
30.5K views51 slides
[2 d1] elasticsearch 성능 최적화 by
[2 d1] elasticsearch 성능 최적화[2 d1] elasticsearch 성능 최적화
[2 d1] elasticsearch 성능 최적화Henry Jeong
3.5K views51 slides
Elasticsearch 설치 및 기본 활용 by
Elasticsearch 설치 및 기본 활용Elasticsearch 설치 및 기본 활용
Elasticsearch 설치 및 기본 활용종민 김
19.5K views20 slides
아파트 정보를 이용한 ELK stack 활용 - 오근문 by
아파트 정보를 이용한 ELK stack 활용 - 오근문아파트 정보를 이용한 ELK stack 활용 - 오근문
아파트 정보를 이용한 ELK stack 활용 - 오근문NAVER D2
1.4K views17 slides
Building a CRM on top of ElasticSearch by
Building a CRM on top of ElasticSearchBuilding a CRM on top of ElasticSearch
Building a CRM on top of ElasticSearchMark Greene
45.1K views17 slides

More Related Content

What's hot

Использование Elasticsearch для организации поиска по сайту by
Использование Elasticsearch для организации поиска по сайтуИспользование Elasticsearch для организации поиска по сайту
Использование Elasticsearch для организации поиска по сайтуOlga Lavrentieva
7.8K views78 slides
What You Missed in Computer Science by
What You Missed in Computer ScienceWhat You Missed in Computer Science
What You Missed in Computer ScienceTaylor Lovett
6.5K views36 slides
Postgresql search demystified by
Postgresql search demystifiedPostgresql search demystified
Postgresql search demystifiedjavier ramirez
1.9K views51 slides
Elasticsearch - DevNexus 2015 by
Elasticsearch - DevNexus 2015Elasticsearch - DevNexus 2015
Elasticsearch - DevNexus 2015Roy Russo
14K views59 slides
ElasticSearch - DevNexus Atlanta - 2014 by
ElasticSearch - DevNexus Atlanta - 2014ElasticSearch - DevNexus Atlanta - 2014
ElasticSearch - DevNexus Atlanta - 2014Roy Russo
8K views47 slides
Data Exploration with Elasticsearch by
Data Exploration with ElasticsearchData Exploration with Elasticsearch
Data Exploration with ElasticsearchAleksander Stensby
4.4K views85 slides

What's hot(20)

Использование Elasticsearch для организации поиска по сайту by Olga Lavrentieva
Использование Elasticsearch для организации поиска по сайтуИспользование Elasticsearch для организации поиска по сайту
Использование Elasticsearch для организации поиска по сайту
Olga Lavrentieva7.8K views
What You Missed in Computer Science by Taylor Lovett
What You Missed in Computer ScienceWhat You Missed in Computer Science
What You Missed in Computer Science
Taylor Lovett6.5K views
Postgresql search demystified by javier ramirez
Postgresql search demystifiedPostgresql search demystified
Postgresql search demystified
javier ramirez1.9K views
Elasticsearch - DevNexus 2015 by Roy Russo
Elasticsearch - DevNexus 2015Elasticsearch - DevNexus 2015
Elasticsearch - DevNexus 2015
Roy Russo14K views
ElasticSearch - DevNexus Atlanta - 2014 by Roy Russo
ElasticSearch - DevNexus Atlanta - 2014ElasticSearch - DevNexus Atlanta - 2014
ElasticSearch - DevNexus Atlanta - 2014
Roy Russo8K views
ElasticSearch in action by Codemotion
ElasticSearch in actionElasticSearch in action
ElasticSearch in action
Codemotion1.7K views
ElasticSearch: Найдется все... и быстро! by Alexander Byndyu
ElasticSearch: Найдется все... и быстро!ElasticSearch: Найдется все... и быстро!
ElasticSearch: Найдется все... и быстро!
Alexander Byndyu3.6K views
Elasticsearch 101 - Cluster setup and tuning by Petar Djekic
Elasticsearch 101 - Cluster setup and tuningElasticsearch 101 - Cluster setup and tuning
Elasticsearch 101 - Cluster setup and tuning
Petar Djekic1.9K views
Beyond full-text searches with Lucene and Solr by Bertrand Delacretaz
Beyond full-text searches with Lucene and SolrBeyond full-text searches with Lucene and Solr
Beyond full-text searches with Lucene and Solr
Bertrand Delacretaz4.8K views
[제1회 루씬 한글분석기 기술세미나] solr로 나만의 검색엔진을 만들어보자 by Donghyeok Kang
[제1회 루씬 한글분석기 기술세미나] solr로 나만의 검색엔진을 만들어보자[제1회 루씬 한글분석기 기술세미나] solr로 나만의 검색엔진을 만들어보자
[제1회 루씬 한글분석기 기술세미나] solr로 나만의 검색엔진을 만들어보자
Donghyeok Kang15.1K views
Terms of endearment - the ElasticSearch Query DSL explained by clintongormley
Terms of endearment - the ElasticSearch Query DSL explainedTerms of endearment - the ElasticSearch Query DSL explained
Terms of endearment - the ElasticSearch Query DSL explained
clintongormley29.2K views
ElasticSearch - index server used as a document database by Robert Lujo
ElasticSearch - index server used as a document databaseElasticSearch - index server used as a document database
ElasticSearch - index server used as a document database
Robert Lujo5.5K views
Elastic Search Training#1 (brief tutorial)-ESCC#1 by medcl
Elastic Search Training#1 (brief tutorial)-ESCC#1Elastic Search Training#1 (brief tutorial)-ESCC#1
Elastic Search Training#1 (brief tutorial)-ESCC#1
medcl5.7K views
Spark with Elasticsearch - umd version 2014 by Holden Karau
Spark with Elasticsearch - umd version 2014Spark with Elasticsearch - umd version 2014
Spark with Elasticsearch - umd version 2014
Holden Karau2.7K views
Ops Jumpstart: MongoDB Administration 101 by MongoDB
Ops Jumpstart: MongoDB Administration 101Ops Jumpstart: MongoDB Administration 101
Ops Jumpstart: MongoDB Administration 101
MongoDB1.4K views
Performance Tuning and Optimization by MongoDB
Performance Tuning and OptimizationPerformance Tuning and Optimization
Performance Tuning and Optimization
MongoDB3.9K views

Similar to [2B1]검색엔진의 패러다임 전환

Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked... by
Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...
Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...Fabrizio Orlandi
2.4K views27 slides
[100621]제안발표 by
[100621]제안발표[100621]제안발표
[100621]제안발표DongKyun Lee
801 views41 slides
Measurement and modeling of the web and related data sets by
Measurement and modeling of the web and related data setsMeasurement and modeling of the web and related data sets
Measurement and modeling of the web and related data setsMark J. Feldman
839 views78 slides
Semantic search within Earth Observation products databases based on automati... by
Semantic search within Earth Observation products databases based on automati...Semantic search within Earth Observation products databases based on automati...
Semantic search within Earth Observation products databases based on automati...Gasperi Jerome
1.3K views44 slides
Query by Example of Speaker Audio Signals using Power Spectrum and MFCCs by
Query by Example of Speaker Audio Signals using Power Spectrum and MFCCsQuery by Example of Speaker Audio Signals using Power Spectrum and MFCCs
Query by Example of Speaker Audio Signals using Power Spectrum and MFCCsIJECEIAES
13 views16 slides
Fox-Keynote-Now and Now of Data Publishing-nfdp13 by
Fox-Keynote-Now and Now of Data Publishing-nfdp13Fox-Keynote-Now and Now of Data Publishing-nfdp13
Fox-Keynote-Now and Now of Data Publishing-nfdp13DataDryad
590 views33 slides

Similar to [2B1]검색엔진의 패러다임 전환(20)

Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked... by Fabrizio Orlandi
Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...
Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...
Fabrizio Orlandi2.4K views
[100621]제안발표 by DongKyun Lee
[100621]제안발표[100621]제안발표
[100621]제안발표
DongKyun Lee801 views
Measurement and modeling of the web and related data sets by Mark J. Feldman
Measurement and modeling of the web and related data setsMeasurement and modeling of the web and related data sets
Measurement and modeling of the web and related data sets
Mark J. Feldman839 views
Semantic search within Earth Observation products databases based on automati... by Gasperi Jerome
Semantic search within Earth Observation products databases based on automati...Semantic search within Earth Observation products databases based on automati...
Semantic search within Earth Observation products databases based on automati...
Gasperi Jerome1.3K views
Query by Example of Speaker Audio Signals using Power Spectrum and MFCCs by IJECEIAES
Query by Example of Speaker Audio Signals using Power Spectrum and MFCCsQuery by Example of Speaker Audio Signals using Power Spectrum and MFCCs
Query by Example of Speaker Audio Signals using Power Spectrum and MFCCs
IJECEIAES13 views
Fox-Keynote-Now and Now of Data Publishing-nfdp13 by DataDryad
Fox-Keynote-Now and Now of Data Publishing-nfdp13Fox-Keynote-Now and Now of Data Publishing-nfdp13
Fox-Keynote-Now and Now of Data Publishing-nfdp13
DataDryad590 views
Optimizing Search Interactions within Professional Social Networks (thesis p... by Nik Spirin
Optimizing Search Interactions within Professional Social Networks (thesis p...Optimizing Search Interactions within Professional Social Networks (thesis p...
Optimizing Search Interactions within Professional Social Networks (thesis p...
Nik Spirin1.7K views
An image crawler for content based image retrieval system by eSAT Journals
An image crawler for content based image retrieval systemAn image crawler for content based image retrieval system
An image crawler for content based image retrieval system
eSAT Journals83 views
Bayesian Network 을 활용한 예측 분석 by datasciencekorea
Bayesian Network 을 활용한 예측 분석Bayesian Network 을 활용한 예측 분석
Bayesian Network 을 활용한 예측 분석
datasciencekorea8.7K views
Web-scale semantic search by Edgar Meij
Web-scale semantic searchWeb-scale semantic search
Web-scale semantic search
Edgar Meij1.5K views
Quality, Quantity, Web and Semantics by Zemanta
Quality, Quantity, Web and SemanticsQuality, Quantity, Web and Semantics
Quality, Quantity, Web and Semantics
Zemanta1.2K views
Quality, quantity, web and semantics by Andraz Tori
Quality, quantity, web and semanticsQuality, quantity, web and semantics
Quality, quantity, web and semantics
Andraz Tori1.8K views
The International Journal of Engineering and Science (The IJES) by theijes
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)
theijes1K views
Automatic Identification of Best Answers in Online Enquiry Communities by Gregoire Burel
Automatic Identification of Best Answers in Online Enquiry CommunitiesAutomatic Identification of Best Answers in Online Enquiry Communities
Automatic Identification of Best Answers in Online Enquiry Communities
Gregoire Burel969 views
A Spark-Based Intelligent Assistant: Making Data Exploration in Natural Langu... by Databricks
A Spark-Based Intelligent Assistant: Making Data Exploration in Natural Langu...A Spark-Based Intelligent Assistant: Making Data Exploration in Natural Langu...
A Spark-Based Intelligent Assistant: Making Data Exploration in Natural Langu...
Databricks406 views
Image web crawler by dixitas
Image web crawlerImage web crawler
Image web crawler
dixitas643 views
Fyp ideas by Mr SMAK
Fyp ideasFyp ideas
Fyp ideas
Mr SMAK2.8K views

More from NAVER D2

[211] 인공지능이 인공지능 챗봇을 만든다 by
[211] 인공지능이 인공지능 챗봇을 만든다[211] 인공지능이 인공지능 챗봇을 만든다
[211] 인공지능이 인공지능 챗봇을 만든다NAVER D2
10.8K views73 slides
[233] 대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing: Maglev Hashing Scheduler i... by
[233] 대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing: Maglev Hashing Scheduler i...[233] 대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing: Maglev Hashing Scheduler i...
[233] 대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing: Maglev Hashing Scheduler i...NAVER D2
3.6K views69 slides
[215] Druid로 쉽고 빠르게 데이터 분석하기 by
[215] Druid로 쉽고 빠르게 데이터 분석하기[215] Druid로 쉽고 빠르게 데이터 분석하기
[215] Druid로 쉽고 빠르게 데이터 분석하기NAVER D2
5.4K views58 slides
[245]Papago Internals: 모델분석과 응용기술 개발 by
[245]Papago Internals: 모델분석과 응용기술 개발[245]Papago Internals: 모델분석과 응용기술 개발
[245]Papago Internals: 모델분석과 응용기술 개발NAVER D2
2.1K views55 slides
[236] 스트림 저장소 최적화 이야기: 아파치 드루이드로부터 얻은 교훈 by
[236] 스트림 저장소 최적화 이야기: 아파치 드루이드로부터 얻은 교훈[236] 스트림 저장소 최적화 이야기: 아파치 드루이드로부터 얻은 교훈
[236] 스트림 저장소 최적화 이야기: 아파치 드루이드로부터 얻은 교훈NAVER D2
2.3K views66 slides
[235]Wikipedia-scale Q&A by
[235]Wikipedia-scale Q&A[235]Wikipedia-scale Q&A
[235]Wikipedia-scale Q&ANAVER D2
1.5K views54 slides

More from NAVER D2(20)

[211] 인공지능이 인공지능 챗봇을 만든다 by NAVER D2
[211] 인공지능이 인공지능 챗봇을 만든다[211] 인공지능이 인공지능 챗봇을 만든다
[211] 인공지능이 인공지능 챗봇을 만든다
NAVER D210.8K views
[233] 대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing: Maglev Hashing Scheduler i... by NAVER D2
[233] 대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing: Maglev Hashing Scheduler i...[233] 대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing: Maglev Hashing Scheduler i...
[233] 대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing: Maglev Hashing Scheduler i...
NAVER D23.6K views
[215] Druid로 쉽고 빠르게 데이터 분석하기 by NAVER D2
[215] Druid로 쉽고 빠르게 데이터 분석하기[215] Druid로 쉽고 빠르게 데이터 분석하기
[215] Druid로 쉽고 빠르게 데이터 분석하기
NAVER D25.4K views
[245]Papago Internals: 모델분석과 응용기술 개발 by NAVER D2
[245]Papago Internals: 모델분석과 응용기술 개발[245]Papago Internals: 모델분석과 응용기술 개발
[245]Papago Internals: 모델분석과 응용기술 개발
NAVER D22.1K views
[236] 스트림 저장소 최적화 이야기: 아파치 드루이드로부터 얻은 교훈 by NAVER D2
[236] 스트림 저장소 최적화 이야기: 아파치 드루이드로부터 얻은 교훈[236] 스트림 저장소 최적화 이야기: 아파치 드루이드로부터 얻은 교훈
[236] 스트림 저장소 최적화 이야기: 아파치 드루이드로부터 얻은 교훈
NAVER D22.3K views
[235]Wikipedia-scale Q&A by NAVER D2
[235]Wikipedia-scale Q&A[235]Wikipedia-scale Q&A
[235]Wikipedia-scale Q&A
NAVER D21.5K views
[244]로봇이 현실 세계에 대해 학습하도록 만들기 by NAVER D2
[244]로봇이 현실 세계에 대해 학습하도록 만들기[244]로봇이 현실 세계에 대해 학습하도록 만들기
[244]로봇이 현실 세계에 대해 학습하도록 만들기
NAVER D21.7K views
[243] Deep Learning to help student’s Deep Learning by NAVER D2
[243] Deep Learning to help student’s Deep Learning[243] Deep Learning to help student’s Deep Learning
[243] Deep Learning to help student’s Deep Learning
NAVER D21.4K views
[234]Fast & Accurate Data Annotation Pipeline for AI applications by NAVER D2
[234]Fast & Accurate Data Annotation Pipeline for AI applications[234]Fast & Accurate Data Annotation Pipeline for AI applications
[234]Fast & Accurate Data Annotation Pipeline for AI applications
NAVER D21.3K views
Old version: [233]대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing by NAVER D2
Old version: [233]대형 컨테이너 클러스터에서의 고가용성 Network Load BalancingOld version: [233]대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing
Old version: [233]대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing
NAVER D21.4K views
[226]NAVER 광고 deep click prediction: 모델링부터 서빙까지 by NAVER D2
[226]NAVER 광고 deep click prediction: 모델링부터 서빙까지[226]NAVER 광고 deep click prediction: 모델링부터 서빙까지
[226]NAVER 광고 deep click prediction: 모델링부터 서빙까지
NAVER D21.9K views
[225]NSML: 머신러닝 플랫폼 서비스하기 & 모델 튜닝 자동화하기 by NAVER D2
[225]NSML: 머신러닝 플랫폼 서비스하기 & 모델 튜닝 자동화하기[225]NSML: 머신러닝 플랫폼 서비스하기 & 모델 튜닝 자동화하기
[225]NSML: 머신러닝 플랫폼 서비스하기 & 모델 튜닝 자동화하기
NAVER D23.6K views
[224]네이버 검색과 개인화 by NAVER D2
[224]네이버 검색과 개인화[224]네이버 검색과 개인화
[224]네이버 검색과 개인화
NAVER D22.3K views
[216]Search Reliability Engineering (부제: 지진에도 흔들리지 않는 네이버 검색시스템) by NAVER D2
[216]Search Reliability Engineering (부제: 지진에도 흔들리지 않는 네이버 검색시스템)[216]Search Reliability Engineering (부제: 지진에도 흔들리지 않는 네이버 검색시스템)
[216]Search Reliability Engineering (부제: 지진에도 흔들리지 않는 네이버 검색시스템)
NAVER D21.9K views
[214] Ai Serving Platform: 하루 수 억 건의 인퍼런스를 처리하기 위한 고군분투기 by NAVER D2
[214] Ai Serving Platform: 하루 수 억 건의 인퍼런스를 처리하기 위한 고군분투기[214] Ai Serving Platform: 하루 수 억 건의 인퍼런스를 처리하기 위한 고군분투기
[214] Ai Serving Platform: 하루 수 억 건의 인퍼런스를 처리하기 위한 고군분투기
NAVER D22.6K views
[213] Fashion Visual Search by NAVER D2
[213] Fashion Visual Search[213] Fashion Visual Search
[213] Fashion Visual Search
NAVER D21.5K views
[232] TensorRT를 활용한 딥러닝 Inference 최적화 by NAVER D2
[232] TensorRT를 활용한 딥러닝 Inference 최적화[232] TensorRT를 활용한 딥러닝 Inference 최적화
[232] TensorRT를 활용한 딥러닝 Inference 최적화
NAVER D24.5K views
[242]컴퓨터 비전을 이용한 실내 지도 자동 업데이트 방법: 딥러닝을 통한 POI 변화 탐지 by NAVER D2
[242]컴퓨터 비전을 이용한 실내 지도 자동 업데이트 방법: 딥러닝을 통한 POI 변화 탐지[242]컴퓨터 비전을 이용한 실내 지도 자동 업데이트 방법: 딥러닝을 통한 POI 변화 탐지
[242]컴퓨터 비전을 이용한 실내 지도 자동 업데이트 방법: 딥러닝을 통한 POI 변화 탐지
NAVER D21.1K views
[212]C3, 데이터 처리에서 서빙까지 가능한 하둡 클러스터 by NAVER D2
[212]C3, 데이터 처리에서 서빙까지 가능한 하둡 클러스터[212]C3, 데이터 처리에서 서빙까지 가능한 하둡 클러스터
[212]C3, 데이터 처리에서 서빙까지 가능한 하둡 클러스터
NAVER D21.7K views
[223]기계독해 QA: 검색인가, NLP인가? by NAVER D2
[223]기계독해 QA: 검색인가, NLP인가?[223]기계독해 QA: 검색인가, NLP인가?
[223]기계독해 QA: 검색인가, NLP인가?
NAVER D23.8K views

Recently uploaded

CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue by
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlueCloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlueShapeBlue
46 views15 slides
Ransomware is Knocking your Door_Final.pdf by
Ransomware is Knocking your Door_Final.pdfRansomware is Knocking your Door_Final.pdf
Ransomware is Knocking your Door_Final.pdfSecurity Bootcamp
76 views46 slides
Don’t Make A Human Do A Robot’s Job! : 6 Reasons Why AI Will Save Us & Not De... by
Don’t Make A Human Do A Robot’s Job! : 6 Reasons Why AI Will Save Us & Not De...Don’t Make A Human Do A Robot’s Job! : 6 Reasons Why AI Will Save Us & Not De...
Don’t Make A Human Do A Robot’s Job! : 6 Reasons Why AI Will Save Us & Not De...Moses Kemibaro
29 views38 slides
Network Source of Truth and Infrastructure as Code revisited by
Network Source of Truth and Infrastructure as Code revisitedNetwork Source of Truth and Infrastructure as Code revisited
Network Source of Truth and Infrastructure as Code revisitedNetwork Automation Forum
42 views45 slides
How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ... by
How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...
How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...ShapeBlue
65 views28 slides
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R... by
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...ShapeBlue
54 views15 slides

Recently uploaded(20)

CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue by ShapeBlue
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlueCloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue
ShapeBlue46 views
Don’t Make A Human Do A Robot’s Job! : 6 Reasons Why AI Will Save Us & Not De... by Moses Kemibaro
Don’t Make A Human Do A Robot’s Job! : 6 Reasons Why AI Will Save Us & Not De...Don’t Make A Human Do A Robot’s Job! : 6 Reasons Why AI Will Save Us & Not De...
Don’t Make A Human Do A Robot’s Job! : 6 Reasons Why AI Will Save Us & Not De...
Moses Kemibaro29 views
How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ... by ShapeBlue
How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...
How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...
ShapeBlue65 views
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R... by ShapeBlue
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...
ShapeBlue54 views
State of the Union - Rohit Yadav - Apache CloudStack by ShapeBlue
State of the Union - Rohit Yadav - Apache CloudStackState of the Union - Rohit Yadav - Apache CloudStack
State of the Union - Rohit Yadav - Apache CloudStack
ShapeBlue145 views
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit... by ShapeBlue
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...
ShapeBlue57 views
2FA and OAuth2 in CloudStack - Andrija Panić - ShapeBlue by ShapeBlue
2FA and OAuth2 in CloudStack - Andrija Panić - ShapeBlue2FA and OAuth2 in CloudStack - Andrija Panić - ShapeBlue
2FA and OAuth2 in CloudStack - Andrija Panić - ShapeBlue
ShapeBlue50 views
Backup and Disaster Recovery with CloudStack and StorPool - Workshop - Venko ... by ShapeBlue
Backup and Disaster Recovery with CloudStack and StorPool - Workshop - Venko ...Backup and Disaster Recovery with CloudStack and StorPool - Workshop - Venko ...
Backup and Disaster Recovery with CloudStack and StorPool - Workshop - Venko ...
ShapeBlue77 views
Business Analyst Series 2023 - Week 3 Session 5 by DianaGray10
Business Analyst Series 2023 -  Week 3 Session 5Business Analyst Series 2023 -  Week 3 Session 5
Business Analyst Series 2023 - Week 3 Session 5
DianaGray10369 views
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f... by TrustArc
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...
TrustArc77 views
What’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlue by ShapeBlue
What’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlueWhat’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlue
What’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlue
ShapeBlue131 views
Keynote Talk: Open Source is Not Dead - Charles Schulz - Vates by ShapeBlue
Keynote Talk: Open Source is Not Dead - Charles Schulz - VatesKeynote Talk: Open Source is Not Dead - Charles Schulz - Vates
Keynote Talk: Open Source is Not Dead - Charles Schulz - Vates
ShapeBlue119 views
Migrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlue by ShapeBlue
Migrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlueMigrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlue
Migrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlue
ShapeBlue96 views
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or... by ShapeBlue
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...
ShapeBlue88 views
Igniting Next Level Productivity with AI-Infused Data Integration Workflows by Safe Software
Igniting Next Level Productivity with AI-Infused Data Integration Workflows Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Safe Software344 views
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBIT by ShapeBlue
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBITUpdates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBIT
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBIT
ShapeBlue91 views
Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P... by ShapeBlue
Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P...Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P...
Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P...
ShapeBlue82 views

[2B1]검색엔진의 패러다임 전환

  • 2. 연구배경 사용자의정보욕구변화 참여, 공유, 개방의Web 2.0 시대도래 사용자중심의정보생산/소비구조로의변화 웹및SNS상에개인의의견/주관적정보의양폭증 “분당상견례하기좋은한식집”, “반전이좋은스릴러“, “유행하는핸드백” 등의주관적정보에대한정보요구증가 •사실검색(e.g., ‘action movie’) 수요는정체또는불규칙한반면, ‘best action movie’, ‘best SUV’와같은주관적질의는꾸준히증가 2 “action movie'와best action movie' 질의어에대한구글검색추세그래프 (Google Trends, http://www.google.com/trends/)
  • 3. 3 Aardvark: Large-Scale Social Search Engine (Horowitz and Kamvar, WWW2010) “64% of queries contain subjective element in Aardvark” (e.g., “Do you know of any great delis in Baltimore, MD?” “What are the things/crafts/toys your children have made that made them really proud of themselves?”) 2010년google이$50,000,000 USD (한화530억) 에인수 사실검색VS. 컨센서스검색 컨센서스검색요구의증가
  • 4. 검색엔진VS. 컨센서스엔진 기존문서기반검색엔진의한계 객관적정보(e.g., ‘액션영화’또는‘핸드백가격‘)는현재의검색엔진에서검색가능하나주관적질의(‘재미있는액션영화’, ’요즘유행하는핸드백‘) 에는적절한대응불가능 문서내에서기술의대상이되는객체를찾아내어이를색인의대상으로인식하고다양한문서에산재한사용자의의견을대상객체별로종합/분석하여랭킹하는새로운검색기술로의근본적인패러다임의전환요구 4
  • 5. 5 •낮은가격순 •높은가격순 •등록일순 •상품평많은순 의단순한상품정렬 단순나열되는사용자리뷰 •내용파악이힘들며 •정보의종합이어려움 복잡한옵션선택 TV의인치와가격외에유용한정보가없는결과리스트
  • 6. 6 구매후기|2013.04.12 고가의전자제품을인터넷구매라많이망설였습니다.설치된후제품을보니너무만족합니다. 화면크고잘나오고저렴하게구입잘한것같아서기분이좋습니다. LG전자 47LM6200 가격대비막강한성능을가진TV입니다.|2013.04.01 제품자체가보급형으로저렴한가격.인터넷, 3D 등의막강한기능을가졌고이곳저곳상품평읽어보니모두만족하는제품이라안심하고구매했습니다. 좋은제품합리적인가격에잘구매한것같습니다. 감사합니다. 탁월한선택... LG 스마트TV 47LM6200...|2012.09.10 특히리모콘의기능과3D안경은S사것보다활용도가아주편하고좋습니다. 3D안경도타사의밧데리로하는3D안경보다훨씬편하고특히안경쓴사람들에게편리한클립형은아이디어가돋보인다. 깔끔한화질및벽걸이설치Good. 제품수급에따른배송지연|2012.07.02 화질도깔끔히잘나오고, 무엇보다벽걸이형으로아주잘설치되어서만족합니다. 나쁘지않습니다.|2013.04.19 가격대비이정도면괜찮은듯싶습니다. 그러나마우스리모컨이은근계륵이네요. 스마트티비엔확실히필요하나감도가영불편하게되어있구요. 리모컨도초간단으로나오는데.. 너무간단해서조작하기영.. 리모컨시스템빼고는뭐나쁘지않습니다. Search 가격성능비가좋은TV 제품자체가보급형으로저렴한가격 LG 47LM 가격대비아주좋은선택이었네요. LG 47LM 가격대비성능비가매우우수한3D 스마트LED TV라고생각합니다. LG 47LM LG 47LM 화면크고잘나오고저렴하게구입잘한것같아서기분이좋습니다. 삼성UN50 무엇보다가격대배최고의제품이라말하고싶습니다 삼성UN50 아주좋은가격에사게되어만족합니다 삼성UN50 가격대비크기및화질좋습니다. 삼성UN50 정말최고의제품&서비스입니다.|2013.07.31 어제주문했는데이렇게빨리배송이올줄이야!!! 배송기사님도너무마음에들게설치해주시고. 무엇보다가격대배최고의제품이라말하고싶습니다. 모든것만족!! 착한가격에만족합니다.|2012.12.18 아주좋은가격에사게되어만족합니다. 삼성스마트TV로성능이나외관은기존에백화점에서보는것과별반다르지않고만족합니다. 현재약2주정도사용중인데기능이나외관모두만족입니다 가격대비최고의가치있는모델|2013.03.21 저녁에주문했는데다음날아침에배송!!!벽걸이로샀는데크기도크고영화보기에는아주좋을것같습니다. 화질도좋고, 크기도좋고, 배송도번개배송!! 저렴하게구입 가격대배최고 저렴한가격 가격대비성능비가매우우수 가격대비크기및화질좋습니다 아주좋은가격 가격대비이정도면괜찮 가격대비아주좋은선택 0.5 0.8 0.9 0.7 0.5 0.8 0.7 0.6 Query Term과매칭된Aspect Segment Score 삼성합계: 2.9 LG합계: 2.6 최종검색순위 1. 삼성UN50ES6800F 2. LG 47LM6200 Click! 삼성전자 UN50ES6800F
  • 7. Consensus Search 최근사용자들은구매활동이나문화생활과관련된의사결정을위해인터넷검색을활발히활용 공연관람이나, 상품구매를위해타사용자들의리뷰, 후기를참조 각리뷰는작성자의“주관적의견”을토대로작성 가능한많은리뷰를읽어야의사결정에도움 컨센서스엔진이란? 타사용자들이기작성해놓은수많은리뷰를사전에분석 사용자가원하는관점(질의)에서타사용자들의리뷰를분석, 종합해주는검색시스템 7
  • 8. Consensus Engine 현재의검색엔진으로는충분하지않다! 상위몇개의문서에원하는정보가있을수는있다 하지만각각의문서는각작성자의의견 대중의consensus를대표할수없다 하지만답은이미Web에존재! 많은사용자들이각자의의견을여러형태(SNS, blog, review)로온라인상에게시 이러한온라인의견들을“잠재적투표”로인식 이미피력된온라인의견을검색시점에(query time)모아서분석하면컨센서스검색이가능 8
  • 9. Uhm.. Yeah.. It is noisy, but… 9 Online Consumer Posts: 2ndmost trusted forms of advertising (The Nielson Company, Q3 2011)
  • 10. Is consensus search ever possible…? “Best Action Movies in 2013” Not immediately answerable with conventional search engines Because the answer should be based on consensus, which cannot be found in one of “top-10” documents However, the answers are already on the Web Numerous implicit votes from people on the Web and Social Networks Only if we can process them …. … ONLINE! 10
  • 13. The Key Ideas (I) Subdocument-level Indexing Capture semantics from user opinion more precisely Indexing unit no longer a page but; •a reviewwithin a page if more than one reviews exist on the page, •or a sentencewithin a review, •or even a clauseor phrasewithin a sentence discussing one aspect of the target entity Maximal Coherent Semantic Unit (MCSU) •a finest granule indexing unit used in CONSENTO indexing •maximal subsequence of words within a sentence, which carries single coherent semantics Indexing MCSUs instead of documents enables semantic analysis to be performed during indexing time •facilitating the online processing of consensus search in query time 13
  • 14. The Key Ideas (II) ConsensusRank: A Unique Ranking Method based on Public Sentiment Virtually, all existing ranking methods rank target objects (either documents or entities) directly based on their relevance to the query terms Contrastingly, ConsensusRankranks the entities indirectly through aggregating the scores of referring segments (e.g., MCSUs) that match to the query context It can be viewed as a voting process where each reviewer casts a weighted vote on an entity with respect to a query by expressing positive or negative opinions about that entity 14
  • 15. 15 (A)Indexing Subsystem Web Documents Parsing & Preprocessing DOM-tree Parsing Contents Extraction ContentsSegmentation Sentence Splitter MCSU Extraction Entity Search Index (B) Searching Subsystem Query Parsing Query Preprocessing & Expansion Retrieval Matching MCSU Retrieval Ranking Segment Grouping Score Aggregation Entity List User Query 1 2 3 4 5 6 ReviewContents ExpandedQuery MCSU Posting List MCSUs Indexing Inverted Entry Construction & Indexing CONSENTOArchitecture Indexing Subsystem Parsing & Preprocessing Contents Segmentation Indexing Searching Subsystem Query Parsing Retrieval Ranking
  • 16. The current working prototype of CONSENTO is built on movie domain CONSENTO crawled review pages from popular movie review sites such as IMDB, Meta Critics etc. Review contents are extracted using DOM- tree parsing and XPATH queries Extracted information include: entity name (i.e., movie name) review text, date and time review quality (e.g., “20 out of 30 people found the review helpful”) I: Parsing & Preprocessing
  • 17. Split the review contents into MCSUs e.g., “The storyline is ridiculous, the acting is laughable, and the camera work is terrible.” s1) “The storyline is ridiculous” s2) “the acting is laughable” s3) “the camera work is terrible” II: Contents Segmentation
  • 19. CONSENTOindexes MCSUs on a conventional inverted index that is used in most modern search engines. Only mapping needs to be redefined logically from (terms → documents) to (terms → MCSUs) III: Indexing
  • 20. III: Indexing 20 Feature 2 Feature 1 excellent visual effects, but plot was hard to follow Entity Name Transformer 3 sentiment sentiment Document #1 Bag of words excellent effects, plot hard Doc#1 Term Doc excellent #1 hard #1 follow #1 plot #1 visual #1 effects #1 follow visual Traditional Inverted index Query: “excellent plot”. System return this document * Conventional Indexing Method Example
  • 21. III: Indexing 21 excellent visual effects, but plot was hard to follow Segment 2 Segment 1 SegmentID ObjectName Feature Sentiment Segment1 Transformer 3 visual effects excellent Segment 2 Transformer 3 plot hard to follow Sub-document level indexing Term SegmentID ObjectName Feature Sentiment excellent SID1 Transformer 3 visual effects excellent visual SID1 Transformer 3 visual effects excellent effect SID1 Transformer 3 visual effects excellent plot SID2 Transformer 3 plot hard hard SID2 Transformer 3 plot hard follow SID2 Transformer 3 plot hard Query: “excellent plot”, doesn't match any segment * Subdocument-level Indexing Example
  • 22. III: Indexing Simply treating an MCSU as a document Store additional information in each posting for use in the ranking stage MCSU posting structure
  • 23. rid ts rq 푟1 푡푠1 0.8 푟2 푡푠2 0.4 푟3 푡푠3 0.6 푟4 푡푠4 0.9 푟5 푡푠5 0.4 푟6 푡푠6 0.5 푟7 푡푠7 0.7 푟8 푡푠8 0.6 푟9 푡푠9 0.8 Site Name Source ID IMDb 푤1 Flixster 푤2 Metacritic 푤3 Yahoo! 푤4 Feature id music 푎1 soundtrack 푎2 story 푎3 plot 푎4 performance 푎5 acting 푎6 Sentiword id great 푚1 excellent 푚2 superb 푚3 tragic 푚4 Entity id Titanic 푒1 Brokeback Mountain 푒2 Dark Knight 푒3 Avatar 푒4 Term Postings Cameron <푠19, 푒4, [−], [푚3], 푟7, 푤3> Pandora <푠16, 푒4, [푎2], [−], 푟6, 푤3>, <푠18, 푒4, [−], [−], 푟6, 푤3> tragic <푠7, 푒2, [푎3], [푚4], 푟3, 푤1> performance <푠5, 푒1, [푎6], [푚6], 푟2, 푤1>, <푠9, 푒2, [푎6], [푚3], 푟3, 푤1>, <푠11, 푒2, [푎6], [푚1], 푟4, 푤1>, <푠13, 푒3, [푎6], [−], 푟5, 푤2>, <푠15, 푒4, [푎6], [−], 푟5, 푤3>, <푠20, 푒3, [푎6], [−], 푟8, 푤4>, <푠21, 푒3,[푎6], [푚6], 푟9, 푤4> soundtrack <푠4, 푒1, [푎2],[−], 푟2, 푤1>, <푠10, 푒2, [푎2],[푚2], 푟4, 푤1>, <푠16, 푒4, [푎2],[−], 푟6, 푤2>, <푠22, 푒3, [푎2],[푚1], 푟9, 푤4> plot <푠14, 푒3, [푎4],[−], 푟5, 푤2> acting <푠13, 푒4, [푎6], [−], 푟9, 푤4>, music <푠2, 푒1, [푎1], [푚1], 푟1, 푤1>, <푠8, 푒2, [푎1], [푚1], 푟3, 푤1> Yeston <푠2, 푒1, [푎1], [−],푟1, 푤1>, story <푠1, 푒1, [푎3], [푚1],푟1, 푤1>, <푠7, 푒2, [푎3], [−],푟3, 푤1>, <푠12, 푒2, [푎3], [푚2],푟4, 푤1>, <푠17, 푒4, [푎3], [−],푟6, 푤3> (s7) beautiful tragic love story, //(s8)with great music.//(s9) superb performances in movies ever! (s10) The soundtrack is also excellent,// (s11)great performance, //(s12)excellent presentation of a love story… Brokeback Mountain 퐫ퟑ 퐫ퟒ The Dark Knight (s13) The performance by Heath Ledger was outstanding //(s14) and plot is amazing too… 퐫ퟓ The Dark Knight (s20) Joker shows phonemically awesome performance!… (s21) nice performance //(s22)and backed up with great soundtrack. //(s23)excellent casting! 퐫ퟖ 퐫ퟗ (s1) the greatest love stories of all //(s2)and beautiful music from Yeston. // (s3) Everything about this movie was excellent... (푠4) touching soundtrack, //(푠5) and perfect handling of the known tragedy with nice performance. //(푠6)This has the best love scene I have ever seen… Titanic 퐫ퟏ 퐫ퟐ (s15) Navilooks very real, good performance, //(s16) beautiful soundtrack that emphasize the vastness of the Pandora, //(s17)with love story.// (s18) The world of Pandora is stunning Avatar 퐫ퟔ 퐫ퟕ (s19) James Cameron deserves high praise for this creation… Review ID
  • 24. IV: Query Parsing CONSENTOpreprocesses the query and performs query expansion stop-word removal, polarity only-word removal feature expansion stemming Polarity only-word removal "good action movie" and "greataction movie" should be treated as the same query Feature words expanded for better recall ‘plot’ → {plot, story} ‘music’ → {music, soundtrack}
  • 25. V: Retrieval Retrieve MCSU segments that match to the query terms Same as the conventional systems retrieve document posting lists
  • 26. VI: Ranking Group MCSU postings by entity and aggregate the scores of the postings to compute the score of the corresponding entity
  • 31. Movie data sets Source •Amazon , IMDB, Metacritic, Flixster, Rotten Tomatoes and Yahoo Movies Period •2008 ~ 2010 More than 740 movies, and 30K reviews Hotel data sets hotel data set from Ganesanand Zhai reviews for the hotels in 10 major cities from TripAdvisor The authors provided us the corrected judgment set for our test Experimental Setup: Data Set
  • 32. Experiment Methods Ganesanand Zhai’sOE and QAM methods •Opinion expansion word •Query aspect model Baseline 1) BM25 •b = 0.75 •k1 = 2 2) VSMBM (lucenedefault) •Vector space model + Boolean model 3) ConsensusRank
  • 36. Honeymoon Snorkeling Hawaii! Honeymoon Whale Watching Snorkeling Whale watching Whale Watching Snorkeling Snorkeling Active Volcano Honeymoon Honeymoon Whale Watching Snorkeling Honeymoon Whale Watching
  • 37. 1. 웹및소셜네트워크상의다양한정보를 사전에분석및인덱싱 스릴러영화? 반전있는 스릴러 영화? 대학생백팩? 믿을만한 중고차딜러? 믿을만한 근처어린이집 2. Ad-hoc 의사결정질의에대한실시간결과도출 면접용 메이크업 미용실 학원근처 갈만한 스터디장소 강남상견례한식집 배낭여행숙소 우리동네PT 잘하는 트레이너?
  • 38. 38 best thriller with plot twist
  • 39. The Artist vs. Jack and Jill 39
  • 40. 40 good pizza restaurant
  • 42. 42