SlideShare a Scribd company logo
이이이이 세션은세션은세션은세션은
• Social Web (and Social Web search) is a great thing, but…
– 수많은 사람들의 경험을, 아주 쉽게 검색할 수 있다
• 영화, 책, 여행지, 음악, …
– 아직도, 소셜웹에는 더 많은 것이 (검색 되지 않고 남아) 있다.
– 개별 경험 이상의 것: 많은 숫자의 다양한 “경험”들이 모이면
• 트랜드, 숨어있던 관계, 새로운 지식, …
• Social Web + Semantic Web Technology
– A prototype “Experience Search” system
– 새로운 종류의 정보 요구
• 여성 얼리어댑터들이 좋아하는 MP3 플레이어들은?
• 젊은이들이, 시대를 타지 않고 꾸준히 읽는 책을 리스트 해 달라.
• 폴 오스터의 책을 좋아 사람들이 요즘 읽은 책과, 그들의 관련 포스팅
을 보고 싶다.
• 남자는 스릴러, 여자는 로맨스를 읽는다는데, 정말 그럴까?
Overview
• A Semantic Search System on Social Web Content
• Social Web + Semantic Web
– Social Web Content
• Blog postings
• Experiences of Web users
– Semantic Web Technology– Semantic Web Technology
• Publishing portable-data
• Accessing web-based open knowledge
Overview
• By the term “Semantic Search”…
– Not by “text matching”
– But by satisfying the “conditions” given in the query.
• By “Experience Search”…
– On the “topics” of Bloggers
– Example queries
• “20대가 선호하는 mp3 플레이어는?” (mp3 players that are favored by
20s.)
• “폴 오스터 팬들이 요즘 읽는 책은?” (List the books that paul auster fans
read these days.)
• “애플제품 매니아들이 요즘 이야기 하는 최신 전자제품은?” (List the
devices that are being talked by apple-lovers.)
• “남자는 스릴러, 여자는 로맨스를 읽는다는데, 정말 그럴까?” (Men
read thrillers, women read romances. Is it true in Blogosphere?)
Overview
• Challenges!
– 1) Blog postings are free-text.
• No semantics
• No explicit/machine-readable topics
– 2) Database/Ontology does not have such information.
• For example, our book ontology does not know that a book• For example, our book ontology does not know that a book
is favored by some group or not.
• How to draw such a “previously unknown”, “not recorded
in the DB” type of knowledge?
The Idea
• Answer for Challenge 1 : Semantic Blogs
– A little semantics from blog postings.
– Topic: what is the topic of this posting?
• Semantic Blog with Semantic Tags
– Converting conventional blogs to semantic blogs
– Blogger: who is the blogger?– Blogger: who is the blogger?
• Basic information about for each bloggers
– Age group, gender, job
– Published in FOAF (Friend-of-a-Friend)
– Manually published + predicted by maching-learning
The Idea
• Answer for Challenge 2 : Emergent Knowledge
– Connections make new information
– Some blog postings are about specific topic-items.
• They draws a new connection between the author (blogger)
and the topic item (book, IT-device, movie, etc)
• New tendency/relationships can be found from this• New tendency/relationships can be found from this
connections,
• If large number of such connections are available.
Emerging Information from
Connections
• Sci-fi Fan Example
Book Ontology
-Book Title
-ISBN
-Book Author
-Genre
Blog Postings
(SemTag)
-Topic (->)
-Date/Time
-Blogger (->)
Personal Info
(FOAF)
-age
-gender
-address -Genre
-Publisher
-Blogger (->)-address
topic
Blogger
22
Female
Daegu
-> (uri)
2010.03.
<- (uri)
The Vor Game
9788989571506
Lois Bujold
Sci-fi
Baen Books
Emerging Information from
Connections
• Sci-fi Fan Example
Personal Info
(FOAF)
-age
-gender
-address
Blog Postings
(SemTag)
-Topic (->)
-Date/Time
-Blogger (->)
Book Ontology
-Book Title
-ISBN
-Book Author
-Genre-address -Blogger (->) -Genre
-Publisher
topic
genre
Sci-Fi
Blogger
SciSciSciSci----Fi fanFi fanFi fanFi fan
Emerging Information from
Connections
• Examples: Emerging information from connections
– 20대가 선호하는 기기 (favored by age-group 20s)
– “반지의 제왕”을 읽은 사람들이 (bloggers who have read
the book “Lord of the rings”. )
– 올해의 베스트셀러 탑 50 (top 50 books of this year)
– 폴 오스터 책을 많이 읽은 블로거 (bloggers who have– 폴 오스터 책을 많이 읽은 블로거 (bloggers who have
read many books of author paul auster)
Implementation: Semantic Blog
• A Semantic Blog example
Implementation: Blog postings as an
Event
• Postings as Ontology Instances
Implementation: Converting
Conventional postings to SemBlogs
• Problem
– To acquire “emergent knowledge”, we need a lot of postings
with semantic tags.
– There aren’t many semantic blogs, yet.
• Answer
– There are a large number of “topic-known” blog postings.– There are a large number of “topic-known” blog postings.
– Let’s convert such postings to semantic blog postings
Implementation: Converting
Conventional postings to SemBlogs
• DB-links in conventional blogs
– DB-links: Ability to explicitly mark the topic by making a
link to Database Item of portal services.
• Naver (DB-attachment), Daum (DB-link): movie, books,
• Yes24/Alladdin blogs: books, IT-devices
– In essential, they are “semantic tags” in limited domain– In essential, they are “semantic tags” in limited domain
• Postings
– Collected nearly 100,000 Blog postings with DB-link
– Converted into Semantic Blog postings (event instances)
– Postings about “movies”, “books”, “IT devices”, “travel
locations”.
Implementation: Converting
Conventional postings to SemBlogs
• Blogger information
– Among the collected postings, 2000 bloggers have been
selected.
• Who posted more than 20 topic-known postings.
– Manually tagged FOAF info for 2000 bloggers
• Age, Gender, Home location (city level), Occupation.• Age, Gender, Home location (city level), Occupation.
– Their blog texts are then become the training data
– Classification methods have been applied to other bloggers
– In total 5000+ bloggers have been collected for search data
• The data
– 5000+ bloggers, 100,000+ postings, over 3 years.
Implementation: Selecting Domain
Ontologies
• Domain ontologies are needed
– DBPedia could provide good topic-vocabulary…
• However, not enough Korean books and locations in the
DBPedia.
– Domain ontologies are separately prepared for the search
systemsystem
– Travel locations
• GeoNames ontology (geonames.org)
– Books
• Book ontology (bizier et al.)
– IT devices
• IT ontology (Kaist CoreOnto Ontology)
Implementation: The Main Idea (again),
and Semantic Labels
• “Simple and large (instances)” is better than “rich and few”
• Simple semantics from texts/blog postings
– Relatively easy to achieve in large numbers
• From Large number of Instances
– Large number of “connections” can be found
– Knowledge that are not described in the ontology can be
found from the connections
• How normal users can explicitly use/find such connections?
– Name the patterns: Semantic Labels
Implementation: Semantic Labels
• Semantic Labels
– Connect human concepts to graph-patterns
– Graph patterns are described in SPARQL
• SPARQL is query language, which can also be used as a
rule language
– With additions of Aggregation functions, etc.– With additions of Aggregation functions, etc.
– Name the “Findings”
• In the implementation, new findings are attached to
instances as a label
• This label can be used in the semantic search.
• Rule-based findings of meaningful patterns
Semantic Label Examples
Antecedents are
described in
Rule a language.
( SPARQL
+ additional
functions )functions )
Search System Architecture
advanced
users Semantic
Label
Definitions
Rule Process
Module
Query
Search
Module
keyword
Search
Inference and modify
SPARQL queries
Rule
authoring
RDF Store
People
Event
Domain
Ontologies
users
User
Interface
Process
Module
Module
Analysis
Module
keyword
queries
Search result
in XML
Analysis
request
Analysis
request
query
Analysis result
in XML
Event
Ontology
FOAF
Instances
Ontologies
Semantic Search Demonstrations
Semantic Search Demonstrations
Semantic Search Demonstrations
Semantic Search Demonstrations
결론결론결론결론
• 블로그스피어에서 찾는 창발적 지식(Emergent
Knowledge)
– 블로그 포스팅을 연결삼아 (Blog postings as
“Connections”)
– 새로운 지식 발견이 가능
• “Simple Semantic goes a long way”• “Simple Semantic goes a long way”
– 단순한 Semantic (data), 다양한 사례 (Instances)
• Social Web + Semantic Web
Additional Information
about the system
• Detailed information about the system, and its evaluation can
be found in the paper, doi:10.1016/j.websem.2010.05.001
TG Noh et al., Learning the emergent knowledge from annotated blog postings,
Web Semantics: Science, Services and Agents on the World Wide Web, 2010
• You can access the paper, data and prototype demo and its
video in
– http://nweb.knu.ac.kr/

More Related Content

Similar to Learning Emergent Knowledge from Blog Postings

Library Linked Data
Library Linked DataLibrary Linked Data
Library Linked Data
Dorothea Salo
 
Social semantic web
Social semantic webSocial semantic web
Social semantic web
Vlad Posea
 
Metadata
MetadataMetadata
Metadata
Dorothea Salo
 
Zemanta Tech Talk at Audible
Zemanta Tech Talk at AudibleZemanta Tech Talk at Audible
Zemanta Tech Talk at Audible
Andraz Tori
 
What does it all mean anyway?
What does it all mean anyway?What does it all mean anyway?
What does it all mean anyway?
robertstevens65
 
What does it all mean anyway
What does it all mean anywayWhat does it all mean anyway
What does it all mean anyway
robertstevens65
 
What does it all mean anyway
What does it all mean anywayWhat does it all mean anyway
What does it all mean anyway
robertstevens65
 
Contextualized Online Search and Research Skills.pptx
Contextualized Online Search and Research Skills.pptxContextualized Online Search and Research Skills.pptx
Contextualized Online Search and Research Skills.pptx
JhayRom
 
Studying archives of online behavior
Studying archives of online behaviorStudying archives of online behavior
Studying archives of online behavior
James Howison
 
Open Library at Make Books Apparent
Open Library at Make Books ApparentOpen Library at Make Books Apparent
Open Library at Make Books Apparent
George Oates
 
Aep mc nairguide
Aep mc nairguideAep mc nairguide
Aep mc nairguide
Annelise Sklar
 
Large-Scale Semantic Search
Large-Scale Semantic SearchLarge-Scale Semantic Search
Large-Scale Semantic Search
Roi Blanco
 
Lipstick on a Pig: Integrated Library Systems
Lipstick on a Pig: Integrated Library SystemsLipstick on a Pig: Integrated Library Systems
Lipstick on a Pig: Integrated Library Systems
Dorothea Salo
 
An Introduction to Entities in Semantic Search
An Introduction to Entities in Semantic SearchAn Introduction to Entities in Semantic Search
An Introduction to Entities in Semantic Search
David Amerland
 
Beyond gsafd
Beyond gsafdBeyond gsafd
Glasgow: OPAC 2.0 and Beyond
Glasgow: OPAC 2.0 and BeyondGlasgow: OPAC 2.0 and Beyond
Glasgow: OPAC 2.0 and Beyond
daveyp
 
INFORMATION SKILLS: NAVIGATING RESEARCH IN LIBRARY
INFORMATION SKILLS: NAVIGATING RESEARCH IN LIBRARYINFORMATION SKILLS: NAVIGATING RESEARCH IN LIBRARY
INFORMATION SKILLS: NAVIGATING RESEARCH IN LIBRARY
Chris Okiki
 
Олександр Обєдніков “Рекомендательные системы”
Олександр Обєдніков “Рекомендательные системы”Олександр Обєдніков “Рекомендательные системы”
Олександр Обєдніков “Рекомендательные системы”
Dakiry
 
Fri schreiber key_knowledge engineering
Fri schreiber key_knowledge engineeringFri schreiber key_knowledge engineering
Fri schreiber key_knowledge engineering
eswcsummerschool
 
Beyond document retrieval using semantic annotations
Beyond document retrieval using semantic annotations Beyond document retrieval using semantic annotations
Beyond document retrieval using semantic annotations
Roi Blanco
 

Similar to Learning Emergent Knowledge from Blog Postings (20)

Library Linked Data
Library Linked DataLibrary Linked Data
Library Linked Data
 
Social semantic web
Social semantic webSocial semantic web
Social semantic web
 
Metadata
MetadataMetadata
Metadata
 
Zemanta Tech Talk at Audible
Zemanta Tech Talk at AudibleZemanta Tech Talk at Audible
Zemanta Tech Talk at Audible
 
What does it all mean anyway?
What does it all mean anyway?What does it all mean anyway?
What does it all mean anyway?
 
What does it all mean anyway
What does it all mean anywayWhat does it all mean anyway
What does it all mean anyway
 
What does it all mean anyway
What does it all mean anywayWhat does it all mean anyway
What does it all mean anyway
 
Contextualized Online Search and Research Skills.pptx
Contextualized Online Search and Research Skills.pptxContextualized Online Search and Research Skills.pptx
Contextualized Online Search and Research Skills.pptx
 
Studying archives of online behavior
Studying archives of online behaviorStudying archives of online behavior
Studying archives of online behavior
 
Open Library at Make Books Apparent
Open Library at Make Books ApparentOpen Library at Make Books Apparent
Open Library at Make Books Apparent
 
Aep mc nairguide
Aep mc nairguideAep mc nairguide
Aep mc nairguide
 
Large-Scale Semantic Search
Large-Scale Semantic SearchLarge-Scale Semantic Search
Large-Scale Semantic Search
 
Lipstick on a Pig: Integrated Library Systems
Lipstick on a Pig: Integrated Library SystemsLipstick on a Pig: Integrated Library Systems
Lipstick on a Pig: Integrated Library Systems
 
An Introduction to Entities in Semantic Search
An Introduction to Entities in Semantic SearchAn Introduction to Entities in Semantic Search
An Introduction to Entities in Semantic Search
 
Beyond gsafd
Beyond gsafdBeyond gsafd
Beyond gsafd
 
Glasgow: OPAC 2.0 and Beyond
Glasgow: OPAC 2.0 and BeyondGlasgow: OPAC 2.0 and Beyond
Glasgow: OPAC 2.0 and Beyond
 
INFORMATION SKILLS: NAVIGATING RESEARCH IN LIBRARY
INFORMATION SKILLS: NAVIGATING RESEARCH IN LIBRARYINFORMATION SKILLS: NAVIGATING RESEARCH IN LIBRARY
INFORMATION SKILLS: NAVIGATING RESEARCH IN LIBRARY
 
Олександр Обєдніков “Рекомендательные системы”
Олександр Обєдніков “Рекомендательные системы”Олександр Обєдніков “Рекомендательные системы”
Олександр Обєдніков “Рекомендательные системы”
 
Fri schreiber key_knowledge engineering
Fri schreiber key_knowledge engineeringFri schreiber key_knowledge engineering
Fri schreiber key_knowledge engineering
 
Beyond document retrieval using semantic annotations
Beyond document retrieval using semantic annotations Beyond document retrieval using semantic annotations
Beyond document retrieval using semantic annotations
 

More from Saltlux Inc.

Semantic techandbigdata(presentation)rev01
Semantic techandbigdata(presentation)rev01Semantic techandbigdata(presentation)rev01
Semantic techandbigdata(presentation)rev01Saltlux Inc.
 
5 소셜빅데이터분석서비스 최광선
5 소셜빅데이터분석서비스 최광선5 소셜빅데이터분석서비스 최광선
5 소셜빅데이터분석서비스 최광선Saltlux Inc.
 
4 voc비정형분석 문종영
4 voc비정형분석 문종영4 voc비정형분석 문종영
4 voc비정형분석 문종영Saltlux Inc.
 
3 빅데이터기반비정형데이터의실시간처리방법 원종석
3 빅데이터기반비정형데이터의실시간처리방법 원종석3 빅데이터기반비정형데이터의실시간처리방법 원종석
3 빅데이터기반비정형데이터의실시간처리방법 원종석Saltlux Inc.
 
2 기업의빅데이터delta전략 이진권
2 기업의빅데이터delta전략 이진권2 기업의빅데이터delta전략 이진권
2 기업의빅데이터delta전략 이진권Saltlux Inc.
 
1 손에잡히는빅데이터 이경일
1 손에잡히는빅데이터 이경일1 손에잡히는빅데이터 이경일
1 손에잡히는빅데이터 이경일Saltlux Inc.
 
SemTech 2011, Saltlux, Tony Lee
SemTech 2011, Saltlux, Tony LeeSemTech 2011, Saltlux, Tony Lee
SemTech 2011, Saltlux, Tony Lee
Saltlux Inc.
 
Semtech 2011, Saltlux, Tony Lee
Semtech 2011, Saltlux, Tony LeeSemtech 2011, Saltlux, Tony Lee
Semtech 2011, Saltlux, Tony Lee
Saltlux Inc.
 
9.use case geo semantic technology
9.use case geo semantic technology9.use case geo semantic technology
9.use case geo semantic technologySaltlux Inc.
 
6.최광선 semantic search and mining
6.최광선 semantic search and mining6.최광선 semantic search and mining
6.최광선 semantic search and miningSaltlux Inc.
 
semantic search and mining
semantic search and miningsemantic search and mining
semantic search and mining
Saltlux Inc.
 
Cognitive Planning and Learning for Mobile Platforms
Cognitive Planning and Learning for Mobile PlatformsCognitive Planning and Learning for Mobile Platforms
Cognitive Planning and Learning for Mobile PlatformsSaltlux Inc.
 
Web Scale Reasoning and the LarKC Project
Web Scale Reasoning and the LarKC ProjectWeb Scale Reasoning and the LarKC Project
Web Scale Reasoning and the LarKC Project
Saltlux Inc.
 
Industrials Use cases for Semantic Technology
Industrials Use cases for Semantic TechnologyIndustrials Use cases for Semantic Technology
Industrials Use cases for Semantic TechnologySaltlux Inc.
 
The Semantic Technology Business: Europe
The Semantic Technology Business: EuropeThe Semantic Technology Business: Europe
The Semantic Technology Business: Europe
Saltlux Inc.
 
Technology Trends for LOD and Semantic Web
Technology Trends for LOD and Semantic WebTechnology Trends for LOD and Semantic Web
Technology Trends for LOD and Semantic WebSaltlux Inc.
 

More from Saltlux Inc. (17)

Semantic techandbigdata(presentation)rev01
Semantic techandbigdata(presentation)rev01Semantic techandbigdata(presentation)rev01
Semantic techandbigdata(presentation)rev01
 
5 소셜빅데이터분석서비스 최광선
5 소셜빅데이터분석서비스 최광선5 소셜빅데이터분석서비스 최광선
5 소셜빅데이터분석서비스 최광선
 
4 voc비정형분석 문종영
4 voc비정형분석 문종영4 voc비정형분석 문종영
4 voc비정형분석 문종영
 
3 빅데이터기반비정형데이터의실시간처리방법 원종석
3 빅데이터기반비정형데이터의실시간처리방법 원종석3 빅데이터기반비정형데이터의실시간처리방법 원종석
3 빅데이터기반비정형데이터의실시간처리방법 원종석
 
2 기업의빅데이터delta전략 이진권
2 기업의빅데이터delta전략 이진권2 기업의빅데이터delta전략 이진권
2 기업의빅데이터delta전략 이진권
 
1 손에잡히는빅데이터 이경일
1 손에잡히는빅데이터 이경일1 손에잡히는빅데이터 이경일
1 손에잡히는빅데이터 이경일
 
SemTech 2011, Saltlux, Tony Lee
SemTech 2011, Saltlux, Tony LeeSemTech 2011, Saltlux, Tony Lee
SemTech 2011, Saltlux, Tony Lee
 
Semtech 2011, Saltlux, Tony Lee
Semtech 2011, Saltlux, Tony LeeSemtech 2011, Saltlux, Tony Lee
Semtech 2011, Saltlux, Tony Lee
 
9.use case geo semantic technology
9.use case geo semantic technology9.use case geo semantic technology
9.use case geo semantic technology
 
6.최광선 semantic search and mining
6.최광선 semantic search and mining6.최광선 semantic search and mining
6.최광선 semantic search and mining
 
semantic search and mining
semantic search and miningsemantic search and mining
semantic search and mining
 
Cognitive Planning and Learning for Mobile Platforms
Cognitive Planning and Learning for Mobile PlatformsCognitive Planning and Learning for Mobile Platforms
Cognitive Planning and Learning for Mobile Platforms
 
Web Scale Reasoning and the LarKC Project
Web Scale Reasoning and the LarKC ProjectWeb Scale Reasoning and the LarKC Project
Web Scale Reasoning and the LarKC Project
 
Industrials Use cases for Semantic Technology
Industrials Use cases for Semantic TechnologyIndustrials Use cases for Semantic Technology
Industrials Use cases for Semantic Technology
 
The Semantic Technology Business: Europe
The Semantic Technology Business: EuropeThe Semantic Technology Business: Europe
The Semantic Technology Business: Europe
 
Technology Trends for LOD and Semantic Web
Technology Trends for LOD and Semantic WebTechnology Trends for LOD and Semantic Web
Technology Trends for LOD and Semantic Web
 
1.sos2010 tony
1.sos2010 tony1.sos2010 tony
1.sos2010 tony
 

Learning Emergent Knowledge from Blog Postings

  • 1.
  • 2. 이이이이 세션은세션은세션은세션은 • Social Web (and Social Web search) is a great thing, but… – 수많은 사람들의 경험을, 아주 쉽게 검색할 수 있다 • 영화, 책, 여행지, 음악, … – 아직도, 소셜웹에는 더 많은 것이 (검색 되지 않고 남아) 있다. – 개별 경험 이상의 것: 많은 숫자의 다양한 “경험”들이 모이면 • 트랜드, 숨어있던 관계, 새로운 지식, … • Social Web + Semantic Web Technology – A prototype “Experience Search” system – 새로운 종류의 정보 요구 • 여성 얼리어댑터들이 좋아하는 MP3 플레이어들은? • 젊은이들이, 시대를 타지 않고 꾸준히 읽는 책을 리스트 해 달라. • 폴 오스터의 책을 좋아 사람들이 요즘 읽은 책과, 그들의 관련 포스팅 을 보고 싶다. • 남자는 스릴러, 여자는 로맨스를 읽는다는데, 정말 그럴까?
  • 3. Overview • A Semantic Search System on Social Web Content • Social Web + Semantic Web – Social Web Content • Blog postings • Experiences of Web users – Semantic Web Technology– Semantic Web Technology • Publishing portable-data • Accessing web-based open knowledge
  • 4. Overview • By the term “Semantic Search”… – Not by “text matching” – But by satisfying the “conditions” given in the query. • By “Experience Search”… – On the “topics” of Bloggers – Example queries • “20대가 선호하는 mp3 플레이어는?” (mp3 players that are favored by 20s.) • “폴 오스터 팬들이 요즘 읽는 책은?” (List the books that paul auster fans read these days.) • “애플제품 매니아들이 요즘 이야기 하는 최신 전자제품은?” (List the devices that are being talked by apple-lovers.) • “남자는 스릴러, 여자는 로맨스를 읽는다는데, 정말 그럴까?” (Men read thrillers, women read romances. Is it true in Blogosphere?)
  • 5. Overview • Challenges! – 1) Blog postings are free-text. • No semantics • No explicit/machine-readable topics – 2) Database/Ontology does not have such information. • For example, our book ontology does not know that a book• For example, our book ontology does not know that a book is favored by some group or not. • How to draw such a “previously unknown”, “not recorded in the DB” type of knowledge?
  • 6. The Idea • Answer for Challenge 1 : Semantic Blogs – A little semantics from blog postings. – Topic: what is the topic of this posting? • Semantic Blog with Semantic Tags – Converting conventional blogs to semantic blogs – Blogger: who is the blogger?– Blogger: who is the blogger? • Basic information about for each bloggers – Age group, gender, job – Published in FOAF (Friend-of-a-Friend) – Manually published + predicted by maching-learning
  • 7. The Idea • Answer for Challenge 2 : Emergent Knowledge – Connections make new information – Some blog postings are about specific topic-items. • They draws a new connection between the author (blogger) and the topic item (book, IT-device, movie, etc) • New tendency/relationships can be found from this• New tendency/relationships can be found from this connections, • If large number of such connections are available.
  • 8. Emerging Information from Connections • Sci-fi Fan Example Book Ontology -Book Title -ISBN -Book Author -Genre Blog Postings (SemTag) -Topic (->) -Date/Time -Blogger (->) Personal Info (FOAF) -age -gender -address -Genre -Publisher -Blogger (->)-address topic Blogger 22 Female Daegu -> (uri) 2010.03. <- (uri) The Vor Game 9788989571506 Lois Bujold Sci-fi Baen Books
  • 9. Emerging Information from Connections • Sci-fi Fan Example Personal Info (FOAF) -age -gender -address Blog Postings (SemTag) -Topic (->) -Date/Time -Blogger (->) Book Ontology -Book Title -ISBN -Book Author -Genre-address -Blogger (->) -Genre -Publisher topic genre Sci-Fi Blogger SciSciSciSci----Fi fanFi fanFi fanFi fan
  • 10. Emerging Information from Connections • Examples: Emerging information from connections – 20대가 선호하는 기기 (favored by age-group 20s) – “반지의 제왕”을 읽은 사람들이 (bloggers who have read the book “Lord of the rings”. ) – 올해의 베스트셀러 탑 50 (top 50 books of this year) – 폴 오스터 책을 많이 읽은 블로거 (bloggers who have– 폴 오스터 책을 많이 읽은 블로거 (bloggers who have read many books of author paul auster)
  • 11. Implementation: Semantic Blog • A Semantic Blog example
  • 12. Implementation: Blog postings as an Event • Postings as Ontology Instances
  • 13. Implementation: Converting Conventional postings to SemBlogs • Problem – To acquire “emergent knowledge”, we need a lot of postings with semantic tags. – There aren’t many semantic blogs, yet. • Answer – There are a large number of “topic-known” blog postings.– There are a large number of “topic-known” blog postings. – Let’s convert such postings to semantic blog postings
  • 14. Implementation: Converting Conventional postings to SemBlogs • DB-links in conventional blogs – DB-links: Ability to explicitly mark the topic by making a link to Database Item of portal services. • Naver (DB-attachment), Daum (DB-link): movie, books, • Yes24/Alladdin blogs: books, IT-devices – In essential, they are “semantic tags” in limited domain– In essential, they are “semantic tags” in limited domain • Postings – Collected nearly 100,000 Blog postings with DB-link – Converted into Semantic Blog postings (event instances) – Postings about “movies”, “books”, “IT devices”, “travel locations”.
  • 15. Implementation: Converting Conventional postings to SemBlogs • Blogger information – Among the collected postings, 2000 bloggers have been selected. • Who posted more than 20 topic-known postings. – Manually tagged FOAF info for 2000 bloggers • Age, Gender, Home location (city level), Occupation.• Age, Gender, Home location (city level), Occupation. – Their blog texts are then become the training data – Classification methods have been applied to other bloggers – In total 5000+ bloggers have been collected for search data • The data – 5000+ bloggers, 100,000+ postings, over 3 years.
  • 16. Implementation: Selecting Domain Ontologies • Domain ontologies are needed – DBPedia could provide good topic-vocabulary… • However, not enough Korean books and locations in the DBPedia. – Domain ontologies are separately prepared for the search systemsystem – Travel locations • GeoNames ontology (geonames.org) – Books • Book ontology (bizier et al.) – IT devices • IT ontology (Kaist CoreOnto Ontology)
  • 17.
  • 18. Implementation: The Main Idea (again), and Semantic Labels • “Simple and large (instances)” is better than “rich and few” • Simple semantics from texts/blog postings – Relatively easy to achieve in large numbers • From Large number of Instances – Large number of “connections” can be found – Knowledge that are not described in the ontology can be found from the connections • How normal users can explicitly use/find such connections? – Name the patterns: Semantic Labels
  • 19. Implementation: Semantic Labels • Semantic Labels – Connect human concepts to graph-patterns – Graph patterns are described in SPARQL • SPARQL is query language, which can also be used as a rule language – With additions of Aggregation functions, etc.– With additions of Aggregation functions, etc. – Name the “Findings” • In the implementation, new findings are attached to instances as a label • This label can be used in the semantic search. • Rule-based findings of meaningful patterns
  • 20. Semantic Label Examples Antecedents are described in Rule a language. ( SPARQL + additional functions )functions )
  • 21. Search System Architecture advanced users Semantic Label Definitions Rule Process Module Query Search Module keyword Search Inference and modify SPARQL queries Rule authoring RDF Store People Event Domain Ontologies users User Interface Process Module Module Analysis Module keyword queries Search result in XML Analysis request Analysis request query Analysis result in XML Event Ontology FOAF Instances Ontologies
  • 26. 결론결론결론결론 • 블로그스피어에서 찾는 창발적 지식(Emergent Knowledge) – 블로그 포스팅을 연결삼아 (Blog postings as “Connections”) – 새로운 지식 발견이 가능 • “Simple Semantic goes a long way”• “Simple Semantic goes a long way” – 단순한 Semantic (data), 다양한 사례 (Instances) • Social Web + Semantic Web
  • 27. Additional Information about the system • Detailed information about the system, and its evaluation can be found in the paper, doi:10.1016/j.websem.2010.05.001 TG Noh et al., Learning the emergent knowledge from annotated blog postings, Web Semantics: Science, Services and Agents on the World Wide Web, 2010 • You can access the paper, data and prototype demo and its video in – http://nweb.knu.ac.kr/