Lessons Learned from LOD
(Linked Open Data) Failure and
Big Data:
The Future Trend
Youngwhan Lee, Ph. D.
전화: 010-7997-0345...
Web Evolution and Big Data
Internet Today
2010:
• Estimated 1011 Web pages in the World

2012:
•
•
•

Social Media: Facebook (1 Billion Monthly Activ...
Web Explosion and Big Data
•
•

Number of Web Users (Mar. 2012): 2.3 Billion
1011 Web pages in the World (Est. 2010)
– Sin...
Aggregation

데이터분석

지식구조화

큐레이션

RIF
SPARQ
L
OWL
RDF
LOD

NoSQL
MapReduce
R-DBMS

Understanding

Modified, based on Gene B...
빅데이터/웹에서의 정보/지식 추출
• 정보 검색
– SEO(Search Engine Optimization) PageRank, EdgeRank

• Data Mining: 프로그램에 의한 정보(지식) 추출 가능
– 통계...
Pareto’s Law
Longtail

Bighead
Longtail Phenomena in
The Long Tail by Chris Anderson (Wired, Oct. ´04) adopted to
information domains

Longtail Applicati...
지식공학에서의 접근
• 온톨로지 구축
– Cyc
– WolframAlpha
– Siri

• 데이터의 웹(Web of Data)
– LOD  LOD2
Old “Layercake” of Semantic Web

정보 교환
RDF
OWL2
OWL2
Linked Open Data (LOD) Principles
Linking Open Data (LOD) is to connect and to open data to public


A little history of ...
Advantages of LOD
•
•
•
•
•
•
•
•

Elegant
Expandable
Flexible
Powerful
Decentralized
Participatory
Inclusive, and
“Free” ...
Linked Open Data (LOD) Principles
Change of Web Structure

유저 인터페이스
인간을 위한
웹 페이지 연결

웹페이지 연결 버스

유저 인터페이스
인간을 위한
웹 페이지 연결

웹페이지 연결 버스
매쉬업

매쉬업
컴퓨터를 위한
웹 데이터...
Mar., 2008
May, 2007

Sep., 2008

July, 2009
SPARQL
SPARQL (Simple Protocol and RDF Query Language)
Web 3.0: Merging the two Perspectives

WWW Propoal
(1989)

Semantic
Web

Technology
Innovation
Perspective

LOD Proposal (...
But no Champaign…
• Definition Unclear
– Berners-Lee’s 4 principles are ambiguous

•
•
•
•

Interpretation difficult
Incon...
Solution to LOD problems: LOD2
• LOD2 Stack: A Technical Approach
– Linked Data Management
– Enrichment and Quality Improv...
Q: Is this technical approach for LOD good enough?

A: Business approach is
definitely needed.
Big Data
What did we do with big data in 2013?

What would we do with big data in 2014?
빅데이터와 데이터 지상주의

End of Theory
“이론의 종말” by Chris Anderson
Implication
• Issue: Have and Have-not are
separated
– E. g. in marketing
• 4Ps
– Price, product, place, promotion

• STP
...
Implication
• Is Technical Approach needed?
Business Approach
• Data Markets
– Azure Data Marketplace
– Data.com
– Infochimps.com
– DataMarket.com
– Kaggle.com
Data Market: Azure Data Marketplace
Data Market: Data.com
Data Market: Infochimps.com
Data Market: DataMarket.com
Data Market: Kaggle.com
Conclusion
• Positioning for Korea,
– Where are we?
– Where are we heading to?
참고문헌
• 웹3.0 세상을 바꾸고 있다.
– 이영환

• A Semantic Web Primer (Cooperative Information Systems series)
– Grigoris Antoniou, Frank...
Web sites
• Problems of Linked Data
– http://milicicvuk.com/blog/2011/07/26/problems-of-linked-data14-identity/

• LOD2
– ...
Lessons Learned from Lod Failure and Big Data : The Future Trend
Lessons Learned from Lod Failure and Big Data : The Future Trend
Lessons Learned from Lod Failure and Big Data : The Future Trend
Upcoming SlideShare
Loading in …5
×

Lessons Learned from Lod Failure and Big Data : The Future Trend

450
-1

Published on

I discuss the failure of LOD and the reasons. From the lessons learned, LOD2 got launched four plus (4+) years ago and is about to the completed. What can you say about the future trend of Big Data from the lessons?

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
450
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
9
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Lessons Learned from Lod Failure and Big Data : The Future Trend

  1. 1. Lessons Learned from LOD (Linked Open Data) Failure and Big Data: The Future Trend Youngwhan Lee, Ph. D. 전화: 010-7997-0345 이메일: nicklee@konkuk.ac.kr Facebook: Youngwhan Nick Lee Twitter: nicklee002 1
  2. 2. Web Evolution and Big Data
  3. 3. Internet Today 2010: • Estimated 1011 Web pages in the World 2012: • • • Social Media: Facebook (1 Billion Monthly Active Users) 문자 발명후 2003년까지 5 엑사 바이트  2012년 현재 매일 7 엑사바이트 데이터 생성 중 Is “big data” a big pile of garbage? 1-3
  4. 4. Web Explosion and Big Data • • Number of Web Users (Mar. 2012): 2.3 Billion 1011 Web pages in the World (Est. 2010) – Since the inception of Web, there were 7000 days (i.e. 20 years). This means humans create over 10 Million pages a day. • Digital Information Created in the year 2010: 1 zetabytes (1021) - - • "There was 5 exabytes of information created between the dawn of civilization through 2003, but that much information is now created every 2 days, and the pace is increasing.“ –Eric Schmitt (2010) 2012, almost 7 exabytes are created everyday. We call it “Big Data.” What does this mean?
  5. 5. Aggregation 데이터분석 지식구조화 큐레이션 RIF SPARQ L OWL RDF LOD NoSQL MapReduce R-DBMS Understanding Modified, based on Gene Bellinger, Durval Castro, Anthony Mills http://www.systems-thinking.org/dikw/dikw.htm , http://yjhyjh.egloos.com/39721
  6. 6. 빅데이터/웹에서의 정보/지식 추출 • 정보 검색 – SEO(Search Engine Optimization) PageRank, EdgeRank • Data Mining: 프로그램에 의한 정보(지식) 추출 가능 – 통계분석, Rule-based Analysis, 신경망 분석 – Visualization 데이터사이언스 • 지식공학 이용 – RDF/OWL 사용한 온톨로지 누적 연결 – Raw Data 연결하고 분석 가능하도록 개방 (Linked Open Data; LOD) – 프로그램에 의한 논리분석 가능한 지식 추출 가능 • SPARQL • RIF(Rule-based Interface Framework) 지식공학 • 인간의 힘 이용: 큐레이션 – 인간의 눈과 지식을 이용하여 정보를 필터하고 종합 • 예: pinterest.com, videocooki.com, storify.com, scoop.it, curated.by
  7. 7. Pareto’s Law Longtail Bighead
  8. 8. Longtail Phenomena in The Long Tail by Chris Anderson (Wired, Oct. ´04) adopted to information domains Longtail Applications Popularity Mobile Apps  iPhone Apps  Android Apps SNS Apps  Facebook Apps  Twitter Apps LOD and Others  Medical Apps  공공 정보 활용 Apps  … … … Bighead Applications … …
  9. 9. 지식공학에서의 접근 • 온톨로지 구축 – Cyc – WolframAlpha – Siri • 데이터의 웹(Web of Data) – LOD  LOD2
  10. 10. Old “Layercake” of Semantic Web 정보 교환
  11. 11. RDF
  12. 12. OWL2
  13. 13. OWL2
  14. 14. Linked Open Data (LOD) Principles Linking Open Data (LOD) is to connect and to open data to public  A little history of LOD Project  Tim Berners-Lee proposed LOD(Linking Open Data) project (2006)  Since the proposal, numerous countries and organizations participated, caused LOD to explode in terms of the number of data  Wikipedia  DBpedia (www.dbpedia.org)  Bio2RDF project opened in 27 fields of Biology, Genetics, Medical-related, of which the data sets are about 2.3 billions (Bio2RDF.org) (2008.10)  BBC announced to participate LOD project (www.bbc.org), now one of the institutes actively utilizing the data  US Data.gov released 5 billion data triples  US Library of Congress announced to join LOD project. (http://id.loc.gov/authorities/sh85042531#concept)  NY Times ( data.nytimes.com) release their data of 150 years of publication (2009.10)  US Whitehouse release a plan to open data in RDF (2009.11) 4 Principles of LOD 1. 2. 3. 4. Use URIs as names for things Use HTTP URIs When someone looks up a URI, provide useful information Include links to other URIs
  15. 15. Advantages of LOD • • • • • • • • Elegant Expandable Flexible Powerful Decentralized Participatory Inclusive, and “Free” to use
  16. 16. Linked Open Data (LOD) Principles
  17. 17. Change of Web Structure 유저 인터페이스 인간을 위한 웹 페이지 연결 웹페이지 연결 버스 유저 인터페이스 인간을 위한 웹 페이지 연결 웹페이지 연결 버스 매쉬업 매쉬업 컴퓨터를 위한 웹 데이터 연결 웹데이터 연결 버스 18
  18. 18. Mar., 2008 May, 2007 Sep., 2008 July, 2009
  19. 19. SPARQL
  20. 20. SPARQL (Simple Protocol and RDF Query Language)
  21. 21. Web 3.0: Merging the two Perspectives WWW Propoal (1989) Semantic Web Technology Innovation Perspective LOD Proposal (2006) “GGG” Proposal (2007) Knowledge-based Semantics Next Generation Web Data-based Semantics Market Behavior Perspective WEB 1.0 WEB 2.0 Web 3.0 “WEB2” Proposal (2009) Technical Proposal Phase Practical Use Phase
  22. 22. But no Champaign… • Definition Unclear – Berners-Lee’s 4 principles are ambiguous • • • • Interpretation difficult Inconsistent Difficult both to learn and use Difficult to build browsers and reasoners • “Free” to use Full of incomplete and inconsistent RDFs, no way to make them evolve In short, “Garbage in, Garbage out” experienced
  23. 23. Solution to LOD problems: LOD2 • LOD2 Stack: A Technical Approach – Linked Data Management – Enrichment and Quality Improvement – Various Tools to use • • • • • Storage and Querying Revision and authoring Interlinking and fusing Classification and enrichment …
  24. 24. Q: Is this technical approach for LOD good enough? A: Business approach is definitely needed.
  25. 25. Big Data What did we do with big data in 2013? What would we do with big data in 2014?
  26. 26. 빅데이터와 데이터 지상주의 End of Theory “이론의 종말” by Chris Anderson
  27. 27. Implication • Issue: Have and Have-not are separated – E. g. in marketing • 4Ps – Price, product, place, promotion • STP – Segmentation, targeting, and positioning
  28. 28. Implication • Is Technical Approach needed?
  29. 29. Business Approach • Data Markets – Azure Data Marketplace – Data.com – Infochimps.com – DataMarket.com – Kaggle.com
  30. 30. Data Market: Azure Data Marketplace
  31. 31. Data Market: Data.com
  32. 32. Data Market: Infochimps.com
  33. 33. Data Market: DataMarket.com
  34. 34. Data Market: Kaggle.com
  35. 35. Conclusion • Positioning for Korea, – Where are we? – Where are we heading to?
  36. 36. 참고문헌 • 웹3.0 세상을 바꾸고 있다. – 이영환 • A Semantic Web Primer (Cooperative Information Systems series) – Grigoris Antoniou, Frank van Harmelen • Semantic Web for the Working Ontologist, Second Edition: Effective Modeling in RDFS and OWL – Dean Allemang, James Hendler • 온톨로지: 인터넷 진화의 열쇠 – 노상규, 박진수 • 월드와이드웹 – 팀 버너스-리 • 큐레이션 – 스티븐 로젠바움 저, 이시은 역
  37. 37. Web sites • Problems of Linked Data – http://milicicvuk.com/blog/2011/07/26/problems-of-linked-data14-identity/ • LOD2 – http://lod2.eu/Welcome.html – http://stack.lod2.eu/blog/ • How to Define Web 3.0 – http://howtosplitanatom.com/news/how-to-define-web-30-2/ • SPARQL by Example – http://www.cambridgesemantics.com/semantic-university/sparqlby-example#(1) • Practical P-P-P-Problems with Linked Data – http://www.mkbergman.com/917/practical-p-p-p-problems-withlinked-data/ • Linked-Data-Api – https://code.google.com/p/linked-data-api/
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×