The Web of Data as a Massively Scalable NoSQL Database

•

0 likes•541 views

The document summarizes using the Web as a NoSQL database by leveraging linked data principles. It discusses representing entities and their attributes as global URIs, using standard data formats, making data accessible on the Web by dereferencing URIs, and linking to other related data to gain network effects. The approach simplifies data integration while benefiting from the scalability, availability and innovation of the existing Web infrastructure.

Technology Sports

The Web of Data as a NoSQL Database

Sam Tunnicliffe
@beobal

Talis Systems Ltd

http://talis.com
http://github.com/talis

NoSQL Now! 2011

entity retrieval
using xDBC & ORM
or custom SQL

version 1.0

entity retrieval
using store specific
protocols and
clients

schema-last

sharding strategy
may be encapsulated
schema knowledge by clients/servers or
resides in application may require the
or access layer application to handle
routing/addressing
as well as managing
store specific
protocols and
clients

sharded, polyglot storage

What if you could use the Web as a database?

loose coupling

http://www.flickr.com/photos/11950mike/4707805552

http://www.flickr.com/photos/juniorvelo/2861770108

outsource data
acquisition costs

proven, extreme scalability

http://www.flickr.com/photos/krayker/2268587409

http://www.flickr.com/photos/ranjithsiji/4897513366

leverage existing infrastructure

http://www.flickr.com/photos/mandy_pantz/2512569926

more and more diverse data

serendipity

http://www.flickr.com/photos/sylvar/3291628571

http://www.flickr.com/photos/zivkovic/5850008238

high latency

giving away control

http://www.flickr.com/photos/kecko/4052526123

variable availability

http://www.flickr.com/photos/numberstumper/3057162582

1969-059A
spacecraft/1969-059A

global names

1969-059A
spacecraft/1969-059A
nasa.dataincubator.org/spacecraft/1969-059A

global names

1969-059A
spacecraft/1969-059A
nasa.dataincubator.org/spacecraft/1969-059A
http://nasa.dataincubator.org/spacecraft/1969-059A

URIs for entity names

mass 28801.1
name “Apollo 11 CSM”

things have attributes

mass 28801.1
name “Apollo 11 CSM”
launch launch/1969-059

things have attributes

http://purl.org/net/schemas/space/mass 28801.1
http://xmlns.com/foaf/0.1/name “Apollo 11 CSM”
http://purl.org/net/schemas/space/launch launch/1969-059

URIs for attribute names

http://www.flickr.com/photos/juniorvelo/457197656

links

DNS is your routing component

http://www.flickr.com/photos/cjschmit/4623783487

subject

predicate

object

RDF and linked data

1969-59A

launch

launch/1969-59

RDF and linked data

1969-59A

mass: 28801.1
name: Apollo 11 CSM launch

launch/1969-59

launch date: 16 July 1969
launch vehicle: Saturn V
RDF and linked data weather: clear, dry

nasa.gov

1969-059A

Apollo 11

geonames.org

Cape
launch/1969-59
Canaveral
Washington
D.C.
launch date: 16 July 1969
launch vehicle: Saturn V
weather: clear, dry

United Mexico
States

alternate name: Stati Uniti
alternate name: Estados Unidos
alternate name: アメリカ合衆国
population: 311,874,000

RDF and linked data Canada

routes between
linked entities is
explicit in data

DNS does the
hard work entity lookups
come from
authoritative sources

web enabled data

realtime discovery
of additional
data sources

web enabled data

expanded
data universe

simplified access
protocol

but some things
are now outside
of your control

web enabled data

http://www.flickr.com/photos/vhanes/3722327096

local caches

outcomes

http://www.flickr.com/photos/carbonnyc/293733099

shared effort

http://www.flickr.com/photos/toffehoff/244870160/

more simple data integration

http://www.flickr.com/photos/thedailyenglishshow/3947409618/

more linked data

http://www.flickr.com/photos/ninjanoodles/114033269

http://www.flickr.com/photos/asurroca/66225176

network effects

use the web as a database by...

●
using global names
●
for entities
●
for attributes
●
using standard formats
●
making data dereferenceable
●
linking to other data

http://www.flickr.com/photos/ryanwick/3461847552

Recently uploaded

Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal

TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays

Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer

Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science

A Call to Action for Generative AI in 2024Results

IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge

Histor y of HAM Radio presentation slidevu2urc

How to convert PDF to text with Nanonetsnaman860154

Automating Google Workspace (GWS) & more with Apps Scriptwesley chun

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia

04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG

Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1

Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun

Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge

Slack Application Development 101 Slidespraypatel2

Presentation on how to chat with PDF using ChatGPT code interpreternaman860154

Recently uploaded (20)

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...

Tata AIG General Insurance Company - Insurer Innovation Award 2024

Exploring the Future Potential of AI-Enabled Smartphone Processors

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx

A Call to Action for Generative AI in 2024

IAC 2024 - IA Fast Track to Search Focused AI Solutions

Histor y of HAM Radio presentation slide

How to convert PDF to text with Nanonets

Automating Google Workspace (GWS) & more with Apps Script

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx

Boost Fertility New Invention Ups Success Rates.pdf

Data Cloud, More than a CDP by Matt Robison

Advantages of Hiring UIUX Design Service Providers for Your Business

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf

Slack Application Development 101 Slides

Presentation on how to chat with PDF using ChatGPT code interpreter

The Web of Data as a Massively Scalable NoSQL Database

1. The Web of Data as a NoSQL Database Sam Tunnicliffe @beobal Talis Systems Ltd http://talis.com http://github.com/talis NoSQL Now! 2011

2. entity retrieval using xDBC & ORM or custom SQL version 1.0

3. entity retrieval using store specific protocols and clients schema-last

4. sharding strategy may be encapsulated schema knowledge by clients/servers or resides in application may require the or access layer application to handle routing/addressing as well as managing store specific protocols and clients sharded, polyglot storage

5. What if you could use the Web as a database?

6. loose coupling http://www.flickr.com/photos/11950mike/4707805552

7. http://www.flickr.com/photos/juniorvelo/2861770108 outsource data acquisition costs

8. proven, extreme scalability http://www.flickr.com/photos/krayker/2268587409

9. http://www.flickr.com/photos/ranjithsiji/4897513366 leverage existing infrastructure

10. http://www.flickr.com/photos/mandy_pantz/2512569926 more and more diverse data

11. serendipity http://www.flickr.com/photos/sylvar/3291628571

12. http://www.flickr.com/photos/zivkovic/5850008238 high latency

13. giving away control http://www.flickr.com/photos/kecko/4052526123

14. variable availability http://www.flickr.com/photos/numberstumper/3057162582

15. global names

16. 1969-059A global names

17. 1969-059A spacecraft/1969-059A global names

18. 1969-059A spacecraft/1969-059A nasa.dataincubator.org/spacecraft/1969-059A global names

19. 1969-059A spacecraft/1969-059A nasa.dataincubator.org/spacecraft/1969-059A http://nasa.dataincubator.org/spacecraft/1969-059A URIs for entity names

20. mass 28801.1 things have attributes

21. mass 28801.1 name “Apollo 11 CSM” things have attributes

22. mass 28801.1 name “Apollo 11 CSM” launch launch/1969-059 things have attributes

23. http://purl.org/net/schemas/space/mass 28801.1 http://xmlns.com/foaf/0.1/name “Apollo 11 CSM” http://purl.org/net/schemas/space/launch launch/1969-059 URIs for attribute names

24. http://www.flickr.com/photos/juniorvelo/457197656 links

25. dereference to get data

26. DNS is your routing component http://www.flickr.com/photos/cjschmit/4623783487

27. subject predicate object RDF and linked data

28. 1969-59A launch launch/1969-59 RDF and linked data

29. 1969-59A mass: 28801.1 name: Apollo 11 CSM launch launch/1969-59 launch date: 16 July 1969 launch vehicle: Saturn V RDF and linked data weather: clear, dry

30. nasa.gov 1969-059A Apollo 11 geonames.org Cape launch/1969-59 Canaveral Washington D.C. launch date: 16 July 1969 launch vehicle: Saturn V weather: clear, dry United Mexico States alternate name: Stati Uniti alternate name: Estados Unidos alternate name: アメリカ合衆国 population: 311,874,000 RDF and linked data Canada

31. routes between linked entities is explicit in data DNS does the hard work entity lookups come from authoritative sources web enabled data

32. realtime discovery of additional data sources web enabled data

33. expanded data universe simplified access protocol but some things are now outside of your control web enabled data

34. http://www.flickr.com/photos/vhanes/3722327096 local caches

35. outcomes http://www.flickr.com/photos/carbonnyc/293733099

36. shared effort http://www.flickr.com/photos/toffehoff/244870160/

37. more simple data integration http://www.flickr.com/photos/thedailyenglishshow/3947409618/

38. more linked data http://www.flickr.com/photos/ninjanoodles/114033269

39. http://www.flickr.com/photos/asurroca/66225176 network effects

40. use the web as a database by... ● using global names ● for entities ● for attributes ● using standard formats ● making data dereferenceable ● linking to other data http://www.flickr.com/photos/ryanwick/3461847552

41. http://talis.com thank you

The Web of Data as a Massively Scalable NoSQL Database

Recommended

Recommended

More Related Content

More from DATAVERSITY

More from DATAVERSITY (20)

Recently uploaded

Recently uploaded (20)

The Web of Data as a Massively Scalable NoSQL Database