Linked Data: some social challenges

19,763 views
19,409 views

Published on

Presentation given at "Global Interoperability and Linked Data in Libraries", University of Florence, 18 Jun 2012

Published in: Technology, Education
4 Comments
18 Likes
Statistics
Notes
No Downloads
Views
Total views
19,763
On SlideShare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
146
Comments
4
Likes
18
Embeds 0
No embeds

No notes for slide

Linked Data: some social challenges

  1. L ink ed Da ta social challengessom e tech & michele barbera <barbera@netseven.it> @barbz79it
  2. 1
  3.  TECH
  4.  CHALLENGES2
  5.  SOCIAL
  6.  CHALLENGES3
  7.  LINKED
  8.  DATA
  9.  ECONOMY
  10. smart data now! Unità Web of Data spaziodati.eu netseven.it fbk.eu
  11. Is
  12.  the
  13.  Semantic
  14.  Web
  15.  real?
  16. no*.*
  17.  I’m
  18.  provocative
  19. we
  20.  aimed
  21.  at
  22.  this:
  23. and
  24.  failed*.*
  25.  But
  26.  produced
  27.  ~170k
  28.  research
  29.  papers
  30.  in
  31.  11
  32.  years,
  33.  not
  34.  bad!
  35. Pizza
  36.  ontology?!
  37. well,
  38.  not
  39.  really
  40.  failed...
  41. we’re
  42.  still
  43.  working
  44.  on
  45.  it
  46. less
  47.  pizza
  48.  more
  49.  engineering
  50. A
  51.  little
  52.  semantics
  53.  goesa
  54.  long
  55.  way... Semantic
  56.  Web Linked
  57.  Data
  58. Semantic
  59.  Web Linked
  60.  Data Web
  61.  of
  62.  Data
  63. it’s
  64.  not
  65.  just
  66.  technology
  67. it’s
  68.  definetely
  69.  not
  70.  AI
  71. it’s
  72.   just
  73.   about
  74.   linking
  75.  things
  76.  together
  77. your web site DATA IS LESS VALUABLE WHEN SILOED
  78. because
  79.  value
  80.  is
  81.  in
  82.  context
  83. content
  84.  is
  85.  king
  86. xcontent
  87.  is
  88.  king
  89. linking
  90.   t e chs o m e es i s s u
  91. 1
  92.  SCALABILITY
  93. is
  94.  it
  95.  all
  96.  about
  97.  size? Flexibility,
  98.  dinamicity,
  99.  scalability by Giovanni Tummarello
  100. dataspaces by Giovanni Tummarello
  101. Large
  102.  Scale
  103.  RDF
  104.  summaries 12M
  105.  relationshipsClass Level http://test01.sindice.net/szydan/dataset-view/dataset/default/www.bbc.co.uk by Giovanni Tummarello
  106. Large
  107.  Scale
  108.  RDF
  109.  summaries 12M
  110.  relationshipsClass Level 10B
  111.  relationships http://test01.sindice.net/szydan/dataset-view/dataset/default/www.bbc.co.uk by Giovanni Tummarello
  112. 2
  113.  -
  114.  streaming
  115.  linked
  116.  data
  117. 3
  118.  -
  119.  versioning moved deleted SPOT
  120.  THE
  121.  DIFFERENCE
  122. self_promotion
  123. SIREn Data Collection Settings 500M web data documents  Cluster of 4 nodes (RDF, RDFa, Microformat, etc.)  2 nodes for indexing 200K datasets  2 nodes for querying 50B triples  Replication Indexing Performance Services Full index construction takes  Keyword and structured queries approx 24 hours  Dataset search 436K triples / second  99% uptime
  124. SIREn Data Collection Settings 500M web data documents  Cluster of 4 nodes (RDF, RDFa, Microformat, etc.)  2 nodes for indexing 200K datasets  2 nodes for querying 50B triples  Replication Indexing Performance Services Full index construction takes  Keyword and structured queries approx 24 hours  Dataset search 436K triples / second  99% uptime
  125. SIREn Data Collection Settings 500M web data documents  Cluster of 4 nodes (RDF, RDFa, Microformat, etc.)  2 nodes for indexing 200K datasets  2 nodes for querying 50B triples  Replication Indexing Performance Services Full index construction takes  Keyword and structured queries approx 24 hours  Dataset search 436K triples / second  99% uptime
  126. SIREn Data Collection Settings 500M web data documents  Cluster of 4 nodes (RDF, RDFa, Microformat, etc.)  2 nodes for indexing 200K datasets  2 nodes for querying 50B triples  Replication Indexing Performance Services Full index construction takes  Keyword and structured queries approx 24 hours  Dataset search 436K triples / second  99% uptime
  127. SIREn Data Collection Settings 500M web data documents  Cluster of 4 nodes (RDF, RDFa, Microformat, etc.)  2 nodes for indexing 200K datasets  2 nodes for querying 50B triples  Replication spaziodati.3scale.net Indexing Performance Services Full index construction takes  Keyword and structured queries approx 24 hours  Dataset search 436K triples / second  99% uptime
  128. SIREn Data Collection Settings 500M web data documents  Cluster of 4 nodes (RDF, RDFa, Microformat, etc.)  2 nodes for indexing 200K datasets  2 nodes for querying 50B triples  Replication spaziodati.3scale.net Indexing Performance Services Full index construction takes  Keyword and structured queries approx 24 hours  Dataset search 436K triples / second  99% uptime
  129. SIREn Data Collection Settings 500M web data documents  Cluster of 4 nodes (RDF, RDFa, Microformat, etc.)  2 nodes for indexing 200K datasets  2 nodes for querying 50B triples  Replication spaziodati.3scale.net Indexing Performance Services Full index construction takes  Keyword and structured queries approx 24 hours  Dataset search 436K triples / second  99% uptime
  130. /self_promotion
  131. som e
  132.   so ci al is su es
  133. 1
  134.  THINKING
  135.  IN
  136.  THE
  137.  GRAPH
  138. 1
  139.  -
  140.  thinking
  141.  in
  142.  tables
  143. 1
  144.  -
  145.  thinking
  146.  in
  147.  tables
  148. 1
  149.  -
  150.  thinking
  151.  in
  152.  tables
  153. 1
  154.  -
  155.  thinking
  156.  in
  157.  tables
  158. 1
  159.  -
  160.  thinking
  161.  in
  162.  tables
  163. 1
  164.  -
  165.  thinking
  166.  in
  167.  tables
  168. 1
  169.  -
  170.  thinking
  171.  in
  172.  tables
  173. 1
  174.  -
  175.  thinking
  176.  in
  177.  tables
  178. 1
  179.  -
  180.  thinking
  181.  in
  182.  tables
  183. thinking
  184.  in
  185.  tables u_id f_id 1 2 1 3 3 4 4 3 id name age affiliation 1 Michele 33 net7 2 Mario 32 unipi 3 Silvia 28 unifi 4 Irene 27 unitn Institution City net7 pisa unipi pisa unifi firenze unitn trento
  186. thinking
  187.  in
  188.  graphs? pisa Firenze place Trento e e plac plac unipi net7 unifi e plac ks ks unitn wor wor friend frien ks d michele
  189.  (33) wor silvia
  190.  (28) frien d irene
  191.  (27) fri end mario
  192.  (32)
  193. e.g.soci al
  194.  gr ap hs Mario
  195.  342-2345672 ,
  196.  12
  197.  Apr,
  198.  Via
  199.  san
  200.  G Giovanni
  201.  333-231345 iuseeppe
  202.  34 0,
  203.  Bologna,
  204.  via
  205.  UgoAnna
  206.  328-3422345,
  207.  Bassi
  208.  12
  209.  Trento,
  210.  p.zza
  211.  VerMamma
  212.  050-342212 di
  213.  11 4,
  214.  PisaAntonio
  215.  051-34245 6,
  216.  Bologna
  217. 2
  218.  -
  219.  A.A.A.**
  220.  “you
  221.  don’t
  222.  know
  223.  what
  224.  you’re
  225.  talking
  226.  about”
  227. AAA library wikidb scholarly
  228.  community
  229. tbl“The
  230.  less
  231.   inviting
  232.  side
  233.   of
  234.   sharing
  235.   is
  236.  losing
  237.  some
  238.   control.
  239.   Indeed,
  240.   at
  241.   each
  242.   layer
  243.   ---
  244.   “Net,
  245.   Web,
  246.   or
  247.   Graph
  248.   ---
  249.   we
  250.   have
  251.   ceded
  252.  some
  253.  control
  254.  for
  255.  greater
  256.  benefits” “ It
  257.   is
  258.   about
  259.   getting
  260.   excited
  261.   about
  262.   “ connections,
  263.  rather
  264.  than
  265.  nervous”
  266. 3)info
  267.  vs.
  268.  non-info
  269. http-range-14 http://example.com/resource/CNR http://example.com/page/CNR 303
  270.  redirection? http://example.com/data/CNR http://www.cnr.it/homepage#CNR http://www.cnr.it/homepage hash
  271.  uri?
  272. caution! http://universities.org/italy#cnr ns:president a_person ns:department some_department ns:department some_department owl:sameAs http://www.example.com/cnr ns:creator jonnhy
  273.  web
  274.  developer ns:date 12
  275.  Jun
  276.  2011 ns:name “The
  277.  Website”
  278. 4)Open
  279.  World
  280.  Assumption
  281. Kbase Seat
  282.  14
  283.  is
  284.  reserved Seat
  285.  27
  286.  is
  287.  reserved OWA CWA is
  288.  seat
  289.  28
  290.  reserved?UNKNOWN NO
  291. -
  292.  OWA
  293.  is
  294.  not
  295.  difficult
  296.  to
  297.  understand-
  298.  OWA
  299.  is
  300.  good
  301.  to
  302.  deal
  303.  with
  304.  inconsistencies
  305.  anduniversal
  306.  systems-
  307.  We’re
  308.  more
  309.  familiar
  310.  with
  311.  CW
  312.  reasoning-
  313.  many
  314.  existing
  315.  tools
  316.  are
  317.  CW
  318.   D a ta d i e my? k o n na
  319.   L o e c
  320. -
  321.  ~
  322.  300
  323.  datasets-
  324.  not
  325.  frequently
  326.  updated Linked
  327.  Data-
  328.  0,1
  329.  %
  330.  of
  331.  the
  332.  Web
  333.  of
  334.  Data
  335. Web
  336.  of
  337.  Data
  338. h1 id=namespan class=fn n span class=given-nameMichele /span span class=family-nameBarbera/span /span/h1
  339. www.rottentomatoes.com
  340. Schema.orghttp://www.linkedopendata.it/schema-org-e-le-responsabilita-dei- monopolisti
  341. G
  342.  knowledge
  343.  graph
  344. Freebase
  345.  +
  346.  Geonames
  347.  +
  348.  DBpedia
  349.  +
  350.  schema.org
  351.  +
  352.  search
  353.  statistics? opaque/hidden
  354.  identifiers
  355.  =
  356.  not
  357.  reusable
  358. 5 billion 40%global data mobile phones 30 billion pieces of content shared growth in projected generated per year vs 5% on facebook every month 235 terabytes 15 out of 17 data collected by US library of Congress 60% potential increas in retailers’ sectors in US have more data stored per company than the US Library in april 2011 operating margins possible with big data of Congress BIG DATA AND INFO OVERLOAD IN USE IN 2010: 250$ billion potential annual value 600$ billion 300$ to Europe’s public sector potential annual consumer surplus from using billion administration - more than GDP of Greece personal location data globallypotential annual value to US health care (more than double the total annual 60% 140.000-190.000 more deep analytical talent positions health care potential increase and 1,5 million more data-savvy managers spending in Spain) in retailers’ operating margins need to take full advantage of big data possible wiith big data with big dat only in United States
  359. “ “ The
  360.  real
  361.  value
  362.  of
  363.  the
  364.  GKG
  365.  may
  366.  be
  367.  in
  368.  what
  369.  gets
  370.   deleted
  371.  instead
  372.  of
  373.  what
  374.  gets
  375.  added. Paul
  376.  Houle,
  377.  http://lists.w3.org/Archives/Public/public-lod/2012Jun/0038.html
  378. Open
  379.   Data
  380.   (and
  381.   digital
  382.   public
  383.   goods)
  384.  r e p r e s e n t s
  385.   a n
  386.   u n p r e c e d e n t e d
  387.  opportunity
  388.   to
  389.   build
  390.   a
  391.   (local?
  392.  vertical?)
  393.   data
  394.   economy
  395.   and
  396.   to
  397.  preserve
  398.  our
  399.  cultural
  400.  diversity
  401. “ The
  402.   gist
  403.   of
  404.   the
  405.   matter
  406.   is
  407.   to
  408.   turn
  409.   large
  410.   streams
  411.   of
  412.   data
  413.   into
  414.   added
  415.  value
  416.  for
  417.  the
  418.  public
  419.  and
  420.  private
  421.  sector
  422.  [...] Clearly,
  423.   research,
  424.   engineering,
  425.   policy
  426.   making
  427.   for
  428.   the
  429.   Data
  430.   Economy
  431.   and
  432.   the
  433.   exploitation
  434.   of
  435.   the
  436.   unprecedented
  437.   wealth
  438.   of
  439.   “ data
  440.  have
  441.  become
  442.  keys
  443.  to
  444.  the
  445.  Future
  446.  of
  447.  Europe.
  448. WE CAN DO IT!!!
  449. Thank
  450.  you.@barbz79it

×