Slides for the second part of the Linked Data for Development (LD4D) Tutorial, held at WSSF2013 in Montreal Canada.
In this presentation I talk about Downscaling the Semantic Web, taking into account issues around 1) infrastructure and hardware 2) interfaces 3) relevant data
1. Linked Data for Development:
Part 2: Downscaling Linked Data
Victor de Boer
With significant input from
Christophe GuĂŠret, Martin Murillo, Stephane Boyera, Stefan
Schlobach, Bernie Innocenti, Walter Bender, Claudia Urrea, Anna
Bon, Hans Akkermans, Nana Gyan, Amadou Tangara. Mary Allen,âŚ
3. Outline
⢠Part 2:
â Why Linked Data for Development
â Bringing the Semantic Web and Linked Data to the Base of the
Pyramid
⢠Relevancy
⢠Infrastructure and connectivity
⢠Interfaces
â IATI as Linked Data
â Voice-based access to Market data in the Sahel
â Distributed data sharing: OLPC and ERS
⢠Part 3: Handson session!
7. ICT4D
⢠Technology is a development tool
â Education
â Healthcare
â Livelihood
â etc.
⢠Leveraging communication
independently of physical/geographical
barriers
⢠Improving transparency, accountability,
efficiency of governments
⢠Developing nations can leapfrog directly
into the information age, jumping many
phases of immature technologies
Based on Sbc4d.com
8. Information sharing needs
⢠Agriculture
â Market Prices
â Business opportunities
â Support
â Sharing indigenous
knowledge
â Etc.
⢠Health
â Prevention
â Access to healthcare
â Detection of disease
outbreak
â etc.
⢠Education
⢠Etc.
Based on Sbc4d.com
9. Web Alliance for Regreening in Africa
Washington, 13-15 May 2013 9
W4RA : Information
exchange and
knowledge sharing in
rural Africa
10. World Wide Web
as Instrument of Empowerment
âOur success will be measured by how well we foster the
creativity of our children. Whether future scientists have
the tools to cure diseases.
Whether people, in developed and developing economies
alike, can distinguish reliable information from propaganda
or commercial chaff.
Whether the next generation will build systems that support
democracy and accountable debate.
I hope that you will join this global effort to advance the
Web to empower people.â
Sir Tim Berners-Lee, inventor of the Web:
11. Why the Semantic Web?
⢠Information (from NGOs) in silos
â Specific products
â Specific communities
⢠Lot of knowledge is lost due to lack of publication
ď Sharing (heterogeneous) knowledge is essential
⢠LD is well-suited because of:
â Language-agnostic
â Interface-agnostic
â De-centralised authoring
⢠Slicing
â Re-usability
⢠Local
⢠Global
Img: flickr/elcovs
16. Barriers to the Internet
1. Technology: The lack of connectivity
and electricity, cost of devices and
cost of connection are limiting the
adoption and usage of new
technologies;
2. Capacity: Lack of time and resources
limits the participation in data
sharing processes. There are also
issues related to low education levels,
low capacity to interpret data, and
illiteracy;
3. Relevance: Power balance, culture,
apathy, lack of incentives, lack of
interest and dis-empowerment are
also all threats to having citizens
engage in data sharing.
Stephane Boyera (SBC4D.com)
17. Sem.tech/Linked Data should be made
1. usable on small, affordable,
hardware deployed in various
connectivity contexts;
2. accessible to individuals with
varied cultural backgrounds /
literacy levels;
3. relevant and directly useful to
the target public they aim to
empower.
Infrastructure
Interface Relevancy
20. Relevancy
⢠No local content
⢠No local ownership
⢠Power balance, culture,
apathy, lack of incentives,
dis-empowerment
Subsecretario de transparencia, Alcaldes y la gente
http://www.youtube.com/watch?v=q0S3juRQXR0 Max Rodriguez
25. With the mainstream
⢠Dev.countries can leapfrog
directly into the
information age,
â jumping many phases of
immature technologies
⢠Linked Data is mainstream
computer science
research.
â Letâs worry about the 4.5 B
unconnected prosumers
now!
Img: flickr/n3v3rv0id
27. ⢠Integrate local community radios and mobile ICT for
knowledge sharing
⢠Better support and integrate local languages in voice-
based services
â Development of appropriate speech elements (text-
to-speech and Speech recognition)
⢠Develop a free and open source toolbox for local
developers.
â Investigate self-sustainability
â Develop appropriate business models
â In collaboration with local communities.
28. Bottom-up
⢠Involvement of local communities
â Trust and ownership
â Co-creation
⢠Bottom-up: field visits,
workshops, demos, roadshows,
etc
⢠Local communities: innovation co-
creation, âLiving Labsâ socio-
technical approach
â Use case gathering
â Observation and prototyping
â Test, adapt
29. From 20 use cases to 3 voice systems
Market Information
Citizen Journalism
Event Organiser
1 m-Milk ordering and delivery service of Tominian Milk producers and NGO
2 m-Tree protection alert service Sahel Eco Farmers and NGO
3 mobile-web Event organizer for vaccination of herds Farmers
4 m-Farmer-expert directory service Farmer organization
5 NGO info-line about legal issues in several languages Sahel Eco
6 Leave announcement or select your favourite song Radio
7 Shea butter and honey trading service Radio and Sahel Eco
8 Access radio programs and announcements on your phone Radio
9 Gourcy seed producers seed certiďŹcation service Farmer organization
10 Radio questions and answers about agricultural issues Radio
11 m-collective purchase organizing service Local buyers
12 m-GIS regreening service Sahel Eco
13 m-Farmer social network Sahel Eco
14 mobile-web regional market system Farmer organization
15 Sahel Eco portal to Regreening and access to m-services Sahel Eco
16 m-event organizer for re-greening events Sahel Eco, farmers
31. âSlot and Fillerâ Text-to-Speech
English:
Bambara:
15 liters of offered by Zakari Diarra
15_ba.wav L_ba.wav Of_ba.wav
Spoken Language
Elements Repository
honey
32. VoiceXML
<?xml version="1.0" encoding="ISO-8859-1"?>
<vxml version="2.0" lang="en">
<form>
<prompt bargein="false">
Welcome to RadioMarche!
<audio src=âaudio/communique_1_bambara.wav"/>
</prompt>
<option dtmf="1" value=â1">Press one for X</option>
<option dtmf=â2" value=â2">Press two for Y</option>
...
</vxml>
DTMF = Dual-tone multi-frequency signaling
38. Web for ALL.
Using voice technologies and available toolsâŚ
⌠we make the benefits of the Web available to people who
use simple mobile phones.
~
~
~
~
~
~
39. Results
⢠RadioMarchÊ -- Increased market for farmers.
â Political, social, economical, ecological factors play a great role
â Too successful: not the entire value chain is served
⢠Foroba Blon -- Facilitating rural citizen journalism.
â Privacy and security,
â New business models
Voice platform with reusable components for
different use cases.
42. Linked Market Data
⢠1,952 RDF triples
â 90 offerings
â 19 contacts
⢠Links to
â Data
⢠DBPedia
⢠GeoNames
⢠Agrovoc
â Vocabularies
⢠Foaf
⢠GoodRelations
Local market data
Data / communique layer
Farmers
(producers)
Buyers
(consumers)
Email GSM/Voice
Web SMS
Interface handler layer
Local
radio
43. Sharing across regions/NGOs
Local market data
RadioMarchĂŠ market information system
Farmers
(producers)
Buyers
(consumers)
Email GSM/Voice
Web SMS
Data / communique platform
Local radio
RadioMarchĂŠ in second region
Local market data
Data / communique layer
Farmers
(producers)
Buyers
(consumers)
Email GSM/Voice
Web SMS
Interface handler layer
Local radio
49. Voice browser Tel: +31208080855
Skype: +990009369996162208
Welcome
Choose application and
language
dtmf
About which product
(EN)
About which product
(NL)
List all products (EN)
dtmf
List product offerings
dtmf
List product offerings
1
2
3
1..n
1..n
50. Current status
⢠Linked Market Data
â Locally created
â Linked Data make re-use possible (NGO, others)
â LD voice labels
⢠Can be (re)used to develop voice applications with this data
⢠To go beyond proof-of-concept
â More localization needed
â Local hardware/services (Emerginov / OfficeRoute)
â User testing
â More sophisticated translations (VoiceSPARQL)
53. Icon-based interaction
NCR ATM interface for illiterate 'grammar' - ISOTYPE by Otto Neurath available at
http://imaginarymuseum.org/MHV/PZImhv/NeurathPictureLanguage.html
55. One Laptop Per Child (OLPC), Sugar
and the Entity Registry System
Bernie Innocenti, Walter Bender, Christophe
GuĂŠret,Claudia Urrea
56. OLPC mission and vision
⢠Develop (and deploy) a low-cost laptop
in order to revolutionize how we
educate the world's children
⢠What motivates learning is not carrots
or sticks, but rather:
â autonomy,
â mastery, and
â a sense of purpose.
⢠A laptop makes learning more flexible:
Children learn by teaching and actively
helping each other; the teacher is free
to focus expertise where it is needed
57. How is learning with the XO different?
OLPC
Computer for learning
Student-centric
Teacher as mentor
Voice, text
Learning to learn
Critical thinking
60. The numbers (2012)
⢠2,000,000+ children with XOs
⢠1,000,000,000 children w/o laptops
⢠150+ language projects
⢠40+ countries
⢠500+ Sugar activities
75. Introduction - IATI
âIATI is a voluntary, multi-stakeholder initiative that seeks
to improve the transparency of aid in order to increase
its effectiveness in tackling poverty.â
As of 2013, over 150 donors, NGOs and governments
have registered to the IATIregistry.org by publishing
their aid activities in this XML standard.
Now: 180+
76. Introduction - IATI users
⢠Funders
o Where is the money of my organisation spent?
o Where do other organisations spend their money?
⢠Governments
o How much money is spent in my country?
o What are the budgets or planned disbursements for my country?
⢠Locals
o What organisations are working in my area?
o What projects are currently going on in my area?
⢠Public
o Where is my tax money going?
o What are the organisations doing with my donations?
78. Introduction - Why IATI Linked Data?
1. Reusable vocabularies
o Extract information automatically from the IATI data by making
use of applications which are able to interpret standard
vocabularies
2. Enrich IATI data
o Link IATI data to external datasets in order to enrich the IATI
data with additional information or metadata.
3. Donors can use their own Linked Data specification.
o @Linked-data-uri attribute already exists in the IATI
model.
79. Model and links based on requirements
elicitated from experts
Iterative Requirements Engineering Process Model by Loucopoulos and Karakostas
80. Linked Data model - Example
iati:activity/GB-CHC-285776-CHA024
iati:activity-transaction
iati:activity/GB-CHC-285776-CHA024/transaction/42737 .
iati:activity/GB-CHC-285776-CHA024/transaction/42737
iati:transaction-tied-status
iati:codelist/TiedStatus/5 .
81. Linked Data model - Provenance
⢠On file level
o Not on activity level
⢠A named graph per file,
e.g.:iati:graph/dataset/Worldb
ank
83. Linked Data model - Triple store
⢠Triples loaded into a ClioPatria triple store:
o http://semanticweb.cs.vu.nl/iati/
o Sparql endpoint
â Dereferenceable URIs
(http://purl.org/collections/iati/codelist/Sector/11420)
⢠Total number of triples: 36,629,017
⢠Total number of named graphs: 4,790
o Largest activities graph is UNOPS containing 1,231,896
triples
⢠Takes approximately 30 minutes to load all data into the triple store.
RDFLib
Python RDF/Turtle
84. Linking datasets - Approach
1. In total, how much does a given country receive in aid?
2. A comparative index of aid versus the Human Development Index.
3. What is the geographic location of a project? How much aid went to a given
province, constituency or village?
o Is the aid spent in places where the need is highest? Is it well distributed
across the country?
o Can we attribute sub-national breakdowns for aid so we can see how much
goes to different parts of recipient countries?
4. How does violent conflict in recipient countries affect aid activities?
5. How does aid spending as registered in the IATI standard compare to World Bank
indicators?
86. Linking Data applications - Approach
1. In total, how much does a given country receive in aid?
2. A comparative index of aid versus the Human Development Index.
3. What is the geographic location of a project? How much aid went to a given
province, constituency or village?
o Is the aid spent in places where the need is highest? Is it well distributed
across the country?
o Can we attribute sub-national breakdowns for aid so we can see how much
goes to different parts of recipient countries?
4. How does violent conflict in recipient countries affect aid activities?
5. How does aid spending as registered in the IATI standard compare to World Bank
indicators?
88. 2. A comparative index of aid versus the Human Development Index.
http://iati2lod.appspot.com/
89. http://iati2lod.appspot.com/
4. How does violent conflict in recipient countries affect aid activities?
5. How does aid spending as registered in the IATI standard compare to World
Bank indicators?
90.
91.
92. Links to DBPedia
IDS: document 0001 Theme:âFood Securityâ
DBPedia:âFood Securityâ
Analysis of approaches to
understanding and addressing food
security issues; examination of the
structural causes of food insecurity
and different policy responses
Theme:â Food aid emergencies â
Person:âDavid Pimentelâ
Organisation:âFAOâ
âVoedselzekerheidâ@NL
93. Links to IATI
IDS: document 0003 Theme 'Higher educationâ
IATI Sector:âHigher Educationâ
Theme Education
Organisation : UN Habitat
Activity: Multi donor fund to support civil
society in democracy related issues
Degree and diploma
programmes at
universities, colleges and
polytechnics;
scholarships.
94. Linked Data for Landportal.info
[M.Sc. thesis by Alan Chavoshe]
⢠The Land Portal is an easy access, easy-to-use
platform to share land related information, to
monitor trends, and identify information gaps to
promote effective and sustainable land
governance.
98. Take home
⢠Knowledge sharing is a tool for development
⢠Linked Data is well-suited because of
â Language- and interface agnostic characteristics
â Decentralizability
â Reusability outside of original context
⢠Downscaling
â Interface
â Infrastructure
â Relevancy
Img: flickr/TomJByrne
99. What we need from you?
⢠Data
⢠Cases
â Transparency, Governance, Democracy
â Economic development, Healthcare
⢠Reflection
â Ethics of ICT4D
⢠Open Data
⢠Linked Data
Img: flickr/wetwebwork
- More affiiations and who I am
Choice point
Martin Murillo
We all love the web and all of us appreciate the influence it had on our social, political and economic lives. This Empowerment through the sharing of knowledge of businesses, people and societies is reflected in the rapid growth of the Internet and the World Wide Web
4.5 Billion people are now unconnected to the web.
Too many pics
Knowledge sharing. Information exchange
Spice this up
Relate this to w4ra
Information sharing needs
Arrange icons properly
Actually built and running
s
using voice technologies we make the benefits of the Web available to those with simple mobile phones.
Spice this up
Linked data is a good fit
Multi-stakeholder # IATI history # Gathered by IATI registry # XML
XML per element # Conclusions # Codelists
Why IATI as Linked Data? # 1. Reusing standard vocâs # 2. Additional data from external datasets # Show added value; 3 parts
Requirements engineering approach # Experts and users from the field # Two main requirements
IATI was on the Web of Data, but no longer. However⌠they do have an API, soâŚ.
Sahel. Governments in the African Sahel region
have recorded 1000s of pages of data about rainfall and crop harvest in their region.
This data is very useful when aggregated over multiple regions for analysis supporting
decisions in re-greening initiatives. For this goal, low resolution digital scans have been
made of handwritten documents containing tabular as well as textual data3. The data in
these documents is to be converted into digitized structured data. Automated techniques
digitization are error-prone and require a lot of configuration. Although crowdsourcing
has been used for digitizing handwritten documents (e.g. [14]), this specific task is fairly
complex in the sense that (i) the semantics of the tables is often not easily understood;
and (ii) decoding the handwriting does require specific language and domain knowledge
(e.g., familiarity with the regional geography and villages). We expect that the level of
quality that the faceless crowd can provide is not sufficient for the goals of the regreening
initiative and that therefore this task is very well suited for nichesourcing. The
niche being targeted is the so-called African Diaspora: African expatriates who now
reside in better connected parts of the world. Members of the diaspora are very much
affiliated with local issues in their region of origin. This intrinsic motivation can be
exploited through nichesourcing. Existing Web communities set up by members of the
Diaspora (e.g., on Facebook) can be addressed. The network connections can be used to
distribute human computation tasks as well as reinforce motivation through reputation.
Furthermore, the domain knowledge of the niche members (including the local language
and names of villages) may guarantee to produce a higher level of quality than which
could be obtained by a general crowd.
Also Applications!
Emphasis that SE guy records in multiple languages
That there is a user profile, which includes language
That farmers get the info in their language
That they can respond and that it is stored in a db