13. Web Standards
25/05/16Women in Digital
https://www.w3.org/TR/tr-date-stds.html
Web Design and Applications HTML, CSS, SVG, Ajax, and other
technologies for Web Applications
Web Architecture URIs and HTTP
Semantic Web RDF, SPARQL, OWL, and SKOS.
XML Technology XML, XML Namespaces, XML Schema,
XSLT, Efficient XML Interchange (EXI), and
other related standards.
Web of Services HTTP, XML, SOAP, WSDL, SPARQL, and
others.
Browsers and Authoring Tools
14. Web Generations
25/05/16Women in Digital
250 000 sites
45 million users
1996
80 million sites
1 billion users
2006
800 million sites
3 billion users
2016
Files, Documents
Keyword Search
Social
Networks
Semantic
Search
Natural
Language
Search
2026
IoT
User
Generated
Content
Streaming
Ubiquity
Personalized
Content
http://www.quantumrun.com/future-timeline/2026
3.9 billion users
15. Rows of servers inside a Facebook data center in North Carolina. Photo by
Rich Miller
25/05/16Women in Digital
19. Big Data in Use
25/05/16Women in Digital
Customer
experience
Brand
perception
Target segment
identification
Demand Forecast
Supply Chain
Product Design
Risk management
Fraud detection
Research
Real time data
Health Care
Diagnosis
Icons made by Freepik from www.flaticon.com licensed by Creative Commons BY 3.0
20. Open Data
Open means anyone can freely access, use,
modify, and share for any purpose (subject,
at most, to requirements that preserve
provenance and openness)
25/05/16Women in Digital
Source: http://opendefinition.org/od/2.1/en/
21. Open Government Data
• Transparency
• Public service
improvement
• Economic and Social
Value
• Open data != Free
data
25/05/16Women in Digital
Open Data: The Next Phase in the Technology Revolution
BY CASEY COLEMAN – AUGUST 27, 2013
POSTED IN: EMERGING TECHNOLOGY, GOVERNMENT,
INNOVATION, UNCATEGORIZED
23. Where is the data?
• DBpedia
• Government portals
o UK, US
o https://opendata.swiss launched last year (2015)
o Open Data Barometer 3rd edition (2016)
• The World Bank
• European Data Portal
• Google Public Data Directory
• Data Portals search, DataHub by Open Knowledge Foundation
• CKAN Instances- http://ckan.org/instances/#
25/05/16Women in Digital
28. Linked [Open] Data
The Semantic Web isn't just about putting
data on the web. It is about making links, so
that a person or machine can explore the
web of data.
25/05/16Women in Digital
Tim Berners-Lee
https://www.w3.org/DesignIssues/LinkedData.html
30. Querying the web
SPARQL-LD endpoint:
http://users.ics.forth.gr/~fafalios/
Recipe:
• SPARQL endpoint
o Federated query
• Annotated Website
Ref:
P. Fafalios and Y. Tzitzikas, SPARQL-LD: A SPARQL Extension for Fetching and
Querying Linked Data,14th International Semantic Web Conference (demo paper),
ISWC 2015, Bethlehem, Pennsylvania, USA, October 11-15, 2015.
25/05/16Women in Digital
32. SPARQL services
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT DISTINCT ?authorName
(count(DISTINCT ?paper) AS ?numOfPapers)
(count(DISTINCT ?series) AS ?numOfDiffConfs)
WHERE {
SERVICE <http://users.ics.forth.gr/~fafalios> {
SELECT DISTINCT ?authorURI WHERE {
?p <http://purl.org/dc/terms/creator> ?authorURI } }
SERVICE <http://dblp.l3s.de/d2r/sparql> {
?p2 <http://purl.org/dc/elements/1.1/creator> ?authorURI .
?p2 <http://swrc.ontoware.org/ontology#series> ?series }
SERVICE ?authorURI {
?author foaf:name ?authorName .
?paper <http://purl.org/dc/elements/1.1/creator> ?authorURI }
} GROUP BY ?authorName ORDER BY DESC(?numOfPapers)
25/05/16Women in Digital
Editor's Notes
Say something about me.
Vint Cerf "Founded the Internet Society (ISOC). Now ISOC leader Internet related standards, education, and policy.
It is dedicated to ensuring the open development, evolution, and use of the Internet for the benefit of people throughout the world.ISOC continues to serve as the organizational home of the Internet Engineering Task Force (IETF).
Tim Berners-Lee Founded the W3C together with MIT.W3C primarily pursues its mission through the creation of Web standards and guidelines designed to ensure long-term growth for the Web.
User generated content: blogs, tags,
How big the data in the internet is?
Internet traffic is the flow of data across the Internet. Because of the distributed nature of the Internet, there is no single point of measurement for total Internet traffic
Big data:
Digitalization of services.
customer activity, analysis
Text -> google correction
Gestures
analysis of images video
Likes, dislikes
internet of things,
geospatial data
Social media
Big data concerns Volume, Velocity and Variety … also mentioned veracity and value
One time processing is probably not big data problem…
Cannot store all this data in a single computer and cannot be processed and Analyzed
How to deal with big data = be able to add resources (computers) on the fly -> scale (up vs out)
Distributed Data
Hadoop Distributed storage (HDFS)
NoSQL
Distributed computing
* Mapreduce - Spark, Kafka,
Each computer maps a computation to a single node, and then the algorithm summarizes (reduces) the computation
Costs Benefits of relying on big data
US health $300 billion USD/year increasing the efficiency and quality service (McKinsey)
Europe $149 billion USD in government administration costs
A lot of investment in Big data projects
Technology
User experience
Internet services (Google)-
Smart Cities
Jobs generation
Computer-Science related jobs
Innovation being able to deal with big data has open new doors
Data-based services
Mobile apps
Data Science
Industry
Increasingly connected to the Internet in order to open up new dimensions in production efficiency.
Industry 4.0 is used to refer to the fourth industrial revolution, following those of mechanization, industrialization, and automation.
Health
Evidence based diagnosis
Geonome research
Success stories:
Real time prodction
Uber real time rate calculation
Forecast:
Shortage of big data talents?
New possitions as data chief officers that helps to lead
Machine Learning momentum – new tools to use algorithms with big data- computer power, deep learning
Spark –consuming streams, Machine Learning
Data as a service business models
The importance of opening data started in the 1950s with the Open Scientific Data concept with the formation of the World Data Center system with the aim to share Astronomical and Geophysical data.
The International Council of Scientific Unions (now the International Council for Science) established several World Data Centers to minimize the risk of data loss and to maximize data accessibility, further recommending in 1955 that data be made available in machine-readable form.
Other movements emerged: open source, open hardware, open content and open access
wikipedia.com, on Monday 15 January 2001.
In 2001, Lawrence Lessig founded Creative Commons
Tim Berners-Lee (TED 2009): “We want raw data, now!”
Lessig was a candidate for the Democratic Party's nomination for President of the United States in the 2016 U.S. presidential election, but withdrew before the primaries.
Benefits
Government spending
Public service improvement (movability, education, health)
Economic and Social Value
Open Data is not necessarily free -> business opportunities by opening the data, make users to pay a service maintenance quality
Evidence of the quantitative impact of re-use of Open Data is measured by means of key indicators:
Direct benefits are monetized benefits that are realized in market transactions in the form of revenues and Gross Value Added (GVA), the number of jobs involved in producing a service or product, and cost savings.
Indirect economic benefits are i.e. new goods and services, time savings for users of applications using Open Data, knowledge economy growth, increased efficiency in public services and growth of related markets.
The European Commission, within the context of the launch of the wished to obtain further evidence of the quantitative impact of re-use of Public Data Resources. A study was carried out with the aim to collect, assess and aggregate all economic evidence to forecast the benefits of the re-use of Open Data for all 28 European Member States and the European Free Trade Association (EFTA) countries, further referred to as EU 28+, for the period 2016-2020.
Privacy
New York City Taxi and Limousine Commission. It contains details about every taxi ride (yellow cabs) in New York in 2013, including the pickup and drop off times, locations, fare and tip amounts, as well as anonymized (hashed) versions of the taxi’s license and medallion numbers.
Open Licenses
Public domain license has no restrictions at all (technically, these indicate that the rights owner has waived their rights to the content or data)
CC0, PDDL
Attribution license just says that you must give attribution to the publisher
CC-by ODC-by
Attribution & share-alike license says that you must give attribution and share any derived content or data under the same licence
CC-by-sa, ODnL
Inportance of standards
Schema.org
Owl:sameAs or other kind of links
DBpedia -Towards a Public Data Infrastructure for a Large, Multilingual, Semantic Knowledge Graph
Cloud by contributors to the Linking Open Data community project and other individuals and organisations. It is based on metadata collected and curated by contributors to the Data Hub as well as on metadata extracted from a crawl of the Linked Data web conducted in April 2014.