4. AI for Big Data
The unprecedented availability of massive volumes of data in
many different domains has both revolutionized AI and also
turned it one of the most important technologies in today’s
landscape, leading to the development of highly successful
techniques that can impose structure and extract value from
very large collections of data
6. Research in Scholarly Analytics
The SKM3 team produces innovative approaches leveraging large-scale data mining, semantic
technologies, machine learning, and visual analytics both to extract understanding and value from
large collections of scholarly data and also to provide services to a variety of stakeholders.
http://skm.kmi.open.ac.uk
8. ACM and other similar classifications
• Expensive, long-drawn process
• 14 years between 1998 and 2012 releases
• Becomes obsolete very quickly, unable to cover latest trends
• Validation is an issue
• It is a totally manual process and necessarily the result reflects individual biases and
viewpoints – no ground truth
• Mostly too high-level
• Does not cover fine-grained topics, which is where the action tends to be
• e.g., only 84 topics under AI, while our analysis has identified about 1800 distinct
research areas in the AI field
• Choice of topics and relations between topics are debatable
• Semantic Web is not included (but “SW Languages” is!)
• The area of Ontologies is under Information Retrieval
9. K K
K
K
K
K
K K
K
K
K
K
A
A
A
A
A
A
O O
O
O
O
V V
V V
V
K K
K
K1 K2
Venues
Authors
Organizations
Keywords
Linked Data Cloud
Very Large
Publication Corpus
Statistical Topic
Identification
Candidate Topics
Topic Validation
Validated Topics Statistics/ML
SubTopicOf Relations
Equivalence Relations
Automatic generation of taxonomies of research areas
10. The Computer Science Ontology
The Computer Science Ontology (CSO) is a large-scale, automatically generated ontology of
research areas. It provides the largest research taxonomy in the field of Computer Science,
including about 14K topics and 163K semantic relationships.
http://cso.kmi.open.ac.uk/
11. Automatic Classification of Publications
• The CSO Classifier is an unsupervised approach for automatically classifying documents according
to the Computer Science Ontology. It is currently being used to annotate the publications of
Springer Nature and Dimensions.
Salatino et al. (2019) The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly Articles.
13. Business Value
13
About 9M of additional downloads thanks to STM.
0
5000
10000
15000
20000
2009 2010 2011 2012 2013 2014 2015 2016 2017 2018
Average number of yearly downloads
for books in SpringerLink
downloads (CS Proceedings) expected downloads (CS Proceedings)
downloads (CS Proceedings) withSTM downloads (other books in CS)
downloads (overall)
15. 15
«[…] transition from one paradigm
to another via revolution is the usual
developmental pattern of mature
science»
Thomas Kuhn
The Structure of Scientific Revolutions
17. Approach
• Analysis and discovery of patterns which may
indicate the emergence of new topics
• For example, before the Semantic Web emerged explicitly
as research area we could identify new interesting
dynamics involving authors from different research areas
such as knowledge representation, agent systems,
hypertext and databases.
• Recognizing dynamics that appear to match the
generic patterns to identify emerging trends
18. T1 T2
Year n
Year n+1
T3
T1 T2
T3
Focus on collaboration between research communities
19. • The creation of novel topics is anticipated by a significant increase in
the pace of collaboration in the areas that are associated with the
generation of the new topic and therefore in the density of that
portion of the topological space
19
20. Output: topics, papers, authors
Influential Authors
W. Bruce Croft,
Dieter Fensel,
Dan Suciu,
William W. Cohen,
Berthier Ribeiro-Neto,
Clement T. Yu,
James Allan,
Justin Zobel,
Dragomir R. Radev,
Victor Vianu
Influential Papers
- A Sheth et al. "Managing semantic content for the Web" (2002)
- RWP Luk et al. "A survey in indexing and searching XML documents" (2002)
- J Kahan et al. "Annotea: An open RDF infrastructure for shared Web
annotations" (2002)
- R Manmatha et al. "Modeling score distributions for combining the outputs of
search engines" (2001)
- S Dagtas et al. "Models for motion-based video indexing and retrieval" (2000)
Evolutionary network in 2002, reflecting the
emergence of Semantic Search the following year
22. MK:Smart
• A large collaborative project (19 partners - £17.2M budget) partly
funded by the HEFCE’s Catalyst Fund
• Aim of the fund is to enhance higher education’s contribution to economic
growth
• “we are seeking to support developments that stimulate the capabilities of HE teaching and
research to deliver sustainable economic impact across the nation”
• Main objective of the project:
To put in place an integrated innovation and support programme, which
will leverage large-scale city data to provide solutions to the key demand
problems and will also provide a sustainable technological infrastructure
to accelerate innovation and economic growth.
23.
24. • A multi award-winning infrastructure
supporting the acquisition and
management of both static (i.e., DBs,
files) and dynamic (i.e., sensor feeds)
data sources
• A data eco-system, where private, open
and commercial data sources co-exist in
the same infrastructure
• A platform for Open Innovation,
providing developers with APIs and tools
to facilitate the engineering of data-
intensive applications
25. Infrastructure Layers
DATA HUB
Smart
Parking
Driver
Assist
Waste Management
Tracing
Assets:
BT Trace
Smart Street
Lighting
APPLICATIONS
LoRa
MESH
CONNECTIVITY
UNB
SENSORS
Light
Sensor
Bin
Usage
Parking
Sensor
Vehicle
Telemetry
RFID
Trace
Soil
Moisture
Analytics
Dev Environment
IT Services
Information Spine
26. Data cataloguing and governance
Licenses are described as machine readable policies {
"global:homepage": [
"https://datahub.mksmart.org/policy/open-governme
"https://datahub.beta.mksmart.org/policy/open-gove
],
"global:landingPage": [
"http://data.mksmart.org/entity/thing/www:uri/http
government-license/",
"http://data.mksmart.org/entity/thing/www:uri/https:/
n-government-license/"
],
"global:api": [
"https://datahub.beta.mksmart.org/data-catalogue-a
government-license",
"https://datahub.mksmart.org/data-catalogue-api/?a
license"
],
"global:name": ["open-government-license"],
"global:description": [""],
"global:title": ["Open Government License"],
"global:permission": [
"http://data.mksmart.org/entity/thing/www:uri/perm
"http://data.mksmart.org/entity/thing/www:uri/perm
"http://data.mksmart.org/entity/thing/www:uri/perm
"http://data.mksmart.org/entity/thing/www:uri/perm
"http://data.mksmart.org/entity/thing/www:uri/perm
"http://data.mksmart.org/entity/thing/www:uri/perm
"http://data.mksmart.org/entity/thing/www:uri/perm
"http://data.mksmart.org/entity/thing/www:uri/perm
27.
28. • 6 workshops and 20 roadshows.
• OurMK platform and MK Citizen Lab
• Citizen Ideas Competition
• 11 projects already funded
• Rated one of the “Top 5 crowdsourcing initiatives in
government: better engagement with citizens”
• Franzi’s Food passport
• Michael’s Domestic solar
• Lindsey’s Pop-up shop
• Ros’s Breastfeeding app
• Zi’s Parent Computing Literacy
• Les’s Allotment Borehole Feasibility
• Padma’s Centre MK Beacon Navigation
• Paul’s Redways route Recordings
• Eric’s Redways Reporting App
• MKPAA’s Beat the Redways game
• MK Academy’s Water Awareness Week
30. • £2m two-year project
• Promoting research and innovation in
the digital economy
• MK now 2nd highest economy
outside London for tech and
digital SMEs.
• Addresses data science skill gap in
SMEs
• Focus on South East Midlands LEP
region
• Leverages and strengthens SME
innovation network created in
MK:Smart
• Innovative approach focused on
customised and integrated
business/tech support
• Advisory Board includes
MK Council, NatWest, SEMLEP
Target
50 new
propositions/prototypes
Grants
MK Data
Hub
Lean skills
training
Tech Design
& Prototype
Evaluation
Business & innovation
networks
Our MK – citizen innovation
platform
31. Robots in a smart city
• Currently developments in smart cities focus
primarily on sensor deployment and data collection
and analysis to optimize services.
• No integration of robots in smart city infrastructure,
even though autonomous robots already operate in
urban scenarios
• Advantages from integrating robots in a smart city
infrastructure:
• Robots can make use of data coming from a
variety of sources (hence becoming smarter)
• Robots can act as mobile sensors (hence
reducing cost of massive sensor deployment)
• Robots can be opportunistically deployed to
deal with exceptional events – e.g.,
emergencies
32.
33. Hans, the Health and Safety Inspector
• Hans is aware of the Health and Safety regulations at the OU
• It is expected to detect H&S violations autonomously
• It is also expected to fulfil additional lab supervision tasks, e.g., checking occupancy of meeting
rooms
• This requires integration with KMi’s room booking system
• Hans needs object recognition ability, integration with KMi Systems, specialized task knowledge,
and integration with external knowledge bases (e.g., ConceptNet, WordNet, Visual Genome)
34.
35. 1st International Competition on Robots in Smart Cities
SciRoc is a EU funded project
whose aim is to bring robotic
tournaments in the context of
smart cities.
The first international
competition took place in
Milton Keynes, on 18-21
September 2019.
Challenge comprises 5
episodes, testing Human-
Robot Interaction, Navigation,
Manipulation, Autonomous
Flying, Humanoid Robotics
and Interaction with smart city
infrastructure.