SlideShare a Scribd company logo
1 of 38
Download to read offline
Research Directions in Intelligent Systems and Data Science
Prof Enrico Motta
Knowledge Media Institute
The Open University
Making Sense of Scholarly Data
AI for Big Data
The unprecedented availability of massive volumes of data in
many different domains has both revolutionized AI and also
turned it one of the most important technologies in today’s
landscape, leading to the development of highly successful
techniques that can impose structure and extract value from
very large collections of data
Big data in the scholarly domain
5
Research in Scholarly Analytics
The SKM3 team produces innovative approaches leveraging large-scale data mining, semantic
technologies, machine learning, and visual analytics both to extract understanding and value from
large collections of scholarly data and also to provide services to a variety of stakeholders.
http://skm.kmi.open.ac.uk
Mapping the Space of Research in Computer Science
7
ACM and other similar classifications
• Expensive, long-drawn process
• 14 years between 1998 and 2012 releases
• Becomes obsolete very quickly, unable to cover latest trends
• Validation is an issue
• It is a totally manual process and necessarily the result reflects individual biases and
viewpoints – no ground truth
• Mostly too high-level
• Does not cover fine-grained topics, which is where the action tends to be
• e.g., only 84 topics under AI, while our analysis has identified about 1800 distinct
research areas in the AI field
• Choice of topics and relations between topics are debatable
• Semantic Web is not included (but “SW Languages” is!)
• The area of Ontologies is under Information Retrieval
K K
K
K
K
K
K K
K
K
K
K
A
A
A
A
A
A
O O
O
O
O
V V
V V
V
K K
K
K1 K2
Venues
Authors
Organizations
Keywords
Linked Data Cloud
Very Large
Publication Corpus
Statistical Topic
Identification
Candidate Topics
Topic Validation
Validated Topics Statistics/ML
SubTopicOf Relations
Equivalence Relations
Automatic generation of taxonomies of research areas
The Computer Science Ontology
The Computer Science Ontology (CSO) is a large-scale, automatically generated ontology of
research areas. It provides the largest research taxonomy in the field of Computer Science,
including about 14K topics and 163K semantic relationships.
http://cso.kmi.open.ac.uk/
Automatic Classification of Publications
• The CSO Classifier is an unsupervised approach for automatically classifying documents according
to the Computer Science Ontology. It is currently being used to annotate the publications of
Springer Nature and Dimensions.
Salatino et al. (2019) The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly Articles.
Automatic Classification of CS Proceedings at Springer Nature
Business Value
13
About 9M of additional downloads thanks to STM.
0
5000
10000
15000
20000
2009 2010 2011 2012 2013 2014 2015 2016 2017 2018
Average number of yearly downloads
for books in SpringerLink
downloads (CS Proceedings) expected downloads (CS Proceedings)
downloads (CS Proceedings) withSTM downloads (other books in CS)
downloads (overall)
Augur: Predicting the emergence of new research areas
14
15
«[…] transition from one paradigm
to another via revolution is the usual
developmental pattern of mature
science»
Thomas Kuhn
The Structure of Scientific Revolutions
The data…..
16
Approach
• Analysis and discovery of patterns which may
indicate the emergence of new topics
• For example, before the Semantic Web emerged explicitly
as research area we could identify new interesting
dynamics involving authors from different research areas
such as knowledge representation, agent systems,
hypertext and databases.
• Recognizing dynamics that appear to match the
generic patterns to identify emerging trends
T1 T2
Year n
Year n+1
T3
T1 T2
T3
Focus on collaboration between research communities
• The creation of novel topics is anticipated by a significant increase in
the pace of collaboration in the areas that are associated with the
generation of the new topic and therefore in the density of that
portion of the topological space
19
Output: topics, papers, authors
Influential Authors
W. Bruce Croft,
Dieter Fensel,
Dan Suciu,
William W. Cohen,
Berthier Ribeiro-Neto,
Clement T. Yu,
James Allan,
Justin Zobel,
Dragomir R. Radev,
Victor Vianu
Influential Papers
- A Sheth et al. "Managing semantic content for the Web" (2002)
- RWP Luk et al. "A survey in indexing and searching XML documents" (2002)
- J Kahan et al. "Annotea: An open RDF infrastructure for shared Web
annotations" (2002)
- R Manmatha et al. "Modeling score distributions for combining the outputs of
search engines" (2001)
- S Dagtas et al. "Models for motion-based video indexing and retrieval" (2000)
Evolutionary network in 2002, reflecting the
emergence of Semantic Search the following year
Smart Cities and Robotics
MK:Smart
• A large collaborative project (19 partners - £17.2M budget) partly
funded by the HEFCE’s Catalyst Fund
• Aim of the fund is to enhance higher education’s contribution to economic
growth
• “we are seeking to support developments that stimulate the capabilities of HE teaching and
research to deliver sustainable economic impact across the nation”
• Main objective of the project:
To put in place an integrated innovation and support programme, which
will leverage large-scale city data to provide solutions to the key demand
problems and will also provide a sustainable technological infrastructure
to accelerate innovation and economic growth.
• A multi award-winning infrastructure
supporting the acquisition and
management of both static (i.e., DBs,
files) and dynamic (i.e., sensor feeds)
data sources
• A data eco-system, where private, open
and commercial data sources co-exist in
the same infrastructure
• A platform for Open Innovation,
providing developers with APIs and tools
to facilitate the engineering of data-
intensive applications
Infrastructure Layers
DATA HUB
Smart
Parking
Driver
Assist
Waste Management
Tracing
Assets:
BT Trace
Smart Street
Lighting
APPLICATIONS
LoRa
MESH
CONNECTIVITY
UNB
SENSORS
Light
Sensor
Bin
Usage
Parking
Sensor
Vehicle
Telemetry
RFID
Trace
Soil
Moisture
Analytics
Dev Environment
IT Services
Information Spine
Data cataloguing and governance
Licenses are described as machine readable policies {
"global:homepage": [
"https://datahub.mksmart.org/policy/open-governme
"https://datahub.beta.mksmart.org/policy/open-gove
],
"global:landingPage": [
"http://data.mksmart.org/entity/thing/www:uri/http
government-license/",
"http://data.mksmart.org/entity/thing/www:uri/https:/
n-government-license/"
],
"global:api": [
"https://datahub.beta.mksmart.org/data-catalogue-a
government-license",
"https://datahub.mksmart.org/data-catalogue-api/?a
license"
],
"global:name": ["open-government-license"],
"global:description": [""],
"global:title": ["Open Government License"],
"global:permission": [
"http://data.mksmart.org/entity/thing/www:uri/perm
"http://data.mksmart.org/entity/thing/www:uri/perm
"http://data.mksmart.org/entity/thing/www:uri/perm
"http://data.mksmart.org/entity/thing/www:uri/perm
"http://data.mksmart.org/entity/thing/www:uri/perm
"http://data.mksmart.org/entity/thing/www:uri/perm
"http://data.mksmart.org/entity/thing/www:uri/perm
"http://data.mksmart.org/entity/thing/www:uri/perm
• 6 workshops and 20 roadshows.
• OurMK platform and MK Citizen Lab
• Citizen Ideas Competition
• 11 projects already funded
• Rated one of the “Top 5 crowdsourcing initiatives in
government: better engagement with citizens”
• Franzi’s Food passport
• Michael’s Domestic solar
• Lindsey’s Pop-up shop
• Ros’s Breastfeeding app
• Zi’s Parent Computing Literacy
• Les’s Allotment Borehole Feasibility
• Padma’s Centre MK Beacon Navigation
• Paul’s Redways route Recordings
• Eric’s Redways Reporting App
• MKPAA’s Beat the Redways game
• MK Academy’s Water Awareness Week
Ground Resistance
A collaboration between
MK:Smart and artists
Wesley Goatley and
Georgina Voss
• £2m two-year project
• Promoting research and innovation in
the digital economy
• MK now 2nd highest economy
outside London for tech and
digital SMEs.
• Addresses data science skill gap in
SMEs
• Focus on South East Midlands LEP
region
• Leverages and strengthens SME
innovation network created in
MK:Smart
• Innovative approach focused on
customised and integrated
business/tech support
• Advisory Board includes
MK Council, NatWest, SEMLEP
Target
50 new
propositions/prototypes
Grants
MK Data
Hub
Lean skills
training
Tech Design
& Prototype
Evaluation
Business & innovation
networks
Our MK – citizen innovation
platform
Robots in a smart city
• Currently developments in smart cities focus
primarily on sensor deployment and data collection
and analysis to optimize services.
• No integration of robots in smart city infrastructure,
even though autonomous robots already operate in
urban scenarios
• Advantages from integrating robots in a smart city
infrastructure:
• Robots can make use of data coming from a
variety of sources (hence becoming smarter)
• Robots can act as mobile sensors (hence
reducing cost of massive sensor deployment)
• Robots can be opportunistically deployed to
deal with exceptional events – e.g.,
emergencies
Hans, the Health and Safety Inspector
• Hans is aware of the Health and Safety regulations at the OU
• It is expected to detect H&S violations autonomously
• It is also expected to fulfil additional lab supervision tasks, e.g., checking occupancy of meeting
rooms
• This requires integration with KMi’s room booking system
• Hans needs object recognition ability, integration with KMi Systems, specialized task knowledge,
and integration with external knowledge bases (e.g., ConceptNet, WordNet, Visual Genome)
1st International Competition on Robots in Smart Cities
SciRoc is a EU funded project
whose aim is to bring robotic
tournaments in the context of
smart cities.
The first international
competition took place in
Milton Keynes, on 18-21
September 2019.
Challenge comprises 5
episodes, testing Human-
Robot Interaction, Navigation,
Manipulation, Autonomous
Flying, Humanoid Robotics
and Interaction with smart city
infrastructure.
Research in Intelligent Systems and Data Science at the Knowledge Media Institute
Research in Intelligent Systems and Data Science at the Knowledge Media Institute
Research in Intelligent Systems and Data Science at the Knowledge Media Institute

More Related Content

What's hot

Online Learning and Linked Data: An Introduction
Online Learning and Linked Data: An IntroductionOnline Learning and Linked Data: An Introduction
Online Learning and Linked Data: An IntroductionEUCLID project
 
Learning Networks and Connective Knowledge
Learning Networks and Connective KnowledgeLearning Networks and Connective Knowledge
Learning Networks and Connective Knowledgeedtechtalk
 
Digital Futures - Data & Community Ecosystems
Digital Futures - Data & Community EcosystemsDigital Futures - Data & Community Ecosystems
Digital Futures - Data & Community EcosystemsOpen Knowledge Canada
 
FORCE11: Creating a data and tools ecosystem
FORCE11:  Creating a data and tools ecosystemFORCE11:  Creating a data and tools ecosystem
FORCE11: Creating a data and tools ecosystemMaryann Martone
 
#ALAAC15 Linked Data Love
#ALAAC15 Linked Data Love #ALAAC15 Linked Data Love
#ALAAC15 Linked Data Love Kristi Holmes
 
The technology of open learning
The technology of open learningThe technology of open learning
The technology of open learningErik Duval
 
"Plans are worthless, but planning is essential"
"Plans are worthless, but planning is essential""Plans are worthless, but planning is essential"
"Plans are worthless, but planning is essential"Research Data Alliance
 
Blockchain and the Future of Digital Learning Credential Assessment and Manag...
Blockchain and the Future of Digital Learning Credential Assessment and Manag...Blockchain and the Future of Digital Learning Credential Assessment and Manag...
Blockchain and the Future of Digital Learning Credential Assessment and Manag...eraser Juan José Calderón
 
Open data presentation 2013 v0 5
Open data presentation 2013 v0 5Open data presentation 2013 v0 5
Open data presentation 2013 v0 5Alan Kong
 
Metadata in a Crowd: Shared Knowledge Production
Metadata in a Crowd: Shared Knowledge ProductionMetadata in a Crowd: Shared Knowledge Production
Metadata in a Crowd: Shared Knowledge ProductionKevin Rundblad
 
Harnessing Collective Intelligence for Sustainable Development
Harnessing Collective Intelligence for Sustainable DevelopmentHarnessing Collective Intelligence for Sustainable Development
Harnessing Collective Intelligence for Sustainable DevelopmentEDINA, University of Edinburgh
 
Online Data Preprocessing: A Case Study Approach
Online Data Preprocessing: A Case Study ApproachOnline Data Preprocessing: A Case Study Approach
Online Data Preprocessing: A Case Study ApproachIJECEIAES
 
LEARN Conference - How to cost
LEARN Conference - How to costLEARN Conference - How to cost
LEARN Conference - How to costJisc RDM
 
Denver's Open Data Initiative
Denver's Open Data InitiativeDenver's Open Data Initiative
Denver's Open Data InitiativeAllan Glen
 
The importance of FAIR and the Community of Data Driven Insights - the road t...
The importance of FAIR and the Community of Data Driven Insights - the road t...The importance of FAIR and the Community of Data Driven Insights - the road t...
The importance of FAIR and the Community of Data Driven Insights - the road t...Carlos Utrilla Guerrero
 
Wire Workshop: Overview slides for ArchiveHub Project
Wire Workshop: Overview slides for ArchiveHub ProjectWire Workshop: Overview slides for ArchiveHub Project
Wire Workshop: Overview slides for ArchiveHub Projectmwe400
 
Knowledge Sharing over social networking systems
Knowledge Sharing over social networking systemsKnowledge Sharing over social networking systems
Knowledge Sharing over social networking systemstanguy
 

What's hot (19)

Online Learning and Linked Data: An Introduction
Online Learning and Linked Data: An IntroductionOnline Learning and Linked Data: An Introduction
Online Learning and Linked Data: An Introduction
 
Learning Networks and Connective Knowledge
Learning Networks and Connective KnowledgeLearning Networks and Connective Knowledge
Learning Networks and Connective Knowledge
 
Digital Futures - Data & Community Ecosystems
Digital Futures - Data & Community EcosystemsDigital Futures - Data & Community Ecosystems
Digital Futures - Data & Community Ecosystems
 
FORCE11: Creating a data and tools ecosystem
FORCE11:  Creating a data and tools ecosystemFORCE11:  Creating a data and tools ecosystem
FORCE11: Creating a data and tools ecosystem
 
#ALAAC15 Linked Data Love
#ALAAC15 Linked Data Love #ALAAC15 Linked Data Love
#ALAAC15 Linked Data Love
 
Columbia citi economics of net 060515 final
Columbia citi economics of net 060515 finalColumbia citi economics of net 060515 final
Columbia citi economics of net 060515 final
 
The technology of open learning
The technology of open learningThe technology of open learning
The technology of open learning
 
"Plans are worthless, but planning is essential"
"Plans are worthless, but planning is essential""Plans are worthless, but planning is essential"
"Plans are worthless, but planning is essential"
 
Blockchain and the Future of Digital Learning Credential Assessment and Manag...
Blockchain and the Future of Digital Learning Credential Assessment and Manag...Blockchain and the Future of Digital Learning Credential Assessment and Manag...
Blockchain and the Future of Digital Learning Credential Assessment and Manag...
 
Open data presentation 2013 v0 5
Open data presentation 2013 v0 5Open data presentation 2013 v0 5
Open data presentation 2013 v0 5
 
Metadata in a Crowd: Shared Knowledge Production
Metadata in a Crowd: Shared Knowledge ProductionMetadata in a Crowd: Shared Knowledge Production
Metadata in a Crowd: Shared Knowledge Production
 
Harnessing Collective Intelligence for Sustainable Development
Harnessing Collective Intelligence for Sustainable DevelopmentHarnessing Collective Intelligence for Sustainable Development
Harnessing Collective Intelligence for Sustainable Development
 
Online Data Preprocessing: A Case Study Approach
Online Data Preprocessing: A Case Study ApproachOnline Data Preprocessing: A Case Study Approach
Online Data Preprocessing: A Case Study Approach
 
LEARN Conference - How to cost
LEARN Conference - How to costLEARN Conference - How to cost
LEARN Conference - How to cost
 
Open Data is not Enough
Open Data is not EnoughOpen Data is not Enough
Open Data is not Enough
 
Denver's Open Data Initiative
Denver's Open Data InitiativeDenver's Open Data Initiative
Denver's Open Data Initiative
 
The importance of FAIR and the Community of Data Driven Insights - the road t...
The importance of FAIR and the Community of Data Driven Insights - the road t...The importance of FAIR and the Community of Data Driven Insights - the road t...
The importance of FAIR and the Community of Data Driven Insights - the road t...
 
Wire Workshop: Overview slides for ArchiveHub Project
Wire Workshop: Overview slides for ArchiveHub ProjectWire Workshop: Overview slides for ArchiveHub Project
Wire Workshop: Overview slides for ArchiveHub Project
 
Knowledge Sharing over social networking systems
Knowledge Sharing over social networking systemsKnowledge Sharing over social networking systems
Knowledge Sharing over social networking systems
 

Similar to Research in Intelligent Systems and Data Science at the Knowledge Media Institute

Linked Open Data_mlanet13
Linked Open Data_mlanet13Linked Open Data_mlanet13
Linked Open Data_mlanet13Kristi Holmes
 
Hughes RDAP11 Data Publication Repositories
Hughes RDAP11 Data Publication RepositoriesHughes RDAP11 Data Publication Repositories
Hughes RDAP11 Data Publication RepositoriesASIS&T
 
A Comprehensive Guide to Data Science Technologies.pdf
A Comprehensive Guide to Data Science Technologies.pdfA Comprehensive Guide to Data Science Technologies.pdf
A Comprehensive Guide to Data Science Technologies.pdfGeethaPratyusha
 
What do we want computers to do for us?
What do we want computers to do for us? What do we want computers to do for us?
What do we want computers to do for us? Andrea Volpini
 
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...Geoffrey Fox
 
Semantic Interoperability Issues and Approaches in the IoT.est Project
Semantic Interoperability Issues and Approaches in the IoT.est ProjectSemantic Interoperability Issues and Approaches in the IoT.est Project
Semantic Interoperability Issues and Approaches in the IoT.est Projectiotest
 
Development of Southern Luzon State University Digital Library of Theses and ...
Development of Southern Luzon State University Digital Library of Theses and ...Development of Southern Luzon State University Digital Library of Theses and ...
Development of Southern Luzon State University Digital Library of Theses and ...IRJET Journal
 
Data Science: History repeated? – The heritage of the Free and Open Source GI...
Data Science: History repeated? – The heritage of the Free and Open Source GI...Data Science: History repeated? – The heritage of the Free and Open Source GI...
Data Science: History repeated? – The heritage of the Free and Open Source GI...Peter Löwe
 
Neven Vrček - Role of Governments, Academy & Science Parks - University of Za...
Neven Vrček - Role of Governments, Academy & Science Parks - University of Za...Neven Vrček - Role of Governments, Academy & Science Parks - University of Za...
Neven Vrček - Role of Governments, Academy & Science Parks - University of Za...CUBCCE Conference
 
INNOVATION AND ‎RESEARCH (Digital Library ‎Information Access)‎
INNOVATION AND ‎RESEARCH (Digital Library ‎Information Access)‎INNOVATION AND ‎RESEARCH (Digital Library ‎Information Access)‎
INNOVATION AND ‎RESEARCH (Digital Library ‎Information Access)‎Libcorpio
 
General introduction to IoTCrawler
General introduction to IoTCrawlerGeneral introduction to IoTCrawler
General introduction to IoTCrawlerIoTCrawler
 
High Performance Data Analytics and a Java Grande Run Time
High Performance Data Analytics and a Java Grande Run TimeHigh Performance Data Analytics and a Java Grande Run Time
High Performance Data Analytics and a Java Grande Run TimeGeoffrey Fox
 
Data Harvesting, Curation and Fusion Model to Support Public Service Recommen...
Data Harvesting, Curation and Fusion Model to Support Public Service Recommen...Data Harvesting, Curation and Fusion Model to Support Public Service Recommen...
Data Harvesting, Curation and Fusion Model to Support Public Service Recommen...Citadelh2020
 
Data Harvesting, Curation and Fusion Model to Support Public Service Recommen...
Data Harvesting, Curation and Fusion Model to Support Public Service Recommen...Data Harvesting, Curation and Fusion Model to Support Public Service Recommen...
Data Harvesting, Curation and Fusion Model to Support Public Service Recommen...Gayane Sedrakyan
 
Big Data Processing Beyond MapReduce by Dr. Flavio Villanustre
Big Data Processing Beyond MapReduce by Dr. Flavio VillanustreBig Data Processing Beyond MapReduce by Dr. Flavio Villanustre
Big Data Processing Beyond MapReduce by Dr. Flavio VillanustreHPCC Systems
 
Towards a Smart (City) Data Science. A case-based retrospective on policies, ...
Towards a Smart (City) Data Science. A case-based retrospective on policies, ...Towards a Smart (City) Data Science. A case-based retrospective on policies, ...
Towards a Smart (City) Data Science. A case-based retrospective on policies, ...Enrico Daga
 
NIST Big Data Public Working Group NBD-PWG
NIST Big Data Public Working Group NBD-PWGNIST Big Data Public Working Group NBD-PWG
NIST Big Data Public Working Group NBD-PWGGeoffrey Fox
 
Overview of XSEDE Systems Engineering
Overview of XSEDE Systems EngineeringOverview of XSEDE Systems Engineering
Overview of XSEDE Systems EngineeringJohn Towns
 

Similar to Research in Intelligent Systems and Data Science at the Knowledge Media Institute (20)

Linked Open Data_mlanet13
Linked Open Data_mlanet13Linked Open Data_mlanet13
Linked Open Data_mlanet13
 
Shifting the Burden from the User to the Data Provider
Shifting the Burden from the User to the Data ProviderShifting the Burden from the User to the Data Provider
Shifting the Burden from the User to the Data Provider
 
Hughes RDAP11 Data Publication Repositories
Hughes RDAP11 Data Publication RepositoriesHughes RDAP11 Data Publication Repositories
Hughes RDAP11 Data Publication Repositories
 
E Infrastructure for OA
E Infrastructure for OAE Infrastructure for OA
E Infrastructure for OA
 
A Comprehensive Guide to Data Science Technologies.pdf
A Comprehensive Guide to Data Science Technologies.pdfA Comprehensive Guide to Data Science Technologies.pdf
A Comprehensive Guide to Data Science Technologies.pdf
 
What do we want computers to do for us?
What do we want computers to do for us? What do we want computers to do for us?
What do we want computers to do for us?
 
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
 
Semantic Interoperability Issues and Approaches in the IoT.est Project
Semantic Interoperability Issues and Approaches in the IoT.est ProjectSemantic Interoperability Issues and Approaches in the IoT.est Project
Semantic Interoperability Issues and Approaches in the IoT.est Project
 
Development of Southern Luzon State University Digital Library of Theses and ...
Development of Southern Luzon State University Digital Library of Theses and ...Development of Southern Luzon State University Digital Library of Theses and ...
Development of Southern Luzon State University Digital Library of Theses and ...
 
Data Science: History repeated? – The heritage of the Free and Open Source GI...
Data Science: History repeated? – The heritage of the Free and Open Source GI...Data Science: History repeated? – The heritage of the Free and Open Source GI...
Data Science: History repeated? – The heritage of the Free and Open Source GI...
 
Neven Vrček - Role of Governments, Academy & Science Parks - University of Za...
Neven Vrček - Role of Governments, Academy & Science Parks - University of Za...Neven Vrček - Role of Governments, Academy & Science Parks - University of Za...
Neven Vrček - Role of Governments, Academy & Science Parks - University of Za...
 
INNOVATION AND ‎RESEARCH (Digital Library ‎Information Access)‎
INNOVATION AND ‎RESEARCH (Digital Library ‎Information Access)‎INNOVATION AND ‎RESEARCH (Digital Library ‎Information Access)‎
INNOVATION AND ‎RESEARCH (Digital Library ‎Information Access)‎
 
General introduction to IoTCrawler
General introduction to IoTCrawlerGeneral introduction to IoTCrawler
General introduction to IoTCrawler
 
High Performance Data Analytics and a Java Grande Run Time
High Performance Data Analytics and a Java Grande Run TimeHigh Performance Data Analytics and a Java Grande Run Time
High Performance Data Analytics and a Java Grande Run Time
 
Data Harvesting, Curation and Fusion Model to Support Public Service Recommen...
Data Harvesting, Curation and Fusion Model to Support Public Service Recommen...Data Harvesting, Curation and Fusion Model to Support Public Service Recommen...
Data Harvesting, Curation and Fusion Model to Support Public Service Recommen...
 
Data Harvesting, Curation and Fusion Model to Support Public Service Recommen...
Data Harvesting, Curation and Fusion Model to Support Public Service Recommen...Data Harvesting, Curation and Fusion Model to Support Public Service Recommen...
Data Harvesting, Curation and Fusion Model to Support Public Service Recommen...
 
Big Data Processing Beyond MapReduce by Dr. Flavio Villanustre
Big Data Processing Beyond MapReduce by Dr. Flavio VillanustreBig Data Processing Beyond MapReduce by Dr. Flavio Villanustre
Big Data Processing Beyond MapReduce by Dr. Flavio Villanustre
 
Towards a Smart (City) Data Science. A case-based retrospective on policies, ...
Towards a Smart (City) Data Science. A case-based retrospective on policies, ...Towards a Smart (City) Data Science. A case-based retrospective on policies, ...
Towards a Smart (City) Data Science. A case-based retrospective on policies, ...
 
NIST Big Data Public Working Group NBD-PWG
NIST Big Data Public Working Group NBD-PWGNIST Big Data Public Working Group NBD-PWG
NIST Big Data Public Working Group NBD-PWG
 
Overview of XSEDE Systems Engineering
Overview of XSEDE Systems EngineeringOverview of XSEDE Systems Engineering
Overview of XSEDE Systems Engineering
 

Recently uploaded

IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxdolaknnilon
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxUnduhUnggah1
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 

Recently uploaded (20)

IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptx
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docx
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 

Research in Intelligent Systems and Data Science at the Knowledge Media Institute

  • 1. Research Directions in Intelligent Systems and Data Science Prof Enrico Motta Knowledge Media Institute The Open University
  • 2.
  • 3. Making Sense of Scholarly Data
  • 4. AI for Big Data The unprecedented availability of massive volumes of data in many different domains has both revolutionized AI and also turned it one of the most important technologies in today’s landscape, leading to the development of highly successful techniques that can impose structure and extract value from very large collections of data
  • 5. Big data in the scholarly domain 5
  • 6. Research in Scholarly Analytics The SKM3 team produces innovative approaches leveraging large-scale data mining, semantic technologies, machine learning, and visual analytics both to extract understanding and value from large collections of scholarly data and also to provide services to a variety of stakeholders. http://skm.kmi.open.ac.uk
  • 7. Mapping the Space of Research in Computer Science 7
  • 8. ACM and other similar classifications • Expensive, long-drawn process • 14 years between 1998 and 2012 releases • Becomes obsolete very quickly, unable to cover latest trends • Validation is an issue • It is a totally manual process and necessarily the result reflects individual biases and viewpoints – no ground truth • Mostly too high-level • Does not cover fine-grained topics, which is where the action tends to be • e.g., only 84 topics under AI, while our analysis has identified about 1800 distinct research areas in the AI field • Choice of topics and relations between topics are debatable • Semantic Web is not included (but “SW Languages” is!) • The area of Ontologies is under Information Retrieval
  • 9. K K K K K K K K K K K K A A A A A A O O O O O V V V V V K K K K1 K2 Venues Authors Organizations Keywords Linked Data Cloud Very Large Publication Corpus Statistical Topic Identification Candidate Topics Topic Validation Validated Topics Statistics/ML SubTopicOf Relations Equivalence Relations Automatic generation of taxonomies of research areas
  • 10. The Computer Science Ontology The Computer Science Ontology (CSO) is a large-scale, automatically generated ontology of research areas. It provides the largest research taxonomy in the field of Computer Science, including about 14K topics and 163K semantic relationships. http://cso.kmi.open.ac.uk/
  • 11. Automatic Classification of Publications • The CSO Classifier is an unsupervised approach for automatically classifying documents according to the Computer Science Ontology. It is currently being used to annotate the publications of Springer Nature and Dimensions. Salatino et al. (2019) The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly Articles.
  • 12. Automatic Classification of CS Proceedings at Springer Nature
  • 13. Business Value 13 About 9M of additional downloads thanks to STM. 0 5000 10000 15000 20000 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 Average number of yearly downloads for books in SpringerLink downloads (CS Proceedings) expected downloads (CS Proceedings) downloads (CS Proceedings) withSTM downloads (other books in CS) downloads (overall)
  • 14. Augur: Predicting the emergence of new research areas 14
  • 15. 15 «[…] transition from one paradigm to another via revolution is the usual developmental pattern of mature science» Thomas Kuhn The Structure of Scientific Revolutions
  • 17. Approach • Analysis and discovery of patterns which may indicate the emergence of new topics • For example, before the Semantic Web emerged explicitly as research area we could identify new interesting dynamics involving authors from different research areas such as knowledge representation, agent systems, hypertext and databases. • Recognizing dynamics that appear to match the generic patterns to identify emerging trends
  • 18. T1 T2 Year n Year n+1 T3 T1 T2 T3 Focus on collaboration between research communities
  • 19. • The creation of novel topics is anticipated by a significant increase in the pace of collaboration in the areas that are associated with the generation of the new topic and therefore in the density of that portion of the topological space 19
  • 20. Output: topics, papers, authors Influential Authors W. Bruce Croft, Dieter Fensel, Dan Suciu, William W. Cohen, Berthier Ribeiro-Neto, Clement T. Yu, James Allan, Justin Zobel, Dragomir R. Radev, Victor Vianu Influential Papers - A Sheth et al. "Managing semantic content for the Web" (2002) - RWP Luk et al. "A survey in indexing and searching XML documents" (2002) - J Kahan et al. "Annotea: An open RDF infrastructure for shared Web annotations" (2002) - R Manmatha et al. "Modeling score distributions for combining the outputs of search engines" (2001) - S Dagtas et al. "Models for motion-based video indexing and retrieval" (2000) Evolutionary network in 2002, reflecting the emergence of Semantic Search the following year
  • 21. Smart Cities and Robotics
  • 22. MK:Smart • A large collaborative project (19 partners - £17.2M budget) partly funded by the HEFCE’s Catalyst Fund • Aim of the fund is to enhance higher education’s contribution to economic growth • “we are seeking to support developments that stimulate the capabilities of HE teaching and research to deliver sustainable economic impact across the nation” • Main objective of the project: To put in place an integrated innovation and support programme, which will leverage large-scale city data to provide solutions to the key demand problems and will also provide a sustainable technological infrastructure to accelerate innovation and economic growth.
  • 23.
  • 24. • A multi award-winning infrastructure supporting the acquisition and management of both static (i.e., DBs, files) and dynamic (i.e., sensor feeds) data sources • A data eco-system, where private, open and commercial data sources co-exist in the same infrastructure • A platform for Open Innovation, providing developers with APIs and tools to facilitate the engineering of data- intensive applications
  • 25. Infrastructure Layers DATA HUB Smart Parking Driver Assist Waste Management Tracing Assets: BT Trace Smart Street Lighting APPLICATIONS LoRa MESH CONNECTIVITY UNB SENSORS Light Sensor Bin Usage Parking Sensor Vehicle Telemetry RFID Trace Soil Moisture Analytics Dev Environment IT Services Information Spine
  • 26. Data cataloguing and governance Licenses are described as machine readable policies { "global:homepage": [ "https://datahub.mksmart.org/policy/open-governme "https://datahub.beta.mksmart.org/policy/open-gove ], "global:landingPage": [ "http://data.mksmart.org/entity/thing/www:uri/http government-license/", "http://data.mksmart.org/entity/thing/www:uri/https:/ n-government-license/" ], "global:api": [ "https://datahub.beta.mksmart.org/data-catalogue-a government-license", "https://datahub.mksmart.org/data-catalogue-api/?a license" ], "global:name": ["open-government-license"], "global:description": [""], "global:title": ["Open Government License"], "global:permission": [ "http://data.mksmart.org/entity/thing/www:uri/perm "http://data.mksmart.org/entity/thing/www:uri/perm "http://data.mksmart.org/entity/thing/www:uri/perm "http://data.mksmart.org/entity/thing/www:uri/perm "http://data.mksmart.org/entity/thing/www:uri/perm "http://data.mksmart.org/entity/thing/www:uri/perm "http://data.mksmart.org/entity/thing/www:uri/perm "http://data.mksmart.org/entity/thing/www:uri/perm
  • 27.
  • 28. • 6 workshops and 20 roadshows. • OurMK platform and MK Citizen Lab • Citizen Ideas Competition • 11 projects already funded • Rated one of the “Top 5 crowdsourcing initiatives in government: better engagement with citizens” • Franzi’s Food passport • Michael’s Domestic solar • Lindsey’s Pop-up shop • Ros’s Breastfeeding app • Zi’s Parent Computing Literacy • Les’s Allotment Borehole Feasibility • Padma’s Centre MK Beacon Navigation • Paul’s Redways route Recordings • Eric’s Redways Reporting App • MKPAA’s Beat the Redways game • MK Academy’s Water Awareness Week
  • 29. Ground Resistance A collaboration between MK:Smart and artists Wesley Goatley and Georgina Voss
  • 30. • £2m two-year project • Promoting research and innovation in the digital economy • MK now 2nd highest economy outside London for tech and digital SMEs. • Addresses data science skill gap in SMEs • Focus on South East Midlands LEP region • Leverages and strengthens SME innovation network created in MK:Smart • Innovative approach focused on customised and integrated business/tech support • Advisory Board includes MK Council, NatWest, SEMLEP Target 50 new propositions/prototypes Grants MK Data Hub Lean skills training Tech Design & Prototype Evaluation Business & innovation networks Our MK – citizen innovation platform
  • 31. Robots in a smart city • Currently developments in smart cities focus primarily on sensor deployment and data collection and analysis to optimize services. • No integration of robots in smart city infrastructure, even though autonomous robots already operate in urban scenarios • Advantages from integrating robots in a smart city infrastructure: • Robots can make use of data coming from a variety of sources (hence becoming smarter) • Robots can act as mobile sensors (hence reducing cost of massive sensor deployment) • Robots can be opportunistically deployed to deal with exceptional events – e.g., emergencies
  • 32.
  • 33. Hans, the Health and Safety Inspector • Hans is aware of the Health and Safety regulations at the OU • It is expected to detect H&S violations autonomously • It is also expected to fulfil additional lab supervision tasks, e.g., checking occupancy of meeting rooms • This requires integration with KMi’s room booking system • Hans needs object recognition ability, integration with KMi Systems, specialized task knowledge, and integration with external knowledge bases (e.g., ConceptNet, WordNet, Visual Genome)
  • 34.
  • 35. 1st International Competition on Robots in Smart Cities SciRoc is a EU funded project whose aim is to bring robotic tournaments in the context of smart cities. The first international competition took place in Milton Keynes, on 18-21 September 2019. Challenge comprises 5 episodes, testing Human- Robot Interaction, Navigation, Manipulation, Autonomous Flying, Humanoid Robotics and Interaction with smart city infrastructure.