SlideShare a Scribd company logo
1 of 14
Introduction to Data andText Mining
Catherine Grout
https://en.wikipedia.org/wiki/Text_mining
07/03/2016 Digifest 2016 - Sustainable and efficient solutions for shared research data management – Business case and costing 2
What is data and text mining?
» Text mining, also referred to as text data mining, roughly equivalent to text analytics, refers to the process of
deriving high-quality information from text.
» High-quality information is typically derived through the devising of patterns and trends through means such as
statistical pattern learning.
» Text mining usually involves the process of structuring the input text … deriving patterns within the structured data,
and finally evaluation and interpretation of the output.
» High quality' in text mining usually refers to some combination of relevance, novelty, and interestingness.
» Data Mining – is an imprecise term but means anything from
› Large scale data analysis within science - outputs of Hubl telecscope, Cern Large Hadron Collider
› Analysing census data for socio-economic trends (medium scale –finite amount of data )
› The opportunities of mining connected small objects/collections of research data to find new insight. e.g.
bringing together various versions of the Mona Lisa and using Data Mining to analyse their underlying structure.
Ref : https://en.wikipedia.org/wiki/Text_mining
07/03/2016 Digifest 2016 - Sustainable and efficient solutions for shared research data management – Business case and costing 4
What is its value for research and education?
» 2012 – Jisc published a key report
“Value and benefits of text mining
» https://www.jisc.ac.uk/reports/value-
and-benefits-of-text-mining
» Took a case study approach and also
under took an economic analysis of
the benefits (…biomedicine)
» Wider at scale benefits were harder
to come by owing to legal and
technical limitations in inhibiting
systematic use
» Since then new benefits have
emerged
What types of benefits?
» Finding research insights that were not
possible through other techniques
» Bringing together texts/data across
different discipline and finding new
insights
» “Text mining offers a way of helping
researchers to make sense of and
leverage value from the vast sea of
electronic resources, which is continually
expanding.”
» .”.potential to increase the research base
available to business and society and to
enable business and others to use the
research base more effectively”
Health benefits of outdoor education
https://en.wikipedia.org/wiki/Outdoor_education
Innovative Research in Humanities & Social Sciences
» Digging into Data Challenge
» http://diggingintodata.org
» International Initiative now in its 4th
funding round e.g.:
» Trees andTweets -
https://sites.google.com/site/jackgrievea
ston/treesandtweets
» DiLiPaD – http://dilipad.history.ac.uk/
07/03/2016 Digifest 2016 - Sustainable and efficient solutions for shared research data management – Business case and costing 8
07/03/2016 Digifest 2016 - Sustainable and efficient solutions for shared research data management – Business case and costing 9
Mining Repositories : Core
» CORE is an aggregation of OpenAccess Repositories and offers itself as a
platform forTDM (£25 million articles)
› Can use an API (of interest if want to build value add services on top
› Or - download the whole aggregation as an open dataset here:
https://core.ac.uk/intro/data_dump
› Jisc and the Open University running CORE in partnership, with the back-end aggregation hosted
by the OU and the front-end services hosted by Jisc. (Further services by Jisc could be developed
on top of this. )
07/03/2016 Digifest 2016 - Sustainable and efficient solutions for shared research data management – Business case and costing 10
Universities and Industry
» NCUB (NationalCouncil for Universities and Business) is developing a tool called
an “Intelligent Broker”
› To assist with making better links between University and Industry
› Could potentially harvest and mine data from key sources like the Research
Council’s Gateway to Research, equipment.data (national equipment portal)
and other services potentially - like Core.
› This would give SME’s more intelligence about research intensive activity in
particular areas for example
Content Mine
Grew out of a Jisc project initially
07/03/2016 Digifest 2016 - Sustainable and efficient solutions for shared research data management – Business case and costing 12
And Finally…
» Open Citation Experiment (usingText mining techniques –see Digifest session
and demo on this!)
» Jisc are commissioning a study to examine theText Mining landscape and future
contributions to this space to review:
› The current landscape - primarily in UK HE but also looking internationally, and within other
relevant sectors to provide a broad view.
› The market – what are the value chains and where might Jisc contribute?
› The legal position and other inhibitors
› Researcher practice, the issues they encounter, their current and future needs, considering
subjects that use and those that don’t
› Existing platforms, services and tools, and potential for use by Jisc or its customers
› Recommendations on possible future areas of work or services for Jisc to explore
jisc.ac.uk
For more information
Catherine Grout
Head of Change - Research
catherine.grout@jisc.ac.uk
07/03/2016 Digifest 2016 - Sustainable and efficient solutions for shared research data management – Business case and costing 14

More Related Content

What's hot

Open data and open access landscape in Tanzania/Zaituni Kaijage
Open data and open access landscape in Tanzania/Zaituni KaijageOpen data and open access landscape in Tanzania/Zaituni Kaijage
Open data and open access landscape in Tanzania/Zaituni KaijageAfrican Open Science Platform
 
A coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon HodsonA coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon HodsonAfrican Open Science Platform
 
Perspectives from the African Open Science Platform/Susan Veldsman
Perspectives from the African Open Science Platform/Susan VeldsmanPerspectives from the African Open Science Platform/Susan Veldsman
Perspectives from the African Open Science Platform/Susan VeldsmanAfrican Open Science Platform
 
Digital notebooks - a Jisc perspective
Digital notebooks - a Jisc perspectiveDigital notebooks - a Jisc perspective
Digital notebooks - a Jisc perspectiveChristopher Brown
 
Data and Innovation in the public sector
Data and Innovation in the public sectorData and Innovation in the public sector
Data and Innovation in the public sectorJames Stewart
 
ODHK.Meet.37 Intro to Research Data Policies and Platforms
ODHK.Meet.37 Intro to Research Data Policies and PlatformsODHK.Meet.37 Intro to Research Data Policies and Platforms
ODHK.Meet.37 Intro to Research Data Policies and PlatformsScott Edmunds
 
Data as a research output and a research asset: the case for Open Science/Sim...
Data as a research output and a research asset: the case for Open Science/Sim...Data as a research output and a research asset: the case for Open Science/Sim...
Data as a research output and a research asset: the case for Open Science/Sim...African Open Science Platform
 
Legal Interoperability of Research Data: Principles and Implementation Guidel...
Legal Interoperability of Research Data: Principles and Implementation Guidel...Legal Interoperability of Research Data: Principles and Implementation Guidel...
Legal Interoperability of Research Data: Principles and Implementation Guidel...OpenAIRE
 
Open Data: Barriers, Risks, and Opportunities
Open Data: Barriers, Risks, and OpportunitiesOpen Data: Barriers, Risks, and Opportunities
Open Data: Barriers, Risks, and OpportunitiesSlim Turki, Dr.
 
Open Intelligence - Free Directories
Open Intelligence - Free DirectoriesOpen Intelligence - Free Directories
Open Intelligence - Free DirectoriesJan Wyllie
 
Next gen insight networkshop44
Next gen insight   networkshop44Next gen insight   networkshop44
Next gen insight networkshop44Jisc
 
Horizon 2020 Open Research Data Pilot, Jean-Claude Burgelman, DG RTD European...
Horizon 2020 Open Research Data Pilot, Jean-Claude Burgelman, DG RTD European...Horizon 2020 Open Research Data Pilot, Jean-Claude Burgelman, DG RTD European...
Horizon 2020 Open Research Data Pilot, Jean-Claude Burgelman, DG RTD European...OpenAIRE
 
Susanna Sansone at the Knowledge Dialogues/ODHK "Beyond Open"event
Susanna Sansone at the Knowledge Dialogues/ODHK "Beyond Open"eventSusanna Sansone at the Knowledge Dialogues/ODHK "Beyond Open"event
Susanna Sansone at the Knowledge Dialogues/ODHK "Beyond Open"eventGigaScience, BGI Hong Kong
 
WikiRate - Data Liberation and Radical Transparency
WikiRate - Data Liberation and Radical TransparencyWikiRate - Data Liberation and Radical Transparency
WikiRate - Data Liberation and Radical TransparencyVishal Kapadia
 
WikiRate: Stakeholder Perspectives - NGOs and Academics
WikiRate: Stakeholder Perspectives - NGOs and AcademicsWikiRate: Stakeholder Perspectives - NGOs and Academics
WikiRate: Stakeholder Perspectives - NGOs and AcademicsVishal Kapadia
 
#opendata Back to the future
#opendata Back to the future#opendata Back to the future
#opendata Back to the futureSlim Turki, Dr.
 
Jisc unleashing data 5 minutes
Jisc unleashing data 5 minutesJisc unleashing data 5 minutes
Jisc unleashing data 5 minutesDaniela G. Duca
 
Overview of Emerging Requirements for Data Management of Federally Funded Res...
Overview of Emerging Requirements for Data Management of Federally Funded Res...Overview of Emerging Requirements for Data Management of Federally Funded Res...
Overview of Emerging Requirements for Data Management of Federally Funded Res...Richard Huffine
 
Developing institutional RDM services
Developing institutional RDM servicesDeveloping institutional RDM services
Developing institutional RDM servicesMichael Day
 
Data sharing for development: a case of Infrastructural development in Uganda...
Data sharing for development: a case of Infrastructural development in Uganda...Data sharing for development: a case of Infrastructural development in Uganda...
Data sharing for development: a case of Infrastructural development in Uganda...African Open Science Platform
 

What's hot (20)

Open data and open access landscape in Tanzania/Zaituni Kaijage
Open data and open access landscape in Tanzania/Zaituni KaijageOpen data and open access landscape in Tanzania/Zaituni Kaijage
Open data and open access landscape in Tanzania/Zaituni Kaijage
 
A coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon HodsonA coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon Hodson
 
Perspectives from the African Open Science Platform/Susan Veldsman
Perspectives from the African Open Science Platform/Susan VeldsmanPerspectives from the African Open Science Platform/Susan Veldsman
Perspectives from the African Open Science Platform/Susan Veldsman
 
Digital notebooks - a Jisc perspective
Digital notebooks - a Jisc perspectiveDigital notebooks - a Jisc perspective
Digital notebooks - a Jisc perspective
 
Data and Innovation in the public sector
Data and Innovation in the public sectorData and Innovation in the public sector
Data and Innovation in the public sector
 
ODHK.Meet.37 Intro to Research Data Policies and Platforms
ODHK.Meet.37 Intro to Research Data Policies and PlatformsODHK.Meet.37 Intro to Research Data Policies and Platforms
ODHK.Meet.37 Intro to Research Data Policies and Platforms
 
Data as a research output and a research asset: the case for Open Science/Sim...
Data as a research output and a research asset: the case for Open Science/Sim...Data as a research output and a research asset: the case for Open Science/Sim...
Data as a research output and a research asset: the case for Open Science/Sim...
 
Legal Interoperability of Research Data: Principles and Implementation Guidel...
Legal Interoperability of Research Data: Principles and Implementation Guidel...Legal Interoperability of Research Data: Principles and Implementation Guidel...
Legal Interoperability of Research Data: Principles and Implementation Guidel...
 
Open Data: Barriers, Risks, and Opportunities
Open Data: Barriers, Risks, and OpportunitiesOpen Data: Barriers, Risks, and Opportunities
Open Data: Barriers, Risks, and Opportunities
 
Open Intelligence - Free Directories
Open Intelligence - Free DirectoriesOpen Intelligence - Free Directories
Open Intelligence - Free Directories
 
Next gen insight networkshop44
Next gen insight   networkshop44Next gen insight   networkshop44
Next gen insight networkshop44
 
Horizon 2020 Open Research Data Pilot, Jean-Claude Burgelman, DG RTD European...
Horizon 2020 Open Research Data Pilot, Jean-Claude Burgelman, DG RTD European...Horizon 2020 Open Research Data Pilot, Jean-Claude Burgelman, DG RTD European...
Horizon 2020 Open Research Data Pilot, Jean-Claude Burgelman, DG RTD European...
 
Susanna Sansone at the Knowledge Dialogues/ODHK "Beyond Open"event
Susanna Sansone at the Knowledge Dialogues/ODHK "Beyond Open"eventSusanna Sansone at the Knowledge Dialogues/ODHK "Beyond Open"event
Susanna Sansone at the Knowledge Dialogues/ODHK "Beyond Open"event
 
WikiRate - Data Liberation and Radical Transparency
WikiRate - Data Liberation and Radical TransparencyWikiRate - Data Liberation and Radical Transparency
WikiRate - Data Liberation and Radical Transparency
 
WikiRate: Stakeholder Perspectives - NGOs and Academics
WikiRate: Stakeholder Perspectives - NGOs and AcademicsWikiRate: Stakeholder Perspectives - NGOs and Academics
WikiRate: Stakeholder Perspectives - NGOs and Academics
 
#opendata Back to the future
#opendata Back to the future#opendata Back to the future
#opendata Back to the future
 
Jisc unleashing data 5 minutes
Jisc unleashing data 5 minutesJisc unleashing data 5 minutes
Jisc unleashing data 5 minutes
 
Overview of Emerging Requirements for Data Management of Federally Funded Res...
Overview of Emerging Requirements for Data Management of Federally Funded Res...Overview of Emerging Requirements for Data Management of Federally Funded Res...
Overview of Emerging Requirements for Data Management of Federally Funded Res...
 
Developing institutional RDM services
Developing institutional RDM servicesDeveloping institutional RDM services
Developing institutional RDM services
 
Data sharing for development: a case of Infrastructural development in Uganda...
Data sharing for development: a case of Infrastructural development in Uganda...Data sharing for development: a case of Infrastructural development in Uganda...
Data sharing for development: a case of Infrastructural development in Uganda...
 

Similar to Introducing Data and Text Mining at DigiFest

Introduction to data and text mining - Jisc Digifest 2016
Introduction to data and text mining - Jisc Digifest 2016Introduction to data and text mining - Jisc Digifest 2016
Introduction to data and text mining - Jisc Digifest 2016Jisc
 
Big data and the dark arts - Jisc Digital Media 2015
Big data and the dark arts - Jisc Digital Media 2015Big data and the dark arts - Jisc Digital Media 2015
Big data and the dark arts - Jisc Digital Media 2015Jisc
 
Data science e machine learning
Data science e machine learningData science e machine learning
Data science e machine learningGiuseppe Manco
 
Gobinda Chowdhury
Gobinda ChowdhuryGobinda Chowdhury
Gobinda Chowdhurymaredata
 
EPFL Open Research Data - a Jisc perspective
EPFL Open Research Data - a Jisc perspectiveEPFL Open Research Data - a Jisc perspective
EPFL Open Research Data - a Jisc perspectiveChristopher Brown
 
Implementing Open Access: Effective Management of Your Research Data
Implementing Open Access: Effective Management of Your Research DataImplementing Open Access: Effective Management of Your Research Data
Implementing Open Access: Effective Management of Your Research DataMartin Hamilton
 
Supporting Research Data Management in UK Universities: the Jisc Managing Res...
Supporting Research Data Management in UK Universities: the Jisc Managing Res...Supporting Research Data Management in UK Universities: the Jisc Managing Res...
Supporting Research Data Management in UK Universities: the Jisc Managing Res...L Molloy
 
The role of libraries and information professionals during the Big Data Era/ ...
The role of libraries and information professionals during the Big Data Era/ ...The role of libraries and information professionals during the Big Data Era/ ...
The role of libraries and information professionals during the Big Data Era/ ...African Open Science Platform
 
Open Access to Research Data: Challenges and Solutions
Open Access to Research Data: Challenges and SolutionsOpen Access to Research Data: Challenges and Solutions
Open Access to Research Data: Challenges and SolutionsMartin Donnelly
 
My FAIR share of the work - Diamond Light Source - Dec 2018
My FAIR share of the work - Diamond Light Source - Dec 2018My FAIR share of the work - Diamond Light Source - Dec 2018
My FAIR share of the work - Diamond Light Source - Dec 2018Susanna-Assunta Sansone
 
Jisc visions: research
Jisc visions: researchJisc visions: research
Jisc visions: researchJisc
 
The FOSTER project - general overview
The FOSTER project - general overviewThe FOSTER project - general overview
The FOSTER project - general overviewMartin Donnelly
 
Repository and preservation systems
Repository and preservation systemsRepository and preservation systems
Repository and preservation systemsJisc
 
Open Data - strategies for research data management & impact of best practices
Open Data - strategies for research data management & impact of best practicesOpen Data - strategies for research data management & impact of best practices
Open Data - strategies for research data management & impact of best practicesMartin Donnelly
 
Text and data mining - the opportunities and the EU conundrum - why aren’t we...
Text and data mining - the opportunities and the EU conundrum - why aren’t we...Text and data mining - the opportunities and the EU conundrum - why aren’t we...
Text and data mining - the opportunities and the EU conundrum - why aren’t we...FutureTDM
 
e-infrastructures supporting open knowledge circulation - OpenAIRE France
e-infrastructures supporting open knowledge circulation - OpenAIRE Francee-infrastructures supporting open knowledge circulation - OpenAIRE France
e-infrastructures supporting open knowledge circulation - OpenAIRE FranceJean-François Lutz
 
Digital Resources for Open Science
Digital Resources for Open ScienceDigital Resources for Open Science
Digital Resources for Open ScienceMartin Donnelly
 

Similar to Introducing Data and Text Mining at DigiFest (20)

Introduction to data and text mining - Jisc Digifest 2016
Introduction to data and text mining - Jisc Digifest 2016Introduction to data and text mining - Jisc Digifest 2016
Introduction to data and text mining - Jisc Digifest 2016
 
Big data and the dark arts - Jisc Digital Media 2015
Big data and the dark arts - Jisc Digital Media 2015Big data and the dark arts - Jisc Digital Media 2015
Big data and the dark arts - Jisc Digital Media 2015
 
Data science e machine learning
Data science e machine learningData science e machine learning
Data science e machine learning
 
Gobinda Chowdhury
Gobinda ChowdhuryGobinda Chowdhury
Gobinda Chowdhury
 
EPFL Open Research Data - a Jisc perspective
EPFL Open Research Data - a Jisc perspectiveEPFL Open Research Data - a Jisc perspective
EPFL Open Research Data - a Jisc perspective
 
Rdaeu russia_fg_1_july2014_final
Rdaeu  russia_fg_1_july2014_finalRdaeu  russia_fg_1_july2014_final
Rdaeu russia_fg_1_july2014_final
 
Implementing Open Access: Effective Management of Your Research Data
Implementing Open Access: Effective Management of Your Research DataImplementing Open Access: Effective Management of Your Research Data
Implementing Open Access: Effective Management of Your Research Data
 
Supporting Research Data Management in UK Universities: the Jisc Managing Res...
Supporting Research Data Management in UK Universities: the Jisc Managing Res...Supporting Research Data Management in UK Universities: the Jisc Managing Res...
Supporting Research Data Management in UK Universities: the Jisc Managing Res...
 
African Open Science Platform: Pilot Phase
African Open Science Platform: Pilot PhaseAfrican Open Science Platform: Pilot Phase
African Open Science Platform: Pilot Phase
 
The role of libraries and information professionals during the Big Data Era/ ...
The role of libraries and information professionals during the Big Data Era/ ...The role of libraries and information professionals during the Big Data Era/ ...
The role of libraries and information professionals during the Big Data Era/ ...
 
Open Access to Research Data: Challenges and Solutions
Open Access to Research Data: Challenges and SolutionsOpen Access to Research Data: Challenges and Solutions
Open Access to Research Data: Challenges and Solutions
 
CODATA, Open Science Policies and Capacity Building by Simon Hodson
CODATA, Open Science Policies and Capacity Building by Simon HodsonCODATA, Open Science Policies and Capacity Building by Simon Hodson
CODATA, Open Science Policies and Capacity Building by Simon Hodson
 
My FAIR share of the work - Diamond Light Source - Dec 2018
My FAIR share of the work - Diamond Light Source - Dec 2018My FAIR share of the work - Diamond Light Source - Dec 2018
My FAIR share of the work - Diamond Light Source - Dec 2018
 
Jisc visions: research
Jisc visions: researchJisc visions: research
Jisc visions: research
 
The FOSTER project - general overview
The FOSTER project - general overviewThe FOSTER project - general overview
The FOSTER project - general overview
 
Repository and preservation systems
Repository and preservation systemsRepository and preservation systems
Repository and preservation systems
 
Open Data - strategies for research data management & impact of best practices
Open Data - strategies for research data management & impact of best practicesOpen Data - strategies for research data management & impact of best practices
Open Data - strategies for research data management & impact of best practices
 
Text and data mining - the opportunities and the EU conundrum - why aren’t we...
Text and data mining - the opportunities and the EU conundrum - why aren’t we...Text and data mining - the opportunities and the EU conundrum - why aren’t we...
Text and data mining - the opportunities and the EU conundrum - why aren’t we...
 
e-infrastructures supporting open knowledge circulation - OpenAIRE France
e-infrastructures supporting open knowledge circulation - OpenAIRE Francee-infrastructures supporting open knowledge circulation - OpenAIRE France
e-infrastructures supporting open knowledge circulation - OpenAIRE France
 
Digital Resources for Open Science
Digital Resources for Open ScienceDigital Resources for Open Science
Digital Resources for Open Science
 

More from Jisc RDM

2019-06_Eunis_Burland
2019-06_Eunis_Burland2019-06_Eunis_Burland
2019-06_Eunis_BurlandJisc RDM
 
Jisc Research Data Shared Service Open Repositories 2018 Paper
Jisc Research Data Shared Service Open Repositories 2018 PaperJisc Research Data Shared Service Open Repositories 2018 Paper
Jisc Research Data Shared Service Open Repositories 2018 PaperJisc RDM
 
Jisc Research Data Shared Service Open Repositories 2018 24x7
Jisc Research Data Shared Service Open Repositories 2018 24x7Jisc Research Data Shared Service Open Repositories 2018 24x7
Jisc Research Data Shared Service Open Repositories 2018 24x7Jisc RDM
 
Jisc Research Data Shared Service - a Samvera case study
Jisc Research Data Shared Service - a Samvera case studyJisc Research Data Shared Service - a Samvera case study
Jisc Research Data Shared Service - a Samvera case studyJisc RDM
 
Building a national Data Repository Data Modelling
Building a national Data Repository Data ModellingBuilding a national Data Repository Data Modelling
Building a national Data Repository Data ModellingJisc RDM
 
Building a national Data Repository System Integration Architecture Overview
Building a national Data Repository System Integration Architecture OverviewBuilding a national Data Repository System Integration Architecture Overview
Building a national Data Repository System Integration Architecture OverviewJisc RDM
 
Building a National Data Service Open Repositories 2018
Building a National Data Service Open Repositories 2018Building a National Data Service Open Repositories 2018
Building a National Data Service Open Repositories 2018Jisc RDM
 
Research Data Toolkit
Research Data ToolkitResearch Data Toolkit
Research Data ToolkitJisc RDM
 
Pre jisc datachampday_260318
Pre jisc datachampday_260318Pre jisc datachampday_260318
Pre jisc datachampday_260318Jisc RDM
 
Stories from the Field: Data are Messy and that's (kind of) ok
Stories from the Field: Data are Messy and that's (kind of) okStories from the Field: Data are Messy and that's (kind of) ok
Stories from the Field: Data are Messy and that's (kind of) okJisc RDM
 
Fair data - dinkum research - by Andy Turner
Fair data -  dinkum research - by Andy TurnerFair data -  dinkum research - by Andy Turner
Fair data - dinkum research - by Andy TurnerJisc RDM
 
2018 03 codata - making the case
2018 03 codata - making the case2018 03 codata - making the case
2018 03 codata - making the caseJisc RDM
 
Research Data Shared Service update at DPC
Research Data Shared Service update at DPCResearch Data Shared Service update at DPC
Research Data Shared Service update at DPCJisc RDM
 
Research Data Shared Service Webinar #1
Research Data Shared Service Webinar #1Research Data Shared Service Webinar #1
Research Data Shared Service Webinar #1Jisc RDM
 
Managing data behind creative masterpieces -RCM
Managing data behind creative masterpieces -RCMManaging data behind creative masterpieces -RCM
Managing data behind creative masterpieces -RCMJisc RDM
 
Managing data behind creative masterpieces
Managing data behind creative masterpiecesManaging data behind creative masterpieces
Managing data behind creative masterpiecesJisc RDM
 
Lightning Talks - Intro
Lightning Talks - IntroLightning Talks - Intro
Lightning Talks - IntroJisc RDM
 
Lightning Talk - Andrew MacLellan
Lightning Talk - Andrew MacLellanLightning Talk - Andrew MacLellan
Lightning Talk - Andrew MacLellanJisc RDM
 
Lightning Talk - Nick Sheppard
Lightning Talk - Nick SheppardLightning Talk - Nick Sheppard
Lightning Talk - Nick SheppardJisc RDM
 
Lightning Talk - Angela Dappart
Lightning Talk - Angela DappartLightning Talk - Angela Dappart
Lightning Talk - Angela DappartJisc RDM
 

More from Jisc RDM (20)

2019-06_Eunis_Burland
2019-06_Eunis_Burland2019-06_Eunis_Burland
2019-06_Eunis_Burland
 
Jisc Research Data Shared Service Open Repositories 2018 Paper
Jisc Research Data Shared Service Open Repositories 2018 PaperJisc Research Data Shared Service Open Repositories 2018 Paper
Jisc Research Data Shared Service Open Repositories 2018 Paper
 
Jisc Research Data Shared Service Open Repositories 2018 24x7
Jisc Research Data Shared Service Open Repositories 2018 24x7Jisc Research Data Shared Service Open Repositories 2018 24x7
Jisc Research Data Shared Service Open Repositories 2018 24x7
 
Jisc Research Data Shared Service - a Samvera case study
Jisc Research Data Shared Service - a Samvera case studyJisc Research Data Shared Service - a Samvera case study
Jisc Research Data Shared Service - a Samvera case study
 
Building a national Data Repository Data Modelling
Building a national Data Repository Data ModellingBuilding a national Data Repository Data Modelling
Building a national Data Repository Data Modelling
 
Building a national Data Repository System Integration Architecture Overview
Building a national Data Repository System Integration Architecture OverviewBuilding a national Data Repository System Integration Architecture Overview
Building a national Data Repository System Integration Architecture Overview
 
Building a National Data Service Open Repositories 2018
Building a National Data Service Open Repositories 2018Building a National Data Service Open Repositories 2018
Building a National Data Service Open Repositories 2018
 
Research Data Toolkit
Research Data ToolkitResearch Data Toolkit
Research Data Toolkit
 
Pre jisc datachampday_260318
Pre jisc datachampday_260318Pre jisc datachampday_260318
Pre jisc datachampday_260318
 
Stories from the Field: Data are Messy and that's (kind of) ok
Stories from the Field: Data are Messy and that's (kind of) okStories from the Field: Data are Messy and that's (kind of) ok
Stories from the Field: Data are Messy and that's (kind of) ok
 
Fair data - dinkum research - by Andy Turner
Fair data -  dinkum research - by Andy TurnerFair data -  dinkum research - by Andy Turner
Fair data - dinkum research - by Andy Turner
 
2018 03 codata - making the case
2018 03 codata - making the case2018 03 codata - making the case
2018 03 codata - making the case
 
Research Data Shared Service update at DPC
Research Data Shared Service update at DPCResearch Data Shared Service update at DPC
Research Data Shared Service update at DPC
 
Research Data Shared Service Webinar #1
Research Data Shared Service Webinar #1Research Data Shared Service Webinar #1
Research Data Shared Service Webinar #1
 
Managing data behind creative masterpieces -RCM
Managing data behind creative masterpieces -RCMManaging data behind creative masterpieces -RCM
Managing data behind creative masterpieces -RCM
 
Managing data behind creative masterpieces
Managing data behind creative masterpiecesManaging data behind creative masterpieces
Managing data behind creative masterpieces
 
Lightning Talks - Intro
Lightning Talks - IntroLightning Talks - Intro
Lightning Talks - Intro
 
Lightning Talk - Andrew MacLellan
Lightning Talk - Andrew MacLellanLightning Talk - Andrew MacLellan
Lightning Talk - Andrew MacLellan
 
Lightning Talk - Nick Sheppard
Lightning Talk - Nick SheppardLightning Talk - Nick Sheppard
Lightning Talk - Nick Sheppard
 
Lightning Talk - Angela Dappart
Lightning Talk - Angela DappartLightning Talk - Angela Dappart
Lightning Talk - Angela Dappart
 

Recently uploaded

Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfUmakantAnnand
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Micromeritics - Fundamental and Derived Properties of Powders
Micromeritics - Fundamental and Derived Properties of PowdersMicromeritics - Fundamental and Derived Properties of Powders
Micromeritics - Fundamental and Derived Properties of PowdersChitralekhaTherkar
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxRoyAbrique
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 

Recently uploaded (20)

Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.Compdf
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Micromeritics - Fundamental and Derived Properties of Powders
Micromeritics - Fundamental and Derived Properties of PowdersMicromeritics - Fundamental and Derived Properties of Powders
Micromeritics - Fundamental and Derived Properties of Powders
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 

Introducing Data and Text Mining at DigiFest

  • 1. Introduction to Data andText Mining Catherine Grout
  • 2. https://en.wikipedia.org/wiki/Text_mining 07/03/2016 Digifest 2016 - Sustainable and efficient solutions for shared research data management – Business case and costing 2
  • 3. What is data and text mining? » Text mining, also referred to as text data mining, roughly equivalent to text analytics, refers to the process of deriving high-quality information from text. » High-quality information is typically derived through the devising of patterns and trends through means such as statistical pattern learning. » Text mining usually involves the process of structuring the input text … deriving patterns within the structured data, and finally evaluation and interpretation of the output. » High quality' in text mining usually refers to some combination of relevance, novelty, and interestingness. » Data Mining – is an imprecise term but means anything from › Large scale data analysis within science - outputs of Hubl telecscope, Cern Large Hadron Collider › Analysing census data for socio-economic trends (medium scale –finite amount of data ) › The opportunities of mining connected small objects/collections of research data to find new insight. e.g. bringing together various versions of the Mona Lisa and using Data Mining to analyse their underlying structure. Ref : https://en.wikipedia.org/wiki/Text_mining
  • 4. 07/03/2016 Digifest 2016 - Sustainable and efficient solutions for shared research data management – Business case and costing 4
  • 5. What is its value for research and education? » 2012 – Jisc published a key report “Value and benefits of text mining » https://www.jisc.ac.uk/reports/value- and-benefits-of-text-mining » Took a case study approach and also under took an economic analysis of the benefits (…biomedicine) » Wider at scale benefits were harder to come by owing to legal and technical limitations in inhibiting systematic use » Since then new benefits have emerged
  • 6. What types of benefits? » Finding research insights that were not possible through other techniques » Bringing together texts/data across different discipline and finding new insights » “Text mining offers a way of helping researchers to make sense of and leverage value from the vast sea of electronic resources, which is continually expanding.” » .”.potential to increase the research base available to business and society and to enable business and others to use the research base more effectively” Health benefits of outdoor education https://en.wikipedia.org/wiki/Outdoor_education
  • 7. Innovative Research in Humanities & Social Sciences » Digging into Data Challenge » http://diggingintodata.org » International Initiative now in its 4th funding round e.g.: » Trees andTweets - https://sites.google.com/site/jackgrievea ston/treesandtweets » DiLiPaD – http://dilipad.history.ac.uk/
  • 8. 07/03/2016 Digifest 2016 - Sustainable and efficient solutions for shared research data management – Business case and costing 8
  • 9. 07/03/2016 Digifest 2016 - Sustainable and efficient solutions for shared research data management – Business case and costing 9
  • 10. Mining Repositories : Core » CORE is an aggregation of OpenAccess Repositories and offers itself as a platform forTDM (£25 million articles) › Can use an API (of interest if want to build value add services on top › Or - download the whole aggregation as an open dataset here: https://core.ac.uk/intro/data_dump › Jisc and the Open University running CORE in partnership, with the back-end aggregation hosted by the OU and the front-end services hosted by Jisc. (Further services by Jisc could be developed on top of this. ) 07/03/2016 Digifest 2016 - Sustainable and efficient solutions for shared research data management – Business case and costing 10
  • 11. Universities and Industry » NCUB (NationalCouncil for Universities and Business) is developing a tool called an “Intelligent Broker” › To assist with making better links between University and Industry › Could potentially harvest and mine data from key sources like the Research Council’s Gateway to Research, equipment.data (national equipment portal) and other services potentially - like Core. › This would give SME’s more intelligence about research intensive activity in particular areas for example
  • 12. Content Mine Grew out of a Jisc project initially 07/03/2016 Digifest 2016 - Sustainable and efficient solutions for shared research data management – Business case and costing 12
  • 13. And Finally… » Open Citation Experiment (usingText mining techniques –see Digifest session and demo on this!) » Jisc are commissioning a study to examine theText Mining landscape and future contributions to this space to review: › The current landscape - primarily in UK HE but also looking internationally, and within other relevant sectors to provide a broad view. › The market – what are the value chains and where might Jisc contribute? › The legal position and other inhibitors › Researcher practice, the issues they encounter, their current and future needs, considering subjects that use and those that don’t › Existing platforms, services and tools, and potential for use by Jisc or its customers › Recommendations on possible future areas of work or services for Jisc to explore
  • 14. jisc.ac.uk For more information Catherine Grout Head of Change - Research catherine.grout@jisc.ac.uk 07/03/2016 Digifest 2016 - Sustainable and efficient solutions for shared research data management – Business case and costing 14