SlideShare a Scribd company logo
Commercially empowered Linked Open Data
Ecosystems in Research
           Towards unfolding todays and tomorrows
           scientific treasures

           Michael Granitzer
           University of Passau




                                          FP 7 Strep No. 296150
                                                                  1
nani gigantum humeris insidentes
   Standing on the shouldes of giants
     – Research builds on the past
     – We pass on knowledge, to create
       new knowledge




     Root of (Western) Society




                                         2
Lying under a pile of text documents
   .. with varying quality
   .. with contradicting facts
   .. with missing data
   .. labour intensive to compare results
   Some examples
    – “Improvements that don’t add up”
       Armstrong et. al. 2009

    – “Why most research results are false”
       Ioannidis, 2005




       Can we do better?


                                              3
Yes, we (think) we can...
   Make Facts and Figures explicit, discoveralbe and comparable

   Giving textually enCODED scientific knowledge, we can
    –   Extract facts from research papers
    –   Integrate those facts with existing knowledge
    –   Make it available for (visual) analysis
    –   Crowdsource


   Focus on
    – Empirical observations/facts
    – Linked Open Data
    – Computer Science and Biomedical Domain



                                                                   4
That‘s nice, but how?

      Extract                                                          Analyse &                                                        Share &
                         Aggregate
    & Integrate                                                        Organise                                                       Commercialise




                                             Dependency and Frequency Analysis

                                                                                        Graph Depencies
                                                                                                                           Machine
                                                                                              Algorithm
                                                                                                                           Learning




                                                                                                                  CRF        SVM


                                                                                              Biomedical
                                                                                                                        Data Set 1




                                               Gesamtergebnis"
                                                                                                Algorithms"
                                                        (Leer)"
                                                         SVM"                                   Domain"
                                                     DataSet2"
                                                                                                Experiment"
                                                     DataSet1"
                                                          CRF"                                  (Leer)"
                                                   Biomedical"                                  Gesamtergebnis"
                                                                  0"   5"   10"   15"   20"




Text, Linked Data   Linked Scientific Fact                    Visual Analytics &                                                        Crowdsourcing &
  Experiments          Data Warehouse                           Collaborative                                                             Marketplace
                                                               mind-mapping
                                                                                                                                                          5
Extract & Integrate: Approach and Challenges
   Extracting Structural Elements
     – Tables
     – Figures
     – Sections and sub-sections
   Extracting Facts from Structural Elements
     – Entity extraction (e.g. algorithms, data sets, genes, significance levels etc.)
     – Fact extraction – <Entity, Relation, Measure>
     – Table Triplification
   Crowdsourcing Extraction
     – Extraction quality and domain knowledge remains a key issue
      Empower users to maintain their own extraction model
      Allow to semantically annotate research papers (e.g. entities, facts)


   Result: Semantically annotated scientific data as LOD Endpoint


                                                                                         6
Extract & Integrate: Example
                               Numerical Facts

                                 Dimension/
                                   Entity

                                 In-Document
                                    Context




               Ranking Facts




                                                 7
Extract & Integrate: Current Status
                                                                                    TeamBeam -PDF
                                                                                     Structure Extraction
                                                                                      – Structural elements
                                                                                      – Focusing now on
                                                                                         tables

                                                                                    Entity Extraction in work



                                                                                    First Prototypes for
                                                                                     Table2RDFDataCube




         TeamBeam — Meta-Data Extraction from Scientific Literature
         By Roman Kern, Graz University of Technology; Kris Jack and Maya Hristakeva, Mendeley Ltd.; Michael
         Granitzer, University of Passau                                                                         8
Aggregate: Approach and Challenges
   Representation and Storage
     – Representation using the RDF Data Cube Vocabulary
         • Dimensions (e.g. Algorithms, Genes)
         • Measures (e.g. 0.3, 37) and Attributes (e.g. %, °)
     – Challenge 1: Ensure independency of dimensions
     – Challenge 2: Decentralized querying and aggregation
                                                                http://www.w3.org/TR/vocab-data-cube/#ref_qb_measureType




   SPARQL Data Warehousing Wizard
     – Provide simple and intuitive Wizard for creating aggregation queries
         • Google-like starting point
         • Pivot table creation similar like in Spreadsheets
     – Store using RDF Data Cube Vocabulary

 Linked Scientific Fact Data Warehouse for non-IT Experts


                                                                                                                           9
Aggregate: Current Status
   Representation and Storage
     – Data Model implemented
     – Triplification of Benchmarking Data (e.g. CLEF, TPC-H etc.)
     We are looking for data

   SPARQL Data Warehousing Wizard




                                                                     10
Analyse: Approach and Challenges
   Visual Analytics for Linked Scientific Facts
     – RDF based description of visualisations
         • Glue between data and single visualisations
         • Make visualisation state explicit
         • Share visualisation state

     – HTML 5 based visualisations and visualisation wizard




                                                              11
Share: Approach and Challenges
   Provenance
     – Who published data?
     – Who modified data?

   Share aggregated data sets and annotation models
     – Build on insights created by others
     – Re-use text annotation models

   Share visual analytics applications
     – Simple visualisations might be misleading
     – Sharing whole states of a visual analysis will reveal
       more details on certain decisions




                                                               12
Why should YOU do it?




Marketplace concept for research data
 Users (=researchers) will be enabled to “sell” their analysis results
  (or give it away for free)
 Serveral concepts to be investigated: Revenue chains, roles, models
  (donations, paid subscription for data feeds, purchase etc.)
 Increased opportunities for researchers and research data
                                                                          13
integrate    crowdsource




      extract &
                      organise
      visualise




 Find us, join us, ask us, help us
         http://code-research.eu/
http://www.facebook.com/CODEresearchEU
           #CODEresearchEU

More Related Content

Similar to I-Know presentation: CODE - Commerically empowered Linked Open Data Ecosystems in Research

Business Intelligence Applications: Build or Buy Evaluation and IBM Cognos Demo
Business Intelligence Applications:  Build or Buy Evaluation and IBM Cognos DemoBusiness Intelligence Applications:  Build or Buy Evaluation and IBM Cognos Demo
Business Intelligence Applications: Build or Buy Evaluation and IBM Cognos Demo
Senturus
 
March 2009 DIA Janus Update
March 2009 DIA Janus UpdateMarch 2009 DIA Janus Update
March 2009 DIA Janus Update
olivaa
 
Semantic Web powering Enterprise and Web Applications
Semantic Web powering Enterprise and Web ApplicationsSemantic Web powering Enterprise and Web Applications
Semantic Web powering Enterprise and Web Applications
Amit Sheth
 
Analytics capability framework viramdas 201212 ssnet
Analytics capability framework viramdas 201212 ssnetAnalytics capability framework viramdas 201212 ssnet
Analytics capability framework viramdas 201212 ssnet
Vishwanath Ramdas
 
제1회 Korea Community Day 발표자료 Bigdata
제1회 Korea Community Day 발표자료 Bigdata 제1회 Korea Community Day 발표자료 Bigdata
제1회 Korea Community Day 발표자료 Bigdata
Gruter
 
Big Data Beyond Hadoop*: Research Directions for the Future
Big Data Beyond Hadoop*: Research Directions for the FutureBig Data Beyond Hadoop*: Research Directions for the Future
Big Data Beyond Hadoop*: Research Directions for the Future
Odinot Stanislas
 
Bug deBug Chennai 2012 Talk - Driving innovation using pattern based thinking...
Bug deBug Chennai 2012 Talk - Driving innovation using pattern based thinking...Bug deBug Chennai 2012 Talk - Driving innovation using pattern based thinking...
Bug deBug Chennai 2012 Talk - Driving innovation using pattern based thinking...RIA RUI Society
 
"Cost/Benefit Case for Enterprise Warehouse Solutions"
"Cost/Benefit Case for Enterprise Warehouse Solutions""Cost/Benefit Case for Enterprise Warehouse Solutions"
"Cost/Benefit Case for Enterprise Warehouse Solutions"IBM India Smarter Computing
 
Predictive analytics
Predictive analytics Predictive analytics
Predictive analytics
SAS Singapore Institute Pte Ltd
 
Data mining process powerpoint presentation slides.
Data mining process powerpoint presentation slides.Data mining process powerpoint presentation slides.
Data mining process powerpoint presentation slides.SlideTeam.net
 
Data mining process powerpoint ppt templates.
Data mining process powerpoint ppt templates.Data mining process powerpoint ppt templates.
Data mining process powerpoint ppt templates.SlideTeam.net
 
Data mining strategy powerpoint ppt slides.
Data mining strategy powerpoint ppt slides.Data mining strategy powerpoint ppt slides.
Data mining strategy powerpoint ppt slides.SlideTeam.net
 
Data mining strategy powerpoint presentation templates.
Data mining strategy powerpoint presentation templates.Data mining strategy powerpoint presentation templates.
Data mining strategy powerpoint presentation templates.SlideTeam.net
 
Data mining process powerpoint presentation templates.
Data mining process powerpoint presentation templates.Data mining process powerpoint presentation templates.
Data mining process powerpoint presentation templates.SlideTeam.net
 
Data mining strategy powerpoint ppt templates.
Data mining strategy powerpoint ppt templates.Data mining strategy powerpoint ppt templates.
Data mining strategy powerpoint ppt templates.SlideTeam.net
 
Infosys - Supply Chain Analytics Services | Solution
Infosys - Supply Chain Analytics Services | SolutionInfosys - Supply Chain Analytics Services | Solution
Infosys - Supply Chain Analytics Services | Solution
Infosys
 
Paradigm shifts in wildlife and biodiversity management through machine learning
Paradigm shifts in wildlife and biodiversity management through machine learningParadigm shifts in wildlife and biodiversity management through machine learning
Paradigm shifts in wildlife and biodiversity management through machine learningSalford Systems
 

Similar to I-Know presentation: CODE - Commerically empowered Linked Open Data Ecosystems in Research (20)

Business Intelligence Applications: Build or Buy Evaluation and IBM Cognos Demo
Business Intelligence Applications:  Build or Buy Evaluation and IBM Cognos DemoBusiness Intelligence Applications:  Build or Buy Evaluation and IBM Cognos Demo
Business Intelligence Applications: Build or Buy Evaluation and IBM Cognos Demo
 
March 2009 DIA Janus Update
March 2009 DIA Janus UpdateMarch 2009 DIA Janus Update
March 2009 DIA Janus Update
 
Data mining
Data miningData mining
Data mining
 
Semantic Web powering Enterprise and Web Applications
Semantic Web powering Enterprise and Web ApplicationsSemantic Web powering Enterprise and Web Applications
Semantic Web powering Enterprise and Web Applications
 
101 ab 1345-1415
101 ab 1345-1415101 ab 1345-1415
101 ab 1345-1415
 
101 ab 1345-1415
101 ab 1345-1415101 ab 1345-1415
101 ab 1345-1415
 
Analytics capability framework viramdas 201212 ssnet
Analytics capability framework viramdas 201212 ssnetAnalytics capability framework viramdas 201212 ssnet
Analytics capability framework viramdas 201212 ssnet
 
제1회 Korea Community Day 발표자료 Bigdata
제1회 Korea Community Day 발표자료 Bigdata 제1회 Korea Community Day 발표자료 Bigdata
제1회 Korea Community Day 발표자료 Bigdata
 
Big Data Beyond Hadoop*: Research Directions for the Future
Big Data Beyond Hadoop*: Research Directions for the FutureBig Data Beyond Hadoop*: Research Directions for the Future
Big Data Beyond Hadoop*: Research Directions for the Future
 
Bug deBug Chennai 2012 Talk - Driving innovation using pattern based thinking...
Bug deBug Chennai 2012 Talk - Driving innovation using pattern based thinking...Bug deBug Chennai 2012 Talk - Driving innovation using pattern based thinking...
Bug deBug Chennai 2012 Talk - Driving innovation using pattern based thinking...
 
"Cost/Benefit Case for Enterprise Warehouse Solutions"
"Cost/Benefit Case for Enterprise Warehouse Solutions""Cost/Benefit Case for Enterprise Warehouse Solutions"
"Cost/Benefit Case for Enterprise Warehouse Solutions"
 
Predictive analytics
Predictive analytics Predictive analytics
Predictive analytics
 
Data mining process powerpoint presentation slides.
Data mining process powerpoint presentation slides.Data mining process powerpoint presentation slides.
Data mining process powerpoint presentation slides.
 
Data mining process powerpoint ppt templates.
Data mining process powerpoint ppt templates.Data mining process powerpoint ppt templates.
Data mining process powerpoint ppt templates.
 
Data mining strategy powerpoint ppt slides.
Data mining strategy powerpoint ppt slides.Data mining strategy powerpoint ppt slides.
Data mining strategy powerpoint ppt slides.
 
Data mining strategy powerpoint presentation templates.
Data mining strategy powerpoint presentation templates.Data mining strategy powerpoint presentation templates.
Data mining strategy powerpoint presentation templates.
 
Data mining process powerpoint presentation templates.
Data mining process powerpoint presentation templates.Data mining process powerpoint presentation templates.
Data mining process powerpoint presentation templates.
 
Data mining strategy powerpoint ppt templates.
Data mining strategy powerpoint ppt templates.Data mining strategy powerpoint ppt templates.
Data mining strategy powerpoint ppt templates.
 
Infosys - Supply Chain Analytics Services | Solution
Infosys - Supply Chain Analytics Services | SolutionInfosys - Supply Chain Analytics Services | Solution
Infosys - Supply Chain Analytics Services | Solution
 
Paradigm shifts in wildlife and biodiversity management through machine learning
Paradigm shifts in wildlife and biodiversity management through machine learningParadigm shifts in wildlife and biodiversity management through machine learning
Paradigm shifts in wildlife and biodiversity management through machine learning
 

Recently uploaded

Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website
Pixlogix Infotech
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
Rohit Gautam
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Zilliz
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
Alex Pruden
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
SOFTTECHHUB
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 

Recently uploaded (20)

Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 

I-Know presentation: CODE - Commerically empowered Linked Open Data Ecosystems in Research

  • 1. Commercially empowered Linked Open Data Ecosystems in Research Towards unfolding todays and tomorrows scientific treasures Michael Granitzer University of Passau FP 7 Strep No. 296150 1
  • 2. nani gigantum humeris insidentes  Standing on the shouldes of giants – Research builds on the past – We pass on knowledge, to create new knowledge Root of (Western) Society 2
  • 3. Lying under a pile of text documents  .. with varying quality  .. with contradicting facts  .. with missing data  .. labour intensive to compare results  Some examples – “Improvements that don’t add up” Armstrong et. al. 2009 – “Why most research results are false” Ioannidis, 2005 Can we do better? 3
  • 4. Yes, we (think) we can...  Make Facts and Figures explicit, discoveralbe and comparable  Giving textually enCODED scientific knowledge, we can – Extract facts from research papers – Integrate those facts with existing knowledge – Make it available for (visual) analysis – Crowdsource  Focus on – Empirical observations/facts – Linked Open Data – Computer Science and Biomedical Domain 4
  • 5. That‘s nice, but how? Extract Analyse & Share & Aggregate & Integrate Organise Commercialise Dependency and Frequency Analysis Graph Depencies Machine Algorithm Learning CRF SVM Biomedical Data Set 1 Gesamtergebnis" Algorithms" (Leer)" SVM" Domain" DataSet2" Experiment" DataSet1" CRF" (Leer)" Biomedical" Gesamtergebnis" 0" 5" 10" 15" 20" Text, Linked Data Linked Scientific Fact Visual Analytics & Crowdsourcing & Experiments Data Warehouse Collaborative Marketplace mind-mapping 5
  • 6. Extract & Integrate: Approach and Challenges  Extracting Structural Elements – Tables – Figures – Sections and sub-sections  Extracting Facts from Structural Elements – Entity extraction (e.g. algorithms, data sets, genes, significance levels etc.) – Fact extraction – <Entity, Relation, Measure> – Table Triplification  Crowdsourcing Extraction – Extraction quality and domain knowledge remains a key issue  Empower users to maintain their own extraction model  Allow to semantically annotate research papers (e.g. entities, facts)  Result: Semantically annotated scientific data as LOD Endpoint 6
  • 7. Extract & Integrate: Example Numerical Facts Dimension/ Entity In-Document Context Ranking Facts 7
  • 8. Extract & Integrate: Current Status  TeamBeam -PDF Structure Extraction – Structural elements – Focusing now on tables  Entity Extraction in work  First Prototypes for Table2RDFDataCube TeamBeam — Meta-Data Extraction from Scientific Literature By Roman Kern, Graz University of Technology; Kris Jack and Maya Hristakeva, Mendeley Ltd.; Michael Granitzer, University of Passau 8
  • 9. Aggregate: Approach and Challenges  Representation and Storage – Representation using the RDF Data Cube Vocabulary • Dimensions (e.g. Algorithms, Genes) • Measures (e.g. 0.3, 37) and Attributes (e.g. %, °) – Challenge 1: Ensure independency of dimensions – Challenge 2: Decentralized querying and aggregation http://www.w3.org/TR/vocab-data-cube/#ref_qb_measureType  SPARQL Data Warehousing Wizard – Provide simple and intuitive Wizard for creating aggregation queries • Google-like starting point • Pivot table creation similar like in Spreadsheets – Store using RDF Data Cube Vocabulary  Linked Scientific Fact Data Warehouse for non-IT Experts 9
  • 10. Aggregate: Current Status  Representation and Storage – Data Model implemented – Triplification of Benchmarking Data (e.g. CLEF, TPC-H etc.) We are looking for data  SPARQL Data Warehousing Wizard 10
  • 11. Analyse: Approach and Challenges  Visual Analytics for Linked Scientific Facts – RDF based description of visualisations • Glue between data and single visualisations • Make visualisation state explicit • Share visualisation state – HTML 5 based visualisations and visualisation wizard 11
  • 12. Share: Approach and Challenges  Provenance – Who published data? – Who modified data?  Share aggregated data sets and annotation models – Build on insights created by others – Re-use text annotation models  Share visual analytics applications – Simple visualisations might be misleading – Sharing whole states of a visual analysis will reveal more details on certain decisions 12
  • 13. Why should YOU do it? Marketplace concept for research data  Users (=researchers) will be enabled to “sell” their analysis results (or give it away for free)  Serveral concepts to be investigated: Revenue chains, roles, models (donations, paid subscription for data feeds, purchase etc.)  Increased opportunities for researchers and research data 13
  • 14. integrate crowdsource extract & organise visualise Find us, join us, ask us, help us http://code-research.eu/ http://www.facebook.com/CODEresearchEU #CODEresearchEU