BIG DATA EUROPE
HTTP://WWW.BIG-DATA-EUROPE.EU/
Integrating Big Data, Software & Communities for Addressing
Europe’s Societal Challenges
European Data Economy Workshop,
Focus: Data Value Chain & Big Data & Open Data
15 September 2015, University of Economics Vienna
Semantic Web Company
(SWC)
SWC was founded 2001, head-quartered in Vienna
25 experts in linked data technologies
Product: PoolParty Semantic Suite (launched 2009)
Serving customers from all over the world
EU- & US-based consulting services
Semantic Web Company
(SWC)
Some of our Customers
● Credit Suisse
● Boehringer Ingelheim
● Roche
● Wolters Kluwer
● BMJ Publishing Group
● Red Bull Media House
● Canadian Broadcasting Corporation (CBC)
● Pearson
● Council of the EU
● DG Environment, EC
● Healthdirect Australia
● Ministry of Finance (Austria)
● World Bank Group
● Inter-American Development Bank (IADB)
● International Atomic Energy Agency (IAEA)
● Buildings Performance Institute Europe
(BPIE)
● Renewable Energy & Energy Efficiency P
(REEEP)
● Global Buildings Performance Network
(GBPN)
● American Physical Society
Finance / Automotive / Publisher / Health Care / Public Administration / Energy /
Education
Selected Partners
● EBCONT
● EPAM Systems
● iQuest
● PwC
● Tenforce
● OpenLink Software
● Ontotext
● MarkLogic
● Gravity Zero
● Altotech
● Wolters Kluwer
● Taxonomy Strategies
● Digirati
● Fraunhofer (IAIS)
● University of Leipzig
(INFAI)
We all have one goal in mind: Make machines smart enough so that
they can help us to find those needles in the haystack, which are
really relevant to us.
The Motivation – Big Data
Every day, we create 2.5 quintillion bytes of data — so much that 90% of the data in the
world today has been created in the last two years alone.
This data comes from everywhere: sensors used to gather climate information, posts to
social media sites, digital pictures and videos, purchase transaction records, and cell phone
GPS signals to name a few.
This data is big data. Source:
The Motivation – Big Data
BIG
DATA
Big Data Dimensions
Volume
Velocity
Variety
100010101010101010101010
010100101010101010010100
101010010100101010010100
101001010100101010100101
010001010101010010101010
101010101001010101010101
001010101110001010101010
101010101001010010101010
101001010010101001010010
101001010010100101010010
101010010101000101010101
00101010101
1000101010
1010101010
1010010100
1010101010
1001010010
…….
………….
……
………..…….
.
……………
1 0
1000101010
1010101010
1010010100
1010101010
100101oo11
Veracity!
Big Data Dimensions
Big Data in Europe: Challenges,
Opportunities
Health
Climate
Energy
Transport
Food
Societies
Security
Lorem
ipsum
dolors
KSDJO
PSCKK
SDKAB
LKASJL
LAWWD
S
wpweppe
pwpisio
we
10101
00110
10100
10101
0
Regional Data Repositories
10101
00110
10100
10101
0
10101
00110
10100
10101
0
#2: Interlink, Centralise Access, Explore
1010101001010101010
0101101000101010101
0010101010100001011
0100010101010100101
0101010010010101010
0101010101001011010
001010
Data
Eleme
nt
#3: Analyse, Discover, Visualize
#4: Mashup, Cross-domain Exploitation
Journalists Authorities
Big Data in Europe:
Obstacles
30-sept.-15
#1 Big Data “Variety“ problem
 Multiple Data Sources
 Required: Integration, Harmonisation
#2 Opening-up Data concerns
 Loss of control, lack of tracking
 Reservations about large corporations
#3 Limited Skills, Training,
Technology
 Lack of Data Scientists
 Lack of Generic Architectures, components
Big Data in Europe:
Obstacles
30-sept.-15
Extraction, Curation Quality, Linking,
Integration
Publication,
Visualization, Analysis
Extraction, Curation, Quality,
Linking, Integration, Publication,
Visualization, Analysis
Health
Transport
Security
Extraction Curation Quality Linking Integration Publication Visualization Analysis
Data Repositories Linked Open Data
Cloud
Stage 1
Stage 2
Stage 3
Food SocietiesClimate Energy
BDE Partners
Rationale
 Show societal value of Big Data
 Lower barrrier for using big data technologies
o Required effort and resources
o Limited data science skills
 Help establishing cross-
lingual/organizational/domain Data Value
Chains
30-sept.-15
Rationale
COORDINATION
Stakeholder Engagement
(Requirements Elicitation)
SUPPORT
Design, Realise, Evaluate
Big Data Aggregator
Platform
Create and Manage Societal
Big Data Interest Groups
Cloud-deployment ready
Big Data Aggregator
Platform
CSA
Measures
Results
Summary
Two clearly defined coordination and support measures:
 Coordination: Engaging with a diverse range of stakeholder groups representing particularly
the Horizon 2020 societal challenges Health, Food & Agriculture, Energy, Transport, Climate,
Social Sciences and Security; Collecting requirements for the ICT infrastructure needed by data-
intensive science practitioners tackling a wide range of societal challenges; covering all aspects
of publishing and consuming semantically interoperable, large-scale data and knowledge assets;
 Support: Designing, realizing and evaluating a Big Data Aggregator platform infrastructure
that meets requirements, minimises disruption to current workflows, and maximises the
opportunities to take advantage of the latest European RTD developments (incl. multilingual data
harvesting, data analytics & visualisation).
BigDataEurope will implement and apply two main instruments to successfully realize these
measures:
 Build Societal Big Data Interest/Community Groups in the W3C interest group scheme &
involving a large number of stakeholders from the Horizon 2020 societal challenges as well as
technical Big Data experts;
 Design, integrate and deploy a cloud-deployment-ready Big Data aggregator platform
Orthogonal Dimensions of Big Data
Ecosystems
Generic Big Data Enabling Technologies
Data Value Chain
Data Generation
& Acquisition
Data Analysis &
Processing
Data Storage &
Curation
Data
Visualization &
Usage
Data-driven
Services
SocietalChallenges
DomainSpecificDataAssets&Technology
Healthcare
Food Security
Energy
Intelligent Transport
Climate & Environment
Inclusive & Reflective Societies
Secure Societies
BDE Stakeholder Engagement Approach &
Activities
BDE Community Tools – JOIN IN NOW !
• Website: news, events, community, …
• 7 x BDE W3C Community Groups
• 7+1x Mailing Lists
• 7 x SC Workshops/Year = 21 Workshops
• Full set of communication tool-set…
Future Outlook
• BDE Aggregator Platform
• For download / internal use
• Cloud Version
• Big Data Technology Support Tools
Domains, Focus Areas & Data
Assets
Societal Domain Preliminary Big Data Focus area Selected Key Data assets
Life Sciences &
Health
Heterogeneous data Linking &
integration
Biomedical Semantic Indexing & QA
ACD Labs / ChemSpider, ChEBI, ChEMBL, Con-ceptWiki, DrugBank, EN-
ZYME, Gene Ontology, GO Annotation, Swis-sProt, UniProt, Wik-iPathways,
PubMed, MeSH, Disease Ontology (DO), Joint Chemical Dic-tionary
(Jochem), Bio-ASQ datasets
Food &
Agriculture
Large-scale distributed data integration
INFOODS, AQUASTAT Green Learning Network (GLN), Agricultural
Bibliography Network (ABN), AGRIS, AquaMaps, Fishbase
Energy
Real-time monitoring, stream
processing, data analytics, and
decision support
European Energy Exchange Data, smart meter measurement data,
gas/fuels/energy market/price data, consumption statistics, equipment
condition monitoring data)
Transport
Streaming sensor network & geo-
spatial data integration
GTFS data, OSM/ LinkedGeoData, MobilityMaps, Transport sensor data,
ROSATTE Road safety attributes, European Road Data Infrastructure -
EuroRoadS
Climate
Real-time monitoring, stream
processing, and data analytics.
European Grid Infrastructure (EGI), Databases hosting atmospheric data.
Several software frameworks for simulation, calibration and reconstruction.
Social Sciences
Statistical and research data linking &
integration
Federated social sciences data catalogs, statistical data from public data
portals and statistical offices (e.g. EuroStats, UNESCO, WorldBank)
Security
Real-time monitoring, stream
processing, and data analytics.
Image data analysis
Earth Observation data (e.g. Very High Resolution Satellite Imagery acquired
from commercial providers and governmental systems) and collateral data
for supporting CFSP/CSDP missions and operations, Databases hosting
atmospheric Data. Experimental and simulation data concerning dispersion
Work Packages & Implementation
Phases
Community
Building
M1-M12 M13-M24 M25-M36
Enabling
Technologies
Component
Integration
Uptake
Integrator
Deployment
Community
Assessment
WP3 – Big Data Generic Enabling
Technologies & Architecture
WP5 – Big Data Integrator Instances
WP7 – Dissemination & Communication
WP2 – Community Building & Requirements
WP4 – Big Data Integrator Platform
WP6 – Real-life Deployment & User Evaluation
Blueprint of the Data Aggregator
Platform
Batch Layer
Speed Layer
Data Storage
Real-time data &
Transactions …
Batch View
Real-time
View
messagepassing
message passing
Applications & Showcases
Real-time dashboards
Domain-specific BDE apps
Big Data Analytics
In-stream Mining
BDEPlatform&
Intelligence
Input data
Stream
Spatial
Social
Statistical
Temporal
Transaction
al
Imagery
+ Semantic Layer (Retaining Semantics using LD
Lambda Architecture
Announcements….
Workshop SC2 (Agriculture & Food): 22.9.2015, Paris, INFOS
Workshop SC7 (Secure Societies): 30.9.2015, Brussels, INFO
Workshop SC4 (Transport): .10.2015, Bordeaux, INFO
Workshop SC6 (Social Science): 18.11.2015, Luxembourg, INFO
Martin Kaltenböck, m.kaltenboeck@semantic-web.at
Semantic Web Company GmbH
Mariahilfer Strasse 70/8, A-1070 Vienna
+43-1-4021235
http://www.semantic-web.at
http://www.poolparty-software.com
http://slideshare.net/semwebcompany
http://youtube.com/semwebcompany
Your Questions please….
www.big-data-europe.eu
30-sept.-15
#BigDataEurope

BigDataEurope - Empowering Communities with Data Technologies

  • 1.
    BIG DATA EUROPE HTTP://WWW.BIG-DATA-EUROPE.EU/ IntegratingBig Data, Software & Communities for Addressing Europe’s Societal Challenges European Data Economy Workshop, Focus: Data Value Chain & Big Data & Open Data 15 September 2015, University of Economics Vienna
  • 2.
    Semantic Web Company (SWC) SWCwas founded 2001, head-quartered in Vienna 25 experts in linked data technologies Product: PoolParty Semantic Suite (launched 2009) Serving customers from all over the world EU- & US-based consulting services
  • 3.
    Semantic Web Company (SWC) Someof our Customers ● Credit Suisse ● Boehringer Ingelheim ● Roche ● Wolters Kluwer ● BMJ Publishing Group ● Red Bull Media House ● Canadian Broadcasting Corporation (CBC) ● Pearson ● Council of the EU ● DG Environment, EC ● Healthdirect Australia ● Ministry of Finance (Austria) ● World Bank Group ● Inter-American Development Bank (IADB) ● International Atomic Energy Agency (IAEA) ● Buildings Performance Institute Europe (BPIE) ● Renewable Energy & Energy Efficiency P (REEEP) ● Global Buildings Performance Network (GBPN) ● American Physical Society Finance / Automotive / Publisher / Health Care / Public Administration / Energy / Education Selected Partners ● EBCONT ● EPAM Systems ● iQuest ● PwC ● Tenforce ● OpenLink Software ● Ontotext ● MarkLogic ● Gravity Zero ● Altotech ● Wolters Kluwer ● Taxonomy Strategies ● Digirati ● Fraunhofer (IAIS) ● University of Leipzig (INFAI) We all have one goal in mind: Make machines smart enough so that they can help us to find those needles in the haystack, which are really relevant to us.
  • 4.
    The Motivation –Big Data Every day, we create 2.5 quintillion bytes of data — so much that 90% of the data in the world today has been created in the last two years alone. This data comes from everywhere: sensors used to gather climate information, posts to social media sites, digital pictures and videos, purchase transaction records, and cell phone GPS signals to name a few. This data is big data. Source:
  • 5.
    The Motivation –Big Data BIG DATA
  • 6.
  • 7.
  • 8.
    Big Data inEurope: Challenges, Opportunities Health Climate Energy Transport Food Societies Security Lorem ipsum dolors KSDJO PSCKK SDKAB LKASJL LAWWD S wpweppe pwpisio we 10101 00110 10100 10101 0 Regional Data Repositories 10101 00110 10100 10101 0 10101 00110 10100 10101 0 #2: Interlink, Centralise Access, Explore 1010101001010101010 0101101000101010101 0010101010100001011 0100010101010100101 0101010010010101010 0101010101001011010 001010 Data Eleme nt #3: Analyse, Discover, Visualize #4: Mashup, Cross-domain Exploitation Journalists Authorities
  • 9.
    Big Data inEurope: Obstacles 30-sept.-15 #1 Big Data “Variety“ problem  Multiple Data Sources  Required: Integration, Harmonisation #2 Opening-up Data concerns  Loss of control, lack of tracking  Reservations about large corporations #3 Limited Skills, Training, Technology  Lack of Data Scientists  Lack of Generic Architectures, components
  • 10.
    Big Data inEurope: Obstacles 30-sept.-15 Extraction, Curation Quality, Linking, Integration Publication, Visualization, Analysis Extraction, Curation, Quality, Linking, Integration, Publication, Visualization, Analysis Health Transport Security Extraction Curation Quality Linking Integration Publication Visualization Analysis Data Repositories Linked Open Data Cloud Stage 1 Stage 2 Stage 3 Food SocietiesClimate Energy
  • 11.
  • 12.
    Rationale  Show societalvalue of Big Data  Lower barrrier for using big data technologies o Required effort and resources o Limited data science skills  Help establishing cross- lingual/organizational/domain Data Value Chains 30-sept.-15
  • 13.
    Rationale COORDINATION Stakeholder Engagement (Requirements Elicitation) SUPPORT Design,Realise, Evaluate Big Data Aggregator Platform Create and Manage Societal Big Data Interest Groups Cloud-deployment ready Big Data Aggregator Platform CSA Measures Results
  • 14.
    Summary Two clearly definedcoordination and support measures:  Coordination: Engaging with a diverse range of stakeholder groups representing particularly the Horizon 2020 societal challenges Health, Food & Agriculture, Energy, Transport, Climate, Social Sciences and Security; Collecting requirements for the ICT infrastructure needed by data- intensive science practitioners tackling a wide range of societal challenges; covering all aspects of publishing and consuming semantically interoperable, large-scale data and knowledge assets;  Support: Designing, realizing and evaluating a Big Data Aggregator platform infrastructure that meets requirements, minimises disruption to current workflows, and maximises the opportunities to take advantage of the latest European RTD developments (incl. multilingual data harvesting, data analytics & visualisation). BigDataEurope will implement and apply two main instruments to successfully realize these measures:  Build Societal Big Data Interest/Community Groups in the W3C interest group scheme & involving a large number of stakeholders from the Horizon 2020 societal challenges as well as technical Big Data experts;  Design, integrate and deploy a cloud-deployment-ready Big Data aggregator platform
  • 15.
    Orthogonal Dimensions ofBig Data Ecosystems Generic Big Data Enabling Technologies Data Value Chain Data Generation & Acquisition Data Analysis & Processing Data Storage & Curation Data Visualization & Usage Data-driven Services SocietalChallenges DomainSpecificDataAssets&Technology Healthcare Food Security Energy Intelligent Transport Climate & Environment Inclusive & Reflective Societies Secure Societies
  • 16.
    BDE Stakeholder EngagementApproach & Activities BDE Community Tools – JOIN IN NOW ! • Website: news, events, community, … • 7 x BDE W3C Community Groups • 7+1x Mailing Lists • 7 x SC Workshops/Year = 21 Workshops • Full set of communication tool-set… Future Outlook • BDE Aggregator Platform • For download / internal use • Cloud Version • Big Data Technology Support Tools
  • 17.
    Domains, Focus Areas& Data Assets Societal Domain Preliminary Big Data Focus area Selected Key Data assets Life Sciences & Health Heterogeneous data Linking & integration Biomedical Semantic Indexing & QA ACD Labs / ChemSpider, ChEBI, ChEMBL, Con-ceptWiki, DrugBank, EN- ZYME, Gene Ontology, GO Annotation, Swis-sProt, UniProt, Wik-iPathways, PubMed, MeSH, Disease Ontology (DO), Joint Chemical Dic-tionary (Jochem), Bio-ASQ datasets Food & Agriculture Large-scale distributed data integration INFOODS, AQUASTAT Green Learning Network (GLN), Agricultural Bibliography Network (ABN), AGRIS, AquaMaps, Fishbase Energy Real-time monitoring, stream processing, data analytics, and decision support European Energy Exchange Data, smart meter measurement data, gas/fuels/energy market/price data, consumption statistics, equipment condition monitoring data) Transport Streaming sensor network & geo- spatial data integration GTFS data, OSM/ LinkedGeoData, MobilityMaps, Transport sensor data, ROSATTE Road safety attributes, European Road Data Infrastructure - EuroRoadS Climate Real-time monitoring, stream processing, and data analytics. European Grid Infrastructure (EGI), Databases hosting atmospheric data. Several software frameworks for simulation, calibration and reconstruction. Social Sciences Statistical and research data linking & integration Federated social sciences data catalogs, statistical data from public data portals and statistical offices (e.g. EuroStats, UNESCO, WorldBank) Security Real-time monitoring, stream processing, and data analytics. Image data analysis Earth Observation data (e.g. Very High Resolution Satellite Imagery acquired from commercial providers and governmental systems) and collateral data for supporting CFSP/CSDP missions and operations, Databases hosting atmospheric Data. Experimental and simulation data concerning dispersion
  • 18.
    Work Packages &Implementation Phases Community Building M1-M12 M13-M24 M25-M36 Enabling Technologies Component Integration Uptake Integrator Deployment Community Assessment WP3 – Big Data Generic Enabling Technologies & Architecture WP5 – Big Data Integrator Instances WP7 – Dissemination & Communication WP2 – Community Building & Requirements WP4 – Big Data Integrator Platform WP6 – Real-life Deployment & User Evaluation
  • 19.
    Blueprint of theData Aggregator Platform Batch Layer Speed Layer Data Storage Real-time data & Transactions … Batch View Real-time View messagepassing message passing Applications & Showcases Real-time dashboards Domain-specific BDE apps Big Data Analytics In-stream Mining BDEPlatform& Intelligence Input data Stream Spatial Social Statistical Temporal Transaction al Imagery + Semantic Layer (Retaining Semantics using LD Lambda Architecture
  • 20.
    Announcements…. Workshop SC2 (Agriculture& Food): 22.9.2015, Paris, INFOS Workshop SC7 (Secure Societies): 30.9.2015, Brussels, INFO Workshop SC4 (Transport): .10.2015, Bordeaux, INFO Workshop SC6 (Social Science): 18.11.2015, Luxembourg, INFO
  • 21.
    Martin Kaltenböck, m.kaltenboeck@semantic-web.at SemanticWeb Company GmbH Mariahilfer Strasse 70/8, A-1070 Vienna +43-1-4021235 http://www.semantic-web.at http://www.poolparty-software.com http://slideshare.net/semwebcompany http://youtube.com/semwebcompany Your Questions please…. www.big-data-europe.eu 30-sept.-15 #BigDataEurope