Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Key Technology Trends for Big Data in Europe

4,142 views

Published on

In this presentation we will discuss some of the results of the BIG project including analysis of foundational Big Data research technologies, technology and strategy roadmaps to enable business to understand the potential of Big Data technologies across different sectors, and the necessary collaboration and dissemination infrastructure to link technology suppliers, integrators and leading user organizations.

Edward Curry is leading the Technical Working Group of the BIG Project with over 30 committed experts along the big data value chain (Acquisition, Analysis, Curation, Storage, Usage). With the help of the other technical leads, he will elaborate on the key technology trends identified in the BIG Project and how they bring data­-driven value to industrial sectors.

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Key Technology Trends for Big Data in Europe

  1. 1. BIG Final Event Workshop - September 30, 2014 - Heidelberg BIG Big Data Public Private Forum KEY TECHNOLOGY TRENDS FOR BIG DATA IN EUROPE Edward Curry, Insight @ NUI Galway Tilman Becker, Andre Freitas, John Domnique, Helen Lippell, Felicia Lobillo, Ricard Munné, Axel Ngonga, Denise Paradowski, Sebnem Rusitschka, Holger Ziekow, Martin Strohbach, Sonja Zillner, and all the many many contributors to the Technical Working Groups and Sectorial Forums
  2. 2. BIG Final Event Workshop - September 30, 2014 - Heidelberg BIG Big Data Public Private Forum 2 OVERVIEW Business Context Methodology Value-Driven Use Case Technology Trends
  3. 3. BIG Final Event Workshop - September 30, 2014 - Heidelberg BIG Big Data Public Private Forum BUSINESS CONTEXT
  4. 4. “This is a revolution: and I want the EU to be right at the front of it.” Neelie Kroes, Vice-President of the European Commission responsible for the Digital Agenda, March 2013 BIG Final Event Workshop - September 30, 2014 - Heidelberg BIG Big Data Public Private Forum 4 BIG DATA IN EUROPE “Possibly one of the few last chances for Europe‘s software industry to take a true leadership “ K-H Streibich, CEO
  5. 5. Open Innovation Open Data BIG Final Event Workshop - September 30, 2014 - Heidelberg BIG Big Data Public Private Forum 5 INCREASED OPENNESS Ecosystems Approaches Community-based Tools and Data
  6. 6. BIG Final Event Workshop - September 30, 2014 - Heidelberg BIG Big Data Public Private Forum BIG METHODOLOGY
  7. 7. Industry Driven Sectorial Forums BIG Final Event Workshop - September 30, 2014 - Heidelberg BIG Big Data Public Private Forum 7 SECTORIAL FORUMS AND TECHNICAL WORKING GROUPS Health Public Sector Finance & Insurance Telco, Media& Entertainment Manufacturing, Retail, Energy, Transport Needs Offerings Big Data Value Chain Technical Working Groups Data Acquisition Data Analysis Data Curation Data Storage Data Usage • Structured data • Unstructured data • Event processing • Sensor networks • Protocols • Real-time • Data streams • Multimodality • Stream mining • Semantic analysis • Machine learning • Information extraction • Linked Data • Data discovery • ‘Whole world’ semantics • Ecosystems • Community data analysis • Cross-sectorial data analysis • Data Quality • Trust / Provenance • Annotation • Data validation • Human-Data Interaction • Top-down/Bottom-up • Community / Crowd • Human Computation • Curation at scale • Incentivisation • Automation • Interoperability • In-Memory DBs • NoSQL DBs • NewSQL DBs • Cloud storage • Query Interfaces • Scalability and Performance • Data Models • Consistency, Availability, Partition-tolerance • Security and Privacy • Standardization • Decision support • Prediction • In-use analytics • Simulation • Exploration • Visualisation • Modeling • Control • Domain-specific usage
  8. 8. BIG Final Event Workshop - September 30, 2014 - Heidelberg BIG Big Data Public Private Forum 8 SECTORIAL ANALYSIS METHODOLOGY
  9. 9. Middle Management BIG Final Event Workshop - September 30, 2014 - Heidelberg BIG Big Data Public Private Forum 9 TECHNICAL WORKGROUP APPROACH Senior Academic Senior Management Middle Researcher Position in Organisation University MNC SME Other Types of Organisations 1. Literature & Technical Survey 2. Subject Matter Expert Interviews 3. Stakeholder Workshops 4. Online Questionnaire (with NESSI) • Early adopters • Business enablement • Technical maturity • Key Opinion Leaders Methodology Interviewee Breakdown Target Interviewee
  10. 10. BIG Final Event Workshop - September 30, 2014 - Heidelberg BIG Big Data Public Private Forum 10 SUBJECT MATTER EXPERT INTERVIEWS
  11. 11. Expert Interviews Technical Whitepapers BIG Final Event Workshop - September 30, 2014 - Heidelberg BIG Big Data Public Private Forum ▶ Executive Overview ▶ Key Insights ▶ Social & Economic Impact ▶ Concise State of the Art ▶ Future Requirements & Emerging Trends ▶ Sector-specific Case Studies 11 WORKING GROUP RESULTS Interviews, Technical White Papers, Sector's requisites and Roadmaps available on: http://www.big-project.eu
  12. 12. BIG Final Event Workshop - September 30, 2014 - Heidelberg BIG Big Data Public Private Forum VALUE-DRIVEN USE CASE
  13. 13. Public Service Integration with Open Data Retail BIG Final Event Workshop - September 30, 2014 - Heidelberg BIG Big Data Public Private Forum 13 VALUE-DRIVEN USE CASES Health Public Sector Finance & Insurance Telco, Media& Entertainment Manufacturing, Retail, Energy, Transport Industry Driven Sectorial Forums Industry 4.0 Increasing Productivity of Wind Farms Data Markets Data-Driven Therapy Guidance
  14. 14. Technology Evolution Process Revolution BIG Final Event Workshop - September 30, 2014 - Heidelberg BIG Big Data Public Private Forum 14 THE DATA LANDSCAPE (1/2) ▶ Much of Big Data technology is evolving evolutionary ▶ Old technologies applied in a new context ▶ Volume, Variety, Velocity, Value … ▶ Business processes change must be revolutionary to enable new opportunities ▶ Industry 4.0 (industrial internet) ▶ Predictive maintenance ▶ Opportunities for data-driven improvements ▶ integration with customer and supplier data ▶ Moving from infrastructure services (IaaS) to software (SaaS) to business processes (BPaaS) to knowledge (KaaS)
  15. 15. Variety and Reuse BIG Final Event Workshop - September 30, 2014 - Heidelberg BIG Big Data Public Private Forum 15 THE DATA LANDSCAPE (2/2) ▶ The long tail of data variety is a major shift in the data landscape ▶ Coping with data variety and verifiability are central challenges and opportunities for Big Data ▶ Cross-sectorial uses of Big Data will open up new business opportunities ▶ Need for scalable approaches to cope with data under different format and semantic assumptions
  16. 16. Secondary Usage of Health Data BIG Final Event Workshop - September 30, 2014 - Heidelberg BIG Big Data Public Private Forum 16 REUSE OF HEALTH DATA ▶ Aggregation, analysis and presentation of clinical, financial, administrative and other related data ▶ Goal is to discover new valuable knowledge ▶ Identify trends, predict outcomes or influence patient care, drug development, or therapy choices ▶ Patient recruiting & profiling for conducting clinical studies
  17. 17. Pharmaceutical & R&D Data § Owned by the pharmaceutical companies, research labs/ academia, government § Encompass clinical trials, clinical studies, population and disease data, etc. BIG Final Event Workshop - September 30, 2014 - Heidelberg BIG Big Data Public Private Forum 17 DATA POOLS IN HEALTHCARE MAIN IMPACT BY INTEGRATING VARIOUS AND HETEROGENEOUS DATA SOURCES Clinical Data § Owned by providers (such as hospitals, care centers, physicians, etc.) § Encompass any information stored within the classical hospital information systems or EHR, such as medical records, medical images, lab results, genetic data, etc. Claims, Cost & Administrative Data § Owned by providers and payors § Encompass any data sets relevant for reimbursement issues, such as utilization of care, cost estimates, claims, etc. Patient Behaviour & Sentiment Data § Owned by consumers or monitoring device producer § Encompass any information related to the patient behaviours and preferences Health data on the web § Mainly open source § Examples are websites such as PatientLikeMe, Linked Open Data, etc. Highest Impact on integrated data sets
  18. 18. Dr. Martin Strohbach Senior Researcher BIG Final Event Workshop - September 30, 2014 - Heidelberg BIG Big Data Public Private Forum PEER ENERGY CLOUD
  19. 19. BIG Final Event Workshop - September 30, 2014 - Heidelberg BIG Big Data Public Private Forum 19 PEER ENERGY CLOUD Smart grid pilot in Saarlouis 100 households Berlin Innovation award Saarlouis Engage consumers to optimally use local solar energy § Understand consumption and save § Trade solar energy in the neighborhood to balance the grid
  20. 20. BIG Final Event Workshop - September 30, 2014 - Heidelberg BIG Big Data Public Private Forum 20 DEVICE LEVEL ENERGY MONITORING Monitored/controlled grid today Monitored/controlled grid tomorrow Germany aims at 30% clean/ renewable energy by 2020, seeking to build a smart grid Sensors today Sensors tomorrow (consumer level) Energy Consumption Temperature Movement,...
  21. 21. 35.040 values per year BIG Final Event Workshop - September 30, 2014 - Heidelberg BIG Big Data Public Private Forum 21 GETTING READY FOR DATA VOLUMES IN FUTURE GRIDS PeerEnergyCloud Pilots allows us to get ready for future data volumes today How much data is really needed for what? 1 value per year today smart metering 540 million values per year ? Billion values per year PeerEnergy- Cloud Future possibilities Optimum? 7 devices per household every 2 seconds , 4-5 measurements per devices every 15 real-time analytics minutes on mass data (grouped aggregation) Scalable statistics over hundreds of millions of measurements Automatic detection of load anomalies (spotting inefficiencies and defects) Household activity state inference and prediction
  22. 22. BIG Final Event Workshop - September 30, 2014 - Heidelberg BIG Big Data Public Private Forum 22 IDENTIFIED NEEDS FOR DEVICE LEVEL MONITORING Managing Large Data RDBMs didn‘t easily support our data volumes as well as Hadoop did Real-time Insights E.g. for forecasting energy demand and anomaly detections is required to make efficient decisions Data Security and Privacy Privacy and confidentiality preserving data analytics are required to enable the service provider to retrieve the knowledge without violating the agreed upon granularity, in PEC this was realized by dynamic configurability of data access( which data, what purpose, what granularity, …) Ease of use Simplifications of applying machine learning techniques on Big Data sets would help speeding up development, e.g. unified batch/stream abstractions, standardized data integration, visualization tools
  23. 23. BIG Final Event Workshop - September 30, 2014 - Heidelberg BIG Big Data Public Private Forum KEY TECHNOLOGY TRENDS
  24. 24. BIG Final Event Workshop - September 30, 2014 - Heidelberg BIG Big Data Public Private Forum 24 THE DATA VALUE CHAIN Data Acquisition Data Analysis Data Curation Data Storage Data Usage • Structured data • Unstructured data • Event processing • Sensor networks • Protocols • Real-time • Data streams • Multimodality • Stream mining • Semantic analysis • Machine learning • Information extraction • Linked Data • Data discovery • ‘Whole world’ semantics • Ecosystems • Community data analysis • Cross-sectorial data analysis • Data Quality • Trust / Provenance • Annotation • Data validation • Human-Data Interaction • Top-down/Bottom-up • Community / Crowd • Human Computation • Curation at scale • Incentivisation • Automation • Interoperability • In-Memory DBs • NoSQL DBs • NewSQL DBs • Cloud storage • Query Interfaces • Scalability and Performance • Data Models • Consistency, Availability, Partition-tolerance • Security and Privacy • Standardization • Decision support • Predictions • In-use analytics • Simulation • Exploration • Modeling • Control • Domain-specific usage Big Data Value Chain • Technical working groups examine the the state of the art and future developments in big data across the whole value chain of big data: • Working groups publish Technical white papers that result from desktop research and in-depth interviews with leading experts.
  25. 25. IMPROVING USABILITY Usability ▶ Lowering the usability barrier for data tools: Users should be able to directly manipulate the data ▶ Improvement of Human-Data interaction: Enabling experts & casual users to query, explore, transform, & curate data ▶ Interactive exploration: Big Data generates insights beyond existing models, new analysis interfaces must support browsing and modeling (visual analytics) ▶ Convergence within analytical frameworks Analytical databases for better performance and lower development complexity (Mahout, Spark, Hadoop/R, rasdaman, SciDB) BIG Final Event Workshop - September 30, 2014 - Heidelberg BIG Big Data Public Private Forum 25
  26. 26. BIG Final Event Workshop - September 30, 2014 - Heidelberg BIG Big Data Public Private Forum 26 BLENDING HUMAN AND ALGORITHM Blended Approaches ▶ Blended human and algorithmic data processing approaches for coping with data acquisition, transformation, curation, access, and analysis challenges for Big Data Analytics & Algorithms Entity Linking Data Fusion Relation Extraction Human Computation Relevance Judgment Data Verification Disambiguation Better Data Internal Community - Domain Knowledge - High Quality Responses - Trustable Web Data Databases Sensor Data Programmers Managers External Crowd - High Availability - Large Scale - Expertise Variety
  27. 27. BIG Final Event Workshop - September 30, 2014 - Heidelberg BIG Big Data Public Private Forum 27 A CROSS-SECTOR TREND… Telco, Media, & Entertainment Manufacturing, Retail, Energy & Transport Public Sector Life Sciences
  28. 28. Ecosystems are Important ▶ Community provided data (crowd-based collection, data quality, analysis and usage) ▶ Community tools which are interoperable and usable ▶ Support from large communities or large companies BIG Final Event Workshop - September 30, 2014 - Heidelberg BIG Big Data Public Private Forum 28 COMMUNITY AND ECOSYSTEMS Community ▶ Solutions based on large communities (crowd-based approaches) and Ecosystems are emerging as a trend to cope with Big Data challenges Emerging Economic Model for Open Data ▶ Pre-competitive collaboration efforts ▶ Pistoia Alliance (pharmaceutical data) ▶ Share costs, risks and technical challenges ▶ Benefit from collective wisdom and network effect for curated dataset
  29. 29. COMMUNITY DATA Community Analysis and Collection § Number of data collection points can be dramatically increased; § Communities are creating bespoke tools for the particular situation and to BIG Final Event Workshop - September 30, 2014 - Heidelberg BIG Big Data Public Private Forum 29 handle any problems in data collection (Developer Ecosystem) § Citizen engagement is increased significantly Real-time City Noise Levels radiation monitoring
  30. 30. BIG Final Event Workshop - September 30, 2014 - Heidelberg BIG Big Data Public Private Forum 30 STANDARDS Standardization & interoperability ▶ Principled semantic and standardized data representation models are central to cope with data heterogeneity ▶ Minimum information models needed ▶ Significant increase in the use of new data models (i.e. graph-based) (expressivity and flexibility) ▶ Better integration between data tools ▶ Standardization of Query Interfaces ! source: TU Berlin, FG DIMA 2013 Open Open Challenges Technology Stacks • Unclear Adoption Paths for Non-IT Based Sectors • Lack of standards and best practices is major barrier for adoption • Privacy and Security is Lacking Behind
  31. 31. BIG Final Event Workshop - September 30, 2014 - Heidelberg BIG Big Data Public Private Forum 31 END-TO-END ARCHITECTURES Architectures ▶ Design end-to-end architectures for full data lifecycle ▶ Support for both “Data-at-Rest” and “Data-in-Motion” ▶ Data Hubs and Markets: Hadoop-based solutions tend to become central integration point for all enterprise data
  32. 32. Key Technical Requirements BIG Final Event Workshop - September 30, 2014 - Heidelberg BIG Big Data Public Private Forum 32 BIGGEST BLOCKERS ▶ Lack of Business-driven Big Data strategies ▶ Undiscovered und unclaimed potential business values ▶ Data Sharing & Exchange ▶ Need for format and data storage technology standards ▶ Data Privacy and Security ▶ Regulations & markets for data access ▶ Legal frameworks for data sharing & communication are needed ▶ Human resources ▶ Lack of skilled data scientists and data engineers
  33. 33. The Data Landscape ▶ Much of (Big Data) technology is evolving evolutionary ▶ But business processes change must be revolutionary ▶ Data variety and verifiability are key opportunities ▶ Long tail of data variety is a major shift in the data landscape BIG Final Event Workshop - September 30, 2014 - Heidelberg BIG Big Data Public Private Forum Biggest Blockers ▶ Lack of Business-driven Big Data strategies ▶ Need for format and data storage technology standards ▶ Data exchange between companies, institutions, individuals, etc. ▶ Regulations & markets for data access ▶ Human resources: Lack of skilled data scientists and data engineers 33 KEY INSIGHTS Key Trends ▶ Lower usability barrier for data tools ▶ Blended human and algorithmic data processing for coping with for data quality ▶ Leveraging large communities (crowds) ▶ Need for semantic standardized data representation ▶ Significant increase in use of new data models (i.e. graph) (expressivity and flexibility)
  34. 34. Thank you Dr. Edward Curry Research Fellow, Insight @ NUI Galway. ed.curry@insight-centre.org Interviews, Technical White Papers, Sector's requisites and Roadmaps available on: http://www.big-project.eu Tilman Becker (DFKI, Data Usage), Andre Freitas (NUI Galway, Data Curation), John Domnique (STI, Data Analysis), Helen Lippell (Press Association, Media), Felicia Lobillo (ATOS, Retail), Ricard Munné (ATOS, Public Sector), Axel Ngonga (InfAI, Data Acquisition), Denise Paradowski (DFKI, Retail), Sebnem Rusitschka (Siemens, Energy and Transport), Holger Ziekow (AGT, PEC), Martin Strohbach (AGT, Data Storage), Sonja Zillner (Siemens, Health), and all the many many contributors to the Technical Working Groups and Sectorial Forums http://www.bigdatavalue.eu http://www.big-project.eu BIG Final Event Workshop - September 30, 2014 - Heidelberg

×