Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Big	Data:	
Beyond	the	hype,	Delivering	value	
Edward Curry
Insight @ NUI Galway
ed.curry@insight-centre.org
www.edwardcurr...
About	Me	
Vice	President
New	Horizons	for	a	Data-Driven	Economy	
A	Roadmap	for	Usage	and	Exploita>on	of	Big	Data	in	Europe	
Jose	Maria	Cavanillas	(...
Overview
n  Part I: What is “Big Data”?
n  Part II: Data Driven Innovation:
Big Data is Transforming
Sectors by Breaking S...
PART	I:	WHAT	IS	“BIG	DATA”?
Definitions of Big Data
09/02/16 8www.bdva.eu
The “V’s” of Big Data
Volume	 Velocity	 Veracity	Variety	 Value	
Data	at	Rest	
Terabytes	to		
exabyt...
Isn’t Big Data Just Hype?
Big Data
Emerging	Technologies	Hype	Cycle	2015	
“I	would	not	consider	big	data	to	be	an	
emerging	technology…”	
-	Betsy	Burton,	Gar...
09/02/16 13www.bdva.eu
PART II: DATA DRIVEN
INNOVATION: BIG DATA IS
TRANSFORMING SECTORS BY
BREAKING SILOS AND DRIVING
ECO...
Big	Data	is	transforming	Business	models
Key Enablers
Internet of Things
Availability of Data
Key Enablers
Ecosystems Approaches
Open Innovation
Technology
Providers
Data Value Chain
Core Value Chain
Extended Value Chain
Big Data Ecosystem
Suppliers of Complementary
...
Legal
Social
EconomicTechnology
Application
Data &
Skills
Big Data Value Ecosystem
Ownership
Copyright
Liability
Insolvenc...
HEALTH	
HEALTH AND
WELLBEING
Macro	trends	driving	healthcare	needs		
Increase	in	life	
expectancy		
		
100%	increase	 1	out	of	2	
Shortage	of	
+	18	yea...
Macro	trends	driving	healthcare	needs		
Increase	in	life	
expectancy		
		
100%	increase	 1	out	of	2	
Shortage	of	
+	18	yea...
August,	2014						Philips	Research	23	23	
Making	a	difference	across	the	health	con>nuum	
1,000,000	
paEents	monitored	in	t...
August,	2014						Philips	Research	24	24	
6,000,000	
paEents	monitored	in	their	
homes	every	day	
	
18	petabytes			
of	ima...
August,	2014						Philips	Research	25	25	
6,000,000	
paEents	monitored	in	their	
homes	every	day	
	
18	petabytes			
of	ima...
August,	2014						Philips	Research	26	26	
6,000,000	
paEents	monitored	in	their	
homes	every	day	
	
18	petabytes			
of	ima...
BIG
Big Data Public Private Forum
27
DATA POOLS IN HEALTHCARE
MAIN IMPACT BY INTEGRATING VARIOUS AND
HETEROGENEOUS DATA SO...
Big Data is Impacting in All Sectors
Economy Energy Environment Education
Health &
Wellbeing
Tourism Mobility Grovenance
Ci>zen	Sensors	
“…humans	as	ci,zens	on	the	ubiquitous	Web,	ac,ng	as	
sensors	and	sharing	their	observa,ons	and	views…”	
¨ ...
Crisis Response
PART	III:	THE	DATA	VALUE	CHAIN:		
TOOLS	AND	TECHNIQUES
The Big Data Landscape is Complex
35 BIG 318062
BIG
Big Data Public Private Forum
THE DATA VALUE CHAIN
Data
Acquisition
Data
Analysis
Data
Curation
Data
Sto...
36 BIG 318062
BIG
Big Data Public Private Forum
36 BIG 318062
DATA ACQUISITION OVERVIEW
▶  Process of gathering, filtering...
37 BIG 318062
BIG
Big Data Public Private Forum
37 BIG 318062
END-TO-END ARCHITECTURES
Architectures
▶ Design end-to-end a...
38 BIG 318062
BIG
Big Data Public Private Forum
38 BIG 318062
DATA ANALYSIS OVERVIEW
Core Techniques
The techniques associ...
39 BIG 318062
BIG
Big Data Public Private Forum
39 BIG 318062
THE ROLE OF COMMUNITY IN ANALYSIS
Community Analysis and Col...
40 BIG 318062
BIG
Big Data Public Private Forum
DATA CURATION OVERVIEW
▶  Digital Curation “Selection, preservation, maint...
41 BIG 318062
BIG
Big Data Public Private Forum
41 BIG 318062
Internal Community
- Domain Knowledge
- High Quality Respons...
42 BIG 318062
BIG
Big Data Public Private Forum
RECAPTCHA
n  OCR
¨  ~ 1% error rate
¨  20%-30% for 18th and 19th
century b...
43 BIG 318062
BIG
Big Data Public Private Forum
A CROSS-SECTOR TREND…
Telco, Media, & Entertainment
Manufacturing, Retail,...
44 BIG 318062
BIG
Big Data Public Private Forum
44 BIG 318062
DATA STORAGE OVERVIEW
▶ Is responsible for analysing differe...
45 BIG 318062
BIG
Big Data Public Private Forum
BIG DATA STORAGE AS A COMMODITY
46 BIG 318062
BIG
Big Data Public Private Forum
46 BIG 318062
Mathworks
Analytical
Databases
ANALYSIS OF BIG DATA VOLUMES
...
47 BIG 318062
BIG
Big Data Public Private Forum
47 BIG 318062
TRADEOFF: SIZE VS. COMPLEXITY
48 BIG 318062
BIG
Big Data Public Private Forum
48 BIG 318062
§  Decision support
§  Descriptive
§  Predictive
§  Prescrip...
49 BIG 318062
BIG
Big Data Public Private Forum
49 BIG 318062
IMPROVING USABILITY
Usability
▶ Lowering the usability barri...
PART IV: THE NEXT WAVE OF BIG
DATA RESEARCH AND INNOVATION
09/02/16 51www.bdva.eu
  The Big Data Value Strategic Research and
Innovation Agenda (BDV SRIA) defines the
overall goals,...
09/02/16 52www.bdva.eu
BDV SRIA Technical Priorities
Data Management
Engineering the management of data
Data Processing Ar...
09/02/16 53www.bdva.eu
  How do semantically annotated unstructured and semi-structured
data without imposing extra-effort...
09/02/16 54www.bdva.eu
  How to integrate the processing of data in motion and data at rest,
e.g.
•  Real-time Analytics &...
09/02/16 55www.bdva.eu
  How to produce predictive and prescriptive analytics results?
i.e.by deep learning techniques and...
09/02/16 56www.bdva.eu
  How to ensure privacy and data anonymisation as key
requirements for data sharing and exchange?
 ...
09/02/16 57www.bdva.eu
Data Visualization
  How to present data analytics reports that encompass complex
documents contain...
09/02/16 58www.bdva.eu
Non-Technical Challenges
  Skills development
  Business Models and
Ecosystems
  Policy, Regulation...
PART	V:	Data	
Science	and	Skills
The Skills GAP
CONCLUSION
The Data Landscape
▶ Much of Big Data technology is evolutionary
▶ Old technologies applied in a new context
▶ Volume, Var...
The Data Landscape
▶ The long tail of data variety is a major shift in
the data landscape
▶ Coping with data variety and v...
Resources on Big-Data
QuesEons?
Credits	
•  Members	of	the	Big	Project.	In	parEcular	the	
leaders	and	members	of	the	Technical	
Working	groups	and	Sectori...
Big Data: Beyond the hype, Delivering value
Big Data: Beyond the hype, Delivering value
Big Data: Beyond the hype, Delivering value
Big Data: Beyond the hype, Delivering value
Big Data: Beyond the hype, Delivering value
Big Data: Beyond the hype, Delivering value
Upcoming SlideShare
Loading in …5
×

Big Data: Beyond the hype, Delivering value

1,012 views

Published on

Big Data: Beyond the hype, Delivering value explains Big Data technology and how it is transforming industry and society to members of the IDEAL-IST project.

IDEAL-IST is an international ICT (Information and Communication Technologies) network, with more than 65 ICT national partners from EU and Non-EU Countries. It assists ICT companies and research organizations worldwide wishing to find project partners for a participation in the Horizon 2020 program of the European Commission.

Published in: Data & Analytics

Big Data: Beyond the hype, Delivering value

  1. 1. Big Data: Beyond the hype, Delivering value Edward Curry Insight @ NUI Galway ed.curry@insight-centre.org www.edwardcurry.org
  2. 2. About Me Vice President
  3. 3. New Horizons for a Data-Driven Economy A Roadmap for Usage and Exploita>on of Big Data in Europe Jose Maria Cavanillas (Atos) Prof. Wolfgang Wahlster (DFKI) Co-Editors: 3 Open Access PDF hJp://>ny.cc/NewHorizons •  Provides big picture on how to exploit big data, including technological, economic, poliEcal and societal issues •  Details complete lifecycle of big data value chain, ranging from data acquisiEon, analysis, curaEon and storage, to data usage and exploitaEon •  Illustrates potenEal of big data value within different sectors, including industry, healthcare, finance, energy, media and public services •  Summarizes more than two years of research with wide stakeholder consultaEon Overview Many of the slides today are based on the work of the chapter authors
  4. 4. Overview n  Part I: What is “Big Data”? n  Part II: Data Driven Innovation: Big Data is Transforming Sectors by Breaking Silos and Driving Ecosystems n  Part III: The Data Value Chain: Tools and Techniques n  Part IV: The Next Wave of Big Data Research and Innovation n  Part V: Data Science and Skills Agenda n  Understand of what is Big Data and its use n  High-level overview of key technologies ¨  No formulas or complex examples ¨  Lots of keywords (Sorry!) n  Feel for the key trends and issues Learning Objectives
  5. 5. PART I: WHAT IS “BIG DATA”?
  6. 6. Definitions of Big Data
  7. 7. 09/02/16 8www.bdva.eu The “V’s” of Big Data Volume Velocity Veracity Variety Value Data at Rest Terabytes to exabytes of exis>ng data to process Data in Mo>on Streaming data, requiring mseconds to respond Data in Many Forms Structured, unstructured, text, mul>media,… Data in Doubt Uncertainty due to data inconsistency & incompleteness, ambigui>es, latency, decep>on € € € € € € € € Data into Money Business models can be associated to the data Adapted by a post of Michael Walker on 28 November 2012
  8. 8. Isn’t Big Data Just Hype?
  9. 9. Big Data
  10. 10. Emerging Technologies Hype Cycle 2015 “I would not consider big data to be an emerging technology…” - Betsy Burton, Gartner
  11. 11. 09/02/16 13www.bdva.eu PART II: DATA DRIVEN INNOVATION: BIG DATA IS TRANSFORMING SECTORS BY BREAKING SILOS AND DRIVING ECOSYSTEMS
  12. 12. Big Data is transforming Business models
  13. 13. Key Enablers Internet of Things Availability of Data
  14. 14. Key Enablers Ecosystems Approaches Open Innovation
  15. 15. Technology Providers Data Value Chain Core Value Chain Extended Value Chain Big Data Ecosystem Suppliers of Complementary Data Products and Services End-Users of my End-Users Direct Data End-Users Direct Data Suppliers Data Value Distribution Channels Suppliers of my Data Suppliers Co-opetitors (Competitors and cooperation) Other Stakeholders and Peripheral Actors Government Organisations Regulators Investors, Venture Capitalist & Incubators Industry Associations Data Marketplace Standardisation Bodies Start-ups and Entrepreneurs Researchers & Academics Stakeholders in a Big Data Value Ecosystem
  16. 16. Legal Social EconomicTechnology Application Data & Skills Big Data Value Ecosystem Ownership Copyright Liability Insolvency Privacy User Behaviour Societal Impact Collaboration Business Models Benchmarking Open Source Deployment Models Information Pricing Data-Driven Decision Making Risk Management Competitive Intelligence Digital Humanities Internet of Things Verticals Industry 4.0 Scalable Data Processing Real-Time Statistics/ML Linguistics HCI/Visualisation The Dimensions of a Big Data Value Ecosystem [adapted from Cavanillas et al. (2014)]
  17. 17. HEALTH HEALTH AND WELLBEING
  18. 18. Macro trends driving healthcare needs Increase in life expectancy 100% increase 1 out of 2 Shortage of + 18 years 0.2 doctors and 2 healthcare workers per 1,000 people in sub-Saharan African German hospitals makes a loss 90,000+ physicians in the US by 2020 From 11% in 2000 to 22% in 2050 of # of people >60 Between 1950 and 2050, globally 246 million people with diabetes, increasing to 380 m in 2025 Private Healthcare growing fastest in Emerging Markets
  19. 19. Macro trends driving healthcare needs Increase in life expectancy 100% increase 1 out of 2 Shortage of + 18 years 0.2 doctors and 2 healthcare workers per 1,000 people in sub-Saharan African German hospitals makes a loss 90,000+ physicians in the US by 2020 From 11% in 2000 to 22% in 2050 of # of people >60 Between 1950 and 2050, globally 246 million people with diabetes, increasing to 380 m in 2025 Private Healthcare growing fastest in Emerging Markets Big data needed to op>mize the triangle of healthcare Cost Quality Access
  20. 20. August, 2014 Philips Research 23 23 Making a difference across the health con>nuum 1,000,000 paEents monitored in their homes every day 18 petabytes of imaging study data managed for healthcare providers 250 million appliances sold each year making homes healthier Hundreds of thousands of people tracking their health with AcEveLink® Last year 6.5 million people improved their oral health with our oral healthcare products 101 million lives improved globally through access to diagnosEc X-Ray 275 million pa>ents tracked with our paEent monitors in 2014 Healthy living Preven>on Diagnosis Treatment Home care
  21. 21. August, 2014 Philips Research 24 24 6,000,000 paEents monitored in their homes every day 18 petabytes of imaging study data managed for healthcare providers 250 million appliances sold each year making homes healthier Hundreds of thousands of people tracking their health with AcEveLink® Last year 6.5 million people improved their oral health with our oral healthcare products 101 million lives improved globally through access to diagnosEc X-Ray 275 million pa>ents tracked with our paEent monitors in 2014 Philips Big Health Data
  22. 22. August, 2014 Philips Research 25 25 6,000,000 paEents monitored in their homes every day 18 petabytes of imaging study data managed for healthcare providers 250 million appliances sold each year making homes healthier Hundreds of thousands of people tracking their health with AcEveLink® Last year 6.5 million people improved their oral health with our oral healthcare products 101 million lives improved globally through access to diagnosEc X-Ray 275 million pa>ents tracked with our paEent monitors in 2014 Philips Big Health Data
  23. 23. August, 2014 Philips Research 26 26 6,000,000 paEents monitored in their homes every day 18 petabytes of imaging study data managed for healthcare providers 250 million appliances sold each year making homes healthier Hundreds of thousands of people tracking their health with AcEveLink® Last year 6.5 million people improved their oral health with our oral healthcare products 101 million lives improved globally through access to diagnosEc X-Ray 275 million pa>ents tracked with our paEent monitors in 2014 Philips Big Health Data
  24. 24. BIG Big Data Public Private Forum 27 DATA POOLS IN HEALTHCARE MAIN IMPACT BY INTEGRATING VARIOUS AND HETEROGENEOUS DATA SOURCES Clinical Data §  Owned by providers (such as hospitals, care centers, physicians, etc.) §  Encompass any information stored within the classical hospital information systems or EHR, such as medical records, medical images, lab results, genetic data, etc. Claims, Cost & Administrative Data §  Owned by providers and payors §  Encompass any data sets relevant for reimbursement issues, such as utilization of care, cost estimates, claims, etc. Pharmaceutical & R&D Data §  Owned by the pharmaceutical companies, research labs/ academia, government §  Encompass clinical trials, clinical studies, population and disease data, etc. Patient Behaviour & Sentiment Data §  Owned by consumers or monitoring device producer §  Encompass any information related to the patient behaviours and preferences Health data on the web §  Mainly open source §  Examples are websites such as PatientLikeMe, Linked Open Data, etc. Highest Impact on integrated data sets
  25. 25. Big Data is Impacting in All Sectors Economy Energy Environment Education Health & Wellbeing Tourism Mobility Grovenance
  26. 26. Ci>zen Sensors “…humans as ci,zens on the ubiquitous Web, ac,ng as sensors and sharing their observa,ons and views…” ¨  Sheth, A. (2009). CiEzen sensing, social signals, and enriching human experience. Internet Compu,ng, IEEE, 13(4), 87-92. Air Pollution
  27. 27. Crisis Response
  28. 28. PART III: THE DATA VALUE CHAIN: TOOLS AND TECHNIQUES
  29. 29. The Big Data Landscape is Complex
  30. 30. 35 BIG 318062 BIG Big Data Public Private Forum THE DATA VALUE CHAIN Data Acquisition Data Analysis Data Curation Data Storage Data Usage •  Structured data •  Unstructured data •  Event processing •  Sensor networks •  Protocols •  Real-time •  Data streams •  Multimodality •  Stream mining •  Semantic analysis •  Machine learning •  Information extraction •  Linked Data •  Data discovery •  ‘Whole world’ semantics •  Ecosystems •  Community data analysis •  Cross-sectorial data analysis •  Data Quality •  Trust / Provenance •  Annotation •  Data validation •  Human-Data Interaction •  Top-down/Bottom- up •  Community / Crowd •  Human Computation •  Curation at scale •  Incentivisation •  Automation •  Interoperability •  In-Memory DBs •  NoSQL DBs •  NewSQL DBs •  Cloud storage •  Query Interfaces •  Scalability and Performance •  Data Models •  Consistency, Availability, Partition-tolerance •  Security and Privacy •  Standardization •  Decision support •  Predictions •  In-use analytics •  Simulation •  Exploration •  Modeling •  Control •  Domain-specific usage Big Data Value Chain
  31. 31. 36 BIG 318062 BIG Big Data Public Private Forum 36 BIG 318062 DATA ACQUISITION OVERVIEW ▶  Process of gathering, filtering and cleaning data before the data is put in a data warehouse or any other storage solution on which data analysis can be carried out Definition ▶  Mainly driven by 4 of 9 Vs •  Volume •  Velocity •  Variety •  Value Scope ▶  Most data acquisition scenarios assume high-volume, high- velocity, high-variety but low- value data Key Technology Data Acquisition Data Analysis Data Curation Data Storage Data Usage
  32. 32. 37 BIG 318062 BIG Big Data Public Private Forum 37 BIG 318062 END-TO-END ARCHITECTURES Architectures ▶ Design end-to-end architectures for full data lifecycle ▶ Support for both “Data-at-Rest” and “Data-in-Motion” ▶ Data Hubs and Markets: Hadoop-based solutions tend to become central integration point for all enterprise data
  33. 33. 38 BIG 318062 BIG Big Data Public Private Forum 38 BIG 318062 DATA ANALYSIS OVERVIEW Core Techniques The techniques associated with Big Data Analysis will encompass those related to data mining and machine learning, to information extraction and new forms of data processing and reasoning including for example, stream data processing and large-scale reasoning. ▶  Big Data Analysis is concerned with making raw data which has been acquired amenable to use ▶  Supports decision making as well as domain specific usage. Big Data Analysis ▶  Entity summarisation ▶  Data abstraction based on ontologies and communication workflow patters ▶  Recommendations and personal data ▶  Stream data processing ▶  Large scale reasoning & Large scale machine learning State of the art areas Data Acquisition Data Analysis Data Curation Data Storage Data Usage
  34. 34. 39 BIG 318062 BIG Big Data Public Private Forum 39 BIG 318062 THE ROLE OF COMMUNITY IN ANALYSIS Community Analysis and Collection §  Number of data collection points can be dramatically increased; §  Communities are creating bespoke tools for the particular situation and to handle any problems in data collection (Developer Ecosystem) §  Citizen engagement is increased significantly Real-time radiation monitoringCity Noise Levels
  35. 35. 40 BIG 318062 BIG Big Data Public Private Forum DATA CURATION OVERVIEW ▶  Digital Curation “Selection, preservation, maintenance, collection, and archiving of digital assets” ▶  Data Curation “Active management of data over its life-cycle” Definition ▶  Individual Curators ▶  Curation Departments ▶  Community-based (Emerging trend) Who? ▶  (Semi-)Automated ▶  Crowdsourced Data Management How? ▶  Accessible ▶  Authenticity ▶  Collaboration ▶  Discoverability ▶  Fitness for Use Why? ▶  Integrity ▶  Reusability ▶  Security ▶  Sustainability ▶  Trustworthy Data Acquisition Data Analysis Data Curation Data Storage Data Usage
  36. 36. 41 BIG 318062 BIG Big Data Public Private Forum 41 BIG 318062 Internal Community - Domain Knowledge - High Quality Responses - Trustable BLENDING HUMAN AND ALGORITHM Blended Approaches ▶ Blended human and algorithmic data processing approaches for coping with data acquisition, transformation, curation, access, and analysis challenges for Big Data Analytics & Algorithms Entity Linking Data Fusion Relation Extraction Human Computation Relevance Judgment Data Verification Disambiguation Better Data Web Data Databases Sensor Data Programmers Managers External Crowd - High Availability - Large Scale - Expertise Variety
  37. 37. 42 BIG 318062 BIG Big Data Public Private Forum RECAPTCHA n  OCR ¨  ~ 1% error rate ¨  20%-30% for 18th and 19th century books
  38. 38. 43 BIG 318062 BIG Big Data Public Private Forum A CROSS-SECTOR TREND… Telco, Media, & Entertainment Manufacturing, Retail, Energy & Transport Public Sector Life Sciences
  39. 39. 44 BIG 318062 BIG Big Data Public Private Forum 44 BIG 318062 DATA STORAGE OVERVIEW ▶ Is responsible for analysing different aspects of storing, organizing and manipulating of information on electronic data storage devices Definition ▶ Data organization and modelling ▶ Basic data manipulations (Create, Read, Update, Delete - CRUD) ▶ Data compression ▶ Data recovery, concurrency, consistency, integrity and security ▶ Database systems architecture, availability and partition tolerance Key Topics Data Acquisition Data Analysis Data Curation Data Storage Data Usage
  40. 40. 45 BIG 318062 BIG Big Data Public Private Forum BIG DATA STORAGE AS A COMMODITY
  41. 41. 46 BIG 318062 BIG Big Data Public Private Forum 46 BIG 318062 Mathworks Analytical Databases ANALYSIS OF BIG DATA VOLUMES Towards Integrated Analy>cs •  Integrated Systems •  Single data model •  Potentially higher performance •  Lower development complexity •  Separate Systems •  Different data models •  May negatively impact performances •  Higher development complexity DBMS Data Management Analytics Rasdaman SciDB Revolution Analytics ClouderaRDBMS
  42. 42. 47 BIG 318062 BIG Big Data Public Private Forum 47 BIG 318062 TRADEOFF: SIZE VS. COMPLEXITY
  43. 43. 48 BIG 318062 BIG Big Data Public Private Forum 48 BIG 318062 §  Decision support §  Descriptive §  Predictive §  Prescriptive analysis §  Data exploration §  Extends Visualisation to §  Visual Analytics §  Key areas include: §  Industry 4.0 (industrial internet) §  Predictive maintenance §  Smart data and service integration DATA USAGE OVERVIEW ▶  Key task of Data Usage is to support business decisions ▶  Lookup, Learn, Investigate ▶  Exploratory browsing ▶  Search ▶  Analytics ▶  Closely related to Business Intelligence and Data Mining technologies, but extending them ▶  Off-line vs. real-time support ▶  Automated decisions Definition Decision Making Data Acquisition Data Analysis Data Curation Data Storage Data Usage
  44. 44. 49 BIG 318062 BIG Big Data Public Private Forum 49 BIG 318062 IMPROVING USABILITY Usability ▶ Lowering the usability barrier for data tools: Users should be able to directly manipulate the data ▶ Improvement of Human-Data interaction: Enabling experts & casual users to query, explore, transform, & curate data ▶ Interactive exploration: Big Data generates insights beyond existing models, new analysis interfaces must support browsing and modeling (visual analytics) ▶ Convergence within analytical frameworks Analytical databases for better performance and lower development complexity (Mahout, Spark, Hadoop/R, rasdaman, SciDB)
  45. 45. PART IV: THE NEXT WAVE OF BIG DATA RESEARCH AND INNOVATION
  46. 46. 09/02/16 51www.bdva.eu   The Big Data Value Strategic Research and Innovation Agenda (BDV SRIA) defines the overall goals, main technical and non-technical priorities, and a research and innovation roadmap for the European contractual Public Private Partnership (cPPP) on Big Data Value. What is the SRIA? Strategic Research and Innovation Agenda What is the SRIA?   Version 1.0 was published by BDVA in January 2015   Version 2.0 due this month. Latest Version •  Built upon inputs and analysis from SMEs and Large Enterprises, public organisaEons, and research and academic insEtuEons. •  Mul>ple workshops and consulta>ons took place to ensure the widest representaEon of views and posiEons •  Approximately 200 organisa>ons and other relevant stakeholders physically par>cipa>ng and contribuEng. SRIA is based on strong community involvement
  47. 47. 09/02/16 52www.bdva.eu BDV SRIA Technical Priorities Data Management Engineering the management of data Data Processing Architectures Optimized architectures for analytics both data at rest and in motion with low latency delivering real-time analytics Deep Analytics Deep analytics to improve data understanding, deep learning, meaningfulness of data Data Protection and Preservation Mechanism To make data owners comfortable about sharing data in an experimental setting Data Visualization and User Experience Enable intelligent visualization of complex information relying on enhanced user experience and usability
  48. 48. 09/02/16 53www.bdva.eu   How do semantically annotated unstructured and semi-structured data without imposing extra-effort to data producers.   How to unlock data silos by creating interoperability standards and technologies for storing and exchanging of data?   How to improve and assess the data quality from the various domains?   How to ensure consistent data provenance along the data value chain?   How do handle the sheer unbound size of data as well as enforcing consistent quality as the data scales in volume, velocity and variability?   How to integrate analytics results from two different worlds: the data and the business processes?   How to bundle and provision data, software and data analytics results to ensure reuse of intermediate results? Data Management Challenges
  49. 49. 09/02/16 54www.bdva.eu   How to integrate the processing of data in motion and data at rest, e.g. •  Real-time Analytics & Stream Processing •  New Big Data-specific parallelization techniques   How to parallelize and distribute analytics tasks in order to cope efficiently with data in motion? The challenge is to develop complex analytics techniques at scale and for data in motion in order to extract knowledge out of the data and develop decision support applications   How to analyze data generated by IoT applications? I.e. how to develop algorithms for IoT dataflows analytics   How to ensure performance and scalability of the algorithms? I.e. the performance has to scale by orders of magnitude while reducing energy consumption with the best effort integration between hardware and software. Data Processing Architectures Challenges
  50. 50. 09/02/16 55www.bdva.eu   How to produce predictive and prescriptive analytics results? i.e.by deep learning techniques and graph mining techniques applied on extremely large graphs. Contextualization that combines heterogeneous data and data streams via graphs to improve the quality of mining processes, classifiers, and event discovery   How to foster the semantic analysis of data? I.e. How to improve data analysis to provide a near-real-time interpretation of the data   How t o validate content? I.e. How to implement veracity models for validating content   How to develop new and open analytics frameworks?   How to improve the scalability and processing speed for the aforementioned algorithms   How to develop advanced business analytics and Intelligence techniques? Deep Analytics Challenges
  51. 51. 09/02/16 56www.bdva.eu   How to ensure privacy and data anonymisation as key requirements for data sharing and exchange?   How to foster differential privacy, private information retrieval, homomorphic encryption?   How to provide technical means that allow data owners to control the access and usage of their data?   How to ensuring irreversibility of the anonymisation?   How to develop scalable solutions?   How to preserve data anonymity while ensuring high data quality ? Data Protection and Anonymisation Challenges
  52. 52. 09/02/16 57www.bdva.eu Data Visualization   How to present data analytics reports that encompass complex documents containing a variety of data sources?   How to address the various design challenges in representing complex information?   Interfaces need to be humane   just-in-time delivery of relevant information   Filtering versus hiding of information   How to enable advanced data visualisation incorporating data variety?   How to align the user-driven vs. data-driven data access paradigm?   How to develop intuitive interfaces while exploiting the advanced discovery aspects of Big Data analytics? Challenges
  53. 53. 09/02/16 58www.bdva.eu Non-Technical Challenges   Skills development   Business Models and Ecosystems   Policy, Regulation and Standardization   Social perceptions and societal implications
  54. 54. PART V: Data Science and Skills
  55. 55. The Skills GAP
  56. 56. CONCLUSION
  57. 57. The Data Landscape ▶ Much of Big Data technology is evolutionary ▶ Old technologies applied in a new context ▶ Volume, Variety, Velocity, Value … Technology Evolution Process Revolution ▶ Business process change must be revolutionary to enable new opportunities ▶ Industry 4.0 (Smart Manufacturing) ▶ Predictive maintenance ▶ Opportunities for data-driven improvements ▶ integration with customer and supplier data ▶ Moving from infrastructure services (IaaS) to software (SaaS) to business processes (BPaaS) to knowledge (KaaS)
  58. 58. The Data Landscape ▶ The long tail of data variety is a major shift in the data landscape ▶ Coping with data variety and verifiability are central challenges and opportunities for Big Data ▶ Cross-sectorial uses of Big Data will open up new business opportunities ▶ Need for scalable approaches to cope with data under different format and semantic assumptions Variety and Reuse
  59. 59. Resources on Big-Data
  60. 60. QuesEons?
  61. 61. Credits •  Members of the Big Project. In parEcular the leaders and members of the Technical Working groups and Sectorial Forums. •  Reused Images are credited on each slide •  SRIA Group from the BDVA

×