Taming the Big Data Beast - Together

740 views

Published on

Kennisalliantie Nieuwjaarsreceptie 31 januari 2013:
Prof. dr. Jacob de Vlieg: “Taming the Big Data Beast Together”
CEO en wetenschappelijk directeur van het Netherlands eScience Center (NLeSC)

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
740
On SlideShare
0
From Embeds
0
Number of Embeds
147
Actions
Shares
0
Downloads
8
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Taming the Big Data Beast - Together

  1. 1. Netherlands eScience CenterICT Synergy Hub, AmsterdamTaming the Big Data Beast - TogetherNieuwjaarsbijeenkomst KennisalliantieDelft, 31 januari-2013Prof. dr. Jacob de Vlieg ¹ ²1. CEO & Scientific Director of Netherlands eScience Center, NWO-SURF2. Head Computational Design & Discovery, CMBI, Radboud University, Medical Center, Nijmegen,Netherlands
  2. 2. Agenda• Big Data in Science: Challenges & Opportunities – Top Sector ICT Roadmap theme: “Data, Data, Data”• Netherlands eScience Center (NLeSC) – Expert centre for Big Data Research• Joint NWO-NLeSC “Big Data” project call – Public-private partnerships
  3. 3. Data are the lifeblood of modern science and the digital economy
  4. 4. Data are the lifeblood of modern science and the digital economyManaging, analyzing, linking & re-using data to create business value and/or scientific breakthroughs e.g. – Social media data to influence consumer choices – Sensor networks data: e.g. sensor-enabled smart dikes – Imaging & biobanking data in health care e.g. diagnostics, medicine – And many more opportunities
  5. 5. Data are the lifeblood of modern science and the digital economyManaging, analyzing, linking & re-using data to create business value and/or scientific breakthroughs e.g. – Social media data to influence consumer choices – Sensor networks data: e.g. sensor-enabled smart dikes – Imaging & biobanking data in health care e.g. diagnostics, medicine – And many more opportunitiesBig Data: a complex concept – 4Vs: Volume, Variety, Velocity, Verification
  6. 6. Data are the lifeblood of modern science and the digital economyManaging, analyzing, linking & re-using data to create business value and/or scientific breakthroughs e.g. – Social media data to influence consumer choices – Sensor networks data: e.g. sensor-enabled smart dikes – Imaging & biobanking data in health care e.g. diagnostics, medicine – And many more opportunities.Big Data: a complex concept – 4Vs: Volume, Variety, Velocity, VerificationBig Data inextricably connected to eScience/HPCICT top sector roadmap: e-Science is about intelligent infrastructure to model and/or to access big data
  7. 7. Key eScience challenges Big Data research– Cross-type data integration– Data-driven & multi-models simulations– Visualization & analytics– High performance computing: connected computers & fast networks.
  8. 8. Key eScience challenges Big Data research– Cross-type data integration– Data-driven & multi-models simulations– Visualization & analytics– High performance computing: connected computers & fast networks– Stimulate culture of knowledge sharing: no silos; data stewardship– Rationalization of ICT landscapes; interoperability & industry data standards– Training & education
  9. 9. Science itself is changing …We need to change with it… Neelie Kroes in “Giving Europe’s Scientists the Tools to Deliver”Two key words: multidisciplinary research & data-driven discovery
  10. 10. eScience and the mystery of theempty labs
  11. 11. eScience and the mystery of theempty labs
  12. 12. eScience and the mystery of theempty labs • Much more data per experiment (miniaturized and/or automation) • External data sources & outsourcing • Experimental design, data management & analytics(eScience)
  13. 13. Quantified Self Movement -> Big Data Use apps and wearable sensors to monitor daily life e.g. hours of sleep, food consumed, exercise taken, etc. Quantified Self = Big Data + Mobile + Sensors + Visualization + Gamification .
  14. 14. eScience HeroFights for medical innovation; parkinson’s disease• Big Data• Pattern recognition• Machine learning• Social Media Andy Grove (ex-CEO Intel)
  15. 15. Voice algorithms spot Parkinsons disease:data-driven diagnostics• Machine learning algorithms that analyse voice recordings to detect Parkinsons symptoms early on (Little at al. @ Media Lab, MIT)• Social Media: Looking for volunteers to contribute to the database to improve pattern recognition
  16. 16. Voice algorithms spot Parkinsons disease:data-driven diagnostics• Machine learning algorithms that analyse voice recordings to detect Parkinsons symptoms early on (Little at al. @ Media Lab, MIT)• Social Media: Looking for volunteers to contribute to the database to improve pattern recognitionSocial networking health sites: patient-driven data collection •21andme •PatientsLikeMe.com •And so on Big Data V= Verification: privacy, compliance, etc
  17. 17. Data Scientist is now the hottest job title in Silicon Valley… Tim OReilly Founder of OReilly Media Supporter free software and open source movementsMcKinsey projected that the US needs140,000 to 190,000 more workers with “deepanalytical expertise”
  18. 18. Netherlands eScience Center Netherlands organization Principal Dutch body for for scientific research: ICT innovation for research NL-eSC SURF Science park, Amsterdam; SARA, EGI Networked innovation model Bridge: •Science & advanced ICT •Industry & Academic Research •Training & EducationNew ways to do research made possible because of Big Data/eScience
  19. 19. NLeSC portfolio divided in themes•Sustainability & Environment •Life Sciences- Climate - Green Genetics- Water management - Translational Research IT-Energy - Foods-Ecology - Cognition/Neuroscience•Chemistry & Materials •eScience Methodology & ‘Big Data’-Chemistry - eScience Methodology - Astronomy•Humanities & Social Sciences- Humanities-Social Sciences
  20. 20. Can scientists from digital humanities help foodresearchers?Food Research: Food Specific Ontologies for Food Focused Text MiningProject Leader: Wynand AlkemaAddressing absence of domain specific structuredvocabularies which limits the use of data mining &knowledge management methods in food research.Digital Humanities: BiographyNEDProject Leader: Guus SchreiberWill improve current version of the Biography Portal byincorporating analytical tools to show interconnections,trends, geographical maps and time lines.
  21. 21. eScience & Big Data: providing leads for new food applications
  22. 22. NLeSC eScience engineers:Scientists bridging research and advanced ICT Deliver sustainable solutions for data-driven research Work both at center and on site
  23. 23. Collaborative Innovation Network Taming the Data Beast Together SMEs,etc NLeSC eScience Engineers: Work both at center and on site: •Exchange of eScience expertise •Re-use of proven eScience (technology hopping) •Career development & training
  24. 24. Grand scientific challenges leads to innovative eScience & Big Data ResearcheSalsa NLeSC project: data-driven simulations & advancedvisualization to understand Climate Change Prof. Henk Dijkstra, Univ. of Utrecht NLeSC Integrator Climate Dr. Jason Maassen eScience Engineer NLeSC •eScience to allow unprecedented level of detail (large scale distributed computing) •State-of-the-art visualization techniques to analyze hundreds of Terabytes of output •Re-use of proven eScience concepts in new areas (e.g. sector water)
  25. 25. The number of data-driven start-ups is growing—particularly when it comes to social media. Taming the Big Data Beast
  26. 26. Development of a high performance Twitter analysis platformHadoop – MapReduce architecture @ a large SARA computer clusterSmart search & analysis software Prof. Antal van den BoschGoal is to ask “Big Data” research questions e.g. NLeSC Integrator Humanities Radboud University Nijmegen • Ability to analyze microblogging data produced over years • Time dependant Dr. Erik Tjong Kim Sang eScience Engineer • Real time sentiment analysis NleSC • And so on…
  27. 27. Cyber-common: a facility for 21st century data-driven research and multidisciplinary team work The key to scientific questions y! To link minds and eScience SURF-SARA-NLeSC
  28. 28. Cyber-common: a facility for 21st century data-driven research and multidisciplinary team work The key to to scientific questions The key scientific questions y! yet unasked! To link minds and eScience SURF-SARA-NLeSC
  29. 29. Joint NWO-NLeSC “data sciences” call• Focus on stimulating public-private partnerships• Three instruments: – Industrial Partnership Programme (IPP) – Technology Area’s (TA) – Knowledge Innovation Mapping SMEs (KIEM MKB) Rosemarie van der Veen-Oei (NLeSC) r.vanderveen@nwo.nl T 070 3440 851 Mark Kas (NWO) www.nlesc.nl m.kas@nwo.nl T 070 3440 811, M 06 205 93 207 Netherlands eScience Center
  30. 30. Thank you www.esciencecenter.nl

×