Your SlideShare is downloading. ×
  • Like
Taming the Big Data Beast - Together
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Taming the Big Data Beast - Together

  • 471 views
Published

Kennisalliantie Nieuwjaarsreceptie 31 januari 2013: …

Kennisalliantie Nieuwjaarsreceptie 31 januari 2013:
Prof. dr. Jacob de Vlieg: “Taming the Big Data Beast Together”
CEO en wetenschappelijk directeur van het Netherlands eScience Center (NLeSC)

Published in Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
471
On SlideShare
0
From Embeds
0
Number of Embeds
2

Actions

Shares
Downloads
6
Comments
0
Likes
2

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Netherlands eScience CenterICT Synergy Hub, AmsterdamTaming the Big Data Beast - TogetherNieuwjaarsbijeenkomst KennisalliantieDelft, 31 januari-2013Prof. dr. Jacob de Vlieg ¹ ²1. CEO & Scientific Director of Netherlands eScience Center, NWO-SURF2. Head Computational Design & Discovery, CMBI, Radboud University, Medical Center, Nijmegen,Netherlands
  • 2. Agenda• Big Data in Science: Challenges & Opportunities – Top Sector ICT Roadmap theme: “Data, Data, Data”• Netherlands eScience Center (NLeSC) – Expert centre for Big Data Research• Joint NWO-NLeSC “Big Data” project call – Public-private partnerships
  • 3. Data are the lifeblood of modern science and the digital economy
  • 4. Data are the lifeblood of modern science and the digital economyManaging, analyzing, linking & re-using data to create business value and/or scientific breakthroughs e.g. – Social media data to influence consumer choices – Sensor networks data: e.g. sensor-enabled smart dikes – Imaging & biobanking data in health care e.g. diagnostics, medicine – And many more opportunities
  • 5. Data are the lifeblood of modern science and the digital economyManaging, analyzing, linking & re-using data to create business value and/or scientific breakthroughs e.g. – Social media data to influence consumer choices – Sensor networks data: e.g. sensor-enabled smart dikes – Imaging & biobanking data in health care e.g. diagnostics, medicine – And many more opportunitiesBig Data: a complex concept – 4Vs: Volume, Variety, Velocity, Verification
  • 6. Data are the lifeblood of modern science and the digital economyManaging, analyzing, linking & re-using data to create business value and/or scientific breakthroughs e.g. – Social media data to influence consumer choices – Sensor networks data: e.g. sensor-enabled smart dikes – Imaging & biobanking data in health care e.g. diagnostics, medicine – And many more opportunities.Big Data: a complex concept – 4Vs: Volume, Variety, Velocity, VerificationBig Data inextricably connected to eScience/HPCICT top sector roadmap: e-Science is about intelligent infrastructure to model and/or to access big data
  • 7. Key eScience challenges Big Data research– Cross-type data integration– Data-driven & multi-models simulations– Visualization & analytics– High performance computing: connected computers & fast networks.
  • 8. Key eScience challenges Big Data research– Cross-type data integration– Data-driven & multi-models simulations– Visualization & analytics– High performance computing: connected computers & fast networks– Stimulate culture of knowledge sharing: no silos; data stewardship– Rationalization of ICT landscapes; interoperability & industry data standards– Training & education
  • 9. Science itself is changing …We need to change with it… Neelie Kroes in “Giving Europe’s Scientists the Tools to Deliver”Two key words: multidisciplinary research & data-driven discovery
  • 10. eScience and the mystery of theempty labs
  • 11. eScience and the mystery of theempty labs
  • 12. eScience and the mystery of theempty labs • Much more data per experiment (miniaturized and/or automation) • External data sources & outsourcing • Experimental design, data management & analytics(eScience)
  • 13. Quantified Self Movement -> Big Data Use apps and wearable sensors to monitor daily life e.g. hours of sleep, food consumed, exercise taken, etc. Quantified Self = Big Data + Mobile + Sensors + Visualization + Gamification .
  • 14. eScience HeroFights for medical innovation; parkinson’s disease• Big Data• Pattern recognition• Machine learning• Social Media Andy Grove (ex-CEO Intel)
  • 15. Voice algorithms spot Parkinsons disease:data-driven diagnostics• Machine learning algorithms that analyse voice recordings to detect Parkinsons symptoms early on (Little at al. @ Media Lab, MIT)• Social Media: Looking for volunteers to contribute to the database to improve pattern recognition
  • 16. Voice algorithms spot Parkinsons disease:data-driven diagnostics• Machine learning algorithms that analyse voice recordings to detect Parkinsons symptoms early on (Little at al. @ Media Lab, MIT)• Social Media: Looking for volunteers to contribute to the database to improve pattern recognitionSocial networking health sites: patient-driven data collection •21andme •PatientsLikeMe.com •And so on Big Data V= Verification: privacy, compliance, etc
  • 17. Data Scientist is now the hottest job title in Silicon Valley… Tim OReilly Founder of OReilly Media Supporter free software and open source movementsMcKinsey projected that the US needs140,000 to 190,000 more workers with “deepanalytical expertise”
  • 18. Netherlands eScience Center Netherlands organization Principal Dutch body for for scientific research: ICT innovation for research NL-eSC SURF Science park, Amsterdam; SARA, EGI Networked innovation model Bridge: •Science & advanced ICT •Industry & Academic Research •Training & EducationNew ways to do research made possible because of Big Data/eScience
  • 19. NLeSC portfolio divided in themes•Sustainability & Environment •Life Sciences- Climate - Green Genetics- Water management - Translational Research IT-Energy - Foods-Ecology - Cognition/Neuroscience•Chemistry & Materials •eScience Methodology & ‘Big Data’-Chemistry - eScience Methodology - Astronomy•Humanities & Social Sciences- Humanities-Social Sciences
  • 20. Can scientists from digital humanities help foodresearchers?Food Research: Food Specific Ontologies for Food Focused Text MiningProject Leader: Wynand AlkemaAddressing absence of domain specific structuredvocabularies which limits the use of data mining &knowledge management methods in food research.Digital Humanities: BiographyNEDProject Leader: Guus SchreiberWill improve current version of the Biography Portal byincorporating analytical tools to show interconnections,trends, geographical maps and time lines.
  • 21. eScience & Big Data: providing leads for new food applications
  • 22. NLeSC eScience engineers:Scientists bridging research and advanced ICT Deliver sustainable solutions for data-driven research Work both at center and on site
  • 23. Collaborative Innovation Network Taming the Data Beast Together SMEs,etc NLeSC eScience Engineers: Work both at center and on site: •Exchange of eScience expertise •Re-use of proven eScience (technology hopping) •Career development & training
  • 24. Grand scientific challenges leads to innovative eScience & Big Data ResearcheSalsa NLeSC project: data-driven simulations & advancedvisualization to understand Climate Change Prof. Henk Dijkstra, Univ. of Utrecht NLeSC Integrator Climate Dr. Jason Maassen eScience Engineer NLeSC •eScience to allow unprecedented level of detail (large scale distributed computing) •State-of-the-art visualization techniques to analyze hundreds of Terabytes of output •Re-use of proven eScience concepts in new areas (e.g. sector water)
  • 25. The number of data-driven start-ups is growing—particularly when it comes to social media. Taming the Big Data Beast
  • 26. Development of a high performance Twitter analysis platformHadoop – MapReduce architecture @ a large SARA computer clusterSmart search & analysis software Prof. Antal van den BoschGoal is to ask “Big Data” research questions e.g. NLeSC Integrator Humanities Radboud University Nijmegen • Ability to analyze microblogging data produced over years • Time dependant Dr. Erik Tjong Kim Sang eScience Engineer • Real time sentiment analysis NleSC • And so on…
  • 27. Cyber-common: a facility for 21st century data-driven research and multidisciplinary team work The key to scientific questions y! To link minds and eScience SURF-SARA-NLeSC
  • 28. Cyber-common: a facility for 21st century data-driven research and multidisciplinary team work The key to to scientific questions The key scientific questions y! yet unasked! To link minds and eScience SURF-SARA-NLeSC
  • 29. Joint NWO-NLeSC “data sciences” call• Focus on stimulating public-private partnerships• Three instruments: – Industrial Partnership Programme (IPP) – Technology Area’s (TA) – Knowledge Innovation Mapping SMEs (KIEM MKB) Rosemarie van der Veen-Oei (NLeSC) r.vanderveen@nwo.nl T 070 3440 851 Mark Kas (NWO) www.nlesc.nl m.kas@nwo.nl T 070 3440 811, M 06 205 93 207 Netherlands eScience Center
  • 30. Thank you www.esciencecenter.nl