Taken some of the hype out of Big Data again - Medtech Pharma, Nürnberg july 2014

400 views
294 views

Published on

I was invitted to redo the talk about Big Data i did in Berlin earlier this year - slides also here.
Slides are similar but updated to reflect my new company and some slides are new.
Enjoy

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
400
On SlideShare
0
From Embeds
0
Number of Embeds
8
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Taken some of the hype out of Big Data again - Medtech Pharma, Nürnberg july 2014

  1. 1. MedTech Pharma Nürnberg 2014 Taking (some of) the mystery out of Big Data
  2. 2. www.gritsystems.dk
  3. 3. Contact Claus Stie Kallesøe Founder, CEO claus@gritsystems.dk +45 30 14 15 36
  4. 4. Introduction
  5. 5. Big Data – Either VERY large datasets AND/OR other complexities Characteristics of big data Source: IBM methodology
  6. 6. A couple of words about scale • 100’s of Megabytes • This should not be a problem. Can be handled with Matlab, R, Ruby • 100/500 Gigabytes – 1Terabyte • 2 Terabyte harddrives can be bought in the local shop for €100 • Connect it to your laptop and install postgresql or a no-sql database on it • > 5 Terabytes • Now you might have a size issue Inspired by: http://www.chrisstucchio.com/blog/2013/hadoop_hatred.html
  7. 7. Big Data - “Definition” "Big Data is high volume, high velocity, and/or high variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization."
  8. 8. Cool, but remember where we are! Gartner Hype Cycle 2013
  9. 9. Big Data in Pharma R&D
  10. 10. What is Big Data in Pharma R&D? • Many ideas/possibilities across Pharma R&D and market access • But many of them are likley NOT “real” Big Data problems! • Are they relevant and can they bring insights? • Yes, very much so • Should we than find a way to handle them? • Absolutely
  11. 11. Disclaimer • I am a (web) tech geek • I have nothing against new technologies • Like many other geeks I like it • But do try to use the right tool for the right job
  12. 12. http://blog.mongohq.com/you-dont-have-big-data/
  13. 13. Another great tool - for some Q: “Could you help me get to Nürnberg, pls?” A: “Yes, absolutely. Not a problem” Q: “Ok, btw I want to try the Endeavour A: “...ahh why?” Q: “Because I have read it’s great” A: “Yes, but the ICE….”
  14. 14. MapReduce explained in 41 words Goal: Count the number of books in the library. Map: You count up shelf #1, I count up shelf #2. (The more people we get, the faster this part goes. ) Reduce: We all get together and add up our individual counts. http://www.chrisstucchio.com/blog/2011/mapreduce_explained.html
  15. 15. What is it then? Linked data?
  16. 16. Does it matter what it is? No! It’s data - and potential analytics (business) opportunities. Size and complexity should drive the technology
  17. 17. Technologies Can we do anything on our own
  18. 18. For many people/companies ”Big data technology” is a black box ”A lot of stuff” And then the vendors go: If { box = magic or money} then { box = expensive}
  19. 19. Working within a community A lot of tools available From: ttp://people10.com/blog/ruby-on-rails-the-popular-platform-for-web-development/
  20. 20. New visualisations – easy and free http://philogb.github.io/jit/demos.html
  21. 21. Automated calculations - can bring you far Job submitted to async calculation server
  22. 22. https://circleci.com/ Also a lot of great tools to handle data
  23. 23. Elasticsearch text indexes • Indexed research assay metadata => Google like search to find the relevant assay • Indexed sharepoint project workspaces => Enable easy, fast cross project queries to find trends
  24. 24. Conclusion – Big data in Pharma R&D • Many opportunities across R&D and market access • More data linking and data analytics than Big Data • You can use freely available tools on ”normal” hardware • No magic ”Under the hood” – it’s just data
  25. 25. BUT you still need to define the questions you want to answer – before diving into technology!
  26. 26. www.gritsystems.dk Ask….

×