Taken some of the hype out of Big Data again - Medtech Pharma, Nürnberg july 2014
Upcoming SlideShare
Loading in...5
×
 

Taken some of the hype out of Big Data again - Medtech Pharma, Nürnberg july 2014

on

  • 83 views

I was invitted to redo the talk about Big Data i did in Berlin earlier this year - slides also here.

I was invitted to redo the talk about Big Data i did in Berlin earlier this year - slides also here.
Slides are similar but updated to reflect my new company and some slides are new.
Enjoy

Statistics

Views

Total Views
83
Views on SlideShare
80
Embed Views
3

Actions

Likes
0
Downloads
1
Comments
0

1 Embed 3

https://www.linkedin.com 3

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

CC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Taken some of the hype out of Big Data again - Medtech Pharma, Nürnberg july 2014 Taken some of the hype out of Big Data again - Medtech Pharma, Nürnberg july 2014 Presentation Transcript

  • MedTech Pharma Nürnberg 2014 Taking (some of) the mystery out of Big Data
  • www.gritsystems.dk
  • Contact Claus Stie Kallesøe Founder, CEO claus@gritsystems.dk +45 30 14 15 36
  • Introduction
  • Big Data – Either VERY large datasets AND/OR other complexities Characteristics of big data Source: IBM methodology
  • A couple of words about scale • 100’s of Megabytes • This should not be a problem. Can be handled with Matlab, R, Ruby • 100/500 Gigabytes – 1Terabyte • 2 Terabyte harddrives can be bought in the local shop for €100 • Connect it to your laptop and install postgresql or a no-sql database on it • > 5 Terabytes • Now you might have a size issue Inspired by: http://www.chrisstucchio.com/blog/2013/hadoop_hatred.html
  • Big Data - “Definition” "Big Data is high volume, high velocity, and/or high variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization."
  • Cool, but remember where we are! Gartner Hype Cycle 2013
  • Big Data in Pharma R&D
  • What is Big Data in Pharma R&D? • Many ideas/possibilities across Pharma R&D and market access • But many of them are likley NOT “real” Big Data problems! • Are they relevant and can they bring insights? • Yes, very much so • Should we than find a way to handle them? • Absolutely
  • Disclaimer • I am a (web) tech geek • I have nothing against new technologies • Like many other geeks I like it • But do try to use the right tool for the right job
  • http://blog.mongohq.com/you-dont-have-big-data/
  • Another great tool - for some Q: “Could you help me get to Nürnberg, pls?” A: “Yes, absolutely. Not a problem” Q: “Ok, btw I want to try the Endeavour A: “...ahh why?” Q: “Because I have read it’s great” A: “Yes, but the ICE….”
  • MapReduce explained in 41 words Goal: Count the number of books in the library. Map: You count up shelf #1, I count up shelf #2. (The more people we get, the faster this part goes. ) Reduce: We all get together and add up our individual counts. http://www.chrisstucchio.com/blog/2011/mapreduce_explained.html
  • What is it then? Linked data?
  • Does it matter what it is? No! It’s data - and potential analytics (business) opportunities. Size and complexity should drive the technology
  • Technologies Can we do anything on our own
  • For many people/companies ”Big data technology” is a black box ”A lot of stuff” And then the vendors go: If { box = magic or money} then { box = expensive}
  • Working within a community A lot of tools available From: ttp://people10.com/blog/ruby-on-rails-the-popular-platform-for-web-development/
  • New visualisations – easy and free http://philogb.github.io/jit/demos.html
  • Automated calculations - can bring you far Job submitted to async calculation server
  • https://circleci.com/ Also a lot of great tools to handle data
  • Elasticsearch text indexes • Indexed research assay metadata => Google like search to find the relevant assay • Indexed sharepoint project workspaces => Enable easy, fast cross project queries to find trends
  • Conclusion – Big data in Pharma R&D • Many opportunities across R&D and market access • More data linking and data analytics than Big Data • You can use freely available tools on ”normal” hardware • No magic ”Under the hood” – it’s just data
  • BUT you still need to define the questions you want to answer – before diving into technology!
  • www.gritsystems.dk Ask….