Your SlideShare is downloading. ×
0

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

The Good, The Bad, and The Ugly

609

Published on

High Performance Computing and Big Data Conference

High Performance Computing and Big Data Conference

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
609
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
8
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Has hidden Intel video linked to .wmv file in same directory
  • The $1000 Genome Sequencer is finally in sight – fantastic achievement with the sequencing of genomes now outpacing Moores Law in development;BUT processing the output runs at Moores Law (+human expert) speed. Both the cost and the time to deliver value – improved diagnosis and ultimately better medical outcomes for rare diseases and cancers depends on applying Compute power to Data with Smart AnalyticsFigures derived from Glenn K. Lockwood
  • The $1000 Genome Sequencer is finally in sight – fantastic achievement with the sequencing of genomes now outpacing Moores Law in development;BUT processing the output runs at Moores Law (+human expert) speed. Both the cost and the time to deliver value – improved diagnosis and ultimately better medical outcomes for rare diseases and cancers depends on applying Compute power to Data with Smart AnalyticsFigures derived from Glenn K. Lockwood Comparable demands will come from Plant and Animal genetics & Food safety such as the work at
  • Transcript

    • 1. Andrew Carr CEO, Bull UK & Ireland © Bull, 2014 1
    • 2. High Performance Computing and Big Data Conference Data: the Good, the Bad, and the Ugly © Bull, 2014 2
    • 3. © Bull, 2014 3
    • 4. Click here to play the video © Bull, 2014 4
    • 5. The IT market is at an inflection point: Information as-a-Service Its main driver transitioning from TECHNOLOGY to IT as-a-Service USAGE Distributed IT Centralised IT © Bull, 2014 1970 T E C H N O L O G Y 2010 USAGE 2020 5
    • 6. The IT market is at an inflection point: Information as-a-Service TRANSPARENT PLATFORMS BIG DATA enabling M2M VALUE FROM DATA IT as-a-Service CLOUD SECURITY Distributed IT Centralised IT HIGH PERFORMANCE COMPUTING COMPLEX INTEGRATION IT INFRASTRUCTURE © Bull, 2014 6
    • 7. Time to results… Speed has Value Greater than Size Think Fast Data more than Big Data © Bull, 2014 7
    • 8. A real Big Data problem…but Fast Results? 14 Jan 2014 - Illumina Announces the Thousand Dollar Genome • $800 for reagents, $60 for sample preparation, $137 for ‘hardware’ over lifetime • Assuming you can afford 10 HiSeq X machines at $1 Million each  You will be able to process 5 whole genomes/day – 18,000 a year for X10 So just 30 systems non-stop 24/7 to meet Genomics England 100K 2017 goal ! © Bull, 2014 8
    • 9. A real Big Data problem…but Fast Results? But when you’ve done that, how to process the results? • You now have 30-50 Terabytes of raw data per machine per week • HiSeq X10 cluster will require ~ 175,000 CPU core hours just to align results and even more to perform variant analysis to detect cancer anomalies  Delivering 250,000 core hours/week 24/7 and storing results is not trivial © Bull, 2014 9
    • 10. Why is data important? © Bull, 2014 10
    • 11. Click here to play the video © Bull, 2014 11
    • 12. Turning Fans into Customers… © Bull, 2014 12
    • 13. Smart Stadiums….. • • • • • • • • • 90% Increase in RESPECT services & ‘Report an incident’. 12% New revenue £1 per bet ‘Man of the match’ /First Sub betting 85% Increase In Social Media usage 35% increase in Stadium sponsored betting 8% -15% increase in Club Merchandising Discounts on food & beverage to remove wastage Twitter wall for live interactions (advertorials) Real time non-contentious replays Access to secure club content (premium) Smart Stadiums Value: Become aware: Traffic management Security challenges Weather Crowd control Foot-fall management © Bull, 2014 13
    • 14. Professor Stephen Jarvis Director for Computing Research University of Warwick © Bull, 2014 14
    • 15. Telecoms Forensic science Smart Cities Government Retail Police Opinion polls © Bull, 2014 Healthcare Interpol 15
    • 16. Performance tuning and debugging tools Biometric solutions Fingerprint analysis Source camera identification © Bull, 2014 Used on the world’s Largest supercomputers FBI certified Used in UNHCR camps Used by Interpol to classify and group explicit images 16
    • 17. Let’s investigate some case studies … 1. Characteristics of the problem domain Volume – terabytes to exabytes of existing data to process Velocity – streaming data, milliseconds to seconds response time Variety – structured, unstructured, text multimedia Veracity – uncertainty due to incompleteness or ambiguities 1. Characteristics of the solution Processing – should data processing be done sequentially or in parallel? Storage – should this increase your data storage requirements? Speed – where should you maximise latency: memory, network, both? © Bull, 2014 17
    • 18. Case study 1: You like pink milk © Bull, 2014 18
    • 19. Case study 1: You like pink milk • 1993, Tesco’s CEO was looking to replace Green Shield trading stamps • DunnHumby, a small London start-up, introduced the notion of a clubcard “you know more about my customers after three months, than I know after 30 years” Lord MacLaurin, Tesco Chairman © Bull, 2014 19
    • 20. Case study 1: You like pink milk • Single most significant factor in the success of the company • 43M clubcard holders worldwide • Allows Tesco to stock unpopular brands for big spending customers • 6M transactions per day presents significant volume • Wide application: Calorie counting with Diabetes UK © Bull, 2014 20
    • 21. Case study 1: You like pink milk BIG DATA Characteristics: Terabytes to exabytes of existing data is processed Processing: Batch and in parallel Storage: Very large volumes of data stored Speed: Access of data from disk; transfer of data to / from memory; delivery of results potentially slow © Bull, 2014 21
    • 22. Case study 2: Take heart © Bull, 2014 22
    • 23. Case study 2: Take heart • Some problems are not so much volume as velocity, as you want to analyse data in motion • Non-relational data, such as email, text, voice, video, data from instruments © Bull, 2014 23
    • 24. Case study 2: Take heart • Monitoring needs to be real-time and continuous • Not so much a question of storage, as of spotting outliers © Bull, 2014 24
    • 25. Case study 2: Take heart • Streaming analytic solutions being deployed into intensive care and mobile continuous health monitoring • Text analysis of social media for flu © Bull, 2014 25
    • 26. Case study 2: Take heart • Health analytics market estimated to be worth $21.3B by 2020 • Compound annual growth rate of 25% © Bull, 2014 26
    • 27. Case study 2: Take heart BIG DATA Characteristics: Streaming data; could be from heterogeneous sources from multiple sites Processing: Real-time and in parallel; may alert further batch Storage: Minimal storage requirements; Speed: Transfer ‘from the pipe’ to registers for processing; results often delivered as alerts © Bull, 2014 27
    • 28. Case study 3: We built this city © Bull, 2014 28
    • 29. Case study 3: We built this city • Annual global market for Smart Cities solutions is £200B • Over 1,000 cities in the world with populations >500,000 • Smart Cities research shows us the variety of data • • • • • © Bull, 2014 Transport cards (oyster) Sensors (traffic, pollution, weather) Camera data (security, traffic) GIS (people, vehicles) Buildings (temperature, occupation) 29
    • 30. Case study 3: We built this city Click here to play the video © Bull, 2014 30
    • 31. Case study 3: We built this city What 100 million calls to NYC 311 reveal © Bull, 2014 31
    • 32. Case study 3: We built this city BIG DATA Characteristics: Streaming and/or batch analytics; from heterogeneous sources from multiple sites Processing: Real-time and in parallel; may alert further batch Storage: Minimal storage requirements; Speed: Transfer ‘from the pipe’ to registers for processing; results often delivered as alerts © Bull, 2014 32
    • 33. Case study 4: The Blackberry Riots © Bull, 2014 33
    • 34. Case study 4: The Blackberry Riots • • • • • © Bull, 2014 Between 6 and 10 August 2011, thousands of people took to the streets in London The disturbances began after a police shooting on 4 August in Tottenham The resulting chaos required mass police deployment The rioting soon spread to Birmingham, Bristol, Liverpool and Manchester “Everyone watching these horrific actions will be struck by how they were organised with social media” David Cameron, Prime Minister 34
    • 35. Case study 4: The Blackberry Riots • • • • © Bull, 2014 Professor Rob Procter and a team from LSE and The Guardian set about investigating this claim One of the largest studies of social media analytics What can we learn from use of social media during times of crisis? What does this tell us about veracity of data? 35
    • 36. Case study 4: The Blackberry Riots 9pm on 8th August @Twiggy_Garcia circulates unconfirmed reports that rioters releasing animals at London Zoo Re-tweeted by influential users with many followers. Rumours spread in viral-like way over non-hierarchical network Opposition seeds within 13 minutes. Pictures are identified as fake © Bull, 2014 Click here to play the video 36
    • 37. Case study 4: The Blackberry Riots BIG DATA Characteristics: Uncertainty and Incompleteness exists in all data; streaming has the advantage of ‘in-flight correction’. Processing: Real-time and in parallel; inc. background analysis Storage: Minimal additional storage requirements; Speed: Inevitably impacts speed © Bull, 2014 37
    • 38. • • Working with experts, formulate technology (hardware/software) needs • © Bull, 2014 Identifying characteristics of problem domain ‘Big data’ solutions are commonplace; ‘Fast data’ solutions are not 38
    • 39. Conclusion……. 39 © Bull, 2014 ® Copyright 2011 Gigaspaces Ltd. All Rights Reserved 39
    • 40. Discussion Andrew.Carr@bull.co.uk Stephen.Jarvis@warwick.ac.uk Robert.J.Maskell@intel.com © Bull, 2014 40
    • 41. © Bull, 2014 0870 240 0040 www.bull.co.uk Hemel Hempstead HP2 7DZ information@bull.co.uk @Bull_UK Bull-Information-Systems 41

    ×