Andrew Carr
CEO, Bull UK & Ireland
© Bull, 2014

1
High Performance Computing
and Big Data Conference
Data:
the Good, the Bad, and the Ugly

© Bull, 2014

2
© Bull, 2014

3
Click here to play the video
© Bull, 2014

4
The IT market is at an inflection point:

Information
as-a-Service

Its main driver transitioning from

TECHNOLOGY
to
IT a...
The IT market is at an inflection point:

Information
as-a-Service

TRANSPARENT PLATFORMS

BIG DATA

enabling
M2M

VALUE F...
Time to results…

Speed has Value

Greater than Size

Think Fast Data more than Big Data

© Bull, 2014

7
A real Big Data problem…but Fast Results?

14 Jan 2014 - Illumina Announces the Thousand Dollar Genome
• $800 for reagents...
A real Big Data problem…but Fast Results?

But when you’ve done that, how to process the results?
• You now have 30-50 Ter...
Why is data important?

© Bull, 2014

10
Click here to play the video
© Bull, 2014

11
Turning Fans into Customers…

© Bull, 2014

12
Smart Stadiums…..
•
•
•
•
•
•
•
•
•

90% Increase in RESPECT services & ‘Report an incident’.
12% New revenue £1 per bet ‘...
Professor Stephen Jarvis
Director for Computing Research
University of Warwick

© Bull, 2014

14
Telecoms

Forensic science

Smart Cities

Government

Retail
Police
Opinion polls
© Bull, 2014

Healthcare

Interpol
15
Performance tuning
and debugging tools

Biometric
solutions

Fingerprint analysis

Source camera
identification

© Bull, 2...
Let’s investigate some case studies …
1. Characteristics of the problem domain
Volume – terabytes to exabytes of existing ...
Case study 1: You like pink milk

© Bull, 2014

18
Case study 1: You like pink milk
• 1993, Tesco’s CEO was looking to replace Green
Shield trading stamps
• DunnHumby, a sma...
Case study 1: You like pink milk
• Single most significant factor in the success of the
company
• 43M clubcard holders wor...
Case study 1: You like pink milk

BIG DATA

Characteristics: Terabytes to
exabytes of existing data is
processed
Processin...
Case study 2: Take heart

© Bull, 2014

22
Case study 2: Take heart
• Some problems are not so much volume as velocity, as you
want to analyse data in motion
• Non-r...
Case study 2: Take heart

• Monitoring needs to be real-time and continuous
• Not so much a question of storage, as of spo...
Case study 2: Take heart

• Streaming analytic solutions being deployed into intensive
care and mobile continuous health m...
Case study 2: Take heart
• Health analytics market estimated to be worth $21.3B by
2020
• Compound annual growth rate of 2...
Case study 2: Take heart

BIG DATA

Characteristics: Streaming data;
could be from heterogeneous
sources from multiple sit...
Case study 3: We built this city

© Bull, 2014

28
Case study 3: We built this city
• Annual global market for Smart
Cities solutions is £200B
• Over 1,000 cities in the wor...
Case study 3: We built this city

Click here to play the video
© Bull, 2014

30
Case study 3: We built this city
What 100 million calls
to NYC 311 reveal

© Bull, 2014

31
Case study 3: We built this city

BIG DATA

Characteristics: Streaming and/or
batch analytics; from heterogeneous
sources ...
Case study 4: The Blackberry Riots

© Bull, 2014

33
Case study 4: The Blackberry Riots
•

•
•
•
•

© Bull, 2014

Between 6 and 10 August 2011,
thousands of people took to the...
Case study 4: The Blackberry Riots

•

•

•
•

© Bull, 2014

Professor Rob Procter and a team
from LSE and The Guardian se...
Case study 4: The Blackberry Riots
9pm on 8th August
@Twiggy_Garcia circulates
unconfirmed reports that
rioters releasing ...
Case study 4: The Blackberry Riots

BIG DATA

Characteristics: Uncertainty and
Incompleteness exists in all data;
streamin...
•
•

Working with experts, formulate technology (hardware/software) needs

•

© Bull, 2014

Identifying characteristics of...
Conclusion…….

39

© Bull, 2014

® Copyright 2011 Gigaspaces Ltd. All Rights Reserved
39
Discussion

Andrew.Carr@bull.co.uk
Stephen.Jarvis@warwick.ac.uk
Robert.J.Maskell@intel.com

© Bull, 2014

40
© Bull, 2014

0870 240 0040
www.bull.co.uk
Hemel Hempstead HP2 7DZ

information@bull.co.uk
@Bull_UK
Bull-Information-Syste...
Upcoming SlideShare
Loading in …5
×

The Good, The Bad, and The Ugly

899 views
758 views

Published on

High Performance Computing and Big Data Conference

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
899
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
9
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Has hidden Intel video linked to .wmv file in same directory
  • The $1000 Genome Sequencer is finally in sight – fantastic achievement with the sequencing of genomes now outpacing Moores Law in development;BUT processing the output runs at Moores Law (+human expert) speed. Both the cost and the time to deliver value – improved diagnosis and ultimately better medical outcomes for rare diseases and cancers depends on applying Compute power to Data with Smart AnalyticsFigures derived from Glenn K. Lockwood
  • The $1000 Genome Sequencer is finally in sight – fantastic achievement with the sequencing of genomes now outpacing Moores Law in development;BUT processing the output runs at Moores Law (+human expert) speed. Both the cost and the time to deliver value – improved diagnosis and ultimately better medical outcomes for rare diseases and cancers depends on applying Compute power to Data with Smart AnalyticsFigures derived from Glenn K. Lockwood Comparable demands will come from Plant and Animal genetics & Food safety such as the work at
  • The Good, The Bad, and The Ugly

    1. 1. Andrew Carr CEO, Bull UK & Ireland © Bull, 2014 1
    2. 2. High Performance Computing and Big Data Conference Data: the Good, the Bad, and the Ugly © Bull, 2014 2
    3. 3. © Bull, 2014 3
    4. 4. Click here to play the video © Bull, 2014 4
    5. 5. The IT market is at an inflection point: Information as-a-Service Its main driver transitioning from TECHNOLOGY to IT as-a-Service USAGE Distributed IT Centralised IT © Bull, 2014 1970 T E C H N O L O G Y 2010 USAGE 2020 5
    6. 6. The IT market is at an inflection point: Information as-a-Service TRANSPARENT PLATFORMS BIG DATA enabling M2M VALUE FROM DATA IT as-a-Service CLOUD SECURITY Distributed IT Centralised IT HIGH PERFORMANCE COMPUTING COMPLEX INTEGRATION IT INFRASTRUCTURE © Bull, 2014 6
    7. 7. Time to results… Speed has Value Greater than Size Think Fast Data more than Big Data © Bull, 2014 7
    8. 8. A real Big Data problem…but Fast Results? 14 Jan 2014 - Illumina Announces the Thousand Dollar Genome • $800 for reagents, $60 for sample preparation, $137 for ‘hardware’ over lifetime • Assuming you can afford 10 HiSeq X machines at $1 Million each  You will be able to process 5 whole genomes/day – 18,000 a year for X10 So just 30 systems non-stop 24/7 to meet Genomics England 100K 2017 goal ! © Bull, 2014 8
    9. 9. A real Big Data problem…but Fast Results? But when you’ve done that, how to process the results? • You now have 30-50 Terabytes of raw data per machine per week • HiSeq X10 cluster will require ~ 175,000 CPU core hours just to align results and even more to perform variant analysis to detect cancer anomalies  Delivering 250,000 core hours/week 24/7 and storing results is not trivial © Bull, 2014 9
    10. 10. Why is data important? © Bull, 2014 10
    11. 11. Click here to play the video © Bull, 2014 11
    12. 12. Turning Fans into Customers… © Bull, 2014 12
    13. 13. Smart Stadiums….. • • • • • • • • • 90% Increase in RESPECT services & ‘Report an incident’. 12% New revenue £1 per bet ‘Man of the match’ /First Sub betting 85% Increase In Social Media usage 35% increase in Stadium sponsored betting 8% -15% increase in Club Merchandising Discounts on food & beverage to remove wastage Twitter wall for live interactions (advertorials) Real time non-contentious replays Access to secure club content (premium) Smart Stadiums Value: Become aware: Traffic management Security challenges Weather Crowd control Foot-fall management © Bull, 2014 13
    14. 14. Professor Stephen Jarvis Director for Computing Research University of Warwick © Bull, 2014 14
    15. 15. Telecoms Forensic science Smart Cities Government Retail Police Opinion polls © Bull, 2014 Healthcare Interpol 15
    16. 16. Performance tuning and debugging tools Biometric solutions Fingerprint analysis Source camera identification © Bull, 2014 Used on the world’s Largest supercomputers FBI certified Used in UNHCR camps Used by Interpol to classify and group explicit images 16
    17. 17. Let’s investigate some case studies … 1. Characteristics of the problem domain Volume – terabytes to exabytes of existing data to process Velocity – streaming data, milliseconds to seconds response time Variety – structured, unstructured, text multimedia Veracity – uncertainty due to incompleteness or ambiguities 1. Characteristics of the solution Processing – should data processing be done sequentially or in parallel? Storage – should this increase your data storage requirements? Speed – where should you maximise latency: memory, network, both? © Bull, 2014 17
    18. 18. Case study 1: You like pink milk © Bull, 2014 18
    19. 19. Case study 1: You like pink milk • 1993, Tesco’s CEO was looking to replace Green Shield trading stamps • DunnHumby, a small London start-up, introduced the notion of a clubcard “you know more about my customers after three months, than I know after 30 years” Lord MacLaurin, Tesco Chairman © Bull, 2014 19
    20. 20. Case study 1: You like pink milk • Single most significant factor in the success of the company • 43M clubcard holders worldwide • Allows Tesco to stock unpopular brands for big spending customers • 6M transactions per day presents significant volume • Wide application: Calorie counting with Diabetes UK © Bull, 2014 20
    21. 21. Case study 1: You like pink milk BIG DATA Characteristics: Terabytes to exabytes of existing data is processed Processing: Batch and in parallel Storage: Very large volumes of data stored Speed: Access of data from disk; transfer of data to / from memory; delivery of results potentially slow © Bull, 2014 21
    22. 22. Case study 2: Take heart © Bull, 2014 22
    23. 23. Case study 2: Take heart • Some problems are not so much volume as velocity, as you want to analyse data in motion • Non-relational data, such as email, text, voice, video, data from instruments © Bull, 2014 23
    24. 24. Case study 2: Take heart • Monitoring needs to be real-time and continuous • Not so much a question of storage, as of spotting outliers © Bull, 2014 24
    25. 25. Case study 2: Take heart • Streaming analytic solutions being deployed into intensive care and mobile continuous health monitoring • Text analysis of social media for flu © Bull, 2014 25
    26. 26. Case study 2: Take heart • Health analytics market estimated to be worth $21.3B by 2020 • Compound annual growth rate of 25% © Bull, 2014 26
    27. 27. Case study 2: Take heart BIG DATA Characteristics: Streaming data; could be from heterogeneous sources from multiple sites Processing: Real-time and in parallel; may alert further batch Storage: Minimal storage requirements; Speed: Transfer ‘from the pipe’ to registers for processing; results often delivered as alerts © Bull, 2014 27
    28. 28. Case study 3: We built this city © Bull, 2014 28
    29. 29. Case study 3: We built this city • Annual global market for Smart Cities solutions is £200B • Over 1,000 cities in the world with populations >500,000 • Smart Cities research shows us the variety of data • • • • • © Bull, 2014 Transport cards (oyster) Sensors (traffic, pollution, weather) Camera data (security, traffic) GIS (people, vehicles) Buildings (temperature, occupation) 29
    30. 30. Case study 3: We built this city Click here to play the video © Bull, 2014 30
    31. 31. Case study 3: We built this city What 100 million calls to NYC 311 reveal © Bull, 2014 31
    32. 32. Case study 3: We built this city BIG DATA Characteristics: Streaming and/or batch analytics; from heterogeneous sources from multiple sites Processing: Real-time and in parallel; may alert further batch Storage: Minimal storage requirements; Speed: Transfer ‘from the pipe’ to registers for processing; results often delivered as alerts © Bull, 2014 32
    33. 33. Case study 4: The Blackberry Riots © Bull, 2014 33
    34. 34. Case study 4: The Blackberry Riots • • • • • © Bull, 2014 Between 6 and 10 August 2011, thousands of people took to the streets in London The disturbances began after a police shooting on 4 August in Tottenham The resulting chaos required mass police deployment The rioting soon spread to Birmingham, Bristol, Liverpool and Manchester “Everyone watching these horrific actions will be struck by how they were organised with social media” David Cameron, Prime Minister 34
    35. 35. Case study 4: The Blackberry Riots • • • • © Bull, 2014 Professor Rob Procter and a team from LSE and The Guardian set about investigating this claim One of the largest studies of social media analytics What can we learn from use of social media during times of crisis? What does this tell us about veracity of data? 35
    36. 36. Case study 4: The Blackberry Riots 9pm on 8th August @Twiggy_Garcia circulates unconfirmed reports that rioters releasing animals at London Zoo Re-tweeted by influential users with many followers. Rumours spread in viral-like way over non-hierarchical network Opposition seeds within 13 minutes. Pictures are identified as fake © Bull, 2014 Click here to play the video 36
    37. 37. Case study 4: The Blackberry Riots BIG DATA Characteristics: Uncertainty and Incompleteness exists in all data; streaming has the advantage of ‘in-flight correction’. Processing: Real-time and in parallel; inc. background analysis Storage: Minimal additional storage requirements; Speed: Inevitably impacts speed © Bull, 2014 37
    38. 38. • • Working with experts, formulate technology (hardware/software) needs • © Bull, 2014 Identifying characteristics of problem domain ‘Big data’ solutions are commonplace; ‘Fast data’ solutions are not 38
    39. 39. Conclusion……. 39 © Bull, 2014 ® Copyright 2011 Gigaspaces Ltd. All Rights Reserved 39
    40. 40. Discussion Andrew.Carr@bull.co.uk Stephen.Jarvis@warwick.ac.uk Robert.J.Maskell@intel.com © Bull, 2014 40
    41. 41. © Bull, 2014 0870 240 0040 www.bull.co.uk Hemel Hempstead HP2 7DZ information@bull.co.uk @Bull_UK Bull-Information-Systems 41

    ×