Pragmatic Analytics - Case Studies of High Performance Computing for Better Business and Big Data

1,082 views

Published on

SCIENCE FOUNDATION IRELAND DIGITAL CONTENT WORKSHOP

Monday, July 25th 2011, Guinness Storehouse, Dublin
Session 4 - Data Analytics, Mining and Visualisation

Dr Eoin Brazil, Senior Software Developer and Tech Transfer Manager, Irish Centre for High End Computing (NUIG)

Pragmatic Analytics - Case Studies of High Performance Computing for Better Business and Big Data.

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,082
On SlideShare
0
From Embeds
0
Number of Embeds
9
Actions
Shares
0
Downloads
6
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Pragmatic Analytics - Case Studies of High Performance Computing for Better Business and Big Data

  1. 1. Consultancy – Pragmatic Analytics Irish Centre for High End Computing Dr. Eoin Brazil www.ichec.ie/consultancy
  2. 2. Technology Transfer @ ICHEC • Started just over eighteen months ago • Core competencies include: – Performance Optimization – Data Mining/Analytics (e.g. Computational Finance) • Consultancy • Training (e.g. R - & TSA / & AC, CUDA, HPC, etc.) SFI Enterprise Workshop - 25th July 2011 2
  3. 3. SFI Enterprise Workshop - 25th July 2011 3
  4. 4. Visual Exploration SFI Enterprise Workshop - 25th July 2011 4
  5. 5. Example – Wine Vintage • Hot, dry summers give higher prices in mature wines • Chȃteau Pétrus 2000 ~$60,000 (liv-ex.com) • Bordeaux Equation • Wine quality = 12.145 + 0.00117 Winter Rainfall + 0.0614 Averarge Growing Season Temperature – 0.00386 Harvest Rainfall SFI Enterprise Workshop - 25th July 2011 5
  6. 6. Financial services – Computational Finance SFI Enterprise Workshop - 25th July 2011 6
  7. 7. Real-World Constraints • My application / workflow: – Deal with +2B transactions per day per site – Less than 50ms for end-to-end processing – Need real-time detection of fraud – Multiple coupled models in ensemble – Production platform is X – Cannot incorrectly classify good client as fraudster – Data size is too large for my infrastructure SFI Enterprise Workshop - 25th July 2011 7
  8. 8. Are you ready for Big Data ? • Hadoop is x50+ slower on relation data, can be x1000+ slower on graph data • Make sure you hone the tool first: – – – – MCMC x53 faster using Rcpp Versus R Linear Regression x8 using Eigen via R x15 BLAS/LAPACK with ICC flags and hardware in R Rmpi / multicore / MKL / pnmath / MR / gputools SFI Enterprise Workshop - 25th July 2011 8
  9. 9. What are GPGPUs ? • Disruptive Innovation in Parallel Computing – HPC from desktop to supercomputers (10 Gen leap) SFI Enterprise Workshop - 25th July 2011 9
  10. 10. SFI Enterprise Workshop - 25th July 2011 10
  11. 11. SFI Enterprise Workshop - 25th July 2011 11
  12. 12. Typical Business Results Domain Result Computational Finance 1 or 8 Cards (x121/x950) = Do in 1 second what used to Oil and Gas Data processing = x2 – x6 (profiling at this stage), e.g. if volume took 44 mins could be done in 22 – 7 ½ mins Life Sciences Patient analytics, initial prototype for cardio-vascular disease detection (~72% accuracy), ongoing work. Telecomms Fraud detection prototype for subscription fraud, Detection (~99% accuracy), avoided predicting good clients as fraudster* Electronic Commerce Demand forecasting & customer segmentation = Using historic data to predict future demand (~90% accuracy) & identified valuable clients (~80% accuracy) take 2/16 minutes, 10 generations of processor SFI Enterprise Workshop - 25th July 2011 12
  13. 13. Acknowledgements Supported by Science Foundation Ireland under grant 08/HEC/I1450 and by HEA’s PRTLI-C4.

×