Successfully reported this slideshow.
Your SlideShare is downloading. ×

Machine learning

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Loading in …3
×

Check these out next

1 of 44 Ad

Machine learning

Download to read offline

In this deck from the HPC User Forum in Tucson, Prabhat from NERSC presents: Machine Learning.

"Prabhat leads the Data and Analytics Services team at NERSC. His current research interests include scientific data management, parallel I/O, high performance computing and scientific visualization. He is also interested in applied statistics, machine learning, computer graphics and computer vision."

Watch the video presentation:
http://wp.me/p3RLEV-3R5

Watch the video presentation: http://wp.me/p3RLHQ-fbt
Learn more: http://hpcuserforum.com

Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter

In this deck from the HPC User Forum in Tucson, Prabhat from NERSC presents: Machine Learning.

"Prabhat leads the Data and Analytics Services team at NERSC. His current research interests include scientific data management, parallel I/O, high performance computing and scientific visualization. He is also interested in applied statistics, machine learning, computer graphics and computer vision."

Watch the video presentation:
http://wp.me/p3RLEV-3R5

Watch the video presentation: http://wp.me/p3RLHQ-fbt
Learn more: http://hpcuserforum.com

Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter

Advertisement
Advertisement

More Related Content

Viewers also liked (20)

Similar to Machine learning (20)

Advertisement

More from inside-BigData.com (20)

Recently uploaded (20)

Advertisement

Machine learning

  1. 1. Machine Learning - 1 - Prabhat HPC User Forum April 12, 2016
  2. 2. Slide Courtesy of Nervana Systems
  3. 3. (2012) This is all great, but… •  Is Machine Learning relevant to science? •  Why should HPC facili;es care about Machine Learning, Deep Learning, Sta;s;cs? - 4 -
  4. 4. (2012) This is all great, but… •  Is Machine Learning relevant to science? –  Success stories are for images and audio, but how about scien=fic data? •  Why should HPC facili;es care about Machine Learning, Deep Learning, Sta;s;cs? –  Our applied mathema=cians are content with formula=ng and solving PDEs –  The NNSA folks care about Uncertainty Quan=fica=on –  Our data ‘analy=cs’ folks are happy dealing with meshes, computa=onal geometry, topology - 5 -
  5. 5. (2016) The writing is on the wall •  O(B) $ worth of investment by industry •  Machine Learning and Sta;s;cs are established as key disciplines for this decade –  Deep Learning has taken off as the most promising ML technique - 12 -
  6. 6. (2012) Revisited.. •  Is Machine Learning relevant to science? •  Why should HPC facili;es care about Machine Learning, Deep Learning, Sta;s;cs? - 13 -
  7. 7. Astronomy Physics Light Sources Genomics Climate (2010-2016): The Rise of Data-Intensive Science
  8. 8. - 15 -
  9. 9. - 17 -
  10. 10. - 18 -
  11. 11. - 19 -
  12. 12. - 20 -
  13. 13. - 21 -
  14. 14. - 22 -
  15. 15. - 23 -
  16. 16. 4 V’s of Scientific Big Data - 24 - Science Domain Variety Volume Velocity Veracity Astronomy Mul=ple Telescopes, mul=-band/spectra O(100) TB 100 GB/night – 10 TB/night Noisy, acquisi=on artefacts Light Sources Mul=ple imaging modali=es O(100) GB 1 Gb/s-1 Tb/s Noisy, sample prepara=on/acquisi=on artefacts Genomics Sequencers, Mass- spec, proteomics O(1-10) TB TB/week Missing data, errors High Energy Physics Mul=ple detectors O(100) TB – O(10) PB 1-10 PB/s reduced to GB/s Noisy, artefacts, spa=o- temporal Climate Simula=ons Mul=-variate, spa=o-temporal O(10) TB 100 GB/s ‘Clean’, need to account for mul=ple sources of uncertainty
  17. 17. Does Machine Learning matter? •  Is Machine Learning relevant to science? –  Yes! •  Why should HPC facili;es care about Machine Learning, Deep Learning, Sta;s;cs? –  Analy=cs is the key step for gaining scien=fic insights –  The nature of ques=ons in data-intensive science are inferen=al –  Sta=s=cs and Machine Learning deal with inference in presence of noise and errors - 25 -
  18. 18. Creating a catalog of all objects " in the Universe 1 Top 10 Data Analytics Problems
  19. 19. Fundamental Constants of Cosmology 2 Top 10 Data Analytics Problems
  20. 20. Characterizing Extreme Weather in a Changing Climate 3 Top 10 Data Analytics Problems
  21. 21. Knowledge Extraction from " Scientific Literature 4 Top 10 Data Analytics Problems
  22. 22. Understanding Speech Production 5 Top 10 Data Analytics Problems
  23. 23. Quantitative and Predictive Biology 6 Top 10 Data Analytics Problems
  24. 24. Understanding the Genetic Code 7 Top 10 Data Analytics Problems
  25. 25. 8 Top 10 Data Analytics Problems Personalized Toxicology
  26. 26. Designer Materials 9 Top 10 Data Analytics Problems
  27. 27. Fundamental Constituents of Matter 10 Top 10 Data Analytics Problems
  28. 28. Towards Synthesis (and maybe Convergence) •  What is the landscape of Machine Learning problems in science? –  Bewildering array of taxonomy and domain-specific terminology •  What are the key computa;onal mo;fs? –  Need to have a produc=ve conversa=on with HPC soaware, hardware vendors - 36 -
  29. 29. ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ Classifica;on Regression Clustering Dimensionality Reduc;on Inference Model Es;ma;on Design of Experiments Seman;c Analysis Feature Learning Anomaly Detec;on Astronomy Cosmology Climate Systems Biology Neuroscience EM/X-Ray Imaging Mass-spec Imaging Personalized Toxicology Materials Par;cle Physics
  30. 30. ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ Classifica;on Regression Clustering Dimensionality Reduc;on Inference Model Es;ma;on Design of Experiments Seman;c Analysis Feature Learning Anomaly Detec;on Astronomy Cosmology Climate Systems Biology Neuroscience EM/X-Ray Imaging Mass-spec Imaging Personalized Toxicology Materials Par;cle Physics
  31. 31. Hardware Op=mized Libraries Big Data Mo=f Scalable Algorithms Scien=fic Analysis Science Applica=ons Astronomy, Cosmology, Climate, BRAIN, BioImaging, HEP Pafern/ Anomaly Discovery Large Scale Inference Clustering, Dimensionality Reduc=on Data Fusion Genome Assembly Dense/Sparse Linear Algebra Op=miza=on (Stochas=c) Randomized Linear Algebra Many-Core Chipset, Deep Memory Hierarchy, Reducing Data Movement, Power Efficiency Graph Methods (BFS, DFS,…) MapReduce ScaLAPACK, BLAS PCL-DNN TensorFlow SpearMint RandLA TECA GraphLab Deep Learning Stochas=c Varia=onal Inference Distributed MCMC Direct Graph Kernel Computa=on Sparse Coding DBSCAN CUR/CX Machine Learning Research Strategy
  32. 32. Hardware Op=mized Libraries Big Data Mo=f Scalable Algorithms Scien=fic Analysis Science Applica=ons Astronomy, Cosmology, Climate, BRAIN, BioImaging, HEP Pafern/ Anomaly Discovery Large Scale Inference Clustering, Dimensionality Reduc=on Data Fusion Genome Assembly Dense/Sparse Linear Algebra Op=miza=on (Stochas=c) Randomized Linear Algebra Many-Core Chipset, Deep Memory Hierarchy, Reducing Data Movement, Power Efficiency Graph Methods (BFS, DFS,…) MapReduce ScaLAPACK, BLAS PCL-DNN TensorFlow SpearMint RandLA TECA GraphLab Deep Learning Stochas=c Varia=onal Inference Distributed MCMC Direct Graph Kernel Computa=on Sparse Coding DBSCAN CUR/CX Machine Learning Research Strategy
  33. 33. Machine Learning: Challenges •  Cultural –  ML doesn’t cleanly ‘fit’ within Computer Science or Applied Math –  Sta=s=cs, CS (Machine Learning, HPC) taxonomy –  Mindshare •  Afrac=ng the best academic and industry talent is hard •  Technical –  Big Data ecosystem has evolved independently of HPC –  Aspira=ons of Convergence (Soaware, Hardware) •  HPC ins=tu=ons need to do a befer job of characterizing their Data Analy=cs requirements - 41 -
  34. 34. Machine Learning: Opportunities •  HPC community is uniquely posi;oned –  Storage and Compute Hardware –  Meaningful scien=fic problems •  SoWware (Research and Produc;on) is wide open •  Most exci;ng discoveries happen at the intersec;on of domain sciences and methods –  We don’t know the limits of Deep Learning methods - 42 -
  35. 35. - 43 -
  36. 36. Thanks! - 44 - We are hiring: Big Data Architects Big Data Engineers Data Scien;sts Post-docs, interns Contact: prabhat@lbl.gov

×