Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Apache SystemML 2016 Summer class primer by Berthold Reinwald

52 views

Published on

This deck will provide logistic of the class and very high level Apache SystemML information.

Published in: Education
  • Be the first to comment

  • Be the first to like this

Apache SystemML 2016 Summer class primer by Berthold Reinwald

  1. 1. Apache SystemML Class ”I predict what you will do next summer.” Summer 2016 1
  2. 2. Class Description • Goal • Teach scalable machine learning with Apache SystemML • Attract potential contributors • Audience • Initially summer interns, but goal of developing / folding into University class • Duration ~16 hours • Content • Development of scalable machine learning algorithms • SystemML usage and hands-on exercises • Advanced SystemMLinternals • Office hours • At Adlab: Thursday, 4-5 pm (may be expanded as demanded) 2
  3. 3. Outline 1. SystemML Primer 2. Machine Learning Algorithms 3. Advanced SystemML Internals 3
  4. 4. SystemML Primer • Goal • Teach enough DML, SystemML usage, and Spark for people to be able to write and run SystemML algorithms on Spark and understand its execution. • Content • DML syntax • SystemML usage • Some Spark 4
  5. 5. Machine Learning Algorithms • Descriptive Statistics, Data Preparation, and Train/Test/Cross-Validation • Regression • Classification • Clustering & Matrix Factorization 5 For each session / chosen algorithm have a similar structure: • Possible Applications • Math / Alternatives / Discussion • DML formulation • Data generation • Hands-on exercises • Performance • Accuracy
  6. 6. Advanced SystemML Internals • Architecture • Compiler • Rewrites • Optimizer • Runtime • Buffer pool • Storage • Advanced Operators • Spark Backend • Performance debugging 6
  7. 7. 7 S# / Date Category Title Content Instructor S# / Date Category Title Content Instructor S1 6/21: 9-12 am R: G1-404 SystemML Primer Scalable Machine Learning with Apache SystemML • Intro ML • DML • SystemML usage • Architecture Berthold Reinwald, Nakul Jindal S5 7/18 4-6 R: ML Algs Clustering & Matrix Factorization • kMeans, mf, ALS, PCA, …) • DML • Data gen • Hands-on • Perf & Accuracy Alexandre Evfimievski, Prithvi Sen S2 6/27: 4-6 pm R: ML Algs Data Prep, Descriptive Statistics, and Train/Test/Cr oss-validation • Math • DML • Data-gen • Hands-on • Perf & Accuracy Faraz Makari Manshadi S6 7/25 4-6 pm R: SystemML Internals Apache SystemML Architect. • Architecture • Hops/Lops • CP/Cluster Berthold Reinwald, Niketan Pansare S3 7/5: 4-6 R: ML Algs Regression • Linear, log., GLM, Cox, Time series; CG method • DML • Data-gen • Hands-on • Perf & Accuracy Alexandre Evfimievski S7 8/1 4-6 pm R: SystemML Internals Apache SystemML Optimizer • Rewrites • Optimizer • Cost model Matthias Boehm, Arvind Surve S4 7/11: 4-6 R: ML Algs Classificat. • NaïveBayes, SVM, decTree, RF • DML • Data-gen • Hands-on • Perf & Accuracy Prithvi Sen S8 8/8 4-6 pm R: SystemML Internals Apache SystemML Runtime • Buffer pool • Storage • Spark backend • Matrix block lib • Performance debugging Matthias Boehm, Arvind Surve

×