Advertisement
Advertisement

More Related Content

Slideshows for you(20)

Similar to A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, and Deep Learning Pipelines with Brooke Wenig and Jules Damji(20)

Advertisement

More from Databricks(20)

Advertisement

A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, and Deep Learning Pipelines with Brooke Wenig and Jules Damji

  1. A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, and Deep Learning Pipelines Brooke Wenig Jules S. Damji Spark + AI Summit, SF 6/5/2018
  2. About Us . . . Databricks Machine LearningInstructor Data ScienceSolution Consultant@ Databricks Software Engineering @Splunk & MyFitnessPal MS Machine Learning(UCLA) Fluentin Chinese https://www.linkedin.com/in/brookewenig/ Brooke WenigJules S. Damji Apache Spark Developer& Community Advocate @Databricks Program Chair Spark + AI Summit Software engineering @Sun Microsystems, Netscape, @Home, VeriSign, Scalix, Centrify, LoudCloud/Opsware, ProQuest https://www.linkedin.com/in/dmatrix @2twitme
  3. Agenda for Today’s Talk • Impact of Big Data • Why Apache Spark? • Short Survey of 3 DL Frameworks • TensorFlow • Keras • Deep Learning Pipelines • Demo • Q&A
  4. What has Big Data Done to Us? Permeated our livesSource : MIT
  5. Hardest Part of AI isn’t AI, it’s Data ML Code Configuration Data Collection Data Verification Feature Extraction Machine Resource Management Analysis Tools Process Management Tools Serving Infrastructure Monitoring “Hidden Technical Debt in Machine Learning Systems,” Google NIPS 2015 Figure 1: Onlya small fraction of real-world ML systems is composed of the ML code. The required surrounding infrastructure is vast and complex.
  6. What’s Apache Spark & Why
  7. Apache Spark: The First Unified Analytics Engine Runtime Delta Spark Core Engine Big Data Processing ETL + SQL + Streaming Machine Learning MLlib + SparkR Uniquelycombines Data & AI technologies
  8. Survey of Three Deep Learning Frameworks
  9. What’s TensorFlow? • Open source from Google, 2015 • Current v1.8 API • Fast: Backend C/C++ • Data flow graphs • Nodes are functions/operators • Edges are input or data (tensors) • Lazy execution • Eager execution (1.7)
  10. TensorFlow Programming Stack CPU GPU Android iOS …TPU Use canned estimators Build models Keras Models
  11. Why TensorFlow: Community AF AF • 100K+ stars! • 11M downloads • Popular open-source code • TensorFlow Hub & Blog ○ Code Examples & Tutorials! ○ Learn + share from others
  12. Why TensorFlow: Tools AF AF • Deploy + Serve Models• TensorBoard • Visualize Tensors flow
  13. TensorFlow: We Get it … So What? • Steep learning curve, but powerful!! • Low-level APIs, butoffers control!! • Expert in Machine Learning, justlearn!! • Yet, high-level Estimators help, you bet!! • Better, Keras integration helps, indeed!!
  14. What’s Keras? • Open source Python Library APIs for Deep Learning • Current v2.1.6 APIs François Chollet (Google) • API spec: TensorFlow, CNTKand Theano • Easy to UseHigh-Level DeclarativeAPIs! • Build layers – Great for Neural Network Applications • Fast Experimentation,Modular & Extensible!
  15. Keras Programming Stack CPU GPU Android iOS …TPU Use canned estimators Specific Impl models Keras API Specification TF-Keras Theano-Keras CNTK TensorFlow Workflow .....
  16. Why Keras? • Focuses on Developer Experience • Popular & Broader Community • Supports multiple backends • Modularity • Sequential Layers • Multi-layer input networks model = Sequential() model.add(Dense(32, input_dim=784)) model.add(Activation('relu')) model.add(Dense, 32, activation=’softmax’) ...
  17. Transfer Learning & Deep Learning Pipelines
  18. What’s Transfer Learning? • Training from scratch requires • Enormousamounts of data • A lot of compute resources & time Intermediate representations learned for one task may be useful for other related tasks IDEA
  19. Trained Model SoftMax GIANT PANDA 0.9 RACCOON 0.05 RED PANDA 0.01 …
  20. Transfer Learning as a Pipeline Classifier Dog/Cat?
  21. When to use Transfer Learning? • Dataset is small & similar • Dataset is large & similar • Dataset is small but different • Dataset is large and different Source: Andrej Karpathy’s Transfer Learning
  22. What & Why Deep Learning Pipelines (DLP)? • Open source from Databricks, 2017 • Current v1.0 APIs w/ Apache Spark 2.3 • Primarily in Python • Ease of Use & Integration • Spark MLlibPipelines & DataFrames • TensorFlow & Keras • SQL – Deploying & Evaluating • Distributed Hyperparameter Tuning • Easy for Transfer Learning
  23. DEMO https://dbricks.co/dlf_sai_2018
  24. Takeaways: Which One & What Language?
  25. TensorFlow Keras Takeaways: When to Use TF, Keras or DLP Deep Learning Pipelines • Low-level APIs & Control • Visualize with TensorBoard • Train Models or Transfer Learning • Model Serving • High-level APIs • TensorFlowBackend • LovePython • Train models or transfer learning • Integration with Spark MLlib Pipelines & DataFrames • Integrated with TF & Keras • Transfer Learning
  26. Resources Blogposts Talk, & webinars (http://databricks.com/blog) • Deep Learning Pipelines • GPU acceleration in Databricks • Deep Learning and ApacheSpark • Build Scalable Deep LearningPipelines • Deep Learning course:fast.ai • TensorFlowTutorials • TensorFlowDev Summit • Keras/TensorFlowTutorials • MLFlow.org Docs for Deep Learning on Databricks (http://docs.databricks.com) • Deep Learning Pipelines Example • ApacheSpark integration
  27. Thank You! Questions? brooke@databricks.com jules@databricks.com (@2twitme)
Advertisement