Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
了解更多CS求职信息
扫描二维码关注微信
www.laioffer.comlaiofferhelper2
How Spark Speedup AI
Mike Tang
了解更多CS求职信息
扫描二维码关注微信
www.laioffer.comlaiofferhelper2
了解更多CS求职信息
扫描二维码关注微信
Outline
● Spark ecosystem
● Spark ML and XGBoost...
了解更多CS求职信息
扫描二维码关注微信
www.laioffer.comlaiofferhelper2
Big data
了解更多CS求职信息
扫描二维码关注微信
www.laioffer.comlaiofferhelper2
What is machine learning or AI ?
⬢ Database, Big data, Machine Learni...
了解更多CS求职信息
扫描二维码关注微信
www.laioffer.comlaiofferhelper2
History of big data
⬢ Application driven
○ Billions of web pages
○ Ne...
了解更多CS求职信息
扫描二维码关注微信
www.laioffer.comlaiofferhelper2
Applications driven for big data
⬢ Ecosystem of Hadoop
○ How Facebook...
了解更多CS求职信息
扫描二维码关注微信
www.laioffer.comlaiofferhelper2
The leading data science platform for big data
Apache Spark
Hadoop
In...
了解更多CS求职信息
扫描二维码关注微信
www.laioffer.comlaiofferhelper2
Data pipeline for machine learning
Resilient Distributed Dataset
serv...
了解更多CS求职信息
扫描二维码关注微信
www.laioffer.comlaiofferhelper2
ML is only a small part of real-word ML system
了解更多CS求职信息
扫描二维码关注微信
www.laioffer.comlaiofferhelper2
Bring Data Science to Big Data
Retraining
History
data
Feedback
data
...
了解更多CS求职信息
扫描二维码关注微信
www.laioffer.comlaiofferhelper2
了解更多CS求职信息
扫描二维码关注微信
Outline
● Spark ecosystem
● Spark XGBoost
● Spar...
了解更多CS求职信息
扫描二维码关注微信
www.laioffer.comlaiofferhelper2
Motivation
⬢ Machine learning for big data
⬢ Application lists
○ Hous...
了解更多CS求职信息
扫描二维码关注微信
www.laioffer.comlaiofferhelper2
Motivation
⬢ XGBoost is the start-of-art approach in Kaggle for struc...
了解更多CS求职信息
扫描二维码关注微信
www.laioffer.comlaiofferhelper2
Motivation
⬢ Ensemble and Boosting is time consuming for training mod...
了解更多CS求职信息
扫描二维码关注微信
www.laioffer.comlaiofferhelper2
Motivation
⬢ Ensemble and Boosting is time consuming for training mod...
了解更多CS求职信息
扫描二维码关注微信
www.laioffer.comlaiofferhelper2
Motivation
⬢ Train XGBoost is time consuming
Training
data
XGBoost
Mo...
了解更多CS求职信息
扫描二维码关注微信
www.laioffer.comlaiofferhelper2
Motivation
⬢ What should we do ?
Training
data
XGBoost
Model
B: Model...
了解更多CS求职信息
扫描二维码关注微信
www.laioffer.comlaiofferhelper2
Motivation
⬢ From single machine to parallel computation
○ Distribute...
了解更多CS求职信息
扫描二维码关注微信
www.laioffer.comlaiofferhelper2
How Spark enhance XGBoost
⬢ Efficient distributed training and Spark ...
了解更多CS求职信息
扫描二维码关注微信
www.laioffer.comlaiofferhelper2
How Spark enhance XGBoost
⬢ Each node of XGBoost need Rabit to commun...
了解更多CS求职信息
扫描二维码关注微信
www.laioffer.comlaiofferhelper2
XGBoost on Spark ML pipeline
⬢ Distributed XGBoost inside Spark ML pi...
了解更多CS求职信息
扫描二维码关注微信
www.laioffer.comlaiofferhelper2
XGBoost on Spark ML pipeline
⬢ Distributed XGBoost
○ Parameter:
○ val...
了解更多CS求职信息
扫描二维码关注微信
www.laioffer.comlaiofferhelper2
GPU speedup XGBoost
⬢ Where to improve the tree building procedure?
⬢...
了解更多CS求职信息
扫描二维码关注微信
www.laioffer.comlaiofferhelper2
GPU speedup XGBoost
⬢ GPU speedup XGBoost in the single machine
○ Pro...
了解更多CS求职信息
扫描二维码关注微信
www.laioffer.comlaiofferhelper2
GPU speedup XGBoost
⬢ XGBoost with GPU wins 4.x speedup vs CPU based
...
了解更多CS求职信息
扫描二维码关注微信
www.laioffer.comlaiofferhelper2
GPU speedup XGBoost
⬢ GPU is good but manage GPU cluster is not easy
...
了解更多CS求职信息
扫描二维码关注微信
www.laioffer.comlaiofferhelper2
What you can learn from this notebook
⬢ Combine Spark, and XGBoost to...
了解更多CS求职信息
扫描二维码关注微信
www.laioffer.comlaiofferhelper2
Spark and Xgboost for Fintech
⬢ Lending club data
⬢ Spark Dataframe f...
了解更多CS求职信息
扫描二维码关注微信
www.laioffer.comlaiofferhelper2
了解更多CS求职信息
扫描二维码关注微信
Outline
● Spark ecosystem
● Spark XGBoost
● Spar...
了解更多CS求职信息
扫描二维码关注微信
www.laioffer.comlaiofferhelper2
了解更多CS求职信息
扫描二维码关注微信
Why Deep Learning
Data explosion
Computation exp...
了解更多CS求职信息
扫描二维码关注微信
www.laioffer.comlaiofferhelper2
了解更多CS求职信息
扫描二维码关注微信
Why Deep Learning
Data explosion
Computation exp...
了解更多CS求职信息
扫描二维码关注微信
www.laioffer.comlaiofferhelper2
What is deep learning
⬢ A set of machine learning techniques that can...
了解更多CS求职信息
扫描二维码关注微信
www.laioffer.comlaiofferhelper2
了解更多CS求职信息
扫描二维码关注微信
www.laioffer.comlaiofferhelper2
A typical Deep Learning workflow
Load data Select neural network arch...
了解更多CS求职信息
扫描二维码关注微信
www.laioffer.comlaiofferhelper2
Build your own deep learning model
Model Images(#) Classes(#)
ImageNe...
了解更多CS求职信息
扫描二维码关注微信
www.laioffer.comlaiofferhelper2
了解更多CS求职信息
扫描二维码关注微信
了解更多CS求职信息
扫描二维码关注微信
www.laioffer.comlaiofferhelper2
了解更多CS求职信息
扫描二维码关注微信
www.laioffer.comlaiofferhelper2
了解更多CS求职信息
扫描二维码关注微信
www.laioffer.comlaiofferhelper2
Transfer Learning Pipeline
Pre-trained CNN
model
Softmax classificati...
了解更多CS求职信息
扫描二维码关注微信
www.laioffer.comlaiofferhelper2
Deep Learning in Spark MLlib Pipeline
⬢ Spark MLlib pipeline
○ Sequen...
了解更多CS求职信息
扫描二维码关注微信
www.laioffer.comlaiofferhelper2
Auto ML in Spark ML pipeline
⬢ Spark to prepare the data
○ Spark stre...
了解更多CS求职信息
扫描二维码关注微信
www.laioffer.comlaiofferhelper2
Case study
⬢ Car damage estimation ⬢ Intelligence agent
⬢ X-Ray Image...
了解更多CS求职信息
扫描二维码关注微信
www.laioffer.comlaiofferhelper2
What you can learn this section
⬢ How to combine deep learning and Sp...
了解更多CS求职信息
扫描二维码关注微信
www.laioffer.comlaiofferhelper2
了解更多CS求职信息
扫描二维码关注微信
www.laioffer.comlaiofferhelper2
resources:
https://drive.google.com/drive/folders/1wGKNGq7w75YKYazMZ7...
You’ve finished this document.
Download and read it offline.
Upcoming SlideShare
What to Upload to SlideShare
Next
Upcoming SlideShare
What to Upload to SlideShare
Next
Download to read offline and view in fullscreen.

Share

Training at AI Frontiers 2018 - LaiOffer Data Session: How Spark Speedup AI

Download to read offline

Topic: How to use big data to enhance AI
Outline:
1. Spark ETL
Spark SQL
Spark Streaming
2. Spark ML
Spark ML pipeline
Distributed model tuning
Spark ML model and data lineage management

3. Spark XGboost
XGboost introduction
XGboost with Spark
XGboost with GPU

4. Spark Deep Learning pipeline
Transfer learning
Build Spark ML pipeline with TensorFlow
Model selection on distributed TF model

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all
  • Be the first to like this

Training at AI Frontiers 2018 - LaiOffer Data Session: How Spark Speedup AI

  1. 1. 了解更多CS求职信息 扫描二维码关注微信 www.laioffer.comlaiofferhelper2 How Spark Speedup AI Mike Tang
  2. 2. 了解更多CS求职信息 扫描二维码关注微信 www.laioffer.comlaiofferhelper2 了解更多CS求职信息 扫描二维码关注微信 Outline ● Spark ecosystem ● Spark ML and XGBoost ● Spark Deep Learning pipeline
  3. 3. 了解更多CS求职信息 扫描二维码关注微信 www.laioffer.comlaiofferhelper2 Big data
  4. 4. 了解更多CS求职信息 扫描二维码关注微信 www.laioffer.comlaiofferhelper2 What is machine learning or AI ? ⬢ Database, Big data, Machine Learning, AI ? ⬢ “using algorithms to understand the pattern in data” Prediction insight
  5. 5. 了解更多CS求职信息 扫描二维码关注微信 www.laioffer.comlaiofferhelper2 History of big data ⬢ Application driven ○ Billions of web pages ○ New system requirements ■ Cheap ■ Robust ■ Efficient ○ 2004 Google ○ 2007 Yahoo ○ Hadoop ecosystem ■ HDFS ■ MAPR ■ Yarn ○ 2012 Hortonworks
  6. 6. 了解更多CS求职信息 扫描二维码关注微信 www.laioffer.comlaiofferhelper2 Applications driven for big data ⬢ Ecosystem of Hadoop ○ How Facebook use Hadoop? ■ Hive for OLAP query processing ■ HBase for for billion users activities tracking ○ How Twitter use Hadoop? ■ Storm: streaming data processing for twitter stream data ○ How LinkedIn use Hadoop? ■ Kafaka to subscribe users streaming data ○ When Hadoop come together? ■ Ambari: for node management and deploy different components
  7. 7. 了解更多CS求职信息 扫描二维码关注微信 www.laioffer.comlaiofferhelper2 The leading data science platform for big data Apache Spark Hadoop Interactive Streaming Batch Nosql Tensor flow ⬢ Apache Spark ○ Machine learning application driven ○ The leading computation engine for big data processing ○ Data pipeline for different data source and other computation engine ○ Uniform data processing object RDD and DataFrame ○ Memory based
  8. 8. 了解更多CS求职信息 扫描二维码关注微信 www.laioffer.comlaiofferhelper2 Data pipeline for machine learning Resilient Distributed Dataset server server server server ETL Exploration Machine learning Structural data RAW data processing Interactive, OLAP, Spark SQL Feature engineering Model training Data Product Visualization
  9. 9. 了解更多CS求职信息 扫描二维码关注微信 www.laioffer.comlaiofferhelper2 ML is only a small part of real-word ML system
  10. 10. 了解更多CS求职信息 扫描二维码关注微信 www.laioffer.comlaiofferhelper2 Bring Data Science to Big Data Retraining History data Feedback data Data scientist Continuous updating Deploying Operational data ML Model Feature engineering Model selection Model tuning ML Pipeline Scoring
  11. 11. 了解更多CS求职信息 扫描二维码关注微信 www.laioffer.comlaiofferhelper2 了解更多CS求职信息 扫描二维码关注微信 Outline ● Spark ecosystem ● Spark XGBoost ● Spark Deep Learning pipeline
  12. 12. 了解更多CS求职信息 扫描二维码关注微信 www.laioffer.comlaiofferhelper2 Motivation ⬢ Machine learning for big data ⬢ Application lists ○ House price prediction ○ CTR prediction ○ …. ○ Products recommendation ⬢ ML job categories ○ Regression ○ Classification ○ Clustering ○ Etc. ⬢ XGBoost is good at ○ Regression and Classification
  13. 13. 了解更多CS求职信息 扫描二维码关注微信 www.laioffer.comlaiofferhelper2 Motivation ⬢ XGBoost is the start-of-art approach in Kaggle for structural data ○ 80% teams win the competition based on XGBoost ○ A tree based model ○ Excellent at classification and regression ○ Ref: http://xgboost.readthedocs.io/en/latest/model.html
  14. 14. 了解更多CS求职信息 扫描二维码关注微信 www.laioffer.comlaiofferhelper2 Motivation ⬢ Ensemble and Boosting is time consuming for training model ○ Ensemble ○ An ensemble is a combination of predication model that output a final result
  15. 15. 了解更多CS求职信息 扫描二维码关注微信 www.laioffer.comlaiofferhelper2 Motivation ⬢ Ensemble and Boosting is time consuming for training model ○ Gradient Boosting ○ Multiple round (1…M) iterations to correct the errors of previous round mistake ○ Ref: https://www.slideshare.net/LonghowLam/machine-learning-overview
  16. 16. 了解更多CS求职信息 扫描二维码关注微信 www.laioffer.comlaiofferhelper2 Motivation ⬢ Train XGBoost is time consuming Training data XGBoost Model B: Model EvaluationTesting data Model Evaluation A: Training algorithm C: Model tuning
  17. 17. 了解更多CS求职信息 扫描二维码关注微信 www.laioffer.comlaiofferhelper2 Motivation ⬢ What should we do ? Training data XGBoost Model B: Model Evaluation with Spark ML Testing data Model Evaluatio n A: Speedup Training 1. Parallel and GPU C: AUTO Model Tuning with Spark ML
  18. 18. 了解更多CS求职信息 扫描二维码关注微信 www.laioffer.comlaiofferhelper2 Motivation ⬢ From single machine to parallel computation ○ Distributed training ○ GPU supported ○ Cowork with big data ecosystem ⬢ How to provide the end-end solution for DS? ○ Front-end ■ Easy and efficient way for parallel XGBoost computation ■ Notebook front end for model visualization ○ Backend ■ Yarn to allocate the resource for application (CPU, Memory, GPU) ■ Docker support
  19. 19. 了解更多CS求职信息 扫描二维码关注微信 www.laioffer.comlaiofferhelper2 How Spark enhance XGBoost ⬢ Efficient distributed training and Spark ML pipeline ○ Dataframe and RDD support for efficient data preprocessing ⬢ Ref: http://dmlc.ml/2016/10/26/a-full-integration-of-xgboost-and-spark.html
  20. 20. 了解更多CS求职信息 扫描二维码关注微信 www.laioffer.comlaiofferhelper2 How Spark enhance XGBoost ⬢ Each node of XGBoost need Rabit to communicate with each others ○ Efficient but not easy to manage Rabit XGBoost worker2 XGBoost worker3 XGBoost worker4 Training data Partition 1 XGBoost worker1 Training data Partition 2 Training data Partition 3 Training data Partition 4 Statistic sync: optimal split value
  21. 21. 了解更多CS求职信息 扫描二维码关注微信 www.laioffer.comlaiofferhelper2 XGBoost on Spark ML pipeline ⬢ Distributed XGBoost inside Spark ML pipeline ⬢ XGBoost estimator ○ Extend from Spark ML estimator ⬢ XGBoost model ○ Extend from Spark ML pipelineModel ○ Naturally work inside Spark ML Pipeline for model materialization ⬢ XGBoost parameter ○ Extend from Spark ML parameter ○ Enable automatically parameter tuning
  22. 22. 了解更多CS求职信息 扫描二维码关注微信 www.laioffer.comlaiofferhelper2 XGBoost on Spark ML pipeline ⬢ Distributed XGBoost ○ Parameter: ○ val paramMap = List( "eta" -> 0.1f, "max_depth" -> 2, "objective" -> "binary:logistic").toMap ○ training ○ val xgboostModelRDD = XGBoost.train(trainRDD, paramMap, 1, 4, useExternalMemory=true) ○ val xgboostModelDF = XGBoost.trainWithDataFrame(trainDF, paramMap, 1, 4, useExternalMemory = true) ○ Prediction ○ val xgboostPredictionRDD = xgboostModelRDD.predict(trainRDD.map{x => x.features}) ○ XGBoost inside ML pipeline ○ val xgboostEstimator = new XGBoostEstimator( Map[String, Any]("num_round" -> 30, "nworkers" -> 10, "objective" -> "reg:linear", "eta" -> 0.3, "max_depth" -> 6, "early_stopping_rounds" -> 10)) val pipeline = new Pipeline() .setStages(Array(assembler, xgboostEstimator)) ○ val pipelineData = dataset.withColumnRenamed("PE","label") ○ val pipelineModel = pipeline.fit(pipelineData)
  23. 23. 了解更多CS求职信息 扫描二维码关注微信 www.laioffer.comlaiofferhelper2 GPU speedup XGBoost ⬢ Where to improve the tree building procedure? ⬢ Procedure to build a tree ○ for each feature of input data ■ for each leaf of current tree ● find the best spilt ■ split the leaf node A Y N A Y N
  24. 24. 了解更多CS求职信息 扫描二维码关注微信 www.laioffer.comlaiofferhelper2 GPU speedup XGBoost ⬢ GPU speedup XGBoost in the single machine ○ Processing all nodes in the same level concurrently ○ Optimizing splitting point selection ○ Optimize memory usage for data sparsity ⬢ Algorithm to speedup XGBoost via GPU ○ Phase 1: Find splits ○ Phase 2: Update node positions ○ Phase 3: Sort node buckets ○ Ref: https://peerj.com/articles/cs-127/ Instance ID 1 4 3 2 Feature value 0.1 0.2 0.4 0.5 Gradient 0.3 0.5 0.3 0.3
  25. 25. 了解更多CS求职信息 扫描二维码关注微信 www.laioffer.comlaiofferhelper2 GPU speedup XGBoost ⬢ XGBoost with GPU wins 4.x speedup vs CPU based ⬢ Ref: https://devblogs.nvidia.com/gradient-boosting-decision-trees-xgboost-cuda/
  26. 26. 了解更多CS求职信息 扫描二维码关注微信 www.laioffer.comlaiofferhelper2 GPU speedup XGBoost ⬢ GPU is good but manage GPU cluster is not easy ○ Different versions of drivers for GPUs ○ Users have to build XGBoost for GPU supported ○ Hard to manage the resources of GPU ○ GPU resource cannot be shared ⬢ An idle environment is everything included ○ Spark is an efficient distributed engine for data processing ○ Spark ML pipeline for model tuning ○ GPU is used to speedup the XGBoost training ○ Yarn is able to manage the resources of cluster ○ Notebook is used for end users
  27. 27. 了解更多CS求职信息 扫描二维码关注微信 www.laioffer.comlaiofferhelper2 What you can learn from this notebook ⬢ Combine Spark, and XGBoost together ○ Train and deploy XGBoost model in a unified data platform ○ Automatically tune the XGBoost model based on Spark ML pipeline ○ Speedup XGBoost training based on distributed computation and GPU ○ Multiple users can share the same cluster with GPU and Spark ⬢ Benefits ○ End to end solution for ML pipeline with XGBoost support ○ Do not need to care about GPU management ○ Train the XGBoost with Spark ML APIs ○ Visualize the predication results on notebook
  28. 28. 了解更多CS求职信息 扫描二维码关注微信 www.laioffer.comlaiofferhelper2 Spark and Xgboost for Fintech ⬢ Lending club data ⬢ Spark Dataframe for ETL ⬢ Spark SQL for OLAP ⬢ Spark ML for auto modeling tuning and model serving ⬢ Notebook link: (use databricks community edition) ○ Part1: (https://bit.ly/2QuLQ9b) https://databricks-prod- cloudfront.cloud.databricks.com/public/4027ec902e239c93eaaa8714f173bcfc/49999 72933037924/27242371102049/8135547933712821/latest.html ○ Part2:(https://bit.ly/2AZJI3Z) https://databricks-prod- cloudfront.cloud.databricks.com/public/4027ec902e239c93eaaa8714f173bcfc/49999 72933037924/27242371102070/8135547933712821/latest.html ⬢ Acknowledgment: https://databricks.com/blog/2018/08/09/loan-risk-analysis- with-xgboost-and-databricks-runtime-for-machine-learning.html
  29. 29. 了解更多CS求职信息 扫描二维码关注微信 www.laioffer.comlaiofferhelper2 了解更多CS求职信息 扫描二维码关注微信 Outline ● Spark ecosystem ● Spark XGBoost ● Spark Deep Learning pipeline
  30. 30. 了解更多CS求职信息 扫描二维码关注微信 www.laioffer.comlaiofferhelper2 了解更多CS求职信息 扫描二维码关注微信 Why Deep Learning Data explosion Computation explosion An AI-driven world
  31. 31. 了解更多CS求职信息 扫描二维码关注微信 www.laioffer.comlaiofferhelper2 了解更多CS求职信息 扫描二维码关注微信 Why Deep Learning Data explosion Computation explosion An AI-driven world
  32. 32. 了解更多CS求职信息 扫描二维码关注微信 www.laioffer.comlaiofferhelper2 What is deep learning ⬢ A set of machine learning techniques that can learn useful representations of features directly from images, text and sound. ⬢ Achievements ○ ImageNet ○ Google Neural Machine Translation ○ AlphaGo/AlphaZero ⬢ Benefit from big data and GPU
  33. 33. 了解更多CS求职信息 扫描二维码关注微信 www.laioffer.comlaiofferhelper2
  34. 34. 了解更多CS求职信息 扫描二维码关注微信 www.laioffer.comlaiofferhelper2 A typical Deep Learning workflow Load data Select neural network architecture, optimize the parameters
  35. 35. 了解更多CS求职信息 扫描二维码关注微信 www.laioffer.comlaiofferhelper2 Build your own deep learning model Model Images(#) Classes(#) ImageNet 14M 20K Skin cancer 129,450 757
  36. 36. 了解更多CS求职信息 扫描二维码关注微信 www.laioffer.comlaiofferhelper2 了解更多CS求职信息 扫描二维码关注微信
  37. 37. 了解更多CS求职信息 扫描二维码关注微信 www.laioffer.comlaiofferhelper2
  38. 38. 了解更多CS求职信息 扫描二维码关注微信 www.laioffer.comlaiofferhelper2
  39. 39. 了解更多CS求职信息 扫描二维码关注微信 www.laioffer.comlaiofferhelper2 Transfer Learning Pipeline Pre-trained CNN model Softmax classification (Trainable parameters) Load data as DataFrame
  40. 40. 了解更多CS求职信息 扫描二维码关注微信 www.laioffer.comlaiofferhelper2 Deep Learning in Spark MLlib Pipeline ⬢ Spark MLlib pipeline ○ Sequence of Transformers and Estimators ○ Simple, concise API and ease of use ⬢ Integrates with Spark APIs ○ Spark is great at scaling out computations ○ Image representation and reader in Spark DataFrame/Dataset (new in Spark 2.3) ⬢ Spark Deep Learning Pipelines (github.com/databricks/spark-deep-learning) ○ Plugin your own TensorFlow Graph or Keras Model as Transformers ○ Open source under Apache 2.0 license
  41. 41. 了解更多CS求职信息 扫描二维码关注微信 www.laioffer.comlaiofferhelper2 Auto ML in Spark ML pipeline ⬢ Spark to prepare the data ○ Spark streaming ○ Spark SQL ⬢ Spark for model parameter tuning ○ Hyper parameter ○ Save memory usage ⬢ TensorFlow auto network structure tuning ○ Reinforce learning ○ Transfer learning ⬢ Model deploy as a service
  42. 42. 了解更多CS求职信息 扫描二维码关注微信 www.laioffer.comlaiofferhelper2 Case study ⬢ Car damage estimation ⬢ Intelligence agent ⬢ X-Ray Image analysis ⬢ Anti-Terrorism
  43. 43. 了解更多CS求职信息 扫描二维码关注微信 www.laioffer.comlaiofferhelper2 What you can learn this section ⬢ How to combine deep learning and Spark together ⬢ Take DL as a operator in Spark ML pipeline ⬢ Transfer learning with DL model ⬢ DL model parameter tuning ⬢ Apply DL model into Spark SQL ⬢ Notebook: https://databricks-prod- cloudfront.cloud.databricks.com/public/4027ec902e239c93eaaa8714f173bcfc/4999972933037924/4 324977500035919/8135547933712821/latest.html ⬢ Acknowledgment: https://docs.databricks.com/applications/deep-learning/deep-learning- pipelines.html
  44. 44. 了解更多CS求职信息 扫描二维码关注微信 www.laioffer.comlaiofferhelper2
  45. 45. 了解更多CS求职信息 扫描二维码关注微信 www.laioffer.comlaiofferhelper2 resources: https://drive.google.com/drive/folders/1wGKNGq7w75YKYazMZ7ytgaAtfTCgvsE D?usp=sharing

Topic: How to use big data to enhance AI Outline: 1. Spark ETL Spark SQL Spark Streaming 2. Spark ML Spark ML pipeline Distributed model tuning Spark ML model and data lineage management 3. Spark XGboost XGboost introduction XGboost with Spark XGboost with GPU 4. Spark Deep Learning pipeline Transfer learning Build Spark ML pipeline with TensorFlow Model selection on distributed TF model

Views

Total views

756

On Slideshare

0

From embeds

0

Number of embeds

0

Actions

Downloads

26

Shares

0

Comments

0

Likes

0

×