機械学習 / Deep Learning 大全 (5) Tool編

#azurejp
https://www.facebook.com/dahatake/
https://twitter.com/dahatake/
https://github.com/dahatake/
https://daiyuhatakeyama.wordpress.com/
https://www.slideshare.net/dahatake/

全体像がどんど
ん
把握できなく
なってくる
何のアルゴリズ
ム
をどう使えば
いいんだっけ？
作成したモデル
を
システムに導入
する手間が重
い・・・

Azure Machine Learning (Azure ML)

既存のパッケージ
(Zip化してアップロード)
Rスクリプトの実行モジュール
R スクリプトを記述

既存のパッケージ (Zip化してアップロード)
Pythonスクリプト実行モジュール
Python スクリプトを記述

スクリーンショットは、RHmm モジュール
を読み込んで利用している例。依存関係の
ある MASS と nlme を含めている。

推論
デプロイメントデータの準備モデル構築・学習
ここは世界中の研究者が
論文として発表。基本的に、
GitHubで公開される。
それを利活用すべし
どんなデータを整備するか
企業内のデータ利活用
戦略が競争力の源泉
ビジネスにインパクトがある
領域の特定、そのための
機械学習利活用

Azure Machine Learning
Experimentation Service
Model Management
Service
推論

GUIでのドラッグアンドドロップコードファースト

Spark
SQL Server
GPU インスタンス
コンテナーサービス
Azure Machine
Learning Workbench /
AI Tools for VS
SQL Server
Machine Learning Server
オンプレミス
エッジコンピュー
ティング Azure IoT Edge
実験および
モデル管理
Azure Machine
Learning Service
トレーニングとデプロ
イ
Azure

プロジェクトの依存関係の管理
トレーニングジョブのローカル、スケール
アップまたは
スケールアウトの環境から選択
Git ベースのチェックポイントとバージョン管
理
実行メトリック、出力ログ、およびモデルの
サービスサイドキャプチャ
お気に入りの IDE、および任意のフレームワー
クを利用 U S E T H E M O S T P O P U L A R I N N O V A T I O N S
U S E A N Y T O O L
U S E A N Y F R A M E W O R K O R L I B R A R Y

Training options in Azure Machine Learning
AMLWB
1
2 Train model
Training and testing (model management):
• Vienna Python (3.5.2) environment on local machine
• A conda Python environment inside of a Docker container
on local/remote VM
• A Spark cluster such as Spark for HDInsight Spark on Azure
ML model
pickle.dump(myModel,
open('./model.pkl', 'wb'))
Local or remote VM
Apache
Spark for
Azure HDI
BAIT
Azure Batch AI Training
Azure Batch AI training (BAIT)
- managed Azure Batch
- parallel and distributed computing
- clusters of GPU compute nodes
Data From
AOI Machines
Deployment
3

Training options in Azure Machine Learning
AMLWB
1
2 Train model
1. Pipeline development (CPU VM)
ML model
pickle.dump(myModel,
open('./model.pkl', 'wb'))
Local or remote VM
Apache
Spark for
Azure HDI
BAIT
Azure Batch AI Training
Data From
AOI Machines
Deployment
3
ML model
Local or remote VM
2. Training (GPU VM) 3. Meta-optimization (cluster of GPU VMs)

いつどれを使用するか?
どのエンジンを使用する
か?
デプロイ対象
どちらの経験を優先す
るか?
独自に構築するか、事前トレー
ニングされたモデルを使用する
か?
Microsoft ML
および AI 製
品
独自に構築
Azure
Machine
Learning
コードファー
スト
(オンプレミ
ス)
ML サー
バー
オンプレミ
スの
Hadoop
SQL
Server
(クラウド)
AML (プレ
ビュー)
SQL
Server
Spark Hadoop Azure
Batch
DSVM Azure
Container
Service
ビジュアルツー
ル
(クラウド)
AML
Studio
使用
コグニティブ
サービス、
ボット

Azure Blob
Storage
Azure Machine
Learning Model
Management Service
GPU Data Science
Virtual Machine
機械学習モデル
Java ETL
Azure Container
Registry
予測的 Web アプリケーション
転移学習、畳み込みニューラルネットワーク (CNN)、および勾配ブースティングデシジョンツリー (GBDT) といった
学習アルゴリズムにより、画像分類を再定義
Azure Cluster Service

Azure Container
Service
Azure Machine
Learning
Azure GPU Data Science Virtual Machine
Web アプリ
(Jupyter Notebook)
Workbench Experimentation
Service
Microsoft
SQL Server
Operationalization
クラスター
ディープラーニングと自然言語の処理により、検索の有効性とタグ付けの正確性が向上
SQL

• AppSource
Cortana Intelligent Gallery
• 数分で展開
• Visual Studio Code 連携
• GitHub Samples
• 設定/構成変更
• Partner Solution

推論

データのサンプル化、理解、および準備を迅速
化
PROSE SDK などを活用して、例示による
インテリジェントなデータ準備を実現
Python による変換の拡張/カスタマイズと特性付
け
大規模な実行のための Python と PySpark の生
成

取り込み
データの価値の
理解
利用準備
多様なデータの読み取り
サンプリング
平準化
浄化
Enterprise data pipeline
• Schedule
• Deploy
• Scale Up/Out
• Secure
• Monitor
• Diagnose
With Azure Machine Learning
プロファイル
型とエンティティの推論
クリーンアップ
変換、抽出、連結
強化
形状変形
集計
特徴化
比較
検証

月曜日-金曜日: 7:00 午前-6:00pm、土曜日: 9:00 am-5:00午後、日曜日: 定休日
Timings_1 Timings_2 Timings_3 Timings_4 Timings_5 Timings_6 Timings_7 Timings_8 Timings_9
月曜日金曜日 7:00 am 6:00 pm 土曜日 9:00 am 5:00 pm 日曜日閉じ
192.128.138.20-[2016 年10月16日 16:22:33-0200] "GET/images/picture.gif HTTP/1.1" 234 343 www.yahoo.com
"Http://www.example.com/"" Mozilla/4.0 (互換性;MSIE 4) ""-"
logtext_1 logtext_2 logtext_3 logtext_4 logtext_5 logtext_6 logtext_7 logtext_8
192.128.138.20 2016年10月16日 16:22:33 -0200 取得画像/画像. gif http 1.1

 列の統計: 数値
 ヒストグラム
 値のカウント
 ボックスプロット
 散布図
 時系列
 地図

VSTS アカウント AML の実験アカウント
VSTS プロジェクトワーク
Git リポジト
リ
Git リポジト
リ
Git リポジト
リ
AML
プロジェク
ト
AML
プロジェク
ト
AML
プロジェク
ト
ユーザーが両方の場所で
アクセス許可を必要とす
る

HDInsight
Model Training Execution Environments
Spark in HDInsight
Azure VM
Remote Training
Docker container in
Linux VM
Local Training
Azure ML
Local Python 3.5.2
Environment
Local Docker container
Experiment Service
Model Management
Service
Workbench Jupyter Notebooks Visual Studio Code
Development Environment

ローカルおよびクラウドでの実験のためのジョブ
の管理
Spark + Python + R のためのサポートを見つける
(ロードマップ)
ジョブをローカル、リモート VM 上 (スケール
アップ)、
Spark クラスター上 (スケールアウト)、または
SQL オンプレミスで実行
コード、構成、パラメーター、およびデータに対
する
Git の実験追跡を使用して作成
詳細な履歴メタデータによる検索と比較

#
# Pattern to invoke Azure ML Logger to record metrics.
#
# Import Azure ML Logger library
from azureml.logging import get_azureml_logger
# Create a new instance of the logger
run_logger = get_azureml_logger()
# log a value (associated to a given experiment and project)
run_logger.log("key", value)
# log an array of values (associated to a given run)
run_logger.log("Actual",
[testlabel[i] for i in range(len(testlabel))[0::100]])

Visual Studio Code 拡張機能 (より多くの IDE および
ノートブックをサポート予定)
希望の IDE で構築開始 - 追加のツールは不要
機械学習とディープラーニングのための統合された
機能豊富な作成
ご使用の IDE またはノートブックから Azure
Machine Learning サービスを直接呼び出し

Training pipeline steps:
1. Provision VMs
use portal or CLI
(https://github.com/Azure/DataScienceVM/tree/master/Scripts/CreateDSVM/Ubuntu)

1. Provision VMs
2. Provision and attach data disk
attach disk to DSVM
https://docs.microsoft.com/en-us/azure/virtual-machines/linux/add-disk?toc=%2fazure%2fvirtual-machines%2flinux%2ftoc.json
az vm disk attach -g someRSG --vm-name someVM --disk aDisk --new --size-gb 200
Or, if you do not need the data after you destroy the VM (preferred scenario when working with external
clients data), simply resize DSVM disks on portal (turned off DSVM).

1. Provision VMs
3. Get Data
azcopy --source {dataBlob}
--destination {DataBaseInputDir}
--source-key {sourceKey} –recursive

1. Provision VMs
3. Get Data
Data is accessible within docker container and on the host
import os
try:
amlWBSharedDir = os.environ['AZUREML_NATIVE_SHARE_DIRECTORY']
except:
amlWBSharedDir = '/datadrive01/somesharedDir/expAccnt/amlws/projectName'

1. Provision VMs
3. Get Data
4. Futurize images using a pretrained Deep Learning model
model = ResNet50()
img = image.load_img(fname, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
features = model.predict_on_batch(np.concatenate(load_img)).squeeze()
features.shape
(1, 2048)

1. Provision VMs
3. Get Data
4. Featurize images using a pretrained Deep Learning model
5. Train lightGBM model
lgb_train = lgb.Dataset(X_train, y_train, free_raw_data=False)
lgb_test = lgb.Dataset(X_test, y_test, reference=lgb_train, free_raw_data=False)
clf = lgb.train(params, lgb_train, num_boost_round=500)
y_pred_proba = clf.predict(X_test)
y_pred = binarize_prediction(y_pred_proba)
clf.save_model(path.join(JABIL_ROOT, MODEL_NAME))

HTTP サービスとしてモデルをデプロイおよび管理
リアルタイムおよびバッチ処理をコンテナーベース
でホスト
Azure による管理と監視
(例: AppInsights)
SparkML、Python、CNTK、TF、TLC、R の最高クラ
スのサポート、その他 (Caffe、MXnet) をサポートす
るよう拡張可能
Python および .NET Core でのサービス作成

ACS w/
Kubernetes
ACR
App Insights
Storage
必要に応じて
スケールアウト/
イン
アクセスが
登録された
コンテナー
Get 要求ログ
データ
Windows
Linux
Mac
Spark ML
Python
CNTK
AZ URE サービスO S サポート ML フレームワーク
R
環境および技術スタック

Kubernetes
Cluster Manager
Model Instances Model Instances

Easily build and deploy models on Microsoft AML platform
What are Azure ML Packages?

Accurate
Packages support State of the Art algorithms and include
optimized parameters to provide out of the box highly
accurate models.
Flexible
Powerful and composable Python API. Optimize the best
algorithm for your application. Provides control over all
the aspects that are available to the power user.
Scalable
Train models at petabyte scale with multi GPU
Time to Solution
Automates many tedious tasks (data alignment, cleaning
and preparation), easily model experimentations and
deploy in production.
Value
Proposition
Best in class package in Quality, Flexibility, Scale and Time to solution.

Sample Code (less than 20 lines)

Easily build and deploy highly accurate computer vision models
on the Microsoft AI Platform
Azure Machine Learning Package For Computer
Vision

Computer Vision Scenario Examples

66
The evaluation module provides functionality to evaluate the
performance of the trained model. Some of the evaluation
metrics : accuracy, precision and recall, confusion matrix
Prepare Dataset
Evaluate the model
Score the model
Image Classification
Model Definition & Training
Deploy the model
Deploy the model as an Azure web service,
Dockers Container or IoT Edge

67
Train the model Score an image
Object Detection

High Accuracy out of the box
Include optimized parameters and pre trained weights to
provide out of the box highly accurate models.
Computer Vision Simplified in Python
Time To Solution
Get from a data set to a highly accurate deployed model
quickly

AML Package for Text Analytics

Easily build and deploy text models on Microsoft AI platform
Azure ML Package For Text Analytics

Pipelines
• TLC Text Classification
• Sklearn Text
Classification
• DNN Text
Classification
• CRF Entity Extraction
• DNN Entity Extraction
• Active Learning
• Incremental Learning
Preprocessors
• Tokenization
• PDF reader
• Image reader
• Stemming
• Lemmatization
• Sentence
Detection
• Dictionary
Lookup
• Regular
Expression
• Phrase Detection
Learners
• TLC
• Sklearn
• CRF
• Word2Vec
• FastText
• Pre-defined CNN
• User-defined
CNN
• Pre-defined RNN
• User-defined
RNN
Vectorizers
• Word N-grams
• Char N-grams
• Word2Vec
• FastText
• Dictionary
Lookup
• Pre-trained
models
• User-defined
functions
• Embedding
Cluster Id
Pipelines and Transformers









Text Analytics Use Cases

http://medicalentitydetector.azurewebsites.net/
Biomedical Demo

Period % Improvement in Accuracy of
ML Forecast vs Human Forecast
Year
1
Q2 86%
Q3 3%
Q4 33%
Year
2
Q1 96%
Q2 40%
Q3 20%
Average 44%

Easily build and deploy highly accurate forecasting models
on the Microsoft AI Platform
Azure Machine Learning Package For
Forecasting

Modeling
- Seasonal Naive
- ARIMA
- ETS
- RecursiveForecaster
- RegressionForecaster
- SklearnEstimator
Time Series AutoML
- RollingOriginValidator
- TSGridSearch (models,
params)
TS Data Prep
- File/pandas
- AML Data Prep
- SQL Server
- TSImputer
- TS Merge (joins)
- UDF in pipeline
Featurization
- LagOperator
- TimeFeaturizer
- RollingWindow
- Forecasts as
Features
Model Deployment
- SetupWizard
- WebServiceFactory
.deploy()
Consumption
- ForecastWebService
.score(method=‘parallel’)
Model Evaluation
- ForecastDataFrame
.calc_error(type=‘MAPE’)
Forecasting Simplified in Python

RAPID TIME TO SOLUTION
~3-10X fewer lines of code than scikit
FLEXIBLE
Time series specific transforms compatible
with scikit
Forecasting Simplified in Python
Simple forecaster in 5 lines of code
Deployment in 1 line of code
Incorporate transforms and models from
scikit
Build your own transforms and run your
own models
ENTERPRISE SCALE
Forecast experimentation and model
deployment at scale
Scale complex time series cross-
validation and hyperparameter sweeps
Run forecasters as web services

https://aka.ms/azureml-docs
https://aka.ms/azureml-wb-msi
https://aka.ms/azureml-wb-dmg
https://aka.ms/mmlspark
https://aka.ms/vscodetool-ai

傾向を把握する
学習モデルを作
る
配置する運用する

データを測る
データを収集す
る
データを整理す
る
傾向を把握する

Edge SDK V1 Machine
Broker
BLE IoT Hub

• 離散データの相互相関を把握
• 埋もれていた傾向を発見可能

Edge SDK V2 Machine
Edge Runtime
Edge SDK V1 Machine
Broker
KES製
Modbus
Bridge
IoT
Hub
FA
Equips
FA
Equips
FA
Equips
MQTT
証明書による接続
ASA
on
Edge
ASA
on
Edge
Local
Alert

学習済み
ライブラリー
Edge向け
カスタマイズ
Docker コンテナ
https://github.com/liupeirong/Azure
→DarknetYoloIoTEdge
https://docs.microsoft.com/ja-jp/azure/iot-edge/tutorial-deploy-machine-learning

Video
Camera
Camera AI Display
Azure IoT Edge device
I/O HTTP
IoT Hub
Azure IoT Edge Runtime
messages

Cloud: Azure 高機能 Edge 軽量 Edge
概要
An Azure host that
spans from CPU to GPU
and FPGA VMs
A server with slots to insert CPUs, GPUs, and FPGAs or a x64 or
ARM system that needs to be plugged in to work
A Sensor with a SoC
(ARM CPU, DSPs)
and memory that can
operate on batteries
CPU
CPU,GPU or Arria 10
FPGA
Arria 10
FPGA
NVIDIA GPU x64 CPU ARM CPU
HW accelerated
DSP,CPU,GPU
モデルパッケー
ジ
Native to Windows
and container elsewhere
Windows
Native
- Linux
container
- Windows ML
- Linux
container
- Windows ML
- Linux
container
- (Ideally) container
- Android Native
- iOS Native
- RT OS

学習済みモデル
学習環境向け
AIライブラリー
学習済みモデル
（Tuned）
HW向け
AIライブラリー
Edge Runtime
入出力ロジック

IoT Hub
Elasticsearch OperatorKibana
Functions
IoT Edge / Edge Agent
Plat’Home
OpenBlocks IoT VX2
Parse Module
(C# Custom Module)
Filter Module
(ASA Module)
Stream Analytics
(Edge)
PD Handler (parse process)
PD Repeater (transfer process)
Container Registry
deploy via IoT Hub
BLE
json (containing base64)
json (containing base64)
富士通コンポーネント社
加速度センサー
オムロン社
環境センサー
IoTセンサー・デバイス
Action
パトライト社
ネットワーク監視表示灯
Azure IoT EdgeとOpenBlocks IoT VX2を使った
エッジコンピューティングIoTシステム構成例

PoE 接続
最大 30W 給電
CLOUDIAN AI Box
Outdoor Model
NVIDIA Jetson TX2 搭載
堅牢, 防塵・防水, 落雷対策
センサーデータストリームにも対応可
• Edge側でのリアルタイム画像
識別
• 識別結果をクラウドへ送信
• リモートからのAIロジック配
置・更新
NVIDIA JetPack
Microsoft Azure

機械学習 / Deep Learning 大全 (5) Tool編

More Related Content

What's hot

Similar to 機械学習 / Deep Learning 大全 (5) Tool編

More from Daiyu Hatakeyama

機械学習 / Deep Learning 大全 (5) Tool編

Editor's Notes