Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Azure によるスピードレイヤの分析アーキテクチャ

725 views

Published on

日本マイクロソフト株式会社 松崎 剛

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Azure によるスピードレイヤの分析アーキテクチャ

  1. 1. Azure による スピードレイヤの 分析アーキテクチャ
  2. 2. Streaming Use Case CONSUMER ENGAGEMENT Real-time Pricing Optimization • Demand-Elasticity • Personal Pricing Schemes • Promotion events • Multi-channel engagement Risk and Fraud, Threat Detection RISK AND REVENUE MANAGEMENT • Real-time anomaly detection • Card Monitoring and Fraud Detection • Risk Aggregation • Preventive Maintenance • Smart Grids and Microgrids • Asset performance as a Service • UAV image analysis Industrial IoT GRID OPS, ASSET OPTIMIZATION • Real-time firewall, network, and auth log correlation • Anomaly detection • Security context, enrichment • Security Orchestration Security Intelligence ACTIONABLE THREAT INTELLIGENCE IoT DEVICE ANALYTICS SENSOR DATA • Aggregation of streaming events • Predictive Maintenance • Anomaly Detection • Right product, promotion, at right time • Real time Ad bidding platform • Personalized Ad Targeting Next Best and Personalized Offers RECOMMENDATION ENGINE CONSUMER ENGAGEMENT ANALYSIS Sentiment Analysis • Demand-Elasticity • Social Network Analysis • Promotion events • Multi-channel Attribution
  3. 3. How people stream today ? Event Hubs
  4. 4. Apache Kafka
  5. 5. partition1 partition1 partition1 broker1 broker2 broker3 broker4 primary secondary (replicated) partition2partition2 partition2 partition3partition3 partition3 partition4partition4 partition4 partition5 partition5 partition5 partition6 partition6partition6 partition7 partition7partition7 partition8partition8partition8
  6. 6. Kafka on Azure - IaaS or PaaS ? Customer Responsibility Microsoft Responsibility On-Premises IaaS PaaS SaaS
  7. 7. . . . Virtual Network (VNet) Kafka Cluster . . . Application (Producers, Consumers, Connectors, etc)
  8. 8. Connectivity & Eco-system A Variety of Source Connectors Sink Connectors MirrorMaker Flink NiFi ... See Connectors : https://www.confluent.io/hub/
  9. 9. Connectivity & Eco-system spout bolt bolt bolt Azure HDInsight … …
  10. 10. Connectivity & Eco-system Azure Databricks (後述)
  11. 11. How people stream today ? Event Hubs
  12. 12. Apache Kafka
  13. 13. Apache Kafka Azure Event Hub HTTP AMQP Kafka
  14. 14. Event Hub の利点 (Compared with Kafka) • Fully Managed • One Click to Scale / Auto-Inflate • Geo Disaster Recovery / Zone Redundancy • Dedicated Capacity (100,000 broker connections) • Azure 上の周辺サービスとの Connectivity • Azure Function • Azure Event Grid • Azure Stream Analytics • Azure Databricks etc
  15. 15. Event Hub でサポートされていない機能 • Idempotent producer • Transaction • Compression • Size-based retention • Log compaction • Adding partitions to an existing topic • HTTP Kafka API support • Kafka Streams https://docs.microsoft.com/en-us/azure/event-hubs/event-hubs-for-kafka-ecosystem-overview
  16. 16. Azure Stream Analytics You can output into : • Data Lake Store / Blob Storage • SQL Database • Event Hub • Power BI • Table storage • Service Bus (Queues, Topics) • Cosmos DB • Functions Point of Service Devices Self Checkout Stations Kiosks Smart Phones Slates/ Tablets PCs/ Laptops Servers Digital Signs Diagnostic EquipmentRemote Medical Monitors Logic Controllers Specialized DevicesThin Clients Handhelds Security POS Terminals Automation Devices Vending Machines Kinect ATM
  17. 17. Pricing Reference (VM / HDInsight) Size vCPU RAM Regular Virtual Machine HDInsight (Kafka) D1 v2 1 3.5 GiB 11.43/h (8,339/m) 13.28/h (9,688/m) D2 v2 2 7 GiB 22.96/h (16,760/m) 26.66/h (19,458/m) D3 v2 4 14 GiB 45.81/h (33,439/m) 53.20/h (38,836/m) D4 v2 8 28 GiB 91.62/h (66,879/m) 106.40/h (77,672/m) D5 v2 16 56 GiB 183.24/h (133,759/m) 212.80/h (155,344/m) D11 v2 2 14 GiB 25.65/h (18,723/m) 29.90/h (21,821/m) D12 v2 4 28 GiB 51.41/h (37,527/m) 59.67/h (43,553/m) D13 v2 8 56 GiB 102.82/h (75,055/m) 119.33/h (87,107/m) D14 v2 16 112 GiB 205.52/h (150,029/m) 238.54/h (174,132/m) GeneralPurposeHighMemory ※ 2019 年 03 月時点の, 単一ノード, 東日本 (Japan East) リージョン, Pay-As-You-Go (割引なし) で比較
  18. 18. Pricing Reference (Event Hub)
  19. 19. Pricing Reference (Event Hub) ¥559,974.24 /m
  20. 20. Azure Databricks CONTROL EASE OF USE Install-based,fully customized infrastructure Frictionless & Optimized Spark clusters Azure Databricks IaaS Clusters Managed Clusters Azure Virtual Machine (VMSS, VNet, etc) Workload optimized, managed clusters Azure HDInsight STORAGE LAYER ANALYTICS LAYER ReducedAdministration Azure Data Lake Store Azure Storage
  21. 21. Structured Streaming
  22. 22. Realtime Analytics Sensors and IoT (unstructured) Ingest Store Prep & train Model & serve Cosmos DB Apps Azure Blob Storage Logs (unstructured) Azure Data Factory Azure Databricks Media (unstructured) Files (unstructured) Business/custom apps (structured) Azure SQL Data Warehouse Azure Analysis Services Power BI Azure Event Hubs Azure IoT Hub Apache Kafka See https://azure.microsoft.com/en-us/solutions/architecture/
  23. 23. Realtime Analytics Sensors and IoT (unstructured) Ingest Store Prep & train Model & serve Cosmos DB Apps Azure Blob Storage Logs (unstructured) Azure Data Factory Azure Databricks Media (unstructured) Files (unstructured) Business/custom apps (structured) Azure SQL Data Warehouse Azure Analysis Services Power BI Azure Event Hubs Azure IoT Hub Apache Kafka See https://azure.microsoft.com/en-us/solutions/architecture/ import com.microsoft.azure.cosmosdb.spark.CosmosDBSpark import com.microsoft.azure.cosmosdb.spark.config.Config val writeConfig = Config(Map("Endpoint, MasterKey, Database, PreferredRegions, Collection, WritingBatchSize")) import org.apache.spark.sql.SaveMode sentimentdata.write.mode(SaveMode.Overwrite).cosmosDB(writeConfig)
  24. 24. Realtime Analytics Sensors and IoT (unstructured) Ingest Store Prep & train Model & serve Cosmos DB Apps Azure Blob Storage Logs (unstructured) Azure Data Factory Azure Databricks Media (unstructured) Files (unstructured) Business/custom apps (structured) Azure SQL Data Warehouse Azure Analysis Services Power BI Azure Event Hubs Azure IoT Hub Apache Kafka See https://azure.microsoft.com/en-us/solutions/architecture/ Also used PolyBase Low Cost Outperformed import com.microsoft.azure.sqldb.spark.config.Config import com.microsoft.azure.sqldb.spark.connect._ import org.apache.spark.sql.SaveMode val config = Config(Map("url“, "databaseName", "dbTable“, "dbo.Clients“, "user", "password")) collection.write.mode(SaveMode.Append).sqlDB(config)
  25. 25. Realtime Analytics Sensors and IoT (unstructured) Ingest Store Prep & train Model & serve Cosmos DB Apps Azure Blob Storage Logs (unstructured) Azure Data Factory Azure Databricks Media (unstructured) Files (unstructured) Business/custom apps (structured) Azure SQL Data Warehouse Azure Analysis Services Power BI Azure Event Hubs Azure IoT Hub Apache Kafka See https://azure.microsoft.com/en-us/solutions/architecture/ Databricks Delta
  26. 26. Realtime Analytics See https://azure.microsoft.com/en-us/solutions/architecture/
  27. 27. Functions
  28. 28. KEDA – Kubernetes-based Event-Driven Autoscaling https://github.com/kedacore/keda Kubernetes cluster Function pods Horizontal pod autoscaler Kubernetes store KEDA Metrics adapter ScalerController CLI 1-> n or n-> 1 0-> 1 or 1-> 0 Any events? Register + trigger and scaling definition External trigger source
  29. 29. FPGA-Enabled Inference on Azure A Scalable FPGA-Powered DNN Serving Platform Fast: Ultra-low latency, high-throughput serving of DNN models at low batch sizes Flexible: Future proof, adaptable to fast-moving AI space and evolving model types Friendly: Turnkey deployment of TensorFlow/CNTK/Caffe/etc. F F F L0 L1 F F F L0 Neural FU Network switches FPGAs ResNet 50, ResNet 152, VGG-16, SSD-VGG, DenseNet-121 Support for image classification and recognition scenarios

×