Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

UTAD - Jornadas de Informática - Potential of Big Data


Published on

Short Presentation given at the Universidade de Tras dos Montes (UTAD) IT event for students and faculty members. The talk is meant to be an overview of Big Data and how microsoft technologies tackle that subject and how students could leverage these tools on their projects and future.

Published in: Software
  • Be the first to comment

  • Be the first to like this

UTAD - Jornadas de Informática - Potential of Big Data

  1. 1. • Analysis
  2. 2. • Analysis • Transportation
  3. 3. • Analysis • Transportation • Access Control
  4. 4. • Analysis • Transportation • Access Control • Replication
  5. 5. • Analysis • Transportation • Access Control • Replication • Storage
  6. 6. Data at Rest Data in Motion Data in Many Forms Data in Doubt Data into Money
  7. 7. map (in_key, in_value) -> list(out_key, intermediate_value) reduce (out_key, list(intermediate_value)) -> list(out_value)
  8. 8. Read lines from file Convert line to Key-Value Pair(s) Filter (by key/value) Combine Values with similar Keys Shuffle data across nodes for reduces by Key Sort by Key Aggregate (reduce) Filter (based on aggregated value) Write results to file Map Reduce
  9. 9. Deer Bear River Car Car River Deer Car Bear Deer Bear River Car Car River Deer Car Bear Deer, 1 Bear, 1 River, 1 Car, 1 Car, 1 River, 1 Deer, 1 Car, 1 Bear, 1 Bear, 1 Bear, 1 Car, 1 Car, 1 Car, 1 Deer, 1 Deer, 1 River, 1 River, 1 Bear, 2 Car, 3 Deer, 2 River, 2 Bear, 2 Car, 3 Deer, 2 River, 2
  10. 10. Windows Azure Blob Storage (WABS) Distributed File System Applications (by cluster type) Spark  Spark  Spark Streaming  Spark MLlib Storm  Storm  Kafka HBase  HBase  Zookeeper …. Hadoop  HDFS APIs  MapReduce  Sqoop  Pig  Hive (Tez)  Mahout  Oozie Yet Another Resource Negotiator (YARN) Acquisition  Azure Data Factory Stream Processing • Steam Analytics • Event Hub Machine Learning  Azure Machine Learning NoSQL  Table Storage  DocumentDB
  11. 11. DATA Business apps Custom apps Sensors and devices INTELLIGENCE ACTION People Automated Systems
  13. 13. Process real-time data in Azure using a simple SQL language Consumes millions of real-time events from Event Hub collected from devices, sensors, infrastructure, and applications Performs time-sensitive analysis using SQL-like language against multiple real-time streams and reference data Outputs to persistent stores, dashboards or back to devices Point of Service Devices Self Checkout Stations Kiosks Smart Phones Slates/ Tablets PCs/ Laptops Servers Digital Signs Diagnostic EquipmentRemote Medical Monitors Logic Controllers Specialized DevicesThin Clients Handhelds Security POS Terminals Automation Devices Vending Machines Kinect ATM
  14. 14. Fully managed service to support orchestration of data movement and processing Connect to relational or non-relational data that is on-premises or in the cloud Single pane of glass to monitor and manage data processing pipelines. Publish to Power BI Compose and orchestrate data services at scale C# MapReduce Trusted data BI & analytics Hive Pig Stored Procedures Azure Machine Learning
  15. 15. ML Algorithms are best of breed and embrace OSS • MS + R + Python + BYOA ML Studio for productive development • Faster experiments results in faster improvements • Visual Workflows & ML Experiments ML Operationalization to remove deployment friction • Build entire ML Apps & Deploy as Cloud APIs ML Gallery • Provide ML applications like apps in an ‘app store’ • Publish/consume APIs in a 2 sided market Help organizations eliminate undifferentiated heavy lifting Powerful predictive analytics in Azure Azure Machine Learning
  16. 16. Enable enterprise-wide self-service data source registration and discovery A metadata repository that allow users to register, enrich, understand, discover, and consume data sources Delivers differentiated value though ‒ Data source discovery; rather than data discovery ‒ Support for data from any source; Structured and unstructured, on premises and in the cloud ‒ Publishing, discovery and consumption through any tool ‒ Annotation crowdsourcing: empowering any user to capture and share their knowledge. This, while allowing IT to maintain control and oversight
  17. 17. Power Map with custom maps allows deeper geospatial explorations and storytelling Power Query brings modern data discovery, connectivity, shaping and publishing to Excel Analysis Services connectivity for Power View allows users to leverage existing IT investments Support for more sophisticated data models in Power Pivot – date and calc tables, many-to-many relationships, etc Power Map w/ Custom Maps Power Query
  18. 18. Power BI dashboards and KPIs for monitoring the health of your business New data visualizations and touch- optimized exploration in HTML5 Power BI mobile apps across devices including iPad and iPhone Support for new data sources including, Dynamics CRM online and SQL Server Analysis Services Dashboard Tree Map
  19. 19. A hyper scale repository for big data analytic workloads • Hadoop File System compatible with HDFS™ • Integrated with HDInsight, Revolution R, Hortonworks, Cloudera • Based on YARN • Petabyte-sized files • No size limits to data in single account • Massive throughput to increase performance • AAD based access control • Data management Devices
  20. 20. Azure Data Lake Analytics Service A new distributed analytics service Built on Apache YARN Scales dynamically with the turn of a dial Pay by the query Supports Azure AD for access control, roles, and integration with on-prem identity systems Built with U-SQL to unify the benefits of SQL with the power of C# Processes data across Azure 37
  21. 21. Stream Analytics TransformIngest Example overall data flow and Architecture Web logs Present & decide IoT, Mobile Devices etc. Social Data Event Hubs HDInsight Azure Data Factory Azure SQL DB Azure Data Lake Azure Machine Learning (Fraud detection etc.) Power BI Web dashboards Mobile devices DW / Long-term storage Predictive analytics Event & data producers Azure SQL DW
  22. 22. From zero to finished, analytical apps and scenarios Pre-Configured Allows customers to accelerates the process of building analytical apps Go from zero to sample app in minutes, from sample app to finished solution in a week
  23. 23. The Internet of Things – Manufacturing GLOBAL OPERATIONS I can see my production line status and recommend adjustments to better manage operational cost. I know when to deploy the right resources for predictive maintenance to minimize equipment failures and reduce service cost. I gain insight into usage patterns from multiple customers and track equipment deterioration, enabling me to reengineer products for better performance. MANUFACTURING PLANT Aggregate product data, customer sentiment, and other third-party syndicated data to identify and correct quality issues. Manage equipment remotely, using temperature limits and other settings to conserve energy and reduce costs. Monitor production flow in near-real time to eliminate waste and unnecessary work in process inventory. GLOBAL FACILITY INSIGHT Implement condition- based maintenance alerts to eliminate machine down-time and increase throughput. THIRD-PARTY LOGISTICS Provide cross-channel visibility into inventories to optimize supply and reduce shared costs in the value chain. CUSTOMER SITE Transmits operational information to the partner (e.g. OEM) and to field service engineers for remote process automation and optimization. Management R&D Field Service
  24. 24. The Internet of Things – Retail Marketing MOBILE EXPERIENCE STORE PURCHASE HISTORY: Dog food M T W Th F Weather Data Merchandizing IN-STORE SHOPPING Online Behavior Shopping Route REFLECTION BEST DEAL INSPIRATION, DISCOVERY, PRE-SHOPPING Purchase History RIGHT OFFER, RIGHT TIME, RIGHT PLACE IoT DATA FUELS CUSTOMER AND PRODUCT INSIGHTS 1001001001001001001011010010101100101101001010000101010101011010010110100101010 CHECKOUT Retail 200ft Have you seen these! We’re ready for the rain! #ShoppingSuccess 42
  25. 25. The Internet of Things – Hospitality & Travel Save money with more accurate arrival time predictions Provide a seamless traveler experience from the curb to the gate, and enable context-sensitive notifications Provide guests with a connected tablet to control room settings, request services, and provide feedback—and save their preferences Centrally manage critical station assets—everything from communication and security networks to escalators and HVAC control systems Send reports and sensor data to maintenance crews for faster turnaround Configure notifications on employee devices of restaurant equipment maintenance needs Manage inventory in near real time, and monitor food storage temperatures and expirations NEW GATE: B7 25% off ON TIME
  26. 26. • • •
  27. 27. Thank You