Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)

19 views

Published on

In dieser Session stellen wir ein Projekt vor, in welchem wir ein umfassendes BI-System mit Hilfe von Azure Blob Storage, Azure SQL, Azure Logic Apps und Azure Analysis Services für und in der Azure Cloud aufgebaut haben. Wir berichten über die Herausforderungen, wie wir diese gelöst haben und welche Learnings und Best Practices wir mitgenommen haben.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)

  1. 1. news.trivadis.com/blog@trivadis Data Warehouse – (high) added value in Azure Cloud yves.mauron@trivadis.com marco.amhof@trivadis.com
  2. 2. Agenda • The modern data warehouse pattern • Azure Cloud architecture pillars • Ingest data (Azure Data Factory, ADF SSIS Runtime Integration, others) • Store data (Azure SQL, Azure Data Lake, Azure SQL DWH) • Process data (Azure Databricks) • Azure SSAS – Advantages and possibilities of a BI semantic layer • Power BI • Trivadis Customer Cases • Case 1 (Lift and Shift on-premise DWH to Azure Cloud) • Case 2 (BI Solution in Azure Cloud with Data Lake and Databricks Technology) • Case 3 (Green field a BI Solution from scratch in Azure Cloud)
  3. 3. © Microsoft Corporation Modern data warehousing pattern Advanced Analytics Social LOB Graph IoT Image CRM INGEST STORE PREP MODEL & SERVE (& store) Data orchestration and monitoring Big data store Transform & Clean Data warehouse AI BI + Reporting
  4. 4. AI built-in | Most secure | Lowest TCO Data warehouses Data lakes Operational databases Data warehouses Data lakes Operational databasesIndustry leader 4 years in a row #1 TPC-H performance T-SQL query over any data 70% faster 2x the global reach 99.9% SLA Easiest lift and shift with no code changes The Microsoft offering SQL Server Hybrid Azure Data Services Security and performanceFlexibility of choiceReason over any data, anywhere SocialLOB Graph IoTImageCRM
  5. 5. The Azure data landscape Azure Data Factory Azure Import/Export service Azure SDKAzure CLI Cognitive servicesBot service Azure Search Azure Data Catalog Azure ExpressRoute Azure network security groups Azure Functions Visual StudioOperations Management Suite Azure Active Directory Azure key management service Azure Blob Storage Azure Data Lake Store Azure IoT Hub Azure event hubs Kafka on Azure HDInsight Azure SQL data warehouseAzure SQL DB Azure Cosmos DB Azure Analysis Services Power BI Azure HDInsight Azure Databricks Azure HDInsight Azure Databricks Azure Stream Analytics Azure ML Azure Databricks ML Server 10 01 SQL NSG >_ INGEST STORE PREP MODEL & SERVE
  6. 6. Ingest
  7. 7. ETL vs. ELT vs.
  8. 8. Azure Data Factory • Data Integration Service: Serverless, Scalable, Hybrid Hybrid Pipeline Model Seamlessly span: on prem, Azure, other clouds & SaaS Run on-demand, scheduled, data-availability or on event Data Movement @Scale Cloud & Hybrid w/ 80+ connectors provided Up to 1 GB/s SSIS Package Execution Lift existing SQL Server ETL to Azure Use existing tools (SSMS, SSDT) Author & Monitor Programmability w/ multi-language SDK Visual Tools Azure
  9. 9. Pipeline foreach (…)Trigger Linked Service
  10. 10. © Microsoft Corporation No-code data transformation @ scale Mapping Data Flow Data cleansing, transformation, aggregation, conversion, etc. Cloud scale via Spark execution Easily build resilient data flows
  11. 11. Store
  12. 12. Azure SQL Database resource types Azure SQL Database Database-scoped deployment option with predictable workload performance Shared resource model optimized for greater efficiency of multi- tenant applications Best for apps that require resource guarantee at database level Best for SaaS apps with multiple databases that can share resources at database level, achieving better cost efficiency Best for modernization at scale with low friction and effort Elastic PoolSingle Managed Instance Instance-scoped deployment option with high compatibility with SQL Server and full PaaS benefits
  13. 13. Azure SQL Data Warehouse Best in class price-performance Up to 14X times faster and 94% less expensive than cloud competitors Industry-leading security Defense-in-depth security and 99.9% financially backed availability SLA Intelligent workload management Separation of compute and storage Prioritize resources for the most valuable workloads Developer productivityData flexibility
  14. 14. Azure SQL Data Warehouse MPP Architecture
  15. 15. © Microsoft Corporation Data ingestion using external data sources Polybase -- Create Azure DataLake Gen2 Storage reference CREATE EXTERNAL DATA SOURCE AzureStorage with ( TYPE = HADOOP, LOCATION='abfss://<container>@<storageaccnt>.blob.core.windows.net' , CREDENTIAL = AzureStorageCredential –- not required if using managed identity ); -- Type of format in Hadoop (CSV, RCFILE , ORC, PARQUET). CREATE EXTERNAL FILE FORMAT TextFileFormat WITH ( FORMAT_TYPE = DELIMITEDTEXT, FORMAT_OPTIONS (FIELD_TERMINATOR ='|', USE_TYPE_DEFAULT = TRUE) ) -- LOCATION: path to file or directory that contains data CREATE EXTERNAL TABLE [dbo].[CarSensor_Data] ( [SensorKey] int NOT NULL, [Speed] float NOT NULL, [YearMeasured] int NOT NULL ) WITH (LOCATION='/Demo/’, DATA_SOURCE = AzureStorage, FILE_FORMAT = TextFileFormat ); Overview Polybase supports querying files (Parquet, Delimited Text) stored in a Hadoop File System (HDFS), Azure Blob storage, or Azure Data Lake Store. To query files, users create three objects: External data source, external file format, external table. Starting in SQL Server 2019, you can now use PolyBase to access external data in SQL Server, Oracle, Teradata, and MongoDB.
  16. 16. DWU’s (CPU, memory, and IO) in $$$ SQL Azure Data Warehouse (a preview)
  17. 17. © Microsoft Corporation It is a central storage repository that holds data coming from many sources in a raw, granular format. It can store structured, semi-structured, or unstructured data, which means data ingested quickly and can be kept in a more flexible format for future use cases. What is a Data Lake? Characteristics • Schema-on-read (ELT) • Collection of data, not a platform • Perfect place for evolving data Benefits • Quickly ingest high volumes of diverse data structures • Enable advanced analytics and data exploration • Scalability and storage cost reduction BestPractices • Data Governance needed to avoid Data Swamp • Security considerations • Design your Data Lake • Metadata management
  18. 18. © Microsoft Corporation Data Lake Design Considerations Data Lake Zones Transient Landing Zone Temporary storage of data to meet regulatory and quality control requirements. Limited access. May not be required depending on requirements. Raw Zone Original source of data ready for consumption. Metadata publicly available but access to data still limited. Trusted Zone Standardized and enriched datasets ready for consumption to those with appropriate role-based access. Metadata available to all. Curated/Refined Zone Data transformed from Trusted Zone to meet specific business requirements. Sandbox Zone Playground for Data Scientists for ad hoc exploratory use cases. Data Governance Considerations Security and Compliance Access Control Encryption Row-Level Security Metadata Management Data Quality Metadata Management Lifecycle Management
  19. 19. Transform & Clean
  20. 20. © Microsoft Corporation A fast, easy and collaborative Apache® Spark™ based analytics platform optimized for Azure Azure Databricks Best of Databricks Best of Microsoft Designed in collaboration with the founders of Apache Spark One-click set up; streamlined workflows Interactive workspace that enables collaboration between data scientists, data engineers, and business analysts. Native integration with Azure services (Power BI, SQL DW, Cosmos DB, ADLS, Azure Storage, Azure Data Factory, Azure AD, Event Hub, IoT Hub, HDInsight Kafka, SQL DB) Enterprise-grade Azure security (Active Directory integration, compliance, enterprise-grade SLAs)
  21. 21. Azure Databricks Optimized Databricks Runtime Engine DATABRICKS I/O SERVERLESS Collaborative Workspace Cloud storage Data warehouses Hadoop storage IoT / streaming data Rest APIs Machine learning models BI tools Data exports Data warehouses Azure Databricks Enhance Productivity Deploy Production Jobs & Workflows APACHE SPARK MULTI-STAGE PIPELINES DATA ENGINEER JOB SCHEDULER NOTIFICATION & LOGS DATA SCIENTIST BUSINESS ANALYST Build on secure & trusted cloud Scale without limits
  22. 22. Azure Databricks Deployment Azure Resource Manager APIs Azure Portal Azure Databricks Workspace Managed Resource Group Attached Azure BLOB (DBFS) Workspace VNET Workspace NSG rulesCluster Node(s) Notebooks Clusters Jobs Run on Interact using UI or Azure Databricks REST API Integrate with other Azure Services Azure BLOBs Data Lake Event Hub IOT Hub Kafka Cosmos DB SQL DW Data Factory
  23. 23. Model & Serve
  24. 24. Why do I also need a cube if I have a data warehouse? • Semantic layer • Handle many concurrent users • Implement complex business logic (DAX) • Aggregating data for performance • multidimensional analysis • No joins or relationships • Hierarchies, KPI’s • Row-level Security • Advanced time-calculations • Slowly Changing Dimensions (SCD) • Required for some reporting tools
  25. 25. What is Azure Analysis Services? • Azure Analysis Services is a fully managed platform as a service (PaaS) that provides enterprise- grade data models in the cloud. • Use advanced mashup and modeling features to combine data from multiple data sources, define metrics, and secure your data in a single, trusted tabular semantic data model. • The data model provides an easier and faster way for users to browse massive amounts of data for ad hoc data analysis.
  26. 26. Business / custom apps (Structured) Logs, files and media (unstructured) Azure Storage Polybase Azure SQL Data Warehouse Data Factory Data Factory Azure Databricks (Spark) Analytical dashboards (PowerBI) Model & ServePrep & TrainStoreIngest Intelligence Modern Data Analytics Landscape AZURE DATA FACTORY ORCHESTRATES DATA PIPELINE ACTIVITY WORKFLOW & SCHEDULING Azure Analysis ServicesOn Prem, Cloud Apps & Data
  27. 27. BI & Reporting
  28. 28. Power BI Power BI Report Server (on-premise)
  29. 29. Customer success stories
  30. 30. Case 1 – «on-premise» to Azure Cloud Customer
  31. 31. Case 1 - Approach
  32. 32. Case 1 – Timeline
  33. 33. Case 2 – Modern Data Warehouse – (Delta) Lake
  34. 34. Case 3 – Management Reporting PowerBI Excel … Sub1 Sub2 Sub n Structures Messaging Model & ServePrep / TrainStoreIngest SQL DatabaseBLOB Storage Logic App Function App Analysis Services Azure AD
  35. 35. DEMO
  36. 36. Questions?

×