Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Data lake analytics for the admin

In this session we will take a look at Azure Data Lake from an administrator's perspective.

Do you know who has what access where? How much data is in your data lake? What about the accesses to the data lake, is everything running normally?

In this session we will show you what possibilities the portal offers you to keep an eye on the Azure Data Lake. In addition, we will show you further scripts and tools to perform the corresponding tasks.

Dive with us into the depths of your Data Lake.

  • Be the first to comment

  • Be the first to like this

Data lake analytics for the admin

  1. 1. Data Lake Analytics for the Admin Tillmann Eitelberg & Oliver Engels DBCC International – Friday 23.10.2020
  2. 2. WE CREATE SMART AND CASUAL DATADESIGN. EVERY DAY. 2© ppt by OH22
  3. 3. ABOUT US Tillmann Eitelberg | Oliver Engels 3© ppt by OH22
  4. 4. 4 DATA LAKE ANALYTICS FOR THE ADMIN About Us © ppt by OH22 Tillmann Eitelberg CEO oh22information services GmbH t.eitelberg@oh22.net Kellerstr. 3 53772 Königswinter Deutschland ↗ www.oh22.is Oliver Engels CEO oh22data AG o.engels@oh22.net Otto-Hahn-Str. 22 65520 Bad Camberg Deutschland ↗ www.oh22.net
  5. 5. WHAT IS THE SESSION ABOUT? 5© ppt by OH22
  6. 6. DATA LAKE ANALYTICS FOR THE ADMIN 6© ppt by OH22 We moved our Analytics workload to the cloud…. Now we have a Data Lake! But hey, how do I control that beast?
  7. 7. DATA LAKE ANALYTICS FOR THE ADMIN 7© ppt by OH22 -- Quality ++
  8. 8. + In Modern Data Warehouse, Lake House or Enterprise Data Lake there are clear structures how data is stored + Data is moved through different zones and / or sections + Folder structures result from the respective processing (partitioning) of the data (e.g. yyyyy/mm/dd/hh/) + Data is made available to other users in the company at the end of a process chain + Data is stored in different access tiers to make data management cost-effective + Most of the settings are done by Data Engineers (or Data Engineers show the Admins where to click) 8 DATA LAKE ANALYTICS FOR THE ADMIN What is it about? © ppt by OH22
  9. 9. DATA LAKE ANALYTICS FOR THE ADMIN 9© ppt by OH22 I want Metadata! I need Performance Monitoring! I have to create rules! I must be alerted! I rely on Logs!
  10. 10. + There are many different Azure services and features that can give administrators back control + Additionally, the Azure Monitor provides a complete overview of events on the storage account (and of course other services) + PowerShell offers additional possibilities to get information about individual services 10 DATA LAKE ANALYTICS FOR THE ADMIN How can administrators get an overview? © ppt by OH22
  11. 11. METADATA The data about the data for the admin 11© ppt by OH22
  12. 12. 12 Metadata © ppt by OH22 DATA LAKE ANALYTICS FOR THE ADMIN © by SQLChick: https://www.sqlchick.com/entries/tag/Azure+Data+Lake
  13. 13. 13 DATA LAKE ANALYTICS FOR THE ADMIN Metadata © ppt by OH22
  14. 14. 14 DATA LAKE ANALYTICS FOR THE ADMIN Metadata © ppt by OH22 Blob Custom Properties + Azure Search Blob Index Service with Tags
  15. 15. DATA LAKE ANALYTICS FOR THE ADMIN 15© ppt by OH22
  16. 16. DEMO Metadata with Powershell 16© ppt by OH22
  17. 17. ACTIVITY LOG 17© ppt by OH22
  18. 18. + General log, which provides insights into subscription-level events + Access directly via the portal in Storage Account / Data Lake + Data can be sent to different endpoints, such as Log Analytics, Event Hub or Storage Account + The configuration of the activity logs is stored in log profiles, which can be created automatically via PowerShell or CLI 18 DATA LAKE ANALYTICS FOR THE ADMIN Activity Log © ppt by OH22
  19. 19. # Settings needed for the new log profile $logProfileName = "default" $locations = (Get-AzLocation).Location $locations += "global" $subscriptionId = "<your Azure subscription Id>" $resourceGroupName = "<resource group name your event hub belongs to>" $eventHubNamespace = "<event hub namespace>„ # Build the service bus rule Id from the settings above $serviceBusRuleId = "/subscriptions/$subscriptionId/resourceGroups/$resourceGroupName/providers/Microsoft.EventHub/namespaces/$event HubNamespace/authorizationrules/RootManageSharedAccessKey" # Build the storage account Id from the settings above $storageAccountId = "/subscriptions/$subscriptionId/resourceGroups/$resourceGroupName/providers/Microsoft.Storage/storageAccounts/$s torageAccountName" Add-AzLogProfile -Name $logProfileName -Location $locations -StorageAccountId $storageAccountId -ServiceBusRuleId $serviceBusRuleId 19 DATA LAKE ANALYTICS FOR THE ADMIN Activity Log © ppt by OH22
  20. 20. + For some events a change history can also be displayed (e.g. virtual machines) 20 DATA LAKE ANALYTICS FOR THE ADMIN Activity Log © ppt by OH22 For some events a change history can also be displayed (e.g. virtual machines)
  21. 21. AZURE POLICY 22© ppt by OH22
  22. 22. + A global service, not assigned to a subscription, resource group or resource + Azure Policy helps to enforce organizational standards and to assess compliance at-scale + Complete control of the response to an evaluation + Deny the resource change + Log the change to the resource + Alter the resource before the change + Alter the resource after the change + Deploy related compliant resources + Uses a JSON format to form the logic to determine if a resource is compliant or not 23 DATA LAKE ANALYTICS FOR THE ADMIN Azure Policy © ppt by OH22
  23. 23. + No actions / operations are prevented (RBAC), resource properties are checked for compliance without considering who made a change or who has permissions to do that + Over 10 built-in Azure Policy definitions for Azure Storage + for example “Geo-redundant storage should be enabled for Storage Accounts” + With an initiative, several policy definitions can be combined into a group, e.g. to meet a higher-level standard 24 DATA LAKE ANALYTICS FOR THE ADMIN Azure Policy © ppt by OH22
  24. 24. + The evaluation of the policies take place on the following events: + New policy assignment. ( ~30 minutes). + Updated assignment of a existing policy. (~ 30 minutes) + Deployment of a resources. (~15 minutes) + Every 24 hours + Update of the resource provider + On-demand (~3 minutes) 25 DATA LAKE ANALYTICS FOR THE ADMIN Azure Policy © ppt by OH22
  25. 25. 26 DATA LAKE ANALYTICS FOR THE ADMIN Azure Policy © ppt by OH22
  26. 26. DEMO Policies Live 27© ppt by OH22
  27. 27. LOG ANALYTICS 28© ppt by OH22
  28. 28. + Log Analytics is the primary tool in the Azure portal for writing log queries + Log queries help you to fully leverage the value of the data collected in Azure Monitor Logs + Based on Azure Data Explorer + log queries are written using the Kusto query language (KQL) + Different data sources (= resources) write data to different tables + KQL can (of course) use multiple tables in one query 29 DATA LAKE ANALYTICS FOR THE ADMIN Log Analytics © ppt by OH22
  29. 29. + Loq queries are also used in: + Alert Rules + Azure Dashboards + Views + Export + PowerShell Get-AzOperationalInsightsSearchResult 30 DATA LAKE ANALYTICS FOR THE ADMIN Log Analytics © ppt by OH22
  30. 30. SecurityEvent | where TimeGenerated > ago(7d) | where EventID == 4625 | summarize count() by Computer, bin(TimeGenerated, 1h) | render timechart 31 DATA LAKE ANALYTICS FOR THE ADMIN Log Analytics © ppt by OH22
  31. 31. 32 DATA LAKE ANALYTICS FOR THE ADMIN Log Analytics © ppt by OH22
  32. 32. CLASSIC METRICS Transition to metrics in Azure Monitor 34© ppt by OH22
  33. 33. + On August 31, 2023 Storage Analytics metrics, also referred to as classic metrics will be retired + Classic metrics are sent and stored in an Azure storage account + Collection and aggregation through the Storage Account + Storage takes place in table storage on the storage account + With Azure Monitor, Azure Storage sends metric data to the Azure Monitor back end + Azure Monitor metrics can be sent to multiple locations + Classic Metrics send 0 values for non-existent metrics to the log, in Azure Monitor these values do not exist + Microsoft Docs: Transition to metrics in Azure Monitor https://docs.microsoft.com/de-de/azure/storage/common/storage-metrics-migration?toc=/azure/storage/blobs/toc.json 35 DATA LAKE ANALYTICS FOR THE ADMIN Classic Metrics © ppt by OH22
  34. 34. ALERTS When things happen, I need to be aware 36© ppt by OH22
  35. 35. + Alerts proactively notify you when issues are found with your data lake + The are analyzing data in the Azure Monitor + You define: + Scope (your data lake) + Condition (e.g. Ingest) + Action (e.g. email, runbook) 37 DATA LAKE ANALYTICS FOR THE ADMIN Alerts © ppt by OH22
  36. 36. 38 DATA LAKE ANALYTICS FOR THE ADMIN Alerts © ppt by OH22
  37. 37. 39 DATA LAKE ANALYTICS FOR THE ADMIN Alerts © ppt by OH22
  38. 38. 40 DATA LAKE ANALYTICS FOR THE ADMIN Alerts © ppt by OH22
  39. 39. Thank you sponsors!!

×