3 Things to Learn:
How to deploy community defined open data models to break vendor lock-in and gain complete enterprise visibility
How to open up application flexibility while building on a future proofed architecture
How to infinitely scale data storage, access, and machine learning
Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...Michael Rys
SQLBits 2020 presentation on how you can build solutions based on the modern data warehouse pattern with Azure Synapse Spark and SQL including demos of Azure Synapse.
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...Cloudera, Inc.
For self-service BI and exploratory analytic workloads, the cloud can provide a number of key benefits, but the move to the cloud isn’t all-or-nothing. Gartner predicts nearly 80 percent of businesses will adopt a hybrid strategy. Learn how a modern analytic database can power your business-critical workloads across multi-cloud and hybrid environments, while maintaining data portability. We'll also discuss how to best leverage the increased agility cloud provides, while maintaining peak performance.
Data Lakehouse, Data Mesh, and Data Fabric (r2)James Serra
So many buzzwords of late: Data Lakehouse, Data Mesh, and Data Fabric. What do all these terms mean and how do they compare to a modern data warehouse? In this session I’ll cover all of them in detail and compare the pros and cons of each. They all may sound great in theory, but I'll dig into the concerns you need to be aware of before taking the plunge. I’ll also include use cases so you can see what approach will work best for your big data needs. And I'll discuss Microsoft version of the data mesh.
Data Lakehouse, Data Mesh, and Data Fabric (r1)James Serra
So many buzzwords of late: Data Lakehouse, Data Mesh, and Data Fabric. What do all these terms mean and how do they compare to a data warehouse? In this session I’ll cover all of them in detail and compare the pros and cons of each. I’ll include use cases so you can see what approach will work best for your big data needs.
3 Things to Learn:
How to deploy community defined open data models to break vendor lock-in and gain complete enterprise visibility
How to open up application flexibility while building on a future proofed architecture
How to infinitely scale data storage, access, and machine learning
Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...Michael Rys
SQLBits 2020 presentation on how you can build solutions based on the modern data warehouse pattern with Azure Synapse Spark and SQL including demos of Azure Synapse.
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...Cloudera, Inc.
For self-service BI and exploratory analytic workloads, the cloud can provide a number of key benefits, but the move to the cloud isn’t all-or-nothing. Gartner predicts nearly 80 percent of businesses will adopt a hybrid strategy. Learn how a modern analytic database can power your business-critical workloads across multi-cloud and hybrid environments, while maintaining data portability. We'll also discuss how to best leverage the increased agility cloud provides, while maintaining peak performance.
Data Lakehouse, Data Mesh, and Data Fabric (r2)James Serra
So many buzzwords of late: Data Lakehouse, Data Mesh, and Data Fabric. What do all these terms mean and how do they compare to a modern data warehouse? In this session I’ll cover all of them in detail and compare the pros and cons of each. They all may sound great in theory, but I'll dig into the concerns you need to be aware of before taking the plunge. I’ll also include use cases so you can see what approach will work best for your big data needs. And I'll discuss Microsoft version of the data mesh.
Data Lakehouse, Data Mesh, and Data Fabric (r1)James Serra
So many buzzwords of late: Data Lakehouse, Data Mesh, and Data Fabric. What do all these terms mean and how do they compare to a data warehouse? In this session I’ll cover all of them in detail and compare the pros and cons of each. I’ll include use cases so you can see what approach will work best for your big data needs.
Cloud Data Warehousing with Cloudera Altus 7.24.18Cloudera, Inc.
This webinar will help you maximize the full potential of the cloud. Understand how to leverage cloud environments for different analytic workloads to empower business analysts and keep IT happy. An intricate, beautiful balance. The learn best practices in design, performance tuning, workload considerations, and hybrid or multi-cloud strategies.
Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...Cloudera, Inc.
Maschinelles Lernen und Analyseanwendungen explodieren im Unternehmen und ermöglichen Anwendungsfällen in Bereichen wie vorbeugende Wartung, Bereitstellung neuer, wünschenswerter Produktangebote für Kunden zum richtigen Zeitpunkt und Bekämpfung von Insider-Bedrohungen für Ihr Unternehmen.
Sql Bits 2020 - Designing Performant and Scalable Data Lakes using Azure Data...Rukmani Gopalan
Cloud Storage is evolving rapidly, and our Azure Storage portfolio has added a ton of new industry leading capabilities. In this session you will learn the do's and don'ts of building data lakes on Azure Data Lake Storage. You will learn about the commonly used patterns, how to set up your accounts and pipelines to maximize performance, how to organize your data and various options to secure access to your data. We will also cover customer use cases and highlight planned enhancements and upcoming features.
Move your on prem data to a lake in a Lake in CloudCAMMS
With the boom in data; the volume and its complexity, the trend is to move data to the cloud. Where and How do we do this? Azure gives you the answer. In this session, I will give you an introduction to Azure Data Lake and Azure Data Factory, and why they are good for the type of problem we are talking about. You will learn how large datasets can be stored on the cloud, and how you could transport your data to this store. The session will briefly cover Azure Data Lake as the modern warehouse for data on the cloud,
In the past few years, the term "data lake" has leaked into our lexicon. But what exactly IS a data lake? Some IT managers confuse data lakes with data warehouses. Some people think data lakes replace data warehouses. Both of these conclusions are false. Their is room in your data architecture for both data lakes and data warehouses. They both have different use cases and those use cases can be complementary.
Todd Reichmuth, Solutions Engineer with Snowflake Computing, has spent the past 18 years in the world of Data Warehousing and Big Data. He spent that time at Netezza and then later at IBM Data. Earlier in 2018 making the jump to the cloud at Snowflake Computing.
Mike Myer, Sales Director with Snowflake Computing, has spent the past 6 years in the world of Security and looking to drive awareness to better Data Warehousing and Big Data solutions available! Was previously at local tech companies FireMon and Lockpath and decided to join Snowflake due to the disruptive technology that's truly helping folks in the Big Data world on a day to day basis.
Consolidate your data marts for fast, flexible analytics 5.24.18Cloudera, Inc.
In this webinar, Cloudera and AtScale will showcase:
How a company can modernize their analytic architecture to deliver flexibility and agility to more end-users.
How using AtScale’s Universal Semantic layer can end the data chaos and allow business users to use the data in the modern platform.
Highlight the performance of AtScale and Cloudera’s analytic database with newly completed TPC-DS standard benchmarking.
Best practices for migrating from legacy appliances.
Demystifying Data Warehouse as a Service (DWaaS)Kent Graziano
This is from the talk I gave at the 30th Anniversary NoCOUG meeting in San Jose, CA.
We all know that data warehouses and best practices for them are changing dramatically today. As organizations build new data warehouses and modernize established ones, they are turning to Data Warehousing as a Service (DWaaS) in hopes of taking advantage of the performance, concurrency, simplicity, and lower cost of a SaaS solution or simply to reduce their data center footprint (and the maintenance that goes with that).
But what is a DWaaS really? How is it different from traditional on-premises data warehousing?
In this talk I will:
• Demystify DWaaS by defining it and its goals
• Discuss the real-world benefits of DWaaS
• Discuss some of the coolest features in a DWaaS solution as exemplified by the Snowflake Elastic Data Warehouse.
Hadoop Essentials -- The What, Why and How to Meet Agency ObjectivesCloudera, Inc.
This session will provide an executive overview of the Apache Hadoop ecosystem, its basic concepts, and its real-world applications. Attendees will learn how organizations worldwide are using the latest tools and strategies to harness their enterprise information to solve business problems and the types of data analysis commonly powered by Hadoop. Learn how various projects make up the Apache Hadoop ecosystem and the role each plays to improve data storage, management, interaction, and analysis. This is a valuable opportunity to gain insights into Hadoop functionality and how it can be applied to address compelling business challenges in your agency.
10 Reasons Snowflake Is Great for AnalyticsSenturus
Learn why Snowflake analytic data warehouse makes sense for BI including data loading flexibility and scalability, consumption-based storage and compute costs, Time Travel and data sharing features, support across a range of BI tools like Power BI and Tableau and ability to allocate compute costs. View this on-demand webinar: https://senturus.com/resources/10-reasons-snowflake-is-great-for-analytics/.
Senturus offers a full spectrum of services in business intelligence and training on Cognos, Tableau and Power BI. Our resource library has hundreds of free live and recorded webinars, blog posts, demos and unbiased product reviews available on our website at: http://www.senturus.com/senturus-resources/.
Slides for the talk at AI in Production meetup:
https://www.meetup.com/LearnDataScience/events/255723555/
Abstract: Demystifying Data Engineering
With recent progress in the fields of big data analytics and machine learning, Data Engineering is an emerging discipline which is not well-defined and often poorly understood.
In this talk, we aim to explain Data Engineering, its role in Data Science, the difference between a Data Scientist and a Data Engineer, the role of a Data Engineer and common concepts as well as commonly misunderstood ones found in Data Engineering. Toward the end of the talk, we will examine a typical Data Analytics system architecture.
Improve Monitoring And Observability for Kubernetes with OSS tools.pdfNilesh Gule
Slide deck related to the presentation at the KubeDay Singapore event. The session covered 3 pillars of Observability and how to use Jaeger for Distribute Tracing, Loki for Log Aggregation and Prometheus and Grafana for Metrics in a distributed application. Azure Kubernetes Service AKS cluster was used for live demo.
https://events.linuxfoundation.org/kubeday-singapore/
Cloud Data Warehousing with Cloudera Altus 7.24.18Cloudera, Inc.
This webinar will help you maximize the full potential of the cloud. Understand how to leverage cloud environments for different analytic workloads to empower business analysts and keep IT happy. An intricate, beautiful balance. The learn best practices in design, performance tuning, workload considerations, and hybrid or multi-cloud strategies.
Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...Cloudera, Inc.
Maschinelles Lernen und Analyseanwendungen explodieren im Unternehmen und ermöglichen Anwendungsfällen in Bereichen wie vorbeugende Wartung, Bereitstellung neuer, wünschenswerter Produktangebote für Kunden zum richtigen Zeitpunkt und Bekämpfung von Insider-Bedrohungen für Ihr Unternehmen.
Sql Bits 2020 - Designing Performant and Scalable Data Lakes using Azure Data...Rukmani Gopalan
Cloud Storage is evolving rapidly, and our Azure Storage portfolio has added a ton of new industry leading capabilities. In this session you will learn the do's and don'ts of building data lakes on Azure Data Lake Storage. You will learn about the commonly used patterns, how to set up your accounts and pipelines to maximize performance, how to organize your data and various options to secure access to your data. We will also cover customer use cases and highlight planned enhancements and upcoming features.
Move your on prem data to a lake in a Lake in CloudCAMMS
With the boom in data; the volume and its complexity, the trend is to move data to the cloud. Where and How do we do this? Azure gives you the answer. In this session, I will give you an introduction to Azure Data Lake and Azure Data Factory, and why they are good for the type of problem we are talking about. You will learn how large datasets can be stored on the cloud, and how you could transport your data to this store. The session will briefly cover Azure Data Lake as the modern warehouse for data on the cloud,
In the past few years, the term "data lake" has leaked into our lexicon. But what exactly IS a data lake? Some IT managers confuse data lakes with data warehouses. Some people think data lakes replace data warehouses. Both of these conclusions are false. Their is room in your data architecture for both data lakes and data warehouses. They both have different use cases and those use cases can be complementary.
Todd Reichmuth, Solutions Engineer with Snowflake Computing, has spent the past 18 years in the world of Data Warehousing and Big Data. He spent that time at Netezza and then later at IBM Data. Earlier in 2018 making the jump to the cloud at Snowflake Computing.
Mike Myer, Sales Director with Snowflake Computing, has spent the past 6 years in the world of Security and looking to drive awareness to better Data Warehousing and Big Data solutions available! Was previously at local tech companies FireMon and Lockpath and decided to join Snowflake due to the disruptive technology that's truly helping folks in the Big Data world on a day to day basis.
Consolidate your data marts for fast, flexible analytics 5.24.18Cloudera, Inc.
In this webinar, Cloudera and AtScale will showcase:
How a company can modernize their analytic architecture to deliver flexibility and agility to more end-users.
How using AtScale’s Universal Semantic layer can end the data chaos and allow business users to use the data in the modern platform.
Highlight the performance of AtScale and Cloudera’s analytic database with newly completed TPC-DS standard benchmarking.
Best practices for migrating from legacy appliances.
Demystifying Data Warehouse as a Service (DWaaS)Kent Graziano
This is from the talk I gave at the 30th Anniversary NoCOUG meeting in San Jose, CA.
We all know that data warehouses and best practices for them are changing dramatically today. As organizations build new data warehouses and modernize established ones, they are turning to Data Warehousing as a Service (DWaaS) in hopes of taking advantage of the performance, concurrency, simplicity, and lower cost of a SaaS solution or simply to reduce their data center footprint (and the maintenance that goes with that).
But what is a DWaaS really? How is it different from traditional on-premises data warehousing?
In this talk I will:
• Demystify DWaaS by defining it and its goals
• Discuss the real-world benefits of DWaaS
• Discuss some of the coolest features in a DWaaS solution as exemplified by the Snowflake Elastic Data Warehouse.
Hadoop Essentials -- The What, Why and How to Meet Agency ObjectivesCloudera, Inc.
This session will provide an executive overview of the Apache Hadoop ecosystem, its basic concepts, and its real-world applications. Attendees will learn how organizations worldwide are using the latest tools and strategies to harness their enterprise information to solve business problems and the types of data analysis commonly powered by Hadoop. Learn how various projects make up the Apache Hadoop ecosystem and the role each plays to improve data storage, management, interaction, and analysis. This is a valuable opportunity to gain insights into Hadoop functionality and how it can be applied to address compelling business challenges in your agency.
10 Reasons Snowflake Is Great for AnalyticsSenturus
Learn why Snowflake analytic data warehouse makes sense for BI including data loading flexibility and scalability, consumption-based storage and compute costs, Time Travel and data sharing features, support across a range of BI tools like Power BI and Tableau and ability to allocate compute costs. View this on-demand webinar: https://senturus.com/resources/10-reasons-snowflake-is-great-for-analytics/.
Senturus offers a full spectrum of services in business intelligence and training on Cognos, Tableau and Power BI. Our resource library has hundreds of free live and recorded webinars, blog posts, demos and unbiased product reviews available on our website at: http://www.senturus.com/senturus-resources/.
Slides for the talk at AI in Production meetup:
https://www.meetup.com/LearnDataScience/events/255723555/
Abstract: Demystifying Data Engineering
With recent progress in the fields of big data analytics and machine learning, Data Engineering is an emerging discipline which is not well-defined and often poorly understood.
In this talk, we aim to explain Data Engineering, its role in Data Science, the difference between a Data Scientist and a Data Engineer, the role of a Data Engineer and common concepts as well as commonly misunderstood ones found in Data Engineering. Toward the end of the talk, we will examine a typical Data Analytics system architecture.
Improve Monitoring And Observability for Kubernetes with OSS tools.pdfNilesh Gule
Slide deck related to the presentation at the KubeDay Singapore event. The session covered 3 pillars of Observability and how to use Jaeger for Distribute Tracing, Loki for Log Aggregation and Prometheus and Grafana for Metrics in a distributed application. Azure Kubernetes Service AKS cluster was used for live demo.
https://events.linuxfoundation.org/kubeday-singapore/
Slide deck related to the Open Telemetry demo at Singapore Java User Group (JUG). The demo showcased how to use Open Telemetry on local laptop with Docker, Docker Compose and Kubernetes running on single node cluster using Docker Desktop. In the last part of the demo, we highlighted the use of Helm charts and Operators to run OpenTelemetry collector on a managed Kubernetes cluster with Azure Kubernetes Service (AKS).
Build Secure Portable Applications using AKS and its ecosystemNilesh Gule
Slides related to the Global Azure Singapore 2023 talk about building secure and portable applications using AKS and its ecosystem. The demo showcases the integrations with Dapr and KEDA
Modular Architecturs for resilience and Adaptability.pdfNilesh Gule
Slidedeck related to the session on Modular Architectures for Resilience and Adaptability at he APIDays Singapore event
The video recording of the talk is available on YouTube
https://youtu.be/ILU0VdaxxNQ
Modular Architecturs for resilience and Adaptability.pdfNilesh Gule
Slide deck for the APIDays Singapore talk on Modular Architecture for Resilience and Adaptability. https://www.apidays.global/singapore/
The session covered different means by which we can architect modern applications and services for resilience and adaptability
Cloud Native Ninja - PT7 - Containerize Go apps.pdfNilesh Gule
Slide deck related to YouTube video on how to containerize or Dokcerize Go Applications. https://youtu.be/6ji4biaYx98
Covers the following topics:
- Integrate with Dapr Go SDK
- Build Go app using Go build tools
- Containerize Go App using multistage Dockerfile
- Build Container image using Docker build
- Use Docker Compose to build multi-container images
- Publish multiple container images using Docker Compose
- Run container images locally with Docker Run
Slide deck related to YouTube video on how to containerize or Dokcerize Spring Boot Applications. https://youtu.be/c9L89T8BkZ4
Covers the following topics:
- Generate Spring Boot App using Spring initializer
- Build Spring Boot app using Maven build
- Containerize Spring Boot App using multistage Dockerfile
- Build Container image using Docker build
- Use Docker Compose to build multi-container images
- Publish multiple container images using Docker Compose
- Run container images locally with Docker Run
Portable Multi-cloud Microservices with Dapr .pdfNilesh Gule
Slide deck related to the presentation and live demo for Devtron webinar.
https://www.linkedin.com/video/event/urn:li:ugcPost:7031191867178303488
The session demonstrated how Distributed Application Runtime or Dapr can be used to build and deploy portable microservices which can be deployed to multi-cloud environments.
Portable Multi-cloud Microservices with Dapr .pdfNilesh Gule
Slide deck related to the Power Platform Bootcamp Manila 2023. The demo showcased how to build portable multi-cloud microservices with Distributed Application Runtime Dapr. RabbitMQ is used as a message broker and Azure Kubernetes service (AKS) cluster is used for deployment.
Cloud Native Ninja - Distributed Microservices with Dapr - part 2.pdfNilesh Gule
Slidedeck related to Part 2 of the Cloud Native Ninja series. This video https://youtu.be/ep_IJ9d0Nqw talks about building distributed microservices using Dapr
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
7. Data Warehouse
• Store data from multiple data sources
• Used for historical and trend analysis reporting
• Central repository for many subject areas
• Contains single version of truth
• Not to be used for OLTP applications
Main features
• Reduce stress on production systems
• Optimized for read access
• Keep historical records
• Restructure / rename tables, fields, model data
• Protect against source system upgrades
• Use Master Data Management
• Easy to create BI solutions on top of it (e.g.
Azure Analysis Services Cubes)
• Reduces security access to multiple production
systems
Use cases
12. Azure Data Lake Storage
• Petabyte scale storage
• Hierarchical namespace
• Hadoop compatible access with ABFS
driver
Main features
• Use Service Principles
• Use Security Groups over individual
users
• Enable Gen 2 firewall with Azure
services access
ADLS best practices
13. Azure Data Factory
• Cloud ETL service
• Scale-out serverless data integration & data
transformation
• Code-free UI
• Monitoring & Management
Main features
21. Best in class price
per performance
Developer
productivity
Workload aware
query execution
Data flexibility
Up to 94% less expensive
than competitors
Manage heterogenous
workloads through
workload priorities and
isolation
Ingest variety of data
sources to derive the
maximum benefit.
Query all data.
Use preferred tooling for
SQL data warehouse
development
Industry-leading
security
Defense-in-depth
security and 99.9%
financially backed
availability SLA
Azure Synapse – SQL Analytics
focus areas
Credits: James Serra
22. Data Lake with Data Warehouse use cases
• Data scientists/power users
• Batch processing
• Data refinement / cleaning
• ETL workloads
• Store older / backup data
• Sandbox for data exploration
• One-time reports
• Quick access to data
• Don’t know questions
Data Lake – staging & preparation
• Business people
• Low latency
• Complex joins
• Interactive ad-hoc queries
• High number of users
• Additional security
• Large support for tools
• Dashboards
• Self-service BI
• Known questions
Data Warehouse – Serving, security &
Compliance
23. Data Lakehouse concerns / limitations
• Reliability – keeping data lake and warehouse
consistent
• Data staleness – older data in warehouse
• Limited support for advanced analytics – Top ML
systems don’t work well on warehouses
• TCO – extra cost for data copies in warehouse
Problems
• Speed – RDBMS faster, especially MPP
• Security - No RLS, column-level, dynamic data
masking
• Complexity – metadata separate from data, file
based
• Missing features – referential integrity, TDE,
workload management, many features require
Spark lockin
Concerns wrt RDBMS
25. @nileshgule
References
Data Warehouse
❖ Why you need a data warehouse
❖ James Serra website
Azure
❖ ADLS Gen 2 Storage Account
❖ Azure Data Factory
❖ Azure Databricks
❖ Azure Data Factory Mapping Data Flows
❖ Azure Synapse Analytics
❖ Azure Synapse Link for SQL
Delta Lake & Data Mesh
❖ Databricks Deltalake
❖ Delta
❖ Data Mesh Architecture
❖ ThoughtWorks – Data Mesh
26. Nilesh Gule
ARCHITECT | MICROSOFT MVP
“Code with Passion and
Strive for Excellence”
nileshgule
@nileshgule Nilesh Gule
NileshGule
www.handsonarchitect.com
https://bit.ly/youtube-nileshgule