Intuit Data Ecosystem supports unique consumer and small business assets at scale, and handle petabytes of customer data. We have 8M active small business customers and 16M paid workers that uses Intuit Quick Books and Quick Books Payroll Products. Huge customer base and large volumes of data always challenges the data teams in terms of freshness of data, correctness of data etc. This presentation is intended to cover such problems we faced at Intuit along with the data observability model we follow to cure, detect and prevent data Issues. We would like to provide deep insights into the implementations and the impact of some of the great work done by Intuit in this direction.
Databricks CEO Ali Ghodsi introduces Databricks Delta, a new data management system that combines the scale and cost-efficiency of a data lake, the performance and reliability of a data warehouse, and the low latency of streaming.
Modernizing to a Cloud Data ArchitectureDatabricks
Organizations with on-premises Hadoop infrastructure are bogged down by system complexity, unscalable infrastructure, and the increasing burden on DevOps to manage legacy architectures. Costs and resource utilization continue to go up while innovation has flatlined. In this session, you will learn why, now more than ever, enterprises are looking for cloud alternatives to Hadoop and are migrating off of the architecture in large numbers. You will also learn how elastic compute models’ benefits help one customer scale their analytics and AI workloads and best practices from their experience on a successful migration of their data and workloads to the cloud.
Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...DATAVERSITY
With the explosive growth of DataOps to drive faster and more confident business decisions, proactively understanding the quality and health of your data is more important than ever. Data observability is an emerging discipline within data quality used to expose anomalies in data by continuously monitoring and testing data using artificial intelligence and machine learning to trigger alerts when issues are discovered.
Join Julie Skeen and Shalaish Koul from Precisely, to learn how data observability can be used as part of a DataOps strategy to improve data quality and reliability and to prevent data issues from wreaking havoc on your analytics and ensure that your organization can confidently rely on the data used for advanced analytics and business intelligence.
Topics you will hear addressed in this webinar:
Data observability – what is it and how it can complement your data quality strategy
Why now is the time to incorporate data observability into your DataOps strategy
How data observability helps prevent data issues from impacting downstream analytics
Examples of how data observability can be used to prevent real-world issues
Intuit Data Ecosystem supports unique consumer and small business assets at scale, and handle petabytes of customer data. We have 8M active small business customers and 16M paid workers that uses Intuit Quick Books and Quick Books Payroll Products. Huge customer base and large volumes of data always challenges the data teams in terms of freshness of data, correctness of data etc. This presentation is intended to cover such problems we faced at Intuit along with the data observability model we follow to cure, detect and prevent data Issues. We would like to provide deep insights into the implementations and the impact of some of the great work done by Intuit in this direction.
Databricks CEO Ali Ghodsi introduces Databricks Delta, a new data management system that combines the scale and cost-efficiency of a data lake, the performance and reliability of a data warehouse, and the low latency of streaming.
Modernizing to a Cloud Data ArchitectureDatabricks
Organizations with on-premises Hadoop infrastructure are bogged down by system complexity, unscalable infrastructure, and the increasing burden on DevOps to manage legacy architectures. Costs and resource utilization continue to go up while innovation has flatlined. In this session, you will learn why, now more than ever, enterprises are looking for cloud alternatives to Hadoop and are migrating off of the architecture in large numbers. You will also learn how elastic compute models’ benefits help one customer scale their analytics and AI workloads and best practices from their experience on a successful migration of their data and workloads to the cloud.
Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...DATAVERSITY
With the explosive growth of DataOps to drive faster and more confident business decisions, proactively understanding the quality and health of your data is more important than ever. Data observability is an emerging discipline within data quality used to expose anomalies in data by continuously monitoring and testing data using artificial intelligence and machine learning to trigger alerts when issues are discovered.
Join Julie Skeen and Shalaish Koul from Precisely, to learn how data observability can be used as part of a DataOps strategy to improve data quality and reliability and to prevent data issues from wreaking havoc on your analytics and ensure that your organization can confidently rely on the data used for advanced analytics and business intelligence.
Topics you will hear addressed in this webinar:
Data observability – what is it and how it can complement your data quality strategy
Why now is the time to incorporate data observability into your DataOps strategy
How data observability helps prevent data issues from impacting downstream analytics
Examples of how data observability can be used to prevent real-world issues
To take a “ready, aim, fire” tactic to implement Data Governance, many organizations assess themselves against industry best practices. The process is not difficult or time-consuming and can directly assure that your activities target your specific needs. Best practices are always a strong place to start.
Join Bob Seiner for this popular RWDG topic, where he will provide the information you need to set your program in the best possible direction. Bob will walk you through the steps of conducting an assessment and share with you a set of typical results from taking this action. You may be surprised at how easy it is to organize the assessment and may hear results that stimulate the actions that you need to take.
In this webinar, Bob will share:
- The value of performing a Data Governance best practice assessment
- A practical list of industry Data Governance best practices
- Criteria to determine if a practice is best practice
- Steps to follow to complete an assessment
- Typical recommendations and actions that result from an assessment
The first step towards understanding data assets’ impact on your organization is understanding what those assets mean for each other. Metadata – literally, data about data – is a practice area required by good systems development, and yet is also perhaps the most mislabeled and misunderstood Data Management practice. Understanding metadata and its associated technologies as more than just straightforward technological tools can provide powerful insight into the efficiency of organizational practices and enable you to combine practices into sophisticated techniques supporting larger and more complex business initiatives. Program learning objectives include:
- Understanding how to leverage metadata practices in support of business strategy
- Discuss foundational metadata concepts
- Guiding principles for and lessons previously learned from metadata and its practical uses applied strategy
Metadata strategies include:
- Metadata is a gerund so don’t try to treat it as a noun
- Metadata is the language of Data Governance
- Treat glossaries/repositories as capabilities, not technology
Introduction to Snowflake Datawarehouse and Architecture for Big data company. Centralized data management. Snowpipe and Copy into a command for data loading. Stream loading and Batch Processing.
A Practical Enterprise Feature Store on Delta LakeDatabricks
The feature store is a data architecture concept used to accelerate data science experimentation and harden production ML deployments. Nate Buesgens and Bryan Christian describe a practical approach to building a feature store on Delta Lake at a large financial organization. This implementation has reduced feature engineering “wrangling” time by 75% and has increased the rate of production model delivery by 15x. The approach described focuses on practicality. It is informed by innovative approaches such as Feast, but our primary goal is evolutionary extensions of existing patterns that can be applied to any Delta Lake architecture.
Key Takeaways:
– Understand the key use cases that motivate the feature store from both a data science and engineering perspective.
– Consider edge cases where there may be opportunities for simplification such as “online” predictions.
– Review a typical logical data model for a feature store and how that can be applied to your business domain.
– Consider options for physical storage of the feature store in the Delta Lake.
– Understand common access patterns including metadata-based feature discovery.
Business Intelligence (BI) and Data Management Basics amorshed
A one-day training course on the Concepts of Data Management and Business Intelligence (BI) in the DX age
A Basic Review of BI and DM
How to Implement BI
A review of BI Tools and 2022 Gartner Quadrant Magic
Basics of Data warehouse (DWH)
An introductions to Power BI
Components of Power BI
Steps for BI Implementation
Data Culture
Intro to ETL and ELT
OLAP files and Architecture
Digital transformation or DX review
A glance at DMBOK2.0 framework
BI Challenges
Data Governance
Data Integration
Data Security and Privacy in DMBOK2.0
Data-Driven Organization
Data and BI Maturity Model
Traditional BI
Self-service BI
who is DMP
who is BI developer
what is Metadata
what is Master data
Data Quality
Data Literacy
Benefits of BI
BI features
How does BI Works?
Modern BI
Data Analytics
BI Architecture
Data Types
Data Lake
Data Mart
Data Silo
Data Visualization
Power BI Architecture and components
DI&A Slides: Data Lake vs. Data WarehouseDATAVERSITY
Modern data analysis is moving beyond the Data Warehouse to the Data Lake where analysts are able to take advantage of emerging technologies to manage complex analytics on large data volumes and diverse data types. Yet, for some business problems, a Data Warehouse may still be the right solution.
If you’re on the fence, join this webinar as we compare and contrast Data Lakes and Data Warehouses, identifying situations where one approach may be better than the other and highlighting how the two can work together.
Get tips, takeaways and best practices about:
- The benefits and problems of a Data Warehouse
- How a Data Lake can solve the problems of a Data Warehouse
- Data Lake Architecture
- How Data Warehouses and Data Lakes can work together
Data Governance Best Practices, Assessments, and RoadmapsDATAVERSITY
When starting or evaluating the present state of your Data Governance program, it is important to focus on best practices such that you don’t take a ready, fire, aim approach. Best practices need to be practical and doable to be selected for your organization, and the program must be at risk if the best practice is not achieved.
Join Bob Seiner for an important webinar focused on industry best practice around standing up formal Data Governance. Learn how to assess your organization against the practices and deliver an effective roadmap based on the results of conducting the assessment.
In this webinar, Bob will focus on:
- Criteria to select the appropriate best practices for your organization
- How to define the best practices for ultimate impact
- Assessing against selected best practices
- Focusing the recommendations on program success
- Delivering a roadmap for your Data Governance program
Describes what Enterprise Data Architecture in a Software Development Organization should cover and does that by listing over 200 data architecture related deliverables an Enterprise Data Architect should remember to evangelize.
Product-thinking is making a big impact in the data world with the rise of Data Products, Data Product Managers, data mesh, and treating “Data as a Product.” But Honest, No-BS: What is a Data Product? And what key questions should we ask ourselves while developing them? Tim Gasper (VP of Product, data.world), will walk through the Data Product ABCs as a way to make treating data as a product way simpler: Accountability, Boundaries, Contracts and Expectations, Downstream Consumers, and Explicit Knowledge.
Data Management, Metadata Management, and Data Governance – Working TogetherDATAVERSITY
The data disciplines listed in the title must work together. The key to success requires understanding the boundaries and overlaps between the disciplines. Wouldn’t it be great to be able to present the relationships between the disciplines in a simple all-in diagram? At the end of this webinar, you will be able to do just that.
This new RWDG webinar with Bob Seiner will outline how Data Management, Metadata Management, and Data Governance can be optimized to work together. Bob will share a diagram that has successfully communicated the relationship between these disciplines to leadership resulting in the disciplines working in harmony and delivering success.
Bob will share the following in this webinar:
- Categories of disciplines focused on managing data as an asset
- A definition of Data Management that embraces numerous data disciplines
- The importance of Metadata -Management to all data disciplines
- Why data and metadata require formal governance
- A graphic that effectively exhibits the relationship between the disciplines
Architect’s Open-Source Guide for a Data Mesh ArchitectureDatabricks
Data Mesh is an innovative concept addressing many data challenges from an architectural, cultural, and organizational perspective. But is the world ready to implement Data Mesh?
In this session, we will review the importance of core Data Mesh principles, what they can offer, and when it is a good idea to try a Data Mesh architecture. We will discuss common challenges with implementation of Data Mesh systems and focus on the role of open-source projects for it. Projects like Apache Spark can play a key part in standardized infrastructure platform implementation of Data Mesh. We will examine the landscape of useful data engineering open-source projects to utilize in several areas of a Data Mesh system in practice, along with an architectural example. We will touch on what work (culture, tools, mindset) needs to be done to ensure Data Mesh is more accessible for engineers in the industry.
The audience will leave with a good understanding of the benefits of Data Mesh architecture, common challenges, and the role of Apache Spark and other open-source projects for its implementation in real systems.
This session is targeted for architects, decision-makers, data-engineers, and system designers.
Big data in transport an international transport forum overview oct 2013OpenSkyData
Comprehensive Guide on the use of Big Data in Transportation Services from the International Transport Forum. OpenSky loves making big data work for organisations large and small.
http://www.openskydata.com/our-sectors/transport.html
To take a “ready, aim, fire” tactic to implement Data Governance, many organizations assess themselves against industry best practices. The process is not difficult or time-consuming and can directly assure that your activities target your specific needs. Best practices are always a strong place to start.
Join Bob Seiner for this popular RWDG topic, where he will provide the information you need to set your program in the best possible direction. Bob will walk you through the steps of conducting an assessment and share with you a set of typical results from taking this action. You may be surprised at how easy it is to organize the assessment and may hear results that stimulate the actions that you need to take.
In this webinar, Bob will share:
- The value of performing a Data Governance best practice assessment
- A practical list of industry Data Governance best practices
- Criteria to determine if a practice is best practice
- Steps to follow to complete an assessment
- Typical recommendations and actions that result from an assessment
The first step towards understanding data assets’ impact on your organization is understanding what those assets mean for each other. Metadata – literally, data about data – is a practice area required by good systems development, and yet is also perhaps the most mislabeled and misunderstood Data Management practice. Understanding metadata and its associated technologies as more than just straightforward technological tools can provide powerful insight into the efficiency of organizational practices and enable you to combine practices into sophisticated techniques supporting larger and more complex business initiatives. Program learning objectives include:
- Understanding how to leverage metadata practices in support of business strategy
- Discuss foundational metadata concepts
- Guiding principles for and lessons previously learned from metadata and its practical uses applied strategy
Metadata strategies include:
- Metadata is a gerund so don’t try to treat it as a noun
- Metadata is the language of Data Governance
- Treat glossaries/repositories as capabilities, not technology
Introduction to Snowflake Datawarehouse and Architecture for Big data company. Centralized data management. Snowpipe and Copy into a command for data loading. Stream loading and Batch Processing.
A Practical Enterprise Feature Store on Delta LakeDatabricks
The feature store is a data architecture concept used to accelerate data science experimentation and harden production ML deployments. Nate Buesgens and Bryan Christian describe a practical approach to building a feature store on Delta Lake at a large financial organization. This implementation has reduced feature engineering “wrangling” time by 75% and has increased the rate of production model delivery by 15x. The approach described focuses on practicality. It is informed by innovative approaches such as Feast, but our primary goal is evolutionary extensions of existing patterns that can be applied to any Delta Lake architecture.
Key Takeaways:
– Understand the key use cases that motivate the feature store from both a data science and engineering perspective.
– Consider edge cases where there may be opportunities for simplification such as “online” predictions.
– Review a typical logical data model for a feature store and how that can be applied to your business domain.
– Consider options for physical storage of the feature store in the Delta Lake.
– Understand common access patterns including metadata-based feature discovery.
Business Intelligence (BI) and Data Management Basics amorshed
A one-day training course on the Concepts of Data Management and Business Intelligence (BI) in the DX age
A Basic Review of BI and DM
How to Implement BI
A review of BI Tools and 2022 Gartner Quadrant Magic
Basics of Data warehouse (DWH)
An introductions to Power BI
Components of Power BI
Steps for BI Implementation
Data Culture
Intro to ETL and ELT
OLAP files and Architecture
Digital transformation or DX review
A glance at DMBOK2.0 framework
BI Challenges
Data Governance
Data Integration
Data Security and Privacy in DMBOK2.0
Data-Driven Organization
Data and BI Maturity Model
Traditional BI
Self-service BI
who is DMP
who is BI developer
what is Metadata
what is Master data
Data Quality
Data Literacy
Benefits of BI
BI features
How does BI Works?
Modern BI
Data Analytics
BI Architecture
Data Types
Data Lake
Data Mart
Data Silo
Data Visualization
Power BI Architecture and components
DI&A Slides: Data Lake vs. Data WarehouseDATAVERSITY
Modern data analysis is moving beyond the Data Warehouse to the Data Lake where analysts are able to take advantage of emerging technologies to manage complex analytics on large data volumes and diverse data types. Yet, for some business problems, a Data Warehouse may still be the right solution.
If you’re on the fence, join this webinar as we compare and contrast Data Lakes and Data Warehouses, identifying situations where one approach may be better than the other and highlighting how the two can work together.
Get tips, takeaways and best practices about:
- The benefits and problems of a Data Warehouse
- How a Data Lake can solve the problems of a Data Warehouse
- Data Lake Architecture
- How Data Warehouses and Data Lakes can work together
Data Governance Best Practices, Assessments, and RoadmapsDATAVERSITY
When starting or evaluating the present state of your Data Governance program, it is important to focus on best practices such that you don’t take a ready, fire, aim approach. Best practices need to be practical and doable to be selected for your organization, and the program must be at risk if the best practice is not achieved.
Join Bob Seiner for an important webinar focused on industry best practice around standing up formal Data Governance. Learn how to assess your organization against the practices and deliver an effective roadmap based on the results of conducting the assessment.
In this webinar, Bob will focus on:
- Criteria to select the appropriate best practices for your organization
- How to define the best practices for ultimate impact
- Assessing against selected best practices
- Focusing the recommendations on program success
- Delivering a roadmap for your Data Governance program
Describes what Enterprise Data Architecture in a Software Development Organization should cover and does that by listing over 200 data architecture related deliverables an Enterprise Data Architect should remember to evangelize.
Product-thinking is making a big impact in the data world with the rise of Data Products, Data Product Managers, data mesh, and treating “Data as a Product.” But Honest, No-BS: What is a Data Product? And what key questions should we ask ourselves while developing them? Tim Gasper (VP of Product, data.world), will walk through the Data Product ABCs as a way to make treating data as a product way simpler: Accountability, Boundaries, Contracts and Expectations, Downstream Consumers, and Explicit Knowledge.
Data Management, Metadata Management, and Data Governance – Working TogetherDATAVERSITY
The data disciplines listed in the title must work together. The key to success requires understanding the boundaries and overlaps between the disciplines. Wouldn’t it be great to be able to present the relationships between the disciplines in a simple all-in diagram? At the end of this webinar, you will be able to do just that.
This new RWDG webinar with Bob Seiner will outline how Data Management, Metadata Management, and Data Governance can be optimized to work together. Bob will share a diagram that has successfully communicated the relationship between these disciplines to leadership resulting in the disciplines working in harmony and delivering success.
Bob will share the following in this webinar:
- Categories of disciplines focused on managing data as an asset
- A definition of Data Management that embraces numerous data disciplines
- The importance of Metadata -Management to all data disciplines
- Why data and metadata require formal governance
- A graphic that effectively exhibits the relationship between the disciplines
Architect’s Open-Source Guide for a Data Mesh ArchitectureDatabricks
Data Mesh is an innovative concept addressing many data challenges from an architectural, cultural, and organizational perspective. But is the world ready to implement Data Mesh?
In this session, we will review the importance of core Data Mesh principles, what they can offer, and when it is a good idea to try a Data Mesh architecture. We will discuss common challenges with implementation of Data Mesh systems and focus on the role of open-source projects for it. Projects like Apache Spark can play a key part in standardized infrastructure platform implementation of Data Mesh. We will examine the landscape of useful data engineering open-source projects to utilize in several areas of a Data Mesh system in practice, along with an architectural example. We will touch on what work (culture, tools, mindset) needs to be done to ensure Data Mesh is more accessible for engineers in the industry.
The audience will leave with a good understanding of the benefits of Data Mesh architecture, common challenges, and the role of Apache Spark and other open-source projects for its implementation in real systems.
This session is targeted for architects, decision-makers, data-engineers, and system designers.
Big data in transport an international transport forum overview oct 2013OpenSkyData
Comprehensive Guide on the use of Big Data in Transportation Services from the International Transport Forum. OpenSky loves making big data work for organisations large and small.
http://www.openskydata.com/our-sectors/transport.html
Is Growth Important? Yes. But Retention Is KingTheFamily
Is Growth Important? Yes. But Retention Is King. Oussama, partner at TheFamily underlines the crucial importance of Retention in the AARRR funnel.
Get the video over here: http://buff.ly/1xhBXfn
How to Become a Data Scientist
SF Data Science Meetup, June 30, 2014
Video of this talk is available here: https://www.youtube.com/watch?v=c52IOlnPw08
More information at: http://www.zipfianacademy.com
Zipfian Academy @ Crowdflower
4 Best Practices for Analyzing Healthcare DataHealth Catalyst
Meaningful healthcare analytics today generally need data from multiple source systems to help address the triple aim cost, quality, and patient satisfaction. Once appropriate data has been captured, pulled into a single place, and tied together, then data analysis can begin. In this article I share 4 ways to enable your analyst including providing them with
1) a data warehouse
2) a sandbox
3) a set of discovery tools
4) the right kind of direction.
Big Data [sorry] & Data Science: What Does a Data Scientist Do?Data Science London
What 'kind of things' does a data scientist do? What are the foundations and principles of data science? What is a Data Product? What does the data science process looks like? Learning from data: Data Modeling or Algorithmic Modeling? - talk by Carlos Somohano @ds_ldn at The Cloud and Big Data: HDInsight on Azure London 25/01/13
Revolution in Business Analytics-Zika Virus ExampleBardess Group
Even from the “man in the street” perspective, there is a sense that we are living in an increasingly algorithmic world. Self-driving cars, pizza delivery by drone, and smart houses are commonplace. The technologies enabling this revolution are both simultaneously mature and evolving rapidly.
In this session, we’ll took a look at a real world problem, the recent global outbreak of the ZIka virus, and used data analytics technologies to gain valuable insights that can assist authorities and the general public to understand and potentially prevent the spread of this disease.
Bardess Group, a sponsor of the event and business analytics consulting firm, will demonstrate how huge, extremely jagged data from a variety of sources can be collected and prepared and rapidly made available for analysis. Advanced machine learning and predictive analysis further enhance the value of those insights.
Finally, Bardess will make the case that using a systematic approach to conceptually visualize the strategic journey to insightful business analytics, the analytics value chain, can assist any organization prepare for this revolution in analytics.
Also see http://cloudera.qlik.com for the demos.
Every now and then something comes along that has the potential to change the face of business as we know it. Currently big data is being touted as that thing, and CIOs everywhere need to get a grip on what it is and how it can benefit their company.
Big Data World Forum (BDWF http://www.bigdatawf.com/) is specially designed for data-driven decision makers, managers, and data practitioners, who are shaping the future of the big data.
Modernizing Your IT Infrastructure with Hadoop - Cloudera Summer Webinar Seri...Cloudera, Inc.
You will also learn how to understand key challenges when deploying a Hadoop cluster in production, manage the entire Hadoop lifecycle using a single management console, deliver integrated management of the entire cluster to maximize IT and business agility.
A Strategic View of Enterprise Reporting and Analytics: The Data FunnelInside Analysis
The Briefing Room with Colin White and Jaspersoft
Slides from the Live Webcast on June 12, 2012
As the corporate appetite for analytics and reporting grows, companies must find a way to secure a strategic view of their information architecture. End users with varying degrees of expertise need a wide range of data and reports delivered in a timely fashion. As the audience for analytics expands, that puts pressure on IT infrastructure and staff. And now with the promise of Hadoop and MapReduce, the organization's desire for business insight becomes even more significant.
In this episode of The Briefing Room, veteran Analyst Colin White of BI Research will explain the value of being strategic with enterprise reporting. White will be briefed by Karl Van den Bergh of Jaspersoft, who will tout his company's “data funnel” concept, which is designed to strategically manage an organization's information architecture. By aligning information assets along this funnel, IT can effectively address the spectrum of analytical needs – from simple reporting to complex, ad hoc analysis – without over-taxing personnel and system resources.
Big Data, NoSQL, NewSQL & The Future of Data ManagementTony Bain
It is an exciting and interesting time to be involved in data. More change of influence has occurred in the database management in the last 18 months than has occurred in the last 18 years. New technologies such as NoSQL & Hadoop and radical redesigns of existing technologies, like NewSQL , will change dramatically how we manage data moving forward.
These technologies bring with them possibilities both in terms of the scale of data retained but also in how this data can be utilized as an information asset. The ability to leverage Big Data to drive deep insights will become a key competitive advantage for many organisations in the future.
Join Tony Bain as he takes us through both the high level drivers for the changes in technology, how these are relevant to the enterprise and an overview of the possibilities a Big Data strategy can start to unlock.