SlideShare a Scribd company logo
CEDAR
DATA
ESTATE
WASHINGTON STATE
DEPARTMENT OF HEALTH
CEDARDATA ES
TATE
CDC
(RAW)
SOURCE
SOURCE
CHARS
(RAW)
WAIIS/IMMS
(RAW)
SOURCE
SOURCE
CREST
(RAW)
01_RAW (RAW)
02_USABLE_PREP (COOKED)
03_USABLE
04_USEABLE_OUTPUT
05_OUTPUT
06_SANDBOX
Data Sciences
Support Unit
CEDAR (SERVED)
LAUREL
MADRONA
Data Analysis
Unit
PARQUET_SRC (COOKED)
DEV
TEST
STAGE
PROD
Data Sciences
Unit
RAW (RAW)
AUDIT
RPT_OUT
Data Audit
Unit
REDCap
(RAW)
SOURCE
AZURE PIPELINE
AZURE SHARE
LEGEND
RAW = Identical to source
COOKED = Cleaned and conformed
in common parquet files
SERVED = Business rule driven
under governance teams
CEDAR DATA LAKE
CEDAR
Data Estate
Data sources on left side of diagram are typically drawn from delimited text or relational databases and
are accessed via API or direct network connection using Azure Data Factory pipelines.
The CEDAR data lake is focused on extract and load activities, with only enough transformation to fulfill
the Kimball standards for cleaning and conforming. Consumers can receive data as
CEDAR is a “hub” data lake and as such is optimized for reads from a hierarchical Azure Data Lake Gen 2
data store.
Each of the client data consumer units receive a read-only “spoke” from CEDAR that is implemented
using Azure Share.
The Data Sciences Support Unit acts as CEDAR’s “first best customer” and serves as a center for
standards and best practices. DSSU maintains a code repository implemented in GitHub for the benefit
of all the CEDAR data consumer units.
Client data consumers vary in requirements and implementation. Each is envisioned (though not
required) to be built as a compute-optimized data lake that adds value by using both local and shared
data to create data products that are composed of data science experiments and machine learning
models, traditional data analytics, healthcare-driven insights, and various public and private
dashboards.
Third party public data consumers like local healthcare authorities, hospitals, clinics, and autonomous
indigenous healthcare organizations would access the products our data consumers produce via secure
API and Azure Identity Governance-derived accounts.
Content in the CEDAR data lake’s “served” folders is anticipated to carry the following additional
attributes:
 Data is organized by business fact subject groups like vaccinations, investigations, hospitalizations
and such rather than “siloed” by individual budgetary units.
 Data is cleaned and conformed using Kimball standards and best practices for “Data Mart” units.
 Governance is crucial to the “served” folders and is anticipated to be chaired at the level of the
office of technology innovation with stakeholders from the budgetary units who contributed data
and who thereby assisted in breaking down the budgetary “silos”.
Description and Notes

More Related Content

What's hot

Big Data and Health Care
Big Data and Health CareBig Data and Health Care
Big Data and Health Care
Jeffrey Funk
 
Big data's impact on healthcare
Big data's impact on healthcareBig data's impact on healthcare
Big data's impact on healthcare
René Kuipers
 
BIG Data & Hadoop Applications in Healthcare
BIG Data & Hadoop Applications in HealthcareBIG Data & Hadoop Applications in Healthcare
BIG Data & Hadoop Applications in Healthcare
Skillspeed
 
Client engagement wendy
Client engagement wendyClient engagement wendy
Client engagement wendyTaylor Nichols
 
Big Data In Medicine
Big Data In Medicine Big Data In Medicine
Big Data In Medicine
Frank Meissner
 
Big data in healthcare
Big data in healthcareBig data in healthcare
Big data in healthcare
BYTE Project
 
Healthcare and Big Data - May 2017
Healthcare and Big Data -  May 2017Healthcare and Big Data -  May 2017
Healthcare and Big Data - May 2017
paul young cpa, cga
 
Big Data in Healthcare Made Simple: Where It Stands Today and Where It’s Going
Big Data in Healthcare Made Simple: Where It Stands Today and Where It’s GoingBig Data in Healthcare Made Simple: Where It Stands Today and Where It’s Going
Big Data in Healthcare Made Simple: Where It Stands Today and Where It’s Going
Health Catalyst
 
Big datarevealed hadoop catalog
Big datarevealed hadoop catalogBig datarevealed hadoop catalog
Big datarevealed hadoop catalog
Steven Meister
 
What we do
What we doWhat we do
What we do
John Snow Labs
 
Big Data Analytics in Healthcare and Life Sciences
Big Data Analytics in Healthcare and Life SciencesBig Data Analytics in Healthcare and Life Sciences
Big Data Analytics in Healthcare and Life SciencesAli Sanousi, MD, MBA, PhD
 
HP & Sogeti Healthcare Big Data Presentation for Discover 2015
HP & Sogeti Healthcare Big Data Presentation for Discover 2015HP & Sogeti Healthcare Big Data Presentation for Discover 2015
HP & Sogeti Healthcare Big Data Presentation for Discover 2015
Robert LeRoy
 
Grid And Healthcare For IOM July 2009
Grid And Healthcare For IOM July 2009Grid And Healthcare For IOM July 2009
Grid And Healthcare For IOM July 2009
Ian Foster
 
Standards metadata management - version control and its governance
Standards metadata management - version control and its governanceStandards metadata management - version control and its governance
Standards metadata management - version control and its governance
Kevin Lee
 
Metadata becomes alive via a web service between MDR and SAS
Metadata becomes alive via a web service between MDR and SASMetadata becomes alive via a web service between MDR and SAS
Metadata becomes alive via a web service between MDR and SAS
Kevin Lee
 
Achieving Medical Imaging Interoperability with PACS and RIS Integrations
Achieving Medical Imaging Interoperability with PACS and RIS IntegrationsAchieving Medical Imaging Interoperability with PACS and RIS Integrations
Achieving Medical Imaging Interoperability with PACS and RIS Integrations
Chetu
 
Separating pacs-servers-from-vna
Separating pacs-servers-from-vnaSeparating pacs-servers-from-vna
Separating pacs-servers-from-vna
Eric Javier Espino Man
 
Data centric SDLC for automated clinical data development
Data centric SDLC for automated clinical data developmentData centric SDLC for automated clinical data development
Data centric SDLC for automated clinical data development
Kevin Lee
 
Whitepaper : The Bridge From PACS to VNA: Scale Out Storage
Whitepaper : The Bridge From PACS to VNA: Scale Out Storage Whitepaper : The Bridge From PACS to VNA: Scale Out Storage
Whitepaper : The Bridge From PACS to VNA: Scale Out Storage
EMC
 

What's hot (20)

Big Data and Health Care
Big Data and Health CareBig Data and Health Care
Big Data and Health Care
 
Big data's impact on healthcare
Big data's impact on healthcareBig data's impact on healthcare
Big data's impact on healthcare
 
BIG Data & Hadoop Applications in Healthcare
BIG Data & Hadoop Applications in HealthcareBIG Data & Hadoop Applications in Healthcare
BIG Data & Hadoop Applications in Healthcare
 
Client engagement wendy
Client engagement wendyClient engagement wendy
Client engagement wendy
 
Big Data In Medicine
Big Data In Medicine Big Data In Medicine
Big Data In Medicine
 
Big data in healthcare
Big data in healthcareBig data in healthcare
Big data in healthcare
 
Healthcare and Big Data - May 2017
Healthcare and Big Data -  May 2017Healthcare and Big Data -  May 2017
Healthcare and Big Data - May 2017
 
Big data analystics
Big data analysticsBig data analystics
Big data analystics
 
Big Data in Healthcare Made Simple: Where It Stands Today and Where It’s Going
Big Data in Healthcare Made Simple: Where It Stands Today and Where It’s GoingBig Data in Healthcare Made Simple: Where It Stands Today and Where It’s Going
Big Data in Healthcare Made Simple: Where It Stands Today and Where It’s Going
 
Big datarevealed hadoop catalog
Big datarevealed hadoop catalogBig datarevealed hadoop catalog
Big datarevealed hadoop catalog
 
What we do
What we doWhat we do
What we do
 
Big Data Analytics in Healthcare and Life Sciences
Big Data Analytics in Healthcare and Life SciencesBig Data Analytics in Healthcare and Life Sciences
Big Data Analytics in Healthcare and Life Sciences
 
HP & Sogeti Healthcare Big Data Presentation for Discover 2015
HP & Sogeti Healthcare Big Data Presentation for Discover 2015HP & Sogeti Healthcare Big Data Presentation for Discover 2015
HP & Sogeti Healthcare Big Data Presentation for Discover 2015
 
Grid And Healthcare For IOM July 2009
Grid And Healthcare For IOM July 2009Grid And Healthcare For IOM July 2009
Grid And Healthcare For IOM July 2009
 
Standards metadata management - version control and its governance
Standards metadata management - version control and its governanceStandards metadata management - version control and its governance
Standards metadata management - version control and its governance
 
Metadata becomes alive via a web service between MDR and SAS
Metadata becomes alive via a web service between MDR and SASMetadata becomes alive via a web service between MDR and SAS
Metadata becomes alive via a web service between MDR and SAS
 
Achieving Medical Imaging Interoperability with PACS and RIS Integrations
Achieving Medical Imaging Interoperability with PACS and RIS IntegrationsAchieving Medical Imaging Interoperability with PACS and RIS Integrations
Achieving Medical Imaging Interoperability with PACS and RIS Integrations
 
Separating pacs-servers-from-vna
Separating pacs-servers-from-vnaSeparating pacs-servers-from-vna
Separating pacs-servers-from-vna
 
Data centric SDLC for automated clinical data development
Data centric SDLC for automated clinical data developmentData centric SDLC for automated clinical data development
Data centric SDLC for automated clinical data development
 
Whitepaper : The Bridge From PACS to VNA: Scale Out Storage
Whitepaper : The Bridge From PACS to VNA: Scale Out Storage Whitepaper : The Bridge From PACS to VNA: Scale Out Storage
Whitepaper : The Bridge From PACS to VNA: Scale Out Storage
 

Similar to Cedar data estate

An overview of clinical data repository
An overview of clinical data repositoryAn overview of clinical data repository
An overview of clinical data repositoryNetrah Laxminarayanan
 
52023374-5ab1-4b99-8b31-bdc4ee5a7d89.pdf
52023374-5ab1-4b99-8b31-bdc4ee5a7d89.pdf52023374-5ab1-4b99-8b31-bdc4ee5a7d89.pdf
52023374-5ab1-4b99-8b31-bdc4ee5a7d89.pdf
vitm11
 
5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake
MetroStar
 
An Overview of Data Lake
An Overview of Data LakeAn Overview of Data Lake
An Overview of Data Lake
IRJET Journal
 
Kaizentric Presentation
Kaizentric PresentationKaizentric Presentation
Kaizentric Presentation
Azhagarasan Annadorai
 
An Eye on the Future A Review of Data Virtualization Techniques to Improve Re...
An Eye on the Future A Review of Data Virtualization Techniques to Improve Re...An Eye on the Future A Review of Data Virtualization Techniques to Improve Re...
An Eye on the Future A Review of Data Virtualization Techniques to Improve Re...
HMO Research Network
 
Automatic Data Reconciliation, Data Quality, and Data Observability.pdf
Automatic Data Reconciliation, Data Quality, and Data Observability.pdfAutomatic Data Reconciliation, Data Quality, and Data Observability.pdf
Automatic Data Reconciliation, Data Quality, and Data Observability.pdf
4dalert
 
Webinar Data Mesh - Part 3
Webinar Data Mesh - Part 3Webinar Data Mesh - Part 3
Webinar Data Mesh - Part 3
Jeffrey T. Pollock
 
BDaas- BigData as a service
BDaas- BigData as a service  BDaas- BigData as a service
BDaas- BigData as a service
Agile Testing Alliance
 
BDW Chicago 2016 - Ramu Kalvakuntla, Sr. Principal - Technical - Big Data Pra...
BDW Chicago 2016 - Ramu Kalvakuntla, Sr. Principal - Technical - Big Data Pra...BDW Chicago 2016 - Ramu Kalvakuntla, Sr. Principal - Technical - Big Data Pra...
BDW Chicago 2016 - Ramu Kalvakuntla, Sr. Principal - Technical - Big Data Pra...
Big Data Week
 
oracle-database-editions-wp-12c-1896124
oracle-database-editions-wp-12c-1896124oracle-database-editions-wp-12c-1896124
oracle-database-editions-wp-12c-1896124Arjun Sathe
 
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Denodo
 
Azure Data Factory ETL Patterns in the Cloud
Azure Data Factory ETL Patterns in the CloudAzure Data Factory ETL Patterns in the Cloud
Azure Data Factory ETL Patterns in the Cloud
Mark Kromer
 
IndyCodeCamp SDS May 16th 2009
IndyCodeCamp SDS May 16th 2009IndyCodeCamp SDS May 16th 2009
IndyCodeCamp SDS May 16th 2009Aaron King
 
Epic Migration to Software Defined Storage
Epic Migration to Software Defined StorageEpic Migration to Software Defined Storage
Epic Migration to Software Defined Storage
IT Brand Pulse
 
Data and Analytics at Holland & Barrett: Building a '3-Michelin-star' Data Pl...
Data and Analytics at Holland & Barrett: Building a '3-Michelin-star' Data Pl...Data and Analytics at Holland & Barrett: Building a '3-Michelin-star' Data Pl...
Data and Analytics at Holland & Barrett: Building a '3-Michelin-star' Data Pl...
Dobo Radichkov
 
Data mining
Data miningData mining
Data mining
sweetysweety8
 
SQL Saturday Redmond 2019 ETL Patterns in the Cloud
SQL Saturday Redmond 2019 ETL Patterns in the CloudSQL Saturday Redmond 2019 ETL Patterns in the Cloud
SQL Saturday Redmond 2019 ETL Patterns in the Cloud
Mark Kromer
 
Spark Summit EU talk by Pat Patterson
Spark Summit EU talk by Pat PattersonSpark Summit EU talk by Pat Patterson
Spark Summit EU talk by Pat Patterson
Spark Summit
 

Similar to Cedar data estate (20)

An overview of clinical data repository
An overview of clinical data repositoryAn overview of clinical data repository
An overview of clinical data repository
 
52023374-5ab1-4b99-8b31-bdc4ee5a7d89.pdf
52023374-5ab1-4b99-8b31-bdc4ee5a7d89.pdf52023374-5ab1-4b99-8b31-bdc4ee5a7d89.pdf
52023374-5ab1-4b99-8b31-bdc4ee5a7d89.pdf
 
5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake
 
An Overview of Data Lake
An Overview of Data LakeAn Overview of Data Lake
An Overview of Data Lake
 
Kaizentric Presentation
Kaizentric PresentationKaizentric Presentation
Kaizentric Presentation
 
An Eye on the Future A Review of Data Virtualization Techniques to Improve Re...
An Eye on the Future A Review of Data Virtualization Techniques to Improve Re...An Eye on the Future A Review of Data Virtualization Techniques to Improve Re...
An Eye on the Future A Review of Data Virtualization Techniques to Improve Re...
 
Automatic Data Reconciliation, Data Quality, and Data Observability.pdf
Automatic Data Reconciliation, Data Quality, and Data Observability.pdfAutomatic Data Reconciliation, Data Quality, and Data Observability.pdf
Automatic Data Reconciliation, Data Quality, and Data Observability.pdf
 
SaaSRefArch
SaaSRefArchSaaSRefArch
SaaSRefArch
 
Webinar Data Mesh - Part 3
Webinar Data Mesh - Part 3Webinar Data Mesh - Part 3
Webinar Data Mesh - Part 3
 
BDaas- BigData as a service
BDaas- BigData as a service  BDaas- BigData as a service
BDaas- BigData as a service
 
BDW Chicago 2016 - Ramu Kalvakuntla, Sr. Principal - Technical - Big Data Pra...
BDW Chicago 2016 - Ramu Kalvakuntla, Sr. Principal - Technical - Big Data Pra...BDW Chicago 2016 - Ramu Kalvakuntla, Sr. Principal - Technical - Big Data Pra...
BDW Chicago 2016 - Ramu Kalvakuntla, Sr. Principal - Technical - Big Data Pra...
 
oracle-database-editions-wp-12c-1896124
oracle-database-editions-wp-12c-1896124oracle-database-editions-wp-12c-1896124
oracle-database-editions-wp-12c-1896124
 
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
 
Azure Data Factory ETL Patterns in the Cloud
Azure Data Factory ETL Patterns in the CloudAzure Data Factory ETL Patterns in the Cloud
Azure Data Factory ETL Patterns in the Cloud
 
IndyCodeCamp SDS May 16th 2009
IndyCodeCamp SDS May 16th 2009IndyCodeCamp SDS May 16th 2009
IndyCodeCamp SDS May 16th 2009
 
Epic Migration to Software Defined Storage
Epic Migration to Software Defined StorageEpic Migration to Software Defined Storage
Epic Migration to Software Defined Storage
 
Data and Analytics at Holland & Barrett: Building a '3-Michelin-star' Data Pl...
Data and Analytics at Holland & Barrett: Building a '3-Michelin-star' Data Pl...Data and Analytics at Holland & Barrett: Building a '3-Michelin-star' Data Pl...
Data and Analytics at Holland & Barrett: Building a '3-Michelin-star' Data Pl...
 
Data mining
Data miningData mining
Data mining
 
SQL Saturday Redmond 2019 ETL Patterns in the Cloud
SQL Saturday Redmond 2019 ETL Patterns in the CloudSQL Saturday Redmond 2019 ETL Patterns in the Cloud
SQL Saturday Redmond 2019 ETL Patterns in the Cloud
 
Spark Summit EU talk by Pat Patterson
Spark Summit EU talk by Pat PattersonSpark Summit EU talk by Pat Patterson
Spark Summit EU talk by Pat Patterson
 

Recently uploaded

06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
mzpolocfi
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
javier ramirez
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
74nqk8xf
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
sameer shah
 
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
AnirbanRoy608946
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
jerlynmaetalle
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
roli9797
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
dwreak4tg
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
apvysm8
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfUnleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Enterprise Wired
 
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
GetInData
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
vikram sood
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 

Recently uploaded (20)

06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
 
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfUnleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
 
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 

Cedar data estate

  • 2. CEDARDATA ES TATE CDC (RAW) SOURCE SOURCE CHARS (RAW) WAIIS/IMMS (RAW) SOURCE SOURCE CREST (RAW) 01_RAW (RAW) 02_USABLE_PREP (COOKED) 03_USABLE 04_USEABLE_OUTPUT 05_OUTPUT 06_SANDBOX Data Sciences Support Unit CEDAR (SERVED) LAUREL MADRONA Data Analysis Unit PARQUET_SRC (COOKED) DEV TEST STAGE PROD Data Sciences Unit RAW (RAW) AUDIT RPT_OUT Data Audit Unit REDCap (RAW) SOURCE AZURE PIPELINE AZURE SHARE LEGEND RAW = Identical to source COOKED = Cleaned and conformed in common parquet files SERVED = Business rule driven under governance teams CEDAR DATA LAKE
  • 3. CEDAR Data Estate Data sources on left side of diagram are typically drawn from delimited text or relational databases and are accessed via API or direct network connection using Azure Data Factory pipelines. The CEDAR data lake is focused on extract and load activities, with only enough transformation to fulfill the Kimball standards for cleaning and conforming. Consumers can receive data as CEDAR is a “hub” data lake and as such is optimized for reads from a hierarchical Azure Data Lake Gen 2 data store. Each of the client data consumer units receive a read-only “spoke” from CEDAR that is implemented using Azure Share. The Data Sciences Support Unit acts as CEDAR’s “first best customer” and serves as a center for standards and best practices. DSSU maintains a code repository implemented in GitHub for the benefit of all the CEDAR data consumer units. Client data consumers vary in requirements and implementation. Each is envisioned (though not required) to be built as a compute-optimized data lake that adds value by using both local and shared data to create data products that are composed of data science experiments and machine learning models, traditional data analytics, healthcare-driven insights, and various public and private dashboards. Third party public data consumers like local healthcare authorities, hospitals, clinics, and autonomous indigenous healthcare organizations would access the products our data consumers produce via secure API and Azure Identity Governance-derived accounts. Content in the CEDAR data lake’s “served” folders is anticipated to carry the following additional attributes:  Data is organized by business fact subject groups like vaccinations, investigations, hospitalizations and such rather than “siloed” by individual budgetary units.  Data is cleaned and conformed using Kimball standards and best practices for “Data Mart” units.  Governance is crucial to the “served” folders and is anticipated to be chaired at the level of the office of technology innovation with stakeholders from the budgetary units who contributed data and who thereby assisted in breaking down the budgetary “silos”. Description and Notes