SlideShare a Scribd company logo
1 of 59
Karen Lopez
Data Evangelist, InfoAdvisors
@DataChick
Who’sTinkling inYour Data Lake?
2
5 Questions I Can’t Answer about me
What IsYour Phone Number?
Where DoYou Live?
Who are you?
What is your Date of Birth?
What isYour Name?
Irreverent, but serious
Feel free to chat/ask question as we go
Let’s have fun
Let’s learn a few things
Twitter is a great way to spread the word
@datachick
About this session
A NewV
"The basic commandment of
the Water Bureau is to provide
clean, cold and
constant water to its
customers“…
If you want your data model to be simple..
Go out and make the
world simple, then
come back to me.
Missing Data
“Soloman noticed that when
nothing hit the detector, a
negative reading was being
recorded. But you cannot get
negative energy.Thus, he
contacted scientists at the US
space agency. It turned out that
Soloman had noticed
something no-one else had,
including the NASA experts.
-1 as as a Faux NULL
-1 as as a Faux NULL
Let me tell you about my Dad…
LoveYour Data Lake
Data
UrinalysisTM
is a good
data
strategy
http://www.jarnot.com/archives/2013/05/ode-to-a-shipping-label.php
Karen Says: Names
1. One of the more difficult Data Modeling problems
2. Format is different than content
3. Not all name parts are the same
4. Not all name rules are the same
5. More myths about names than names themselves
http://tinyurl.com/namemyths
Let me tell you about my friend, Jessica
My friend
LoveYour Data… Lake, DBs, Spreadsheets, etc.
Data
UrinalysisTM
is a good data
strategy
Data Hygiene
keeps your
data
contamination
from spreading
Set
expectations
about data
quality
Profile your
data, often.
It’s like a clinic
check that you
are still healthy
WHAT ISYOUR HOME
PHONE NUMBER?
YOUR WORK
NUMBER?
http://tinyurl.com/LandlineUse
1. There is no longer a
concept of Home / Work
phone number
2. We can’t use categories
to tell us when to call
3. We should ask about
times to contact
separately
4. Not all phones are
phones
Karen Says: Phone Data
 Anticipate International
Data
 Don’t Assume Anything
Based on Other Data
 Offer Data Correction
Processes
 Learn about the Outliers
 … What else?
WhatYou Should Do:
• What My Name Is
• What My Address Is
• What I Do
• Where I Live
• What my Phone Number
Is
….I don’t know until you tell
me.
Don’t Ask Me:
"The basic commandment
ofTeamData is to provide
clean , consistent and
timely data to its
customers“…
"The basic commandment of
the Water Bureau is to provide
clean, cold and
constant water to its
customers“…
Someone will always
be pissing in your
Data Lake
If you want your data to
be simple, go out and
make the world simple
then come back to me.
ThankYou!
https://forms.office.com/r/WZft74CyQq

More Related Content

Similar to WhoseTinklingInYourDataLake - DAMA Chicago.pdf

Essay Introduce About Yourself
Essay Introduce About YourselfEssay Introduce About Yourself
Essay Introduce About YourselfHeather Lopez
 
Data visualization
Data visualizationData visualization
Data visualizationMukul Taneja
 
As you study DNA, you find that every persons DNA is different from.docx
As you study DNA, you find that every persons DNA is different from.docxAs you study DNA, you find that every persons DNA is different from.docx
As you study DNA, you find that every persons DNA is different from.docxcargillfilberto
 
As you study DNA, you find that every persons DNA is different from.docx
As you study DNA, you find that every persons DNA is different from.docxAs you study DNA, you find that every persons DNA is different from.docx
As you study DNA, you find that every persons DNA is different from.docxlauricesatu
 
Big Data, Small Data, Data that Totally Rocks - SMWTO
Big Data, Small Data, Data that Totally Rocks - SMWTOBig Data, Small Data, Data that Totally Rocks - SMWTO
Big Data, Small Data, Data that Totally Rocks - SMWTORob Clark
 
Extended Essay Cover Page. Online assignment writing service.
Extended Essay Cover Page. Online assignment writing service.Extended Essay Cover Page. Online assignment writing service.
Extended Essay Cover Page. Online assignment writing service.Emily Custard
 
Internet Safety Final
Internet Safety FinalInternet Safety Final
Internet Safety Finalguest5000f0a
 
Akademik Essay Nasil Yazilir. Online assignment writing service.
Akademik Essay Nasil Yazilir. Online assignment writing service.Akademik Essay Nasil Yazilir. Online assignment writing service.
Akademik Essay Nasil Yazilir. Online assignment writing service.Anita Gomez
 
Book Summary : Everybody Lies
Book Summary : Everybody LiesBook Summary : Everybody Lies
Book Summary : Everybody LiesRahul Rishi
 
The social network - GPD
The social network - GPDThe social network - GPD
The social network - GPDKrishnaSoc
 
Introduction to Networking
Introduction to NetworkingIntroduction to Networking
Introduction to NetworkingBrenda Brumfield
 
Connecticut Hospital Association Nursing Leadership Forum 10-29-15
Connecticut Hospital Association Nursing Leadership Forum 10-29-15Connecticut Hospital Association Nursing Leadership Forum 10-29-15
Connecticut Hospital Association Nursing Leadership Forum 10-29-15e-Patient Dave deBronkart
 
E5 Working It Online Internet Partner Notification Jackson
E5 Working It Online Internet Partner Notification JacksonE5 Working It Online Internet Partner Notification Jackson
E5 Working It Online Internet Partner Notification JacksonDSHS
 
Argumentative Essay Sample Doc
Argumentative Essay Sample DocArgumentative Essay Sample Doc
Argumentative Essay Sample DocCrystal Wright
 

Similar to WhoseTinklingInYourDataLake - DAMA Chicago.pdf (20)

Essay Introduce About Yourself
Essay Introduce About YourselfEssay Introduce About Yourself
Essay Introduce About Yourself
 
Data visualization
Data visualizationData visualization
Data visualization
 
As you study DNA, you find that every persons DNA is different from.docx
As you study DNA, you find that every persons DNA is different from.docxAs you study DNA, you find that every persons DNA is different from.docx
As you study DNA, you find that every persons DNA is different from.docx
 
As you study DNA, you find that every persons DNA is different from.docx
As you study DNA, you find that every persons DNA is different from.docxAs you study DNA, you find that every persons DNA is different from.docx
As you study DNA, you find that every persons DNA is different from.docx
 
Jerait PDF.pdf
Jerait PDF.pdfJerait PDF.pdf
Jerait PDF.pdf
 
Lesson 2
Lesson 2Lesson 2
Lesson 2
 
Big Data, Small Data, Data that Totally Rocks - SMWTO
Big Data, Small Data, Data that Totally Rocks - SMWTOBig Data, Small Data, Data that Totally Rocks - SMWTO
Big Data, Small Data, Data that Totally Rocks - SMWTO
 
Extended Essay Cover Page. Online assignment writing service.
Extended Essay Cover Page. Online assignment writing service.Extended Essay Cover Page. Online assignment writing service.
Extended Essay Cover Page. Online assignment writing service.
 
Internet Safety Final
Internet Safety FinalInternet Safety Final
Internet Safety Final
 
Akademik Essay Nasil Yazilir. Online assignment writing service.
Akademik Essay Nasil Yazilir. Online assignment writing service.Akademik Essay Nasil Yazilir. Online assignment writing service.
Akademik Essay Nasil Yazilir. Online assignment writing service.
 
Book Summary : Everybody Lies
Book Summary : Everybody LiesBook Summary : Everybody Lies
Book Summary : Everybody Lies
 
Gunning for granny
Gunning for grannyGunning for granny
Gunning for granny
 
The social network - GPD
The social network - GPDThe social network - GPD
The social network - GPD
 
Introduction to Networking
Introduction to NetworkingIntroduction to Networking
Introduction to Networking
 
FOJ Lectures
FOJ LecturesFOJ Lectures
FOJ Lectures
 
Connecticut Hospital Association Nursing Leadership Forum 10-29-15
Connecticut Hospital Association Nursing Leadership Forum 10-29-15Connecticut Hospital Association Nursing Leadership Forum 10-29-15
Connecticut Hospital Association Nursing Leadership Forum 10-29-15
 
Sabi
SabiSabi
Sabi
 
E5 Working It Online Internet Partner Notification Jackson
E5 Working It Online Internet Partner Notification JacksonE5 Working It Online Internet Partner Notification Jackson
E5 Working It Online Internet Partner Notification Jackson
 
Ona 2012
Ona 2012Ona 2012
Ona 2012
 
Argumentative Essay Sample Doc
Argumentative Essay Sample DocArgumentative Essay Sample Doc
Argumentative Essay Sample Doc
 

More from Karen Lopez

DGIQ East 2023 AI Ethics SIG
DGIQ East 2023 AI Ethics SIGDGIQ East 2023 AI Ethics SIG
DGIQ East 2023 AI Ethics SIGKaren Lopez
 
A Designer's Favourite Security and Privacy Features in SQL Server and Azure ...
A Designer's Favourite Security and Privacy Features in SQL Server and Azure ...A Designer's Favourite Security and Privacy Features in SQL Server and Azure ...
A Designer's Favourite Security and Privacy Features in SQL Server and Azure ...Karen Lopez
 
Data in the Stars
Data in the StarsData in the Stars
Data in the StarsKaren Lopez
 
Designer's Favorite New Features in SQLServer
Designer's Favorite New Features in SQLServerDesigner's Favorite New Features in SQLServer
Designer's Favorite New Features in SQLServerKaren Lopez
 
Expert Cloud Data Backup and Recovery Best Practice.pptx
Expert Cloud Data Backup and Recovery Best Practice.pptxExpert Cloud Data Backup and Recovery Best Practice.pptx
Expert Cloud Data Backup and Recovery Best Practice.pptxKaren Lopez
 
Manage Your Time So It Doesn't Manage You
Manage Your Time So It Doesn't Manage YouManage Your Time So It Doesn't Manage You
Manage Your Time So It Doesn't Manage YouKaren Lopez
 
Migrating Data and Databases to Azure
Migrating Data and Databases to AzureMigrating Data and Databases to Azure
Migrating Data and Databases to AzureKaren Lopez
 
Blockchain for the DBA and Data Professional
Blockchain for the DBA and Data ProfessionalBlockchain for the DBA and Data Professional
Blockchain for the DBA and Data ProfessionalKaren Lopez
 
Blockchain for the DBA and Data Professional
Blockchain for the DBA and Data ProfessionalBlockchain for the DBA and Data Professional
Blockchain for the DBA and Data ProfessionalKaren Lopez
 
Data Security and Protection in DevOps
Data Security and Protection in DevOps Data Security and Protection in DevOps
Data Security and Protection in DevOps Karen Lopez
 
Data Modeling for Security, Privacy and Data Protection
Data Modeling for Security, Privacy and Data ProtectionData Modeling for Security, Privacy and Data Protection
Data Modeling for Security, Privacy and Data ProtectionKaren Lopez
 
Fast Focus: SQL Server Graph Database & Processing
Fast Focus: SQL Server Graph Database & ProcessingFast Focus: SQL Server Graph Database & Processing
Fast Focus: SQL Server Graph Database & ProcessingKaren Lopez
 
Designing for Data Security by Karen Lopez
Designing for Data Security by Karen LopezDesigning for Data Security by Karen Lopez
Designing for Data Security by Karen LopezKaren Lopez
 
The Key to Keys - Database Design
The Key to Keys - Database DesignThe Key to Keys - Database Design
The Key to Keys - Database DesignKaren Lopez
 
How to Survive as a Data Architect in a Polyglot Database World
How to Survive as a Data Architect in a Polyglot Database WorldHow to Survive as a Data Architect in a Polyglot Database World
How to Survive as a Data Architect in a Polyglot Database WorldKaren Lopez
 
Karen's Favourite Features of SQL Server 2016
Karen's Favourite Features of  SQL Server 2016Karen's Favourite Features of  SQL Server 2016
Karen's Favourite Features of SQL Server 2016Karen Lopez
 
7 Databases in 70 minutes
7 Databases in 70 minutes7 Databases in 70 minutes
7 Databases in 70 minutesKaren Lopez
 
Karen Lopez 10 Physical Data Modeling Blunders
Karen Lopez 10 Physical Data Modeling BlundersKaren Lopez 10 Physical Data Modeling Blunders
Karen Lopez 10 Physical Data Modeling BlundersKaren Lopez
 
NoSQL and Data Modeling for Data Modelers
NoSQL and Data Modeling for Data ModelersNoSQL and Data Modeling for Data Modelers
NoSQL and Data Modeling for Data ModelersKaren Lopez
 

More from Karen Lopez (19)

DGIQ East 2023 AI Ethics SIG
DGIQ East 2023 AI Ethics SIGDGIQ East 2023 AI Ethics SIG
DGIQ East 2023 AI Ethics SIG
 
A Designer's Favourite Security and Privacy Features in SQL Server and Azure ...
A Designer's Favourite Security and Privacy Features in SQL Server and Azure ...A Designer's Favourite Security and Privacy Features in SQL Server and Azure ...
A Designer's Favourite Security and Privacy Features in SQL Server and Azure ...
 
Data in the Stars
Data in the StarsData in the Stars
Data in the Stars
 
Designer's Favorite New Features in SQLServer
Designer's Favorite New Features in SQLServerDesigner's Favorite New Features in SQLServer
Designer's Favorite New Features in SQLServer
 
Expert Cloud Data Backup and Recovery Best Practice.pptx
Expert Cloud Data Backup and Recovery Best Practice.pptxExpert Cloud Data Backup and Recovery Best Practice.pptx
Expert Cloud Data Backup and Recovery Best Practice.pptx
 
Manage Your Time So It Doesn't Manage You
Manage Your Time So It Doesn't Manage YouManage Your Time So It Doesn't Manage You
Manage Your Time So It Doesn't Manage You
 
Migrating Data and Databases to Azure
Migrating Data and Databases to AzureMigrating Data and Databases to Azure
Migrating Data and Databases to Azure
 
Blockchain for the DBA and Data Professional
Blockchain for the DBA and Data ProfessionalBlockchain for the DBA and Data Professional
Blockchain for the DBA and Data Professional
 
Blockchain for the DBA and Data Professional
Blockchain for the DBA and Data ProfessionalBlockchain for the DBA and Data Professional
Blockchain for the DBA and Data Professional
 
Data Security and Protection in DevOps
Data Security and Protection in DevOps Data Security and Protection in DevOps
Data Security and Protection in DevOps
 
Data Modeling for Security, Privacy and Data Protection
Data Modeling for Security, Privacy and Data ProtectionData Modeling for Security, Privacy and Data Protection
Data Modeling for Security, Privacy and Data Protection
 
Fast Focus: SQL Server Graph Database & Processing
Fast Focus: SQL Server Graph Database & ProcessingFast Focus: SQL Server Graph Database & Processing
Fast Focus: SQL Server Graph Database & Processing
 
Designing for Data Security by Karen Lopez
Designing for Data Security by Karen LopezDesigning for Data Security by Karen Lopez
Designing for Data Security by Karen Lopez
 
The Key to Keys - Database Design
The Key to Keys - Database DesignThe Key to Keys - Database Design
The Key to Keys - Database Design
 
How to Survive as a Data Architect in a Polyglot Database World
How to Survive as a Data Architect in a Polyglot Database WorldHow to Survive as a Data Architect in a Polyglot Database World
How to Survive as a Data Architect in a Polyglot Database World
 
Karen's Favourite Features of SQL Server 2016
Karen's Favourite Features of  SQL Server 2016Karen's Favourite Features of  SQL Server 2016
Karen's Favourite Features of SQL Server 2016
 
7 Databases in 70 minutes
7 Databases in 70 minutes7 Databases in 70 minutes
7 Databases in 70 minutes
 
Karen Lopez 10 Physical Data Modeling Blunders
Karen Lopez 10 Physical Data Modeling BlundersKaren Lopez 10 Physical Data Modeling Blunders
Karen Lopez 10 Physical Data Modeling Blunders
 
NoSQL and Data Modeling for Data Modelers
NoSQL and Data Modeling for Data ModelersNoSQL and Data Modeling for Data Modelers
NoSQL and Data Modeling for Data Modelers
 

Recently uploaded

9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAbdelrhman abooda
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 

Recently uploaded (20)

9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 

WhoseTinklingInYourDataLake - DAMA Chicago.pdf

  • 1. Karen Lopez Data Evangelist, InfoAdvisors @DataChick Who’sTinkling inYour Data Lake?
  • 2. 2
  • 3. 5 Questions I Can’t Answer about me What IsYour Phone Number? Where DoYou Live? Who are you? What is your Date of Birth? What isYour Name?
  • 4. Irreverent, but serious Feel free to chat/ask question as we go Let’s have fun Let’s learn a few things Twitter is a great way to spread the word @datachick About this session
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13. "The basic commandment of the Water Bureau is to provide clean, cold and constant water to its customers“…
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22. If you want your data model to be simple.. Go out and make the world simple, then come back to me.
  • 23. Missing Data “Soloman noticed that when nothing hit the detector, a negative reading was being recorded. But you cannot get negative energy.Thus, he contacted scientists at the US space agency. It turned out that Soloman had noticed something no-one else had, including the NASA experts. -1 as as a Faux NULL -1 as as a Faux NULL
  • 24.
  • 25. Let me tell you about my Dad…
  • 26.
  • 28.
  • 29.
  • 30.
  • 31.
  • 33. Karen Says: Names 1. One of the more difficult Data Modeling problems 2. Format is different than content 3. Not all name parts are the same 4. Not all name rules are the same 5. More myths about names than names themselves http://tinyurl.com/namemyths
  • 34.
  • 35.
  • 36.
  • 37.
  • 38.
  • 39.
  • 40. Let me tell you about my friend, Jessica
  • 42.
  • 43.
  • 44. LoveYour Data… Lake, DBs, Spreadsheets, etc. Data UrinalysisTM is a good data strategy Data Hygiene keeps your data contamination from spreading Set expectations about data quality Profile your data, often. It’s like a clinic check that you are still healthy
  • 45.
  • 46.
  • 47.
  • 48. WHAT ISYOUR HOME PHONE NUMBER? YOUR WORK NUMBER?
  • 50. 1. There is no longer a concept of Home / Work phone number 2. We can’t use categories to tell us when to call 3. We should ask about times to contact separately 4. Not all phones are phones Karen Says: Phone Data
  • 51.  Anticipate International Data  Don’t Assume Anything Based on Other Data  Offer Data Correction Processes  Learn about the Outliers  … What else? WhatYou Should Do:
  • 52. • What My Name Is • What My Address Is • What I Do • Where I Live • What my Phone Number Is ….I don’t know until you tell me. Don’t Ask Me:
  • 53. "The basic commandment ofTeamData is to provide clean , consistent and timely data to its customers“…
  • 54.
  • 55.
  • 56. "The basic commandment of the Water Bureau is to provide clean, cold and constant water to its customers“…
  • 57. Someone will always be pissing in your Data Lake
  • 58. If you want your data to be simple, go out and make the world simple then come back to me.