SlideShare a Scribd company logo
1 of 69
Download to read offline
Practical Data
Strategies in the
Real World of Poor
Data Quality
A n d r e w P a t r i c i o | w w w . d a t a e f f e c t i v e n e s s . c o m
Agenda
Foundation
Data Effectiveness
Data Sophistication
Data Prioritization
Consistency, Relevancy, Accuracy
Data Quality Culture
Reporting platform
Managing Requests
Summary
Data Effectiveness
Andrew Patricio www.dataeffectiveness.com EDW2017 2
Foundation
3
The Foundation
ef·fec·tive·ness
iˈfektivnəs/, noun
the degree to which something is successful in producing the
intended or desired result
Data Effectiveness
4Andrew Patricio www.dataeffectiveness.com EDW2017
The Wrong Question
Not “What do
you want?”
Data Effectiveness
5Andrew Patricio www.dataeffectiveness.com EDW2017
Instead, “What problem
are you trying to solve?”
Effectiveness is about solving problems not deliverables
What do you want?
• Focused on requirements
• Mid-stream changes = not delivering what was promised
• Encourages business to think transactionally instead of as partners in the
solution
• Overall sense is one of CYA, “We just did what you asked”
What problem are you trying to solve?
• Focused on end goal
• Mid-stream changes = steering to maintain drive towards end goal
• Forces business to think of themselves as part of the team as well as articulate
the problem thereby making sure they understand it themselves
• Overall sense is one of partners on a journey to discover an unknown answer
Data Effectiveness
6Andrew Patricio www.dataeffectiveness.com EDW2017
The Ends (sometimes) Justify the Means
Having a goal of effectiveness instead of quality means project is
successful to the degree that it achieves desired result
“What problem are you trying to solve?” is how to define the
desired result
Data Effectiveness
7Andrew Patricio www.dataeffectiveness.com EDW2017
This combination gives you both a
structure to make progress and the
freedom to follow and steer around
obstacles
About Me – Andrew Patricio
President Data Effectiveness Inc
• www.dataeffectiveness.com
• Data Evaluation
• Data Strategy
• Data Infrastructure
Personal background
• Chief Data Officer at DC Public Schools
Nov 2010 to June 2016
• IT & management consulting
• Electrical Engineering
Data Effectiveness
8Andrew Patricio www.dataeffectiveness.com EDW2017
Data Effectiveness
9
Data Driven Decision Making
All organizations seek to make decisions based on data
Data Effectiveness
10Andrew Patricio www.dataeffectiveness.com EDW2017
Data Reality
But the reality is that the data we have available is often in poor shape
Data Effectiveness
11Andrew Patricio www.dataeffectiveness.com EDW2017
Getting to Data Driven – Reporting vs Analytics
Steve Levitt, Freakonomics Podcast, 26 June 2014
“Yeah, I think the hardest single thing is that even if you have the desire … to be
data driven, that the existing systems…I never would have thought this before I
started working with companies. I never would have imagined that it is an I.T.
problem that you simply cannot get the data you want, and the data are held in
27 different data sets that have different identifiers … the I.T. support and the
complexity in these big firms blows your mind about how hard it is to do the
littlest, simple things.”
Data analysts are NOT necessarily technologists
Data Effectiveness
12Andrew Patricio www.dataeffectiveness.com EDW2017
Data Driven Decision Making
High performance data analytics…
Data Effectiveness
13
Requires pragmatic data reporting
…in the real world of data
Andrew Patricio www.dataeffectiveness.com EDW2017
Data Sophistication
14Andrew Patricio www.dataeffectiveness.com EDW2017
Data Sophistication Cycle
Results oriented incompatible with data driven?
• In a results-oriented organization the push is to “get things done” and the
velocity of the need often makes it difficult for data systems to keep up.
• Data quality often suffers and the data driven aspect gets starved of food
Solution is to design data system
complexity to slightly lead process
sophistication rather than being too
far ahead
Data Effectiveness
15Andrew Patricio www.dataeffectiveness.com EDW2017
Data Sophistication Cycle
Data capture system evolves along with process sophistication
Reporting sophistication should keep pace with data quality
Data Effectiveness
16
Example Data Entry
System
Key Data
structure
Process
Sophistication
Data
Quality
Reporting
Sophistication
Notepad Open entry
Excel Data cells
MS Access Data records
Student Information
System (SIS)
Normalized data
model
Reporting system
separate from SIS
Reporting data
model
Don’t build a formal data warehouse for excel “data systems”!
Andrew Patricio www.dataeffectiveness.com EDW2017
Data Prioritization
17
Capacity vs Demand
Not all data requests are created equal
Need to prioritize give finite capacity, time, and budget
Can‘t do everything perfectly but can be consciously imperfect
Effectiveness is defined by achieving desired results so need to set expectations accordingly
about those results
“What problem are you trying to solve?”
but different parts of the organization
have different problems
Data Effectiveness
18Andrew Patricio www.dataeffectiveness.com EDW2017
Data Driven Pipeline
Data Effectiveness
19
Organizational
Success
Data
Analytics
Programs /
Business
Product of business is Effective Outcomes
Product of analytics is Effective Decisions
Product of reporting is Effective Data
Effective
Decisions
Effective
Data
Data
Reporting
Effective
Outcomes
Andrew Patricio www.dataeffectiveness.com EDW2017
Organizational Goals drive focus of data pipeline
Data Effectiveness
Prioritize
Outcomes
Prioritize
Analytics
Prioritize
Data
Desired organizational success prioritize which outcomes business should focus on
Desired business outcomes prioritize which decisions analytics should focus on
Desired analytics decisions prioritizes which data reporting should focus on
Andrew Patricio www.dataeffectiveness.com EDW2017
Focus on relevant data
Data Effectiveness
Two considerations:
1. Some organizational goals are foundational if not necessary value adding: eg
Regulatory, Human Resources, Financial health, etc
2. Not all interesting questions are relevant
Result is that resources are focused on data that ultimately solves
the main problem of achieving organizational goals
Andrew Patricio www.dataeffectiveness.com EDW2017
Data Quality
Data Effectiveness
22
Overall Organizational
Successes
Not all of your data needs to be at the same level of quality.
Sole measure is whether or not it is sufficient to achieve a
particular organizational goal
Reporting
Infrastructure
Effective
Data
Business Streams and
various Analytics
Effective
Outcomes
Andrew Patricio www.dataeffectiveness.com EDW2017
What is Data Effectiveness?
Data Effectiveness is primary responsibility of reporting
Data Effectiveness
23
Being effectively data driven starts with Data Effectiveness:
Getting good data, when it is needed, to who needs it
Organizational
Success
Data Analytics Programs /
Business
Effective
Decisions
Effective
Outcomes
Effective
Data
Data
Reporting
Andrew Patricio www.dataeffectiveness.com EDW2017
CAR cycle
24Andrew Patricio www.dataeffectiveness.com EDW2017
How does Data go wrong?
Data entry issues
• Fat fingering
• Workarounds, solving problem in front of them
• Transactional system only cares about latest enrollment action not data changes
• Poor understanding of process/policy
• Duplication
Legacy data
• Different definitions year to year (regulatory changes, etc)
• Poor QA processes (definition incorrect)
• System transitions (Poor data transfer strategy from previous vendors)
Data Effectiveness
25Andrew Patricio www.dataeffectiveness.com EDW2017
Data issues
End of year attendance example
Data Effectiveness
26
Date report run SY13-14 ADA (example)
July 2014 95%
October 2014 92%
Initially assumed that was bug in second report
Reason behind nonsensical error was that schools were changing enrollment date
from Aug 14 to Aug15 instead of entering new enrollment for the year
Registrars were just solving immediate problem in front of them
Students who were present in SY14-15 data in june were missing in October
Andrew Patricio www.dataeffectiveness.com EDW2017
Data issues
School Dashboard vs
Weekly reports
Idea was to get more
regularly updated data
to schools
Inconsistencies
reduced trust in data
Data Effectiveness
27
Two different queries implementing the same metric, data quality meant slightly
different answers
• School on student table used for dashboard queries
• Didn’t always match school based on enrollment history used in reports
Andrew Patricio www.dataeffectiveness.com EDW2017
Fixing Data Quality
How do we make our data more effective given these challenges?
Data Effectiveness
28
Improve Data Quality long term?
Make data driven decisions today?
Andrew Patricio www.dataeffectiveness.com EDW2017
Consistency, Accuracy, Relevancy cycle
Problem is how to build a train as it’s moving down the track. When data
quality is not so good you still have to provide reports and make decisions, you
cannot wait until everything is perfect because that’s a moving target
Good enough is good enough but what is good enough?
Data Effectiveness
29
Consistency
Accuracy
Relevancy
Andrew Patricio www.dataeffectiveness.com EDW2017
Consistency, Accuracy, Relevancy cycle
Goal is to have accurate metrics aligned with business goal
• Cannot talk about accuracy if there isn’t agreement on the value being reported
• Once the value is consistent, you can talk about if it’s accurate
• Once it’s accurate you can talk about whether it’s relevant to business goal
Data Effectiveness
30
Metric A
Report 1: 90
Report 2: 81
Report 3: 87
Metric A
Report 1: 87
Report 2: 87
Report 3: 87
Consistent
Metric A
Report 1: 85
Report 2: 85
Report 3: 85
Metric
aligned with
goal
Not
Relevant
Determine proposed change
and go through cycle again
Accurate Relevant
DATA INFORMATION KNOWLEDGE
Andrew Patricio www.dataeffectiveness.com EDW2017
Consistency – DATA
“What is the value measure of this metric?”
Driven by reporting
Consistency means literally just that: a metric has the same value for the same
parameters no matter who pulls it
Factors
• Traceability – same metric in different reports must be traced back to same source
• Same parameters – need to be careful because different metrics could be referred to by
the same common name
• Time factor – legitimate changes can be made after report is run
Data Effectiveness
31
Total absences Truant absences Pulled Reason behind difference
100 90 Oct First pull
88 88 Nov Data corrected
80 85 Dec Some unexcused absences corrected to
suspensions
Andrew Patricio www.dataeffectiveness.com EDW2017
Accuracy – INFORMATION
“Is the value measure shown for this metric correct?”
Driven by Analytics
Once you have consistency, you can work on accuracy: key is to use only good
data when verifying “accuracy”
Metric could be “inaccurate” because
• Bug in query – fix
• Wrong or inconsistent business rules – nail down definitions, two different sets of
business rules for the same metric could be appropriate. Two different metrics? Or
“correct” business rules
• Data quality – identify source and reason, data entry team
Data Effectiveness
32Andrew Patricio www.dataeffectiveness.com EDW2017
Relevancy – KNOWLEDGE
“Is this metric helping to meet our goal?”
Driven by business
Once you have accuracy, then you can determine whether that metric is useful.
If not, then either business goal or metric needs to change
• Changing metric
• Use new metric – longer to get consistency, cycle could be just as long or longer
• Refine business rules of existing metric – less effort to get consistency, shorter cycle
• Changing business goal
• Effective data in hand is worth two in the bush
• Tail could be wagging the dog but unmeasurable business goal is just a wish
Example:
Unexcused absences Suspensions are not considered unexcused absences so this doesn’t truly
capture time away from instruction
In Seat Attendance (ISA) Counts all absences except in-school suspension, etc
Data Effectiveness
33Andrew Patricio www.dataeffectiveness.com EDW2017
Cycle
As data becomes information becomes knowledge, the data sophistication of the
process grows which requires more/different metrics
Data Effectiveness
34
Different metrics could be at different points in the cycle
Accuracy
RelevancyConsistency
Accuracy
RelevancyConsistency
Accuracy
RelevancyConsistency
Acc
RelCons
Acc
RelCons
Acc
RelCons
Acc
RelCons
Acc
RelCons
Acc
RelCons
Acc
RelCons
Acc
RelCons
Acc
RelCons
Acc
RelCons
Acc
RelCons
Acc
RelCons
Andrew Patricio www.dataeffectiveness.com EDW2017
Data Quality Culture
35Andrew Patricio www.dataeffectiveness.com EDW2017
Why is there inconsistency in the first place?
Ongoing issue is data entry problem
• Need to balance flexibility/freedom of entry with validation checks
• Most systems can validate based on patterns or entries but do not have enough flexibility to
differentiate between other valid and invalid entries
Why are there data entry errors?
Data Effectiveness
36
Often users don’t have the access to make a needed data change so they must
enter a request for the tech team to handle
• strictness of data entry check needs to balance against technical team capacity
Andrew Patricio www.dataeffectiveness.com EDW2017
Short sighted data entry
Example: Enrollment overlaps
Student Information System is transactional and only tracks current state
• For enrollment it doesn’t care about data values in enrollment history
• Only cares about latest enrollment action (admit or withdrawal) and school
• “enrollment history” in system is merely log of events
• Users can willy-nilly adjust enrollment history with no effect on current status
Data Effectiveness
37Andrew Patricio www.dataeffectiveness.com EDW2017
Preventing data entry errors
Business line workers are our "data entry team" rather than our “users”
• Successful data reporting intimately tied to their effectiveness
• Perfect system which users are not comfortable with will still have bad data quality
Data Effectiveness
38
Taking this point of view automatically fosters more collaboration
• Connecting the dots for end users by tracing the pathway from a specific data entry error to specific
issue on data report
• Data Integrity Management system displays errors to “data entry team”
• includes steps as to how issue can be fixed
• Includes direct link to relevant record in transactional system to minimize context switching
Users Data Entry Team
Andrew Patricio www.dataeffectiveness.com EDW2017
Central system to flag data errors to users for them to correct
• Ideally errors reported back to users who entered it
• Provides specific resolution steps
Data Integrity Management system
Data Effectiveness
39Andrew Patricio www.dataeffectiveness.com EDW2017
Data Integrity Management System
Fixing Data
Error Correction Cycle
• Feed back errors to users for them to correct
• Technical team looks for other common data entry errors to either prevent through
front-end validation or add to error checking
Data Effectiveness
40
Error
Dashboard
Technical
team
Improve Front
End Validations
Update
Error
Patterns
Fix Data Errors
Error
Identification
Transactional
Systems
Users
(ie “data entry team”)
Andrew Patricio www.dataeffectiveness.com EDW2017
Data Integrity Management System
Data Effectiveness
41Andrew Patricio www.dataeffectiveness.com EDW2017
Reporting Platform
42Andrew Patricio www.dataeffectiveness.com EDW2017
Single system for operations and reporting
Many organizations create reports from queries directly off transactional systems
• Makes querying a bear due to complex data model for transactional system
• All reports require technical team capacity, even simple ones
• Highly normalized = simple knowledge is stored in a complex way
• Optimized for inserts not reporting
• Business definitions often exist only in query code
Example: find Residency Verification
select decode (afv.value,null,'N',438,'N','Y') end "Residency Verification SY13-14",
from students p, adhoc_fields_values afv, adhoc_fields_drop_downs afdd
where p.pupil_number = afv.pupil_number(+) and afv.adhoc_fields_def_ID(+) = 109
and AFV.ADHOC_FIELDS_DEF_ID = AFDD.ADHOC_FIELDS_DEF_ID(+)
and afv.value = AFDD.FIELD_KEY_VALUE(+)
Data Effectiveness
43Andrew Patricio www.dataeffectiveness.com EDW2017
Reporting platform - Speed
Data model focused on reporting, not on transactions
• space vs speed tradeoff highly biased towards speed
• Virtually unlimited disk space
• Batch processing not real time
• Complete flexibility to organize data optimally for ease of reporting
• Central store for all siloed data (data-warehouse lite)
Data Effectiveness
44
Student
Demographics
Admit_withdraw
Attendance Base
Assessment
Courses_Taken
Andrew Patricio www.dataeffectiveness.com EDW2017
Reporting platform – ease of use
Really nothing more than a dedicated reporting database, not data warehouse
Data model can be tailored for reporting
• Keeps track of all changes, not just latest data (valid from, valid to)
• Super flat, Highly denormalized
• Redundancy okay so long as we have data traceability
• have multiple copies/formats/structures of same base data for different users/uses
• Fewer joins so can shift technical capacity to more complex business rules
• Can be exposed more directly to data analysts for increased self-service
Data Effectiveness
45
select decode (afv.value,null,'N',438,'N','Y') end
"Residency Verification", from students p,
adhoc_fields_values afv, adhoc_fields_drop_downs afdd
where p.pupil_number = afv.pupil_number(+) and
afv.adhoc_fields_def_ID(+) = 109 and
AFV.ADHOC_FIELDS_DEF_ID =
AFDD.ADHOC_FIELDS_DEF_ID(+) and afv.value =
AFDD.FIELD_KEY_VALUE(+)
select [Residency Verification] from
student_demographics_snapshot
Andrew Patricio www.dataeffectiveness.com EDW2017
Reporting platform - Consistency
Common processing
• Common query code centralized
• Batch ETL so can make multiple passes to pre-calculate higher order metrics
Consistent business rules
• can have old and new metrics back-calculated as well (old vs new truancy rules)
• calculate metric, in one place so one number, right or wrong, is reported
Data Traceability
• Data path from systems of record to reports fully documented
Data Effectiveness
46
Herding Kittens One Big Powerful Cat
Andrew Patricio www.dataeffectiveness.com EDW2017
SSIS, SQL Server, Perl on
Virtual Machine servers
Data Effectiveness
47
Accounting data
system
HR data system
Assessment
data dump
Assessment
data dump
Assessment
data dump
External
imports
Assessment
data dump
Assessment
data dump
Assessment
data dump
Misc Data
Files
CRM
Misc SystemMisc SystemMisc System
ETL
(SQL Server
Integration
Services,
Perl,
Manual
loads)
Reporting
Database
(MS SQL
Server)
Primary ERP
Data Mart
(MS SQL
Server)
Direct SQL (SQL
Server Management
Studio)
Reporting Platform Example Architecture
Andrew Patricio www.dataeffectiveness.com EDW2017
Reporting Platform – Business Rules Centralized
Based on weekly attendance report
• Updated daily
• Calculates individual student attendance metrics
Data Effectiveness
48
Metric Details
Truancy Calculates truancy based on old rules and new rules
so can compare trends
Absence Counts Period and Daily; Unexcused, Excused, In Seat
Attendance, Suspension
Andrew Patricio www.dataeffectiveness.com EDW2017
Reporting Platform – common processing tasks
Enrollment admit withdraw matching
• SIS stores enrollment as separate admit and withdraw events
• Need to match admits to withdrawals for the same enrollment period and school
Data Effectiveness
49
Admit Date Withdraw Date School
24 August 2011 24 June 2012 123
24 June 2012 10 October 2012 456
11 October 2012 1 January 3030 789
Date Type School
24 August 2011 Admit 123
24 June 2012 Withdrawal 123
24 June 2012 Admit 456
10 October 2012 Withdrawal 456
11 October 2012 Admit 789
Currently enrolled as “withdrawal
date” in the far future so that there is
an actual date and not a null to
compare against:
currently enrolled is today() <
[withdraw date])
Andrew Patricio www.dataeffectiveness.com EDW2017
Reporting Platform – Optimized for Reporting
Generally two ways we need to analyze assessments
• Single view of all assessments for a student – data in columns
• Each row a single student for a particular school year
• Comparing one run of an assessment with another – data in rows
• Each row a single assessmet for a single student for a particular school year
Data Effectiveness
50
Student Assessment SY Score
123 A1 Q1 SY1415 90
123 A1 Q2 SY1415 80
123 A1 Q3 SY1415 70
123 A1 Q4 SY1415 100
456 A1 Sem 1 SY1415 65
Student A1 Q1 A1 Q1 A1 Q3 A1 Q4 A2 Sem 1 A2 Sem 2 SY
123 90 80 70 100 76 87 SY1415
456 60 70 80 90 65 86 SY1415
Andrew Patricio www.dataeffectiveness.com EDW2017
All traceable back to same
original data load so
potential for different
answers is minimized
Reporting Platform Development
How to develop system with poor data quality?
With poor data quality it is hard to determine whether some inconsistent or
inaccurate number is due to a bug in your query or inconsistent data.
Data Effectiveness
51Andrew Patricio www.dataeffectiveness.com EDW2017
or
?
Reporting Platform Development
Key is to realize that reporting platform did not need to be accurate per se, it just
needed to not be more inaccurate.
Data Effectiveness
52Andrew Patricio www.dataeffectiveness.com EDW2017
Solution
• Prioritize – Start with recreating standard
reports in reporting platform and compare with
existing standard reports: CAR cycle
• Compartmentalize – Run reports using only
students with no data quality issues so any
errors are likely due to bugs that can be nailed
down and fixed DO NO HARM
Reporting Platform Development
1. Create Sample Report and compare to Standard Report (eg attendance
weekly)
2. Check for discrepancies
1. If discrepancy is due to mistake in reporting platform or query, fix it
2. If discrepancy is due to bad data, store student id in exceptions table
3. Pull Sample Report again, filtering out exception students so that only
“Good Data” is included in report
4. Continue until no discrepancies
Data Effectiveness
53Andrew Patricio www.dataeffectiveness.com EDW2017
Reporting Platform Development
Need to ensure that reporting platform is not introducing new errors. How?
Use only known good data to validate:
Data Effectiveness
54
Report
validated
Fix any issues
with Reporting
platform
No discrepancies
discrepancies
Filter out
students with
bad data into
exceptions tableReporting Platform
Report query
Standard Report
Sample Report
Why?
Compare
Bad data
students
Good data
students
Andrew Patricio www.dataeffectiveness.com EDW2017
Managing requests
55Andrew Patricio www.dataeffectiveness.com EDW2017
Capacity vs Demand
Demand for data is ever increasing, people are hungry for data
Needed to do more with the same size team
Two Tracks
• Increase reporting efficiency
• Reduce demand on reporting team
Data Effectiveness
56Andrew Patricio www.dataeffectiveness.com EDW2017
Increase Efficiency
Users make requests via online “Data Request Tool” (DRT)
• Central point of communication with requestors for clarifications
• Tracks implementation notes and report writer assignments
• Report files attached to request along with query code
• One report can be attached to multiple requests to allow for reuse
• Data snapshot of common data available on front end
• Updated daily with common metrics (absences, GPA, grade level, school, etc)
• User can customize columns/filters to download for themselves
• Example of some columns available:
Data Effectiveness
57
Student_ID YTD_Unexcused_Absences Total SBT Suspension_Days
School_Name YTD_Excused_Absences Truant - still be truant?
ELL_Status YTD_ISA_Average_Attendance Truant_>=10_days
FARM_Status Membership_days Current_School_Average_Attendance
Student_Race Absences_Towards_Truancy Current_School_Excused_Absences
SPED_Status Suspension_Absences_Days Current_School_ISA_Average_Attendance
Andrew Patricio www.dataeffectiveness.com EDW2017
Increase Efficiency
“Data Request Tool” (DRT)
Data Effectiveness
58Andrew Patricio www.dataeffectiveness.com EDW2017
Increase Efficiency
Data Librarian is first point of contact for requests to reporting team
• Dedicated FTE position
• Clarifies request requirements
• Is there an already completed report that can fulfill this request?
• Acts as gatekeeper to qualify requests before they hit reporting capacity
Data Effectiveness
59
Program
needs data
Standard Report?
Common metric?
Program Enters
Data Request
Data Librarian
clarifies request
Report
Created
Report Writer
assigned
Report
Reviewed
Existing report
available?
Report
Delivered
Andrew Patricio www.dataeffectiveness.com EDW2017
Self Service Reporting
Goal is to provide self-service reporting to analysts while ensuring consistency
• Giving them raw access to reporting platform is too overwhelming
• Analysts are not database developers/DBAs
• SQL skills, would still require joins to get meaningful data
• Creating dedicated pull of custom data would mean another thing to maintain
Solution was first to create regularly disseminated standard report with
commonly requested metrics and standard demographics
Data Effectiveness
60Andrew Patricio www.dataeffectiveness.com EDW2017
Self Service Reporting
Then save weekly snapshot of each report into a dedicated “data mart”
• Simply add “report date” field to existing columns
• Analysts already used to seeing these reports so no learning curve in using data
Data Effectiveness
61Andrew Patricio www.dataeffectiveness.com EDW2017
Quickie
Data Mart
Standard Report
Daily Feeds
Standard Report
Daily Feeds
“Data Mart” example - Standard Report
Standard Report data flows into data mart. Analysts/Power Users can create
dashboards in tools like PowerBI for staff to use or they can access it directly
Data Effectiveness
62
Standard Report
Weekly Feeds
Standard Report wk 1
Standard Report wk 2
Standard Report wk 3
Standard Report wk 4
Standard Report wk 52
…
Analytics
Power Users
Andrew Patricio www.dataeffectiveness.com EDW2017
Report requests hitting report writers
Data Effectiveness
63
0
20
40
60
80
100
120
Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul
Data Requests per Month
SY12-13 SY13-14 SY14-15 SY15-16
More self-service reporting and standardized reports
• Fewer adhoc requests for standard data
• Reporting capacity can be spent on more complex requests
Andrew Patricio www.dataeffectiveness.com EDW2017
Summary
64Andrew Patricio www.dataeffectiveness.com EDW2017
Takeaways
“What problem are
you trying to solve”?
Data Effectiveness
65Andrew Patricio www.dataeffectiveness.com EDW2017
Effective Data
Organizatio
nal Success
Data
Analytics
Programs
/ Business
Effective
Decisions
Effective
Outcomes
Effective
Data
Data
Reporting
Takeaways
Data Effectiveness
66
Don’t overengineer
data systems
Andrew Patricio www.dataeffectiveness.com EDW2017
Focus on data that
supports organizational
goals
Takeaways
Consistency First, then Accuracy, then Relevancy
Data Effectiveness
67
Metric A
Report 1: 90
Report 2: 81
Report 3: 87
Metric A
Report 1: 87
Report 2: 87
Report 3: 87
Consistent
Metric A
Report 1: 85
Report 2: 85
Report 3: 85
Metric
aligned with
goal
Accurate Relevant
School Staff is our "data entry team" rather than our “users”
Users Data Entry Team
Andrew Patricio www.dataeffectiveness.com EDW2017
ROI
Meet your data where it is today and build to where you want to be
Data Effectiveness
68Andrew Patricio www.dataeffectiveness.com EDW2017
Questions?
andrew@dataeffectiveness.com
@dataeffectively
dataeff.site
dataeff.blog
dataeff.me
Data Effectiveness
69Andrew Patricio www.dataeffectiveness.com EDW2017

More Related Content

What's hot

How to Get Started or Expand Your Learning Analytics Program
 How to Get Started or Expand Your Learning Analytics Program How to Get Started or Expand Your Learning Analytics Program
How to Get Started or Expand Your Learning Analytics ProgramWatershed
 
Managerial insights on an article - "What people can’t Capture"
Managerial insights on an article - "What people can’t Capture"Managerial insights on an article - "What people can’t Capture"
Managerial insights on an article - "What people can’t Capture"Potlacheruvu Sai Krishna Vamsi
 
Data Sourcing Best Practices for Reporting (Webinar slides)
Data Sourcing Best Practices for Reporting (Webinar slides)Data Sourcing Best Practices for Reporting (Webinar slides)
Data Sourcing Best Practices for Reporting (Webinar slides)Yellowfin
 
Simplify your analytics strategy- Palash badjatya
Simplify your analytics strategy- Palash badjatyaSimplify your analytics strategy- Palash badjatya
Simplify your analytics strategy- Palash badjatyaAcropolis Technical Campus
 
How to source good data
How to source good dataHow to source good data
How to source good dataSolveXia
 
How to Get More Value from Your Social Data
How to Get More Value from Your Social DataHow to Get More Value from Your Social Data
How to Get More Value from Your Social DataAnna OBrien
 
What people analytics can’t capture
What people analytics can’t captureWhat people analytics can’t capture
What people analytics can’t captureB.R Keerthi
 
Accepting the Truth at Work: 3 Practical Tools
Accepting the Truth at Work: 3 Practical Tools Accepting the Truth at Work: 3 Practical Tools
Accepting the Truth at Work: 3 Practical Tools Janice Fraser
 
Implementing Data Science
Implementing Data ScienceImplementing Data Science
Implementing Data ScienceNathan Watson
 
How To Improve Profitability & Outperform Your Competition: the Guide to Data...
How To Improve Profitability & Outperform Your Competition: the Guide to Data...How To Improve Profitability & Outperform Your Competition: the Guide to Data...
How To Improve Profitability & Outperform Your Competition: the Guide to Data...A.J. Riedel
 
Session T5 - Data Driven Decision Making - 3DM
Session T5 - Data Driven Decision Making - 3DMSession T5 - Data Driven Decision Making - 3DM
Session T5 - Data Driven Decision Making - 3DMProject Controls Expo
 
SharePoint 2013 - Why, How and What? - Session #SPCon13
SharePoint 2013 - Why, How and What? - Session #SPCon13SharePoint 2013 - Why, How and What? - Session #SPCon13
SharePoint 2013 - Why, How and What? - Session #SPCon13Roland Driesen
 
Mark Graban SHS 2014: Two Data Points Are Not a Trend: Using SPC to Manage Be...
Mark Graban SHS 2014: Two Data Points Are Not a Trend: Using SPC to Manage Be...Mark Graban SHS 2014: Two Data Points Are Not a Trend: Using SPC to Manage Be...
Mark Graban SHS 2014: Two Data Points Are Not a Trend: Using SPC to Manage Be...Mark Graban
 
What People Analytics Can’t Capture
What People Analytics Can’t CaptureWhat People Analytics Can’t Capture
What People Analytics Can’t Capturepriyanshi tomar
 
Security Administration Vii 2 Statistical Analysis
Security Administration Vii 2 Statistical AnalysisSecurity Administration Vii 2 Statistical Analysis
Security Administration Vii 2 Statistical AnalysisCarter F. Smith, J.D., Ph.D.
 
Assignment week4 day3 data analytics
Assignment week4 day3 data analyticsAssignment week4 day3 data analytics
Assignment week4 day3 data analyticsGirish Nookella
 
Real-world state of the BI market: Webinar presentation slides
Real-world state of the BI market: Webinar presentation slidesReal-world state of the BI market: Webinar presentation slides
Real-world state of the BI market: Webinar presentation slidesYellowfin
 
Analysis of “what do you do with all this big data” –ted talk by susan etlinger
Analysis of “what do you do with all this big data” –ted talk by susan etlingerAnalysis of “what do you do with all this big data” –ted talk by susan etlinger
Analysis of “what do you do with all this big data” –ted talk by susan etlingerDarpan Deoghare
 

What's hot (20)

How to Get Started or Expand Your Learning Analytics Program
 How to Get Started or Expand Your Learning Analytics Program How to Get Started or Expand Your Learning Analytics Program
How to Get Started or Expand Your Learning Analytics Program
 
Managerial insights on an article - "What people can’t Capture"
Managerial insights on an article - "What people can’t Capture"Managerial insights on an article - "What people can’t Capture"
Managerial insights on an article - "What people can’t Capture"
 
Data Sourcing Best Practices for Reporting (Webinar slides)
Data Sourcing Best Practices for Reporting (Webinar slides)Data Sourcing Best Practices for Reporting (Webinar slides)
Data Sourcing Best Practices for Reporting (Webinar slides)
 
Simplify your analytics strategy- Palash badjatya
Simplify your analytics strategy- Palash badjatyaSimplify your analytics strategy- Palash badjatya
Simplify your analytics strategy- Palash badjatya
 
How to source good data
How to source good dataHow to source good data
How to source good data
 
How to Get More Value from Your Social Data
How to Get More Value from Your Social DataHow to Get More Value from Your Social Data
How to Get More Value from Your Social Data
 
What people analytics can’t capture
What people analytics can’t captureWhat people analytics can’t capture
What people analytics can’t capture
 
Accepting the Truth at Work: 3 Practical Tools
Accepting the Truth at Work: 3 Practical Tools Accepting the Truth at Work: 3 Practical Tools
Accepting the Truth at Work: 3 Practical Tools
 
Implementing Data Science
Implementing Data ScienceImplementing Data Science
Implementing Data Science
 
How To Improve Profitability & Outperform Your Competition: the Guide to Data...
How To Improve Profitability & Outperform Your Competition: the Guide to Data...How To Improve Profitability & Outperform Your Competition: the Guide to Data...
How To Improve Profitability & Outperform Your Competition: the Guide to Data...
 
Session T5 - Data Driven Decision Making - 3DM
Session T5 - Data Driven Decision Making - 3DMSession T5 - Data Driven Decision Making - 3DM
Session T5 - Data Driven Decision Making - 3DM
 
SharePoint 2013 - Why, How and What? - Session #SPCon13
SharePoint 2013 - Why, How and What? - Session #SPCon13SharePoint 2013 - Why, How and What? - Session #SPCon13
SharePoint 2013 - Why, How and What? - Session #SPCon13
 
Mark Graban SHS 2014: Two Data Points Are Not a Trend: Using SPC to Manage Be...
Mark Graban SHS 2014: Two Data Points Are Not a Trend: Using SPC to Manage Be...Mark Graban SHS 2014: Two Data Points Are Not a Trend: Using SPC to Manage Be...
Mark Graban SHS 2014: Two Data Points Are Not a Trend: Using SPC to Manage Be...
 
What People Analytics Can’t Capture
What People Analytics Can’t CaptureWhat People Analytics Can’t Capture
What People Analytics Can’t Capture
 
Security Administration Vii 2 Statistical Analysis
Security Administration Vii 2 Statistical AnalysisSecurity Administration Vii 2 Statistical Analysis
Security Administration Vii 2 Statistical Analysis
 
Assignment week4 day3 data analytics
Assignment week4 day3 data analyticsAssignment week4 day3 data analytics
Assignment week4 day3 data analytics
 
Employee engagement case study
Employee engagement case studyEmployee engagement case study
Employee engagement case study
 
Real-world state of the BI market: Webinar presentation slides
Real-world state of the BI market: Webinar presentation slidesReal-world state of the BI market: Webinar presentation slides
Real-world state of the BI market: Webinar presentation slides
 
Analysis of “what do you do with all this big data” –ted talk by susan etlinger
Analysis of “what do you do with all this big data” –ted talk by susan etlingerAnalysis of “what do you do with all this big data” –ted talk by susan etlinger
Analysis of “what do you do with all this big data” –ted talk by susan etlinger
 
Data driven decision making
Data driven decision makingData driven decision making
Data driven decision making
 

Similar to Practical Data Strategies in the real world of poor Data Quality

Data Effectiveness: How to build a Data Driven and Reporting infrastructure
Data Effectiveness: How to build a Data Driven and Reporting infrastructureData Effectiveness: How to build a Data Driven and Reporting infrastructure
Data Effectiveness: How to build a Data Driven and Reporting infrastructureAndrew Patricio
 
NTEN Your Analytics doesn't have to be dramatic to be useful
NTEN Your Analytics doesn't have to be dramatic to be usefulNTEN Your Analytics doesn't have to be dramatic to be useful
NTEN Your Analytics doesn't have to be dramatic to be usefulAndrew Patricio
 
Data Literacy and Data Virtualization: A Step-by-step Guide to Bolstering You...
Data Literacy and Data Virtualization: A Step-by-step Guide to Bolstering You...Data Literacy and Data Virtualization: A Step-by-step Guide to Bolstering You...
Data Literacy and Data Virtualization: A Step-by-step Guide to Bolstering You...Denodo
 
Accenture Big Data Expo
Accenture Big Data ExpoAccenture Big Data Expo
Accenture Big Data ExpoBigDataExpo
 
DataEd Slides: Data Management Best Practices
DataEd Slides: Data Management Best PracticesDataEd Slides: Data Management Best Practices
DataEd Slides: Data Management Best PracticesDATAVERSITY
 
How to unlock new data-driven potential for your organization
How to unlock new data-driven potential for your organizationHow to unlock new data-driven potential for your organization
How to unlock new data-driven potential for your organizationMichal Hodinka
 
Data-Ed Webinar: Data Quality Success Stories
Data-Ed Webinar: Data Quality Success StoriesData-Ed Webinar: Data Quality Success Stories
Data-Ed Webinar: Data Quality Success StoriesDATAVERSITY
 
10 Steps to Develop a Data Literate Workforce
10 Steps to Develop a Data Literate Workforce10 Steps to Develop a Data Literate Workforce
10 Steps to Develop a Data Literate WorkforceSense Corp
 
Creating a Data-Driven Organization, Crunchconf, October 2015
Creating a Data-Driven Organization, Crunchconf, October 2015Creating a Data-Driven Organization, Crunchconf, October 2015
Creating a Data-Driven Organization, Crunchconf, October 2015Carl Anderson
 
Creating a Data-Driven Organization, Data Day Texas, January 2016
Creating a Data-Driven Organization, Data Day Texas, January 2016Creating a Data-Driven Organization, Data Day Texas, January 2016
Creating a Data-Driven Organization, Data Day Texas, January 2016Carl Anderson
 
Is Your Agency Data Challenged?
Is Your Agency Data Challenged?Is Your Agency Data Challenged?
Is Your Agency Data Challenged?DLT Solutions
 
State of Data Governance in 2021
State of Data Governance in 2021State of Data Governance in 2021
State of Data Governance in 2021DATAVERSITY
 
Data-Ed Webinar: The Seven Deadly Data Sins - Emerging from Management Purgatory
Data-Ed Webinar: The Seven Deadly Data Sins - Emerging from Management PurgatoryData-Ed Webinar: The Seven Deadly Data Sins - Emerging from Management Purgatory
Data-Ed Webinar: The Seven Deadly Data Sins - Emerging from Management PurgatoryDATAVERSITY
 
Master Data-Driven Decision-Making in 2024
Master Data-Driven Decision-Making in 2024Master Data-Driven Decision-Making in 2024
Master Data-Driven Decision-Making in 2024USDSI
 
Moving Data Science from an Event to A Program: Considerations in Creating Su...
Moving Data Science from an Event to A Program: Considerations in Creating Su...Moving Data Science from an Event to A Program: Considerations in Creating Su...
Moving Data Science from an Event to A Program: Considerations in Creating Su...Domino Data Lab
 
The New Age Data Quality
The New Age Data QualityThe New Age Data Quality
The New Age Data QualityRanjeet202050
 
Creating a Data-Driven Organization -- thisismetis meetup
Creating a Data-Driven Organization -- thisismetis meetupCreating a Data-Driven Organization -- thisismetis meetup
Creating a Data-Driven Organization -- thisismetis meetupCarl Anderson
 
Data Integrity: The Baseline for Innovation
Data Integrity: The Baseline for InnovationData Integrity: The Baseline for Innovation
Data Integrity: The Baseline for InnovationPrecisely
 
Data Innovation Summit: Data Integrity Trends
Data Innovation Summit: Data Integrity TrendsData Innovation Summit: Data Integrity Trends
Data Innovation Summit: Data Integrity TrendsPrecisely
 

Similar to Practical Data Strategies in the real world of poor Data Quality (20)

Data Effectiveness: How to build a Data Driven and Reporting infrastructure
Data Effectiveness: How to build a Data Driven and Reporting infrastructureData Effectiveness: How to build a Data Driven and Reporting infrastructure
Data Effectiveness: How to build a Data Driven and Reporting infrastructure
 
NTEN Your Analytics doesn't have to be dramatic to be useful
NTEN Your Analytics doesn't have to be dramatic to be usefulNTEN Your Analytics doesn't have to be dramatic to be useful
NTEN Your Analytics doesn't have to be dramatic to be useful
 
Data Literacy and Data Virtualization: A Step-by-step Guide to Bolstering You...
Data Literacy and Data Virtualization: A Step-by-step Guide to Bolstering You...Data Literacy and Data Virtualization: A Step-by-step Guide to Bolstering You...
Data Literacy and Data Virtualization: A Step-by-step Guide to Bolstering You...
 
Accenture Big Data Expo
Accenture Big Data ExpoAccenture Big Data Expo
Accenture Big Data Expo
 
DataEd Slides: Data Management Best Practices
DataEd Slides: Data Management Best PracticesDataEd Slides: Data Management Best Practices
DataEd Slides: Data Management Best Practices
 
How to unlock new data-driven potential for your organization
How to unlock new data-driven potential for your organizationHow to unlock new data-driven potential for your organization
How to unlock new data-driven potential for your organization
 
Data-Ed Webinar: Data Quality Success Stories
Data-Ed Webinar: Data Quality Success StoriesData-Ed Webinar: Data Quality Success Stories
Data-Ed Webinar: Data Quality Success Stories
 
10 Steps to Develop a Data Literate Workforce
10 Steps to Develop a Data Literate Workforce10 Steps to Develop a Data Literate Workforce
10 Steps to Develop a Data Literate Workforce
 
Creating a Data-Driven Organization, Crunchconf, October 2015
Creating a Data-Driven Organization, Crunchconf, October 2015Creating a Data-Driven Organization, Crunchconf, October 2015
Creating a Data-Driven Organization, Crunchconf, October 2015
 
Creating a Data-Driven Organization, Data Day Texas, January 2016
Creating a Data-Driven Organization, Data Day Texas, January 2016Creating a Data-Driven Organization, Data Day Texas, January 2016
Creating a Data-Driven Organization, Data Day Texas, January 2016
 
Is Your Agency Data Challenged?
Is Your Agency Data Challenged?Is Your Agency Data Challenged?
Is Your Agency Data Challenged?
 
State of Data Governance in 2021
State of Data Governance in 2021State of Data Governance in 2021
State of Data Governance in 2021
 
Data-Ed Webinar: The Seven Deadly Data Sins - Emerging from Management Purgatory
Data-Ed Webinar: The Seven Deadly Data Sins - Emerging from Management PurgatoryData-Ed Webinar: The Seven Deadly Data Sins - Emerging from Management Purgatory
Data-Ed Webinar: The Seven Deadly Data Sins - Emerging from Management Purgatory
 
Master Data-Driven Decision-Making in 2024
Master Data-Driven Decision-Making in 2024Master Data-Driven Decision-Making in 2024
Master Data-Driven Decision-Making in 2024
 
Why data governance is the new buzz?
Why data governance is the new buzz?Why data governance is the new buzz?
Why data governance is the new buzz?
 
Moving Data Science from an Event to A Program: Considerations in Creating Su...
Moving Data Science from an Event to A Program: Considerations in Creating Su...Moving Data Science from an Event to A Program: Considerations in Creating Su...
Moving Data Science from an Event to A Program: Considerations in Creating Su...
 
The New Age Data Quality
The New Age Data QualityThe New Age Data Quality
The New Age Data Quality
 
Creating a Data-Driven Organization -- thisismetis meetup
Creating a Data-Driven Organization -- thisismetis meetupCreating a Data-Driven Organization -- thisismetis meetup
Creating a Data-Driven Organization -- thisismetis meetup
 
Data Integrity: The Baseline for Innovation
Data Integrity: The Baseline for InnovationData Integrity: The Baseline for Innovation
Data Integrity: The Baseline for Innovation
 
Data Innovation Summit: Data Integrity Trends
Data Innovation Summit: Data Integrity TrendsData Innovation Summit: Data Integrity Trends
Data Innovation Summit: Data Integrity Trends
 

Recently uploaded

GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一F La
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...ThinkInnovation
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 

Recently uploaded (20)

GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 

Practical Data Strategies in the real world of poor Data Quality

  • 1. Practical Data Strategies in the Real World of Poor Data Quality A n d r e w P a t r i c i o | w w w . d a t a e f f e c t i v e n e s s . c o m
  • 2. Agenda Foundation Data Effectiveness Data Sophistication Data Prioritization Consistency, Relevancy, Accuracy Data Quality Culture Reporting platform Managing Requests Summary Data Effectiveness Andrew Patricio www.dataeffectiveness.com EDW2017 2
  • 4. The Foundation ef·fec·tive·ness iˈfektivnəs/, noun the degree to which something is successful in producing the intended or desired result Data Effectiveness 4Andrew Patricio www.dataeffectiveness.com EDW2017
  • 5. The Wrong Question Not “What do you want?” Data Effectiveness 5Andrew Patricio www.dataeffectiveness.com EDW2017 Instead, “What problem are you trying to solve?”
  • 6. Effectiveness is about solving problems not deliverables What do you want? • Focused on requirements • Mid-stream changes = not delivering what was promised • Encourages business to think transactionally instead of as partners in the solution • Overall sense is one of CYA, “We just did what you asked” What problem are you trying to solve? • Focused on end goal • Mid-stream changes = steering to maintain drive towards end goal • Forces business to think of themselves as part of the team as well as articulate the problem thereby making sure they understand it themselves • Overall sense is one of partners on a journey to discover an unknown answer Data Effectiveness 6Andrew Patricio www.dataeffectiveness.com EDW2017
  • 7. The Ends (sometimes) Justify the Means Having a goal of effectiveness instead of quality means project is successful to the degree that it achieves desired result “What problem are you trying to solve?” is how to define the desired result Data Effectiveness 7Andrew Patricio www.dataeffectiveness.com EDW2017 This combination gives you both a structure to make progress and the freedom to follow and steer around obstacles
  • 8. About Me – Andrew Patricio President Data Effectiveness Inc • www.dataeffectiveness.com • Data Evaluation • Data Strategy • Data Infrastructure Personal background • Chief Data Officer at DC Public Schools Nov 2010 to June 2016 • IT & management consulting • Electrical Engineering Data Effectiveness 8Andrew Patricio www.dataeffectiveness.com EDW2017
  • 10. Data Driven Decision Making All organizations seek to make decisions based on data Data Effectiveness 10Andrew Patricio www.dataeffectiveness.com EDW2017
  • 11. Data Reality But the reality is that the data we have available is often in poor shape Data Effectiveness 11Andrew Patricio www.dataeffectiveness.com EDW2017
  • 12. Getting to Data Driven – Reporting vs Analytics Steve Levitt, Freakonomics Podcast, 26 June 2014 “Yeah, I think the hardest single thing is that even if you have the desire … to be data driven, that the existing systems…I never would have thought this before I started working with companies. I never would have imagined that it is an I.T. problem that you simply cannot get the data you want, and the data are held in 27 different data sets that have different identifiers … the I.T. support and the complexity in these big firms blows your mind about how hard it is to do the littlest, simple things.” Data analysts are NOT necessarily technologists Data Effectiveness 12Andrew Patricio www.dataeffectiveness.com EDW2017
  • 13. Data Driven Decision Making High performance data analytics… Data Effectiveness 13 Requires pragmatic data reporting …in the real world of data Andrew Patricio www.dataeffectiveness.com EDW2017
  • 14. Data Sophistication 14Andrew Patricio www.dataeffectiveness.com EDW2017
  • 15. Data Sophistication Cycle Results oriented incompatible with data driven? • In a results-oriented organization the push is to “get things done” and the velocity of the need often makes it difficult for data systems to keep up. • Data quality often suffers and the data driven aspect gets starved of food Solution is to design data system complexity to slightly lead process sophistication rather than being too far ahead Data Effectiveness 15Andrew Patricio www.dataeffectiveness.com EDW2017
  • 16. Data Sophistication Cycle Data capture system evolves along with process sophistication Reporting sophistication should keep pace with data quality Data Effectiveness 16 Example Data Entry System Key Data structure Process Sophistication Data Quality Reporting Sophistication Notepad Open entry Excel Data cells MS Access Data records Student Information System (SIS) Normalized data model Reporting system separate from SIS Reporting data model Don’t build a formal data warehouse for excel “data systems”! Andrew Patricio www.dataeffectiveness.com EDW2017
  • 18. Capacity vs Demand Not all data requests are created equal Need to prioritize give finite capacity, time, and budget Can‘t do everything perfectly but can be consciously imperfect Effectiveness is defined by achieving desired results so need to set expectations accordingly about those results “What problem are you trying to solve?” but different parts of the organization have different problems Data Effectiveness 18Andrew Patricio www.dataeffectiveness.com EDW2017
  • 19. Data Driven Pipeline Data Effectiveness 19 Organizational Success Data Analytics Programs / Business Product of business is Effective Outcomes Product of analytics is Effective Decisions Product of reporting is Effective Data Effective Decisions Effective Data Data Reporting Effective Outcomes Andrew Patricio www.dataeffectiveness.com EDW2017
  • 20. Organizational Goals drive focus of data pipeline Data Effectiveness Prioritize Outcomes Prioritize Analytics Prioritize Data Desired organizational success prioritize which outcomes business should focus on Desired business outcomes prioritize which decisions analytics should focus on Desired analytics decisions prioritizes which data reporting should focus on Andrew Patricio www.dataeffectiveness.com EDW2017
  • 21. Focus on relevant data Data Effectiveness Two considerations: 1. Some organizational goals are foundational if not necessary value adding: eg Regulatory, Human Resources, Financial health, etc 2. Not all interesting questions are relevant Result is that resources are focused on data that ultimately solves the main problem of achieving organizational goals Andrew Patricio www.dataeffectiveness.com EDW2017
  • 22. Data Quality Data Effectiveness 22 Overall Organizational Successes Not all of your data needs to be at the same level of quality. Sole measure is whether or not it is sufficient to achieve a particular organizational goal Reporting Infrastructure Effective Data Business Streams and various Analytics Effective Outcomes Andrew Patricio www.dataeffectiveness.com EDW2017
  • 23. What is Data Effectiveness? Data Effectiveness is primary responsibility of reporting Data Effectiveness 23 Being effectively data driven starts with Data Effectiveness: Getting good data, when it is needed, to who needs it Organizational Success Data Analytics Programs / Business Effective Decisions Effective Outcomes Effective Data Data Reporting Andrew Patricio www.dataeffectiveness.com EDW2017
  • 24. CAR cycle 24Andrew Patricio www.dataeffectiveness.com EDW2017
  • 25. How does Data go wrong? Data entry issues • Fat fingering • Workarounds, solving problem in front of them • Transactional system only cares about latest enrollment action not data changes • Poor understanding of process/policy • Duplication Legacy data • Different definitions year to year (regulatory changes, etc) • Poor QA processes (definition incorrect) • System transitions (Poor data transfer strategy from previous vendors) Data Effectiveness 25Andrew Patricio www.dataeffectiveness.com EDW2017
  • 26. Data issues End of year attendance example Data Effectiveness 26 Date report run SY13-14 ADA (example) July 2014 95% October 2014 92% Initially assumed that was bug in second report Reason behind nonsensical error was that schools were changing enrollment date from Aug 14 to Aug15 instead of entering new enrollment for the year Registrars were just solving immediate problem in front of them Students who were present in SY14-15 data in june were missing in October Andrew Patricio www.dataeffectiveness.com EDW2017
  • 27. Data issues School Dashboard vs Weekly reports Idea was to get more regularly updated data to schools Inconsistencies reduced trust in data Data Effectiveness 27 Two different queries implementing the same metric, data quality meant slightly different answers • School on student table used for dashboard queries • Didn’t always match school based on enrollment history used in reports Andrew Patricio www.dataeffectiveness.com EDW2017
  • 28. Fixing Data Quality How do we make our data more effective given these challenges? Data Effectiveness 28 Improve Data Quality long term? Make data driven decisions today? Andrew Patricio www.dataeffectiveness.com EDW2017
  • 29. Consistency, Accuracy, Relevancy cycle Problem is how to build a train as it’s moving down the track. When data quality is not so good you still have to provide reports and make decisions, you cannot wait until everything is perfect because that’s a moving target Good enough is good enough but what is good enough? Data Effectiveness 29 Consistency Accuracy Relevancy Andrew Patricio www.dataeffectiveness.com EDW2017
  • 30. Consistency, Accuracy, Relevancy cycle Goal is to have accurate metrics aligned with business goal • Cannot talk about accuracy if there isn’t agreement on the value being reported • Once the value is consistent, you can talk about if it’s accurate • Once it’s accurate you can talk about whether it’s relevant to business goal Data Effectiveness 30 Metric A Report 1: 90 Report 2: 81 Report 3: 87 Metric A Report 1: 87 Report 2: 87 Report 3: 87 Consistent Metric A Report 1: 85 Report 2: 85 Report 3: 85 Metric aligned with goal Not Relevant Determine proposed change and go through cycle again Accurate Relevant DATA INFORMATION KNOWLEDGE Andrew Patricio www.dataeffectiveness.com EDW2017
  • 31. Consistency – DATA “What is the value measure of this metric?” Driven by reporting Consistency means literally just that: a metric has the same value for the same parameters no matter who pulls it Factors • Traceability – same metric in different reports must be traced back to same source • Same parameters – need to be careful because different metrics could be referred to by the same common name • Time factor – legitimate changes can be made after report is run Data Effectiveness 31 Total absences Truant absences Pulled Reason behind difference 100 90 Oct First pull 88 88 Nov Data corrected 80 85 Dec Some unexcused absences corrected to suspensions Andrew Patricio www.dataeffectiveness.com EDW2017
  • 32. Accuracy – INFORMATION “Is the value measure shown for this metric correct?” Driven by Analytics Once you have consistency, you can work on accuracy: key is to use only good data when verifying “accuracy” Metric could be “inaccurate” because • Bug in query – fix • Wrong or inconsistent business rules – nail down definitions, two different sets of business rules for the same metric could be appropriate. Two different metrics? Or “correct” business rules • Data quality – identify source and reason, data entry team Data Effectiveness 32Andrew Patricio www.dataeffectiveness.com EDW2017
  • 33. Relevancy – KNOWLEDGE “Is this metric helping to meet our goal?” Driven by business Once you have accuracy, then you can determine whether that metric is useful. If not, then either business goal or metric needs to change • Changing metric • Use new metric – longer to get consistency, cycle could be just as long or longer • Refine business rules of existing metric – less effort to get consistency, shorter cycle • Changing business goal • Effective data in hand is worth two in the bush • Tail could be wagging the dog but unmeasurable business goal is just a wish Example: Unexcused absences Suspensions are not considered unexcused absences so this doesn’t truly capture time away from instruction In Seat Attendance (ISA) Counts all absences except in-school suspension, etc Data Effectiveness 33Andrew Patricio www.dataeffectiveness.com EDW2017
  • 34. Cycle As data becomes information becomes knowledge, the data sophistication of the process grows which requires more/different metrics Data Effectiveness 34 Different metrics could be at different points in the cycle Accuracy RelevancyConsistency Accuracy RelevancyConsistency Accuracy RelevancyConsistency Acc RelCons Acc RelCons Acc RelCons Acc RelCons Acc RelCons Acc RelCons Acc RelCons Acc RelCons Acc RelCons Acc RelCons Acc RelCons Acc RelCons Andrew Patricio www.dataeffectiveness.com EDW2017
  • 35. Data Quality Culture 35Andrew Patricio www.dataeffectiveness.com EDW2017
  • 36. Why is there inconsistency in the first place? Ongoing issue is data entry problem • Need to balance flexibility/freedom of entry with validation checks • Most systems can validate based on patterns or entries but do not have enough flexibility to differentiate between other valid and invalid entries Why are there data entry errors? Data Effectiveness 36 Often users don’t have the access to make a needed data change so they must enter a request for the tech team to handle • strictness of data entry check needs to balance against technical team capacity Andrew Patricio www.dataeffectiveness.com EDW2017
  • 37. Short sighted data entry Example: Enrollment overlaps Student Information System is transactional and only tracks current state • For enrollment it doesn’t care about data values in enrollment history • Only cares about latest enrollment action (admit or withdrawal) and school • “enrollment history” in system is merely log of events • Users can willy-nilly adjust enrollment history with no effect on current status Data Effectiveness 37Andrew Patricio www.dataeffectiveness.com EDW2017
  • 38. Preventing data entry errors Business line workers are our "data entry team" rather than our “users” • Successful data reporting intimately tied to their effectiveness • Perfect system which users are not comfortable with will still have bad data quality Data Effectiveness 38 Taking this point of view automatically fosters more collaboration • Connecting the dots for end users by tracing the pathway from a specific data entry error to specific issue on data report • Data Integrity Management system displays errors to “data entry team” • includes steps as to how issue can be fixed • Includes direct link to relevant record in transactional system to minimize context switching Users Data Entry Team Andrew Patricio www.dataeffectiveness.com EDW2017
  • 39. Central system to flag data errors to users for them to correct • Ideally errors reported back to users who entered it • Provides specific resolution steps Data Integrity Management system Data Effectiveness 39Andrew Patricio www.dataeffectiveness.com EDW2017
  • 40. Data Integrity Management System Fixing Data Error Correction Cycle • Feed back errors to users for them to correct • Technical team looks for other common data entry errors to either prevent through front-end validation or add to error checking Data Effectiveness 40 Error Dashboard Technical team Improve Front End Validations Update Error Patterns Fix Data Errors Error Identification Transactional Systems Users (ie “data entry team”) Andrew Patricio www.dataeffectiveness.com EDW2017
  • 41. Data Integrity Management System Data Effectiveness 41Andrew Patricio www.dataeffectiveness.com EDW2017
  • 42. Reporting Platform 42Andrew Patricio www.dataeffectiveness.com EDW2017
  • 43. Single system for operations and reporting Many organizations create reports from queries directly off transactional systems • Makes querying a bear due to complex data model for transactional system • All reports require technical team capacity, even simple ones • Highly normalized = simple knowledge is stored in a complex way • Optimized for inserts not reporting • Business definitions often exist only in query code Example: find Residency Verification select decode (afv.value,null,'N',438,'N','Y') end "Residency Verification SY13-14", from students p, adhoc_fields_values afv, adhoc_fields_drop_downs afdd where p.pupil_number = afv.pupil_number(+) and afv.adhoc_fields_def_ID(+) = 109 and AFV.ADHOC_FIELDS_DEF_ID = AFDD.ADHOC_FIELDS_DEF_ID(+) and afv.value = AFDD.FIELD_KEY_VALUE(+) Data Effectiveness 43Andrew Patricio www.dataeffectiveness.com EDW2017
  • 44. Reporting platform - Speed Data model focused on reporting, not on transactions • space vs speed tradeoff highly biased towards speed • Virtually unlimited disk space • Batch processing not real time • Complete flexibility to organize data optimally for ease of reporting • Central store for all siloed data (data-warehouse lite) Data Effectiveness 44 Student Demographics Admit_withdraw Attendance Base Assessment Courses_Taken Andrew Patricio www.dataeffectiveness.com EDW2017
  • 45. Reporting platform – ease of use Really nothing more than a dedicated reporting database, not data warehouse Data model can be tailored for reporting • Keeps track of all changes, not just latest data (valid from, valid to) • Super flat, Highly denormalized • Redundancy okay so long as we have data traceability • have multiple copies/formats/structures of same base data for different users/uses • Fewer joins so can shift technical capacity to more complex business rules • Can be exposed more directly to data analysts for increased self-service Data Effectiveness 45 select decode (afv.value,null,'N',438,'N','Y') end "Residency Verification", from students p, adhoc_fields_values afv, adhoc_fields_drop_downs afdd where p.pupil_number = afv.pupil_number(+) and afv.adhoc_fields_def_ID(+) = 109 and AFV.ADHOC_FIELDS_DEF_ID = AFDD.ADHOC_FIELDS_DEF_ID(+) and afv.value = AFDD.FIELD_KEY_VALUE(+) select [Residency Verification] from student_demographics_snapshot Andrew Patricio www.dataeffectiveness.com EDW2017
  • 46. Reporting platform - Consistency Common processing • Common query code centralized • Batch ETL so can make multiple passes to pre-calculate higher order metrics Consistent business rules • can have old and new metrics back-calculated as well (old vs new truancy rules) • calculate metric, in one place so one number, right or wrong, is reported Data Traceability • Data path from systems of record to reports fully documented Data Effectiveness 46 Herding Kittens One Big Powerful Cat Andrew Patricio www.dataeffectiveness.com EDW2017
  • 47. SSIS, SQL Server, Perl on Virtual Machine servers Data Effectiveness 47 Accounting data system HR data system Assessment data dump Assessment data dump Assessment data dump External imports Assessment data dump Assessment data dump Assessment data dump Misc Data Files CRM Misc SystemMisc SystemMisc System ETL (SQL Server Integration Services, Perl, Manual loads) Reporting Database (MS SQL Server) Primary ERP Data Mart (MS SQL Server) Direct SQL (SQL Server Management Studio) Reporting Platform Example Architecture Andrew Patricio www.dataeffectiveness.com EDW2017
  • 48. Reporting Platform – Business Rules Centralized Based on weekly attendance report • Updated daily • Calculates individual student attendance metrics Data Effectiveness 48 Metric Details Truancy Calculates truancy based on old rules and new rules so can compare trends Absence Counts Period and Daily; Unexcused, Excused, In Seat Attendance, Suspension Andrew Patricio www.dataeffectiveness.com EDW2017
  • 49. Reporting Platform – common processing tasks Enrollment admit withdraw matching • SIS stores enrollment as separate admit and withdraw events • Need to match admits to withdrawals for the same enrollment period and school Data Effectiveness 49 Admit Date Withdraw Date School 24 August 2011 24 June 2012 123 24 June 2012 10 October 2012 456 11 October 2012 1 January 3030 789 Date Type School 24 August 2011 Admit 123 24 June 2012 Withdrawal 123 24 June 2012 Admit 456 10 October 2012 Withdrawal 456 11 October 2012 Admit 789 Currently enrolled as “withdrawal date” in the far future so that there is an actual date and not a null to compare against: currently enrolled is today() < [withdraw date]) Andrew Patricio www.dataeffectiveness.com EDW2017
  • 50. Reporting Platform – Optimized for Reporting Generally two ways we need to analyze assessments • Single view of all assessments for a student – data in columns • Each row a single student for a particular school year • Comparing one run of an assessment with another – data in rows • Each row a single assessmet for a single student for a particular school year Data Effectiveness 50 Student Assessment SY Score 123 A1 Q1 SY1415 90 123 A1 Q2 SY1415 80 123 A1 Q3 SY1415 70 123 A1 Q4 SY1415 100 456 A1 Sem 1 SY1415 65 Student A1 Q1 A1 Q1 A1 Q3 A1 Q4 A2 Sem 1 A2 Sem 2 SY 123 90 80 70 100 76 87 SY1415 456 60 70 80 90 65 86 SY1415 Andrew Patricio www.dataeffectiveness.com EDW2017 All traceable back to same original data load so potential for different answers is minimized
  • 51. Reporting Platform Development How to develop system with poor data quality? With poor data quality it is hard to determine whether some inconsistent or inaccurate number is due to a bug in your query or inconsistent data. Data Effectiveness 51Andrew Patricio www.dataeffectiveness.com EDW2017 or ?
  • 52. Reporting Platform Development Key is to realize that reporting platform did not need to be accurate per se, it just needed to not be more inaccurate. Data Effectiveness 52Andrew Patricio www.dataeffectiveness.com EDW2017 Solution • Prioritize – Start with recreating standard reports in reporting platform and compare with existing standard reports: CAR cycle • Compartmentalize – Run reports using only students with no data quality issues so any errors are likely due to bugs that can be nailed down and fixed DO NO HARM
  • 53. Reporting Platform Development 1. Create Sample Report and compare to Standard Report (eg attendance weekly) 2. Check for discrepancies 1. If discrepancy is due to mistake in reporting platform or query, fix it 2. If discrepancy is due to bad data, store student id in exceptions table 3. Pull Sample Report again, filtering out exception students so that only “Good Data” is included in report 4. Continue until no discrepancies Data Effectiveness 53Andrew Patricio www.dataeffectiveness.com EDW2017
  • 54. Reporting Platform Development Need to ensure that reporting platform is not introducing new errors. How? Use only known good data to validate: Data Effectiveness 54 Report validated Fix any issues with Reporting platform No discrepancies discrepancies Filter out students with bad data into exceptions tableReporting Platform Report query Standard Report Sample Report Why? Compare Bad data students Good data students Andrew Patricio www.dataeffectiveness.com EDW2017
  • 55. Managing requests 55Andrew Patricio www.dataeffectiveness.com EDW2017
  • 56. Capacity vs Demand Demand for data is ever increasing, people are hungry for data Needed to do more with the same size team Two Tracks • Increase reporting efficiency • Reduce demand on reporting team Data Effectiveness 56Andrew Patricio www.dataeffectiveness.com EDW2017
  • 57. Increase Efficiency Users make requests via online “Data Request Tool” (DRT) • Central point of communication with requestors for clarifications • Tracks implementation notes and report writer assignments • Report files attached to request along with query code • One report can be attached to multiple requests to allow for reuse • Data snapshot of common data available on front end • Updated daily with common metrics (absences, GPA, grade level, school, etc) • User can customize columns/filters to download for themselves • Example of some columns available: Data Effectiveness 57 Student_ID YTD_Unexcused_Absences Total SBT Suspension_Days School_Name YTD_Excused_Absences Truant - still be truant? ELL_Status YTD_ISA_Average_Attendance Truant_>=10_days FARM_Status Membership_days Current_School_Average_Attendance Student_Race Absences_Towards_Truancy Current_School_Excused_Absences SPED_Status Suspension_Absences_Days Current_School_ISA_Average_Attendance Andrew Patricio www.dataeffectiveness.com EDW2017
  • 58. Increase Efficiency “Data Request Tool” (DRT) Data Effectiveness 58Andrew Patricio www.dataeffectiveness.com EDW2017
  • 59. Increase Efficiency Data Librarian is first point of contact for requests to reporting team • Dedicated FTE position • Clarifies request requirements • Is there an already completed report that can fulfill this request? • Acts as gatekeeper to qualify requests before they hit reporting capacity Data Effectiveness 59 Program needs data Standard Report? Common metric? Program Enters Data Request Data Librarian clarifies request Report Created Report Writer assigned Report Reviewed Existing report available? Report Delivered Andrew Patricio www.dataeffectiveness.com EDW2017
  • 60. Self Service Reporting Goal is to provide self-service reporting to analysts while ensuring consistency • Giving them raw access to reporting platform is too overwhelming • Analysts are not database developers/DBAs • SQL skills, would still require joins to get meaningful data • Creating dedicated pull of custom data would mean another thing to maintain Solution was first to create regularly disseminated standard report with commonly requested metrics and standard demographics Data Effectiveness 60Andrew Patricio www.dataeffectiveness.com EDW2017
  • 61. Self Service Reporting Then save weekly snapshot of each report into a dedicated “data mart” • Simply add “report date” field to existing columns • Analysts already used to seeing these reports so no learning curve in using data Data Effectiveness 61Andrew Patricio www.dataeffectiveness.com EDW2017
  • 62. Quickie Data Mart Standard Report Daily Feeds Standard Report Daily Feeds “Data Mart” example - Standard Report Standard Report data flows into data mart. Analysts/Power Users can create dashboards in tools like PowerBI for staff to use or they can access it directly Data Effectiveness 62 Standard Report Weekly Feeds Standard Report wk 1 Standard Report wk 2 Standard Report wk 3 Standard Report wk 4 Standard Report wk 52 … Analytics Power Users Andrew Patricio www.dataeffectiveness.com EDW2017
  • 63. Report requests hitting report writers Data Effectiveness 63 0 20 40 60 80 100 120 Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul Data Requests per Month SY12-13 SY13-14 SY14-15 SY15-16 More self-service reporting and standardized reports • Fewer adhoc requests for standard data • Reporting capacity can be spent on more complex requests Andrew Patricio www.dataeffectiveness.com EDW2017
  • 65. Takeaways “What problem are you trying to solve”? Data Effectiveness 65Andrew Patricio www.dataeffectiveness.com EDW2017 Effective Data Organizatio nal Success Data Analytics Programs / Business Effective Decisions Effective Outcomes Effective Data Data Reporting
  • 66. Takeaways Data Effectiveness 66 Don’t overengineer data systems Andrew Patricio www.dataeffectiveness.com EDW2017 Focus on data that supports organizational goals
  • 67. Takeaways Consistency First, then Accuracy, then Relevancy Data Effectiveness 67 Metric A Report 1: 90 Report 2: 81 Report 3: 87 Metric A Report 1: 87 Report 2: 87 Report 3: 87 Consistent Metric A Report 1: 85 Report 2: 85 Report 3: 85 Metric aligned with goal Accurate Relevant School Staff is our "data entry team" rather than our “users” Users Data Entry Team Andrew Patricio www.dataeffectiveness.com EDW2017
  • 68. ROI Meet your data where it is today and build to where you want to be Data Effectiveness 68Andrew Patricio www.dataeffectiveness.com EDW2017