What is that uniquely horned creature over there? Unicorn? Rhino? Our Data? As the world of “Data” is growing more and more, so are the technologies and tools around it continuing to advance and improve. Inevitably this also means the struggles and problems around it are also going to increase as well though. Thankfully these are usually not completely new and in fact tend to be very similar to other scenarios we’ve seen and solved for in the past. In this session we’ll explore the adoption of agile, lean & craft processes/practices within our data realms to seek solution to these new and growing concerns.
7. Unicorn or Rhino
@daniel_davis
• Data Science
• Data Engineering
• Machine Learning
• Artificial Intelligence
• Analytics
• DataOps
• Data Modeling
What’s the Data Realm?
7
• Data Scripts
• Data Configurations
• Algorithms
• Big Data
• Data Lakes
• Data Warehouses
• Etc…
Basically all things data…
For convenience, today we’ll just call it all “data work”.
8. Unicorn or Rhino
@daniel_davis
So how is it different?
8
Well, IMO it’s actually not really very different at all.
Sure the tools might be new or unique, and perhaps they don’t have
all the functionality we might like or expect from them (yet).
However, beyond that there are more parallels between data work
and “traditional computer science” than a lot of data folks like to admit.
9. Unicorn or Rhino
@daniel_davis
Sorry, but your data is Not That Special
9
It’s trying to be a unicorn,
when it’s really just a Rhino.
Most data is not requiring of
such special treatment.
Or perhaps better stated as, most data is not exempt
from the better practices we found helpful in other spaces.
10. Unicorn or Rhino
@daniel_davis
Getting Better Through Better Practices
10
Let’s discuss a collection of better practices that may
be helpful…
• Source Control
• Build & Deploy Pipelines
• Modular & Test Protected
• Reference Architecture
• Organized Learning
• Team Focused
11. Unicorn or Rhino
@daniel_davis
Logic == Code
When you put logic into things (Schema, Scripts, Configurations,
Tools, Etc.) it is code and therefore (in my opinion) should be in
treated as such and managed within source control.
Source Control
11
12. Unicorn or Rhino
@daniel_davis
Automation == Less Effort
Less time, as it’s almost always faster.
Less accidental issues due to human error.
Less mean time to resolution.
Build & Deploy Pipelines
12
13. Unicorn or Rhino
@daniel_davis
Small Logic with Small Tests == Easy Fixes & Changes
Data work, such as algorithms, are generally made up of complicated
mathematical functions.
The smaller the better and with as much test protection as possible
(preferably automated).
Modular & Test Protected
13
14. Unicorn or Rhino
@daniel_davis
Data Models <> Class Diagrams
Documentation that is easily understood and up to date is crucial to
maintaining a healthy product and a healthy team.
And by documentation, we are not talking about word docs or
sharepoint sites. But rather the entire knowledge base of the product,
system or service. Code, Tests, Backlogs, Diagrams, Logs, as well as
the “normal” docs.
Reference Architecture
14
15. Unicorn or Rhino
@daniel_davis
Discovery == Learning
Exploratory data analysis tasks are largely “spike” type work. We
should treat it appropriately then and need to have specific learning
objectives defined.
If we haven’t learned what we needed, within the time allowed, then
that’s fine, it just means we need a new spike.
Organized Learning
15
16. Unicorn or Rhino
@daniel_davis
Team == Support System
Nobody really likes to be alone on an island. (Well at least not forever)
The power of a team is unparalleled in our technical world these days.
Team Focused
16
17. Unicorn or Rhino
@daniel_davis
• Are the issues or pain points you are facing really
something new that requires a new solution?
• By treating “data work” as something separate, what
additional handoffs, dependencies, or silos are we creating?
• How far removed from the “customers value” is the data
work and the folks who are doing it?
Considerations
17
19. Unicorn or Rhino
@daniel_davis
Re-Seeding The Question
Based off your original thoughts to the biggest
struggle for the Data Realm…
Where do you think you and/or your organization
can make some changes or improvements?
19
20. Unicorn or Rhino
@daniel_davis
When you get back to work, what’s one thing you will
do to make those changes or improvements real?
A challenge for you…
20