1. Technology and Operations Management ISTM 619
Pepperdine Graziadio School of Business
Matt Turner, Strategic Advisor mwturner@gmail.com
October 16, 2023
Data In Action:
Business Impact of Databases
2. Agenda
• Importance and value of data
• The key: putting data into action
• Database fundamentals
• Data strategy
• Types of databases
• Use Cases Data and AI
• Conclusion
*Each person creates 33.67 GB a DAY a day (as of 2022)
3. It all starts
and ends
with the data
problem!
Roland Busch
Deputy CEO, CTO
Siemens AG 2019
2018
5. It’s that easy, right? Just
back the data pump
over your data and you
are all set right?
6. Bad data is like
manure … it gets
everywhere!
Susan Lauda
Director, Global Advanced Technology
AGCO Corp 2019
Lance Stafford
Enterprise Architect
Chevron
Bring consistency
to the ‘chaos that
exists in the data
silos”
7. Bad data is like
manure … it gets
everywhere!
Susan Lauda
Director, Global Advanced Technology
AGCO Corp 2019
Lance Stafford
Enterprise Architect
Chevron
We’re working on
opinion, not data
8. The Answer
• Learn about data!
• What data and databases are
• How it’s organized and put into
action
• Fundamental to making good
business and technology decisions
10. database noun
a collection of information that is organized so that it
can be easily accessed, managed and updated
11. Data in Action
Transactional Analytics
Video Audio
Documents,
Messages
Reference Data
Context*
*everything is data
100s / 1000s 10s / 100s
12. Relational Approach
oRelational Database
Management Systems -
RDBMS
oDefine everything in rows and
columns
oLists and categories provide
context
oProven, powerful, simple
model for most types of data
Title ProductionDate Category AssetType Length
Film1 3/1/14 Feature HD Master 2:40
Show1 6/4/13 Series HD720 0:40
Film2 6/4/05 Feature Archive 1:55
Category
Feature
Series
Action
Drama
Comedy
Documentary
…
Cable
Broadcast
Drama
Comedy
…
Action
Drama
Family
Documentary
…
14. Enter NoSQL
• Flexible schema
• Handle ALL data
• Real world has documents, lists,
relationships … not all fits into SQL
• Pioneered distributed, massive
scale out
• Flexibility to tune for speed or data
consistency
• Example:
• Digital Twin
15. NoSQL Examples
digital twin: digital versions of products
?
https://www.mongodb.com/blog/post/recap-product-announcements-mongodb-local-london-2023
!
16. NoSQL Example: Digital Twin
digital twin = digital version of product including all parts and configurations
Both!
Customers
Policies, claims,
demographics,
products, models …
17. NoSQL for ALL your data
Type Notes Examples
Key-value No schema, simple pairs, Massive scale Redis, DynamoDB
Columnar Flexible schema (tablelike), Massive scale Hbase, Cassandra, ScyllaDB
Document Schema flexibility, stores ‘as-is’ rich content MongoDB, Couchbase
Graph Core data and relationships Neo4J, AllegroGraph
Vector Distance! Data behind Large Language
Models (LLMs) and ML (Machine Learning)
Pinecone, Weaviate
Bonus: Hadoop Not a database! Files for processing Spark
Cloud +
Hybrids
Capabilities are merging (document + graph
+ vector) in cloud systems
Snowflake, Databricks,
AWS
20. Role of data in AI
• Train Models
• Initial models (ML), training and tuning (LLM)
• Trusted, tested, up to date data NOT garbage in, garbage out
• Get trusted AI responses
• Data sets the context with prompt generation
• Tap enterprise data to get the right answers from your LLM
• Create new data and queries
• Use the answers + interactions to create better data!
• AI interactions for insight
• What are the most common problems that our customers are
seeing?
• What are the protein interactions that might lead to new
combinations?
21. Data and AI Use Cases
Customer Inquiry:
“I need help with my washer”
Lookup
customer and
ALL product info
Interact with customer
with that context:
“Here is what I can do
to help you …”
Create a prompt
with customer and
product context
24. Key Takeaways
• Data is still THE most valuable asset
• Your competitive advantage
• Trusted AI needs Trusted data
• Warning: garbage in, garbage out
• Understanding data is key to making
good decisions about business and
technology
26. Resources
• Rich Data, Poor Data, Shelly Palmer:
https://www.shellypalmer.com/2016/05/rich-
data-poor-data-data-rich-data-poor-data-
middle-class-not/
• Lance Stafford, Chevron project talk
https://www.marklogic.com/resources/chevron-
harmonizing-facility-and-equipment-data-on-
the-marklogic-data-hub-platform/
• TechTarget definition:
https://searchsqlserver.techtarget.com/definitio
n/database
• NoSQL Design principles from ScyllaDB:
https://www.scylladb.com/glossary/nosql-
design-principles/
• Bosch Use Case MongoDB
https://www.mongodb.com/blog/post/recap-
product-announcements-mongodb-local-
london-2023
• Generative AI use cases in MetaAcademy GenAI
course:
https://courses.shellypalmer.com/metacademy-
generative-ai
• How much data is created each day
https://techjury.net/blog/how-much-data-is-
created-every-day/
Editor's Notes
But there were people having the conversation in their industry
We do a lot of work with publishers and one of the primary voices for change has been Dr. Sven Fund – then the CEO of DeGruyter a publisher over in Germany.
He wrote what I think of as a battle plan for the modern publisher called integrating publishing.
It’s a data driven approach to rethinking every part of the business around using data across every part of the business. From planning what content to invest in, to creating it and tailoring it to knowing how it impacts your customers, data can and should play a role … and Sven laid out the plan to get publishers to that point. This was quite a change for an industry still just thinking about content.
Shelly Palmer is another voice that was early with a message about data. He worked mostly within Media but his message was to every organization highlighting how the game has changed.
He says “Data Rich or Data Poor” that is the ONLY game. Every company is now competing on the battleground of data. Its not your revenue, your number of customers or their engagement. It’s the data you gather that actually matters. What’s more, you aren’t competing against what you think of as your competitors. Its Google, Apple, Facebook … and way above all of them Amazon.
Shelly says this to bring people’s attention to the importance of data.
And he’s not alone – he is joined by my colleague Michel de Ru. Michel works across a number of industries and at the MarkLogic 360 event last year he issued a call to arms:
Industrialize your data!
You invest in your processes, your machinery, your people and take care of your capital. And you need to do the same thing your data.
Think about how you manage it and, just like your machinery and other assets, industrialize how you deal with it
And they aren’t alone.
Who has heard this phrase Data is the new Oil?
Its everywhere … there is even someone saying it’s the not the new oil it the new nuclear. I guess because it keeps delivering value forever?
In fact there is so much about this, if you search for Data is the new oil infographic you get 13 million hits!
This is my favorite – see the data in the ground – just pump it out and – presto – you get your value!
Right? Its that easy, right?
And they aren’t alone.
Who has heard this phrase Data is the new Oil?
Its everywhere … there is even someone saying it’s the not the new oil it the new nuclear. I guess because it keeps delivering value forever?
In fact there is so much about this, if you search for Data is the new oil infographic you get 13 million hits!
This is my favorite – see the data in the ground – just pump it out and – presto – you get your value!
Right? Its that easy, right?
But there were people having the conversation in their industry
We do a lot of work with publishers and one of the primary voices for change has been Dr. Sven Fund – then the CEO of DeGruyter a publisher over in Germany.
He wrote what I think of as a battle plan for the modern publisher called integrating publishing.
It’s a data driven approach to rethinking every part of the business around using data across every part of the business. From planning what content to invest in, to creating it and tailoring it to knowing how it impacts your customers, data can and should play a role … and Sven laid out the plan to get publishers to that point. This was quite a change for an industry still just thinking about content.
Shelly Palmer is another voice that was early with a message about data. He worked mostly within Media but his message was to every organization highlighting how the game has changed.
He says “Data Rich or Data Poor” that is the ONLY game. Every company is now competing on the battleground of data. Its not your revenue, your number of customers or their engagement. It’s the data you gather that actually matters. What’s more, you aren’t competing against what you think of as your competitors. Its Google, Apple, Facebook … and way above all of them Amazon.
Shelly says this to bring people’s attention to the importance of data.
And he’s not alone – he is joined by my colleague Michel de Ru. Michel works across a number of industries and at the MarkLogic 360 event last year he issued a call to arms:
Industrialize your data!
You invest in your processes, your machinery, your people and take care of your capital. And you need to do the same thing your data.
Think about how you manage it and, just like your machinery and other assets, industrialize how you deal with it
But there were people having the conversation in their industry
We do a lot of work with publishers and one of the primary voices for change has been Dr. Sven Fund – then the CEO of DeGruyter a publisher over in Germany.
He wrote what I think of as a battle plan for the modern publisher called integrating publishing.
It’s a data driven approach to rethinking every part of the business around using data across every part of the business. From planning what content to invest in, to creating it and tailoring it to knowing how it impacts your customers, data can and should play a role … and Sven laid out the plan to get publishers to that point. This was quite a change for an industry still just thinking about content.
Shelly Palmer is another voice that was early with a message about data. He worked mostly within Media but his message was to every organization highlighting how the game has changed.
He says “Data Rich or Data Poor” that is the ONLY game. Every company is now competing on the battleground of data. Its not your revenue, your number of customers or their engagement. It’s the data you gather that actually matters. What’s more, you aren’t competing against what you think of as your competitors. Its Google, Apple, Facebook … and way above all of them Amazon.
Shelly says this to bring people’s attention to the importance of data.
And he’s not alone – he is joined by my colleague Michel de Ru. Michel works across a number of industries and at the MarkLogic 360 event last year he issued a call to arms:
Industrialize your data!
You invest in your processes, your machinery, your people and take care of your capital. And you need to do the same thing your data.
Think about how you manage it and, just like your machinery and other assets, industrialize how you deal with it
When you take this to the world of data, and in particular the data layer that can run your business
this is what you get – traditional data structures that just fall short
You have to define everything up front – all your data and everything your organization does …
And then categorize it. In no way will this work – you will end up stripping off context sometimes in layer. You can’t share this data across your organization and so you get what Alan was talking about in terms of the multiple layers of appliations and data
One of our customers talks about the result of all this changing of data as operating on opinion, not data!