Supercharging AI with Data Enrichment

A Fireside Chat moderated by
David Loshin
Affiliate Research Director, TDWI
President, Knowledge Integrity, Inc.
Senior Lecturer, University of Maryland
November 2, 2023
Supercharging AI with Data Enrichment

President, Knowledge Integrity, Inc.
Senior Lecturer and Lead for External Relations,
University of Maryland

What we will talk
about today
• Setting the stage:
• Generative AI as a business imperative
• Data imperatives for AI model quality
• Discussion: The pivotal role of data
enrichment in training and fine-tuning
Generative AI models

What is Generative AI?
• A subset of artificial intelligence that includes systems designed
to generate outputs such as images, music, text, or other forms of
media, based on its training data
• Learns from existing and generate new data that is consistent
with the original data set
• Generative AI systems that have been trained on billions of
parameters use prediction to create new instances of data in
response to provided prompts
• Large Language Models (LLMs) are a type of Generative AI that
have been trained on massive amounts of content

Ensuring Trustworthy & Appropriate Results
Data
volumes
Data
quality
Data
access
• Issues include:
– Bias
– Privacy
– Ethical concerns
– Legal concerns
– Hallucinations

Data Enrichment & LLM Training
• Improving the utility of data through appending and integration of
relevant content from additional sources
• Enrichment is used for
– Refining contextual nuances
– Improving fidelity of prompt responses
– Improve pattern recognition to reduce probability of hallucinations
– Improve interpretability of results

The leader in data integrity
Our software, data enrichment products and
strategic services deliver accuracy, consistency, and
context in your data, powering confident decisions.
of the Fortune 100
99
countries
100 2,500
employees
customers
12,000
Brands you trust, trust us
Data leaders partner with us
10

AI initiatives succeed with trusted data
of leading
businesses have
ongoing investments
in artificial
intelligence
91%
From Noise to Brilliance: Supercharge AI with Data Enrichment
Algorithms
Data
Modeling
Large
Language
Models
Deep
Learning
Hyperparameter
Tuning
Training
Data
Retrieval
Augmented
Generation
Supervised
Learning
Natural
Language
Processing
Bias
and
Fairness
Artificial
Intelligence
Feature
Engineering
Neural
Networks
Chatbots
Machine
Learning
Data
Mining
11
Source: NewVantage

For trusted data,
you need data integrity
Data integrity is data with maximum
accuracy, consistency, and context for
confident business decision-making
Data
Integrity
12

What is data
enrichment, exactly?
13
It’s the process of enhancing your data by
appending relevant context from additional
sources – improving its overall value,
accuracy, and usability.

Trusted third-party data at a global scale
Addresses &
Property
Verified and validated address and
property data for map display and
analytics
Boundaries
Administrative, community, and
industry-specific boundaries for data
enrichment and territory analysis
Demographics
Demographic and consumer context
data for better understanding people
and behavior
Points of
Interest
Detailed business, leisure, and
geographic features for location
and competitive intelligence
Streets
Robust street-level data for mapping,
analysis, routing, and geocoding
Risk
Natural hazard boundaries related to
flood, fire, earthquakes, and weather
14
Expertly curated datasets containing thousands of attributes for faster, confident decisions

15 From Noise to Brilliance: Supercharge AI with Data Enrichment
Purchases &
Shopping
Building & Parcel
Boundaries
Lifestyles
PreciselyID
School Rankings
Points of Interest
Addresses Population
Property Attributes
Weather
Natural & Manmade
Hazards
Travel Time
Administrative
Boundaries
Land & Property Consumer Environment
Data enrichment can be easy with the right tools
A unique identifier for every address that doesn’t change, and other methods for appending data

Addressing AI limitations with enrichment
Inaccurate training data
leads to poor model
accuracy and
performance, yielding
low-quality results
Clean data reduces the
need for extensive data
prep, simplifying the
overall AI pipeline and
improving efficiency
High-integrity data
reduces the time and
computational resources
required for model
development
Practitioners can rely on
consistent data to
extract meaningful
features that contribute
to model performance
Transparent, accurate
data aids in the
understanding of model
decisions, builds trust,
and identifies biases
Data with integrity
avoids introducing noise
that contributes to
overfitting, resulting in
more robust models
Models trained on high-
integrity data are easier
to maintain, as changes
are less likely to cause
unexpected issues
Easier model
maintenance
Reduced
Preprocessin
g Overhead
Effective
Feature
Engineering
Enhanced
Model
Interpretability
Reduced
Overfitting
Faster model
training
Model
Accuracy and
Performance
When AI models are built
on reliable data, they are
more likely to perform
consistently and
dependably
Reliable
Model
Deployment

17 From Noise to Brilliance: Supercharge AI with Data Enrichment
• Financial crimes
and compliance
• Customer insight
• Branch location analytics
• Fraud analytics
• Risk analysis
• Fraud analytics
• Pricing
• Network and coverage
planning
• Location-based
marketing & advertising
• Asset management
FINANCIAL SERVICES INSURANCE TELECOMMUNICATIONS
• Retail location analysis
• Location-based
• Home search
• Appraisal analysis
• Valuation modeling
RETAIL
• Service optimization
and delivery
• Planning
• Compliance and safety
• Emergency response
and management
• Economic development
• Site selection
• Market analysis
• Lifestyle modeling
GOVERNMENT REAL ESTATE
• Checkout analytics
• Logistics and delivery
• Location-based
eCOMMERCE
Solve complex, real-world challenges

Key takeaways
Appending relevant context from
additional sources
What is data enrichment?
Accuracy, performance, and utility
across various applications
How does it improve your AI?
Improves business outcomes, saves
money, and user trust
How does it benefit you?

CONTACT INFORMATION
If you have further questions or comments:
David Loshin, Knowledge Integrity, Inc.
loshin@knowledge-integrity.com
Antonio Cotroneo, Precisely
antonio.cotroneo@precisely.com

Supercharging AI with Data Enrichment

Recommended

Recommended

More Related Content

Similar to Supercharging AI with Data Enrichment

Similar to Supercharging AI with Data Enrichment (20)

More from Precisely

More from Precisely (20)

Recently uploaded

Recently uploaded (20)

Supercharging AI with Data Enrichment