Ten Organizational Design Models to align structure and operations to busines...
Pan Dhoni - Modernizing Data And Analytics using AI.pdf
1. ABOUT ME
Names: Pan Singh Dhoni
Work: Data & AI Expert, Independent
Researcher, Judge, Speaker, and thought
leader
Email ID: ps.dhoni@gmail.com
2. AGENDA
• WILL UNDERSTAND DATA AND ANALYTICS
TERMINOLOGY
• ARCHITECTURE: ENTERPRISE DATA
PLATFORM
• GENERATIVE AI, AND LARGE LANGUAGE
MODEL (LLM) INTRODUCTION
• MODEL SELECTION APPROACH, MODEL
FINE TUNING
• DATA ANALYSIS USING AI: GRAPH,
CHARTS, AND REPORTS GENERATION
USING AI
• DEVELOPER PRODUCTIVITY USING AI
• IMAGE PROCESSING USING GENERATIVE AI
• BUSINESS USE CASES FOR RETAIL
• AI: HANDLING RISKS AND CHALLENGES
2
Modernizing Data And Analytics using AI
3. 3
DATA AND ANALYTICS TERMINOLOGY
• Ingestion, queue, ETL, Data Modeling, aggregation, data
engineering, Data Analytics, DQ, business partners, Analyst,
platform etc.
• Business Intelligence: Insight from the data in the form of reports,
Dashboards, and trends
• Machine Learning/ AI: Forecasting, Digital marketing, Revenue
growth
4. 4
ENTERPRISE DATA PLATFORM
Bronze Gold
Data Quality
-Validation,
rules, prevention
Data Governance
- policies
- standards
Data Security
- Identity
- Access
ML/ AI
Vendors/
Apps
System of
Records 1
System of
Records n
.
.
.
Integra
tion
Layer
CDP
Business
partners
Business
use cases
Silver
Semantic Layer
Presentation Layer
5. 5
A R T I F I C I A L I N T E L L I G E N C E ( A I )
AI is not a novel concept; it has been around for a long time, but major changes
have occurred on the infrastructure side with resources such as machines ranging
from CPUs and GPUs to TPUs becoming widely available. The major change is that
it is now widely accessible to small industries and the common man. The recent
surge in interest in OpenAI has proven this theory. The use cases remain the same;
the only change is in how to automate these use cases for industries of all sizes to
serve the common man.
7. 7
E N T E R P R I S E D A T A P L A T F O R M W I T H A I C A P A B I L I T I E S
Staging Layer
Data Serving
Layer
BI
ML/ AI
Vendors/
Apps
System of
Records 1
System of
Records n
.
.
.
Integra
tion
Layer
CDP
Business
partners
Business
partners/
decision Makers
Generative AI/ AI for better productivity & metrics
Data Quality
-Validation,
rules, prevention
Data Governance
- policies
- standards
Data Security
- Identity
- Access
Business
use cases
Smart
Ingestion
Smart
ETL
Smart
DQ, Data
Gov. &
Security
Smart
BI
Smart
Data
Model
Smart
Market
ing
9. 9
LLM MODELS
• Translate text into other languages
• Improve customer experience with chatbots and AI assistants
• Organize and classify customer feedback to the right departments
• Summarize large documents, such as earnings calls and legal
documents
• Create new marketing content
• Generate software code from natural language
Large language models (LLMs) are machine learning models that are very effective
at performing language-related tasks
GPT family of models (e.g., ChatGPT), BERT, Llama, MPT and Anthropic.
10. 10
M O D E L S E L E C T I O N : O P E N S O U R C E O R P R O P R I E T A R Y
Assess model based on Privacy, Quality, Cost and Latency
Open-Source Models
• Use as off-the-shelf or fine tune
• provides flexibility for customizations
• can be smaller in size to save cost
• commercial / non-commercial use
Examples: LLaMA/ mosai etc.
Pros: Development is slow
Cons: Cost is low, no vendor lock-in
Proprietary (Paid) Models
• Usually offered as LLMs-as-a-service
• Some can be fine-tuned
• Restrictive licenses for usage and
modification
Examples: OpenAI, ChatGPT
Pros: Development is fast, quality will be better
Cons: Cost is high, Vendor lock-in, Data privacy/
Security
11. 11
D O M A I N S P E C I F I C : F I N E T U N I N G
Company data
Schema, NRF
Data Schema
Retail Domain
Foundation Model
12. 12
M O D E L T R A I N I N G : S C H E M A I N F O R M A T I O N
LLM Model
Database
Emp
(name
str (20),
age int
(5) …)
Send Schema to LLM model,
using prompt
13. 13
B U S I N E S S V A L U E : R E T R I E V E V A L U A B L E I N F O R M A T I O N
Prompt to LLM
English Prompt
SQL Query
SQL Execution
Result Set
14. 14
D A T A A N A L Y S I S / A N A L Y S T / B I
LLM Model
UI
Prompt to LLM
English Prompt
SQL Query
SQL Execution
Result Set
DB or Delta lake or S3 etc.
16. 16
I M A G E P R O C E S S I N G U S I N G G E N A I
17. 17
Parameter/Metric Non-Local Means Denoising CNN trained for Image Denoising Conclusion
Accuracy (MSE) / PSNR MSE = 15.0 PSNR = 18 dB (inversely related to MSE) The generative method (CNN)
outperforms the traditional method
(NLM) in terms of accuracy.
Processing Time 25 seconds in hours (training time): Very First Time The traditional method is significantly
faster for inference, while the generative
model requires more time for training.
Generalization Moderate High The generative model demonstrates
better generalization capabilities.
Interpretability High Not as high due to the inherent
complexity of CNNs.
Traditional methods tend to be more
interpretable.
19. 19
FEW RETAIL USE CASES
• Business partners
• Inventory optimization
• Store employee performance improvement
• Theft protection
• Revenue increase (targeted marketing,
customer satisfaction)
20. 20
ADDRESS ETHICAL ISSUES
Data Bias:
➢ Assess data slices -> update data
➢ Regulation
Toxic, discriminatory, exclusive model, Mis-information hazard:
➢ Assess data slices -> update data
➢ curate data for fine tuning
➢ Regulation
21. 21
ADDRESS ETHICAL ISSUES
✓ Ensure proper data anonymization, encryption, and access controls
✓ Implement safeguards to access or disclosure of sensitive data during training/storage/inferences
✓ Establish data and model governance; version control, monitoring, auditing, data usage policy etc.
Ask question with vendors:
✓ will data will be shared with third party
✓ do you have data lineage that enables you to delete data from varios storage parts.
✓ history stored , is that secure?
22. 22
REFERENCES
Google Scholar (Pan Singh Dhoni):
https://scholar.google.com/citations?user=PnVDB6MAAAAJ&hl=en
https://figshare.com/authors/Pan_Dhoni/16388907
1. https://docs.databricks.com/en/generative-ai/generative-ai.html
2. Synergizing Generative Artificial Intelligence and Cybersecurity: Roles of Generative Artificial Intelligence
Entities, Companies, Agencies and Government in Enhancing: Journal of Global Research in Computer
Sciences 14 (3), 16, 2023
3. Synergy in Technology How Generative AI Augments the Capabilities of Customer Data Platforms: Journal
of Mathematical Techniques and Computational Mathematics 2 (9), 396, 2023.
4. Synergizing Generative AI and Cybersecurity: Roles of Generative AI Entities, Companies, Agencies, and
Government in Enhancing Cybersecurity: TechRxiv
5. A cost-effective IT approach to rapidly build a data platform and integrate retail applications for small and
mid-size companies: TechRxiv
6. Exploring the Synergy between Generative AI, Data and Analytics in the Modern Age: TechRxiv
7. Enhancing Data Quality through Generative AI: An Empirical Study with Data: TechRxiv
8. Represented Advantages of generative AI in Image Processing paper at ICAAAIML-2023, SHARDA
UNIVERSITY
9. From Data to Decisions: Enhancing Retail with AI and Machine Learning, International Journal of
Computing and Engineering 5(1), 38-51, 2024.
10. https://docs.databricks.com/en/generative-ai/generative-ai.html
11. https://arxiv.org/abs/2303.08774
23. THANK YOU
Pan Singh Dhoni
ps.dhoni@gmail.com
https://www.linkedin.com/in/pandhoni/
8/06/20XX PITCH DECK 23