Cloud Native AI Introduction,
Challenges, and Path Forward
By Husnain Ahmed
husnain2info@gmail.com
https://www.linkedin.com/in/husnainahmed/
Question of the
day 🤔
What is CNAI?
What is Cloud native?
Cloud Native
Scalable and reliable platform
● Microservices
● Containers
● Container Orchestration
● DevOps
Cloud Native is a new or modern way of
developing, deploying and running
applications
● Increase Efficiency by CI/CD
● Ensure Availability
● Reduce Cost
How does cloud native benefits organizations?
AI/ML
Artificial Intelligence / Machine
Learning
● Machine Learning
● Deep Learning
● Data Science
● Math and Statics
AI is the ability of a computer to perform
tasks commonly associated with intelligent
beings.
AI is built over these important fields
Graphical Representation of AI system
Main Branches
of AI
Generative AI/Predictive AI
Predictive AI:
Best for the work whose output is
already known and does repetitive
tasks
Generative AI:
Based on LLMs and can produce
human like new content
Challenges of Cloud Native
in AI
Challenges
1. Data Management and
Governance
2. Processing Demands and
Efficiency
3. Architectural Complexity
4. Operational Hurdles
5. Unified System Management
Data Management
and Governance
Addressing data size,
synchronization, governance,
privacy regulations (e.g.,
GDPR, CCPA), ownership,
lineage, and mitigating bias.
Challenge # 1
Processing
Demands and
Efficiency
Managing rising processing
demands while ensuring
cost efficiency, scalability,
and orchestrating
workflows with custom
dependencies.
Challenge # 2
Architectural
Complexity
Navigating microservice
architecture, resource allocation, and
debugging in model serving and user
experience contexts.
Challenge # 3
Operational
Hurdles
Tackling multi-stage AI pipeline
complexities, developer learning
curves, big data handling, multi-
tenancy, and security compliance.
Challenge # 4
Unified System
Management
Handling Cloud Native AI systems
comprehensively, including resource
allocation, cost control, monitoring,
disaster recovery, security
compliance, sustainability, and
educational initiatives.
Challenge # 5
Path Forward - CNAI (Cloud Native Artificial
Intelligence)
● Flexibility in Tooling: Embrace popular tools like REST interfaces and cloud-based resources
to navigate the overwhelming options in AI, ensuring adaptability as new technologies emerge.
● Sustainable AI Practices: Enhance AI workload accountability for environmental impact by
integrating cloud native technologies for optimization, advocating for standardized
environmental assessments, promoting energy-efficient AI models, and emphasizing purposeful
AI usage.
● Customizing Platform Dependencies: Ensure Cloud Native environments support GPU
drivers and acceleration for AI workloads, addressing compatibility challenges with specific
frameworks and libraries, thus accommodating diverse vendors and GPU architectures.
● Implementing Reference Models: Consider the value of a Cloud Native, OpenTofu-based
reference implementation, combining various open-source tools like JupyterLab, Kubeflow,
PyTorch, Spark/Ray/Trino, Iceberg, Feast, MLFlow, Yunikorn, EKS/GKE, S3/GCS, etc., to
provide a user-friendly and scalable distribution for AI/ML development in the Cloud, fostering
open and responsible AIML development.
● Adopting Unified Terminology: As AI proliferates, terminology evolves to simplify
conversations, encompassing both business-friendly terms like "repurpose" for content reuse
and technical terms like RAG, Reason, and Refinement, facilitating broader adoption and
understanding across diverse sectors.
Solutions and Opportunities - CNAI (Cloud Native
Artificial Intelligence)
● Orchestration - Kubeflow: Kubeflow streamlines ML Operations (MLOps) with
Kubernetes, enabling efficient adoption of Cloud Native tools for AI/ML/DL. It
implements microservices for each ML lifecycle stage, offering distributed training,
hyperparameter tuning, and model serving.
● Vector Databases: Enhance Cloud Native AI by enriching LLM prompts with contextual
embeddings, enabling multi-modal GenAI systems to handle diverse inputs effectively.
Examples include Redis, Milvus, Faiss, and Weaviate, offering tailored indexing schemes
for efficient vector handling.
● OpenLLMetry: Improves Cloud Native AI observability with OpenTelemetry, enabling
comprehensive instrumentation for Generative AI. Developers rely on observability tools
for refining AI usage over time, with data driving evaluations and fine-tuning workflows.
Solutions
● CNCF Project Landscape: Explore collaborative AI projects in LF groups like CNCF,
offering a hub for engineers.
● ML Tool to Task Mind Map: Gain insights from Cloud Native Landscape and ML Tool,
aiding decision-making.
● CNAI for Kids and Students: Empower youth with AI education through initiatives like
CNCF Kids Day.
● Participation: Access education and collaboration platforms for AI specialists and
generalists.
● Trust and Safety: Prioritize safety in AI and Cloud Native tech for positive online
experiences.
● New Engineering Discipline: Witness the rise of roles like MLDevOps, bridging Data
Science and Development.
Opportunities
Combining AI and Cloud Native tech gives big opportunities for companies. Cloud Native
systems make it easier to train and use AI models at a bigger scale. There are still
challenges, like managing resources and making sure AI models are easy to understand.
But new Cloud Native tools, like Kubeflow, are making things better. As AI and Cloud
Native tech get better, companies that use them together can do more and beat their
competition. It's all about investing smartly in people, tools, and tech to make big
innovations and give customers great experiences.
Conclusion

Cloud Native AI Introduction, Challenges

  • 1.
    Cloud Native AIIntroduction, Challenges, and Path Forward By Husnain Ahmed husnain2info@gmail.com https://www.linkedin.com/in/husnainahmed/
  • 2.
    Question of the day🤔 What is CNAI?
  • 3.
  • 4.
    Cloud Native Scalable andreliable platform ● Microservices ● Containers ● Container Orchestration ● DevOps Cloud Native is a new or modern way of developing, deploying and running applications
  • 5.
    ● Increase Efficiencyby CI/CD ● Ensure Availability ● Reduce Cost How does cloud native benefits organizations?
  • 6.
    AI/ML Artificial Intelligence /Machine Learning ● Machine Learning ● Deep Learning ● Data Science ● Math and Statics AI is the ability of a computer to perform tasks commonly associated with intelligent beings. AI is built over these important fields
  • 7.
  • 8.
    Main Branches of AI GenerativeAI/Predictive AI Predictive AI: Best for the work whose output is already known and does repetitive tasks Generative AI: Based on LLMs and can produce human like new content
  • 9.
    Challenges of CloudNative in AI
  • 10.
    Challenges 1. Data Managementand Governance 2. Processing Demands and Efficiency 3. Architectural Complexity 4. Operational Hurdles 5. Unified System Management
  • 11.
    Data Management and Governance Addressingdata size, synchronization, governance, privacy regulations (e.g., GDPR, CCPA), ownership, lineage, and mitigating bias. Challenge # 1
  • 12.
    Processing Demands and Efficiency Managing risingprocessing demands while ensuring cost efficiency, scalability, and orchestrating workflows with custom dependencies. Challenge # 2
  • 13.
    Architectural Complexity Navigating microservice architecture, resourceallocation, and debugging in model serving and user experience contexts. Challenge # 3
  • 14.
    Operational Hurdles Tackling multi-stage AIpipeline complexities, developer learning curves, big data handling, multi- tenancy, and security compliance. Challenge # 4
  • 15.
    Unified System Management Handling CloudNative AI systems comprehensively, including resource allocation, cost control, monitoring, disaster recovery, security compliance, sustainability, and educational initiatives. Challenge # 5
  • 16.
    Path Forward -CNAI (Cloud Native Artificial Intelligence)
  • 17.
    ● Flexibility inTooling: Embrace popular tools like REST interfaces and cloud-based resources to navigate the overwhelming options in AI, ensuring adaptability as new technologies emerge. ● Sustainable AI Practices: Enhance AI workload accountability for environmental impact by integrating cloud native technologies for optimization, advocating for standardized environmental assessments, promoting energy-efficient AI models, and emphasizing purposeful AI usage. ● Customizing Platform Dependencies: Ensure Cloud Native environments support GPU drivers and acceleration for AI workloads, addressing compatibility challenges with specific frameworks and libraries, thus accommodating diverse vendors and GPU architectures. ● Implementing Reference Models: Consider the value of a Cloud Native, OpenTofu-based reference implementation, combining various open-source tools like JupyterLab, Kubeflow, PyTorch, Spark/Ray/Trino, Iceberg, Feast, MLFlow, Yunikorn, EKS/GKE, S3/GCS, etc., to provide a user-friendly and scalable distribution for AI/ML development in the Cloud, fostering open and responsible AIML development. ● Adopting Unified Terminology: As AI proliferates, terminology evolves to simplify conversations, encompassing both business-friendly terms like "repurpose" for content reuse and technical terms like RAG, Reason, and Refinement, facilitating broader adoption and understanding across diverse sectors.
  • 18.
    Solutions and Opportunities- CNAI (Cloud Native Artificial Intelligence)
  • 19.
    ● Orchestration -Kubeflow: Kubeflow streamlines ML Operations (MLOps) with Kubernetes, enabling efficient adoption of Cloud Native tools for AI/ML/DL. It implements microservices for each ML lifecycle stage, offering distributed training, hyperparameter tuning, and model serving. ● Vector Databases: Enhance Cloud Native AI by enriching LLM prompts with contextual embeddings, enabling multi-modal GenAI systems to handle diverse inputs effectively. Examples include Redis, Milvus, Faiss, and Weaviate, offering tailored indexing schemes for efficient vector handling. ● OpenLLMetry: Improves Cloud Native AI observability with OpenTelemetry, enabling comprehensive instrumentation for Generative AI. Developers rely on observability tools for refining AI usage over time, with data driving evaluations and fine-tuning workflows. Solutions
  • 20.
    ● CNCF ProjectLandscape: Explore collaborative AI projects in LF groups like CNCF, offering a hub for engineers. ● ML Tool to Task Mind Map: Gain insights from Cloud Native Landscape and ML Tool, aiding decision-making. ● CNAI for Kids and Students: Empower youth with AI education through initiatives like CNCF Kids Day. ● Participation: Access education and collaboration platforms for AI specialists and generalists. ● Trust and Safety: Prioritize safety in AI and Cloud Native tech for positive online experiences. ● New Engineering Discipline: Witness the rise of roles like MLDevOps, bridging Data Science and Development. Opportunities
  • 21.
    Combining AI andCloud Native tech gives big opportunities for companies. Cloud Native systems make it easier to train and use AI models at a bigger scale. There are still challenges, like managing resources and making sure AI models are easy to understand. But new Cloud Native tools, like Kubeflow, are making things better. As AI and Cloud Native tech get better, companies that use them together can do more and beat their competition. It's all about investing smartly in people, tools, and tech to make big innovations and give customers great experiences. Conclusion