Serverless Generative AI on AWS, AWS User Groups of Florida

Serverless
Generative AI
AWS User Groups of Florida
Fort Lauderdale, FL, USA
February 27th, 2024
Patrick Hannah
CTO
CloudHesive

AWS User Groups of Florida – Updates
We are back to In-Person Meetups and working towards a monthly cadence
Always open to ideas on how we can improve the content and format!
Collaborate with us after the MeetUp!
Future MeetUps – Presenters? Topics? Formats?
Slideshare – Keep an eye on our MeetUp Page – we will post a link to the Slides
Slack – Keep the conversation going
Today’s MeetUp Format
Feel free to ask questions throughout the session!
Dedicated Q&A at the end

Topic
In this session, I will unravel the complexities of serverless
generative AI, offering insights into its architecture, applications,
and potential impact on businesses across various industries.
Whether you're a seasoned AWS practitioner or just starting your
journey into cloud computing, this presentation promises to
broaden your horizons and spark new ideas.

Inspiration
“I'm wondering if there is a feature request to create something like a saved
query in Athena that can be executed via a CloudWatch Event?”
The AWS Step Functions service integration with Amazon Athena enables you to use
Step Functions to start and stop query execution, and get query results
AWS User Groups of Florida MeetUp - AWS API Architectures - Scott
Hendrickson, Partner Solutions Architect, AWS
Data sources and resolvers are how AWS AppSync translates GraphQL requests and
fetches information from your AWS resources
AWS Well Architected Framework Serverless Application Lens
If your Lambda function is not performing custom logic while integrating with other
AWS services, chances are that it may be unnecessary

Who doesn’t like connecting things together?

Compute’s Transition to Serverless
Compute - EC2 Bare Metal (Intel, AMD, Graviton, M1)
Compute - EC2 Virtual > Bare Metal (Xen, KVM/Nitro)
Containers - Fargate > ContainderD (was DockerD) > EC2
Serverless - Lambda > Firecracker (Micro VM) > EC2

Serverless’ Flavors
High Level Abstractions
SaaS (Connect)
Hybrid Abstractions
PaaS (DynamoDB)
Low Level Abstractions
IaaS (Lambda)

Service Categories
Analytics
Application Integration
AR & VR
AWS Cost Management
Blockchain
Business Applications
Compute
Customer Engagement
Database
Developer Tools
End User Computing
Game Tech
Internet of Things
Machine Learning
Management & Governance
Media Services
Migration & Transfer
Mobile
Networking & Content Delivery
Quantum Technologies
Robotics
Satellite
Security, Identity, & Compliance
Storage

Workload Personas
Migrated
Server Based
Migrated & Optimized
Blends of Server and Service Based
Serverless/Native
Service Based
Orchestrated
ECS, EKS, K8s
Inherited
Wildcard!
Hybrid
Wildcard!

Well Architected Framework
Operational Excellence
Security
Reliability
Performance Efficiency
Cost Optimization
Sustainability

Cloud Workload Lifecycle Management
Workload
Architecture
Monitoring
Automation
Processes
Integration

Workload + Architecture Drives Service Selection
Containers
Container File
Versioning
Multi-threaded/Single-task
Minutes to Days
Per VM/Per Hour
Virtual Machines
AMI
Patching
Multi-threaded/Multi-task
Hours to Months
Per VM/Per Hour
Functions/Services
Code
Versioning
Single-threaded/Single-task
Microseconds to Seconds
Per Memory/Second/Per Request

Automation + Processes Drives Lifecycle Management Selection
Organizations
Cross-Account Asset Management + Governance
Control Tower
Account vending/default standardization
Service Catalog
Workload platform vending/default standardization
CloudFormation
IaC
Ephemeral Compute + API Managed Data/Control Plane for
Persistence Tiers
Hands off/Lights out

Processes
Patching
Backup/Restore Testing
Failover Testing (AZ)
Credential Rotation/Credential Audit
Event Response Testing
Incident Response Testing
Performance Testing
Performance/Cost Review
Vulnerability/Penetration Testing

AI/ML Options
Generalized
Specialized
“Balanced”

Generative AI in the context of AWS
Amazon Bedrock
Amazon SageMaker, Studio and Canvas (and Redshift Inferences)
NVIDIA GPU-powered Amazon EC2 instances
AWS Tranium
AWS Inferentia
Amazon EC2 UltraClusters
Amazon Q: Business, AWS, QuickSight, Connect, Supply Chain, Code
Catalyst, IDE, Code Transformation, Query Editor (Redshift)
PartyRock
AWS CodeWhisperer
AWS HealthScribe

Generative AI in the context of AWS
Services that accelerate development for AWS
Services that are powered by it – No-code data connectors/Zero
ETL, Instance Selection, Console to Code (and AppComposer),
Natural Language Querying, Code Scanning, Datazone
(Descriptions)
Services that accelerate development for you – Lex
(Conversational FAQ, Slot Resolution, Bot builder, Utterance
Generator), Personalize (Themes), Transcribe (Summarization)
Services improved by it – Alexa

Rationalization
Why Serverless – how does serverless change how we incept,
launch, and iterate product?
Why GenAI – how does Generative AI change how we think
about solving problems with data?

Bedrock Operationalization
Non-functional
Regional Considerations
FM Subscription
Throughput/Quotas
Security
Operational Monitoring
Traffic Flow (Private Link)
Functional
Prompt Engineering
Tokens
Model Parameters
Inference Parameters
Sessions

Databases that can be used to store Vector Embeddings
OpenSearch/Serverless
Redis Enterprise and MemoryDB
Pinecone
Aurora (Postgres)
RDS (Postgres)
MongoDB
DocumentDB
Neptune

Machine Learning
Amazon Augmented AI - Easily implement human review of machine learning predictions
Amazon CodeGuru - Intelligent recommendations for building and running modern applications
Amazon Comprehend - Analyze Unstructured Text
Amazon Comprehend Medical - Amazon Comprehend Medical uses machine learning to extract
insights and relationships from medical text.
AWS DeepComposer - AWS DeepComposer allows developers of all skill levels to get started with
Generative AI.
AWS DeepLens - Deep Learning Enabled Video Camera
AWS DeepRacer - Fully autonomous 1/18th scale race car, driven by machine learning
Amazon DevOps Guru - ML-powered cloud operations service to improve application availability.
Amazon Forecast - Amazon Forecast is a fully-managed service for accurate time-series
forecasting
Amazon Fraud Detector - Detect more online fraud faster using machine learning
Amazon HealthLake - Making sense of health data
Amazon Kendra - Highly accurate enterprise search service powered by machine learning
AWS HealthImaging
Amazon Lex - Build Voice and Text Chatbots
Amazon Lookout for Equipment - Detect abnormal equipment behavior by analyzing sensor data
Amazon Lookout for Metrics - Accurately detect anomalies in your business metrics and quickly
understand why
Amazon Lookout for Vision - Identify defects using computer vision to automate quality inspection.
Amazon Monitron - End-to-end system for equipment monitoring
Amazon Omics - Transform omics data into insights.
AWS Panorama - Enabling computer vision applications at the edge
Amazon Personalize - Amazon Personalize helps you easily add real-time recommendations to
your apps
Amazon Polly - Turn Text into Lifelike Speech
Amazon Rekognition - Search and Analyze Images
Amazon SageMaker - Build, Train, and Deploy Machine Learning Models
Amazon Textract - Easily extract text and data from virtually any document
Amazon Transcribe - Powerful Speech Recognition
Amazon Translate - Powerful Neural Machine Translation
Amazon Bedrock

Primary Services
API Tier
API Gateway – API Management
AppSync – GraphQL API
Application (Execution)/Code Tier
Lambda – Serverless Compute
Data Store Tier
DynamoDB – Key/Value Data Base
Service Tier
Event Bridge/Step Functions – Event Bus, Low Code/No Code Workflow
Athena – Interactive Query Service
S3 – Object Storage
Glue – Data Integration Service

Options for APIs
Client > API Gateway HTTP > Things
Client > API Gateway REST > Things
Client > AppSync GraphQL > Things
Client > Application Load Balancer > Lambda
Client > Lambda Function URLs
Client > CloudFront (Authorizer) > Lambda
Client > AWS IoT

Options to call AWS services w/o Lambda
APIs
API Gateway > AWS Services
AppSync > GraphQL > Resolvers > AWS Services
Event
Step Functions > AWS Services
EventBridge

API Gateway Integrations
AWS
Service
Lambda
AWS Proxy
Service
Lambda
HTTP
HTTP Proxy
Mock

AppSync Resolvers
DynamoDB
RDS
OpenSearch
Lambda
HTTP

Sync versus Async
Can the payload fit in the size/time constraints
What is the impact to the client?

Step Functions Optimized Integrations
Lambda
Batch
DynamoDB
ECS/Fargate
SNS
SQS
Glue, DataBrew
SageMaker
EMR
CodeBuild
Athena
EKS
API Gateway
EventBridge
Step Functions
HTTP Destinations (New) - https://aws.amazon.com/blogs/aws/external-endpoints-and-testing-of-task-states-now-available-in-aws-step-functions/
Bedrock (New)- https://aws.amazon.com/about-aws/whats-new/2023/11/aws-step-functions-optimized-integration-bedrock/

Options for Event Buses/Messaging/Queuing
DynamoDB > Triggers
CloudWatch Logs > Metrics > Alarms / Lambda
CloudWatch Metrics > Destination
Kinesis > Lambda
Event Bridge (DLQ Support) > Lambda
SQS (DLQ Support) > Lambda
SNS (DLQ Support) > Lambda
(DLQ Support) Lambda

Serverless Data Stores - The Easy Button
S3 Query – Query objects in S3, through S3
Athena (and S3 and Glue) – Query objects in S3, Presto
AppFlow – Data Integration Platform
Profiles
Wisdom
Tasks

Serverless Data Stores
DynamoDB – Key/Value
Timescale – Time Series
Keyspaces – Cassandra
QLDB – Ledger
Aurora – Relational
Prometheus – Prometheus
Grafana – Grafana
MWAA – Airflow

General Considerations
Multi-Region? Single-Region? Which Region(s)?
Which Services?
What will they cost? How are they metered/billed?
How far do we need to scale?
What compliance requirements do we need to meet?
What tools do we have in our reach? (Frameworks, Patterns,
etc.)

API Gateway
Development (Isolation, Stages, SAM)
Client Security (Certificates, API Keys, Authorizers)
Gateway Security (WAF, Throttling)
Endpoint Type (Edge optimized, Regional, Private, API Cache)
Integration (Methods, Proxy, Response Codes)
Operationalization (CloudWatch Logs, CloudWatch Metrics,
Access Logging, X-Ray
Testing (Direct, PostMan)

Lambda
Runtime
Pre-Warming
Sizing/Timeouts
Development (Isolation, Versions, SAM, Cloud9, Parameterization)
Integration (Methods, Response Codes)
Security (KMS, Execution Role)
Operationalization (CloudWatch Logs, CloudWatch Metrics, X-Ray)
Testing (Direct)

“The Rest”
Development (Coding Best Practices, Runtime, RDBMS, DevOps)
Data Stores that are not Serverless (Sizing, CloudWatch, Logs, Events,
Backup/Recovery, Multi-AZ, Database “Stuff”)
Trade-off
VPC (Public Subnets, Private Subnets, Security Groups)
Typical of Legacy Integrations, Non-Serverless Data Stores, etc.
General (What are all of the things we need to think about when we create a
new AWS account?)
“Landing Zone”

Conclusion
AWS continues to increase the breadth and depth of their service offerings
I wish it did that
I didn’t know I needed that
It’s easier to get started today than it was yesterday
Simplicity
Support
Cost
Lessons Learned
Regional Availability
Flexibility of implementation to change FMs (or even support custom FMs) and tune FM specific parameters
Conclusion
Generative AI and API Access to Generative AI services (like Bedrock) can be an easy button
Not an end all – value can be found in context, which takes us back to needing a strong data foundation
Priorities are still priorities – customers don’t care about Generative AI if your customers have needs unfulfilled by the product or by Generative AI
Customers may also need to be led to it – if the customer isn’t asking, pushing it on them won’t help – they need education
Consider sustainability when choosing an approach – Maslow’s Hammer
Don’t forget about team enablement
Limited by your imagination and ability to execute

References
https://docs.aws.amazon.com/wellarchitected/latest/serverless-applications-lens/wellarchitected-
serverless-applications-lens.pdf – Well Architected Serverless Application Lens
https://docs.aws.amazon.com/apigateway/latest/developerguide/getting-started-aws-proxy.html – API
Gateway Service Proxy Example
https://docs.aws.amazon.com/apigateway/latest/developerguide/websocket-api-chat-app.html – API
Gateway Websocket Example
https://docs.aws.amazon.com/appsync/latest/devguide/tutorials.html – AppSync Tutorials
https://docs.aws.amazon.com/appsync/latest/devguide/tutorial-dynamodb-resolvers.html – AppSync
Tutorial DynamoDB Resolver
https://docs.aws.amazon.com/lambda/latest/dg/lambda-urls.html – Lambda URLS
https://docs.aws.amazon.com/step-functions/latest/dg/connect-supported-services.html – Step Functions
Supported Services
https://docs.aws.amazon.com/step-functions/latest/dg/sample-athena-query.html – Step Functions
Athena Query

0800-860-2040
sales-latam@cloudhesive.com
cloudhesive.com
Fort Lauderdale
2419 E. Commercial Blvd, Ste. 300
Ft. Lauderdale, Florida
USA
Buenos Aires
Av. Del Libertador 6680, Piso 6
CABA, Ciudad de Buenos Aires
Argentina
Santiago de Chile
Cerro El Plomo 5420 SB1, Oficina 15
Nueva Las Condes, Santiago de Chile
Chile

Serverless Generative AI on AWS, AWS User Groups of Florida

Recommended

Recommended

More Related Content

Similar to Serverless Generative AI on AWS, AWS User Groups of Florida

Similar to Serverless Generative AI on AWS, AWS User Groups of Florida (20)

More from CloudHesive

More from CloudHesive (20)

Recently uploaded

Recently uploaded (20)

Serverless Generative AI on AWS, AWS User Groups of Florida

Editor's Notes