SlideShare a Scribd company logo
1 of 35
Download to read offline
Building an Event Driven
Gen-AI powered video
summarizer API
I am Zied Ben Tahar
AWS Community Builder / Principal Architect @ Aviv Group /
You can find me:
Blog: https:/
/zied-ben-tahar.medium.com/
Linkedin: https:/
/www.linkedin.com/in/zied-ben-tahar-5200913/
Twitter : @wow_sig
Bonjour !
Agenda
01 02
04
03
05
GenAI on AWS - A
quick overview
Let’s build: AI powered
Video summarizer
Video summarizer -
Scaling the
architecture with EDA
Api-fying the solution Demo Time
GenAI on AWS - A quick
overview
01
Amazon Bedrock
A serverless Gen-AI service
Managed FMs
Access to foundation models
from leading partners on AI space
Customizable
Capabilities to adapt and
privately customize FMs to
domain specific problem
Serverless
No infrastructure to manage,
Good integration with AWS
ecosystem
Gen AI
Addressing some LLM challenges
How to solve issues related to knowledge cut-off, hallucinations, blackbox ?
➔ Prompt engineering
◆ Optimizing prompts for specific tasks and guiding the model to yield the
expected answers
➔ Retrieval augmented generation (RAG)
◆ Allow the model to access to external/up to date information to enhance
response.
➔ Model fine-tuning
◆ Additional training of a pre-trained model on task-specific datasets.
Anthropic’s Claude (2.1) 3
A serious alternative to GPT 4
➔ Handles large context windows, 200K tokens (150k
words)
➔ Able to maintain factual consistency when processing
long prompts
➔ Can generate code and structured data
➔ Vision capabilities (new with Sonnet)
Claude 3 benchmarks
Let’s build!
AI powered Video
summarizer
02
Solution overview
Summarization workflow, first attempt
Extract/Get Video
transcript
Summarize & extract
structured data
Generate audio from
summary
Profit !
01 02 03 04
Solution overview
Serverless Summarizer workflow, a first “naïve” approach
Deep dive
Getting video transcripts
Konami’ing your way into video transcripts
🔗 https:/
/github.com/Kakulukian/youtube-transcript
~ Fast, as it retrieves already generated transcripts from youtube
~ …but it’s brittle, based on a non official youtube API, does not work on
some languages
Deep dive
Prompting techniques with Claude
You are a video transcript summarizer.
Here is a video transcription that you must summarize:
<transcript>{{transcript}}</transcript>
Summarize this transcript in a third person point of view in 10 sentences.
Identify the speakers and the main topics of the transcript and add them in
the output as well.
Do not add or invent speaker names if you not able to identify them.
You must output the summary JSON format conforming to this JSON schema:
{
"type": "object",
"properties": {
"speakers": {
"type": "array",
"items": {
"type": "string"
}
},
"topics": {
"type": "string"
},
"summary": {
"type": "array",
"items": {
"type": "string"
}
}
}
}
➔ Clear without ambiguity, no
room for assumptions
➔ Don’t be polite, be
straightforward
➔ Concrete example/ schema
for structured outputs
Deep dive
Putting words in Claude’s mouth
For structured output, we’ll need to give hints to the Assistant by prefilling its response
With Claude’s Completion API
(Legacy)
Message API
Deep dive
Putting words in Claude’s mouth
❌ ✅
Deep dive
Invoking the model - state machine definition
1- Generate model parameters and write to an s3
bucket
2- Direct invoke to claude model
3- Making sure that JSON is well formed
Deep diving
Invoking the model
Can we do better ?
A better solution
Enters Amazon Transcribe
~ A more stable solution that supports many languages and more media types
~ Some important architectural side effects: The system is now event based and
asynchronous
This is nice…but how can I scale
my architecture ?
Scaling the architecture
with EDA
03
Scaling the architecture
Orchestration or Choreography ?
~ Orchestration: Greater control
over complex workflows. Complex
to evolve when collaborating with
external contexts
~ Choreography: Flexible and and
easier to extend but harder to
Observe
Which pattern works best ?
Scaling the architecture
Let’s try with Choreography
~ Extensible: Tasks can be added or removed without impacting the entire system
~ Resilient: If a task fails, it does not impact the entire application
Scaling the architecture
Scatter-gather pattern
~ Aggregator: A central component handling the events from the scatter phase.
Scaling the architecture
Scatter-gather pattern
…Yes, but
Scaling the architecture
Aggregator pattern can be hard to tame
● What if…
○ a task fails
○ or never publishes the expected event
● How to make sure that failure is handled properly ?
● How to monitor the overall progress of a given Job ?
● How to make sure that my system is observed properly
So…Orchestration or Choreography ?
Scaling the architecture
Orchestration or Choreography…Why not both ?
Apify-ing the solution
04
APIs, like diamonds, are forever
Joshua Bloch
Apify-ing the solution
Principles on designing good “public” Event-Driven APIs
● Be conservative on what you send (Postel’s Law)
● Be mindful of the observed behavior of your APIs (Hyrum’s law)
● Make clear distinction between your private and public events
● Adopt a sane versioning strategy
● Adopt standards: CloudEvents & AsyncAPI
Apify-ing the solution
The awesomest GenAI powered video summarizer
Demo
05
Thanks !
Questions ?
Source code👇

More Related Content

Similar to eda-genai-believe-in-serverless-talk.pdf

Spring '20 Developer Release Highlights
Spring '20 Developer Release HighlightsSpring '20 Developer Release Highlights
Spring '20 Developer Release HighlightsPeter Knolle
 
Questions Log: Tips for Intermediate Cognos Report Studio Authors
Questions Log: Tips for Intermediate Cognos Report Studio AuthorsQuestions Log: Tips for Intermediate Cognos Report Studio Authors
Questions Log: Tips for Intermediate Cognos Report Studio AuthorsSenturus
 
The OpenOffice.org specification process demystified
The OpenOffice.org specification process demystifiedThe OpenOffice.org specification process demystified
The OpenOffice.org specification process demystifiedAlexandro Colorado
 
DevOps Fest 2020. immutable infrastructure as code. True story.
DevOps Fest 2020. immutable infrastructure as code. True story.DevOps Fest 2020. immutable infrastructure as code. True story.
DevOps Fest 2020. immutable infrastructure as code. True story.Vlad Fedosov
 
Breaking Through The Challenges of Scalable Deep Learning for Video Analytics
Breaking Through The Challenges of Scalable Deep Learning for Video AnalyticsBreaking Through The Challenges of Scalable Deep Learning for Video Analytics
Breaking Through The Challenges of Scalable Deep Learning for Video AnalyticsJason Anderson
 
ITARC15 Workshop - Architecting a Large Software Project - Lessons Learned
ITARC15 Workshop - Architecting a Large Software Project - Lessons LearnedITARC15 Workshop - Architecting a Large Software Project - Lessons Learned
ITARC15 Workshop - Architecting a Large Software Project - Lessons LearnedJoão Pedro Martins
 
Front-End Modernization for Mortals
Front-End Modernization for MortalsFront-End Modernization for Mortals
Front-End Modernization for Mortalscgack
 
Front end-modernization
Front end-modernizationFront end-modernization
Front end-modernizationdevObjective
 
Lean Engineering: How to make Engineering a full Lean UX partner
Lean Engineering: How to make Engineering a full Lean UX partnerLean Engineering: How to make Engineering a full Lean UX partner
Lean Engineering: How to make Engineering a full Lean UX partnerBill Scott
 
Building a Great AEM Team: Time Warner Cable's Journey
Building a Great AEM Team: Time Warner Cable's JourneyBuilding a Great AEM Team: Time Warner Cable's Journey
Building a Great AEM Team: Time Warner Cable's JourneyiCiDIGITAL
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Comment tirer partie de Visual Studio Online pour vos développements SharePoint
Comment tirer partie de Visual Studio Online pour vos développements SharePointComment tirer partie de Visual Studio Online pour vos développements SharePoint
Comment tirer partie de Visual Studio Online pour vos développements SharePointGilles Pommier
 
Fowa Miami 09 Cloud Computing Workshop
Fowa Miami 09 Cloud Computing WorkshopFowa Miami 09 Cloud Computing Workshop
Fowa Miami 09 Cloud Computing WorkshopMark Masterson
 
Scaling Machine Learning from zero to millions of users (May 2019)
Scaling Machine Learning from zero to millions of users (May 2019)Scaling Machine Learning from zero to millions of users (May 2019)
Scaling Machine Learning from zero to millions of users (May 2019)Julien SIMON
 
PHP Mega Meetup, Sep, 2020, Anti patterns in php
PHP Mega Meetup, Sep, 2020, Anti patterns in phpPHP Mega Meetup, Sep, 2020, Anti patterns in php
PHP Mega Meetup, Sep, 2020, Anti patterns in phpAhmed Abdou
 

Similar to eda-genai-believe-in-serverless-talk.pdf (20)

Spring '20 Developer Release Highlights
Spring '20 Developer Release HighlightsSpring '20 Developer Release Highlights
Spring '20 Developer Release Highlights
 
Questions Log: Tips for Intermediate Cognos Report Studio Authors
Questions Log: Tips for Intermediate Cognos Report Studio AuthorsQuestions Log: Tips for Intermediate Cognos Report Studio Authors
Questions Log: Tips for Intermediate Cognos Report Studio Authors
 
XP Injection
XP InjectionXP Injection
XP Injection
 
XP Injection
XP InjectionXP Injection
XP Injection
 
Feature folders
Feature foldersFeature folders
Feature folders
 
The OpenOffice.org specification process demystified
The OpenOffice.org specification process demystifiedThe OpenOffice.org specification process demystified
The OpenOffice.org specification process demystified
 
DevOps Fest 2020. immutable infrastructure as code. True story.
DevOps Fest 2020. immutable infrastructure as code. True story.DevOps Fest 2020. immutable infrastructure as code. True story.
DevOps Fest 2020. immutable infrastructure as code. True story.
 
Breaking Through The Challenges of Scalable Deep Learning for Video Analytics
Breaking Through The Challenges of Scalable Deep Learning for Video AnalyticsBreaking Through The Challenges of Scalable Deep Learning for Video Analytics
Breaking Through The Challenges of Scalable Deep Learning for Video Analytics
 
Enabling Lean at Enterprise Scale: Lean Engineering in Action
Enabling Lean at Enterprise Scale: Lean Engineering in ActionEnabling Lean at Enterprise Scale: Lean Engineering in Action
Enabling Lean at Enterprise Scale: Lean Engineering in Action
 
ITARC15 Workshop - Architecting a Large Software Project - Lessons Learned
ITARC15 Workshop - Architecting a Large Software Project - Lessons LearnedITARC15 Workshop - Architecting a Large Software Project - Lessons Learned
ITARC15 Workshop - Architecting a Large Software Project - Lessons Learned
 
Front-End Modernization for Mortals
Front-End Modernization for MortalsFront-End Modernization for Mortals
Front-End Modernization for Mortals
 
Front end-modernization
Front end-modernizationFront end-modernization
Front end-modernization
 
Front end-modernization
Front end-modernizationFront end-modernization
Front end-modernization
 
Lean Engineering: How to make Engineering a full Lean UX partner
Lean Engineering: How to make Engineering a full Lean UX partnerLean Engineering: How to make Engineering a full Lean UX partner
Lean Engineering: How to make Engineering a full Lean UX partner
 
Building a Great AEM Team: Time Warner Cable's Journey
Building a Great AEM Team: Time Warner Cable's JourneyBuilding a Great AEM Team: Time Warner Cable's Journey
Building a Great AEM Team: Time Warner Cable's Journey
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Comment tirer partie de Visual Studio Online pour vos développements SharePoint
Comment tirer partie de Visual Studio Online pour vos développements SharePointComment tirer partie de Visual Studio Online pour vos développements SharePoint
Comment tirer partie de Visual Studio Online pour vos développements SharePoint
 
Fowa Miami 09 Cloud Computing Workshop
Fowa Miami 09 Cloud Computing WorkshopFowa Miami 09 Cloud Computing Workshop
Fowa Miami 09 Cloud Computing Workshop
 
Scaling Machine Learning from zero to millions of users (May 2019)
Scaling Machine Learning from zero to millions of users (May 2019)Scaling Machine Learning from zero to millions of users (May 2019)
Scaling Machine Learning from zero to millions of users (May 2019)
 
PHP Mega Meetup, Sep, 2020, Anti patterns in php
PHP Mega Meetup, Sep, 2020, Anti patterns in phpPHP Mega Meetup, Sep, 2020, Anti patterns in php
PHP Mega Meetup, Sep, 2020, Anti patterns in php
 

Recently uploaded

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsPrecisely
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 

Recently uploaded (20)

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power Systems
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 

eda-genai-believe-in-serverless-talk.pdf

  • 1. Building an Event Driven Gen-AI powered video summarizer API
  • 2. I am Zied Ben Tahar AWS Community Builder / Principal Architect @ Aviv Group / You can find me: Blog: https:/ /zied-ben-tahar.medium.com/ Linkedin: https:/ /www.linkedin.com/in/zied-ben-tahar-5200913/ Twitter : @wow_sig Bonjour !
  • 3. Agenda 01 02 04 03 05 GenAI on AWS - A quick overview Let’s build: AI powered Video summarizer Video summarizer - Scaling the architecture with EDA Api-fying the solution Demo Time
  • 4. GenAI on AWS - A quick overview 01
  • 5. Amazon Bedrock A serverless Gen-AI service Managed FMs Access to foundation models from leading partners on AI space Customizable Capabilities to adapt and privately customize FMs to domain specific problem Serverless No infrastructure to manage, Good integration with AWS ecosystem
  • 6. Gen AI Addressing some LLM challenges How to solve issues related to knowledge cut-off, hallucinations, blackbox ? ➔ Prompt engineering ◆ Optimizing prompts for specific tasks and guiding the model to yield the expected answers ➔ Retrieval augmented generation (RAG) ◆ Allow the model to access to external/up to date information to enhance response. ➔ Model fine-tuning ◆ Additional training of a pre-trained model on task-specific datasets.
  • 7. Anthropic’s Claude (2.1) 3 A serious alternative to GPT 4 ➔ Handles large context windows, 200K tokens (150k words) ➔ Able to maintain factual consistency when processing long prompts ➔ Can generate code and structured data ➔ Vision capabilities (new with Sonnet) Claude 3 benchmarks
  • 8. Let’s build! AI powered Video summarizer 02
  • 9. Solution overview Summarization workflow, first attempt Extract/Get Video transcript Summarize & extract structured data Generate audio from summary Profit ! 01 02 03 04
  • 10. Solution overview Serverless Summarizer workflow, a first “naïve” approach
  • 11. Deep dive Getting video transcripts Konami’ing your way into video transcripts 🔗 https:/ /github.com/Kakulukian/youtube-transcript ~ Fast, as it retrieves already generated transcripts from youtube ~ …but it’s brittle, based on a non official youtube API, does not work on some languages
  • 12. Deep dive Prompting techniques with Claude You are a video transcript summarizer. Here is a video transcription that you must summarize: <transcript>{{transcript}}</transcript> Summarize this transcript in a third person point of view in 10 sentences. Identify the speakers and the main topics of the transcript and add them in the output as well. Do not add or invent speaker names if you not able to identify them. You must output the summary JSON format conforming to this JSON schema: { "type": "object", "properties": { "speakers": { "type": "array", "items": { "type": "string" } }, "topics": { "type": "string" }, "summary": { "type": "array", "items": { "type": "string" } } } } ➔ Clear without ambiguity, no room for assumptions ➔ Don’t be polite, be straightforward ➔ Concrete example/ schema for structured outputs
  • 13. Deep dive Putting words in Claude’s mouth For structured output, we’ll need to give hints to the Assistant by prefilling its response With Claude’s Completion API (Legacy) Message API
  • 14. Deep dive Putting words in Claude’s mouth ❌ ✅
  • 15. Deep dive Invoking the model - state machine definition 1- Generate model parameters and write to an s3 bucket 2- Direct invoke to claude model 3- Making sure that JSON is well formed
  • 17. Can we do better ?
  • 18. A better solution Enters Amazon Transcribe ~ A more stable solution that supports many languages and more media types ~ Some important architectural side effects: The system is now event based and asynchronous
  • 19. This is nice…but how can I scale my architecture ?
  • 21. Scaling the architecture Orchestration or Choreography ? ~ Orchestration: Greater control over complex workflows. Complex to evolve when collaborating with external contexts ~ Choreography: Flexible and and easier to extend but harder to Observe
  • 23. Scaling the architecture Let’s try with Choreography ~ Extensible: Tasks can be added or removed without impacting the entire system ~ Resilient: If a task fails, it does not impact the entire application
  • 24. Scaling the architecture Scatter-gather pattern ~ Aggregator: A central component handling the events from the scatter phase.
  • 27. Scaling the architecture Aggregator pattern can be hard to tame ● What if… ○ a task fails ○ or never publishes the expected event ● How to make sure that failure is handled properly ? ● How to monitor the overall progress of a given Job ? ● How to make sure that my system is observed properly
  • 29. Scaling the architecture Orchestration or Choreography…Why not both ?
  • 31. APIs, like diamonds, are forever Joshua Bloch
  • 32. Apify-ing the solution Principles on designing good “public” Event-Driven APIs ● Be conservative on what you send (Postel’s Law) ● Be mindful of the observed behavior of your APIs (Hyrum’s law) ● Make clear distinction between your private and public events ● Adopt a sane versioning strategy ● Adopt standards: CloudEvents & AsyncAPI
  • 33. Apify-ing the solution The awesomest GenAI powered video summarizer