Uploaded byapidays

PPTX, PDF127 views

apidays Munich 2025 - Streamline & Secure LLM Traffic with APISIX AI Gateway (API7)

Streamline & Secure LLM Traffic with APISIX AI Gateway Yilia Lin, Technical Writer at API7 apidays Munich 2025 - Accelerate AI Use Cases with APIs July 2 & 3, 2025 ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Data & Analytics◦

Related topics:

API Gateway•API Security Insights•Large Language Models• AI Infrastructure•API Management Trends•

Streamline & Secure LLM Traffic Using
APISIX AI Gateway
Yilia Lin, 3 July, 2025, API Days Munich

Agend
a
0
1
0
2
0
3
Apache APISIX
Overview
APISIX AI Gateway
Overview
Proxy Multi-LLMs and
Token
-based Rate Limiting
04 Q&A

About Speaker
 Apache APISIX Committer
 Technical Writer at API7.ai
 LinkedIn: linkedin.com/in/yilialin/
 GitHub: github.com/Yilialinn
Yilia Lin

Apache APISIX Overview
01

Apache APISIX Overview
Donated to Apache Software Foundation by API7.ai in 2019
Ultra High-Performance: > 23,000 single-core QPS
Low Latency: < 0.6 ms average delay
Lightweight Architecture: Decoupled control plane and data plane
High Scalability: >100 open-source plugins
Open-Source without Vendor Lock-in: Apache License 2.0

APISIX AI Gateway Overview
02

The Rise of AI and New Challenges
AI Application Characteristics
High-concurrency LLM Services
Token-based Pricing Model
Dynamic Scalability
Content Sensitivity
New Challenges
• Traffic Governance
• Cost Optimization
• Multi-Version Management
• Content Security

APISIX AI Gateway Features

AI plugins
APISIX AI Gateway Architecture
More Resources
• https://apisix.apache.org/docs/apisix/plugins
• https://docs.api7.ai/hub

APISIX AI Gateway Characteristics
Open-Source
Out-of-the-box
High Scalability
High Stability
High Security

Practical Application of APISIX AI Gateway
03
Proxy Multi-LLMs and Implement Token-Based Rate Limiting

Workflow

Configure Multi-LLMs and
Implement Token-Based Rate Limiting
 demo: https://app.storylane.io/share/cjpfweudrq1n
 doc: https://docs.api7.ai/hub/ai-proxy-multi#configure-
instance-priority-and-rate-limiting

Thank You!
Yilia Lin
yilialin
Yilialinn

Recommended

PDF

apidays Munich 2025 - Making Sense of AI-Ready APIs in a Buzzword World, Andr...

PDF

apidays Paris 2024 - AI In My API Gateway ... But Why?, Mathieu Ancelin, APIM

PDF

apidays Hong Kong - Why is API Gateway essential to business, Zhiyuan Ju, API...

PDF

apidays LIVE Singapore - Next-generation microservice architecture based on A...

PDF

apidays LIVE India 2022_Handle 10 billion API requests with Apache APISIX.pdf

PDF

INTERFACE by apidays 2023 - Unlocking the Power of LLM, Valliappan Narayanan,...

PDF

apidays Munich 2025 - The Physics of Requirement Sciences Through Application...

PDF

apidays Munich 2025 - Developer Portals, API Catalogs, and Marketplaces, Miri...

PDF

apidays Munich 2025 - Integrate Your APIs into the New AI Marketplace, Senthi...

PDF

apidays Munich 2025 - The Double Life of the API Product Manager, Emmanuel Pa...

PDF

apidays Munich 2025 - Let’s build, debug and test a magic MCP server in Postm...

PDF

apidays Munich 2025 - The life-changing magic of great API docs, Jens Fischer...

PDF

apidays Munich 2025 - Automating Operations Without Reinventing the Wheel, Ma...

PDF

apidays Munich 2025 - Geospatial Artificial Intelligence (GeoAI) with OGC API...

PPTX

apidays Munich 2025 - GraphQL 101: I won't REST, until you GraphQL, Surbhi Si...

PPTX

apidays Munich 2025 - Effectively incorporating API Security into the overall...

PPTX

apidays Munich 2025 - Federated API Management and Governance, Vince Baker (D...

PPTX

apidays Munich 2025 - Agentic AI: A Friend or Foe?, Merja Kajava (Aavista Oy)

PPTX

apidays Munich 2025 - Building Telco-Aware Apps with Open Gateway APIs, Subhr...

PPTX

apidays Munich 2025 - Building an AWS Serverless Application with Terraform, ...

PDF

apidays Helsinki & North 2025 - REST in Peace? Hunting the Dominant Design fo...

PDF

apidays Helsinki & North 2025 - Monetizing AI APIs: The New API Economy, Alla...

PDF

apidays Helsinki & North 2025 - How (not) to run a Graphql Stewardship Group,...

PDF

apidays Helsinki & North 2025 - APIs in the healthcare sector: hospitals inte...

PDF

apidays Helsinki & North 2025 - API-Powered Journeys: Mobility in an API-Driv...

PDF

Avatar for apidays apidays PRO June 07, 2025 0 5 apidays Helsinki & North 2...

PPTX

apidays Helsinki & North 2025 - Vero APIs - Experiences of API development in...

PPTX

apidays Helsinki & North 2025 - From Chaos to Clarity: Designing (AI-Ready) A...

PDF

The Hidden Risks in Linux Power Monitoring and How to Fix Them

byNiCE IT Management Solutions GmbH

PPTX

Chinese economy in the era a modern world

More Related Content

PDF

apidays Munich 2025 - Making Sense of AI-Ready APIs in a Buzzword World, Andr...

PDF

apidays Paris 2024 - AI In My API Gateway ... But Why?, Mathieu Ancelin, APIM

PDF

apidays Hong Kong - Why is API Gateway essential to business, Zhiyuan Ju, API...

PDF

apidays LIVE Singapore - Next-generation microservice architecture based on A...

PDF

apidays LIVE India 2022_Handle 10 billion API requests with Apache APISIX.pdf

PDF

INTERFACE by apidays 2023 - Unlocking the Power of LLM, Valliappan Narayanan,...

PDF

apidays Munich 2025 - The Physics of Requirement Sciences Through Application...

PDF

apidays Munich 2025 - Developer Portals, API Catalogs, and Marketplaces, Miri...

apidays Munich 2025 - Making Sense of AI-Ready APIs in a Buzzword World, Andr...

apidays Paris 2024 - AI In My API Gateway ... But Why?, Mathieu Ancelin, APIM

apidays Hong Kong - Why is API Gateway essential to business, Zhiyuan Ju, API...

apidays LIVE Singapore - Next-generation microservice architecture based on A...

apidays LIVE India 2022_Handle 10 billion API requests with Apache APISIX.pdf

INTERFACE by apidays 2023 - Unlocking the Power of LLM, Valliappan Narayanan,...

apidays Munich 2025 - The Physics of Requirement Sciences Through Application...

apidays Munich 2025 - Developer Portals, API Catalogs, and Marketplaces, Miri...

More from apidays

PDF

apidays Munich 2025 - Integrate Your APIs into the New AI Marketplace, Senthi...

PDF

apidays Munich 2025 - The Double Life of the API Product Manager, Emmanuel Pa...

PDF

apidays Munich 2025 - Let’s build, debug and test a magic MCP server in Postm...

PDF

apidays Munich 2025 - The life-changing magic of great API docs, Jens Fischer...

PDF

apidays Munich 2025 - Automating Operations Without Reinventing the Wheel, Ma...

PDF

apidays Munich 2025 - Geospatial Artificial Intelligence (GeoAI) with OGC API...

PPTX

apidays Munich 2025 - GraphQL 101: I won't REST, until you GraphQL, Surbhi Si...

PPTX

apidays Munich 2025 - Effectively incorporating API Security into the overall...

PPTX

apidays Munich 2025 - Federated API Management and Governance, Vince Baker (D...

PPTX

apidays Munich 2025 - Agentic AI: A Friend or Foe?, Merja Kajava (Aavista Oy)

PPTX

apidays Munich 2025 - Building Telco-Aware Apps with Open Gateway APIs, Subhr...

PPTX

apidays Munich 2025 - Building an AWS Serverless Application with Terraform, ...

PDF

apidays Helsinki & North 2025 - REST in Peace? Hunting the Dominant Design fo...

PDF

apidays Helsinki & North 2025 - Monetizing AI APIs: The New API Economy, Alla...

PDF

apidays Helsinki & North 2025 - How (not) to run a Graphql Stewardship Group,...

PDF

apidays Helsinki & North 2025 - APIs in the healthcare sector: hospitals inte...

PDF

apidays Helsinki & North 2025 - API-Powered Journeys: Mobility in an API-Driv...

PDF

Avatar for apidays apidays PRO June 07, 2025 0 5 apidays Helsinki & North 2...

PPTX

apidays Helsinki & North 2025 - Vero APIs - Experiences of API development in...

PPTX

apidays Helsinki & North 2025 - From Chaos to Clarity: Designing (AI-Ready) A...

apidays Munich 2025 - Integrate Your APIs into the New AI Marketplace, Senthi...

apidays Munich 2025 - The Double Life of the API Product Manager, Emmanuel Pa...

apidays Munich 2025 - Let’s build, debug and test a magic MCP server in Postm...

apidays Munich 2025 - The life-changing magic of great API docs, Jens Fischer...

apidays Munich 2025 - Automating Operations Without Reinventing the Wheel, Ma...

apidays Munich 2025 - Geospatial Artificial Intelligence (GeoAI) with OGC API...

apidays Munich 2025 - GraphQL 101: I won't REST, until you GraphQL, Surbhi Si...

apidays Munich 2025 - Effectively incorporating API Security into the overall...

apidays Munich 2025 - Federated API Management and Governance, Vince Baker (D...

apidays Munich 2025 - Agentic AI: A Friend or Foe?, Merja Kajava (Aavista Oy)

apidays Munich 2025 - Building Telco-Aware Apps with Open Gateway APIs, Subhr...

apidays Munich 2025 - Building an AWS Serverless Application with Terraform, ...

apidays Helsinki & North 2025 - REST in Peace? Hunting the Dominant Design fo...

apidays Helsinki & North 2025 - Monetizing AI APIs: The New API Economy, Alla...

apidays Helsinki & North 2025 - How (not) to run a Graphql Stewardship Group,...

apidays Helsinki & North 2025 - APIs in the healthcare sector: hospitals inte...

apidays Helsinki & North 2025 - API-Powered Journeys: Mobility in an API-Driv...

Avatar for apidays apidays PRO June 07, 2025 0 5 apidays Helsinki & North 2...

apidays Helsinki & North 2025 - Vero APIs - Experiences of API development in...

apidays Helsinki & North 2025 - From Chaos to Clarity: Designing (AI-Ready) A...

Recently uploaded

PDF

The Hidden Risks in Linux Power Monitoring and How to Fix Them

byNiCE IT Management Solutions GmbH

PPTX

Chinese economy in the era a modern world

PPTX

Advanced Python Course With Certification

byinderjeetsingh0589

PPTX

2B.Carbon-Neutral Technologies and Negative Emission Strategies for Net-Zero ...

byrameshkumar01430

PPTX

Pollution_Assignment_GurveerKaur_1112.pptx

PDF

MariaDB Monitoring for Enhancing Performance, Availability, and Security

byNiCE IT Management Solutions GmbH

PPTX

Anjna_DFT_PPTeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee.pptx

byBENGUERBAYacine

PDF

IIT JEE 2026 Complete Study Planner.pdfg

PDF

Starting with SPSS.pdf spss software by ibm

byMuditMathur39

PDF

Google Gemini Learning Guide - Generating Text, Vision Model, and Embeddings ...

DOCX

Leadership as a Catalyst How Nutrition Leadership Influences Health Financing...

byGetachewKebede3

PPTX

PPT PVA Training NDLM portal various data .pptx

bysachinsoodvetss

PDF

Power BI Template Design Instructions.pdf

bycbendezupublico

PPTX

Data_Analysis_Plan-corrected-Biostatistics.pptx

byFahmida Swati

PDF

System Center 2025 Migration: Preparing for a Smooth Transition

byNiCE IT Management Solutions GmbH

PPTX

Correlation-Regression analysis -16.11.25.pptx

byFahmida Swati

PDF

Comparing Versions in MySQL Key Differences and Insights.pdf

PPTX

MC25104 - Data structures and algorithms using python Python_OOP_Theory_and_P...

PPTX

ammonia process.pptx Amisha Group 7.pptxAmisha Group 7.pptxAmisha Group 7.pptx

byrameshkumar01430

PPTX

Understanding purpose, audience MSDA .pptx

byfahadtauqeer2005

The Hidden Risks in Linux Power Monitoring and How to Fix Them

byNiCE IT Management Solutions GmbH

Chinese economy in the era a modern world

Advanced Python Course With Certification

byinderjeetsingh0589

2B.Carbon-Neutral Technologies and Negative Emission Strategies for Net-Zero ...

byrameshkumar01430

Pollution_Assignment_GurveerKaur_1112.pptx

MariaDB Monitoring for Enhancing Performance, Availability, and Security

byNiCE IT Management Solutions GmbH

Anjna_DFT_PPTeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee.pptx

byBENGUERBAYacine

IIT JEE 2026 Complete Study Planner.pdfg

Starting with SPSS.pdf spss software by ibm

byMuditMathur39

Google Gemini Learning Guide - Generating Text, Vision Model, and Embeddings ...

Leadership as a Catalyst How Nutrition Leadership Influences Health Financing...

byGetachewKebede3

PPT PVA Training NDLM portal various data .pptx

bysachinsoodvetss

Power BI Template Design Instructions.pdf

bycbendezupublico

Data_Analysis_Plan-corrected-Biostatistics.pptx

byFahmida Swati

System Center 2025 Migration: Preparing for a Smooth Transition

byNiCE IT Management Solutions GmbH

Correlation-Regression analysis -16.11.25.pptx

byFahmida Swati

Comparing Versions in MySQL Key Differences and Insights.pdf

MC25104 - Data structures and algorithms using python Python_OOP_Theory_and_P...

ammonia process.pptx Amisha Group 7.pptxAmisha Group 7.pptxAmisha Group 7.pptx

byrameshkumar01430

Understanding purpose, audience MSDA .pptx

byfahadtauqeer2005

apidays Munich 2025 - Streamline & Secure LLM Traffic with APISIX AI Gateway (API7)

1.
Streamline & SecureLLM Traffic Using APISIX AI Gateway Yilia Lin, 3 July, 2025, API Days Munich
2.
Agend a 0 1 0 2 0 3 Apache APISIX Overview APISIX AIGateway Overview Proxy Multi-LLMs and Token -based Rate Limiting 04 Q&A
3.
About Speaker  ApacheAPISIX Committer  Technical Writer at API7.ai  LinkedIn: linkedin.com/in/yilialin/  GitHub: github.com/Yilialinn Yilia Lin
4.
Apache APISIX Overview 01
5.
Apache APISIX Overview Donatedto Apache Software Foundation by API7.ai in 2019 Ultra High-Performance: > 23,000 single-core QPS Low Latency: < 0.6 ms average delay Lightweight Architecture: Decoupled control plane and data plane High Scalability: >100 open-source plugins Open-Source without Vendor Lock-in: Apache License 2.0
6.
APISIX AI GatewayOverview 02
7.
The Rise ofAI and New Challenges AI Application Characteristics High-concurrency LLM Services Token-based Pricing Model Dynamic Scalability Content Sensitivity New Challenges • Traffic Governance • Cost Optimization • Multi-Version Management • Content Security
8.
APISIX AI GatewayFeatures
9.
AI plugins APISIX AIGateway Architecture More Resources • https://apisix.apache.org/docs/apisix/plugins • https://docs.api7.ai/hub
10.
APISIX AI GatewayCharacteristics Open-Source Out-of-the-box High Scalability High Stability High Security
11.
Practical Application ofAPISIX AI Gateway 03 Proxy Multi-LLMs and Implement Token-Based Rate Limiting
12.
Workflow
13.
Configure Multi-LLMs and ImplementToken-Based Rate Limiting  demo: https://app.storylane.io/share/cjpfweudrq1n  doc: https://docs.api7.ai/hub/ai-proxy-multi#configure- instance-priority-and-rate-limiting
14.
Thank You! Yilia Lin yilialin Yilialinn

Editor's Notes

#1 Good day, everyone! I'm excited to talk about how we can streamline and secure LLM traffic using APISIX AI Gateway. Why should you care about this? AI applications are growing explosively. If you're building AI applications, you're probably dealing with multiple LLM providers, worrying about API costs spiraling out of control, and concerned about security. Today, I'll show you how to solve all these challenges with a single, open-source solution.
#2 Here's what we'll cover in the next 25 minutes: - Apache APISIX Overview - APISIX AI Gateway Overview - Demo - proxy multiple LLMs and token-based rate limiting - 5 minutes for your questions at the end Let's dive in!
#3 First, let me introduce myself. I'm Yilia Lin, Apache APISIX Committer and Technical Writer at API7.ai. I'm not an engineer but a language learner, and I'm working on content marketing. You can find me on LinkedIn and GitHub. I'm always happy to connect and discuss API gateway technologies.
#4 Let me start with the foundation - Apache APISIX
#5 - Apache APISIX was donated to the Apache Software Foundation by API7.ai in 2019, and it's become one of the fastest-growing API gateway projects in the cloud-native ecosystem. - It has ultra-high Performance: over 23,000 single-core QPS with less than 0.6 ms average latency. - It is a lightweight API gateway with a decoupled control plane and data plane. This means you can scale your data processing independently from your configuration management. - APISIX is highly scalable, with over 100 open-source plugins, covering authentication, monitoring, traffic management, and security, etc. - It's completely open-source without vendor lock-in with Apache License 2.0. You can customize it for your specific needs without worrying about licensing restrictions. Here are some APISIX users, including Zoom, KFC, McDonald's, SHEIN, and NASA, covering e-commerce, catering, financial services, electronics, and aerospace.
#6 Now, let's see how we've extended APISIX specifically for AI use cases.
#7 We're witnessing an incredible rise in AI applications. But, with this comes unique challenges that traditional API gateways were not designed to handle. AI applications have some characteristics, like: - LLM Services need to handle high-concurrency requests at a time with varied response times - Pricing is based on the token used instead of the number of requests - Systems must scale up or down quickly, as traffic can spike suddenly - Content sensitivity: both input prompts and output responses require filtering for content safety. These characteristics bring us new challenges - Traffic Governance Gets Tricky: Traditional load balancing doesn't work well with LLMs. Traditional load balancers just distribute requests randomly. But LLMs aren't interchangeable! - Cost Optimization: This is probably the biggest pain point. Since LLM APIs charge by tokens, not requests, costs can vary wildly from cheap to sky-high. - Multi-Version Headaches: There are so many AI versions. OpenAI alone has 3.5, 4, 4-Turbo, 4o, etc. How do you smoothly shift traffic between them or A/B test different models? - Content Security: How to ensure prompts aren't malicious? How to prevent sensitive data from being leaked in responses? Traditional API security doesn't cover prompt injection attacks or content moderation. This is exactly where APISIX AI Gateway comes in.
#8 Let me walk you through the features of APISIX AI Gateway: - Supports multiple LLM providers to avoid vendor lock-in. You can route between various LLMs. - Token-based rate limiting: This is crucial to prevent API abuse and optimize cost management. - AI Rag: to combine the enterprise knowledge base to improve the generated output. - Observability of token usage: Track token usage to prevent API abuse and excessive billing. - Retry and fallback to other LLM services, ensuring service stability and quality. - Security: Includes prompt filtering and content moderation to ensure compliance with AI applications. This is crucial for enterprises, as malicious prompts can lead to reputational harm or data breaches.
#9 Now, let's take a look at the architecture of the APISIX AI gateway. APISIX AI Gateway builds on the solid foundation of Apache APISIX. All AI-specific features are implemented as plugins. This architecture can maintain APISIX's modular and extensible features. You can see all the plugin documentation on the official website of APISIX or the API7 plugin hub.
#10 In summary, APISIX AI Gateway is - Fully open-source, without vendor lock-in - All the AI features are out-of-the-box, and you can also combine them according to your requirements - High Scalability: Effortlessly scale up or down, ensuring optimal performance even during traffic surges. - High Stability: Robust and reliable, minimizing downtime and ensuring consistent service delivery for your AI applications. - High Security: Equipped with advanced security measures to protect against threats and ensure the safe deployment of AI applications.
#11 Now let's see an example. I want to show you how to realize token-based rate limiting using AI plugins.
#12 First, let's see the workflow of this example. In this case, we have two LLM instances - one is OpenAI Instance, one is Deepseek Instance. We want to route traffic between them intelligently and implement token-based rate limiting to control costs. The workflow would be: 1. A client sends a Request to the APISIX AI Gateway. 2. APISIX passes the request to the ai-proxy-multi plugin to evaluate routing logic. 3. The ai-rate-limiting plugin checks if the high-priority OpenAI instance's rate limit is exceeded. - If OpenAI's rate limit is not exceeded, the request is sent to the OpenAI instance. - If OpenAI's rate limit is exceeded, the request is forwarded to the low-priority one, that is DeepSeek instance. 4. The response from either OpenAI or DeepSeek is forwarded back through the ai-proxy-multi plugin and APISIX to the client. This logic ensures that requests are handled efficiently, using the high-priority instance when available and falling back to the low-priority one when rate limits are reached.
#13 OK, after understanding the workflow, let's see this demo.
#14 OK, that's all for my talk. Thank you for your attention! Feel free to connect with me afterward or reach out on LinkedIn and GitHub. Now, I'd love to hear your questions!