James Luan
April 16, 2024
Leverage Zilliz Serverless
Up to 50X Saving for Your Vector Storage Cost
Top Concerns for Vector Database Users
Gathering feedback from 1000 GenAI developers from Milvus community, 2024.
The Disparity Between POC and Reality
A Quick RAG DEMO
Things to be think production
Cost per user ?
Monitor search quality and latency?
Handle multi tenancy?
Handle bursted traffic?
Availability - hardware failure and software bugs
Recap - Zilliz cloud Dedicated cluster Architecture
Advantages:
• Disaggregated storage and computation for
enhanced
fl
exibility.
• Elastic resource pool for batching workloads.
• Isolation via Kubernetes namespace
• Optimized performance through data caching.
Disadvantages:
• High cost to start, potentially exceeding $100
monthly, even no search.
• Lack of ability to scale with workloads.
• Performance degradation as data volume increases.
• Elevated storage expenses, particularly when storing
everything in main memory.
Zilliz Cloud Serverless
Pay as you go Up to 100X cheaper
Multi Tenancy Support
Dynamic Scale
Running on all cloud Monitoring && Metrics
Zilliz cloud Serverless architecture
• Logical clusters && Auto scaling
• Disaggregate streaming and historical data
• Tiered storage
• Multi tenancy && Cold-Hot Separation
Logical clusters && Auto scaling
• Proxy Node:
• Routes traf
fi
c, enforces quotas, and throttles
requests, scales according to CPU and
network bandwidth.
• Streaming Nodes:
• Persists streaming data and serve streaming
data search, scales based on write queue time
and CPU/memory usage.
• Query Nodes:
• Handles historical data search requests,
scales according to search queue time and
CPU/memory utilization.
• Resource Pool:
• Executes batch jobs, scales based on the
number of jobs.
• Storage:
• Storing metadata and logs, scales with data
volume and queue time,
Disaggregate streaming and historical data
Tiered Storage
Multi Tenancy && Hot cold separation
Optimizations:
• Multi layer caching
• Prune by partitions/partition keys
• Prune by segment centroids
• Prune by
fi
ltering push down
• Pre-warm
Performance and Cost Evaluation
Single Tenant Performance Storage Per 10M 768 Dim
915$ 228$ 16$
Performance Capacity Serverless
up to 50x immediate savings
for non-performance-critical
applications
Roadmaps for Zilliz cloud Serverless
Zilliz Serverless Beta
GCP
100m data limited
10 seconds cold latency
2024.5
Zilliz Serverless GA
AWS, GCP
100m data limit on single tenant
3 seconds cold latency
2024.7
Zilliz Serverless 2.0
AWS, GCP, Azure
No limit on data volume
1 second cold latency
2024.10
You get free credits for the
fi
rst million data!
We are seeking for private beta user, please contact us if you are interested!
| © Copyright 9/27/23 Zilliz
13
QA

Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost

  • 1.
    James Luan April 16,2024 Leverage Zilliz Serverless Up to 50X Saving for Your Vector Storage Cost
  • 2.
    Top Concerns forVector Database Users Gathering feedback from 1000 GenAI developers from Milvus community, 2024.
  • 3.
    The Disparity BetweenPOC and Reality A Quick RAG DEMO Things to be think production Cost per user ? Monitor search quality and latency? Handle multi tenancy? Handle bursted traffic? Availability - hardware failure and software bugs
  • 4.
    Recap - Zillizcloud Dedicated cluster Architecture Advantages: • Disaggregated storage and computation for enhanced fl exibility. • Elastic resource pool for batching workloads. • Isolation via Kubernetes namespace • Optimized performance through data caching. Disadvantages: • High cost to start, potentially exceeding $100 monthly, even no search. • Lack of ability to scale with workloads. • Performance degradation as data volume increases. • Elevated storage expenses, particularly when storing everything in main memory.
  • 5.
    Zilliz Cloud Serverless Payas you go Up to 100X cheaper Multi Tenancy Support Dynamic Scale Running on all cloud Monitoring && Metrics
  • 6.
    Zilliz cloud Serverlessarchitecture • Logical clusters && Auto scaling • Disaggregate streaming and historical data • Tiered storage • Multi tenancy && Cold-Hot Separation
  • 7.
    Logical clusters &&Auto scaling • Proxy Node: • Routes traf fi c, enforces quotas, and throttles requests, scales according to CPU and network bandwidth. • Streaming Nodes: • Persists streaming data and serve streaming data search, scales based on write queue time and CPU/memory usage. • Query Nodes: • Handles historical data search requests, scales according to search queue time and CPU/memory utilization. • Resource Pool: • Executes batch jobs, scales based on the number of jobs. • Storage: • Storing metadata and logs, scales with data volume and queue time,
  • 8.
  • 9.
  • 10.
    Multi Tenancy &&Hot cold separation Optimizations: • Multi layer caching • Prune by partitions/partition keys • Prune by segment centroids • Prune by fi ltering push down • Pre-warm
  • 11.
    Performance and CostEvaluation Single Tenant Performance Storage Per 10M 768 Dim 915$ 228$ 16$ Performance Capacity Serverless up to 50x immediate savings for non-performance-critical applications
  • 12.
    Roadmaps for Zillizcloud Serverless Zilliz Serverless Beta GCP 100m data limited 10 seconds cold latency 2024.5 Zilliz Serverless GA AWS, GCP 100m data limit on single tenant 3 seconds cold latency 2024.7 Zilliz Serverless 2.0 AWS, GCP, Azure No limit on data volume 1 second cold latency 2024.10 You get free credits for the fi rst million data! We are seeking for private beta user, please contact us if you are interested!
  • 13.
    | © Copyright9/27/23 Zilliz 13 QA