1 | © Copyright 2025 Zilliz
1
June 2025
Build Fast, Scale Faster:
Milvus vs. Zilliz Cloud
2 | © Copyright 2025 Zilliz
2
Agenda About Zilliz
01
Zilliz Cloud Offerings
02
Milvus vs Zilliz Cloud
03
Q&A
04
3 | © Copyright 2025 Zilliz
3 | © Copyright 2025 Zilliz
3
The Forrester Wave™ Vector
Database Providers, Q3 2024
Zilliz is the right partner for
your Vector Database
needs.
4 | © Copyright 2025 Zilliz
4
BRING YOUR OWN CLOUD
Zilliz Cloud BYOC
For Private VPCs
Milvus
Most widely-adopted open
source vector database
SELF MANAGED SOFTWARE
Zilliz Cloud
AI Powered Search that is
performant and scales
FULLY MANAGED SERVICE
Set up Once: Common API across all products regardless of architecture
Zilliz Offerings
Coming Soon!
5 | © Copyright 2025 Zilliz
5 | © Copyright 2025 Zilliz
5
SUPERIOR AI POWERED SEARCH
Cardinal Search
Engine
3X faster than Milvus, AIPowered
AutoIndex, Dynamic Search
Strategies, Zero Manual Tuning
HIGH AVAILABILITY & SCALE
Cloud Native
Database
Distributed Architecture for Stability
and Cost-Efficient Scalability —
Multi-AZ, Multi-Cloud, and
Cross-Cloud Availability.
UNCOMPROMISING DATA
SECURITY
Enterprise Ready
Platform
Battle-Tested: Delivering Reliable
Performance and Enterprise-Grade
Security
UNCOMPROMISING DATA
SECURITY
Enterprise Ready
Platform
Battle-Tested: Delivering Reliable
Performance and Enterprise-Grade
Security
Introducing Zilliz Cloud
6 | © Copyright 2025 Zilliz
6
Cardinal Search Engine | Built to Outperform
Why Cardinal Delivers Lightning-Fast Vector Search
Smart Index Execution
Cardinal intelligently blends graph, IVF, and other indexing
strategies — automatically optimized by your data distribution and
query patterns.
Data-Aware Optimization
Built on an immutable index architecture, Cardinal understands
your data deeply. It constructs optimal graphs and dynamically
tunes quantization for peak performance.
Early Termination
Cardinal speeds up simple queries using early termination. For
complex datasets, it adapts search strategies to maintain high
recall without compromising latency.
State-of-the-Art Compression & Quantization
Cardinalʼs advanced compression and quantization algorithms
store up to 2× more vectors in memory while reducing bandwidth
usage—delivering faster performance and better memory
efficiency than open-source Milvus
7 | © Copyright 2025 Zilliz
7
Zilliz Cloud offers three CU types
Memory Based Disk Based Smart Tiers
Decreasing
characteristic
radius
Performance-optimized
● RAM Based ~5$ GB/month
● Search QPS 500-1500
● Search Latency sub 10ms
● Ideal for real-time applications like
generative AI, recommendation
systems, chatbots, and more.
Extended-capacity
● S3 Based ~0.023 $ GB/month
● Search QPS 5-20
● Search Latency hundreds-ms
● Ideal for applications that need to
store massive volumes of data at a
low cost.
Read more about it : https://docs.zilliz.com/docs/cu-types-explained
Capacity-optimized
● EBS Based ~0.1$ GB/month
● Search QPS 100-300-1500
● Search Latency tens-ms
● Ideal for large-scale unstructured data
search, copyright detection, and
identity verification.
8 | © Copyright 2025 Zilliz
8
Ultra-fast Metadata Filtering
Enhanced Graph Connectivity
Inspired by the ACORN1] strategy, Cardinal reinforces cross-cluster
connections to eliminate isolated data “islandsˮ during filtered searches,
ensuring smoother graph traversal even under tight constraints.
Unified Graph-and-IVF Framework
For high-selectivity queries, tightly clustered buckets act as efficient
detours within the graph, accelerating pathfinding and improving recall
under complex filtering conditions.
Vectorized Scalar Filtering
Built on columnar storage and techniques from Meta Velox, Cardinal
supports vectorized execution for scalar filters, significantly improving
query throughput and efficiency even for brute-force search scenarios
Comprehensive Scalar Indexing
Cardinal supports a full range of scalar indexing options—including
Lucene-style inverted indexes, text match, bitmap, NGram, and
structured JSON indexes, making it easy to support diverse filtering
requirements with speed
Used hosts with similar list price: 1000$/month:
Zillizcloud: 8cu-perf, Pinecone: p2.x8, Qdrant: 16c64g,
ElasticSearch: 8c60g, Opensearch(aws arm): 16c128g
Dataset: Cohere1M768dim
Benchmark: https://github.com/zilliztech/VectorDBBench
1 Patel, Liana, et al. "Acorn: Performant and predicate-agnostic search over vector embeddings and structured data." Proceedings of
the ACM on Management of Data 2.3 2024 127.
Filter Performance Test
9 | © Copyright 2025 Zilliz
9
Squeezing Every Bit of Performance from your
Hardware
Leverage SVE2 and ARM Instruction Sets:
Beyond vector calculations, Zilliz Cloud applies vectorization
to Gather/Scatter operations, bitsets, and table
lookups—maximizing parallelism and boosting throughput on
modern ARM architectures.
Capitalize on Gravitonʼs Bandwidth Advantage:
Graviton processors offer higher memory bandwidth than
traditional x86 chips. Zilliz Cloud optimizes algorithms to
reduce bandwidth bottlenecks, unlocking better compute
efficiency and performance.
GPU Support for Cost-Efficient Scale:
Zilliz works closely with NVIDIA to bring GPU-accelerated
indexing to production environments. With the latest Cagra
index and L20 GPUs, our vector search achieves up to 3
better cost efficiency, making it ideal for large-scale,
high-throughput workloads.
Dataset: Cohere1M768dim
Benchmark: https://github.com/zilliztech/VectorDBBench
Host: Zillizcloud: 8cu-perf
Graviton Performance Test
QPS
10 | © Copyright 2025 Zilliz
10 | © Copyright 2025 Zilliz
10
Milvus Challenges
Operational
Overhead
Continuous operations,
patching and active monitoring
required to adequately support
AI applications
Security and
Compliance Risk
Ensure all configurations and
data are securely managed,
fully protected, and compliant
with industry standards
Uptime and
Reliability
Milvus requires deep expertise
and significant effort to
maintain enterprise
performance and availability
11 | © Copyright 2025 Zilliz
11 | © Copyright 2025 Zilliz
11
Managing Milvus | Real World Impact
Increased TCO
Bigger teams cost more
Milvus uses standard open source indexes
HNSW, IVF, PQ/SQ) requiring more hw
resources than optimized solutions
Delayed Time-to-Value
Slower adoption of latest innovations
hindering ability to meet or exceed
business requirements
Unplanned Downtime or
Performance Challenges
Missing SLAs, inability to deliver on
business requirements, lost revenue,
reputational damage and customer
frustration
The Cost of Bigger Teams
Eventually, more FTEs are required for
AI infrastructure and Vector DB
management to support business and
application requirements
12 | © Copyright 2025 Zilliz
12
12 | © Copyright 2025 Zilliz
Two Ways to Migrate
Zilliz Cloud offers two primary migration methods:
Via Endpoint
Connect directly to your self-hosted Milvus instance and migrate one database at a
time. This approach allows for a controlled, database-by-database migration
process and is ideal when fine-grained oversight is required.
Via Backup Files
Upload Milvus backup files to Zilliz Cloud and migrate multiple databases in
parallel. This is faster and more efficient for large-scale migrations or when
downtime is acceptable.
13 | © Copyright 2025 Zilliz
13
Thank You!

Build Fast, Scale Faster: Milvus vs. Zilliz Cloud for Production-Ready AI

  • 1.
    1 | ©Copyright 2025 Zilliz 1 June 2025 Build Fast, Scale Faster: Milvus vs. Zilliz Cloud
  • 2.
    2 | ©Copyright 2025 Zilliz 2 Agenda About Zilliz 01 Zilliz Cloud Offerings 02 Milvus vs Zilliz Cloud 03 Q&A 04
  • 3.
    3 | ©Copyright 2025 Zilliz 3 | © Copyright 2025 Zilliz 3 The Forrester Wave™ Vector Database Providers, Q3 2024 Zilliz is the right partner for your Vector Database needs.
  • 4.
    4 | ©Copyright 2025 Zilliz 4 BRING YOUR OWN CLOUD Zilliz Cloud BYOC For Private VPCs Milvus Most widely-adopted open source vector database SELF MANAGED SOFTWARE Zilliz Cloud AI Powered Search that is performant and scales FULLY MANAGED SERVICE Set up Once: Common API across all products regardless of architecture Zilliz Offerings Coming Soon!
  • 5.
    5 | ©Copyright 2025 Zilliz 5 | © Copyright 2025 Zilliz 5 SUPERIOR AI POWERED SEARCH Cardinal Search Engine 3X faster than Milvus, AIPowered AutoIndex, Dynamic Search Strategies, Zero Manual Tuning HIGH AVAILABILITY & SCALE Cloud Native Database Distributed Architecture for Stability and Cost-Efficient Scalability — Multi-AZ, Multi-Cloud, and Cross-Cloud Availability. UNCOMPROMISING DATA SECURITY Enterprise Ready Platform Battle-Tested: Delivering Reliable Performance and Enterprise-Grade Security UNCOMPROMISING DATA SECURITY Enterprise Ready Platform Battle-Tested: Delivering Reliable Performance and Enterprise-Grade Security Introducing Zilliz Cloud
  • 6.
    6 | ©Copyright 2025 Zilliz 6 Cardinal Search Engine | Built to Outperform Why Cardinal Delivers Lightning-Fast Vector Search Smart Index Execution Cardinal intelligently blends graph, IVF, and other indexing strategies — automatically optimized by your data distribution and query patterns. Data-Aware Optimization Built on an immutable index architecture, Cardinal understands your data deeply. It constructs optimal graphs and dynamically tunes quantization for peak performance. Early Termination Cardinal speeds up simple queries using early termination. For complex datasets, it adapts search strategies to maintain high recall without compromising latency. State-of-the-Art Compression & Quantization Cardinalʼs advanced compression and quantization algorithms store up to 2× more vectors in memory while reducing bandwidth usage—delivering faster performance and better memory efficiency than open-source Milvus
  • 7.
    7 | ©Copyright 2025 Zilliz 7 Zilliz Cloud offers three CU types Memory Based Disk Based Smart Tiers Decreasing characteristic radius Performance-optimized ● RAM Based ~5$ GB/month ● Search QPS 500-1500 ● Search Latency sub 10ms ● Ideal for real-time applications like generative AI, recommendation systems, chatbots, and more. Extended-capacity ● S3 Based ~0.023 $ GB/month ● Search QPS 5-20 ● Search Latency hundreds-ms ● Ideal for applications that need to store massive volumes of data at a low cost. Read more about it : https://docs.zilliz.com/docs/cu-types-explained Capacity-optimized ● EBS Based ~0.1$ GB/month ● Search QPS 100-300-1500 ● Search Latency tens-ms ● Ideal for large-scale unstructured data search, copyright detection, and identity verification.
  • 8.
    8 | ©Copyright 2025 Zilliz 8 Ultra-fast Metadata Filtering Enhanced Graph Connectivity Inspired by the ACORN1] strategy, Cardinal reinforces cross-cluster connections to eliminate isolated data “islandsˮ during filtered searches, ensuring smoother graph traversal even under tight constraints. Unified Graph-and-IVF Framework For high-selectivity queries, tightly clustered buckets act as efficient detours within the graph, accelerating pathfinding and improving recall under complex filtering conditions. Vectorized Scalar Filtering Built on columnar storage and techniques from Meta Velox, Cardinal supports vectorized execution for scalar filters, significantly improving query throughput and efficiency even for brute-force search scenarios Comprehensive Scalar Indexing Cardinal supports a full range of scalar indexing options—including Lucene-style inverted indexes, text match, bitmap, NGram, and structured JSON indexes, making it easy to support diverse filtering requirements with speed Used hosts with similar list price: 1000$/month: Zillizcloud: 8cu-perf, Pinecone: p2.x8, Qdrant: 16c64g, ElasticSearch: 8c60g, Opensearch(aws arm): 16c128g Dataset: Cohere1M768dim Benchmark: https://github.com/zilliztech/VectorDBBench 1 Patel, Liana, et al. "Acorn: Performant and predicate-agnostic search over vector embeddings and structured data." Proceedings of the ACM on Management of Data 2.3 2024 127. Filter Performance Test
  • 9.
    9 | ©Copyright 2025 Zilliz 9 Squeezing Every Bit of Performance from your Hardware Leverage SVE2 and ARM Instruction Sets: Beyond vector calculations, Zilliz Cloud applies vectorization to Gather/Scatter operations, bitsets, and table lookups—maximizing parallelism and boosting throughput on modern ARM architectures. Capitalize on Gravitonʼs Bandwidth Advantage: Graviton processors offer higher memory bandwidth than traditional x86 chips. Zilliz Cloud optimizes algorithms to reduce bandwidth bottlenecks, unlocking better compute efficiency and performance. GPU Support for Cost-Efficient Scale: Zilliz works closely with NVIDIA to bring GPU-accelerated indexing to production environments. With the latest Cagra index and L20 GPUs, our vector search achieves up to 3 better cost efficiency, making it ideal for large-scale, high-throughput workloads. Dataset: Cohere1M768dim Benchmark: https://github.com/zilliztech/VectorDBBench Host: Zillizcloud: 8cu-perf Graviton Performance Test QPS
  • 10.
    10 | ©Copyright 2025 Zilliz 10 | © Copyright 2025 Zilliz 10 Milvus Challenges Operational Overhead Continuous operations, patching and active monitoring required to adequately support AI applications Security and Compliance Risk Ensure all configurations and data are securely managed, fully protected, and compliant with industry standards Uptime and Reliability Milvus requires deep expertise and significant effort to maintain enterprise performance and availability
  • 11.
    11 | ©Copyright 2025 Zilliz 11 | © Copyright 2025 Zilliz 11 Managing Milvus | Real World Impact Increased TCO Bigger teams cost more Milvus uses standard open source indexes HNSW, IVF, PQ/SQ) requiring more hw resources than optimized solutions Delayed Time-to-Value Slower adoption of latest innovations hindering ability to meet or exceed business requirements Unplanned Downtime or Performance Challenges Missing SLAs, inability to deliver on business requirements, lost revenue, reputational damage and customer frustration The Cost of Bigger Teams Eventually, more FTEs are required for AI infrastructure and Vector DB management to support business and application requirements
  • 12.
    12 | ©Copyright 2025 Zilliz 12 12 | © Copyright 2025 Zilliz Two Ways to Migrate Zilliz Cloud offers two primary migration methods: Via Endpoint Connect directly to your self-hosted Milvus instance and migrate one database at a time. This approach allows for a controlled, database-by-database migration process and is ideal when fine-grained oversight is required. Via Backup Files Upload Milvus backup files to Zilliz Cloud and migrate multiple databases in parallel. This is faster and more efficient for large-scale migrations or when downtime is acceptable.
  • 13.
    13 | ©Copyright 2025 Zilliz 13 Thank You!