More Related Content Similar to Vectors are the new JSON in PostgreSQL (SCaLE 21x) (20) More from Jonathan Katz (14) Vectors are the new JSON in PostgreSQL (SCaLE 21x)1. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Vectors are the new JSON
Jonathan Katz
(he/him/his)
Principal Product Manager – Technical
AWS
2. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
{
"id": 5432,
"name": "PostgreSQL",
"description": "World's most advanced open source
relational database",
"supportedVersions": [16, 15, 14, 13, 12]
}
3. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
{
"id": 5432,
"name":
"PostgreSQL",
"description":
"World's most advanced
open source relational
database",
"supportedVersions":
[16, 15, 14, 13, 12]
}
id 5432
name PostgreSQL
description world's most...
supportedVersions [16,15,14,13,12]
4. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
{
"id": 5432,
"name":
"PostgreSQL",
"description":
"World's most advanced
open source relational
database",
"supportedVersions":
[16, 15, 14, 13, 12]
}
5. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Timeline of JSON storage
• 2000-2001: JSON invented
• 2004: AJAX model emerges in wider deployments
• 2006: RFC 4627 publishes JSON format
• 2006-2009: JSON-specific data stores emerge
• 2012: PostgreSQL adds support for JSON (text)
• 2013: ECMA-404 standardizes JSON
• 2014: PostgreSQL adds support for JSONB (binary)
• 2017: SQL/JSON standard published
• 2019: PostgreSQL adds SQL/JSON path language
• 2023: PostgreSQL adds SQL/JSON constructors and predicates
6. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
7. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
[0.5, 0.5]
8. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Magnitude
|| [0.5, 0.5] || = √ (0.52 + 0.52) = 0.70710
9. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Direction
Magnitude
[0.5, 0.5]
10. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
[0.5, 0.5, 0.5]
11. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
12. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Pre-trained on vast amounts of
unstructured data
Contain large number of parameters that make
them capable of learning complex concepts
Can be applied in a wide range of contexts
Customize FMs using your data for domain
specific tasks
Generative AI is powered
by foundation models
13. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Retrieval Augmented
Generation (RAG)
Configure FM to interact with
your data
A N S W E R
Q U E S T I O N
K N O W L E D G E
B A S E S
F O U N D A T I O N
M O D E L
How much does a blue
elephant vase cost?
Product catalog
Price data
A blue elephant vase
typically costs $19.99
Sorry, I don't know
14. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
The role of vectors in RAG
Document
chunks
Embeddings
PDF
document
Database
User
Embeddings Foundational
model
1
4
Question
Question + Context
Response
2 3
5
6
7
15. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Challenges with vectors
• Time to generate embeddings
• Embedding size
• Compression
• Query time
Blue elephant vase
that can hold up to
three plants in it,
hand painted…
0.1234
0.1231
0.1232
0.9005
0.2489
1536 dimensions
4-byte floats
6152B => 6KiB
0.12310
0.24234
0.59405
0.23430
0.23432
0.20551
0.70543
0.20559
0.20559
0.70543
0.23432
0.24234
0.23430
0.12310
0.20551
0.59405
1,000,000 => 5.7GB
16. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Approximate nearest neighbor (ANN)
• Find similar vectors without
searching all of them
• Faster than exact nearest
neighbor
• “Recall” – % of expected results
Recall: 80%
17. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Questions for choosing a vector storage system
• Where does vector storage fit into my workflow?
• How much data am I storing?
• What matters to me: storage, performance, relevancy, cost?
• What are my tradeoffs: indexing, query time, schema design?
18. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
PostgreSQL as a vector store
19. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Why use PostgreSQL for vector searches?
• Existing client libraries work without modification
• Convenient to co-locate app + AI/ML data in same database
• PostgreSQL acts as persistent transactional store while working with
other vector search systems
20. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Native vector support in PostgreSQL
• ARRAY data type
§ Multiple data types (int4, int8, float4,
float8)
§ “Unlimited” dimensions
§ No native distance operations
– Can add using Trusted Language
Extensions + PL/Rust
§ No native indexing
• Cube data type
§ float8 values
§ Euclidean, Manhattan, Chebyshev
distances
§ K-NN GiST index – exact nearest
neighbor search
§ Limited to 100 dimensions
21. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
What is pgvector?
An open source extension that:
adds support for storage, indexing, searching, metadata with choice of distance
vector data type
Supports IVFFlat/HNSW indexing
Distance operators (<->, <=>, <#>)
Exact nearest neighbor (K-NN)
Approximate nearest neighbor (ANN)
Co-locate with embeddings
github.com/pgvector/pgvector
22. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Understanding pgvector performance
0%
10%
20%
30%
40%
50%
60%
70%
0
5000
10000
15000
20000
25000
30000
35000
20 40 80 200 400 800
Speedup
(%)
Transactions/Second
(TPS)
hnsw.ef_search
1536-dimensional vector HNSW search
db.r6g.16xlarge db.r7g.16xlarge Speedup (%)
23. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
pgvector distance operations
<->
Euclidean/L2
<=>
Cosine distance
<#>
Inner product
24. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
How does pgvector index a vector?
0.0234
0.093
-0.9123
0.1055
Valid?
✅ Same dimensions?
✅ Magnitude > 0?
Normalized?
🛠 If not, normalize
0.0253
0.1007
-0.9880
0.1142
25. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Indexing methods: IVFFlat and HNSW
• IVFFlat
§ K-means based
§ Organize vectors into lists
§ Requires prepopulated data
§ Insert time bounded by # lists
• HNSW
§ Graph based
§ Organize vectors into
“neighborhoods”
§ Iterative insertions
§ Insertion time increases as data in
graph increases
26. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Which search method do I choose?
• Exact nearest neighbors: No index
• Fast indexing: IVFFlat*
• Easy to manage: HNSW
• High performance/recall: HNSW
27. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
pgvector strategies and best
practices
28. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Best practices for pgvector
Storage strategies
HNSW strategies
IVFFlat strategies
Filtering
Distributed queries
29. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
pgvector storage strategies
30. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Understanding TOAST in PostgreSQL
• TOAST (The Oversized-Attribute Storage Technique) is a mechanism
for storing data larger than 8KB
• By default, PostgreSQL “TOASTs” values over 2KB
• 510-dim 4-byte float vector
31. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
PostgreSQL column storage types
• PLAIN: Data stored inline with table
• EXTENDED: Data stored/compressed in TOAST table when threshold
exceeded
§ pgvector default before 0.6.0
• EXTERNAL: Data stored in TOAST table when threshold exceeded
§ pgvector default 0.6.0+
• MAIN: Data stored compressed inline with table
32. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Impact of TOAST on pgvector queries
Limit (cost=772135.51..772136.73 rows=10 width=12)
-> Gather Merge (cost=772135.51..1991670.17 rows=10000002 width=12)
Workers Planned: 6
-> Sort (cost=771135.42..775302.08 rows=1666667 width=12)
Sort Key: ((<-> embedding))
-> Parallel Seq Scan on vecs128 (cost=0.00..735119.34 rows=1666667
width=12)
128 dimensions
33. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Impact of TOAST on pgvector queries
Limit (cost=149970.15..149971.34 rows=10 width=12)
-> Gather Merge (cost=149970.15..1347330.44 rows=10000116 width=12)
Workers Planned: 4
-> Sort (cost=148970.09..155220.16 rows=2500029 width=12)
Sort Key: (($1 <-> embedding))
-> Parallel Seq Scan on vecs1536 (cost=0.00..94945.36 rows=2500029
width=12)
1,536 dimensions
34. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Strategies for pgvector and TOAST
• Use PLAIN storage
§ ALTER TABLE … ALTER COLUMN ... SET STORAGE PLAIN
§ Requires table rewrite (VACUUM FULL) if data already exists
§ Limits vector sizes to 2,000 dimensions
• Use min_parallel_table_scan_size to induce more parallel
workers
35. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Impact of TOAST on pgvector queries
Limit (cost=95704.33..95705.58 rows=10 width=12)
-> Gather Merge (cost=95704.33..1352239.13 rows=10000111 width=12)
Workers Planned: 11
-> Sort (cost=94704.11..96976.86 rows=909101 width=12)
Sort Key: (($1 <-> embedding))
-> Parallel Seq Scan on vecs1536 (cost=0.00..75058.77 rows=909101 width=12)
1,536 dimensions
SET min_parallel_table_scan_size TO 1
36. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
HNSW strategies
37. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
HNSW index building parameters
• m
§ Maximum number of bidirectional links between indexed vectors
§ Default: 16
• ef_construction
§ Number of vectors to maintain in “nearest neighbor” list
§ Default: 64
38. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Building an HNSW index
39. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Building an HNSW index
Layer 2
40. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Building an HNSW index
Layer 2
41. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Building an HNSW index
Layer 1
42. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Building an HNSW index
Layer 0
43. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
HNSW query parameters
• hnsw.ef_search
§ Number of vectors to maintain in “nearest neighbor” list
§ Must be greater than or equal to LIMIT
44. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Querying an HNSW index
Layer 2
45. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Querying an HNSW index
Layer 2
46. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Querying an HNSW index
Layer 1
47. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Querying an HNSW index
Layer 1
48. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Querying an HNSW index
Layer 0
49. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Querying an HNSW index
Layer 0
50. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Best practices for building HNSW indexes
• Default values (M=16,ef_construction=64) usually work
• (pgvector 0.5.1) Start with empty index and use concurrent writes to
accelerate builds
§ INSERT or COPY
• pgvector (0.6.0+) use parallel builds on a full table
§ max_parallel_maintenance_workers
51. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Impact of parallelism on HNSW build time
0
100
200
300
400
500
600
700
800
900
1000
1 2 4 8 16 32 64
Time
(s)
Clients / Workers
HNSW index build (1,000,000 128-dim vectors)
Parallel Build Concurrent Inserts
52. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Choosing m and ef_construction (serial)
0.82
0.84
0.86
0.88
0.9
0.92
0.94
0
50
100
150
200
250
32 64 128 256 512
Recall
Index
build
(min)
ef_construction
1.1MM 1536-dim vectors, m=16, ef_search=20
Build Time (min) Recall
53. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Choosing m and ef_construction (parallel)
0.8
0.82
0.84
0.86
0.88
0.9
0.92
0.94
0
1
2
3
4
5
6
7
8
9
10
32 64 128 256 512
Recall
Index
build
(min)
ef_construction
1.1MM 1536-dim vectors, m=16, ef_search=20,
max_maintenance_workers=64
Build time Recall
54. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Choosing m and ef_construction
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0
100
200
300
400
500
600
700
800
16 24 36 48
Recall
Index
build
(min)
m
1MM 960-dim vectors
Build Time (min) Recall
55. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Performance strategies for HNSW queries
• Index building has biggest impact on performance/recall
§ More time spent optimizing build increases likelihood of finding best candidates
in a neighborhood
• Increasing hnsw.ef_search increases recall, decreases performance
• Set shared_buffers to a value that keeps data (index) in memory
56. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
IVFFlat strategies
57. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
IVFFlat index building parameters
• lists
§ Number of “buckets” for organizing vectors
§ Tradeoff between number of vectors in bucket and relevancy
CREATE INDEX ON products
USING ivfflat(embedding) WITH (lists=3);
58. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Building an IVFFlat index
59. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Building an IVFFlat index: Assign lists
60. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Querying an IVFFlat index
SET ivfflat.probes TO 1
SELECT id FROM products ORDER BY $1 <-> embedding LIMIT 3
61. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Querying an IVFFlat index
SET ivfflat.probes TO 2
SELECT id FROM products ORDER BY $1 <-> embedding LIMIT 3
62. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Performance strategies for IVFFlat queries
• Increasing ivfflat.probes increases recall, decreases performance
• Lowering random_page_cost on a per-query basis can induce index
usage
• Set shared_buffers to a value that keeps data (index) in memory
• Increase work_mem on a per-query basis
63. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Best practices for building IVFFlat indexes
• Choose value of lists to maximize recall but minimize effort of search
§ < 1MM vectors: # vectors / 1000
§ > 1MM vectors: √(# vectors)
• May be necessary to rebuild when adding/modifying vectors in index
• Use parallelism to accelerate build times
64. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
How parallelism works with pgvector IVFFlat
Vectors in
table
List
List
List
Assign to list
Sequential scan
65. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
How parallelism works with pgvector IVFFlat
Vectors in
table
List
List
List
Assign to list
Parallel scan
Assign to list
Parallel scan
Assign to list
Parallel scan
66. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Using parallelism to accelerate IVFFlat builds
0
20
40
60
80
100
120
140
Serial Parallel
Time
(s)
1MM 768-dim, lists=1000
67. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
pgvector filtering strategies
68. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
What is filtering?
SELECT id
FROM products
WHERE products.category_id = 7
ORDER BY :'q' <-> products.embedding
LIMIT 10;
69. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
How filtering impacts ANN queries
• PostgreSQL may choose to not use the index
• Uses an index, but does not return enough results
• Filtering occurs after using the index
70. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Do I need an HNSW/IVFFlat index for a filter?
• Does the filter use a B-Tree (or other index) to reduce the data set?
• How many rows does the filter remove?
• Do I want exact results or approximate results?
71. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Filtering strategies
• Partial index
• Partition
CREATE INDEX ON docs
USING hnsw(embedding vector_l2_ops)
WHERE category_id = 7;
---
CREATE TABLE docs_cat7
PARTITION OF docs
FOR VALUES IN (7);
CREATE INDEX ON docs_cat7
USING hnsw(embedding vector_l2_ops);
72. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Distributed pgvector queries
73. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
When does it make sense to distribute vector data?
• Not enough memory for
workload to meet latency target
• Network overhead must be
acceptable
• Can manage complexity of multi-
node system
74. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Setup foreign data wrapper
CREATE EXTENSION IF NOT EXISTS vector;
CREATE EXTENSION IF NOT EXISTS postgres_fdw;
CREATE SERVER vectors1
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (
async_capable 'true', extensions 'vector', dbname 'vectors', host
'<NODE1>'
);
CREATE SERVER vectors2
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (
async_capable 'true', extensions 'vector', dbname 'vectors', host
'<NODE2>'
);
75. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Setup foreign tables
CREATE TABLE vectors (
id uuid,
node_id int,
embedding vector(768)
) PARTITION BY LIST(node_id);
CREATE FOREIGN TABLE vectors_node1 PARTITION OF vectors
FOR VALUES IN (1)
SERVER vectors1
OPTIONS (schema_name 'public', table_name 'vectors');
CREATE FOREIGN TABLE vectors_node2 PARTITION OF vectors
FOR VALUES IN (2)
SERVER vectors2
OPTIONS (schema_name 'public', table_name 'vectors');
76. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Example EXPLAIN output
Limit (cost=200.01..206.45 rows=10 width=28) (actual time=18.171..18.182 rows=10
loops=1)
-> Merge Append (cost=200.01..3222700.01 rows=5000000 width=28) (actual
time=18.169..18.179 rows=10 loops=1)
Sort Key: (('$1'::vector <=> vectors.embedding))
-> Foreign Scan on vectors_node1 vectors_1 (cost=100.00..1586350.00
rows=2500000 width=28) (actual time=8.607..8.609 rows=2 loops=1)
-> Foreign Scan on vectors_node2 vectors_2 (cost=100.00..1586350.00
rows=2500000 width=28) (actual time=9.559..9.566 rows=9 loops=1)
Planning Time: 0.298 ms
Execution Time: 19.355 ms
77. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Looking ahead
78. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
pgvector roadmap
• Performance improvements for massively parallel HNSW builds
(completed)
• Enhanced index-based filtering/HQANN (in progress)
• More data types per dimension (float2, uint8) (in progress)
§ Scalar quantization via expression indexes
• Product quantization
• Parallel query
79. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Conclusion
• Like JSON, a vector is just a data type.
• Primary design decision: query performance and recall
• Determine where to invest: storage, compute, indexing strategy
• Plan for today and tomorrow: pgvector is rapidly innovating
80. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Thank you!
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Please complete the session
survey in the mobile app
Thank you!
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Please complete the session
survey in the mobile app
Jonathan Katz
jkatz@amazon.com
@jkatz05