2. Agenda
1 Vector Databases, About Pinecone
2 Pinecone Organization, Project and Client libraries
3 Index and Collection
4 Data – Insert, Update, Delete
5 Data - Fetch & Search
6 Other
3. What is a Vector Database
• Applications that involve large language models, generative AI, and semantic search rely on vector
embeddings, a type of data that represents semantic information.
• Embeddings are generated by AI models (such as Large Language Models) and have a large number
of attributes or features, making their representation challenging to manage. In the context of AI and
machine learning, these features represent different dimensions of the data that are essential for
understanding patterns, relationships, and underlying structures
• Traditional scalar-based databases can’t keep up with the complexity and scale of such data, making
it difficult to extract insights and perform real-time analysis.
• Vector databases like Pinecone offer optimized storage and querying capabilities for embeddings.
• First, use the embedding model to create vector embeddings for
the content we want to index.
• The vector embedding is inserted into the vector database, with some
reference to the original content the embedding was created from.
• When the application issues a query, we use the same embedding
model to create embeddings for the query, and use those embeddings
to query the database for similar vector embeddings. And as
mentioned before, those similar embeddings are associated with the
original content that was used to create them.
4. About Pinecone
• Pinecone is a managed, cloud-native, vector database/ DBaaS.
• Few other examples of vector databases include Qdrant, Milvus, Chroma, Weaviate etc.
Other databases like Postgres, Redis and Sqlite provides extensions to handle Vector data.
• Pinecone currently is one of the most popular Vector Database.
• Suitable for storing vector embeddings (also called text embeddings) that provide long
term memory to Large Language Models. large language models, generative AI, and
semantic search rely on vector embeddings
• Vector embedding or text embeddings are a type of data that represents semantic
information. This information allows AI applications to gain understanding and maintain a
long-term memory that they can draw upon when executing complex tasks
• It helps in storing, querying, filtering and searching vector data.
• Operations are low latency and scale to billions of vectors
• Pinecone is a NoSQL datastore and so it’s eventually consistent
• Provides a mongo like API
5. Pinecone object hierarchy
• Organization – An organization in Pinecone
is a higher-level construct that contains
multiple projects. So, it is a set of projects
that use the same billing.
• Project – Each organization contains one or
more projects that share the same
organization owners and billing settings.
• Index – An index is similar to a table in
relational databases. It contains records that
store vectors.
• Record – A record consists of an id, vector
data (array of float/ numbers) and metadata
in the form of key value pairs.
• Collection - Used to backup an Index (along
with associated data)
Organization
Project N
Project 2
Project 1
Index N
Index 1
Record 1 (Id, Vector, Metadata)
Record N (Id, Vector, Metadata)
Collection 1 Collection N
Billing Roles - Organization
Owners and Users
6. Pinecone Client in Python
• Install Pinecone client
pip install pinecone-client
• Import Pinecone library and initialize the client by passing Pinecone api key and name of the Pinecone
environment.
import pinecone
import os
pinecone_api_key = os.environ.get('PINECONE_API_KEY')
env = os.environ.get('PINECONE_ENVIRONMENT')
pinecone.init(api_key=pinecone_api_key,
environment=env)
pinecone.version()
VersionResponse(server='2.0.11', client='2.2.4')
7. Organization in Pinecone
• A Pinecone organization is a set of projects that use the same billing.
• When an account is created in Pinecone a project gets created. Additional projects could
be created from Settings > Projects.
8. Project in Pinecone
• Each Pinecone project contains a number of indexes and users. Only a user who belongs to the
project can access the indexes in that project. Each project also has at least one project owner.
• Create a project in the Pinecone web console by specifying a name, Cloud providers (GCP, AWS,
Azure), Deployment location, Pod limit (maximum number of pods that can be used in total)
• whoami method could be used to retrieve the project id
pinecone.whoami()
WhoAmIResponse(username='bd6e4bc', user_label='default', projectname='f638a37’)
• Project name could be also retrieved using Config property. Apart from Project, environment, log
level, api key etc could be retrieved
pinecone.Config.PROJECT_NAME
pinecone.Config.ENVIRONMENT
pinecone.Config.API_KEY
9. Index
• An index is the highest-level organizational
unit of vector data in Pinecone (like a table in
relational database)
• Pinecone indexes store records – each
record can contain vector data, which is
array of floating-point numbers / decimals.
• It accepts and stores vectors, serves queries
over the vectors it contains, and does other
vector operations over its contents.
• A Pinecone Project can contain many
Indexes. Each Index contains multiple
records.
Organization
Project N
Project 2
Project 1
Index N
Index 1
Record 1 (Vector)
Record N (Vector)
10. Record
• A Pinecone Project can
contain many Indexes. Each
Index contains multiple
records.
• A record contains an
• ID,
• Values which is a vector
or array of float/
numbers
• Additional Metadata
(optional).
• As Pinecone is a NoSQL
vector database, defining
the schema is not reqd.
Organization
Project N
Project 2
Project 1
Index N
Index 1
1 [ 1.0,2.0, …. 10.0] {“key1”: “val1”, “key2”: “val2”}
2 [ 1.0,2.0, …. 10.0] {“key1”: “val1”, “key2”: “val2”}
N [ 1.0,2.0, …. 10.0] {“key1”: “val1”, “key2”: “val2”}
ID Values Metadata
11. Index – creating
• Create an index by passing index name and size/ dimension of the vector to be stored in index.
pinecone.create_index(name="first-index", dimension=5)
• Additional parameters could be passed like distance metric type and no of shards.
pinecone.create_index(name="second-index", dimension=5, metric="cosine", shards=1)
12. Index – creating
• Distance metric could be of three type –
o Euclidean - This is used to calculate the distance between two data points in a plane. It is one of the most
commonly used distance metric.
o Cosine - This is often used to find similarities between different documents. This is the default value. The
advantage is that the scores are normalized to [-1,1] range.
o Dotproduct - This is used to multiply two vectors. You can use it to tell us how similar the two vectors are. The
more positive the answer is, the closer the two vectors are in terms of their directions.
• Distance metric could be of three type – No of pods and pod type also could be specified –
pinecone.create_index(name="third-index", dimension=10, metric="cosine", shards=3,
pods=5, pod_type="p2")
13. Index – listing index, getting details and deleting
• List Pinecone indexes
pinecone.list_indexes()
['first-index’]
• Get details of an index
pinecone.describe_index("second-index")
IndexDescription(name='second-index', metric='cosine', replicas=1, dimension=5.0,
shards=1, pods=1, pod_type='starter', status={'ready': True, 'state': 'Ready'},
metadata_config=None, source_collection=‘’)
• Delete an index
pinecone.delete_index("first-index")
14. Index – scaling and configuring
• An index could be scaled up or down -
pinecone.scale_index(name='second-index', replicas=3)
• An index could be updated or reconfigured to a different pod type and replica count.
pinecone.configure_index(name='second-index', pod_type=“s1", replicas = 5)
• There are three types of pods available –
o s1 – storage optimized pod. provide large storage capacity and lower overall costs with slightly higher query
latencies than p1 pods. They are ideal for very large indexes with moderate or relaxed latency requirements.
o p1 - performance-optimized pods provide very low query latencies, but hold fewer vectors per pod than s1
pods. They are ideal for applications with low latency requirements (<100ms).
o p2 - p2 pod type provides greater query throughput with lower latency. For vectors with fewer than 128
dimension and queries where topK is less than 50, p2 pods support up to 200 QPS per replica and return
queries in less than 10ms. This means that query throughput and latency are better than s1 and p1.
o Starter – used in free plan.
• Get statistics about the index -
index.describe_index_stats()
{'dimension': 5, 'index_fullness': 9e-05, 'namespaces': {'': {'vector_count': 9}},
'total_vector_count': 9}
15. Insert data
• To insert, update, delete or perform any operation, Get a reference to the index created before -
index = pinecone.Index("second-index")
• Insert a single record by using the upsert method. The record contain Id, vector embeddings and
optional metadata -
index.upsert([("hello world", [1.0, 2.234, 3.34, 5.6, 7.8])])
• Insert multiple records using the same operations passing all the records as array -
index.upsert(
[
("Bangalore", [1.0, 2.234, 3.34, 5.6, 7.8]),
("Kolkata", [2.0, 1.234, 3.34, 5.6, 7.8]),
("Chennai", [3.0, 5.234, 3.34, 5.6, 7.8]),
("Mumbai", [4.0, 6.234, 3.34, 5.6, 7.8])
])
• Insert records (multiple) with metadata
index.upsert(
[
("Delhi", [1.0, 2.234, 3.34, 5.6, 7.8], {"type": "city", "sub-type": "metro"}),
("Pune", [2.0, 1.234, 3.34, 5.6, 7.8], {"type": "city", "sub-type": "non-metro"})
])
16. Update data – partial and full update
• Pinecone supports Full update and partial update of records.
• Full Update allows updating both vector values and metadata, while partial update allowing either
vector values or metadata
• To insert, update, delete or perform any operation, Get a reference to the index created before -
index = pinecone.Index("second-index")
• To partially update the record, use the update method
Following code update the vector values of the record -
index.update(id="hello world", values=[2.0, 3.1, 6.4, 9.6, 11.8])
Following code update the metadata the record -
index.update(id="hello-world", metadata={"city1":"Blore", "city2":"Kolkata"})
• To fully update the record use the same upsert method used earlier to insert the data -
index.upsert([("hello-world", [10.0, 20.1, 31.4, 55.6, 75.8], {"city1":"Blore",
"city2":"Kolkata", "city3":"Delhi"})])
17. Backup data using Collection
• Collections are used to backup an Index. A collection is a static copy of your index that only
consumes storage
• Create a collection by specifying name of the collection and the name of the index -
pinecone.create_collection(name="bakcup_collection", source=“first-index")
pinecone.list_collections()
['bakcup-collection’]
pinecone.describe_collection("bakcup-collection")
18. Collection – listing, get details and deleting
• All existing collections could be easily listed -
pinecone.list_collections()
['bakcup-collection’]
• All existing collections could be easily listed -
pinecone.describe_collection("bakcup-collection")
• Delete collection -
pinecone.delete_collection("bakcup-collection")
19. Dense vector vs Sparse data
• Pinecone supports both Dense vector and Sparse vector.
• Till now, we have only played with Dense vectors.
Index 1
1 [ 1.0,2.0, …. 10.0] indices [1,2], values [10,0, 20.5] {“key1”: “val1”, “key2”: “val2”}
2 [ 1.0,2.0, …. 10.0] indices [1,2], values [10,0, 20.5] {“key1”: “val1”, “key2”: “val2”}
N [ 1.0,2.0, …. 10.0] indices [1,2], values [10,0, 20.5] {“key1”: “val1”, “key2”: “val2”}
ID Dense Vector Sparse Vector Metadata
both Dense vector and Sparse vector could be part of a record
20. Inserting sparse data
• Sparse vector values can be upserted alongside dense vector values -
index.upsert(vectors=[{'id': 'id1', 'values': [0.1, 0.2, 0.3, 0.4, 0.5],
'sparse_values':{ 'indices':[1,2], 'values': [10.0, 20.5] } }])
• Note that you cannot upsert a record with sparse vector values without dense vector values
index.upsert(vectors=[{'id': 'id1', 'sparse_values':{ 'indices':[1,2], 'values':
[10.0, 20.5] } }])
ValueError: Vector dictionary is missing required fields: ['values']
21. Fetch data, and update data
• Fetch data by passing ids of the record
index.fetch(ids=['Bangalore', 'Kolkata’])
{'namespace': '',
'vectors': {'Bangalore': {'id': 'Bangalore', 'metadata': {}, 'values': [1.0, 2.234,
3.34, 5.6, 7.8]},
'Kolkata': {'id': 'Kolkata', 'metadata': {}, 'values': [2.0, 1.234,
3.34, 5.6, 7.8]}}}
• Update a record – both values (vector) and metadata could be updated
index.update(id='Bangalore', set_metadata={"type":"city", "sub-type":"non-metro"},
values=[1.0, 2.0, 3.0, 4.0, 5.0])
index.fetch(ids=['Bangalore’])
{'namespace': '',
'vectors': {'Bangalore': {'id': 'Bangalore', 'metadata': {'sub-type': 'non-metro',
'type': 'city'}, 'values': [1.0, 2.0, 3.0, 4.0, 5.0]}}}
22. Query data
• Query data by vector match
index.query(vector=[1.0, 2.0, 3.0, 5, 7.0], top_k=3, include_values=True)
{'matches': [{'id': 'Bangalore', 'score': 0.999296784, 'values': [1.0, 2.0, 3.0,
5.0, 7.0]},
{'id': 'Delhi', 'score': 0.997676671, 'values': [1.0, 2.234, 3.34,
5.6, 7.8]},
{'id': 'Den Haag', 'score': 0.997676671, 'values': [1.0, 2.234, 3.34,
5.6, 7.8]}], 'namespace': ‘’}
• Apart from id and values (vector data), metadata of each matched record also could be retrieved
index.query(vector=[1.0, 2.0, 3.0, 5, 7.0], top_k=3, include_values=True,
include_metadata=True)
23. Namespace
• Pinecone allows you to partition the records in an index into namespaces. Queries and other
operations are then limited to one namespace, so different requests can search different subsets of
your index.
• A new namespace could be created by inserting a record to an index by specifying a new namespace
-
index.upsert(
vectors = [
("Howrah", [1.0, 2.234, 3.34, 5.6, 7.8], {"type": "city", "sub-type":
"metro"}),
("Siliguri", [2.0, 1.234, 3.34, 5.6, 7.8], {"type": "city", "sub-type":
"non-metro"})
], namespace='my-first-namespace')
• By default each index contains a single default index.