The Rise of
Vector Data
Edo Liberty
Founder & CEO, Pinecone
What is vector data?
What is vector data?
Translation, understanding, Sentiment,
Question Answering, Semantic Search, ...
Anomaly detection, speech-to-text, music
transcription, machinery malfunction, ...
Object recognition, deduplication,
scene detection, product search, ...
Object Vector Task
Text: BERT, DistilBERT, word2vec, GloVe, ...
Audio: wav2vec, mxnet-audio, ...
Vision: resnet, alexnet, vgg, squeezenet, densenet, inception,
googlenet, mobilenet, ...
>> import torchvision.models as models
>> model = models.squeezenet1_0(pretrained=True)
What if we save the vectors?
Then we can search by similarity
You’ve seen the results
Vectors need a new kind of database
Key-Value Graph Vector
Document
A vector index needs complex algorithms
A vector index needs complex algorithms
http://ann-benchmarks.com/
A vector index also needs complex infrastructure
Functionality + Scale
● Sharding
● Replication
● Live Updates
● Namespacing
● Filtering
● Pre/Post processing
Production readiness
● High Availability
● Persistence
● Consistency
● Monitoring
● Alerting
● Support
You can leverage vectors through a managed service
Vectors
Similarity search as a service
(Pinecone.io)
Application or
notebook
Image Search Demo
Thank you!
Pinecone.io — Similarity search as a service

The Rise of Vector Data

  • 1.
    The Rise of VectorData Edo Liberty Founder & CEO, Pinecone
  • 2.
  • 3.
    What is vectordata? Translation, understanding, Sentiment, Question Answering, Semantic Search, ... Anomaly detection, speech-to-text, music transcription, machinery malfunction, ... Object recognition, deduplication, scene detection, product search, ... Object Vector Task
  • 4.
    Text: BERT, DistilBERT,word2vec, GloVe, ... Audio: wav2vec, mxnet-audio, ... Vision: resnet, alexnet, vgg, squeezenet, densenet, inception, googlenet, mobilenet, ... >> import torchvision.models as models >> model = models.squeezenet1_0(pretrained=True)
  • 5.
    What if wesave the vectors?
  • 6.
    Then we cansearch by similarity
  • 7.
  • 8.
    Vectors need anew kind of database Key-Value Graph Vector Document
  • 9.
    A vector indexneeds complex algorithms
  • 10.
    A vector indexneeds complex algorithms http://ann-benchmarks.com/
  • 11.
    A vector indexalso needs complex infrastructure Functionality + Scale ● Sharding ● Replication ● Live Updates ● Namespacing ● Filtering ● Pre/Post processing Production readiness ● High Availability ● Persistence ● Consistency ● Monitoring ● Alerting ● Support
  • 12.
    You can leveragevectors through a managed service Vectors Similarity search as a service (Pinecone.io) Application or notebook
  • 13.
  • 14.
    Thank you! Pinecone.io —Similarity search as a service