Successfully reported this slideshow.
Your SlideShare is downloading. ×

[DSC Europe 22] Similarity search: Take your search engine to the next level - Francisco Losada

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad

Check these out next

1 of 20 Ad

[DSC Europe 22] Similarity search: Take your search engine to the next level - Francisco Losada

Download to read offline

If you have ever heard of observability, you may as well have heard of monitoring, telemetry, tracing, and most probably used interchangeably. Our customers are no exception and often come to us with a wrong idea of the technologies to use to tackle each of them. In this session, we will dive into this soup of words, throw light into each of these terms and expose the breath of AWS services that best suit each of the use cases.

If you have ever heard of observability, you may as well have heard of monitoring, telemetry, tracing, and most probably used interchangeably. Our customers are no exception and often come to us with a wrong idea of the technologies to use to tackle each of them. In this session, we will dive into this soup of words, throw light into each of these terms and expose the breath of AWS services that best suit each of the use cases.

Advertisement
Advertisement

More Related Content

More from DataScienceConferenc1 (20)

Recently uploaded (20)

Advertisement

[DSC Europe 22] Similarity search: Take your search engine to the next level - Francisco Losada

  1. 1. DSC EUROPE 22 © 2022, Amazon Web Services, Inc. or its affiliates. © 2022, Amazon Web Services, Inc. or its affiliates. Similarity Search Take your search engine to the next level N O V E M B E R 1 8 T H – D S C E U R O P E 2 2 Francisco Losada AWS Specialist Solutions Architect
  2. 2. DSC EUROPE 22 © 2022, Amazon Web Services, Inc. or its affiliates. Agenda • Text search: Analysis and queries • Image search: Analysis and queries • End-to-end architecture 2
  3. 3. DSC EUROPE 22 © 2022, Amazon Web Services, Inc. or its affiliates. Text Search: Analysis 3
  4. 4. DSC EUROPE 22 © 2022, Amazon Web Services, Inc. or its affiliates. Text Search: Analysis 4 Source: https://codingexplained.com/coding/elasticsearch/understanding-analysis-in-elasticsearch-analyzers
  5. 5. DSC EUROPE 22 © 2022, Amazon Web Services, Inc. or its affiliates. Text Search: Query 5 GET /my-index-000001/_search { "query": { "match": { ”texto": ”Europe" } } } 1. DSL: Domain Specific Language 2. Analysis: Alicante  [europe] 3. Search
  6. 6. DSC EUROPE 22 © 2022, Amazon Web Services, Inc. or its affiliates. Image Search 6
  7. 7. DSC EUROPE 22 © 2022, Amazon Web Services, Inc. or its affiliates. Image Search: Analysis 7 • Characteristics: • Dress • White • Bugs Bunny • Round neck • Short sleeve • Black line around neck • Tight
  8. 8. DSC EUROPE 22 © 2022, Amazon Web Services, Inc. or its affiliates. CNN: Feature Extraction 8 Source: https://neurohive.io/en/popular-networks/vgg16/
  9. 9. DSC EUROPE 22 © 2022, Amazon Web Services, Inc. or its affiliates. 9
  10. 10. DSC EUROPE 22 © 2022, Amazon Web Services, Inc. or its affiliates. Image Search query: KNN algorithm 10 Source: https://medium.com/swlh/k-nearest-neighbor-ca2593d7a3c4
  11. 11. DSC EUROPE 22 © 2022, Amazon Web Services, Inc. or its affiliates. Image Search query: KNN algorithm 11 Source: https://medium.com/swlh/k-nearest-neighbor-ca2593d7a3c4
  12. 12. DSC EUROPE 22 © 2022, Amazon Web Services, Inc. or its affiliates. Image Search query: KNN algorithm 12 Source: https://medium.com/swlh/k-nearest-neighbor-ca2593d7a3c4
  13. 13. DSC EUROPE 22 © 2022, Amazon Web Services, Inc. or its affiliates. Image search pipeline 13 Source: https://aws.amazon.com/blogs/machine-learning/building-a-visual-search-application-with-amazon-sagemaker-and-amazon-es/
  14. 14. DSC EUROPE 22 © 2022, Amazon Web Services, Inc. or its affiliates. What is OpenSearch ? OpenSearch is a community-driven, open source search and analytics suite derived from Apache 2.0 licensed Elasticsearch 7.10.2 & Kibana 7.10.2. The OpenSearch Project comprises OpenSearch, a search engine daemon, OpenSearch Dashboards for visualization and user interface, and tools and plugins, providing additional functionality https://opensearch.org/ 14
  15. 15. DSC EUROPE 22 © 2022, Amazon Web Services, Inc. or its affiliates. Amazon OpenSearch Service Easy integration Open source OpenSearch APIs, managed Opensearch dashboards, integration with Logstash Cost-effective Pay only for resources used with choice of on- demand and Reserved Instance compute pricing, and save up to 90% with Ultrawarm low-cost storage tier Fully managed Deployment in minutes, software installation and patching, failure recovery, backups, and monitoring Scalable, secure, and compliant Network isolation with Amazon VPC, encryption at-rest and in transit, and compliant with HIPPA PCI DSS, and ISO 15
  16. 16. DSC EUROPE 22 © 2022, Amazon Web Services, Inc. or its affiliates. AWS Cloud - region VPC Customer domain Application Load Balancing (ALB) Data nodes Leader nodes UltraWarm nodes IAM, Cognito, SAML for Dashboards Login SAML AWS CloudTrail Amazon CloudWatch OpenSearch Fine-grained access control AWS Database Migration Service Amazon Kinesis Data Firehose Amazon CloudWatch Logs Amazon Managed Streaming for Kafka 16
  17. 17. DSC EUROPE 22 © 2022, Amazon Web Services, Inc. or its affiliates. Pick algorithm Visualize in notebooks Label data Collect and prepare data Store features Check data Train models Tune parameters Deploy in production Manage and monitor CI/CD Amazon SageMaker: Built to make ML more accessible 17
  18. 18. DSC EUROPE 22 © 2022, Amazon Web Services, Inc. or its affiliates. Architecture 18
  19. 19. DSC EUROPE 22 © 2022, Amazon Web Services, Inc. or its affiliates. Implementation 19
  20. 20. DSC EUROPE 22 © 2022, Amazon Web Services, Inc. or its affiliates. Thank you! Questions? 20 https://eventbox.dev/survey/IJKGHW4

Editor's Notes

  • https://www.youtube.com/shorts/n5t0B3N71vs
  • Text Analysis: Es el proceso de convertir text no estructurado, a un formato estructurado que este optimizado para la busqueda.

    https://youtube.com/shorts/n5t0B3N71vs?feature=share

  • Query DSL (Domain Specific Language)

  • https://www.youtube.com/shorts/n5t0B3N71vs
  • Purpose of the slide: Give a brief background on OpenSearch, the open-source engine that powers Amazon OpenSearch Service.

    In January 21, 2021, Elastic NV announced that they would change their software licensing strategy and not release new versions of Elasticsearch and Kibana under the permissive Apache License, Version 2.0 (ALv2) license. Instead, new versions of the software will be offered under the Elastic license, with source code available under the Elastic License or SSPL. These are not open source and do not offer users the freedoms of open source. To ensure that the open source community and our customers continue to have a secure, high-quality, fully open source search and analytics suite, AWS introduced OpenSearch; a community-driven, ALv2 licensed fork of open source Elasticsearch and Kibana.

    The ALv2 license gives the open source community and our customers the freedom to use, modify, extend, embed, monetize, resell, and offer OpenSearch as part of their products and services. The announcement of OpenSearch has garnered positive support from the community. Numerous organizations such as SAP, Capital One, Dow Jones, Logz.io, and Red Hat, and individual contributors have expressed interest in joining the project and helping develop OpenSearch.
  • Amazon OpenSearch is a distributed system.

    We have number of different nodes , data nodes which hold the your data and indexing the data, responding to your quires, master nodes – orchestration the cluster and keep the cluster functioning as a whole

    Ultrawarm nodes – that are high density storage by S3 by much reduced cost. To store long tail data at much reduced cost.

    We have number of security features , we have IAM to provide access to cluster , we have open distro for OpenSearch plugin for providing fine grained access controls to your OpenSearch cluster

    We have integration with other services –On the Input side – We have Kinesis Firehose which can push your data to OpenSearch for log workloads, DMS – that can deliver database data into OpenSearch Cloud watch logs that supports lambda batch delivery to to OpenSearch

    On the Output side - metrics to cloud watch and audit data goes into cloud trail.

    We called it as domain.
  • Amazon SageMaker is the most complete end-to-end ML service helping our customer through improved agility, productivity, and cost-effectiveness.
    We built Amazon SageMaker from the ground up to provide every developer and data scientist with the ability to build, train, and deploy ML models quickly and at lower cost by providing the tools required for every step of the ML development lifecycle in one integrated, fully managed service. In fact, we have launched 50+ capabilities in the past year alone, all aimed at making this process easier for our customers.
    And last year we launched Amazon SageMaker Studio to bring this all together in a single pane of glass so that you get access to all your tools in one place.
  • 1. Dataset – Imagenes

    2. Una red neuronal convolucional – para extraer el vector de las imagenes

    3. Tenemos que tener el modelo accessible – Mejor si es a traves de un endpoint con una API

    4. Tenemos que guardar todos los vectores del dataset en Opensearch

×