SlideShare a Scribd company logo
OpenSearch
-Abhi Jain
Agenda
● OpenSearch
○ What is it?
○ Benefits/ Uses
○ How to use it
○ Features
● Migrate from Elastic to OpenSearch
● Tools & Plugins
About Me
● Lead Dev
● Located in Florida
● Trainer
● Presenter
● .NET Developer
● Youtuber: Coach4Dev
● Husband/ Father
Amazon Elasticsearch
● Launched in 2015
● Gained popularity for log analytics usage
● Used open-source Elastic under Apache License v2
● Jan 2021
○ Elastic NV changed licensing strategy
○ After ElasticSearch 7.10.2 & Kibana 7.10.2
■ Not release under Apache License v2
■ Release under Elastic License
OpenSearch
● Sep 2021:
○ Renamed from ElasticSearch to OpenSearch
● OpenSource fork from Elastic 7.10.2 and Kibana 7.10.2
● Highly scalable
● Fast access & response to large volumes of data
● Powered by Apache Lucene Search library
Apache Lucene
● Apache Lucene project develops open-source search software
○ Releases a core search library named Lucene core
● Lucene Core
○ Java Library providing powerful indexing and search features
Apache Solr
● Open source search platform
● Built on Apache Lucene
Solr vs ElasticSearch
● Similar performance mostly.
● ES has better support for scalability
○ due to horizontal scaling
■ Better cloud support too
● ES can support multiple doc types in a single index better
○ More difficult to do this in Solr
● ES supports native DSL (Domain Specific Language)
○ Need to program queries in Solr
● https://mindmajix.com/elasticsearch-vs-solr
Why OpenSearch
● Huge amount of machine generated data these days
○ Growing exponentially
● Getting insights is important
● Interactive log analytics
● Real-time application monitoring
● Website Search, etc.
OpenSearch Features
● Easy to set-up and configure
● In-place upgrades
● Enables data monitoring & setting alerts based on thresholds
● Supports authentication, encryption & compliance requirements
OpenSearch vs ElasticSearch
● OpenSearch was forked from Elastic Search
○ Now they are separate from each other
● Each is adding features separately
● OpenSearch
○ Inbuilt support from AWS
OpenSearch features not in ES (free version)
● Centralized user accounts / access control
● Cross-cluster replication
● IP filtering
● Configurable retention period
● Anomaly detection
● Tableau connector
● JDBC driver
● ODBC driver
● Machine learning features such as regression and classification
● Link
ElasticSearch Features
● Based on subscription levels
● https://www.elastic.co/subscriptions
OpenSearch & ElasticSearch Version Support
● Currently supports the following OpenSearch versions:
○ 1.3, 1.2, 1.1, 1.0
● And supports the following ElasticSearch versions:
○ 7.10, 7.9, 7.8, 7.7, 7.4, 7.1
○ 6.8, 6.7, 6.5, 6.4, 6.3, 6.2, 6.0
○ 5.6, 5.5, 5.3, 5.1
○ 2.3
○ 1.5
What is Kibana
● Free & open front end application
● Charting tool for Elastic Stack
● Sits on top of Elastic Stack
● Sample Dashboard
OpenSearch Dashboards
● Default visualization tool for data in OpenSearch
● Filter data with queries
● Comes with opensearch service
Terminologies
OpenSearch Cluster
● Synonymous to domain
● Domains are clusters with
○ settings,
○ instance types,
○ instance counts,
○ and storage resources that you specify.
● Group of nodes
○ With same cluster.name attribute
Opensearch Node
● Member of a cluster
● A distinct host
● With IP address
Getting Started
● Create a domain
● Size the domain appropriately for your workload
● Control access to your domain using a domain access policy or fine-grained
access control
● Index data manually or from other AWS services
● Use OpenSearch Dashboards to search your data and create visualizations
Custom Endpoint
● If we want easier to read or custom domain name
● Can use Https
○ Upload SSL certificate
Run OpenSearch locally
● Install docker
● wsl -d docker-desktop
● sysctl -w vm.max_map_count=262144
● Ctrl+C
● docker-compose up
● Visit http://localhost:5601/
● Use admin/admin to login and explore
● Link
Upload Data
● One at a time
● Bulk
Upload Data One At a time
● curl -XPUT -u "master:XXXX"
"https://search-test-domain-s7g5csgqurpevadhaonp75mwgm.us-west-1.es.a
mazonaws.com/movies/_doc/1" -d "{"director": "Burton, Tim", "genre":
["Comedy","Sci-Fi"], "year": 1996, "actor": ["Jack Nicholson","Pierce
Brosnan","Sarah Jessica Parker"], "title": "Mars Attacks!"}" -H "Content-Type:
application/json"
Upload Data Bulk
● curl -XPOST -u "master:XXXXX"
"https://search-test-domain-s7g5csgqurpevadhaonp75mwgm.us-west-1.es.a
mazonaws.com/_bulk" --data-binary @bulk_movies.txt -H "Content-Type:
application/json"
How to Query?
Searching Data
● URI Searches
● Command Line
● OpenSearch Dashboards
Searching Data - URI
● GET Request
● https://search-test-domain-s7g5csgqurpevadhaonp75mwgm.us-west-1.es.am
azonaws.com/movies/_search?q=rebel&pretty=true
● Searches all the indices and properties
URI Search Specific fields
● Search movies index and title property
● GET
https://search-my-domain.us-west-1.es.amazonaws.com/movies/_search?q=ti
tle:house
Get Search Results - Command Line
● curl -XGET -u "master:XXXXX"
"https://search-test-domain-s7g5csgqurpevadhaonp75mwgm.us-west-1.es.a
mazonaws.com/movies/_search?q=rebel&pretty=true"
Query DSL
● For more complex queries
○ OpenSearch Domain Specific Language (DSL)
● POST request with query body
●
Get Search Results - Dev Tools
● https://search-test-domain-s7g5csgqurpevadhaonp75mwgm.us-west-1.es.am
azonaws.com/_dashboards/app/dev_tools#/console
○ GET _search
○ {
○ "query": {
○ "match_all": {}
○ }
○ }
Search on only specific fields
GET _search
{
"size": 20,
"query": {
"multi_match": {
"query": "U.S.",
"fields": ["title", "actor", "director"]
}
}
}
Search - Boosting fields
GET _search
{
"size": 20,
"query": {
"multi_match": {
"query": "john",
"fields": ["title^4", "actor", "director^4"]
}
}
}
Search - Pagination
GET _search
{
"from": 0,
"size": 1,
"query": {
"multi_match": {
"query": "Drama",
"fields": ["genre"]
}
}
}
Query -With Highlights In Response
GET _search
{
"size": 20,
"query": {
"multi_match": {
"query": "Manchurian",
"fields": ["title^4", "actor", "director"]
}
},
"highlight": {
"fields": {
"title": {}
},
"pre_tags": "<strong>",
"post_tags": "</strong>",
"fragment_size": 200,
"boundary_chars": ".,!? "
}
}
Query - Count
GET movies/_count
{
"query": {
"multi_match": {
"query": "Manchurian",
"fields": ["title^4", "actor", "director"]
}
}
}
Dashboard Query Language
● Use DQL in Dashboards
○ Search for data and visualizations
● Terms Query
○ Search for any text
■ E.g. www.example.com
○ Access object’s nested field
■ E.g. coordinates.lat:43.7102
○ Leading and trailing wildcards
■ host.keyword:*.example.com/*
● Operators
○ AND
○ OR
Dashboard Query Language
● Date and range Queries
○ bytes >= 15 and memory < 15
○ @timestamp > "2020-12-14T09:35:33"
● Nested field query
○ superheroes: {hero-name: Superman}
Dashboard Plugins
Query Workbench
● SQL
○ Run SQL
○ Treat indices as tables
● PPL
○ Piped Processing Language
○ Commands delimited by pipes
Reporting
● Multiple file formats
● On demand/ Scheduled
● Generate from
○ Dashboard
○ Visualization
○ Discover
Anomaly Detection
● Detect unusual behavior in time series data
● Anomaly Grade
● Confidence Score
Notifications
● Supported
○ Amazon Chime
○ SNS
○ SES
○ SMTP
○ Slack
○ Custom Webhooks
Observability plugin
● Visualize/Query time series data
● Event analytics
● Compare the data the way you like
Index Management
● Create ISM policy
● To manage your indexes
Security plugin
● Set up RBAC
●
Migrate from ElasticSearch to OpenSearch
Three major approaches
● Snapshot
● Rolling Upgrade
● Cluster Restart
Snapshot Method
● Generate snapshot in ElasticSearch
● Save in shared directory
● Restore in OpenSearch
● Snapshot
○ Backup of entire cluster state
○ Useful for recovery from failure and migration
● Link
Snapshot Method
● Check Index compatibility
○ E.g.: Cant restore 7.6.0 snapshot into 7.5.0 cluster
● Link
● Fastest
● Easiest
● Most efficient
●
Rolling Upgrade
● Official way to migrate cluster
● Without interruption
● Rolling upgrades are supported:
○ Between minor versions
○ From 5.6 to 6.8
○ From 6.8 to 7.14.1
○ From any version since 7.14.0 to 7.14.1
Rolling Upgrade
● Shut down one node at a time
○ Minimal disruption
Cluster Restart Upgrades
● Shut down all nodes
● Perform the upgrade
● Restart the cluster
Mapping
OpenSearch Mapping
● Dynamic
○ When you index a document
○ Opensearch adds fields automatically
○ It deduces their types by itself
● Explicit
○ If you know your data types
○ Preferred way of doing things
OpenSearch Mapping
● If you do not define a mapping ahead of time, OpenSearch dynamically
creates a mapping for you.
● If you do decide to define your own mapping, you can do so at index creation.
● ONE mapping is defined per index. Once the index has been created, we can
only add new fields to a mapping. We CANNOT change the mapping of an
existing field.
● If you must change the type of an existing field, you must create a new index
with the desired mapping, then reindex all documents into the new index.
Text vs keyword data types
● Text type
○ Full text searches
● Keyword type
○ Exact searches
○ Aggregations
○ Sorting
Text vs Keyword
● Inverted Index
Aggregations
OpenSearch Aggregations
● Analyze data
○ In real time too
● Extract statistics
● More expensive than queries
○ Or CPU and Memory
○ In general
Aggregation Query
● Use aggs or aggregations
Example
● Get average of
Data Streams
Data Streams in OpenSearch
● Ingesting time series data
○ Logs
○ Events
○ Metrics, etc.
● Number of documents grows rapidly
● Append Only data
● Don't need to update older documents (Very rarely)
Rollover
● If data is growing rapidly
● Write to index upto certain threshold
○ Then create a new index
○ And start writing to it
● Optimize the active index for high ingest rates on high-performance hot
nodes.
● Optimize for search performance on warm nodes.
● Shift older, less frequently accessed data to less expensive cold nodes,
● Delete data according to your retention policies by removing entire indices.
Index Template
● Data Stream requires an index template
● A name or wildcard (*) pattern for the data stream.
● The data stream’s timestamp field. This field must be mapped as a date or
date_nanos field data type and must be included in every document indexed
to the data stream.
● The mappings and settings applied to each backing index when it’s created.
ILM Policy
● Index Lifecycle Management Policy
● Can be applied to any number of indices
● Usage
○ Allocate
○ Delete
○ Rollover
○ Read Only
○ Wait for snapshot
ILM Policy
● Create a policy:
● Link
Create ILM Policy
Create ILM Policy
Create ILM Policy
Index Template
● Tells ElasticSearch how to configure an index when it is created
● For data streams
○ Configures the stream’s backing indices
○ Configured prior to index creation
Templates Types
● Component Templates
○ Reusable building blocks that configure
■ mappings,
■ settings, and
■ Aliases
○ Not directly applied to indices
● Index Template
○ Collection of component templates
○ Directly applied to indices
○ Some defaults: metrics-*-*, logs-*-*
Create Component Template
● Link
Create Index Template
● Data Stream requires matching index template
● PUT _index_template/{template_name}
Create Index Template
● Link
Create data stream
● Documents must contain timestamp field
● PUT _data_stream/my-data-stream
● Stream’s name must match one of your index template’s index patterns
Get Info About Data Stream
● GET _data_stream/my-data-stream
Delete Data Stream
● DELETE _data_stream/my-data-stream
Cross Cluster Replication
Cross Cluster Replication
● Cross Cluster replication plugin
○ Replicates indexes, mapping & metadata from one cluster to another
● Advantages
○ Continue to handle search requests if there is an outage
○ Can help reduce latency in application
■ Replicating data across geographically distant data centers
Replication
● Active passive model
○ Follower index pulls data from leader index
● It can be
○ Started
○ Paused
○ Stopped
○ Resumed
● Can be secured
○ Security plugin
○ Encrypt cross cluster traffic
Exercise
● Create 2 domains in AWS OpenSearch
● Link
Exercise
● Source Domain Connections Tab -> Outbound ->
○ Create Connection to Destination Domain
● Set access policy on destination domain:
● Link
○
○
Exercise
● Get Connection status
○ GET _plugins/_replication/connect1/_status
● Start syncing
○ PUT _plugins/_replication/connect1/_start
○ {
○ "leader_alias": "Connect1",
○ "leader_index": "movies",
○ "use_roles":{
○ "leader_cluster_role": "all_access",
○ "follower_cluster_role": "all_access"
○ }
○ }
Plugins
Opensearch plugins
● Standalone components
○ That add features and capabilities
● Huge number of plugins available
● E.g.
○ Replication Plugin
○ Security plugin
○ Notification plugin
SQL Plugin
● Lets you run SQL queries on ESDB
● Add data
○ PUT movies/_doc/1
○ { "title": "Spirited Away" }
● Query data
○ POST _plugins/_sql
○ {
○ "query": "SELECT * FROM movies LIMIT 50"
○ }
○
SQL Plugin
● Delete data from ESDB Index
● Enable Delete via SQL plugin
○ PUT _plugins/_query/settings
○ {
○ "transient": {
○ "plugins.sql.delete.enabled": "true"
○ }
○ }
○
SQL PLugin - Delete
● To Delete the data
○ POST _plugins/_sql
○ {
○ "query": "DELETE FROM movies"
○ }
○
Asynchronous Search
● Large volumes of data
● Can take longer to search
● Async
○ Run searches in the background
○ Monitor progress of these searches
○ Get back partial results as they become available
Asynchronous Search
● POST _plugins/_asynchronous_search
● Response contents:
○ ID
■ Can be used to track the state of the search
■ Get partial results
○ State
■ Running
■ Completed
■ Persisted
● Link
OpenSearch Clients
Clients
● OpenSearch Python client
● OpenSearch JavaScript (Node.js) client
● OpenSearch .NET clients
● OpenSearch Go client
● OpenSearch PHP client
Open Search Client for .NET
● OpenSearch.Net
○ Low level client
● OpenSearch.Client
○ High level client
● Sample code: Link
Exercise
● Create a .NET application
● Add a document to OpenSearch using the .NET Application
○ OpenSearch.Client (.NET High level client)
Agents and Ingestion Tools
Beats
● Data shippers
● Agents on servers
● Send data to ES/ Logstash
Grafana
● An open source visualization tool
● Various sources can be used as data source:
○ InfluxDB
○ MySQL
○ ElasticSearch
○ PostgreSQL
● Better suited for metrics visualizations
● Does not allow full text data querying
Logstash
● Free/ Open-Source
● Data processing pipeline
● Ingests data from multitude of sources
● Transforms it
● Sends it to your favorite stash
Logstash - Ingestion
● Data of all shapes/ sizes/ source
○ Can be ingested
● It can parse/ transform your data
Logstash - Output
● ElasticSearch
● Mongodb
● S3
● Etc.
● Link
AWS OpenSearch Security
● Use multi-factor authentication (MFA) with each account.
● Use SSL/TLS to communicate with AWS resources. We recommend TLS 1.2
or later.
● Set up API and user activity logging with AWS CloudTrail.
● Use AWS encryption solutions, along with all default security controls within
AWS services.
● Use advanced managed security services such as Amazon Macie, which
assists in discovering and securing personal data that is stored in Amazon S3.
● If you require FIPS 140-2 validated cryptographic modules when accessing
AWS through a command line interface or an API, use a FIPS endpoint.
Summary
● Opensearch
○ Open Source Search solution
● Upcoming and supported by AWS
● Caters to most search use cases
○ Great Query performance
● Powerful tools
● Community Support
Connect with me
● Trainings on various tech topics
● For any questions:
○ https://linkedin.com/in/coach4dev

More Related Content

What's hot

Intro to open source observability with grafana, prometheus, loki, and tempo(...
Intro to open source observability with grafana, prometheus, loki, and tempo(...Intro to open source observability with grafana, prometheus, loki, and tempo(...
Intro to open source observability with grafana, prometheus, loki, and tempo(...
LibbySchulze
 
Cloud Storage Comparison: AWS vs Azure vs Google vs IBM
Cloud Storage Comparison: AWS vs Azure vs Google vs IBMCloud Storage Comparison: AWS vs Azure vs Google vs IBM
Cloud Storage Comparison: AWS vs Azure vs Google vs IBM
RightScale
 
Edge architecture ieee international conference on cloud engineering
Edge architecture   ieee international conference on cloud engineeringEdge architecture   ieee international conference on cloud engineering
Edge architecture ieee international conference on cloud engineering
Mikey Cohen - Hiring Amazing Engineers
 
OpenShift Container Platform 4.12 Release Notes
OpenShift Container Platform 4.12 Release NotesOpenShift Container Platform 4.12 Release Notes
OpenShift Container Platform 4.12 Release Notes
GerryJamisola1
 
Hybrid cloud : why and how to connect your datacenters to OVHcloud ?
Hybrid cloud : why and how to connect your datacenters to OVHcloud ? Hybrid cloud : why and how to connect your datacenters to OVHcloud ?
Hybrid cloud : why and how to connect your datacenters to OVHcloud ?
OVHcloud
 
The Case for Chaos
The Case for ChaosThe Case for Chaos
The Case for Chaos
Bruce Wong
 
Introduction to Amazon Elasticsearch Service
Introduction to  Amazon Elasticsearch ServiceIntroduction to  Amazon Elasticsearch Service
Introduction to Amazon Elasticsearch Service
Amazon Web Services
 
twMVC 47_Elastic APM 的兩三事
twMVC 47_Elastic APM 的兩三事twMVC 47_Elastic APM 的兩三事
twMVC 47_Elastic APM 的兩三事
twMVC
 
Yaml
YamlYaml
Jenkins Pipeline Tutorial | Continuous Delivery Pipeline Using Jenkins | DevO...
Jenkins Pipeline Tutorial | Continuous Delivery Pipeline Using Jenkins | DevO...Jenkins Pipeline Tutorial | Continuous Delivery Pipeline Using Jenkins | DevO...
Jenkins Pipeline Tutorial | Continuous Delivery Pipeline Using Jenkins | DevO...
Edureka!
 
Kubernetes Basics
Kubernetes BasicsKubernetes Basics
Kubernetes Basics
Eueung Mulyana
 
Terraform
TerraformTerraform
Low Code Integration with Apache Camel.pdf
Low Code Integration with Apache Camel.pdfLow Code Integration with Apache Camel.pdf
Low Code Integration with Apache Camel.pdf
Claus Ibsen
 
Elasticsearch in Netflix
Elasticsearch in NetflixElasticsearch in Netflix
Elasticsearch in Netflix
Danny Yuan
 
Stability Patterns for Microservices
Stability Patterns for MicroservicesStability Patterns for Microservices
Stability Patterns for Microservices
pflueras
 
(CMP201) All You Need To Know About Auto Scaling
(CMP201) All You Need To Know About Auto Scaling(CMP201) All You Need To Know About Auto Scaling
(CMP201) All You Need To Know About Auto Scaling
Amazon Web Services
 
chaos-engineering-Knolx
chaos-engineering-Knolxchaos-engineering-Knolx
chaos-engineering-Knolx
Knoldus Inc.
 
Elastic Load Balancing Deep Dive - AWS Online Tech Talk
Elastic  Load Balancing Deep Dive - AWS Online Tech TalkElastic  Load Balancing Deep Dive - AWS Online Tech Talk
Elastic Load Balancing Deep Dive - AWS Online Tech Talk
Amazon Web Services
 
K8s cluster autoscaler
K8s cluster autoscaler K8s cluster autoscaler
K8s cluster autoscaler
k8s study
 
Introduction To AWS & AWS Lambda
Introduction To AWS & AWS LambdaIntroduction To AWS & AWS Lambda
Introduction To AWS & AWS Lambda
An Nguyen
 

What's hot (20)

Intro to open source observability with grafana, prometheus, loki, and tempo(...
Intro to open source observability with grafana, prometheus, loki, and tempo(...Intro to open source observability with grafana, prometheus, loki, and tempo(...
Intro to open source observability with grafana, prometheus, loki, and tempo(...
 
Cloud Storage Comparison: AWS vs Azure vs Google vs IBM
Cloud Storage Comparison: AWS vs Azure vs Google vs IBMCloud Storage Comparison: AWS vs Azure vs Google vs IBM
Cloud Storage Comparison: AWS vs Azure vs Google vs IBM
 
Edge architecture ieee international conference on cloud engineering
Edge architecture   ieee international conference on cloud engineeringEdge architecture   ieee international conference on cloud engineering
Edge architecture ieee international conference on cloud engineering
 
OpenShift Container Platform 4.12 Release Notes
OpenShift Container Platform 4.12 Release NotesOpenShift Container Platform 4.12 Release Notes
OpenShift Container Platform 4.12 Release Notes
 
Hybrid cloud : why and how to connect your datacenters to OVHcloud ?
Hybrid cloud : why and how to connect your datacenters to OVHcloud ? Hybrid cloud : why and how to connect your datacenters to OVHcloud ?
Hybrid cloud : why and how to connect your datacenters to OVHcloud ?
 
The Case for Chaos
The Case for ChaosThe Case for Chaos
The Case for Chaos
 
Introduction to Amazon Elasticsearch Service
Introduction to  Amazon Elasticsearch ServiceIntroduction to  Amazon Elasticsearch Service
Introduction to Amazon Elasticsearch Service
 
twMVC 47_Elastic APM 的兩三事
twMVC 47_Elastic APM 的兩三事twMVC 47_Elastic APM 的兩三事
twMVC 47_Elastic APM 的兩三事
 
Yaml
YamlYaml
Yaml
 
Jenkins Pipeline Tutorial | Continuous Delivery Pipeline Using Jenkins | DevO...
Jenkins Pipeline Tutorial | Continuous Delivery Pipeline Using Jenkins | DevO...Jenkins Pipeline Tutorial | Continuous Delivery Pipeline Using Jenkins | DevO...
Jenkins Pipeline Tutorial | Continuous Delivery Pipeline Using Jenkins | DevO...
 
Kubernetes Basics
Kubernetes BasicsKubernetes Basics
Kubernetes Basics
 
Terraform
TerraformTerraform
Terraform
 
Low Code Integration with Apache Camel.pdf
Low Code Integration with Apache Camel.pdfLow Code Integration with Apache Camel.pdf
Low Code Integration with Apache Camel.pdf
 
Elasticsearch in Netflix
Elasticsearch in NetflixElasticsearch in Netflix
Elasticsearch in Netflix
 
Stability Patterns for Microservices
Stability Patterns for MicroservicesStability Patterns for Microservices
Stability Patterns for Microservices
 
(CMP201) All You Need To Know About Auto Scaling
(CMP201) All You Need To Know About Auto Scaling(CMP201) All You Need To Know About Auto Scaling
(CMP201) All You Need To Know About Auto Scaling
 
chaos-engineering-Knolx
chaos-engineering-Knolxchaos-engineering-Knolx
chaos-engineering-Knolx
 
Elastic Load Balancing Deep Dive - AWS Online Tech Talk
Elastic  Load Balancing Deep Dive - AWS Online Tech TalkElastic  Load Balancing Deep Dive - AWS Online Tech Talk
Elastic Load Balancing Deep Dive - AWS Online Tech Talk
 
K8s cluster autoscaler
K8s cluster autoscaler K8s cluster autoscaler
K8s cluster autoscaler
 
Introduction To AWS & AWS Lambda
Introduction To AWS & AWS LambdaIntroduction To AWS & AWS Lambda
Introduction To AWS & AWS Lambda
 

Similar to OpenSearch.pdf

Streamsets and spark in Retail
Streamsets and spark in RetailStreamsets and spark in Retail
Streamsets and spark in Retail
Hari Shreedharan
 
Analytic Insights in Retail Using Apache Spark with Hari Shreedharan
Analytic Insights in Retail Using Apache Spark with Hari ShreedharanAnalytic Insights in Retail Using Apache Spark with Hari Shreedharan
Analytic Insights in Retail Using Apache Spark with Hari Shreedharan
Databricks
 
Ledingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @LendingkartLedingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @Lendingkart
Mukesh Singh
 
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
PGConf APAC 2018 - High performance json  postgre-sql vs. mongodbPGConf APAC 2018 - High performance json  postgre-sql vs. mongodb
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
PGConf APAC
 
Building a Unified Logging Layer with Fluentd, Elasticsearch and Kibana
Building a Unified Logging Layer with Fluentd, Elasticsearch and KibanaBuilding a Unified Logging Layer with Fluentd, Elasticsearch and Kibana
Building a Unified Logging Layer with Fluentd, Elasticsearch and Kibana
Mushfekur Rahman
 
Introducing Datawave
Introducing DatawaveIntroducing Datawave
Introducing Datawave
Accumulo Summit
 
NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1
Ruslan Meshenberg
 
Data Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixData Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFix
C4Media
 
Initial presentation of swift (for montreal user group)
Initial presentation of swift (for montreal user group)Initial presentation of swift (for montreal user group)
Initial presentation of swift (for montreal user group)
Marcos García
 
Real-time analytics with Druid at Appsflyer
Real-time analytics with Druid at AppsflyerReal-time analytics with Druid at Appsflyer
Real-time analytics with Druid at Appsflyer
Michael Spector
 
Serverless Clojure and ML prototyping: an experience report
Serverless Clojure and ML prototyping: an experience reportServerless Clojure and ML prototyping: an experience report
Serverless Clojure and ML prototyping: an experience report
Metosin Oy
 
PostgreSQL and Sphinx pgcon 2013
PostgreSQL and Sphinx   pgcon 2013PostgreSQL and Sphinx   pgcon 2013
PostgreSQL and Sphinx pgcon 2013
Emanuel Calvo
 
TRHUG 2015 - Veloxity Big Data Migration Use Case
TRHUG 2015 - Veloxity Big Data Migration Use CaseTRHUG 2015 - Veloxity Big Data Migration Use Case
TRHUG 2015 - Veloxity Big Data Migration Use Case
Hakan Ilter
 
Journey through high performance django application
Journey through high performance django applicationJourney through high performance django application
Journey through high performance django application
bangaloredjangousergroup
 
Google app engine - Soft Uni 19.06.2014
Google app engine - Soft Uni 19.06.2014Google app engine - Soft Uni 19.06.2014
Google app engine - Soft Uni 19.06.2014
Dimitar Danailov
 
Load testing in Zonky with Gatling
Load testing in Zonky with GatlingLoad testing in Zonky with Gatling
Load testing in Zonky with Gatling
Petr Vlček
 
The Professional Programmer
The Professional ProgrammerThe Professional Programmer
The Professional Programmer
Dave Cross
 
Log Management: AtlSecCon2015
Log Management: AtlSecCon2015Log Management: AtlSecCon2015
Log Management: AtlSecCon2015
cameronevans
 
High performance json- postgre sql vs. mongodb
High performance json- postgre sql vs. mongodbHigh performance json- postgre sql vs. mongodb
High performance json- postgre sql vs. mongodb
Wei Shan Ang
 
A Day in the Life of a Druid Implementor and Druid's Roadmap
A Day in the Life of a Druid Implementor and Druid's RoadmapA Day in the Life of a Druid Implementor and Druid's Roadmap
A Day in the Life of a Druid Implementor and Druid's Roadmap
Itai Yaffe
 

Similar to OpenSearch.pdf (20)

Streamsets and spark in Retail
Streamsets and spark in RetailStreamsets and spark in Retail
Streamsets and spark in Retail
 
Analytic Insights in Retail Using Apache Spark with Hari Shreedharan
Analytic Insights in Retail Using Apache Spark with Hari ShreedharanAnalytic Insights in Retail Using Apache Spark with Hari Shreedharan
Analytic Insights in Retail Using Apache Spark with Hari Shreedharan
 
Ledingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @LendingkartLedingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @Lendingkart
 
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
PGConf APAC 2018 - High performance json  postgre-sql vs. mongodbPGConf APAC 2018 - High performance json  postgre-sql vs. mongodb
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
 
Building a Unified Logging Layer with Fluentd, Elasticsearch and Kibana
Building a Unified Logging Layer with Fluentd, Elasticsearch and KibanaBuilding a Unified Logging Layer with Fluentd, Elasticsearch and Kibana
Building a Unified Logging Layer with Fluentd, Elasticsearch and Kibana
 
Introducing Datawave
Introducing DatawaveIntroducing Datawave
Introducing Datawave
 
NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1
 
Data Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixData Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFix
 
Initial presentation of swift (for montreal user group)
Initial presentation of swift (for montreal user group)Initial presentation of swift (for montreal user group)
Initial presentation of swift (for montreal user group)
 
Real-time analytics with Druid at Appsflyer
Real-time analytics with Druid at AppsflyerReal-time analytics with Druid at Appsflyer
Real-time analytics with Druid at Appsflyer
 
Serverless Clojure and ML prototyping: an experience report
Serverless Clojure and ML prototyping: an experience reportServerless Clojure and ML prototyping: an experience report
Serverless Clojure and ML prototyping: an experience report
 
PostgreSQL and Sphinx pgcon 2013
PostgreSQL and Sphinx   pgcon 2013PostgreSQL and Sphinx   pgcon 2013
PostgreSQL and Sphinx pgcon 2013
 
TRHUG 2015 - Veloxity Big Data Migration Use Case
TRHUG 2015 - Veloxity Big Data Migration Use CaseTRHUG 2015 - Veloxity Big Data Migration Use Case
TRHUG 2015 - Veloxity Big Data Migration Use Case
 
Journey through high performance django application
Journey through high performance django applicationJourney through high performance django application
Journey through high performance django application
 
Google app engine - Soft Uni 19.06.2014
Google app engine - Soft Uni 19.06.2014Google app engine - Soft Uni 19.06.2014
Google app engine - Soft Uni 19.06.2014
 
Load testing in Zonky with Gatling
Load testing in Zonky with GatlingLoad testing in Zonky with Gatling
Load testing in Zonky with Gatling
 
The Professional Programmer
The Professional ProgrammerThe Professional Programmer
The Professional Programmer
 
Log Management: AtlSecCon2015
Log Management: AtlSecCon2015Log Management: AtlSecCon2015
Log Management: AtlSecCon2015
 
High performance json- postgre sql vs. mongodb
High performance json- postgre sql vs. mongodbHigh performance json- postgre sql vs. mongodb
High performance json- postgre sql vs. mongodb
 
A Day in the Life of a Druid Implementor and Druid's Roadmap
A Day in the Life of a Druid Implementor and Druid's RoadmapA Day in the Life of a Druid Implementor and Druid's Roadmap
A Day in the Life of a Druid Implementor and Druid's Roadmap
 

Recently uploaded

Fundamentals of Programming and Language Processors
Fundamentals of Programming and Language ProcessorsFundamentals of Programming and Language Processors
Fundamentals of Programming and Language Processors
Rakesh Kumar R
 
E-commerce Application Development Company.pdf
E-commerce Application Development Company.pdfE-commerce Application Development Company.pdf
E-commerce Application Development Company.pdf
Hornet Dynamics
 
socradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdfsocradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdf
SOCRadar
 
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
Łukasz Chruściel
 
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissancesAtelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Neo4j
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
Aftab Hussain
 
What is Master Data Management by PiLog Group
What is Master Data Management by PiLog GroupWhat is Master Data Management by PiLog Group
What is Master Data Management by PiLog Group
aymanquadri279
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
Aftab Hussain
 
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
kalichargn70th171
 
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
Green Software Development
 
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdfAutomated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
timtebeek1
 
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Crescat
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOMLORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
lorraineandreiamcidl
 
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling ExtensionsUI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
Peter Muessig
 
Webinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for EmbeddedWebinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for Embedded
ICS
 
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise EditionWhy Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
Envertis Software Solutions
 
Using Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query PerformanceUsing Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query Performance
Grant Fritchey
 
Revolutionizing Visual Effects Mastering AI Face Swaps.pdf
Revolutionizing Visual Effects Mastering AI Face Swaps.pdfRevolutionizing Visual Effects Mastering AI Face Swaps.pdf
Revolutionizing Visual Effects Mastering AI Face Swaps.pdf
Undress Baby
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j
 
Oracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptxOracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptx
Remote DBA Services
 

Recently uploaded (20)

Fundamentals of Programming and Language Processors
Fundamentals of Programming and Language ProcessorsFundamentals of Programming and Language Processors
Fundamentals of Programming and Language Processors
 
E-commerce Application Development Company.pdf
E-commerce Application Development Company.pdfE-commerce Application Development Company.pdf
E-commerce Application Development Company.pdf
 
socradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdfsocradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdf
 
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
 
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissancesAtelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissances
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
 
What is Master Data Management by PiLog Group
What is Master Data Management by PiLog GroupWhat is Master Data Management by PiLog Group
What is Master Data Management by PiLog Group
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
 
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
 
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
 
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdfAutomated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
 
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOMLORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
 
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling ExtensionsUI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
 
Webinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for EmbeddedWebinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for Embedded
 
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise EditionWhy Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
 
Using Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query PerformanceUsing Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query Performance
 
Revolutionizing Visual Effects Mastering AI Face Swaps.pdf
Revolutionizing Visual Effects Mastering AI Face Swaps.pdfRevolutionizing Visual Effects Mastering AI Face Swaps.pdf
Revolutionizing Visual Effects Mastering AI Face Swaps.pdf
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
 
Oracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptxOracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptx
 

OpenSearch.pdf

  • 2. Agenda ● OpenSearch ○ What is it? ○ Benefits/ Uses ○ How to use it ○ Features ● Migrate from Elastic to OpenSearch ● Tools & Plugins
  • 3. About Me ● Lead Dev ● Located in Florida ● Trainer ● Presenter ● .NET Developer ● Youtuber: Coach4Dev ● Husband/ Father
  • 4. Amazon Elasticsearch ● Launched in 2015 ● Gained popularity for log analytics usage ● Used open-source Elastic under Apache License v2 ● Jan 2021 ○ Elastic NV changed licensing strategy ○ After ElasticSearch 7.10.2 & Kibana 7.10.2 ■ Not release under Apache License v2 ■ Release under Elastic License
  • 5. OpenSearch ● Sep 2021: ○ Renamed from ElasticSearch to OpenSearch ● OpenSource fork from Elastic 7.10.2 and Kibana 7.10.2 ● Highly scalable ● Fast access & response to large volumes of data ● Powered by Apache Lucene Search library
  • 6. Apache Lucene ● Apache Lucene project develops open-source search software ○ Releases a core search library named Lucene core ● Lucene Core ○ Java Library providing powerful indexing and search features
  • 7. Apache Solr ● Open source search platform ● Built on Apache Lucene
  • 8. Solr vs ElasticSearch ● Similar performance mostly. ● ES has better support for scalability ○ due to horizontal scaling ■ Better cloud support too ● ES can support multiple doc types in a single index better ○ More difficult to do this in Solr ● ES supports native DSL (Domain Specific Language) ○ Need to program queries in Solr ● https://mindmajix.com/elasticsearch-vs-solr
  • 9. Why OpenSearch ● Huge amount of machine generated data these days ○ Growing exponentially ● Getting insights is important ● Interactive log analytics ● Real-time application monitoring ● Website Search, etc.
  • 10. OpenSearch Features ● Easy to set-up and configure ● In-place upgrades ● Enables data monitoring & setting alerts based on thresholds ● Supports authentication, encryption & compliance requirements
  • 11. OpenSearch vs ElasticSearch ● OpenSearch was forked from Elastic Search ○ Now they are separate from each other ● Each is adding features separately ● OpenSearch ○ Inbuilt support from AWS
  • 12. OpenSearch features not in ES (free version) ● Centralized user accounts / access control ● Cross-cluster replication ● IP filtering ● Configurable retention period ● Anomaly detection ● Tableau connector ● JDBC driver ● ODBC driver ● Machine learning features such as regression and classification ● Link
  • 13. ElasticSearch Features ● Based on subscription levels ● https://www.elastic.co/subscriptions
  • 14. OpenSearch & ElasticSearch Version Support ● Currently supports the following OpenSearch versions: ○ 1.3, 1.2, 1.1, 1.0 ● And supports the following ElasticSearch versions: ○ 7.10, 7.9, 7.8, 7.7, 7.4, 7.1 ○ 6.8, 6.7, 6.5, 6.4, 6.3, 6.2, 6.0 ○ 5.6, 5.5, 5.3, 5.1 ○ 2.3 ○ 1.5
  • 15. What is Kibana ● Free & open front end application ● Charting tool for Elastic Stack ● Sits on top of Elastic Stack ● Sample Dashboard
  • 16. OpenSearch Dashboards ● Default visualization tool for data in OpenSearch ● Filter data with queries ● Comes with opensearch service
  • 18. OpenSearch Cluster ● Synonymous to domain ● Domains are clusters with ○ settings, ○ instance types, ○ instance counts, ○ and storage resources that you specify. ● Group of nodes ○ With same cluster.name attribute
  • 19. Opensearch Node ● Member of a cluster ● A distinct host ● With IP address
  • 20. Getting Started ● Create a domain ● Size the domain appropriately for your workload ● Control access to your domain using a domain access policy or fine-grained access control ● Index data manually or from other AWS services ● Use OpenSearch Dashboards to search your data and create visualizations
  • 21. Custom Endpoint ● If we want easier to read or custom domain name ● Can use Https ○ Upload SSL certificate
  • 22. Run OpenSearch locally ● Install docker ● wsl -d docker-desktop ● sysctl -w vm.max_map_count=262144 ● Ctrl+C ● docker-compose up ● Visit http://localhost:5601/ ● Use admin/admin to login and explore ● Link
  • 23. Upload Data ● One at a time ● Bulk
  • 24. Upload Data One At a time ● curl -XPUT -u "master:XXXX" "https://search-test-domain-s7g5csgqurpevadhaonp75mwgm.us-west-1.es.a mazonaws.com/movies/_doc/1" -d "{"director": "Burton, Tim", "genre": ["Comedy","Sci-Fi"], "year": 1996, "actor": ["Jack Nicholson","Pierce Brosnan","Sarah Jessica Parker"], "title": "Mars Attacks!"}" -H "Content-Type: application/json"
  • 25. Upload Data Bulk ● curl -XPOST -u "master:XXXXX" "https://search-test-domain-s7g5csgqurpevadhaonp75mwgm.us-west-1.es.a mazonaws.com/_bulk" --data-binary @bulk_movies.txt -H "Content-Type: application/json"
  • 27. Searching Data ● URI Searches ● Command Line ● OpenSearch Dashboards
  • 28. Searching Data - URI ● GET Request ● https://search-test-domain-s7g5csgqurpevadhaonp75mwgm.us-west-1.es.am azonaws.com/movies/_search?q=rebel&pretty=true ● Searches all the indices and properties
  • 29. URI Search Specific fields ● Search movies index and title property ● GET https://search-my-domain.us-west-1.es.amazonaws.com/movies/_search?q=ti tle:house
  • 30. Get Search Results - Command Line ● curl -XGET -u "master:XXXXX" "https://search-test-domain-s7g5csgqurpevadhaonp75mwgm.us-west-1.es.a mazonaws.com/movies/_search?q=rebel&pretty=true"
  • 31. Query DSL ● For more complex queries ○ OpenSearch Domain Specific Language (DSL) ● POST request with query body ●
  • 32. Get Search Results - Dev Tools ● https://search-test-domain-s7g5csgqurpevadhaonp75mwgm.us-west-1.es.am azonaws.com/_dashboards/app/dev_tools#/console ○ GET _search ○ { ○ "query": { ○ "match_all": {} ○ } ○ }
  • 33. Search on only specific fields GET _search { "size": 20, "query": { "multi_match": { "query": "U.S.", "fields": ["title", "actor", "director"] } } }
  • 34. Search - Boosting fields GET _search { "size": 20, "query": { "multi_match": { "query": "john", "fields": ["title^4", "actor", "director^4"] } } }
  • 35. Search - Pagination GET _search { "from": 0, "size": 1, "query": { "multi_match": { "query": "Drama", "fields": ["genre"] } } }
  • 36. Query -With Highlights In Response GET _search { "size": 20, "query": { "multi_match": { "query": "Manchurian", "fields": ["title^4", "actor", "director"] } }, "highlight": { "fields": { "title": {} }, "pre_tags": "<strong>", "post_tags": "</strong>", "fragment_size": 200, "boundary_chars": ".,!? " } }
  • 37. Query - Count GET movies/_count { "query": { "multi_match": { "query": "Manchurian", "fields": ["title^4", "actor", "director"] } } }
  • 38. Dashboard Query Language ● Use DQL in Dashboards ○ Search for data and visualizations ● Terms Query ○ Search for any text ■ E.g. www.example.com ○ Access object’s nested field ■ E.g. coordinates.lat:43.7102 ○ Leading and trailing wildcards ■ host.keyword:*.example.com/* ● Operators ○ AND ○ OR
  • 39. Dashboard Query Language ● Date and range Queries ○ bytes >= 15 and memory < 15 ○ @timestamp > "2020-12-14T09:35:33" ● Nested field query ○ superheroes: {hero-name: Superman}
  • 41. Query Workbench ● SQL ○ Run SQL ○ Treat indices as tables ● PPL ○ Piped Processing Language ○ Commands delimited by pipes
  • 42. Reporting ● Multiple file formats ● On demand/ Scheduled ● Generate from ○ Dashboard ○ Visualization ○ Discover
  • 43. Anomaly Detection ● Detect unusual behavior in time series data ● Anomaly Grade ● Confidence Score
  • 44. Notifications ● Supported ○ Amazon Chime ○ SNS ○ SES ○ SMTP ○ Slack ○ Custom Webhooks
  • 45. Observability plugin ● Visualize/Query time series data ● Event analytics ● Compare the data the way you like
  • 46. Index Management ● Create ISM policy ● To manage your indexes
  • 47. Security plugin ● Set up RBAC ●
  • 48. Migrate from ElasticSearch to OpenSearch
  • 49. Three major approaches ● Snapshot ● Rolling Upgrade ● Cluster Restart
  • 50. Snapshot Method ● Generate snapshot in ElasticSearch ● Save in shared directory ● Restore in OpenSearch ● Snapshot ○ Backup of entire cluster state ○ Useful for recovery from failure and migration ● Link
  • 51. Snapshot Method ● Check Index compatibility ○ E.g.: Cant restore 7.6.0 snapshot into 7.5.0 cluster ● Link ● Fastest ● Easiest ● Most efficient ●
  • 52. Rolling Upgrade ● Official way to migrate cluster ● Without interruption ● Rolling upgrades are supported: ○ Between minor versions ○ From 5.6 to 6.8 ○ From 6.8 to 7.14.1 ○ From any version since 7.14.0 to 7.14.1
  • 53. Rolling Upgrade ● Shut down one node at a time ○ Minimal disruption
  • 54. Cluster Restart Upgrades ● Shut down all nodes ● Perform the upgrade ● Restart the cluster
  • 56. OpenSearch Mapping ● Dynamic ○ When you index a document ○ Opensearch adds fields automatically ○ It deduces their types by itself ● Explicit ○ If you know your data types ○ Preferred way of doing things
  • 57. OpenSearch Mapping ● If you do not define a mapping ahead of time, OpenSearch dynamically creates a mapping for you. ● If you do decide to define your own mapping, you can do so at index creation. ● ONE mapping is defined per index. Once the index has been created, we can only add new fields to a mapping. We CANNOT change the mapping of an existing field. ● If you must change the type of an existing field, you must create a new index with the desired mapping, then reindex all documents into the new index.
  • 58. Text vs keyword data types ● Text type ○ Full text searches ● Keyword type ○ Exact searches ○ Aggregations ○ Sorting
  • 59. Text vs Keyword ● Inverted Index
  • 61. OpenSearch Aggregations ● Analyze data ○ In real time too ● Extract statistics ● More expensive than queries ○ Or CPU and Memory ○ In general
  • 62. Aggregation Query ● Use aggs or aggregations
  • 65. Data Streams in OpenSearch ● Ingesting time series data ○ Logs ○ Events ○ Metrics, etc. ● Number of documents grows rapidly ● Append Only data ● Don't need to update older documents (Very rarely)
  • 66. Rollover ● If data is growing rapidly ● Write to index upto certain threshold ○ Then create a new index ○ And start writing to it ● Optimize the active index for high ingest rates on high-performance hot nodes. ● Optimize for search performance on warm nodes. ● Shift older, less frequently accessed data to less expensive cold nodes, ● Delete data according to your retention policies by removing entire indices.
  • 67. Index Template ● Data Stream requires an index template ● A name or wildcard (*) pattern for the data stream. ● The data stream’s timestamp field. This field must be mapped as a date or date_nanos field data type and must be included in every document indexed to the data stream. ● The mappings and settings applied to each backing index when it’s created.
  • 68. ILM Policy ● Index Lifecycle Management Policy ● Can be applied to any number of indices ● Usage ○ Allocate ○ Delete ○ Rollover ○ Read Only ○ Wait for snapshot
  • 69. ILM Policy ● Create a policy: ● Link
  • 73. Index Template ● Tells ElasticSearch how to configure an index when it is created ● For data streams ○ Configures the stream’s backing indices ○ Configured prior to index creation
  • 74. Templates Types ● Component Templates ○ Reusable building blocks that configure ■ mappings, ■ settings, and ■ Aliases ○ Not directly applied to indices ● Index Template ○ Collection of component templates ○ Directly applied to indices ○ Some defaults: metrics-*-*, logs-*-*
  • 76. Create Index Template ● Data Stream requires matching index template ● PUT _index_template/{template_name}
  • 78. Create data stream ● Documents must contain timestamp field ● PUT _data_stream/my-data-stream ● Stream’s name must match one of your index template’s index patterns
  • 79. Get Info About Data Stream ● GET _data_stream/my-data-stream
  • 80. Delete Data Stream ● DELETE _data_stream/my-data-stream
  • 82. Cross Cluster Replication ● Cross Cluster replication plugin ○ Replicates indexes, mapping & metadata from one cluster to another ● Advantages ○ Continue to handle search requests if there is an outage ○ Can help reduce latency in application ■ Replicating data across geographically distant data centers
  • 83. Replication ● Active passive model ○ Follower index pulls data from leader index ● It can be ○ Started ○ Paused ○ Stopped ○ Resumed ● Can be secured ○ Security plugin ○ Encrypt cross cluster traffic
  • 84. Exercise ● Create 2 domains in AWS OpenSearch ● Link
  • 85. Exercise ● Source Domain Connections Tab -> Outbound -> ○ Create Connection to Destination Domain ● Set access policy on destination domain: ● Link ○ ○
  • 86. Exercise ● Get Connection status ○ GET _plugins/_replication/connect1/_status ● Start syncing ○ PUT _plugins/_replication/connect1/_start ○ { ○ "leader_alias": "Connect1", ○ "leader_index": "movies", ○ "use_roles":{ ○ "leader_cluster_role": "all_access", ○ "follower_cluster_role": "all_access" ○ } ○ }
  • 88. Opensearch plugins ● Standalone components ○ That add features and capabilities ● Huge number of plugins available ● E.g. ○ Replication Plugin ○ Security plugin ○ Notification plugin
  • 89. SQL Plugin ● Lets you run SQL queries on ESDB ● Add data ○ PUT movies/_doc/1 ○ { "title": "Spirited Away" } ● Query data ○ POST _plugins/_sql ○ { ○ "query": "SELECT * FROM movies LIMIT 50" ○ } ○
  • 90. SQL Plugin ● Delete data from ESDB Index ● Enable Delete via SQL plugin ○ PUT _plugins/_query/settings ○ { ○ "transient": { ○ "plugins.sql.delete.enabled": "true" ○ } ○ } ○
  • 91. SQL PLugin - Delete ● To Delete the data ○ POST _plugins/_sql ○ { ○ "query": "DELETE FROM movies" ○ } ○
  • 92. Asynchronous Search ● Large volumes of data ● Can take longer to search ● Async ○ Run searches in the background ○ Monitor progress of these searches ○ Get back partial results as they become available
  • 93. Asynchronous Search ● POST _plugins/_asynchronous_search ● Response contents: ○ ID ■ Can be used to track the state of the search ■ Get partial results ○ State ■ Running ■ Completed ■ Persisted ● Link
  • 95. Clients ● OpenSearch Python client ● OpenSearch JavaScript (Node.js) client ● OpenSearch .NET clients ● OpenSearch Go client ● OpenSearch PHP client
  • 96. Open Search Client for .NET ● OpenSearch.Net ○ Low level client ● OpenSearch.Client ○ High level client ● Sample code: Link
  • 97. Exercise ● Create a .NET application ● Add a document to OpenSearch using the .NET Application ○ OpenSearch.Client (.NET High level client)
  • 99. Beats ● Data shippers ● Agents on servers ● Send data to ES/ Logstash
  • 100. Grafana ● An open source visualization tool ● Various sources can be used as data source: ○ InfluxDB ○ MySQL ○ ElasticSearch ○ PostgreSQL ● Better suited for metrics visualizations ● Does not allow full text data querying
  • 101. Logstash ● Free/ Open-Source ● Data processing pipeline ● Ingests data from multitude of sources ● Transforms it ● Sends it to your favorite stash
  • 102. Logstash - Ingestion ● Data of all shapes/ sizes/ source ○ Can be ingested ● It can parse/ transform your data
  • 103. Logstash - Output ● ElasticSearch ● Mongodb ● S3 ● Etc. ● Link
  • 104. AWS OpenSearch Security ● Use multi-factor authentication (MFA) with each account. ● Use SSL/TLS to communicate with AWS resources. We recommend TLS 1.2 or later. ● Set up API and user activity logging with AWS CloudTrail. ● Use AWS encryption solutions, along with all default security controls within AWS services. ● Use advanced managed security services such as Amazon Macie, which assists in discovering and securing personal data that is stored in Amazon S3. ● If you require FIPS 140-2 validated cryptographic modules when accessing AWS through a command line interface or an API, use a FIPS endpoint.
  • 105. Summary ● Opensearch ○ Open Source Search solution ● Upcoming and supported by AWS ● Caters to most search use cases ○ Great Query performance ● Powerful tools ● Community Support
  • 106. Connect with me ● Trainings on various tech topics ● For any questions: ○ https://linkedin.com/in/coach4dev