Expanding Elastic:
Learn how anyone can leverage heterogeneous compute to extend and accelerate Elasticsearch
The document discusses how Ryft provides solutions to leverage powerful FPGA- and x86-based heterogeneous compute resources to gain immediate insight from data in Elasticsearch without indexing or data preparation latency. Ryft can accelerate workflows, speed search and analysis across unstructured data with no transformation needed, and increase the power of edit distance searches for fuzzy matching beyond a distance of 2. Ryft offers flexible deployment on-premise, in the cloud, or in hybrid environments to extend the capabilities of Elasticsearch.
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Leverage heterogeneous compute to gain real-time insights from Elasticsearch
1. Expanding Elastic:
Learn how anyone can leverage heterogeneous compute
to extend and accelerate Elasticsearch
Pat McGarry, Vice President of Engineering
Al Leyva, Director of Product Management for Analytics Integration
Ryft
March 7, 2017
@Ryft
2. Analytics at the Speed of Your Business
Challenges:
Exponential data growth
Traditional text searching is
hindered with complex
indexing and transformation
requirements
Answers are delayed as
legacy architectures require
hours or days to find insights
Solutions:
Deploy heterogeneous
compute on-premise, hybrid
or in the cloud
Abstract away the
complexities of powerful
FPGA-based compute
technology
Real-time insights with no
indexing or data preparation
latency
Fast data growth is creating the largest business threats & opportunities since the
Internet
3. Accelerate and Extend Elasticsearch with Ryft
Leverage powerful FPGA- and x86-based heterogeneous compute resources to
gain immediate insight without indexing
Reduce need for data indexing and transformation, accelerate searches and extend
Elasticsearch capabilities to:
Accelerate workflows with the ability to deploy pre-index and post-index
searches
Speed search and analysis across unstructured data and JSON, XML, LOGs,
CSV, TSV and other files with no transformation
Increase the power of edit distance with user selectable changes to large (>2)
distance requests for Fuzzy Hamming or Levenshtein searches.
Enhance wildcard searches to include leading wildcard characters
4. Flexible Deployment On-Premise, in the Cloud or in
Hybrid Environments
Kibana Layer
Elasticsearch Layer
Lucene Layer
ES to Lucene
Ryft ONE / AWS F1 Instance
ES Plugin mechanism routes requests
(fuzziness & metric)
The Elastic Stack implements a distributed,
JSON-based search and analytics engine:
ES to Ryft
Primitive
Elasticsearch on Ryft can be deployed in your
environment, on-premise, hybrid or in the
cloud:
Deploy via the new Amazon
F1 platform.
Deploy on-premise or in
hybrid environments with the
Ryft ONE accelerator.
5. Speed search and analysis across unstructured data
and JSON, XML, LOGs, CSV, TSV files with no ETL
• Using
Elasticsearch
command
• Un-indexed
human genome
data
• Match query
with Levenshtein
search
• Edit distance of
4
6. Increase edit distance values beyond 2 using Fuzzy
Hamming or Levenshtein searches powered by Ryft
• Using
Elasticsearch
command
• Match phrase
query with
Levenshtein
search
• With edit
distance of 6
• No re-indexing
necessary
Pharmaceutical Research
Pharmaceutical Scien ce
r deleted
7. Enhance wildcard searches to include leading
wildcard characters powered by Ryft
• Using
Elasticsearch
command
• Using XML
pcap file
• Looking for IP
addresses
xx.0.90.xx
• No ETL or
indexes
necessary
Threat: Reliance upon legacy network & compute architectures that were never designed to organize, store & process data at the rate required today.
Opportunity: There is an enormous growth market for organizations capable of gaining instant analysis on data in any analytics ecosystem and regardless of data type, format or structure.
Ryft leverages FPGA/x86 heterogeneous compute technology to eliminate indexing and transformation, in addition to providing expanded Elasticsearch functionality. For instance, Elasticsearch supports complex Levenshtein distance searches up to a distance of two. However, many of our customers often require more than a distance of two in cases when there are misspellings or abbreviations, there are many permutations of the same words, or when lots of transcriptions exist in the data.
By leveraging FPGA and x86-based heterogeneous architectures either on-premise or in the cloud via the Amazon F1 platform, companies can eliminate data indexing and transformation, accelerate searches and extend Elasticsearch capabilities to:
Accelerate workflows with the ability to deploy pre-index and post-index searches
Speed search and analysis across unstructured data and JSON, XML, LOGs, CSV, TSV and other files with no ETL
Increase edit distance values up to 10 or more using Fuzzy Hamming or Levenshtein searches
Enhance wildcard searches to include leading wildcard characters
With Ryft:
Adds both edit distance & hamming fuzzy search capabilities to Elastic Search for phrase matching and word matches
Extends the fuzziness distance beyond 2
Reuses existing ES “metric” to request search type
No restrictions on Elasticsearch operations
No reliance on Elasticsearch indexes
Adds support non-JSON elements via ES syntax without indexes
Wildcard support (limited in ES), PCRE2-compliant regular expressions, unstructured (non-indexed) file support, indexing performance improvements
All you see is the Elasticsearch application you know and love. Behind the scenes, Ryft abstracts away the complexities of using FPGA-based heterogeneous technology to execute multiple complex algorithms, and return data in native Elasticsearch JSON formats for visualization through the Elasticsearch application, Kibana, and/or other applications using the Elasticsearch API.