SlideShare a Scribd company logo
1 of 15
Download to read offline
Steering Away From Bolted-on Analytics
Whitepaper
info@connexica.comwww.connexica.com +44(0)1785 246777
Search Powered Data Discovery
Introduction
Since the dawn of the PC, the spreadsheet and the database, developers have been continually
enhancing the same products to eke out more and more life from what are now legacy
technologies.
The cost of ground up development is extremely high so it makes sense to “sweat the asset” and
try to prolong the shelf life of products.
In the context of data storage software, OLTP databases such as DB2 and Oracle were great
inventions. These technologies were designed specifically to allow data to be stored securely
inside highly optimised, normalised databases.
Thousands of systems were developed where a relational database was the heartbeat of the
application and SQL the de-facto way of getting data in and out of it.
OLTP databases were not designed for fast data retrieval so we came up with new ways of
storing data, by designing warehouses and duplicating data in a de-normalised form. This meant
fewer SQL joins and consequently faster data retrieval.
When SQL over a Warehouse became too slow, new ways of pre-aggregating data were devised,
resulting in the rise of OLAP and its multiple storage variants HOLAP, MOLAP, ROLAP -
depending on how you wish to tune performance against having additional storage and
hardware overheads.
Technology Evolution
2
When OLAP became too restrictive as users needed access to
data at a more granular, transaction level with reduced loading
times, in-memory technologies gathered popularity, taking
advantage of 64 bit architectures and the ability to build
hardware with more and more RAM.
When in-memory technology struggles or the specialised
hardware is deemed too expensive to cope with the sheer
volume of data being generated by the modern day business, or
where aggregated data is insufficient for the ad-hoc information
needs of today’s business worker, we need to take stock and
consider if we may finally have come to a point where we need to
do things differently.
So what’s next?
All of these developments have been incremental add-ons that
have enabled the core data repository – a relational database – to
carry on doing what its best at – OLTP – Online Transaction
Processing, aka “storing” data.
All of the reporting technologies have been incremental add-ons
to SQL, to get around the limitations of a technology that was
never designed for fast data retrieval. The technology is there to
plug a gap, yet businesses are gambling their futures and cash on
these systems continuing to work and be fit for purpose, whilst
the world of Big Data, Social Media and real-time analytics are
becoming a way of life.
3
90%of the world’s data
was generated over
the last 2 years
Source: www.sciencedaily.com
Search-Based Business Analytics
CXAIR is a combined Search Engine and a Business Analytics tool that is not based on OLTP, OLAP
or in-memory technology. CXAIR instead uses the same principles adopted by Internet Search
Engines such as Google and Bing.
Where traditional enterprise reporting tools either report directly off the source data or off a
pre-aggregated CUBE or in-memory aggregations, CXAIR users report off a search engine created
from an organisations source data in a similar way to how a search engine crawls web sites for
content.
Whilst Google crawls web sites for text, documents, media etc…, CXAIR crawls databases and
stores a copy of that data in super-fast, highly scalable indexes. These indexes, when grouped
together, form a search engine that can be “Googled” with sub-second speeds using natural
language search terms.
Unlike internet search engines, CXAIR can update its indexes in near real time, allowing users to
search and find data seconds after being entered into its originating system.
Unlike OLTP and OLAP databases, CXAIR indexes can be searched without the need to write SQL
or MDX.
CXAIR is both a flexible data store and a powerful search and analytical tool, providing many of
the best features associated with the most popular Business Intelligence tools and Warehouse
solutions - but in a fundamentally different way.
The CXAIR approach is focused very much around the business user and how they gain insight
from their corporate data assets without reliance on IT.
CXAIR uses search technology as the “engine” and “storage mechanism” for storing data from
disparate data sources and like Google handles extremely large data volumes and returns
sub-second response times to queries on commodity hardware.
4
5
Below is a summary of some of the key differences between Search BI and other technologies
used to store and retrieve data.
Understanding
Search BI
What’s different between a search engine and OLTP?
A search engine is built for fast data retrieval, not for transaction based data entry.
The technology itself is relatively new coming over 30 years after OLTP and has evolved to
handle huge data volumes and provide rapid retrieval times.
Search engines are extremely easy to configure.
Search engines are queried by typing in natural language search terms not complex SQL.
Search engines are not designed for data entry and do not implement a comprehensive
transaction management system.
OLTP is designed for secure data entry and does not provide an integrated reporting
and analysis layer.
Search engines are designed specifically for the web.
6
What’s different between a search engine and OLAP?
In a search engine, data is accessed through natural language searches, rather than MDX.
Search engine data is held at document / transaction level and is not pre-aggregated.
A search engine does not require data to be in an organised structure.
•OLAP is designed as a fast aggregation engine that sits on top of one or more OLTP or
Warehouse systems and does not provide an integrated reporting and analysis layer.
In a search engine all fields and values are available for searching whereas OLAP requires
you to decide exactly what information is to be made available to the user by structuring
cubes, dimensions and measures.
What’s different between a search engine and in-memory analytics?
A search engine does not pre-load data into memory.
A search engine is not restricted by memory limits but by disk storage and IO performance.
A search engine is easily distributed across multiple servers to spread load for large
numbers of users and high volumes of data.
A search engine will run on commodity hardware.
Search engine data is held at document / transaction level and is not pre-aggregated.
In a search engine all fields and values are available for searching whereas in-memory
analytics requires you to decide exactly what information is to be made available to the
user by structuring in-memory aggregations, hierarchies and measures.
7
Advantages of Search BI
Search BI is easier to use.
The technology is inherently fast.
The technology is relatively light weight and is easy to implement.
Search engines are designed to handle very large data volumes.
End users do not require SQL skills.
Search engines do not need to differentiate between structured and unstructured data.
Search BI stores its data at document / transaction level and is not pre-aggregated.
Search engines do not need to pre-load data into memory.
The scalability of a search engine is not restricted by memory limits.
Search BI provides an integrated storage repository and query tool.
Search engines run on commodity hardware.
Here are some of the key advantages Search BI has over other technologies used to store and
retrieve data.
8
Understanding
CXAIR
CXAIR is Search BI that quickly and inexpensively presents
actionable information to all of your business users without the
need for IT.
The product uses search technology to provide a simple, easy to
understand interface for querying and reporting on diverse
information collated from multiple, disparate data sources.
Combined with a natural language search capability, CXAIR provides
a highly visual front-end that allows business users to create and
view high quality charts, dashboards and Infographic style outputs.
Unlike traditional reporting tools, CXAIR can query across millions
of transactions at the speed of Google, providing near real-time
responses to information requests.
CXAIR is able to pull structured and unstructured content from
operational data stores, applications, spread sheets, document
directories and all manner of different media streams including
Twitter and RSS web feeds and present that data back in
consolidated format for consumption by all levels of the business.
200+implementations of
CXAIR across a variety
of industries
Source: Connexica
9
How does it work?
To best understand how CXAIR works, the first thing to understand are the various components
that make up the technology and how you would go about building a search engine, then
analysing the contents though the CXAIR analysis and reporting engine.
A data gathering engine that continually mines information from multiple data sources and
stores a copy of that data as encoded index files.
A high performance search engine that allows data contained in the index files to be queried
and analysed using natural language search terms.
A visualisation engine for transforming search results into graphics.
An analysis and reporting engine for transforming search results and visualisations into reports
and dashboards.
A configuration manager that maintains the metadata and configuration details relating to the
CXAIR installation.
A web user interface for end users who wish to search or run pre-created reports and analyses.
A web interface for full CXAIR users who require access to the entire front-end search, analysis
and report development capabilities.
A web interface for administrators to configure and administer the CXAIR instance for access by
authorised users and 3rd party applications.
CXAIR consists of:
10
Getting Data into CXAIR
In the context of CXAIR, a search engine is a series of indexes which have been logically grouped
together to form a single searchable source of information.
Indexes are stored as a series of segments which simply appear as a group of files held within a
sub-directory on a disk. These files are stored in binary form and are accessed whenever a user
queries a search engine that contains that index via CXAIR or a 3rd party application using a CXAIR
API call.
Index segments contain a series of documents that contain searchable text, dates, numbers and
images that have been extracted and analysed by the index build process and converted into a
proprietary format designed for fast data access.
To create an index there are a series of wizards that allow you to select the type of index you wish
to create - complete refresh, incremental, continuous update, snapshot or archive, the data source
you wish to index e.g. database, file system, web URL, spread sheet, email etc… and then which
filters to apply to restrict what data is returned to the indexing process.
11
The data gathering engine then crawls the source system and transforms the data into a
searchable index.
This process can be repeated any number of times to create any number of searchable indexes.
Administrators can create multiple search engines that either share common indexes or have their
own indexes and restrict access to those indexes and search engines to specific groups of users.
Once an index has been built and added to a search engine, a user (subject to access permissions)
can search the index without any further configuration.
Users can perform free text searches, filter data by clicking on search results to narrow down and
refine the search or use range filters such as date pickers, sliders, check boxes and numeric range
controls to perform more sophisticated searching.
Once the data has been filtered to the records (documents) you are interested in you can then
transform the output into a table, chart, Venn diagram or dashboard without any coding or SQL.
Diagram showing how different data sources can be combined to create multiple search engines.
12
Warehouses are typically built to provide a unified view of the business within a single database.
Often data in critical OLTP based systems will get archived off into a warehouse due to the need
for the OLTP database to function as quickly as possible. In contrast Warehouses are often used
to store historical data for periodic reporting and trend analysis.
Designing a Warehouse requires a combination of SQL expertise as well as business knowledge.
Designing the layout, structures, dimensions and measures for calculating totals and metrics
requires both technical skills and knowledge of the systems and their data, as well as the
reporting requirements and business processes of the organisation.
Critical to the warehouse and the ability to provide timely and accurate reports is “good data”. If
the data is not good, you can’t report against it as it won’t join together to allow you to produce
meaningful reports. Not having “good data” typically forces you to split the Warehouse into
multiple databases - a landing database, a staging database where you correct and augment the
data and a production database or multiple production data marts which are used as the source
for management reporting.
CXAIR can sit on top of a Warehouse and provide the reporting layer or act as an alternative
reporting layer over the production databases and marts.
Warehouse becomes an option, not a necessity
So why is this
different?
13
Alternatively CXAIR can extend the Data Warehouse by taking in data from other systems that are
not easily accessible to the Warehouse.
A fundamental difference between CXAIR and traditional data warehouse implementations is
that it does not need “good data”.
As CXAIR is powered by a search engine, it is inherently designed for structured data and
unstructured data. From a proto-typing and data discovery perspective, using fuzzy matching and
natural language search allows you to navigate around “bad data” and identify errors and
omissions in your operational systems.
Another alternative application of CXAIR is for it to be the Data Warehouse.
Where it is different to a data warehouse is the way it stores the data in indexes which are joined
together to create a unified search engine that spans all of your critical business data.
Search engines are inherently designed to store and retrieve huge data volumes so holding
historic records which might otherwise need to be archived off is standard functionality. In
addition as CXAIR is both the reporting / analysis engine and the storage mechanism, there is no
need to have separate reporting and analysis tools.
Reporting without SQL
A key differentiator between CXAIR and traditional reporting tools is that under the hood it is not
using SQL, OLAP or in-memory analytics. Behind every action is a “search” against an index which
returns matches in sub-second time even over data volumes of millions of records.
Because of the raw speed of search engine technology, the approach to querying and report
writing can be turned on its head.
In normal reporting the primary skill is to know how to get at the underlying data. This would be
achieved either via the creation of complex SQL for a relational database, MDX for an OLAP cube
or proprietary scripting as part of the load and aggregation process for in-memory analytics.
For CXAIR the end user is able to get to the underlying data themselves by simply clicking and
selecting data values. What’s more, this can be done in real time and iteratively to follow the
user’s train of thought.
14
From there the process of transforming that data into a table or chart and iteratively refining the
layout is simple and fast due to the user being able to continually review what the report looks
like as they go along because of the sheer speed in which the data is returned by the search
engine.
To highlight the benefits the speed of a search engine brings for analysis and report building,
CXAIR has an in-built Venn diagram function that allows users to create interactive VENN
diagrams over indexes and search results.
The Venn functionality allows you to identify patterns, relationships and clusters in your data. In
this example we are looking at Health data where we can see the total number of patients that
were admitted electively, the total number of patients admitted in Q3 and all of the patients
admitted that were referred by Dr R. Jones.
Combining the 3 sets shows which of those patients, treated by Dr R. Jones, have also been
admitted electively in Q3.
This functionality is possible because of the speed of search and would not be possible over
traditional database technology, due of the vast number of complex joins that would need to be
done to calculate the various Venn segments over potentially huge data sets.
15
Search BI is a new way of analysing and reporting across todays every increasing and diverse
information sources through natural language searching.
It was developed around the need to join up a network of systems and data and provide a way of
locating that information extremely quickly through the use of simple search terms.
Search Bi and CXAIR is not the next generation of OLTP or an aggregated layer on top of OLTP to
provide improved query responses but a new application of a search engine.
Whilst CXAIR end user functionality converges with traditional business intelligence tools to
cater for what has become a standard set of requirements for standard reporting, CXAIR is also
able to offer something new.
Search BI is an evolution of the Internet Search engine not OLTP. CXAIR is the only BI technology
available today that offers integrated storage and analysis over a search engine, capable of
coping with the diverse demands of modern day information requirements.
Summary

More Related Content

What's hot

Teradata Aster Discovery Platform
Teradata Aster Discovery PlatformTeradata Aster Discovery Platform
Teradata Aster Discovery PlatformScott Antony
 
Filling the Data Lake - Strata + HadoopWorld San Jose 2016 Preview Presentation
Filling the Data Lake - Strata + HadoopWorld San Jose 2016 Preview PresentationFilling the Data Lake - Strata + HadoopWorld San Jose 2016 Preview Presentation
Filling the Data Lake - Strata + HadoopWorld San Jose 2016 Preview PresentationPentaho
 
The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...
The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...
The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...Revolution Analytics
 
AWS Summit Singapore - Accelerate Digital Transformation through AI-powered C...
AWS Summit Singapore - Accelerate Digital Transformation through AI-powered C...AWS Summit Singapore - Accelerate Digital Transformation through AI-powered C...
AWS Summit Singapore - Accelerate Digital Transformation through AI-powered C...Amazon Web Services
 
Splunk Business Analytics
Splunk Business AnalyticsSplunk Business Analytics
Splunk Business AnalyticsCleverDATA
 
BigData-Architecture
BigData-ArchitectureBigData-Architecture
BigData-ArchitectureNarayana B
 
Data Warehousing AWS 12345
Data Warehousing AWS 12345Data Warehousing AWS 12345
Data Warehousing AWS 12345AkhilSinghal21
 
Business intelligence and data warehousing
Business intelligence and data warehousingBusiness intelligence and data warehousing
Business intelligence and data warehousingOZ Assignment help
 
Introduction to Data Mining and Data Warehousing
Introduction to Data Mining and Data WarehousingIntroduction to Data Mining and Data Warehousing
Introduction to Data Mining and Data WarehousingKamal Acharya
 

What's hot (20)

Teradata Aster Discovery Platform
Teradata Aster Discovery PlatformTeradata Aster Discovery Platform
Teradata Aster Discovery Platform
 
Filling the Data Lake - Strata + HadoopWorld San Jose 2016 Preview Presentation
Filling the Data Lake - Strata + HadoopWorld San Jose 2016 Preview PresentationFilling the Data Lake - Strata + HadoopWorld San Jose 2016 Preview Presentation
Filling the Data Lake - Strata + HadoopWorld San Jose 2016 Preview Presentation
 
The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...
The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...
The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...
 
Oracle sql plsql & dw
Oracle sql plsql & dwOracle sql plsql & dw
Oracle sql plsql & dw
 
The Big Metadata
The Big MetadataThe Big Metadata
The Big Metadata
 
No sql databases
No sql databasesNo sql databases
No sql databases
 
AWS Summit Singapore - Accelerate Digital Transformation through AI-powered C...
AWS Summit Singapore - Accelerate Digital Transformation through AI-powered C...AWS Summit Singapore - Accelerate Digital Transformation through AI-powered C...
AWS Summit Singapore - Accelerate Digital Transformation through AI-powered C...
 
Taming Big Data With Modern Software Architecture
Taming Big Data  With Modern Software ArchitectureTaming Big Data  With Modern Software Architecture
Taming Big Data With Modern Software Architecture
 
Splunk Business Analytics
Splunk Business AnalyticsSplunk Business Analytics
Splunk Business Analytics
 
SAP BI/BW
SAP BI/BWSAP BI/BW
SAP BI/BW
 
BigData-Architecture
BigData-ArchitectureBigData-Architecture
BigData-Architecture
 
NoSQL Type, Bigdata, and Analytics
NoSQL Type, Bigdata, and AnalyticsNoSQL Type, Bigdata, and Analytics
NoSQL Type, Bigdata, and Analytics
 
Data Warehousing AWS 12345
Data Warehousing AWS 12345Data Warehousing AWS 12345
Data Warehousing AWS 12345
 
Business intelligence and data warehousing
Business intelligence and data warehousingBusiness intelligence and data warehousing
Business intelligence and data warehousing
 
Enterprise architecture for big data projects
Enterprise architecture for big data projectsEnterprise architecture for big data projects
Enterprise architecture for big data projects
 
Msbi Architecture
Msbi ArchitectureMsbi Architecture
Msbi Architecture
 
SaaSRefArch
SaaSRefArchSaaSRefArch
SaaSRefArch
 
Introduction to Data Mining and Data Warehousing
Introduction to Data Mining and Data WarehousingIntroduction to Data Mining and Data Warehousing
Introduction to Data Mining and Data Warehousing
 
Intelligent Data Extraction, Turning Content into Data, A Look at Advanced Ca...
Intelligent Data Extraction, Turning Content into Data, A Look at Advanced Ca...Intelligent Data Extraction, Turning Content into Data, A Look at Advanced Ca...
Intelligent Data Extraction, Turning Content into Data, A Look at Advanced Ca...
 
Big data technologies with Case Study Finance and Healthcare
Big data technologies with Case Study Finance and HealthcareBig data technologies with Case Study Finance and Healthcare
Big data technologies with Case Study Finance and Healthcare
 

Similar to Steering Away from Bolted-On Analytics

Redefining Data Analytics Through Search
Redefining Data Analytics Through SearchRedefining Data Analytics Through Search
Redefining Data Analytics Through SearchConnexica
 
Big Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential ToolsBig Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential ToolsFredReynolds2
 
Enabling SQL Access to Data Lakes
Enabling SQL Access to Data LakesEnabling SQL Access to Data Lakes
Enabling SQL Access to Data LakesVasu S
 
ACDKOCHI19 - Next Generation Data Analytics Platform on AWS
ACDKOCHI19 - Next Generation Data Analytics Platform on AWSACDKOCHI19 - Next Generation Data Analytics Platform on AWS
ACDKOCHI19 - Next Generation Data Analytics Platform on AWSAWS User Group Kochi
 
the Data World Distilled
the Data World Distilledthe Data World Distilled
the Data World DistilledRTTS
 
Unlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data VirtualizationUnlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data VirtualizationDenodo
 
Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...
Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...
Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...Denodo
 
Modern Integrated Data Environment - Whitepaper | Qubole
Modern Integrated Data Environment - Whitepaper | QuboleModern Integrated Data Environment - Whitepaper | Qubole
Modern Integrated Data Environment - Whitepaper | QuboleVasu S
 
Top Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practicesTop Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practicesSpringPeople
 
Alteryx Desktop Designer Overview
Alteryx Desktop Designer OverviewAlteryx Desktop Designer Overview
Alteryx Desktop Designer OverviewTridant
 
CXAIR for Data Migration
CXAIR for Data MigrationCXAIR for Data Migration
CXAIR for Data MigrationConnexica
 
When and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureWhen and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureDATAVERSITY
 
Evolving Big Data Strategies: Bringing Data Lake and Data Mesh Vision to Life
Evolving Big Data Strategies: Bringing Data Lake and Data Mesh Vision to LifeEvolving Big Data Strategies: Bringing Data Lake and Data Mesh Vision to Life
Evolving Big Data Strategies: Bringing Data Lake and Data Mesh Vision to LifeSG Analytics
 
5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data LakeMetroStar
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureDATAVERSITY
 
Business Intelligence Solution Using Search Engine
Business Intelligence Solution Using Search EngineBusiness Intelligence Solution Using Search Engine
Business Intelligence Solution Using Search Engineankur881120
 
Relational Databases For An Efficient Data Management And...
Relational Databases For An Efficient Data Management And...Relational Databases For An Efficient Data Management And...
Relational Databases For An Efficient Data Management And...Sheena Crouch
 
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...Denodo
 

Similar to Steering Away from Bolted-On Analytics (20)

Redefining Data Analytics Through Search
Redefining Data Analytics Through SearchRedefining Data Analytics Through Search
Redefining Data Analytics Through Search
 
Big Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential ToolsBig Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential Tools
 
Enabling SQL Access to Data Lakes
Enabling SQL Access to Data LakesEnabling SQL Access to Data Lakes
Enabling SQL Access to Data Lakes
 
ACDKOCHI19 - Next Generation Data Analytics Platform on AWS
ACDKOCHI19 - Next Generation Data Analytics Platform on AWSACDKOCHI19 - Next Generation Data Analytics Platform on AWS
ACDKOCHI19 - Next Generation Data Analytics Platform on AWS
 
BigData Analysis
BigData AnalysisBigData Analysis
BigData Analysis
 
the Data World Distilled
the Data World Distilledthe Data World Distilled
the Data World Distilled
 
Unlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data VirtualizationUnlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data Virtualization
 
Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...
Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...
Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...
 
Modern Integrated Data Environment - Whitepaper | Qubole
Modern Integrated Data Environment - Whitepaper | QuboleModern Integrated Data Environment - Whitepaper | Qubole
Modern Integrated Data Environment - Whitepaper | Qubole
 
Top Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practicesTop Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practices
 
Alteryx Desktop Designer Overview
Alteryx Desktop Designer OverviewAlteryx Desktop Designer Overview
Alteryx Desktop Designer Overview
 
CXAIR for Data Migration
CXAIR for Data MigrationCXAIR for Data Migration
CXAIR for Data Migration
 
When and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureWhen and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data Architecture
 
Evolving Big Data Strategies: Bringing Data Lake and Data Mesh Vision to Life
Evolving Big Data Strategies: Bringing Data Lake and Data Mesh Vision to LifeEvolving Big Data Strategies: Bringing Data Lake and Data Mesh Vision to Life
Evolving Big Data Strategies: Bringing Data Lake and Data Mesh Vision to Life
 
5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
 
Business Intelligence Solution Using Search Engine
Business Intelligence Solution Using Search EngineBusiness Intelligence Solution Using Search Engine
Business Intelligence Solution Using Search Engine
 
Ask bigger questions
Ask bigger questionsAsk bigger questions
Ask bigger questions
 
Relational Databases For An Efficient Data Management And...
Relational Databases For An Efficient Data Management And...Relational Databases For An Efficient Data Management And...
Relational Databases For An Efficient Data Management And...
 
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
 

More from Connexica

GDPR Data Discovery and Management Brochure
GDPR Data Discovery and Management BrochureGDPR Data Discovery and Management Brochure
GDPR Data Discovery and Management BrochureConnexica
 
CXAIR Product Feature: Pages - The Next Generation Report Builder
CXAIR Product Feature: Pages - The Next Generation Report BuilderCXAIR Product Feature: Pages - The Next Generation Report Builder
CXAIR Product Feature: Pages - The Next Generation Report BuilderConnexica
 
CXAIR for Healthcare Overview
CXAIR for Healthcare OverviewCXAIR for Healthcare Overview
CXAIR for Healthcare OverviewConnexica
 
About Us - Who are Connexica and what is CXAIR?
About Us - Who are Connexica and what is CXAIR?About Us - Who are Connexica and what is CXAIR?
About Us - Who are Connexica and what is CXAIR?Connexica
 
5 Reasons why Surrey and Borders NHS Chose CXAIR
5 Reasons why Surrey and Borders NHS Chose CXAIR5 Reasons why Surrey and Borders NHS Chose CXAIR
5 Reasons why Surrey and Borders NHS Chose CXAIRConnexica
 
5 Reasons why South Staffordshire and Shropshire NHS Chose CXAIR
5 Reasons why South Staffordshire and Shropshire NHS Chose CXAIR5 Reasons why South Staffordshire and Shropshire NHS Chose CXAIR
5 Reasons why South Staffordshire and Shropshire NHS Chose CXAIRConnexica
 
The Second Big Bang
The Second Big BangThe Second Big Bang
The Second Big BangConnexica
 
Interactive Venns - Name, Set and Match?
Interactive Venns - Name, Set and Match?Interactive Venns - Name, Set and Match?
Interactive Venns - Name, Set and Match?Connexica
 
Don’t Make Bad Data an Excuse
Don’t Make Bad Data an ExcuseDon’t Make Bad Data an Excuse
Don’t Make Bad Data an ExcuseConnexica
 
Comparison of CXAIR to Traditional BI Technologies
Comparison of CXAIR to Traditional BI Technologies Comparison of CXAIR to Traditional BI Technologies
Comparison of CXAIR to Traditional BI Technologies Connexica
 
The Path to Manageable Data - Going Beyond the Three V’s of Big Data
The Path to Manageable Data - Going Beyond the Three V’s of Big DataThe Path to Manageable Data - Going Beyond the Three V’s of Big Data
The Path to Manageable Data - Going Beyond the Three V’s of Big DataConnexica
 
10 Facts about CXAIR Infographic
10 Facts about CXAIR Infographic 10 Facts about CXAIR Infographic
10 Facts about CXAIR Infographic Connexica
 
GDPR Checklist Infographic
GDPR Checklist InfographicGDPR Checklist Infographic
GDPR Checklist InfographicConnexica
 
Who are Connexica? Infographic
Who are Connexica? InfographicWho are Connexica? Infographic
Who are Connexica? InfographicConnexica
 
Customer Case Study: Q&A with ResMed
Customer Case Study: Q&A with ResMedCustomer Case Study: Q&A with ResMed
Customer Case Study: Q&A with ResMedConnexica
 
5 Reasons why two Mental Health Trusts chose CXAIR
5 Reasons why two Mental Health Trusts chose CXAIR5 Reasons why two Mental Health Trusts chose CXAIR
5 Reasons why two Mental Health Trusts chose CXAIRConnexica
 
GS1 Compliance and Scan4Safety
GS1 Compliance and Scan4SafetyGS1 Compliance and Scan4Safety
GS1 Compliance and Scan4SafetyConnexica
 
Single View of Procurement
Single View of ProcurementSingle View of Procurement
Single View of ProcurementConnexica
 
Real-Time Inventory Management and Alerting
Real-Time Inventory Management and AlertingReal-Time Inventory Management and Alerting
Real-Time Inventory Management and AlertingConnexica
 
Unstructured Data Fact Sheet
Unstructured Data Fact SheetUnstructured Data Fact Sheet
Unstructured Data Fact SheetConnexica
 

More from Connexica (20)

GDPR Data Discovery and Management Brochure
GDPR Data Discovery and Management BrochureGDPR Data Discovery and Management Brochure
GDPR Data Discovery and Management Brochure
 
CXAIR Product Feature: Pages - The Next Generation Report Builder
CXAIR Product Feature: Pages - The Next Generation Report BuilderCXAIR Product Feature: Pages - The Next Generation Report Builder
CXAIR Product Feature: Pages - The Next Generation Report Builder
 
CXAIR for Healthcare Overview
CXAIR for Healthcare OverviewCXAIR for Healthcare Overview
CXAIR for Healthcare Overview
 
About Us - Who are Connexica and what is CXAIR?
About Us - Who are Connexica and what is CXAIR?About Us - Who are Connexica and what is CXAIR?
About Us - Who are Connexica and what is CXAIR?
 
5 Reasons why Surrey and Borders NHS Chose CXAIR
5 Reasons why Surrey and Borders NHS Chose CXAIR5 Reasons why Surrey and Borders NHS Chose CXAIR
5 Reasons why Surrey and Borders NHS Chose CXAIR
 
5 Reasons why South Staffordshire and Shropshire NHS Chose CXAIR
5 Reasons why South Staffordshire and Shropshire NHS Chose CXAIR5 Reasons why South Staffordshire and Shropshire NHS Chose CXAIR
5 Reasons why South Staffordshire and Shropshire NHS Chose CXAIR
 
The Second Big Bang
The Second Big BangThe Second Big Bang
The Second Big Bang
 
Interactive Venns - Name, Set and Match?
Interactive Venns - Name, Set and Match?Interactive Venns - Name, Set and Match?
Interactive Venns - Name, Set and Match?
 
Don’t Make Bad Data an Excuse
Don’t Make Bad Data an ExcuseDon’t Make Bad Data an Excuse
Don’t Make Bad Data an Excuse
 
Comparison of CXAIR to Traditional BI Technologies
Comparison of CXAIR to Traditional BI Technologies Comparison of CXAIR to Traditional BI Technologies
Comparison of CXAIR to Traditional BI Technologies
 
The Path to Manageable Data - Going Beyond the Three V’s of Big Data
The Path to Manageable Data - Going Beyond the Three V’s of Big DataThe Path to Manageable Data - Going Beyond the Three V’s of Big Data
The Path to Manageable Data - Going Beyond the Three V’s of Big Data
 
10 Facts about CXAIR Infographic
10 Facts about CXAIR Infographic 10 Facts about CXAIR Infographic
10 Facts about CXAIR Infographic
 
GDPR Checklist Infographic
GDPR Checklist InfographicGDPR Checklist Infographic
GDPR Checklist Infographic
 
Who are Connexica? Infographic
Who are Connexica? InfographicWho are Connexica? Infographic
Who are Connexica? Infographic
 
Customer Case Study: Q&A with ResMed
Customer Case Study: Q&A with ResMedCustomer Case Study: Q&A with ResMed
Customer Case Study: Q&A with ResMed
 
5 Reasons why two Mental Health Trusts chose CXAIR
5 Reasons why two Mental Health Trusts chose CXAIR5 Reasons why two Mental Health Trusts chose CXAIR
5 Reasons why two Mental Health Trusts chose CXAIR
 
GS1 Compliance and Scan4Safety
GS1 Compliance and Scan4SafetyGS1 Compliance and Scan4Safety
GS1 Compliance and Scan4Safety
 
Single View of Procurement
Single View of ProcurementSingle View of Procurement
Single View of Procurement
 
Real-Time Inventory Management and Alerting
Real-Time Inventory Management and AlertingReal-Time Inventory Management and Alerting
Real-Time Inventory Management and Alerting
 
Unstructured Data Fact Sheet
Unstructured Data Fact SheetUnstructured Data Fact Sheet
Unstructured Data Fact Sheet
 

Recently uploaded

Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 

Recently uploaded (20)

Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 

Steering Away from Bolted-On Analytics

  • 1. Steering Away From Bolted-on Analytics Whitepaper info@connexica.comwww.connexica.com +44(0)1785 246777 Search Powered Data Discovery
  • 2. Introduction Since the dawn of the PC, the spreadsheet and the database, developers have been continually enhancing the same products to eke out more and more life from what are now legacy technologies. The cost of ground up development is extremely high so it makes sense to “sweat the asset” and try to prolong the shelf life of products. In the context of data storage software, OLTP databases such as DB2 and Oracle were great inventions. These technologies were designed specifically to allow data to be stored securely inside highly optimised, normalised databases. Thousands of systems were developed where a relational database was the heartbeat of the application and SQL the de-facto way of getting data in and out of it. OLTP databases were not designed for fast data retrieval so we came up with new ways of storing data, by designing warehouses and duplicating data in a de-normalised form. This meant fewer SQL joins and consequently faster data retrieval. When SQL over a Warehouse became too slow, new ways of pre-aggregating data were devised, resulting in the rise of OLAP and its multiple storage variants HOLAP, MOLAP, ROLAP - depending on how you wish to tune performance against having additional storage and hardware overheads. Technology Evolution 2
  • 3. When OLAP became too restrictive as users needed access to data at a more granular, transaction level with reduced loading times, in-memory technologies gathered popularity, taking advantage of 64 bit architectures and the ability to build hardware with more and more RAM. When in-memory technology struggles or the specialised hardware is deemed too expensive to cope with the sheer volume of data being generated by the modern day business, or where aggregated data is insufficient for the ad-hoc information needs of today’s business worker, we need to take stock and consider if we may finally have come to a point where we need to do things differently. So what’s next? All of these developments have been incremental add-ons that have enabled the core data repository – a relational database – to carry on doing what its best at – OLTP – Online Transaction Processing, aka “storing” data. All of the reporting technologies have been incremental add-ons to SQL, to get around the limitations of a technology that was never designed for fast data retrieval. The technology is there to plug a gap, yet businesses are gambling their futures and cash on these systems continuing to work and be fit for purpose, whilst the world of Big Data, Social Media and real-time analytics are becoming a way of life. 3 90%of the world’s data was generated over the last 2 years Source: www.sciencedaily.com
  • 4. Search-Based Business Analytics CXAIR is a combined Search Engine and a Business Analytics tool that is not based on OLTP, OLAP or in-memory technology. CXAIR instead uses the same principles adopted by Internet Search Engines such as Google and Bing. Where traditional enterprise reporting tools either report directly off the source data or off a pre-aggregated CUBE or in-memory aggregations, CXAIR users report off a search engine created from an organisations source data in a similar way to how a search engine crawls web sites for content. Whilst Google crawls web sites for text, documents, media etc…, CXAIR crawls databases and stores a copy of that data in super-fast, highly scalable indexes. These indexes, when grouped together, form a search engine that can be “Googled” with sub-second speeds using natural language search terms. Unlike internet search engines, CXAIR can update its indexes in near real time, allowing users to search and find data seconds after being entered into its originating system. Unlike OLTP and OLAP databases, CXAIR indexes can be searched without the need to write SQL or MDX. CXAIR is both a flexible data store and a powerful search and analytical tool, providing many of the best features associated with the most popular Business Intelligence tools and Warehouse solutions - but in a fundamentally different way. The CXAIR approach is focused very much around the business user and how they gain insight from their corporate data assets without reliance on IT. CXAIR uses search technology as the “engine” and “storage mechanism” for storing data from disparate data sources and like Google handles extremely large data volumes and returns sub-second response times to queries on commodity hardware. 4
  • 5. 5 Below is a summary of some of the key differences between Search BI and other technologies used to store and retrieve data. Understanding Search BI What’s different between a search engine and OLTP? A search engine is built for fast data retrieval, not for transaction based data entry. The technology itself is relatively new coming over 30 years after OLTP and has evolved to handle huge data volumes and provide rapid retrieval times. Search engines are extremely easy to configure. Search engines are queried by typing in natural language search terms not complex SQL. Search engines are not designed for data entry and do not implement a comprehensive transaction management system. OLTP is designed for secure data entry and does not provide an integrated reporting and analysis layer. Search engines are designed specifically for the web.
  • 6. 6 What’s different between a search engine and OLAP? In a search engine, data is accessed through natural language searches, rather than MDX. Search engine data is held at document / transaction level and is not pre-aggregated. A search engine does not require data to be in an organised structure. •OLAP is designed as a fast aggregation engine that sits on top of one or more OLTP or Warehouse systems and does not provide an integrated reporting and analysis layer. In a search engine all fields and values are available for searching whereas OLAP requires you to decide exactly what information is to be made available to the user by structuring cubes, dimensions and measures. What’s different between a search engine and in-memory analytics? A search engine does not pre-load data into memory. A search engine is not restricted by memory limits but by disk storage and IO performance. A search engine is easily distributed across multiple servers to spread load for large numbers of users and high volumes of data. A search engine will run on commodity hardware. Search engine data is held at document / transaction level and is not pre-aggregated. In a search engine all fields and values are available for searching whereas in-memory analytics requires you to decide exactly what information is to be made available to the user by structuring in-memory aggregations, hierarchies and measures.
  • 7. 7 Advantages of Search BI Search BI is easier to use. The technology is inherently fast. The technology is relatively light weight and is easy to implement. Search engines are designed to handle very large data volumes. End users do not require SQL skills. Search engines do not need to differentiate between structured and unstructured data. Search BI stores its data at document / transaction level and is not pre-aggregated. Search engines do not need to pre-load data into memory. The scalability of a search engine is not restricted by memory limits. Search BI provides an integrated storage repository and query tool. Search engines run on commodity hardware. Here are some of the key advantages Search BI has over other technologies used to store and retrieve data.
  • 8. 8 Understanding CXAIR CXAIR is Search BI that quickly and inexpensively presents actionable information to all of your business users without the need for IT. The product uses search technology to provide a simple, easy to understand interface for querying and reporting on diverse information collated from multiple, disparate data sources. Combined with a natural language search capability, CXAIR provides a highly visual front-end that allows business users to create and view high quality charts, dashboards and Infographic style outputs. Unlike traditional reporting tools, CXAIR can query across millions of transactions at the speed of Google, providing near real-time responses to information requests. CXAIR is able to pull structured and unstructured content from operational data stores, applications, spread sheets, document directories and all manner of different media streams including Twitter and RSS web feeds and present that data back in consolidated format for consumption by all levels of the business. 200+implementations of CXAIR across a variety of industries Source: Connexica
  • 9. 9 How does it work? To best understand how CXAIR works, the first thing to understand are the various components that make up the technology and how you would go about building a search engine, then analysing the contents though the CXAIR analysis and reporting engine. A data gathering engine that continually mines information from multiple data sources and stores a copy of that data as encoded index files. A high performance search engine that allows data contained in the index files to be queried and analysed using natural language search terms. A visualisation engine for transforming search results into graphics. An analysis and reporting engine for transforming search results and visualisations into reports and dashboards. A configuration manager that maintains the metadata and configuration details relating to the CXAIR installation. A web user interface for end users who wish to search or run pre-created reports and analyses. A web interface for full CXAIR users who require access to the entire front-end search, analysis and report development capabilities. A web interface for administrators to configure and administer the CXAIR instance for access by authorised users and 3rd party applications. CXAIR consists of:
  • 10. 10 Getting Data into CXAIR In the context of CXAIR, a search engine is a series of indexes which have been logically grouped together to form a single searchable source of information. Indexes are stored as a series of segments which simply appear as a group of files held within a sub-directory on a disk. These files are stored in binary form and are accessed whenever a user queries a search engine that contains that index via CXAIR or a 3rd party application using a CXAIR API call. Index segments contain a series of documents that contain searchable text, dates, numbers and images that have been extracted and analysed by the index build process and converted into a proprietary format designed for fast data access. To create an index there are a series of wizards that allow you to select the type of index you wish to create - complete refresh, incremental, continuous update, snapshot or archive, the data source you wish to index e.g. database, file system, web URL, spread sheet, email etc… and then which filters to apply to restrict what data is returned to the indexing process.
  • 11. 11 The data gathering engine then crawls the source system and transforms the data into a searchable index. This process can be repeated any number of times to create any number of searchable indexes. Administrators can create multiple search engines that either share common indexes or have their own indexes and restrict access to those indexes and search engines to specific groups of users. Once an index has been built and added to a search engine, a user (subject to access permissions) can search the index without any further configuration. Users can perform free text searches, filter data by clicking on search results to narrow down and refine the search or use range filters such as date pickers, sliders, check boxes and numeric range controls to perform more sophisticated searching. Once the data has been filtered to the records (documents) you are interested in you can then transform the output into a table, chart, Venn diagram or dashboard without any coding or SQL. Diagram showing how different data sources can be combined to create multiple search engines.
  • 12. 12 Warehouses are typically built to provide a unified view of the business within a single database. Often data in critical OLTP based systems will get archived off into a warehouse due to the need for the OLTP database to function as quickly as possible. In contrast Warehouses are often used to store historical data for periodic reporting and trend analysis. Designing a Warehouse requires a combination of SQL expertise as well as business knowledge. Designing the layout, structures, dimensions and measures for calculating totals and metrics requires both technical skills and knowledge of the systems and their data, as well as the reporting requirements and business processes of the organisation. Critical to the warehouse and the ability to provide timely and accurate reports is “good data”. If the data is not good, you can’t report against it as it won’t join together to allow you to produce meaningful reports. Not having “good data” typically forces you to split the Warehouse into multiple databases - a landing database, a staging database where you correct and augment the data and a production database or multiple production data marts which are used as the source for management reporting. CXAIR can sit on top of a Warehouse and provide the reporting layer or act as an alternative reporting layer over the production databases and marts. Warehouse becomes an option, not a necessity So why is this different?
  • 13. 13 Alternatively CXAIR can extend the Data Warehouse by taking in data from other systems that are not easily accessible to the Warehouse. A fundamental difference between CXAIR and traditional data warehouse implementations is that it does not need “good data”. As CXAIR is powered by a search engine, it is inherently designed for structured data and unstructured data. From a proto-typing and data discovery perspective, using fuzzy matching and natural language search allows you to navigate around “bad data” and identify errors and omissions in your operational systems. Another alternative application of CXAIR is for it to be the Data Warehouse. Where it is different to a data warehouse is the way it stores the data in indexes which are joined together to create a unified search engine that spans all of your critical business data. Search engines are inherently designed to store and retrieve huge data volumes so holding historic records which might otherwise need to be archived off is standard functionality. In addition as CXAIR is both the reporting / analysis engine and the storage mechanism, there is no need to have separate reporting and analysis tools. Reporting without SQL A key differentiator between CXAIR and traditional reporting tools is that under the hood it is not using SQL, OLAP or in-memory analytics. Behind every action is a “search” against an index which returns matches in sub-second time even over data volumes of millions of records. Because of the raw speed of search engine technology, the approach to querying and report writing can be turned on its head. In normal reporting the primary skill is to know how to get at the underlying data. This would be achieved either via the creation of complex SQL for a relational database, MDX for an OLAP cube or proprietary scripting as part of the load and aggregation process for in-memory analytics. For CXAIR the end user is able to get to the underlying data themselves by simply clicking and selecting data values. What’s more, this can be done in real time and iteratively to follow the user’s train of thought.
  • 14. 14 From there the process of transforming that data into a table or chart and iteratively refining the layout is simple and fast due to the user being able to continually review what the report looks like as they go along because of the sheer speed in which the data is returned by the search engine. To highlight the benefits the speed of a search engine brings for analysis and report building, CXAIR has an in-built Venn diagram function that allows users to create interactive VENN diagrams over indexes and search results. The Venn functionality allows you to identify patterns, relationships and clusters in your data. In this example we are looking at Health data where we can see the total number of patients that were admitted electively, the total number of patients admitted in Q3 and all of the patients admitted that were referred by Dr R. Jones. Combining the 3 sets shows which of those patients, treated by Dr R. Jones, have also been admitted electively in Q3. This functionality is possible because of the speed of search and would not be possible over traditional database technology, due of the vast number of complex joins that would need to be done to calculate the various Venn segments over potentially huge data sets.
  • 15. 15 Search BI is a new way of analysing and reporting across todays every increasing and diverse information sources through natural language searching. It was developed around the need to join up a network of systems and data and provide a way of locating that information extremely quickly through the use of simple search terms. Search Bi and CXAIR is not the next generation of OLTP or an aggregated layer on top of OLTP to provide improved query responses but a new application of a search engine. Whilst CXAIR end user functionality converges with traditional business intelligence tools to cater for what has become a standard set of requirements for standard reporting, CXAIR is also able to offer something new. Search BI is an evolution of the Internet Search engine not OLTP. CXAIR is the only BI technology available today that offers integrated storage and analysis over a search engine, capable of coping with the diverse demands of modern day information requirements. Summary