The explosive growth of machine generated data is creating challenges in not only storage and management but how to turn that data into actionable information.
On April 16, 2014 our webinar "Introduction to Infobright" explores how Infobright accelerates ad-hoc query performance and reduces costs.
3. The Rise of Machine Data
50 billion connected
devices
$7.4B mobile advertising
billings
190 Exabytes of data from:
– Web logs
– Sensor data
– Call data records
– Transaction records
3Confidential – Do Not Distribute
4. More than just “Big” Data
•Transactional
•Analytics
•Dynamic
•Static
•Pre-planned
•Ad-hoc
•Rough
•Approximate
•Structured
•Unstructured
•Semi-
structured
Data Query
Function
Data
Refresh
5. “Real time” Analytics: The New Imperative
Identify security threats & fraud
Troubleshoot networks
Optimize online/mobile ads
Plan capacity scale-out
Competitive positioning
7. Who is Infobright
Global provider of
database analytics
platforms to over 450
direct and OEM
customers in the
telecom, digital media
and marketing, financial
services, solution
provider, energy and
healthcare markets
8. Key Benefits of the Knowledge Grid
Architecture
8Confidential – Do Not Distribute
9. Column vs. Row: What is the best use case?
Row Oriented
All the columns are
needed
Transactional
processing is required
Column Oriented
Only relevant columns
are needed
Reports are aggregates
(sum, count, average,
etc.)
10. Column vs. Row: How it Works
50 days worth of data, 1 million
rows / day
Disk I/O is the primary limiting
factor
A row-oriented design forces the
database to retrieve all column
data
As table size increases so do the
indexes
Load speed degrades since
indexes need to be recreated as
data is added; this causes huge
sorts (another very slow
operation)
30 Columns
50MRows
11. Column vs. Row: How it works
Query:
– Select Column 11 ,
Where Column 17
for the 3rd week
(day 15 – day 21)
30 Columns
50MRows
12. Column vs. Row: How it Works
Row-based results
– Eliminate 43 days
– 7 million rows
retrieved
– 210 million data
elements retrieved
30 Columns
50MRows
13. Column vs. Row: How it Works
Column-based results
– Eliminate 43 days
– Eliminate 28 of the 30
columns
– 14 million data
elements
30 Columns
50MRows
14. Data Loading Process: Data Packs
Bulk load input
data
…
…
…
64K
64K
64K
64K
A1
A2
A3
A-n
B1
B2
B3
B-n
C1
C2
C3
C-n
Data Packs
15. Data Loading Process: Compression &
Knowledge Grid
…
…
…
64K
64K
64K
64K
Data packs
compressed
On-Disk storage
In Memory
Knowledge Grid
16. What Your Data Looks Like Now
Original Data
10 TB
Compressed Data
500 GB
17. The Knowledge Grid: How it works
Knowledge Nodes
answer the query
directly, or
Identify only required
Data Packs, minimizing
decompression, and
Predict required data in
advance based on
workload
All driven by a granular
computing engine
18. Queries with the Knowledge Grid: How it
Works
Query: How are my
sales doing this year?
Granular engine iterates
on Knowledge Grid
Each pass eliminates
Data Packs
If any Data Packs are
needed to resolve query,
only those are
decompressed
Knowledge Grid
Compressed Data
19. Queries with the Knowledge Grid: How it
Works
SELECT count(*)
FROM employees
WHERE salary > 100000
AND age < 35
AND job = ‘DBA’
AND state = ‘TX’
salary age job state
No Match Suspect All Match
20. Queries with the Knowledge Grid: How it
Works
SELECT count(*)
FROM employees
WHERE salary > 100000
AND age < 35
AND job = ‘DBA’
AND state = ‘TX’
salary age job state
No match Suspect All Match
All packs
ignored
All packs
ignored
All packs
ignored
Only this pack will
be decompressed
21. Working with Infobright & Hadoop
General purpose database solutions require:
– Significant administration, ongoing tuning and indexing
– More hardware
– Less flexibility for macroscopic investigative analytics
– Higher total cost of ownership
Hadoop Connecter
Infobright
Enterprise Edition
BI Tools
22. Customer Example: JDSU
Low Admin: Do not want to force
users to require DBA’s to keep
solution running
Load Speeds: Ingestion rates
continue to increase, placing heavy
burden on solutions
High Compression: Want to keep
longer histories in less space
Requirements
Lower TCO: Resulting in better
value for customers, better
margins for providers
Stripped Away “DBA” tax
requirement required by previous
versions
Ingesting over 1TB/Hour, with
significant headroom beyond that
Over 3X the retention period
and a 5X simultaneous reduction in
storage requirement
Lower TCO for users, higher
margins for JDSU
Results
Little to No Admin
Fast Load Speeds
20:1+ Compression
Exceptional Ad Hoc
Query Performance
Very Low TCO
22
23. Customer Example: LiveRail
Low Admin: Reduce the
requirements for labor intensive
reporting
Ad Hoc Query Capabilities:
Ability to mine data based for
investigative analytics
High Compression: Want to keep
longer histories in less space
Requirements
Lower TCO: Robust analytics
platform without excessive outlay
of capital or people
Eliminated the need for staff to
run customized reports using Hive
Developed a portal where
customers can run their own ad
hoc reporting
Minimal resources required to
house the Infobright repository for
reporting
Better results for customers,
lower costs and higher margins
for LiveRail
Results
Little to No Admin
Fast Load Speeds
20:1+ Compression
Exceptional Ad Hoc
Query Performance
Very Low TCO
23
24. Customer Example: JC Decaux
Low Admin: Reduce the
requirements for labor intensive
reporting
Ad Hoc Query Capabilities:
Consolidate and issue timely
reports from disparate data
sources
High Compression: Existing
Oracle-based system couldn’t
handle the volume of data
Requirements
Lower TCO: Minimize admin
required for managing Oracle and
work with Hadoop
Ability to create essential reports in
less than three minutes
Fast queries: queries originally
taking 15+ minutes using MySQL
reduced to seconds
Fast uploads: Data loads that
used to take two hours are now
happening in 20 minutes.
implemented in three months.Fast deployment: System
implemented in three months.
Results
Little to No Admin
Fast Load Speeds
20:1+ Compression
Exceptional Ad Hoc
Query Performance
Very Low TCO
24
25. Download our trial
Follow us on Twitter
Follow us on LinkedIn
Join our community
Getting Started with Infobright