FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
Kognitio - an overview
1. The Proven Analytical
Platform for Big Data
September 2013
Michael Hiskey
Vice President
Marketing & Business Development
2. Kognitio is an
in-memory analytical platform
Built from the ground-up to satisfy large and
complex analytics on big data sets
A massively parallel, in-memory analytical
engine that interoperates with your existing
infrastructure
5. Analytical Platform Reference Architecture
Analytical
Platform
Layer
Near-line
Storage
(optional)
Application &
Client Layer
All BI Tools All OLAP Clients Excel
Persistence
Layer Hadoop
Clusters
Enterprise Data
Warehouses
Legacy
Systems
Kognitio
Storage
Reporting
Cloud
Storage
6. Analytical Platform: Addressable Segments
Acceleration for
Traditional BI
Data Science /
Advanced Analytics
SQL on Hadoop.. And
everything else
• Improve performance of
existing BI stack 10‐100x
without re‐engineering
• Cost‐saving alternative to
expanding large‐scale
EDWs
• Enable tighter data security
and BI Tool governance
• Plug‐and‐Play with Hadoop
• Analytical “Sandbox” for
rapid Big Data projects
• MPP in‐memory code
execution of standard
languages (R, SAS, Python,
Perl) in line with SQL
• Ability to simply embed Big
Data analytics into existing
BI/Dashboard Tools without
disruption
• Ability to rapidly move
discovery into production
• Tight Hadoop Integration
• In‐memory over disk
• Seamless integration
SQL, ODBC, JDBC, MDX,
ODBO, XML/A etc.
• Fast MPP data transfer
• High‐throughput, high‐
concurrency, low‐latency
interactive analytics
• Core RDBMS architecture
simplifies integration and
brings ACID, DW qualities
• Data Virtualization ‐
Platform for LDW
• Central shared controlled
data models
7. create view image shopdata as
select prod, store, cust, cost
from “transactions”
where date > 1/1/12
select
store,
product_category,
sum(cost) total_spend,
customer_category customer_type,
count (distinct cust) customers
from
shopdata sd,
product_info p,
customers c
where
sd.prod = p.prod_code
and c.cust_id = sd.cust
group by
store,
product_category,
customer_type
Kognitio Hadoop Integration
• More than just a connector – tight integration*
– Hadoop does what it is good at – storing and filtering data
– Kognitio does what it is good at – complex analytics
Hadoop Cluster
Give me prod, store, cust, cost
from “hdfs files”
where date > 1/1/12
Transaction Data
*Developed in co-operation
with Sears (Metascale)
8. Kognitio Hadoop Connectors
HDFS Connector – fast load of complete files
• Connector defines access to HDFS file system
• External table accesses row-based data
in HDFS
• Dynamic access or “pin” data into memory
• HDFS file(s) loaded into memory
• Data filtering relies on data being partitioned into
different directories/files within Hadoop
Map Reduce Connector – filter from large files
• Connector uploads Kognitio agent to Hadoop
nodes
• Query passes selections and relevant
predicates to agent
• Data filtering and projection takes place locally
on each Hadoop node
• Data filtered as it is read from file(s)
• Only data of interest is transferred and loaded
into memory via parallel load streams
9. MPP in-memory code execution
NoSQL external scripting function:
• SQL provides standard data access framework
– Open, adaptable framework; pass data to/from any
executable or interpreter
– Fully flexible MPP execution of R, Python, Java, text
parsing libraries etc.
create interpreter perlinterp
command '/usr/bin/perl' sends 'csv' receives 'csv' ;
select top 1000 words, count(*)
from (external script using environment perlinterp
receives (txt varchar(32000))
sends (words varchar(100))
script S'endofperl(
while(<>)
{
chomp();
s/[,.!_]//g;
foreach $c (split(/ /))
{ if($c =~ /^[a-zA-Z]+$/) { print "$cn”} }
}
)endofperl'
from (select comments from customer_enquiry))dt
group by 1
order by 2 desc;
Example:
This reads long comments text from
customer enquiry table, in line Perl
converts long text into output stream
of words (one word per row), query
selects top 1000 words by frequency
using standard SQL aggregation
10. Using R code for ad-hoc external script
create script environment rsint command '/usr/bin/Rscript --vanilla --slave';
grant execute on script environment rsint to power-user;
select *
from (external script using environment rsint
receives ( PRICE SMALLINT )
sends ( PRICE INTEGER )
script S'endofr(
options(error = expression(q("no")))
mydata<-read.csv(file=file("stdin"), header=FALSE)
sink(, type="message")
mydata$V1<-mydata$V1-100
write.table(mydata, row.names = FALSE, col.names = FALSE, sep = "," )
)endofr'
from (select price from ITEM_SALE)) dt ;
MPP Execution of R
• Rows are read into data frame mydata
• Data frame vectors (columns) automatically named V1,V2 etc.
• Run math formula – in this case simple subtract 100
• Data frame rows returned to Kognitio
11. Kognitio Cloud
PRIVATE CLOUD PUBLIC CLOUD
• Could be referred to as an
“exclusive” hybrid cloud offering
• Heritage from “DaaS” managed
services
Kognitio ‘hosted appliance’
Kognitio & Partner operated
Exclusive – ‘bare metal’
Monthly pricing
Min. 1 year term
Min. 256GB RAM
Notice required
Multi-node
Optimum configuration
Limited Customisation
AWS
• On-demand
‘hosted appliance’
• Multi-node
• Limited
Customisation
Marketplace
• On-demand
‘hosted server’
• Single node
• Not customisable
• Anonymous
• Ready-to-use in-memory analytical platform leveraging Amazon Web
Services (AWS) Elastic Cloud Computing (EC2) infrastructure
• Hourly usage per CPU/server and TB of data (min 7.5 GBs RAM)
• Automatic provisioning - minutes with pre-installed servers
• Elastic scalability (up and down) to meet compute demand
Single Node
Scale-out
Console / Services
Multi-node
CloudFormation
12. Cloud provides an ideal deployment scenario
Cloud model can provide a way to quickly
model, experiment, develop and build
• Deploy to existing reporting tools
• Pass ownership to IT
• Cloud instances can be “temporary”
• Repeatable framework
2011 2010 Sep.3
Aug. Jul. Sep. Aug.
3,443,873 8.1 382,009 401,951 391,878 351,696 369,199
617,194 10.4 67,055 71,725 69,801 61,676 66,085
65,237 1.0 7,671 7,892 7,422 7,357 7,611
70,324 0.0 7,737 8,240 7,888 7,685 8,082
226,261 5.8 24,764 26,196 25,973 23,288 23,722
455,276 5.6 50,418 52,164 53,062 47,710 48,597
446,918 3.5 48,368 51,797 51,160 46,166 49,848
88,590 8.7 10,510 10,681 10,258 9,591 9,514
279,985 13.2 31,390 31,889 28,478 28,266 28,282
368,372 5.5 41,188 42,244 43,097 37,992 40,228
Not Adjusted
9 Month Total 2011 2010
*
Business
Analyst
Business
User
IT Admin
Data
Scientist
PRESS
HERE…and cool Big Data stuff happens!
12
13. Innovative client solutions
Orbitz leverages Kognitio Cloud to take large volumes of complex data, ingested in
real time from web channels, demographic and psychographic data, customer
segmentation and modeling scores and turn it into actionable intelligence, allowing
them to think of new ways of offering the right products and services to its current
and prospective client base.
PlaceIQ provides actionable hyper‐local Mobile BI location intelligence. They
leverage Kognitio to extracts intelligence from large amounts of place, social and
mobile location‐based data to create hyper‐local, targetable audience profiles,
giving advertisers the power to connect with consumers at the right place, at the
right time, with the right message.
Public
Cloud
Private
Cloud
Public
Cloud
Software
Appliance
TiVo Research & Analytics 40 TBs of RAM that perform complex media analytics,
cross‐correlating data from over 22 sources with set‐top box data to allow
advertisers, networks and agencies to analyze the ROI of creative campaigns
while they are still in flight, enabling self‐service reporting for business users
The VivaKi Nerve Center provides social media and other analytics for campaign
monitoring and near real‐time advertising effectiveness. This enables agencies in the
Publicis Global Network to provide deep‐dive analytics into TBs of data in seconds
AIMIA provides self‐service customer loyalty analysis on over 24 billion transactions
that are live in‐memory full volumes of POS data. Retailers, Customer Packaged Goods
companies and other service providers, provide merchandise managers with “train‐of‐
thought” analysis to better target customers.
14. Context for media analytics:
• In‐memory analytical database for Big Data
• Correlate everything to everything
• MPP + Linear Scalability
• Predictable and ultra‐fast performance
• > 22 data sources
• Commodity servers/equipment
• Market‐available IT skills
• No solution re‐engineering
Solution Benefits
– Reports allow advertisers, networks and agencies to analyze the
relative strengths and weaknesses of different creative
executions, and how such variables as program environment,
time slots, and pod position impact their ROI
– Enables self‐service reporting for business users
Mars, Inc.:
“By using TRA to improve media plans, creative and
flighting, Mars has achieved a portfolio increase in ROI
versus a year ago of 25% in one category and 35% in a
second category.”
Challenges
– Expanding volumes of data
– Few opportunities for
summarization (demographics,
purchaser targets, etc.)
– Data too large/complex for
traditional database systems
– Need for simple administration
Analytics on tens of billions of events in
tens of seconds with NO DBA
15. Loyalty marketing company that provides
marketing and consulting services to retailers,
service providers, and consumer packaged
goods companies. Their Self-Service
application offers “train-of-thought” analysis
with near real-time data processing, enabling
clients to better target customers.
Background
Case Study: AIMIA
In-memory analytics enable market basket analysis on with blazing speed
•Offer a near-time analytical
environment where all EPOS
transactions, not just sampled
data, could be analyzed.
(improve statistical confidence)
•Enable analysts to write a query
and DB execute (no involvement
from IT/DBAs)
Challenge
AIMIA lands a Kognitio Analytical Appliance they re-sell to each of their end-user
clients, with years of full volume EPOS transactions + customer + product data (over
24 Billion transactions currently). All transactions are held in memory for complex
basket analysis-type queries.
Solution
Best-tuned Oracle RAC query ran in 25 min. same query Kognitio: 3 minutes!
That was in the initial implementation, circa 2007.
Today, average bundle of 12-18 queries runs in 90 seconds!
Results
16. Gartner: Kognitio is “visionary”
Strengths - Commentary
• Consistent leadership with innovative pricing models
• Pioneered data warehouse SaaS
• Kognitio Cloud "on demand" cloud offering key for
growing clients
• Unique ability to switch between Cloud and Platform
• Meets Gartner Logical Data Warehouse concept
• Innovative Hadoop integration
• Great performance
• Consistently satisfied clients with its great
performance
• Makes it easier to use and run ad hoc queries
• Recognized the shift from traditional warehousing
• New features have extended capabilities to manage
external processes and data
19. The Kognitio Analytical Platform
• Why an “analytical platform”?
– In the burgeoning “big data” ecosystem, the volume, velocity and
variety of data require a new approach
• Disaggregation of persistent data storage and analytics
• Variety of BI Tools (MicroStrategy, Tableau, MS Excel, etc.)
• Introduce a new tier to accelerate, govern and increase flexibility
– Complement to Hadoop, EDWs, etc.
• MPP in-memory structure enables fast ad-hoc reporting
• Standard SQL, MDX, etc. to make Hadoop easy, consumable
• Tight integration enables an “information anywhere” approach