2. Kognitio is an
in-memory analytical platform
Built from the ground-up to satisfy large and
complex analytics on big data sets
A massively parallel, in-memory analytical
engine that interoperates with your existing
infrastructure
3. Kognitio
Kognitio is focused on providing the premier high-
performance analytical platform to power business
insight around the world.
•Privately held
•Dev Labs in the UK
•Leadership in US
•~100 employees
Core product:
•MPP in-memory
analytical platform
•Built from the
ground-up to satisfy
large and complex
analytics on big data
sets
5. The Kognitio Analytical Platform
• Why an “analytical platform”?
– In the burgeoning “big data” ecosystem, the volume, velocity and
variety of data require a new approach
• Disaggregation of persistent data storage and analytics
• Variety of BI Tools (MicroStrategy, Tableau, MS Excel, etc.)
• Introduce a new tier to accelerate, govern and increase flexibility
– Complement to Hadoop, EDWs, etc.
• MPP in-memory structure enables fast ad-hoc reporting
• Standard SQL, MDX, etc. to make Hadoop easy, consumable
• Tight integration enables an “information anywhere” approach
7. What is an “In-memory” Analytical Platform?
• A database where all of the data of interest or specific portions of the
data have been permanently pre-loaded into a computers random
access memory (RAM).
• Not a large cache
– Data is held in structures that take advantage of the properties of
RAM – NOT copies of frequently used disk blocks
– The databases query optimiser knows at all times exactly which
data is in memory and which is not
8. Kognitio Analytical Platform
• A high performance in-memory analytical platform that
doesn’t require specialized servers
• Software
– quick simple deployment on commodity hardware or Cloud
• Scalable
– Linear scale-out through best of breed parallelism
• Powerful
– Unrivalled MPP analytical performance
– Harnesses all CPU cores made available
• Low TCO
– Linux, commodity hardware, no special hardware needs
– SQL relational core familiar to most DBAs
9. For Analytics, the CPU is King
• The key metric of any analytical platform should be GB/CPU
– It needs to effectively utilize all available cores
– Hyper threads are NOT the equivalent of cores
• Interactive/adhoc analytics:
– THINK data to core ratios ≈ 10GB data per CPU core
• Every cycle is precious – CPU cores need to used efficiently
– Techniques such as “dynamic machine code generation”
Careful – performance impact of compression:
Makes disk-based databases go faster
Makes in-memory databases go slower
10. Speed & Scale from “True MPP”
• Memory & CPU on an individual server = NOWHERE near enough for big data
– Moore’s Law – The power of a processor doubles every two years
– Data volumes – Double every year!!
• The only way to keep up is to parallelise or scale-out
• Combine the RAM of many individual servers
Many •
•
many CPU cores spread across
many CPUs, housed in
• many individual computers (1 to 1000+)
– Data is split across all the CPU cores
– All database operations are parallelised with no points of serialisation –
This is true MPP
• Every CPU core in
Every • Every server needs to efficiently involved in
• Every query
11. Free to use - Get started now
Try it now: http://www.kognitio.com/free
12. Kognitio Cloud
Kognitio Cloud is a ready-to-use analytical platform. A
secure Platform-as-a-Service (PaaS) available as either a
Private or Public Cloud, it leverages the cloud computing
model to make the Kognitio Analytical Platform available
on a subscription basis.
PRIVATE CLOUD PUBLIC CLOUD
• Could be referred to as an “exclusive” hybrid cloud offering • Ready-to-use in-memory analytical platform leveraging Amazon
Web Services (AWS) Elastic Cloud Computing (EC2) infrastructure
• Kognitio was the first to offer “Data-warehousing-as-a-Service”
(DaaS) in 1993, managed services hosted solution model • Based on hourly usage per CPU/server and TB of data
• Designed for clients who require a secure, dedicated • Suitable for use cases with unpredictable usage patterns
environment without the skills requirement and capital overhead
• Automatically provisioning in minutes with pre-installed servers
associated with traditional, in-house analytical implementations
• Elastic scalability (up and down) to meet compute demand
Cloud model enables multiple advantages
• Attractive to Line-of-Business functions
Fast execution • No software or hardware to buy, install, maintain or upgrade
/ time-to-value • Analysis projects can be brought to life quickly and easily
• PaaS model eliminates setup, maintenance and servicing
Flexibility • Enabling delivery of complex analytics to business users
• “sandbox” environment for development and testing
• Avoid CapEx with only OpEx charges based on
usage/subscription level
Lower costs • Support and maintenance amortization across relevant contract
periods
13. Analytics from the business user-down
Business
User
1. Understand the business problem
2. Define the requirements
• Forecast ROIs and interation Business
Analyst
3. Perform a Kognitio Cloud Assessment
4. Execute a cloud agreement with Kognitio
Not Adjusted
*
9 Month Total 2011 2010
Sep.3
5. Build the application
2011 2010 Aug. Jul. Sep. Aug.
3,443,873 8.1 382,009 401,951 391,878 351,696 369,199
617,194 10.4 67,055 71,725 69,801 61,676 66,085
65,237 1.0 7,671 7,892 7,422 7,357 7,611
70,324 0.0 7,737 8,240 7,888 7,685 8,082
226,261 5.8 24,764 26,196 25,973 23,288 23,722
455,276 5.6 50,418 52,164 53,062 47,710 48,597
446,918 3.5 48,368 51,797 51,160 46,166 49,848
88,590 8.7 10,510 10,681 10,258 9,591 9,514
279,985 13.2 31,390 31,889 28,478 28,266 28,282
368,372 5.5 41,188 42,244 43,097 37,992 40,228
6. Test and deploy the solution
7. Ongoing development & improvement
Enables the Business:
• Fast integration and time-to-value
• Iterative “Sandbox” approach IT
• Reduced risk
14. Deploy with other technologies on AWS
• One click to launch!
• Automatic deployment of Kognitio and BI
tools on Amazon Web Services
• Self-Service BI NeutrinoBI at
nbi.kognitiocloud.com
• Pre-loaded ready sample data in the
cloud for use and demonstration
• Multi-node and single server self-paced
demonstrations
• Videos, instructional information
• Kognitio Community forum on LinkedIn
15. Public Cloud multi-node via CloudFormation
• Kognitio configured as a multi-node deployment
• Available as a trial platform on-demand
• kognitio.kognitiocloud.com
• Few steps to deployment
16. New! Kognitio version 8:
Enabling and extending the Analytical Platform
General Availability:
June 2013
External Functions
Not Only SQL
External Tables
Kognitio Storage
as an External table
Hadoop Connector Other Connectors
17. Kognitio Hadoop Integration
• Developed in co-operation with Sears (Metascale)
• More than just a connector – tight integration
– Hadoop does what it is good at – filtering data
– Kognitio does what it is good at – complex analytics
Create view image “name” as select “field1, field2” from Near-line
“table” where date > 1/1/12 Storage
(optional)
Select
Merchant_Group,
to_char(Num_Accounts,'999,999') Num_Accounts, Give me field1, field 2 from “file” where
to_char(Num_Transactions, '999,999,999') Num_Trans,
date > 1/1/12
Data
to_char(cast(Total_spend as dec(15,2)), '999,999,999') || ' K' otal_Spend_K
from
(select MG.GroupDesc Merchant_Group, count(distinct Account_ID) as Num_Accounts,
count(*) as Num_Transactions, sum(Transaction_Amount) as Total_Spend from
demo_fs.V_Fin_CC_Trans T, demo_fs.V_Fin_Merchant M, demo_fs.V_Fin_Merch_Group MG
where T.Merchant_Category = M.CategoryNo and M.GroupNo=MG.GroupNo and
upper(Location) in (select distinct upper(Town) from
demo_fs.V_Fin_Postcodes where upper(Town) like '%LOW%')
group by MG.GroupDesc ) SQ1
order by Num_Accounts desc;
Hadoop Cluster
18. Kognitio Hadoop Connectors
HDFS Connector – fast load of complete files
• Connector defines access to HDFS file system
• External table accesses row-based data
in HDFS
• Dynamic access or “pin” data into memory
• Complete HDFS file is loaded into memory
• Data filtering requires data to be partitioned into
different files within Hadoop
Map Reduce Connector – filter from large files
• Connector uploads agent to Hadoop nodes
• Query passes selections and relevant
predicates to agent
• Data filtering and projection takes place locally
on each Hadoop node
• Only data of interest is loaded into memory via
parallel load streams
• Data can be filtered within a file
19. Not Only SQL
Kognitio External Scripts
– Run third party binaries or scripts embedded within SQL
• Flexible framework to pass data to/from any executable or interpreter
• Full MPP execution of Perl, Python, Java, R, SAS, etc.
• Any number of rows in/out, partitioning controls
20. Not Only SQL: any language in-line
Kognitio External Scripts
– Run third party binaries or scripts embedded within SQL
• Perl, Python, Java, R, SAS, etc.
• One-to-many rows in, zero-to-many rows out, one to one
create interpreter perlinterp
command '/usr/bin/perl' sends 'csv' receives 'csv' ;
select top 1000 words, count(*) This reads long comments
from (external script using environment perlinterp text from customer enquiry
receives (txt varchar(32000))
sends (words varchar(100)) table, in line perl converts
script S'endofperl( long text into output
while(<>)
{ stream of words (one word
chomp(); per row), query selects top
s/[,.!_]//g;
foreach $c (split(/ /)) 1000 words by frequency
{ if($c =~ /^[a-zA-Z]+$/) { print "$cn”} } using standard SQL
}
)endofperl' aggregation
from (select comments from customer_enquiry))dt
group by 1
order by 2 desc;
21. Innovative client solutions
TiVo Research & Analytics 40 TBs of RAM that perform complex media analytics,
cross-correlating data from over 22 sources with set-top box data to allow
Software advertisers, networks and agencies to analyze the ROI of creative campaigns
while they are still in flight, enabling self-service reporting for business users
The VivaKi Nerve Center provides social media and other analytics for campaign
Public
monitoring and near real-time advertising effectiveness. This enables agencies in the
Cloud Publicis Global Network to provide deep-dive analytics into TBs of data in seconds
AIMIA provides self-service customer loyalty analysis on over 24 billion transactions
that are live in-memory full volumes of POS data. Retailers, Customer Packaged Goods
Appliance companies and other service providers, provide merchandise managers with “train-of-
thought” analysis to better target customers.
Orbitz leverages Kognitio Cloud to take large volumes of complex data, ingested in
Private real time from web channels, demographic and psychographic data, customer
Cloud segmentation and modeling scores and turn it into actionable intelligence, allowing
them to think of new ways of offering the right products and services to its current
and prospective client base.
PlaceIQ provides actionable hyper-local Mobile BI location intelligence. They
leverage Kognitio to extracts intelligence from large amounts of place, social and
Public
mobile location-based data to create hyper-local, targetable audience profiles,
Cloud giving advertisers the power to connect with consumers at the right place, at the
right time, with the right message.
22. Analytics on tens of billions of events in
tens of seconds with NO DBA
Context for media analytics:
• In-memory analytical database for Big Data
• Correlate everything to everything
• MPP + Linear Scalability
• Predictable and ultra-fast performance
Challenges • > 22 data sources
– Expanding volumes of data
• Commodity servers/equipment
– Few opportunities for
summarization (demographics, • Market-available IT skills
purchaser targets, etc.)
• No solution re-engineering
– Data too large/complex for
traditional database systems
– Need for simple administration
Solution Benefits Mars, Inc.:
– Reports allow advertisers, networks and agencies to analyze the “By using TRA to improve media plans, creative and
relative strengths and weaknesses of different creative flighting, Mars has achieved a portfolio increase in ROI
executions, and how such variables as program environment, versus a year ago of 25% in one category and 35% in a
time slots, and pod position impact their ROI second category.”
– Enables self-service reporting for business users
23. Case Study: AIMIA
In-memory analytics enable market basket analysis on with blazing speed
Background Challenge
Loyalty marketing company that provides • Offer a near-time analytical
marketing and consulting services to retailers, environment where all EPOS
service providers, and consumer packaged transactions, not just sampled
goods companies. Their Self-Service data, could be analyzed.
application offers “train-of-thought” analysis (improve statistical confidence)
with near real-time data processing, enabling • Enable analysts to write a query
clients to better target customers. and DB execute (no involvement
from IT/DBAs)
Solution
AIMIA lands a Kognitio Analytical Appliance they re-sell to each of their end-user
clients, with years of full volume EPOS transactions + customer + product data (over
24 Billion transactions currently). All transactions are held in memory for complex
basket analysis-type queries.
Results
Best-tuned Oracle RAC query ran in 25 min. same query Kognitio: 3 minutes!
That was in the initial implementation, circa 2007.
Today, average bundle of 12-18 queries runs in 90 seconds!
26. Think differently about business analytics
Business users require:
• True ad-hoc analysis
• Performance “at the glass”
• Less reliance on IT
• Evolution required for Big Data Analytics:
– Lower reliance on OLAP cubes and associated admin.
– Stop building multiple dependent data marts, databases, etc.
– Bring Hadoop in new use cases:
• “Dark Data”: Web, Social, History, etc.
• Enable noSQL interoperability with existing tools