Kognitio - an overview


Published on

Overview of Kognitio - the company, our products and where they fit.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Kognitio - an overview

  1. 1. The Proven Analytical Platform for Big Data September 2013 Michael Hiskey Vice President Marketing & Business Development
  2. 2. Kognitio is an in-memory analytical platform Built from the ground-up to satisfy large and complex analytics on big data sets A massively parallel, in-memory analytical engine that interoperates with your existing infrastructure
  3. 3. Kognitio •Founded in 1987 •Privately held •Dev Labs in the UK  •Leadership  in US •~100 employees Core product: •MPP in‐memory  analytical platform •Built from the  ground‐up to satisfy  large and complex  analytics on big data  sets Focused on providing the premier high-performance analytical platform to power business insight around the world Price of RAM Log (10) 1995 2000 2005 20101987
  4. 4. Kognitio clients span the globe *some clients NDA * *
  5. 5. Analytical Platform Reference Architecture Analytical Platform Layer Near-line Storage (optional) Application & Client Layer All BI Tools All OLAP Clients Excel Persistence Layer Hadoop Clusters Enterprise Data Warehouses Legacy Systems Kognitio Storage Reporting Cloud Storage
  6. 6. Analytical Platform: Addressable Segments Acceleration for  Traditional BI Data Science /  Advanced Analytics SQL on Hadoop.. And  everything else • Improve performance of  existing BI stack 10‐100x  without re‐engineering • Cost‐saving alternative to  expanding large‐scale  EDWs • Enable tighter data security  and BI Tool governance • Plug‐and‐Play with Hadoop • Analytical “Sandbox” for  rapid Big Data projects • MPP in‐memory code  execution of standard  languages (R, SAS, Python,  Perl) in line with SQL • Ability to simply embed Big  Data analytics into existing  BI/Dashboard Tools without  disruption • Ability to rapidly move  discovery into production • Tight Hadoop Integration • In‐memory over disk • Seamless integration  SQL, ODBC, JDBC, MDX,  ODBO, XML/A etc. • Fast MPP data transfer • High‐throughput, high‐ concurrency, low‐latency  interactive analytics • Core RDBMS architecture  simplifies integration and  brings ACID, DW qualities • Data Virtualization ‐ Platform for LDW • Central shared controlled  data models
  7. 7. create view image shopdata as select prod, store, cust, cost from “transactions” where date > 1/1/12 select store, product_category, sum(cost) total_spend, customer_category customer_type, count (distinct cust) customers from shopdata sd, product_info p, customers c where sd.prod = p.prod_code and c.cust_id = sd.cust group by store, product_category, customer_type Kognitio Hadoop Integration • More than just a connector – tight integration* – Hadoop does what it is good at – storing and filtering data – Kognitio does what it is good at – complex analytics Hadoop Cluster Give me prod, store, cust, cost from “hdfs files” where date > 1/1/12 Transaction Data *Developed in co-operation with Sears (Metascale)
  8. 8. Kognitio Hadoop Connectors HDFS Connector – fast load of complete files • Connector defines access to HDFS file system • External table accesses row-based data in HDFS • Dynamic access or “pin” data into memory • HDFS file(s) loaded into memory • Data filtering relies on data being partitioned into different directories/files within Hadoop Map Reduce Connector – filter from large files • Connector uploads Kognitio agent to Hadoop nodes • Query passes selections and relevant predicates to agent • Data filtering and projection takes place locally on each Hadoop node • Data filtered as it is read from file(s) • Only data of interest is transferred and loaded into memory via parallel load streams
  9. 9. MPP in-memory code execution NoSQL external scripting function: • SQL provides standard data access framework – Open, adaptable framework; pass data to/from any executable or interpreter – Fully flexible MPP execution of R, Python, Java, text parsing libraries etc. create interpreter perlinterp command '/usr/bin/perl' sends 'csv' receives 'csv' ; select top 1000 words, count(*) from (external script using environment perlinterp receives (txt varchar(32000)) sends (words varchar(100)) script S'endofperl( while(<>) { chomp(); s/[,.!_]//g; foreach $c (split(/ /)) { if($c =~ /^[a-zA-Z]+$/) { print "$cn”} } } )endofperl' from (select comments from customer_enquiry))dt group by 1 order by 2 desc; Example: This reads long comments text from customer enquiry table, in line Perl converts long text into output stream of words (one word per row), query selects top 1000 words by frequency using standard SQL aggregation
  10. 10. Using R code for ad-hoc external script create script environment rsint command '/usr/bin/Rscript --vanilla --slave'; grant execute on script environment rsint to power-user; select * from (external script using environment rsint receives ( PRICE SMALLINT ) sends ( PRICE INTEGER ) script S'endofr( options(error = expression(q("no"))) mydata<-read.csv(file=file("stdin"), header=FALSE) sink(, type="message") mydata$V1<-mydata$V1-100 write.table(mydata, row.names = FALSE, col.names = FALSE, sep = "," ) )endofr' from (select price from ITEM_SALE)) dt ; MPP Execution of R • Rows are read into data frame mydata • Data frame vectors (columns) automatically named V1,V2 etc. • Run math formula – in this case simple subtract 100 • Data frame rows returned to Kognitio
  11. 11. Kognitio Cloud PRIVATE CLOUD PUBLIC CLOUD • Could be referred to as an “exclusive” hybrid cloud offering • Heritage from “DaaS” managed services Kognitio ‘hosted appliance’ Kognitio & Partner operated Exclusive – ‘bare metal’ Monthly pricing Min. 1 year term Min. 256GB RAM Notice required Multi-node Optimum configuration Limited Customisation AWS • On-demand ‘hosted appliance’ • Multi-node • Limited Customisation Marketplace • On-demand ‘hosted server’ • Single node • Not customisable • Anonymous • Ready-to-use in-memory analytical platform leveraging Amazon Web Services (AWS) Elastic Cloud Computing (EC2) infrastructure • Hourly usage per CPU/server and TB of data (min 7.5 GBs RAM) • Automatic provisioning - minutes with pre-installed servers • Elastic scalability (up and down) to meet compute demand Single Node Scale-out Console / Services Multi-node  CloudFormation
  12. 12. Cloud provides an ideal deployment scenario Cloud model can provide a way to quickly model, experiment, develop and build • Deploy to existing reporting tools • Pass ownership to IT • Cloud instances can be “temporary” • Repeatable framework 2011 2010 Sep.3 Aug. Jul. Sep. Aug. 3,443,873 8.1 382,009 401,951 391,878 351,696 369,199 617,194 10.4 67,055 71,725 69,801 61,676 66,085 65,237 1.0 7,671 7,892 7,422 7,357 7,611 70,324 0.0 7,737 8,240 7,888 7,685 8,082 226,261 5.8 24,764 26,196 25,973 23,288 23,722 455,276 5.6 50,418 52,164 53,062 47,710 48,597 446,918 3.5 48,368 51,797 51,160 46,166 49,848 88,590 8.7 10,510 10,681 10,258 9,591 9,514 279,985 13.2 31,390 31,889 28,478 28,266 28,282 368,372 5.5 41,188 42,244 43,097 37,992 40,228 Not Adjusted 9 Month Total 2011 2010 * Business  Analyst Business  User IT Admin Data  Scientist PRESS HERE…and cool Big Data stuff happens! 12
  13. 13. Innovative client solutions Orbitz leverages Kognitio Cloud to take large volumes of complex data, ingested in  real time from web channels, demographic and psychographic data, customer  segmentation and modeling scores and turn it into actionable intelligence, allowing  them to think of new ways of offering the right products and services to its current  and prospective client base. PlaceIQ provides actionable hyper‐local Mobile BI location intelligence.  They  leverage Kognitio to extracts intelligence from large amounts of place, social and  mobile location‐based data to create hyper‐local, targetable audience profiles,  giving advertisers the power to connect with consumers at the right place, at the  right time, with the right message.  Public  Cloud Private  Cloud Public  Cloud Software Appliance TiVo Research & Analytics 40 TBs of RAM that perform complex media analytics,  cross‐correlating data from over 22 sources with set‐top box data to allow  advertisers, networks and agencies  to analyze the ROI of creative campaigns  while they are still in flight, enabling self‐service reporting for business users The VivaKi Nerve Center provides social media and other analytics for  campaign  monitoring and near real‐time advertising effectiveness.  This enables agencies in the  Publicis Global Network to provide deep‐dive analytics into TBs of data in seconds AIMIA provides self‐service customer loyalty analysis on over 24 billion transactions  that are live in‐memory full volumes of POS data.  Retailers, Customer Packaged Goods  companies and other service providers, provide merchandise managers with  “train‐of‐ thought” analysis to better target customers.
  14. 14. Context for media analytics:  • In‐memory analytical database for Big Data • Correlate everything to everything • MPP + Linear Scalability • Predictable and ultra‐fast performance • > 22 data sources • Commodity servers/equipment • Market‐available IT skills • No solution re‐engineering Solution Benefits – Reports allow advertisers, networks and agencies  to analyze the  relative strengths and weaknesses of different creative  executions, and how such variables as program environment,  time slots, and pod position impact their ROI – Enables self‐service reporting for business users Mars, Inc.:  “By using TRA to improve media plans, creative and  flighting, Mars has achieved a portfolio increase in ROI  versus a year ago of 25% in one category and 35% in a  second category.” Challenges – Expanding volumes of data – Few opportunities for  summarization (demographics,  purchaser targets, etc.) – Data too large/complex for  traditional database systems – Need for simple administration Analytics on tens of billions of events in tens of seconds with NO DBA
  15. 15. Loyalty marketing company that provides marketing and consulting services to retailers, service providers, and consumer packaged goods companies. Their Self-Service application offers “train-of-thought” analysis with near real-time data processing, enabling clients to better target customers. Background Case Study: AIMIA In-memory analytics enable market basket analysis on with blazing speed •Offer a near-time analytical environment where all EPOS transactions, not just sampled data, could be analyzed. (improve statistical confidence) •Enable analysts to write a query and DB execute (no involvement from IT/DBAs) Challenge AIMIA lands a Kognitio Analytical Appliance they re-sell to each of their end-user clients, with years of full volume EPOS transactions + customer + product data (over 24 Billion transactions currently). All transactions are held in memory for complex basket analysis-type queries. Solution Best-tuned Oracle RAC query ran in 25 min.  same query Kognitio: 3 minutes! That was in the initial implementation, circa 2007.  Today, average bundle of 12-18 queries runs in 90 seconds! Results
  16. 16. Gartner: Kognitio is “visionary” Strengths - Commentary • Consistent leadership with innovative pricing models • Pioneered data warehouse SaaS • Kognitio Cloud "on demand" cloud offering key for growing clients • Unique ability to switch between Cloud and Platform • Meets Gartner Logical Data Warehouse concept • Innovative Hadoop integration • Great performance • Consistently satisfied clients with its great performance • Makes it easier to use and run ad hoc queries • Recognized the shift from traditional warehousing • New features have extended capabilities to manage external processes and data
  17. 17. What others say about Kognitio…
  18. 18. connect www.kognitio.com twitter.com/kognitiolinkedin.com/companies/kognitio tinyurl.com/kognitio youtube.com/kognitio NA: +1 855  KOGNITIO EMEA: +44 1344 300 770
  19. 19. The Kognitio Analytical Platform • Why an “analytical platform”? – In the burgeoning “big data” ecosystem, the volume, velocity and variety of data require a new approach • Disaggregation of persistent data storage and analytics • Variety of BI Tools (MicroStrategy, Tableau, MS Excel, etc.) • Introduce a new tier to accelerate, govern and increase flexibility – Complement to Hadoop, EDWs, etc. • MPP in-memory structure enables fast ad-hoc reporting • Standard SQL, MDX, etc. to make Hadoop easy, consumable • Tight integration enables an “information anywhere” approach