1
The Future of Big Data and Instantly Relevant Computing
Copyright © Clusterpoint 2015
Gints Ernestsons, founder
XML
JSON
ACID
TEXT
SQL
REST APIREST APIREST APIREST API
NOSQL
Distributed
transactions
SEARCH
Big data Real-time
analytics
2
How to find relevant technology among the big data “hype”?
CLOUDCLOUD
Complexity
PAAS
AAAS
IAAS
Private
PublicHybrid
Virtualization
Containerization
Big Data
3
Perhaps AI and supercomputers can solve big data problem?
AI can help, yet it requires big computing costs and big efforts today
Baidu’s AAAAIIII
SupercomputerSupercomputerSupercomputerSupercomputer
Beats Google
at Image
Recognition
MIT Technology
Review, May, 2015
Google's DeepMind Builds ArtificialArtificialArtificialArtificial
IntelligenceIntelligenceIntelligenceIntelligence That Mimics ... Human
Brain
International
Business
Times,
Nov 2014
Facebook Fights Info OverloadInfo OverloadInfo OverloadInfo Overload
WithWithWithWith AIsAIsAIsAIs That Identify What’s
In Videos
TechCrunch,
Mar 2015
Microsoft
Challenges
Google’s
Artificial BrainArtificial BrainArtificial BrainArtificial Brain
With
‘Project Adam’
Wired, July, 2014
4
CATCATCATCAT or DOGDOGDOGDOG ????
Method converts vision and voice data
into text using neural network
algorithm that labels images, video or
audio with text
Deep learning of AI: the “Holly Grail” of big data computing?
If you believe authority, a cat detector for less than:
$ 1$ 1$ 1$ 1 M ?M ?M ?M ? $10 M ?$10 M ?$10 M ?$10 M ? $1 BILLION ?$1 BILLION ?$1 BILLION ?$1 BILLION ?
5
“ Blind belief in authority is the greatest enemy of truth. ”
Albert Einstein
6
There are 2 main problems in ordinary databases and big data
Ordinary databases overload and overwhelm users with data
7
We were always have been dealing with information overload ...
Prof. Clay ShirkyProf. Clay ShirkyProf. Clay ShirkyProf. Clay Shirky,
a new media writer on
the social and economic
effects of digital
technologies, US
8
Relevance ranking is a method to address information overload
Weighting of all relevant human needs to determine the best action
Relevance
ranking
Human
needs
9
Relevance ranking of needs for your business product
Needs of your product customers
Needs of your business owners
Needs of your product end-users
Community needs
Employees needs
RELEVANCE
RANKING
( for example
scoring all
needs from
0% to 100% )
10
How to select your cloud computing and big data technology?
Rank your most relevant human needs in big data computing!
11
John von Neumann computing worldJohn von Neumann computing worldJohn von Neumann computing worldJohn von Neumann computing world
Small disk storage
Tiny RAM capacity
Slow CPU speed
Limited network bandwidth
Highly expensive hardware
Complex schemas, software, data
Most relevant need is an inexpensive computing infrastructure
Gordon MooreGordon MooreGordon MooreGordon Moore’’’’s computing worlds computing worlds computing worlds computing world
TBs of cheap hdd/ssd storage
GBs of RAM
Cheap multi-core CPUs
Gbps high-speed networks
Nearly expendable hardware
Web software (html, json, xml)
SQL
12
Today relevant is instantly scalable distributed computing
CPU-time
required to
process a
request
It takes X seconds on single server
It takes 100 times less clock time
to get the result within Cloud.
13
Relevant computing must be available, reliable, ubiquitous
REST API
http
https
tcp/ip
14
Relevant is cost-efficient sharing of computing infrastructure
PayPay--perper--useuse
Model, $Model, $
Resources
Time
ConventionalConventional
ProvisioningProvisioning
Model, $Model, $
Save 3x-10x
BIG DATABIG DATA
15
Need to manage structured and unstructured data together
Easily mix / analyze all your data types
From structured data To unstructured data
XML JSON BLOBTEXTSQL NOSQL
16
XML / JSON / BLOB
Relevant for human productivity is flexible schema-free data
Ordinary database Document database
17
Iron-clad security and consistency for big data is very relevant
Secure high speed ACID-transactions
From a single
computer security
To safe online transaction
processing in big data
SQL
XML
JSON
BLOB
SQL
NoSQL
18
Humans need fast and relevant free text search in big data
Simple web-style search in big data as a norm
Natural language
keyword (voice) search
for ease of use
Ranking of search
results to get rid of
information overload
19
Instant responsivness in big data search and analytics
PB
GB
TB
MB
Milliseconds for aMilliseconds for aMilliseconds for aMilliseconds for annnn
instant searchinstant searchinstant searchinstant search queryqueryqueryquery
MinutesMinutesMinutesMinutes //// hourshourshourshours
for a SQLfor a SQLfor a SQLfor a SQL queryqueryqueryquery
Low querying latency across billions of documents is relevant
XML
JSON
BLOB
NoSQL
20
replica 1
replica 2
replica 3
Relevant mision-critical features for 24/7 computing services
LOAD BALANCINGLOAD BALANCINGLOAD BALANCINGLOAD BALANCING
FAULTFAULTFAULTFAULT----TOLERANCETOLERANCETOLERANCETOLERANCE HIGHHIGHHIGHHIGH----AVAILABILITYAVAILABILITYAVAILABILITYAVAILABILITY
SCALE OUT ABILITYSCALE OUT ABILITYSCALE OUT ABILITYSCALE OUT ABILITY
21
Ordinary database requires tons of your efforts
Your coporate business data
22
Popular approach recently: one can train a big animal
Your
big
data
23
No custom integration requiredCustom “stitching” all platforms
Big data management in one software platform is very relevant
Secure DB, ACID-
transactions
ONE
API
Cut 80% off your TCOCut 80% off your TCOCut 80% off your TCOCut 80% off your TCO
Drive up to 10x fasterDrive up to 10x fasterDrive up to 10x fasterDrive up to 10x faster
Instant Big data
scale out ability
Online analytics
on rich data
Search software,
full-text indexes
Your application “spaghetti” code
24
Develop your applications scalable for big data from day one
OPEX, TCOOPEX, TCOOPEX, TCOOPEX, TCO
Life-cycle
Save > 80% Write only
once your web
or mobile
software
Test Year 1 Year 2 Year 3 Year 4 Year N
25
What will happen in the computing industry?
Relational databases
will die in pain
NoSQL will go
extinct as well (!?)
26
Rank relevant needs to select your future big data technology
Cost-efficiency
Low latency
Instant scalability
Iron-clad security
Schema-free simplicity
High availability
Relevant text search
Real-time analytics
100%100%100%100%
0%
50%
100%100%100%100%
0%
50%
0%
50%
0%0%0%0%
50%
0%
0%
100%100%100%100%
0%0%0%0%
50%
0%
50%
100%100%100%100%
100%100%100%100%
50%
50%
100%100%100%100%
100%100%100%100%
100%100%100%100%
EnterpriseEnterpriseEnterpriseEnterprise CommunityCommunityCommunityCommunity Web Corp.Web Corp.Web Corp.Web Corp.Needs of:
27
The future of big data is instantly relevant computing
28
Who we are?
Clusterpoint is a
European tech
company, founded
in 2006. Our
unique database
software is used
by commercial
customers mainly
in EU & Nordic
markets.
Photo: April 2015,
AngelHack, SanFrancisco,
USA
29
Founder,
Visionary
Gints
Ernestsons
CTO,
Founder
Jurgis
Orups
DB Software
Architect
Janis
Sermulins
CEO
Zigmars
Rasscevskis
Business
Dev Director
Peteris
Janovskis
Management Team
15 years CTO in
LursoftLursoftLursoftLursoft; 8 years
CEO in
Clusterpoint;Clusterpoint;Clusterpoint;Clusterpoint;
25 years as a
technology
entrepreneur &
bold innovator
8 years in
GoogleGoogleGoogleGoogle;
Technical Lead,
the Web search
infrastructure
core software
engineering
(Zurich, Swiss)
9 years runs
ClusterpointClusterpointClusterpointClusterpoint
core software
engineering
team, expert
in C/C++,
NoSQL, Big
data search
5 years in
GoogleGoogleGoogleGoogle
(Zurich);
MITMITMITMIT alumni;
internship in
IntelIntelIntelIntel RRRResearchesearchesearchesearch
(USA)
12 years in
OracleOracleOracleOracle;
Alliance &
Channel
Director
Central and
East
Europe
30
Try instantly relevant computing with Clusterpoint database!
Cost-efficient distributed document database with built-in search and analytics
XML
JSON
BLOB
REST APIREST APIREST APIREST API
ALL IN ONE PLATFORMALL IN ONE PLATFORMALL IN ONE PLATFORMALL IN ONE PLATFORM
31
Email: support@clusterpoint.com
Phone USA: +1 (650) 681 9710
Phone Europe: +371 (2) 9243460
Scale your data when your need is the most relevant: instantly
Free 10 GBFree 10 GBFree 10 GBFree 10 GB • instant scalability
• no s/w deployment
• no h/w provisioning
• 365/24/7 managed
Free sign-up: www.clusterpoint.com
Cloud DBaaSCloud DBaaSCloud DBaaSCloud DBaaS
32
" For the modern customer, big data is all about big relevance “
Source: Prof. Steven Van Belleghem, a writer, keynote speaker and inspirator,
a thought leader in customer-centric marketing. His book published in 2015.
Thank you for your attention!

Big Data Expo 2015 - Clusterpoint The Future of Big Data

  • 1.
    1 The Future ofBig Data and Instantly Relevant Computing Copyright © Clusterpoint 2015 Gints Ernestsons, founder XML JSON ACID TEXT SQL REST APIREST APIREST APIREST API NOSQL Distributed transactions SEARCH Big data Real-time analytics
  • 2.
    2 How to findrelevant technology among the big data “hype”? CLOUDCLOUD Complexity PAAS AAAS IAAS Private PublicHybrid Virtualization Containerization Big Data
  • 3.
    3 Perhaps AI andsupercomputers can solve big data problem? AI can help, yet it requires big computing costs and big efforts today Baidu’s AAAAIIII SupercomputerSupercomputerSupercomputerSupercomputer Beats Google at Image Recognition MIT Technology Review, May, 2015 Google's DeepMind Builds ArtificialArtificialArtificialArtificial IntelligenceIntelligenceIntelligenceIntelligence That Mimics ... Human Brain International Business Times, Nov 2014 Facebook Fights Info OverloadInfo OverloadInfo OverloadInfo Overload WithWithWithWith AIsAIsAIsAIs That Identify What’s In Videos TechCrunch, Mar 2015 Microsoft Challenges Google’s Artificial BrainArtificial BrainArtificial BrainArtificial Brain With ‘Project Adam’ Wired, July, 2014
  • 4.
    4 CATCATCATCAT or DOGDOGDOGDOG???? Method converts vision and voice data into text using neural network algorithm that labels images, video or audio with text Deep learning of AI: the “Holly Grail” of big data computing? If you believe authority, a cat detector for less than: $ 1$ 1$ 1$ 1 M ?M ?M ?M ? $10 M ?$10 M ?$10 M ?$10 M ? $1 BILLION ?$1 BILLION ?$1 BILLION ?$1 BILLION ?
  • 5.
    5 “ Blind beliefin authority is the greatest enemy of truth. ” Albert Einstein
  • 6.
    6 There are 2main problems in ordinary databases and big data Ordinary databases overload and overwhelm users with data
  • 7.
    7 We were alwayshave been dealing with information overload ... Prof. Clay ShirkyProf. Clay ShirkyProf. Clay ShirkyProf. Clay Shirky, a new media writer on the social and economic effects of digital technologies, US
  • 8.
    8 Relevance ranking isa method to address information overload Weighting of all relevant human needs to determine the best action Relevance ranking Human needs
  • 9.
    9 Relevance ranking ofneeds for your business product Needs of your product customers Needs of your business owners Needs of your product end-users Community needs Employees needs RELEVANCE RANKING ( for example scoring all needs from 0% to 100% )
  • 10.
    10 How to selectyour cloud computing and big data technology? Rank your most relevant human needs in big data computing!
  • 11.
    11 John von Neumanncomputing worldJohn von Neumann computing worldJohn von Neumann computing worldJohn von Neumann computing world Small disk storage Tiny RAM capacity Slow CPU speed Limited network bandwidth Highly expensive hardware Complex schemas, software, data Most relevant need is an inexpensive computing infrastructure Gordon MooreGordon MooreGordon MooreGordon Moore’’’’s computing worlds computing worlds computing worlds computing world TBs of cheap hdd/ssd storage GBs of RAM Cheap multi-core CPUs Gbps high-speed networks Nearly expendable hardware Web software (html, json, xml) SQL
  • 12.
    12 Today relevant isinstantly scalable distributed computing CPU-time required to process a request It takes X seconds on single server It takes 100 times less clock time to get the result within Cloud.
  • 13.
    13 Relevant computing mustbe available, reliable, ubiquitous REST API http https tcp/ip
  • 14.
    14 Relevant is cost-efficientsharing of computing infrastructure PayPay--perper--useuse Model, $Model, $ Resources Time ConventionalConventional ProvisioningProvisioning Model, $Model, $ Save 3x-10x BIG DATABIG DATA
  • 15.
    15 Need to managestructured and unstructured data together Easily mix / analyze all your data types From structured data To unstructured data XML JSON BLOBTEXTSQL NOSQL
  • 16.
    16 XML / JSON/ BLOB Relevant for human productivity is flexible schema-free data Ordinary database Document database
  • 17.
    17 Iron-clad security andconsistency for big data is very relevant Secure high speed ACID-transactions From a single computer security To safe online transaction processing in big data SQL XML JSON BLOB SQL NoSQL
  • 18.
    18 Humans need fastand relevant free text search in big data Simple web-style search in big data as a norm Natural language keyword (voice) search for ease of use Ranking of search results to get rid of information overload
  • 19.
    19 Instant responsivness inbig data search and analytics PB GB TB MB Milliseconds for aMilliseconds for aMilliseconds for aMilliseconds for annnn instant searchinstant searchinstant searchinstant search queryqueryqueryquery MinutesMinutesMinutesMinutes //// hourshourshourshours for a SQLfor a SQLfor a SQLfor a SQL queryqueryqueryquery Low querying latency across billions of documents is relevant XML JSON BLOB NoSQL
  • 20.
    20 replica 1 replica 2 replica3 Relevant mision-critical features for 24/7 computing services LOAD BALANCINGLOAD BALANCINGLOAD BALANCINGLOAD BALANCING FAULTFAULTFAULTFAULT----TOLERANCETOLERANCETOLERANCETOLERANCE HIGHHIGHHIGHHIGH----AVAILABILITYAVAILABILITYAVAILABILITYAVAILABILITY SCALE OUT ABILITYSCALE OUT ABILITYSCALE OUT ABILITYSCALE OUT ABILITY
  • 21.
    21 Ordinary database requirestons of your efforts Your coporate business data
  • 22.
    22 Popular approach recently:one can train a big animal Your big data
  • 23.
    23 No custom integrationrequiredCustom “stitching” all platforms Big data management in one software platform is very relevant Secure DB, ACID- transactions ONE API Cut 80% off your TCOCut 80% off your TCOCut 80% off your TCOCut 80% off your TCO Drive up to 10x fasterDrive up to 10x fasterDrive up to 10x fasterDrive up to 10x faster Instant Big data scale out ability Online analytics on rich data Search software, full-text indexes Your application “spaghetti” code
  • 24.
    24 Develop your applicationsscalable for big data from day one OPEX, TCOOPEX, TCOOPEX, TCOOPEX, TCO Life-cycle Save > 80% Write only once your web or mobile software Test Year 1 Year 2 Year 3 Year 4 Year N
  • 25.
    25 What will happenin the computing industry? Relational databases will die in pain NoSQL will go extinct as well (!?)
  • 26.
    26 Rank relevant needsto select your future big data technology Cost-efficiency Low latency Instant scalability Iron-clad security Schema-free simplicity High availability Relevant text search Real-time analytics 100%100%100%100% 0% 50% 100%100%100%100% 0% 50% 0% 50% 0%0%0%0% 50% 0% 0% 100%100%100%100% 0%0%0%0% 50% 0% 50% 100%100%100%100% 100%100%100%100% 50% 50% 100%100%100%100% 100%100%100%100% 100%100%100%100% EnterpriseEnterpriseEnterpriseEnterprise CommunityCommunityCommunityCommunity Web Corp.Web Corp.Web Corp.Web Corp.Needs of:
  • 27.
    27 The future ofbig data is instantly relevant computing
  • 28.
    28 Who we are? Clusterpointis a European tech company, founded in 2006. Our unique database software is used by commercial customers mainly in EU & Nordic markets. Photo: April 2015, AngelHack, SanFrancisco, USA
  • 29.
    29 Founder, Visionary Gints Ernestsons CTO, Founder Jurgis Orups DB Software Architect Janis Sermulins CEO Zigmars Rasscevskis Business Dev Director Peteris Janovskis ManagementTeam 15 years CTO in LursoftLursoftLursoftLursoft; 8 years CEO in Clusterpoint;Clusterpoint;Clusterpoint;Clusterpoint; 25 years as a technology entrepreneur & bold innovator 8 years in GoogleGoogleGoogleGoogle; Technical Lead, the Web search infrastructure core software engineering (Zurich, Swiss) 9 years runs ClusterpointClusterpointClusterpointClusterpoint core software engineering team, expert in C/C++, NoSQL, Big data search 5 years in GoogleGoogleGoogleGoogle (Zurich); MITMITMITMIT alumni; internship in IntelIntelIntelIntel RRRResearchesearchesearchesearch (USA) 12 years in OracleOracleOracleOracle; Alliance & Channel Director Central and East Europe
  • 30.
    30 Try instantly relevantcomputing with Clusterpoint database! Cost-efficient distributed document database with built-in search and analytics XML JSON BLOB REST APIREST APIREST APIREST API ALL IN ONE PLATFORMALL IN ONE PLATFORMALL IN ONE PLATFORMALL IN ONE PLATFORM
  • 31.
    31 Email: support@clusterpoint.com Phone USA:+1 (650) 681 9710 Phone Europe: +371 (2) 9243460 Scale your data when your need is the most relevant: instantly Free 10 GBFree 10 GBFree 10 GBFree 10 GB • instant scalability • no s/w deployment • no h/w provisioning • 365/24/7 managed Free sign-up: www.clusterpoint.com Cloud DBaaSCloud DBaaSCloud DBaaSCloud DBaaS
  • 32.
    32 " For themodern customer, big data is all about big relevance “ Source: Prof. Steven Van Belleghem, a writer, keynote speaker and inspirator, a thought leader in customer-centric marketing. His book published in 2015. Thank you for your attention!