More Related Content Similar to Big Data LDN 2018: PROGRESS FOR BIG DATA IN KUBERNETES (20) More from Matt Stubbs (20) Big Data LDN 2018: PROGRESS FOR BIG DATA IN KUBERNETES2. 2 © 2018 MapR Technologies, Inc. // MapR Confidential
Ted Dunning, PhD
Chief Application Architect, MapR Technologies
Board member, Apache Software Foundation
O’Reilly author
Email tdunning@mapr.com tdunning@apache.org
Twitter @ted_dunning
Contact Information
3. 3 © 2018 MapR Technologies, Inc. // MapR Confidential
kubernetes is coming!
4. 4 © 2018 MapR Technologies, Inc. // MapR Confidential
why?
5. 5 © 2018 MapR Technologies, Inc. // MapR Confidential
Source: Shippable.com http://blog.shippable.com/why-the-adoption-of-kubernetes-will-explode-in-2018
kubernetes = major community support
6. 6 © 2018 MapR Technologies, Inc. // MapR Confidential
every cloud supports kubernetes
https://www.sinax.be/en/aws/
https://www.westconcomstor.com/za/en/vendors/wc-vendors/microsoft-azure-EN-UK.html
https://www.g2crowd.com/products/google-kubernetes-engine-gke/details
7. 7 © 2018 MapR Technologies, Inc. // MapR Confidential
massive customer adoption rate
8. 8 © 2018 MapR Technologies, Inc. // MapR Confidential
9. 9 © 2018 MapR Technologies, Inc. // MapR Confidential
what is kubernetes?
10. 10 © 2018 MapR Technologies, Inc. // MapR Confidential
kubernetes (n.) - greek word for pilot or
helm
11. 11 © 2018 MapR Technologies, Inc. // MapR Confidential
kubernetes started life as a
successor to google’s borg
project...
https://cloud.google.com/security/encryption-in-transit/
12. 12 © 2018 MapR Technologies, Inc. // MapR Confidential
kubernetes is an ecosystem...
Source: Redmonk - http://redmonk.com/sogrady/2017/09/22/cloud-native-license-choices/
13. 13 © 2018 MapR Technologies, Inc. // MapR Confidential
container and resource orchestration engine...
14. 14 © 2018 MapR Technologies, Inc. // MapR Confidential
Source: Shippable.com http://blog.shippable.com/why-the-adoption-of-kubernetes-will-explode-in-2018
kubernetes won the container orchestration war...
15. 15 © 2018 MapR Technologies, Inc. // MapR Confidential
what is kubernetes?
16. 16 © 2018 MapR Technologies, Inc. // MapR Confidential
it runs containers
17. 17 © 2018 MapR Technologies, Inc. // MapR Confidential
what is a container?
18. 18 © 2018 MapR Technologies, Inc. // MapR Confidential
not a vm
19. 19 © 2018 MapR Technologies, Inc. // MapR Confidential
hardware
vm vs container
os
hypervisor
vm
os
libs
app
vm
os
libs
app
hardware
os
container
libs
app
container
libs
app
container
libs
app
20. 20 © 2018 MapR Technologies, Inc. // MapR Confidential
pets vs cattle
https://fwallpapers.com/view/cat-jeans
http://www.clipartpanda.com/clipart_images/free-clip-art-1083418
21. 21 © 2018 MapR Technologies, Inc. // MapR Confidential
pets vs cattle
- long lived
- name them
- care for them
- ephemeral
- brand them with #’s
- well..vets are expensive
22. 22 © 2018 MapR Technologies, Inc. // MapR Confidential
23. 27 © 2018 MapR Technologies, Inc. // MapR Confidential
File File File Container Image
chroot
container = image + isolation
cgroups
● cpu
● memory
● network
● etc.
namespaces
● pids
● mnts
● etc.
24. 28 © 2018 MapR Technologies, Inc. // MapR Confidential
containers are good
25. 29 © 2018 MapR Technologies, Inc. // MapR Confidential
containers are good
26. 30 © 2018 MapR Technologies, Inc. // MapR Confidential
containers have a problem
27. 31 © 2018 MapR Technologies, Inc. // MapR Confidential
you can never get away from
pets unless:
- you handle the problem of
container state
- you need an environment to
support cattle
MapR and kubernetes are the
solution
28. 32 © 2018 MapR Technologies, Inc. // MapR Confidential
solve port mapping hell
monitor running containers
handle dead containers
move containers so utilization improves
autoscale container instances to handle load
Things docker can’t (or won’t) do...
29. 33 © 2018 MapR Technologies, Inc. // MapR Confidential
Magical View of Kubernetes
30. 34 © 2018 MapR Technologies, Inc. // MapR Confidential
App 1
Kubernetes
Magical View of Kubernetes
Kubernetes starts application
containers “somewhere”
31. 35 © 2018 MapR Technologies, Inc. // MapR Confidential
Magical View of Kubernetes
App 1 App 3
Kubernetes
Later containers may be started
elsewhere due to “affinities”
32. 36 © 2018 MapR Technologies, Inc. // MapR Confidential
Magical View of Kubernetes
App 1 App 2 App 3
Kubernetes
Kubernetes provides super fast
naming via DNS so containers
can find each other
33. 37 © 2018 MapR Technologies, Inc. // MapR Confidential
Note that you don’t think about
which machine at all
34. 38 © 2018 MapR Technologies, Inc. // MapR Confidential
You don’t think about which
machine at all
No more names from The Hobbit
Just cattle
35. 39 © 2018 MapR Technologies, Inc. // MapR Confidential
Software engineering can be viewed as freezing bits
Initially, everything is possible, nothing is actual
We freeze the source
Then the binary
Then the package
Then the environment
Ultimately the system
The Impact of Kubernetes
36. 40 © 2018 MapR Technologies, Inc. // MapR Confidential
37. 41 © 2018 MapR Technologies, Inc. // MapR Confidential
38. 42 © 2018 MapR Technologies, Inc. // MapR Confidential
39. 43 © 2018 MapR Technologies, Inc. // MapR Confidential
git cc/ld
java/jar
Build
docker build
resources
config libraries
Package
helm package
Construct
40. 44 © 2018 MapR Technologies, Inc. // MapR Confidential
git cc/ld
java/jar
Build
docker build
resources
config libraries
Package
helm package
Construct
helm install/scale
Load balancer
Deploy
41. 45 © 2018 MapR Technologies, Inc. // MapR Confidential
This is glorious
42. 46 © 2018 MapR Technologies, Inc. // MapR Confidential
but we still have a problem
43. 47 © 2018 MapR Technologies, Inc. // MapR Confidential
state
44. 48 © 2018 MapR Technologies, Inc. // MapR Confidential
Not Done Yet
Load balancer
45. 49 © 2018 MapR Technologies, Inc. // MapR Confidential
Not Done Yet
Load balancer
Here’s the problem
46. 50 © 2018 MapR Technologies, Inc. // MapR Confidential
Not Really Ready at All
State in containers messes things up
Restarts lose the state
Replicating state makes services
complex
Application developers just aren’t
systems developers
State life-cycle doesn’t match app life-
cycle
Load balancer
Here’s the problem
47. 51 © 2018 MapR Technologies, Inc. // MapR Confidential
What is a Service Anyway?
Load balancer
RPC in
48. 52 © 2018 MapR Technologies, Inc. // MapR Confidential
Synchronous RPC-based services only serve one need
In a synchronous service it’s common to do some, defer some
But deferring work is hard in a synchronous world … we have to give up the return call
in some sense
This is the germ of streaming architecture
But … Not Entirely
49. 53 © 2018 MapR Technologies, Inc. // MapR Confidential
Deferred
What is a Service Anyway?
Load balancer
RPC in
50. 54 © 2018 MapR Technologies, Inc. // MapR Confidential
If I can hide details of who and where, I have a service
If I can hide details of deployment, I have a micro-service
If I can hide details of when, I have a streaming micro-service
Isolation is The Defining Characteristic
51. 55 © 2018 MapR Technologies, Inc. // MapR Confidential
Temporal and Geo Isolation
Producer Consumer isn’t even running
52. 56 © 2018 MapR Technologies, Inc. // MapR Confidential
Temporal and Geo Isolation
Producer Consumer
53. 57 © 2018 MapR Technologies, Inc. // MapR Confidential
Producer
Consumer
Temporal and Geo Isolation
Consumer could be an ocean away
54. 58 © 2018 MapR Technologies, Inc. // MapR Confidential
Files are important
• Config files, image files, archival data data
• Legacy applications like machine learning, web
Tables are important
• Critical to have random update for some applications
• Should scale transparently without dedicated cluster
Streams are important
• Should be co-equal form of persistence
We Need Multiple Forms of Persistence
55. 59 © 2018 MapR Technologies, Inc. // MapR Confidential
App 1 App 2 App 3
56. 60 © 2018 MapR Technologies, Inc. // MapR Confidential
App 1 App 2 App 3
stream
LogFile
57. 61 © 2018 MapR Technologies, Inc. // MapR Confidential
App 1 App 2 App 3
stream
LogFile
58. 62 © 2018 MapR Technologies, Inc. // MapR Confidential
App 1 App 2 App 3
59. 63 © 2018 MapR Technologies, Inc. // MapR Confidential
Global namespace across entire Kubernetes cluster
• Between clusters as well if possible
All three forms of primitive persistence
• Files, streams, tables
Inherently scalable
• Performance, cardinality, locality
Uniform access and control
• Path names for all objects, identical permission scheme
What Does This Data Platform Need to Have?
60. 64 © 2018 MapR Technologies, Inc. // MapR Confidential
Global namespace across entire Kubernetes cluster
• Between clusters as well if possible
All three forms of primitive persistence
• Files, streams, tables
Inherently scalable
• Performance, cardinality, locality
Uniform access and control
• Path names for all objects, identical permission scheme
Oh…. got that already. Just need to wire it up to Kubernetes
What Does This Data Platform Need to Have?
61. 65 © 2018 MapR Technologies, Inc. // MapR Confidential
62. 66 © 2018 MapR Technologies, Inc. // MapR Confidential
63. 67 © 2018 MapR Technologies, Inc. // MapR Confidential
kubelet
docker
pod
Normally pods interact
directly with node resources
64. 68 © 2018 MapR Technologies, Inc. // MapR Confidential
kubelet
docker
pod
plugin We can install a volume
plugin (recently introduced)
65. 69 © 2018 MapR Technologies, Inc. // MapR Confidential
mapr-
fs
kubelet
docker
pod
plugin
fuse
This allows uniform access
to files, tables and streams
66. 70 © 2018 MapR Technologies, Inc. // MapR Confidential
Where does that take us?
67. 71 © 2018 MapR Technologies, Inc. // MapR Confidential
Installation of plugin is K8S level operation
• No per-node attention required
Use of plugin is overlay operation
• No change needed for an container
• Any Helm chart can use the plugin for conventional file access
Can share storage/compute or isolate or scale independently
Consequences
68. 72 © 2018 MapR Technologies, Inc. // MapR Confidential
State is no longer a dirty word for Kubernetes
HPC can run on K8S
Boring things can run on K8S without storage appliances
Previously crazy ideas can now be valuable
Complexity is largely not visible
More Consequences
69. 73 © 2018 MapR Technologies, Inc. // MapR Confidential
Container orchestration is awesome
70. 74 © 2018 MapR Technologies, Inc. // MapR Confidential
Container orchestration is awesome
Data orchestration is, too
71. 75 © 2018 MapR Technologies, Inc. // MapR Confidential
Cloud as-is: No unified data access or security concepts
Public Cloud
Application
✓
API
API Connector
Single cloud vendor strategy:
• Vendor lock in
• No failover in case of global outage
• Limited Edge capabilities
AWS Services:
• Kinesis & Elastic MapReduce
• Redshift & DynamoDB
• S3 & Glacier
72. 76 © 2018 MapR Technologies, Inc. // MapR Confidential
Cloud as-is: No unified data access or security concepts
Public Cloud
Application
✓
API
API Connector
AWS Services:
• Kinesis & Elastic MapReduce
• Redshift & DynamoDB
• S3 & Glacier
API
Azure Services:
• HD Insight
• SQL Server & CosmosDB
• Blob & DataLakeStore
Multi cloud strategy:
• Complex data movement between clouds
• On any other cloud:
• Different API‘s: application breaks
• Different Security concept
73. 77 © 2018 MapR Technologies, Inc. // MapR Confidential
Cloud as-is: No unified data access or security concepts
Edge Private Cloud
On Premise
Public Cloud Public Cloud Public Cloud
API
Application
✓
API API API API
API Connector
Multi cloud strategy:
• Complex data movement between clouds
• On any other cloud:
• Different API‘s, application breaks
• Different Security concept
74. 78 © 2018 MapR Technologies, Inc. // MapR Confidential
Open APIs
How a “Media Company” is Unifying Compute Environments
Application
• Unified Security Model
• Data access decoupled from physical
storage location. Globally.
• No lock-in to proprietary APIs
• Full openness
• Data made portable
API Connector
✓
GLOBAL DATA MANAGEMENT
Edge Private Cloud
On Premise
Public Cloud Public Cloud Public Cloud
Uniform computing
environment everywhere
75. 79 © 2018 MapR Technologies, Inc. // MapR Confidential
Open APIs
How “Manufacturing Company” is Orchestrating Data
Application
• Unified Security Model
• Data access decoupled from physical
storage location. Globally.
• No lock-in to proprietary APIs
• Full openness
• Data made portable
API Connector
✓
GLOBAL DATA MANAGEMENT
Edge Private Cloud
On Premise
Public Cloud Public Cloud Public Cloud
Platform level
data replication
76. 80 © 2018 MapR Technologies, Inc. // MapR Confidential
Tier 1 Bank #1 Creating a Global Filesystem
/mapr/edge1
/mapr/edge2
/mapr/edge3
/mapr/newyork
/mapr/amsterdam
/mapr/azure /mapr/gcp
/mapr/aws-eu-west
NFS POSIX HDFSREST
HOT
WARM
COLD
/mapr
Kafka JSON HBASE SQL S3
Application
✓
Global access
to local data
77. 81 © 2018 MapR Technologies, Inc. // MapR Confidential
Tier 1 Bank #2: Creating a Universal Application Platform
Application
GLOBAL DATA MANAGEMENT
Edge Private Cloud
On Premise
Public Cloud Public Cloud Public Cloud
PodPod Pod Image Classification using
Tensorflow in a Docker container
Classic ETL
Scheduling & Scaling
MapR Kubernetes Volume Driver
Single pane of glass to
control jobs anywhere
78. 82 © 2018 MapR Technologies, Inc. // MapR Confidential
Additional Resources
O’Reilly report by Ted Dunning & Ellen Friedman © September 2017
Read free courtesy of MapR:
https://mapr.com/ebook/machine-learning-logistics/
O’Reilly book by Ted Dunning & Ellen Friedman
© March 2016
Read free courtesy of MapR:
https://mapr.com/streaming-architecture-using-
apache-kafka-mapr-streams/
Ted Dunning & Ellen Friedman
Model Management in theReal World
MachineLearning
Logistics
79. 83 © 2018 MapR Technologies, Inc. // MapR Confidential
Additional Resources
O’Reilly book by Ted Dunning & Ellen Friedman
© June 2014
Read free courtesy of MapR:
https://mapr.com/practical-machine-learning-new-look-
anomaly-detection/
O’Reilly book by Ellen Friedman & Ted Dunning
© February 2014
Read free courtesy of MapR:
https://mapr.com/practical-machine-learning/
80. 84 © 2018 MapR Technologies, Inc. // MapR Confidential
AI & Analytics in Production:
How to Make It Work
Free book signing with authors Ted Dunning & Ellen Friedman
MapR stand #145:
Tues 1:00 pm – 1:45 pm
Wed 1:00 pm – 1:45 pm
Download free pdf via MapR:
https://mapr.com/ebook/ai-and-analytics-in-production/