SlideShare a Scribd company logo
1 of 75
Download to read offline
45 Minutes of Openstack
Hate: A reality check
Kristian Köhntopp, Cloud Architect Old Fart
Martin Loschwitz, Cloud Architect Future Integration Representative
SysEleven
https://www.flickr.com/photos/pinksherbet/188842453 Pink Sherbet Photography (CC BY 2.0)
2
3
https://uksysadmin.files.wordpress.com/2011/03/openstackwallpaper1.png
Widely supported by many marketing departments
https://www.flickr.com/photos/ahockley/8662640096 Aaron Hockley (cc-by-sa 2.0)
6
http://www.zdnet.com/article/customers-reporting-interest-in-cloud-containers-linux-and-openstack-for-2015/

http://www.businesscloudnews.com/2015/04/09/red-hat-dell-redouble-openstack-private-cloud-efforts/

http://blogs.wsj.com/cio/2015/02/25/the-morning-download-h-p-deal-with-deutsche-bank-is-step-forward-for-openstack/
What Openstack Vendors promise…
https://www.flickr.com/photos/debarshiray/8237431709/sizes/l debarshiray (CC-BY-SA)
How Openstack Admins imagine their workplace…
(C) Kristian Köhntopp, 2006
What is being delivered…
(C) Hendrik Scholz, 2006
What the admin job actually looks like…
(C) Kristian Köhntopp, 2015
What is it that we want to do?
11
„Any VM, anywhere.“
12
„Infrastructure as Code.“
13
Why would I need that?
• 24 Cores (48 Cores HT), 256G RAM, 2* 10GBit und 12* 3TB
HDD including a solid BBU
• 10k EUR

• Applications that actually fill that box are rare. So we are
cutting it up to sell off the parts.
14
"Any VM, anywhere"15
CPU, RAM
StorageNetwork
OverlayUnderlay
Storage
"Where the Internet lives", http://www.google.com/about/datacenters/gallery/#/tech/12
Storage requirements
• VM with Ephemeral Storage
• Not a problem, right? Because storage can be an un-
RAID-ed local disk.
• But then you got no migration.
• Mainentance sucks w/o Migration. For you and for your
customers.
17
Lesson #1:

Nope, local storage is not a legal
default.
18
Storage
• VM with Volume: All writes are always remote.
• And redundant.
• Relevant metrics:
• Bandwidth, IOPS and Latency
• MB/sec, multithreaded-fsync()/sec und sequential-
fsync()/sec
• Where do the limits come from?
19
Storage vs. requirements
• Scales easily:
• Bandwidth, MT-IOPS
• Hard to scale:
• Sequential-IOPS (b/c latency, target: 10k)
• Default-Benchmark: "MySQL Slave on a volume"
• "16 KB single-thread random-write on a datafile."
20
Lesson #2:

A Single-Threaded Random-Write
Benchmark is a fine first evaluation.
21
Storage vs. actual Openstack delivery
• OpenStack Default is iSCSI und tgtd
• »The problem is left as an exercise to the user.«

• Storage Vendors are totally in love with that approach.
• https://wiki.openstack.org/wiki/CinderSupportMatrix
22
Storage: Ceph - the good
• Excellent bandwidth
• Good Multithreaded-IOPS
• Very robust, if you got spare memory and time for MTTR
23
Storage: Ceph - the bad
• CRUSH
• Layout follows topology
• Change topology, move data needlessly
• You do need a background network for storage to
facilitate Rebalancing/Recovery/Cluster-Reorg
24
Storage: Ceph - the ugly
• Sequential IOPS fatally slow (200 IOPS).
• MySQL slave on our Ceph-Volumes @ 200 Commit/s
• Windows 8.1 boots from a Ceph-Volume in 15 Minutes
25
Lesson #3:

Pure Play OpenStack will
not work for production.

Corollary:
OpenStack is not a functioning
Open Source Project.
26
Storage requirements vs. Multitenancy
• IOPS-Requirements need SSD to implement
• IOPS Quotas require SSD to implement
• SSD complicated - Preice vs. guranteed Performance
• Caches complicated
27
"Magento Indexer"
Single-Threaded Database Updates
You can't put quota on what isn't there.
Small IOPS = large incremental steps, large variance.
Target price 1 EUR/GB atm.
Low performance variation more
important than huge peak perf.
Working Set > Cache
means you are working
the raw iron.
Lesson #4:

Caching is as much part of the problem
as it is part of the solution.
28
Network
Mercury Redstone Connector MR-1 (1960) https://www.flickr.com/photos/jurvetson/5691350527 Steve Jurvetson (CC-BY)
Network requirements
• Hosting setup:
• Topology: Wait-Free, Oversubscription-Free
• Redundancy: SPOF-Free
• Multi-Tenancy: Isolation and Quotas
• Also:
• Capable of carrying the storage traffic
30
–Joe Random Alphauser
„We just rolled out a few dozen VMs
with Hadoop and played Terasort.“
31
http://bradhedlund.com/2012/01/25/construct-a-leaf-spine-design-with-40g-or-10g-an-observation-in-scaling-the-fabric/
Notwork: Openstack default offering
• Open vSwitch, GRE-Ball,
Broadcast-Problem,
Chokepoint, SPOF
• Current releases are only
marginally less braindead.
32
SPOF
Lesson #5:

Ok, networking doesn't work, too.
33
Let's look around for a non-broken solution34
OpenContrail: the good, …
• Open Source Project by Juniper
• Uses existing hardware routing infrastructure
• scales, no SPOFs, no Chokepoints
• based on MPLS, BGP, and other well understood
protocols
• actually delivers stuff (as opposed to e.g. OpenDaylite)
35
OpenContrail: the bad, …
• Juniper bought Contrail, doesn't understand how to
market it
• little and outdated documentation, bad release
management, very bad packaging
• funny support organisation
36
OpenContrail: the bad, …
• During the OpenContrail build process, the scons-based
build-process will
• download the full source of tools such as Bind and Curl
• patch them wildly
• put the resulting binaries into a .deb package
• which will of course conflict with half a dozen Ubuntu
packages
37
… and the ugly.
https://www.flickr.com/photos/59145750@N03/5559004171 Mara Tr. (CC-BY 2.0)
Technology-Jenga
• OpenContrail’s "Stack"
• vrouter (.ko), if-map, Sandesh (XML over Thrift!)
• C++, Python, node.js, irond (Java)
• redis, Cassandra, Zookeeper
• xmpp, BGP, MPLS
• Up Next in OpenContrail 2.2: Kafka
• Put that into a job profile, try to find a candidate. Is he named
Pedro?
39
Lesson #6:

Problem not limited to Contrail scope.
40
Let's look what else is in the box…
https://www.flickr.com/photos/markjsebastian/4217877353/ Mark Sebastian (CC BY-SA 2.0)
General requirements as of 2015
• Distributed Anything:
• NTP, centralized Logging, centralized Monitoring
• functional validation before components before join, clean
cluster startup
• cluster comms are Kyle-Kingsbury-proof
• CA, encrypted data in flight, optionally encrypted data at rest
• User-Story regarding Live-Upgrades, with Canaries
42
OpenStack delivers this…
https://www.flickr.com/photos/z287marc/3189567558/ z287marc (CC BY 2.0)
A wild hack session appears…
https://www.flickr.com/photos/nauright/11341288174/ Romana Klee (CC BY-SA 2.0)
Not an isolated problem…
• Start a HEAT stack with a few networks and 20 VMs
• … and the cluster switches off.
• Nova does not communicate with Cinder, but has a
hard timeout.
• "Are we stupid or is OpenStack stupid? Let's test a few
public clouds…"
45
Result…
• Phone ringing…
• "Whatever you are doing,
could you please do
something else?"
46
"Rotes Telefon", http://de.wikipedia.org/wiki/Datei:Jimmy_Carter_Library_and_Museum_99.JPG User:Piotrus (CC-BY 3.0)
Not an isolated problem…47
Implementation tested
on Devstack in VMware Fusion
on a MacBook Air in St. Oberholz?
Not an isolated problem…48
https://thespeechatimeforchoosing.files.wordpress.com/2012/12/facepalm.jpg
Keystone
Not an isolated problem…
• "Could you have a look, my Instances won't instantiate."
• A longer debug session later it turns out that libvirts on
cloud17 is dysfunctional.
• Compute-Nodes in OpenStack join the cluster simply by
being visible in AMQP (register as a consumer), no
functional validation.
49
Not an isolated problem…50
Ceilometer
Not an isolated problem…
• During the Juno release cycle, Horizon got patched to be
„less confusing for users“ (any GNOME devs around?)
• Result: Assigning floating IPs using Horizon became
impossible when using OpenContrail
• Lesson learned: OpenStack wants to be versatile, but in
fact, most development only ever happens for the
„streamline“ setups
51
What is the real problem?
https://www.flickr.com/photos/jdhancock/8395113234 JD Hancock (CC BY 2.0)
„Any VM, anywhere.“
53
„As a hoster/enterprise/department,
I want a virtualization platform
which does…“
54
Problemspec vs. Produktspec
• Requirements regarding
• Tenant Isolation,
• Billing Model,
• Operations Model,
• Development Model,
• Scalability
55
Prototypes vs. Product
"Prototype in the round file", https://www.flickr.com/photos/generated/3313311558 Jared Tarbell (CC-BY 2.0)
Stabilize the core with actual engineering

vs.

Moar components
57
“The Big Tent”
https://www.flickr.com/photos/fmckinlay/2410261506 Fiona Shields (CC BY-NC-ND 2.0)
"The problem with the Big Tent is that it is full of clowns."
https://www.flickr.com/photos/fmckinlay/2410261506 Fiona Shields (CC BY-NC-ND 2.0)
–Maik Zumstrull
„Instead of having a functional
solution I can now choose between 13
differently deficient components.“
60
Democracy as a software architecture model
• Product definition == target specification
• Who is our customer and what do they need?
• What are the mandatory/desireable/optional
properties of the product to become part of the
solution space instead of the problem space?
61
Democracy as a software architecture model
• Quality Assurance
• "But OpenStack does have Blueprints, CI and code
review"
• Without a target specification you have no idea about
requirements on scale, workload types, security, …
62
Who actually votes on your commit…
Democracy as a software architecture model
• Limited, learnable technology stack
• One recommended and tested solution with a
documented best practice for deployment instead of a
plugin architecture.
• Reuseable solutions for comparable problems in
neighboring subprojects (“Architecture Refactoring”).
64
Monasca
How to win at Hipsterbingo…
• »Monasca is an open-source multi-tenant, highly
scalable, performant, fault-tolerant monitoring-as-a-
service solution that integrates with OpenStack.



It uses a REST API for high-speed metrics processing and
querying and has a streaming alarm engine and
notification engine.«
66
Bingo!
How to win at Hipsterbingo…
• »Uses a number of underlying technologies:



Apache Kafka, Apache Storm, Zookeeper, MySQL,
Vagrant, Dropwizard, InfluxDB, Vertica.«
67
Democracy as a software architecture model
• Useful vs. Hyped
• “Docker is like OpenStack,
but from a Dev instead of
an Ops perspective. 



Wouldn't it be great to
unify the projects?”
68
https://twitter.com/martinisoft/status/527191803603468288
And finally…69
TL;DR please?
70
TL;DR
• OpenStack right now: not a product, but a box of nuts and bolts.
Significant assembly required.
• Not all parts are actually bad, but the quality varies. A lot.

• No other project has the momentum.
• Most other projects do not have multi-tenancy, multi-node.
• Many haven't even grasped the actual problem. Including Docker.
71
TL;DR
• Deep in the Gartner hype cycle phase 2:
• Users and hence real-world
requirements missing.
• No unified view on acceptable
solutions.
• Vendors think they can "win" and
"own" the market or products,
pushing their embrace and extend
agendas.
72
TL;DR
• Apache-like open development processes are a proven
model for taming open source and commoditisation.
• Compare Hadoop
• Based on methods pioneered by Java
73
74
?
3. Openstack DACH Tag
11. Juni 2015
Berlin
http://openstack-dach.org
75

More Related Content

What's hot

Cloud Foundry, the Open Platform as a Service - Oscon - July 2012
Cloud Foundry, the Open Platform as a Service - Oscon - July 2012Cloud Foundry, the Open Platform as a Service - Oscon - July 2012
Cloud Foundry, the Open Platform as a Service - Oscon - July 2012
Patrick Chanezon
 
OpenStack Paris 2014 - Federation, are we there yet ?
OpenStack Paris 2014 - Federation, are we there yet ?OpenStack Paris 2014 - Federation, are we there yet ?
OpenStack Paris 2014 - Federation, are we there yet ?
Tim Bell
 

What's hot (19)

node.js in production: Reflections on three years of riding the unicorn
node.js in production: Reflections on three years of riding the unicornnode.js in production: Reflections on three years of riding the unicorn
node.js in production: Reflections on three years of riding the unicorn
 
OpenShift Virtualization - VM and OS Image Lifecycle
OpenShift Virtualization - VM and OS Image LifecycleOpenShift Virtualization - VM and OS Image Lifecycle
OpenShift Virtualization - VM and OS Image Lifecycle
 
ShipItCon - Continuous Deployment and Multicloud with Ansible and Kubernetes
ShipItCon - Continuous Deployment and Multicloud with Ansible and KubernetesShipItCon - Continuous Deployment and Multicloud with Ansible and Kubernetes
ShipItCon - Continuous Deployment and Multicloud with Ansible and Kubernetes
 
Deep Dive Into the CERN Cloud Infrastructure - November, 2013
Deep Dive Into the CERN Cloud Infrastructure - November, 2013Deep Dive Into the CERN Cloud Infrastructure - November, 2013
Deep Dive Into the CERN Cloud Infrastructure - November, 2013
 
Saltconf16 - Salt is Not Configuration Management
Saltconf16 - Salt is Not Configuration ManagementSaltconf16 - Salt is Not Configuration Management
Saltconf16 - Salt is Not Configuration Management
 
XCP-ng - past, present and future
XCP-ng - past, present and futureXCP-ng - past, present and future
XCP-ng - past, present and future
 
PuppetConf 2016: Changing the Engine While in Flight – Neil Armitage, VMware
PuppetConf 2016: Changing the Engine While in Flight – Neil Armitage, VMwarePuppetConf 2016: Changing the Engine While in Flight – Neil Armitage, VMware
PuppetConf 2016: Changing the Engine While in Flight – Neil Armitage, VMware
 
Cloud Foundry, the Open Platform as a Service - Oscon - July 2012
Cloud Foundry, the Open Platform as a Service - Oscon - July 2012Cloud Foundry, the Open Platform as a Service - Oscon - July 2012
Cloud Foundry, the Open Platform as a Service - Oscon - July 2012
 
20190620 accelerating containers v3
20190620 accelerating containers v320190620 accelerating containers v3
20190620 accelerating containers v3
 
OpenStack Paris 2014 - Federation, are we there yet ?
OpenStack Paris 2014 - Federation, are we there yet ?OpenStack Paris 2014 - Federation, are we there yet ?
OpenStack Paris 2014 - Federation, are we there yet ?
 
OpenNebulaConf2018 - How Inoreader Migrated from Bare-Metal Containers to Ope...
OpenNebulaConf2018 - How Inoreader Migrated from Bare-Metal Containers to Ope...OpenNebulaConf2018 - How Inoreader Migrated from Bare-Metal Containers to Ope...
OpenNebulaConf2018 - How Inoreader Migrated from Bare-Metal Containers to Ope...
 
Jurijs Velikanovs - RAC Attack 101 - How to install 12c RAC on your laptop
Jurijs Velikanovs -  RAC Attack 101 - How to install 12c RAC on your laptop  Jurijs Velikanovs -  RAC Attack 101 - How to install 12c RAC on your laptop
Jurijs Velikanovs - RAC Attack 101 - How to install 12c RAC on your laptop
 
Extending TripleO for OpenStack Management
Extending TripleO for OpenStack ManagementExtending TripleO for OpenStack Management
Extending TripleO for OpenStack Management
 
Openstack trystack
Openstack   trystack Openstack   trystack
Openstack trystack
 
How to use TripleO tools for your own project
How to use TripleO tools for your own projectHow to use TripleO tools for your own project
How to use TripleO tools for your own project
 
Configuration Management Evolution at CERN
Configuration Management Evolution at CERNConfiguration Management Evolution at CERN
Configuration Management Evolution at CERN
 
Containers & CaaS
Containers & CaaSContainers & CaaS
Containers & CaaS
 
[오픈소스컨설팅] Open Stack Ceph, Neutron, HA, Multi-Region
[오픈소스컨설팅] Open Stack Ceph, Neutron, HA, Multi-Region[오픈소스컨설팅] Open Stack Ceph, Neutron, HA, Multi-Region
[오픈소스컨설팅] Open Stack Ceph, Neutron, HA, Multi-Region
 
Cern Cloud Architecture - February, 2016
Cern Cloud Architecture - February, 2016Cern Cloud Architecture - February, 2016
Cern Cloud Architecture - February, 2016
 

Similar to OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

Similar to OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate (20)

Commit to excellence - Java in containers
Commit to excellence - Java in containersCommit to excellence - Java in containers
Commit to excellence - Java in containers
 
Replatforming Legacy Packaged Applications: Block-by-Block with Minecraft
Replatforming Legacy Packaged Applications: Block-by-Block with MinecraftReplatforming Legacy Packaged Applications: Block-by-Block with Minecraft
Replatforming Legacy Packaged Applications: Block-by-Block with Minecraft
 
Leveraging docker for hadoop build automation and big data stack provisioning
Leveraging docker for hadoop build automation and big data stack provisioningLeveraging docker for hadoop build automation and big data stack provisioning
Leveraging docker for hadoop build automation and big data stack provisioning
 
Leveraging Docker for Hadoop build automation and Big Data stack provisioning
Leveraging Docker for Hadoop build automation and Big Data stack provisioningLeveraging Docker for Hadoop build automation and Big Data stack provisioning
Leveraging Docker for Hadoop build automation and Big Data stack provisioning
 
Open stack jobs avoiding the axe
Open stack jobs   avoiding the axeOpen stack jobs   avoiding the axe
Open stack jobs avoiding the axe
 
There's no magic... until you talk about databases
 There's no magic... until you talk about databases There's no magic... until you talk about databases
There's no magic... until you talk about databases
 
Supporting Highly Multitenant Spark Notebook Workloads with Craig Ingram and ...
Supporting Highly Multitenant Spark Notebook Workloads with Craig Ingram and ...Supporting Highly Multitenant Spark Notebook Workloads with Craig Ingram and ...
Supporting Highly Multitenant Spark Notebook Workloads with Craig Ingram and ...
 
Montreal OpenStack Q3-2017 MeetUp
Montreal OpenStack Q3-2017 MeetUpMontreal OpenStack Q3-2017 MeetUp
Montreal OpenStack Q3-2017 MeetUp
 
Performance Benchmarking: Tips, Tricks, and Lessons Learned
Performance Benchmarking: Tips, Tricks, and Lessons LearnedPerformance Benchmarking: Tips, Tricks, and Lessons Learned
Performance Benchmarking: Tips, Tricks, and Lessons Learned
 
Why OpenStack on UCS? An Introduction to Red Hat and Cisco OpenStack Solution
Why OpenStack on UCS? An Introduction to Red Hat and Cisco OpenStack SolutionWhy OpenStack on UCS? An Introduction to Red Hat and Cisco OpenStack Solution
Why OpenStack on UCS? An Introduction to Red Hat and Cisco OpenStack Solution
 
La apuesta de Telefónica por la cloud privada
La apuesta de Telefónica por la cloud privadaLa apuesta de Telefónica por la cloud privada
La apuesta de Telefónica por la cloud privada
 
Navigating container technology for enhanced security by Niklas Saari
Navigating container technology for enhanced security by Niklas SaariNavigating container technology for enhanced security by Niklas Saari
Navigating container technology for enhanced security by Niklas Saari
 
Experiments with Complex Scientific Applications on Hybrid Cloud Infrastructures
Experiments with Complex Scientific Applications on Hybrid Cloud InfrastructuresExperiments with Complex Scientific Applications on Hybrid Cloud Infrastructures
Experiments with Complex Scientific Applications on Hybrid Cloud Infrastructures
 
Docker in Production: How RightScale Delivers Cloud Applications
Docker in Production: How RightScale Delivers Cloud ApplicationsDocker in Production: How RightScale Delivers Cloud Applications
Docker in Production: How RightScale Delivers Cloud Applications
 
Databricks clusters in autopilot mode
Databricks clusters in autopilot modeDatabricks clusters in autopilot mode
Databricks clusters in autopilot mode
 
Ippevent : openshift Introduction
Ippevent : openshift IntroductionIppevent : openshift Introduction
Ippevent : openshift Introduction
 
2014 11-05 hpcac-kniep_christian_dockermpi
2014 11-05 hpcac-kniep_christian_dockermpi2014 11-05 hpcac-kniep_christian_dockermpi
2014 11-05 hpcac-kniep_christian_dockermpi
 
Zoo keeper in the wild
Zoo keeper in the wildZoo keeper in the wild
Zoo keeper in the wild
 
What's really the difference between a VM and a Container?
What's really the difference between a VM and a Container?What's really the difference between a VM and a Container?
What's really the difference between a VM and a Container?
 
Design for Scale / Surge 2010
Design for Scale / Surge 2010Design for Scale / Surge 2010
Design for Scale / Surge 2010
 

OSDC 2015: Martin Gerhard Loschwitz - Kristian Köhntopp | 45 Minutes of OpenStack Hate

  • 1. 45 Minutes of Openstack Hate: A reality check Kristian Köhntopp, Cloud Architect Old Fart Martin Loschwitz, Cloud Architect Future Integration Representative SysEleven https://www.flickr.com/photos/pinksherbet/188842453 Pink Sherbet Photography (CC BY 2.0)
  • 2. 2
  • 3. 3
  • 5. Widely supported by many marketing departments https://www.flickr.com/photos/ahockley/8662640096 Aaron Hockley (cc-by-sa 2.0)
  • 7. What Openstack Vendors promise… https://www.flickr.com/photos/debarshiray/8237431709/sizes/l debarshiray (CC-BY-SA)
  • 8. How Openstack Admins imagine their workplace… (C) Kristian Köhntopp, 2006
  • 9. What is being delivered… (C) Hendrik Scholz, 2006
  • 10. What the admin job actually looks like… (C) Kristian Köhntopp, 2015
  • 11. What is it that we want to do? 11
  • 14. Why would I need that? • 24 Cores (48 Cores HT), 256G RAM, 2* 10GBit und 12* 3TB HDD including a solid BBU • 10k EUR
 • Applications that actually fill that box are rare. So we are cutting it up to sell off the parts. 14
  • 15. "Any VM, anywhere"15 CPU, RAM StorageNetwork OverlayUnderlay
  • 16. Storage "Where the Internet lives", http://www.google.com/about/datacenters/gallery/#/tech/12
  • 17. Storage requirements • VM with Ephemeral Storage • Not a problem, right? Because storage can be an un- RAID-ed local disk. • But then you got no migration. • Mainentance sucks w/o Migration. For you and for your customers. 17
  • 18. Lesson #1:
 Nope, local storage is not a legal default. 18
  • 19. Storage • VM with Volume: All writes are always remote. • And redundant. • Relevant metrics: • Bandwidth, IOPS and Latency • MB/sec, multithreaded-fsync()/sec und sequential- fsync()/sec • Where do the limits come from? 19
  • 20. Storage vs. requirements • Scales easily: • Bandwidth, MT-IOPS • Hard to scale: • Sequential-IOPS (b/c latency, target: 10k) • Default-Benchmark: "MySQL Slave on a volume" • "16 KB single-thread random-write on a datafile." 20
  • 21. Lesson #2:
 A Single-Threaded Random-Write Benchmark is a fine first evaluation. 21
  • 22. Storage vs. actual Openstack delivery • OpenStack Default is iSCSI und tgtd • »The problem is left as an exercise to the user.«
 • Storage Vendors are totally in love with that approach. • https://wiki.openstack.org/wiki/CinderSupportMatrix 22
  • 23. Storage: Ceph - the good • Excellent bandwidth • Good Multithreaded-IOPS • Very robust, if you got spare memory and time for MTTR 23
  • 24. Storage: Ceph - the bad • CRUSH • Layout follows topology • Change topology, move data needlessly • You do need a background network for storage to facilitate Rebalancing/Recovery/Cluster-Reorg 24
  • 25. Storage: Ceph - the ugly • Sequential IOPS fatally slow (200 IOPS). • MySQL slave on our Ceph-Volumes @ 200 Commit/s • Windows 8.1 boots from a Ceph-Volume in 15 Minutes 25
  • 26. Lesson #3:
 Pure Play OpenStack will not work for production.
 Corollary: OpenStack is not a functioning Open Source Project. 26
  • 27. Storage requirements vs. Multitenancy • IOPS-Requirements need SSD to implement • IOPS Quotas require SSD to implement • SSD complicated - Preice vs. guranteed Performance • Caches complicated 27 "Magento Indexer" Single-Threaded Database Updates You can't put quota on what isn't there. Small IOPS = large incremental steps, large variance. Target price 1 EUR/GB atm. Low performance variation more important than huge peak perf. Working Set > Cache means you are working the raw iron.
  • 28. Lesson #4:
 Caching is as much part of the problem as it is part of the solution. 28
  • 29. Network Mercury Redstone Connector MR-1 (1960) https://www.flickr.com/photos/jurvetson/5691350527 Steve Jurvetson (CC-BY)
  • 30. Network requirements • Hosting setup: • Topology: Wait-Free, Oversubscription-Free • Redundancy: SPOF-Free • Multi-Tenancy: Isolation and Quotas • Also: • Capable of carrying the storage traffic 30
  • 31. –Joe Random Alphauser „We just rolled out a few dozen VMs with Hadoop and played Terasort.“ 31 http://bradhedlund.com/2012/01/25/construct-a-leaf-spine-design-with-40g-or-10g-an-observation-in-scaling-the-fabric/
  • 32. Notwork: Openstack default offering • Open vSwitch, GRE-Ball, Broadcast-Problem, Chokepoint, SPOF • Current releases are only marginally less braindead. 32 SPOF
  • 33. Lesson #5:
 Ok, networking doesn't work, too. 33
  • 34. Let's look around for a non-broken solution34
  • 35. OpenContrail: the good, … • Open Source Project by Juniper • Uses existing hardware routing infrastructure • scales, no SPOFs, no Chokepoints • based on MPLS, BGP, and other well understood protocols • actually delivers stuff (as opposed to e.g. OpenDaylite) 35
  • 36. OpenContrail: the bad, … • Juniper bought Contrail, doesn't understand how to market it • little and outdated documentation, bad release management, very bad packaging • funny support organisation 36
  • 37. OpenContrail: the bad, … • During the OpenContrail build process, the scons-based build-process will • download the full source of tools such as Bind and Curl • patch them wildly • put the resulting binaries into a .deb package • which will of course conflict with half a dozen Ubuntu packages 37
  • 38. … and the ugly. https://www.flickr.com/photos/59145750@N03/5559004171 Mara Tr. (CC-BY 2.0)
  • 39. Technology-Jenga • OpenContrail’s "Stack" • vrouter (.ko), if-map, Sandesh (XML over Thrift!) • C++, Python, node.js, irond (Java) • redis, Cassandra, Zookeeper • xmpp, BGP, MPLS • Up Next in OpenContrail 2.2: Kafka • Put that into a job profile, try to find a candidate. Is he named Pedro? 39
  • 40. Lesson #6:
 Problem not limited to Contrail scope. 40
  • 41. Let's look what else is in the box… https://www.flickr.com/photos/markjsebastian/4217877353/ Mark Sebastian (CC BY-SA 2.0)
  • 42. General requirements as of 2015 • Distributed Anything: • NTP, centralized Logging, centralized Monitoring • functional validation before components before join, clean cluster startup • cluster comms are Kyle-Kingsbury-proof • CA, encrypted data in flight, optionally encrypted data at rest • User-Story regarding Live-Upgrades, with Canaries 42
  • 44. A wild hack session appears… https://www.flickr.com/photos/nauright/11341288174/ Romana Klee (CC BY-SA 2.0)
  • 45. Not an isolated problem… • Start a HEAT stack with a few networks and 20 VMs • … and the cluster switches off. • Nova does not communicate with Cinder, but has a hard timeout. • "Are we stupid or is OpenStack stupid? Let's test a few public clouds…" 45
  • 46. Result… • Phone ringing… • "Whatever you are doing, could you please do something else?" 46 "Rotes Telefon", http://de.wikipedia.org/wiki/Datei:Jimmy_Carter_Library_and_Museum_99.JPG User:Piotrus (CC-BY 3.0)
  • 47. Not an isolated problem…47 Implementation tested on Devstack in VMware Fusion on a MacBook Air in St. Oberholz?
  • 48. Not an isolated problem…48 https://thespeechatimeforchoosing.files.wordpress.com/2012/12/facepalm.jpg Keystone
  • 49. Not an isolated problem… • "Could you have a look, my Instances won't instantiate." • A longer debug session later it turns out that libvirts on cloud17 is dysfunctional. • Compute-Nodes in OpenStack join the cluster simply by being visible in AMQP (register as a consumer), no functional validation. 49
  • 50. Not an isolated problem…50 Ceilometer
  • 51. Not an isolated problem… • During the Juno release cycle, Horizon got patched to be „less confusing for users“ (any GNOME devs around?) • Result: Assigning floating IPs using Horizon became impossible when using OpenContrail • Lesson learned: OpenStack wants to be versatile, but in fact, most development only ever happens for the „streamline“ setups 51
  • 52. What is the real problem? https://www.flickr.com/photos/jdhancock/8395113234 JD Hancock (CC BY 2.0)
  • 54. „As a hoster/enterprise/department, I want a virtualization platform which does…“ 54
  • 55. Problemspec vs. Produktspec • Requirements regarding • Tenant Isolation, • Billing Model, • Operations Model, • Development Model, • Scalability 55
  • 56. Prototypes vs. Product "Prototype in the round file", https://www.flickr.com/photos/generated/3313311558 Jared Tarbell (CC-BY 2.0)
  • 57. Stabilize the core with actual engineering
 vs.
 Moar components 57
  • 59. "The problem with the Big Tent is that it is full of clowns." https://www.flickr.com/photos/fmckinlay/2410261506 Fiona Shields (CC BY-NC-ND 2.0)
  • 60. –Maik Zumstrull „Instead of having a functional solution I can now choose between 13 differently deficient components.“ 60
  • 61. Democracy as a software architecture model • Product definition == target specification • Who is our customer and what do they need? • What are the mandatory/desireable/optional properties of the product to become part of the solution space instead of the problem space? 61
  • 62. Democracy as a software architecture model • Quality Assurance • "But OpenStack does have Blueprints, CI and code review" • Without a target specification you have no idea about requirements on scale, workload types, security, … 62
  • 63. Who actually votes on your commit…
  • 64. Democracy as a software architecture model • Limited, learnable technology stack • One recommended and tested solution with a documented best practice for deployment instead of a plugin architecture. • Reuseable solutions for comparable problems in neighboring subprojects (“Architecture Refactoring”). 64
  • 66. How to win at Hipsterbingo… • »Monasca is an open-source multi-tenant, highly scalable, performant, fault-tolerant monitoring-as-a- service solution that integrates with OpenStack.
 
 It uses a REST API for high-speed metrics processing and querying and has a streaming alarm engine and notification engine.« 66 Bingo!
  • 67. How to win at Hipsterbingo… • »Uses a number of underlying technologies:
 
 Apache Kafka, Apache Storm, Zookeeper, MySQL, Vagrant, Dropwizard, InfluxDB, Vertica.« 67
  • 68. Democracy as a software architecture model • Useful vs. Hyped • “Docker is like OpenStack, but from a Dev instead of an Ops perspective. 
 
 Wouldn't it be great to unify the projects?” 68 https://twitter.com/martinisoft/status/527191803603468288
  • 71. TL;DR • OpenStack right now: not a product, but a box of nuts and bolts. Significant assembly required. • Not all parts are actually bad, but the quality varies. A lot.
 • No other project has the momentum. • Most other projects do not have multi-tenancy, multi-node. • Many haven't even grasped the actual problem. Including Docker. 71
  • 72. TL;DR • Deep in the Gartner hype cycle phase 2: • Users and hence real-world requirements missing. • No unified view on acceptable solutions. • Vendors think they can "win" and "own" the market or products, pushing their embrace and extend agendas. 72
  • 73. TL;DR • Apache-like open development processes are a proven model for taming open source and commoditisation. • Compare Hadoop • Based on methods pioneered by Java 73
  • 74. 74 ?
  • 75. 3. Openstack DACH Tag 11. Juni 2015 Berlin http://openstack-dach.org 75