1
Sagi Brody, CTO
@WebairSagi
sagi@webair.com
sagi@webair.com
sagi@webair.com
Why Managed Service Providers Should Embrace
Container Technology
2
What’s a Managed Service Provider Anyways?
• Outsourced DevOps
• Not many folks doing it
• Difficult to scale
3
Why do folks use MSPs/CSPs?
• Small IT teams who’d rather focus on AppDev
• Common in non-tech industries who prefer to outsource
• Cost
• Liability & Responsibility
• Unicorns are hard to find
• MSP != IaaS
• Not just for web
• Ride the wave of new technologies
4
What’s it look like?
LAMP at scale, redis,
memcache, nginx..
LAMP at scale, redis,
memcache, nginx..
Linux, FreeBSD, Agents/AutomationLinux, FreeBSD, Agents/Automation
HAProxy + keepalived, nginx, csyncHAProxy + keepalived, nginx, csync
MikroTik, PaloAlto, JuniperMikroTik, PaloAlto, Juniper
FlowSpec, SDN hooksFlowSpec, SDN hooks
L7 mitigation + automated /24 swingsL7 mitigation + automated /24 swings
Redirects to CDN in App or via HTTP rewriteRedirects to CDN in App or via HTTP rewrite
Applicatio
n
Serve
r
Load
Balancers
Firewal
l
Networ
k
DDo
S
CDN/Proxie
s
BareMetal, VM (Hybrid SAN, Flash,Xen/Vmware), or IaaS
Infrastructur
e
5
Full Stack Management - how?
• Need standardized platform, yet customizable as required
• Client/Staff web portals to automate common tasks
• Templated OS deployments w/ management scripts/monitoring
• OSS/BSS + SPOT
• FreeBSD - C daemon w/ SSL transport + custom API
• Linux - Ansible - Py scripts
• SPOT injects into 3rd party services
• Custom layered monitoring
• Docs/Diagrams
• Social Monitoring
• TaaS
LinuxLinux
ServiceService
SSL
SSH
App routerApp router
CLI
Scripts
CLI
Scripts
PortalPortal
AppsApps
OSSOSS
MonitoringMonitoring
FbsdFbsd
6
Full Stack Management - how?
usage: manageserver mysql [-h] [-v] [-a <newdbname> <newuser> <newpassword>]
[-d <newdbname>] [--remove-db <dbname>] [-l]
[-u <dbname> <newuser> <newpassword>]
[--remove-user <username>] [--list-user]
[-e <dest-file> [<dbname> ...]] [--list-backup]
[-b {on,off,status,run,restore}] [-c]
[-f <filepath>] [-r <root password>]
optional arguments:
-h, --help show this help message and exit
-v, --verbose set verbosity level
-a <newdbname> <newuser> <newpassword>, --add <newdbname> <newuser> <newpassword>
Add a MySQL db and a user with all privileges on that
db
-d <newdbname>, --add-db <newdbname>
Add a MySQL db
--remove-db <dbname> Delete MySQL db
-l, --list-db List all MySQL databases
-u <dbname> <newuser> <newpassword>, --add-user <dbname> <newuser> <newpassword>
Add a new MySQL user with all privileges on an
existing db
--remove-user <username>
Delete MySQL user from all hosts
--list-user List all MySQL users
-e <dest-file> [<dbname> ...], --export <dest-file> [<dbname> ...]
Dump DB to file. use .gz/.bz2 in file name to
compress. blank dbname to dump all DBs.
--list-backup List all MySQL backups
-b {on,off,status,run,restore}, --backup {on,off,status,run,restore}
Check status/run/restore and turn on/off Mysql Backups
cron job
-c, --cleanup Optional argument used with '--backup run' to remove
backups older than 5 days
-f <filepath>, --file <filepath>
Argument used with '--backup restore' to pass the
.tar.gz backup file path
-r <root password>, --root <root password>
Optional argument used with Mysql related commands to
pass the root password
7
MSP Pain Points
• Guilty until proven innocent
• Not always involved or notified of code updates and
changes
• Not aware of 3rd party integrations
• Legacy apps (“bring us your tired your poor..”)
• Backup/Restore
• Application security
• Not involved in design
• Multi-tenancy systems (vendor nightmare)
• Per client customization
• Keeping OS/Software current
Containers to the rescue?
8
Containerizing Multi-Tenant Shared Infrastructure
Usage: lbmanager <LB_ID> OPTIONS
<LB_ID> A friendly name for identification.
OPTIONS Consists of one or more of the following switchs.
Each switch could be repeated as many times as needed.
-v <IP> Add the IP to the VIPs (Will not serve untill at least 1 VIP exist)
-V <IP> Remove the IP from the VIPs (Will stop serving if there isn't any VIP)
-r <IP> Add the IP to the RIPs
-R <IP> Remove the IP from the RIPs
-p <PORT> Add the PORT to the listening ports (Will not serve until at least 1 port exist)
-P <PORT> Remove the PORT from the listening ports (Will stop serving if there isn't any port)
-u <USER> User name for the stats section (Default:lbstats)
-s <PASS> Password for the stats section ( Default: md5(<LB_ID>-<shared_secret>) )
-1 Make the LB_ID active on 1st LB and standby on 2nd LB (Default)
-2 Make the LB_ID active on 2nd LB and standby on 1st LB
-c Add comment to a LB_ID (Enclose in "")
-e Enable LB_ID. HAProxy will start to serving LB_ID
-d Disable LB_ID. HAProxy will stop to serving LB_ID
-o Enable SSL offloading with the certificates in the /etc/haproxy/lb_sets/LB_ID/certs/ directory
-X Destroy & Backup the LB_ID
-L Locks the LB_ID against further modifications.DON'T USE WITH OTHER OPTIONS, they will be ignored
-O Disable SSL offloading
-U Reset client stats username to default one (Default:lbstats)
-S Reset client stats password to default hash with shared secret ( md5(<LB_ID>-<shared_secret>) )
-b <METHOD> Balancing method on the backend servers, is one of:
leastconn Sends requests to the server with least connections (Default)
roundrobin
static-rr
first
source
rdp-cookie
NOTE: When satisfied with all the changes run "lbmanager reload" to make the changes live.
LB Service: Ansible, CentOS, HAProxy, keepalived, csync, git, automation
interface
9
Containerizing Multi-Tenant Shared Infrastructure
LB Service: Ansible, CentOS, HAProxy, keepalived, csync, git, automation
interface
Issues:
• Multiple customer configs merged into master file
• Resources shared across customers
• Limit of customization per customer forces standalones
• Standalones can add up quickly, also need HA
• Manual configuration rollbacks
• Version control of deployed app
Containerize:
• Container per customer deployment
• Customization w/o standing up additional resources
• Finer resource monitoring per customer == potential for new
chargeback methods
• Create ELB like service using stateless containers + autoscale
scheduler (Exhibitor+Zookeeper)
10
Containerizing Multi-Tenant Shared Infrastructure
Shared Hosting Environments
# ls -al
total 10
drwx--x--x 4 clifico-www clifico 512 Aug 8 2008 .
drwx--x--x 88 root wheel 3584 Jul 30 13:09 ..
lrwxr-xr-x 1 root wheel 15 Aug 8 2008 bin -> /www/apache/bin
drwxr-xr-x 3 root wheel 512 Aug 19 00:07 conf
lrwxr-xr-x 1 root wheel 17 Aug 8 2008 icons -> /www/apache/icons
lrwxr-xr-x 1 root wheel 19 Aug 8 2008 libexec -> /www/apache/libexec
drwxr-xr-x 2 clifico-www clifico 512 Aug 19 00:05 logs
lrwxr-xr-x 1 root wheel 19 Aug 8 2008 modules -> /www/apache/modules
zsmgcontrol-www 47227 0.0 0.0 54096 0 ?? IW - 0:00.00 /www/vapache/smgcontrol/bin/httpd -f /www/vapache/smgcontrol/conf/httpd.conf -DPHP5
revsinn 49687 0.0 0.8 82332 34604 ?? I 2:13AM 0:01.13 /www/vapache/revsinn/bin/httpd -f /www/vapache/revsinn/conf/httpd.conf -DPHP5
laphta-www 51439 0.0 0.2 55944 6352 ?? I 2:14AM 0:02.20 /www/vapache/laphta/bin/httpd -f /www/vapache/laphta/conf/httpd.conf -DPHP5
laphta-www 53163 0.0 0.8 79604 33180 ?? I 2:15AM 0:02.53 /www/vapache/laphta/bin/httpd -f /www/vapache/laphta/conf/httpd.conf -DPHP5
msdivamarie 54375 0.0 0.7 75108 29032 ?? I 2:15AM 0:02.63 /www/vapache/msdivamarie/bin/httpd -f /www/vapache/msdivamarie/conf/httpd.conf -DPHP5
woodswe-www 55193 0.0 0.0 75588 0 ?? IW - 0:00.00 /www/vapache/woodswe/bin/httpd -f /www/vapache/woodswe/conf/httpd.conf -DPHP5
frenchelite-www 55280 0.0 1.1 90392 44660 ?? I 2:16AM 0:00.73 /www/vapache/frenchelite/bin/httpd -f /www/vapache/frenchelite/conf/httpd.conf -DPHP5
woodswe-www 55289 0.0 0.0 75588 0 ?? IW - 0:00.00 /www/vapache/woodswe/bin/httpd -f /www/vapache/woodswe/conf/httpd.conf -DPHP5
woodswe-www 55290 0.0 0.0 75588 0 ?? IW - 0:00.00 /www/vapache/woodswe/bin/httpd -f /www/vapache/woodswe/conf/httpd.conf -DPHP5
apachectl..
if [ "x$user" = "x-" ]; then
user=$(/usr/bin/find /www/vapache -maxdepth 1 -mindepth 1 -type d
| xargs -n 1 basename)
split=`expr $(echo $user | wc -w) / 4`
fi
for i in $user ; do
apacheinit $i
if [ $? -eq 1 ]; then
echo "WARNING: No such user $i"
continue
fi
• Custom apachectl + supporting scripts
• Separate apache instance per user
• ‘jail like’ separation via perms + ps mod
• Scripts to standup/destroy new customer
environments
11
Containerizing Multi-Tenant Shared Infrastructure
Shared Hosting Environments:
Issues:
• No per user resource limits, only application
• No dependency separation
• 0day local root exploits
• Difficult to move users
Containerize:
• Don’t deploy user accounts, deploy applications
• Cloudapps == deploy small Xen instance + application specific template
• Already being done well in containers, tons of OSS in dockerhub
• Easier to manage hosts
• Potential for better security than user land
12
OS & Software Updates
• Common issue among MSPs
• We upgrade based on Stability, Security, Feature requests, and
opportunistically
• Feature requests most common (latest ffmpeg..)
• Manual updates for older OS versions
• Who knows if something broke post update?
• Safer for us to standup new VMs and move workloads, but time
consuming
New functionality:
• CoreOS FastPatch
• Atomic rpm-ostree
• RancherOS - per service container rollbacks
• Why upgrade the OS if you can redeploy workload onto
already upgraded host
13
Platform Customization
Issues:
• Standardized platforms handle 80-90% of common use cases..
• More Flexibility needed - as close to internal DevOps teams as possible (our
heroes)
• Automation exists today, but still deploying VMs to solve problems
• VMs deployed ‘at will’ based on templates or snapshots
• Difficult to keep up w/ various tweaks & changes
• Many multi-purpose VMs exist (web/db/cdn-origin/memcache)
Containerize:
• docker-compose customer’s environment
• Less time to update update compose file then alternative work
• Auto-scale
• Already using abstracted LB & shared storage for configs and data
• Unionfs no longer the only storage option
• RDS type services already exist on perm
14
SaaS Platform Segmentation & Scale
• SaaS platforms with X customers, each with their own VH
directive, subdir, or subdomain
• Each customer gets their own container
• App can identify customer baed on URL string or HOST
• Allows per customer changes without affecting others
• Fork and customize codebase per customer
• Easier to align multiple platform flavors per
customer/container
• Adding container monitoring (sysdig, prometheus) allows
instant resource utilization per platform customer - quickly
find noisy neighbors and ability to up charge for resources
• Single customer load spikes may not affect other
customers
• Shard customer across zones/regions
15
Site Segmentation
• Typical to see single servers or clusters with 50-5,000 micro sites
on same Apache/nginx daemons
• Hard to track intermittent slowness or downtime
• Poor security - 1 Hacked site has potential to effect entire
cluster/server
• Containerize each site:
• Decoupling app from code makes break/fix easier for MSP to fix
or redeploy
• Per site resource utilization
• Secure
16
Pipeline
• Rollbacks very difficult today on customer + MSP
• Most development happens offsite
• MSP should help in CI/CD pipeline building process and provide
dev resources
• Build new pipelines on-demand to help customers w/ various
workflows
• Free up existing enterprise test resources so multiple devs can
test concurrently and w/o interference
Pipeline:
17
VPCC & PPCC
• Virtual Private Container Clouds & Physical Private Container
Clouds
• Hybrid - containers, private clouds, bare metal, xconnects
• Proven & trusted enterprise feature sets: high availability,
distributed resource schedulers, multi-tenant segmentation, DraaS,
redundant networking, enterprise support
• Private repositories
• Isolated for compliance (PCI, HIPAA)
• Existing stable and easy to use workload portability platforms and
hyper-converged infrastructure
• Overlay networks for interconnects
• Not CaaS
18
VPCC & PPCC
CoreOS, Fleet, etcd, cloud-init, weave,
cAdvisor
19
Ecosystem software
• Tectonic (by CoreOS) - Enterprise support - Kubernetes, etcd,
flannel, docker/rkt, coreOS, beautiful GUI, A/B updates
• Atomic (RedHat) - Enterprise support, subscription-manager, A/B
updates, use any fedora based distro, Selinux
• DockerSwarm
• RancherOS - Lightweight, containerizes all system processes, A/B
updates, simple rollback
• Triton - Abstracted single docker host across clustered
compute/network environment
• Vmware Photon,Snappy Ubuntu Core, Mesosphere DCOS
• Match Ecosystem to use case, workloads and skills
20
Internal
• Convert to micro-services based architecture for internal use
cases (like us..)
• Development of services spread across teams and infrastructures
• Better customer provisioning and segmentation on shared resource
platforms such as load balancers, mail servers, elasticsearch
• Standup training environments
• “Drink your own Kool-Aid”
• Ability to help customers containerize Apps as a value add
21
Where we started…
• Started w/ BareMetal (wasteful)
• 2007 -VMs in production (20-50 per
host)
• Hit Storage Bottleneck
• Storage Innovation
• Hit Network bottleneck
• Network Innovation
• Containers -> Full circle
• Cycle will continue….
22
THANK YOU!
Sagi Brody, CTO
@WebairSagi
sagi@webair.com

Why Managed Service Providers Should Embrace Container Technology

  • 1.
    1 Sagi Brody, CTO @WebairSagi sagi@webair.com sagi@webair.com sagi@webair.com WhyManaged Service Providers Should Embrace Container Technology
  • 2.
    2 What’s a ManagedService Provider Anyways? • Outsourced DevOps • Not many folks doing it • Difficult to scale
  • 3.
    3 Why do folksuse MSPs/CSPs? • Small IT teams who’d rather focus on AppDev • Common in non-tech industries who prefer to outsource • Cost • Liability & Responsibility • Unicorns are hard to find • MSP != IaaS • Not just for web • Ride the wave of new technologies
  • 4.
    4 What’s it looklike? LAMP at scale, redis, memcache, nginx.. LAMP at scale, redis, memcache, nginx.. Linux, FreeBSD, Agents/AutomationLinux, FreeBSD, Agents/Automation HAProxy + keepalived, nginx, csyncHAProxy + keepalived, nginx, csync MikroTik, PaloAlto, JuniperMikroTik, PaloAlto, Juniper FlowSpec, SDN hooksFlowSpec, SDN hooks L7 mitigation + automated /24 swingsL7 mitigation + automated /24 swings Redirects to CDN in App or via HTTP rewriteRedirects to CDN in App or via HTTP rewrite Applicatio n Serve r Load Balancers Firewal l Networ k DDo S CDN/Proxie s BareMetal, VM (Hybrid SAN, Flash,Xen/Vmware), or IaaS Infrastructur e
  • 5.
    5 Full Stack Management- how? • Need standardized platform, yet customizable as required • Client/Staff web portals to automate common tasks • Templated OS deployments w/ management scripts/monitoring • OSS/BSS + SPOT • FreeBSD - C daemon w/ SSL transport + custom API • Linux - Ansible - Py scripts • SPOT injects into 3rd party services • Custom layered monitoring • Docs/Diagrams • Social Monitoring • TaaS LinuxLinux ServiceService SSL SSH App routerApp router CLI Scripts CLI Scripts PortalPortal AppsApps OSSOSS MonitoringMonitoring FbsdFbsd
  • 6.
    6 Full Stack Management- how? usage: manageserver mysql [-h] [-v] [-a <newdbname> <newuser> <newpassword>] [-d <newdbname>] [--remove-db <dbname>] [-l] [-u <dbname> <newuser> <newpassword>] [--remove-user <username>] [--list-user] [-e <dest-file> [<dbname> ...]] [--list-backup] [-b {on,off,status,run,restore}] [-c] [-f <filepath>] [-r <root password>] optional arguments: -h, --help show this help message and exit -v, --verbose set verbosity level -a <newdbname> <newuser> <newpassword>, --add <newdbname> <newuser> <newpassword> Add a MySQL db and a user with all privileges on that db -d <newdbname>, --add-db <newdbname> Add a MySQL db --remove-db <dbname> Delete MySQL db -l, --list-db List all MySQL databases -u <dbname> <newuser> <newpassword>, --add-user <dbname> <newuser> <newpassword> Add a new MySQL user with all privileges on an existing db --remove-user <username> Delete MySQL user from all hosts --list-user List all MySQL users -e <dest-file> [<dbname> ...], --export <dest-file> [<dbname> ...] Dump DB to file. use .gz/.bz2 in file name to compress. blank dbname to dump all DBs. --list-backup List all MySQL backups -b {on,off,status,run,restore}, --backup {on,off,status,run,restore} Check status/run/restore and turn on/off Mysql Backups cron job -c, --cleanup Optional argument used with '--backup run' to remove backups older than 5 days -f <filepath>, --file <filepath> Argument used with '--backup restore' to pass the .tar.gz backup file path -r <root password>, --root <root password> Optional argument used with Mysql related commands to pass the root password
  • 7.
    7 MSP Pain Points •Guilty until proven innocent • Not always involved or notified of code updates and changes • Not aware of 3rd party integrations • Legacy apps (“bring us your tired your poor..”) • Backup/Restore • Application security • Not involved in design • Multi-tenancy systems (vendor nightmare) • Per client customization • Keeping OS/Software current Containers to the rescue?
  • 8.
    8 Containerizing Multi-Tenant SharedInfrastructure Usage: lbmanager <LB_ID> OPTIONS <LB_ID> A friendly name for identification. OPTIONS Consists of one or more of the following switchs. Each switch could be repeated as many times as needed. -v <IP> Add the IP to the VIPs (Will not serve untill at least 1 VIP exist) -V <IP> Remove the IP from the VIPs (Will stop serving if there isn't any VIP) -r <IP> Add the IP to the RIPs -R <IP> Remove the IP from the RIPs -p <PORT> Add the PORT to the listening ports (Will not serve until at least 1 port exist) -P <PORT> Remove the PORT from the listening ports (Will stop serving if there isn't any port) -u <USER> User name for the stats section (Default:lbstats) -s <PASS> Password for the stats section ( Default: md5(<LB_ID>-<shared_secret>) ) -1 Make the LB_ID active on 1st LB and standby on 2nd LB (Default) -2 Make the LB_ID active on 2nd LB and standby on 1st LB -c Add comment to a LB_ID (Enclose in "") -e Enable LB_ID. HAProxy will start to serving LB_ID -d Disable LB_ID. HAProxy will stop to serving LB_ID -o Enable SSL offloading with the certificates in the /etc/haproxy/lb_sets/LB_ID/certs/ directory -X Destroy & Backup the LB_ID -L Locks the LB_ID against further modifications.DON'T USE WITH OTHER OPTIONS, they will be ignored -O Disable SSL offloading -U Reset client stats username to default one (Default:lbstats) -S Reset client stats password to default hash with shared secret ( md5(<LB_ID>-<shared_secret>) ) -b <METHOD> Balancing method on the backend servers, is one of: leastconn Sends requests to the server with least connections (Default) roundrobin static-rr first source rdp-cookie NOTE: When satisfied with all the changes run "lbmanager reload" to make the changes live. LB Service: Ansible, CentOS, HAProxy, keepalived, csync, git, automation interface
  • 9.
    9 Containerizing Multi-Tenant SharedInfrastructure LB Service: Ansible, CentOS, HAProxy, keepalived, csync, git, automation interface Issues: • Multiple customer configs merged into master file • Resources shared across customers • Limit of customization per customer forces standalones • Standalones can add up quickly, also need HA • Manual configuration rollbacks • Version control of deployed app Containerize: • Container per customer deployment • Customization w/o standing up additional resources • Finer resource monitoring per customer == potential for new chargeback methods • Create ELB like service using stateless containers + autoscale scheduler (Exhibitor+Zookeeper)
  • 10.
    10 Containerizing Multi-Tenant SharedInfrastructure Shared Hosting Environments # ls -al total 10 drwx--x--x 4 clifico-www clifico 512 Aug 8 2008 . drwx--x--x 88 root wheel 3584 Jul 30 13:09 .. lrwxr-xr-x 1 root wheel 15 Aug 8 2008 bin -> /www/apache/bin drwxr-xr-x 3 root wheel 512 Aug 19 00:07 conf lrwxr-xr-x 1 root wheel 17 Aug 8 2008 icons -> /www/apache/icons lrwxr-xr-x 1 root wheel 19 Aug 8 2008 libexec -> /www/apache/libexec drwxr-xr-x 2 clifico-www clifico 512 Aug 19 00:05 logs lrwxr-xr-x 1 root wheel 19 Aug 8 2008 modules -> /www/apache/modules zsmgcontrol-www 47227 0.0 0.0 54096 0 ?? IW - 0:00.00 /www/vapache/smgcontrol/bin/httpd -f /www/vapache/smgcontrol/conf/httpd.conf -DPHP5 revsinn 49687 0.0 0.8 82332 34604 ?? I 2:13AM 0:01.13 /www/vapache/revsinn/bin/httpd -f /www/vapache/revsinn/conf/httpd.conf -DPHP5 laphta-www 51439 0.0 0.2 55944 6352 ?? I 2:14AM 0:02.20 /www/vapache/laphta/bin/httpd -f /www/vapache/laphta/conf/httpd.conf -DPHP5 laphta-www 53163 0.0 0.8 79604 33180 ?? I 2:15AM 0:02.53 /www/vapache/laphta/bin/httpd -f /www/vapache/laphta/conf/httpd.conf -DPHP5 msdivamarie 54375 0.0 0.7 75108 29032 ?? I 2:15AM 0:02.63 /www/vapache/msdivamarie/bin/httpd -f /www/vapache/msdivamarie/conf/httpd.conf -DPHP5 woodswe-www 55193 0.0 0.0 75588 0 ?? IW - 0:00.00 /www/vapache/woodswe/bin/httpd -f /www/vapache/woodswe/conf/httpd.conf -DPHP5 frenchelite-www 55280 0.0 1.1 90392 44660 ?? I 2:16AM 0:00.73 /www/vapache/frenchelite/bin/httpd -f /www/vapache/frenchelite/conf/httpd.conf -DPHP5 woodswe-www 55289 0.0 0.0 75588 0 ?? IW - 0:00.00 /www/vapache/woodswe/bin/httpd -f /www/vapache/woodswe/conf/httpd.conf -DPHP5 woodswe-www 55290 0.0 0.0 75588 0 ?? IW - 0:00.00 /www/vapache/woodswe/bin/httpd -f /www/vapache/woodswe/conf/httpd.conf -DPHP5 apachectl.. if [ "x$user" = "x-" ]; then user=$(/usr/bin/find /www/vapache -maxdepth 1 -mindepth 1 -type d | xargs -n 1 basename) split=`expr $(echo $user | wc -w) / 4` fi for i in $user ; do apacheinit $i if [ $? -eq 1 ]; then echo "WARNING: No such user $i" continue fi • Custom apachectl + supporting scripts • Separate apache instance per user • ‘jail like’ separation via perms + ps mod • Scripts to standup/destroy new customer environments
  • 11.
    11 Containerizing Multi-Tenant SharedInfrastructure Shared Hosting Environments: Issues: • No per user resource limits, only application • No dependency separation • 0day local root exploits • Difficult to move users Containerize: • Don’t deploy user accounts, deploy applications • Cloudapps == deploy small Xen instance + application specific template • Already being done well in containers, tons of OSS in dockerhub • Easier to manage hosts • Potential for better security than user land
  • 12.
    12 OS & SoftwareUpdates • Common issue among MSPs • We upgrade based on Stability, Security, Feature requests, and opportunistically • Feature requests most common (latest ffmpeg..) • Manual updates for older OS versions • Who knows if something broke post update? • Safer for us to standup new VMs and move workloads, but time consuming New functionality: • CoreOS FastPatch • Atomic rpm-ostree • RancherOS - per service container rollbacks • Why upgrade the OS if you can redeploy workload onto already upgraded host
  • 13.
    13 Platform Customization Issues: • Standardizedplatforms handle 80-90% of common use cases.. • More Flexibility needed - as close to internal DevOps teams as possible (our heroes) • Automation exists today, but still deploying VMs to solve problems • VMs deployed ‘at will’ based on templates or snapshots • Difficult to keep up w/ various tweaks & changes • Many multi-purpose VMs exist (web/db/cdn-origin/memcache) Containerize: • docker-compose customer’s environment • Less time to update update compose file then alternative work • Auto-scale • Already using abstracted LB & shared storage for configs and data • Unionfs no longer the only storage option • RDS type services already exist on perm
  • 14.
    14 SaaS Platform Segmentation& Scale • SaaS platforms with X customers, each with their own VH directive, subdir, or subdomain • Each customer gets their own container • App can identify customer baed on URL string or HOST • Allows per customer changes without affecting others • Fork and customize codebase per customer • Easier to align multiple platform flavors per customer/container • Adding container monitoring (sysdig, prometheus) allows instant resource utilization per platform customer - quickly find noisy neighbors and ability to up charge for resources • Single customer load spikes may not affect other customers • Shard customer across zones/regions
  • 15.
    15 Site Segmentation • Typicalto see single servers or clusters with 50-5,000 micro sites on same Apache/nginx daemons • Hard to track intermittent slowness or downtime • Poor security - 1 Hacked site has potential to effect entire cluster/server • Containerize each site: • Decoupling app from code makes break/fix easier for MSP to fix or redeploy • Per site resource utilization • Secure
  • 16.
    16 Pipeline • Rollbacks verydifficult today on customer + MSP • Most development happens offsite • MSP should help in CI/CD pipeline building process and provide dev resources • Build new pipelines on-demand to help customers w/ various workflows • Free up existing enterprise test resources so multiple devs can test concurrently and w/o interference Pipeline:
  • 17.
    17 VPCC & PPCC •Virtual Private Container Clouds & Physical Private Container Clouds • Hybrid - containers, private clouds, bare metal, xconnects • Proven & trusted enterprise feature sets: high availability, distributed resource schedulers, multi-tenant segmentation, DraaS, redundant networking, enterprise support • Private repositories • Isolated for compliance (PCI, HIPAA) • Existing stable and easy to use workload portability platforms and hyper-converged infrastructure • Overlay networks for interconnects • Not CaaS
  • 18.
    18 VPCC & PPCC CoreOS,Fleet, etcd, cloud-init, weave, cAdvisor
  • 19.
    19 Ecosystem software • Tectonic(by CoreOS) - Enterprise support - Kubernetes, etcd, flannel, docker/rkt, coreOS, beautiful GUI, A/B updates • Atomic (RedHat) - Enterprise support, subscription-manager, A/B updates, use any fedora based distro, Selinux • DockerSwarm • RancherOS - Lightweight, containerizes all system processes, A/B updates, simple rollback • Triton - Abstracted single docker host across clustered compute/network environment • Vmware Photon,Snappy Ubuntu Core, Mesosphere DCOS • Match Ecosystem to use case, workloads and skills
  • 20.
    20 Internal • Convert tomicro-services based architecture for internal use cases (like us..) • Development of services spread across teams and infrastructures • Better customer provisioning and segmentation on shared resource platforms such as load balancers, mail servers, elasticsearch • Standup training environments • “Drink your own Kool-Aid” • Ability to help customers containerize Apps as a value add
  • 21.
    21 Where we started… •Started w/ BareMetal (wasteful) • 2007 -VMs in production (20-50 per host) • Hit Storage Bottleneck • Storage Innovation • Hit Network bottleneck • Network Innovation • Containers -> Full circle • Cycle will continue….
  • 22.
    22 THANK YOU! Sagi Brody,CTO @WebairSagi sagi@webair.com