2. Agenda
❏Kenshoo Introduction
❏Challenges over the years
❏Why we chose openstack
❏OpenStack flavors - the chosen one
❏OpenStack Services - what, why
❏What is Kloud
❏What has been done already
❏What next
3. Kenshoo’s Mission
is to empower every marketer in the world
with technology to build brands
and generate demand across all media.
4. What do we do?
Manage Digital Campaigns
Track users interaction (impressions / clicks and Conversions).
Optimize bids and budgets to business goals
Visualize and Report on your metrics
6. Kenshoo in Numbers
5.433 Billion Managed Keywords, 1.738 Billion Managed Ads
~10% of Adwords traffic goes through Kenshoo proxies
~ 1.5PB SSD Storage used to power ~500 MySQLs
More than 3K/sec clicks tracked and attributed (10+% of global
Search ads traffic)
2PB of data used for raw data persistency on HDFS and
Cassandra
7. 2009 2010 2011 2012 2013 2014 2015 2016 2017
Uptime
Operations
Control and
Accountability
Efficiency
Stability
Performance
Rapid
Growth
High Scale
Time to market
Kenshoo’s infrastructure challenges
11. Environment Overview
Controller x 1
Compute x 3
Ceph x 4, 4TB
Controller x 2
Compute x 10
Ceph x 10, 128TB
Controller x 2
Compute x 19
Ceph x 16, 150TB
POC / Play Lab / Staging Production
12. What next ?
❏Scale OpenStack environments
❏Install new services (Trove, Tempest, Watcher etc)
❏Moving rest of the labs to OpenStack
❏Moving Production to OpenStack
13. Thank you for listening !
13/02/17 Paz Tal-Shachar
Kenshoo’s Private Cloud
Editor's Notes
Kenshoo focuses only on the top marketing organizations with the most sophisticated needs. Here you can see our blue chip portfolio. We’re well distributed across verticals and proud to partner with the leaders in each category. What you don’t see here are a lot of smaller brands. And that’s not because we don’t want to show them. It’s because we only work with the elite. The industry norm has been to build tools that can scale from the biggest companies down to the mom-and-pops. That doesn’t work in our space. Different needs. Different toolset. We help clients where digital marketing is a core focus at massive scale.
2008 - 2009 uptime. Kenshoo started install production servers in hosting company around the world.
2009 - 2010 operations. Kenshoo started to grow in number of customer and servers, a proper infrastructure process and procedures has been placed.
2010 - 2011 control and accountability. In order to control the maintenance, provisioning, and scale up timelines, we started use our own servers and network equipment.
2011 - 2012 efficiency. To use hardware wisely and improve the infra cost, Kenshoo started to use virtualization and central storage.
2012 - 2013 stability. Kenshoo implemented a new backup and monitoring systems for production.
2013 - 2014 performance. To handle the massive new customers (bigger reports, bigger bid policies, uploads downloads...), we moved from iscsi based storage to FC based storage in our own cage.
2014 - 2015 rapid growth. With the very high speed of new customers arriving (leading to lots of new infra resources provisioning) an automation was a must. We have build a provisioning system that can handle the grow by automate storage, network, and system processes.
2015 - 2016 high scale. SSD architecture implemented for our our challenging customers.
2016 - 2017 ???? - Time to market !
To get Elasticity, Self service, Infrastructure as code, we started discovering the public cloud, and did found great capabilities, but, the price was very high comparing to data center.
Many companies created and selling their own flavor of OpenStack
Team players,
Time to market, innvoation, rapid growth ---> a new release every 6 months. (EOL after 1 year).
Flexability - new services, certification.
We started with a small POC based openstack vanilla. Once we decided to go with this project, and how its essential for kenshoo we started to explore more and we’ve with several vendors, most of them were pretty good.
Still, we decided to go with Vanilla because of 3 main reason:
Innovation - As Paz mentioned, Kenshoo grow fast, we have to be at top of the technology in order to support that. We couldn’t rely on vendor’s certification. A new release is out every 6 months,
Flexibility - We wanted to be able to install whatever we would decide with no dependency or possible degradation. For example, I can tell you about Designate, which we chose to install for DNS solution, we needed to upgrade our existing env to Mitaka, and when we chose to install Watcher for a DRS like solution, we needed to upgrade to Ocata.
Cost effiency - when we considered the two previous reasons, for us, there was no added value by choosing another flavor.
I do want to say one thing, the fact that we chose to do it by our own could be also a downside.\
VIO (vmware), Helion (HP), Nutrino (emc), Mirantis, Canonical, Stratuscale, etc...
Each one of the them has been picked up, Openstack services (in addition to the basic services) , some of them added some of their own features (closed source), some of them added different storage integrations and all of them created a simple installation flow.
In all of them, we are limited to the services they chose. for example if we want to use Stratuscale because they have good networking approach, we must use GlusterFS and cannot use ceph, or in Helion flavor we cannot install Magnum (the container as a service).
That limitation made us decide to use the open source vanila Openstack project.
Meaning we have to implement and build the implementation / scale out, flows by ourselves. (Don’t worry - already done :-)
We chose the basic service plus extras.
Nova, Keystone, Glance, Neutron (on top of Open vswitch), Cinder , cinder-backup for backups, Horizon, heat. We now starting to work on implementing Trove.
Foreman implemented - now we using it to quickly install new physical servers with Ubuntu / CentOS
Network API - now supports a physical port configuration on the physical switches, (calls by Forman)
Ceph - we used the public open source ansible for provisioning, integrated with Openstack services (nova, glance and cinder), tested by performance team and ~4.5% performance improvement for the KS
OpenStack - we have build a provision flow (using ansible), hardware designed, functionally tests (basic) for KS, live migration tested, resource increasing tested, KS stack were created
Restore for lab - flow created using API orchestration on the top of Storage, Actifio, and OpenStack REST APIs (also supports clean KS).
Deployment - KS deployment flow (fabric and liquibase) has been tuned to the new env
Cloud plugin - now slaves creates automatically by jobs queue, on the OpenStack env
Several POC tenants created - performance,reporting, dba, tracking, bigdata, ace ...
POC - 1 controller, 3 compute nodes, 10TB ceph
Lab/STG Environment - 2 controller, 9 compute nodes, 30TB ceph
Snapshots - test ability of taking, deleting, and reverting snapshots by self service
HA - set and test fully HA for network, compute, and management.
Floating IP - integrate Neutron with with our public IPs in order to expose service to the world by self service
Designate - implement designate in order to expand the DNS (internal and external) management capabilities
Packer - on the new KS artifact creation flow, we will need to create also new OpenStack image
Troove - implement and test the database (Mysql) as a service, service
Magnum - implement and test the container as a service, service, with Kubernetes integration !
Manila / CephFS - implement and test the file storage (NFS) service
Swift / CephRGW - implement and test the object storage (like s3) service
Auto scaling - test only
Rolling upgrades