Beginning  the  DevOps
Journey  in  Real  Money  
Gaming
Kelly  Looney  27.11.2014
DevOps in  Real  Money  Gaming
Context:  600M  € company  – Online  Sports  betting,  Online  Casino,  Poker,  other  games
Two  primary  technologies  combined  via  a  business  merger
(turn  of  the  century  architecture)
• Sports  -­ .Net/SQL  Server
• Poker,  Casino,  and  “Platform”  – Java/Oracle
• Datacenters  in  Gibraltar,  Vienna,  and  other  points  in  Europe,  now  in  US
• Over  2000  Servers  in  production  
• 200  people  in  Ops  and  Infrastructure
• Development  Centers  in  Vienna,  Ukraine,  and  Hyderabad
• Over  700  development  team  members
2
3
Different  
faces/rules
For  different  
markets
Monolithic  App  with  many  single  points  of  failure
In  2013…the  Challenge
DevTeams  focused  on  
Horizontal  Components
Totally  separate  
Ops,  Maintenance,  
and  Devteams
Clashing  cultures  
from  Merger/
Locations/
Code  bases
Up  24/7  with  
Millions  of  €/day  
wagered
In-­‐house
Build
Deploy
Monitoring
…
AppDynamics picture  of  the  Beast 4
What  we  have  done  and  are  doing…
• Global  Agile  Transformation  – classes,  coaches,  96  Scrum   teams
• Craig  Larman,  Luke  Hohmann (Innovation  Games)
• DTO  (Damon  Edwards,  Alex  Honor)  for  DevOps principles
• Now  exploring  SAFe
• Several  Organization  Changes
• Components  -­>  Features  -­>  Services
• Ops  -­>  LeanOps -­>  Delivery  Units
6
Cultural  changes  we  have  encouraged
• Old  style  Developers
• Responsibilities:  Write  code
• Focus:  Know  ONE  THING  really  really  well.  
• Deep  expertise  =  respect
• What  we  want  now  is  Developers  that:
• Understand  our  company  goals  
• Understand  requirements  and  tests
• Write,  build,  integrate,  and  test  code  incrementally  
• Can  demonstrate  and  explain  working  systems    
• Maintains  his/her  code  in  production
• Understands  operations  
Deep  expertise  is  great,  but  varied  knowledge  is  just  as  important
7
Wow,  you  want  developers  to  do  everything…
• First  the  right  attitude…then
• Todays  Tools  and  Processes:
1. Agile  provides  continuous  “customer”  access
2. Distributed  versioning  (typically  Git)  puts  full  source  control  into  individual  developers  hands
3. Continuous  Integration  isolates  mistakes
4. Jenkins-­Vagrant-­Puppet-­Chef-­Saltstack pipelines  make  infrastructure  and  deployment  mostly  
automatic  regardless  of  complexity
• Deploy  to  Test,  UAT,  Staging,  Production
5. Monitoring  lets  you  see  and  assess  your  running  service
How  is  that  possible?
8
What  we  have  done  and  are  doing…tech
• Tool  changes
• SVN-­>Git,   In  house  depoy -­>  Jenkins/Team  City,  Puppet,Chef,  Rundeck
• Bare  Metal  -­>  VMWare  -­>  Now  headed  to  Docker/containers
• Monitoring…AppDynamics – more  to  come
• Architectural  Principles
• Less  centralized,  fewer  failure  points
• Code  to  create  a  server  is  the  asset,  not  the  server
• Throw  cheap  machines,  not  faster  CPUs  or  bigger  DBs  at  scaling  
problems
• Use  RDBs  when  needed  otherwise  avoid
9
Containers  are  changing  hosting
• Virtualization  efficiency  and  cost  savings  are  obvious
• The  most  interesting  issue  is  the  separation  of  concerns  presented
• “developer-­land”  vs infrastructure
10
What  to  do  about  quality?
• We  pulled  all  sorts  of  people  together
• Ops  ,  Dev,  CS,  Business,  Partners…
• “What  do  you  think  we  can  we  do  to  improve  overall  system  quality?”
• #1  Answer:  We  need  comprehensive  monitoring
• Our  system  is  so  complex  and  so  opaque  we  can’t  really  tell  what  is  specifically  wrong.
• Reworking  our  millions  of  lines  of  code  to  properly  and  consistently  log  will  never  happen…
• This  lead  us  to  evaluate  many  different  monitoring  approaches  and  products
• We  settled  on  AppDynamics,  reasons:
• Advanced  UI,  very  flexible
• One  application  to  replace  a  variety  of  other  solutions
• Aggregation  of  data  was  a  huge  cost  saver
• #2  Quality  issue:  Testing  Environment  stability  and  viability
• Expensive,  not  really  “production-­like”  and  not  highly  available
• Too  elaborate  for  early  testing  and  not  close  enough  for  late  testing
• Forced  to  mix  tests  which  often  polluted  one  another
• Infrastructure  just  an  incredible  blocker,  no  private  or  public  cloud
First  Steps:  Workshops  at  each  main  development  site
11
12
The  Difference  Monitoring  has  made…
1. Like  a  giant  debugger  for  production  issues
• Peer  into  what  were  before  opaque  code  bases
• Where  are  the  stress  points?  Also  surface  the  really  dumb  stuff.
• Identify  intermittent  issues  that  were  hard  to  identify  before
• “Working  for  me…”
2. Better  resource  planning
• We  had  lots  of  “over-­solved”  problems  before
• How  do  things  change  during  spikes  in  traffic
3. Rollout  actually  helped  us  identify  services  that  needed  refactoring
• If  the  overhead  of  mature  monitoring  breaks  your  service…
4. Developers  starting  to  use  AppDynamics to  assess  new  designs
• It  has  uncovered  a  few  things  were  were  happy  we  did  not  deploy!
5. Gets  the  whole  organization  in  touch  with  operations
• A  huge  DevOps goal  realized…
13
Posted  all  around  the  organization
14
Automating  Test
• It’s  not  “How  many  automated  tests  do  I  have?”
• We  could  have  easily  run  days  worth  or  tests  whenever  we  wanted
• It’s  “I  have  the  right  tests  to  quickly  decide  if  I  can  move  forward”
• Also  BTW  “We  run  Jenkins  to  do  a  build  every  night”
• Does  !=  Continuous  Integration…
• Can  you  create  a  viable  test  environment,  use  it,  then  throw  it  away?
(before  it  pollutes  other  tests…)
15
What  DevOps and  CD  mean  for  the  organization
• The  whole  idea  of  holding  off  changes  to  retain  stability  gets  turned  on  its  head
• Change  all  the  time  and  stay  stable!
• Changes  get  smaller  and  smaller,  but  are  constantly  being  deployed
• With  small  changes  integration  issues  become  fairly  simple
• Environments  must  proliferate  along  with  associated  infrastructure
• Ideally  you  need  a  new  test  environment  to  test  every  change  – Create/Destroy
• Are  your  environments  captured  as  code?
• Use  Cloud  services  here,  even  if  you  don’t  want  to  for  production
16

Gartner Infrastructure and Operations Summit Berlin 2015 - DevOps Journey

  • 1.
    Beginning  the  DevOps Journey in  Real  Money   Gaming Kelly  Looney  27.11.2014
  • 2.
    DevOps in  Real Money  Gaming Context:  600M  € company  – Online  Sports  betting,  Online  Casino,  Poker,  other  games Two  primary  technologies  combined  via  a  business  merger (turn  of  the  century  architecture) • Sports  -­ .Net/SQL  Server • Poker,  Casino,  and  “Platform”  – Java/Oracle • Datacenters  in  Gibraltar,  Vienna,  and  other  points  in  Europe,  now  in  US • Over  2000  Servers  in  production   • 200  people  in  Ops  and  Infrastructure • Development  Centers  in  Vienna,  Ukraine,  and  Hyderabad • Over  700  development  team  members 2
  • 3.
    3 Different   faces/rules For  different  markets Monolithic  App  with  many  single  points  of  failure In  2013…the  Challenge DevTeams  focused  on   Horizontal  Components Totally  separate   Ops,  Maintenance,   and  Devteams Clashing  cultures   from  Merger/ Locations/ Code  bases Up  24/7  with   Millions  of  €/day   wagered In-­‐house Build Deploy Monitoring …
  • 4.
    AppDynamics picture  of the  Beast 4
  • 6.
    What  we  have done  and  are  doing… • Global  Agile  Transformation  – classes,  coaches,  96  Scrum   teams • Craig  Larman,  Luke  Hohmann (Innovation  Games) • DTO  (Damon  Edwards,  Alex  Honor)  for  DevOps principles • Now  exploring  SAFe • Several  Organization  Changes • Components  -­>  Features  -­>  Services • Ops  -­>  LeanOps -­>  Delivery  Units 6
  • 7.
    Cultural  changes  we have  encouraged • Old  style  Developers • Responsibilities:  Write  code • Focus:  Know  ONE  THING  really  really  well.   • Deep  expertise  =  respect • What  we  want  now  is  Developers  that: • Understand  our  company  goals   • Understand  requirements  and  tests • Write,  build,  integrate,  and  test  code  incrementally   • Can  demonstrate  and  explain  working  systems     • Maintains  his/her  code  in  production • Understands  operations   Deep  expertise  is  great,  but  varied  knowledge  is  just  as  important 7
  • 8.
    Wow,  you  want developers  to  do  everything… • First  the  right  attitude…then • Todays  Tools  and  Processes: 1. Agile  provides  continuous  “customer”  access 2. Distributed  versioning  (typically  Git)  puts  full  source  control  into  individual  developers  hands 3. Continuous  Integration  isolates  mistakes 4. Jenkins-­Vagrant-­Puppet-­Chef-­Saltstack pipelines  make  infrastructure  and  deployment  mostly   automatic  regardless  of  complexity • Deploy  to  Test,  UAT,  Staging,  Production 5. Monitoring  lets  you  see  and  assess  your  running  service How  is  that  possible? 8
  • 9.
    What  we  have done  and  are  doing…tech • Tool  changes • SVN-­>Git,   In  house  depoy -­>  Jenkins/Team  City,  Puppet,Chef,  Rundeck • Bare  Metal  -­>  VMWare  -­>  Now  headed  to  Docker/containers • Monitoring…AppDynamics – more  to  come • Architectural  Principles • Less  centralized,  fewer  failure  points • Code  to  create  a  server  is  the  asset,  not  the  server • Throw  cheap  machines,  not  faster  CPUs  or  bigger  DBs  at  scaling   problems • Use  RDBs  when  needed  otherwise  avoid 9
  • 10.
    Containers  are  changing hosting • Virtualization  efficiency  and  cost  savings  are  obvious • The  most  interesting  issue  is  the  separation  of  concerns  presented • “developer-­land”  vs infrastructure 10
  • 11.
    What  to  do about  quality? • We  pulled  all  sorts  of  people  together • Ops  ,  Dev,  CS,  Business,  Partners… • “What  do  you  think  we  can  we  do  to  improve  overall  system  quality?” • #1  Answer:  We  need  comprehensive  monitoring • Our  system  is  so  complex  and  so  opaque  we  can’t  really  tell  what  is  specifically  wrong. • Reworking  our  millions  of  lines  of  code  to  properly  and  consistently  log  will  never  happen… • This  lead  us  to  evaluate  many  different  monitoring  approaches  and  products • We  settled  on  AppDynamics,  reasons: • Advanced  UI,  very  flexible • One  application  to  replace  a  variety  of  other  solutions • Aggregation  of  data  was  a  huge  cost  saver • #2  Quality  issue:  Testing  Environment  stability  and  viability • Expensive,  not  really  “production-­like”  and  not  highly  available • Too  elaborate  for  early  testing  and  not  close  enough  for  late  testing • Forced  to  mix  tests  which  often  polluted  one  another • Infrastructure  just  an  incredible  blocker,  no  private  or  public  cloud First  Steps:  Workshops  at  each  main  development  site 11
  • 12.
  • 13.
    The  Difference  Monitoring has  made… 1. Like  a  giant  debugger  for  production  issues • Peer  into  what  were  before  opaque  code  bases • Where  are  the  stress  points?  Also  surface  the  really  dumb  stuff. • Identify  intermittent  issues  that  were  hard  to  identify  before • “Working  for  me…” 2. Better  resource  planning • We  had  lots  of  “over-­solved”  problems  before • How  do  things  change  during  spikes  in  traffic 3. Rollout  actually  helped  us  identify  services  that  needed  refactoring • If  the  overhead  of  mature  monitoring  breaks  your  service… 4. Developers  starting  to  use  AppDynamics to  assess  new  designs • It  has  uncovered  a  few  things  were  were  happy  we  did  not  deploy! 5. Gets  the  whole  organization  in  touch  with  operations • A  huge  DevOps goal  realized… 13
  • 14.
    Posted  all  around the  organization 14
  • 15.
    Automating  Test • It’s not  “How  many  automated  tests  do  I  have?” • We  could  have  easily  run  days  worth  or  tests  whenever  we  wanted • It’s  “I  have  the  right  tests  to  quickly  decide  if  I  can  move  forward” • Also  BTW  “We  run  Jenkins  to  do  a  build  every  night” • Does  !=  Continuous  Integration… • Can  you  create  a  viable  test  environment,  use  it,  then  throw  it  away? (before  it  pollutes  other  tests…) 15
  • 16.
    What  DevOps and CD  mean  for  the  organization • The  whole  idea  of  holding  off  changes  to  retain  stability  gets  turned  on  its  head • Change  all  the  time  and  stay  stable! • Changes  get  smaller  and  smaller,  but  are  constantly  being  deployed • With  small  changes  integration  issues  become  fairly  simple • Environments  must  proliferate  along  with  associated  infrastructure • Ideally  you  need  a  new  test  environment  to  test  every  change  – Create/Destroy • Are  your  environments  captured  as  code? • Use  Cloud  services  here,  even  if  you  don’t  want  to  for  production 16