Riccardo Capecchi – Andrea Capriotti
Matteo Turra
2
Cineca - About UsCineca - About Us
Cineca is a non profit Consortium, made up of 70 Italian
universities*, 5 Italian Research Institutions and the
Italian Ministry of Education.
Today it is the largest Italian computing centre, one of the
most important worldwide. With around 700 employees, it
operates in the technological transfer sector through high
performance scientific computing, the management and
development of networks and web based services, and
the development of complex information systems for
treating large amounts of data.
3
Cineca OrganizationCineca Organization
In 2012:
● 5 Big Development teams (split in smaller
project teams)
● 1 Infrastructure Team (In charge of all services
and split to give support to the different dev
teams)
● 1 High Performance Computing team
4
Cineca OrganizationCineca Organization
5
Ops – How it used to beOps – How it used to be
● 2012 – Golden Image installation and
management of the configurations via overrides
6
Ops – Infrastructure as codeOps – Infrastructure as code
2013 – A corporate merge with Cilea and Caspur gave
us new goals to manage the new infrastructure and
merge the 3 Ops teams.
After a scouting of the best/more used tools we choose
to use puppet.
The 3 IT-Ops teams started a join project to write and
configure everything (related to Linux servers) with
Puppet.
With puppet we started also to use Git as VCS to keep
all puppet files.
7
Ops – Expanding our codeOps – Expanding our code
From 1 of January 2014 we started to manage all
our Linux installations with Puppet and started to
expand the modules base to manage more
services/configurations.
We also started a massive training of all Ops staff,
in around 2 months more than 60 people received
internal training regarding puppet.
8
Ops – AutomationOps – Automation
On 2015 we continued to expand our code in puppet
and started to integrate it with other tools such as
Nagios, Monit and Collectd to automatically get the
services with everything they needed to be
production-ready.
We also expanded our use of Jenkins to easily
manage and automate some common tasks.
On summer 2015 we received the task of re-make the
infrastrucure related to the application Pentaho.
9
Use Case:Use Case:
10
Use Case:Use Case:
New platform release to deliver on 80 customers on
premise and hosted infrastructure!
11
Platform ComponentsPlatform Components
Tomcat
+
a bunch of jars and wars
12
Use Case:Use Case:
NO PROBLEM!
I'm a Black Belt Tomcat Expert
13
Java WebApp deploymentJava WebApp deployment
● Oracle JDK 6/7
● Apache Tomcat
● Apache HTTPD with mod_jk
● Jar(s) & War(s)
● Some shell scripts
14
Mega Package!Mega Package!
Everytime a
little update
is needed …
i.e: Xalan bugfix
15
Solution: StandardizeSolution: Standardize
DEV ENVIRONMENT OPS ENVIRONMENT
DEV vs OPSDEV vs OPS
16
Interaction between DEV and OPS
17
Interaction between DEV and OPS
Developer
Operation
Merge
Request
Accept/Reject
Request
18
GitLab Web User Interface
19
Test Driven Puppet:Test Driven Puppet:
● How to test puppet scripts?
● Clean environment
● Reuse of code
● Side Effect: docker images
● From Puppet Apply to Pupper Agent
20
Provisioning: Docker vs PuppetProvisioning: Docker vs Puppet
● Puppet is slow but module are extremly powerful
● Dockerfile has limited configuration but very fast to build a
new image based on an existing one.
● Use slow puppet to build images (run once)
● Note:
– Puppet can be used for container provisioning
– Portability: The same image can be used in test, staging,
production and development, lowering the diversity of
environments
21
Deploy on DockerDeploy on Docker
● Now we have Docker images.
● Where do we store it?
● How can I run container for test, qa, production
environment?
● How to reach high-availability and scalability?
● OPS have the answer!
22
Docker swarmDocker swarm
● Turns a pool of Docker host machines into a
single virtual host
● Allows us to distribute container workloads
across multiple machines running in a cluster
● Serves the standard Docker API
● Ships with simple scheduling and discovery
backend
23
ConsulConsul
● Service discovery and configuration
● Failure detection
● Key/Value storage e DNS server
● Distributed and highly available system with
gossip protocol (Serf based)
24
RegistratorRegistrator
● Service registry bridge for docker
● Monitors the Docker UNIX socket for events
● Dynamically registering and unregistering
Docker container’s services
25
Docker registry
(Distribution)
Docker registry
(Distribution)
● Image registry to store and distribute images
● Private registry to host customer applications
● Test and continuous integration images
26
DockerUIDockerUI
● A web interface for docker
● Start, stop, kill, pause, restart and commit
containers
● Provide details about running containers
27
Log management: ElkLog management: Elk
● Central point to collect, manage and visualize
logs from all containers
Docker apps
28
Docker Swarm, Consul - RegistratorDocker Swarm, Consul - Registrator
29
Devops “done”, but we forgot ...Devops “done”, but we forgot ...
30
What we have nowWhat we have now
Puppet
~800 Installations of Linux servers fully managed via puppet
More than 100 modules
Training done on ops and some dev
Dev can do merge-request to apply change to infrastructure
Docker
Procedures to build images from puppet recipes
Test done on a full docker environment
Training done on dev AND ops.
Openshift
Orchestration of containers
Manage RBAC for dev and ops
31
ConclusionsConclusions
Devops in practice it's more about organization than
tools and softwares.
But tools can help in day by day operations, using
gitlab we have been able to give RBAC access to
different Git repositories
A lot of training is needed on both sides, learning the
best dev practice to ops people and viceversa.
After months of work together to find a solution we
“forgot” to detail it to our CTO, and so he was
thinking we were on stale, lesson learnt.
32
How it endedHow it ended
33
QuestionsQuestions

devops@cineca

  • 1.
    Riccardo Capecchi –Andrea Capriotti Matteo Turra
  • 2.
    2 Cineca - AboutUsCineca - About Us Cineca is a non profit Consortium, made up of 70 Italian universities*, 5 Italian Research Institutions and the Italian Ministry of Education. Today it is the largest Italian computing centre, one of the most important worldwide. With around 700 employees, it operates in the technological transfer sector through high performance scientific computing, the management and development of networks and web based services, and the development of complex information systems for treating large amounts of data.
  • 3.
    3 Cineca OrganizationCineca Organization In2012: ● 5 Big Development teams (split in smaller project teams) ● 1 Infrastructure Team (In charge of all services and split to give support to the different dev teams) ● 1 High Performance Computing team
  • 4.
  • 5.
    5 Ops – Howit used to beOps – How it used to be ● 2012 – Golden Image installation and management of the configurations via overrides
  • 6.
    6 Ops – Infrastructureas codeOps – Infrastructure as code 2013 – A corporate merge with Cilea and Caspur gave us new goals to manage the new infrastructure and merge the 3 Ops teams. After a scouting of the best/more used tools we choose to use puppet. The 3 IT-Ops teams started a join project to write and configure everything (related to Linux servers) with Puppet. With puppet we started also to use Git as VCS to keep all puppet files.
  • 7.
    7 Ops – Expandingour codeOps – Expanding our code From 1 of January 2014 we started to manage all our Linux installations with Puppet and started to expand the modules base to manage more services/configurations. We also started a massive training of all Ops staff, in around 2 months more than 60 people received internal training regarding puppet.
  • 8.
    8 Ops – AutomationOps– Automation On 2015 we continued to expand our code in puppet and started to integrate it with other tools such as Nagios, Monit and Collectd to automatically get the services with everything they needed to be production-ready. We also expanded our use of Jenkins to easily manage and automate some common tasks. On summer 2015 we received the task of re-make the infrastrucure related to the application Pentaho.
  • 9.
  • 10.
    10 Use Case:Use Case: Newplatform release to deliver on 80 customers on premise and hosted infrastructure!
  • 11.
  • 12.
    12 Use Case:Use Case: NOPROBLEM! I'm a Black Belt Tomcat Expert
  • 13.
    13 Java WebApp deploymentJavaWebApp deployment ● Oracle JDK 6/7 ● Apache Tomcat ● Apache HTTPD with mod_jk ● Jar(s) & War(s) ● Some shell scripts
  • 14.
    14 Mega Package!Mega Package! Everytimea little update is needed … i.e: Xalan bugfix
  • 15.
    15 Solution: StandardizeSolution: Standardize DEVENVIRONMENT OPS ENVIRONMENT DEV vs OPSDEV vs OPS
  • 16.
  • 17.
    17 Interaction between DEVand OPS Developer Operation Merge Request Accept/Reject Request
  • 18.
  • 19.
    19 Test Driven Puppet:TestDriven Puppet: ● How to test puppet scripts? ● Clean environment ● Reuse of code ● Side Effect: docker images ● From Puppet Apply to Pupper Agent
  • 20.
    20 Provisioning: Docker vsPuppetProvisioning: Docker vs Puppet ● Puppet is slow but module are extremly powerful ● Dockerfile has limited configuration but very fast to build a new image based on an existing one. ● Use slow puppet to build images (run once) ● Note: – Puppet can be used for container provisioning – Portability: The same image can be used in test, staging, production and development, lowering the diversity of environments
  • 21.
    21 Deploy on DockerDeployon Docker ● Now we have Docker images. ● Where do we store it? ● How can I run container for test, qa, production environment? ● How to reach high-availability and scalability? ● OPS have the answer!
  • 22.
    22 Docker swarmDocker swarm ●Turns a pool of Docker host machines into a single virtual host ● Allows us to distribute container workloads across multiple machines running in a cluster ● Serves the standard Docker API ● Ships with simple scheduling and discovery backend
  • 23.
    23 ConsulConsul ● Service discoveryand configuration ● Failure detection ● Key/Value storage e DNS server ● Distributed and highly available system with gossip protocol (Serf based)
  • 24.
    24 RegistratorRegistrator ● Service registrybridge for docker ● Monitors the Docker UNIX socket for events ● Dynamically registering and unregistering Docker container’s services
  • 25.
    25 Docker registry (Distribution) Docker registry (Distribution) ●Image registry to store and distribute images ● Private registry to host customer applications ● Test and continuous integration images
  • 26.
    26 DockerUIDockerUI ● A webinterface for docker ● Start, stop, kill, pause, restart and commit containers ● Provide details about running containers
  • 27.
    27 Log management: ElkLogmanagement: Elk ● Central point to collect, manage and visualize logs from all containers Docker apps
  • 28.
    28 Docker Swarm, Consul- RegistratorDocker Swarm, Consul - Registrator
  • 29.
    29 Devops “done”, butwe forgot ...Devops “done”, but we forgot ...
  • 30.
    30 What we havenowWhat we have now Puppet ~800 Installations of Linux servers fully managed via puppet More than 100 modules Training done on ops and some dev Dev can do merge-request to apply change to infrastructure Docker Procedures to build images from puppet recipes Test done on a full docker environment Training done on dev AND ops. Openshift Orchestration of containers Manage RBAC for dev and ops
  • 31.
    31 ConclusionsConclusions Devops in practiceit's more about organization than tools and softwares. But tools can help in day by day operations, using gitlab we have been able to give RBAC access to different Git repositories A lot of training is needed on both sides, learning the best dev practice to ops people and viceversa. After months of work together to find a solution we “forgot” to detail it to our CTO, and so he was thinking we were on stale, lesson learnt.
  • 32.
  • 33.