Attending IT conferences, such as Gartner, but also vendors’ events, can be painful
and stressful for midmarket IT manager: you see many fine examples of big-scale
implementation of a tool or a framework, managers from rich companies present
fancy case studies demonstrating that project of x millions returned three times
more in savings because of reduced work hours and increased service quality.
Then you return to your small or mid company knowing that nobody would ever
fund implementation of out-of-box monitoring solution and you will never have a
chance to map your capabilities using IT4IT framework. These toys are for big guys
only.
This presentation shows how one can make a lot with moderate budget and how
you can build mature IT service management without implementing all features
offered by vendors and presented as a must.
This is not about revolution and dramatic transformation through single project or
initiative.
This case study shows how evolutionary transformation of company culture can
lead to high service quality.
0
I was born in 1972 and have been working in Intesa Sanpaolo Card as the Head of
Service Management Department in Croatia.
After master degree in computer sciences at the University of Zagreb, I am working
for 19 years in infrastructure and operations area for card processing business
For last 7 years my focus has been real-life application of ITSM good practice. This
has contributed achieving certain improvements, and some of these improvements
are topic of this presentation.
Positions (within card processing companies): Help-Desk staff, DBA, IT Infrastructure
& Operations deputy manager, Head of Service management
1
PEOPLE
362 employees
100+ IT
Early and quick adopters
Quick acceptance and implementation of agile methodology
High fluidity of communication and people interactions
Corporate culture based on constant self and collective improvement
100% use of English language
INNOVATIVE COMPANY
First company in the region turning your smartphone into contactless payment
device:
1st in the world: Pilot and launch of mobile wallet for PBZ Group with
American Express
2nd in Europe: Pilot for VUB with Visa Inspire and one of the first to launch it
commercially
1st in Visa Inc. Region: Pilot for BIB with Visa Inspire
In-house development of Fraud Management solution for detecting payment card
fraud
Creating the most innovative and modern disaster recovery solution
Modern and flexible infrastructure with full multi-country, multi-currency, multi-
language capabilities
Providing multi-payment schemes
Implementation of the Card Life Cycle Management strategies to increase customer
lifespan and profitability
Acting as compliant guardian for banks
2
In peak time:
- One minute outage – 5.000+ lost trx
- Ten minutes of outage – 50.000+ lost trx
- One hour outage – 300.000+ lost trx
Large number of people affected: so smooth operations are important!
3
The company had number of typical start-up company issues, including high
demand for new development, lack of resources, insufficient control of change and
firefighting in incident management.
The challenge was to radically increase service quality, while keeping costs under
control during turbulent years for financial industry in Europe.
Examples:
Customers call: „Application is not working!” IT answer: „All systems are up
and running.” It took then several hours to determine what is happening and resolve
the incident related to network issue.
Reported incident on service - Service Desk does not know it exists (!) Later, in
corridor, IT guys said they have deployed new service three days ago
Reference to Real ITSM:
Service „deathcycle”
Service taming
Do not take this book as a reference guide: it is just „ITSM humor (if there can be
such a thing)”
Question: According to Real ITSM, What is monitoring device?
4
The telephone was main monitoring device for ISPC, as well.
However, this was not good practice.
The company’s position within Midmarket category did not allow luxury of investing
in multi-million transformation project. So ISPC initiated several focused initiatives
targeted to improve service quality. As Gartner’s research stated: “…smaller
businesses lack deep resources to support change and must do everything they can
to assure that their investments in change are focused and successful. ”- “Strategic
Benefits Realization for the Midmarket CIO”, Heather Colella, August, 2015
So, what have we done?
5
By knowing which services are critical and which are important
Take inventory of services and critical infrastructure
Define and negotiate service catalogue and SLAs
Not as easy task as it may look like
Dilemmas regarding service catalogue:
Technical (measurable) vs. Business (chargeable): answer depends on
industry and position of IT
Anyway, you always need technical view in order to have
technical metrics
Example: transaction processing in card business
More detailed vs. more general: answer - general, and yet
measurable
In our case, 3 groups defined:
- online, main metric is availability
- batch processing, main metric is time of delivery (i.e.
percentage of files delivered on time)
- operational (includes human interaction), different metrics
This is not quick-win task (OK, now you have Service list, so what?!), nevertheless
this is foundation for the rest of the story
6
First two processes to mature:
Incident management:
Implement severity matrix, use incident frequency/severity as a
primary KPI
Think: whether to publish severity estimate to clients!
The KPI indicates service quality and is cheap to implement
Used old, good enough, ticketing tool
(Operational) Change management
Change windows, planned maintenance detailed planning, service
state checklists
Automated testing and build as a (one of) cornerstone for DevOps
File Integrity Monitoring – easy to deploy, difficult to configure
Scope narrowing for FIM: 5 critical platforms
At the end – unauthorized change is eliminated, not due to
FIM, but to culture change – OCM process should not be „police
task”
Consistent reporting
Availability
Incidents
Culture shift
- From „service taming” to change planning and control
Be friend with audits and use their findings to build momentum
People in audit may have broader picture and often provide actually good
advices
„It was audit request” may be valid reasoning why some control must be in
place. Of course, it should not be only reason!
7
Simple metric – tickets as the only source
8
By monitoring and measuring service state
Use existing ticketing tool as basic source of data
You already have (primitive) monitoring system: telephone!
Implement monitoring of end-to-end service state for all critical services to, at
least, know the service is broken before your customers know it
Using active monitoring (synthetic transactions)
Using passive monitoring (control over real user transactions)
We have discovered many „islands of monitoring” withinIT
Use existing infrastructure monitoring components (like Nagios) to know, at least:
Whether server/network is running?
Is there some free storage space?
Is any web certificate about to expire?
Target specific pain-points
Umbrella monitoring to visualize and consolidate information
Data processing (batch jobs) as a specific area to address
Visualize the data, not only for 7x24 Service Desk, but also for management
Tool list:
HP BSM – umbrella monitoring
HP BAC – robot scripts for application and WS active end to end monitoring
MIRTA – internally developed solution for rule-based card authorizations outage
detection – passive end to end monitoring
CA Spectrum – network infrastructure
Nagios – server infrastructure AND specific scripts targeting pain-points
Automic – enterprise job scheduling tool
Project cost – less than €200.000
Less than half on licenses and video-wall
More than half on internal/external development and integration
9
Do not use monitoring events to automatically raise tickets
Avoid false-alert pitfall – treat false alarms as incidents in order to eliminate
Alternative – nobody trusts monitoring
Repeating not-detected incidents are problem itself
Continuous gradual improvement (2013-2016)
100% is neither reachable, nor needed (low severity incidents)
10
Embrace the culture of continuous improvement
Implement Problem management, at least using spreadsheet
Track problem resolution
Ask executive sponsorship
Weekly Problem management meetings with heads of IT, Operations
and Security
Weekly meeting with CEO and CBO on incidents and problems
Great impact to company culture: service orientation
11
Empower the role of Service Desk within key processes:
Incident management
Authority to set incident severity and activate any function
7x24
Major incidents triggers SMS to MB members including CEO
Change management
Service Desk as a part of CAB with veto option
Monitoring
Self-confidence due to service-state information availability
Feeling of helplessness before
Feeling of confidence today
Example: SD has noticed large number of declined authorizations on our client - big
bank’s POS network in one country and we see these all belong to cards issued by
biggest bank in this country.
After SD informed our bank and they asked other bank what is happening, the other
bank replied: „Thank you for alert, we looked and now see that we have problem on
our host”
This is greatest gain from monitoring: from „telephone monitoring” to monitoring
even for other providers!
12
Results within Best-In-Class or, at least, outstanding
Problem with legacy ATM platform: in resolution
Gartner metrics are defined on yearly level – if calculated yearly, both main metrics
fall into „Best-In-Class”
ISPC SLA matches Gartner „Outstanding” for similar services
13
CMDB?
- Holy grail of service management 
- Automatic population, including relationships – incredible amount of information
- Implementing tools that use full power of CMDB
- Using CMDB within Incident, Problem and especially Change management process
directly from tool
- Connecting to FIM
- Infrastructure as a code?
- Anyway, we are working not badly even without proper CMDB
IT4IT?
- Appealing concept, allowing the company to map all capabilities in a vendor-agnostic
and even process-agnostic way with the goal to have clear picture which capabilities
have overlapping solutions and which capabilities are missing
- Never saw small-scale implementation
Agile/DevOps
- Agile development in place for majority of platforms
- More automation needed
- More important: further culture shift
- ITIL - DevOps collision is myth!; ITIL embraces DevOps and, in a way, is a prerequisit for
DevOps
- Foundations exist!
TQM
- To institutionalize improvement culture
Company ownership change to be finalized in December 2016! Future: operations on open
card processing market.
14
Tickets as a source of metrics, both relative (index) and absolute (availability)
Make your service catalogue and agree SLAs with stakeholders
Attack critical services with end-to-end monitoring
Get management attention
Implement problem record, at least on peace of paper
Empower Service Desk in order to reinforce customer centricity
Embrace continuous improvement culture, regardless flavor (Kaizen, ITIL, Kata,
QM...)
Eighth advice: „Read good books”
Literature:
Real ITSM
Visible Ops and Phoenix Project
Gartner researches
15
Questions?
16

How To Build Mature SM - final

  • 1.
    Attending IT conferences,such as Gartner, but also vendors’ events, can be painful and stressful for midmarket IT manager: you see many fine examples of big-scale implementation of a tool or a framework, managers from rich companies present fancy case studies demonstrating that project of x millions returned three times more in savings because of reduced work hours and increased service quality. Then you return to your small or mid company knowing that nobody would ever fund implementation of out-of-box monitoring solution and you will never have a chance to map your capabilities using IT4IT framework. These toys are for big guys only. This presentation shows how one can make a lot with moderate budget and how you can build mature IT service management without implementing all features offered by vendors and presented as a must. This is not about revolution and dramatic transformation through single project or initiative. This case study shows how evolutionary transformation of company culture can lead to high service quality. 0
  • 2.
    I was bornin 1972 and have been working in Intesa Sanpaolo Card as the Head of Service Management Department in Croatia. After master degree in computer sciences at the University of Zagreb, I am working for 19 years in infrastructure and operations area for card processing business For last 7 years my focus has been real-life application of ITSM good practice. This has contributed achieving certain improvements, and some of these improvements are topic of this presentation. Positions (within card processing companies): Help-Desk staff, DBA, IT Infrastructure & Operations deputy manager, Head of Service management 1
  • 3.
    PEOPLE 362 employees 100+ IT Earlyand quick adopters Quick acceptance and implementation of agile methodology High fluidity of communication and people interactions Corporate culture based on constant self and collective improvement 100% use of English language INNOVATIVE COMPANY First company in the region turning your smartphone into contactless payment device: 1st in the world: Pilot and launch of mobile wallet for PBZ Group with American Express 2nd in Europe: Pilot for VUB with Visa Inspire and one of the first to launch it commercially 1st in Visa Inc. Region: Pilot for BIB with Visa Inspire In-house development of Fraud Management solution for detecting payment card fraud Creating the most innovative and modern disaster recovery solution Modern and flexible infrastructure with full multi-country, multi-currency, multi- language capabilities Providing multi-payment schemes Implementation of the Card Life Cycle Management strategies to increase customer lifespan and profitability Acting as compliant guardian for banks 2
  • 4.
    In peak time: -One minute outage – 5.000+ lost trx - Ten minutes of outage – 50.000+ lost trx - One hour outage – 300.000+ lost trx Large number of people affected: so smooth operations are important! 3
  • 5.
    The company hadnumber of typical start-up company issues, including high demand for new development, lack of resources, insufficient control of change and firefighting in incident management. The challenge was to radically increase service quality, while keeping costs under control during turbulent years for financial industry in Europe. Examples: Customers call: „Application is not working!” IT answer: „All systems are up and running.” It took then several hours to determine what is happening and resolve the incident related to network issue. Reported incident on service - Service Desk does not know it exists (!) Later, in corridor, IT guys said they have deployed new service three days ago Reference to Real ITSM: Service „deathcycle” Service taming Do not take this book as a reference guide: it is just „ITSM humor (if there can be such a thing)” Question: According to Real ITSM, What is monitoring device? 4
  • 6.
    The telephone wasmain monitoring device for ISPC, as well. However, this was not good practice. The company’s position within Midmarket category did not allow luxury of investing in multi-million transformation project. So ISPC initiated several focused initiatives targeted to improve service quality. As Gartner’s research stated: “…smaller businesses lack deep resources to support change and must do everything they can to assure that their investments in change are focused and successful. ”- “Strategic Benefits Realization for the Midmarket CIO”, Heather Colella, August, 2015 So, what have we done? 5
  • 7.
    By knowing whichservices are critical and which are important Take inventory of services and critical infrastructure Define and negotiate service catalogue and SLAs Not as easy task as it may look like Dilemmas regarding service catalogue: Technical (measurable) vs. Business (chargeable): answer depends on industry and position of IT Anyway, you always need technical view in order to have technical metrics Example: transaction processing in card business More detailed vs. more general: answer - general, and yet measurable In our case, 3 groups defined: - online, main metric is availability - batch processing, main metric is time of delivery (i.e. percentage of files delivered on time) - operational (includes human interaction), different metrics This is not quick-win task (OK, now you have Service list, so what?!), nevertheless this is foundation for the rest of the story 6
  • 8.
    First two processesto mature: Incident management: Implement severity matrix, use incident frequency/severity as a primary KPI Think: whether to publish severity estimate to clients! The KPI indicates service quality and is cheap to implement Used old, good enough, ticketing tool (Operational) Change management Change windows, planned maintenance detailed planning, service state checklists Automated testing and build as a (one of) cornerstone for DevOps File Integrity Monitoring – easy to deploy, difficult to configure Scope narrowing for FIM: 5 critical platforms At the end – unauthorized change is eliminated, not due to FIM, but to culture change – OCM process should not be „police task” Consistent reporting Availability Incidents Culture shift - From „service taming” to change planning and control Be friend with audits and use their findings to build momentum People in audit may have broader picture and often provide actually good advices „It was audit request” may be valid reasoning why some control must be in place. Of course, it should not be only reason! 7
  • 9.
    Simple metric –tickets as the only source 8
  • 10.
    By monitoring andmeasuring service state Use existing ticketing tool as basic source of data You already have (primitive) monitoring system: telephone! Implement monitoring of end-to-end service state for all critical services to, at least, know the service is broken before your customers know it Using active monitoring (synthetic transactions) Using passive monitoring (control over real user transactions) We have discovered many „islands of monitoring” withinIT Use existing infrastructure monitoring components (like Nagios) to know, at least: Whether server/network is running? Is there some free storage space? Is any web certificate about to expire? Target specific pain-points Umbrella monitoring to visualize and consolidate information Data processing (batch jobs) as a specific area to address Visualize the data, not only for 7x24 Service Desk, but also for management Tool list: HP BSM – umbrella monitoring HP BAC – robot scripts for application and WS active end to end monitoring MIRTA – internally developed solution for rule-based card authorizations outage detection – passive end to end monitoring CA Spectrum – network infrastructure Nagios – server infrastructure AND specific scripts targeting pain-points Automic – enterprise job scheduling tool Project cost – less than €200.000 Less than half on licenses and video-wall More than half on internal/external development and integration 9
  • 11.
    Do not usemonitoring events to automatically raise tickets Avoid false-alert pitfall – treat false alarms as incidents in order to eliminate Alternative – nobody trusts monitoring Repeating not-detected incidents are problem itself Continuous gradual improvement (2013-2016) 100% is neither reachable, nor needed (low severity incidents) 10
  • 12.
    Embrace the cultureof continuous improvement Implement Problem management, at least using spreadsheet Track problem resolution Ask executive sponsorship Weekly Problem management meetings with heads of IT, Operations and Security Weekly meeting with CEO and CBO on incidents and problems Great impact to company culture: service orientation 11
  • 13.
    Empower the roleof Service Desk within key processes: Incident management Authority to set incident severity and activate any function 7x24 Major incidents triggers SMS to MB members including CEO Change management Service Desk as a part of CAB with veto option Monitoring Self-confidence due to service-state information availability Feeling of helplessness before Feeling of confidence today Example: SD has noticed large number of declined authorizations on our client - big bank’s POS network in one country and we see these all belong to cards issued by biggest bank in this country. After SD informed our bank and they asked other bank what is happening, the other bank replied: „Thank you for alert, we looked and now see that we have problem on our host” This is greatest gain from monitoring: from „telephone monitoring” to monitoring even for other providers! 12
  • 14.
    Results within Best-In-Classor, at least, outstanding Problem with legacy ATM platform: in resolution Gartner metrics are defined on yearly level – if calculated yearly, both main metrics fall into „Best-In-Class” ISPC SLA matches Gartner „Outstanding” for similar services 13
  • 15.
    CMDB? - Holy grailof service management  - Automatic population, including relationships – incredible amount of information - Implementing tools that use full power of CMDB - Using CMDB within Incident, Problem and especially Change management process directly from tool - Connecting to FIM - Infrastructure as a code? - Anyway, we are working not badly even without proper CMDB IT4IT? - Appealing concept, allowing the company to map all capabilities in a vendor-agnostic and even process-agnostic way with the goal to have clear picture which capabilities have overlapping solutions and which capabilities are missing - Never saw small-scale implementation Agile/DevOps - Agile development in place for majority of platforms - More automation needed - More important: further culture shift - ITIL - DevOps collision is myth!; ITIL embraces DevOps and, in a way, is a prerequisit for DevOps - Foundations exist! TQM - To institutionalize improvement culture Company ownership change to be finalized in December 2016! Future: operations on open card processing market. 14
  • 16.
    Tickets as asource of metrics, both relative (index) and absolute (availability) Make your service catalogue and agree SLAs with stakeholders Attack critical services with end-to-end monitoring Get management attention Implement problem record, at least on peace of paper Empower Service Desk in order to reinforce customer centricity Embrace continuous improvement culture, regardless flavor (Kaizen, ITIL, Kata, QM...) Eighth advice: „Read good books” Literature: Real ITSM Visible Ops and Phoenix Project Gartner researches 15
  • 17.