SlideShare a Scribd company logo
1 of 23
Automating IT Analytics to
Optimize Service Delivery and
Cost at Safeway
David Wagner – TeamQuest Advocate
Chris Lynn - Safeway Capacity
Manager/Performance Analyst
December 11, 2013

TeamQuest and the TeamQuest logo are registered trademarks in the US, EU and
elsewhere.
All other © 2012 TeamQuest Corporation. the property
Copyrighttrademarks and service marks are All Rights of their respective owners.
Reserved.
Agenda
• TeamQuest Perspectives
• Safeway Experiences

Copyright © 2012 TeamQuest Corporation. All Rights
Reserved.
Desired State:
Continuous Optimization
• Continuously financially-optimized IT environment
– Always know where and when performance problems will
affect the bottom line
– Identify cost and performance inefficiencies in support of
business processes and eliminate them

• Continuously optimized customer experience
– Understand when, where and why customer experiences fail
– Resolve, predict and prevent customer dissatisfaction issues

Copyright © 2012 TeamQuest Corporation. All Rights
Reserved.
Continuous IT Optimization
Results
• Significantly reduce initial CapEx, and ongoing OpEx
– Make, and keep making, more money!

• Optimize resources for systems of customer engagement
• Deploy and refresh new applications faster
– e.g. Retailers need to capture their share of mobile commerce as it
grows from $6 to $31B (2016)

• Respond faster to business spikes
• Prevent business impacting outages and slowdowns

Copyright © 2012 TeamQuest Corporation. All Rights
Reserved.
How :
Aligned Business and IT Analytics
Big Data Collection
Underlying IT
Infrastructure

Outage
Management

Customer
Operations

Distribution
Automation

Asset
Management

Services

Enterprise IT
Optimization
• Correlate
business & IT
performance
• Insight into how
business processchanges impact IT

Applicatio
ns

• Understand and
optimize IT costs
by business
unit/process and
technology

Server/OS

Network

• Insight into
business
performance
across technology
stack

Storage

Business Intelligence

Aligned Business and IT Intelligence
Copyright © 2012 TeamQuest Corporation. All Rights
Reserved.

5
TeamQuest’s Approach:
Federated IT Analytics
• Federates existing data/information into purposedesigned optimization process
–
–
–
–

Technology data (e.g. server, network, storage, etc.)
Service data (catalog, metrics, tickets, etc.)
Financial data
Business data (analytics, KPIs, plans, TXNs, etc.)

• Automates IT analytics across all data sources
– Flexible and adaptive to dynamic environments
– Raw (commodity) data -> actionable information for IT

• Single-pane-of-glass IT Optimization

Copyright © 2012 TeamQuest Corporation. All Rights
Reserved.
Result:
Automated Application Financial Optimization
• Continuous Optimization
– Pre-purchase validation
– Re-purposing
– Consolidation

• Fully automated, low
cost

• Integrated with Risk and
Service Management
• Changed new VMware
Clusters from every 6
weeks to:
– None for 18+ months…
– Consolidated 1000’s of
VM’s (Saving $M)

Copyright © 2012 TeamQuest Corporation. All Rights
Reserved.
Result:
IT Optimized... Future Assured
• Continuous IT Optimization
– Peak IT Performance
– Ideal Resource Capacity
– Optimized Resource Costs

• Automated IT Analytics
– Predictive
– Federated

• Aligned IT and Business Management
– Performance
– Capacity
– Financial

Copyright © 2012 TeamQuest Corporation. All Rights
Reserved.
Automating IT Analytics to Optimize Service
Delivery and Cost at Safeway
Chris Lynn - Safeway
December 11, 2013 2:30-3:00
Topics

Background
Server Storage Forecasting and Optimization
Application Capacity Analysis – Dashboards to Details
Business KPI Analytics
Vmware High Level Analytics
Background
• Manager of Safeway Capacity and
Performance Team
• ChrisLynn@usa.com
• http://www.linkedin.com/pub/chris-lynn/2/65/309/

• Environment Supported
• ~4000 servers (~1700 physical)
• ~200 significant applications
• Unix, Windows, Mainframe, Teradata, Tandem,
etc.
• Thousands of internal IT Customers, and
millions of shoppers
Server Storage Forecasting and
Optimization
•
•
•
•
•

Optimizing Availability (reducing incidents)
Optimizing Enterprise Capacity
Reducing Risk
Automated replacing Manual
Embedded expertise
Storage Capacity Incident Avoidance:
Old (on server) Manual Method
$ df
Filesystem
size used avail capacity Mounted on
/dev/vx/dsk/rootvol
3.9G 2.5G 1.4G 65% /
/dev/vx/dsk/var
1.9G 832M 1.0G 45% /var
swap
9.5G 16K 9.5G 1% /var/run
swap
1.0G 2.6M 1021M 1% /tmp
/dev/vx/dsk/patrol
1.9G 1.5G 227M 88% /appl/patrol
/dev/vx/dsk/home
486M 347M 91M 80% /export/home
/dev/vx/dsk/openv
1.4G 583M 717M 45% /usr/openv
/dev/vx/dsk/performdg/usrlocal
1.9G 542M 1.3G 29% /usr/local
/dev/vx/dsk/performdg/oracle
3.9G 1.1G 2.8G 28% /appl/oracle
/dev/vx/dsk/performdg/apache
128M 27M 95M 22% /appl/apache
/dev/vx/dsk/rootdg/opswarelv
241M 234K 216M 1% /var/opt/opsware
/dev/vx/dsk/performdg/b1home
12G 432M 11G 4% /appl/perform/best1home
/dev/vx/dsk/performdg/spool_apache 256M 145M 104M 59% /appl/spool/apache
/dev/vx/dsk/performdg/manage
180G 122G 54G 70% /appl/perform/manager
/dev/vx/dsk/performdg/workspace 90G 59G 29G 68% /appl/perform/workspace
/dev/vx/dsk/performdg/collect
480G 442G 37G 93% /appl/perform/collect
Automated Storage Forecasting:
File System Exceptions
• Weekly automated prioritized scan
• 4500 servers
• 45000 filesystems
• Focused on meaningful exceptions
• A proactive shift from find to fix
• Was – 50 minutes looking for
potential problems, 10 minutes to
fix
• Now- 5 minutes looking for
potential problems, 55 minutes
fixing them
• Impossible to do manually
New Automated (global exception)
File System Forecast Analytic Details
• Complex multi-level thresholds
1. Is file system utilization above 90% AND growing by >0.2% for the interval?
2. Is file system utilization above 75% AND growing by >2% for the interval?
3. Is the file system utilization above 15% AND growing by >15% for the interval?
4. Is /appl/patrol above 90% AND growing for the interval?

•
•
•
•
•
•
•
•
•
•

Individual exclusions and special cases
Physical and virtual in same report, but can be treated uniquely.
Sorted by date/time most likely to fill up
Show all candidates for a single server together (sorted by highest one),
minimize the time for operations to respond
Includes historical trend compared to just a point in time (e.g. df)
Forecast utilization trend into the future (multiple statistical options)
more than 24 hours of data to avoid temp FS
must have recent data to avoid shutdown servers
final measured number not below threshold
if final number >99.5% catches the very full fs that might not be growing.
Executive Capacity Dashboards
Capacity Risk Indicators

Stressed
Highly Stressed

Under Used

Well Used
Application Capacity Dashboards
Capacity Risk Indicators
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Highly Stressed

Stressed

Well Used

Under Used
Application Capacity Analysis
•
•
•
•

Automated Application Triage
All relevant metrics
Embedded expertise
Enterprise perspective of true
capacity
Capacity Risk Candidates
(OS--#of systems)

100%
50%
0%

Under Used
Well Used
Stressed
Highly Stressed
Integration With Business Metrics
System/Platform capacity data:
• Physical servers
• Virtual servers
• Tandem capacity systems
• Teradata capacity systems
• Datacenter facilities
Business perspective:
• Business transaction volumes
• Resource utilization
Vmware Aggregate Capacity

Aggregate shows the worst individual status
Vmware Aggregate Capacity

Aggregate shows the worst individual metric status
Lessons Learned/ Value Gained
• Reduced service risk
• More proactive less reactive
• Established a baseline to optimize capacity, and a
mechanism to measure the progress
• Business and IT alignment
• Performance and capacity to the business
• Management and technical personnel
• Launch slowly in phases to not overwhelm the groups
• People really do care about formatting and color
choice, not just content

More Related Content

What's hot

Hitachi datasheet-thin-image-snapshot
Hitachi datasheet-thin-image-snapshotHitachi datasheet-thin-image-snapshot
Hitachi datasheet-thin-image-snapshot
Hitachi Vantara
 
Improving the Accuracy of Variable Sales Compensation Forecasts
Improving the Accuracy of Variable Sales Compensation ForecastsImproving the Accuracy of Variable Sales Compensation Forecasts
Improving the Accuracy of Variable Sales Compensation Forecasts
Callidus Software
 
Data Center Pdf
Data Center PdfData Center Pdf
Data Center Pdf
slparker21
 
vRescue Presentation
vRescue PresentationvRescue Presentation
vRescue Presentation
vRescue
 
DMM9 - Data Migration Testing
DMM9 - Data Migration TestingDMM9 - Data Migration Testing
DMM9 - Data Migration Testing
Nick van Beest
 
Nimble storage investor overview presentation
Nimble storage investor overview presentationNimble storage investor overview presentation
Nimble storage investor overview presentation
nimblestorageIR
 

What's hot (20)

PUE Reconsidered
PUE ReconsideredPUE Reconsidered
PUE Reconsidered
 
Hitachi datasheet-thin-image-snapshot
Hitachi datasheet-thin-image-snapshotHitachi datasheet-thin-image-snapshot
Hitachi datasheet-thin-image-snapshot
 
Nimble storage investor presentation - Q2 FY15
Nimble storage investor presentation -  Q2 FY15Nimble storage investor presentation -  Q2 FY15
Nimble storage investor presentation - Q2 FY15
 
Addressing the Top 3 Storage Challenges in Healthcare with Hanover Hospital
Addressing the Top 3 Storage Challenges in Healthcare with Hanover HospitalAddressing the Top 3 Storage Challenges in Healthcare with Hanover Hospital
Addressing the Top 3 Storage Challenges in Healthcare with Hanover Hospital
 
Improving the Accuracy of Variable Sales Compensation Forecasts
Improving the Accuracy of Variable Sales Compensation ForecastsImproving the Accuracy of Variable Sales Compensation Forecasts
Improving the Accuracy of Variable Sales Compensation Forecasts
 
Demystifying the Cloud
Demystifying the CloudDemystifying the Cloud
Demystifying the Cloud
 
Data Center Pdf
Data Center PdfData Center Pdf
Data Center Pdf
 
Pure Storage Customer Business and IT Transformation
Pure Storage Customer Business and IT TransformationPure Storage Customer Business and IT Transformation
Pure Storage Customer Business and IT Transformation
 
DN 2017 | Hardware Failure Prediction at Dell-EMC | Ran Taig | Dell
DN 2017 |  Hardware Failure Prediction at Dell-EMC | Ran Taig | DellDN 2017 |  Hardware Failure Prediction at Dell-EMC | Ran Taig | Dell
DN 2017 | Hardware Failure Prediction at Dell-EMC | Ran Taig | Dell
 
vRescue Presentation
vRescue PresentationvRescue Presentation
vRescue Presentation
 
DMM9 - Data Migration Testing
DMM9 - Data Migration TestingDMM9 - Data Migration Testing
DMM9 - Data Migration Testing
 
Virtual SAN vs Good Old SANs: Can't they just get along?
Virtual SAN vs Good Old SANs: Can't they just get along?Virtual SAN vs Good Old SANs: Can't they just get along?
Virtual SAN vs Good Old SANs: Can't they just get along?
 
Demantra & ascp
Demantra & ascpDemantra & ascp
Demantra & ascp
 
IT Outsourcing: Business Continuity by Design by OneNeck IT Services
IT Outsourcing: Business Continuity by Design by OneNeck IT ServicesIT Outsourcing: Business Continuity by Design by OneNeck IT Services
IT Outsourcing: Business Continuity by Design by OneNeck IT Services
 
Complete Data Protection with Corp IT Group Recovery Cloud
Complete Data Protection with Corp IT Group Recovery CloudComplete Data Protection with Corp IT Group Recovery Cloud
Complete Data Protection with Corp IT Group Recovery Cloud
 
ManagedBackup
ManagedBackupManagedBackup
ManagedBackup
 
CDSi Services
CDSi ServicesCDSi Services
CDSi Services
 
Nimble storage investor overview presentation
Nimble storage investor overview presentationNimble storage investor overview presentation
Nimble storage investor overview presentation
 
Elements of a Successful Computer System ver 1.0
Elements of a Successful Computer System ver 1.0Elements of a Successful Computer System ver 1.0
Elements of a Successful Computer System ver 1.0
 
Preparing a data migration plan: A practical guide
Preparing a data migration plan: A practical guidePreparing a data migration plan: A practical guide
Preparing a data migration plan: A practical guide
 

Viewers also liked

Viewers also liked (7)

SAFEWAY SLIDE
SAFEWAY SLIDESAFEWAY SLIDE
SAFEWAY SLIDE
 
Optimizing IT Costs & Services With Big Data (Little Effort!) - Case Studies ...
Optimizing IT Costs & Services With Big Data (Little Effort!) - Case Studies ...Optimizing IT Costs & Services With Big Data (Little Effort!) - Case Studies ...
Optimizing IT Costs & Services With Big Data (Little Effort!) - Case Studies ...
 
It's Time the Data Center Gets the "Moneyball" Treatment
It's Time the Data Center Gets the "Moneyball" TreatmentIt's Time the Data Center Gets the "Moneyball" Treatment
It's Time the Data Center Gets the "Moneyball" Treatment
 
A #Pink14 Presentation: Optimizing for the #SDDC
A #Pink14 Presentation: Optimizing for the #SDDCA #Pink14 Presentation: Optimizing for the #SDDC
A #Pink14 Presentation: Optimizing for the #SDDC
 
Analytics: The Next Killer App for Optimizing IT? #GartnerIOM
Analytics: The Next Killer App for Optimizing IT? #GartnerIOMAnalytics: The Next Killer App for Optimizing IT? #GartnerIOM
Analytics: The Next Killer App for Optimizing IT? #GartnerIOM
 
Enterprise Capacity Optimization - Capacity Management Over Everything
Enterprise Capacity Optimization - Capacity Management Over EverythingEnterprise Capacity Optimization - Capacity Management Over Everything
Enterprise Capacity Optimization - Capacity Management Over Everything
 
Optimizing IBM AIX Enterprise Environments
Optimizing IBM AIX Enterprise EnvironmentsOptimizing IBM AIX Enterprise Environments
Optimizing IBM AIX Enterprise Environments
 

Similar to Automating IT Analytics to Optimize Service Delivery and Cost at Safeway - A #GartnerDC Presentation

HyperconvergedFantasyAnalytics
HyperconvergedFantasyAnalyticsHyperconvergedFantasyAnalytics
HyperconvergedFantasyAnalytics
Jerry Jermann
 
Ppt Template
Ppt TemplatePpt Template
Ppt Template
papdev
 
Oracle Sistemas Convergentes
Oracle Sistemas ConvergentesOracle Sistemas Convergentes
Oracle Sistemas Convergentes
Fran Navarro
 
Callidus Software On-Premise To On-Demand Migration
Callidus Software On-Premise To On-Demand MigrationCallidus Software On-Premise To On-Demand Migration
Callidus Software On-Premise To On-Demand Migration
Callidus Software
 
Sybase Global Infrastructure
Sybase Global InfrastructureSybase Global Infrastructure
Sybase Global Infrastructure
Robert Mobley
 
PayPal Decision Management Architecture
PayPal Decision Management ArchitecturePayPal Decision Management Architecture
PayPal Decision Management Architecture
Pradeep Ballal
 
SmartCloud Monitoring and Capacity Planning
SmartCloud Monitoring and Capacity PlanningSmartCloud Monitoring and Capacity Planning
SmartCloud Monitoring and Capacity Planning
IBM Danmark
 

Similar to Automating IT Analytics to Optimize Service Delivery and Cost at Safeway - A #GartnerDC Presentation (20)

This is my test slideshare
This is my test slideshareThis is my test slideshare
This is my test slideshare
 
Increased IT infrastructure effectiveness by 80% with Microsoft system center...
Increased IT infrastructure effectiveness by 80% with Microsoft system center...Increased IT infrastructure effectiveness by 80% with Microsoft system center...
Increased IT infrastructure effectiveness by 80% with Microsoft system center...
 
HyperconvergedFantasyAnalytics
HyperconvergedFantasyAnalyticsHyperconvergedFantasyAnalytics
HyperconvergedFantasyAnalytics
 
Ppt Template
Ppt TemplatePpt Template
Ppt Template
 
Oracle Sistemas Convergentes
Oracle Sistemas ConvergentesOracle Sistemas Convergentes
Oracle Sistemas Convergentes
 
Callidus Software On-Premise To On-Demand Migration
Callidus Software On-Premise To On-Demand MigrationCallidus Software On-Premise To On-Demand Migration
Callidus Software On-Premise To On-Demand Migration
 
Sybase Global Infrastructure
Sybase Global InfrastructureSybase Global Infrastructure
Sybase Global Infrastructure
 
Audax Group: CIO Perspectives - Managing The Copy Data Explosion
Audax Group: CIO Perspectives - Managing The Copy Data ExplosionAudax Group: CIO Perspectives - Managing The Copy Data Explosion
Audax Group: CIO Perspectives - Managing The Copy Data Explosion
 
Exploring Opportunities in Crisis by Ramco
Exploring Opportunities in Crisis by RamcoExploring Opportunities in Crisis by Ramco
Exploring Opportunities in Crisis by Ramco
 
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
 
Feasible
FeasibleFeasible
Feasible
 
Business Continuity Presentation
Business Continuity PresentationBusiness Continuity Presentation
Business Continuity Presentation
 
The BUsiness of Windows Azure Platform
The BUsiness of Windows Azure PlatformThe BUsiness of Windows Azure Platform
The BUsiness of Windows Azure Platform
 
Introducing Elevate Capacity Management
Introducing Elevate Capacity ManagementIntroducing Elevate Capacity Management
Introducing Elevate Capacity Management
 
Business Continuity Presentation[1]
Business Continuity Presentation[1]Business Continuity Presentation[1]
Business Continuity Presentation[1]
 
Data Engineer's Lunch #60: Series - Developing Enterprise Consciousness
Data Engineer's Lunch #60: Series - Developing Enterprise ConsciousnessData Engineer's Lunch #60: Series - Developing Enterprise Consciousness
Data Engineer's Lunch #60: Series - Developing Enterprise Consciousness
 
Is On-Demand SPM Right For Your Company?
Is On-Demand SPM Right For Your Company?Is On-Demand SPM Right For Your Company?
Is On-Demand SPM Right For Your Company?
 
ADV Slides: Comparing the Enterprise Analytic Solutions
ADV Slides: Comparing the Enterprise Analytic SolutionsADV Slides: Comparing the Enterprise Analytic Solutions
ADV Slides: Comparing the Enterprise Analytic Solutions
 
PayPal Decision Management Architecture
PayPal Decision Management ArchitecturePayPal Decision Management Architecture
PayPal Decision Management Architecture
 
SmartCloud Monitoring and Capacity Planning
SmartCloud Monitoring and Capacity PlanningSmartCloud Monitoring and Capacity Planning
SmartCloud Monitoring and Capacity Planning
 

More from TeamQuest Corporation

More from TeamQuest Corporation (11)

Vendor Selection Matrix - Capacity Management - Top 15 Vendors in 2016
Vendor Selection Matrix - Capacity Management - Top 15 Vendors in 2016Vendor Selection Matrix - Capacity Management - Top 15 Vendors in 2016
Vendor Selection Matrix - Capacity Management - Top 15 Vendors in 2016
 
Eliminate Turbulence Between IT and the Business with Business Value Dashboards
Eliminate Turbulence Between IT and the Business with Business Value DashboardsEliminate Turbulence Between IT and the Business with Business Value Dashboards
Eliminate Turbulence Between IT and the Business with Business Value Dashboards
 
IT Maturity: Lady Gaga and her Effect on Infrastructure Performance and Capac...
IT Maturity: Lady Gaga and her Effect on Infrastructure Performance and Capac...IT Maturity: Lady Gaga and her Effect on Infrastructure Performance and Capac...
IT Maturity: Lady Gaga and her Effect on Infrastructure Performance and Capac...
 
Infographic: Is IT Service Optimization Worth It?
Infographic: Is IT Service Optimization Worth It?Infographic: Is IT Service Optimization Worth It?
Infographic: Is IT Service Optimization Worth It?
 
Infographic: Plan for Success!
Infographic: Plan for Success!Infographic: Plan for Success!
Infographic: Plan for Success!
 
Infographic: Why Optimize IT?
Infographic: Why Optimize IT?Infographic: Why Optimize IT?
Infographic: Why Optimize IT?
 
IBM Edge 2015 Infographic
IBM Edge 2015 InfographicIBM Edge 2015 Infographic
IBM Edge 2015 Infographic
 
Understanding the Real Value of IT and Proving it to the Business
Understanding the Real Value of IT and Proving it to the BusinessUnderstanding the Real Value of IT and Proving it to the Business
Understanding the Real Value of IT and Proving it to the Business
 
Big Data - Marrying Service Management With Service Delivery - #Pink13
Big Data - Marrying Service Management With Service Delivery - #Pink13Big Data - Marrying Service Management With Service Delivery - #Pink13
Big Data - Marrying Service Management With Service Delivery - #Pink13
 
State of Capacity Management
State of Capacity ManagementState of Capacity Management
State of Capacity Management
 
How to Do Capacity Planning
How to Do Capacity PlanningHow to Do Capacity Planning
How to Do Capacity Planning
 

Recently uploaded

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Recently uploaded (20)

Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 

Automating IT Analytics to Optimize Service Delivery and Cost at Safeway - A #GartnerDC Presentation

  • 1. Automating IT Analytics to Optimize Service Delivery and Cost at Safeway David Wagner – TeamQuest Advocate Chris Lynn - Safeway Capacity Manager/Performance Analyst December 11, 2013 TeamQuest and the TeamQuest logo are registered trademarks in the US, EU and elsewhere. All other © 2012 TeamQuest Corporation. the property Copyrighttrademarks and service marks are All Rights of their respective owners. Reserved.
  • 2. Agenda • TeamQuest Perspectives • Safeway Experiences Copyright © 2012 TeamQuest Corporation. All Rights Reserved.
  • 3. Desired State: Continuous Optimization • Continuously financially-optimized IT environment – Always know where and when performance problems will affect the bottom line – Identify cost and performance inefficiencies in support of business processes and eliminate them • Continuously optimized customer experience – Understand when, where and why customer experiences fail – Resolve, predict and prevent customer dissatisfaction issues Copyright © 2012 TeamQuest Corporation. All Rights Reserved.
  • 4. Continuous IT Optimization Results • Significantly reduce initial CapEx, and ongoing OpEx – Make, and keep making, more money! • Optimize resources for systems of customer engagement • Deploy and refresh new applications faster – e.g. Retailers need to capture their share of mobile commerce as it grows from $6 to $31B (2016) • Respond faster to business spikes • Prevent business impacting outages and slowdowns Copyright © 2012 TeamQuest Corporation. All Rights Reserved.
  • 5. How : Aligned Business and IT Analytics Big Data Collection Underlying IT Infrastructure Outage Management Customer Operations Distribution Automation Asset Management Services Enterprise IT Optimization • Correlate business & IT performance • Insight into how business processchanges impact IT Applicatio ns • Understand and optimize IT costs by business unit/process and technology Server/OS Network • Insight into business performance across technology stack Storage Business Intelligence Aligned Business and IT Intelligence Copyright © 2012 TeamQuest Corporation. All Rights Reserved. 5
  • 6. TeamQuest’s Approach: Federated IT Analytics • Federates existing data/information into purposedesigned optimization process – – – – Technology data (e.g. server, network, storage, etc.) Service data (catalog, metrics, tickets, etc.) Financial data Business data (analytics, KPIs, plans, TXNs, etc.) • Automates IT analytics across all data sources – Flexible and adaptive to dynamic environments – Raw (commodity) data -> actionable information for IT • Single-pane-of-glass IT Optimization Copyright © 2012 TeamQuest Corporation. All Rights Reserved.
  • 7. Result: Automated Application Financial Optimization • Continuous Optimization – Pre-purchase validation – Re-purposing – Consolidation • Fully automated, low cost • Integrated with Risk and Service Management • Changed new VMware Clusters from every 6 weeks to: – None for 18+ months… – Consolidated 1000’s of VM’s (Saving $M) Copyright © 2012 TeamQuest Corporation. All Rights Reserved.
  • 8. Result: IT Optimized... Future Assured • Continuous IT Optimization – Peak IT Performance – Ideal Resource Capacity – Optimized Resource Costs • Automated IT Analytics – Predictive – Federated • Aligned IT and Business Management – Performance – Capacity – Financial Copyright © 2012 TeamQuest Corporation. All Rights Reserved.
  • 9. Automating IT Analytics to Optimize Service Delivery and Cost at Safeway Chris Lynn - Safeway December 11, 2013 2:30-3:00
  • 10. Topics Background Server Storage Forecasting and Optimization Application Capacity Analysis – Dashboards to Details Business KPI Analytics Vmware High Level Analytics
  • 11. Background • Manager of Safeway Capacity and Performance Team • ChrisLynn@usa.com • http://www.linkedin.com/pub/chris-lynn/2/65/309/ • Environment Supported • ~4000 servers (~1700 physical) • ~200 significant applications • Unix, Windows, Mainframe, Teradata, Tandem, etc. • Thousands of internal IT Customers, and millions of shoppers
  • 12. Server Storage Forecasting and Optimization • • • • • Optimizing Availability (reducing incidents) Optimizing Enterprise Capacity Reducing Risk Automated replacing Manual Embedded expertise
  • 13. Storage Capacity Incident Avoidance: Old (on server) Manual Method $ df Filesystem size used avail capacity Mounted on /dev/vx/dsk/rootvol 3.9G 2.5G 1.4G 65% / /dev/vx/dsk/var 1.9G 832M 1.0G 45% /var swap 9.5G 16K 9.5G 1% /var/run swap 1.0G 2.6M 1021M 1% /tmp /dev/vx/dsk/patrol 1.9G 1.5G 227M 88% /appl/patrol /dev/vx/dsk/home 486M 347M 91M 80% /export/home /dev/vx/dsk/openv 1.4G 583M 717M 45% /usr/openv /dev/vx/dsk/performdg/usrlocal 1.9G 542M 1.3G 29% /usr/local /dev/vx/dsk/performdg/oracle 3.9G 1.1G 2.8G 28% /appl/oracle /dev/vx/dsk/performdg/apache 128M 27M 95M 22% /appl/apache /dev/vx/dsk/rootdg/opswarelv 241M 234K 216M 1% /var/opt/opsware /dev/vx/dsk/performdg/b1home 12G 432M 11G 4% /appl/perform/best1home /dev/vx/dsk/performdg/spool_apache 256M 145M 104M 59% /appl/spool/apache /dev/vx/dsk/performdg/manage 180G 122G 54G 70% /appl/perform/manager /dev/vx/dsk/performdg/workspace 90G 59G 29G 68% /appl/perform/workspace /dev/vx/dsk/performdg/collect 480G 442G 37G 93% /appl/perform/collect
  • 14. Automated Storage Forecasting: File System Exceptions • Weekly automated prioritized scan • 4500 servers • 45000 filesystems • Focused on meaningful exceptions • A proactive shift from find to fix • Was – 50 minutes looking for potential problems, 10 minutes to fix • Now- 5 minutes looking for potential problems, 55 minutes fixing them • Impossible to do manually
  • 15. New Automated (global exception) File System Forecast Analytic Details • Complex multi-level thresholds 1. Is file system utilization above 90% AND growing by >0.2% for the interval? 2. Is file system utilization above 75% AND growing by >2% for the interval? 3. Is the file system utilization above 15% AND growing by >15% for the interval? 4. Is /appl/patrol above 90% AND growing for the interval? • • • • • • • • • • Individual exclusions and special cases Physical and virtual in same report, but can be treated uniquely. Sorted by date/time most likely to fill up Show all candidates for a single server together (sorted by highest one), minimize the time for operations to respond Includes historical trend compared to just a point in time (e.g. df) Forecast utilization trend into the future (multiple statistical options) more than 24 hours of data to avoid temp FS must have recent data to avoid shutdown servers final measured number not below threshold if final number >99.5% catches the very full fs that might not be growing.
  • 16. Executive Capacity Dashboards Capacity Risk Indicators Stressed Highly Stressed Under Used Well Used
  • 17. Application Capacity Dashboards Capacity Risk Indicators 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% Highly Stressed Stressed Well Used Under Used
  • 18. Application Capacity Analysis • • • • Automated Application Triage All relevant metrics Embedded expertise Enterprise perspective of true capacity Capacity Risk Candidates (OS--#of systems) 100% 50% 0% Under Used Well Used Stressed Highly Stressed
  • 19. Integration With Business Metrics System/Platform capacity data: • Physical servers • Virtual servers • Tandem capacity systems • Teradata capacity systems • Datacenter facilities Business perspective: • Business transaction volumes • Resource utilization
  • 20. Vmware Aggregate Capacity Aggregate shows the worst individual status
  • 21. Vmware Aggregate Capacity Aggregate shows the worst individual metric status
  • 22.
  • 23. Lessons Learned/ Value Gained • Reduced service risk • More proactive less reactive • Established a baseline to optimize capacity, and a mechanism to measure the progress • Business and IT alignment • Performance and capacity to the business • Management and technical personnel • Launch slowly in phases to not overwhelm the groups • People really do care about formatting and color choice, not just content

Editor's Notes

  1. Our way is better, faster, and more cost effective than the alternatives to Build a huge “data mart” (i.e. PMDB)Complexity = (data ETL) x (# sources) x (maint. effort) x (SDDC variability/dynamism) x … + Compliance: Data duplication, privacy, audit, etc… + Lock in”= Very costly and time-consumingOr Apply general purpose BI analytics to IT challenges they aren’t designed to handleAnswers Business questions, but…Not focused on IT Resource optimization, performance, capacityAgility? Core competence?
  2. 4,000 systems (physical and virtual) evaluated in 1 day for 6-17 metrics each against platform specific thresholds for a 30 day history16,000 capacity risk indicators from 40,000 metric checks on 4,000 systems16,000 capacity risk checks resulting in 3% highly stressed and 6% stressed
  3. 5 CPU metrics7 Memory metrics3 IO metrics2 Network metricsEvaluated at 4 levels of criticality per metric groupUnique thresholds per platformAggregated to an overall server capacity ratingOnly systems with a concern have detailed charts createdVisually obvious which metrics of concern4,000 systems (physical and virtual) evaluated in 1 day for 6-17 metrics each against platform specific thresholds for a 30 day history16,000 capacity risk indicators from 40,000 metric checks on 4,000 systems16,000 capacity risk checks resulting in 3% highly stressed and 6% stressed
  4. Phvg06-Prod DMZPhvg07-Prod Server FarmPhvg08-NonProd DMZPhvg11-NonProd DMZPhvg09-NonProd Server FarmPhvg12-14 = WISE