Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Platform Health Assessment at Department of Homeland Security Citizenship and Immigration Services

126 views

Published on

SpringOne Platform 2019
Session Title: Platform Health Assessment at Department of Homeland Security Citizenship and Immigration Services
Speakers: Chris Saunders, Platform Architect Manager, Pivotal and Kelly Walsh, Engagement Director, Pivotal and Paul Beccio, Developer, DHS USCIS
Youtube: https://youtu.be/LZsqqSH9VbI

Published in: Software
  • Be the first to comment

  • Be the first to like this

Platform Health Assessment at Department of Homeland Security Citizenship and Immigration Services

  1. 1. Platform Health Assessment at Department of Homeland Security Citizenship and Immigration Services October 7–10, 2019 Austin Convention Center
  2. 2. Unless otherwise indicated, these slides are © 2013-2019 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Hello! Matt Dosberg, Chief, DID(it) Loves mission, progressive tech, and working with talented team Hands full with two small kids Paul Beccio, SRE USCIS Chris Saunders, Platform Architect Manager For Federal Successful Customers Matter Kelly Walsh, Engagement Director for New England Region
  3. 3. Unless otherwise indicated, these slides are © 2013-2019 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ USCIS Mission
  4. 4. Unless otherwise indicated, these slides are © 2013-2019 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ U.S. Citizenship & Imigration Services U.S. Citizenship and Immigration Services administers the nation’s lawful immigration system, safeguarding its integrity and promise by efficiently and fairly adjudicating requests for immigration benefits while protecting Americans, securing the homeland, and honoring our values 4
  5. 5. Unless otherwise indicated, these slides are © 2013-2019 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Fiscal Year 2018 Snapshot 8.7 Million Receipts 5 757,000 Naturalizations 1.1 Millions Lawful Permanent Residents 2.1 Million Employment Authorization Cards 37 Million New Hires Verified
  6. 6. Unless otherwise indicated, these slides are © 2013-2019 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Asylum Program Every year people come to the United States seeking protection because they have suffered persecution or fear that they will suffer persecution due to: - Race - Religion - Nationality - Membership in a particular social group - Political opinion 6 Types of Asylum Affirmative Defensive Credible Fear Reasonable Fear
  7. 7. Unless otherwise indicated, these slides are © 2013-2019 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Digital Innovation & Development - DID(it) Provides cutting edge Application Development and associated IT DevSecOps services through the application of best of breed Agile and Lean Startup best practices to support web and mobile-based application development of existing and new systems based upon a mix of legacy and emerging technologies 7
  8. 8. Strong Platform enables Mission Outcomes Reduced Processing Time Increased Integrity and Safeguarding Operational Efficiencies Team Responsiveness
  9. 9. State of the Platform in Fall 2018 Matrixed Ops Team with Shared Responsibilities Firefighting! Behind on Platform Upgrades Lack of Platform Monitoring Platform not a Product
  10. 10. Unless otherwise indicated, these slides are © 2013-2019 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Journey Health Markers
  11. 11. Unless otherwise indicated, these slides are © 2013-2019 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Customer Journeys Start Ingrain Launch Platform Capability Extend the Platform as a Product Stay on Track with Continuous Improvement Platform Application NEW EXISTING Replatform and Modernize Construct and Begin Enterprise AppTX Plan Execute on AppTX Plan Scale Tackle a Meaningful Problem with Custom Software Enable Practice Leaders Foster New Culture and Continue Learning Cost Savings Faster Releases Reduce Risk Stable Software Get to Scale
  12. 12. Unless otherwise indicated, these slides are © 2013-2019 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Customer Journeys Start Ingrain Launch Platform Capability Extend the Platform as a Product Stay on Track with Continuous Improvement Platform Applications NEW EXISTING Replatform and Modernize Construct and Begin Enterprise AppTX Plan Execute on AppTX Plan Scale Tackle a Meaningful Problem with Custom Software Enable Practice Leaders Foster New Culture and Continue Learning Cost Savings Faster Releases Reduce Risk Stable Software Get to Scale Journey Health Markers are how we find out where a client is on the journey (and focus on what they need to do next)
  13. 13. Unless otherwise indicated, these slides are © 2013-2019 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Platform Maturity Matrix Dimensions Balanced Team Business Continuity Platform as a Product Path to Production Performance OptimizationMonitoring and Metrics Capacity Planning Platform Update Engine Emergency Response Self-Service
  14. 14. 1 – Chaotic: No service level indicators or objectives (SLI/SLO) defined. No automated monitoring. Users report issues, and application / platform teams are not aware of the issues until they are reported. Monitoring and Metrics 3 – Managed: Monitoring provides visibility and appropriate alerts are sent when defined thresholds are met. 4 – Measured: Monitoring and alerting strategies are adjusted as a response to violations of SLOs. They incorporate and adapt to customer feedback on a regular basis. 5 – Continuous Improvement: The team iterate on new monitoring graphs, and continually and proactively tweak the alerting strategy to align with SLOs, minimize false alerts. 2 – Defined: Clearly defined ownership of monitoring. Some SLI/SLO definition but monitoring solutions don't provide clear optics to all the right things. Basic platform metrics are being sent to something like ELK, Splunk or Prometheus. Establishing desired service behavior, measuring how the service is actually behaving, and correcting discrepancies. Examples: response latency, error or unanswered query rate, peak utilization of resources ©Copyright2018–19PivotalSoftware,Inc.AllRightsReserved
  15. 15. FacilitateIntroduce Assess Recommend
  16. 16. FacilitateIntroduce Assess Recommend Clients
  17. 17. Unless otherwise indicated, these slides are © 2013-2019 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ CIS Health Markers
  18. 18. Health Markers Self-Assessment, 12/13/18 Platform Journey Platform as a Product: 2.1 Balanced Team: 2.4 Path to Production: 3.1 Priority 1 Monitoring and Metrics: 2.3 Capacity Planning: 1.3 Change Management: 2.0 Priority 2 Emergency Response: 2.0 Self Service: 3.0 Performance Optimization: 1.7 Priority 3 Business Continuity: 1.3 Ingrain Scale 1 2 Start 3 4 5
  19. 19. Priority 1: Monitoring and Metrics Establishing desired service behavior, measuring how the service is actually behaving, and correcting discrepancies. Examples: response latency, error or unanswered query rate, peak utilization of resources Needs work: ● Desired service behavior not tied to specific application needs ● Process for monitoring is unclear/confusing ● We have a sense there are more tools we could be using Working well: ● Pcf 2.2 upgrade getting us closer ● We have had experiences with useful alerts (Redis) ● New Relic Integration with Slack is promising On a 1-5 scale, we give ourselves a: 2.3 Pivotal’s recommendations: Things we can work on together at no cost: ● Set up a time for Pivotal to host a “Healthwatch 101” class; get to know functionality for metrics and event alerts ● Start a platform team book club to read Google’s SRE book so everyone is on the same page and learning together ● Identify most meaningful metrics and event alerts and start monitoring!
  20. 20. Priority 2: Emergency Response Noticing and responding effectively to service failures in order to preserve the service's conformance to SLA. Examples: on-call rotations, prober, dip detection, primary/secondary/escalation, playbooks, wheel of misfortune, prod VPN rooms Needs work: ● No engineer on call for ops ● Unclear who on the team should respond to emergencies ● No DR/COOP plan Working well: ● The team has internal and Pivotal resources they can reach out to if something goes wrong ● Pivotal SLAs have been helpful ● Wiki docs + an enthusiastic team! On a 1-5 scale, we give ourselves a: 2.0 Pivotal’s recommendations: Things we can work on together at no cost: ● Pivotal can provide example playbooks and SLAs from other clients who have successfully tackled Emergency Response ● Establish priority (T1, T2, etc) among applications in the DID app portfolio ● Create a first-draft Emergency Response document (the Google SRE book is a helpful guide) ● Establish regular cadence with developer teams to anticipate ER needs
  21. 21. Priority 3: Business Continuity Treat the platform like critical infrastructure with published RPO/RTOs that satisfy business requirements. Proving viability of disaster recovery plans by restoring platforms to a known good state through the use of automation. Needs work: ● Right now, we have to rebuild manually in order to recover ● We probably take it for granted that PCF is ever reliable, so there is a lot we haven’t covered on our end Working well: ● Concourse has been great ● PCF automates a lot of the work required to rebuild On a 1-5 scale, we give ourselves a: 1.3 Pivotal’s recommendations: Things we can work on together at no cost: ● Interview application dev teams to understand DR needs ● Consider setting up an active-active architecture ● Validate backups, consider exercising DR with sandbox environments
  22. 22. 2nd Health Markers Self-Assessment, 10/2/19 Platform Journey Priority 1 Platform as a Product: 2.1 Balanced Team: 2.4 Priority 2 Path to Production: 3.1 Monitoring and Metrics: 2.3 Capacity Planning: 1.3 Platform Upgrade Engine: 2.0 Priority 3 Emergency Response: 2.0 Self Service: 3.0 Performance Optimization: 1.7 Business Continuity: 1.3 Ingrain Scale 1 2 Start 3 4 5 12/13/18 10/02/19
  23. 23. The Alliance - Pivotal Platform Value Metrics Speed Daily releases to production Deployment timelines for new features <10 minutes Developer self services (Redis, Push) Stability Scalability 300+ containers across environments Business critical systems running within PCF (Global) < 2 mins to scale applications Security 0hours spent planning security and patching 44CVEs resolved YTD with no downtime 100% workloads created from security approved buildpacks 200+ VM’s replaced and hardened with latest security patches through automated pipelines Savings 30+ year old MainFrame application decommissioned MainFrame computers turned off saving $10,000,000/year 0Minutes downtime during patching & routine maintenance 3 foundations 100%of platform life-cycle driven by automated pipelines and Infrastructure as Code.(Q4)
  24. 24. Unless otherwise indicated, these slides are © 2013-2019 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ We’re Hiring! Product Managers Product Designers Software Engineers 25 Come help solve complex problems using progressive tech alongside a very talented team that supports USCIS RAIO mission to provide immigration and humanitarian services for people who are fleeing oppression, persecution or torture and facing urgent humanitarian situations Want to learn more? Come up and say hello. Or reach out to Matthew.W.Dosberg@uscis.dhs.gov
  25. 25. Unless otherwise indicated, these slides are © 2013-2019 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Thank you!

×