Brighttalk high scale low touch and other bedtime stories - final


Published on

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Brighttalk high scale low touch and other bedtime stories - final

  1. 1. High Scale Low Touch and Other Bedtime Stories
  2. 2. Mr. White has fifteen years of experience designing and managing the deployment of Systems Monitoring and Event Management software. Prior to joining IBM, Mr. White held various positions including the leader of the Monitoring and Event Management organization of a Fortune 100 company and developing solutions as a consultant for a wide variety of organizations, including the Mexican Secretaría de Hacienda y Crédito Público, Telmex, Wal-Mart of Mexico, JP Morgan Chase, Nationwide Insurance and the US Navy Facilities and Engineering Command. Andrew White Cloud and Smarter Infrastructure Solution Specialist IBM Corporation
  3. 3.!
  4. 4. Ground rules for this session… •  If you can’t tell if I am trying to be funny… –  GO AHEAD AND LAUGH! •  Feel free to text, tweet, yammer, or whatever during this talk. I appreciate the real time feedback from you. •  If you have a question, no need to wait until the end. Just interrupt me. Seriously… I don’t mind.
  5. 5. Anti-patterns and I am here today to share some of what I have learned about Orchestration
  6. 6. What Is a System? It is a set of interconnected actors that change over time when they are influenced by other elements of the system. Actor Actor Actor Actor Actor Actor Actor Actor
  7. 7. Systems are Volatile This change makes it difficult to control the behavior of the system. The good news is that systems are perfect. They always deliver the optimum result given a specific stimuli.
  8. 8. Rare Events “one chance in a million” will undoubtedly occur, with no less and no more than it’s appropriate frequency, however surprised we may be that it should occur to us. Sir Ronald A. Fisher ©  Aquire  Inc.  2012  
  9. 9. The Gaussian Bell Curve Mean   -1σ +1σ -2σ +2σ -3σ +3σ 67% 95% 99.5%
  10. 10. Two Important Properties •  The causal effect between two actors will always impact the entire system •  Correlation != Causation
  11. 11. How Fire Works Time Oxygen Heat Fuel Fire MatchStrike Action Conditions Fire Oxygen Heat Fuel Match Strike -AND- • Actions are momentary and act as a catalyst to bring about change • Conditions are stable and exist over time
  12. 12. A Real World Example Customers Complaining Web Server returning 500 errors The application server was timing out SQL Server was not processing queries Transaction log was unable to grow T: Drive at 0 Bytes free Logs were not truncated DBA on honeymoon vacation in Fiji Logs are truncated manually Company has only 1 DBA “Backup” DBA was not aware the logs require truncation Space allocations are fixed Lack of Control Only one database cluster in use DR SQL Cluster DR Cluster being used for UAT testing More Information Needed One one application server exists More Information Needed Trying to do business on the website Desired Condition -AND- -AND- -AND- -AND- -AND- -AND- -AND-
  13. 13. Hidden Factors Hidden Factor Smoking Lung Cancer
  14. 14. A failure is not always a mistake, it may simply be the best one can do under the circumstances. -- B.F. Skinner, American Psychologist & Professor, Harvard University
  15. 15. Uh oh. Two independent thought alarms in one day. The students are over-stimulated. Willie! Remove all the colored chalk from the classrooms. --B.F. Skinner, Principal, Springfield Elementary School
  16. 16. You can't judge my choices without understanding my reasons. -Unknown
  17. 17. Feedback Loops Unfortunately feedback has taken on both positive and negative indications. In reality, positive feedback is not “praise” and negative feedback is not “criticism.” Positive feedback reinforces while negative feedback balances. Profits Productivity Cost Cutting Reinforcing Balancing
  18. 18. The Agile Value Proposition Availability Change Frequency Change Size Change Capability Change Risk (-) (+) (+) (-) (-) Adapted From:
  19. 19. Customer Satisfaction Availability Change Frequency Change Size Change Capability Change Risk (-) (+) (+) (-) (-) Business Value Business Demand Change Backlog (+) (+) (+) (+) (-) (+) Adapted From:
  20. 20. Be Careful of Good Intentions Availability Change Frequency Change Size Change Capability Change Risk (-) (+) (+) (-) (-) Business Value Business Demand Change Backlog (+) (+) (+) (+) (-) (+) Change Process Release Process (+) (+) (-) (+) (+) Adapted From:
  21. 21. Be Careful of Good Intentions Availability Change Frequency Change Size Change Capability Change Risk (-) (+) (+) (-) (-) Business Value Business Demand Change Backlog (+) (+) (+) (+) (-) (+) Change Process Release Process (+) (+) (-) (+) (+) Change Automation Adapted From: (+) (-)(-)
  22. 22. The trick is not to spend our time trying to get better at predicting this world, or making it more predictable, for both of these strategies are bound to fail. - Nassim Nicholas Taleb, Author and Philosopher
  23. 23. Woteva you say, man!
  24. 24. Increasing Complexity § Heterogeneous environments § Organizational silos § Skill gaps Massive Scale § Users, transactions, data § Rapid demand cycles § Unpredictable Rapid Pace § Evolving ecosystem § Minimize time to value § Accelerating business needs Today’s IT infrastructures are too complex, provide poor scalability, and are slow to keep up with today’s rapid rate of change A new set of challenges V1 V2 V3 V4 V5 V5 ... …. Vn C C W1 W2 W3 W4 R1 R2 R3 Traditional (Systems of Record) Emerging (Systems of Interaction) Workload View
  25. 25. Future                           §  Rapidly changing workloads, dynamic patterns §  Dynamic automatic composition of heterogeneous system §  Autonomic and proactive management Current                           §  Diverse workload, limited patterns §  Homogeneous resource pooling §  Expert configuration and mapping of workload Traditional                         §  Few, stable, and well known workloads §  Fixed System hardware, manual scaling §  Hardwired workload, minimal configuration W1 W2 W3 W4 R1 R2 R3 Volatile workload characteristics result from changing business requirements V1 V2 V3 V4 V5 … Vn V1 V2 V3 V4 V5 V5 ... …. Vn C C Workloads are volatile
  26. 26. A new approach to infrastructures Simplified Reduce infrastructure complexity, specialization and support Adaptive Ensure service qualities through policy based, scalable infrastructure automation Responsive Accelerate application lifecycle and workload deployment through rapid change Resource Smart Application Aware Definitions Patterns Analytics Compute Network Storage Software Defined Environments are optimized to deliver the agility, efficiency and performance needed for today’s workloads Software Defined Environment
  27. 27. What are Software Defined Environments Resource Smart Application Aware Definitions Patterns Analytics Compute Network Storage With a Software Defined Environment, business users can describe their requirements of the IT environment in a systematic way that in turn drives automation of the infrastructure Software Defined Environments Abstracted and virtualized IT infrastructure resources managed by software Applications that define infrastructure requirements and configuration IT infrastructure that extends multiple environments to go beyond the data center
  28. 28. Fully virtualized, integrated & programmable infrastructure Elastically scalable resources available on-demand Intelligent resource scheduling Infrastructure that captures workload requirements and deployment best practices Policy-based automation across infrastructure Analytics to optimize the environment in real-time The next-gen infrastructure for HSLT Resource Smart Application Aware Definitions Patterns Analytics Compute Network Storage Software Defined Environment Application Aware that understands the unique workload requirements Resource Smart that dynamically allocates infrastructure based on policies
  29. 29. What the business wants: • Define business needs • Identify service opportunities and requirements • Quickly experiment and test new services What Software Defined Environments provides: • Patterns of Expertise to link solution to infrastructure based on business rules • Automated orchestration of workloads • Analytics-based optimization of workload to maximize outcomes ContinuousOptimization P r io r it y P o lic y S e r v ic e P r io r it y P o lic y S e r v ic e Solution Definition Leverage best practices with patterns of expertise
  30. 30. What s the best infrastructure for my cloud? How do I manage & secure my hybrid environment? How do I maintain choice and flexibility? How do I rapidly deploy & operate my cloud? Are you building the right cloud?
  31. 31. Anti-Pattern: a commonly used process structure or pattern of action that, despite initially appearing to be an appropriate and effective response to a problem, typically has more bad consequences than beneficial results. -Wikipedia
  32. 32. Benefits Consequences Related Patterns How to recognize anti-patterns Problem Solution Context and Forces Design Pattern Benefits Anti-Pattern Solution Refactored Solution Consequences Symptoms and Consequences Related Patterns Design Anti-Pattern Contextual Causes
  33. 33. Are Today’s Good Practices Tomorrow’s Anti-Patterns?
  34. 34. Architecture by Accident The Humble Start… Meeting Demand… The First Bottleneck… The Second Bottleneck… Becoming Mission Critical… Enabling SOA… The Fun Begins… How Did We Get Here?
  35. 35. Everything fails, all the time. - Werner Vogels, CTO Amazon
  36. 36. Six Types of Socratic Questions •  What does that mean? •  Can you explain that? Questions that Clarify •  Is that always the case? •  How can you verify or disprove? Questions that Challenge •  What causes that to happen? •  What would be an example? Questions that Examine •  Why is that view best? •  What is another way to look at it? Questions that Expand •  What are the consequences? •  What are you implying? Questions that Asses •  What was the point of that question? •  How does that question apply to us? Questioning the Question
  37. 37. Questions we need to ask •  What kind of scenarios do I need to plan for? •  Where are my single-points-of-failure? •  If I use a master/slave architecture, what happens when I lose the master? •  What happens when my load balancer fails? •  How will I replace a node when it fails? •  How will I recognize a failure when it occurs? •  What if the cache grows beyond the memory limit of the instance? •  What if a downstream service times out or returns an exception?
  38. 38. Requirements for cloud scalability •  Build processes threads that resume on reboot •  Allow for state to re-sync using message queues •  Prepare pre-configured and pre-optimized virtual images for quick launches •  Avoid in-memory data stores •  Have a good DR strategy and then automate it •  Create application as loosely coupled systems to enable better fault tolerance and larger scale
  39. 39. Don’t forget get about the cloud-as-a- system view
  40. 40. Where Does Over- Subscription Occur? Corporate! LANs & VPNs! Load Balancer! Load Balancer! Firewall! Switch! VM Server Farm! Database! NAS ! Appliances! Storage! Frame! Web Servers! Load Balancer! 1.  Hypervisor 2.  CPU Cycles 3.  Memory 4.  Blade Backplane I/O 5.  SAN Fabric 6.  Network Interfaces 7.  Host Bus Adapters 8.  Backup Device 9.  WAN Circuits 10. Storage Processors Common Locations ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! Here Here Here Here
  41. 41. The scalability of an application is defined by whether it can accommodate changes in traffic without requiring changes in architecture.
  42. 42. Service Orientation 1 2 3 4 5 6 Goals of Service Orientation Abstraction Loose Coupling Autonomy Standard Services Composability Reusability
  43. 43. OrŸchesŸtraŸtion [AWR-kuh-strey-shun] •  A central process controls everything and coordinates the execution of different operations involved in the operation •  The services do not "know” that they are involved in a composite process •  Only the central coordinator of the orchestration is aware of the desired outcome, •  The orchestration leverages explicit process definitions to operate the services in the correct order of invocation   1.  the act of arranging a piece of music 2.  the planning or execution of events in order to achieve a desired effect 3.  The technique of arranging or manipulating, especially by means of clever or thorough planning or maneuvering
  44. 44. Orchestration Illustration Orchestrator Web Service 1 Web Service 4 Web Service 3 Web Service 2
  45. 45. ChoŸreŸogŸraŸphy [kawr-ee-OG-ruh-fee]   1.  the art of composing ballets and other dances 2.  the method of representing the various movements in dancing by a system of notation 3.  The arrangement or manipulation of actions leading up to an event •  Choreography does not rely on a central coordinator. •  Each service knows exactly who and when to execute •  Focuses on the exchange of messages and information •  All services need to be aware of the business process, operations to execute, messages to exchange, timing, etc.
  46. 46. Orchestration Illustration Web Service 1 Web Service 4 Web Service 3 Web Service 2 Send Receive Invoke Invoke Invoke
  47. 47. Choreography vs. Orchestration •  From the perspective of composing services to execute business processes, orchestration is a more flexible paradigm and has the following advantages over choreography: –  The coordination of component processes is centrally managed by a known coordinator. –  Web services can be incorporated without their being aware that they are taking part in a larger business process. –  Alternative scenarios can be put in place in case faults occur. Page  49  
  48. 48. Orchestration Requirements •  Event-based processing •  Coordinate asynchronously between services •  Correlate messages being exchanged •  Provide for parallel processing •  Allow for transaction roll-back •  Manipulate and transform data between messaging partners •  Be able to manage long running business transactions and activities •  Have a robust mechanism for fault and error handling
  49. 49. The Management Eco-System Capacity ManagementCompute Storage Network Facilities Event Management (Netcool OMNIbus) CMDB (SmartCloud ControlDesk) Billing (Cost Mgr) Software Tracking (TEM SUA) SmartCloud Monitoring Storage Productivity Center Network Performance Manager Data Center Infrastructure Manager Capacity Management Predictive Insights Capacity Analyzer Automated Reporting Engine (Tivoli Common Reporter) Cloud Automation (SmartCloud Orchestrator) Interface for Capacity Planners Interface for Business Users Policies (EGO) Data Warehouse
  50. 50. One Integrated Environment Distributed DatabaseMainframe Network Middleware Storage Event Pool Operational! Data Warehouse! Predictive Enrichment & Correlation Service DeskPaging CMDB Knowledge Asset Mgmt Event Catalog Event API Business Telemetry 3rd Party Providers Presentation Framework
  51. 51. Processing Streams Situational Awareness Engine Adapted from how-to-build-an-event-processing-application-presentation-717795 Real-Time Event Streams Detected and Predicted Situations Patterns from Historical Data Causal Relationship from Past RCAs
  52. 52. Complex Event Processing Event Pipeline Event Queries Time Window Data Events Control Event Other Events Event Filter Scenarios A B C Feedback Loop Event Intelligence Action Events
  53. 53. AutomatedAction Notificationand Escalation BusinessImpact Analysis RootCauseAnalysis Correlationand EventSuppression Enrichment Meta-Data Integration Bus DistributedCollectors DistributedCollectors LOB Managed Monitoring System Service Provider Monitoring System Vendor Managed Monitoring System Element Manager Element Manager Element Manager Other Enterprise Data Document Sharing Service Desk CMDB Batch Scheduling Knowledge Database Online Run Book PBX/Call Manager Visualization Framework CommonEvent Format Topology And Relationship Database Automated Action Tools DistributedCollectors Automated Provisioning System Predictive Analysis Automated Change Reconciliation Security Management ArchiveandReport Business Telemetry Data Service Center and Enterprise Notification Tool Event Processing
  54. 54. Palette of library assets enable easy workflow composition through drag and drop Access to rich libraries (toolkits) of reusable automation assets that enable to speed automation creation Rich set of actions types, flow control, data handling primitives that simplify creation of complex automations Easy workflow action editing for managing: data mapping, error recovery options, implementation details , etc. Graphical editor for composing and connecting workflows Rich tooling functions to edit, version, debug, optimize workflows Automating Responses
  55. 55. Custom" Data Core pattern processing BPM" Process SCO BPM" Human" Service pre-provision " event operation post-provision " event operation Operation context Operation context Operation context Custom" Data SCO REST API Operation context 1 2 4 5 3 SCORESTAPI A B A B Once per registered event operation Extending Workload Deployment with Custom Automation Pre-process event Post-process event
  56. 56. Completing the journey Define •  Review the existing architecture •  Review the business outcomes •  Define the end state Prioritize •  Consolidations •  Technologies to virtualize •  Business processes to model and workflows to automate Execute •  Look for early wins •  Evolve incrementally •  Organize the teams effectively
  57. 57. You have to be realistic about how fast you can mature. Iterating helps form a cultural of continuous improvements Iterative development
  58. 58. Let’s keep the conversation going…! ReverendDrew!!! @SystemsMgmtZen! ReverendDrew!! 614-306-3434!