I Know What You Did THIS Summer


Published on

Update to I Know What You Did Last Summer with some new material and some detail moved to backup slides at the end.

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

I Know What You Did THIS Summer

  1. 1. I Know What You Did This Last Summer... Martin Packer, IBM Email: martin_packer@uk.ibm.com Blog: https://www.ibm.com/developerworks/mydeveloperworks/blogs/MartinPacker 1
  2. 2. ... but do YOU? 2
  3. 3. What’s The Point Of This Presentation?So, it’s got a “tongue-in-cheek” title but what’s it all about?I think one of the least well appreciated aspects of z/OS and itsmiddleware is the richness of instrumentation it gives you: HereI describe it and just some of the ways you can get value fromSMF.While Im aware MY concerns might not match YOUR concernsEXACTLY theres much common ground.Id like to make you smarter - or appear to be. :-) 3
  4. 4. Agenda• Who wants to know?• Let’s review what we have• What more do we need? 4
  5. 5. Performance And Capacity● Im a little loth to talk about Performance and Capacity ● As we KNOW we can very successfully use instrumentation for this● But we are blessed with very good Performance and Capacity information ● And theres an abundance of “folklore” on how to use it ● Even if we have to “stay on our toes”● Well scarcely touch on this in the rest of the presentation
  6. 6. The Value Of Recorded Instrumentation• You really CAN know what happened last summer • Depending on what instrumentation you kept • Depending on how you look at the data* • We can get from the anecdotal to some hard facts * I use the terms “instrumentation”, “data”, “evidence “ and “statistics” interchangeably 6
  7. 7. Architecture and Inventory● "Architecture" means many different things *: ● Im interested in how infrastructure fits together. ● Im not happy to just have a "bucket of parts". ● Or an inventory thats just a list. ● But we can get a well-structured inventory out of what we have to hand.● I was taught to use “top down problem decomposition” ● A very good idea but... ● Theres a danger of losing sight of what this thing is actually FOR * See a later slide for more on this 7
  8. 8. Patterns and Changes● A static view often isnt enough ● Particularly as not just workloads but configurations are getting increasingly dynamic ● Ive the scars to prove it● It’s important to know how your systems “usually” behave ● The classic “double hump” ● There may be no “usually” ● This lack of “envelope” is in itself important ● “Our rolling 4 hour average peaks between 2AM and 4AM” ● This would probably affect software billing● Knowing what’s normal allows you to understand changes ● “This isn’t normal” ● “This is slowly getting worse”
  9. 9. LicencingWhat are we actually using?● ● And are we using it enough to justify it? ● And who IS using it? ● And should we be using multiple versions?What are we licenced for?● ● And what SHOULD we be licenced for?Note: SMF 70-1 and 89 basis for some IBM licencing schemes● ● And used in at least one third-party Licence Management tool
  10. 10. I’d Like To Make You (Appear) Smarter :-)Imagine me meeting you for the first time...●I’d like not to have to ask stupid questions...● ● ... the answers to which I should be able to find outI think you’d like to get the answers from data● ● Rather than having to trouble HUMANS for them ● Humans might not know ● Or might give you the WRONG answerAnd I think you value being proactive● ● Based on evidenceMost of my conversations about systems begin with FACTS● ● The interpretation is the fun bit ● You probably wish many of your conversations started with facts, too
  11. 11. So, What Do We Have?
  12. 12. But First Some Assumptions● Were not talking about formatted reports ● I assume you can process data and arent entirely reliant on RMF Postprocessor reports● Im not entirely limiting this to SMF ● Ive had conversations with developers where the words “SMF-like” have cropped up ● A WLM policy is “admissable evidence” ● Sos a DB2 Catalog ● Particularly the bits with history● The point Im making doesnt require an EXHAUSTIVE survey of the available data● Im NOT talking about Performance
  13. 13. “Physical” Containers
  14. 14. One Layer Down - LPARs
  15. 15. Another Layer Down – WLM Constructs
  16. 16. Address Space Etcetera
  17. 17. Application
  18. 18. Middleware-Specific Instrumentation ● Application Information● CICS, MQ, Websphere Application Server and DB2 are particularly prolific ● Tells you what the subsystem is used for● Subsystem Information ● And how its being driven crazy ● Tells you a lot about how these are set up and behave ● Youd probably recognise a SAP DB2 subsystem ● Use this information in concert with RMF information ● And youd certainly recognise one with lots of CICS or Batch use ● For example DB2 Group Buffer Pool analysis ● DDF leaves even more footprints than usual • Examples: IP Address, Client Application Program ● Domain knowledge is key
  19. 19. Data Set Instrumentation● Almost unlimited food for nosiness thought● Dynamic ● SMF 42-6● Point in time: ● DB2 Catalog ● DCOLLECT ● SMF 14, 15, 16, 62, 64● Other records have hints ● Example SMF 30 DD-level information● For the insatiably curious try User F61 GTF
  20. 20. Data Set Instrumentation - Examples● DB2 data set names give database and space name ● Also partitioning clues • And “hot” partitions ● DB2 Catalog reinforces this • Catalog vs DCOLLECT is an interesting comparison ● Note: SMF 42-6 doesnt help you understand WHICH DB2 users● Mnemonic data set and DD names in SMF 14 and 15 ● For example “STEPLIB” or “~.RUNLIB”● Batch understanding is greatly aided by data set information ● Ive discussed this at length elsewhere● CICS VSAM LSR pool use and data sets
  21. 21. SMF 30 Usage Information● Useful for licencing discussions but so much more besides: ● Are all my CICS regions up at 4.1.0? ● Which batch jobs access this DB2 Subsystem? ● (Subsystem name only given for DB2 Version 9 and above) ● Is this CICS region a TOR / AOR, FOR, DOR or what? ● How much is the TCB / SRB time when using DB2?● You can answer these questions just with SMF 30● Note: Multiple sections with the same “key” e.g. CICS / MQ ● Need to sum TCB and SRB times ● Speculate that fine structure is of interest
  22. 22. OA39629: NEW FUNCTION TO REPORT THE HIGHEST PERCENT OF CPU TIME USED BY A SINGLE TASK IN AN ADDRESS SPACE● z/OS Release 12 and 13● Provides largest TCBs % of an engine● Largest TCBs program name● Purpose: ● To help understand which address spaces have single-TCB speed sensitivities● Speculation: ● Might show QR TCB for CICS without 110s being needed ● Interesting to compare to Product Usage TCB / SRB in Type 30
  23. 23. CFLEVEL 18 CIB / CFP Path Instrumentation Improvements● Traditionally weve had path types to a CF ● And path types for CF-to-CF links● New instrumentation adds much more: ● Adapter type, Path type, CHPID, PCHID, Adapter ID, Port number ● Helps build better topology picture ● Latency ● Used in report to estimate distance @ 10mics per km ● Path degraded flag ● Note: No traffic information
  24. 24. Architects Will Recognise This As Incomplete
  25. 25. This Is Not A Complete Architecture As Architects Would Recognise It● This only documents componentry inside the mainframe● The names are not necessarily names Applications people or architects would recognise ● For example a machine serial number is probably NOT what an architect would use to name a machine ● If they even WANTED to name a machine● Theres little commentary● Interfaces are sparse● An attempt to portray our understanding as architectural would appear like Officer Crabtrees * attempts to speak French * http://en.wikipedia.org/wiki/Officer_Crabtree
  26. 26. An Observation On Batch Architecture● Most installations have little understanding of their batch “architecture” ● Quotes because that term may be too kind :-) ● Numerous customer recent conversations convince me of this ● Knowledge is being lost from organisations● This understanding is important ● Needed to make tuning, scaling and streamlining effective and safe ● Aallows stuff to be reliably run ● Enables training the next generation● Further observation: “Any technology distinguishable from magic is insufficiently advanced” applies here
  27. 27. So What Do We Still Need?
  28. 28. Some Parting Thoughts
  29. 29. Some Parting Thoughts● Experiment with data depiction techniques ● Example: Plot “with load” rather than time of day ● Example: Use time as the third dimension ● Maybe someone knows how to make animated GIFs or movies from static graphics● Think of creative ways to use instrumentation● Look to other sources of instrumentation than the obvious● Beware the subtleties of e.g. field meanings ● Which, I guess, means staying “plugged in to the folklore”
  30. 30. Backup Slides
  31. 31. In An Ideal World Youd Like Instrumentation To Be ...● Timestamped ● Fit for purpose● Readily parseable ● Persisted● Of known provenance ● Have a manageable lifecycle● Light weight ● Immediately produced● Understood by the community ● Standards-based [ Id say ALL instrumentation● Available at various levels of falls short of at least one of detail these ideals]
  32. 32. Audit● Follows on from Change and Inventory● Do we have what we think we have? ● If not why not?● Who made that change and when? ● And how did it affect things? ● Maybe “why?” isn’t answerable from the data ● Some changes are “heralded”: ● WLM Policy activation specifically recorded in SMF ● Some arent: ● “We seem to have more online disks in this interval than the previous one”
  33. 33. We Know (Almost) Everything Youd Ever Want To Know● For processors: ● Serial number and Plant ● “Whats in a name?” ● Device Type and Model ● Actually hardware and software models ● Specialty engine counts● For Coupling Facilities ● Similar● For Disk, Tape and Switches ● Enormous amounts of information
  34. 34. The “Almost” We Had Before Is Almost Gone● CPU● Memory● Channels● Disk and Tape ● Some connectivity information still missing● Parallel Sysplex Infrastructure ● Connectivity ● Performance ● Traffic● (Some LPARs are ICF LPARs – just to mess up my graphic)
  35. 35. WLM Constructs● RMF tells us how the following behave: ● We get SOME information on what these represent: ● Workloads ● Description strings ● Service Classes ● No classification information ● Service Class Periods ● “Served” service classes may be a bit of a clue ● Report Classes ● Policy changes are readily discernable ● Including who did it ● (Usually I see mnemonic policy descriptions)
  36. 36. Parallel Sysplex Infrastructure● Enormous amount of information on Coupling Facility structures● XCF groups likewise ● I got job name put into RMF as member name is often useless● RMF doesnt know what structures or XCF groups are used for ● So we have to “guess” ● But its been a LONG time since I guessed a CF structure or XCF groups use wrong
  37. 37. Address Space● Key non-performance information in Type 30: ● Program name ● WLM Service Class and Report Class ● But not for “served” work● Can relate Report Class and Service Class ● And usually figure out what these are REALLY for● Can detect eg CICS regions, DB2 subsystem and MQ subsystem address spaces● Can dispel myths like “we dont use Unix System Services”● Accounting Information and Programmer Name can be interesting
  38. 38. ... but I have to admit I don’tknow what you’re doing right now... 38
  39. 39. Online Monitoring Still Has A Role● Its unimpressive to respond to an incident with “No but I can tell you what happened last week”● Automation probably requires it● Some things simply arent available in an externally-recorded form● People seem to quite like it
  40. 40. Im Told I dont Do Enough Graphics … So Here Are Some (Almost) Gratuitous Ones :-) Source: http://www.edwardtufte.com/tufte/posters
  41. 41. Provenance Is ImportantSource: http://www.the-world-heritage-sites.com/messel-pit-fossil-site_germany.htm
  42. 42. Messel Pit Fossil Site, GermanyThe Messel Pit Fossil Site is a disused quarry in the village of Messel, Darmstadt-Dieburg, Hesse, about 35 kmsoutheast of Frankfurt-am-Main, Germany. The quarry used to be a mine since 1859, when brown coal andlater bituminous shale were mined. By the 1900s, it became well known for a different reason, when it beganto yield fossils. Nevertheless, mining continued until as late as 1971, when the shale mine finally closed, and acement factory built in the quarry also failed.After the quarry became disused, there was a plan to turn it into a garbage dump. Fossil enthusiasts wereallowed to dig in the quarry. These amateurs developed a technique to preserve the fine details on smallfossils. In time, the Messel Pit became known as the richest site for fossils from the Eocene period, which wasbetween 57 million and 36 million years ago.Today scientists have uncovered exceptionally well-preserved fossils of mammals, including fully articulatedskeletons to the contents of the stomach of animals from that period.In 1995, Messel Pit Fossil Site became the first site to be inscribed as a World Heritage Site solely due tofossils. It took place at the 19th session of the World Heritage Committee held in Berlin, Germany, on 4-9December, 1995. Source: http://www.the-world-heritage-sites.com/messel-pit-fossil-site_germany.htm