Cartographer  or Building a Next Generation Management Framework Bobby Krupczak Chief Scientist Krupczak.org, LLC [email_a...
Overview <ul><li>Background
Overview of network mgmt today
Cartographer
Yet another management framework
Software technology
Demo </li></ul>
Who Am I?  <ul><li>BS CISE from UF 1989
Worked in industry on SNMP
MS CS GaTech 1993
Co-founder of Empire Technologies
PhD CS GaTech 1997
Sold Empire to Concord 1999
Krupczak.org 2003 </li></ul>
Management Model <ul><li>Mgmt info is virtual representation
Managers, agents exchange mgmt info
Mgmt is therefore: </li><ul><li>Inspection of
Alteration of
Creation of
Deletion of mgmt info </li></ul></ul>
First-Generation  <ul><li>Dumb, lightweight (hopefully) agents
Heavyweight, complex,  smart  managers
Traditional command-control
Scaling becomes issue
Analogous to CEO managing entire enterprise </li></ul>
2nd-Generation <ul><li>Push intelligence outwards towards agent
Empire/SystemEDGE, RMON
Increase scaling, reduce reaction time
Some delegation, middle-managers, remote pollers
Exception-management, event
de-duplication, root-cause  </li></ul>
2nd-Generation (continued) <ul><li>Agents  still  work in isolation (stovepipes)
Distribution overhead and agent administrative footprint still non-trivial
SNMPv1, v2c, v3 now deployed
Agent backlash?
CEO now has bank VPs but still manage/controls the enterprise </li></ul>
Cartographer <ul><li>Discover, track relationships between components in distributed system </li><ul><li>Dependencies betw...
Include network services as well as higher-layer abstractions
Agent based
Topography  not  topology
Others have examined this approach though mostly in academic research papers </li></ul></ul>
Cartographer (II) <ul><li>Model relationships using dependency graph borrowed from graph theory branch of mathematics
Systems represented as vertexes
Dependencies represented as edges
Directed graphs
System is server if it provides service to some client
System is client if it consumes service </li></ul>
Example Dependency Graph
What Do We Do With Data? <ul><li>Discover, analyze dependencies
Diagnose and troubleshoot faults
Security spinoff
Monitor, test, & compare service experiences
Work bottom-up </li></ul>
But I Already Know My Network <ul><li>You may be surprised what you find
Distributed systems are  highly  dynamic, not static
Automating management necessitates capturing this info and encoding it </li></ul>
What Do We Do? Discover/Analyze <ul><li>Discover dependencies via: </li><ul><li>OS and app configuration </li><ul><li>/etc...
System APIs </li></ul><li>Dynamically via protocol endpoints </li><ul><li>IPv4 and IPv6  </li></ul></ul><li>Classify into ...
Inbound/outbound/transit
Per-system, per-user, per-app </li></ul>
What We Do? Discover/Analyze (II) <ul><li>Dependencies tell us what a machine is doing </li><ul><li>Validate configuration...
Discover misconfiguration </li></ul><li>Seed automatic configuration for monitoring </li><ul><li>If DB server => automatic...
What Do We Do? Diagnose <ul><li>Who/what is impacted? </li><ul><li>If key app dies => know who is impacted </li></ul><li>D...
Upcoming SlideShare
Loading in …5
×

Cartographer, or Building A Next Generation Management Framework

749 views

Published on

Dr. Bobby Krupczak's slides about the Cartographer management agent and the underlying XMP management framework. Presented at the February 10, 2009 meeting of the Atlanta Network and Systems Management Technical User Group (ANSMTUG).

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
749
On SlideShare
0
From Embeds
0
Number of Embeds
8
Actions
Shares
0
Downloads
25
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Cartographer, or Building A Next Generation Management Framework

  1. 1. Cartographer or Building a Next Generation Management Framework Bobby Krupczak Chief Scientist Krupczak.org, LLC [email_address] http://www.krupczak.org/cartographer
  2. 2. Overview <ul><li>Background
  3. 3. Overview of network mgmt today
  4. 4. Cartographer
  5. 5. Yet another management framework
  6. 6. Software technology
  7. 7. Demo </li></ul>
  8. 8. Who Am I? <ul><li>BS CISE from UF 1989
  9. 9. Worked in industry on SNMP
  10. 10. MS CS GaTech 1993
  11. 11. Co-founder of Empire Technologies
  12. 12. PhD CS GaTech 1997
  13. 13. Sold Empire to Concord 1999
  14. 14. Krupczak.org 2003 </li></ul>
  15. 15. Management Model <ul><li>Mgmt info is virtual representation
  16. 16. Managers, agents exchange mgmt info
  17. 17. Mgmt is therefore: </li><ul><li>Inspection of
  18. 18. Alteration of
  19. 19. Creation of
  20. 20. Deletion of mgmt info </li></ul></ul>
  21. 21. First-Generation <ul><li>Dumb, lightweight (hopefully) agents
  22. 22. Heavyweight, complex, smart managers
  23. 23. Traditional command-control
  24. 24. Scaling becomes issue
  25. 25. Analogous to CEO managing entire enterprise </li></ul>
  26. 26. 2nd-Generation <ul><li>Push intelligence outwards towards agent
  27. 27. Empire/SystemEDGE, RMON
  28. 28. Increase scaling, reduce reaction time
  29. 29. Some delegation, middle-managers, remote pollers
  30. 30. Exception-management, event
  31. 31. de-duplication, root-cause </li></ul>
  32. 32. 2nd-Generation (continued) <ul><li>Agents still work in isolation (stovepipes)
  33. 33. Distribution overhead and agent administrative footprint still non-trivial
  34. 34. SNMPv1, v2c, v3 now deployed
  35. 35. Agent backlash?
  36. 36. CEO now has bank VPs but still manage/controls the enterprise </li></ul>
  37. 37. Cartographer <ul><li>Discover, track relationships between components in distributed system </li><ul><li>Dependencies between network, system, applications
  38. 38. Include network services as well as higher-layer abstractions
  39. 39. Agent based
  40. 40. Topography not topology
  41. 41. Others have examined this approach though mostly in academic research papers </li></ul></ul>
  42. 42. Cartographer (II) <ul><li>Model relationships using dependency graph borrowed from graph theory branch of mathematics
  43. 43. Systems represented as vertexes
  44. 44. Dependencies represented as edges
  45. 45. Directed graphs
  46. 46. System is server if it provides service to some client
  47. 47. System is client if it consumes service </li></ul>
  48. 48. Example Dependency Graph
  49. 49. What Do We Do With Data? <ul><li>Discover, analyze dependencies
  50. 50. Diagnose and troubleshoot faults
  51. 51. Security spinoff
  52. 52. Monitor, test, & compare service experiences
  53. 53. Work bottom-up </li></ul>
  54. 54. But I Already Know My Network <ul><li>You may be surprised what you find
  55. 55. Distributed systems are highly dynamic, not static
  56. 56. Automating management necessitates capturing this info and encoding it </li></ul>
  57. 57. What Do We Do? Discover/Analyze <ul><li>Discover dependencies via: </li><ul><li>OS and app configuration </li><ul><li>/etc, .ini, and Windows registry
  58. 58. System APIs </li></ul><li>Dynamically via protocol endpoints </li><ul><li>IPv4 and IPv6 </li></ul></ul><li>Classify into ~ 30 different types
  59. 59. Inbound/outbound/transit
  60. 60. Per-system, per-user, per-app </li></ul>
  61. 61. What We Do? Discover/Analyze (II) <ul><li>Dependencies tell us what a machine is doing </li><ul><li>Validate configuration and operation
  62. 62. Discover misconfiguration </li></ul><li>Seed automatic configuration for monitoring </li><ul><li>If DB server => automatically monitor components </li></ul></ul>
  63. 63. What Do We Do? Diagnose <ul><li>Who/what is impacted? </li><ul><li>If key app dies => know who is impacted </li></ul><li>Determine root cause/impact </li><ul><li>Given fault, which clients are affected?
  64. 64. Given a client, what faults are affecting it?
  65. 65. We know service A depends on X,Y,Z </li><ul><li>If A fails, examine X, Y, Z </li></ul></ul></ul>
  66. 66. What Do We Do? Security Spinoff <ul><li>Track dependencies and interactions longterm
  67. 67. Develop model of typical behavior/role of system/app
  68. 68. Deviations from baseline could indicate issues
  69. 69. Social networking for computers </li><ul><li>If my machine starts communicating with those in China . . . . </li></ul></ul>
  70. 70. What Do We Do? Compare Service Experience <ul><li>Do you see what I see?
  71. 71. Use dependency data to automatically test services </li><ul><li>Global, centralized testing
  72. 72. Per-system active testing
  73. 73. Per-system passive monitoring </li></ul><li>Detect localized hot-spots </li><ul><li>Pinpoint infrastructure problems </li></ul></ul>
  74. 74. What Is Next Generation About This? <ul><li>Started with observations about how human corporations work </li><ul><li>CEO sets broad policies and goals
  75. 75. Employees implement them, solve problems, run the show </li></ul><li>Managers and agents become peers </li><ul><li>Further push intelligence and command/control downward and outward
  76. 76. P2P architecture utilized
  77. 77. Every agent acts in dual role </li></ul></ul>
  78. 78. Peer-to-Peer <ul><li>Not based on polling and storing of data in central repository </li><ul><li>Not to say this isn't important </li></ul><li>Agents self-organize into p2p overlay networks </li><ul><li>Exchange information with peers
  79. 79. Run distributed algorithms
  80. 80. Self-propagate, self-update </li></ul></ul>
  81. 81. What Is A Peer? <ul><li>Systems are peers if they both utilize same service from same server
  82. 82. Many p2p overlays
  83. 83. Increase scaling (unlimited?)
  84. 84. Reduce reaction time
  85. 85. Analyze more up-to-date info </li></ul>
  86. 86. Example P2P Overlay
  87. 87. New Management Framework? <ul><li>Why re-invent the wheel? </li><ul><li>Could make existing IMF work given enough tape and glue
  88. 88. SNMPvX still too cumbersome, inefficient
  89. 89. Protocol limitations
  90. 90. ASN.1/BER too brittle and prone to interoperability problems
  91. 91. WBEM/CIM too heavyweight, complex </li><ul><li>Spend all day modeling, not managing </li></ul><li>Some existing work applying XML to IMF </li></ul></ul>
  92. 92. X ML M anagement P rotocol <ul><li>Framework in addition to just a protocol </li><ul><li>SMI, protocol, MIBs </li></ul><li>Borrow from and extend the IMF as much as possible
  93. 93. Utilize XML for: </li><ul><li>Data modeling (SMI)
  94. 94. Specification (MIBs)
  95. 95. Transfer syntax (protocol) </li></ul><li>Everything is text </li></ul>
  96. 96. More XML <ul><li>ASN.1 could have been used?
  97. 97. More XML tools,
  98. 98. More widely adopted than ASN.1
  99. 99. XML schemas for structured document </li><ul><li>Modeling
  100. 100. Parsing
  101. 101. Conversion
  102. 102. Validating </li></ul><li>Still need to test interoperability </li></ul>
  103. 103. XMP SMI <ul><li>http://xmlns.krupczak.org/xsd/xmptypes-1.0.xsd
  104. 104. Start with SNMP SMI
  105. 105. Enhance only where necessary </li><ul><li>Do away with OIDs
  106. 106. Tuple of MIB-name, object-name, key
  107. 107. MIB-2 ifInOctets </li><ul><li>From: </li><ul><li>1.3.6.1.2.1.2.2.1.10.1 </li></ul><li>To: </li><ul><li>mib2.ifInOctets.if0 </li></ul></ul></ul></ul>
  108. 108. XMP SMI (II) <ul><li>SMI type enhancements </li><ul><li>Added several data types and promoted several textual conventions
  109. 109. Everything 64-bit min, although with XML, numbers can be larger w/o breaking 2/3 of framework
  110. 110. With BER, changing from 32-64 bit breaks SMIs, MIBs, software
  111. 111. Textual conventions specify additional semantics; overloading is poor engineering </li><ul><li>Promote several to standard types </li></ul></ul></ul>
  112. 112. XMP SMI (III) <ul><li>Added extendedBoolean type </li><ul><li>True, False, Unknown </li></ul><li>Added unsupportedVariable so agent can answer queries honestly and completely
  113. 113. Avoid use of inheritance and poloymorphism complexities (ala CIM)
  114. 114. Scalar and tabular objects </li></ul>
  115. 115. XMP SMI (IV) <ul><li>Tables are relations </li><ul><li>Support relational table operations
  116. 116. How to marry table permissions with object permissions? </li></ul><li>Need a lot more work on MIB specification & schema </li></ul>
  117. 117. XMP Protocol <ul><li>http://xmlns.krupczak.org/xsd/xmp-1.0.xsd </li></ul>
  118. 118. XMP Protocol (II) <ul><li>Connection-oriented </li><ul><li>Avoid much of intricacies of UDP-based protocols </li></ul><li>What intricacies? </li><ul><li>More efficient for larger data xfers
  119. 119. No need for MIB tricks
  120. 120. No need for object ordering
  121. 121. No built-in race conditions in large tables </li></ul><li>Original rationale for SNMP/UDP valid then, not now? </li></ul>
  122. 122. XMP Protocol (III) <ul><li>Entity initiates session
  123. 123. Also closes session
  124. 124. Stay connected as long as needed
  125. 125. RPC like semantics </li><ul><li>Request/response semantics
  126. 126. Initiator makes requests </li></ul><li>Is this a manager? </li></ul>
  127. 127. XMP Protocol (IV) <ul><li>Message types borrowed from SNMP </li><ul><li>GetRequest (scalars)
  128. 128. Response (scalars, tables)
  129. 129. SetRequest (scalars)
  130. 130. Trap </li><ul><li>First two objects are core.trapType and core.sysObjectID </li></ul><li>Information </li></ul></ul>
  131. 131. Example GetRequest
  132. 132. Example Response
  133. 133. XMP Table Operations <ul><li>SQL-like
  134. 134. SelectTableRequest
  135. 135. InsertTableRequest
  136. 136. DeleteTableRequest
  137. 137. UpdateTableRequest
  138. 138. No overloading, no side-effects </li></ul>
  139. 139. Example SelectTableRequest
  140. 140. No GetNext/GetBulk <ul><li>No GetNext/GetBulk needed for table traversal
  141. 141. GetNext yields very little information and no additional semantics
  142. 142. But how do I walk a MIB? </li><ul><li>You don't
  143. 143. In practice, walking only yields syntactic information </li></ul></ul>
  144. 144. Tables, Keys <ul><li>For scalars, no real instance identifier needed
  145. 145. For tables, relation keys </li><ul><li>Keys can be strings, numbers, variable-length </li></ul><li>No explicit notion of ordering </li><ul><li>No need? </li></ul></ul>
  146. 146. XMP Encapsulation in SSL/TCP <ul><li>Utilize SSLv3/TLSv1 for privacy and authentication
  147. 147. Cartographer utilizes its own CA to create/sign X509v3 certs
  148. 148. Each entity embeds own CA
  149. 149. Agent -> Agent requires two-way authentication
  150. 150. Manager does not need to provide cert
  151. 151. TCP/UDP 5270 </li></ul>
  152. 152. XMP MIBs <ul><li>Virtually compatible with SNMP SMI
  153. 153. Implemented MIB-2 in XMP
  154. 154. Can implement others </li><ul><li>HostMIB, SysApplMIB </li></ul><li>How MIBs are specified still under development </li><ul><li>XML schema
  155. 155. Tables, objects, keys
  156. 156. Borrow from relational DB theory and SQL </li></ul></ul>
  157. 157. XMP MIBs (II) <ul><li>MIB names must be unique within universe of XMP
  158. 158. Within a MIB, object names must be unique
  159. 159. Can utilize private-enterprise numbers to help with uniqueness </li><ul><li>Krupczak.org is 16050 </li></ul><li>Core MIB contains agent-engine stats and config
  160. 160. Cartographer MIB implemented </li></ul>
  161. 161. But How Do I Make Money? <ul><li>License model: </li><ul><li>Open source
  162. 162. Closed source
  163. 163. Dual-license </li></ul><li>Traditional closed-source company </li><ul><li>Market for management software mature and consolidating
  164. 164. Unlikely to gain much traction </li></ul><li>Crippleware </li></ul>
  165. 165. Example OSS Companies <ul><li>Example open-source companies: </li><ul><li>Sendmail (OSS, add-on software and services)
  166. 166. Snort (dual license?)
  167. 167. Asterisk (dual license)
  168. 168. OpenNMS (OSS, services)
  169. 169. JBoss – sold for $400m to RedHat
  170. 170. MySQL – sold for $1B to Sun </li></ul></ul>
  171. 171. An Island or Ecosystem? <ul><li>Tremendous investment in existing products & frameworks
  172. 172. Add XMP as new management protocol to existing platforms </li><ul><li>OpenNMS
  173. 173. MRTG
  174. 174. ZenOSS? </li><ul><li>Integration in research phase </li></ul><li>Others? </li></ul></ul>
  175. 175. Integration (continued) <ul><li>SNMP/XMP gateway? </li><ul><li>Not under active consideration
  176. 176. Very difficult computer science problem </li></ul><li>Backport to SNMP, WBEM </li><ul><li>Not under active consideration
  177. 177. More likely than gateway approach </li></ul></ul>
  178. 178. Technologies, Platforms, Engines <ul><li>Agent written entirely in C </li><ul><li>No need to install interpreters, VMs, DLLs </li><ul><li>In past lifetime, having to install Java on all systems was large barrier </li></ul><li>Goal is to run agent out of box
  179. 179. Very small footprint </li><ul><li>Footprint less than 3% is upper-bound </li></ul><li>Engine is 66k lines of C-code
  180. 180. Plugins 9k to 16k lines of C-code </li></ul><li>Ship with libs/DLLs if needed </li></ul>
  181. 181. Platform support <ul><li>Solaris 9+ Sparc (64-bit)
  182. 182. Solaris 9+ x86
  183. 183. Linux 2.4+ on x86 (32, 64-bit)
  184. 184. Windows 2000/XP/2003/Vista/2008 </li><ul><li>Win32 and Win64 </li></ul><li>Agent uses as few libs as possible </li><ul><li>Libxml
  185. 185. Pthreads
  186. 186. Openssl
  187. 187. Iconv, zlib </li></ul></ul>
  188. 188. Big Picture
  189. 189. Agent Pieces/Parts
  190. 190. Licenses <ul><li>Agent engine, GPLv2
  191. 191. MIB-2 plugin, GPLv2
  192. 192. Example plugin, GPLv2
  193. 193. Cartographer plugin, closed source, shrinkwrap software license
  194. 194. Java GUI, closed source, shrinkwrap software license
  195. 195. See release notes and install instructions </li></ul>
  196. 196. Roadmap <ul><li>1.0 released in November 2008 </li><ul><li>Framework
  197. 197. Infrastructure </li></ul><li>1.1 release in Spring/Summer 2009 </li><ul><li>Bug fixes, additional platforms
  198. 198. MIB schema, SMI work
  199. 199. More MIB data
  200. 200. More intelligence
  201. 201. A lot more work on events </li></ul></ul>
  202. 202. More Roadmap <ul><li>2.0 TBD </li><ul><li>Self-propagation (already do self-updating)
  203. 203. Distributed decision making
  204. 204. Root cause, impact
  205. 205. Automatic testing/measurement
  206. 206. More integration </li></ul></ul>
  207. 207. Demo – Cartographer Main
  208. 208. Dependency View
  209. 209. Dependency Query
  210. 210. Dependency Query
  211. 211. Dependency Query (Asterisk)
  212. 212. Process Query
  213. 213. Process Query
  214. 214. Endpoint Query
  215. 215. Endpoint View
  216. 216. MRTG Integration
  217. 217. ONMS Integration

×