Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
OSIS 2019
THE OPEN SOURCE
INNOVATION SPRING 2019
@nico_charles
nicolas@rudder.io
Qu’apporte l’observabilité à la
gestion d...
OSIS 2019How are the systems?
Does no error nor change in logs mean success?
Aren’t we missing something?
OSIS 2019Definition
Configuration management is a systems
engineering process for establishing and
maintaining consistency o...
OSIS 2019Let's remember: What does configuration management do?
configuration
target state
feedbackconfiguration
OSIS 2019Let's remember: What does configuration management do?
configuration
target state
feedbackconfiguration
feedbackconfi...
OSIS 2019Main challenges faced nowadays
DEV QA PRODUCTION RECOVERY
DEV SEC OPSMGMT EXTERN
Multiple teams, diluted expertis...
OSIS 2019Getting and understanding the info is complex
Operators, Managers, Experts, APIs have differents needs
Frustratio...
OSIS 2019Definition (again)
Observability is a measure of how well
internal states of a system can be inferred
from knowled...
OSIS 2019Monitoring VS Observability: having a factual & deep insight
monitoring observabilityVS
OSIS 2019Why we need Observability in Configuration Management?
Causality AgencyPerspective
trust and prove
configuration st...
OSIS 2019Observability adoption
Databases
Built-in facilities
Tooling ecosystem to extract knowledge
OSIS 2019Observability adoption
Software
Legacy: embedding agent (often proprietary)
New developments:
Best practices
Open...
OSIS 2019Let’s take an implementation example...
OSIS 2019These concepts are core to Rudder
Everyone/thing can be an actor of configuration management
"rules": [
{
"id": "3...
OSIS 2019Compliance?
PARAM
RULE
● Id
DIRECTIVE
● Id
● (Components)
GROUP
● Id
RUDDER config
(global)
● Policy Mode
● Schedu...
OSIS 2019Compliance?
RUDDER config
(global)
● Policy Mode
● Schedule
NODE
● Properties
● Policy Mode
● Schedule
Environment...
OSIS 2019Compliance?
PARAM
RULE
● Id
DIRECTIVE
● Id
● (Components)
GROUP
● Id
RUDDER config
(global)
● Policy Mode
● Schedu...
OSIS 2019Compliance?
PARAM
RULE
● Id
DIRECTIVE
● Id
● (Components)
GROUP
● Id
RUDDER config
(global)
● Policy Mode
● Schedu...
OSIS 2019Compliance?
PARAM
RULE
● Id
DIRECTIVE
● Id
● (Components)
GROUP
● Id
RUDDER config
(global)
● Policy Mode
● Schedu...
OSIS 2019Compliance?
● Id : . . .
● Generated : . . .
Files
Node configuration
RUN
● Reports
● Reports
● ...
● ...
METADATA...
OSIS 2019Compliance?
● Id : . . .
● Generated : . . .
Files
Node configuration
Run reports
RUN
● Reports
● Reports
● ...
● ...
OSIS 2019Compliance?
● Id : . . .
● Generated : . . .
Files
Node configuration
RUN
● Reports
● Reports
● ...
● ...
METADATA...
OSIS 2019Compliance?
● Id : . . .
● Generated : . . .
Files
Node configuration
RUN
● Reports
● Reports
● ...
● ...
METADATA...
OSIS 2019Compliance?
● Id : . . .
● Generated : . . .
Files
Node configuration
RUN
● Reports
● Reports
● ...
● ...
METADATA...
OSIS 2019Causality and dependencies of events
Why would we need it?
● We have logs
● We have experts
OSIS 2019Causality and dependencies of events
OSIS 2019Causality and dependencies of events
Diagnostic on infrastructures is hard
● Many systems
● Dependencies across s...
OSIS 2019Causality and dependencies of events
Monitoring can only correlate
Events happen on the whole infrastructure
Caus...
OSIS 2019Event sourcing & Tracing
Terminology (Dapper & OpenTracing)
Trace: Description of a “transaction” as it moves thr...
OSIS 2019Event sourcing & Tracing
What’s in a span?
Operation name
Start & end timestamps
Tags: Set of key:value
Logs: Set...
OSIS 2019Event sourcing & Tracing
Temporal relationships between Spans in a single Trace
https://www.jaegertracing.io/docs...
OSIS 2019Event sourcing & Tracing
Configuration Management: What would be the traces?
Defining the infrastructure state is a...
OSIS 2019Event sourcing & Tracing
PARAM
RULE
● Id
DIRECTIVE
● Id
● (Components)
GROUP
● Id
Environmental
context
● Id : . ...
OSIS 2019Event sourcing & Tracing
Store Traces & Events:
● Integrate with systems in place
● Many tools are compatible wit...
OSIS 2019What to do of these billions events?
Reactive approach
Query, search and analyze traces in case of problems
Proac...
OSIS 2019Closing thoughts
Mark Burgess
Founder of Configuration Management
http://markburgess.org/anomalies.html
OSIS 2019
THE OPEN SOURCE
INNOVATION SPRING 2019
@nico_charles
nicolas@rudder.io
Thank you !
Any questions ?
OSIS 2019Security?
Events, trace and logs hold critical data
Within a simple system, security can be built-in
AuthN/AuthZ
...
Upcoming SlideShare
Loading in …5
×

OSIS19_Cloud : Qu’apporte l’observabilité à la gestion de configuration? par Nicolas Charles

21 views

Published on

On parle d’observabilité des services lorsque ceux-ci exposent des états et métriques internes pour améliorer la disponibilité globale.
Qu’en est-il de l’observabilité des infrastructures sur lesquelles ils sont déployés, configurés et maintenus ?
Les différents logs (centralisés, agrégés) permettent un bon début d’analyse mais il faut aussi observer les systèmes au fil de l’eau pour tracer chaque changement et les corréler avec le monitoring. Aujourd’hui, ces étapes de configuration IT devraient être prises en charge par les outils de gestion de configuration, qui deviennent la passerelle vers l’observabilité des opérations.
Nous montrerons l'intérêt de cette approche pour la gestion IT moderne avec un retour d’expérience sur les challenges de leur mise en place dans Rudder, notre solution libre d’audit et de gestion de configuration en continu.

Published in: Software
  • Be the first to comment

  • Be the first to like this

OSIS19_Cloud : Qu’apporte l’observabilité à la gestion de configuration? par Nicolas Charles

  1. 1. OSIS 2019 THE OPEN SOURCE INNOVATION SPRING 2019 @nico_charles nicolas@rudder.io Qu’apporte l’observabilité à la gestion de configuration ?
  2. 2. OSIS 2019How are the systems? Does no error nor change in logs mean success? Aren’t we missing something?
  3. 3. OSIS 2019Definition Configuration management is a systems engineering process for establishing and maintaining consistency of a product [...] throughout its life. Configuration_management “
  4. 4. OSIS 2019Let's remember: What does configuration management do? configuration target state feedbackconfiguration
  5. 5. OSIS 2019Let's remember: What does configuration management do? configuration target state feedbackconfiguration feedbackconfiguration feedbackconfiguration
  6. 6. OSIS 2019Main challenges faced nowadays DEV QA PRODUCTION RECOVERY DEV SEC OPSMGMT EXTERN Multiple teams, diluted expertise, harder reporting Heterogeneous systems, reduced visibility, ease of use and understanding
  7. 7. OSIS 2019Getting and understanding the info is complex Operators, Managers, Experts, APIs have differents needs Frustration when we need a third party to obtain relevant data We mistrust what we don’t understand
  8. 8. OSIS 2019Definition (again) Observability is a measure of how well internal states of a system can be inferred from knowledge of its external outputs. Observability “
  9. 9. OSIS 2019Monitoring VS Observability: having a factual & deep insight monitoring observabilityVS
  10. 10. OSIS 2019Why we need Observability in Configuration Management? Causality AgencyPerspective trust and prove configuration states provide insights relevant to different needs help teams find the best levers for their job A B
  11. 11. OSIS 2019Observability adoption Databases Built-in facilities Tooling ecosystem to extract knowledge
  12. 12. OSIS 2019Observability adoption Software Legacy: embedding agent (often proprietary) New developments: Best practices Open standards Architectural bricks
  13. 13. OSIS 2019Let’s take an implementation example...
  14. 14. OSIS 2019These concepts are core to Rudder Everyone/thing can be an actor of configuration management "rules": [ { "id": "32377fd7-02fd-43d0-aab7-28460a91 "name": "Security rules - baseline", "compliance": 100, "mode": "full-compliance", "complianceDetails": { "successAlreadyOK": 87.47, "successNotApplicable": 12.53 },
  15. 15. OSIS 2019Compliance? PARAM RULE ● Id DIRECTIVE ● Id ● (Components) GROUP ● Id RUDDER config (global) ● Policy Mode ● Schedule NODE ● Properties ● Policy Mode ● Schedule Environmental context ● Id : . . . ● Generated : . . . Files Node configuration Change request Historisation Historization Event logs
  16. 16. OSIS 2019Compliance? RUDDER config (global) ● Policy Mode ● Schedule NODE ● Properties ● Policy Mode ● Schedule Environmental context ● Id : . . . ● Generated : . . . Files Node configuration Change request Historisation Event logs PARAM RULE ● Id ● Groups + Directives DIRECTIVE ● Id ● Components GROUP ● Id Historization
  17. 17. OSIS 2019Compliance? PARAM RULE ● Id DIRECTIVE ● Id ● (Components) GROUP ● Id RUDDER config (global) ● Policy Mode ● Schedule NODE ● Properties ● Policy Mode ● Schedule Environmental context ● Id : . . . ● Generated : . . . Files Node configuration Change request Historisation Historization Event logs
  18. 18. OSIS 2019Compliance? PARAM RULE ● Id DIRECTIVE ● Id ● (Components) GROUP ● Id RUDDER config (global) ● Policy Mode ● Schedule NODE ● Properties ● Policy Mode ● Schedule Environmental context ● Id : . . . ● Generated : . . . Files Node configuration Change request Historisation Historization Event logs
  19. 19. OSIS 2019Compliance? PARAM RULE ● Id DIRECTIVE ● Id ● (Components) GROUP ● Id RUDDER config (global) ● Policy Mode ● Schedule NODE ● Properties ● Policy Mode ● Schedule Environmental context ● Id : . . . ● Generated : . . . Files Node configuration Change request Historisation Historization Event logs
  20. 20. OSIS 2019Compliance? ● Id : . . . ● Generated : . . . Files Node configuration RUN ● Reports ● Reports ● ... ● ... METADATA ● node id ● config id ● run timestamp RUN ● Reports ● Reports ● ... ● ... METADATA ● node id ● config id ● run timestamp ● Signature Get Policy Send configuration reports Expected reports (node id, config id, timestamp) Run reports Historization Compliance historized Store expected reports Metadata ● Integrity ● Signature Config ● Id ● For Rule R, Directive D1, Component C
  21. 21. OSIS 2019Compliance? ● Id : . . . ● Generated : . . . Files Node configuration Run reports RUN ● Reports ● Reports ● ... ● ... METADATA ● node id ● config id ● run timestamp RUN ● Reports ● Reports ● ... ● ... METADATA ● node id ● config id ● run timestamp ● Signature Get Policy Send configuration reports Expected reports node id config id timestamp end of validity Historization Compliance historized Store expected reports Metadata ● Integrity ● Signature Config ● Id ● For Rule R, Directive D1, Component C
  22. 22. OSIS 2019Compliance? ● Id : . . . ● Generated : . . . Files Node configuration RUN ● Reports ● Reports ● ... ● ... METADATA ● node id ● config id ● run timestamp RUN ● Reports ● Reports ● ... ● ... METADATA ● node id ● config id ● run timestamp ● Signature Get Policy Send configuration reports Expected reports (node id, config id, timestamp) Run reports Historization Compliance historized Store expected reports Metadata ● Integrity ● Signature Config ● Id ● For Rule R, Directive D1, Component C
  23. 23. OSIS 2019Compliance? ● Id : . . . ● Generated : . . . Files Node configuration RUN ● Reports ● Reports ● ... ● ... METADATA ● node id ● config id ● run timestamp RUN ● Reports ● Reports ● ... ● ... METADATA ● node id ● config id ● run timestamp ● Signature Get Policy Send configuration reports Expected reports (node id, config id, timestamp) Run reports Historization Compliance historized Store expected reports Metadata ● Integrity ● Signature Config ● Id ● For Rule R, Directive D1, Component C
  24. 24. OSIS 2019Compliance? ● Id : . . . ● Generated : . . . Files Node configuration RUN ● Reports ● Reports ● ... ● ... METADATA ● node id ● config id ● run timestamp RUN ● Reports ● Reports ● ... ● ... METADATA ● node id ● config id ● run timestamp ● Signature Get Policy Send configuration reports Expected reports (node id, config id, timestamp) Run reports Historization Compliance historized Store expected reports Metadata ● Integrity ● Signature Config ● Id ● For Rule R, Directive D1, Component C
  25. 25. OSIS 2019Causality and dependencies of events Why would we need it? ● We have logs ● We have experts
  26. 26. OSIS 2019Causality and dependencies of events
  27. 27. OSIS 2019Causality and dependencies of events Diagnostic on infrastructures is hard ● Many systems ● Dependencies across systems ● Many actors involved An issue on one component can impact hundred systems We need to separate the causes from the symptoms
  28. 28. OSIS 2019Causality and dependencies of events Monitoring can only correlate Events happen on the whole infrastructure Causes and precedences help root cause analysis
  29. 29. OSIS 2019Event sourcing & Tracing Terminology (Dapper & OpenTracing) Trace: Description of a “transaction” as it moves through systems Span: Named and timed operation, piece of workflow (+ tags and logs) Span context: Trace information that accompanies the transaction
  30. 30. OSIS 2019Event sourcing & Tracing What’s in a span? Operation name Start & end timestamps Tags: Set of key:value Logs: Set of key:value SpanContext
  31. 31. OSIS 2019Event sourcing & Tracing Temporal relationships between Spans in a single Trace https://www.jaegertracing.io/docs/1.9/architecture/
  32. 32. OSIS 2019Event sourcing & Tracing Configuration Management: What would be the traces? Defining the infrastructure state is a trace Each changes before validation is a span Validating results in a change request closes the trace Computing the nodes configurations is a trace Computing targets, overrides and generating files are spans Closes with the serialization of the nodes configurations in database Each run on an node is a trace Each configuration check is a span
  33. 33. OSIS 2019Event sourcing & Tracing PARAM RULE ● Id DIRECTIVE ● Id ● (Components) GROUP ● Id Environmental context ● Id : . . . ● Generated : . . . Files Node configuration Commit Id RUN ● Reports ● Reports ● ... ● ... METADATA ● node id ● config id ● run timestamp RUN ● Reports ● Reports ● ... ● ... METADATA ● node id ● config id ● run timestamp ● Signature Get config Send configuration reports Expected reports (node id, config id, timestamp) Run reports Historisation Compliance historised Store expected reportsMetadata ● Integrity ● CommitId ● Signature Config ● For Rule R, Directive D1, Component C Event logs Change request Defining state Trace + Spans Trace Run: Trace Each step: span Message bus Message bus
  34. 34. OSIS 2019Event sourcing & Tracing Store Traces & Events: ● Integrate with systems in place ● Many tools are compatible with OpenTracing Correlate with non-observable systems
  35. 35. OSIS 2019What to do of these billions events? Reactive approach Query, search and analyze traces in case of problems Proactive approach Process mining: Machine Learning on these events Detect unusual behaviours Outliers Inconsistencies across systems
  36. 36. OSIS 2019Closing thoughts Mark Burgess Founder of Configuration Management http://markburgess.org/anomalies.html
  37. 37. OSIS 2019 THE OPEN SOURCE INNOVATION SPRING 2019 @nico_charles nicolas@rudder.io Thank you ! Any questions ?
  38. 38. OSIS 2019Security? Events, trace and logs hold critical data Within a simple system, security can be built-in AuthN/AuthZ For distributed system, it’s much harder Who can see what? Who defines and enforces the authorizations? Partial visibility of events/traces Tags on events for authorizations

×