Vayacondios:
Divine into Complex Systems
Huston Hoburg &Flip Kromer
Infochimps, a CSC Company
MongoDB Austin
2014 March 24...
Infochimps
• Big Data Platform for Large Companies	

• Cloud::Queries (ElasticSearch,MongoDB,HBase)	

• Cloud::Hadoop (Dyn...
Vayacondios
• Built for ourVisibility Stack…	

• … but we think it has wider use	

!
• “Data Goes In, the Right Thing Happ...
Circulatory
Immune
Clotting
OK, Glass
“OK Glass, Show me a skeuomorphism”
Immune
Circulatory
Digestive
Respiratory
Non-Numeric Metrics
Target INR = 2-3
Low Platelets = H.I.T. (bad)
Heparin (Blood Thinner)
Low Platelets
• Folic Acid,Vitamin B12	

• Medication (Valproic Acid, Singulair, Heparin)	

• Sepsis	

• HIV	

• (about th...
Systems
• Anatomical Systems: Circulatory, Immune, etc	

• Interventions: Drugs, Surgeries, …	

• Course of Treatment: top...
ICU
• Model the patient, not the data source	

• Highlight Interactions among systems 	

• Highlight Interactions among nu...
Monitoring Sucks
Operations
System != Machine
• Whole-System MongoDB:	

• Machines it runs on,Volumes it uses	

• Systems writing to it	

• Applicatio...
Operations
• Cognitive model for Humans, not from Robots	

• Go beyond the Time-series Graph	

• Highlight Interactions	

...
• 15 clients, 15 architectures	

• < 1 operator per client, 2 continents	

• 1500 machines in 150 clusters	

• 30+ technol...
Systems to Instrument
• WholeSystems: ZookeeperSystem, ElasticsearchSystem, HbaseSystem, HadoopMapredSystem, HadoopHdfsSys...
Vayacondios
• Visibility Stack for our operations team	

• Open-sourcing this summer	

• Internals in Ruby	

• Access anyw...
Cognitive Model
• MongoDB:	

• is_a Data store	

• has_many Network Services	

• has_many Daemons	

• has_many Machines	

...
Model DSL

(domain-specific language)
Model DSL

(domain-specific language)
Faithful
• Whiteboard rule: how do folks talk about system?
• If you need it,it’s in the system
Prompt
• As fast as joint ...
Biographizing Isn’t Pretty
Faithful to Source
• crap data => well-formed data	

• uniform JSON-ready hash	

• syntax cleaned up	

• semantically unch...
Write Contract
• Vaya Con Dios,“Go with God”. As the kingdom
of heaven is unknowable, so is further fate of data:	

• How ...
Reporters/Reports
• Assemble Biographies into Reports	

• Faithful to application	

• Don’t know when will be run, why, etc
Presentation
Dashboarding
text metric
text metric
text metric text metric
text metric text metric text metric
text metric
Model-Driven Templates
Repeatable Partials
Model/Presenter/View
• Report == Model	

• Reporter == Presenter	

• Dashboard .xml ==View
Model/Presenter/View
• More targets that just dashboard!	

• Splunk+PagerDuty Alerts	

• Cucumber tests	

• Auditing repor...
System Checks
• Correctness, Consistency	

• Attached Directly to the Model	

• No worthwhile distinction between 

QA (in...
Safe Systems
System Drift
• Cognitive Model	

• Discoverable Interface	

• Testable Contract
Inevitability
• If configured and reported, consistency checks	

• If reported, dashboard exists	

• If is_a generic system...
Interaction
• Monitoring systems do a terrible job here	

• Hard sources of failure:	

• Drift 	

 	

 	

 	

 conceived !...
Application Design
Application Design
• Visibility into complex systems:	

• Biography of raw parts (raw Model) => 

Reporter (Presenter) =>
...
Simple Blog
Blog: Views
Author Page
Post Page
Index Page
Blog: Views Author Page
Post Page
Index Page
PostSynopsisReport
PostReport
UserReport
CommentReport
“Query on the way In”
!
• New/Updated Post: Update Post triggers…	

• Update PostReport	

• Update SynopsisReport	

• Upda...
“Query on the way In”
!
• User fullname changes: Update User triggers…	

• Update UserReport	

• Update their SynopsisRepo...
Vayacondios Contract
Faithful
• Whiteboard rule: how do folks talk about system?
• If you need it,it’s in the system
Prompt
• As fast as joint ...
Faithful
• Single concern: subject of the biography
• look at what’s offered,look at what reports need
Prompt
• Run as oft...
Faithful
• One Reporter per Application (*) & Topic
• USCE Method:Utiliz’n,Saturat’n,Connections,Errors
Prompt
• Run as of...
Benefits
• Separation of concerns: 	

• Source complexity (API, parsing, translation)	

• Timing	

• Transport	

• Individu...
Benefits
• Separation of concerns: Source,Timing,
Transport, Individual Applications, Reliability	

• No external libraries...
So What?
• There’s not much to it: shims and conventions	

• VCD is not MongoDB	

• just like MongoDB is not mmap tables	
...
We’re Hiring
jobs@infochimps.com
github.com/infochimps-labs
Vayacondios: Divine into Complex Systems
Vayacondios: Divine into Complex Systems
Vayacondios: Divine into Complex Systems
Vayacondios: Divine into Complex Systems
Vayacondios: Divine into Complex Systems
Vayacondios: Divine into Complex Systems
Vayacondios: Divine into Complex Systems
Vayacondios: Divine into Complex Systems
Upcoming SlideShare
Loading in...5
×

Vayacondios: Divine into Complex Systems

381

Published on

This presentation given by Flip Kromer and Huston Hoburg on March 24, 2014 at the MongoDB Meetup in Austin.

Vayacondios is a system we're building at Infochimps to gather metrics on highly complex systems and help humans make sense of their operation. You can think of it as a "data goes in, the right thing happens" machine: send in facts from anywhere about anything, and Vayacondios will promptly process and syndicate them to all consumers. Producers don't have to (or get to) worry about the needs of those who will use the data, or the details of transport, storage, filtering or anything else: the data will go where it needs to go. Each consumer, meanwhile, finds that everything they need to know is available to them, on the fly or on demand, without crufty adapters or extraneous dependencies. They don't have to (or get to) worry about the distribution of their sources, the tempo of update, or how the data came to be.

Vayacondios was built for our technical ops team to monitor all the databases and systems they superintend, but it suggests a better way to build database driven applications of any kind. The quiet tyranny of developing against a traditional database has left us with many bad habits: not duplicating data, using models that serve the query engine not the user, assembling application objects from raw parts on every page refresh. Combining streaming data processing systems with distributed datastores like MongoDB let you do your query on the way _in_ to the database -- any number of queries, decoupled, of any complexity or tempo. The resulting approach is simpler, fault-tolerant, and scales in terms of machines and developers. Most importantly, your data models are purely faithful to the needs of your application, uncontaminated by differing opinions of other consumers or by incidentals of the robots that gather and process and store the data.

Published in: Engineering
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
381
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
6
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Vayacondios: Divine into Complex Systems

  1. 1. Vayacondios: Divine into Complex Systems Huston Hoburg &Flip Kromer Infochimps, a CSC Company MongoDB Austin 2014 March 24th
  2. 2. Infochimps • Big Data Platform for Large Companies • Cloud::Queries (ElasticSearch,MongoDB,HBase) • Cloud::Hadoop (Dynamic Hadoop) • Cloud::Streams (Storm+Trident) • Managed Service, Enterprise Features • Recently sold to CSC, and it’s quite awesome • We’re Hiring (natch)
  3. 3. Vayacondios • Built for ourVisibility Stack… • … but we think it has wider use ! • “Data Goes In, the Right Thing Happens” • Prompt, Comprehensive and Faithful
  4. 4. Circulatory Immune Clotting
  5. 5. OK, Glass “OK Glass, Show me a skeuomorphism”
  6. 6. Immune Circulatory Digestive Respiratory
  7. 7. Non-Numeric Metrics
  8. 8. Target INR = 2-3 Low Platelets = H.I.T. (bad) Heparin (Blood Thinner)
  9. 9. Low Platelets • Folic Acid,Vitamin B12 • Medication (Valproic Acid, Singulair, Heparin) • Sepsis • HIV • (about three dozen others)
  10. 10. Systems • Anatomical Systems: Circulatory, Immune, etc • Interventions: Drugs, Surgeries, … • Course of Treatment: topline progress indicators • Diagnosis • Practitioner • Medical Devices
  11. 11. ICU • Model the patient, not the data source • Highlight Interactions among systems • Highlight Interactions among numbers • Broaden your view of “systems”
  12. 12. Monitoring Sucks
  13. 13. Operations
  14. 14. System != Machine • Whole-System MongoDB: • Machines it runs on,Volumes it uses • Systems writing to it • Applications and Collections • Data Files, Logs, Repl Sets, Oplog, Arbiters • Codebase repo, Cookbooks, Configuration • Issue Tracker Tickets, Change Events
  15. 15. Operations • Cognitive model for Humans, not from Robots • Go beyond the Time-series Graph • Highlight Interactions • Link to Systems that write to this DB • Link to Github for Repos & Cookbooks • Drill into System • Issues in Issue Tracker • Broaden your view of “systems”
  16. 16. • 15 clients, 15 architectures • < 1 operator per client, 2 continents • 1500 machines in 150 clusters • 30+ technologies (HBase, MongoDB, Storm, …) • 4 Providers (AWS, Metal,VCE, OpenStack) • 3Virtualizations (AWS,VMWare, OpenStack) • Max 21 minutes downtime / month (99.95% SLA) Our Challenge
  17. 17. Systems to Instrument • WholeSystems: ZookeeperSystem, ElasticsearchSystem, HbaseSystem, HadoopMapredSystem, HadoopHdfsSystem, KafkaSystem, MysqlSystem, MysqlClientSystem, ListenerSetSystem, StormTridentSystem, MongodbSystem, NfsSystem, VayacondiosSystem,TachyonSystem, SplunkSystem, S3System, RdsSystem, PigSystem, HiveSystem, HueSystem • Machines: ZookeeperMachine, ElasticsearchDatanodeMachine, HBaseRegionserverMachine, HBaseMasterMachine, HadoopDnttMachine, HadoopTtonlyMachine, HadoopNamenodeMachine, HadoopJobtrackerMachine, HadoopSecondaryNamenodeMachine, HadoopFailoverMonitorMachine, MysqlServerMachine, KafkaBrokerMachine, PlatformListenerMachine, StormBolterMachine, StormMasterMachine, MongodbMachine, NfsServerMachine, VayacondiosServerMachine, PlatformApiMachine,TachyonServerMachine, HueMachine • Daemons: n, ElasticsearchDaemon, HbaseRegionserverDaemon, HbaseMasterDaemon, HadoopDatanodeDaemon, HadoopTasktrackerDaemon, HadoopNamenodeDaemon, HadoopJobtrackerDaemon, HadoopSecondaryNamenodeDaemon, HadoopFailoverDaemon, KafkaBrokerDaemon, MysqlDaemon, PlatformListenerDaemon, StormNimbusDaemon, StormUiDaemon, StormSupervisorDaemon, MongodbDatanodeDaemon, NfsServerDaemon, NtpDaemon, NfsClientDaemon, VayacondiosServerDaemon,TachyonServerDaemon, PlatformApiServerDaemon, HueBeeswaxDaemon • Providers:AwsProvider, CloudTrailProvider, OpenstackProvider, VceProvider, ChefServerProvider, Route53Provider, ElbProvider • Manifests: most of the above have a planned version and the realized version • Events: MachineLifecycle, CronJobLifecycle, ChefClientLifecycle • Build Artifacts:: FitDeployArtifact, DebArtifact, RpmArtifact, GemArtifact,AmiArtifact, OpenstackImageArtifact, VceTemplateArtifact, NpmArtifact,TarballArtifact • PlatformApps: HadoopJobLifecycle (Hive, Pig,Wukong),TridentJobLifecycle, MountweaselLifecycle • OpsProcesses: IncidentLifecycle, ChangeRequestLifecycle, FiredrillLifecycle, GitCommitLifecycle, ProblemLifecycle (JIRA), LunchladyLifecycle
  18. 18. Vayacondios • Visibility Stack for our operations team • Open-sourcing this summer • Internals in Ruby • Access anywhere (HTTP or log file) • MongoDB (but now please forget that fact)
  19. 19. Cognitive Model • MongoDB: • is_a Data store • has_many Network Services • has_many Daemons • has_many Machines • has_manyVolumes • has_many Collections • …etc
  20. 20. Model DSL
 (domain-specific language)
  21. 21. Model DSL
 (domain-specific language)
  22. 22. Faithful • Whiteboard rule: how do folks talk about system? • If you need it,it’s in the system Prompt • As fast as joint laws of Economics & Physics allow Comprehensive
  23. 23. Biographizing Isn’t Pretty
  24. 24. Faithful to Source • crap data => well-formed data • uniform JSON-ready hash • syntax cleaned up • semantically unchanged • encouraged to model it, but let Wookiee win
  25. 25. Write Contract • Vaya Con Dios,“Go with God”. As the kingdom of heaven is unknowable, so is further fate of data: • How used • By Whom • How Processed • Where Stored
  26. 26. Reporters/Reports • Assemble Biographies into Reports • Faithful to application • Don’t know when will be run, why, etc
  27. 27. Presentation
  28. 28. Dashboarding
  29. 29. text metric text metric text metric text metric text metric text metric text metric text metric Model-Driven Templates
  30. 30. Repeatable Partials
  31. 31. Model/Presenter/View • Report == Model • Reporter == Presenter • Dashboard .xml ==View
  32. 32. Model/Presenter/View • More targets that just dashboard! • Splunk+PagerDuty Alerts • Cucumber tests • Auditing reports (Security, Good Manners)
  33. 33. System Checks • Correctness, Consistency • Attached Directly to the Model • No worthwhile distinction between 
 QA (integration tests) and live Alerts • Drive Splunk+Pager Duty for Alerts • Author Cucumber specs(!) for QA tests
  34. 34. Safe Systems
  35. 35. System Drift • Cognitive Model • Discoverable Interface • Testable Contract
  36. 36. Inevitability • If configured and reported, consistency checks • If reported, dashboard exists • If is_a generic system (eg filesystem), gets correctness tests (eg “capacity < 75%”) • If system A discovers system B: • dashboard has link from A to B • connectivity & security checks from A to B
  37. 37. Interaction • Monitoring systems do a terrible job here • Hard sources of failure: • Drift conceived != realized • Interaction unexpected consequences • Change oops
  38. 38. Application Design
  39. 39. Application Design • Visibility into complex systems: • Biography of raw parts (raw Model) => 
 Reporter (Presenter) =>
 Summary of Systems (View-ready Model) • Database-driven Application • Model =>
 Presenter =>
 View
  40. 40. Simple Blog
  41. 41. Blog: Views Author Page Post Page Index Page
  42. 42. Blog: Views Author Page Post Page Index Page PostSynopsisReport PostReport UserReport CommentReport
  43. 43. “Query on the way In” ! • New/Updated Post: Update Post triggers… • Update PostReport • Update SynopsisReport • Update UserReport
  44. 44. “Query on the way In” ! • User fullname changes: Update User triggers… • Update UserReport • Update their SynopsisReports • Update their PostReports • Update their CommentReports
  45. 45. Vayacondios Contract
  46. 46. Faithful • Whiteboard rule: how do folks talk about system? • If you need it,it’s in the system Prompt • As fast as joint laws of Economics & Physics allow Comprehensive
  47. 47. Faithful • Single concern: subject of the biography • look at what’s offered,look at what reports need Prompt • Run as often as needed (not your concern) Comprehensive
  48. 48. Faithful • One Reporter per Application (*) & Topic • USCE Method:Utiliz’n,Saturat’n,Connections,Errors Prompt • Run as often as needed (not your concern) Comprehensive
  49. 49. Benefits • Separation of concerns: • Source complexity (API, parsing, translation) • Timing • Transport • Individual Applications • Reliability
  50. 50. Benefits • Separation of concerns: Source,Timing, Transport, Individual Applications, Reliability • No external libraries in application • Uniform access times • Reduce risk from multiple-dependencies
  51. 51. So What? • There’s not much to it: shims and conventions • VCD is not MongoDB • just like MongoDB is not mmap tables • Power through constraint:
  52. 52. We’re Hiring jobs@infochimps.com github.com/infochimps-labs
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×