PuppetDB
(ノ°ヮ°)ノ*:・゚✧2014
✧
RUN
PDB
deepak giridharagopal
deepak @puppetlabs.com
@grim_radical
Puppet
agent
Puppet
master
facts
PuppetDBPuppetDB
Puppet
agent
Puppet
master
facts
PuppetDBPuppetDB
Puppet
agent
Puppet
master PuppetDBfacts
PuppetDB
Puppet
agent
Puppet
master
facts
PuppetDB
PuppetDB
Puppet
agent
Puppet
master
facts
PuppetDBPuppetDB
Yum!
Puppet
agent
Puppet
master PuppetDBPuppetDB
Puppet
agent
Puppet
master PuppetDBPuppetDB
catalogcatalog
catalog
Puppet
agent
Puppet
master PuppetDBPuppetDB
catalog
catalog
Puppet
agent
Puppet
master
PuppetDB
PuppetDB
catalog
Puppet
agent
catalog
Puppet
master
facts
PuppetDBPuppetDB
Yum!
Puppet
agent
Puppet
master PuppetDBPuppetDB
catalog
Puppet
agent
Puppet
master PuppetDBPuppetDB
report
Puppet
agent
Puppet
master PuppetDBPuppetDB
report
report
Puppet
agent
Puppet
master PuppetDBPuppetDB
report
Puppet
agent
Puppet
master
PuppetDB
PuppetDB
Puppet
agent
Puppet
master
facts
PuppetDBPuppetDB
Yum!
Puppet
agent
Puppet
master PuppetDBPuppetDB
Puppet
master
catalog
PuppetDB
catalog
Puppet
master
catalog
PuppetDB
catalog
Puppet
master PuppetDB
catalog
catalog
Puppet
master PuppetDB
catalog
Puppet
master
PuppetDcatalog
catalog
Puppet
master
PuppetDcatalog
Puppet
master
Pupcatalog
catalog
Puppet
master
Pupcatalog
Software should be
self-regulating
goo.comfoo.com bar.com
STORAGE
bar.com
Self-regulating catalog & fact storage!
goo.comfoo.com bar.com
STORAGE
bar.combar.combar.combar.com
Self-regulating report storage!
goo.comfoo.com bar.com
STORAGE
Self-regulating node storage!
goo.comfoo.com bar.com
STORAGE
bar.com
Deduplication of catalogs!
foo.com goo.com
/commands MQ Parse
Delayed
Dead Letter
Office
Process
UUID
Querying
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
foo.com
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
bar.com
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
baz.com
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
goo.com
/v4/resources
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
foo.com
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
bar.com
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
baz.com
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
goo.com
/v4/resources/Service
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
foo.com
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
bar.com
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
baz.com
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
goo.com
/v4/resources/Service/foo
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
foo.com
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
bar.com
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
baz.com
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
goo.com
/v4/nodes/foo.com/resources
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
foo.com
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
bar.com
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
baz.com
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
goo.com
/v4/nodes/foo.com/resources
/Package
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
foo.com
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
bar.com
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
baz.com
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
goo.com
/v4/nodes/foo.com/resources
/Package/foo
• Can query facts, nodes,
resources, reports, events,
metrics
• Advanced queries via AST-
based language
• Aggregates, ordering, paging,
streaming, subqueries
A long time ago in a galaxy
far, far away…
!
!
uh, like here a year ago…
Just released 1.4.0,
Around 11k deployments,
Basic streaming support,
Other stuff…
We’ve been busy!
Soft write failures
Paging support
Resource containment paths
in events
Event aggregates
Differential fact storage
Differential edge storage
Pervasive streaming of query
results
3dfx Voodoo2 and
Soundblaster AWE32 support
Improved de-duplication
Resource parameter caching
PostgreSQL Hot Standby
support for faster reads
Debugging of de-duplication
algorithm
Full director’s commentary
Compressed responses
A pony
Certificate chain support
Support for puppet
environments
Event subqueries
Prepared statement caching
Direct POST of json data in
terminus
Brings Aeris from Final
Fantasy VII back to life
Profiling support for terminus
Faster message parsing
Structured fact storage and
querying
Unified query subsystem in v4
API
Riker’s beard
docs.puppetlabs.com/
puppetdb/latest/
release_notes.html
We can’t get through
that entire list, but I’ll try
to highlight a few shiny
bits
1.
Differential storage
goo.comfoo.com bar.com
STORAGE
bar.comfoo.com
bar.com
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[zzz]
Exec[echo foo]
Exec[echo bar]
STORAGE
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
bar.com
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[zzz]
Exec[echo foo]
Exec[echo bar]
STORAGE
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
bar.com
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[zzz]
Exec[echo foo]
Exec[echo bar]
STORAGE
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
bar.com
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[zzz]
Exec[echo foo]
Exec[echo bar]
STORAGE
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
bar.com
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[zzz]
Exec[echo foo]
Exec[echo bar]
• Edges and facts, too
• Trades reads for writes,
which is a good tradeoff.
• PostgreSQL’s heap-only-
tuples help a lot.
~90%fewer writes
Thanks to the folks at
Spotify, the community,
etc. that helped us with
this!
2.
More effective
de-duplication
Order matters!
{"foo" => "goo",
"bar" => "baz"}
{"bar" => "baz",
"foo" => "goo"}
904d4d…
11c05d…
• Restructuring data prior to
hashing results in much
fewer false negatives
• The fastest way to persist
data is to already have it
persisted!
~60-70%boost for
users with ordinarily
low dedupe rates
Thanks to the folks at
CERN that helped us
debug this!
3.
Hot standby
PuppetDB PostgreSQL
WRITE
READ
PuppetDB
PostgreSQL
WRITE
READ
PostgreSQL
standby
REPLICATION
Less I/O contention for
reads and writes yields
better throughput
3.
Environment support
STORAGE
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
foo.com
TEST
bar.com
PROD
STORAGE
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
foo.com
TEST
bar.com
PROD
STORAGE
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
foo.com
TEST
bar.com
PROD
All Files,
please!
Anything for
you! !
STORAGE
File[/foo]
File[/bar] Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
foo.com
TEST
bar.com
PROD
WTF?!
I’m trying my
best! !
STORAGE
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
foo.com
TEST
bar.com
PROD
All Files for
env TEST, plz!
Anything for
you! !
File[/foo]
STORAGE
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
foo.com
TEST
bar.com
PROD
Thanks!
We’re friends
again! ❤️
• Any reception or
transmission of data now
includes the environment
where possible
• Queries can be isolated to a
single environment
3.
Unified query engine
PuppetDB
Parse
Map valid
fields
Map valid
operators
Compile to
SQL
PostgreSQL
Parse
Map valid
fields
Map valid
operators
Compile to
SQL
Parse
Map valid
fields
Map valid
operators
Compile to
SQL
Parse
Map valid
fields
Map valid
operators
Compile to
SQL
Facts
Resources
Nodes
Reports
Parse
Map valid
fields
Map valid
operators
Compile to
SQL
Events
QUERY
PuppetDB
Parse to AST
Term
rewriting
Apply
operators
Compile to
SQL
PostgreSQL
QUERY
• Common query engine
underpinning v4 API
• Operators are available
uniformly across all v4
endpoints
• We can add new endpoints
and fields much faster
4.
Structured/Trusted
fact support
{
"cpus" : {
"cpu1" : {
"bogomips": 6000,
}
},
"networking" : {
"eth0" : {
"ipaddresses" : [ "1.1.1.5" ],
"macaddresses" : [ "aa:bb:cc:dd:ee:00" ]
}
}
}
["=", "path",
["networking",
"eth0",
"macaddresses",
0]]
["~>", "path",
["networking",
"eth.*",
"macaddresses",
".*"]]
!
• PostgreSQL’s pg_trgm index
• Trusted facts
So where are we now?
More features, but also
more speed
Language bindings:
Ruby, Python,
JavaScript, C/C++, Go,
Clojure, Java…
Trapperkeeper
https://github.com/puppetlabs/trapperkeeper
Puppetboard
https://github.com/nedap/puppetboard
Puppet Explorer
https://github.com/spotify/puppetexplorer
A new deployment
every 10 minutes
Coming soon:
more efficient GC
Coming soon:
simplified query syntax
Coming soon:
historical data
Thanks!

PuppetDB: One Year Faster - PuppetConf 2014