2. openpolicyagent.org
Policy Management Fabric
(Out of scope for OPA)
Goals and Non-goals
OPA
OPA OPA
OPA
OPA’s Goal: policy-enable other projects and services, regardless of domain
● Run at the edge to make policy decisions for host-local consumers
● Zero runtime dependencies
● Easy integrations
4. openpolicyagent.org
OPA is an open source, general-purpose policy engine
● Declarative Language (Rego)
○ Is X allowed to call operation Y on resource Z?
○ Which users can SSH into production hosts?
○ What clusters should workload X be deployed to?
○ What annotations must be set on object X?
5. openpolicyagent.org
● Declarative Language (Rego)
○ Is X allowed to call operation Y on resource Z?
○ Which users can SSH into production hosts?
○ What clusters should workload X be deployed to?
○ What annotations must be set on object X?
● Library/Daemon (Go)
○ In-memory storage of data and policies
○ Zero runtime dependencies
○ Evaluation engine: parser, compiler, interpreter
○ Tooling: REPL, test framework, tracing
OPA is an open source, general-purpose policy engine
6. openpolicyagent.org
● Declarative Language (Rego)
○ Is X allowed to call operation Y on resource Z?
○ Which users can SSH into production hosts?
○ What clusters should workload X be deployed to?
○ What annotations must be set on object X?
● Library/Daemon (Go)
○ In-memory storage of data and policies
○ Zero runtime dependencies
○ Evaluation engine: parser, compiler, interpreter
○ Tooling: REPL, test framework, tracing
● Standard Library & Integrations
○ Authorization, admission control, auditing, etc.
○ Kubernetes, Istio, AWS, Terraform, Docker, and more.
OPA is an open source, general-purpose policy engine
7. openpolicyagent.org
OPA is an open source, general-purpose policy engine
DataLogic
Management API:
Management pushes updates
Enforcement API:
Service requests decision
8. openpolicyagent.org
OPA is an open source, general-purpose policy engine
DataLogic
Management API:
Management pushes updates
Enforcement API:
Service requests decision
Service-specific
Management sidecar
9. openpolicyagent.org
Enforcement + Management API (REST)
List all policies GET /v2/policies
Insert, modify, delete policies GET/PUT/DELETE /v2/policies/<path>
List all data GET /v2/data
Insert, modify, delete raw data GET/PUT/PATCH/DELETE /v2/data/<path>
Get policy decision GET/POST /v2/decisions/<path>
?metrics
?watch
?explain
include metrics (ex: latency)
stream updates
explain why result is true
(Tim’s proposal for v2--a small change from v1.)
Management API
Enforcement API
10. History
2016: Inception
Requirements
● Decisions about JSON
● Decisions are JSON
● Ease of integration
● Host-local agent
Execution
● New language: Rego
● HTTP API over localhost
● Go binary
2017: Application
Requirements
● Solve real problems
● Build community
● Learn requirements
● Hill-climb implementation
Execution
● Domains: Cloud, Server,
Container, Microservices
● Customers, KubeCon,
CNCF, Meetups, ...
2018: Hardening
Requirements
● Ease of use
● Performance
● Solve real problems
● Build community
Execution
● v2 of Language/API/Engine
● Leverage Google’s CEL
● Policy Library
● CNCF, Conferences, Users
Today
12. openpolicyagent.org
Dimensions for Use Case Comparison
● Policy
○ What kind of policy?
○ What kind of expressiveness? Iteration, etc.
● Data/context
○ OPA treats data separate from policy
○ What data does the policy depend on?
○ How does OPA know about that data?
● Decisions
○ Are decisions booleans/strings/numbers/arrays/maps?
● Integration
○ How was the enforcement integration done?
● Policy management
○ How were policies/data pushed into OPA?
● Performance
○ How many queries per second are required?
● Mode
○ Proactive (prevent violations), reactive (fix violations), audit (identify violations)
17. openpolicyagent.org
Use Cases: Kubernetes: Admission Control Decision
Policy Query
POST opa:8181/v1/data/k8s/admission/allow
input:
kind: Pod
metadata:
labels:
app: nginx
name: nginx-1493591563-bvl8q
namespace: production
spec:
containers:
- image: nginx
name: nginx
securityContext:
privileged: true
dnsPolicy: ClusterFirst
nodeName: minikube
restartPolicy: Always
status:
containerStatuses:
- name: nginx
ready: true
hostIP: 192.168.99.100
phase: Running
podIP: 172.17.0.4
Policy Decision
200 OK
{
“result”: true
}
Policy decision can also be a
JSON-patch-like dictionary
describing updates to pod.
18. openpolicyagent.org
Use Cases: Kubernetes: Admission Control
Example Policies
● Images may only be pulled from internal registry
● Only scanned images may be deployed in
namespaces A, B, and C
● QA team must sign-off on image before deployed to
production
● Stateful deployments must use ‘recreate’ update
strategy
● Developers must not modify selectors or labels
referred to by selectors after creation
● Containers must have CPU and memory resource
requests and limits set
● Containers cannot run with privileged security
context
● Services in namespace X should have AWS SSL
annotation added
apiserver
admission control
kubectl apply -f app.yaml
OPA
19. openpolicyagent.org
Use Cases: Kubernetes
● Cluster placement
○ Policy: choose clusters a workload should be deployed to. JSON pointer for analyzing request
○ Data: Depends on cluster metadata (mirrored from k8s)
○ Decision: set of clusters
○ Integration: webhook hardcoded to ask GET /
○ Policy management: K8s ConfigMaps
○ Mode: proactive, reactive, audit
○ Performance: 1s
● Admission control
○ Policy: authorization + modification of incoming request. JSON pointer for analyzing request
○ Data: Depends on pod metadata (mirrored from k8s)
○ Decision: JSON patch describing changes
○ Integration: webhook hardcoded to ask GET /
○ Policy management: k8s ConfigMaps
○ Mode: proactive, reactive, audit
○ Performance: 1s
25. openpolicyagent.org
Use Cases: Terraform Architecture
Terraform State
CICD Pipeline
OPA
OPA
Risk Management
● Compute risk of infra change
● Limit blast radius based on
seniority of author
● Automatic approvals and manual
approvals
Terraform change
Public Cloud
Public Cloud Resource Audit
● Find public cloud resources not
under control of Terraform
● Report violations of policy
29. openpolicyagent.org
New Use Cases
● Ratelimiting
○ Early days of this use case
○ Performance: 1000+ rps
○ Policy: choose ratelimit. Written using GUI/YAML. YAML treated as data in policy.
○ Decision: number
○ Integration: ?
○ Policy management: Custom
○ Mode: proactive
● Data protection: Minio, Kafka, OpenSDSS
○ Performance: 1000 rps
○ Policy: AWS IAM policies translated to Rego
○ Decision: allow/deny
○ Integration: GET /path
○ Policy management: Custom minio federation service
○ Mode: proactive
30. openpolicyagent.org
Lessons Learned
● Iteration/JSON-pointer/modules common expressiveness requirements
○ Policy about images in a k8s pod or about a Terraform plan needs iteration and JSON-pointer
● Data as a first-class citizen helps with writing policy
○ YAML/GUI data becomes the user-facing policy language; admin encodes semantics in Logic
● Policy decisions can be more complex than allow/deny
○ Assuming the technology supports it
● Always at the mercy of the system you are integrating with
○ Users willing to modify their application are great! So are systems that support plugins.
● Valuable to operate without a hard dependency on storage
○ Every system already has some storage system; they rarely want another etcd to manage
○ But everyone asks about storage options
● The higher the performance requirement, the simpler the policy
○ The tradeoff is unavoidable, but it’s possible to lessen the impact.
39. openpolicyagent.org
Integration
OR
HTTP API
Service (Go)
OPA
Service
OPA
HTTP
List all policies GET v1/policies
Insert, modify, and delete policies GET/PUT/DELETE v1/policies/<path>
Insert and modify raw data PUT/PATCH v1/data/<path>
Get policy decision GET/POST v1/data/<path>
Evaluate ad-hoc policy queries GET v1/query?q=<query>
?metrics
?watch
?explain
include metrics (ex: latency)
stream updates
explain why result is true
40. openpolicyagent.org
Use Cases: Kubernetes: Admission Control
Example Policies
● Images may only be pulled from internal registry
● Only scanned images may be deployed in namespaces
A, B, and C
● QA team must sign-off on image before deployed to
production
● Stateful deployments must use ‘recreate’ update
strategy
● Developers must not modify selectors or labels referred
to by selectors after creation
● Containers must have CPU and memory resource
requests and limits set
● Containers cannot run with privileged security context
● Services in namespace X should have AWS SSL
annotation added
● Product teams may only expose services with hostname
from whitelist...
apiserver
admission control
kubectl apply -f app.yaml
OPA