Policy as Code: Guardrails and Compliance with OPA and Sentinel

Enforce infrastructure compliance and governance automatically using Policy as Code with Open Policy Agent (OPA), HashiCorp Sentinel, or AWS Policy.

published: March 25, 2026 reading time: 28 min read author: GeekWorkBench updated: June 17, 2026

Quick Summary

Policy as Code brings governance into automated workflows by writing rules that validate infrastructure configurations before deployment, rejecting bad configurations with clear explanations. Three dominant tools address this space: Open Policy Agent (OPA) as an open-source general-purpose engine, HashiCorp Sentinel for commercial HashiCorp environments, and AWS Policy for native AWS governance. The approach replaces manual review with automated enforcement, making compliance a side effect of the deployment process rather than an afterthought.

Policy as Code: Guardrails and Compliance with OPA and Sentinel

Policy as code brings governance and compliance into your automated workflows. Instead of reviewing infrastructure changes manually or hoping naming conventions are followed, you write policies that automatically validate configurations before deployment. Bad configurations get rejected with clear explanations, and compliance becomes a side effect of the deployment process rather than an afterthought.

Three main approaches dominate the landscape. Open Policy Agent (OPA) is an open-source general-purpose policy engine. HashiCorp Sentinel is a commercial product tightly integrated with HashiCorp tools. AWS Policy is the native option for AWS environments. Each has different strengths depending on your stack and requirements.

Introduction

The core idea is straightforward. You write rules that describe what valid infrastructure looks like. During deployment, a policy engine evaluates your proposed configuration against those rules. If the configuration violates a rule, the deployment stops and you get an error explaining what went wrong and why.

Policies can enforce anything you can express programmatically. Common use cases include requiring encryption at rest, enforcing naming conventions, restricting which regions resources can be deployed to, ensuring cost controls like budget alerts exist, and preventing public exposure of sensitive resources.

The alternative—manual policy review—does not scale. Human reviewers get tired, miss details, and apply rules inconsistently. Automated policy enforcement removes the variability and catches issues before they reach production.

When to Use / When Not to Use

When policy as code makes sense

Policy as code pays off when compliance failures are expensive. If you operate in regulated industries—finance, healthcare, government—or when security incidents from misconfiguration carry significant financial or reputational risk, automated policy enforcement catches problems before they reach production.

Use it when you have multiple teams deploying infrastructure independently. Without automated policy checks, each team might interpret standards differently. A policy library enforced in CI creates consistency without requiring a central approver for every change.

Policy as code also helps when you need audit evidence. Regulated environments often need to prove that controls existed before deployment, not just that someone reviewed a checklist. A policy evaluation log showing which rules passed and which failed is stronger evidence than a sign-off sheet.

When to skip it

Policy as code only makes sense when compliance failures carry real cost. If you’re running a handful of resources with one or two people who know the system, writing policies is overhead you don’t need. A code review from someone familiar with the codebase catches misconfigurations just as well, without the added friction of automated checks.

This gets at the real question: how fast is your change velocity? If every infrastructure change gets genuine human attention, policy as code is optional. If your team is still experimenting with naming conventions, tagging strategies, or resource architecture, premature policy enforcement locks in decisions before the team has settled on what good looks like. You end up spending engineering cycles updating policies as standards evolve instead of building the actual infrastructure.

One exception worth mentioning: regulated environments where compliance evidence matters. Even a small team in finance, healthcare, or government often needs to demonstrate that controls existed before deployment, not just that someone reviewed a checklist afterward. In those cases, policy as code justifies itself through audit requirements rather than operational risk reduction.

When misconfiguration costs are low and your team is small, manual review works fine. When those costs climb or your team grows past the point where everyone sees every change, automated enforcement becomes worth the investment.

OPA Rego Language Basics

OPA uses a custom language called Rego for writing policies. Rego is declarative—you describe what makes a configuration valid rather than how to validate it.

package terraform.analysis

import future.keywords.if
import future.keywords.contains

# Deny S3 buckets without encryption
deny_s3_unencrypted if {
    input.resource_changes[_].type == "aws_s3_bucket"
    not input.resource_changes[_].change.after.server_side_encryption_configuration
}

# Deny EC2 instances without tags
deny_ec2_missing_tags if {
    some rc in input.resource_changes
    rc.type == "aws_instance"
    not rc.change.after.tags
}

# Deny RDS instances that are publicly accessible
deny_rds_public_access if {
    some rc in input.resource_changes
    rc.type == "aws_db_instance"
    rc.change.after.publicly_accessible == true
}

Rego policies operate on input documents. For Terraform, the input is a JSON representation of the plan or state. You write rules that traverse this document and return violations when conditions are met.

The deny prefix is convention—OPA does not require it. You can name your rules anything. The important part is that when a rule evaluates to true, that represents a violation.

Writing and Testing Policies

OPA ships with a testing framework that makes policy development iterative rather than trial-and-error.

package terraform.analysis

import future.keywords.if

deny_s3_unencrypted if {
    some bucket in input.resource_changes
    bucket.type == "aws_s3_bucket"
    not bucket.change.after.server_side_encryption_configuration
}

# Tests
Test_deny_s3_unencrypted_violation {
    not deny_s3_unencrypted with input as {
        "resource_changes": [{
            "type": "aws_s3_bucket",
            "change": {
                "after": {
                    "bucket": "my-bucket",
                    "server_side_encryption_configuration": null
                }
            }
        }]
    }
}

Test_allow_s3_encrypted {
    deny_s3_unencrypted with input as {
        "resource_changes": [{
            "type": "aws_s3_bucket",
            "change": {
                "after": {
                    "bucket": "my-bucket",
                    "server_side_encryption_configuration": {
                        "rule": {
                            "apply_server_side_encryption_by_default": {
                                "sse_algorithm": "AES256"
                            }
                        }
                    }
                }
            }
        }]
    }
}

Run tests with opa test. Good test coverage catches regressions when you modify policies and validates that new policies behave as intended.

opa test ./policies/ -v

Integrating with CI/CD Pipelines

The real value of policy as code emerges in automated pipelines. Every infrastructure change gets evaluated against policies before deployment proceeds.

# GitHub Actions example
name: Terraform Policy Check

on:
  pull_request:
    paths:
      - "**.tf"
      - "policies/**"

jobs:
  OPA:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Install OPA
        run: |
          curl -L -o opa https://openpolicyagent.org/downloads/latest/opa_linux_amd64
          chmod +x opa

      - name: Run Terraform plan
        run: |
          terraform init
          terraform plan -out=plan.tfplan
          terraform show -json plan.tfplan > plan.json

      - name: Evaluate policies
        run: |
          opa eval --fail-defined -d policies.rego -f pretty -I -d plan.json "data.terraform.analysis.deny"

If any policy returns a violation, the fail-defined flag causes OPA to exit with a non-zero code, failing the pipeline. Developers see the violation details in the pull request comments.

Terraform Validation with OPA

Beyond CI/CD integration, OPA can validate Terraform code directly without applying changes. This is useful for pre-merge checks that do not require running an actual plan.

The conftest tool integrates OPA with configuration files.

# .github/workflows/tf-validation.yml
jobs:
  validate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Install conftest
        run: |
          curl -L -o conftest https://github.com/open-policy-agent/conftest/releases/download/v0.55.0/conftest_0.55.0_Linux_x86_64.tar.gz
          tar xzf conftest.tar.gz

      - name: Validate Terraform
        run: |
          conftest test . --policy policy.rego

For teams using Terraform Cloud, OPA integration happens through the Sentinel policy engine’s Rego-like syntax, which is compatible enough to port policies between the two systems.

Real-World Policy Examples

Budget enforcement policies prevent runaway infrastructure costs.

package terraform.analysis

import future.keywords.if
import future.keywords.contains

deny_missing_cost_alert if {
    not input.resource_changes[_] {
        .type == "aws_budgets_cost_notification"
        .change.after.notification
    }
}

deny_unlimited_budget if {
    some rc in input.resource_changes
    rc.type == "aws_budgets_cost_notification"
    rc.change.after.budget_type == "USAGE"
}

Tagging policies enforce organizational naming standards.

package terraform.analysis

import future.keywords.if

required_tags := {"Environment", "Team", "CostCenter", "Application"}

deny_missing_required_tags if {
    some rc in input.resource_changes
    is_compute_resource(rc.type)
    missing_tags := required_tags - get_tags(rc)
    count(missing_tags) > 0
}

is_compute_resource(resource_type) if {
    resource_type == "aws_instance"
}

is_compute_resource(resource_type) if {
    resource_type == "aws_ecs_service"
}

get_tags(rc) := tags if {
    tags := {k | rc.change.after.tags[k]}
}

Network policies prevent accidental public exposure.

package terraform.analysis

deny_public_rds if {
    some rc in input.resource_changes
    rc.type == "aws_db_instance"
    rc.change.after.publicly_accessible == true
}

deny_public_load_balancer if {
    some rc in input.resource_changes
    rc.type == "aws_lb"
    rc.change.after.scheme == "internet-facing"
    not has_waf_access_logging(rc)
}

OPA vs Sentinel vs AWS Policy Comparison

Aspect	OPA	Sentinel	AWS Policy
Cost	Free, open-source	Commercial (HashiCorp Enterprise)	Free (native AWS)
Scope	Multi-cloud, Kubernetes, CI/CD	HashiCorp tools only	AWS only
Language	Rego	Sentinel (DSL)	IAM JSON/YAML
Testing	Built-in test framework	Built-in test framework	IAM Access Analyzer
CI/CD integration	Native via conftest	Native via Terraform Cloud	Native via AWS config

Policy Enforcement Flow

flowchart TD
    A[Terraform Plan] --> B[OPA evaluates policies]
    B --> C{All policies pass?}
    C -->|Yes| D[Apply proceeds]
    C -->|No| E[Show violations]
    E --> F[Fix code]
    F --> A
    D --> G[State updated]

Trade-off Analysis

Policy Engine Selection

Factor	OPA	Sentinel	AWS Policy
Cost	Free, open-source	Commercial (Terraform Cloud Enterprise)	Free (native AWS)
Multi-cloud	Full support	HashiCorp only	AWS only
Language paradigm	Rego (declarative, functional)	DSL (imperative-style)	IAM JSON/YAML
Learning curve	Steeper (new language)	Moderate (familiar patterns)	Gentle for AWS users
Policy reuse	Package system for sharing	Module system for sharing	Policy templates
Test framework	Built-in, strong	Built-in, strong	Limited to Access Analyzer
Runtime evaluation	Any stage (plan, apply, admission)	Terraform Cloud only	AWS Config, SCPs
CI/CD integration	Native (conftest,opa)	Native via TFC	Native via AWS config

OPA Rego Complexity vs Simplicity

Simple policies (allow/deny) are straightforward to write and maintain. The deny_s3_unencrypted pattern—checking a single attribute and returning true if violated—is easy to read and debug.

Complex policies (exceptions, conditions, multi-resource) require careful Rego design. Using helper functions and separating data transformation from logic evaluation keeps policies readable. Writing all logic inline makes policies impossible to test in isolation.

The tradeoff is between brevity and maintainability. Inline policies are shorter but harder to test. Helper functions add indirection but enable unit testing of individual logical components.

A simple policy that checks a single condition looks like this:

package policy.s3

deny_s3_unencrypted if {
    input.resource.aws_s3_bucket.anyBucket
    input.resource.aws_s3_bucket.anyBucket.server_side_encryption_configuration[_].server_side_encryption_by_default[_].sse_algorithm == "AES256"
} if {
    not input.resource.aws_s3_bucket.anyBucket.server_side_encryption_configuration
}

The deny_s3_unencrypted rule fires when an S3 bucket lacks encryption. The structure is flat—one attribute check, one negation. This is easy to trace at runtime: either the condition is true or it is not.

A complex policy that enforces multiple tags across resource types benefits from helper functions:

package policy.tags

# Helper: extract all tags from a resource
get_resource_tags(resource) := tags if {
    keys := [k | resource[k]; endswith(k, "_tags")]
    tags := object.union(data.aws_ec2_tags[_], data.aws_s3_tags[_])
}

# Helper: check required tag presence
has_required_tags(tags) if {
    required := {"environment", "cost-center", "owner"}
    missing := required - tags
    count(missing) == 0
}

# Main rule: deny resources missing required tags
deny_missing_tags if {
    resource := input.resource[_]
    tags := get_resource_tags(resource)
    not has_required_tags(tags)
}

Separating get_resource_tags from has_required_tags lets you test each function independently. You can unit-test the tag extraction logic with mock data without wiring up a full resource graph. You can test the requirement logic without caring which resource type produced the tags.

Testing follows the same split. Simple policies use a single test case per rule:

package policy.s3

test_deny_unencrypted_bucket if {
    deny_s3_unencrypted with input.resource.aws_s3_bucket.anyBucket as {}
}

Complex policies test each helper function with multiple cases:

package policy.tags

test_has_required_tags_all_present if {
    has_required_tags({"environment": "prod", "cost-center": "core", "owner": "platform"})
}

test_has_required_tags_missing_one if {
    not has_required_tags({"environment": "prod", "owner": "platform"})
}

test_get_resource_tags_multiple_sources if {
    tags := get_resource_tags({"aws_instance": {}})
    count(tags) > 0
}

Draw the refactoring line when a rule exceeds two inline conditions or when you find yourself copying the same attribute traversal logic across multiple rules. If you write the same input.resource.aws_s3_bucket[_].server_side_encryption_configuration chain twice, extract it to a helper. If a test case requires more than five assertions to set up, the rule is doing too much.

Performance matters when Rego evaluates large plan files. Every input.resource[_] traversal scans the entire resource graph. A policy with three or four broad traversals on a plan with 500 resources can take seconds to evaluate. Profile with opa eval --explain perf to identify slow traversals. Push predicates down into rule conditions early—Rego evaluates from left to right and stops on the first failure in a conjunction:

# Slow: scans all resources before checking encryption
deny_s3_unencrypted if {
    input.resource.aws_s3_bucket.anyBucket
    not input.resource.aws_s3_bucket.anyBucket.server_side_encryption_configuration
}

# Faster: check encryption first, only then look up the bucket
deny_s3_unencrypted if {
    some bucket in input.resource.aws_s3_bucket
    not bucket.server_side_encryption_configuration
}

The second form binds bucket only when the negation would succeed, avoiding an unnecessary full-graph scan.

Policy Scope Decisions

Global policies (applied to every resource) catch everything but generate noise when teams have legitimate exceptions. A global “all resources must have cost center tag” creates violations for resources that genuinely cannot have tags.

Resource-type policies target specific risky resources. “Deny S3 buckets without encryption” and “Deny RDS instances publicly accessible” focus on high-impact cases without flagging every untagged resource.

Exception-based policies start with a deny and add exceptions for known legitimate cases. This approach works well but requires maintenance—every legitimate exception becomes a permanent exception unless someone periodically reviews them.

For most organizations, resource-type policies with specific targeting strike the right balance between coverage and noise.

Choosing a policy scope follows a simple heuristic: start with the risk, not the rule. Ask what the worst case is if a resource violates the policy. If the worst case is a data breach or a compliance failure, write a resource-type policy targeting only that resource. If the worst case is cost leakage or operational confusion, a broader scope may be justified—but narrow it to the specific resource category that generates the cost.

Global policies make sense only for a small set of non-negotiable baselines: no public S3 access, no unencrypted databases, no resource that bypasses your tagging taxonomy entirely. Even then, global policies should be combined with resource-type policies for the resources that matter most. A global “tag everything” policy with a resource-type override for “S3 buckets must have encryption” gives you coverage without flagging every untagged resource that genuinely cannot carry a tag.

Exception-based policies are useful when you have a known legitimate case that cannot be remediated immediately. The pattern is common in migrations: you cannot encrypt the legacy bucket until the application team finishes their migration, so you add an exception and set a calendar reminder. The problem is that exceptions accumulate. Every quarter you should audit your exception list and remove the ones that are no longer current. If an exception has been in place for more than 90 days without a remediation plan, it is time to escalate or close it.

A practical approach for most teams: start with resource-type policies targeting your highest-risk resources (S3, RDS, IAM roles). Add a global baseline policy only for the three or four rules that apply universally. Review the global policy’s violation output after 30 days—if it is flagging more than 20% of resources, the scope is too broad and needs narrowing.

Policy Evaluation Timing Decisions

Stage	Benefit	Risk
Pre-commit (local)	Fast feedback, no CI queue	Developers may skip
Pull request CI	Mandatory gate, consistent	Pipeline latency
Plan-time (in Terraform)	Context-aware, sees actual values	Slows down apply
Admission (Kubernetes)	Blocks bad deployments at runtime	Latency impact

The standard approach is pre-merge CI checks with OPA or conftest. This catches violations before they reach any environment. For critical policies, add a second gate at plan-time that evaluates against the actual Terraform plan JSON.

Production Failure Scenarios

Common Policy Failures

Failure	Impact	Mitigation
Overly strict policies	Developers bypass policies or disable enforcement	Start permissive, tighten incrementally
Policy conflicts	Impossible to satisfy two policies simultaneously	Audit policy interactions before deployment
Slow policy evaluation	CI pipeline delays	Cache OPA decisions, optimize Rego queries
Policies not enforced in CI	Violations reach production	Make policy checks mandatory in pipeline
Rego bugs allow violations	False sense of compliance	Write comprehensive tests for every policy

Policy Debug Flow

flowchart TD
    A[Policy violation in CI] --> B[Run OPA locally]
    B --> C[Verbose output: opa eval -v]
    C --> D{Is violation real?}
    D -->|Yes| E[Fix infrastructure code]
    D -->|No| F[Fix policy Rego]
    E --> G[Retry pipeline]
    F --> G

Observability Hooks

Track policy enforcement to catch systemic issues and measure compliance.

What to monitor:

Policy violation rate per team or repository
Most commonly violated policies
Policy evaluation duration in CI
Policy change frequency (too many changes may indicate instability)
Exception request rate

# Run OPA with verbose output to debug
opa eval --fail-defined -d policy.rego -v "data.terraform.analysis.deny"

# Test policies against a plan file
opa eval --fail-defined -d policy.rego -f pretty \
  -I -d plan.json "data.terraform.analysis.deny"

# List all policies and their pass/fail status
opa eval --explain=full -d policies/ -f pretty plan.json

# Check Rego syntax without running
opa check policy.rego

# Run policy tests
opa test ./policies/ -v

Common Pitfalls / Anti-Patterns

Writing policies before understanding the data

OPA policies operate on input documents. For Terraform, that means plan.json or state.json—the JSON output from terraform show -json. Rego policies traverse this structure to find violations, so writing policies without knowing what that structure looks like is backwards.

Start by generating a plan and examining the raw JSON output. Run terraform init && terraform plan -out=plan.tfplan && terraform show -json plan.tfplan > plan.json, then open the file. Look at how resource types are named (aws_s3_bucket, not aws_s3), which attributes live under change.after versus change.before, and which fields are null versus missing entirely. These distinctions matter—a missing field and a null field behave differently in Rego.

What actually happens is the policy gets written against the Terraform provider documentation, which does not always match the JSON structure OPA sees. The docs might list server_side_encryption_configuration while the actual JSON key is sse_configuration. Booleans show up as strings "true" in some provider versions and proper booleans in others. Checking the actual data upfront resolves these mismatches before you write a single line of Rego.

For Kubernetes manifests, the input is the admission request object. Run OPA with --verbose or use opa eval with sample inputs to see the structure before writing policies. The same principle applies: inspect the actual document, not the expected document.

Making policies too broad

A policy that flags every resource creates noise and trains teams to ignore violations. Instead, target specific risky resources. “Deny S3 buckets without encryption” is actionable. “Deny resources without tags” is noisy when many resources legitimately lack tags.

The concrete cost is alert fatigue. When every resource in a pull request generates a violation, reviewers stop reading the output. A policy that returns 50 violations on a 10-resource change means the 2 real violations in that batch get missed. Teams learn to scroll past the policy check or disable it entirely, which defeats the purpose of enforcement.

The most common source of overly broad policies is copying a global rule from one context to another without adjusting the scope. A “deny untagged resources” rule that works fine in an AWS-only environment flags every Kubernetes pod, Cloud Foundry application, and SaaS resource when you extend policy enforcement across platforms. Before adding a global rule, ask whether it applies uniformly across every resource type in your infrastructure.

Narrowing a broad policy usually means choosing a resource type or a tag as the filter. Instead of “deny resources without cost-center tag,” write “deny aws_instance and aws_rds_instance resources without cost-center tag.” This is specific enough to catch the resources that actually generate cost attribution problems without flagging resources that cannot carry user-defined tags.

Audit existing policies for scope creep quarterly. Look for policies that generate more than 10% violation rate across your fleet—that is a signal the rule is either too broad or catching real problems that should be fixed at the source rather than blocked at the policy layer.

Not testing policies

Rego policies are code and can have bugs. A policy with a logic error might pass everything when it should deny, or deny everything when it should pass. OPA’s test framework categorizes failures into three types: false passes (a violating resource slips through undetected), false denies (a compliant resource gets flagged incorrectly), and unexpected exceptions (a test errors instead of producing the expected result). All three represent defects that ship to production if you skip testing.

OPA’s test framework uses test_ prefixed rules that assert expected policy behavior:

package terraform.analysis

deny_s3_unencrypted if {
    some bucket in input.resource_changes
    bucket.type == "aws_s3_bucket"
    not bucket.change.after.server_side_encryption_configuration
}

test_deny_s3_unencrypted_violation {
    deny_s3_unencrypted with input.resource_changes as [{
        "type": "aws_s3_bucket",
        "change": {"after": {}}
    }]
}

test_deny_s3_unencrypted_pass {
    not deny_s3_unencrypted with input.resource_changes as [{
        "type": "aws_s3_bucket",
        "change": {"after": {"server_side_encryption_configuration": {}}}
    }]
}

The first test verifies that an unencrypted bucket triggers a denial. The second confirms that a properly encrypted bucket passes without flagging. Run opa test ./policies/ to execute the full suite—OPA returns non-zero on failure, which fails your CI gate. For a policy library, aim for at least 80% coverage on critical rules that directly block resource creation. Beyond catching bugs, tests function as living compliance documentation: they record exactly which scenarios the policies handle and how they respond, which is exactly what auditors want to see during review.

Hardcoding exceptions in policies

When a legitimate use case violates a policy, the instinct is to add an exception directly in the Rego code. A not exception condition here, an allowed_account_ids list there. This works for the first few exceptions, but exception lists tend to grow indefinitely. Six months later, the policy that started as “deny S3 buckets without encryption” now has forty-seven exceptions for accounts that legitimately do not need encryption, and nobody can remember why each one exists.

The better approach is to keep exception data separate from policy logic. Store exceptions in an external JSON or YAML file that Rego imports at evaluation time. The policy checks whether the current resource matches any exception before flagging a violation.

package terraform.analysis

deny_s3_unencrypted if {
    some bucket in input.resource_changes
    bucket.type == "aws_s3_bucket"
    not bucket.change.after.server_side_encryption_configuration
    not is_exception(bucket, "s3_unencrypted")
}

is_exception(resource, rule_id) if {
    data.exceptions[rule_id][_].resource_id == resource.id
}

The exception file is readable by non-engineers and reviewable without touching Rego. When a migration project that legitimately needed an unencrypted bucket finishes, you remove the exception from the data file, not from policy code. Exception review becomes a business process rather than an archaeology project.

If a legitimate case cannot be handled by the policy itself and an exception feels necessary, treat that as a signal the policy needs rethinking. Either the policy is too broad and needs tighter scoping, or the legitimate case reveals a missing dimension in how the policy evaluates risk. When neither option works, escalate through a formal risk acceptance process. That documentation serves as audit evidence—the policy code itself is not the proof.

Interview Questions

1. How does OPA evaluate Terraform plan files?

Expected answer points:

OPA operates on input documents—for Terraform, plan.json or state.json
Terraform plan converts to JSON via `terraform show -json plan.tfplan`
OPA Rego policies traverse the JSON document and return violations when conditions are met
`opa eval --fail-defined` exits non-zero if any policy returns a violation

2. What is the difference between OPA, Sentinel, and AWS Policy?

Expected answer points:

OPA: free, open-source, multi-cloud, uses Rego language
Sentinel: commercial, HashiCorp-only, DSL with imperative-style policies
AWS Policy: free, native AWS, IAM JSON/YAML for SCPs and Access Analyzer
OPA is vendor-neutral; Sentinel locks to HashiCorp; AWS Policy locks to AWS

3. Why should you start with permissive policies and tighten incrementally?

Expected answer points:

Overly strict policies cause developers to bypass or disable enforcement
Starting permissive identifies actual violations before fining-tune rules
Too strict creates noise, trains teams to ignore violations, defeats purpose
Incrementally tighten as policy library matures and exceptions get documented

4. How do you debug a policy that is not catching violations it should?

Expected answer points:

Run OPA locally with verbose output: `opa eval -v -d policy.rego plan.json`
Check if the violation is real—if yes, fix infrastructure code; if no, fix policy Rego
Use `opa check` to verify Rego syntax without running
Write comprehensive tests for every policy—Rego bugs cause false passes or false denies

5. Why are hardcoded exceptions in policies problematic?

Expected answer points:

When a legitimate case violates a policy, instinct is to add exception
Over time exceptions accumulate, policy becomes meaningless
Instead: fix the policy to handle the legitimate case, or formalize risk acceptance
Exception-based policies require ongoing maintenance to stay relevant

6. How does conftest integrate OPA with configuration files?

Expected answer points:

conftest is a tool that applies OPA policies to configuration files
Runs outside Terraform, validates HCL/Terraform code directly without plan
Useful for pre-merge checks that do not require running actual plan
`conftest test . --policy policy.rego` validates configs against policies

7. What is the tradeoff between global policies and resource-type policies?

Expected answer points:

Global policies catch everything but generate noise when teams have legitimate exceptions
Resource-type policies target specific risky resources (S3 encryption, RDS public access)
Global "deny untagged resources" is noisy; resource-type is actionable
For most organizations, resource-type policies strike right balance between coverage and noise

8. How do you structure Rego policies for maintainability?

Expected answer points:

Separate data transformation from logic evaluation using helper functions
Inline policies are shorter but harder to test in isolation
Use packages to organize policies by domain (terraform.analysis, kubernetes.admission)
Test each logical component independently, not just end-to-end

9. What monitoring metrics should you track for policy enforcement?

Expected answer points:

Policy violation rate per team or repository—high rate indicates policy issues or training gap
Most commonly violated policies—may need clearer documentation or easier to satisfy
Policy evaluation duration in CI—slow evaluation indicates optimization opportunity
Policy change frequency (too many changes = instability), exception request rate

10. When should you evaluate policies at different stages (pre-commit vs CI vs plan-time)?

Expected answer points:

Pre-commit (local): fast feedback but developers may skip
Pull request CI: mandatory gate, consistent, adds pipeline latency
Plan-time (in Terraform): context-aware, sees actual values, slows down apply
Standard approach is pre-merge CI with OPA/conftest; critical policies get second gate at plan-time

11. How does OPA's fail-defined flag work and when would you use it in CI/CD?

Expected answer points:

`opa eval --fail-defined` exits with non-zero code when any policy returns a violation
Without fail-defined, OPA returns 0 even if violations exist—pipeline continues silently
In CI/CD, fail-defined causes the pipeline to fail when policy violations are detected
Works with `-f pretty` for human-readable output or `-f json` for programmatic parsing
Combined with `-I` for stdin input, evaluates plan.json against policies and blocks bad deployments

12. What is the difference between deny and warn policy rules in OPA?

Expected answer points:

`deny` rules: when true, the policy result is a violation that blocks deployment
`warn` rules: when true, the policy result is a warning that does not block deployment
Use deny for hard requirements (encryption, no public access), warn for soft recommendations
You can combine both: some policies deny, others warn depending on severity
CI/CD pipelines can be configured to treat warnings as failures for stricter enforcement

13. How do you structure OPA policies for reuse across different resource types?

Expected answer points:

Use packages to organize policies by domain: `package terraform.analysis.deny`
Create helper functions for common checks: `is_encrypted(rc)`, `has_required_tags(rc)`
Package imports let you share functions across policy files
Data documents allow external configuration without modifying policy code
Test each helper function independently before composing into larger rules

14. What are the limitations of OPA policy evaluation for Terraform plans?

Expected answer points:

OPA evaluates plan JSON output, not HCL source—so computed values may not be available
Some resource attributes only appear after apply, not in plan output
OPA cannot validate things like exact IAM policy structure that requires evaluation
Terraform state must exist for import and resource attribute lookups
Large plans create large JSON that can slow down policy evaluation

15. How does Sentinel differ from OPA for policy enforcement in Terraform Cloud?

Expected answer points:

Sentinel is HashiCorp's commercial policy-as-code product, only works with Terraform Cloud
Sentinel uses its own DSL with imperative-style policy writing
OPA is open-source and vendor-neutral, can evaluate plans locally before TFC submission
Sentinel policies can import Terraform run details via the TFC API
Porting policies between OPA and Sentinel is possible but requires rewriting Rego to Sentinel DSL

16. What is the purpose of the future.keywords import in OPA Rego?

Expected answer points:

`future.keywords.if` enables the `if` keyword for conditional rules without parentheses
`future.keywords.contains` enables the `contains` keyword for set membership checks
`future.keywords.default` enables default rule definitions for missing cases
These keywords become standard in future OPA versions, future.keywords enables them early
Makes Rego more readable: `deny if { condition }` vs `deny { condition }`

17. How do you handle policy exceptions without hardcoding them in Rego?

Expected answer points:

Use external data documents to define exceptions rather than hardcoding in policy code
Exception list stored as JSON/YAML, imported as data.exception_list in Rego
Policy checks if the resource matches an exception before flagging violation
Exception list reviewable and manageable without modifying policy logic
Prevents policy code pollution from accumulated hardcoded exceptions

18. What is the tradeoff between policy evaluation speed and policy complexity?

Expected answer points:

Simple deny rules checking single attributes evaluate fast (milliseconds)
Complex rules with nested iterations, multiple helper calls, and deep data traversal are slower
For large Terraform plans, slow policies can add minutes to CI pipeline runtime
Cache frequently evaluated policy decisions to avoid repeated computation
Profile policy evaluation time with `opa eval --explain=full` to identify bottlenecks

19. How do you test OPA policies against mocked Terraform plan data?

Expected answer points:

Write test cases using the `with input as` syntax to inject mock data
Test positive cases: resources that should pass (no violation returned)
Test negative cases: resources that should fail (violation returned)
Run `opa test ./policies/ -v` for verbose test output with pass/fail per test
Cover edge cases: null values, empty lists, unexpected data structures

20. How does AWS Policy differ from SCPs and when would you use each?

Expected answer points:

AWS Policy (IAM) governs what a specific identity can do—user, role, or service
SCPs (Service Control Policies) govern what is denied at the organization level across all accounts
Use IAM policies for fine-grained resource-level permissions per identity
Use SCPs for guardrails that apply to all accounts: deny certain regions, require encryption
SCPs are evaluated before IAM policies—SCPs can never grant permissions, only restrict them

Conclusion

Policy as code shifts compliance from manual review to automated enforcement. OPA provides a vendor-neutral, general-purpose policy engine that integrates with Terraform, Kubernetes, and other infrastructure tools. Writing policies in Rego takes practice, but the resulting automated governance is worth the investment.

Start with a few high-impact policies—encryption requirements, tagging enforcement, publicly accessible resource restrictions. Expand coverage as your policy library grows and your team’s confidence with the tooling increases.

For more on DevOps practices, see our post on Cost Optimization which covers cloud cost governance patterns. For securing your infrastructure and IAM, see Cloud Security. For monitoring policy changes and compliance over time, see Observability Engineering.

Policy as Code: Guardrails and Compliance with OPA and Sentinel

Introduction

When to Use / When Not to Use

When policy as code makes sense

When to skip it

OPA Rego Language Basics

Writing and Testing Policies

Integrating with CI/CD Pipelines

Terraform Validation with OPA

Real-World Policy Examples

OPA vs Sentinel vs AWS Policy Comparison

Policy Enforcement Flow

Trade-off Analysis

Policy Engine Selection

OPA Rego Complexity vs Simplicity

Policy Scope Decisions

Policy Evaluation Timing Decisions

Production Failure Scenarios

Common Policy Failures

Policy Debug Flow

Observability Hooks

Common Pitfalls / Anti-Patterns

Writing policies before understanding the data

Making policies too broad

Not testing policies

Hardcoding exceptions in policies

Interview Questions

Further Reading

Conclusion

Category

Tags

Related Posts

Compliance Automation: SOC 2, PCI-DSS, and Audit Trails

Choosing a Git Team Workflow: Decision Framework

Git Flow: The Original Branching Strategy Explained