GitOps: Infrastructure as Code with Git for Microservices
Discover GitOps principles and practices for managing microservices infrastructure using Git as the single source of truth.
GitOps: Infrastructure as Code with Git for Microservices
Managing dozens or hundreds of microservices without a coherent strategy gets messy fast. You need to know what is deployed where, reproduce environments reliably, and recover from failures without digging through manual steps. GitOps solves this by applying version control practices to infrastructure.
The approach took off in the Kubernetes world, but the principles work beyond any single platform. If you run microservices at scale, GitOps changes how you think about deployments, reliability, and team workflows.
What is GitOps and Why It Matters
GitOps takes DevOps best practices, version control, collaboration, compliance, and applies them to infrastructure automation. The core idea: use Git as the single source of truth for application code and infrastructure configuration.
When your entire system state lives in Git, every infrastructure change goes through a pull request. You get peer review, history, and rollback with a single command. Deployment becomes auditable by default, not as an afterthought.
The term came from Weaveworks, but the principles spread across the industry. Kubernetes shops found GitOps compelling because it matches how Kubernetes works.
Traditional infrastructure management often relies on scripts or manual processes. Run a command here, change a config there, and gradually production drifts from what anyone intended. GitOps flips this. Instead of pushing from CI, you define desired state and a controller makes reality match.
Core Principles
GitOps rests on three principles that set it apart from other infrastructure approaches.
Declarative Configuration
Everything in your infrastructure is expressed declaratively. Rather than scripts that run steps, you define what the system should look like. Kubernetes manifests, Helm charts, and infrastructure-as-code templates fit this model. Your Git repository becomes a complete specification of how systems should run.
Versioned and Immutable
Every change produces a new version in Git history. You never modify existing commits; changes go in as new commits. This immutability gives you rollback and audit. If something breaks, you trace what changed, who approved it, and revert to a known-good state.
Pull-Based Reconciliation
A GitOps operator watches the state in Git and compares it to what runs in your cluster. When drift happens, the operator pulls changes to fix it. This contrasts with traditional CI/CD pipelines, where deployment tools push to clusters from outside.
GitOps Operators: ArgoCD and Flux
Two tools dominate GitOps for Kubernetes: ArgoCD and Flux. Both use pull-based reconciliation but differ in style.
ArgoCD, now a CNCF graduated project, gives you a GUI alongside its reconciliation engine. It shows application state, diffs between desired and actual, and offers rollback through the UI. Teams like ArgoCD for visibility across multiple clusters.
Flux takes a more Git-native approach, designed to fit developer workflows. It uses Custom Resources to define deployments. Flux v2 added modularity with progressive delivery and multi-tenancy.
The choice comes down to team preference. ArgoCD suits teams wanting operational visibility. Flux suits teams prioritizing Git-native workflows and programmatic control. Either works fine.
Repository Structure
How you organize Git repos affects how well GitOps works. Two patterns dominate.
Monorepo Approach
Some teams put everything in one repository. Application manifests, infrastructure configs, environment overrides all together. This simplifies management and makes cross-cutting changes easy. But it creates contention when multiple teams work there, and access control gets coarse.
App Repo and Environment Repo Pattern
A more common approach separates concerns. Each microservice has its own repo with code and Kubernetes manifests. A separate environment repo holds configuration for each environment, references app repos, and defines how services compose.
app-repo/
├── deployment.yaml
├── service.yaml
└── ingress.yaml
environment-repo/
├── production/
│ ├── namespace.yaml
│ ├── configmap.yaml
│ └── kustomization.yaml
└── staging/
├── namespace.yaml
├── configmap.yaml
└── kustomization.yaml
This separation creates boundaries. Teams iterate on services independently while platform teams manage environments centrally.
Kustomize and Helm both work well in GitOps workflows. Kustomize does lightweight patching. Helm handles sophisticated templating and packaging. Pick based on your complexity needs.
Secret Management with GitOps
Secrets trip up most GitOps implementations. GitOps wants everything in Git, but you cannot commit plaintext secrets. Several approaches handle this.
Sealed Secrets from Bitnami
Sealed Secrets encrypts using public-key cryptography. You encrypt a Secret manifest locally with the Sealed Secrets controller public key, then commit the encrypted version. The controller in your cluster decrypts and creates the actual Secret. Encrypted secrets live in Git, but only your cluster reads them.
HashiCorp Vault Integration
Vault gives you centralized secret management with dynamic credentials, access control, and audit logging. GitOps operators can fetch secrets at deployment time. This keeps secrets out of Git but adds a Vault dependency.
External Secrets Operator
The External Secrets Operator bridges Kubernetes Secrets with external secret stores. It defines an ExternalSecret resource pointing to your secret store. The operator syncs values into the cluster as standard Kubernetes Secrets. Works with AWS Secrets Manager, GCP Secret Manager, Azure Key Vault.
Each approach trades convenience against security and complexity. Sealed Secrets is simple but needs key management. Vault is powerful but adds infrastructure. External Secrets is flexible but introduces another component to run.
Drift Detection and Automatic Reconciliation
Configuration drift happens. Someone runs kubectl directly. A helm upgrade changes values unintentionally. A node failure reschedules pods with different configs. GitOps operators watch for this drift and fix it.
The operator periodically fetches desired state from Git and compares it to actual cluster state. When differences appear, it applies the desired configuration to restore alignment.
This automated correction helps. Manual mistakes get fixed automatically. Consistent state becomes the default. You stop waking up to snowflakes that drifted overnight.
ArgoCD can also watch external resources beyond Kubernetes, tracking Terraform or CloudFormation alongside application manifests.
GitOps Flow Diagram
The following diagram shows the typical GitOps workflow and how changes move through the system.
graph TD
A[Developer] -->|Push Code| B[Application Repo]
A -->|Pull Request| C[Environment Repo]
B -->|Image Build| D[Container Registry]
C -->|ArgoCD/Flux| E[GitOps Operator]
D -->|Image Tag Update| C
E -->|Sync| F[Kubernetes Cluster]
F -->|State Check| E
G[Secrets Vault] -->|Fetch Secrets| E
Developers push code to application repos or submit pull requests to environment repos. When code pushes, CI builds container images and updates image tags in the environment repo. The GitOps operator detects the change, validates against cluster state, and syncs the desired configuration.
Comparison with Traditional CI/CD Push Model
Traditional CI/CD uses a push model. CI builds an image, then uses kubectl or helm to push directly to clusters. Credentials live in CI systems. Deployments come from external systems connecting inward.
GitOps flips this. A GitOps operator inside your cluster pulls changes and applies them. Credentials do not leave the cluster for deployments. The cluster controls when to sync, not an external system.
The push model works fine for simpler setups. GitOps shines when managing multiple clusters, prioritizing security, or needing continuous reconciliation. Pick based on your actual needs, not hype.
Benefits of GitOps
GitOps changes how teams manage infrastructure.
Strong Audit Trail
Every change flows through Git pull requests. You get complete history of who changed what, when, why. Code review gates infrastructure changes. Compliance that demands change tracking is satisfied naturally.
Quick Rollback
Reverting a bad change means reverting a Git commit and waiting for the operator to sync. This works even if the bad change deployed problematic code. Rollback time drops from minutes of manual work to seconds of Git operations.
Improved Developer Experience
Developers work with infrastructure through familiar Git workflows. No need to learn kubectl or remember deployment procedures. Reviewing infrastructure changes feels like reviewing code.
Consistency Across Environments
The same manifests can target multiple environments. Promoting to production means updating a reference, not manually re-applying configs. Environment parity improves.
Security
Clusters do not need inbound access from deployment tools. The operator pulls changes, removing an attack vector. Credentials stay in the cluster instead of circulating to external systems.
Challenges and Considerations
GitOps adds complexity you have to handle.
Secret Management
Secrets need extra tooling. Committing secrets to Git is not an option. Each secret strategy adds operational overhead.
Large Monorepo Performance
When everything lives in one repo, operations slow down. Git history grows large, diffs become unwieldy, merge conflicts multiply. Careful repo design helps, but it is an ongoing problem.
Learning Curve
Teams used to imperative tools need time to think declaratively. Understanding reconciliation and debugging when it fails takes effort.
Operator Maintenance
The GitOps operator becomes critical infrastructure. Keeping it updated, monitoring its health, and troubleshooting issues needs attention.
Conclusion
GitOps changes how teams manage microservices infrastructure. By using Git as the source of truth with pull-based reconciliation, you get auditability, fast rollback, and environment consistency. The approach fits Kubernetes well since both use declarative models.
Adopting GitOps requires upfront investment in tooling, repo structure, and team training. For teams running multiple clusters or wanting operational visibility, it pays off. Even for smaller deployments, the discipline GitOps forces around version control and review helps.
If you run Kubernetes and have not tried GitOps, ArgoCD or Flux are worth exploring. Start small with one application, see how reconciliation works in practice, then expand.
When to Use / When Not to Use
GitOps solves specific problems but adds complexity that only pays off in certain contexts.
When to Use GitOps
Use GitOps when:
- Running Kubernetes in production with multiple clusters
- Compliance requires complete audit trails for infrastructure changes
- Teams need self-service deployments without cluster credentials
- Environment consistency matters across development, staging, and production
- Rollback speed is critical for incident response
- Multiple teams deploy to shared clusters
- You want single source of truth for both application and infrastructure state
Use GitOps for multi-cluster management when:
- Managing multiple environments (dev, staging, production) with similar configurations
- Running across multiple cloud providers or regions
- Disaster recovery requires rapid environment reproduction
- On-call engineers need quick visibility into cluster state
When Not to Use GitOps
Consider alternatives when:
- Single cluster, single application, simple deployment needs
- Team is small and all members have direct cluster access
- Existing CI/CD pipeline already works well for deployments
- Strictly imperative infrastructure management is required (rare but valid cases)
- Learning curve would block adoption before benefits materialize
- External systems manage cluster state directly (some managed Kubernetes offerings)
GitOps vs Alternatives Trade-offs
| Approach | Best For | Limitations |
|---|---|---|
| GitOps (ArgoCD/Flux) | Declarative infra, audit trails, multi-cluster | Learning curve, additional operators to maintain |
| CI/CD Push Model | Simple setups, existing CI tooling | Credentials outside cluster, no drift correction |
| Infrastructure as Code (Terraform) | Cloud provisioning, heterogeneous infra | State management, not Kubernetes-native |
| kubectl scripts | One-off operations, emergencies | No audit trail, not repeatable, manual |
| Helm-only | Application templating without GitOps | No automatic sync, no UI, limited visibility |
Tool Selection: ArgoCD vs Flux
graph TD
A[GitOps Operator Selection] --> B{Team Priorities}
B -->|Operational Visibility| C[ArgoCD]
B -->|Git-Native Workflows| D[Flux]
B -->|Multi-Tenant Setup| E[ArgoCD]
B -->|Minimal Footprint| F[Flux]
C --> G[GUI, Application CRD, Rollback UI]
D --> H[Controller CRDs, CLI, Programmatic]
E --> G
F --> H
Production Failure Scenarios
GitOps failures can leave clusters in inconsistent states or block deployments entirely. Understanding these scenarios helps you design resilient GitOps workflows.
Common GitOps Failures
| Failure | Impact | Mitigation |
|---|---|---|
| GitOps operator crash | Cluster drifts without correction | High availability deployment, monitoring |
| Repository branch mismatch | Wrong version deployed to cluster | Branch protection, PR reviews, environment gates |
| Image tag confusion | Unknown version running, hard to rollback | Use git SHA tags, immutable image references |
| Secret encryption key rotation | Sealed secrets become unreadable | Automated key rotation with backup keys |
| Helm chart dependency failure | Deployment fails, partial state | Lockfile for dependencies, test upgrades |
| Cluster connectivity loss | Operator cannot sync, drift accumulates | Local cache for offline operation, alerts |
| Ingress/traffic split mismatch | App unreachable after deploy | Health checks, canary analysis, traffic monitoring |
| CRD version mismatch | Custom resources fail to apply | Test in staging first, version pinning |
Drift Detection Failures
graph TD
A[GitOps Reconciliation Loop] --> B[Fetch Desired State]
A --> C[Compare to Cluster State]
B --> D{Diff Found?}
D -->|Yes| E[Apply Diff]
D -->|No| F[Healthy]
E --> G[Apply Succeeded?]
G -->|No| H[Mark Out of Sync]
G -->|Yes| I[Sync Successful]
H --> J[Alert Team]
I --> A
J --> K[Manual Intervention]
Secret Management Failures
| Failure | Impact | Mitigation |
|---|---|---|
| Sealed Secrets private key lost | Cannot decrypt secrets, services fail | Backup keys in secure location, key rotation |
| Vault unavailable during deploy | Pods fail to start, deployment blocks | Cache secrets locally, graceful degradation |
| External Secrets sync failure | Stale or missing secrets | Retry logic, alerting on sync errors |
| Secret rotation during deploy | Application crashes with old secret | Graceful secret reload, zero-downtime rotation |
Recovery Procedures
# Force sync ArgoCD application
argocd app sync user-service --force
# Rollback via ArgoCD
argocd app rollback user-service
# Check Flux reconciliation status
flux reconcile kustomization user-service --with-source
# Suspend Flux reconciliation (for emergency)
flux suspend kustomization user-service
# View GitOps operator logs
kubectl logs -n argocd deploy/argocd-server
# Check sealed secrets controller
kubectl get pods -n kube-system -l name=sealed-secrets
# Manual secret decryption (emergency)
kubectl exec -n kube-system -l name=sealed-secrets -- unseal
Multi-Cluster GitOps Failures
| Failure | Impact | Mitigation |
|---|---|---|
| Cluster-specific config drift | One cluster differs from others | Cluster fleet auditing, compliance checks |
| Cross-cluster dependency mismatch | Service A in cluster 1 incompatible with Service B in cluster 2 | Contract testing, version synchronization |
| Network partition between clusters | GitOps operator isolated, drift accumulates | Regional Git repos, local caching |
Quick Recap
Key Takeaways
- GitOps uses Git as single source of truth for declarative infrastructure
- Pull-based reconciliation keeps clusters matching desired state automatically
- ArgoCD suits teams wanting operational visibility; Flux suits Git-native workflows
- Secret management is the hardest part: Sealed Secrets, Vault, or External Secrets all have trade-offs
- Repository structure matters: app repos + environment repos pattern scales well
- Drift detection and automatic correction are GitOps superpowers
GitOps Readiness Checklist
# Verify ArgoCD CLI access
argocd cluster list
# Check application sync status
argocd app list
# Validate Kubernetes manifests in repo
kubectl apply --dry-run=server -f k8s/
# Test Helm template rendering
helm template myapp ./charts/myapp --debug
# Verify sealed secrets controller
kubectl get crd sealedsecrets.bitnami.com
# Check Flux CRDs installed
kubectl get crd | grep flux
# Validate Git repo structure
git ls-tree -r HEAD --name-only | head -50
Pre-Adoption Checklist
- Kubernetes clusters provisioned and accessible
- Git repository structure designed (app repos, environment repos, or monorepo)
- Secret management strategy selected and implemented
- CI pipeline configured to build and push container images
- GitOps operator (ArgoCD or Flux) installed and configured
- Initial application manifests committed and deploying
- Rollback procedure documented and tested
- Monitoring for GitOps operator health configured
- Team training completed on GitOps workflows
Related Posts
- Kubernetes: Container Orchestration for Microservices - Core Kubernetes concepts that complement GitOps workflows
- Helm Charts: Package Management for Kubernetes - How Helm fits into GitOps repository structures
- Advanced Kubernetes Patterns - Advanced patterns that work well with GitOps deployment models
- Microservices Learning Roadmap - Structured learning path including GitOps fundamentals
- DevOps Learning Roadmap - Broader DevOps context including GitOps principles
- CI/CD Pipelines for Microservices - Pipeline design complementary to GitOps
Category
Related Posts
GitOps: Declarative Deployments with ArgoCD and Flux
Implement GitOps for declarative, auditable infrastructure and application deployments using ArgoCD or Flux as your deployment operator.
Kustomize: Native Kubernetes Configuration Management
Use Kustomize for declarative Kubernetes configuration management without Helm's templating—overlays, patches, and environment-specific customization.
Health Checks: Liveness, Readiness, and Service Availability
Master health check implementation for microservices including liveness probes, readiness probes, and graceful degradation patterns.