Azure Core Services: VMSS, AKS, Blob Storage
Navigate Azure's core services for DevOps—Virtual Machine Scale Sets, Azure Kubernetes Service, Blob Storage, and managed databases.
Azure Core Services: VMSS, AKS, Blob Storage
Azure is Microsoft’s cloud platform, and it has matured into a capable alternative to AWS and GCP. If your organization lives in the Microsoft ecosystem, Azure integrates naturally with Active Directory, Visual Studio, and other Microsoft tooling. This post covers the core services for deploying and operating workloads on Azure.
Azure organizes resources hierarchically under Management Groups, Subscriptions, and Resource Groups. Management groups let you apply policies and access controls across multiple subscriptions, useful for large enterprises with many teams or environments.
Introduction
Azure’s compute, storage, and database services cover the ground you need for DevOps workloads. Virtual Machine Scale Sets (VMSS) provision and manage multiple VMs as a single set with automatic load balancing across fault domains. Azure Kubernetes Service (AKS) gives you managed Kubernetes where Microsoft handles the control plane and you manage node pools. Blob Storage handles unstructured data like images, logs, and artifacts with lifecycle policies that move data between access tiers automatically. Azure SQL and other managed databases take care of backups, patching, and high availability so you do not have to.
These services are designed to work together. VMSS handles scalable stateless workloads while AKS suits containerized applications that need the Kubernetes ecosystem. Blob Storage is the place for build artifacts, backups, and static assets with multiple redundancy options depending on how critical the data is. Managed databases let your team focus on writing application code instead of database administration.
Azure organizes resources under Management Groups, Subscriptions, and Resource Groups in that order. This hierarchy lets you apply policies and access controls across multiple subscriptions, which matters when you have many teams or environments. Resource groups hold related resources that share the same lifecycle, so you deploy and delete them together. The rest of this guide covers each service with practical examples, trade-off analysis, and production failure scenarios. You will learn when to use each service, how to configure them for production, and what to monitor.
When to Use
VMSS Uniform vs. Flexible Orchestration
Choose VMSS uniform orchestration when all instances run the same workload and you want simple management. Uniform mode provisions identical VMs and is the traditional VMSS model.
Choose VMSS flexible orchestration when you need a mix of instance types in a single scale set, want explicit control over fault domain distribution, or need better availability guarantees for production workloads. Flexible mode is the newer model and recommended for new deployments.
AKS with Azure AD vs. Without
Choose AKS with Azure AD integration when your organization uses Microsoft Entra ID (formerly Azure Active Directory). Azure AD provides unified identity management, conditional access policies, and role-based access control tied to your existing directory.
Choose AKS without Azure AD integration when you need to support users outside your Microsoft tenant, or when you want to manage Kubernetes RBAC independently from Azure AD. You can always add Azure AD integration later.
Azure DevOps vs. GitHub Actions
Choose Azure DevOps when you need deep integration with Microsoft tooling—work items, test plans, Artifacts feed, and enterprise approval workflows. Azure DevOps has stronger governance features for large organizations.
Choose GitHub Actions when your team lives on GitHub, you have many open-source projects, or you prefer YAML-based pipelines that work across any cloud. GitHub Actions has a larger marketplace of community actions.
When Not to Use Azure
Avoid VMSS flexible orchestration when you want simple, identical VM management. Uniform mode has less operational complexity and works well for homogeneous workloads.
Avoid AKS without Azure AD integration if you are already deep in the Microsoft ecosystem. Without Entra ID integration, you lose unified identity management and conditional access policies.
Avoid Azure DevOps if your team is GitHub-centric or prefers working across multiple clouds. Azure DevOps has strong Microsoft integration but adds lock-in that becomes a burden for cross-platform teams.
Avoid Blob Storage Hot tier for infrequently accessed data. Storage costs are higher than Cool or Archive tiers, and egress costs for access from outside Azure add up quickly.
Azure Resource Hierarchy
Resource groups are logical containers for Azure resources. They hold related resources that share the same lifecycle—deploy and delete together. Resource groups live within subscriptions, which bill all contained resources under one account.
# Create a resource group
az group create \
--name rg-production \
--location eastus
# List resource groups
az group list --output table
# Show a specific resource group
az group show --name rg-production
Tags attach metadata to resources independent of resource group boundaries. Common tags include Environment, Team, CostCenter, and Application. Azure Policy can enforce required tags and restrict which resources can be created without them.
Azure subscriptions act as billing and access control boundaries. Many organizations use separate subscriptions per environment or per application domain. Management groups sit above subscriptions and let you apply governance across the entire Azure footprint from a single point.
flowchart TD
A[Management Group] --> B[Subscription: Production]
A --> C[Subscription: Development]
B --> D[Resource Group: rg-prod-api]
B --> E[Resource Group: rg-prod-web]
C --> F[Resource Group: rg-dev-services]
D --> G[AKS Cluster]
D --> H[VMSS]
D --> I[Blob Storage]
E --> J[Azure SQL]
G --> K[System Node Pool]
G --> L[User Node Pool]
K --> M[CoreDNS Pods]
L --> N[App Pods]
VMSS for Scalable Compute
Virtual Machine Scale Sets (VMSS) provision and manage multiple VMs as a single set. They automatically balance across fault domains and update domains for high availability. VMSS is Azure’s equivalent to AWS Auto Scaling Groups.
# Create a VMSS
az vmss create \
--name webapp-vmss \
--resource-group rg-production \
--image UbuntuLTS \
--instance-count 2 \
--vm-sku Standard_D2s_v3 \
--load-balancer my-load-balancer \
--upgrade-policy-mode automatic
# Scale out manually
az vmss scale \
--name webapp-vmss \
--resource-group rg-production \
--new-capacity 5
# Configure autoscaling
az monitor autoscale create \
--resource-group rg-production \
--resource webapp-vmss \
--resource-type Microsoft.Compute/virtualMachineScaleSets \
--name webapp-autoscale \
--min-count 2 \
--max-count 10 \
--count 2
VMSS supports both uniform scaling (all instances identical) and flexible orchestration (mixtures of instance types). Flexible orchestration mode is newer and provides better availability guarantees with explicit control over fault domain distribution.
AKS Cluster Configuration
Azure Kubernetes Service (AKS) provides managed Kubernetes clusters. Microsoft handles the control plane, including upgrades and availability. You manage worker nodes through node pools.
# Create an AKS cluster
az aks create \
--resource-group rg-production \
--name aks-cluster \
--node-count 3 \
--vm-set-type VirtualMachineScaleSets \
--load-balancer-sku standard \
--enable-cluster-autoscaler \
--min-count 1 \
--max-count 5
# Get credentials for kubectl
az aks get-credentials \
--resource-group rg-production \
--name aks-cluster
# Check cluster status
az aks show --resource-group rg-production --name aks-cluster
AKS integrates with Azure Active Directory for authentication, Azure Monitor for logging and monitoring, and Azure Policy for governance enforcement. The Azure Policy add-on validates cluster configurations against organizational standards.
# node-pool addition via az CLI
az aks nodepool add \
--resource-group rg-production \
--cluster-name aks-cluster \
--name systempool \
--node-count 1 \
--node-vm-size Standard_D2s_v3 \
--mode System
# User node pool for workloads
az aks nodepool add \
--resource-group rg-production \
--cluster-name aks-cluster \
--name userpool \
--node-count 2 \
--node-vm-size Standard_D4s_v3 \
--mode User
System node pools run critical system pods like CoreDNS and metrics-server. User node pools run application workloads. This separation ensures system services get guaranteed resources even when user workloads consume capacity.
Blob Storage Lifecycle Policies
Azure Blob Storage holds unstructured data—images, logs, backups, and artifacts. Azure uses containers (similar to S3 buckets) and blobs (the objects themselves).
# Create a storage account
az storage account create \
--name mystorageaccount \
--resource-group rg-production \
--sku Standard_LRS
# Create a container
az storage container create \
--name artifacts \
--account-name mystorageaccount
# Upload a blob
az storage blob upload \
--file ./app.tar.gz \
--name prod/app.tar.gz \
--container-name artifacts \
--account-name mystorageaccount
# List blobs in container
az storage blob list \
--container-name artifacts \
--account-name mystorageaccount
Lifecycle management policies automate data transitions between access tiers and cleanup of old data.
{
"rules": [
{
"name": "artifact-lifecycle",
"enabled": true,
"type": "Lifecycle",
"definition": {
"filters": {
"blobTypes": ["blockBlob"],
"prefixMatch": ["artifacts/prod/"]
},
"actions": {
"baseBlob": {
"tierToCool": {
"daysAfterModificationGreaterThan": 30
},
"delete": {
"daysAfterModificationGreaterThan": 365
}
}
}
}
}
]
}
# Apply lifecycle policy
az storage account management-policy create \
--account-name mystorageaccount \
--policy @policy.json
Azure SQL and Managed Identity
Azure SQL Database is a fully managed PostgreSQL, MySQL, or SQL Server offering. Azure handles backups, patching, and high availability. Your applications connect via connection strings, and managed identities eliminate the need to manage database credentials.
# Create an Azure SQL Database server
az sql server create \
--name sqlserver-production \
--resource-group rg-production \
--admin-user dbadmin \
--admin-password 'YourPassword123!'
# Create a database
az sql db create \
--resource-group rg-production \
--server sqlserver-production \
--name webappdb \
--service-objective S0
# Configure a managed identity on an App Service
az webapp identity assign \
--resource-group rg-production \
--name webapp
Managed identities let your applications authenticate to Azure resources without storing secrets. The identity is attached to the compute resource—App Service, VM, or AKS pod—and Azure handles token issuance.
// .NET - connecting with managed identity
var connection = new SqlConnection(
"Server=tcp:sqlserver-production.database.windows.net;Database=webappdb;Authentication=Active Directory Managed Identity;"
);
For AKS workloads, the Azure AD pod-managed-identity addon attaches a managed identity to each pod. Pods use that identity to authenticate to Azure services without embedding credentials in code or configuration.
Azure DevOps vs GitHub Actions
Azure DevOps and GitHub Actions are the two main CI/CD options for Azure workloads. Azure DevOps has deep integration with Microsoft tooling and enterprise features like work item tracking. GitHub Actions integrates naturally with GitHub repositories and open-source projects.
# GitHub Actions workflow for Azure
name: Deploy to Azure
on:
push:
branches: [main]
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Azure Login
uses: azure/login@v2
with:
creds: ${{ secrets.AZURE_CREDENTIALS }}
- name: Deploy to App Service
uses: azure/webapps-deploy@v3
with:
app-name: my-webapp
slot-name: production
package: ./dist
# Azure DevOps pipeline
trigger:
- main
pool:
vmImage: ubuntu-latest
stages:
- stage: Build
jobs:
- job: BuildJob
steps:
- task: UseDotNet@2
- script: dotnet build
- stage: Deploy
jobs:
- deployment: DeployJob
environment: production
strategy:
runOnce:
deploy:
steps:
- task: AzureWebApp@1
inputs:
azureSubscription: "my-subscription"
appName: "my-webapp"
Both integrate with Azure resources through service connections. The choice often comes down to where your code lives and which tool your team already uses.
For more on managing Azure costs, see our post on Cost Optimization.
Compute Service Trade-offs
| Scenario | VMSS Uniform | VMSS Flexible | AKS | Azure App Service |
|---|---|---|---|---|
| Mixed instance types | No | Yes | Yes | No |
| Kubernetes ecosystem | No | No | Yes | No |
| Platform-managed upgrades | Yes | Limited | Yes (control plane) | Yes |
| Scale to zero | No | No | No | Yes |
| Pay-per-second billing | No | No | No (per node) | Yes (Consumption plan) |
| Max scale | 1000 VMs | 1000 VMs | 1000 nodes | 30-100 instances |
Blob Storage Access Tiers
| Tier | Use case | Pricing |
|---|---|---|
| Hot | Active logs, media, frequent access | Highest storage, lowest access |
| Cool | Backups, temp data, accessed monthly | Lower storage, higher access |
| Cold | Archival, rarely accessed | Lower storage, higher access |
| Archive | Compliance, decade-old data | Lowest storage, highest access latency |
Production Failure Scenarios
| Failure | Impact | Mitigation |
|---|---|---|
| VMSS flexible orchestration zone spread failure | Instances cannot spread across fault domains, availability reduced | Plan zone distribution before deploying, test failure domain behavior |
| AKS node pool out-of-capacity errors | Pods unschedulable, deployments fail | Monitor quota in each region, request increases proactively |
| Azure SQL connection pooling exhaustion | Applications cannot get connections, requests queue and fail | Set appropriate pool size, monitor active connections, implement retry logic |
| Blob storage lifecycle policy misconfiguration causing data loss | Old data deleted prematurely or hot storage bills spike | Test lifecycle policies on non-production buckets first, set up alerts on storage costs |
| Azure AD pod identity misconfiguration locking out AKS pods | Pods cannot authenticate to Azure resources, services fail | Test identity bindings in dev first, keep a fallback service principal |
Azure Observability Hooks
VMSS and AKS monitoring:
# Check VMSS instance health
az vmss list-instance-connection-info \
--resource-group rg-production \
--name webapp-vmss
# Get AKS cluster health and node status
az aks show \
--resource-group rg-production \
--name aks-cluster \
--query '{status:provisioningState,version:kubernetesVersion}'
# List nodes and their status
kubectl get nodes -o wide
Azure Monitor for containers:
# Enable monitoring on AKS
az aks enable-addons \
--resource-group rg-production \
--name aks-cluster \
--addons monitoring
# View container logs
az aks show \
--resource-group rg-production \
--name aks-cluster \
--query addonProfiles.omsagent.config.loganalyticsworkspacerresourceid
Key Azure Monitor metrics to alert on:
| Service | Metric | Alert Threshold |
|---|---|---|
| VMSS | CPU utilization | > 80% for 5 minutes |
| VMSS | Available memory | < 20% for 3 minutes |
| AKS | Pod unschedulable time | > 1 minute |
| AKS | Node pressure evictions | any evictions |
| Azure SQL | DTU utilization | > 80% for 5 minutes |
| Azure SQL | Connection failures | > 5 in 5 minutes |
| Blob Storage | Transactions | error rate > 1% |
| Blob Storage | Used capacity | > 80% of quota |
Common Pitfalls / Anti-Patterns
Not using managed identities. Embedding connection strings or API keys in configuration files is a security risk and a rotation nightmare. Azure managed identities eliminate credentials from code entirely.
Using admin passwords instead of key-based authentication. For VMs and databases, key-based authentication with Azure Key Vault integration is more secure and easier to rotate than passwords.
Leaving resources in the wrong resource group. Resource groups determine lifecycle—resources in the same group are deployed and deleted together. Mixing production and development resources in one group means a delete operation wipes the wrong environment.
Not using Azure Policy to enforce tagging. Without Azure Policy, resources get created without required tags, making cost attribution and automation harder. Set policy enforcement before creating resources at scale.
Deploying to the default subscription. The default subscription has low limits and no governance. Use Management Groups to organize subscriptions by environment and team from day one.
Capacity Estimation and Benchmark Data
Use these numbers for initial capacity planning. Actual performance varies by workload characteristics.
VMSS Instance Types
| Series | Best For | Sizes | Max Instances | Network Performance |
|---|---|---|---|---|
| B | Burstable workloads, dev/test | B1s → B20s | 1000 | Up to 1 Gbps |
| D | General purpose, production | D2s_v3 → D64s_v3 | 1000 | Up to 30 Gbps |
| E | Memory optimized | E2s_v3 → E64s_v3 | 1000 | Up to 30 Gbps |
| F | Compute optimized | F2s_v2 → F72s_v2 | 1000 | Up to 30 Gbps |
| L | Storage optimized | L8s_v2 → L48s_v2 | 1000 | Up to 30 Gbps |
AKS Node Pool Performance
| Parameter | Value | Notes |
|---|---|---|
| Max nodes per cluster | 5,000 (standard tier) | 1,000 default |
| Max pods per node | 250 (default) / 110 (kubenet) | Configure at cluster creation |
| Max node pools per cluster | 100 | Each pool can have different VM size |
| API server latency | 50-200ms p99 | SLA: 99.95% uptime with standard tier |
Blob Storage Performance Targets
| Metric | Value |
|---|---|
| PUT/LIST/DELETE latency | 10-50ms (p99) |
| GET latency | 5-15ms (p99) |
| Max requests per account | 20,000 IOPS (standard) |
| Throughput per account | 60 Gbps (standard) |
| Blob types | Block, Page, Append |
Azure SQL Instance Tiers
| Tier | vCPUs | Memory | Max Connections | Typical Use |
|---|---|---|---|---|
| Basic | 5 DTU | 2 GB | 30 | Small dev/test |
| S0 | 10 DTU | 2.25 GB | 60 | Small production |
| S1 | 20 DTU | 3 GB | 90 | Medium production |
| S2 | 50 DTU | 7 GB | 200 | Medium production |
| S3 | 100 DTU | 28 GB | 500 | Large production |
| P1 | 2 vCores | 5.1 GB | 1,500 | General purpose |
| P2 | 4 vCores | 20.4 GB | 2,500 | General purpose |
| P4 | 8 vCores | 40.4 GB | 5,000 | Business critical |
Interview Questions
Flexible orchestration is the right choice when you need to mix instance types within a single scale set—for example, combining general-purpose VMs with memory-optimized ones for different workloads on the same cluster. It also gives you explicit control over fault domain distribution, which matters for high-availability production workloads. Uniform mode works fine when all instances run identical workloads and you want minimal operational complexity.
Azure AD integration gives you unified identity management across the Microsoft ecosystem. You get conditional access policies, role-based access control tied to your existing directory, and single sign-on across Azure services. Without Entra ID integration, you manage Kubernetes RBAC separately from your corporate identity system, which creates duplication and governance challenges. For organizations already in the Microsoft ecosystem, skipping Azure AD integration means losing a significant security and management capability.
Managed identities attach an Azure IAM principal directly to a compute resource—App Service, VM, or AKS pod. When the resource needs to access another Azure service, Azure handles token issuance automatically. The credentials never touch your code or configuration files. Connection strings, by contrast, are secrets that sit in configuration files, get logged sometimes, and require rotation when anyone who had access leaves. If a connection string leaks, you have to rotate it everywhere. Managed identities eliminate that entire class of risk.
For CI/CD artifacts, I'd tier based on usage patterns. Fresh builds get accessed frequently—keep them in Hot for the first 7 days. After a week, most artifacts are only referenced for specific deployments—move to Cool. After 30 days, artifacts for completed releases are rarely needed but might need to be recreated—move to Cold or Archive. Set deletion at 365 days for anything older than a year. Use prefix filters so only artifacts/prod/* gets the aggressive policy and artifacts/test/* gets a shorter retention.
Azure DevOps wins when you need deep Microsoft integration—work items, test plans, Artifacts feeds, and enterprise approval workflows with strong governance. GitHub Actions wins when your team lives on GitHub, you have many open-source projects, or you want YAML pipelines that work across any cloud. GitHub Actions has a larger marketplace of community actions. The real answer often comes down to where your code lives and which tool your team already knows. Both integrate with Azure through service connections.
Azure Monitor autoscale compares the current metric value against a threshold you define. For CPU-based scaling, you set a threshold like 80% — when average CPU exceeds 80% over a 5-minute window, autoscale adds instances. For memory-based scaling, similar logic applies. You can also use custom metrics from Application Insights or Azure Event Hub. Key settings: set minimum and maximum instance counts, configure the scale-out and scale-in rules (how many instances to add or remove per event), and set the cool-down period to prevent thrashing. Common mistakes: setting thresholds too low (causes unnecessary scaling), too high (causes latency spikes before scaling kicks in), or forgetting to set maximums (cost surprises).
Resource locks apply directly to a resource — a delete or read-only lock prevents accidental deletion or modification through the portal or CLI. They work at the resource or resource group level. Azure Policy works at a broader scope — a subscription, management group, or resource group — and evaluates compliance continuously rather than blocking operations at the point of execution. Use locks for critical production resources where deletion would be catastrophic — apply a delete lock to your production KV, AKS cluster, and primary storage account. Use Azure Policy for ongoing compliance enforcement — ensuring all resources have required tags, that only approved VM sizes are used, or that resources are deployed in allowed regions.
For multi-region resilience: deploy VMSS in at least two regions, each in its own availability zone where available. Use Azure Traffic Manager or Front Door for global load balancing — it routes traffic to the healthy region based on latency or failover priority. Replicate Blob Storage using RA-GRS (Read-Access Geo-Redundant Storage) so the secondary region has a readable copy for read-heavy workloads. For database, use Azure SQL with active geo-replication or Azure Database for PostgreSQL with a replica in the secondary region. Set up async replication for state — sessions and user data go to a distributed cache (Azure Cache for Redis) or database, not local VMSS disk. Test regional failover quarterly.
AKS handles control plane upgrades automatically. Worker node upgrades are your responsibility. Start by upgrading the node pool to a new Kubernetes version, which requires reimaging the nodes. Use surge zones: set maxSurge: 1 on your node pools so that during upgrade, AKS provisions a new node before draining an old one, keeping capacity available. Run application workloads across multiple replicas distributed across node pools. Keep at least two node pools on different versions during the upgrade window so pods can migrate. After upgrading, verify that your applications still work — run smoke tests against the upgraded cluster before updating production.
Azure Private Link exposes a PaaS service (Azure SQL, Blob Storage, Key Vault) via a private IP address in your VNet. Traffic from your VNet to the service traverses the Microsoft backbone, never the public internet. Service Endpoints provide access to PaaS services from your VNet but the traffic still goes over the public Microsoft network path. Private Link is preferred when you need the highest security — it eliminates public internet exposure entirely and gives you granular RBAC control over which subnets can access the service. Private Link also works across tenants, making it suitable for accessing shared services in a hub-spoke architecture. Use Service Endpoints when Private Link is not available for the service or when operational simplicity is more important than maximum security.
Azure RBAC uses a role definition (like Contributor or Reader) assigned to a security principal (user, group, or service principal) at a scope (subscription, resource group, or individual resource). Permissions inherit downward — a role at the resource group level grants access to all resources in that group. Common mistakes: assigning Contributor at the subscription level when you only need it on a specific resource, leading to overpermission. Using the same service principal for multiple applications, making it hard to rotate credentials without affecting multiple services. Not scoping to least privilege — for an AKS cluster, the node pool service principal only needs access to the resource group it lives in, not the whole subscription.
Use the Azure AD Pod Identity addon for AKS. You create an Azure identity, assign it the necessary RBAC roles on the target resource (Key Vault, Blob Storage, etc.), then create a ManagedIdentityBinding resource that maps a Kubernetes service account to the Azure identity. The addon injects the token into the pod's filesystem at a known path, and the Azure SDK automatically uses it. Without pod identity, you would need to store a service principal secret in a Kubernetes secret, which is a credential rotation nightmare and a security risk if the secret leaks.
LRS (Locally Redundant Storage) replicates within a single data center — lowest cost, protects against hardware failure but not data center outages. ZRS (Zone Redundant Storage) replicates across three availability zones in one region — protects against zone failures, good for most production workloads. GRS (Geo-Redundant Storage) replicates to a paired region — protects against regional outages but requires manual failover. GZRS (Geo-Zone Redundant Storage) combines zone redundancy in the primary region with geo-replication to a paired region — maximum durability. Choose LRS for dev/test with low data value. Choose ZRS for production workloads where you need availability zone resilience. Choose GRS or GZRS for disaster recovery scenarios where a full region outage is unacceptable.
Set budgets at the subscription, resource group, or resource level. In Cost Management, create a budget with a threshold (e.g., 80% of monthly spend) and assign alert recipients. Budget alerts can trigger emails, webhook notifications, or Azure Action Groups for automated responses (like scaling down non-production resources). Scope budgets to the level you can act on — if you have one team per resource group, budget per resource group so the team gets direct alerts. Use cost analysis views to identify which resources drive spend — VMSS instances, managed disk storage, and outbound bandwidth are the usual suspects.
Blob Storage immutability lets you set a retention policy on a container — blobs cannot be modified or deleted for the duration you specify. You can also enable legal hold, which immutably stores blobs until you explicitly release the hold. Use cases: regulatory compliance requiring audit trails (SEC Rule 17a-4 for financial records, HIPAA for health data), preventing accidental deletion of backup data, and ensuring logs cannot be tampered with for forensic purposes. When you create an immutable storage policy, you specify a days-based retention period. After setting it, the only way to remove it is to first delete all blobs in the container — you cannot delete the policy with live blobs.
Conditional access policies evaluate when a user tries to access Azure resources — you can require multi-factor authentication, device compliance, or block sign-in from unsupported locations. For AKS access: enable Azure AD integration on your cluster so kubectl authentication flows through Entra ID. Create a conditional access policy that targets the AKS API app, requires MFA for all users, and blocks sign-in from non-compliant devices for production cluster admin tasks. Test the policy against a small group first. Use report-only mode to see which users would be affected before enforcing it. This adds a layer of protection on top of Kubernetes RBAC — even if someone compromises a Kubernetes service account token, the conditional access policy may block the sign-in from an unfamiliar device or location.
Active geo-replication creates readable secondary replicas in different regions — you can use these for read-scale workloads and manual failover. Auto-failover groups are a higher-level construct: they manage replication and automatic failover for a group of databases together. Use auto-failover groups when you want automatic failover for your application database — it handles DNS updates so your connection string does not change when failover occurs. Geo-replication gives you more control but requires you to manually trigger failover. For most production workloads, auto-failover groups with at least one secondary in a paired region is the recommended approach. Test failover annually.
Key Vault stores secrets — credentials, API keys, certificates — things that must be encrypted and access-controlled with strict RBAC. App Configuration stores application settings — feature flags, connection strings with non-secret values, environment-specific configuration. App Configuration supports dynamic refresh without redeployment, making it ideal for feature flags and runtime configuration. Key Vault does not support dynamic refresh — secrets require application restart to pick up changes. Use Key Vault for everything that is a secret: database passwords, API tokens, storage keys. Use App Configuration for configuration that changes frequently or needs to vary by environment without redeployment.
The hub is a central VNet that contains shared services — firewall (Azure Firewall or third-party NVA), VPN gateway, ExpressRoute connection, and Active Directory domain controllers. Each spoke is a separate VNet for a workload or team. VNet peering connects spokes to the hub but not to each other, reducing lateral movement risk. The hub provides centralized network security and governance — all traffic between spokes flows through the hub's firewall, giving you a single point of inspection. For DevOps: each team gets their own subscription and spoke VNet, isolation is built in, and shared services (like a CI/CD agent VNet with a self-hosted runner) live in the hub. This approach scales to many teams without each team needing to manage their own firewall and connectivity infrastructure.
Guest Configuration (part of Azure Policy) lets you audit settings inside virtual machines, including the containers running on them. You create a Guest Configuration assignment that targets your VM scale sets or individual VMs. The policy definition specifies the desired state — for example, a Docker daemon should not be running, or a specific container runtime configuration should be set. The Guest Configuration extension on the VM reports compliance status back to Azure Policy. You can see compliance in the Azure Policy compliance dashboard. For container security, this lets you audit that container runtimes are configured according to your organizational standards without deploying additional agents.
Further Reading
- Azure VMSS Documentation — Official docs for Virtual Machine Scale Sets including flexible orchestration
- Azure Kubernetes Service (AKS) — Microsoft official AKS documentation with cluster management guides
- Azure Blob Storage — Storage account configuration, lifecycle policies, and access tiers
- Azure SQL — Managed database offerings with identity integration
- Microsoft Entra ID (Azure AD) — Identity and access management for Azure resources
- Azure Policy — Enforce tagging and resource governance at scale
Conclusion
Key Takeaways
- Azure resource hierarchy flows Management Groups → Subscriptions → Resource Groups
- VMSS flexible orchestration mode is recommended for new production deployments
- AKS with Azure AD integration provides unified identity management across the Microsoft ecosystem
- Managed identities replace connection strings for Azure SQL, Blob Storage, and other Azure resources
- Azure Policy enforces tagging and resource governance at the subscription level
Azure Onboarding Checklist
# 1. Set up a Management Group for your organization
az account management-group create --name my-org --display-name "My Organization"
# 2. Create a subscription for production
az account management-group subscription add \
--name my-org \
--subscription my-subscription-id
# 3. Create a resource group
az group create --name rg-production --location eastus
# 4. Create an AKS cluster with Azure AD
az aks create \
--resource-group rg-production \
--name aks-cluster \
--node-count 3 \
--enable-azure-ad
# 5. Enable Azure Policy on the cluster
az aks enable-addons \
--resource-group rg-production \
--name aks-cluster \
--addons azure-policy
# 6. Create a storage account with lifecycle policy
az storage account create \
--name mystorageaccount \
--resource-group rg-production \
--sku Standard_LRS Category
Related Posts
Azure Data Services: Data Factory, Synapse, and Event Hubs
Build data pipelines on Azure with Data Factory, Synapse Analytics, and Event Hubs. Learn integration patterns, streaming setup, and data architecture.
Data Migration: Strategies and Patterns for Moving Data
Learn proven strategies for migrating data between systems with minimal downtime. Covers bulk migration, CDC patterns, validation, and rollback.
AWS Core Services for DevOps: EC2, ECS, EKS, S3, Lambda
Navigate essential AWS services for DevOps workloads—compute (EC2, ECS, EKS), storage (S3), serverless (Lambda), and foundational networking.