Kubernetes Storage: PersistentVolumes, Claims, and StorageClasses
Implement persistent storage in Kubernetes using PersistentVolumes, PersistentVolumeClaims, and StorageClasses for stateful applications across different cloud providers.
Kubernetes Storage: PersistentVolumes, Claims, and StorageClasses
Stateless applications are straightforward to run in Kubernetes. Pods get scheduled, they do their work, and if they die, Kubernetes replaces them. Stateful applications are different. A database needs its data to survive pod restarts. A message queue requires persistent storage for messages in transit. For these cases, Kubernetes provides PersistentVolumes, PersistentVolumeClaims, and StorageClasses.
Introduction
Kubernetes provides PersistentVolumes (PV), PersistentVolumeClaims (PVC), and StorageClasses to handle storage for stateful applications. PersistentVolumes are cluster-wide storage resources that survive pod restarts. PersistentVolumeClaims are namespace-scoped requests for storage. StorageClasses define provisioners and parameters for dynamic storage provisioning.
This post covers how persistent storage works in Kubernetes, from basic volume attachment to dynamic provisioning across cloud providers.
If you are new to Kubernetes, start with the Kubernetes fundamentals post. For StatefulSet configuration, see the Kubernetes Workload Resources post.
PV and PVC Lifecycle
A PersistentVolume (PV) is a piece of storage in the cluster. It is a cluster-wide resource, not tied to any specific namespace. A PersistentVolumeClaim (PVC) is a request for storage by a user or pod.
The lifecycle of a PV follows these phases:
- Provisioning: PV is created statically or dynamically
- Binding: PVC binds to a PV that satisfies its requirements
- Using: Pod uses the mounted volume
- Releasing: Pod releases the claim (PVC deleted)
- Reclaiming: PV is retained, recycled, or deleted based on its reclaim policy
Volume Configuration Examples
PersistentVolume example
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-fast-storage
spec:
capacity:
storage: 100Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: fast-ssd
awsElasticBlockStore:
volumeID: vol-0abc123def456
fsType: ext4
PersistentVolumeClaim example
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: database-storage
namespace: production
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 50Gi
storageClassName: fast-ssd
Using PVC in a Pod
apiVersion: v1
kind: Pod
metadata:
name: database
namespace: production
spec:
containers:
- name: postgres
image: postgres:15
volumeMounts:
- name: data
mountPath: /var/lib/postgresql/data
volumes:
- name: data
persistentVolumeClaim:
claimName: database-storage
The PVC must exist in the same namespace as the pod. The PV does not need to be in the same namespace (PVs are cluster-wide).
Access Modes and Usage Patterns
Access modes
| Mode | Abbreviation | Description |
|---|---|---|
| ReadWriteOnce | RWO | Single node can mount as read-write |
| ReadOnlyMany | ROX | Multiple nodes can mount as read-only |
| ReadWriteMany | RWX | Multiple nodes can mount as read-write |
Not all storage providers support all modes. AWS EBS supports RWO only. NFS supports RWX. Azure Files supports RWO, ROX, RWX.
When to Use Each Access Mode
| Access Mode | Use Case | Limitations |
|---|---|---|
| ReadWriteOnce | Single-pod databases, app data | No multi-node read-write |
| ReadOnlyMany | Shared config files, read-only data | No writes at all |
| ReadWriteMany | Multi-node file sharing, shared data volumes | Not supported by cloud block storage |
Rule of thumb: Default to ReadWriteOnce for databases and stateful apps. Use ReadWriteMany only when multiple pods across nodes genuinely need write access (NFS, shared file systems).
PV Lifecycle Flow
flowchart LR
A[Provision] --> B[Bind: PVC matches PV]
B --> C[Pod uses volume]
C --> D[Release: PVC deleted]
D --> E{Reclaim Policy}
E -->|Retain| F[Manual cleanup<br/>or restore]
E -->|Delete| G[Volume deleted]
E -->|Recycle| A
Static vs Dynamic Provisioning
Static provisioning means you create PVs manually before any PVC requests come in. You know the exact storage you have and you allocate it to workloads manually.
Dynamic provisioning creates PVs automatically when a PVC requests them. Kubernetes uses a StorageClass to determine what kind of storage to provision.
Static provisioning
# Admin creates the PV
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-manual-001
spec:
capacity:
storage: 200Gi
accessModes:
- ReadWriteOnce
storageClassName: manual
hostPath:
path: /data/pv-manual-001
# User creates PVC
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-app-storage
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 50Gi
storageClassName: manual
Static provisioning gives administrators precise control. It also requires manual tracking of available storage capacity.
Dynamic provisioning
Dynamic provisioning kicks in when a PVC specifies a StorageClass and no matching PV exists. The StorageClass provisioner creates a new PV automatically.
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: fast-ssd
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp3
fsType: ext4
iops: "3000"
throughput: "125"
volumeBindingMode: WaitForFirstConsumer
The WaitForFirstConsumer binding mode delays PV creation until a pod actually uses the PVC. This lets the scheduler place the pod in the same availability zone as the storage, avoiding cross-zone traffic costs.
StorageClass Providers
Different cloud providers and storage systems expose different provisioners:
| Provider | Provisioner | Notes |
|---|---|---|
| AWS | kubernetes.io/aws-ebs | gp3, gp2, io1, st1, sc1 |
| GCP | kubernetes.io/gce-pd | pd-standard, pd-ssd |
| Azure | kubernetes.io/azure-disk | Standard_LRS, Premium_LRS |
| NFS | nfs.subvol.io (external) | Requires NFS server |
| Local | kubernetes.io/no-provisioner | For local disks |
AWS EBS StorageClass
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: gp3-storage
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp3
fsType: ext4
iops: "3000"
throughput: "125"
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
The allowVolumeExpansion: true field lets you expand volumes without recreating the PVC. You still need to update the PVC spec to request more storage.
NFS StorageClass
NFS works across multiple availability zones and supports ReadWriteMany access mode:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: nfs-storage
provisioner: nfs.subvol.io
parameters:
server: nfs-server.example.com
share: /exports
mountOptions:
- nfsvers=4.1
volumeBindingMode: Immediate
StatefulSet Volume Claim Templates
StatefulSets use volumeClaimTemplates to provision storage for each replica automatically:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: postgres-cluster
namespace: database
spec:
serviceName: postgres-cluster
replicas: 3
selector:
matchLabels:
app: postgres
template:
metadata:
labels:
app: postgres
spec:
containers:
- name: postgres
image: postgres:15
volumeMounts:
- name: data
mountPath: /var/lib/postgresql/data
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: fast-ssd
resources:
requests:
storage: 100Gi
For StatefulSet with 3 replicas, Kubernetes creates 3 PVCs: data-postgres-cluster-0, data-postgres-cluster-1, data-postgres-cluster-2. Each PVC binds to its own PV with independent lifecycle.
When you scale down the StatefulSet, the PVCs for removed pods remain. You need to manually clean them up or configure automatic deletion (not default behavior).
CSI Drivers and Abstraction
The Container Storage Interface (CSI) is a standard for storage plugins. CSI drivers replace the in-tree volume plugins (like aws-ebs, gce-pd) and provide a cleaner abstraction.
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: ebs-sc
provisioner: ebs.csi.aws.com
parameters:
type: gp3
csi.storage.k8s.io/fstype: ext4
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
CSI drivers get installed as pods and communicate with the storage provider. AWS EBS CSI driver, GCP Persistent Disk CSI driver, and others are maintained by cloud providers and the community.
Benefits of CSI:
- Vendors can release storage plugins without modifying Kubernetes core
- CSI drivers run as pods, not in-tree components
- Standardized interface across storage backends
Data Backup Considerations
PersistentVolumes do not get backed up automatically. You need explicit backup strategies:
Volume snapshots
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
name: postgres-snapshot
namespace: database
spec:
volumeSnapshotClassName: ebs-snapshot-class
source:
persistentVolumeClaimName: data-postgres-cluster-0
Backup tools
| Tool | Description |
|---|---|
| Velero | Backup and restore K8s resources and PVs |
| Kasten K10 | Policy-driven data protection |
| aws-cli / gcloud | Cloud provider snapshot tools |
Velero backs up PV snapshots to object storage and can restore entire applications:
velero backup create database-backup --include-namespaces database
velero restore create --from-backup database-backup
Trade-off Analysis
Static vs Dynamic Provisioning Comparison
| Aspect | Static Provisioning | Dynamic Provisioning |
|---|---|---|
| Control | Admin creates PVs manually | StorageClass provisions PVs automatically |
| Utilization | May leave storage unused | Better utilization — PVs created on demand |
| Operational burden | High — manual tracking | Low — automated |
| Use case | Pre-allocated storage, local disks | Cloud block storage, elastic workloads |
ReadWriteOnce vs ReadWriteMany
| Aspect | ReadWriteOnce (RWO) | ReadWriteMany (RWX) |
|---|---|---|
| Cloud support | AWS EBS, GCP PD, Azure Disk | Azure Files, NFS, some SAN |
| Concurrent access | Single node, read-write | Multiple nodes, read-write |
| Performance | Typically higher | Typically lower (network filesystem) |
| Use case | Databases, single-pod apps | Shared file systems, multi-node workloads |
CSI vs In-Tree Volume Plugins
| Aspect | CSI Drivers | In-Tree Plugins (aws-ebs, gce-pd) |
|---|---|---|
| Maintenance | Vendors maintain independently | Kubernetes core team maintains |
| Deployment | Runs as pods outside core K8s | Built into kube-controller-manager |
| Feature velocity | Faster updates, new features | Slower release cycle |
| Standardization | CSI standard across all storage | Plugin-specific implementation |
VolumeBindingMode: Immediate vs WaitForFirstConsumer
| Aspect | Immediate | WaitForFirstConsumer |
|---|---|---|
| PV creation timing | At PVC creation | When pod using PVC is scheduled |
| Zone placement | May create cross-AZ volumes | Pod and volume in same AZ |
| Cost | Risk of cross-AZ traffic | Avoids cross-AZ costs |
| Scheduling flexibility | Higher | Lower — PV must wait for pod |
Reclaim Policies: Retain vs Delete vs Recycle
| Aspect | Retain | Delete | Recycle |
|---|---|---|---|
| Data after PVC deletion | Preserved on PV | Volume deleted automatically | Volume wiped, PV reused |
| Use case | Production data must survive | Ephemeral test storage | Reusable development volumes |
| Risk | Manual cleanup required | Data loss if accidental | Security risk (incomplete wipe) |
StatefulSet volumeClaimTemplates vs Manual PVCs
| Aspect | volumeClaimTemplates (automatic) | Manual PVCs |
|---|---|---|
| Scaling | New PVC + PV auto-created per replica | Admin must create manually |
| Lifecycle | Independent per replica | Independent per PVC |
| Naming | Predictable naming pattern | Must track names manually |
| Deletion on scale-down | PVCs remain (must clean manually) | Same — manual cleanup |
Common Pitfalls / Anti-Patterns
Using ReadWriteMany for Single-Pod Databases
AWS EBS supports ReadWriteOnce only. If you see ReadWriteMany in a cloud database manifest, something is wrong. Most cloud block storage does not support concurrent mounts.
Not Setting volumeBindingMode: WaitForFirstConsumer
WaitForFirstConsumer is the right default for most use cases. Without it, your PV might get provisioned in a different availability zone than your pod, which means cross-zone traffic and unexpected costs.
Manually Creating PVs for StatefulSets
StatefulSets should use volumeClaimTemplates for dynamic volume provisioning. Manually creating PVs for each replica defeats the purpose of dynamic provisioning and makes scaling painful.
Production Failure Scenarios
Volume Remains Stuck in “Pending” State
A PVC stays in Pending when no PV satisfies its requirements. This happens when the StorageClass does not exist, the provisioner is not running, or no PV matches the requested size/access mode.
Symptoms: Pending PVC, no events showing provisioning.
Diagnosis:
kubectl describe pvc <name> -n <namespace>
kubectl get storageclass
kubectl get pods -n kube-system # Check provisioner pods
Mitigation: Verify the StorageClass exists and the provisioner is running. Check that the requested size and accessMode are supported by the storage backend.
Pod Fails to Start Due to Volume Mount Timeout
If a volume is mounted from a remote storage system and the network path is slow or unavailable, the pod can fail to start within the default mount timeout.
Symptoms: ContainerCreating state, MountVolume.SetUp timed out in events.
Mitigation: Increase the mount timeout with the mountOptions field in the StorageClass. Use local storage when possible for latency-sensitive workloads.
Volume Capacity Exhaustion
When a PersistentVolume fills up, the application writing to it fails. Database workloads are particularly vulnerable.
Symptoms: No space left on device errors in application logs, pod enters CrashLoopBackOff.
Mitigation: Set allowVolumeExpansion: true on the StorageClass to expand volumes without recreation. Monitor volume capacity and set up alerts at 80% usage. Implement cleanup policies for old data.
Interview Questions
Expected answer points:
- PV is the actual storage resource in the cluster — cluster-wide, not namespace-scoped
- PVC is a request for storage by a user or pod — namespace-scoped
- A PVC binds to a PV that satisfies its size, accessMode, and StorageClass requirements
- The PV lifecycle is independent of the pod lifecycle — PV survives pod restarts
Expected answer points:
- Retain — PV and data persist after PVC deletion. Manual cleanup needed. Use for production data that must not be lost.
- Delete — PV is automatically deleted when PVC is released. Use for ephemeral or replaceable storage.
- Recycle — PV is wiped and made available for reuse. Deprecated in favor of dynamic provisioning. Use for development.
Expected answer points:
- Static — admin creates PVs manually before any PVC requests. Gives precise control but requires manual tracking.
- Dynamic — StorageClass provisions PVs automatically when a PVC requests storage and no matching PV exists.
- Dynamic provisioning is preferred for cloud workloads where storage is elastic.
- Static is useful when you have pre-allocated storage or local disks.
Expected answer points:
- ReadWriteOnce (RWO) — single node can mount as read-write. Supported by AWS EBS, GCP PD, Azure Disk.
- ReadOnlyMany (ROX) — multiple nodes can mount as read-only. Supported by NFS, Azure Files.
- ReadWriteMany (RWX) — multiple nodes can mount as read-write. Supported by NFS, Azure Files.
- Most cloud block storage (EBS) only supports RWO.
Expected answer points:
- By default (Immediate), the PV is created as soon as the PVC is created
- WaitForFirstConsumer delays PV creation until the pod using the PVC is actually scheduled
- This lets Kubernetes place the PV in the same availability zone as the pod, avoiding cross-AZ traffic costs
- Always use WaitForFirstConsumer for production cloud workloads
Expected answer points:
- A StorageClass defines the provisioner (e.g., AWS EBS, NFS) and parameters for creating PVs
- When a PVC specifies a StorageClass and no matching PV exists, the provisioner creates a new PV automatically
- StorageClasses are the foundation of dynamic provisioning in Kubernetes
- You can set default StorageClasses with annotation `storageclass.kubernetes.io/is-default-class`
Expected answer points:
- StatefulSets use volumeClaimTemplates to create a unique PVC for each replica pod
- Each PVC gets its own PV with independent lifecycle — scaling down does not delete PVCs
- Pod names are stable (e.g., postgres-cluster-0, postgres-cluster-1) for stable network identities
- Deployments share a volume for all replicas — StatefulSets give each replica its own storage
Expected answer points:
- CSI (Container Storage Interface) is a standard interface for storage plugins
- It replaced in-tree volume plugins (aws-ebs, gce-pd) that required modifying Kubernetes core
- CSI drivers run as pods outside the core Kubernetes codebase — vendors can release independently
- Benefits: faster feature development, standardized interface, cleaner abstraction
Expected answer points:
- PVs are not backed up automatically — you need explicit backup strategies
- VolumeSnapshots (snapshot.storage.k8s.io) creates point-in-time snapshots via CSI driver
- Velero backs up PV snapshots to object storage and can restore entire applications
- Cloud provider tools (AWS Snapshot, GCP Snapshots) can also be used for block storage
Expected answer points:
- AWS EBS supports ReadWriteOnce only — it cannot mount to multiple nodes
- If you specify RWX for a cloud database, provisioning will fail or the volume will be unusable
- Single-pod databases should always use RWO — the pod can reschedule to a specific node
- Only use RWX with NFS or shared file systems that explicitly support multi-node read-write
Expected answer points:
- Set `allowVolumeExpansion: true` on the StorageClass
- Edit the PVC spec to request more storage (e.g., `kubectl edit pvc
`) - The CSI driver handles the actual volume expansion on the storage backend
- Not all storage providers support online expansion — some require pod restart
Expected answer points:
- When a StatefulSet is scaled down, the pods are terminated but their PVCs are NOT deleted
- The PVCs remain in the cluster, preserving data
- When you scale back up, the new pods get their own new PVCs (via volumeClaimTemplates)
- You must manually clean up old PVCs or configure a deletion policy to avoid orphaned volumes
Expected answer points:
- 1. Provisioning — PV created statically by admin or dynamically by StorageClass
- 2. Binding — PVC binds to a PV that satisfies its requirements
- 3. Using — Pod uses the mounted volume
- 4. Releasing — Pod releases the claim (PVC deleted)
- 5. Reclaiming — PV is retained, recycled, or deleted based on reclaim policy
Expected answer points:
- hostPath mounts a directory from the host node — simple but not portable across nodes
- Local PV uses a dedicated block device or partition on a local disk — scheduler knows node affinity
- Local PVs respect node topology (same AZ) and are better for latency-critical workloads
- Both are static provisioning — you create the PV manually with the `no-provisioner` StorageClass
Expected answer points:
- Run `kubectl describe pvc
` to see events — look for provisioning failures or no matching PV - Verify the StorageClass exists: `kubectl get storageclass`
- Check that the provisioner pod is running: `kubectl get pods -n kube-system`
- Verify the requested size and accessMode are supported by the storage backend
Expected answer points:
- Volume expansion only works if the StorageClass has `allowVolumeExpansion: true`
- CSI drivers must support online expansion — some require the pod to be stopped before expanding
- You can only expand, not shrink — be careful about requesting too much storage upfront
- Expanding a volume does not automatically increase the filesystem size inside — the application may need to handle filesystem resizing
Expected answer points:
- Storage provider outage — cloud block storage can become unavailable, causing pod evictions
- Cross-AZ traffic costs — PVs provisioned in the wrong zone cause unexpected network costs
- Storage quota exhaustion — reaching provider storage limits blocks new PVCs
- Deleting a PVC without checking reclaim policy can permanently lose data if the PV was set to Delete
Expected answer points:
- The built-in NFS provisioner (deprecated) required a pre-existing PV and could not dynamically provision subvolumes
- nfs.subvol.io creates dynamic subvolumes on an NFS server for each PVC — no manual PV creation needed
- The subvol.io provisioner creates isolated subdirectories with independent permissions per PVC
- It supports `WaitForFirstConsumer` to ensure the pod and NFS volume are in the same availability zone
Expected answer points:
- CSI drivers can advertise topology constraints (region, zone, node) for each volume they can provision
- `WaitForFirstConsumer` delays volume creation until the scheduler places the pod, then the CSI driver creates the volume in the pod's topology
- If a node in the topology becomes unavailable, volume creation fails even if other nodes have capacity
- Topology constraints prevent volumes from being provisioned in zones where they cannot be attached, avoiding runtime attachment failures
Expected answer points:
- You cannot change the StorageClass of an existing PV — migration requires creating a new PVC with the new StorageClass
- Use Velero to backup the application (including PVCs), then restore to a new namespace using a different StorageClass
- For databases, use native backup/restore (pg_dump, mysqldump) to move data to a new volume with the desired StorageClass
- For zero-downtime migration, scale up the application with a new PVC using the new StorageClass, migrate data, then scale down the old pod
Further Reading
- Kubernetes Official Documentation - Persistent Volumes
- CSI Driver Documentation
- Volume Snapshots
- StorageClass Parameters for Cloud Providers
- Velero Backup Documentation
Conclusion
Use this checklist when working with Kubernetes storage:
- Chose the correct access mode (RWO for databases, RWX for shared file storage)
- Used StorageClass with
volumeBindingMode: WaitForFirstConsumerto avoid cross-zone costs - Enabled
allowVolumeExpansion: trueon the StorageClass for production - Used volumeClaimTemplates with StatefulSets, not manually created PVs
- Set up VolumeSnapshots for critical PVs before any major changes
- Monitored volume capacity and set alerts at 80% usage
- Used CSI drivers instead of in-tree provisioners for cloud storage
- Configured Retain reclaim policy for PVs that must survive PVC deletion
- Tested backup and restore procedures in staging
- Used local storage (HostPath or Local PV) for latency-critical workloads where possible
Persistent storage in Kubernetes involves three layers: PersistentVolumes (actual storage), PersistentVolumeClaims (requests), and StorageClasses (provisioning logic). Static provisioning gives you manual control. Dynamic provisioning scales automatically using StorageClass drivers.
StatefulSets use volumeClaimTemplates to create per-replica PVCs with independent lifecycle. CSI drivers provide a standardized interface for storage backends across cloud providers.
Remember that PVs do not get backed up automatically. Plan for snapshots and backup tools like Velero to protect your data. For more on running stateful workloads, see the Kubernetes Workload Resources post and the Advanced Kubernetes post.
Category
Related Posts
Docker Volumes: Persisting Data Across Container Lifecycles
Understand how to use Docker volumes and bind mounts to persist data, share files between containers, and manage stateful applications.
Artifact Management: Build Caching, Provenance, and Retention
Manage CI/CD artifacts effectively—build caching for speed, provenance tracking for security, and retention policies for cost control.
Container Security: Image Scanning and Vulnerability Management
Implement comprehensive container security: from scanning images for vulnerabilities to runtime security monitoring and secrets protection.