File Allocation Methods

Understanding how file systems allocate disk space through contiguous, linked, and indexed allocation strategies, and how modern file systems like FAT, ext2/3/4, and NTFS implement these methods.

published: reading time: 29 min read author: GeekWorkBench

File Allocation Methods

When you save a 2GB video file, something has to decide where all those bytes go on disk. The file allocation method is the strategy a file system uses to assign disk space to files. Choose the wrong allocation method and your hard drive becomes a fragmented mess. Use the right one, and your system hums along efficiently for years.

This isn’t just theory—different allocation methods directly explain why:

  • FAT USB drives get slow after copying many files
  • Linux file systems survive crashes without data loss
  • Database systems carefully manage their own files rather than relying on the OS

Understanding allocation methods gives you insight into every storage decision you’ll ever make.

Introduction

When to Use / When Not to Use

Allocation method knowledge helps with system design and troubleshooting.

When allocation method understanding helps:

  • Formatting drives for specific workloads (large files vs many small files)
  • Understanding why fragmentation occurs and how to prevent it
  • Designing applications that interact heavily with storage
  • Troubleshooting performance issues in file-heavy workloads
  • Choosing the right file system for a given use case

When you can rely on defaults:

  • General desktop and server use (ext4/XFS/NTFS handle this well)
  • Cloud storage where abstracted block devices are provided
  • Containerized workloads with overlay file systems

Architecture or Flow Diagram

graph TD
    A[File Creation Request] --> B{Allocation Strategy}

    B --> C[Contiguous Allocation]
    B --> D[Linked Allocation]
    B --> E[Indexed Allocation]

    C --> F[Find large enough hole]
    D --> G[Track free blocks with linked list]
    E --> H[Build index block for file]

    F --> I[Allocate consecutive blocks]
    G --> J[Allocate from anywhere, link blocks]
    H --> K[Pointers in index block]

    style C stroke:#ff00ff,stroke-width:2px
    style D stroke:#00fff9,stroke-width:2px
    style E stroke:#ff00ff,stroke-width:2px

Core Concepts

Contiguous Allocation

The simplest approach: allocate consecutive blocks for each file.

File "video.mp4" - 5 blocks:
[Block 100][Block 101][Block 102][Block 103][Block 104]
                   ▲ Start block
                   └────── 5 consecutive blocks ──────┘

Directory Entry:
  Name: video.mp4
  Start: 100
  Length: 5

Advantages:

  • Excellent sequential read performance (no seeks between blocks)
  • Simple to implement
  • Small metadata overhead (only start block + length needed)

Disadvantages:

  • External fragmentation (must find large enough continuous hole)
  • Compaction required to reclaim space
  • File growth problematic (adjacent blocks may be taken)
// Simulated contiguous allocation
typedef struct {
    char filename[255];
    int start_block;
    int length_blocks;
} FileEntry;

int allocate_contiguous(int file_id, int blocks_needed) {
    int start = find_contiguous_hole(blocks_needed);
    if (start == -1) {
        // No hole large enough
        return -1;
    }
    allocate(start, blocks_needed);
    return start;
}

Linked Allocation

Each block contains a pointer to the next block:

File "document.txt" - 4 blocks:

[Block 50][data][Block 89] -> [Block 12][data][Block 200] ->
[Block 200][data][NULL]    -> [Block 75][data][NULL]

Directory Entry:
  Name: document.txt
  Start: 50
  End: 75 (cached or NULL if unknown)

Advantages:

  • No external fragmentation (use any free block)
  • File can grow easily (just add more blocks)
  • Simple directory entry (only start block needed)

Disadvantages:

  • Poor random access (must follow chain from start)
  • Pointer overhead (4-8 bytes per block)
  • Pointer integrity critical (corrupted pointer = data loss)
  • No concept of file size without scanning
// Linked allocation block structure
typedef struct {
    char data[BLOCK_SIZE - sizeof(int)];
    int next_block;  // Pointer to next block, -1 = end of file
} LinkedBlock;

Indexed Allocation

Store all block pointers in an index block:

File "database.db" - 7 blocks:

Index Block (Block 500):
┌─────────────────────────────────┐
│ Pointer: 120                    │
│ Pointer: 45                     │
│ Pointer: 389                    │
│ Pointer: 12                     │
│ Pointer: 201                    │
│ Pointer: 88                     │
│ Pointer: 156                    │
│ Pointer: -1 (end of index)      │
└─────────────────────────────────┘


         ├──> [Block 120][data]
         ├──> [Block 45][data]
         ├──> [Block 389][data]
         ├──> [Block 12][data]
         ├──> [Block 201][data]
         ├──> [Block 88][data]
         └──> [Block 156][data]

Directory Entry:
  Name: database.db
  Index Block: 500

Advantages:

  • Direct random access to any block
  • Good sequential performance
  • Blocks can be anywhere on disk
  • File metadata self-contained in index block

Disadvantages:

  • Index block overhead (1 block per file regardless of size)
  • Limit on file size based on index block entries
  • Extra I/O to read index block first

Index Block Extension (ext2/3/4 approach): For large files, use multiple index blocks in a tree structure:

  • Single indirect block (pointers to data blocks)
  • Double indirect block (pointers to single indirect blocks)
  • Triple indirect block (pointers to double indirect blocks)

FAT: File Allocation Table

FAT uses a variation of linked allocation with a central table:

graph LR
    A[FAT Table] --> B[Cluster 0]
    A --> C[Cluster 1]
    A --> D[Cluster 2]
    B --> E[End of file]
    C --> D
    D --> E

    style A stroke:#ff00ff,stroke-width:3px

The FAT table stores one entry per cluster:

  • 0x0000: Free cluster
  • 0xFFFF: End of file
  • Other values: Next cluster in chain
FAT-16 Entry Example:
┌────────┬────────┬────────┬────────┐
│ Cluster│ Value  │ Meaning│ Next   │
├────────┼────────┼────────┼────────┤
│   0    │ 0xFFF8 │ Media  │ -      │
│   1    │ 0xFFFF │ EOF    │ -      │
│   2    │   5    │ Next   │ Cluster 5
│   3    │   6    │ Next   │ Cluster 6
│   4    │   0    │ Free   │ -      │
│   5    │ 0xFFFF │ EOF    │ -      │
│   6    │ 0xFFFF │ EOF    │ -      │
└────────┴────────┴────────┴────────┘

File "example.txt" starting at cluster 2:
[Cluster 2] -> [Cluster 5] -> EOF

ext2/3/4: Block Bitmap and Extents

Linux’s ext file systems use bitmaps and extent trees:

Block Bitmap: One bit per block indicates free/allocated. Fast for finding free space.

Extent-Based Allocation: Instead of listing every block, ext4 uses extents—contiguous ranges:

Extent format in ext4:
Start block: 1024
Length: 50 blocks  (means blocks 1024-1073 are contiguous)

File stored in extents:
  Extent 1: start=1024, len=50   (blocks 1024-1073)
  Extent 2: start=2048, len=25    (blocks 2048-2072)
  Extent 3: start=4096, len=10    (blocks 4096-4105)

This reduces metadata overhead for large, contiguous files and minimizes fragmentation impact on performance.

NTFS: MFT and Attribute Records

NTFS uses the Master File Table (MFT) where each file has one or more records:

MFT Record for large file:

$DATA attribute (non-resident):
  Runlist: cluster 1000-1023, then 2048-2063, then 4096-4105

$FILE_NAME attribute:
  Name: large_video.avi
  Size: 2GB

NTFS can store data directly in the MFT for small files (under 900 bytes), or use runlists pointing to clusters scattered across the volume.

Production Failure Scenarios

Scenario 1: FAT Fragmentation Spiral

What happened: A security camera system wrote continuous 4MB video chunks to a FAT32 USB drive. After a week, the drive showed massive fragmentation. Read speeds dropped from 20MB/s to 2MB/s. Playback became choppy.

Why it happened: FAT can’t handle many small allocations well. Each video chunk became a linked chain of clusters scattered across the drive. Reading one file meant dozens of seeks.

Detection:

# Check fragmentation on FAT/FAT32
defrag /c: /v  # Windows

# Or use CHKDSK
chkdsk e: /r  # Scan and fix

Mitigation:

  • Use larger cluster sizes when formatting (32KB instead of 4KB)
  • Periodically defragment FAT drives
  • For video recording, use dedicated file systems (NTFS, ext4)
  • Consider exFAT for large files on removable media

Scenario 2: ext4 Inode Exhaustion vs Block Exhaustion

What happened: A mail server stored millions of tiny email files. df showed plenty of disk space, but new files couldn’t be created. The issue: inode exhaustion. All inodes were allocated even though blocks remained.

Detection:

df -i  # Check inode usage
# Filesystem   Inodes    IUsed   IFree  IUse% Mounted on
# /dev/sda1  100M       99.9M   100K   99%   /var/mail

tune2fs -l /dev/sda1 | grep -i inode
# Inode count:              100000000
# Inodes per group:         8192
# Block size:               4096

Mitigation:

  • Use mkfs.ext4 -N <inode_count> to provision more inodes when formatting
  • Set inode ratio with mkfs.ext4 -i <bytes_per_inode>
  • Monitor inode usage with alerting

Scenario 3: Database File Fragmentation

What happened: A PostgreSQL database grew to 500GB over months. Query performance degraded despite hardware having plenty of CPU and memory. Analysis showed database file fragmentation at 78%.

Detection:

# Check fragmentation on ext4
sudo fsck.ext4 -nv /dev/sda1

# Check extent tree depth
sudo debugfs -R "extents -v /path/to/database" /dev/sda1

Mitigation:

  • Use e4defrag to defragment:

    sudo e4defrag /var/lib/postgresql/data/
  • For databases, consider preallocate or use raw partitions

  • Choose noatime mount option to reduce fragmentation

  • Plan capacity to avoid >80% disk fullness

Trade-off Table

AspectContiguousLinkedIndexedext4 ExtentsNTFS MFT
Sequential ReadExcellentGoodGoodExcellentExcellent
Random AccessGoodPoorExcellentGoodExcellent
FragmentationHighLowLowLowModerate
Metadata OverheadLowVery LowMediumLowMedium
Large File SupportLimitedGoodLimitedExcellentExcellent
Crash RecoveryPoorPoorGoodExcellentExcellent
ComplexityLowLowMediumHighHigh

Implementation Snippet

Simulating Block Allocation with Bitmaps (Python)

#!/usr/bin/env python3
"""Simulate file allocation with block bitmap."""

class FileSystemSimulator:
    def __init__(self, total_blocks=1000, block_size=4096):
        self.total_blocks = total_blocks
        self.block_size = block_size
        self.block_bitmap = [0] * total_blocks  # 0 = free, 1 = allocated
        self.files = {}

    def allocate_contiguous(self, filename, size_blocks):
        """Allocate contiguous blocks (First Fit)."""
        start = self._find_contiguous_hole(size_blocks)
        if start == -1:
            return None

        # Mark blocks as allocated
        for i in range(start, start + size_blocks):
            self.block_bitmap[i] = 1

        self.files[filename] = {
            'start': start,
            'length': size_blocks,
            'type': 'contiguous'
        }
        return start

    def allocate_linked(self, filename, size_blocks):
        """Allocate using linked list approach."""
        free_blocks = self._find_free_blocks(size_blocks)
        if len(free_blocks) != size_blocks:
            return None

        # Allocate and link
        for i, block in enumerate(free_blocks):
            self.block_bitmap[block] = 1
            if i < len(free_blocks) - 1:
                # Store next pointer (simplified: store in separate structure)
                pass

        self.files[filename] = {
            'start': free_blocks[0],
            'blocks': free_blocks,
            'type': 'linked'
        }
        return free_blocks[0]

    def _find_contiguous_hole(self, size_needed):
        """Find first contiguous hole of sufficient size."""
        consecutive = 0
        start = -1

        for i, allocated in enumerate(self.block_bitmap):
            if allocated == 0:
                if consecutive == 0:
                    start = i
                consecutive += 1
                if consecutive >= size_needed:
                    return start
            else:
                consecutive = 0
                start = -1

        return -1

    def _find_free_blocks(self, count):
        """Find any free blocks."""
        free = []
        for i, allocated in enumerate(self.block_bitmap):
            if allocated == 0:
                free.append(i)
                if len(free) >= count:
                    return free
        return []

    def defragment(self, filename):
        """Defragment a linked file to contiguous allocation."""
        if filename not in self.files or self.files[filename]['type'] != 'linked':
            return False

        old_blocks = self.files[filename]['blocks']
        size = len(old_blocks)
        new_start = self._find_contiguous_hole(size)

        if new_start == -1:
            return False

        # Move data (in reality, this is a complex operation)
        for i in range(size):
            self.block_bitmap[old_blocks[i]] = 0
            self.block_bitmap[new_start + i] = 1

        self.files[filename] = {
            'start': new_start,
            'length': size,
            'type': 'contiguous'
        }
        return True

    def print_status(self):
        """Display allocation status."""
        allocated = sum(self.block_bitmap)
        print(f"Total blocks: {self.total_blocks}")
        print(f"Allocated: {allocated} ({100*allocated/self.total_blocks:.1f}%)")
        print(f"Free: {self.total_blocks - allocated}")
        print(f"\nFiles: {len(self.files)}")
        for name, info in self.files.items():
            print(f"  {name}: start={info.get('start', 'N/A')}, type={info['type']}")

if __name__ == "__main__":
    fs = FileSystemSimulator(100)

    # Test contiguous allocation
    fs.allocate_contiguous("video.mp4", 50)
    fs.allocate_contiguous("document.txt", 5)

    # Test linked allocation
    fs.allocate_linked("database.db", 20)

    fs.print_status()

Checking File Fragmentation with ext4

#!/bin/bash
# check_fragmentation.sh - Check file fragmentation on ext4

FILE=$1
if [ -z "$FILE" ]; then
    echo "Usage: $0 <filename>"
    exit 1
fi

DEVICE=$(df "$FILE" | tail -1 | awk '{print $1}')
MOUNT=$(mount | grep "$DEVICE" | awk '{print $3}')

# Use debugfs to check extent info
echo "=== Extent Information for $FILE ==="
debugfs -R "stat $FILE" "$DEVICE" 2>/dev/null | grep -E "Extent|Block|Size"

echo ""
echo "=== Fragmentation Analysis ==="
filefrag "$FILE"

echo ""
echo "=== Disk Layout ==="
ext2ls -l "$DEVICE" "$FILE" 2>/dev/null || echo "ext2ls not available"

Observability Checklist

Monitoring file allocation health:

  • df -h: Check available space
  • df -i: Check inode usage
  • filefrag -v file: Check file fragmentation level
  • e4defrag -c /mount/point: Check overall fragmentation percentage
# Comprehensive allocation monitoring script
#!/bin/bash
echo "=== Storage Allocation Report ==="
echo ""

echo "--- Space Usage ---"
df -h | grep -v tmpfs

echo ""
echo "--- Inode Usage ---"
df -i | grep -v tmpfs

echo ""
echo "--- Largest Files in /var (if exists) ---"
if [ -d /var ]; then
    du -sh /var/* 2>/dev/null | sort -rh | head -10
fi

echo ""
echo "--- Fragmentation Levels (ext4) ---"
for mount in $(mount | grep ext4 | awk '{print $3}'); do
    echo "Checking $mount..."
    sudo e4defrag -c "$mount" 2>/dev/null | grep -E "Fragmentation|total"
done

Common Pitfalls / Anti-Patterns

Secure File Deletion

File allocation metadata remains after deletion. The actual data isn’t overwritten until blocks are reallocated. For secure deletion:

# Overwrite file contents before deletion
shred -vu /path/to/file

# Or secure-delete entire directory
sfill -f /mount/point

# For ext4, consider encryption (dm-crypt)
# Encrypted FS ensures data is unreadable without key

File System Integrity

Modern file systems include allocation integrity features:

# Check ext4 metadata consistency
sudo e2fsck -n /dev/sda1  # -n = read-only, no changes

# Enable periodic file system checks
sudo tune2fs -l /dev/sda1 | grep -E "Check|state"
# Mount count: 50
# Check interval: 0 days

# Set automatic check interval (in days or mount count)
sudo tune2fs -c 30 /dev/sda1  # Check every 30 mounts
sudo tune2fs -i 90d /dev/sda1  # Check every 90 days

Compliance: Data Localization

Some regulations require knowing where data is physically allocated:

  • Track file allocation patterns for audit
  • Use file system features like ext4 dir_index for compliance tracking
  • Consider LVM for volume-level control

Common Pitfalls / Anti-patterns

1. Filling Disks Above 80%

# BAD: Let disk fill above 85%
# Causes: fragmentation, allocation failures, performance degradation

# GOOD: Monitor and alert
df -h | awk '{if ($5+0 > 80) print "WARNING: "$0}'

# Set up in /etc/rc.local or cron

2. Using Wrong Block Size

# Creating file system with wrong block size
# Small block (4KB): Good for many small files, wastes space on large files
# Large block (8KB): Good for large files, wastes inodes for small files

# Choose based on workload:
# Database: larger blocks
# Mail server: smaller blocks (many small files)

# Check current block size
tune2fs -l /dev/sda1 | grep "Block size"

# Recreate with different block size (requires backup)
sudo mkfs.ext4 -b 8192 /dev/sda1

3. Ignoring Extent Usage in ext4

# Check if files are using extents
debugfs -R "ex list <path>" /dev/sda1

# Force extent usage (disable block mapping)
chattr +e /path/to/file

# Check for non-extent files
debugfs -R "stat <path>" /dev/sda1 | grep -i extent

Quick Recap Checklist

  • Contiguous allocation: Fast but causes fragmentation
  • Linked allocation: Flexible but slow random access
  • Indexed allocation: Good random access, overhead per file
  • FAT: Linked list in a table; fragmentation kills performance
  • ext2/3/4: Bitmap allocation with extent-based files
  • NTFS: MFT with runlists for flexible allocation
  • Monitor inode usage (not just space usage)
  • Keep disk usage below 80% to prevent fragmentation

Interview Questions

1. Compare contiguous, linked, and indexed file allocation methods.

Contiguous allocation stores each file in consecutive blocks. Performance is excellent for sequential access, but external fragmentation becomes severe. Files can't grow if adjacent blocks are taken.

Linked allocation stores each block with a pointer to the next block. Any free block can be used, eliminating external fragmentation. However, random access requires following the entire chain, which is slow. If any pointer is corrupted, subsequent blocks are lost.

Indexed allocation stores all block pointers in an index block. Random access is fast, and blocks can be anywhere. The limitation is file size (limited by index block entries), and there's overhead (one index block per file).

Modern systems use variants: ext4 uses extents (contiguous ranges with pointers), NTFS uses the MFT with runlists, and FAT uses the central allocation table.

2. What is external fragmentation and why does it happen?

External fragmentation occurs when free disk space is split into small, non-contiguous chunks after files are created and deleted over time.

Example: You allocate a 10-block file, then delete it. Then you allocate 3 files of 3 blocks each in different areas. Delete two of them. Now you have three free blocks, but they're not adjacent. A new 5-block file can't be stored contiguously.

In contiguous allocation, this is catastrophic—you might have 100MB free but not be able to allocate a 50MB file. In linked allocation, it doesn't matter since blocks don't need to be adjacent. Indexed allocation is also immune since each block is referenced by index, not position.

Solutions: Compaction (moving files to consolidate free space), defragmentation tools, and using file systems that handle fragmentation gracefully.

3. How does ext4's extent-based allocation improve upon the traditional Unix inode approach?

Traditional Unix inodes use direct and indirect pointers (12 direct pointers, 1 single indirect, 1 double indirect, 1 triple indirect). For a large file, reading block N requires following pointer chains, potentially causing many disk I/Os.

ext4's extent-based allocation stores ranges of contiguous blocks as a single entry: {start_block, length}. A 100MB file might need just 2 extent entries instead of 25,600 block pointers.

Benefits:

  • Reduced metadata overhead (fewer entries per file)
  • Better sequential performance (fewer seek operations)
  • Less fragmentation impact (ranges still work even if scattered)
  • Fewer disk reads to access file metadata
4. Why might `df` show plenty of free space but `touch` fail with "No space left on device"?

This classic puzzle has two common causes:

1. Inode exhaustion: The file system has allocated all inodes. Inodes are created when the file system is formatted (you can have millions of tiny files with zero bytes each but use all inodes). Check with df -i. Solutions: recreate FS with more inodes (mkfs.ext4 -N <count>), or clean up small files.

2. Reserved blocks: ext2/3/4 reserves 5% of blocks for root. Even as root, you can't use these blocks. Check tune2fs -l /dev/sda1 | grep Reserved. Fix with tune2fs -m 0 /dev/sda1 (not recommended for root partition).

3. fragmentation: Not enough contiguous blocks for the allocation request (unlikely with linked/extent-based systems).

5. What is the difference between internal and external fragmentation?

Internal fragmentation: Wasted space within allocated units. If you allocate 4KB blocks and have a 1KB file, you waste 3KB. This is unavoidable with block-based allocation—you choose a block size and accept the waste.

External fragmentation: Free space split into non-contiguous chunks. After many allocations and deletions, you might have 100MB total free space but no single hole large enough for a 50MB file. This affects contiguous allocation methods but not linked or indexed.

Trade-offs:

  • Small block size: Less internal fragmentation, more external
  • Large block size: Less external fragmentation, more internal
  • Linked allocation: No external fragmentation, but pointer overhead
  • Extent-based: Reduces external fragmentation while keeping internal waste low
6. How does NTFS use the Master File Table (MFT)?

NTFS stores everything—including directory entries—as records in the Master File Table (MFT). Each file has at least one MFT record containing attribute/value pairs:

  • $STANDARD_INFORMATION: Timestamps, permissions
  • $FILE_NAME: File name (multiple possible for hard links)
  • $DATA: File content (resident within MFT or non-resident with runlist)

Small files (under ~900 bytes) store content entirely within the MFT record—fastest possible access. Larger files store runlists pointing to data clusters. The MFT itself can grow, and it maintains reserved zones for critical metadata.

NTFS uses update sequence numbers (USNs) to detect torn writes and ensure consistency—similar in concept to journaling but implemented differently.

7. What are the implications of the 1024-file-per-directory limit in FAT12/FAT16?

FAT12 and FAT16 limit directory entries to 16-bit root directory cluster counts. For FAT16 with typical cluster sizes, this limits root directories to 65,536 entries—but practical limits often enforced 512 entries per directory for compatibility.

FAT32 removed this root directory limitation (root can be anywhere, like any other directory), but FAT12/FAT16 root directories were fixed-size at format time.

Implications for modern use:

  • Old floppy disk FAT12 root directories held ~112 files maximum
  • Early USB drives with FAT12/16 would fail to create more than a few hundred files in the root directory
  • Modern SD cards and USB drives use FAT32 with essentially unlimited directory entries
8. How does extent-based allocation reduce metadata overhead compared to block pointers?

Traditional Unix inodes use block pointers:

  • 12 direct pointers
  • 1 single indirect (points to block of pointers)
  • 1 double indirect (points to block of single indirect blocks)
  • 1 triple indirect

A 1GB file with 4KB blocks needs 262,144 data blocks. With block pointers, that's one inode + 256 indirect block entries + etc.

Extent-based allocation stores contiguous ranges: {start: 4096, length: 65536} means "blocks 4096-69631, 65536 blocks (256MB)". A 1GB file might need just 4 extent entries instead of thousands of block pointers.

Benefits: Smaller metadata, fewer disk reads to access file, better sequential performance, less fragmentation impact.

9. Why do file systems reserve 5% of blocks, and how can you change this?

ext2/3/4 reserve 5% of blocks for the root user. This ensures:

  • Root can log in and repair the system even if users fill the disk
  • File system operations (like fsck) have working space
  • Critical daemons don't fail when disk is "full"

For large data partitions (e.g., 10TB file server), 5% is 500GB—significant waste. You can reduce or eliminate the reservation:

# View current reservation
tune2fs -l /dev/sda1 | grep Reserved

Change reservation to 1%

tune2fs -m 1 /dev/sda1

Disable entirely (not recommended for root)

tune2fs -m 0 /dev/sda1

For non-root partitions where you manage recovery, reducing reservation saves substantial space.

9. What is the difference between contiguous allocation with consolidation and extent-based allocation?

Contiguous allocation with consolidation: When a file is deleted and an adjacent block becomes free, the file system merges the free blocks. This helps reduce external fragmentation but still requires allocation to find a large enough contiguous hole.

Extent-based allocation: Stores files as contiguous ranges called extents — a start block and a length. A file can span multiple extents if it grows beyond the initial allocation, but each extent is internally contiguous. This reduces the problem of finding huge contiguous holes while keeping most file access sequential.

Key difference: Contiguous allocation assigns one block range per file; extent allocation allows multiple block ranges per file but prefers to keep each range contiguous.

ext4 uses extent-based allocation. A 100MB file might need just 2 extents (256KB extent entries) instead of 25,600 individual block pointers. This dramatically reduces inode size and disk I/O for file metadata access.

10. What is block group allocation and why does it help with locality?

ext2/3/4 divide the partition into block groups, each containing:

  • Block bitmap (free block tracking)
  • Inode bitmap (free inode tracking)
  • Inode table (inode storage for the group)
  • Data blocks

The allocator tries to keep related data together:

  • New files get inodes in the same block group as their parent directory
  • Data blocks for a file stay within the same block group when possible
  • Linked files (hard links) prefer the same block group

This improves performance by ensuring related data is physically close on disk—reducing seek time during sequential access. The allocator uses heuristics and preallocation hints to maintain locality.

12. How does the free block bitmap work in ext2/3/4, and what are its advantages over linked allocation?

ext2/3/4 use a block bitmap—one bit per block indicating free (0) or allocated (1). To allocate:

  1. Scan the bitmap for a free block
  2. Mark it as allocated (set bit to 1)
  3. Return the block number

Advantages over linked allocation:

  • O(1) free space check: No need to traverse chains to find free space
  • Fast allocation: Bitmap operations are hardware-optimized
  • No pointer corruption: Linked allocation's pointer chain isn't used; if a block is lost, the bitmap still shows it as allocated
  • Easy to find contiguous blocks: Bitmap scanning finds contiguous free ranges for extent allocation
  • Simple recovery: If a pointer is lost, the bitmap shows the actual allocation state

The inode bitmap works similarly for inode allocation. Both bitmaps are stored in each block group, enabling fast local allocation without scanning the entire partition.

13. What is the difference between file allocation table (FAT) and inode-based allocation?

FAT: A central table (stored at the beginning of the partition) where each entry corresponds to a cluster. File cluster chains are stored in this table:

  • FAT[cluster] = next cluster in file chain (or EOF)
  • Directory entries store the starting cluster only
  • The entire FAT must be in memory for efficient operation

Inode-based (ext2/3/4, XFS, NTFS): Each file has its own metadata structure (inode) containing block pointers:

  • Directory entries map names to inode numbers
  • Each inode stores pointers to the file's data blocks
  • No central table needed—inode is self-contained

Key differences:

  • FAT's chain is external to the file; inode's pointers are inside the file's metadata
  • FAT limits file size by table size; inode-based systems limit by pointer count
  • FAT must keep the entire table in memory; inode-based only needs to cache relevant inodes
  • Inode systems support hard links (multiple directory entries pointing to same inode); FAT does not
14. How does journaling improve file system reliability in ext3/4 compared to ext2?

ext2 has no journaling—after a crash, fsck must scan the entire partition to reconstruct the free block bitmap and directory structure. This takes minutes to hours on large partitions.

ext3 adds a write-ahead journal (a separate area on disk):

  1. Before making a metadata change, write it to the journal first
  2. Once the journal write is on disk (synchronized), apply the change to the actual file system
  3. After the change is complete, mark the journal entry as committed

After a crash, recovery is fast:

  • Read the journal (small, linear scan)
  • Replay committed transactions (reapply changes)
  • Usually takes seconds, not minutes

ext4 goes further with journal checksums (detect journal corruption) and data=ordered mode (file data is written before its metadata in the journal), reducing both corruption risk and recovery time.

15. What is the trade-off between larger and smaller block sizes in file allocation?

Small block size (e.g., 4KB):

  • Pros: Less internal fragmentation, efficient for small files, more flexibility in allocation
  • Cons: More metadata overhead (more block pointers needed per file), worse sequential read performance (more seeks), larger bitmaps

Large block size (e.g., 8KB, 16KB):

  • Pros: Better sequential read performance (fewer blocks to seek between), smaller metadata (fewer pointers), better for large files
  • Cons: More internal fragmentation (wasted space in last block), inefficient for many small files, wasted i-nodes

Most file systems use 4KB as the default because it balances well for general workloads. Database workloads often use larger blocks to match their record sizes. The optimal choice depends on the typical file size in your workload.

16. How does Btrfs differ from ext4 in its allocation approach?

ext4 uses traditional block allocation with extents. Btrfs takes a fundamentally different copy-on-write (COW) approach:

extent-based vs block-pointer:

  • ext4: inode stores pointers to data blocks (traditional)
  • Btrfs: inode stores a root node of a B-tree; all block pointers are in the tree

COW semantics:

  • When you modify a block, Btrfs writes a new copy rather than overwriting in place
  • The old block remains until no references point to it
  • This enables snapshots (reference counting on blocks), checksums (verify old data wasn't corrupted), and easy RAID integration

Allocation differences:

  • Btrfs uses chunks (extents of the underlying devices) rather than fixed block groups
  • Dynamic allocation: chunks are allocated from free space and can be relocated
  • Integrated RAID: allocation metadata and data are checksummed and mirrored across devices
17. What is preallocation and when is it useful?

Preallocation reserves disk space for a file before data is written. Instead of allocating blocks as data arrives, the file system reserves a range of blocks upfront.

Use cases:

  • Databases: Preallocate data files to avoid fragmentation as the database grows. Use posix_fallocate() on Linux to reserve space without writing zeros.
  • Video recording: Reserve space for a fixed-length recording so the file doesn't fragment mid-write.
  • Preventing disk full errors: Allocate space for known-size files before writing begins.

Implementation in ext4:

  • extents are pre-allocated for the file
  • Blocks are marked as allocated in the bitmap
  • No data is written until the application writes—blocks remain sparse (read as zeros)

Caveat: Preallocating and then not using the space wastes capacity. Also, some older file systems (FAT) don't support true preallocation and may not reserve the blocks.

18. How does XFS handle allocation differently from ext4?

ext4 allocation approach:

  • Block groups of fixed size (~128MB each)
  • allocator tries to keep files and their metadata in the same block group
  • Uses bitmap-based allocation within block groups

XFS allocation approach:

  • Allocation groups (AGs): XFS partitions the filesystem into allocation groups (typically 1GB each), each with its own bitmap and inode allocator
  • B+tree bitmaps: XFS uses B+trees for free space tracking (not simple bitmaps), enabling faster allocation in large filesystems
  • Delayed allocation: Like ext4, XFS batches allocations and delays physical block assignment until writeback, reducing fragmentation
  • Stripe-aware allocation: XFS can align allocations to RAID stripe boundaries for better performance

XFS handles parallel allocation better because multiple threads can allocate from different AGs simultaneously without contention. This makes XFS excellent for large files and high-throughput workloads.

19. What is the difference between hard links and symbolic links in file allocation?

Hard link: Multiple directory entries (dentries) pointing to the same inode. The file's data blocks are referenced by one inode, but multiple names in different directories point to it.

  • Only works within the same file system (must have same inode number space)
  • Cannot link to directories (would create cycles in directory tree)
  • Deleting a hard link removes one name; the file persists until all hard links are removed
  • No extra disk space for the link itself (just a directory entry)

Symbolic link: A special file containing a path string. The symlink's inode points to data blocks containing the target path.

  • Can cross file systems and mount points
  • Can link to directories
  • Deleting the symlink does not affect the target
  • Deleting the target leaves a "dangling" symlink
  • Uses a small amount of space for the path string

From an allocation perspective: hard links share the same inode and data blocks; symbolic links allocate separate data blocks for the path content.

20. How does defragmentation work on extent-based file systems like ext4?

Fragmentation on ext4 occurs when a file's extents become scattered or when the last extent can't be extended due to adjacent allocated blocks.

Detecting fragmentation:

  • filefrag -v file: Shows extent list for a file
  • e4defrag -c /mount: Shows overall fragmentation percentage

Defragmentation with e4defrag:

  1. Reads the file's current extent list
  2. Calculates optimal layout (contiguous or minimal extents)
  3. Allocates new blocks in the desired locations
  4. Copies data to new blocks (leaving old blocks intact during copy)
  5. Atomically updates inode to point to new blocks
  6. Frees old blocks once new layout is active

The copy-on-write nature of the update ensures the file remains accessible during defragmentation. If the system crashes mid-defrag, the old extents are still valid.

Best practice: Keep disk usage below 80% to minimize fragmentation. ext4's delayed allocation helps—batching writes reduces scatter.

Further Reading

Topic-Specific Deep Dives:

  • ext4 Extent Tree Structure: Study how ext4 actually stores extent information in the inode. The extent header and extent node structures enable the fast directory operations that make ext4 performant.

  • Delayed Allocation: ext4 uses delalloc — writes are held in the page cache and allocated later. This batches allocations and reduces fragmentation. Understanding when this can cause data loss (power failure before allocation) is important.

  • Database File Management: Databases often avoid OS file systems entirely, using raw partitions or raw I/O. Understanding why (bypassing page cache, controlling fsync behavior, avoiding double buffering) explains PostgreSQL and Oracle storage configuration.

  • FAT Fragmentation Physics: On FAT file systems, fragmentation isn’t just about free space scatter—it’s about how the FAT chain grows and how the firmware’s cache interacts with random writes. Understanding the physics explains why FAT USB drives degrade in ways Linux file systems don’t.

  • NTFS MFT Structure: The Master File Table is itself a file. Studying how NTFS stores attribute records (resident vs non-resident), how it handles fragmentation of the MFT itself, and how the $Bitmap works explains Windows storage behavior.

Conclusion

File allocation methods determine how disk space gets assigned to files, with direct consequences for performance and fragmentation. Contiguous allocation delivers excellent sequential throughput but suffers from external fragmentation that eventually requires compaction. Linked allocation eliminates fragmentation but destroys random access performance. Indexed allocation balances both, though with per-file overhead.

Modern file systems combine these approaches: FAT uses a central table for linked-style allocation, ext4 uses extent-based allocation that stores contiguous ranges efficiently, and NTFS uses the MFT with runlists for maximum flexibility. The key insight is that no single approach wins universally — the right choice depends on your workload characteristics.

For continued learning, study how databases implement their own file allocation strategies to bypass OS buffering, why ext4’s delayed allocation reduces fragmentation compared to ext3, and how the trade-offs between allocation methods manifest in real-world performance benchmarks. Understanding these fundamentals helps you diagnose storage performance issues and choose file systems appropriate for your use case.

Category

Related Posts

ASLR & Stack Protection

Address Space Layout Randomization, stack canaries, and exploit mitigation techniques

#operating-systems #aslr-stack-protection #computer-science

Assembly Language Basics: Writing Code the CPU Understands

Learn to read and write simple programs in x86 and ARM assembly, understanding registers, instructions, and the art of thinking in low-level operations.

#operating-systems #assembly-language-basics #computer-science

Boolean Logic & Gates

Understanding AND, OR, NOT gates and how they combine into arithmetic logic units — the building blocks of every processor.

#operating-systems #boolean-logic-gates #computer-science