Materialized Views

Precomputed query results stored as tables. PostgreSQL refresh strategies, indexes on materialized views, and when to use them vs views or denormalization.

published: March 26, 2026 reading time: 12 min read updated: January 1, 1970

Materialized Views

A regular view is a stored query. When you query it, the database runs the underlying query and returns the result. No data is stored. Every query to the view re-executes the query.

A materialized view stores the actual result. The query runs once, the result is saved as a table, and you query that table directly. Fast reads, stale data.

The tradeoff—freshness vs performance—determines when materialized views help and when they don’t.

Materialized Views vs Regular Views vs Denormalization

Regular views are logical constructs. They don’t store data, just the query definition. Use them for hiding complex queries, restricting access without separate tables, or encapsulating business logic.

Materialized views store precomputed results. Use them for accelerating expensive aggregations or joins, replicating remote data locally, or building read replicas for specific query patterns.

Denormalization duplicates data manually. Your code decides what to store where and manages keeping copies in sync. Materialized views let the database handle this automatically.

PostgreSQL Materialized Views

PostgreSQL supports materialized views natively.

CREATE MATERIALIZED VIEW sales_by_month AS
SELECT
    date_trunc('month', order_date) AS month,
    customer_id,
    SUM(total) AS total_sales
FROM orders
GROUP BY date_trunc('month', order_date), customer_id
WITH DATA;

WITH DATA populates the view immediately. Without it, you create an empty view that populates on first refresh.

Query the materialized view like a regular table:

SELECT * FROM sales_by_month WHERE month >= '2025-01-01';

Refreshing Materialized Views

Data goes stale as underlying tables change. PostgreSQL offers two refresh strategies:

Full refresh rewrites the entire materialized view:

REFRESH MATERIALIZED VIEW sales_by_month;

This blocks reads until the refresh finishes. For large views, that means noticeable downtime.

Concurrent refresh rebuilds without blocking reads:

REFRESH MATERIALIZED VIEW CONCURRENTLY sales_by_month;

Concurrent refresh requires a unique index. Without one, PostgreSQL refuses the concurrent option.

CREATE UNIQUE INDEX ON sales_by_month (month, customer_id);

Concurrent refresh takes longer than full refresh but keeps the view available.

Automatic Refresh

PostgreSQL doesn’t have built-in automatic refresh. Options:

Scheduled jobs via pg_cron
Trigger-based refresh on underlying tables
Application-level refresh after writes

Oracle’s materialized views support automatic refresh via refresh groups and schedules. PostgreSQL doesn’t have this.

Oracle Materialized Views

Oracle materialized views offer more features:

On-commit refresh: automatically refresh when underlying tables change
Refresh groups: refresh multiple materialized views together
Query rewrite: the optimizer rewrites queries to use materialized views automatically

CREATE MATERIALIZED VIEW monthly_sales
BUILD IMMEDIATE
REFRESH FAST ON COMMIT
AS
SELECT
    EXTRACT(YEAR FROM order_date) AS year,
    EXTRACT(MONTH FROM order_date) AS month,
    SUM(total) AS total_sales
FROM orders
GROUP BY EXTRACT(YEAR FROM order_date), EXTRACT(MONTH FROM order_date);

BUILD IMMEDIATE populates on creation. REFRESH FAST ON COMMIT uses log-based incremental refresh—only changed rows update, not the entire view.

Refresh Methods

Oracle refresh modes:

REFRESH FAST: incremental, requires materialized view logs on underlying tables
REFRESH COMPLETE: full rebuild
REFRESH FORCE: fast if possible, otherwise complete
NEVER REFRESH: prevents automatic refresh

ALTER MATERIALIZED VIEW monthly_sales REFRESH FAST;

Indexes on Materialized Views

Materialized views are tables. Index them like regular tables to speed up queries.

CREATE MATERIALIZED VIEW order_summary AS
SELECT
    customer_id,
    status,
    COUNT(*) AS order_count,
    SUM(total) AS total_value
FROM orders
GROUP BY customer_id, status;

CREATE INDEX idx_order_summary_customer ON order_summary (customer_id);
CREATE INDEX idx_order_summary_status ON order_summary (status);

For frequently filtered queries, proper indexes make a significant difference.

Oracle supports bitmap indexes on materialized views, which compress well for low-cardinality columns like status flags.

When Materialized Views Make Sense

Materialized views work when:

Underlying data changes infrequently compared to query frequency
Queries are expensive aggregations or multi-table joins
You can tolerate stale data for a known period
Query patterns are predictable and stable

Analytics workloads fit well. Daily sales summaries, monthly rollups, user activity aggregations—these change when new data arrives and are queried constantly.

ETL pipelines use materialized views as intermediate stores. Transform once, store the result, query it repeatedly.

When Materialized Views Don’t Help

Materialized views fail when:

Data changes frequently and you need current values
Query patterns are too diverse to precompute efficiently
Storage overhead exceeds the benefit
Refresh costs outweigh read savings

If you need to refresh every few minutes, the refresh overhead might exceed the read performance benefit. A regular view with proper indexing or query caching might work better.

Materialized View Refresh Flows

flowchart TD
    subgraph FullRefresh["Full Refresh"]
        F1[("Materialized<br/>View")] -->|blocks reads| F2[("Drop + Recreate")]
        F2 --> F3[("New View<br/>Available")]
    end

    subgraph ConcurrentRefresh["Concurrent Refresh"]
        C1[("Old View<br/>Available")] --> C2[("Build index on<br/>new snapshot")]
        C2 --> C3[("Swap atomically<br/>when ready")]
        C3 --> C4[("Old view removed<br/>Reads continue")]
    end

    FullRefresh -.->|concurrent is faster| ConcurrentRefresh

Full refresh blocks reads until complete. Concurrent refresh keeps the old view available while building the new one in the background.

When to Use / When Not to Use Materialized Views

Use materialized views when:

Underlying data changes infrequently compared to query frequency
Queries are expensive aggregations or multi-table joins
You can tolerate stale data for a known period
Query patterns are predictable and stable
You need to replicate remote data locally

Do not use materialized views when:

Data changes frequently and you need current values
Query patterns are too diverse to precompute efficiently
Storage overhead exceeds the benefit
Refresh costs outweigh read savings
You need real-time data accuracy

Materialized View vs Regular View vs Denormalization Trade-offs

Dimension	Regular View	Materialized View	Denormalization
Storage	None — query only	Stores result as table	Manual data duplication
Read performance	Same as query	Fast — reads precomputed	Fast — reads precomputed
Freshness	Always current	Stale until refresh	Manual — application manages
Write impact	None	Refresh blocks or takes time	Updates more complex
Maintenance	None	Periodic refresh needed	Application-level sync
Consistency	Strong	Depends on refresh frequency	Application-dependent
Best for	Hiding complexity, access control	Precomputed aggregations, remote data	Read-heavy workloads, specific shapes

Production Failure Scenarios

Failure	Impact	Mitigation
Full refresh blocking reads	Queries fail during refresh window	Use CONCURRENTLY option with unique index
Refresh lag causing stale reports	Business decisions on outdated numbers	Monitor last_refresh time, alert on staleness
Refresh deadlocks under write load	Refresh and writes block each other	Schedule refresh during low-write windows
Unique index missing for concurrent refresh	PostgreSQL refuses CONCURRENTLY	Always create unique index before concurrent refresh
Materialized view growing too large	Storage bloat, slow refresh	Partition materialized views, limit aggregation scope
Index missing on frequently filtered columns	Slow reads from materialized view	Add indexes like any table — materialized views are tables

Capacity Estimation: Materialized View Storage and Refresh Time

Materialized views store precomputed results as tables, so sizing is straightforward: the view’s result set plus any indexes you create on it.

Storage sizing formula:

mv_storage_bytes = sum(row_width × number_of_rows)
mv_index_storage = sum(index_width × indexed_rows)
total_mv_storage = mv_storage_bytes + mv_index_storage

For a materialized view aggregating sales by day and product category:

Source: 10M rows in fact_sales, 100K rows in dim_products
Materialized result: 365 days × 100 categories = 36,500 rows
Row width: date (4 bytes) + category_id (4 bytes) + product_id (4 bytes) + total_sales (8 bytes) + unit_count (8 bytes) = ~30 bytes per row
MV storage: 36,500 × 30 = ~1.1MB (plus 30% overhead = ~1.5MB)
With indexes: btree on (date, category) adds roughly 36,500 × 24 bytes = ~900KB
Total: ~2.5MB for a materialized view replacing a query scanning 10M rows

Refresh time estimation: Full refresh time depends on the underlying query’s cost:

refresh_time ≈ base_query_time + (index_rebuild_time × number_of_indexes)

If the base query takes 30 seconds on the source data and you have 2 indexes requiring rebuild, full refresh takes roughly 30 + (2 × 5) = 40 seconds. For incremental refresh (REFRESH CONCURRENTLY), the time depends on the delta — typically much faster, proportional to changed rows rather than total rows.

For a 500M-row fact table with daily aggregation materialized view: full refresh might take 5-15 minutes. Incremental refresh of only changed partitions takes seconds. This is why partition-aware materialized views — where the MV is partitioned the same way as the source — enable fast incremental refreshes.

Observability Hooks: Monitoring Materialized View Health

PostgreSQL exposes materialized view metadata through system catalog queries.

-- When was each materialized view last refreshed?
SELECT
    matviewname,
    schemaname,
    last_refresh,
    ROUND(EXTRACT(EPOCH FROM (now() - last_refresh)) / 60, 1) AS minutes_since_refresh
FROM pg_matviews
WHERE schemaname = 'public'
ORDER BY last_refresh;

-- Monitor relation sizes (including materialized views)
SELECT
    relname,
    relkind,
    pg_size_pretty(pg_total_relation_size(relid)) AS total_size,
    pg_size_pretty(pg_relation_size(relid)) AS table_size
FROM pg_catalog.pg_statio_user_tables
WHERE schemaname = 'public'
ORDER BY pg_total_relation_size(relid) DESC;

Key alerts to configure:

Staleness alert: minutes_since_refresh > your_staleness_threshold — if your business requires daily reports at 8am, alert if the MV has not refreshed since yesterday.
Size anomaly: if MV size grows unexpectedly (suddenly 10× larger), check for unbounded aggregation or missing WHERE clause in the view definition.
Refresh duration: track refresh time over time — if REFRESH CONCURRENTLY starts taking 10× longer, the underlying data distribution has likely changed.

-- Track refresh duration history
CREATE TABLE mv_refresh_history (
    mv_name TEXT,
    refresh_started TIMESTAMPTZ,
    refresh_duration_secs NUMERIC,
    rows_affected BIGINT
);

-- After each refresh:
-- INSERT INTO mv_refresh_history VALUES ('sales_by_month', now(), extract_secs, row_count);

Real-World Case Study: Shopify’s Materialized Views for Analytics

Shopify runs analytics over billions of transactions across their merchant base. Their data warehouse team uses materialized views extensively to precompute common aggregation patterns — daily revenue by merchant, order counts by status, refund rates by product category.

The challenge: merchant-facing analytics need to feel real-time, but recomputing aggregations over billions of rows on every query is impractical. Materialized views bridge this gap by giving sub-second query response times on precomputed results.

Their implementation pattern: partition the source fact table by day, create materialized views that aggregate at the day level, and use REFRESH CONCURRENTLY with partition-level granularity. When new data loads for day D, only the partition containing day D changes — the materialized view refresh reads only affected partitions, not the entire fact table.

The lesson: materialized views work best when source data is partitioned and the aggregation granularity aligns with partition boundaries. Without partition alignment, full refreshes become prohibitively expensive and defeat the purpose of precomputation.

Interview Questions

Q: What is the difference between REFRESH MATERIALIZED VIEW and REFRESH CONCURRENTLY?

REFRESH MATERIALIZED VIEW takes an exclusive lock on the MV — reads fail while refresh runs. For large MVs, this lock duration is unacceptable. REFRESH CONCURRENTLY uses a unique index (required) and refreshes without blocking reads — it builds a new version of the MV, swaps when ready, and uses CREATE INDEX CONCURRENTLY semantics. The tradeoff: CONCURRENTLY takes slightly longer than a blocking refresh because it builds indexes one at a time and must ensure no duplicate keys. It also requires the MV to have a unique index.

Q: How do you keep a materialized view from growing unbounded?

The MV query definition determines its size. If you aggregate by day with no upper bound, 10 years of data makes the MV large and slow to refresh. Add a WHERE clause to limit the time window: WHERE order_date >= CURRENT_DATE - INTERVAL '90 days'. Partition the MV the same way you partition source tables so old partitions can be detached and dropped. Alternatively, use a rolling window materialized view — a scheduled job drops old partitions and creates new ones daily.

Q: A materialized view refresh is causing replication lag on your read replica. What do you do?

Refresh on the primary triggers replication of the new data to replicas. A full refresh on a large MV can generate gigabytes of WAL that replicas must apply, causing lag. The fix: run REFRESH CONCURRENTLY on the replica instead of the primary if replicas serve read traffic. Alternatively, reduce refresh frequency — if the MV is refreshing every minute but the source data only changes hourly, you are paying replication cost for no benefit.

Q: When would you choose a materialized view over denormalization managed in application code?

Materialized views let the database handle refresh logic — you define the query once, the database manages when and how to update results. Application-level denormalization requires your code to know about and maintain every denormalized copy. Choose materialized views when the database can manage the refresh efficiently (incremental, partition-aligned, not too frequent). Choose application denormalization when the denormalized data structure does not map cleanly to a SQL query — for example, precomputing a specific social graph adjacency list that requires graph traversal logic rather than a standard aggregation.

Conclusion

Materialized views precompute expensive queries and store results as tables. They trade freshness for read performance. PostgreSQL and Oracle both support them—Oracle has more sophisticated refresh options.

Use materialized views for aggregations and joins queried frequently but changing slowly. Avoid them when data changes often or query patterns are too varied to precompute effectively.

Denormalization — Manual vs database-managed data duplication
Query Execution Plans — Understanding how queries run
Indexes Databases — Indexing strategies for performance

Materialized Views

Materialized Views

Materialized Views vs Regular Views vs Denormalization

PostgreSQL Materialized Views

Refreshing Materialized Views

Automatic Refresh

Oracle Materialized Views

Refresh Methods

Indexes on Materialized Views

When Materialized Views Make Sense

When Materialized Views Don’t Help

Materialized View Refresh Flows

When to Use / When Not to Use Materialized Views

Materialized View vs Regular View vs Denormalization Trade-offs

Production Failure Scenarios

Capacity Estimation: Materialized View Storage and Refresh Time

Observability Hooks: Monitoring Materialized View Health

Real-World Case Study: Shopify’s Materialized Views for Analytics

Interview Questions

Conclusion

Category

Tags

Related Posts

Denormalization

Connection Pooling: HikariCP, pgBouncer, and ProxySQL

Database Monitoring: Metrics, Tools, and Alerting